本站网址: YippeeSoft开心软件

本文链接: 蜘蛛屏蔽

蜘蛛流量


14 16.87 16.87  0.00 0.00 0.00 0.00
15 488.33 488.33  0.00 0.00 0.00 0.00
16 569.05 569.05  0.00 0.00 0.00 0.00
17 513.98 513.98  0.00 0.00 0.00 0.00
18 1260.21 1260.21  0.00 0.00 0.00 0.00


该死的蜘蛛,一天废掉我1G的流量


已经屏蔽了一堆,总不能把GOOGLE和BAIDU的也屏蔽掉吧
郁闷


搞了个模拟蜘蛛测试了下,好像也没多少
http://www.webconfs.com/search-engine-spider-simulator.php


修改FF的USER AGENT,也是403返回。正常


因为浏览器都会发送自身的标识信息和操作系统信息给网站。User Agent Switcher, 用来伪装浏览器和操作系统的标识。
   1. 在 Firefox 地址栏中输入 about:config。
   2. 新建/ 修改   general.useragent.override 的 String 键值。
   3.
 


   1. “Mozilla/5.0 (X11; U; Linux i686; en-GB; rv:1.8.1.6) Gecko/20070914 Firefox/2.0.0.7″
   2. “Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.8.1.7) Gecko/20070914 Firefox/2.0.0.7″
   3. “Mozilla/5.0 (Windows; U; Windows NT 6.0; en) AppleWebKit/522.15.5 (KHTML, like Gecko) Version/3.0.3 Safari/522.15.5″
   4. “Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en) AppleWebKit/103u (KHTML, like Gecko) safari/100″
   5. “Opera/9.23 (X11; Linux x86_64; U; en)”
   6. “Opera/9.23 (Windows NT 5.1; U; en)”
   7. “Mozilla/4.0 (compatible; MSIE 6.1; Windows XP)”
   8. “Mozilla/5.0 (Windows; U; MSIE 7.0; Windows NT 6.0)”


RewriteCond %{HTTP_USER_AGENT} Teleport [OR]
RewriteCond %{HTTP_USER_AGENT} Webdup [OR]
RewriteCond %{HTTP_USER_AGENT} NetAnts [OR]
RewriteCond %{HTTP_USER_AGENT} FlashGet [OR]
RewriteCond %{HTTP_USER_AGENT} Web\ Downloader [OR]
RewriteCond %{HTTP_USER_AGENT} WebCopier [OR]
RewriteCond %{HTTP_USER_AGENT} WebSpider [OR]
RewriteCond %{HTTP_USER_AGENT} WebZIP [OR]
RewriteCond %{HTTP_USER_AGENT} psbot [OR]
RewriteCond %{HTTP_USER_AGENT} btbot [OR]
RewriteCond %{HTTP_USER_AGENT} Wget [OR]
RewriteCond %{HTTP_USER_AGENT} Website\ Downloader [OR]
RewriteCond %{HTTP_USER_AGENT} MSIECrawler [OR]
RewriteCond %{HTTP_USER_AGENT} WebPic [OR]
RewriteCond %{HTTP_USER_AGENT} Indy\ Library [OR]
RewriteCond %{HTTP_USER_AGENT} mp3Spider [OR]
RewriteCond %{HTTP_USER_AGENT} Offline\ Explorer [OR]
RewriteCond %{HTTP_USER_AGENT} ^Microsoft\ Internet\ Explorer$ [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mozilla/4\.0$ [OR]
RewriteCond %{HTTP_USER_AGENT} psycheclone [OR]
RewriteCond %{HTTP_USER_AGENT} tspyyp [OR]
RewriteCond %{HTTP_USER_AGENT} ^ia_archiver$ [OR]
RewriteCond %{HTTP_USER_AGENT} ^Snapbot$ [OR]
RewriteCond %{HTTP_USER_AGENT} Pic\ Agent [OR]
RewriteCond %{HTTP_USER_AGENT} ^MSN\ Bot$ [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mozilla\/4\.0\ \(compatible;\)$ [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mozilla\/4\.0\ \(compatible;\ \)$ [OR]
RewriteCond %{HTTP_USER_AGENT} mozilla\.com [OR]
RewriteCond %{HTTP_USER_AGENT} lanshanbot [OR]
RewriteCond %{HTTP_USER_AGENT} 我的浏览器 [OR]
RewriteCond %{HTTP_USER_AGENT} InetURL [OR]
RewriteCond %{HTTP_USER_AGENT} Outfox [OR]
RewriteCond %{HTTP_USER_AGENT} TMCrawler [OR]
RewriteCond %{HTTP_USER_AGENT} hl_ftien_spider [OR]
RewriteCond %{HTTP_USER_AGENT} DigExt [OR]
RewriteCond %{HTTP_USER_AGENT} ^$
RewriteRule .* – [F,L]


SetEnvIfNoCase User-Agent “^HTTrack” ban_bot
SetEnvIfNoCase User-Agent “^EmailCollector” ban_bot
SetEnvIfNoCase User-Agent “^EmailWolf” ban_bot
SetEnvIfNoCase User-Agent “^ExtractorPro” ban_bot
SetEnvIfNoCase User-Agent “^Offline” ban_bot
SetEnvIfNoCase User-Agent “^WebCopier” ban_bot
SetEnvIfNoCase User-Agent “^Webdupe” ban_bot
SetEnvIfNoCase User-Agent “^WebZIP” ban_bot
SetEnvIfNoCase User-Agent “^Web Downloader” ban_bot
SetEnvIfNoCase User-Agent “^WebAuto” ban_bot
SetEnvIfNoCase User-Agent “^WebCapture” ban_bot
SetEnvIfNoCase User-Agent “^WebMirror” ban_bot
SetEnvIfNoCase User-Agent “^WebStripper” ban_bot
SetEnvIfNoCase User-Agent ^Mozilla.*Indy ban_bot
SetEnvIfNoCase User-Agent “^Slurp” ban_bot
SetEnvIfNoCase User-Agent “^Yahoo! Slurp China” ban_bot
SetEnvIfNoCase User-Agent “^Yahoo! Slurp” ban_bot
SetEnvIfNoCase User-Agent “^ia_archiver” ban_bot
SetEnvIfNoCase User-Agent “^lanshanbot” ban_bot
SetEnvIfNoCase User-Agent “^iaskspider” ban_bot


deny from env=ban_bot

原创文章,转载请注明: 转载自YippeeSoft开心软件

本文链接地址: 蜘蛛屏蔽

历史博文

标签: