[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: strange web log entry



> 65.214.36.45 - - [26/Feb/2002:05:14:25] "GET /robots.txt
> HTTP/1.0" - 404
>   - - "Mozilla/2.0 (compatible; Ask Jeeves)"
..
> What, exactly, is the robots.txt file that I always see
> them going after, anyway?  

robots.txt is a file for *robots* that are going to index your website.
Apparently, Ask Jeeves is going to index your site for their search engine.

Jeeves is a nice one. It looked for robots.txt and told you who it was.
It probably won't rip your entire site in one massive request stream either.

Other robots are less friendly. Obnoxious ones are those that ignore all of the 
above and filter your site for spam email addresses.

Then again, it could be someone impersonating Ask Jeeves. See where the IP
comes from.

A google search for "robots.txt exclusion file" will give you plenty of info.

Mike808/

---------------------------------------------
http://www.valuenet.net



-
To unsubscribe, send email to majordomo@luci.org with
"unsubscribe luci-discuss" in the body.