  |
Robotstxt.org - http://www.robotstxt.org/
Information on the robots.txt Robots Exclusion Standard and other articles about writing well-behaved Web robots. |
  |
ACAP - Automated Content Access Protocol - http://www.the-acap.org/
Standard being developed on behalf of content publishers to communicate permissions information more extensively than is the case with robots.txt. Project documents, implementation and background information. |
  |
User-Agents.org - http://www.user-agents.org/
Large list of search engine spiders, similar web robots, and Web browsers: their web-log identification and links to their originators. |
  |
Bots vs Browsers - http://www.botsvsbrowsers.com
This large database lists user agents in categories and distinguishes between robots and browsers. |
  |
Search Engine Robots and Other User Agents - http://www.jafsoft.com/searchengines/webbots.html
John A. Fotheringham presents data in tabular form on the robots sent by search engines and other sites to read and index Web pages: their origins, names and IP addresses. |