两个陌生搜索机器人的网站介绍

1、http://www.gigablast.com(这个搜索机器人特别勤快)

2006-04-13 17:12:32,796 -

User: userID:836,userName:guest; IP: 66.154.103.128 USER-AGENT:

Gigabot/2.0/gigablast.com/spider.html REFERER: null

Gigablast

Gigablast provides large-scale, high-performance, real-time information retrieval technology for partner sites.

topic generation, index multiple document formats, search, search engine, search engines, search the web, fresh index

Return to Home Page

Search | Contact Us | About Us | Clients

Products & Services

Gigablast Toolbar

Query Syntax

Features

Gigabot

Press

Software License

XML Search Feed

Custom Topic Search

Web Search

Hosted Search

Site Search

Gigaboost

Data Extraction

About Us

With one of the largest and freshest indexes in the world, Gigablast Inc. has recently joined the elite ranks of major search engine companies.

Founded in 2000, Gigablast was created to index up to 200 Billion pages with the least amount of hardware possible. Gigablast provides large-scale, high-performance, real-time information retrieval technology for partner sites. The company offers a variety of features including topic generation and the ability to index multiple document formats. This search delivery mechanism gives a partner "turn key" search capability and the capacity to instantly offer search at maximum scalability with minimum cost. In addition, the Gigablast website (www.gigablast.com) provides unique "Gigabits" of information, enabling visitors to easily refine their search based upon related topics from search results. Clients range from NASDAQ 100 listed corporations to boutique companies.

About The Founder

Matt Wells

Matt Wells got a BS in CS and an MS in Mathematics from New Mexico Tech. While a graduate student he developed and implemented a site called The Artists' Den in 1996. It was on Yahoo's What's Cool twice, on Netscape's What's New once and a semi-permanent feature on Infoseek's Reference page.

Click here to read more about Matt...

Learn more about Gigablast's Environmentally Friendly Search

Top of page

Search | Careers | Products & Services | Contact Us | About Us | Clients | Privacy Policy

Copyright © 2000-2005 Gigablast, Inc. All rights reserved.

2、(http://www.greaterera.com/)

(今天才发现的,只搜索RSS Feed,会在链接中产生一个sessionID,而且这个session较长时间保持(从今天中午的13:00到现在约8个小时都是一个session,比起一般的搜索机器人访问一个链接开一个session肯定对网站的影响小多了,我就是不明白为什么像google之类的大多数搜索机器人访问一个链接就开一个session?))

2006-04-13 20:18:05,859 - get

feed of latestDiariesOfGoal:feedType=rss_1.0 entriesNum=39 channelID=416,

channelName is:学习页面制作; User: userID:836,userName:guest; IP: 65.59.220.23

USER-AGENT: Mozilla/5.0 (compatible; heritrix/1.7.0 +http://www.greaterera.com/)

REFERER:

http://www.learndiary.com/indexAction.do;jsessionid=a-efeghgjtyPc

greaterera.com

We are a group of young professional people with ambition and enthusiasm. We pursue truth, cherish our cultures and appreciate achievements of our civilization. We are experimenting a project of collecting contents of interesting websites that we believe would be good representations of our current Internet community. The purpose is to record and preserve part of our Internet heritage for future generations. We are preparing for a greater era that is coming ahead. For any question or comments, please send it to mail@greaterera.com.