The resources in the internet are abundant, but it is a difficult job to search some useful information. So a search engine is the best method to solve this PRoblem. This article fist introdUCes the system structure of search engine based on the internet in detail, then gives a minute eXPlanation form Spider search, engine and web server. In order to understand the technology more deeply, I have programmed a news search engine by myself.
The news search engine is explained and searched according to hyperlink from a appointed web page, then indexs every searched information and adds it to the index database. Then after receiving the customers' requests from the web server, it soon searchs the right news form the index engine,
In the chapter of introducing search engine, it is not only elaborate the core technology, but also combine with the modern code,pictures included, easy to understand.
第一代搜索引擎出現于1994年。這類搜索引擎一般都索引少于1,000,000個網頁,極少重新搜集網頁并去刷新索引。而且其檢索速度非常慢,一般都要等待10秒甚至更長的時間。在實現技術上也基本沿用較為成熟的IR(Information Retrieval)、網絡、數據庫等技術,相當于利用一些已有技術實現的一個WWW上的應用。在1994年3月到4月,網絡爬蟲World Web Worm (WWWW)平均天天承受大約1500次查詢。