百度认为企业搜索市场是鸡肋,可 Oracle 公司正在该领域投入重兵,磨刀霍霍。前几天,Oracle 公司率先使用自家产品 Oracle Secure Enterprise Search 打造自己的站内搜索:http://search.oracle.com 。
观察一下 Oracle Secure Enterprise Search(SES) 的爬虫:
"GET /OpenSource.htm HTTP/1.0" 200 7336 "-" \ "Oracle Secure Enterprise Search" "GET /Publications.htm HTTP/1.0" 200 6959 "-" \ "Oracle Secure Enterprise Search" "GET /OracleTech.htm HTTP/1.0" 200 14086 "-" \ "Oracle Secure Enterprise Search" "GET /Others.htm HTTP/1.0" 200 5863 "-" \ "Oracle Secure Enterprise Search" "GET /Others/Service.htm HTTP/1.0" 200 4268 "-" \ "Oracle Secure Enterprise Search" "GET /Others/AboutMe.htm HTTP/1.0" 200 5186 "-" \ "Oracle Secure Enterprise Search"
爬虫的名字就叫做 Oracle Secure Enterprise Search 。Web 日志上抓不到版本号。从 SES 服务器端的日志上可以得知目前的爬虫版本号为 10.1.6。
–EOF–