资源预览内容
第1页 / 共27页
第2页 / 共27页
第3页 / 共27页
第4页 / 共27页
第5页 / 共27页
第6页 / 共27页
第7页 / 共27页
第8页 / 共27页
第9页 / 共27页
第10页 / 共27页
亲,该文档总共27页,到这儿已超出免费预览范围,如果喜欢就下载吧!
资源描述
Benchmarking the Advanced Search Interfaces of Eight Major WWW Search EnginesDr. Randy D. Ralph & John W. Felts, Jr.Keywords: information retrieval, search engines, World Wide Web, benchmarking, advanced search, search interfacesAbstract: This research project was designed to benchmark the performance of the advanced search interfaces of eight of the major World Wide Web (WWW) search engines, excluding the meta engines. A review of the literature did not find any previous benchmarking studies of the advanced interfaces based on quantitative data. The research was performed by fifty-two graduate students of library and information studies (LIS) on three campuses of the University of North Carolina (UNC) as a class research project for course LIS 645, Computer-Related Technologies in Library Management. The class was offered by the Department of Library and Information Studies at UNC Greensboro through the North Carolina Research and Education Network (NC-REN). The LIS students selected Altavista, Excite, Go/Infoseek, Google, Hotbot, Lycos, Northernlight, and Yahoo for comparative study. Each researcher submitted a total of five questions in a range of subject areas to each of the eight selected search engines, totaling 2,080 individual searches in 260 search panels of eight search engine trials. Data was collected in the following categories on the first 20 unique citations viewed in the search output lists from the engines:1) an index of relative recall based on the actual or estimated recall reported by the search engine 2) the number of direct hits among the first 20 unique citations viewed3) the number of false coordinations among the first 20 unique citations viewed 4) the number of citations to websites with duplicate content5) the number of citations to websites resulting in failed views6) the depth to the first solid hit among the citations in the search output listThe aim of the research was to identify the engines that might best meet the needs of a library patron. While, on the whole, the search engines performed equally well on a number of parameters tested, it was found that engines differed most significantly in: 1) the percent of relevancy in results from direct hits2) the depth to the first solid hit3) the number of duplicate citations delivered 4) the number of citations which resulted in failed views A discussion and summary of the results, conclusions and recommendations for further research are included. 1.BACKGROUND1.1OverviewThis project builds on previous work conducted by classes in the Department of Library and Information Studies of the School of Education at the University of North Carolina (UNC) at Greensboro under the direction of Dr. Randy D. Ralph. Six of the top eight global World Wide Web (WWW) search engines identified in the previous comparative testing in 1997 as part of an Indexing and Abstracting course (WWW Search Engine Test Methods, available at URL http:/www.netstrider.com/search/methods.html) and in 1999 as part of a course in library automation (Computer-Related Technologies in Library Management, by Randy D. Ralph and John W. Felts at URL http:/library.uncg.edu/search/ were again selected for comparative benchmark testing, this time using the fall 2000 Computer-Related Technologies in Library Management classes (LIS 645), meeting at UNCs Asheville, Charlotte and Greensboro campuses. Each of the fifty-two students devised five (5) search queries in diverse subject areas and genres in order to gauge the overall performance of the eight selected search engines. In a departure from the earlier study, advanced search queries were presented to the search engines using their own advanced search interfaces, rather than the simple default interfaces. The search engines selected were Altavista, Excite, Go/Infoseek, Google, Hotbot, Lycos, Northernlight and Yahoo. 1.2RationaleThere is still a need for the type of examination performed here. While more and more librarians (among the rest of us) are using search engines, few real statistical analyses, as opposed to popular informal comparisons, have been conducted. Many earlier studies are so old they are outdated, since search engines evolve so rapidly. New studies are underway, but this study builds on earlier research only three years old, expanding the earlier parameters. Moreover, as the Internet becomes more and more commercialized, the need for an unbiased and statistically valid comparison is greater now than ever before. This research can be periodically repeated, taking into account the evolution of the search engines as well as that of the Internet itself. 1.3Background of Search EnginesSearch engines came into existence only after 1994. A search engine is software that searches web sites and indexes found in the World Wide Web, and returns the matches, such as documents compatible with the search
收藏 下载该资源
网站客服QQ:2055934822
金锄头文库版权所有
经营许可证:蜀ICP备13022795号 | 川公网安备 51140202000112号