Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/860435.860448acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
Article

Implicit link analysis for small web search

Published: 28 July 2003 Publication History

Abstract

Current Web search engines generally impose link analysis-based re-ranking on web-page retrieval. However, the same techniques, when applied directly to small web search such as intranet and site search, cannot achieve the same performance because their link structures are different from the global Web. In this paper, we propose an approach to constructing implicit links by mining users' access patterns, and then apply a modified PageRank algorithm to re-rank web-pages for small web search. Our experimental results indicate that the proposed method outperforms content-based method by 16%, explicit link-based PageRank by 20% and DirectHit by 14%, respectively.

References

[1]
Agrawal R. and Srikant R. Mining sequential patterns. In Proc. of ICDE'95, 3-14, Taiwan, March 1995.]]
[2]
AltaVista. <http://www.altavista.com>.]]
[3]
Baeza-Yates R. and Ribeiro-Neto B. Modern Information Retrieval. Addison-Wesley, 1999.]]
[4]
Berkhin P., Becher J. D. and Randall D. J. Interactive path analysis of web site traffic. In Proc. of the 7th SIGKDD, 414--419, San Francisco, California, 2001.]]
[5]
Borodin A., Roberts G. O., Rosenthal J. S., and Tsaparas P. Finding authorities and hubs from link structures on the World Wide Web. In Proc. of WWW10, 415--429, Hong Kong, May 2001.]]
[6]
Brin S. and Page L. The anatomy of a large-scale hypertextual web search engine. In Proc. of WWW7, 107--117, Brisbane, Australia, April 1998.]]
[7]
Broder A., Kumar R., Maghoul F., Raghavan P., Rajagopalan S., Stata R., Tomkins A., and Wiener J. Graph structure in the Web. In Proc. of WWW9, 309--320, Amsterdam, May 2000.]]
[8]
Chakrabarti S., Dom B. E., Gibson D., Kleinberg J., Kunar R., Raghavan P., Rajagopalan S., and Tomkins A. Mining the link structure of the World Wide Web, IEEE Computer, 32(8):60--67, August 1999.]]
[9]
Chen M., Hearst M., Hong J. and Lin J. Cha-Cha: a system for organizing intranet search results. In Proceedings of the 2nd USITS, Boulder, CO, October 1999.]]
[10]
Cooley R., Mobasher B. and Srivastava J. Data preparation for mining World Wide Web browsing patterns. Knowledge and Information Systems, 1(1):5--32, 1999.]]
[11]
Cooley R., Tan P.-N. and Srivastava J. Discovery of interesting usage patterns from web data. In Proc. of WEBKDD'99, 163--182, August 1999.]]
[12]
Craswell N., Hawking D., and Robertson S. Effective site finding using link anchor information. In Proc. of SIGIR'01, 250--257, September 2001.]]
[13]
DirectHit. <http://www.directhit.com>.]]
[14]
Google. <http://www.google.com>.]]
[15]
Hagen P., Manning H. and Paul Y. Must search stink? The Forrester report, Forrester, June 2000.]]
[16]
Hawking D., Voorhees E., Bailey P. and Craswell N. Overview of TREC-8 web track. In Proc. of TREC-8, 131--150, Gaithersburg MD, November 1999.]]
[17]
Hearst M. Next generation web search: setting our sites. IEEE Data Engineering Bulletin, 23(3):38--48, September 2000.]]
[18]
Henzinger M. R. Link analysis in web information retrieval. IEEE Data Engineering Bulletin, 23(3):3--8, September 2000.]]
[19]
Kessler M. M. Bibliographic coupling between scientific papers. American Documentation, 14(1):10--25, 1963.]]
[20]
Kleinberg J. M. Authoritative sources in a hyperlinked environment. Journal of the ACM, 46(5):604--632, 1999.]]
[21]
Kleinberg J. M., Kumar R., Raghavan P., Rajagopalan S. and Tomkins A. S. The Web as a graph: measurements, models, and methods. In Proc. of COCOON'99, 26--28, Tokyo, 1999.]]
[22]
Levene M. and Loizou G. Web interaction and the navigation problem in hypertext. Encyclopedia of Microcomputers, 28(7):381-398, Marcel Dekker, NY, 2001.]]
[23]
Levene M. and Wheeldon R. A web site navigation engine. In Proc. of WWW10, Hong Kong, May 2001.]]
[24]
Mannila H., Toivonen H., and Verkamo A. I. Discovery of frequent episodes in event sequences. Data Mining and Knowledge Discovery, 1(3):259--289, November 1997.]]
[25]
Miller J. C., Rae G. and Schaefer F. Modifications of Kleinberg's HITS algorithm using matrix exponentiation and web log records. In Proc. of SIGIR'01, 444--445, New Orleans, September 2001.]]
[26]
Nakayama T., Kato H. and Yamane Y. Discovering the gap between web site designers' expectations and users' behavior. In Proc. of WWW9, Amsterdam, May 2000.]]
[27]
Page L., Brin S., Motwani R. and Winograd T. The PageRank citation ranking: bringing order to the Web. Technical report, Stanford University Database Group, 1998.]]
[28]
Pei J., Han J., Mortazavi-asl B., and Zhu H. Mining access patterns efficiently from web logs. In Proc. of PAKDD'00, 396--407, Kyoto, April 2000.]]
[29]
Raghavan P. Social networks: from the Web to the enterprise. IEEE Internet Computing, 6(1):91--94, Feburary 2002.]]
[30]
Robertson S. E., Walker S., Beaulieu M. M., and Gatford M., Payne A. 1995. Okapi at TREC-4. In Proc. of TREC-4, 73--96, NIST Special Publication 500-236, October 1996.]]
[31]
Srikant R. and Yang Y. Mining web logs to improve website organization. In Proc. of WWW10, Hong Kong, May 2001.]]
[32]
Yang Q., Zhang H. H. and Li T. Mining web logs for prediction models in WWW caching and prefetching. In Proc. of KDD'01, 473-478, August 2001.]]

Cited By

View all

Index Terms

  1. Implicit link analysis for small web search

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGIR '03: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
    July 2003
    490 pages
    ISBN:1581136463
    DOI:10.1145/860435
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 28 July 2003

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. information retrieval
    2. link analysis
    3. log mining
    4. web search

    Qualifiers

    • Article

    Conference

    SIGIR03
    Sponsor:

    Acceptance Rates

    SIGIR '03 Paper Acceptance Rate 46 of 266 submissions, 17%;
    Overall Acceptance Rate 792 of 3,983 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)5
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 26 Sep 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)State of the ArtSupporting Web Search and Navigation by an Overlay Linking Structure10.1007/978-3-031-48393-6_2(9-35)Online publication date: 3-Jan-2024
    • (2023)Link-Driven Study to Enhance Text-Based Image Retrieval: Implicit Links Versus Explicit LinksIEEE Access10.1109/ACCESS.2023.330746411(90526-90537)Online publication date: 2023
    • (2022)State-of-the-Art Survey on Web SearchThe Autonomous Web10.1007/978-3-030-90936-9_1(1-24)Online publication date: 1-Jan-2022
    • (2019)Uncovering Hidden Links Between Images Through Their Textual ContextEnterprise Information Systems10.1007/978-3-030-26169-6_18(370-395)Online publication date: 28-Jul-2019
    • (2018)Social SearchSocial Information Access10.1007/978-3-319-90092-6_7(213-276)Online publication date: 3-May-2018
    • (2014)An Improved Method for Efficient PageRank EstimationDatabase and Expert Systems Applications10.1007/978-3-319-10085-2_19(208-222)Online publication date: 2014
    • (2013)A Local Method for ObjectRank EstimationProceedings of International Conference on Information Integration and Web-based Applications & Services10.1145/2539150.2539177(92-101)Online publication date: 2-Dec-2013
    • (2012)Semantic Search on Unstructured DataSemantic-Enabled Advancements on the Web10.4018/978-1-4666-0185-7.ch009(194-213)Online publication date: 2012
    • (2012)Retrieving keyworded subgraphs with graph ranking scoreExpert Systems with Applications: An International Journal10.1016/j.eswa.2011.08.13639:5(4647-4656)Online publication date: 1-Apr-2012
    • (2012)Searching Steiner trees for web graph queryComputers and Industrial Engineering10.1016/j.cie.2011.11.01362:3(732-739)Online publication date: 1-Apr-2012
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media