Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1076034.1076120acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
Article

Improving web search results using affinity graph

Published: 15 August 2005 Publication History
  • Get Citation Alerts
  • Abstract

    In this paper, we propose a novel ranking scheme named Affinity Ranking (AR) to re-rank search results by optimizing two metrics: (1) diversity -- which indicates the variance of topics in a group of documents; (2) information richness -- which measures the coverage of a single document to its topic. Both of the two metrics are calculated from a directed link graph named Affinity Graph (AG). AG models the structure of a group of documents based on the asymmetric content similarities between each pair of documents. Experimental results in Yahoo! Directory, ODP Data, and Newsgroup data demonstrate that our proposed ranking algorithm significantly improves the search performance. Specifically, the algorithm achieves 31% improvement in diversity and 12% improvement in information richness relatively within the top 10 search results.

    References

    [1]
    Baeza-Yates, R. and Ribeiro-Neto, B. Modern Information Retrieval. Addison Wesley Longman, 1999.
    [2]
    Calvo, R.A., Lee, J.-M. and Li, X. Managing Content with Automatic Document Classification. Journal of Digital Information, 5 (2).
    [3]
    Carbonell, J. and Goldstein, J., The use of MMR, diversity-based reranking for reordering documents and producing summaries. In Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval, (Melbourne, Australia, 1998), 335--336.
    [4]
    Chen, Z., Tao, L., Wang, J., Liu, W. and Ma, W.-Y., A Unified Framework for Web Link Analysis. In Proceedings of the 3rd International Conference on Web Information Systems Engineering, (Singapore, 2002), 63--72.
    [5]
    Croft, W.B., Cronen-Townsend, S. and Larvrenko, V., Relevance feedback and personalization: A language modeling perspective. In Proceedings of the DELOS Network of Excellence Workshop on "Personalisation and Recommender Systems in Digital Libraries", (Dublin City University, Ireland, 2001).
    [6]
    DirectHit. http://www.directhit.com.
    [7]
    Dumais, S. and Chen, H., Hierarchical classification of Web content. In Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval, (Athens, Greece, 2000), 256--263.
    [8]
    Gibson, D., Kleinberg, J.M. and Raghavan, P., Inferring Web communities from link topology. In Proceedings of the 9th ACM Conference on Hypertext and Hypermedia, (Pittsburgh, PA, 1998), 225--234.
    [9]
    Kleinberg, J.M. Authoritative sources in a hyperlinked environment. Journal of the ACM (JACM), 46 (5). 604--632.
    [10]
    Lu, Q. and Getoor, L., Link-based Classification. In Proceedings of the International Conference on Machine Learning, (Washington DC, 2003), 496--503.
    [11]
    ODP. http://dmoz.org/.
    [12]
    Page, L., Brin, S., Motwani, R. and Windograd, T. The pagerank citation ranking: Bring order to the web, Stanford Digital Library Technologies Project, 1998.
    [13]
    Porter, M.F. An algorithm for suffix stripping Program, 1980, 130--137.
    [14]
    Robertson, S.E., Walker, S., Hancock-Beaulieu, M., Gull, A. and Lau, M., Okapi at TREC. In Proceedings of the Text REtrieval Conference, (1992), 21--30.
    [15]
    Wong, S.K.M. and Raghavan, V.V., Vector space model of information retrieval: a reevaluation. In Proceedings of the 7th annual international ACM SIGIR conference on Research and development in information retrieval, (Cambridge, England, 1984), 167--185.
    [16]
    Xi, W., Zhang, B., Chen, Z., Lu, Y., Yan, S., Ma, W.-Y. and Fox, E.A., Link fusion: a unified link analysis framework for multi-type interrelated data objects. In Proceedings of the 13th international conference on World Wide Web, (New York, NY, USA, 2004), 319--327.
    [17]
    Xue, G.-R., Zeng, H.-J., Chen, Z., Ma, W.-Y., Zhang, H.-J. and Lu, C.-J., Implicit link analysis for small web search. In Proceedings of the 26th annual international ACM SIGIR conference on Research and Development in Information Retrieval, (Toronto, Canada, 2003), 56--63.
    [18]
    Zhai, C.X., Cohen, W.W. and Lafferty, J., Beyond independent relevance: methods and evaluation metrics for subtopic retrieval. In Proceedings of the 26th annual international ACM SIGIR conference on Research and development in information retrieval, (Toronto, Canada, 2003), 10--17.

    Cited By

    View all
    • (2023)PSLOG: Pretraining with Search Logs for Document RankingProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3580305.3599477(2072-2082)Online publication date: 6-Aug-2023
    • (2023)Query-Based Extractive Text Summarization Using Sense-Oriented Semantic Relatedness MeasureArabian Journal for Science and Engineering10.1007/s13369-023-07983-7Online publication date: 18-Aug-2023
    • (2022)An overview of cluster-based image search result organization: background, techniques, and ongoing challengesKnowledge and Information Systems10.1007/s10115-021-01650-9Online publication date: 11-Feb-2022
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGIR '05: Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
    August 2005
    708 pages
    ISBN:1595930345
    DOI:10.1145/1076034
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 15 August 2005

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. affinity ranking
    2. diversity
    3. information retrieval
    4. information richness
    5. link analysis

    Qualifiers

    • Article

    Conference

    SIGIR05
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 792 of 3,983 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)7
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 10 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)PSLOG: Pretraining with Search Logs for Document RankingProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3580305.3599477(2072-2082)Online publication date: 6-Aug-2023
    • (2023)Query-Based Extractive Text Summarization Using Sense-Oriented Semantic Relatedness MeasureArabian Journal for Science and Engineering10.1007/s13369-023-07983-7Online publication date: 18-Aug-2023
    • (2022)An overview of cluster-based image search result organization: background, techniques, and ongoing challengesKnowledge and Information Systems10.1007/s10115-021-01650-9Online publication date: 11-Feb-2022
    • (2021)Ambiguity and ClarificationResult Page Generation for Web Searching10.4018/978-1-7998-0961-6.ch004(28-42)Online publication date: 2021
    • (2021)Modeling Intent Graph for Search Result DiversificationProceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3404835.3462872(736-746)Online publication date: 11-Jul-2021
    • (2020)Efficient Outlier Detection in Text Corpus Using Rare Frequency and RankingACM Transactions on Knowledge Discovery from Data10.1145/339971214:6(1-30)Online publication date: 3-Oct-2020
    • (2020)Measuring effectiveness of sample-based product-line testingACM SIGPLAN Notices10.1145/3393934.327813053:9(119-133)Online publication date: 7-Apr-2020
    • (2020)Pattern matching in an open worldACM SIGPLAN Notices10.1145/3393934.327812453:9(134-146)Online publication date: 7-Apr-2020
    • (2020)A multimedia document browser based on multilayer networksMultimedia Tools and Applications10.1007/s11042-020-09872-9Online publication date: 12-Nov-2020
    • (2020)Toward action comprehension for searchingJournal of the Association for Information Science and Technology10.1002/asi.2422071:2(143-157)Online publication date: 1-Jan-2020
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media