Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

On ranking techniques for desktop search

Published: 08 April 2008 Publication History

Abstract

Users tend to store huge amounts of files, of various formats, on their personal computers. As a result, finding a specific, desired file within the file system is a challenging task. This article addresses the desktop search problem by considering various techniques for ranking results of a search query over the file system. First, basic ranking techniques, which are based on various file features (e.g., file name, access date, file size, etc.), are considered and their effectiveness is empirically analyzed. Next, two learning-based ranking schemes are presented, and are shown to be significantly more effective than the basic ranking methods. Finally, a novel ranking technique, based on query selectiveness, is considered for use during the cold-start period of the system. This method is also shown to be empirically effective, even though it does not involve any learning.

References

[1]
Abitebour, S., Agrawal, R., Phil, B., Carey, M., Ceri, S., Croft, B., De Witt, D., Franklin, M., Garcia-Molina, H., Gawlick, D., Gray, J., Haas, L., Halevy, A., Hellerstein, J., Ioannidis, Y., Kersten, M., Pazzani, M., Lesk, M., Maier, D., Naughton, J., Schek, H., Sellis, T., Silberschatz, A., Stonebraker, M., Snod-Grass, R., Ullman, J., Weikum, G., Widom, J., and Zdonik, S. 2003. The Lowell database research self assessment. eprint: arxiv:cs/0310006.
[2]
Ahlberg, C., Williamson, C., and Shneiderman, B. 1992. Dynamic queries for information exploration: An implementation and evaluation. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI), 619--626.
[3]
Apple Inc. 2008. Spotlight. http://www.apple.com/macosx/features/spotlight/.
[4]
Baeza-Yates, R. and Ribeiro-Neto, B. 1999. Modern Information Retrieval. ACM Press Series. Addison-Wesley.
[5]
Barreau, D. and Nardi, B. A. 1995. Finding and reminding - File organization from the desktop. ACM SIGCHI Bull. 27, 3, 39--43.
[6]
Bertsekas, D., Nedic, A., and Ozdaglar, A. 2003. Convex Analysis and Optimization. Athena Scientific.
[7]
Cohen, W., Schapire, R., and Singer, Y. 1999. Learning to order things. J. Artif. Intell. Res. 10, 243--270.
[8]
Copernic. 2007. Copernic desktop search. www.copernic.com/.
[9]
Cortes, C. and Vapnik, V. 1995. Support-Vector networks. Mach. Learn. J. 20, 273--297.
[10]
Cutrell, E., Robbins, D., Dumais, S., and Sarin, R. 2006. Fast, flexible filtering with Phlat---Personal search and organization made easy. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI). ACM Press, New York, 261--270.
[11]
Dong, X. and Halevy, A. 2005. A platform for personal information management and integration. In Proceedings of the International Conference on Innovative Data Systems Research (CIDR), 119--130.
[12]
Dumais, S., Cutrell, E., Cadiz, J., Jancke, G., Sarin, R., and Robbins, D. C. 2003. Stuff I've seen: A system for personal information retrieval and re-use. In Proceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). ACM Press, 72--79.
[13]
Dwork, C., Kumar, R., Naor, M., and Sivakumar, D. 2001. Rank aggregation methods for the Web. In Proceedings of the WWW'01. Hong Kong, 613--622.
[14]
Erickson, T. 1996. The design and long-term use of a personal electronic notebook: A reflective analysis. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI), 11--18.
[15]
Fagin, R., Kumar, R., McCurley, K. S., Novak, J., Sivakumar, D., Tomlin, J. A., and Williamson, D. P. 2003. Searching the workplace Web. In Proceedings of WWW'03. ACM Press, 366--375.
[16]
Gemmell, J., Bell, G., Lueder, R., Drucker, S., and Wong, C. 2002. MyLifeBits: Fulfilling the Memex vision. In Proceedings of ACM Multimedia'02. 235--238.
[17]
Google. 2008. Microsoft, Google desktop. http://desktop.google.com/features.html#search.
[18]
Herbrich, R., Graepel, T., and Obermayer, K. 2000. Large margin rank boundaries for ordinal regression. In Advances in Large Margin Classifiers. 115--132.
[19]
Höffgen, K., Simon, H., and van Horn, K. 1995. Robust trainability of single neurons. J. Comput. Syst. Sci. 50, 114--125.
[20]
Joachims, T. 2005. A support vector method for multivariate performance measures. In Proceedings of the International Conference on Machine Learning. 377--384.
[21]
Joachims, T. 2002. Optimizing search engines using clickthrough data. In Proceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). ACM Press, 154--161.
[22]
Joachims, T. 1999. Making large-scale SVM learning practical. In Advances in Kernel Methods---Support Vector Learning. MIT Press, Chapter 11.
[23]
Joachims, T., Granka, L., Pang, B., Hembrooke, H., and Gay, G. 2005. Accurately interpreting clickthrough data as implicit feedback. In Proceedings of SIGIR'05. 154--161.
[24]
Kantor, P. and Voorhees, E. 2000. The TREC-5 confusion track: Comparing retrieval methods for scanned text. Inf. Retriev. 2, 2, 165--176.
[25]
Lansdale, M. 1988. The psychology of personal information management. Appl. Ergonom. 19, 1, 55--66.
[26]
Lee, J. H., Renear, A., and Smith, L. C. 2006. Known-Item search: Variations on a concept. In Proceedings of 69th Annual Meeting of the American Society for Information Science and Technology (ASIST), Austin, TX, 619--626.
[27]
MacDonald, C. and Ounis, I. 2006. Combining fields in known-item email search. In Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), Seattle, WA, 675--676.
[28]
Marais, H. and Bharat, K. 1997. Supporting cooperative and personal surfing with a desktop assistant. In Proceedings of the 10th Annual ACM Symposium on User Interface Software and Technology (UIST), 129--138.
[29]
Nejdl, W. and Paiu, R. 2005. Desktop search---How contextual information influences search results and rankings. In Proceedings of the IRiX Workshop at the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR).
[30]
Ogilvie, P. and Callan, J. 2003. Combining document representations for known-item search. In Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Informaion Retrieval (SIGIR), Toronto, Canada, 143--150.
[31]
Teevan, J., Alvarado, C., Ackerman, M. S., and Karger, D. R., Eds. 2004. The perfect search engine is not enough: A study of orienteering behavior in directed search. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems.
[32]
Teevan, J., Jones, W., and Bederson, B. B., Eds. 2006. Introduction to special issue on personal information management. Commun. ACM 49.
[33]
Wolber, D., Kepe, M., and Ranitovic, I. 2002. Exposing document content in the personal Web. In Proceedings of the 7th International Conference on Intelligent User Interfaces. ACM Press, 151--158.

Cited By

View all
  • (2023)An Analysis of Untargeted Poisoning Attack and Defense Methods for Federated Online Learning to Rank SystemsProceedings of the 2023 ACM SIGIR International Conference on Theory of Information Retrieval10.1145/3578337.3605117(215-224)Online publication date: 9-Aug-2023
  • (2018)Activity-based linkage and ranking methods for personal dataspaceInternational Journal of Mobile Communications10.1504/IJMC.2018.09138116:3(266-285)Online publication date: 1-Jan-2018
  • (2016)Ontology-assisted provenance visualization for supporting enterprise search of engineering and business filesAdvanced Engineering Informatics10.1016/j.aei.2016.04.00330:2(244-257)Online publication date: 1-Apr-2016
  • Show More Cited By

Index Terms

  1. On ranking techniques for desktop search

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Information Systems
    ACM Transactions on Information Systems  Volume 26, Issue 2
    March 2008
    214 pages
    ISSN:1046-8188
    EISSN:1558-2868
    DOI:10.1145/1344411
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 08 April 2008
    Accepted: 01 September 2007
    Revised: 01 August 2007
    Received: 01 May 2007
    Published in TOIS Volume 26, Issue 2

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Desktop search
    2. personal information management
    3. ranking

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)7
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 27 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)An Analysis of Untargeted Poisoning Attack and Defense Methods for Federated Online Learning to Rank SystemsProceedings of the 2023 ACM SIGIR International Conference on Theory of Information Retrieval10.1145/3578337.3605117(215-224)Online publication date: 9-Aug-2023
    • (2018)Activity-based linkage and ranking methods for personal dataspaceInternational Journal of Mobile Communications10.1504/IJMC.2018.09138116:3(266-285)Online publication date: 1-Jan-2018
    • (2016)Ontology-assisted provenance visualization for supporting enterprise search of engineering and business filesAdvanced Engineering Informatics10.1016/j.aei.2016.04.00330:2(244-257)Online publication date: 1-Apr-2016
    • (2014)Using Context to Discern User Tasks on DesktopApplied Mechanics and Materials10.4028/www.scientific.net/AMM.519-520.318519-520(318-321)Online publication date: Mar-2014
    • (2013)FRIDAL: A Desktop Search System Based on Latent Interfile RelationshipsSoftware and Data Technologies10.1007/978-3-642-29578-2_14(220-234)Online publication date: 2013
    • (2012)MinersoftACM Transactions on Internet Technology10.1145/2220352.222035412:1(1-34)Online publication date: 5-Jul-2012
    • (2012)Superiority of agent based search in advanced educational technology2012 IEEE International Conference on Technology Enhanced Education (ICTEE)10.1109/ICTEE.2012.6208659(1-8)Online publication date: Jan-2012
    • (2012)SoDesktopProceedings of the 2012 International Conference on Communication Systems and Network Technologies10.1109/CSNT.2012.106(463-467)Online publication date: 11-May-2012
    • (2011)Seeding simulated queries with user-study data for personal search evaluationProceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval10.1145/2009916.2009924(25-34)Online publication date: 24-Jul-2011
    • (2011)Unified structure and content search for personal information management systemsProceedings of the 14th International Conference on Extending Database Technology10.1145/1951365.1951391(201-212)Online publication date: 21-Mar-2011
    • Show More Cited By

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media