Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1183579.1183586acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
Article

Distributed cache table: efficient query-driven processing of multi-term queries in P2P networks

Published: 11 November 2006 Publication History

Abstract

The state-of-the-art techniques for processing multi-term queries in P2P environments are query flooding and inverted list intersection. However, it has been shown that due to scalability reasons both methods fail to support full-text search in large scale document collections distributed among the nodes in a P2P network. Although a number of optimizations have been suggested recently based on the aforementioned techniques, little evidence is given on their scalability. In this paper we suggest a novel query-driven indexing strategy which generates and maintains only those index entries that are actually used for query processing. In our approach called Distributed Cache Table<u>1 By analogy with Distributed Hash Table (DHT) (DCT) we suggest to abandon the difference between data indexing and query caching, and to store result sets (caches) for the most profitable queries. DCT employs a distributed index to efficiently locate caches that can answer a given multi-term query and broadcasts the query to all the peers only if no such caches were found. Evaluations on real data and query loads show that DCT converges to a high cache-hit ratio and indeed offers a large-scale distributed solution for storing and efficient querying of vast amounts of documents in the P2P setting. DCT achieves two orders of magnitude improvement in traffic consumption compared to a standard distributed single-term indexing approach.

References

[1]
A. R. Bharambe, M. Agrawal, and S. Seshan. Mercury: supporting scalable multi-attribute range queries. In SIGCOMM'04, Portland, USA, 2004.
[2]
B. Bhattacharjee, S. Chawathe, V. Gopalakrishnan, P. Keleher, and B. Silaghi. Efficient peer-to-peer searches using result-caching. In IPTPS'03, Berkeley, CA, USA, 2003.
[3]
M. Cai, M. Frank, J. Chen, and P. Szekely. Maan: A multi-attribute addressable network for grid information services. Journal of Grid Computing, 2(1), 2004.
[4]
I. Clarke, O. Sandberg, B. Wiley, and T. W. Hong. Freenet: A distributed anonymous information storage and retrieval system. Lecture Notes in Computer Science, 2009, 2001.
[5]
P. Cudré-Mauroux and K. Aberer. A decentralized architecture for adaptive media dissemination. In ICME'02, Lausanne, Switzerland, 2002.
[6]
A. Datta, M. Hauswirth, R. Schmidt, R. John, and K. Aberer. Range queries in trie-structured overlays. In P2P'05, Konstanz, Germany, 2005.
[7]
L. Garcés-Erice, P. Felber, E. W. Biersack, G. Urvoy-Keller, and K. W. Ross. Data indexing in peer-to-peer dht networks. In ICDCS'04, Hachioji, Tokyo, Japan, 2004.
[8]
M. R. Garey and D. S. Johnson. Computers and Intractability: A Guide to the Theory of NP-Completeness. Freeman, San Francisco, CA, 1979.
[9]
A. Kothari, D. Agrawal, A. Gupta, and S. Suri. Range addressable network: A p2p cache architecture for data ranges. In P2P'03, Linköping, Sweden, 2003.
[10]
B. T. Loo, R. Huebsch, J. M. Hellerstein, S. Shenker, and I. Stoica. Enhancing p2p file-sharing with an internet-scale query processor. In VLDB'04, Toronto, Canada, 2004.
[11]
J. Lu and J. Callan. Federated search of text-based digital libraries in hierarchical peer-to-peer networks. In ECIR'05, Santiago de Compostela, Spain, 2005.
[12]
B. Mandhani and D. Suciu. Query caching and view selection for xml databases. In VLDB'05, Trondheim, Norway, 2005.
[13]
I. Podnar, T. Luu, M. Rajman, F. Klemm, and K. Aberer. A peer-to-peer architecture for information retrieval across digital library collections. In ECDL'06, Alicante, Spain (to appear), 2006.
[14]
M. F. Porter. An algorithm for suffix stripping. Program, 14(3):130--137, 1980.
[15]
P. Reynolds and A. Vahdat. Efficient peer-to-peer keyword searching. In Middleware'03, Rio de Janeiro, Brazil, 2003.
[16]
O. D. Sahin, A. Gupta, D. Agrawal, and A. E. Abbadi. A peer-to-peer framework for caching range queries. In ICDE'04, Boston, USA, 2004.
[17]
G. Skobeltsyn, M. Hauswirth, and K. Aberer. Efficient processing of XPath queries with structured overlay networks. In ODBASE'05, Agia Napa, Cyprus, 2005.
[18]
T. Suel, C. Mathur, J.-W. Wu, J. Zhang, A. Delis, M. Kharrazi, X. Long, and K. Shanmugasundaram. ODISSEA: A Peer-to-Peer Architecture for Scalable Web Search and Information Retrieval. In WebDB'03, San Diego, California, 2003.
[19]
C. Tang and S. Dwarkadas. Hybrid global-local indexing for efficient peer-to-peer information retrieval. In NSDI'04, San Francisco, CA, USA, 2004.
[20]
C. Tryfonopoulos, S. Idreos, and M. Koubarakis. Publish/subscribe functionalities for future digital libraries using structured overlay networks. In DELOS'05, Schloss Dagstuhl, Germany, 2005.
[21]
http://en.wikipedia.org.
[22]
http://download.wikimedia.org/enwiki/20060518.
[23]
M. R. Yong Yang, Rocky Dunlap and B. F. Cooper. Performance of full text search in structured and unstructured peer-to-peer systems. In INFOCOM'06, Barcelona, Spain, 2006.
[24]
J. Zhang and T. Suel. Efficient query evaluation on large textual collections in a peer-to-peer environment. In P2P'05, Konstanz, Germany, 2005.

Cited By

View all
  • (2012)Peer-to-Peer Information RetrievalACM Transactions on Information Systems10.1145/2180868.218087130:2(1-34)Online publication date: 1-May-2012
  • (2012)A bandwidth and effective hit optimal cache scheme for wireless data access networks with client injected updatesComputer Networks: The International Journal of Computer and Telecommunications Networking10.1016/j.comnet.2012.02.01556:7(2080-2095)Online publication date: 1-May-2012
  • (2011)Search result caching in peer-to-peer information retrieval networksProceedings of the Second international conference on Multidisciplinary information retrieval facility10.5555/2018142.2018157(134-148)Online publication date: 6-Jun-2011
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
P2PIR '06: Proceedings of the international workshop on Information retrieval in peer-to-peer networks
November 2006
66 pages
ISBN:1595935274
DOI:10.1145/1183579
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 November 2006

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. DHT
  2. P2P
  3. caching
  4. indexing
  5. multi-term
  6. processing
  7. query
  8. query-driven

Qualifiers

  • Article

Conference

CIKM06
Sponsor:
CIKM06: Conference on Information and Knowledge Management
November 11, 2006
Virginia, Arlington, USA

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)0
Reflects downloads up to 30 Aug 2024

Other Metrics

Citations

Cited By

View all
  • (2012)Peer-to-Peer Information RetrievalACM Transactions on Information Systems10.1145/2180868.218087130:2(1-34)Online publication date: 1-May-2012
  • (2012)A bandwidth and effective hit optimal cache scheme for wireless data access networks with client injected updatesComputer Networks: The International Journal of Computer and Telecommunications Networking10.1016/j.comnet.2012.02.01556:7(2080-2095)Online publication date: 1-May-2012
  • (2011)Search result caching in peer-to-peer information retrieval networksProceedings of the Second international conference on Multidisciplinary information retrieval facility10.5555/2018142.2018157(134-148)Online publication date: 6-Jun-2011
  • (2011)Search Result Caching in Peer-to-Peer Information Retrieval NetworksMultidisciplinary Information Retrieval10.1007/978-3-642-21353-3_11(134-148)Online publication date: 2011
  • (2011)OUR: Optimal Update‐based Replacement policy for cache in wireless data access networks with optimal effective hits and bandwidth requirementsWireless Communications and Mobile Computing10.1002/wcm.118213:15(1337-1352)Online publication date: 23-Aug-2011
  • (2010)DICEJournal of Computer Science and Technology10.5555/1945672.194567725:5(933-944)Online publication date: 1-Sep-2010
  • (2010)Optimized information discovery using self-adapting indices over Distributed Hash TablesInternational Performance Computing and Communications Conference10.1109/PCCC.2010.5682330(105-113)Online publication date: Dec-2010
  • (2010)DICE: An Effective Query Result Cache for Distributed Storage SystemsJournal of Computer Science and Technology10.1007/s11390-010-9378-125:5(933-944)Online publication date: 10-Sep-2010
  • (2009)Massively Scalable Web Service DiscoveryProceedings of the 2009 International Conference on Advanced Information Networking and Applications10.1109/AINA.2009.106(394-402)Online publication date: 26-May-2009
  • (2009)Query-driven indexing for scalable peer-to-peer text retrievalFuture Generation Computer Systems10.1016/j.future.2008.03.00625:1(89-99)Online publication date: 1-Jan-2009
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media