Abstract
Large highly distributed data sets are poorly supported by current query technologies. Applications such as endsystem-based network management are characterized by data stored on large numbers of endsystems, with frequent local updates and relatively infrequent global one-shot queries. The challenges are scale (103 to 109 endsystems) and endsystem unavailability. In such large systems, a significant fraction of endsystems and their data will be unavailable at any given time. Existing methods to provide high data availability despite endsystem unavailability involve centralizing, redistributing or replicating the data. At large scale these methods are not scalable. We advocate a design that trades query delay for completeness, incrementally returning results as endsystems become available. We also introduce the idea of completeness prediction, which provides the user with explicit feedback about this delay/completeness trade-off. Completeness prediction is based on replication of compact data summaries and availability models. This metadata is orders of magnitude smaller than the data. Seaweed is a scalable query infrastructure supporting incremental results, online in-network aggregation and completeness prediction. It is built on a distributed hash table (DHT) but unlike previous DHT based approaches it does not redistribute data across the network. It exploits the DHT infrastructure for failure-resilient metadata replication, query dissemination, and result aggregation. We analytically compare Seaweed’s scalability against other approaches and also evaluate the Seaweed prototype running on a large-scale network simulator driven by real-world traces.
Similar content being viewed by others
References
Aberer, K., Datta, A., Hauswirth, M., Schmidt, R.: Indexing data-oriented overlay networks. In: VLDB, pp. 685–696. Trondheim, Norway (2005)
Avnur, R., Hellerstein, J.M.: Eddies: Continuously adaptive query processing. In: SIGMOD, pp. 261–272. Dallas, TX (2000)
Balazinska, M., Balakrishnan, H., Madden, S., Stonebraker, M.: Fault-tolerance in the Borealis distributed stream processing system. In: SIGMOD, pp. 13–24. Baltimore, MD (2005)
Bawa, M., Gionis, A., Garcia-Molina, H., Motwani, R.: The price of validity in dynamic networks. In: SIGMOD, pp. 515–526. Paris, France (2004)
Bhagwan, R., Savage, S., Voelker, G.M.: Understanding availability. In: IPTPS, pp. 256–267 (2003)
Bharambe, A.R., Agrawal, M., Seshan, S.: Mercury: supporting scalable multi-attribute range queries. In: SIGCOMM, pp. 353–366. Portland, OR (2004)
Blake, C., Rodrigues, R.: High availability, scalable storage, dynamic peer networks: Pick two. In: HotOS-IX, pp. 1–6. Kauai, HA (2003)
Bolosky, W., Douceur, J., Ely, D., Theimer, M.: Feasibility of a serverless distributed file system deployed on an existing set of desktop PCs. In: SIGMETRICS, pp. 34–43. Santa Clara, CA (2000)
Castro, M., Costa, M., Rowstron, A.: Performance and dependability of structured peer-to-peer overlays. In: DSN, pp. 9–18. Florence, Italy (2004)
Chen, J., DeWitt, D.J., Tian, F., Wang, Y.: NiagaraCQ: A scalable continuous query system for Internet databases. In: SIGMOD, pp. 379–390. Dallas, TX (2000)
Cheng, R., Xia, Y., Prabhakar, S., Shah, R., Vitter, J.S.: Efficient indexing methods for probabilistic threshold queries over uncertain data. In: VLDB, pp. 876–887. Toronto, CN (2004)
Dabek, F., Zhao, B.Y., Druschel, P., Kubiatowicz, J., Stoica, I.: Towards a common API for structured peer-to-peer overlays. In: IPTPS, pp. 33–44 (2003)
Deshpande, A., Hellerstein, J.M.: Lifting the burden of history from adaptive query processing. In: VLDB, pp. 948–959. Toronto, CN (2004)
Halevy, A.Y., Ashish, N., Bitton, D., Carey, M.J., Draper, D., Pollock, J., Rosenthal, A., Sikka, V.: Enterprise information integration: successes, challenges and controversies. In: SIGMOD, pp. 778–787. Baltimore, MD (2005)
Hellerstein, J.M., Haas, P.J., Wang, H.J.: Online aggregation. In: SIGMOD, pp. 171–182. Tucson, AZ (1997) doi:http://doi.acm.org/10.1145/253260.253291
Huebsch, R., Hellerstein, J.M., Lanham, N., Loo, B.T., Shenker, S., Stoica, I.: Querying the Internet with PIER. In: VLDB, pp. 321–332. Berlin, Germany (2003)
Ioannidis, Y.E., Poosala, V.: Histogram-based approximation of set-valued query-answers. In: VLDB, pp. 174–185. Edinburgh, UK (1999)
Jagadish, H.V., Ooi, B.C., Vu, Q.H.: BATON: a balanced tree structure for peer-to-peer networks. In: VLDB, pp. 661–672. Trondheim, Norway (2005)
Johnson, T., Krishna, P.: Lazy updates for distributed search structure. In: SIGMOD, pp. 337–346. Washington DC, USA (1993) doi:http://doi.acm.org/10.1145/170035.170085
Litwin, W., Neimat, M.A., Schneider, D.A.: RP*: a family of order preserving scalable distributed data structures. In: VLDB, pp. 342–353. Santiago de Chile, Chile (1994)
Lomet, D.B.: Replicated indexes for distributed data. In: PDIS, pp. 108–119. Miami Beach, FL (1996)
Loo, B.T., Hellerstein, J.M., Huebsch, R., Shenker, S., Stoica, I.: Enhancing P2P file-sharing with an Internet-scale query processor. In: VLDB, pp. 432–443. Toronto, CN (2004) http://www.vldb.org/conf/2004/RS11P2.PDF
Madden, S., Shah, M.A., Hellerstein, J.M., Raman, V.: Continuously adaptive continuous queries over streams. In: SIGMOD, pp. 49–60. ACM, Madison, WI (2002)
Mickens, J.W., Noble, B.D.: Exploiting availability prediction in distributed systems. In: NSDI, pp. 73–86. San Jose, CA (2006)
Microsoft: Dr. Watson for Windows. http://www.microsoft.com/resources/documentation/windows/xp/all/proddocs/en-us/drwatson_overview.mspx (2006)
Mortier, R., Isaacs, R., Barham, P.: Anemone: using end-systems as a rich network management platform. In: SIGCOMM MineNet, pp. 203–204. Philadelphia, PA (2005) doi:http://doi. acm.org/10.1145/1080173.1080184
Mortier, R., Narayanan, D., Donnelly, A., Rowstron, A.: Seaweed: distributed scalable ad-hoc querying. In: NetDB Workshop. Atlanta, GA (2006)
Rowstron, A., Druschel, P.: Pastry: scalable, distributed object location and routing for large-scale peer-to-peer systems. In: Middleware, pp. 329–350 (2001)
Saroiu, S., Gummadi, K., Gribble, S.: A measurement study of peer-to-peer file sharing systems. In: MMCN. San Jose, CA (2002)
Tian, F., DeWitt, D.J.: Tuple routing strategies for distributed eddies. In: VLDB, pp. 333–344. Berlin, Germany (2003)
Van Renesse, R., Birman, K., Vogels, W.: astrolabe: a robust and scalable technology for distributed system monitoring, management, and data mining. ACM Trans. Comput. Syst. 21(2), 164–206 (2003) doi:http://doi.acm.org/10.1145/762483.762485
Yalagandula, P., Dahlin, M.: A scalable distributed information management system. In: SIGCOMM, pp. 379–390. Portland, OR (2004)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Narayanan, D., Donnelly, A., Mortier, R. et al. Delay aware querying with Seaweed. The VLDB Journal 17, 315–331 (2008). https://doi.org/10.1007/s00778-007-0060-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00778-007-0060-3