Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3135974.3135976acmconferencesArticle/Chapter ViewAbstractPublication PagesmiddlewareConference Proceedingsconference-collections
research-article

Efficient covering for top-k filtering in content-based publish/subscribe systems

Published: 11 December 2017 Publication History

Abstract

We investigate the use of content-based publish/subscribe for data dissemination in large-scale applications with expressive filtering requirements. In particular, we focus on top-k subscription filtering, where a publication is delivered only to the k best ranked subscribers, as ordered using expressive semantics such as relevance, fairness, and diversity. The naive approach to perform filtering early at the publisher edge works only if complete knowledge of the subscriptions is available, which is not compatible with the well-established covering optimization in scalable content-based publish/subscribe systems. We propose an efficient rank-cover technique to reconcile top-k subscription filtering with covering. We extend the covering model to support top-k and describe a novel algorithm for forwarding subscriptions to publishers while maintaining correctness. We also establish a framework for supporting different types of ranking semantics and propose an implementation to support fairness. Finally, we compare our solutions to a baseline covering system and perform sensitivity analysis to demonstrate that our optimized rank-cover algorithm retains both covering and fairness while achieving properties advantageous to our targeted workloads. In a typical setting, our optimized solution is scalable, selects fairly, and provides over 81% of the covering benefit.

References

[1]
2013. The IBM Strategy. http://www.ibm.com/annualreport/2013/. (2013).
[2]
Marcos K. Aguilera, Robert E. Strom, Daniel C. Sturman, Mark Astley, and Tushar D. Chandra. Matching events in a content-based subscription system. In PODC'99.
[3]
Brian Babcock and Chris Olston. Distributed top-k monitoring. In SIGMOD'03.
[4]
Raphaël Barazzutti, Thomas Heinze, Andre Martin, Emanuel Onica, Pascal Felber, Christof Fetzer, Zbigniew Jerzak, Marcelo Pasin, and Etienne Riviere. Elastic Scaling of a High-Throughput Content-Based Publish/Subscribe Engine. In ICDCS'14.
[5]
Alexis Campailla, Sagar Chaki, Edmund Clarke, Somesh Jha, and Helmut Veith. Efficient filtering in publish-subscribe systems using binary decision diagrams. In ICSE'01.
[6]
Pei Cao and Zhe Wang. Efficient top-K query calculation in distributed networks. In PODC'04.
[7]
Antonio Carzaniga, David S. Rosenblum, and Alexander L Wolf. 2001. Design and Evaluation of a Wide-Area Event Notification Service. TOCS (2001).
[8]
Lisi Chen and Gao Cong. Diversity-Aware Top-k Publish/Subscribe for Text Stream. In SIGMOD '15.
[9]
L. Chen, G. Cong, X. Cao, and K. L. Tan. Temporal Spatial-Keyword Top-k publish/subscribe. In ICDE'15.
[10]
William Culhane, K. R. Jayaram, and Patrick Eugster. Fast, Expressive Top-k Matching. In Middleware '14.
[11]
Marina Drosou, Evaggelia Pitoura, and Kostas Stefanidis. Preferential Publish/Subscribe. In PersDB'08.
[12]
Marina Drosou, Kostas Stefanidis, and Evaggelia Pitoura. Preference-aware publish/subscribe delivery with diversity. In DEBS'09.
[13]
Françoise Fabret, Hans-Arno Jacobsen, François Llirbat, Joâo Pereira, Kenneth A. Ross, and Dennis Shasha. Filtering algorithms and implementation for fast pub/sub systems. In SIGMOD'01.
[14]
Ronald Fagin, Amnon Lotem, and Moni Naor. Optimal aggregation algorithms for middleware. In PODS'01.
[15]
E. Fidler, Hans-Arno Jacobsen, Guoli Li, and Serge Mankovski. The PADRES Distributed Publish/Subscribe System. In ICFI'05.
[16]
Marcus Fontoura, Suhas Sadanandan, Jayavel Shanmugasundaram, Sergei Vassilvitski, Erik Vee, Srihari Venkatesan, and Jason Zien. Efficiently evaluating complex Boolean expressions. In SIGMOD'10.
[17]
Albert Greenberg, James Hamilton, David A. Maltz, and Parveen Patel. The Cost of a Cloud: Research Problems in Data Center Networks. In SIGCOMM'08.
[18]
Ihab F. Ilyas, George Beskales, and Mohamed A. Soliman. 2008. A survey of top-k query processing techniques in relational database systems. Comput. Surv. (2008).
[19]
Chamikara Jayalath, Julian James Stephen, and Patrick Eugster. Atmosphere: A universal cross-cloud communication infrastructure. In Middleware'13.
[20]
Guoli Li, Shuang Huo, and Hans-Arno Jacobsen. A Unified Approach to Routing, Covering and Merging in Publish/Subscribe Systems Based on Modified Binary Decision Diagrams. In ICDCS'05.
[21]
Ashwin Machanavajjhala, Erik Vee, Minos Garofalakis, and Jayavel Shanmugasundaram. Scalable ranked publish/subscribe. In VLDB'08.
[22]
Sebastian Michel, Peter Triantafillou, and Gerhard Weikum. KLEE: a framework for distributed top-k query algorithms. In VLDB'05.
[23]
Melih Onus and Andréa W Richa. 2011. Minimum maximum-degree publish-subscribe overlay network design. TON (2011).
[24]
Navneet Kumar Pandey, Kaiwen Zhang, Stephane Weiss, Hans-Arno Jacobsen, and Roman Vitenberg. Minimizing the Communication Cost of Aggregation in Publish/Subscribe Systems. In ICDCS 15.
[25]
Krešimir Pripužić, Ivana Podnar Žarko, and Karl Aberer. Top-k/w publish/subscribe: finding k most relevant publications in sliding time window w. In DEBS'08.
[26]
Mohammad Sadoghi and Hans-Arno Jacobsen. BE-Tree: An Index Structure to Efficiently Match Boolean Expressions over High-dimensional Discrete Space. In SIGMOD'11.
[27]
Mohammad Sadoghi and Hans-Arno Jacobsen. Relevance Matters: Capitalizing on Less (Top-k Matching in Publish/Subscribe). In ICDE'12.
[28]
Vinay Setty, Gunnar Kreitz, Roman Vitenberg, Maarten van Steen, Guido Urdaneta, and Staffan Gimåker. The Hidden Pub/Sub of Spotify: (Industry Article). In DEBS'13.
[29]
Abhishek Verma, Luis Pedrosa, Madhukar Korupolu, David Oppenheimer, Eric Tune, and John Wilkes. Large-scale Cluster Management at Google with Borg. In EuroSys'15.
[30]
Xiang Wang, Ying Zhang, Wenjie Zhang, Xuemin Lin, and Zengfeng Huang. Skype: Top-k Spatial-keyword Publish/Subscribe over Sliding Window. In VLDB'16.
[31]
Steven Whang, Chad Brower, Jayavel Shanmugasundaram, Sergei Vassilvitskii, Erik Vee, Ramana Yerneni, and Hector Garcia-Molina. Indexing Boolean Expressions. In VLDB'09.
[32]
Tak Yan and Hector Garcia-molina. 1994. Index Structures for Selective Dissemination of Information Under the Boolean Model. TODS (1994).
[33]
Kaiwen Zhang, Vinod Muthusamy, Mohammad Sadoghi, and Hans-Arno Jacobsen. Subscription Covering for Relevance-based Filtering in Content-Based Publish/Subscribe Systems. In ICDCS'17.
[34]
Kaiwen Zhang, Mohammad Sadoghi, Vinod Muthusamy, and Hans-Arno Jacobsen. Distributed Ranked Data Dissemination in Social Networks. In ICDCS'13.

Cited By

View all
  • (2024)ShutPubProceedings of the 7th International Workshop on Edge Systems, Analytics and Networking10.1145/3642968.3654815(13-18)Online publication date: 22-Apr-2024
  • (2023)Lotus: Serverless In-Transit Data Processing for Edge-based Pub/SubProceedings of the 6th International Workshop on Edge Systems, Analytics and Networking10.1145/3578354.3592869(31-35)Online publication date: 8-May-2023
  • (2022)Human as a Service: Towards Resilient Parking Search System With Sensorless SensingIEEE Transactions on Intelligent Transportation Systems10.1109/TITS.2021.313371323:8(13863-13877)Online publication date: Aug-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
Middleware '17: Proceedings of the 18th ACM/IFIP/USENIX Middleware Conference
December 2017
268 pages
ISBN:9781450347204
DOI:10.1145/3135974
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

  • USENIX Assoc: USENIX Assoc
  • IFIP

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 December 2017

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Conference

Middleware '17
Sponsor:
Middleware '17: 18th International Middleware Conference
December 11 - 15, 2017
Nevada, Las Vegas

Acceptance Rates

Middleware '17 Paper Acceptance Rate 20 of 85 submissions, 24%;
Overall Acceptance Rate 203 of 948 submissions, 21%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)8
  • Downloads (Last 6 weeks)3
Reflects downloads up to 23 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)ShutPubProceedings of the 7th International Workshop on Edge Systems, Analytics and Networking10.1145/3642968.3654815(13-18)Online publication date: 22-Apr-2024
  • (2023)Lotus: Serverless In-Transit Data Processing for Edge-based Pub/SubProceedings of the 6th International Workshop on Edge Systems, Analytics and Networking10.1145/3578354.3592869(31-35)Online publication date: 8-May-2023
  • (2022)Human as a Service: Towards Resilient Parking Search System With Sensorless SensingIEEE Transactions on Intelligent Transportation Systems10.1109/TITS.2021.313371323:8(13863-13877)Online publication date: Aug-2022
  • (2018)FireDeXProceedings of the 19th International Middleware Conference10.1145/3274808.3274830(279-292)Online publication date: 26-Nov-2018
  • (2018)Cloud-MOM: A Content-Based Real-Time Message-Oriented Middleware for Cloud2018 IEEE 20th International Conference on High Performance Computing and Communications; IEEE 16th International Conference on Smart City; IEEE 4th International Conference on Data Science and Systems (HPCC/SmartCity/DSS)10.1109/HPCC/SmartCity/DSS.2018.00128(750-757)Online publication date: Jun-2018
  • (2018)Adjusting Matching Algorithm to Adapt to Dynamic Subscriptions in Content-Based Publish/Subscribe Systems2018 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Ubiquitous Computing & Communications, Big Data & Cloud Computing, Social Computing & Networking, Sustainable Computing & Communications (ISPA/IUCC/BDCloud/SocialCom/SustainCom)10.1109/BDCloud.2018.00064(369-376)Online publication date: Dec-2018

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media