Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2602622.2602629acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Data stream processing in dynamic and decentralized peer-to-peer networks

Published: 18 June 2014 Publication History

Abstract

Data stream management systems (DSMS) process data streams, potentially infinite amounts of data sent by active data sources. Distributed DSMS use networks of interconnected machines to enhance the processing power. Typically, clusters of equal, non-autonomous machines are used. However, in some applications, a cluster of computers is not available, not feasible, their acquisition costs are too high or they are too complex to deploy. An alternative would be to use a collection of notebooks, personal computers or smartphones, resulting in a network which only contains autonomous and heterogeneous machines. This results in a dynamic and decentralized network which has to be considered in distributed data stream processing. In this paper, I present my PhD project for developing and deploying a distributed DSMS that can be executed in a Peer-to-Peer (P2P) network of autonomous and heterogeneous peers. My approach addresses three main challenges: data source management, continuous query distribution and distributed query management. A prototypical implementation is already in place and the evaluation is currently planned.

References

[1]
J. Krämer, "Continuous queries over data streams - semantics and implementation," University of Marburg, 2007.
[2]
Q. H. Vu, M. Lupu, and B. C. Ooi, Peer-to-Peer Computing: Principles and Applications, 1st ed. Springer Publishing Company, Incorporated, 2009.
[3]
R. Cox, F. Dabek, F. Kaashoek, J. Li, and R. Morris, "Practical, distributed network coordinates," SIGCOMM Comput. Commun. Rev., vol. 34, no. 1, Jan. 2004.
[4]
P. R. Pietzuch, J. Ledlie, J. Shneidman, M. Roussopoulos, M. Welsh, and M. I. Seltzer, "Network-aware operator placement for stream-processing systems," in ICDE, 2006.
[5]
L. Gong, S. Oaks, and B. Traversat, JXTA in a Nutshell - A Desktop Quick Reference. O'Reilly, 2002.
[6]
S. Microsystems, "Jxta java standard edition v2.5 - programmers guide," 2010.
[7]
H.-J. Appelrath, D. Geesen, M. Grawunder, T. Michelsen, and D. Nicklas, "Odysseus: a highly customizable framework for creating efficient event stream management systems," ser. DEBS '12. ACM, 2012, pp. 367--368.
[8]
D. Terry, D. Goldberg, D. Nichols, and B. Oki, phContinuous queries over append-only databases. ACM, 1992, vol. 21.
[9]
U. Schreier, H. Pirahesh, R. Agrawal, and C. Mohan, "Alert: An architecture for transforming a passive dbms into an active dbms," in Proceedings of the 17th International Conference on Very Large Data Bases, ser. VLDB '91. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 1991, pp. 469--478.
[10]
J. Chen, D. J. DeWitt, F. Tian, and Y. Wang, "Niagaracq: a scalable continuous query system for internet databases," in Proceedings of the 2000 ACM SIGMOD international conference on Management of data, ser. SIGMOD '00. New York, NY, USA: ACM, 2000, pp. 379--390.
[11]
L. Liu, C. Pu, and W. Tang, "Continual queries for internet scale event-driven information delivery," Knowledge and Data Engineering, IEEE Transactions on, vol. 11, no. 4, pp. 610--628, 1999.
[12]
A. Arasu, B. Babcock, S. Babu, J. Cieslewicz, M. Datar, K. Ito, R. Motwani, U. Srivastava, and J. Widom, "Stream: The stanford data stream management system," Stanford InfoLab, Technical Report 2004--20, 2004.
[13]
M. Cammert, C. Heinz, J. Krämer, A. Markowetz, and B. Seeger, "Pipes: A multi-threaded publish-subscribe architecture for continuous queries over streaming data sources," Tech. Rep., 2003.
[14]
H. Thakkar, B. Mozafari, and C. Zaniolo, "Designing an inductive data stream management system: the stream mill experience," in Proceedings of the 2nd international workshop on Scalable stream processing system, ser. SSPS '08. New York, NY, USA: ACM, 2008, pp. 79--88.
[15]
C. Cranor, T. Johnson, and O. Spataschek, "Gigascope: a stream database for network applications," in In SIGMOD, 2003, pp. 647--651.
[16]
K. Aberer, M. Hauswirth, and A. Salehi, "The global sensor networks middleware for efficient and flexible deployment and interconnection of sensor networks," Ecole Polytechnique Fdrale de Lausanne (EPFL), Tech. Rep. LSIR-REPORT-2006-006, 2006.
[17]
L. Gurgen, C. Roncancio, C. Labbé, A. Bottaro, and V. Olive, "Sstreamware: a service oriented middleware for heterogeneous sensor data management," in Proceedings of the 5th international conference on Pervasive services, ser. ICPS '08. New York, NY, USA: ACM, 2008, pp. 121--130.
[18]
Y. Ahmad, B. Berg, U. Cetintemel, M. Humphrey, J.-H. Hwang, A. Jhingran, A. Maskey, O. Papaemmanouil, A. Rasin, N. Tatbul, W. Xing, Y. Xing, and S. Zdonik, "Distributed operation in the borealis stream processing engine," in Proceedings of the 2005 ACM SIGMOD international conference on Management of data, ser. SIGMOD '05. ACM, 2005, pp. 882--884.
[19]
S. B. Zdonik, M. Stonebraker, M. Cherniack, U. Çetintemel, M. Balazinska, and H. Balakrishnan, "The aurora and medusa projects," IEEE Data Eng. Bull., vol. 26, no. 1, pp. 3--10, 2003.
[20]
R. Kuntschke, B. Stegmaier, A. Kemper, and A. Reiser, "Streamglobe: processing and sharing data streams in grid-based p2p infrastructures," in Proceedings of the 31st international conference on Very large data bases, ser. VLDB '05. VLDB Endowment, 2005, pp. 1259--1262.
[21]
S. Chandrasekaran, O. Cooper, A. Deshpande, M. J. Franklin, J. M. Hellerstein, W. Hong, S. Krishnamurthy, S. R. Madden, F. Reiss, and M. A. Shah, "Telegraphcq: continuous dataflow processing," in Proceedings of the 2003 ACM SIGMOD international conference on Management of data, ser. SIGMOD '03. ACM, 2003, pp. 668--668.
[22]
L. Neumeyer, B. Robbins, A. Nair, and A. Kesari, "S4: Distributed stream computing platform," in Proceedings of the 2010 IEEE International Conference on Data Mining Workshops, ser. ICDMW '10. Washington, DC, USA: IEEE Computer Society, 2010, pp. 170--177.
[23]
J. Dean and S. Ghemawat, "Mapreduce: simplified data processing on large clusters," Commun. ACM, vol. 51, no. 1, pp. 107--113, Jan. 2008.
[24]
B. Gedik, H. Andrade, K.-L. Wu, P. S. Yu, and M. Doo, "Spade: the system s declarative stream processing engine," in Proceedings of the 2008 ACM SIGMOD international conference on Management of data, ser. SIGMOD '08. New York, NY, USA: ACM, 2008, pp. 1123--1134.
[25]
L. Amini, H. Andrade, R. Bhagwan, F. Eskesen, R. King, P. Selo, Y. Park, and C. Venkatramani, "Spc: a distributed, scalable platform for data mining," in Proceedings of the 4th international workshop on Data mining standards, services and platforms, ser. DMSSP '06. New York, NY, USA: ACM, 2006, pp. 27--37.
[26]
V. Gulisano, R. Jimenez-Peris, M. Patino-Martinez, and P. Valduriez, "Streamcloud: A large scale data streaming system," in Proceedings of the 2010 IEEE 30th International Conference on Distributed Computing Systems, ser. ICDCS '10. IEEE Computer Society, 2010, pp. 126--137.
[27]
"Storm project page (last visit: Feb 07 2014)," http://storm-project.net/.
[28]
M. Leich, J. Adamek, M. Schubotz, A. Heise, A. Rheinländer, and V. Markl, "Applying stratosphere for big data analytics." in BTW, 2013, pp. 507--510.

Cited By

View all
  • (2018)A Knowledge Carrying Service-Component Architecture for Smart Cyber Physical SystemsService-Oriented Computing – ICSOC 2017 Workshops10.1007/978-3-319-91764-1_22(270-282)Online publication date: 16-Jun-2018
  • (2017)IoT-Based Big Data Storage Systems in Cloud Computing: Perspectives and ChallengesIEEE Internet of Things Journal10.1109/JIOT.2016.26193694:1(75-87)Online publication date: Feb-2017
  • (2015)HeraklesProceedings of the 9th ACM International Conference on Distributed Event-Based Systems10.1145/2675743.2776775(356-359)Online publication date: 24-Jun-2015
  • Show More Cited By

Index Terms

  1. Data stream processing in dynamic and decentralized peer-to-peer networks

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGMOD'14 PhD Symposium: Proceedings of the 2014 SIGMOD PhD symposium
    June 2014
    58 pages
    ISBN:9781450329248
    DOI:10.1145/2602622
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 18 June 2014

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. data streams
    2. distributed systems
    3. p2p network
    4. peer computing

    Qualifiers

    • Research-article

    Conference

    SIGMOD/PODS'14
    Sponsor:

    Acceptance Rates

    SIGMOD'14 PhD Symposium Paper Acceptance Rate 10 of 13 submissions, 77%;
    Overall Acceptance Rate 40 of 60 submissions, 67%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)4
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 30 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2018)A Knowledge Carrying Service-Component Architecture for Smart Cyber Physical SystemsService-Oriented Computing – ICSOC 2017 Workshops10.1007/978-3-319-91764-1_22(270-282)Online publication date: 16-Jun-2018
    • (2017)IoT-Based Big Data Storage Systems in Cloud Computing: Perspectives and ChallengesIEEE Internet of Things Journal10.1109/JIOT.2016.26193694:1(75-87)Online publication date: Feb-2017
    • (2015)HeraklesProceedings of the 9th ACM International Conference on Distributed Event-Based Systems10.1145/2675743.2776775(356-359)Online publication date: 24-Jun-2015
    • (2015)Modulares Verteilungskonzept für DatenstrommanagementsystemeDatenbank-Spektrum10.1007/s13222-015-0199-915:3(213-221)Online publication date: 30-Sep-2015
    • (2014)Die Abteilung Informationssysteme der Universität OldenburgDatenbank-Spektrum10.1007/s13222-014-0166-x14:3(237-242)Online publication date: 17-Sep-2014

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media