Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2933267.2933539acmconferencesArticle/Chapter ViewAbstractPublication PagesdebsConference Proceedingsconference-collections
research-article

Taming velocity and variety simultaneously in big data with stream reasoning: tutorial

Published: 13 June 2016 Publication History

Abstract

Many "big data" applications must tame velocity (processing data in-motion) and variety (processing many different types of data) simultaneously.
The research on knowledge representation and reasoning has focused on the variety of data, devising data representation and processing techniques that promote integration and reasoning on available data to extract implicit information. On the other hand, the event and stream processing community has focused on the velocity of data, producing systems that efficiently operate on streams of data on-the-fly according to pre-deployed processing rules or queries. Several recent works explore the synergy between stream processing and reasoning to fully capture the requirements of modern data intensive applications, thus giving birth to the research domain of stream reasoning.
This tutorial paper offers an overview of the theoretical and technological achievements in stream reasoning, highlighting the key benefits and limitations of existing approaches, and discussing the open challenges and the opportunities for future research. The paper mainly targets researchers and practitioners in the area of event and stream processing. The paper aims to stimulate the discussion on stream reasoning and to further promote the integration of reasoning techniques within event and stream processing systems in three ways: (i) by presenting an active research domain, where researchers on event and stream processing can apply their expertise; (ii) by discussing techniques and technologies that can help advancing the state of the art in event and stream processing; (iii) by identifying the open problems in the field of stream reasoning, and drawing attention to promising research directions.

References

[1]
D. J. Abadi, D. Carney, U. Çetintemel, M. Cherniack, C. Convey, S. Lee, M. Stonebraker, N. Tatbul, and S. Zdonik. Aurora: A new model and architecture for data stream management. VLDB Journal, 12(2):120--139, 2003.
[2]
A. Alexandrov, R. Bergmann, S. Ewen, J.-C. Freytag, F. Hueske, A. Heise, O. Kao, M. Leich, U. Leser, V. Markl, F. Naumann, M. Peters, A. Rheinländer, M. J. Sax, S. Schelter, M. Höger, K. Tzoumas, and D. Warneke. The stratosphere platform for big data analytics. VLDB Journal, 23(6):939--964, 2014.
[3]
D. Anicic, P. Fodor, N. Stojanovic, and R. Stühmer. An approach for data-driven and logic-based complex event processing. In Proceedings of the International Conference on Distributed Event-Based Systems, DEBS '09, pages 26:1--26:2. ACM, 2009.
[4]
D. Anicic, S. Rudolph, P. Fodor, and N. Stojanovic. Stream reasoning and complex event processing in etalis. Semantic Web, 3(4):397--407, 2012.
[5]
A. Arasu, B. Babcock, S. Babu, M. Datar, K. Ito, I. Nishizawa, J. Rosenstein, and J. Widom. Stream: The stanford stream data manager (demonstration description). In Proceedings of the International Conference on Management of Data, SIGMOD '03, pages 665--665. ACM, 2003.
[6]
A. Arasu, S. Babu, and J. Widom. The cql continuous query language: Semantic foundations and query execution. The VLDB Journal, 15(2):121--142, 2006.
[7]
A. Artale, D. Calvanese, R. Kontchakov, and M. Zakharyaschev. The dl-lite family and relations. Journal of Artificial Intelligence Research, 36:1--69, 2009.
[8]
F. Baader and W. Nutt. The Description Logic Handbook: Theory, Implementation, and Applications. Cambridge University Press, 2003.
[9]
B. Babcock, S. Babu, M. Datar, R. Motwani, and J. Widom. Models and issues in data stream systems. In Proceedings Symposium on Principles of Database Systems, PODS '02, pages 1--16. ACM, 2002.
[10]
M. Balduini, A. Bozzon, E. Della Valle, Y. Huang, and G. Houben. Recommending venues using continuous predictive social media analytics. IEEE Internet Computing, 18(5):28--35, 2014.
[11]
M. Balduini, I. Celino, D. Dell'Aglio, E. D. Valle, Y. Huang, T. K. Lee, S. Kim, and V. Tresp. BOTTARI: an augmented reality mobile application to deliver personalized and location-based recommendations by continuous analysis of social media streams. Journal of Web Semantics, 16:33--41, 2012.
[12]
M. Balduini, E. Della Valle, M. Azzi, R. Larcher, F. Antonelli, and P. Ciuccarelli. Citysensing: Fusing city data for visual storytelling. IEEE MultiMedia, 22(3):44--53, 2015.
[13]
M. Balduini, E. Della Valle, D. Dell'Aglio, M. Tsytsarau, T. Palpanas, and C. Confalonieri. Social listening of city scale events using the streaming linked data framework. In Proceedings of the International Semantic Web Conference, ISWC '13, pages 1--16. Springer, 2013.
[14]
C. Baral, M. Gelfond, and J. N. Rushton. Probabilistic reasoning with answer sets. Theory and Practice of Logic Programming, 9(1):57--144, 2009.
[15]
D. F. Barbieri, D. Braga, S. Ceri, E. Della Valle, and M. Grossniklaus. C-sparql: a continuous query language for rdf data streams. International Journal of Semantic Computing, 04(01):3--25, 2010.
[16]
D. F. Barbieri, D. Braga, S. Ceri, E. Della Valle, Y. Huang, V. Tresp, A. Rettinger, and H. Wermser. Deductive and inductive stream reasoning for semantic social media analytics. IEEE Intelligent Systems, 25(6):32--41, 2010.
[17]
H. Beck, M. Dao-Tran, T. Eiter, and M. Fink. Lars: A logic-based framework for analyzing reasoning over streams. In Proceedings of the AAAI Conference on Artificial Intelligence, AAAI '15, pages 1431--1438. AAAI Press, 2015.
[18]
L. Brenna, A. Demers, J. Gehrke, M. Hong, J. Ossher, B. Panda, M. Riedewald, M. Thatte, and W. White. Cayuga: A high-performance event processing engine. In Proceedings of the International Conference on Management of Data, SIGMOD '07, pages 1100--1102. ACM, 2007.
[19]
J. Calbimonte, Ó. Corcho, and A. J. G. Gray. Enabling ontology-based access to streaming data sources. In Proceedings of the International Semantic Web Conference, ISWC '10, pages 96--111. Springer, 2010.
[20]
D. Calvanese, M. Koubarakis, and D. Toman. Special issue of the journal of web semantics on ontology-based data access. Journal of Web Semantics, 33:1--2, 2015.
[21]
S. Ceri and J. Widom. Deriving production rules for incremental view maintenance. In Proceedings of the International Conference on Very Large Data Bases, VLDB '91, pages 577--589. Morgan Kaufmann, 1991.
[22]
G. Cugola and A. Margara. Tesla: A formally defined event specification language. In Proceedings of the International Conference on Distributed Event-Based Systems, DEBS '10, pages 50--61. ACM, 2010.
[23]
G. Cugola and A. Margara. Processing flows of information: From data stream to complex event processing. ACM Computing Surveys, 44(3):15:1--15:62, 2012.
[24]
R. Cyganiak, D. Wood, and M. Lanthaler. Rdf 1.1 concepts and abstract syntax. Technical report, 2014.
[25]
S. Das, S. Sundara, and R. Cyganiak. R2RML: RDB to RDF Mapping Language. Technical report, W3C Recommendation, 2012.
[26]
S. Dehghanzadeh, D. Dell'Aglio, S. Gao, E. Della Valle, A. Mileo, and A. Bernstein. Approximate continuous query answering over streams and dynamic linked data sets. In Proceedings of Engineering the Web in the Big Data Era, ICWE '15, pages 307--325. Springer, 2015.
[27]
E. Della Valle, S. Ceri, F. v. Harmelen, and D. Fensel. It's a streaming world! reasoning upon rapidly changing information. IEEE Intelligent Systems, 24(6):83--89, 2009.
[28]
E. Della Valle, S. Schlobach, M. Krötzsch, A. Bozzon, S. Ceri, and I. Horrocks. Order matters! harnessing a world of orderings for reasoning over massive data. Semantic Web, 4(2):219--231, 2013.
[29]
D. Dell'Aglio, J. Calbimonte, M. Balduini, Ó. Corcho, and E. D. Valle. On correctness in RDF stream processor benchmarking. In Proceedings of the International Semantic Web Conference, ISWC '13, pages 326--342. Springer, 2013.
[30]
D. Dell'Aglio and E. Della Valle. Incremental reasoning on RDF streams. In Linked Data Management, pages 413--435. Chapman and Hall/CRC, 2014.
[31]
D. Dell'Aglio, E. Della Valle, J. Calbimonte, and Ó. Corcho. RSP-QL semantics: A unifying query model to explain heterogeneity of RDF stream processing systems. International Journal Semantic Web Inf. Syst., 10(4):17--44, 2014.
[32]
G. Erétéo, M. Buffa, F. Gandon, and O. Corby. Analysis of a real online social network using semantic web frameworks. In Proceedings of the International The Semantic Web Conference, ISWC '09, pages 180--195. Springer, 2009.
[33]
O. Etzion and P. Niblett. Event Processing in Action. Manning Publications Co., 2010.
[34]
S. Harris and A. Seaborne. Sparql 1.1 query language. Technical report, W3C Recommendation, 2013.
[35]
F. Heintz and P. Doherty. Dyknow: An approach to middleware for knowledge processing. Journal of Intelligent and Fuzzy Systems, 15(1):3--13, 2004.
[36]
P. Hitzler, M. Krotzsch, B. Parsia, P. F. Patel-Schneider, and S. Rudolph. OWL 2 Web Ontology Language Primer. Technical report, W3C Recommendation, 2012.
[37]
J. Hoeksema and S. Kotoulas. High-performance distributed stream reasoning using s4. In Proceedings of the International Workshop on Ordering and Reasoning, 2011.
[38]
S. Komazec, D. Cerri, and D. Fensel. Sparkwave: Continuous schema-enhanced pattern matching over rdf data streams. In Proceedings of the International Conference on Distributed Event-Based Systems, DEBS '12, pages 58--68. ACM, 2012.
[39]
S. Kulkarni, N. Bhagat, M. Fu, V. Kedigehalli, C. Kellogg, S. Mittal, J. M. Patel, K. Ramasamy, and S. Taneja. Twitter heron: Stream processing at scale. In Proceedings of the International Conference on Management of Data, SIGMOD '15, pages 239--250. ACM, 2015.
[40]
D. Le-Phuoc, M. Dao-Tran, J. X. Parreira, and M. Hauswirth. A native and adaptive approach for unified processing of linked streams and linked data. In Proceedings of the International Semantic Web Conference, ISWC '11, pages 370--388. Springer-Verlag, 2011.
[41]
F. Lecue, S. Kotoulas, and P. M. Aonghusa. Capturing the pulse of cities: Opportunity and research challenges for robust stream data reasoning. In Proceedings of AAAI Workshops, AAAI '12, 2012.
[42]
F. Lécué and J. Z. Pan. Predicting knowledge in an ontology stream. In Proceedings of the International Joint Conference on Artificial Intelligence, IJCAI '13, pages 2662--2669. AAAI Press, 2013.
[43]
M. Lenzerini. Data integration: A theoretical perspective. In Proceedings of the Symposium on Principles of Database Systems, pages 233--246, 2002.
[44]
V. Lifschitz. What is answer set programming? In Proceedings of the Conference on Artificial Intelligence, AAAI '08, pages 1594--1597. AAAI Press, 2008.
[45]
C. Liu, J. Urbani, and G. Qi. Efficient RDF stream reasoning with graphics processingunits (GPUs). In Proceedings of the International conference on World Wide Web, WWW '14, pages 343--344, 2014.
[46]
D. C. Luckham. The Power of Events: An Introduction to Complex Event Processing in Distributed Enterprise Systems. Addison-Wesley, 2001.
[47]
A. Margara, J. Urbani, F. van Harmelen, and H. Bal. Streaming the web: Reasoning over dynamic data. Web Semantics: Science, Services and Agents on the World Wide Web, 25(C):24--44, 2014.
[48]
A. Mileo, A. Abdelrahman, S. Policarpio, and M. Hauswirth. Streamrule: A nonmonotonic stream reasoning system for the semantic web. In Proceedings of the International Conference on Web Reasoning and Rule Systems, RR '13, pages 247--252. Springer-Verlag, 2013.
[49]
Y. Nenov, R. Piro, B. Motik, I. Horrocks, Z. Wu, and J. Banerjee. Rdfox: A highly-scalable RDF store. In Proceedings of the International Semantic Web Conference, pages 3--20. Springer, 2015.
[50]
M. Nickles and A. Mileo. Web stream reasoning using probabilistic answer set programming. In Proceedings of the International Conference on Web Reasoning and Rule Systems, RR '14, pages 197--205, 2014.
[51]
Ö. L. Özçep, R. Möller, and C. Neuenstadt. A stream-temporal query language for ontology based data access. In Proceedings of the International Workshop on Description Logics, pages 696--708, 2014.
[52]
D. L. Phuoc, H. N. M. Quoc, C. L. Van, and M. Hauswirth. Elastic and scalable processing of linked stream data in the cloud. In Proceedings of the International Semantic Web Conference, volume 8218 of ISWC '13, pages 280--297. Springer, 2013.
[53]
D. Puiu, P. M. Barnaghi, R. Toenjes, D. Kuemper, M. I. Ali, A. Mileo, J. X. Parreira, M. Fischer, S. Kolozali, N. FarajiDavar, F. Gao, T. Iggena, T. Pham, C. Nechifor, D. Puschmann, and J. Fernandes. Citypulse: Large scale data analytics framework for smart cities. IEEE Access, 4:1086--1108, 2016.
[54]
Y. Ren and J. Z. Pan. Optimising ontology stream reasoning with truth maintenance system. In Proceedings of the International Conference on Information and Knowledge Management, CIKM '11, pages 831--836. ACM, 2011.
[55]
M. Richardson and P. M. Domingos. Markov logic networks. Machine Learning, 62(1-2):107--136, 2006.
[56]
M. Rinne, E. Nuutila, and S. Törmä. INSTANS: high-performance event processing with standard RDF and SPARQL. In Proceedings of the ISWC Posters & Demonstrations Track, CEUR Workshop Proceedings. CEUR-WS.org, 2012.
[57]
N. Shadbolt, T. Berners-Lee, and W. Hall. The semantic web revisited. IEEE Intelligent Systems, 21(3):96--101, 2006.
[58]
M. Shanahan. The event calculus explained. In Artificial Intelligence Today, pages 409--430. 1999.
[59]
A. Skarlatidis, G. Paliouras, A. Artikis, and G. A. Vouros. Probabilistic event calculus for event recognition. ACM Transactions on Computer Logic, 16(2):11:1--11:37, 2015.
[60]
M. Staudt and M. Jarke. Incremental maintenance of externally materialized views. In Proceedings of the International Conference on Very Large Data Bases, VLDB '96, pages 75--86. Morgan Kaufmann, 1996.
[61]
H. Stuckenschmidt, S. Ceri, E. Della Valle, and F. van Harmelen. Towards expressive stream reasoning. volume 10042 of Dagstuhl Seminar Proceedings. Schloss Dagstuhl, 2010.
[62]
S. Tallevi-Diotallevi, S. Kotoulas, L. Foschini, F. Lécué, and A. Corradi. Real-time urban monitoring in dublin using semantic and stream technologies. In Proceedings of the International Semantic Web Conference, ISWC '13, pages 178--194. Springer, 2013.
[63]
S. Tallevi-Diotallevi, S. Kotoulas, L. Foschini, F. Lécué, and A. Corradi. Real-time urban monitoring in dublin using semantic and stream technologies. In Proceedings of the International Semantic Web Conference, volume 8219 of ISWC '13, pages 178--194. Springer, 2013.
[64]
R. Tommasini, E. Della Valle, M. Balduini, and D. Dell'Aglio. Can a brute gang of facts on stream processing murder a theory on reasoning? In Proceedings of the European Semantic Web Conference, ESWC '16, 2016.
[65]
A. Turhan and E. Zenker. Towards temporal fuzzy query answering on stream-based data. In Proceedings of the Workshop on High-Level Declarative Stream Processing, pages 56--69, 2015.
[66]
J. Urbani, A. Margara, C. Jacobs, F. Harmelen, and H. Bal. Dynamite: Parallel materialization of dynamic rdf data. In Proceedings of the International Semantic Web Conference, ISWC '13, pages 657--672. Springer, 2013.
[67]
R. Volz, S. Staab, and B. Motik. Journal on data semantics. chapter Incrementally Maintaining Materializations of Ontologies Stored in Logic Databases, pages 1--34. Springer-Verlag, 2005.
[68]
A. Wagner, S. Speiser, and A. Harth. Semantic web technologies for a smart energy grid: Requirements and challenges. In Proceedings of the International Semantic Web Conference, ISWC '10, pages 33--37. Springer, 2010.
[69]
O. Walavalkar, A. Joshi, T. Finin, and Y. Yesha. Streaming knowledge bases. In Proceedings of the International Workshop on Scalable Semantic Web Knowledge Base Systems, SSWS '08, 2008.
[70]
W. White, M. Riedewald, J. Gehrke, and A. Demers. What is "next" in event processing? In Proceedings of the Symposium on Principles of Database Systems, PODS '07, pages 263--272. ACM, 2007.
[71]
M. Zaharia, T. Das, H. Li, T. Hunter, S. Shenker, and I. Stoica. Discretized streams: Fault-tolerant streaming computation at scale. In Proceedings of the Symposium on Operating Systems Principles, SOSP '13, pages 423--438. ACM, 2013.
[72]
S. Zahmatkesh, E. Della Valle, and D. Dell'Aglio. When a filter makes the difference in continuously answering sparql queries on streaming and quasi-static linked data. In Proceedings of Engineering the Web in the Big Data Era, ICWE '16. Springer, 2016.

Cited By

View all
  • (2022)Bridging the gap between expressivity and efficiency in stream reasoning: a structural caching approach for IoT streamsKnowledge and Information Systems10.1007/s10115-022-01686-564:7(1781-1815)Online publication date: 6-Jun-2022
  • (2022)General IntroductionStreaming Linked Data10.1007/978-3-031-15371-6_1(1-16)Online publication date: 22-Aug-2022
  • (2021)On a Certain Research Gap in Big Data Mining for Customer InsightsApplied Sciences10.3390/app1115699311:15(6993)Online publication date: 29-Jul-2021
  • Show More Cited By

Index Terms

  1. Taming velocity and variety simultaneously in big data with stream reasoning: tutorial

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    DEBS '16: Proceedings of the 10th ACM International Conference on Distributed and Event-based Systems
    June 2016
    456 pages
    ISBN:9781450340212
    DOI:10.1145/2933267
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 13 June 2016

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. complex event processing
    2. event processing
    3. reasoning
    4. stream processing
    5. stream reasoning

    Qualifiers

    • Research-article

    Conference

    DEBS '16

    Acceptance Rates

    Overall Acceptance Rate 145 of 583 submissions, 25%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)21
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 30 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2022)Bridging the gap between expressivity and efficiency in stream reasoning: a structural caching approach for IoT streamsKnowledge and Information Systems10.1007/s10115-022-01686-564:7(1781-1815)Online publication date: 6-Jun-2022
    • (2022)General IntroductionStreaming Linked Data10.1007/978-3-031-15371-6_1(1-16)Online publication date: 22-Aug-2022
    • (2021)On a Certain Research Gap in Big Data Mining for Customer InsightsApplied Sciences10.3390/app1115699311:15(6993)Online publication date: 29-Jul-2021
    • (2021)Towards an Evaluation Framework for Expressive Stream ReasoningThe Semantic Web: ESWC 2021 Satellite Events10.1007/978-3-030-80418-3_14(76-81)Online publication date: 21-Jul-2021
    • (2021)RSP4J: An API for RDF Stream ProcessingThe Semantic Web10.1007/978-3-030-77385-4_34(565-581)Online publication date: 31-May-2021
    • (2020)Continuous top-k approximated join of streaming and evolving distributed dataSemantic Web10.3233/SW-19036711:5(767-799)Online publication date: 1-Jan-2020
    • (2020)Data Stream Media Compression and Transmission Framework for Intelligent Communication Equipment Based on Android System2020 International Conference on Inventive Computation Technologies (ICICT)10.1109/ICICT48043.2020.9112453(639-642)Online publication date: Feb-2020
    • (2020)A First Step Towards a Streaming Linked Data Life-CycleThe Semantic Web – ISWC 202010.1007/978-3-030-62466-8_39(634-650)Online publication date: 2-Nov-2020
    • (2020)Stream Reasoning: From Theory to PracticeReasoning Web. Declarative Artificial Intelligence10.1007/978-3-030-60067-9_4(85-108)Online publication date: 18-Oct-2020
    • (2020)On Teaching Web Stream ProcessingWeb Engineering10.1007/978-3-030-50578-3_33(485-493)Online publication date: 10-Jun-2020
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media