Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1007/978-3-031-19433-7_40guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

RMLStreamer-SISO: An RDF Stream Generator from Streaming Heterogeneous Data

Published: 23 October 2022 Publication History

Abstract

Stream-reasoning query languages such as CQELS and C-SPARQL enable query answering over RDF streams. Unfortunately, there currently is a lack of efficient RDF stream generators to feed RDF stream reasoners. State-of-the-art RDF stream generators are limited with regard to the velocity and volume of streaming data they can handle. To efficiently generate RDF streams in a scalable way, we extended the RMLStreamer to also generate RDF streams from dynamic heterogeneous data streams. This paper introduces a scalable solution that relies on a dynamic window approach to generate RDF streams with low latency and high throughput from multiple heterogeneous data streams. Our evaluation shows that our solution outperforms the state-of-the-art by achieving millisecond latency (compared to seconds that state-of-the-art solutions need), constant memory usage for all workloads, and sustainable throughput of around 70,000 records/s (compared to 10,000 records/s that state-of-the-art solutions take). This opens up the access to numerous data streams for integration with the semantic web.
Resource type: Software
License: MIT License

References

[1]
Barbieri, D.F., Braga, D., Ceri, S., Della Valle, E., Grossniklaus, M.: C-SPARQL: SPARQL for continuous querying. In: Proceedings of the 18th International Conference on World Wide Web. WWW 2009, pp. 1061–1062. Association for Computing Machinery, New York (2009).
[2]
Belcao M, Falzone E, Bionda E, Valle ED, et al. Hotho A et al. Chimera: a bridge between big data analytics and semantic technologies The Semantic Web – ISWC 2021 2021 Cham Springer 463-479
[3]
Botan, I., Derakhshan, R., Dindar, N., Haas, L., Miller, R.J., Tatbul, N.: Secret: a model for analysis of the execution semantics of stream processing systems. Proc. VLDB Endow. 3(1–2), 232–243 (2010).
[4]
Brouwer, M.D., et al.: Distributed continuous home care provisioning through personalized monitoring & treatment planning. In: Companion Proceedings of the Web Conference 2020. ACM, April 2020.
[5]
Calbimonte J-P, Corcho O, Gray AJG, et al. Patel-Schneider PF et al. Enabling ontology-based access to streaming data sources The Semantic Web – ISWC 2010 2010 Heidelberg Springer 96-111
[6]
Carbone P, Katsifodimos A, Ewen S, Markl V, Haridi S, and Tzoumas K Apache flink™: stream and batch processing in a single engine IEEE Data Eng. Bull. 2015 38 28-38
[7]
Chiu DM and Jain R Analysis of the increase and decrease algorithms for congestion avoidance in computer networks Comput. Netw. ISDN Syst. 1989 17 1 1-14
[8]
De Meester B, Dimou A, Verborgh R, and Mannens E Sack H, Rizzo G, Steinmetz N, Mladenić D, Auer S, and Lange C An ontology to semantically declare and describe functions The Semantic Web 2016 Cham Springer 46-49
[9]
Dias de Assunção, M., da Silva Veith, A., Buyya, R.: Distributed data stream processing and edge computing: a survey on resource elasticity and future directions. J. Netw. Comput. Appl. 103, 1–17 (2018)., https://www.sciencedirect.com/science/article/pii/S1084804517303971
[10]
Dimou, A., Vander Sande, M., Colpaert, P., Verborgh, R., Mannens, E., Van de Walle, R.: RML: a generic language for integrated RDF mappings of heterogeneous data, vol. 1184 (2014)
[11]
van Dongen G and Van den Poel D Evaluation of stream processing frameworks IEEE Trans. Parallel Distrib. Syst. 2020 31 8 1845-1858
[12]
Gedik, B.: Generic windowing support for extensible stream processing systems. Softw. Pract. Exper. 44(9), 1105–1128 (2014).
[13]
Haesendonck, G., Maroy, W., Heyvaert, P., Verborgh, R., Dimou, A.: Parallel RDF generation from heterogeneous big data. In: Proceedings of the International Workshop on Semantic Big Data. SBD 2019. Association for Computing Machinery, New York (2019).
[14]
Iglesias, E., Jozashoori, S., Chaves-Fraga, D., Collarana, D., Vidal, M.E.: SDM-RDFIZER. In: Proceedings of the 29th ACM International Conference on Information and Knowledge Management, October 2020.
[15]
Karimov, J., Rabl, T., Katsifodimos, A., Samarev, R., Heiskanen, H., Markl, V.: Benchmarking distributed stream data processing systems. In: 2018 IEEE 34th International Conference on Data Engineering (ICDE), April 2018.
[16]
Le Phuoc, D., Dao-Tran, M., Le Tuan, A., Duc, M.N., Hauswirth, M.: RDF stream processing with CQELS framework for real-time analysis. In: Proceedings of the 9th ACM International Conference on Distributed Event-Based Systems. DEBS 2015, pp. 285–292. Association for Computing Machinery, New York (2015).
[17]
Lefrançois M, Zimmermann A, and Bakerally N Blomqvist E, Maynard D, Gangemi A, Hoekstra R, Hitzler P, and Hartig O A SPARQL extension for generating RDF from heterogeneous formats The Semantic Web 2017 Cham Springer 35-50
[18]
Mauri A, et al., et al. Groth P, et al., et al. TripleWave: spreading RDF streams on the web The Semantic Web – ISWC 2016 2016 Cham Springer 140-149
[20]
Paepe, D.D., et al.: A complete software stack for IoT time-series analysis that combines semantics and machine learning—lessons learned from the dyversify project. Appl. Sci. 11(24), 11932 (2021).
[21]
Santipantakis, G.M., Kotis, K.I., Vouros, G.A., Doulkeridis, C.: RDF-GEN: generating RDF from streaming and archival data. In: Proceedings of the 8th International Conference on Web Intelligence, Mining and Semantics. WIMS 2018. Association for Computing Machinery, New York (2018).
[22]
Scrocca M, Comerio M, Carenini A, and Celino I Turning transport data to comply with EU standards while enabling a multimodal transport knowledge graph Semant. Web - ISWC 2020 2020 411-429
[23]
Simsek, U., Kärle, E., Fensel, D.A.: RocketRML - a NodeJS implementation of a use case specific RML mapper. arXiv abs/1903.04969 (2019).
[24]
Steenwinckel B et al. FLAGS: a methodology for adaptive anomaly detection and root cause analysis on sensor data streams by fusing expert knowledge with machine learning Futur. Gener. Comput. Syst. 2021 116 30-48
[25]
Tommasini R, Della Valle E, Mauri A, Brambilla M, et al. d’Amato C et al. RSPLab: RDF stream processing benchmarking made easy The Semantic Web – ISWC 2017 2017 Cham Springer 202-209
[26]
Zaharia, M., et al.: Apache spark: a unified engine for big data processing. Commun. ACM 59(11), 56–65 (2016).
[27]
Zhang, Q., Song, Y., Routray, R.R., Shi, W.: Adaptive block and batch sizing for batched stream processing system. In: 2016 IEEE International Conference on Autonomic Computing (ICAC), pp. 35–44 (2016).

Cited By

View all
  • (2024)FlexRML: A Flexible and Memory Efficient Knowledge Graph MaterializerThe Semantic Web10.1007/978-3-031-60635-9_3(40-56)Online publication date: 26-May-2024
  • (2023)Boosting Knowledge Graph Generation from Tabular Data with RML ViewsThe Semantic Web10.1007/978-3-031-33455-9_29(484-501)Online publication date: 28-May-2023

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings
The Semantic Web – ISWC 2022: 21st International Semantic Web Conference, Virtual Event, October 23–27, 2022, Proceedings
Oct 2022
898 pages
ISBN:978-3-031-19432-0
DOI:10.1007/978-3-031-19433-7

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 23 October 2022

Author Tags

  1. RML
  2. Stream processing
  3. Window joins
  4. Knowledge graph generation

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 13 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)FlexRML: A Flexible and Memory Efficient Knowledge Graph MaterializerThe Semantic Web10.1007/978-3-031-60635-9_3(40-56)Online publication date: 26-May-2024
  • (2023)Boosting Knowledge Graph Generation from Tabular Data with RML ViewsThe Semantic Web10.1007/978-3-031-33455-9_29(484-501)Online publication date: 28-May-2023

View Options

View options

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media