Abstract
NOA-AID (Network Overlays for Adaptive information Aggregation, Indexing and Discovery on the fog) is an approach for decentralized indexing, aggregation and discovery of data belonging to streams. It is organized on two network layers. The upper layer is in charge of delivering an information discovery approach by providing a distributed index structure. The lower layer is devoted to resource aggregation based on epidemic protocols designed for highly dynamic environment, well suited to stream-oriented scenarios. It defines a flexible approach to express queries targeting highly heterogeneous data, as well as a self-organizing dynamic system allowing the optimal resolution of queries on the most suitable stream producers. The paper also presents a theoretical study and discusses the costs related to information management operations; it also gives an empirical validation of findings. Finally, it reports an extended experimental evaluation that demonstrated the ability of NOA-AID to be effective and efficient for retrieving information extracted from streams in highly-dynamic and distributed processing architectures.
Similar content being viewed by others
References
Andoni, A., Indyk, P.: Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. Commun. ACM 51(1), 117–122 (2008)
Bajaber, F., Elshawi, R., Batarfi, O., Altalhi, A., Barnawi, A., Sakr, S.: Big data 2.0 processing systems: Taxonomy and open challenges. J. Grid Comput. 14(3), 379–405 (2016)
Baraglia, R., Dazzi, P., Guidi, B., Ricci, L.: Godel: Delaunay overlays in p2p networks via gossip. In: IEEE 12th int. conf. on peer-to-peer computing (P2P), pp. 1–12. IEEE (2012)
Baraglia, R., Dazzi, P., Mordacchini, M., Ricci, L.: A peer-to-peer recommender system for self-emerging user communities based on gossip overlays. J. Comput. Syst. Sci. 79(2), 291–308 (2013)
Baraglia, R., Dazzi, P., Mordacchini, M., Ricci, L., Alessi, L.: Group: A gossip based building community protocol. In: Smart spaces and next generation wired/wireless networking, pp. 496–507. Springer (2011)
Bentivogli, L., Forner, P., Magnini, B., Pianta, E.: Revising wordnet domains hierarchy: Semantics, coverage, and balancing. In: Proc. of COLING 2004 workshop on multilingual linguistic resources, pp. 101–108 (2004)
Bruno, R., Conti, M., Mordacchini, M., Passarella, A.: An analytical model for content dissemination in opportunistic networks using cognitive heuristics. In: Proc. of the 15th ACM int. conf. on modeling, analysis and simulation of wireless and mobile systems, pp. 61–68. ACM (2012)
Cai, M., Frank, M., Chen, J., Szekely, P.: Maan: A multi-attribute addressable network for grid information services. J. Grid Comput. 2(1), 3–14 (2004)
Carlini, E., Coppola, M., Dazzi, P., Laforenza, D., Martinelli, S., Ricci, L.: Service and resource discovery supports over p2p overlays. In: International conference on ultra modern telecommunications & workshops. IEEE (2009)
Cai, Z., Lee, I., Chu, S. C., Huang, X.: Simsim: A service discovery method preserving content similarity and spatial similarity in p2p mobile cloud. J. Grid Comput. 17(1), 79–95 (2019)
Carlini, E., Dazzi, P., Mordacchini, M., Ricci, L.: Toward community-driven interest management for distributed virtual environment. In: European conf. on parallel processing, pp. 363–373. Springer, Berlin (2013)
Chang, R.S., Hu, M.S.: A resource discovery tree using bitmap for grids. Futur. Gener. Comput. Syst. 26, 29–37 (2010)
Chaturvedi, S., Tyagi, S., Simmhan, Y: Cost-effective Sharing of Streaming Dataflows for IoT Applications. In: IEEE transactions on cloud computing. IEEE (2019)
Conti, M., Mordacchini, M., Passarella, A., Rozanova, L.: A semantic-based algorithm for data dissemination in opportunistic networks. In: Proc. of the 7th international workshop on self-organizing systems (IWSOS13), pp. 14–26. Springer (2013)
Conti, M., Passarella, A., Das, S.K.: The internet of people (IoP): A new wave in pervasive mobile computing. Pervasive and Mobile Computing 41(Supplement C), 1–27 (2017)
Crespo, A., Garcia-Molina, H.: Semantic overlay networks for p2p systems. Agents and Peer-to-Peer Computing, 1–13 (2005)
Danelutto, M., Dazzi, P., et al.: A java/jini framework supporting stream parallel computations. In: PARCO, pp. 681–688 (2005)
Dazzi, P., Mordacchini, M.: NOA-AID: Network overlays for adaptive information aggregation, indexing and discovery at the edge. In: International Workshop on Autonomic Solutions for Parallel and Distributed Data Stream Processing (Auto-DaSP 2017) (2017)
Domingos, P., Hulten, G.: Mining high-speed data streams. In: Proceedings of the 6th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’00, pp. 71–80. ACM, New York (2000)
Falchi, F., Gennaro, C., Zezula, P.: Nearest neighbor search in metric spaces through content-addressable networks. Inf. Proc. Manag. 43(3), 665–683 (2007)
Gama, J., Kosina, P.: Learning decision rules from data streams. In: IJCAI international joint conference on artificial intelligence, pp. 1255–1260 (2011)
Gao, F., Ali, M.I., Curry, E., Mileo, A.: Automated discovery and integration of semantic urban data streams: The ACEIS middleware. Futur. Gener. Comput. Syst. 76(Supplement C), 561–581 (2017)
Gedik, B., Schneider, S., Hirzel, M., Wu, K.L.: Elastic scaling for data stream processing. IEEE Trans. Parall. Distr. Syst. 25(6), 1447–1463 (2014)
Gennaro, C., Mordacchini, M., Orlando, S., Rabitti, F.: Mroute: A peer-to-peer routing index for similarity search in metric spaces. In: Proc. of the 5th int. workshop on databases, information systems and peer-to-peer computing (DBISP2P 2007), pp. 1–12 (2007)
Gionis, A., Indyk, P., Motwani, R.: Similarity search in high dimensions via hashing. In: Proc. of the Int. Conf. on Very Large Data Bases, pp. 518–529 (1999)
Ghobaei-Arani, M., Souri, A., Rahmanian, A. A.: Resource management approaches in fog computing: A comprehensive review. Journal of Grid Computing, pp. 1–42 Springer (2019)
Ghobaei-Arani, M., Souri, A., Safara, F., Norouzi, M.: An efficient task scheduling approach using moth-flame optimization algorithm for cyber-physical system applications in fog computing. Transactions on Emerging Telecommunications Technologies, e3770 Wiley (2019)
Guerraoui, R., Sidath, B., Kermarrec, A., Fessant, F. L., Huguenin, K., Rivière, E.: Gosskip, an efficient, fault-tolerant and self organizing overlay using gossip-based construction and skip-lists principles. In: 6th IEEE Int. Conf. on Peer-toPeer Computing, 2006 Ratnasamy, pp. 12–22 (2001)
Heintz, Benjamin, Chandra, Abhishek, Sitaraman, Ramesh K: Optimizing Grouped Aggregation in Geo-Distributed Streaming Analytics. Inproceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing (HPDC ’15), pp. 133–144 ACM (2015)
Henning, V., Reichelt, J.: Mendeley-A Last fm For Research? (2009)
Hießl, T., Hochreiner, C., Schulte, S.: Towards a framework for data stream processing in the fog, Informatik Spektrum 42, pp. 256–265 Springer (2019)
Hochreiner, C., Vögler, M., Schulte, S., Dustdar, S.: Elastic stream processing for the internet of things. In: 2016 IEEE 9th international conference on cloud computing (CLOUD), pp. 100–107. IEEE (2016)
Jack, K., Hammerton, J., Harvey, D., Hoyt, J.J., Reichelt, J., Henning, V.: Mendeley reply to the datatel challenge. Proc. Comput. Sci. 1(2), 1–3 (2010)
Jelasity, M., Montresor, A., Babaoglu, O.: T-man: Gossip-based fast overlay topology construction. Comput. Netw. 53(13), 2321–2339 (2009). Elsevier
Kavalionak, H., Gennaro, C., Amato, G., Vairo, C., Perciante, C., Meghini, C., Falchi, F.: Distributed video surveillance using smart cameras. J. Grid Comput. 17(1), 59–77 (2019)
Le, T., Stahl, F., Gomes, J.B., Gaber, M.M., Fatta, G.D.: Computationally efficient rule-based classification for continuous streaming data, pp. 21–34 Springer International Publishing. https://doi.org/10.1007/978-3-319-12069-0_2 (2014)
Liu, L., Antonopoulos, N., Mackin, S., Xu, J., Russell, D.: Efficient resource discovery in self-organized unstructured peer-to-peer networks. Concurrency and Computation: Practice and Experience 21, 159–183 (2009)
Liu, X., Dastjerdi, A. V., Buyya, R.: Stream processing in IoT: Foundations, state-of-the-art, and future directions. In: Internet of Things, pp. 145–161. Morgan Kaufmann (2016)
Lulli, A., Ricci, L., Carlini, E., Dazzi, P., Lucchese, C.: Cracker: Crumbling large graphs into connected components. In: 2015 IEEE symposium on computers and communication (ISCC), pp. 574–581. IEEE (2015)
Marzolla, M., Mordacchini, M., Orlando, S.: A p2p resource discovery system based on a forest of trees. In: 17th int. workshop on database and expert systems applications (DEXA’06), pp. 261–265. https://doi.org/10.1109/DEXA.2006.16 (2006)
Mencagli, G., Torquati, M., Danelutto, M.: Elastic-ppq: A two-level autonomic system for spatial preference query processing over dynamic data streams. Futur. Gener. Comput. Syst. 79(Part 3), 862–877 (2018)
Mordacchini, M., Conti, M., Passarella, A., Bruno, R.: Human-centric data dissemination in the IoP: Large-scale Modeling and Evaluation. ACM Trans. Auto. Adapt. Syst. (TAAS) 14(3), 1–25 (2020). ACM
Mordacchini, M., Dazzi, P., Tolomei, G., Baraglia, R., Silvestri, F., Orlando, S.: Challenges in designing an interest-based distributed aggregation of users in p2p systems. In: ICUMT’09. int. conf. on ultra modern telecommunications & workshops, 2009. pp. 1–8. IEEE (2009)
Mordacchini, M., Passarella, A., Conti, M., Allen, S.M., Chorley, M.J., Colombo, G.B., Tanasescu, V., Whitaker, R.M.: Crowdsourcing through cognitive opportunistic networks, vol. 10. ACM (2015)
Mordacchini, M., Ricci, L., Ferrucci, L., Albano, M., Baraglia, R.: Hivory: Range queries on hierarchical voronoi overlays. In: IEEE 10th int. conf. on peer-to-peer computing (P2P2010), pp. 1–10. IEEE (2010)
Nasiri, H., Nasehi, S., Goudarzi, M.: Evaluation of distributed stream processing frameworks for IoT applications in Smart Cities. J Big Data 6, 52 Springer (2019)
Peiro Sajjad, H., Liu, Y., Vlassov, V.: Optimizing Windowed Aggregation over Geo-Distributed Data Streams. In: Proceedings of the 2018 IEEE international conference on edge computing (EDGE2018), pp. 33–41. IEEE (2018)
Peris, A.D., Hernández, J.M., Huedo, E.: Distributed late-binding scheduling and cooperative data caching. J. Grid Comput. 15(2), 235–256 (2017)
Pirrò, G., Talia, D., Trunfio, P.: A dht-based semantic overlay network for service discovery. Futur. Gener. Comput. Syst. 28(4), 689–707 (2012)
Pubmed central. www.ncbi.nlm.nih.gov/pmc/
Ruffo, G., Schifanella, R.: A peer-to-peer recommender system based on spontaneous affinities. ACM Trans. Internet Technol 9, 4:1–4:34 (2009)
Selimi, M., Cerdà-Alabern, L., Freitag, F., Veiga, L., Sathiaseelan, A., Crowcroft, J.: A lightweight service placement approach for community network micro-clouds. J. Grid Comput. 17(1), 169–189 (2019)
Smith, R.G.: The contract net protocol: High-level communication and control in a distributed problem solver. IEEE Transactions on computers, (12), pp. 1104–1113 IEEE (1980)
Tan, P.N., Steinbach, M., Kumar, V.: Introduction to data mining, 1st edn. Addison-Wesley Longman Publishing Co., Inc., Boston (2005)
Tennant, M., Stahl, F., Rana, O., Gomes, J.B.: Scalable real-time classification of data streams with concept drift. Futur. Gener. Comput. Syst. 75(Supplement C), 187–199 (2017)
Tolosana-Calasanz, R., Bañares, J., Pham, C., Rana, O.F.: Resource management for bursty streams on multi-tenancy cloud environments. Future Gener. Comput.Syst. 55, 444–459 (2016)
Toshniwal, A., Taneja, S., Shukla, A., Ramasamy, K., Patel, J.M., Kulkarni, S., Jackson, J., Gade, K., Fu, M., Donham, J.: Storm@twitter. In: Proceedings of the 2014 ACM SIGMOD international conference on Management of data, pp. 147–156 ACM (2014)
Tudoran, R., Costan, A., Nano, O., Santos, I., Soncu, H., Jetstream, A.G.: Enabling high throughput live event streaming on multi-site clouds. Futur. Gener. Comput. Syst. 54, 274–291 (2016)
Vanneste, S., de Hoog, J., Huybrechts, T., Bosmans, S., Eyckerman, R., Sharif, M., Mercelis, S., Hellinckx, P.: Distributed uniform streaming framework: An elastic fog computing platform for event stream processing and platform transparency. Future Internet 11(7), 158 (2019). MDPI
Voulgaris, S., Gavidia, D., Van Steen, M.: Cyclon: Inexpensive membership management for unstructured p2p overlays. J. Netw. syst. Manag. 13(2), 197–217 (2005)
Voulgaris, S., van Steen, M.: Epidemic-style management of semantic overlays for content-based searching. In: Cunha, J.C., Medeiros, P.D. (eds.) Euro-Par 2005 parallel processing, pp. 1143–1152. Springer, Berlin (2005)
Yang, S.: IoT stream processing and analytics in the fog. IEEE Commun. Mag. 55(8), 21–27 (2017). IEEE
Zaharia, M., Xin, R.S, Wendell, P., Das, T., Armbrust, M., Dave, A., Meng, X., Rosen, J., Venkataraman, S., Franklin, M.J., Ghodsi, A., Gonzales, J., Shenker, S.: Stoica Ion: Apache spark: a unified engine for big data processing. Communications of the ACM, vol. 59, issue 11, pp. 56–65 ACM (2016)
Zhang, Q., Li, S., Wu, Q., Yu, J.: Improving dht load balance using the churn. In: 2016 IEEE international conference on computer and information technology (CIT), pp. 354–360. IEEE (2016)
Zhou, Q., Simmhan, Y., Prasanna, V.: Knowledge-infused and consistent complex event processing over real-time and persistent streams. Futur. Gener. Comput. Syst. 76, 391–406 (2017)
Zhu, Y., Hu, Y.: Efficient semantic search on dht overlays. J. Parall. Distr. Comput. 67(5), 604–616 (2007)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Dazzi, P., Mordacchini, M. Scalable Decentralized Indexing and Querying of Multi-Streams in the Fog. J Grid Computing 18, 395–418 (2020). https://doi.org/10.1007/s10723-020-09521-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10723-020-09521-3