Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Serverless data pipeline approaches for IoT data in fog and cloud computing

Published: 01 May 2022 Publication History

Abstract

With the increasing number of Internet of Things (IoT) devices, massive amounts of raw data is being generated. The latency, cost, and other challenges in cloud-based IoT data processing have driven the adoption of Edge and Fog computing models, where some data processing tasks are moved closer to data sources. Properly dealing with the flow of such data requires building data pipelines, to control the complete life cycle of data streams from data acquisition at the data source, edge and fog processing, to Cloud side storage and analytics. Data analytics tasks need to be executed dynamically at different distances from the data sources and often on very heterogeneous hardware devices. This can be streamlined by the use of a Serverless (or FaaS) cloud computing model, where tasks are defined as virtual functions, which can be migrated from edge to cloud (and vice versa) and executed in an event-driven manner on data streams. In this work, we investigate the benefits of building Serverless data pipelines (SDP) for IoT data analytics and evaluate three different approaches for designing SDPs: (1) Off-the-shelf data flow tool (DFT) based, (2) Object storage service (OSS) based and (3) MQTT based. Further, we applied these strategies on three fog applications (Aeneas, PocketSphinx, and custom Video processing application) and evaluated the performance by comparing their processing time (computation time, network communication and disk access time), and resource utilization. Results show that DFT is unsuitable for compute-intensive applications such as video or image processing, whereas OSS is best suitable for this task. However, DFT is nicely fit for bandwidth-intensive applications due to the minimum use of network resources. On the other hand, MQTT-based SDP is observed with increase in CPU and Memory usage as the number of users rose, and experienced a drop in data units in the pipeline for PocketSphinx and custom video processing applications, however it performed well for Aeneas which had low size data units.

Highlights

Exhibits Serverless Data Pipelines (SDP) deployed in three layered IoT architectures.
Designs SDP approaches using data pipelines, message queues and object storage.
Apply the proposed SDP approaches on real time fog computing workloads.
Performance evaluation of SDPs for suitability in various IoT applications.

References

[1]
Chang C., Srirama S.N., Buyya R., Internet of things (iot) and new computing paradigms, in: Fog and Edge Computing: Principles and Paradigms, Vol. 6, 2019, pp. 1–23.
[2]
Toosi A.N., Mahmud R., Chi Q., Buyya R., Management and orchestration of network slices in 5 g, fog, edge and clouds, in: Fog and Edge Computing, Vol. 10, 2019.
[3]
Hernandez A., Xiao B., Tudor V., Eraia - enabling intelligence data pipelines for iot-based application systems, in: 2020 IEEE International Conference on Pervasive Computing and Communications (PerCom), 2020, pp. 1–9,.
[4]
Aslanpour M.S., Toosi A.N., Cicconetti C., Javadi B., Sbarski P., Taibi D., Assuncao M., Gill S.S., Gaire R., Dustdar S., Serverless edge computing: vision and challenges, in: 2021 Australasian Computer Science Week Multiconference, 2021, pp. 1–10.
[5]
Cardellini V., Lo Presti F., Nardelli M., Russo Russo G., Decentralized self-adaptation for elastic data stream processing, Future Gener. Comput. Syst. 87 (2018) 171–185,. URL https://www.sciencedirect.com/science/article/pii/S0167739X17326821.
[6]
Baldini I., Castro P., Chang K., et al., Serverless computing: Current trends and open problems, in: Research Advances in Cloud Computing, Springer, 2017, pp. 1–20.
[7]
Casale G., Artač M., van den Heuvel W.-J., van Hoorn A., Jakovits P., Leymann F., Long M., Papanikolaou V., Presenza D., Russo A., et al., Radon: rational decomposition and orchestration for serverless computing, in: SICS Software-Intensive Cyber-Physical Systems, 2019, pp. 1–11.
[8]
Buyya R., Srirama S.N., Fog and Edge Computing: Principles and Paradigms, John Wiley & Sons, 2019.
[9]
R. Buyya, S.N. Srirama, G. Casale, et al. A manifesto for future generation cloud computing: Research directions for the next decade, 51 (5) (2019).
[10]
A. Taherkordi, F. Eliassen, G. Horn, From iot big data to iot big services, in: Proceedings of the Symposium on Applied Computing, 2017, pp. 485–491.
[11]
Cheng B., Fuerst J., Solmaz G., Sanada T., Fog function: Serverless fog computing for data intensive iot services, in: 2019 IEEE International Conference on Services Computing (SCC), IEEE, 2019, pp. 28–35.
[12]
Hellerstein J.M., Faleiro J.M., Gonzalez J.E., Schleier-Smith J., Sreekanti V., Tumanov A., Wu C., Serverless computing: One step forward, two steps back, 2018, CoRR abs/1812.03651. arXiv:1812.03651.
[13]
Ravindra P., Khochare A., Reddy S., Sharma S., Varshney P., Simmhan Y., Echo: An adaptive orchestration platform for hybrid dataflows across cloud and edge, 2017, pp. 395–410,.
[14]
Truong H.-L., Integrated analytics for iiot predictive maintenance using iot big data cloud systems, in: 2018 IEEE International Conference on Industrial Internet (ICII), IEEE, 2018, pp. 109–118.
[15]
J. McChesney, N. Wang, A. Tanwer, E. de Lara, B. Varghese, Defog: fog computing benchmarks, in: Proceedings of the 4th ACM/IEEE Symposium on Edge Computing, 2019, pp. 47–58.
[16]
Amazon, AWS Greengrass, 2020, (accessed December 13, 2020). URL https://aws.amazon.com/greengrass/.
[17]
Google, Google cloud IoT, 2020, (accessed December 13, 2020). URL https://cloud.google.com/solutions/iot.
[18]
Amazon, Azure IoT edge, 2020, (accessed December 13, 2020). URL https://azure.microsoft.com/en-us/services/iot-edge/.
[19]
Nardelli M., Cardellini V., Grassi V., Presti F.L., Efficient operator placement for distributed data stream processing applications, IEEE Trans. Parallel Distrib. Syst. 30 (8) (2019) 1753–1767,.
[20]
Das A., Imai S., Patterson S., Wittie M.P., Performance optimization for edge-cloud serverless platforms via dynamic task placement, in: 2020 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID), 2020, pp. 41–50.
[21]
Dehury C.K., Srirama S.N., Chhetri T.R., CCoDaMiC: A framework for coherent coordination of data migration and computation platforms, Future Gener. Comput. Syst. 109 (2020) 1–16.
[22]
L. Ao, L. Izhikevich, G.M. Voelker, G. Porter, Sprocket: A serverless video processing framework, in: Proceedings of the ACM Symposium on Cloud Computing, 2018, pp. 263–274.
[23]
Helu M., Sprock T., Hartenstine D., Venketesh R., Sobel W., Scalable data pipeline architecture to support the industrial internet of things, CIRP Ann. (2020).
[24]
Akin O., Deniz H.F., Nefis D., Kiziltan A., Cakir A., Enabling big data analytics at manufacturing fields of farplas automotive, 2020, arXiv preprint arXiv:2004.11682.
[25]
Ronkainen J., Iivari A., Designing a data management pipeline for pervasive sensor communication systems, in: FNC/MobiSPC, 2015, pp. 183–188.
[26]
Renart E.G., Balouek-Thomert D., Parashar M., Edge based data-driven pipelines (technical report), 2018, arXiv preprint arXiv:1808.01353.

Cited By

View all
  • (2024)A Foundation for Real-time Applications onFunction-as-a-ServiceACM SIGMETRICS Performance Evaluation Review10.1145/3649477.364949751:4(54-65)Online publication date: 23-Feb-2024
  • (2024)Load balancing for heterogeneous serverless edge computingFuture Generation Computer Systems10.1016/j.future.2024.01.020154:C(266-280)Online publication date: 1-May-2024
  • (2023)BlazeFlow: a Multi-Layer Communication Middleware for Real-Time Distributed IoT ApplicationsProceedings of the 1st International Workshop on Middleware for the Computing Continuum10.1145/3631309.3632837(30-35)Online publication date: 11-Dec-2023

Index Terms

  1. Serverless data pipeline approaches for IoT data in fog and cloud computing
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Information & Contributors

          Information

          Published In

          cover image Future Generation Computer Systems
          Future Generation Computer Systems  Volume 130, Issue C
          May 2022
          321 pages

          Publisher

          Elsevier Science Publishers B. V.

          Netherlands

          Publication History

          Published: 01 May 2022

          Author Tags

          1. Serverless computing
          2. Data pipelines
          3. Cloud computing
          4. Fog computing
          5. Edge computing
          6. Internet of things

          Qualifiers

          • Research-article

          Contributors

          Other Metrics

          Bibliometrics & Citations

          Bibliometrics

          Article Metrics

          • Downloads (Last 12 months)0
          • Downloads (Last 6 weeks)0
          Reflects downloads up to 16 Oct 2024

          Other Metrics

          Citations

          Cited By

          View all
          • (2024)A Foundation for Real-time Applications onFunction-as-a-ServiceACM SIGMETRICS Performance Evaluation Review10.1145/3649477.364949751:4(54-65)Online publication date: 23-Feb-2024
          • (2024)Load balancing for heterogeneous serverless edge computingFuture Generation Computer Systems10.1016/j.future.2024.01.020154:C(266-280)Online publication date: 1-May-2024
          • (2023)BlazeFlow: a Multi-Layer Communication Middleware for Real-Time Distributed IoT ApplicationsProceedings of the 1st International Workshop on Middleware for the Computing Continuum10.1145/3631309.3632837(30-35)Online publication date: 11-Dec-2023

          View Options

          View options

          Get Access

          Login options

          Media

          Figures

          Other

          Tables

          Share

          Share

          Share this Publication link

          Share on social media