Demand-Driven Data Acquisition for Large Scale Fleets
Abstract
:1. Introduction
- Local data privacy laws cause a globally scattered regulatory landscape. The European General Data Protection Regulation (GDPR) is especially strict among the different data privacy regulations. Complying with it, therefore, supports the integration of weaker frameworks. The GDPR requires obtaining informed consent from the data subjects before their Personally Identifiable Information (PII) can be collected [12]. Additionally, GDPR key principles such as data minimization or storage limitation must be applied [13].
- System-design issues are arising from a large number of cars and their diverse sensing capabilities. The potential supply of data exceeds the demand many times over. Typical data consumers only access a subset of the whole fleet, and from these vehicles, only a limited number of sensors are required for their use cases. Consequently, each data consumer may access individual subsets of the total sensing capacity. The system must map this access pattern efficiently. Nevertheless, these subsets can be of substantial size and place high requirements on the scaling properties of the system.
- Even for a single automaker, hundreds of vehicle models are regularly customized with model derivatives for specific regions and local markets. Each can have different sensing capabilities on its own. Additionally, the customers can select optional features that might introduce additional sensors. The sensors themselves might be supplied from different producers with varying data-access channels. Therefore, in addition to the legal heterogeneity, the individual cars are highly heterogeneous.
Contributions
- A method to minimize the scope of acquired sensor data for each vehicle individually: Each vehicle only transmits data required to fulfill the demand relevant to it. A data consumer’s demand is considered relevant if the data subject has consented to fulfill it (personal data) or the individual vehicle was assigned to the demand (non-personal data).
- Dynamic demand determination in the context of a fixed purpose and data scope. For this purpose, the data consumers manage tasks which may also include constraints evaluated on each vehicle locally and data processing options.
- An abstraction layer enables the integration of a heterogeneous fleet through a single interface. For this purpose, we continuously distribute instructions to the vehicles. Each instruction describes an individual sensor access pattern. The vehicle checks these instructions by applying a trial-and-error scheme to determine its sensor capabilities and accessing the corresponding data.
- A compression strategy combines custom data preprocessing with an existing algorithm and, in combination, produces better results than the preprocessing itself or other existing algorithms alone. The compression ratio was evaluated by using actual vehicle data.
- A cloud-based reference implementation that handles data processing as distributed and trip-related transactions. Overall system throughput is automatically adjusted by adding or removing servers.
- Validation of the scaling properties of the overall system through a realistic simulation of over 200,000 simultaneously active vehicles.
2. Related Work
2.1. Academia
2.2. Commercial Systems
2.2.1. Software Platforms
2.2.2. Hardware/Software Platforms
3. Proposed System
3.1. GDPR Related Requirements
3.2. Demand-Driven Data Acquisition
3.3. Sensor Abstraction
3.4. Data Transmission and Processing
3.5. Endpoint Messaging
3.6. Vehicle Authentication
3.7. Vehicle Simulators
3.8. Acquisition of Non-Personal Data
4. Implementation
4.1. Utilized Technology
- App Engine: A managed service to host web applications. It deploys a user-supplied application to proprietary virtual machines that spawn within seconds. Thus, it allows handling sudden spikes of traffic by adjusting the underlying servers just in time [66].
- Datastore: Managed NoSQL database that stores items with unique keys. Its underlying servers manage continuous subsets of the keyspace. Every item is limited to one update per second. It supports consistent queries and transactions spanning multiple operations [67].
- Cloud Tasks: Provides the capability to schedule asynchronous HTTP requests. It supports delayed executions and retries failed requests until they finally succeed. The requests are stored within queues [68].
- Pub/Sub: Asynchronous messaging service. Messages are published to topics that can have multiple subscriptions. Published messages are replicated to every subscription and will be delivered at least once [69].
- Compute Engine: A service that provides virtual servers and auto-scaling. The scaling can be tied to the in-flight messages of a Pub/Sub subscription [70].
- Cloud Storage: An object store that can persist unstructured blobs/objects within buckets. Events can be pushed to a Pub/Sub Topic [71].
4.2. Instruction Distribution
4.3. Consent Lookup
4.4. Task Distribution
4.5. Chunk Compression
- 1.
- Timestamp: The system does not have real-time capabilities and cannot perform measurements at exact intervals. In addition, most data-access channels only report measurements if they differ from the previous ones. Consequently, we cannot drop the timestamps in favor of storing the sequence interval length. However, many timestamps still represent recurring intervals with slight variations. They originate from sensors whose values changed with almost every measurement. For example, the engine speed most likely varies continuously, given a resolution of 1 RPM.In [46], the authors have found that Delta-Of-Delta (DOD) encoding is a good fit for timestamps with such characteristics. It enhances Delta encoding, which is a procedure that only keeps the first value of a sequence. The subsequent values are the delta to the predecessor (). In DOD encoding, the first value is unchanged, and the second is the delta to the first. Subsequent values are computed as follows: . Thus, they represent the delta of two deltas.Applying DOD encoding to the timestamps results in an average compression ratio of 3.66 (See Table 3).
- 2.
- Integer: We apply Delta encoding to integer values. The average compression ratio is 3.76. DOD encoding has not led to further improvements (See Table 3).
- 3.
- Float: We found that no sensor exploits the full precision of a float. Thus, we can perform a reversible conversion into integers (F2I) without information loss. The conversion is performed by shifting the decimal sign n places to the right () and cutting the remaining decimals. n is equivalent to the decimals of the sensor resolution. A resolution of corresponds to an n of 2.Applying F2I with Delta- and VLI encoding results in an average compression ratio of 3.16 (See Table 3).We have dismissed two other approaches: The authors of [46] are utilizing XOR-based compression. Their method is optimized for repeating values and can encode them with a single bit. The reference data consist of 59% repetitions. Deducting them from their results gives a compression ratio of 2.09. Furthermore, we have validated the strategy presented by Lindstrom and Isenburg [45]. It relies on a predictor that benefits from repeated values as well. Applying it to our dataset resulted in an average compression ratio of 1.24.
- 4.
- String: The majority of string-producing sensors yield low cardinality ENUMs. We apply a dictionary compression scheme to them. The strings are stored once within the dictionary, and the time series only contains dictionary indexes. An average compression ratio of 3.53 was achieved (See Table 3).
4.6. Chunk Processing
4.7. Chunk Processing Fault Tolerance
4.8. Trip-File Management
- 1.
- The Trip File Manager may not accept files associated with revoked consents.
- 2.
- The revocation of consent must eventually result in the removal of all associated files without any leftovers.
- 3.
- A previously deleted file must not become available after its repeated submission.
5. Performance Evaluation
- 1.
- Can the system react to elevated load with an automated increase in server capacity?
- 2.
- Is the increased server capacity able to keep processing time constant?
- 3.
- Can the system respond to reduced load by automatically reducing server capacity?
5.1. Setup
5.2. Results
6. Discussion
6.1. Demand-Driven Data Acquisition
6.2. Abstraction Layer
6.3. Data Compression
6.4. Data Processing
6.5. Data Authenticity
6.6. Performance Evaluation
7. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Conflicts of Interest
References
- Automotive Usage Based Insurance Market Forecast to 2027—COVID-19 Impact and Global Analysis by Technology Fitted (Smartphones, Black Box, and Dongles); and Policy Type (Pay-As-You-Drive (PAYD) and Pay-How-You-Drive (PHYD)); and Geography. Available online: https://www.theinsightpartners.com/reports/automotive-usage-based-insurance-market/ (accessed on 8 December 2020).
- Mai, A.; Schlesinger, D. A Business Case for Connecting Vehicles. Available online: https://www.cisco.com/c/dam/en_us/about/ac79/docs/mfg/Connected-Vehicles_Exec_Summary.pdf (accessed on 8 December 2020).
- Ullah, S.; Kim, D.H. Lightweight driver behavior identification model with sparse learning on in-vehicle can-bus sensor data. Sensors 2020, 20, 5030. [Google Scholar] [CrossRef]
- Díaz-álvarez, A.; Clavijo, M.; Jiménez, F.; Serradilla, F. Inferring the driver’s lane change intention through lidar-based environment analysis using convolutional neural networks. Sensors 2021, 21, 475. [Google Scholar] [CrossRef] [PubMed]
- Jeon, Y.; Kim, B.; Baek, Y. Ensemble CNN to Detect Drowsy Driving with In-Vehicle Sensor Data. Sensors 2021, 21, 2372. [Google Scholar] [CrossRef] [PubMed]
- Young, R.; Fallon, S.; Jacob, P.; O’Dwyer, D. Vehicle Telematics and Its Role as a Key Enabler in the Development of Smart Cities. IEEE Sens. J. 2020, 20, 11713–11724. [Google Scholar] [CrossRef]
- Delussu, F.; Imran, F.; Mattia, C.; Meo, R. Fuel Prediction and Reduction in Public Transportation by Sensor Monitoring and Bayesian Networks. Sensors 2021, 21, 4733. [Google Scholar] [CrossRef]
- Zahid, M.; Chen, Y.; Jamal, A.; Memon, M.Q. Short term traffic state prediction via hyperparameter optimization based classifiers. Sensors 2020, 20, 685. [Google Scholar] [CrossRef] [Green Version]
- Fox, A.; Kumar, B.V.; Chen, J.; Bai, F. Multi-Lane Pothole Detection from Crowdsourced Undersampled Vehicle Sensor Data. IEEE Trans. Mob. Comput. 2017, 16, 3417–3430. [Google Scholar] [CrossRef]
- Enriquez, D.; Bautista, A.; Field, P.; Kim, S.i.; Jensen, S.; Ali, M.; Miller, J. CANOPNR: CAN-OBD programmable-expandable network-enabled reader for real-time tracking of slippery road conditions using vehicular parameters. In Proceedings of the 15th International IEEE Conference on Intelligent Transportation Systems (ITSC), Anchorage, AK, USA, 16–19 September 2012; pp. 260–264. [Google Scholar] [CrossRef]
- Bishop, J.D.; Stettler, M.E.; Molden, N.; Boies, A.M. Engine maps of fuel use and emissions from transient driving cycles. Appl. Energy 2016, 183, 202–217. [Google Scholar] [CrossRef] [Green Version]
- Lee, G.Y.; Cha, K.J.; Kim, H.J. Designing the GDPR Compliant Consent Procedure for Personal Information Collection in the IoT Environment. In Proceedings of the 2019 IEEE International Congress on Internet of Things (ICIOT), Milan, Italy, 8–13 July 2019; pp. 79–81. [Google Scholar] [CrossRef]
- Vallet, F. The GDPR and Its Application in Connected Vehicles—Compliance and Good Practices. In Electronic Components and Systems for Automotive Applications; Springer: Cham, Switzerland, 2019; pp. 245–254. [Google Scholar] [CrossRef]
- AutoPi Documentation. Available online: https://docs.autopi.io/ (accessed on 13 October 2021).
- Freematics Homepage. Available online: https://freematics.com/ (accessed on 13 October 2021).
- Peppes, N.; Alexakis, T.; Adamopoulou, E.; Demestichas, K. Driving Behaviour Analysis Using Machine and Deep Learning Methods for Continuous Streams of Vehicular Data. Sensors 2021, 21, 4704. [Google Scholar] [CrossRef]
- Khandakar, A.; Chowdhury, M.E.; Ahmed, R.; Dhib, A.; Mohammed, M.; Al-Emadi, N.A.M.A.; Michelson, D. Portable System for Monitoring and Controlling Driver Behavior and the Use of a Mobile Phone While Driving. Sensors 2019, 19, 1563. [Google Scholar] [CrossRef] [Green Version]
- Zhang, M.; Wo, T.; Xie, T.; Lin, X.; Liu, Y. CarStream: An industrial system of big data processing for Internet-of-Vehicles. Proc. VLDB Endow. 2017, 10, 1766–1777. [Google Scholar] [CrossRef]
- Hussain, S.; Mahmud, U.; Yang, S. Car e-Talk: An IoT-enabled Cloud-Assisted Smart Fleet Maintenance System. IEEE Internet Things J. 2021, 8, 9484–9494. [Google Scholar] [CrossRef]
- Silva, M.; Signoretti, G.; Andrade, P.; Silva, I.; Ferrari, P. Towards a customized vehicular maintenance based on 2-layers data-stream application. In Proceedings of the 2021 IEEE International Workshop on Metrology for Automotive (MetroAutomotive), Bologna, Italy, 1–2 July 2021; pp. 193–198. [Google Scholar] [CrossRef]
- Silva, M.; Vieira, E.; Signoretti, G.; Silva, I.; Silva, D.; Ferrari, P. A Customer Feedback Platform for Vehicle Manufacturing Compliant with Industry 4.0 Vision. Sensors 2018, 18, 3298. [Google Scholar] [CrossRef] [Green Version]
- Wilhelm, E.; Siegel, J.; Mayer, S.; Sadamori, L.; Dsouza, S.; Chau, C.K.; Sarma, S. Cloudthink: A scalable secure platform for mirroring transportation systems in the cloud. Transport 2015, 30, 320–329. [Google Scholar] [CrossRef] [Green Version]
- Pillmann, J.; Wietfeld, C.; Zarcula, A.; Raugust, T.; Alonso, D.C. Novel common vehicle information model (CVIM) for future automotive vehicle big data marketplaces. In Proceedings of the 2017 IEEE Intelligent Vehicles Symposium (IV), Los Angeles, CA, USA, 11–14 June 2017; pp. 1910–1915. [Google Scholar] [CrossRef] [Green Version]
- Rehrl, K.; Henneberger, S.; Leitinger, S.; Wagner, A.; Wimmer, M. Towards a National Floating Car Data Platform for Austria. In Proceedings of the 25th World Congress on Intelligent Transportation Systems (ITS), Copenhagen, Denmark, 17–21 September 2018; pp. 1–10. [Google Scholar]
- Xiao, Z.; Li, F.; Wu, R.; Jiang, H.; Hu, Y.; Ren, J.; Cai, C.; Iyengar, A. TrajData: On Vehicle Trajectory Collection With Commodity Plug-and-Play OBU Devices. IEEE Internet Things J. 2020, 7, 9066–9079. [Google Scholar] [CrossRef]
- Liu, N. Internet of Vehicles: Your Next Connection. Available online: https://www.huawei.com/mediafiles/CORPORATE/PDF/Magazine/WinWin/HW_110848.pdf (accessed on 20 October 2021).
- Miche, M.; Bohnert, T.M. The Internet of Vehicles or the Second Generation of Telematic Services. ERCIM News 2009, 77, 43–45. [Google Scholar]
- Bonomi, F. The Smart and Connected Vehicle and the Internet of Things. In Proceedings of the Workshop on Synchronization in Telecommunication Systems, San Jose, CA, USA, 16–18 April 2013. [Google Scholar]
- Contreras-Castillo, J.; Zeadally, S.; Guerrero-Ibanez, J.A. Internet of Vehicles: Architecture, Protocols, and Security. IEEE Internet Things J. 2018, 5, 3701–3709. [Google Scholar] [CrossRef]
- Zubie Platform Documentation. Available online: https://zubie.com/developer/ (accessed on 17 December 2020).
- Vinli Services Documentation. Available online: http://docs.vin.li/en/latest/ (accessed on 17 December 2020).
- Munic Documentation. Available online: https://store.munic.io/documentations/get_started (accessed on 17 December 2020).
- Otonomo Platform. Available online: https://otonomo.io/ (accessed on 17 December 2020).
- Caruso Platform. Available online: https://www.caruso-dataplace.com/ (accessed on 17 December 2020).
- Smartcar Platform. Available online: https://smartcar.com/ (accessed on 17 December 2020).
- Mercedes Benz API Platform. Available online: https://developer.mercedes-benz.com/products (accessed on 17 January 2020).
- BMW CarData. Available online: https://bmw-cardata.bmwgroup.com/thirdparty/public/car-data/overview (accessed on 17 December 2020).
- Ford Connected Vehicle API. Available online: https://developer.ford.com/fordconnect (accessed on 17 December 2020).
- PSA B2B Web API. Available online: https://developer.groupe-psa.io/webapi/b2b/overview/about/ (accessed on 17 December 2020).
- Foster, I.; Koscher, K. Exploring Controller Area Networks. Login Usenix Mag. 2015, 40, 6–10. [Google Scholar]
- Marchetti, M.; Stabili, D. READ: Reverse engineering of automotive data frames. IEEE Trans. Inf. Forensics Secur. 2019, 14, 1083–1097. [Google Scholar] [CrossRef]
- Young, C.; Svoboda, J.; Zambreno, J. Towards Reverse Engineering Controller Area Network Messages Using Machine Learning. In Proceedings of the 2020 IEEE 6th World Forum on Internet of Things (WF-IoT), New Orleans, LA, USA, 2–16 June 2020; pp. 1–6. [Google Scholar] [CrossRef]
- Shaily, S.; Krishnan, S.; Natarajan, S.; Periyasamy, S. Smart driver monitoring system. Multimed. Tools Appl. 2021, 80, 25633–25648. [Google Scholar] [CrossRef]
- Palomino, J.; Cuty, E.; Huanachin, A. Development of a CAN Bus datalogger for recording sensor data from an internal combustion ECU. In Proceedings of the 2021 IEEE International Workshop of Electronics, Control, Measurement, Signals and Their Application to Mechatronics (ECMSM), Liberec, Czech Republic, 21–22 June 2021; pp. 1–4. [Google Scholar] [CrossRef]
- Lindstrom, P.; Isenburg, M. Fast and efficient compression of floating-point data. IEEE Trans. Vis. Comput. Graph. 2006, 12, 1245–1250. [Google Scholar] [CrossRef]
- Pelkonen, T.; Franklin, S.; Teller, J.; Cavallaro, P.; Huang, Q.; Meza, J.; Veeraraghavan, K. Gorilla: A Fast, Scalable, in-Memory Time Series Database. Proc. VLDB Endow. 2015, 8, 1816–1827. [Google Scholar] [CrossRef]
- Deutsch, L.P. GZIP File Format Specification Version 4.3. RFC 1952. 1996. Available online: https://datatracker.ietf.org/doc/html/rfc1952 (accessed on 20 October 2021).
- Seward, J. bzip2 Homepage. Available online: https://sourceware.org/bzip2/ (accessed on 30 August 2021).
- Pavlov, I. LZMA Software Development Kit (SDK). Available online: https://www.7-zip.org/sdk.html (accessed on 30 August 2021).
- Collet, Y. LZ4 Frame Format Description. Available online: https://github.com/lz4/lz4/blob/dev/doc/lz4_Frame_format.md (accessed on 30 August 2021).
- Alakuijala, J.; Farruggia, A.; Ferragina, P.; Kliuchnikov, E.; Obryk, R.; Szabadka, Z.; Vandevenne, L. Brotli: A General-Purpose Data Compressor. ACM Trans. Inf. Syst. 2018, 37. [Google Scholar] [CrossRef]
- Cottet, Y.; Kucherawy, M. Zstandard Compression and the ‘Application/zstd’ Media Type. Available online: https://www.rfc-editor.org/rfc/rfc8878.txt (accessed on 20 October 2021).
- Deutsch, P.; Gailly, J.-L. ZLIB Compressed Data Format Specification Version 3.3. Available online: https://www.rfc-editor.org/rfc/rfc1950.txt (accessed on 20 October 2021).
- Signoretti, G.; Silva, M.; Andrade, P.; Silva, I.; Sisinni, E.; Ferrari, P. An Evolving TinyML Compression Algorithm for IoT Environments Based on Data Eccentricity. Sensors 2021, 21, 4153. [Google Scholar] [CrossRef]
- Golestan, K.; Soua, R.; Karray, F.; Kamel, M.S. Situation awareness within the context of connected cars: A comprehensive review and recent trends. Inf. Fusion 2016, 29, 68–83. [Google Scholar] [CrossRef]
- Road Vehicles—Extended Vehicle (ExVe) Web Services. Standard, International Organization for Standardization, Geneva, CH. 2019. Available online: https://www.iso.org/standard/66978.html (accessed on 28 September 2021).
- Carloop Documentation. Available online: https://carloop.readme.io/docs (accessed on 13 October 2021).
- Macchina Documentation. Available online: https://docs.macchina.cc/ (accessed on 13 October 2021).
- Ferrari, P.; Sisinni, E.; Bellagente, P.; Depari, A.; Flammini, A.; Pasetti, M.; Rinaldi, S. Experimental characterization of an IoV framework leveraging mobile wireless technologies. In Proceedings of the 2021 IEEE International Instrumentation and Measurement Technology Conference (I2MTC), Glasgow, UK, 17–20 May 2021; pp. 1–6. [Google Scholar] [CrossRef]
- Finck, M.; Pallas, F. They who must not be identified-distinguishing personal from non-personal data under the GDPR. Int. Data Priv. Law 2020, 10, 11–36. [Google Scholar] [CrossRef]
- Forgó, N.; Hänold, S.; Schütze, B. The principle of purpose limitation and big data. In New Technology, Big Data and the Law; Springer: Singapore, 2017; Springer: Singapore, 2017; pp. 17–42. [Google Scholar] [CrossRef]
- Gruschka, N.; Mavroeidis, V.; Vishi, K.; Jensen, M. Privacy Issues and Data Protection in Big Data: A Case Study Analysis under GDPR. In Proceedings of the 2018 IEEE International Conference on Big Data (Big Data), Seattle, WA, USA, 10–13 December 2018; pp. 5027–5033. [Google Scholar] [CrossRef] [Green Version]
- Patrick, L.; Martin, W. Volkswagen Infotainment Web Interface Protocol Specification (Viwi Protocol). W3c Member Submission, W3C. 2019. Available online: https://www.w3.org/Submission/viwi-protocol/ (accessed on 25 January 2021).
- Protocol Buffers Homepage. Available online: https://developers.google.com/protocol-buffers/ (accessed on 19 October 2021).
- FastAPI Homepage. Available online: https://fastapi.tiangolo.com/ (accessed on 18 October 2021).
- Google App Engine Documentation. Available online: https://cloud.google.com/appengine (accessed on 29 December 2020).
- Google Datastore Documentation. Available online: https://cloud.google.com/datastore (accessed on 29 December 2020).
- Google Tasks Documentation. Available online: https://cloud.google.com/tasks (accessed on 29 December 2020).
- Google Pub/Sub Documentation. Available online: https://cloud.google.com/pubsub (accessed on 29 December 2020).
- Google Compute Engine Documentation. Available online: https://cloud.google.com/compute (accessed on 25 January 2021).
- Google Cloud Storage Documentation. Available online: https://cloud.google.com/storage (accessed on 29 December 2020).
- Shapiro, M.; Preguiça, N.; Baquero, C.; Zawirski, M. Conflict-Free Replicated Data Types. In Proceedings of the 13th International Symposium on Stabilization, Safety, and Security of Distributed Systems (SSS 2011), Grenoble, France, 10–12 October 2011; pp. 386–400. [Google Scholar] [CrossRef] [Green Version]
- Google Datastore—Features Documentation. Available online: https://cloud.google.com/datastore/docs/firestore-or-datastore (accessed on 25 January 2021).
- Road Vehicles—Vehicle Identification Number (VIN)—Content and Structure. Standard, International Organization for Standardization, Geneva, CH, USA. 2009. Available online: https://www.iso.org/standard/52200.html (accessed on 28 September 2021).
- Google Datastore—Best Practices Documentation. Available online: https://cloud.google.com/datastore/docs/best-practices (accessed on 18 January 2021).
- Zhang, T.; Zuck, A.; Porter, D.E.; Tsafrir, D. Apps Can Quickly Destroy Your Mobile’s Flash: Why They Don’t, and How to Keep It That Way. In Proceedings of the 17th Annual International Conference on Mobile Systems, Applications, and Services (MobiSys), Seoul, Korea, 17–21 June 2019; pp. 207–221. [Google Scholar] [CrossRef]
- Rizzato, F. Germany’s Rural 4G Users Still Spend One-Fourth of Their Time on 3G and 2G Networks. Available online: https://www.opensignal.com/blog/2019/06/13/germanys-rural-4g-users-still-spend-one-fourth-of-their-time-on-3g-and-2g-networks (accessed on 19 January 2021).
- Costa, B.G.; Reis, M.A.S.; Araújo, A.P.; Solis, P. Performance and cost analysis between on-demand and preemptive virtual machines. In Proceedings of the 8th International Conference on Cloud Computing and Services Science (CLOSER), Funchal, Portugal, 19–21 March 2018; pp. 169–178. [Google Scholar] [CrossRef]
- Frenken, K.; Juliet, S. Putting the sharing economy into perspective. Environ. Innov. Soc. Transit. 2017, 23, 3–10. [Google Scholar] [CrossRef]
- Cloud Monitoring Homepage. Available online: https://cloud.google.com/monitoring (accessed on 19 October 2021).
- PSA Monitors. Available online: https://developer.groupe-psa.io/webapi/b2b/monitor/about/ (accessed on 17 December 2020).
- Martin, D.; Kühl, N.; Satzger, G. Virtual Sensors. Bus. Inf. Syst. Eng. 2021, 63, 315–323. [Google Scholar] [CrossRef]
- Ko, J.; Lee, B.B.; Lee, K.; Hong, S.G.; Kim, N.; Paek, J. Sensor Virtualization Module: Virtualizing IoT Devices on Mobile Smartphones for Effective Sensor Data Management. Int. J. Distrib. Sens. Netw. 2015, 11, 730762. [Google Scholar] [CrossRef]
- Madria, S.; Kumar, V.; Dalvi, R. Sensor Cloud: A Cloud of Virtual Sensors. IEEE Softw. 2014, 31, 70–77. [Google Scholar] [CrossRef]
- Guo, L.; Dong, M.; Ota, K.; Li, Q.; Ye, T.; Wu, J.; Li, J. A Secure Mechanism for Big Data Collection in Large Scale Internet of Vehicle. IEEE Internet Things J. 2017, 4, 601–610. [Google Scholar] [CrossRef] [Green Version]
- Nelson, B.; Olovsson, T. Introducing Differential Privacy to the Automotive Domain: Opportunities and Challenges. In Proceedings of the 2017 IEEE 86th Vehicular Technology Conference (VTC-Fall), Toronto, ON, Canada, 24–27 September 2017; pp. 1–7. [Google Scholar] [CrossRef]
- Wallace, B.; Goubran, R.; Knoefel, F.; Marshall, S.; Porter, M.; Harlow, M.; Puli, A. Automation of the Validation, Anonymization, and Augmentation of Big Data from a Multi-year Driving Study. In Proceedings of the 2015 IEEE International Congress on Big Data, New York, NY, USA, 27 June–2 July 2015; pp. 608–614. [Google Scholar] [CrossRef]
- Zhao, P.; Zhang, G.; Wan, S.; Liu, G.; Umer, T. A survey of local differential privacy for securing internet of vehicles. J. Supercomput. 2020, 76, 8391–8412. [Google Scholar] [CrossRef]
- Barati, M.; Rana, O. Tracking GDPR Compliance in Cloud-based Service Delivery. IEEE Trans. Serv. Comput. 2020. [Google Scholar] [CrossRef]
Feature | Demand Driven Data Acquisition | Multi Tenancy | Vehicle External Data Storage | Vehicle External Data Streaming | Conditiona Data Acquisition | Individual Vehicle Addressing | Consistency Guarantees | Data Compression | Native Integration | |
---|---|---|---|---|---|---|---|---|---|---|
Work | ||||||||||
[16] | No | Yes | Yes | Yes | No | No | No | No | No | |
[17] | No | No | No | No | Yes | No | No | No | No | |
[18] | No | No | Yes | Yes | No | No | No | Yes | No | |
[19] | No | No | Yes | No | No | No | No | No | No | |
[20] | No | No | Yes | No | No | No | No | No | No | |
[21] | No | No | Yes | No | No | No | No | No | No | |
[22] | No | Yes | Yes | No | No | No | No | Yes | No | |
[23] | No | Yes | Yes | Yes | No | No | No | Yes | Yes | |
[24] | No | Yes | Yes | No | No | No | No | No | No | |
[25] | No | Yes | Yes | No | No | No | No | No | No | |
[26] | No | Yes | Yes | No | No | No | No | No | Assumed | |
[27] | No | Yes | No | Yes | No | No | No | No | Assumed | |
[28] | No | Yes | Yes | No | No | Yes | No | No | Assumed | |
[29] | No | No | Yes | No | Yes | Yes | No | No | Assumed | |
[30] (*) | No | Yes | Yes | No | No | No | - | - | No | |
[31] (*) | No | Yes | Yes | No | No | No | - | - | No | |
[32] (*) | No | Yes | Yes | Yes | No | No | - | - | No | |
[33] (*) | No | Yes | Yes | Yes | No | - | - | - | Yes | |
[34] (*) | No | Yes | Yes | Yes | No | - | - | - | Yes | |
[35] (*) | No | Yes | Yes | Yes | No | - | - | - | Yes | |
[36] (*) | No | Yes | Yes | No | No | - | - | - | Yes | |
[37] (*) | No | Yes | Yes | Yes | No | - | - | - | Yes | |
[38] (*) | No | Yes | - | - | No | - | - | - | Yes | |
[39] (*) | No | Yes | Yes | Yes | No | - | - | - | Yes | |
[14] | No | Yes | Yes | No | - | Yes | No | - | No | |
[15] | No | No | No | No | No | Yes | No | No | No | |
Our Proposal | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
Requirement | Realization | Section |
---|---|---|
Personal data may only be processed if prior consent is given. | The data collection software natively integrated with the vehicle does not collect and transmit any data by default. An individual vehicle only performs data collection if a consent object exists within our system associated with the vehicle. | Section 3.2 |
The collection of personal data may not exceed the scope of the associated consent. | The data subject consents to grant the data consumer access to an immutable set of sensors (project). Subsequent data acquisition by the data consumer can not exceed this scope. | Section 3.2 |
If consent was revoked, all associated data must be deleted from our system. | Personal data stored on our system is linked to the associated consent. If consent is revoked, we automatically delete the associated data. | Section 4.8 |
If consent was revoked, all associated data must be deleted from the infrastructure of the data consumer. | We notify the data consumer about the revocation via a message-based interface. The data provided to him always references the associated consent. Thus, he can carry out a targeted deletion on his infrastructure. | Section 3.5 |
If consent was revoked, associated data acquisition must stop; no further data may be collected. | Each vehicle communicates continuously with our cloud platform to stop individual data acquisitions if associated consent is revoked. The cloud platform will discard data the vehicle transferred before the revocation was propagated upon receipt. | Section 4.3 and Section 4.8 |
No more data may be collected than necessary (data minimization). | Each vehicle only transmits sensors required to fulfill the data demand of the data consumers for which associated consent exists. Data consumers specify demand according to their needs and can adjust the scope of data acquisition even after consent is granted. | Section 3.2 |
Primitive | Preprocessing | Average Size | Ratio |
---|---|---|---|
Timestamp 1 | None | 5563 Byte | - |
Delta + VLI | 1748 Byte | 3.18 | |
DOD + VLI | 1518 Byte | 3.66 | |
Integer 2 | None | 1069 Byte | - |
VLI | 445 Byte | 2.40 | |
Delta + VLI | 284 Byte | 3.76 | |
DOD + VLI | 284 Byte | 3.76 | |
Float 3 | None | 7050 Byte | - |
F2I + VLI | 3239 Byte | 2.18 | |
F2I + Delta + VLI | 2228 Byte | 3.16 | |
F2I + DOD + VLI | 2243 Byte | 3.14 | |
String 4 | None | 120 Byte | - |
Dictionary + VLI | 34 Byte | 3.53 |
Preprocessing Applied | |||
---|---|---|---|
Algorithm | No (Ratio) | Yes (Ratio) | Time (ms) |
None | 1.00 | 2.99 | - |
Gzip [47] | 1.84 | 4.82 | 38.08 |
bzip2 [48] | 1.87 | 5.05 | 21.71 |
LZMA [49] | 2.68 | 5.36 | 121.32 |
LZ4 [50] | 1.43 | 4.05 | 82.00 |
Brotli [51] | 2.60 | 5.59 | 762.41 |
zstd [52] | 1.97 | 5.01 | 377.04 |
zlib [53] | 1.84 | 4.83 | 38.92 |
Latency (ms) | ||||
---|---|---|---|---|
Infrastructure | Mean | 99th Percentile | 95th Percentile | STD. Dev. |
Backend | 221 | 471 | 230 | 370 |
Streaming | 192 | 1220 | 350 | 263 |
Trip Files | 11329 | 14020 | 12997 | 880 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Matesanz, P.; Graen, T.; Fiege, A.; Nolting, M.; Nejdl, W. Demand-Driven Data Acquisition for Large Scale Fleets. Sensors 2021, 21, 7190. https://doi.org/10.3390/s21217190
Matesanz P, Graen T, Fiege A, Nolting M, Nejdl W. Demand-Driven Data Acquisition for Large Scale Fleets. Sensors. 2021; 21(21):7190. https://doi.org/10.3390/s21217190
Chicago/Turabian StyleMatesanz, Philip, Timo Graen, Andrea Fiege, Michael Nolting, and Wolfgang Nejdl. 2021. "Demand-Driven Data Acquisition for Large Scale Fleets" Sensors 21, no. 21: 7190. https://doi.org/10.3390/s21217190