Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3565011.3569060acmconferencesArticle/Chapter ViewAbstractPublication PagesconextConference Proceedingsconference-collections
research-article
Open access

Towards a decentralized infrastructure for data marketplaces: narrowing the gap between academia and industry

Published: 06 December 2022 Publication History

Abstract

One big challenge for Industry 4.0 is leveraging the large amount of data that remain unused after collection. A variety of commercial data marketplaces have emerged in recent years to tackle this task. Despite their different business models and target markets, such marketplaces share a number of common issues that slow the growth of the industry, including data discovery, transparency, data privacy and data valuation. Many academic designs have been proposed to address these issues, yet most of them remain unimplemented, due to complexity or inefficiency.
We argue that these issues can be addressed with a combination of blockchain-based infrastructure, privacy-preserving computing and machine learning-based valuation metrics. Furthermore, we discuss key enabling technologies in each of these areas that are feasible to deploy at scale and could thus be implemented in real-world marketplaces in the near future. We select such technologies based on their current maturity and their industrial prominence.

References

[1]
2016. General Data Protection Regulation. https://gdpr-info.eu. Accessed: 2022-09-21.
[2]
2018. California Consumer Privacy Act. https://ccpa-info.com/home/1798-140-definitions. Accessed: 2022-09-21.
[3]
Martin Abadi, Andy Chu, Ian Goodfellow, H Brendan McMahan, Ilya Mironov, Kunal Talwar, and Li Zhang. 2016. Deep learning with differential privacy. In Proceedings of the 2016 ACM SIGSAC conference on computer and communications security. 308--318.
[4]
Mohamed Alloghani, Mohammed M Alani, Dhiya Al-Jumeily, Thar Baker, Jamila Mustafina, Abir Hussain, and Ahmed J Aljaaf. 2019. A systematic review on the status and progress of homomorphic encryption technologies. Journal of Information Security and Applications 48 (2019), 102362.
[5]
Elli Androulaki, Artem Barger, Vita Bortnikov, Christian Cachin, Konstantinos Christidis, Angelo De Caro, David Enyeart, Christopher Ferris, Gennady Laventman, Yacov Manevich, et al. 2018. Hyperledger fabric: a distributed operating system for permissioned blockchains. In Proceedings of the thirteenth EuroSys conference. 1--15.
[6]
Santiago Andrés Azcoitia, Costas Iordanu, and Nikolaos Laoutaris. 2021. What is the price of data? A measurement study of commercial data marketplaces. arXiv preprint arXiv:2111.04427 (2021).
[7]
Santiago Andrés Azcoitia and Nikolaos Laoutaris. 2020. Try Before You Buy: A practical data purchasing algorithm for real-world data marketplaces. arXiv preprint arXiv:2012.08874 (2020).
[8]
Santiago Andrés Azcoitia and Nikolaos Laoutaris. 2022. A survey of data marketplaces and their business models. arXiv preprint arXiv:2201.04561 (2022).
[9]
Marianna Belotti, Nikola Božić, Guy Pujolle, and Stefano Secci. 2019. A Vademecum on Blockchain Technologies: When, Which, and How. IEEE Communications Surveys & Tutorials 21, 4 (2019), 3796--3838.
[10]
Vitalik Buterin et al. 2014. A next-generation smart contract and decentralized application platform. white paper 3, 37 (2014), 2--1.
[11]
Jian Cai, Xiangdong Liu, Zhihui Xiao, and Jin Liu. 2009. Improving supply chain performance management: A systematic approach to analyzing iterative KPI accomplishment. Decision support systems 46, 2 (2009), 512--521.
[12]
Nicholas Carlini, Chang Liu, Úlfar Erlingsson, Jernej Kos, and Dawn Song. 2019. The secret sharer: Evaluating and testing unintended memorization in neural networks. In 28th USENIX Security Symposium (USENIX Security 19). 267--284.
[13]
Raymond Cheng, Fan Zhang, Jernej Kos, Warren He, Nicholas Hynes, Noah Johnson, Ari Juels, Andrew Miller, and Dawn Song. 2019. Ekiden: A platform for confidentiality-preserving, trustworthy, and performant smart contracts. In 2019 IEEE European Symposium on Security and Privacy (EuroS&P). IEEE, 185--200.
[14]
Victor Costan and Srinivas Devadas. 2016. Intel SGX explained. Cryptology ePrint Archive (2016).
[15]
Munther Dahleh. 2018. Why the Data Marketplaces of the Future Will Sell Insights, Not Data. https://sloanreview.mit.edu/article/why-the-data-marketplaces-of-the-future-will-sell-insights-not-data/. Accessed: 2022-09-21.
[16]
Ivan Damgård, Valerio Pastro, Nigel Smart, and Sarah Zakarias. 2012. Multiparty computation from somewhat homomorphic encryption. In Annual Cryptology Conference. Springer, 643--662.
[17]
Richard Dennis and Jules Pagna Disso. 2019. An Analysis into the Scalability of Bitcoin and Ethereum. In Third International Congress on Information and Communication Technology, Xin-She Yang, Simon Sherratt, Nilanjan Dey, and Amit Joshi (Eds.). Springer Singapore, Singapore, 619--627.
[18]
Akanksha Dixit, Arjun Singh, Yogachandran Rahulamathavan, and Muttukrishnan Rajarajan. 2021. FAST DATA: A Fair, Secure and Trusted Decentralized IIoT Data Marketplace enabled by Blockchain. IEEE Internet of Things Journal (2021), 1--1.
[19]
Cynthia Dwork. 2008. Differential privacy: A survey of results. In International conference on theory and applications of models of computation. Springer, 1--19.
[20]
Cynthia Dwork, Aaron Roth, et al. 2014. The algorithmic foundations of differential privacy. Foundations and Trends® in Theoretical Computer Science 9, 3--4 (2014), 211--407.
[21]
Anne C Elster and Tor A Haugdahl. 2022. Nvidia Hopper GPU and Grace CPU Highlights. Computing in Science & Engineering 24, 2 (2022), 95--100.
[22]
Ethereum Foundation. 2022. Ethereum Vision. Retrieved 2022-09-22 from https://ethereum.org/en/upgrades/vision/
[23]
Jérôme Euzenat, Pavel Shvaiko, et al. 2007. Ontology matching. Vol. 18. Springer.
[24]
Raul Castro Fernandez, Pranav Subramaniam, and Michael J Franklin. 2020. Data market platforms: Trading data assets to solve data problems. arXiv preprint arXiv:2002.01047 (2020).
[25]
Rosa M Garcia-Teruel. 2020. Legal challenges and opportunities of blockchain technology in the real estate sector. Journal of Property, Planning and Environmental Law (2020).
[26]
Jonas Geiping, Hartmut Bauermeister, Hannah Dröge, and Michael Moeller. 2020. Inverting gradients-how easy is it to break privacy in federated learning? Advances in Neural Information Processing Systems 33 (2020), 16937--16947.
[27]
Amirata Ghorbani and James Zou. 2019. Data shapley: Equitable valuation of data for machine learning. In International Conference on Machine Learning. PMLR, 2242--2251.
[28]
Lodovico Giaretta and Šarūnas Girdzijauskas. 2019. Gossip learning: Off the beaten path. In 2019 IEEE International Conference on Big Data (Big Data). IEEE, 1117--1124.
[29]
Lodovico Giaretta, Ioannis Savvidis, Thomas Marchioro, Šarūnas Girdzijauskas, George Pallis, Marios D Dikaiakos, and Evangelos Markatos. 2021. PDS 2: A user-centered decentralized marketplace for privacy preserving data processing. In 2021 IEEE 37th International Conference on Data Engineering Workshops (ICDEW). IEEE, 92--99.
[30]
Fengyang Guo, Xun Xiao, Artur Hecker, and Schahram Dustdar. 2020. Characterizing IOTA Tangle with Empirical Data. In GLOBECOM 2020 - 2020 IEEE Global Communications Conference. 1--6.
[31]
Pooja Gupta, Volkan Dedeoglu, Salil S. Kanhere, and Raja Jurdak. 2021. Towards a blockchain powered IoT data marketplace. In 2021 International Conference on COMmunication Systems & NETworkS (COMSNETS). 366--368.
[32]
Veneta Haralampieva, Daniel Rueckert, and Jonathan Passerat-Palmbach. 2020. A systematic comparison of encrypted machine learning solutions for image classification. In Proceedings of the 2020 workshop on privacy-preserving machine learning in practice. 55--59.
[33]
Ian Horrocks. 2008. Ontologies and the semantic web. Commun. ACM 51, 12 (2008), 58--67.
[34]
Yan Huang, David Evans, Jonathan Katz, and Lior Malka. 2011. Faster Secure {Two-Party} Computation Using Garbled Circuits. In 20th USENIX Security Symposium (USENIX Security 11).
[35]
Nick Hynes, David Dao, David Yan, Raymond Cheng, and Dawn Song. 2018. A demonstration of sterling: a privacy-preserving data marketplace. Proceedings of the VLDB Endowment 11, 12 (2018), 2086--2089.
[36]
Patrick Jauernig, Ahmad-Reza Sadeghi, and Emmanuel Stapf. 2020. Trusted execution environments: properties, applications, and challenges. IEEE Security & Privacy 18, 2 (2020), 56--60.
[37]
Andrei Kazlouski, Thomas Marchioro, and Evangelos P. Markatos. 2022. What your Fitbit Says about You: De-anonymizing Users in Lifelogging Datasets. In SECRYPT.
[38]
Marcel Keller. 2020. MP-SPDZ: A versatile framework for multi-party computation. In Proceedings of the 2020 ACM SIGSAC conference on computer and communications security. 1575--1590.
[39]
Pang Wei W Koh, Kai-Siang Ang, Hubert Teo, and Percy S Liang. 2019. On the accuracy of influence functions for measuring group effects. Advances in neural information processing systems 32 (2019).
[40]
Jakub Konečnỳ, H Brendan McMahan, Daniel Ramage, and Peter Richtárik. 2016. Federated optimization: Distributed machine learning for on-device intelligence. arXiv preprint arXiv:1610.02527 (2016).
[41]
Vlasis Koutsos, Dimitrios Papadopoulos, Dimitris Chatzopoulos, Sasu Tarkoma, and Pan Hui. 2021. Agora: A Privacy-Aware Data Marketplace. IEEE Transactions on Dependable and Secure Computing (2021), 1--1.
[42]
Max J Krause and Thabet Tolaymat. 2018. Quantification of energy and carbon costs for mining cryptocurrencies. Nature Sustainability 1, 11 (2018), 711--718.
[43]
Danh Le-Phuoc, Hoan Nguyen Mau Quoc, Josiane Xavier Parreira, and Manfred Hauswirth. 2011. The linked sensor middleware-connecting the real world and the semantic web. Proceedings of the Semantic Web Challenge 152 (2011), 22--23.
[44]
Danh Le-Phuoc, Hoan Nguyen Mau Quoc, Hung Ngo Quoc, Tuan Tran Nhat, and Manfred Hauswirth. 2016. The graph of things: A step towards the live knowledge graph of connected things. Journal of Web Semantics 37 (2016), 25--35.
[45]
Yehuda Lindell. 2020. Secure multiparty computation. Commun. ACM 64, 1 (2020), 86--96.
[46]
Jian Liu, Mika Juuti, Yao Lu, and Nadarajah Asokan. 2017. Oblivious neural network predictions via minionn transformations. In Proceedings of the 2017 ACM SIGSAC conference on computer and communications security. 619--631.
[47]
Sin Kuang Lo, Yue Liu, Su Yen Chia, Xiwei Xu, Qinghua Lu, Liming Zhu, and Huansheng Ning. 2019. Analysis of blockchain solutions for IoT: A systematic literature review. IEEE Access 7 (2019), 58822--58835.
[48]
Kyle McDonald. 2021. Ethereum Emissions: A Bottom-up Estimate. arXiv preprint arXiv:2112.01238 (2021).
[49]
H Brendan McMahan, Daniel Ramage, Kunal Talwar, and Li Zhang. 2017. Learning differentially private recurrent language models. arXiv preprint arXiv:1710.06963 (2017).
[50]
Fan Mo, Hamed Haddadi, Kleomenis Katevas, Eduard Marin, Diego Perino, and Nicolas Kourtellis. 2021. PPFL: privacy-preserving federated learning with trusted execution environments. In Proceedings of the 19th Annual International Conference on Mobile Systems, Applications, and Services. 94--108.
[51]
Fan Mo, Zahra Tarkhani, and Hamed Haddadi. 2022. SoK: Machine Learning with Confidential Computing. arXiv preprint arXiv:2208.10134 (2022).
[52]
Payman Mohassel and Yupeng Zhang. 2017. Secureml: A system for scalable privacy-preserving machine learning. In 2017 IEEE symposium on security and privacy (SP). IEEE, 19--38.
[53]
Arvind Narayanan and Vitaly Shmatikov. 2008. Robust de-anonymization of large sparse datasets. In 2008 IEEE Symposium on Security and Privacy (sp 2008). IEEE, 111--125.
[54]
Arvind Narayanan and Vitaly Shmatikov. 2009. De-anonymizing social networks. In 2009 30th IEEE symposium on security and privacy. IEEE, 173--187.
[55]
Nawari O Nawari and Shriraam Ravindran. 2019. Blockchain and the built environment: Potentials and limitations. Journal of Building Engineering 25 (2019), 100832.
[56]
Lucien KL Ng, Sherman SM Chow, Anna PY Woo, Donald PH Wong, and Yongjun Zhao. 2021. Goten: Gpu-outsourcing trusted execution of neural network training. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 14876--14883.
[57]
Oasis Protocol Project. 2020. The Oasis Blockchain Platform. Technical Report.
[58]
Olga Ohrimenko, Felix Schuster, Cédric Fournet, Aastha Mehta, Sebastian Nowozin, Kapil Vaswani, and Manuel Costa. 2016. Oblivious {Multi-Party} machine learning on trusted processors. In 25th USENIX Security Symposium (USENIX Security 16). 619--636.
[59]
Evangelos Psomakelis, Anastasios Nikolakopoulos, Achilleas Marinakis, Alexandros Psychas, Vrettos Moulos, Theodora Varvarigou, and Andreas Christou. 2020. A scalable and semantic data as a service marketplace for enhancing cloud-based applications. Future Internet 12, 5 (2020), 77.
[60]
Gowri Sankar Ramachandran, Rahul Radhakrishnan, and Bhaskar Krishnamachari. 2018. Towards a Decentralized Data Marketplace for Smart Cities. In 2018 IEEE International Smart Cities Conference (ISC2). 1--8.
[61]
Alvin E Roth. 1988. The Shapley value: essays in honor of Lloyd S. Shapley. Cambridge University Press.
[62]
Benedek Rozemberczki, Lauren Watson, Péter Bayer, Hao-Tsung Yang, Olivér Kiss, Sebastian Nilsson, and Rik Sarkar. 2022. The Shapley Value in Machine Learning. arXiv preprint arXiv:2202.05594 (2022).
[63]
Seagate. 2020. Rethink Data. https://www.seagate.com/files/www-content/our-story/rethink-data/files/Rethink_Data_Report_2020.pdf.
[64]
Nicolás Serrano and Fredy Cuenca. 2021. A Peer-to-Peer Ownership-Preserving Data Marketplace. In 2021 IEEE International Conference on Blockchain (Blockchain). 394--400.
[65]
Wellington Fernandes Silvano and Roderval Marcelino. 2020. Iota Tangle: A cryptocurrency to communicate Internet-of-Things data. Future Generation Computer Systems 112 (2020), 307--319.
[66]
Oana Stan, Vincent Thouvenot, Aymen Boudguiga, Katarzyna Kapusta, Martin Zuber, and Renaud Sirdey. 2022. A Secure Federated Learning: Analysis of Different Cryptographic Tools. In SECRYPT.
[67]
Latanya Sweeney. 2002. k-anonymity: A model for protecting privacy. International journal of uncertainty, fuzziness and knowledge-based systems 10, 05 (2002), 557--570.
[68]
Matias Travizano, Carlos Sarraute, Mateusz Dolata, Aaron M. French, and Horst Treiblmaier. 2020. Wibson A Case Study of a Decentralized, Privacy-Preserving Data Marketplace. Springer International Publishing, Cham, 149--170.
[69]
Ardhendu Tripathy, Ye Wang, and Prakash Ishwar. 2019. Privacy-preserving adversarial networks. In 2019 57th Annual Allerton Conference on Communication, Control, and Computing (Allerton). IEEE, 495--505.
[70]
Hien Thi Thu Truong, Miguel Almeida, Ghassan Karame, and Claudio Soriente. 2019. Towards Secure and Decentralized Sharing of IoT Data. In 2019 IEEE International Conference on Blockchain (Blockchain). 176--183.
[71]
Sameer Wagh, Shruti Tople, Fabrice Benhamouda, Eyal Kushilevitz, Prateek Mittal, and Tal Rabin. 2020. Falcon: Honest-majority maliciously secure framework for private deep learning. arXiv preprint arXiv:2004.02229 (2020).
[72]
Zhaoxuan Wu, Yao Shu, and Bryan Kian Hsiang Low. 2022. DAVINZ: Data Valuation using Deep Neural Networks at Initialization. In Proceedings of the 39th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 162), Kamalika Chaudhuri, Stefanie Jegelka, Le Song, Csaba Szepesvari, Gang Niu, and Sivan Sabato (Eds.). PMLR, 24150--24176. https://proceedings.mlr.press/v162/wu22j.html
[73]
Xinyi Xu, Zhaoxuan Wu, Chuan Sheng Foo, and Bryan Kian Hsiang Low. 2021. Validation free and replication robust volume-based data valuation. Advances in Neural Information Processing Systems 34 (2021), 10837--10848.
[74]
Qiang Yang, Yang Liu, Tianjian Chen, and Yongxin Tong. 2019. Federated machine learning: Concept and applications. ACM Transactions on Intelligent Systems and Technology (TIST) 10, 2 (2019), 1--19.
[75]
Jinsung Yoon, Sercan Arik, and Tomas Pfister. 2020. Data Valuation using Reinforcement Learning. In Proceedings of the 37th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 119), Hal Daumé III and Aarti Singh (Eds.). PMLR, 10842--10851. https://proceedings.mlr.press/v119/yoon20a.html
[76]
Hana Yousuf, Michael Lahzi, Said A Salloum, and Khaled Shaalan. 2021. Systematic review on fully homomorphic encryption scheme and its application. Recent Advances in Intelligent Systems and Smart Applications (2021), 537--551.
[77]
Mirko Zichichi, Luca Serena, Stefano Ferretti, and Gabriele D'Angelo. 2021. Towards Decentralized Complex Queries over Distributed Ledgers: a Data Marketplace Use-case. In 2021 International Conference on Computer Communications and Networks (ICCCN). 1--6.
[78]
Kazim Rifat Özyilmaz, Mehmet Doğan, and Arda Yurdakul. 2018. IDMoB: IoT Data Marketplace on Blockchain. In 2018 Crypto Valley Conference on Blockchain Technology (CVCBT). 11--19.

Cited By

View all
  • (2023)A Literature Review on Data Monetization using Smart ContractsInternational Journal of Advanced Research in Science, Communication and Technology10.48175/IJARSCT-12125(180-186)Online publication date: 19-Jul-2023

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
DE '22: Proceedings of the 1st International Workshop on Data Economy
December 2022
70 pages
ISBN:9781450399234
DOI:10.1145/3565011
This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 06 December 2022

Check for updates

Qualifiers

  • Research-article

Funding Sources

  • European Union

Conference

CoNEXT '22
Sponsor:

Upcoming Conference

CoNEXT '24

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)179
  • Downloads (Last 6 weeks)22
Reflects downloads up to 17 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2023)A Literature Review on Data Monetization using Smart ContractsInternational Journal of Advanced Research in Science, Communication and Technology10.48175/IJARSCT-12125(180-186)Online publication date: 19-Jul-2023

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media