Studying the workload of a fully decentralized Web3 system: IPFS
Abstract
Web3 aims at creating a decentralized platform that is competitive with modern cloud infrastructures that support today's Internet. However, Web3 is still limited, supporting only applications in the domains of content creation and sharing, decentralized financing, and decentralized communication. This is mainly due to the technologies supporting Web3: blockchain, IPFS, and libp2p, that although provide a good collection of tools to develop Web3 applications, are still limited in terms of design and performance. This motivates the need to better understand these technologies as to enable novel optimizations that can push Web3 to its full potential. Unfortunately, understanding the current behavior of a fully decentralized large-scale distributed system is a difficult task, as there is no centralized authority that has full knowledge of the system operation. To this end, in this paper we characterize the workload of IPFS, a key enabler of Web3. To achieve this, we have collected traces from accesses performed by users to one of the most popular IPFS gateways located in North America for a period of two weeks. Through the fine analysis of these traces, we gathered the amount of requests to the system, and found the providers of the requested content. With this data, we characterize both the popularity of requested and provided content, as well as their geo-location (by matching IP address with the MaxMind database). Our results show that most of the requests in IPFS are only to a few different content, that is provided by large portion of peers in the system. Furthermore, our analysis also shows that most requests are provided by the two largest portions of providers in the system, located in North America and Europe. With these insights, we conclude that the current IPFS architecture is sub-optimal and propose a research agenda for the future.
- Publication:
-
arXiv e-prints
- Pub Date:
- December 2022
- DOI:
- 10.48550/arXiv.2212.07375
- arXiv:
- arXiv:2212.07375
- Bibcode:
- 2022arXiv221207375A
- Keywords:
-
- Computer Science - Distributed;
- Parallel;
- and Cluster Computing