Web3 is many things but it isn’t easy. Particularly so for developers, who are constantly faced with the seemingly impossible task of optimizing for opposing outcomes simultaneously. Such as creating blockchain applications that are highly decentralized, with data stored on as many nodes as possible, but also faster than anything ever created.
It’s the equivalent of being asked to design an automobile that’s both lightweight and ultra comfortable. Enhancing the one attribute typically comes at the expense of the other and vice-versa. But web3 developers didn’t enter the industry because they wanted to live out their days on easy mode, coding barely improved iterations of existing solutions. Ambitious devs relish the challenge of being asked to accelerate the capabilities of blockchain technology and it doesn’t get much tougher than cracking the decentralized storage trilemma.
This little-known but highly instructive rule holds that engineering web3 storage layers that simultaneously achieve the following is extremely difficult: scalability, random access, and smart contract integration. Some explaining is clearly required to establish why this conundrum holds sway and how this trilemma can be solved. So let’s elaborate.
Objective 1: Scalability
While the original blockchain trilemma, describing the difficulty of optimizing a network for scalability, security, and decentralization, was coined by Vitalik Buterin, it’s less clear who conceived its blockchain storage counterpart. Regardless of originator, it follows the same 2-of-3 formula: that achieving scalability, random access, and smart contract integration simultaneously is extremely hard.
The easiest of these concepts to visualize concerns scalability: that is, the ability of web3 storage layers to scale to hold increasingly vast amounts of data. In theory, there are no upper limits on the amount of data that extant web3 storage layers can hold: the difficulty is doing so without introducing latency.
In decentralized storage systems, data is typically replicated across multiple nodes to ensure redundancy and availability. To reach exabyte-scale storage, the system must minimize the overhead caused by data duplication without compromising on reliability. As a storage network grows in size, more nodes will need to handle increasing amounts of data and transactions. However, high network latency and limited bandwidth between nodes can slow data retrieval and increase costs. See, no one said this stuff was easy.
Objective 2: Random access
Random access refers to the ability to retrieve any piece of data, almost instantly, from wherever it happens to be stored on a decentralized network. Because web3 data is often fragmented into chunks and stored across various nodes, efficiently retrieving a specific piece of data is difficult, especially as the size of the dataset grows. The challenge is to maintain fast lookup and retrieval times without requiring centralized indexes or sacrificing decentralization.
Engineering decentralized storage systems that support complex querying without introducing centralization is challenging because web3 is bereft of the relational databases and sophisticated indexing systems that are available to web2 platforms. This problem is exacerbated by the fact that consensus mechanisms used to verify data integrity may introduce delays in real-time data access, presenting challenges to dapps that require instantaneous interaction.
Objective 3: Smart contract integration
The final challenge when designing a web3 storage layer is making it work seamlessly with the smart contracts being executed by the dapps operating on the Layer 1 or L2 network. If smart contracts are querying large datasets frequently, the cost of these operations can become prohibitive. Optimization is thus essential to minimize the number of onchain transactions required for accessing data to keep gas fees to a minimum.
When a data layer is able to integrate seamlessly with smart contracts, sharing the same programming language and ability to route its data through the blockchain’s existing RPC nodes, latency is minimized. Dapp developers, meanwhile, are able to easily tap into vast amounts of data without getting bogged down with integrating non-native solutions that require careful configuration and maintenance. When a data layer works natively with smart contracts, everything becomes easier.
How close are we to achieving the blockchain storage trilemma?
Establishing the current progress being made by web3 data storage layers depends largely on the network in question. Because to reference objective three once again, the ideal data layer is optimized for the programming language and smart contracts of a specific blockchain network. What works for Polkadot, in other words, won’t work so well for Ethereum.
On Solana, Xandeum believes it’s solved the storage trilemma with its implementation of “Buckets,” a decentralized file system connected to special RPC nodes that gives smart contracts access to a virtually unlimited amount of data. The scalability component is adequately covered through the provision of exabytes of data – 100x that offered by current solutions, while random access is ensured, providing a clear advantage over data layers that only provide file-level access.
While Xandeum is an attractive option to Solana developers, what about the web3 projects operating across the rest of the multichain landscape? At this point in time, web3 storage layers are just about up to the task. But in the near future, the rapidly escalating demands of dapps, particularly those addressing AI inferencing and LLMs, will require multiples more data than decentralized storage can currently offer.
Decoupling storage from the blockchain layer makes sense, since it allows the blockchain to handle governance, incentives, and metadata while the storage layer focuses purely on managing and retrieving large datasets. The challenge comes when attempting to distribute data across multiple nodes while maintaining availability.
Web3 is constantly seeking the next summit to conquer. In solving the Gordian Knot of decentralized storage, it’s facing its toughest challenge yet. No one said engineering web3 solutions was easy. But the rewards for meeting the demands placed by the next generation of decentralized applications will make it all worthwhile.