The Endgame bottleneck: historical storage

Polynya
4 min readFeb 27, 2022

Currently, there’s a clear bottleneck at play with monolithic blockchains: state growth. The direct solutions to this are statelessness, validity proofs, state expiry and PBS. We’ll see rollups adopt similar solutions, with the unique advantage of having high-frequency state expiry as they can simply reconstruct state from the base layer. Once rollups are free of the state growth bottleneck, they are primarily bound by data capacity on the base layer. To be clear, even the perfectly implemented rollup will still have limits, but these are very high, and there can be multiple rollups — and I very much expect composability across rollups (at least those sharing proving systems) to be possible by the time those limits are hit.

Consider Ethereum — with danksharding, there’s going to be ample data capacity available for rollups to settle on. Because rollups with compression tech are incredibly efficient with data — 10x-100x more so than monolithic L1s — they can get a great deal out of this. It’s fair to say there’s going to be enough space on Ethereum rollups to conduct all transactions of value at a global scale.

Eventually, as we move to a PBS + danksharding model, the bottleneck appears to be bandwidth. However, with distributed PBS systems possible, even that is alleviated. The bandwidth required for each validator will always be quite low.

The Endgame bottleneck, thus, becomes storage of historical data. With danksharding, validators are expected to store data they come to consensus on and guarantee availability for only a few months. Beyond that, this data expires, and it transitions to a 1-of-N trust model — i.e. only one copy of all the data must exist. It’s important to note that this is sequential data, and can be stored on very cheap HDDs. (As opposed to SSDs or RAM, which is required for blockchain state.) It’s also important to note that Ethereum has already come to consensus on this data, so it’s a different model entirely.

Now, this is not a big deal, and Vitalik has covered many possibilities, and the chances that 100% of these fails is miniscule:

Source: A step-by-step roadmap for scaling rollups with calldata expansion and sharding — HackMD (ethereum.org) (note: some of the information in that post is now outdated)

I’d also add to this list that each individual user can simply store their own relevant data — it’ll be no bigger than your important documents backup for even the ardent DeFi degen. Or pay for a service (decentralized or centralized) to do it. Sidenote: idea for a decentralized protocol — You enter your Ethereum address, and it collects all relevant data and stores it for a nominal fee.

That said, you can’t go nuts — at some point there’s too much data and the probability of a missing byte somewhere increases. Currently, danksharding is targeting 65 TB/year. Thanks to the incredible data efficiency of rollups — 100x more than monolithic L1s for optimized rollups — we can get ample capacity for all valuable transactions at a global scale. I’ll once again note that because rollups transmute complex state into sequential data, IOPS is no longer the bottleneck — it’s purely hard drive capacity.

This amount of data can be stored by any individual at a cost of $1,200/year with RAID1 redundancy on hard drives. (Addendum: another option is using LTO tapes —which can be less than half the price.) I think this is very conservative — and if no one else will, I certainly will! As the cost of storage gets cheaper over time — per Wright’s Law — this ceiling can continue increasing. I fully expect by the time danksharding rolls out, storage is cheaper and there’s more clarity to historical storage; we can already push higher to >100 TB/year.

My preference would be simply enshrining an “Ethereum History Network” protocol, perhaps building on the works of and/or collaborating with Portal Network, Filecoin, Arweave, TheGraph, Swarm, BitTorrent, IPFS and others. It’s a very, very weak trust assumption — just 1-of-N — so it can be made watertight pretty easily with, say, 1% of ETH issuance used to secure it. The more decentralized this network gets, the more capacity there can be safely. Altair implemented accounting changes to how rewards are distributed, so that shouldn’t be an issue. By doing this, I believe we can easily push much higher — into the petabytes realm.

Even with the current limit, like I said, I believe danksharding will enable enough capacity on Ethereum rollups for all valuable transactions at global scale. Firstly, it’s not clear to me if this “web3”/“crypto experiment” has enough demand to even saturate danksharding! It’ll offer scale 150x higher than the entire blockchain industry activity combined today. Is there going to be 150x higher demand in a couple of years' time? Who knows, but let’s assume there is, and even the mighty danksharding is saturated. This is where alt-DA networks like Celestia, zkPorter and Polygon Avail (and whatever’s being built for StarkNet) can come into play: offering validiums limitless scale for the low/no-value transactions. As we have seen with the race to bottom in the alt-L1 land, I’m sure an alt-DA network will pop up offering petabytes of data capacity — effectively scaling to billions of TPS immediately. Obviously, validiums offer much lower security guarantees than rollups, but it’ll be a reasonable trade-off for lower value transactions. There’ll also be a spectrum between the alt-DA solutions. Lastly, you have all sorts of data that don’t need consensus — those can go straight to IPFS or Filecoin or whatever.

Of course, I’m looking several years down the line. Rollups are maturing rapidly, but we still have several months of intense development ahead of us. But eventually, years down the line, we’re headed to a point where historical storage becomes the primary bottleneck.

--

--

Polynya

Rants and musings on blockchain tech. All content here in the public domain, please feel free to share/adapt/republish.