Samsung is releasing 512 GB DDR5 RAM modules — how this can supercharge zk-rollups
This is a fun post with wild speculation, please do not take it seriously.
One of the magical aspects of zkRs are that you only need one sequencer and prover live at any given time. To attain censorship resistance and liveness resilience, we’re definitely going to need more than one, but it can be a handful. So, zkRs can have very hefty system requirements. Moreover, the burden of being able to sync from genesis is unnecessary as the entire state is fully verified and can be reconstructed directly from L1. Overall, zkRs can offer far higher security guarantees than an L1, despite requiring much higher system specs. (Addendum: we’ll need light unassisted withdrawals to make this bulletproof.)
Today, it’s well known that the primary bottleneck for all blockchain full node clients are disk IOPS. To run Geth, you need at least 5,000 r/w IOPS to reliably sync and keep up with the chain. Budget SSDs today are capable of over 100,000 IOPS, and Erigon claims to be 10x more efficient than Geth, and thus capable of thousands of TPS on a consumer SSD already.
Now, here’s where the exciting new tech enters the fray — Samsung is releasing 512 GB DDR5 modules. We know the next-generation Xeon and EPYC CPUs will support 8 memory channels, which means it can accept 16 memory modules. That’s an eye-watering 8 TB RAM possible! Or, at least, 4 TB! Within this 4 TB, you can easily fit in billions of transactions. Yes, this machine will probably cost $20,000-$30,000, but for a zkR processing thousands of TPS it could be economically sustainable. I’d also note that prover costs will continue going down, and once there’s enough activity, it’ll be negligible to the cost of processing transactions — let alone gas paid to L1.
Now, back to IOPS. We know DDR5 modules run at 7.2 GT/s, across 8 channels this is an insane 460 GB/s of memory bandwidth. While it’s difficult to calculate IOPS at this early stage, it’s fair to assume we’ll see something like 10–50 million IOPS.
At this sort of memory throughput and random I/O, assuming no other bottlenecks, one zkR can easily do millions of TPS. But, of course, there will be other bottlenecks. If the state largely lives on DDR5 RAM, it’s fair to say the CPU (or GPU) will become the bottleneck, or the VM itself. I have no idea, but it’s clear that there’s plenty of headroom from where we stand currently. Obviously, these will continue to improve over time, as will client efficiency. Of course, in the short term, the real bottleneck is data availability, though data shards significantly alleviate that.
Of course, this approach will need to be combined with frequent state expiry. The magic of zkRs is that you don’t need to worry about state expiry infrastructure — it already exists on L1! With advanced solutions like shard and history access precompiles the zkR full node can quickly reconstruct necessary state.
The biggest drawback of this approach is that RAM, unlike SSD, is volatile memory, so if the system shuts down the node will have to sync from scratch. Fortunately, this is not that big a deal of the above mentioned infrastructure in place with frequent snapshots.
Finally, optimistic rollups can’t push things that far, because it still requires 1 honest participant, so we’ll need to keep things in check. Realistically, though, by the time such throughput is required, almost all rollups will be zkRs.
Tl;dr: After data shards release, it’ll be quite possible to have uber-zkRs that can do hundreds of thousands of TPS, and potentially millions over the long-term. And yes, each of these uber-zkRs will maintain full composability across multiple data shards. And no, L1s will never be able to scale this far due to hefty burden of running consensus security. Zero-knowledge proofs and zkRollups are inevitable.