Optimistic rollups are brilliant (and the state of blockchains)
I have never written a post about “ORs vs ZKRs”. They are both great. My writing has been all about being against cripplingly inefficient, unscalable, insecure and irresilient monolithic L1s that we have seen grotesque misallocation of capital towards. My only aim is to see blockchains scale to global ubiquity in a technically, economically, and socially sustainable manner. Not only are monolithic L1s unsustainable, they will actually never be able to offer the throughput required to begin with — and miss the mark by several orders of magnitude. It’s simply not a question anymore— we need modular execution layers & data availability sampling to scale blockchains, P.E.R.I.O.D. It’s never been clearer. The sooner we recognize and pivot to this, the better the blockchain industry will be for it. And yeah, well, this is just, like, my opinion, man. As always, please just consider this one amateur’s stream-of-consciousness rambling, rather than a professional research piece.
Monolithic L1s are (still) cripplingly inefficient
I started writing comments about rollups in 2020, and blog posts around this time last year, when the overwhelming mainstream narrative was “Cardano/<insert alt-L1> smart contracts are imminent, there’ll be a mass exodus from Ethereum to <insert alt-L1> overnight”. Later the narrative was “L2s are temporary band-aid, the only way to scale is L1s”. When put to even the mildest of tests in late 2021, the alt-L1 narratives unraveled amazingly fast (by the way, I’ll define a monolithic L1 as one without validity, fraud, DA proofs, or statelessness):
- Binance Smart Chain’s (no, renaming your chain doesn’t make it decentralized) system requirements ballooned due to state bloat. As a result, nodes started desyncing, leading to many issues. The mistake here was they specified very low requirements when they started — they failed to inform everyone that system requirements grow exponentially with state bloat. Now they are making reckless changes without adequate auditing or testing. Sure, switching to Erigon and having multiple chains will offer incremental benefits, but there’ll always be crippling limitations.
- Solana didn’t make that mistake — they were very open about very high system requirements from the beginning. Now, I definitely will not give Solana a hard time about their various failures and issues, as a lot of it is down to it being an early beta product. Bugs and issues due to missing features are always on the cards with beta projects — whether rollups, dapps or monolithic L1s — and I only have the best wishes for developers so they fix these. But the problem is, years down the line when it does mature, it’s battle-tested and there are no bugs and has a fee market implemented, it’s going to only offer an incremental increase in throughput, and at a steep cost to technical & economic sustainability. Solana is inherently unscalable. Optimistic rollups are far superior solutions to Solana, and will mature much sooner.
- Arguably, Polygon PoS has been the chain that has seen most adoption, after Ethereum. Now, it’s true Polygon PoS is a “commitchain”, and not an alt-L1, but it’s still very much a monolithic chain and is bound by all of the same crippling inefficiencies as L1s. Polygon PoS reached its limits, suffered from spam, raised their minimum gas floor. But even after that, it’s been spammed by projects, raising gas prices >$0.10. To be clear, this is a far better outcome than Solana or Cardano where during congestion a 99% transactions would simply fail, and only micro-MEV bots will win. To their credit, unlike other monolithic projects, the Polygon team has very openly acknowledged the limitations, and have acted upon it by going all in on ZK rollups — that’ll actually enable high scalability. Actions speak larger than words, and a $1B action is worth commending.
- Speaking of Cardano, they too are a very early beta product, and like Solana, also have to implement fee markets. Cardano’s system requirements are still quite low. Lately, I have seen growing interest in the Cardano community around rollups, so that’s great to see! Nevertheless, until Cardano itself doesn’t implement data availability sampling, all of this would be for nothing.
- There are many other projects we have seen fail to live up to the hype. We have seen Avalanche C-Chain’s fees spiking whenever the block space is saturated — I mean, this is a fundamental feature of monolithic chains. Subnets will either fragment security or decentralization, and will be bounded by the same crippling limitations. Regarding “online pruning”: let’s wait and see, but this seems to implement Geth’s offline pruning and makes it so the pruning happens at a higher frequency. This could be a nice addition to Geth, but it absolutely does not solve the fundamental limitation of state bloat. We’ve seen Harmony fail etc. But I also wanted to highlight projects that are building for the next-generation with actually scalable solutions: Ethereum, Tezos, Celestia and Polygon Avail with data availability sampling; Mina & Aleo with validity proofs; and of course, the dozens of rollups — there seems to be a new one popping up every week now! It’s pretty obvious we’ve entered the era of modular architectures — few are building new monolithic L1s anymore with any degree of seriousness. At a pinch, “proto-modular” projects like Polkadot & NEAR are interim solutions that while don’t solve for a lot of the above issues, do retain sustainability & security. If you don’t care about sustainability & security, Dfinity/ICP is building interesting stuff, though.
- I also want to be very clear that monolithic L1s have a path forward, and indeed this is actually my only goal: to get the entire industry upgraded to the next-generation tech. For example, Avalanche can implement a “data availability sampling subnet” at the base layer validated by the full validator set, and invite modular execution layers to build on top or build their own. But until this is a clear priority on their roadmap, I’ll continue to push the narrative till the change is ubiquitous across the industry.
- Now, not every chain must be a rollup or a modular design of some sort. Sovereign L1s still have their place in situations where security is not important, and you want to accomplish some novel feature difficult or not possible with rollups. Of course, this is very much a niche, but it’s real. The Cosmos ecosystem is doing splendid work on this front (also, Polygon Edge is building compelling solutions), though I’d like to see these chains be validity proven, and IBC evolve to verify validity proofs. That’s about as good as multi-L1 bridging is going to get barring some breakthrough. But even the perfect validity proven bridge — like =nil; Foundation are building for the Mina <> Ethereum bridge— still assume that the weaker chain is not compromised. Rollup bridges give you full security guarantees where even if the weaker chain is compromised you can still inherit the stronger chain’s security.
- Finally, it’s important to discuss timing. Optimistic rollups are not ready yet, so using a non-beta monolithic L1 still makes sense. It should be noted that in both cases — optimistic rollups, or monolithic chains — there’s a varying degree of maturity/instability, so you have to evaluate on a case-by-case basis. But I don’t write about the here and now — my only interest is to see how blockchains can scale massively and sustainably in the long term. But optimistic rollups are maturing rapidly, with clear paths to becoming sustainable solutions. Just need the engineering work to get there — it’s quite possible at least one smart contract optimistic rollup will be fully decentralized, implement data compression, have high liquidity bridges, and scale up within a year’s time. Once optimistic rollups are ready for prime time, it’s game over for almost all monolithic L1s.
In the past, I have taken very long-term views, and as a result may have discussed ZK/validity rollups in a more positive light. I have always believed the endgame is ZK rollups, and I see almost all optimistic rollups either becoming ZK rollups or replacing their fraud proof systems with validity proofs. But this eventuality is probably 3 years away.
That concludes the first part: the evidence is overwhelming at this point that monolithic L1s are a dead end, so I just wanted to underline that once and for all, and move on. The next part will be largely Ethereum-centric because that’s where all the rollup innovation is, and they have the most ambitious scalability roadmap. If you are offended by Ethereum, please look away.
OR vs. ZKR
So, let’s talk about optimistic rollups and zk rollups in the here and now. As a side note, this week alone I learned about Obscuro, a TEE rollup, and Urbit’s naïve rollup. The design space for rollups is a blank canvas, so you can have many types of rollups! But here, I’m specifically discussing secured rollups that either rely on fraud proofs (optimistic) or validity proofs (zk).
Let’s start with application-specific rollups: it’s pretty clear that ZK has the lead here. Loopring, zkSync and others have got payments covered, with fees in the $0.10 range for ERC-20 transfers. Both have trades, at the time of writing a trade on ZigZag has a flat fee of $0.28, including gas fees and trading fees. Meanwhile, dYdX has zero user-facing gas fees, but we can calculate their trading fees. In a day with high activity, for each trade they are paying ~$0.08 in gas fees to Ethereum. On days with less activity, this is in the $0.10 range. If activity ramps up to ~100 TPS, this will reduce to the ~$0.02 range.
Now, it’s quite possible for an optimistic rollup to offer fees in this ballpark — indeed, the Hubble instance Worldcoin are planning to use would have ERC-20 transfers in the sub-$0.10 range currently. Moving on to smart contract rollups — why are fees so high on Optimism and Arbitrum, then? Currently, a swap costs $0.85 on Optimism, and $1.35 on Arbitrum (btw, all of these numbers are from L2fees.info) and this is too damn high!
The answer is simple: the rollups themselves are unoptimized, and most of the early projects are just forks of Ethereum projects that are not designed for optimistic rollups. By optimizing for optimistic rollups, Aave V3 managed to drop fees by ~10x!
But there are a lot more optimizations incoming. Aggregating or compressing signatures will lead to a straight ~1,000 gas saving for each transaction. Basic compression of calldata leads to a direct 2.5x savings — with more advanced schemes to come. As both rollups and applications are optimized, transaction fees on optimistic rollups can easily get to the $0.01-$0.10 range.
But wait, that’s before The Surge! The first step is likely going to be blob-carrying transactions, if all goes well EOY 2022 with Shanghai. This will reset calldata cost to ~zero, and optimistic rollups actually become cheaper across the board!
Today, ZK rollups have high fixed costs. Currently, they are cheaper because the calldata costs exceed the fixed costs. But as calldata costs become negligible, the transaction fees will be dominated purely by the costs of the rollups. And zk rollups are simply much more expensive to run.
Now, obviously, there are many different proof systems with differing costs. But ~$0.01-$0.02 is a common estimate. This will pretty much be a floor for ZKRs. Optimistic rollups, however, are free to go lower if they so choose. As a mature ORs costs are 99% calldata, they have access to ~5,200 TPS of blobspace at a negligible cost. By the way, optimistic rollups will also be cheaper than validiums, as the floor cost applies to them too! I believe zkPorter was estimated to be in the $0.01-$0.03 range, ORs can go well below that if they so choose. Personally, I don’t recommend it, for transaction quality reasons, but in that case ORs will end up with a higher “profit” which they can redistribute to stakeholders, or use it for public goods funding, development etc.
Of course, eventually, this ~5,200 TPS is saturated, and if there’s overwhelming demand beyond that, the calldata costs will start to rise again. But, by that time, danksharding will roll out expanding this space first to 125,000 TPS and then onwards to millions of TPS over the years. (Btw, the “TPS” figure is pointless when discussing rollups — but it’s useful for illustrative purposes.) Long term, calldata costs will absolutely not be the bottleneck — the bottleneck will be the rollups themselves. Just to reiterate that — the bottleneck will be rollups, not Ethereum.
Longer term, as ZKRs mature, we’ll have ASIC provers, and the rollup costs for ZKRs will also become negligible. That’s when their advantages — most notably the immediate withdrawals — will win out. For some usecases that are highly compressible with state deltas — like dYdX — they will be significantly cheaper than ORs sooner rather than later.
Speaking of which — a common misconception is that ORs have 7-day finality. But actually, ORs will achieve the same finality as L1 sooner than ZKRs. Already, we see ORs commit batches every 5 to 10 minutes, so that’s your latency. The 7 days is to ensure this finality is maintained, with the assumption there’ll be at least 1 entity who will (it can be you!). As ORs scale up, these batches will be ever more frequent, and at around ~20 TPS, ORs can commit every block, at which point OR finality = L1 finality. Because ZKRs’ fixed costs are so much higher, committing every block requires a lot more activity (>100 TPS) to be feasible. However, with the blob EIP, ZKRs can reengineer to commit some of their proofs to blobs instead, so this may become less of an issue.
So, what’s the 7 day thing, then? Because the challenge period is 7 days, there’s one type of transaction that does come with a 7 day finality period — withdrawals to L1. For some cases, like cross-chain NFTs, this is going to be challenging. However, we have seen many bridges go live on ORs, and as ORs mature, activity & liquidity ramp up, this isn’t going to be an issue at all. The bigger issue is that ZKRs enable new use cases and cross-rollup activity that are impossible with ORs. For example, with danksharding, ZKRs can synchronously call L1, and I even speculate some degree of composability between different ZKRs will be possible if we get L1 pre-confirmations with crLists! ORs are excluded from these novel and innovative scenarios. Addendum: while ORs can do privacy transactions, ZKRs will be able to do it at much lower costs.
Application-specific rollups are fairly streamlined and can scale up massively. IIRC, StarkEx demonstrated 9,000–18,000 TPS way back in mid-2020. However, things are more challenging for smart contract rollups. We have seen StarkNet only capable of throughput that’s lower than dYdX or Immutable X is doing in prod (which is a fraction of what they are actually capable of), and make optimizing for throughput a top priority. Because both Optimism and Arbitrum are based around EVM clients, they a) have a battle-tested codebase, and b) relatively optimized clients. A few days ago, I asked how far EVM can scale before requiring parallelism, Alexey Sharp suggested that Erigon could scale to 500M gas/second sustained (with some outliers burst). So, there’s a ton of headroom available to optimized EVM-based ORs, and more available through either multi-threaded clients, or multiple instances / recursive rollups. (Yes, ORs can have L3s too, though it’s certainly more elegant with ZKRs)
This is where we have a point of contention: since ORs have an honest minority assumption, you can’t push system requirements too hard. To be clear, because it’s an honest minority instead of an honest majority assumption like with monolithic L1s, they can be safer far higher than any monolithic L1 can. But you still need to have a limit, and this limit needs to be lower than ZKRs by default, because ZKRs can be verified easily via validity proofs.
But ORs have a solution too — stateless clients. I saw this on Optimism’s roadmap, though I can’t remember where. With stateless clients, it’ll be pretty easy to verify an OR, and with the right tools quite trivial for you to verify your transactions. Further, because the state of rollups can be easily reconstructed directly from L1, rollups can be far more aggressive with state expiry than an L1 can. Once statelessness and high-frequency state expiry are implemented, rollups have now overcome the crippling issue of state bloat! Further, you also have novel solutions like Fuel V2 which parallelize through a UTXO-like system.
That said, ZKRs are still the more elegant solution for maximum throughput within a network, with recursive validity proofs. It’s quite possible that while StarkNet itself is maturing we see application-specific ZKRs with massive throughput build on top of it, and as a whole StarkNet significantly outscales any OR. The trade-off, though, is atomic composability — unless the wizards at ZKR teams figure that out soon!
Side-note: delays for rollup bridges
We’ve seen a massive $320M hack for Solana Wormhole, and we’ve seen with a recent bug bounty by Optimism bugs are certainly par for the course for early beta products.
So, here’s a pragmatic solution for rollup bridges: have an exponential delay function by the amount withdrawn. Interestingly, due to an OR’s 7 day period, there’s ample time to react even if the bug was exploited. The key insight is that while rollups are maturing, external bridges and ZKR bridges should also implement similar solutions.
For example, if you have a ZKR bridge where massive withdrawals come through in a short period of time, there can be a delay function. This needs to be an aggregate delay, otherwise it’ll be Sybil-attacked. The more the withdrawals are as a % of TVL, the longer the delays will be. In a normal case where you have regular withdrawal activity, they can be as fast as normal. Note that ZKRs offering fast withdrawal functions will probably have to limit those to a certain amount.
By doing this, you can decentralize your contract upgradeability sooner, as there’s enough time to react. I believe having longer withdrawal times during anomalous events is a much better trade-off than centralizing upgradeability. I hope something like this will give rollup teams confidence to decentralize upgradeability ASAP. Over time, as the protocols mature, these delays can get ever shorter.
In the end, it’s all about timelines.
For application-specific chains, ZKRs are the best solution already for most cases. This should be very obvious to anyone who has used dYdX, zkSync, Loopring or Immutable X (though not a rollup). There are some trade-offs, but all that will be plugged in by the end of the year. (See: dYdX V4)
Smart contracts: monolithic L1s have one more year of relevance. The writing’s on the wall, and there’s no escape other than emergency pivoting to the fraud proofs, validity proofs & DA proofs.
Optimistic rollups will scale up faster than ZKRs. A lot of their codebase is already maturing, and once sequencing & upgradeability are decentralized, they’ll be ready for mass adoption. It’s possible by the end of 2022 ORs have sub-cent gas fees, are fully decentralized, and materially the same security and finality as Ethereum, with ample liquidity for fast withdrawals. They’ll continue improving as statelessness & state expiry are implemented over 2023.
ZKRs will continue evolving, maturing, and being battle-tested, optimizing proving times, moving to GPU and finally ASIC provers. Their novel VMs and sequencer nodes will mature and scale up over time too. By the end of 2023, I expect ZKRs to have caught up to and edged out ORs. But ORs will continue to be relevant till 2024/25, by which time I expect most ORs to become ZKRs or at least replace their fraud proof systems with validity proofs.
I have said many times that my time horizons are 5–10 years. But rollups & DAS are developing so rapidly that I now think the endgame is in sight, and will happen before 5 years are up.
Thanks to reddit user u/proof_of_lake for proofreading.