Blockchain full nodes, decentralization, and scalability: an impossible challenge?
One of the key tenets of blockchain decentralization is enabling users to run nodes and verify themselves. Indeed, a blockchain network where users are unable to run full nodes are essentially not trustless, because you’re trusting validators/miners. A common misconception is that miners/validators run the networks — it’s actually the full nodes. Miners/validators simply provide a service. It’s alarming that most blockchain networks do not have a culture of users verifying, and they conveniently sweep this critical flaw under the rug while they boast of “XXX TPS”. To be clear — there’s nothing inherently wrong with this, it’s just that these blockchains are not decentralized and not trustless, and should not be compared to those that are.
Jameson Lopp has completed a fascinating series of tests on various blockchain full nodes: 2021 Altcoin Node Sync Tests (lopp.net).
It’s immediately obvious Bitcoin still takes point when it comes to decentralization. The benchmark is set, with Lopp’s reference system taking less than 3 hours to sync 12+ years of Bitcoin. Bitcoin-like networks such as Litecoin and Dogecoin also do well. Of course, they do well due to the very limited scalability and utility, but that’s the trade-off involved.
Moving on to Ethereum, you can immediately see things get far more challenging. Geth no longer runs on a 1 TB SSD, so Lopp couldn’t complete a full sync. But from my experience, it takes about a week to sync. Erigon is Ethereum’s fastest client, syncing in two and a half days. It makes some smart choices, which may prove controversial to purists, but I think it’s just clever optimization. While that’s still far more than Bitcoin, given the amount of utility and scalability Ethereum has provided over 6 years, this is absolutely remarkable. Using the hardware fully, Erigon is a highly optimized client, and perhaps the best in the blockchain industry.
Still, it’s obvious that Ethereum is on the very ragged edge of how far L1 scalability can be pushed. Some would argue that it’s already way over the limits. People love to complain about high gas fees and such, but actually, this is as good as it’s going to get for a decentralized network.
Monero takes nearly as long as Erigon, despite significantly less usage compared to Ethereum over the years. It’s still definitely on the edge of being called fully decentralized.
Unfortunately, things start to look very dire as we move on to other protocols. Polkadot and Cardano take 5 hours and 10 hours respectively, which is insane given how little activity these chains have had, with both being live for not more than a year. Once they roll out smart contracts and if they have some real utility and usage on chain, they are going to end up being centralized very rapidly. Granted, both have very poorly written clients, so there’s room for optimization to make better use of the hardware.
Binance Smart Chain is the perfect cautionary tale. Despite having meaningful activity for only 6 months or so, it already takes 13 days to sync — which is longer than Ethereum despite forking off Geth. Binance like to boast how they have 8 times the throughput of Ethereum — clearly, it comes at a very steep cost. I estimate within the next year or two it’ll be practically impossible to sync on a consumer PC. EOS has already reached there, it takes week/months.
Incredibly, it’s already impossible to sync Solana from genesis! Your only option is to get snapshots from Solana Labs. For all intents and purposes, Solana is no longer a decentralized network. On Lopp’s high-end consumer PC, he was unable to sync even after getting the snapshot! To be fair, Solana does require 128–256 GB RAM, so this was to be expected. Solana has been live in earnest for 6 months or so, and most of their activity is just Serum (which in turn I suspect is mostly bots and market makers). Already, they have a skip rate of 33%, i.e. 33% of their blocks are not produced on time. How are things going to look years from now?
Finally, we have ICP which flat out don’t let you run a full node — you have to buy one of their approved supercomputers.
Obviously, this doesn’t cover all blockchains, but the general concept holds. The trilemma is real, and the more you push scalability, the harder it gets to run a full node, and the more centralized your network becomes. It’s clear that Bitcoin and other Bitcoin-like networks such as Litecoin are the only ones that are unarguably decentralized. Ethereum is at the very edge of this, though Erigon does make things better. Everything else is simply not decentralized.
So, what are the solutions? Geth implemented fast sync, which doesn’t verify all transactions, but downloads the blocks and verifies the proof-of-works. Now, they have snap sync — indeed, it’s the default. Ethereum’s consensus (eth2) clients use weak subjectivity sync to start at a recent block and sync backwards. Techniques like this may be controversial to purists, but to others, they achieve the same results without materially sacrificing decentralization.
While the article covers full node syncs, there’s another important aspect to it: can a full node stay in sync after the initial sync is complete? For example, with Solana, even using centralized snapshots it was impossible to stay in sync with Lopp’s PC. I’d say Binance Smart Chain and presumably Polygon PoS are at the edge of what’s possible on a relatively high-end PC today.
A long term solution is stateless clients, where full node clients need not hold and sync the full states — just witnesses. It’s a high priority on Ethereum’s roadmap, and this will finally make it easy for the average user to run a node. Of course, it won’t help full nodes and block proposers, who’ll still need to hold the full state, but it’ll go a long way in making Ethereum as easy to use (and even more so) as Bitcoin. This will be combined with state expiry, where every year or so inactive state will be expired and archived. Together, statelessness and state expiry will dramatically improve the situation around state size management, which as you can see from the rest of this post is very dire. But obviously, this has its limits and will not help with keeping sync.
So how do we attain blockchain scalability? The answer is through innovative techniques like rollups and sharding. It’s pretty clear a monolithic blockchain will always be highly constrained, no matter what marketing departments claim. This is a whole different can of worms, though, so I’ll conclude this post here.
Tl;dr: Bitcoin is the only unarguably decentralized network in crypto (and Bitcoin-like networks), and Ethereum is arguably the only other one. We have to hold blockchain projects to a higher standard and force them to use innovative techniques to attain scalability rather than brute force that leads to these networks being inherently centralized.