Peak TPS is a lie. It measures theoretical throughput in a vacuum, ignoring critical constraints like state growth, data availability costs, and real user demand patterns. Benchmarks from Solana or Sui often reflect synthetic, subsidized conditions.
The Future of Scalability Benchmarks: Moving Beyond Peak TPS
Why peak TPS is a misleading vanity metric and how economic throughput, finality latency under load, and cost-per-provable-computation define the real battle for L2 supremacy.
Introduction
Peak TPS is a flawed metric that fails to capture the real-world performance and economic efficiency of modern blockchain systems.
The real bottleneck is state. Systems like Arbitrum Nitro and zkSync Era optimize for state management, not just raw transaction processing. Their effective scalability is defined by how cheaply they can prove and store state transitions on Ethereum.
Scalability is now multi-dimensional. Modern benchmarks must evaluate finality time, cost per user operation, and cross-domain interoperability latency. A protocol like Starknet achieves scalability through recursive proofs, a fundamentally different vector than Aptos's parallel execution.
Evidence: Arbitrum processes over 2 million TPS for its internal VM, but its real constraint is the ~100 KB/sec data availability bandwidth on Ethereum, a metric TPS ignores completely.
Executive Summary
Peak TPS is a vanity metric. The next generation of scalability benchmarks will measure real-world performance, economic security, and developer experience.
The Problem: TPS Measures Throughput, Not Utility
Advertised 100,000 TPS is meaningless if the network is congested during a memecoin frenzy. Peak capacity ignores real-world constraints like state growth, mempool dynamics, and validator load.
- Real Bottleneck: State bloat and I/O, not CPU.
- User Impact: High latency and failed transactions during peak demand.
- Industry Shift: Projects like Solana and Sui now emphasize Time-to-Finality and concurrent execution.
The Solution: The End-to-End Latency Stack
True scalability is measured from user click to guaranteed settlement. This stack includes sequencer inclusion, proof generation, and cross-chain verification.
- Key Layer: EigenDA and Avail decouple data availability from execution.
- Bottleneck Shift: Proof generation time on zkEVMs like zkSync and Scroll.
- New Metric: Settlement Assurance Time, combining soft and hard finality.
The Problem: Isolated Benchmarks Ignore Composability
A chain can be fast in a vacuum but crumble under cross-chain load. The real network is a multi-chain system where LayerZero messages, Circle's CCTP transfers, and Wormhole NFTs create interdependent load.
- New Stress Test: Simultaneous bridge withdrawals and DEX arbitrage.
- Systemic Risk: Congestion on L1 (Ethereum) cascades to all L2s (Arbitrum, Optimism).
- Benchmark Gap: No standard for measuring cross-rollup congestion.
The Solution: Cost-Per-Unit-Value as the Ultimate Metric
Forget cost-per-transaction. Protocols and users care about cost-per-dollar-settled. This measures economic efficiency for high-value DeFi, NFT mints, and institutional transfers.
- VC Focus: Andreessen Horowitz (a16z) and Paradigm track this for portfolio chains.
- Protocol Example: Uniswap v4 hook execution cost vs. swap value.
- Real Scaling: A chain that's cheap for $10 swaps but expensive for $10M swaps has failed.
The Problem: Developer Velocity is the Hidden Tax
Scalability isn't just for users. Complex state access patterns, non-standard precompiles, and unpredictable gas costs on chains like Polygon zkEVM or Base cripple developer iteration speed.
- True Cost: Weeks of optimization for simple contracts.
- Ecosystem Risk: Developers flock to chains with predictable performance (Arbitrum Stylus, Fuel).
- Missing Benchmark: Time-to-first-successful-complex-dApp.
The Future: Adversarial Load Testing as a Service
The next benchmark standard will be continuous, adversarial simulation. Services like Chaos Labs will stress-test chains with realistic, malicious load to find breaking points before hackers do.
- Simulate: NFT mint rushes, oracle manipulation, and governance attacks.
- Benchmark: Sustained TPS under attack, not ideal conditions.
- Outcome: A security-scaling score for VCs and protocols.
Thesis: TPS is a Vanity Metric, Throughput is a System Property
Peak TPS is a marketing number; real scalability is defined by sustainable throughput across the entire user journey.
Peak TPS is meaningless. It measures a single, isolated component under ideal lab conditions, ignoring network congestion, state growth, and data availability costs.
Throughput is a system property. It measures the end-to-end capacity of a network, from transaction submission to finality, including the mempool, sequencer, and DA layer.
Real-world systems bottleneck elsewhere. Solana's 65k TPC claim is irrelevant if the RPC endpoints or the state management system cannot sustain the load.
Evidence: Arbitrum Nitro's capacity is defined by its interaction with Ethereum for data posting, not its internal execution speed. Avalanche subnets demonstrate that sustainable throughput requires parallel execution.
The Current L2 Benchmark Circus
Peak TPS benchmarks are marketing theater that obscure the real constraints of production systems.
Peak TPS is meaningless. It measures a synthetic, single-application workload on a single, empty block. This ignores network congestion, cross-domain messaging, and the cost of data availability, which are the real bottlenecks for users.
The real metric is sustained throughput. A chain's capacity under a realistic, multi-application load with competing transactions determines its utility. Arbitrum and Optimism report ~50-100 sustained TPS, not the millions advertised in lab tests.
Benchmarks ignore the proving bottleneck. A zkEVM's peak TPS is irrelevant if its prover takes hours to generate a validity proof for that block. The constraint shifts from execution to proving latency and cost.
Evidence: The L2BEAT benchmark dashboard tracks real-world metrics like transaction costs and time-to-finality, exposing the gap between marketing claims and on-chain reality for networks like zkSync Era and Base.
The New Benchmark Matrix: A Protocol Comparison
Comparing next-generation blockchain scaling solutions across holistic performance, economic, and security dimensions.
| Metric / Feature | zkSync Era | Arbitrum Nitro | Starknet | Base |
|---|---|---|---|---|
Peak Theoretical TPS | 2,000+ | 4,000+ |
| ~ 2,000 |
Time to Finality (L1) | ~ 15 min | ~ 5 min | ~ 3-4 hours | ~ 12 min |
Avg. Transaction Cost | $0.10 - $0.50 | $0.20 - $0.80 | $0.05 - $0.30 | $0.01 - $0.10 |
Native Account Abstraction | ||||
Proof System | zk-SNARKs (ZK Rollup) | Optimistic Rollup | zk-STARKs (ZK Rollup) | Optimistic Rollup |
Fraud/Validity Proof Time | ~ 1 hour | 7 days (challenge period) | ~ 3-4 hours | 7 days (challenge period) |
Sequencer Decentralization | Planned (ZK Stack) | Planned | Decentralized Provers | Centralized (Coinbase) |
EVM Bytecode Compatibility | Yes (custom VM) | Yes (full EVM) | No (Cairo VM) | Yes (full EVM) |
Deconstructing the Three Pillars of Real Scalability
Peak TPS is a vanity metric; real scalability requires optimizing for throughput, cost, and finality simultaneously.
Scalability is a three-body problem. Isolating a single metric like TPS creates a false benchmark. A chain must balance transaction throughput, user cost, and finality time. Optimizing for one degrades the others, as seen in Solana's trade-offs between speed and reliability.
Throughput without cost control is useless. A system processing 100k TPS with $10 fees fails. Real scaling requires sub-cent transaction costs at scale, which is the core innovation of data availability layers like Celestia and EigenDA.
Finality is the silent killer. Users perceive speed as finality, not block inclusion. Fast finality (e.g., Solana's 400ms, Sui's sub-second) defines user experience, while optimistic rollups like Arbitrum suffer from 7-day challenge windows.
Evidence: The modular stack wins. By separating execution, settlement, and data availability, chains like Arbitrum Orbit and Optimism Superchain optimize each pillar independently. This architecture achieves the sustainable scaling that monolithic L1s cannot.
Architectural Trade-offs in Practice
Peak TPS is a marketing metric. Real-world scalability is defined by the trade-offs between throughput, cost, and decentralization under load.
The Problem: Peak TPS Ignores State Growth
Advertised 1M TPS is meaningless if node hardware requirements double yearly. The real bottleneck isn't computation, but state bloat that centralizes infrastructure.
- Solana's ~1.2 TB annual state growth demands enterprise-grade hardware.
- Ethereum's ~1 TB full archive node size is mitigated by Erigon's flat structure.
- The benchmark that matters: State Growth per 1000 TPS.
The Solution: Cost-Per-Transaction Under Congestion
Sustainable scaling is measured by how costs behave during a mempool flood. Base fee volatility is the true stress test.
- Ethereum L1 fees can spike to $200+ during an NFT mint.
- Solana fails this test entirely during congestion, requiring priority fees and experiencing ~50% failed tx rates.
- Arbitrum Nitro and zkSync Era maintain sub-$0.10 median fees by leveraging L1 security for data, not execution.
The Benchmark: Time-to-Finality Distribution
Users experience latency distributions, not averages. A chain with 2s avg finality but a 60s 99th percentile is unreliable for DeFi.
- Avalanche subnets achieve ~1-2s finality but can suffer from cross-subnet latency.
- Polygon zkEVM leverages Ethereum for ~10 min finality, trading speed for supreme security.
- The key metric is Finality SLA: % of transactions final within a guaranteed window.
The Reality: Decentralization Under Load
High TPS often requires sacrificing validator decentralization. The critical ratio is Nodes / Throughput.
- Solana's ~1,500 validators support high TPS but with high hardware costs, creating centralization pressure.
- Celestia's data availability sampling allows light nodes to scale with the network, preserving decentralization.
- Monad's parallel EVM targets 10k TPS but its decentralization will be proven by its validator set size at launch.
The New Standard: Total Cost of Operation (TCO)
Protocols must be evaluated on the full cost to secure and use them: L1 security costs + L2 operational costs + cross-chain messaging fees.
- Optimism's Bedrock upgrade reduced L1 data costs by ~50%, directly lowering TCO.
- zkRollups like StarkNet have high proving costs but near-zero L1 settlement costs, optimizing for high-volume batches.
- Polygon 2.0's interconnected ZK L2s aim to minimize TCO via shared liquidity and security.
The Frontier: Application-Specific Benchmarks
Generic TPS is dead. Scalability is now measured per use case: Perps Latency, NFT Mint Gas Cost, Cross-Swap Slippage.
- dYdX v4 on Cosmos is built for ~1,000 orders/sec with sub-10ms matching engine latency.
- Immutable zkEVM benchmarks >100k NFT mints for < $1 total gas, a metric irrelevant to Uniswap.
- The future is benchmark suites, not single numbers.
Counterpoint: Why Developers Still Care About Simple TPS
Peak TPS remains a critical, if flawed, metric for developer adoption because it directly impacts user experience and operational costs.
TPS is a proxy for cost. Developers building high-frequency applications like on-chain games or perpetual DEXs need predictable, low transaction fees. A high sustained TPS indicates a chain's capacity to handle load without gas price spikes, directly affecting user retention and protocol economics.
Simple benchmarks accelerate integration. When evaluating a new L2 like Arbitrum Nova or zkSync Era, a developer's first technical filter is throughput. A published peak TPS figure, while incomplete, provides a rapid, comparable baseline for initial architecture decisions, unlike complex multi-dimensional benchmarks.
User experience is defined by latency. For end-users, the difference between a 2-second and a 10-second finality is the difference between a usable product and an abandoned cart. TPS under load directly correlates with this latency, making it a non-negotiable performance indicator for consumer apps.
Evidence: The migration of NFT projects from Ethereum L1 to Polygon and Solana was primarily driven by the need for higher TPS at lower cost to enable minting and trading at scale, proving the metric's practical weight in adoption decisions.
Frequently Challenged Questions
Common questions about the evolution of blockchain scalability metrics beyond simple transaction throughput.
Peak TPS is a bad metric because it measures an unrealistic, isolated lab condition, not real-world user experience. It ignores network congestion, transaction finality time, and the cost of decentralization. Benchmarks must evolve to measure sustained throughput under load, time-to-finality, and cost-per-transaction to reflect actual utility.
The Endgame: Benchmarks as a Commodity
Peak TPS becomes a meaningless vanity metric as the industry standardizes on composable, intent-based execution layers.
Benchmarks will standardize. The current fragmented landscape of TPS claims from Solana, Sui, and Aptos will converge on a common methodology, likely driven by the Ethereum community's EIP-4844 and blob fee markets. This creates a commodity data layer for performance.
The metric shifts to cost-per-unit-of-work. Developers will not ask 'how fast?' but 'how cheap to prove a batch of swaps?'. This mirrors the evolution from raw compute to AWS's cost-per-API-call model, making execution cost the primary benchmark.
Intent-centric architectures obviate TPS. Protocols like UniswapX and CowSwap abstract execution away from users. The relevant benchmark becomes the solver's economic efficiency and the settlement layer's finality speed, not the chain's raw throughput.
Evidence: Arbitrum Stylus demonstrates this shift by benchmarking WASM opcode execution cost in micro-ETH, not transactions per second. The market will price performance in gas-equivalent units across all L2s and alt-L1s.
Actionable Takeaways for Builders & Investors
Forget peak TPS. The next generation of scaling will be defined by composability, economic security, and user experience.
The Problem: TPS is a Vanity Metric
Advertised peak TPS ignores the real bottlenecks: state growth, cross-domain latency, and the cost of finality. A chain claiming 100k TPS is meaningless if its state becomes a 10 TB archive or cross-chain messages take 10 minutes.
- Key Insight: Measure time-to-finality and cost-per-finalized-transaction, not raw throughput.
- Action: Audit scaling claims by testing sustained load under adversarial conditions (e.g., spam attacks).
The Solution: Modular & Intent-Centric Architectures
Scalability is now a system design problem. Decouple execution (Rollups, Solana), settlement (Celestia, EigenLayer), and data availability.
- Key Insight: Modular stacks (e.g., using Celestia for DA) can reduce L2 operating costs by >90%.
- Action: Build for intent-based flows (see UniswapX, CowSwap) where users declare outcomes, not transactions, abstracting away chain boundaries.
The Metric: Economic Throughput (TVL * Velocity)
Real scalability is value moved securely per unit time. A chain with $1B TVL and high velocity is more scalable than one with $10B TVL that's stagnant.
- Key Insight: Track Total Value Secured (TVS) and capital rotation rates. Protocols like EigenLayer monetize security directly.
- Action: For L1s/L2s, incentivize high-value, high-frequency applications (DeFi, gaming) not just NFT mints.
The Bottleneck: Interoperability is the New Scaling Frontier
Scalability is worthless if it's siloed. The future is multi-chain, requiring seamless asset and state movement.
- Key Insight: Universal interoperability layers (LayerZero, Chainlink CCIP, Axelar) and shared security models (Polygon AggLayer, Cosmos IBC) are critical infrastructure.
- Action: Evaluate bridges and messaging protocols on latency (~2-5 sec for optimistic, ~1 min for zk), security guarantees (economic vs. cryptographic), and cost.
The Shift: User-Observed Latency is King
Users don't care about block time; they care about perceived speed from click to confirmation. This requires optimizing the entire stack, from RPCs to pre-confirmations.
- Key Insight: Instant pre-confirmations via threshold encryption (e.g., EigenLayer) or fast-finality chains (Solana, Sei) set the new UX standard.
- Action: Implement local fee markets and priority lanes to guarantee sub-second feedback for end-users, even during congestion.
The Reality: Scalability Requires Sustainable Economics
High throughput that bankrupts validators or relies on unsustainable token emissions is a dead end. Fee revenue must cover hardware and security costs.
- Key Insight: Analyze a chain's cost-of-production (hardware, bandwidth) vs. protocol revenue. Solana and Monad push hardware limits; Ethereum L2s inherit security but pay for it.
- Action: For investors, model validator profitability. For builders, design for real fee revenue, not just token incentives.
Get In Touch
today.
Our experts will offer a free quote and a 30min call to discuss your project.