Beyond TPS: The Real Scalability Benchmarks for ZK-Rollups

introduction

THE NEW SCALABILITY FRONTIER

Introduction

Peak TPS is a flawed metric that fails to capture the real-world performance and economic efficiency of modern blockchain systems.

Peak TPS is a lie. It measures theoretical throughput in a vacuum, ignoring critical constraints like state growth, data availability costs, and real user demand patterns. Benchmarks from Solana or Sui often reflect synthetic, subsidized conditions.

The real bottleneck is state. Systems like Arbitrum Nitro and zkSync Era optimize for state management, not just raw transaction processing. Their effective scalability is defined by how cheaply they can prove and store state transitions on Ethereum.

Scalability is now multi-dimensional. Modern benchmarks must evaluate finality time, cost per user operation, and cross-domain interoperability latency. A protocol like Starknet achieves scalability through recursive proofs, a fundamentally different vector than Aptos's parallel execution.

Evidence: Arbitrum processes over 2 million TPS for its internal VM, but its real constraint is the ~100 KB/sec data availability bandwidth on Ethereum, a metric TPS ignores completely.

key-trends

BEYOND THE TPS TRAP

Executive Summary

Peak TPS is a vanity metric. The next generation of scalability benchmarks will measure real-world performance, economic security, and developer experience.

The Problem: TPS Measures Throughput, Not Utility

Advertised 100,000 TPS is meaningless if the network is congested during a memecoin frenzy. Peak capacity ignores real-world constraints like state growth, mempool dynamics, and validator load.

Real Bottleneck: State bloat and I/O, not CPU.
User Impact: High latency and failed transactions during peak demand.
Industry Shift: Projects like Solana and Sui now emphasize Time-to-Finality and concurrent execution.

~500ms

Real Finality

0.1s-5s

Latency Range

The Solution: The End-to-End Latency Stack

True scalability is measured from user click to guaranteed settlement. This stack includes sequencer inclusion, proof generation, and cross-chain verification.

Key Layer: EigenDA and Avail decouple data availability from execution.
Bottleneck Shift: Proof generation time on zkEVMs like zkSync and Scroll.
New Metric: Settlement Assurance Time, combining soft and hard finality.

2-10 min

zkProof Time

12s

Ethereum Slot

The Problem: Isolated Benchmarks Ignore Composability

A chain can be fast in a vacuum but crumble under cross-chain load. The real network is a multi-chain system where LayerZero messages, Circle's CCTP transfers, and Wormhole NFTs create interdependent load.

New Stress Test: Simultaneous bridge withdrawals and DEX arbitrage.
Systemic Risk: Congestion on L1 (Ethereum) cascades to all L2s (Arbitrum, Optimism).
Benchmark Gap: No standard for measuring cross-rollup congestion.

$10B+

Bridge TVL

5-20 L2s

Active Network

The Solution: Cost-Per-Unit-Value as the Ultimate Metric

Forget cost-per-transaction. Protocols and users care about cost-per-dollar-settled. This measures economic efficiency for high-value DeFi, NFT mints, and institutional transfers.

VC Focus: Andreessen Horowitz (a16z) and Paradigm track this for portfolio chains.
Protocol Example: Uniswap v4 hook execution cost vs. swap value.
Real Scaling: A chain that's cheap for $10 swaps but expensive for $10M swaps has failed.

<0.1%

Target Fee/Value

1000x

Value Range

The Problem: Developer Velocity is the Hidden Tax

Scalability isn't just for users. Complex state access patterns, non-standard precompiles, and unpredictable gas costs on chains like Polygon zkEVM or Base cripple developer iteration speed.

True Cost: Weeks of optimization for simple contracts.
Ecosystem Risk: Developers flock to chains with predictable performance (Arbitrum Stylus, Fuel).
Missing Benchmark: Time-to-first-successful-complex-dApp.

2-4x

Dev Time Increase

~50 APIs

Tooling Required

The Future: Adversarial Load Testing as a Service

The next benchmark standard will be continuous, adversarial simulation. Services like Chaos Labs will stress-test chains with realistic, malicious load to find breaking points before hackers do.

Simulate: NFT mint rushes, oracle manipulation, and governance attacks.
Benchmark: Sustained TPS under attack, not ideal conditions.
Outcome: A security-scaling score for VCs and protocols.

24/7

Testing

Live

Risk Dashboard

thesis-statement

THE REAL BENCHMARK

Thesis: TPS is a Vanity Metric, Throughput is a System Property

Peak TPS is a marketing number; real scalability is defined by sustainable throughput across the entire user journey.

Peak TPS is meaningless. It measures a single, isolated component under ideal lab conditions, ignoring network congestion, state growth, and data availability costs.

Throughput is a system property. It measures the end-to-end capacity of a network, from transaction submission to finality, including the mempool, sequencer, and DA layer.

Real-world systems bottleneck elsewhere. Solana's 65k TPC claim is irrelevant if the RPC endpoints or the state management system cannot sustain the load.

Evidence: Arbitrum Nitro's capacity is defined by its interaction with Ethereum for data posting, not its internal execution speed. Avalanche subnets demonstrate that sustainable throughput requires parallel execution.

market-context

THE DATA

The Current L2 Benchmark Circus

Peak TPS benchmarks are marketing theater that obscure the real constraints of production systems.

Peak TPS is meaningless. It measures a synthetic, single-application workload on a single, empty block. This ignores network congestion, cross-domain messaging, and the cost of data availability, which are the real bottlenecks for users.

The real metric is sustained throughput. A chain's capacity under a realistic, multi-application load with competing transactions determines its utility. Arbitrum and Optimism report ~50-100 sustained TPS, not the millions advertised in lab tests.

Benchmarks ignore the proving bottleneck. A zkEVM's peak TPS is irrelevant if its prover takes hours to generate a validity proof for that block. The constraint shifts from execution to proving latency and cost.

Evidence: The L2BEAT benchmark dashboard tracks real-world metrics like transaction costs and time-to-finality, exposing the gap between marketing claims and on-chain reality for networks like zkSync Era and Base.

SCALABILITY BEYOND TPS

The New Benchmark Matrix: A Protocol Comparison

Comparing next-generation blockchain scaling solutions across holistic performance, economic, and security dimensions.

Metric / Feature	zkSync Era	Arbitrum Nitro	Starknet	Base
Peak Theoretical TPS	2,000+	4,000+	10,000	~ 2,000
Time to Finality (L1)	~ 15 min	~ 5 min	~ 3-4 hours	~ 12 min
Avg. Transaction Cost	$0.10 - $0.50	$0.20 - $0.80	$0.05 - $0.30	$0.01 - $0.10
Native Account Abstraction
Proof System	zk-SNARKs (ZK Rollup)	Optimistic Rollup	zk-STARKs (ZK Rollup)	Optimistic Rollup
Fraud/Validity Proof Time	~ 1 hour	7 days (challenge period)	~ 3-4 hours	7 days (challenge period)
Sequencer Decentralization	Planned (ZK Stack)	Planned	Decentralized Provers	Centralized (Coinbase)
EVM Bytecode Compatibility	Yes (custom VM)	Yes (full EVM)	No (Cairo VM)	Yes (full EVM)

deep-dive

THE NEW BENCHMARK

Deconstructing the Three Pillars of Real Scalability

Peak TPS is a vanity metric; real scalability requires optimizing for throughput, cost, and finality simultaneously.

Scalability is a three-body problem. Isolating a single metric like TPS creates a false benchmark. A chain must balance transaction throughput, user cost, and finality time. Optimizing for one degrades the others, as seen in Solana's trade-offs between speed and reliability.

Throughput without cost control is useless. A system processing 100k TPS with $10 fees fails. Real scaling requires sub-cent transaction costs at scale, which is the core innovation of data availability layers like Celestia and EigenDA.

Finality is the silent killer. Users perceive speed as finality, not block inclusion. Fast finality (e.g., Solana's 400ms, Sui's sub-second) defines user experience, while optimistic rollups like Arbitrum suffer from 7-day challenge windows.

Evidence: The modular stack wins. By separating execution, settlement, and data availability, chains like Arbitrum Orbit and Optimism Superchain optimize each pillar independently. This architecture achieves the sustainable scaling that monolithic L1s cannot.

protocol-spotlight

THE FUTURE OF SCALABILITY BENCHMARKS

Architectural Trade-offs in Practice

Peak TPS is a marketing metric. Real-world scalability is defined by the trade-offs between throughput, cost, and decentralization under load.

The Problem: Peak TPS Ignores State Growth

Advertised 1M TPS is meaningless if node hardware requirements double yearly. The real bottleneck isn't computation, but state bloat that centralizes infrastructure.

Solana's ~1.2 TB annual state growth demands enterprise-grade hardware.
Ethereum's ~1 TB full archive node size is mitigated by Erigon's flat structure.
The benchmark that matters: State Growth per 1000 TPS.

~1.2 TB/yr

Solana State Growth

1 TB

Ethereum Archive

The Solution: Cost-Per-Transaction Under Congestion

Sustainable scaling is measured by how costs behave during a mempool flood. Base fee volatility is the true stress test.

Ethereum L1 fees can spike to $200+ during an NFT mint.
Solana fails this test entirely during congestion, requiring priority fees and experiencing ~50% failed tx rates.
Arbitrum Nitro and zkSync Era maintain sub-$0.10 median fees by leveraging L1 security for data, not execution.

$200+

Ethereum Spike

~50%

Solana Fail Rate

The Benchmark: Time-to-Finality Distribution

Users experience latency distributions, not averages. A chain with 2s avg finality but a 60s 99th percentile is unreliable for DeFi.

Avalanche subnets achieve ~1-2s finality but can suffer from cross-subnet latency.
Polygon zkEVM leverages Ethereum for ~10 min finality, trading speed for supreme security.
The key metric is Finality SLA: % of transactions final within a guaranteed window.

1-2s

Avalanche Finality

~10 min

zkEVM to L1

The Reality: Decentralization Under Load

High TPS often requires sacrificing validator decentralization. The critical ratio is Nodes / Throughput.

Solana's ~1,500 validators support high TPS but with high hardware costs, creating centralization pressure.
Celestia's data availability sampling allows light nodes to scale with the network, preserving decentralization.
Monad's parallel EVM targets 10k TPS but its decentralization will be proven by its validator set size at launch.

~1,500

Solana Validators

10k TPS

Monad Target

The New Standard: Total Cost of Operation (TCO)

Protocols must be evaluated on the full cost to secure and use them: L1 security costs + L2 operational costs + cross-chain messaging fees.

Optimism's Bedrock upgrade reduced L1 data costs by ~50%, directly lowering TCO.
zkRollups like StarkNet have high proving costs but near-zero L1 settlement costs, optimizing for high-volume batches.
Polygon 2.0's interconnected ZK L2s aim to minimize TCO via shared liquidity and security.

-50%

OP Bedrock Cost

~$0.001

zkProof Cost/Tx

The Frontier: Application-Specific Benchmarks

Generic TPS is dead. Scalability is now measured per use case: Perps Latency, NFT Mint Gas Cost, Cross-Swap Slippage.

dYdX v4 on Cosmos is built for ~1,000 orders/sec with sub-10ms matching engine latency.
Immutable zkEVM benchmarks >100k NFT mints for < $1 total gas, a metric irrelevant to Uniswap.
The future is benchmark suites, not single numbers.

~1k/sec

dYdX Order Rate

< $1

100k NFT Mint Gas

counter-argument

THE REALITY OF SHIPMENT

Counterpoint: Why Developers Still Care About Simple TPS

Peak TPS remains a critical, if flawed, metric for developer adoption because it directly impacts user experience and operational costs.

TPS is a proxy for cost. Developers building high-frequency applications like on-chain games or perpetual DEXs need predictable, low transaction fees. A high sustained TPS indicates a chain's capacity to handle load without gas price spikes, directly affecting user retention and protocol economics.

Simple benchmarks accelerate integration. When evaluating a new L2 like Arbitrum Nova or zkSync Era, a developer's first technical filter is throughput. A published peak TPS figure, while incomplete, provides a rapid, comparable baseline for initial architecture decisions, unlike complex multi-dimensional benchmarks.

User experience is defined by latency. For end-users, the difference between a 2-second and a 10-second finality is the difference between a usable product and an abandoned cart. TPS under load directly correlates with this latency, making it a non-negotiable performance indicator for consumer apps.

Evidence: The migration of NFT projects from Ethereum L1 to Polygon and Solana was primarily driven by the need for higher TPS at lower cost to enable minting and trading at scale, proving the metric's practical weight in adoption decisions.

FREQUENTLY ASKED QUESTIONS

Frequently Challenged Questions

Common questions about the evolution of blockchain scalability metrics beyond simple transaction throughput.

Peak TPS is a bad metric because it measures an unrealistic, isolated lab condition, not real-world user experience. It ignores network congestion, transaction finality time, and the cost of decentralization. Benchmarks must evolve to measure sustained throughput under load, time-to-finality, and cost-per-transaction to reflect actual utility.

future-outlook

THE REALITY

The Endgame: Benchmarks as a Commodity

Peak TPS becomes a meaningless vanity metric as the industry standardizes on composable, intent-based execution layers.

Benchmarks will standardize. The current fragmented landscape of TPS claims from Solana, Sui, and Aptos will converge on a common methodology, likely driven by the Ethereum community's EIP-4844 and blob fee markets. This creates a commodity data layer for performance.

The metric shifts to cost-per-unit-of-work. Developers will not ask 'how fast?' but 'how cheap to prove a batch of swaps?'. This mirrors the evolution from raw compute to AWS's cost-per-API-call model, making execution cost the primary benchmark.

Intent-centric architectures obviate TPS. Protocols like UniswapX and CowSwap abstract execution away from users. The relevant benchmark becomes the solver's economic efficiency and the settlement layer's finality speed, not the chain's raw throughput.

Evidence: Arbitrum Stylus demonstrates this shift by benchmarking WASM opcode execution cost in micro-ETH, not transactions per second. The market will price performance in gas-equivalent units across all L2s and alt-L1s.

takeaways

THE FUTURE OF SCALABILITY

Actionable Takeaways for Builders & Investors

Forget peak TPS. The next generation of scaling will be defined by composability, economic security, and user experience.

The Problem: TPS is a Vanity Metric

Advertised peak TPS ignores the real bottlenecks: state growth, cross-domain latency, and the cost of finality. A chain claiming 100k TPS is meaningless if its state becomes a 10 TB archive or cross-chain messages take 10 minutes.

Key Insight: Measure time-to-finality and cost-per-finalized-transaction, not raw throughput.
Action: Audit scaling claims by testing sustained load under adversarial conditions (e.g., spam attacks).

10 TB

State Bloat

10 min

Cross-Chain Latency

The Solution: Modular & Intent-Centric Architectures

Scalability is now a system design problem. Decouple execution (Rollups, Solana), settlement (Celestia, EigenLayer), and data availability.

Key Insight: Modular stacks (e.g., using Celestia for DA) can reduce L2 operating costs by >90%.
Action: Build for intent-based flows (see UniswapX, CowSwap) where users declare outcomes, not transactions, abstracting away chain boundaries.

>90%

Cost Reduction

Intent

New Primitive

The Metric: Economic Throughput (TVL * Velocity)

Real scalability is value moved securely per unit time. A chain with $1B TVL and high velocity is more scalable than one with $10B TVL that's stagnant.

Key Insight: Track Total Value Secured (TVS) and capital rotation rates. Protocols like EigenLayer monetize security directly.
Action: For L1s/L2s, incentivize high-value, high-frequency applications (DeFi, gaming) not just NFT mints.

TVL * Velocity

True Metric

$1B+

TVS Target

The Bottleneck: Interoperability is the New Scaling Frontier

Scalability is worthless if it's siloed. The future is multi-chain, requiring seamless asset and state movement.

Key Insight: Universal interoperability layers (LayerZero, Chainlink CCIP, Axelar) and shared security models (Polygon AggLayer, Cosmos IBC) are critical infrastructure.
Action: Evaluate bridges and messaging protocols on latency (~2-5 sec for optimistic, ~1 min for zk), security guarantees (economic vs. cryptographic), and cost.

~2-5 sec

Fast Msg Latency

IBC

Standard Model

The Shift: User-Observed Latency is King

Users don't care about block time; they care about perceived speed from click to confirmation. This requires optimizing the entire stack, from RPCs to pre-confirmations.

Key Insight: Instant pre-confirmations via threshold encryption (e.g., EigenLayer) or fast-finality chains (Solana, Sei) set the new UX standard.
Action: Implement local fee markets and priority lanes to guarantee sub-second feedback for end-users, even during congestion.

<1 sec

UX Target

Pre-Confirm

Key Tech

The Reality: Scalability Requires Sustainable Economics

High throughput that bankrupts validators or relies on unsustainable token emissions is a dead end. Fee revenue must cover hardware and security costs.

Key Insight: Analyze a chain's cost-of-production (hardware, bandwidth) vs. protocol revenue. Solana and Monad push hardware limits; Ethereum L2s inherit security but pay for it.
Action: For investors, model validator profitability. For builders, design for real fee revenue, not just token incentives.

Cost-of-Prod

Key Ratio

Validator P&L

Due Diligence

The Future of Scalability Benchmarks: Moving Beyond Peak TPS

Introduction

Executive Summary

The Problem: TPS Measures Throughput, Not Utility

The Solution: The End-to-End Latency Stack

The Problem: Isolated Benchmarks Ignore Composability

The Solution: Cost-Per-Unit-Value as the Ultimate Metric

The Problem: Developer Velocity is the Hidden Tax

The Future: Adversarial Load Testing as a Service

Thesis: TPS is a Vanity Metric, Throughput is a System Property

The Current L2 Benchmark Circus

The New Benchmark Matrix: A Protocol Comparison

Deconstructing the Three Pillars of Real Scalability

Architectural Trade-offs in Practice

The Problem: Peak TPS Ignores State Growth

The Solution: Cost-Per-Transaction Under Congestion

The Benchmark: Time-to-Finality Distribution

The Reality: Decentralization Under Load

The New Standard: Total Cost of Operation (TCO)

The Frontier: Application-Specific Benchmarks

Counterpoint: Why Developers Still Care About Simple TPS

Frequently Challenged Questions

The Endgame: Benchmarks as a Commodity

Actionable Takeaways for Builders & Investors

The Problem: TPS is a Vanity Metric

The Solution: Modular & Intent-Centric Architectures

The Metric: Economic Throughput (TVL * Velocity)

The Bottleneck: Interoperability is the New Scaling Frontier

The Shift: User-Observed Latency is King

The Reality: Scalability Requires Sustainable Economics

Get a free quote.

Get In Touch
today.

The Future of Scalability Benchmarks: Moving Beyond Peak TPS

Introduction

Executive Summary

The Problem: TPS Measures Throughput, Not Utility

The Solution: The End-to-End Latency Stack

The Problem: Isolated Benchmarks Ignore Composability

The Solution: Cost-Per-Unit-Value as the Ultimate Metric

The Problem: Developer Velocity is the Hidden Tax

The Future: Adversarial Load Testing as a Service

Thesis: TPS is a Vanity Metric, Throughput is a System Property

The Current L2 Benchmark Circus

The New Benchmark Matrix: A Protocol Comparison

Deconstructing the Three Pillars of Real Scalability

Architectural Trade-offs in Practice

The Problem: Peak TPS Ignores State Growth

The Solution: Cost-Per-Transaction Under Congestion

The Benchmark: Time-to-Finality Distribution

The Reality: Decentralization Under Load

The New Standard: Total Cost of Operation (TCO)

The Frontier: Application-Specific Benchmarks

Counterpoint: Why Developers Still Care About Simple TPS

Frequently Challenged Questions

The Endgame: Benchmarks as a Commodity

Actionable Takeaways for Builders & Investors

The Problem: TPS is a Vanity Metric

The Solution: Modular & Intent-Centric Architectures

The Metric: Economic Throughput (TVL * Velocity)

The Bottleneck: Interoperability is the New Scaling Frontier

The Shift: User-Observed Latency is King

The Reality: Scalability Requires Sustainable Economics

Get In Touch today.

Get In Touch
today.