Why Your Rollup's TPS Benchmark Is Probably Wrong

introduction

THE METRICS ILLUSION

Introduction

Published TPS figures are marketing artifacts, not engineering benchmarks.

Your TPS is a vanity metric. It measures a synthetic, best-case workload that ignores real-world constraints like mempool dynamics, state growth, and cross-domain messaging.

The benchmark is the bottleneck. A rollup's true capacity is defined by its slowest component—be it the sequencer, data availability layer, or state access patterns, not a theoretical compute limit.

Evidence: Arbitrum Nitro's 2M TPS claim is for fraud proof verification, not user transactions. Real throughput is constrained by Ethereum's calldata costs and the sequencer's mempool processing.

key-trends

DECONSTRUCTING MARKETING SPIN

The Three Lies of TPS Benchmarks

Peak TPS numbers are meaningless without context on real-world conditions, cost, and decentralization trade-offs.

The Problem: Synthetic Load vs. Real-World Traffic

Benchmarks use simple token transfers, ignoring the computational weight of real dApp logic. Smart contract execution and state growth are the true bottlenecks, not raw signature verification.

Real TPS under complex operations (e.g., Uniswap swaps) is often <10% of advertised peak.
Tests ignore network effects: mempool congestion and sequencer queuing create non-linear latency spikes.

<10%

Real-World Throughput

100x

Op Cost Variance

The Problem: Ignoring the Data Availability Bottleneck

High TPS is useless if the data isn't available for verification. Ethereum calldata and even blob space have hard, shared limits.

A rollup claiming 100,000 TPS would saturate all of Ethereum's ~0.375 MB/s blob bandwidth in minutes.
Solutions like validiums and EigenDA trade off security for scale, a critical detail omitted from marketing.

~0.375 MB/s

Ethereum Blob Cap

Minutes

To Saturate

The Problem: Centralization for the Leaderboard

Achieving record numbers requires a single, centralized sequencer with optimized hardware, defeating the purpose of a decentralized L2. Decentralizing the sequencer (e.g., via Espresso, Astria) introduces consensus latency, slashing benchmark speeds.

The trade-off is stark: 100k TPS (centralized) vs. ~1k TPS (decentralized).
True scalability requires modular design that separates execution, consensus, and DA.

100k vs 1k

Centralized vs Decentralized TPS

~2s+

Consensus Latency

deep-dive

THE BENCHMARK

The Adversarial Load Fallacy

Rollup TPS claims are invalidated by the absence of real-world, adversarial transaction patterns.

Benchmarks measure optimal conditions. They use simple transfers or identical contract calls, ignoring the computational diversity of real on-chain activity. A network processing 10,000 identical swaps is not equivalent to one processing 10,000 mixed DeFi, NFT, and gaming transactions.

Real load is adversarial and unpredictable. Users submit transactions that maximize their own profit, creating state contention and complex execution paths. This exposes bottlenecks in the sequencer, mempool, and state access patterns that synthetic benchmarks miss entirely.

The testnet-to-mainnet gap proves this. Networks like Arbitrum and Optimism demonstrate stable throughput under controlled loads but face congestion during airdrops or major NFT mints. Their sustained TPS under stress is the only metric that matters.

Evidence: The mempool is the bottleneck. A sequencer claiming 100k TPS in a closed test will choke on a flood of MEV-bundled transactions from Flashbots or Jito. Real throughput is gated by the sequencer's ingestion and ordering logic, not just execution speed.

deep-dive

THE HIDDEN COST

The Cross-Domain Messaging Tax

Rollup throughput benchmarks are misleading because they ignore the latency and cost of finalizing transactions across domains.

Benchmarks measure isolated execution. Rollup TPS counts only L2 state updates, ignoring the cross-domain messaging required for finality on Ethereum. A user's transaction is not complete until proven on L1.

The tax is latency, not just gas. Protocols like Across and Stargate optimize for cost, but the sequencer-to-L1 finality delay creates a multi-block confirmation window where funds are locked.

Proof posting is the bottleneck. Even with 100k TPS, a ZK-rollup like zkSync must batch and post a validity proof to Ethereum, which acts as a global throughput governor for all connected chains.

Evidence: Arbitrum Nitro processes ~40k TPS internally, but its Ethereum calldata submission is rate-limited by L1 block space, creating a practical ceiling far lower than advertised.

TPS BENCHMARK REALITY CHECK

The Hidden Cost of State Growth

Comparing how different state management strategies impact real-world throughput and long-term viability, moving beyond synthetic TPS claims.

State Management Metric	Monolithic L1 (e.g., Ethereum Mainnet)	Optimistic Rollup (e.g., Arbitrum, Optimism)	ZK Rollup (e.g., zkSync Era, Starknet)	Stateless Client / Verkle (Future Ethereum)
State Growth per 1000 TPS (GB/year)	~8760 GB	~438 GB (50x compression)	~8.76 GB (1000x compression)	< 0.1 GB (witness-based)
Node Sync Time (from genesis)	2 weeks	~3-5 days	~1-2 days	< 1 hour
State Bloat Tax (Annual fee inflation for archival nodes)	15-20%	5-8%	1-3%	~0%
Witness Size per Block	N/A (Full state)	~1-5 MB	~10-50 KB	~100-500 KB
Prover/Verifier Overhead (Time per tx)	N/A	~5-20 ms (Fraud proof challenge period)	~50-200 ms (ZK proof generation)	~1-5 ms (Verkle proof verification)
Data Availability Cost per MB (Est.)	$768 (Calldata)	$38.4 (Blob storage)	$0.38 (ZK validity proof + DA layer)	$0.10 (Blob storage + proof)
Supports State Expiry / History Pruning
Developer Friction (State access patterns)	Unrestricted, high cost	Unrestricted, medium cost	Circuit-constrained, low cost	Witness-constrained, very low cost

counter-argument

THE REAL-TIME FALLACY

The Optimist's Rebuttal (And Why It's Wrong)

Peak TPS is a synthetic metric that ignores the real-world constraints of state growth and data availability.

Synthetic benchmarks ignore state growth. Your 100k TPS claim assumes a clean-slate state. Real applications like Uniswap and Aave create complex state dependencies that bloat the Merkle tree, slowing proof generation and increasing L1 settlement costs.

Data availability is the true bottleneck. High TPS requires cheap data posting. Solutions like Celestia or EigenDA provide cost relief, but they shift the bottleneck to cross-chain bridging latency and security assumptions, creating new trade-offs.

Real throughput requires real users. A benchmark transferring ETH between two funded wallets is meaningless. Realistic workloads involve token approvals, NFT minting, and DEX swaps, which have 5-10x higher gas costs per logical user operation.

Evidence: Arbitrum Nitro's peak of ~40k TPS was achieved in a controlled, single-application stress test. Its sustained, multi-app mainnet throughput is two orders of magnitude lower, proving the benchmark gap.

takeaways

TPS MYTHBUSTING

Actionable Takeaways for Builders

Most rollup TPS benchmarks are marketing fluff that ignore real-world constraints. Here's how to measure what actually matters.

The Data Availability Bottleneck

Your sequencer's local TPS is irrelevant if the DA layer can't keep up. Ethereum calldata is capped at ~0.1 MB/s, while Celestia and EigenDA offer ~10-100 MB/s. The bottleneck defines your real throughput ceiling.

Key Metric: DA throughput (MB/s) vs. your rollup's data footprint.
Action: Benchmark with full transaction inclusion, not just mempool ordering.

~0.1 MB/s

Ethereum DA

10-100x

Alt DA Gain

State Growth is the Silent Killer

High TPS accelerates state bloat, crippling node sync times and hardware requirements. This is the Avalanche C-Chain and early Solana problem.

Key Metric: State growth rate (GB/day) per 1k TPS.
Action: Implement state expiry (like Ethereum's EIP-4444) or stateless validity proofs from day one.

GBs/day

State Bloat

Weeks

Sync Time

The Congestion Contagion Effect

Your rollup doesn't exist in a vacuum. A surge on Arbitrum or Base can congest the shared L1 settlement layer, delaying your proofs or data posts. This creates unpredictable finality.

Key Metric: L1 base fee during competitor peak loads.
Action: Model worst-case L1 gas prices and use EIP-4844 blobs for cost predictability.

1000+ gwei

L1 Gas Spike

~10x

Cost Variance

Ignore P2P Network Limits at Your Peril

A centralized sequencer can process 100k TPS, but propagating blocks to a decentralized network of nodes is slower. This is the Solana validator hardware arms race problem.

Key Metric: Block propagation time across global nodes.
Action: Test with a geo-distributed node set, not a local cluster. Consider Nakamoto Coefficient for your validator set.

~500ms-2s

Propagation Lag

$10k+

Min HW Cost

Benchmark Real Transactions, Not Transfers

Advertised TPS always uses simple transfers. Real dApp traffic involves complex Uniswap swaps, NFT mints, and zk-proof verifications, which are 10-100x more expensive in gas/compute.

Key Metric: Gas Units per Transaction (GU/Tx) for your target dApps.
Action: Create a benchmark suite mirroring your expected production mix.

10-100x

Gas Multiplier

<100

Real TPS

The Finality vs. Throughput Trade-Off

Optimistic rollups like Optimism offer fast soft confirmations but 7-day finality. ZK-rollups like zkSync have slower proving times but ~1 hour finality. Your advertised TPS must specify which metric it uses.

Key Metric: Time-to-finality (TTF) at claimed TPS.
Action: Clearly communicate the TTF curve to your users and integrators.

7 Days

ORU Finality

~1 Hour

ZK Finality

Why Your Rollup's TPS Benchmark Is Probably Wrong

Introduction

The Three Lies of TPS Benchmarks

The Problem: Synthetic Load vs. Real-World Traffic

The Problem: Ignoring the Data Availability Bottleneck

The Problem: Centralization for the Leaderboard

The Adversarial Load Fallacy

The Cross-Domain Messaging Tax

The Hidden Cost of State Growth

The Optimist's Rebuttal (And Why It's Wrong)

Actionable Takeaways for Builders

The Data Availability Bottleneck

State Growth is the Silent Killer

The Congestion Contagion Effect

Ignore P2P Network Limits at Your Peril

Benchmark Real Transactions, Not Transfers

The Finality vs. Throughput Trade-Off

Get a free quote.

Get In Touch
today.

Why Your Rollup's TPS Benchmark Is Probably Wrong

Introduction

The Three Lies of TPS Benchmarks

The Problem: Synthetic Load vs. Real-World Traffic

The Problem: Ignoring the Data Availability Bottleneck

The Problem: Centralization for the Leaderboard

The Adversarial Load Fallacy

The Cross-Domain Messaging Tax

The Hidden Cost of State Growth

The Optimist's Rebuttal (And Why It's Wrong)

Actionable Takeaways for Builders

The Data Availability Bottleneck

State Growth is the Silent Killer

The Congestion Contagion Effect

Ignore P2P Network Limits at Your Peril

Benchmark Real Transactions, Not Transfers

The Finality vs. Throughput Trade-Off

Get In Touch today.

Get In Touch
today.