Free 30-min Web3 Consultation
Book Consultation
Smart Contract Security Audits
View Audit Services
Custom DeFi Protocol Development
Explore DeFi
Full-Stack Web3 dApp Development
View App Services
Free 30-min Web3 Consultation
Book Consultation
Smart Contract Security Audits
View Audit Services
Custom DeFi Protocol Development
Explore DeFi
Full-Stack Web3 dApp Development
View App Services
Free 30-min Web3 Consultation
Book Consultation
Smart Contract Security Audits
View Audit Services
Custom DeFi Protocol Development
Explore DeFi
Full-Stack Web3 dApp Development
View App Services
Free 30-min Web3 Consultation
Book Consultation
Smart Contract Security Audits
View Audit Services
Custom DeFi Protocol Development
Explore DeFi
Full-Stack Web3 dApp Development
View App Services
the-modular-blockchain-thesis-explained
Blog

The Future of Data Indexing in a Modular Blockchain World

The modular blockchain thesis is breaking monolithic indexers. We analyze why The Graph's one-size-fits-all model is unsustainable and how a new market for rollup-native, high-performance indexing will emerge.

introduction
THE DATA FRAGMENTATION PROBLEM

Introduction

Modular blockchains solve scaling but create a new, critical bottleneck: fragmented and inaccessible data.

Modular architectures fragment data. Separating execution, settlement, and data availability across layers like Arbitrum, Celestia, and EigenDA breaks the monolithic database model, making holistic data queries impossible for a single node.

The indexing layer is now critical infrastructure. Applications need a unified view across rollups and chains, transforming projects like The Graph and Substreams from optional tools into mandatory data plumbing for user-facing apps.

Real-time indexing defines performance. The 12-second block time of Ethereum is a latency ceiling; users expect sub-second updates, forcing indexers to process streams from Espresso's fast finality or Avail's data availability layer directly.

thesis-statement
THE DATA LAYER

The Core Argument: The Indexing Stack Must Modularize

Monolithic indexing is a scaling bottleneck; the future is a modular stack of specialized components.

Monolithic indexing architectures fail because they bundle data ingestion, processing, and querying. This creates a single point of failure and prevents scaling individual components, as seen with The Graph's subgraph syncing delays during high-throughput events.

Modularization separates concerns into distinct layers: a data availability layer (Celestia, EigenDA), a compute/execution layer (RISC Zero, Jolt), and a query layer. This mirrors the L2 scaling playbook, applying it to the data access problem.

Specialization unlocks performance. A dedicated proving layer for indexing logic, like using RISC Zero, allows verifiable computation. A separate query engine can then serve cached, proven results at sub-second latency without re-executing logic.

The precedent is established. Just as rollups separated execution from consensus, the indexing stack must follow. Protocols like Hyperliquid and dYdX v4 building their own app-chains prove the demand for sovereign, performant data access.

FEATURED SNIPPETS

The Cost of Universality: Indexing Latency & Cost Matrix

A first-principles comparison of data indexing architectures, quantifying the trade-offs between universal coverage and specialized performance.

Core Metric / CapabilityUniversal Indexer (The Graph)Specialized RPC (Alchemy, QuickNode)Application-Specific Indexer (dYdX, Uniswap)

Indexing Latency (Block to Query)

2-12 seconds

< 1 second

< 500 milliseconds

Cost per 1M Queries (Approx.)

$5-15

$50-200

$0 (Sunk Dev Cost)

Multi-Chain Coverage (EVM, Solana, Cosmos)

Subgraph Deployment & Maintenance

Guaranteed State Consistency

Custom Business Logic at Index Layer

Time to New Chain Integration

Weeks (Subgraph Dev)

Days (RPC Node Spin-up)

Months (Full Stack Dev)

Protocol Example

The Graph, Goldsky

Alchemy, QuickNode, Chainstack

dYdX v4, Uniswap Labs API

deep-dive
THE DATA LAYER

Deep Dive: The Technical Inevitability of Fragmentation

Modular architecture fragments application state, making traditional indexers obsolete and creating a new market for decentralized data infrastructure.

Fragmentation is a feature of modular blockchains. Separating execution from consensus and data availability forces application logic to span multiple specialized layers like Arbitrum, Celestia, and EigenDA. This architectural shift breaks the monolithic database model where a single node indexes all state.

Traditional indexers like The Graph fail in this environment. Their subgraph model assumes a single, queryable chain. A modular app's state exists across rollups, DA layers, and co-processors, creating a coordination problem that monolithic indexers cannot solve.

The solution is a new data mesh. Indexing becomes a network of specialized adapters—one for each execution environment and data availability layer. Projects like Subsquid and Goldsky are building this, treating each rollup as a distinct data source to be aggregated.

This creates a market for data proofs. Simply aggregating data is insufficient; users need cryptographic guarantees of correctness across domains. Future indexers will integrate zk-proofs or optimistic verification to become trust-minimized data oracles for cross-chain state.

Evidence: The total value locked across the top 10 rollups exceeds $20B, but no existing indexer provides a unified view of liquidity and positions across Arbitrum, Optimism, and Base. This gap defines the product-market fit.

protocol-spotlight
THE FUTURE OF DATA INDEXING

Protocol Spotlight: Early Movers in the New Stack

As blockchains fragment into modular layers, the old query model is breaking. These protocols are building the new data infrastructure.

01

The Graph: From Monolith to Supernet

The incumbent is pivoting from a monolithic L1 indexer to a network of application-specific subgraphs (Substreams) on a dedicated L2 rollup. This modularizes indexing logic, enabling real-time data streams and massive parallelization.\n- Key Benefit: Unlocks sub-second latency for high-frequency dApps like perps.\n- Key Benefit: Cost predictability via rollup-based execution, decoupling from mainnet gas.

~500ms
Stream Latency
30k+
Subgraphs
02

Goldsky: The Real-Time Data Firehose

Built on Substreams, Goldsky bypasses traditional RPC polling to deliver real-time event streams directly to applications. It's the infrastructure for the intent-based future, powering UX for protocols like UniswapX and CowSwap.\n- Key Benefit: Sub-100ms data delivery, enabling instant UI updates.\n- Key Benefit: Declarative data pipelines that developers configure, not code.

<100ms
Event Delivery
10x
Dev Speed
03

The Problem: RPCs Are Not Indexers

Standard JSON-RPC endpoints are state-query machines, not designed for complex historical queries or aggregations. Asking an RPC for "all Uniswap swaps by wallet X" is like using a screwdriver as a hammer.\n- Consequence: DApp frontends become bloated, performing client-side aggregation which fails at scale.\n- Consequence: Creates centralization pressure as teams rely on a single Infura/Alchemy node for complex logic.

10k+
RPC Calls/UI
5-10s
UI Lag
04

The Solution: Decoupled Execution & Proving

The new stack separates data ingestion, computation, and proof generation. Protocols like EigenLayer AVS (e.g., Hyperbolic) and Risc Zero allow indexers to prove their query results are correct without re-executing every block.\n- Key Benefit: Verifiable data APIs that apps can trust without running a node.\n- Key Benefit: Horizontal scaling of indexing workloads across cheap cloud compute.

-90%
Compute Cost
ZK-Proofs
Verification
05

Storage Layers Are The New Source of Truth

With data availability layers like Celestia, EigenDA, and Avail, the canonical chain is no longer the primary data source. Indexers must now ingest from multiple DA layers and rollups, creating a multi-chain indexing problem.\n- Key Benefit: Enables universal data queries across any modular chain.\n- Key Benefit: Future-proofs infrastructure against the rise of sovereign rollups.

10+
DA Sources
Unlimited
Rollup Scale
06

Who Wins? The Orchestrator

The winning protocol won't just index faster; it will orchestrate a marketplace of specialized indexers. Think Across Protocol but for data, routing queries to the optimal indexer (Goldsky for speed, The Graph for breadth, a ZK-prover for security).\n- Key Benefit: Best-in-class performance for every query type via intelligent routing.\n- Key Benefit: Economic efficiency through competitive indexing markets, not fixed staking.

Marketplace
Model
Intent-Based
Routing
counter-argument
THE LEGACY CONSTRAINT

Counter-Argument: Can't The Graph Just Adapt?

The Graph's monolithic architecture is fundamentally misaligned with the modular execution and data availability demands of modern blockchains.

The Graph's monolithic architecture is its core constraint. Its design assumes a single, unified data source, which is incompatible with the fragmented data availability landscape of rollups and Layer 2s like Arbitrum and Optimism.

Adapting requires a full-stack rebuild. To index from Celestia or EigenDA, The Graph must re-architect its node software, consensus, and economic model. This is a multi-year engineering challenge, not a simple upgrade.

The economic model breaks. Indexers stake on Ethereum mainnet but must pay for data from external DA layers. This creates a capital efficiency and settlement mismatch that native, chain-specific indexers avoid entirely.

Evidence: Market share erosion. Emerging chains like Solana and Sui are building their own indexing stacks (e.g., Sui Move Analyzer). The Graph's subgraph deployment growth on new L2s lags behind its Ethereum mainnet dominance.

risk-analysis
THE FRAGMENTATION TRAP

Risk Analysis: What Could Go Wrong?

Modular blockchains solve scaling but create a data indexing nightmare for applications.

01

The Data Availability Black Box

Indexers must now trust external Data Availability (DA) layers like Celestia, EigenDA, or Avail. If a DA layer censors or loses data, the indexer's state becomes corrupted, breaking downstream applications. This creates systemic risk for protocols like The Graph or Subsquid that rely on historical data integrity.

  • Risk: Unrecoverable state forks from DA failures.
  • Impact: Breaks DeFi oracles and NFT provenance.
100%
State Corruption
~2-3s
DA Finality Lag
02

Cross-Chain Indexing Latency Arbitrage

In a modular stack, finality is asynchronous across execution, settlement, and DA layers. Fast indexers reading from an execution layer (e.g., Arbitrum) could serve stale data before the settlement layer (e.g., Ethereum) confirms the rollup's proof. This opens a multi-layer MEV attack vector where arbitrage bots exploit timing gaps in indexed data feeds.

  • Risk: Front-running based on indexing speed differentials.
  • Vector: Exploits gap between L2 inclusion and L1 finality.
~12s
L2->L1 Delay
$M+
MEV Opportunity
03

The Interoperability Tax on Query Cost

Indexing a user's activity across Ethereum + 5 L2s + a DA layer requires aggregating data from multiple RPC endpoints and proving data consistency. This multiplies infrastructure costs and query complexity. Projects like Goldsky or Covalent face 10-100x cost inflation versus indexing a single chain, making real-time cross-chain queries economically non-viable for most dApps.

  • Risk: Indexing becomes a capital-intensive oligopoly.
  • Result: Kills long-tail dApp innovation.
10-100x
Cost Increase
5+
Endpoints/User
04

Sequencer-Level Censorship

Centralized sequencers (e.g., in Optimism, Arbitrum) control transaction ordering and data publication. They can withhold or reorder transaction data before it reaches the DA layer, making it impossible for decentralized indexers to build a canonical timeline. This gives sequencers the power to manipulate indexed states for DeFi apps, akin to Flashbots on steroids.

  • Risk: Indexers see only the sequencer's curated reality.
  • Control Point: Single entity dictates historical record.
1
Central Point of Failure
100%
Data Control
future-outlook
THE MODULAR DATA STACK

Future Outlook: The 2024-2025 Indexing Landscape

Indexing infrastructure will fragment to serve specialized data needs across modular execution, settlement, and DA layers.

Specialized indexers win. Generic The Graph subgraphs fail for high-throughput rollups and novel VM states. Projects like EigenLayer AVSs and Risc Zero will index and prove specific data streams, creating a market for verifiable off-chain compute.

Data availability dictates architecture. Indexers for Celestia or EigenDA require different sync patterns than Ethereum archival nodes. This creates layer-specific tooling, fragmenting the monolithic indexer model into a composable service mesh.

The query layer commoditizes. Competition between The Graph, Subsquid, and Goldsky pushes cost to zero. Value accrues to the proving layer—services that generate ZK proofs of query correctness become the premium product.

Evidence: The Graph's indexing time for a new chain like Base often lags by weeks, while dedicated RPC providers like Alchemy serve custom data in real-time, proving the demand for specialization.

takeaways
THE FUTURE OF DATA INDEXING

Key Takeaways for Builders and Investors

In a modular world where execution, settlement, and data availability are disaggregated, the indexing layer becomes the critical abstraction for application logic.

01

The Graph's Subgraphs Are a Legacy Monolith

Subgraphs are monolithic, indexing-specific smart contracts that must be redeployed for every chain. This creates fragmented data silos and ~$100k+ annual costs for multi-chain dApps.\n- Problem: No cross-chain querying, forcing developers to manage N+1 subgraphs.\n- Solution: Next-gen indexers like Goldsky and Subsquid use a schema-first, multi-chain data lake approach, enabling a single query across Ethereum, Arbitrum, and Polygon.

-80%
Dev Ops
10+
Chains Indexed
02

Intent-Based Queries Will Eat Batch Processing

Traditional indexing is a push model: index everything, filter later. This wastes ~70% of compute on unused data.\n- Problem: Inefficient for real-time, user-specific intents (e.g., "find my best liquidity route").\n- Solution: PropellerHeads and RISC Zero are pioneering ZK-proof-based query engines. Users submit an intent, and a prover generates a ZK-proof of the query result in ~2 seconds, consuming only the necessary data.

100x
Efficiency Gain
~2s
Proof Time
03

The Indexer is the New RPC

As applications demand richer data (historical states, event correlations), simple JSON-RPC calls are insufficient. The indexing layer becomes the primary data gateway.\n- Problem: RPCs like Alchemy and Infura offer low-level chain data, not application-ready abstractions.\n- Solution: Indexers like Covalent and Blockpour provide unified APIs that return structured, business-logic-ready data, abstracting away the underlying Celestia, EigenDA, or Avail data availability layer.

1 API
All Chains
500ms
P95 Latency
04

Indexing is the Ultimate MEV Surface

Who controls the indexer controls the data lens—and the arbitrage opportunities. Fast, proprietary indexing is a competitive moat for DeFi protocols.\n- Problem: Public indexers create a level playing field, revealing opportunities to everyone simultaneously.\n- Solution: Protocols like Uniswap (with UniswapX) and Aave are building internal, ultra-low-latency indexers to power their own intent-based systems, capturing ~$1B+ in annual MEV that would otherwise leak to searchers.

$1B+
MEV Captured
<100ms
Alpha Window
05

Decentralization is a Scaling Trade-Off

Fully decentralized indexing networks (e.g., The Graph) suffer from higher latency and cost volatility due to tokenomics and coordination overhead.\n- Problem: ~5-10 second query latency is unacceptable for high-frequency trading or gaming applications.\n- Solution: Hybrid models are winning: use a centralized, performant indexer for real-time queries (Goldsky, Subsquid) and a decentralized network for censorship-resistant archival data and verification, similar to the Ethereum execution and EigenLayer restaking security model.

5-10s
Decentralized Latency
<1s
Hybrid Latency
06

The Vertical Integration Play: Indexing-As-A-Settlement

The logical endpoint is for rollups and app-chains to bundle a native indexer as part of their state transition function.\n- Problem: External indexers add latency and are a trust assumption outside the chain's security model.\n- Solution: Fuel Network with its native Sway language and Movement Labs with MoveVM are architecting state models where indexing is a first-class primitive. This enables native intent matching and ~$0.001 query costs baked into transaction fees.

$0.001
Query Cost
Native
To VM
ENQUIRY

Get In Touch
today.

Our experts will offer a free quote and a 30min call to discuss your project.

NDA Protected
24h Response
Directly to Engineering Team
10+
Protocols Shipped
$20M+
TVL Overall
NDA Protected Directly to Engineering Team
Why The Graph Will Fragment in a Modular Blockchain World | ChainScore Blog