A The Graph Subgraph is a self-contained data indexing unit that defines which blockchain data to index, how to process it, and how to make it queryable via a GraphQL API. It consists of three core components: a subgraph manifest (subgraph.yaml) that specifies the smart contracts, events, and block ranges to monitor; a GraphQL schema (schema.graphql) that defines the shape of the queryable data; and mapping scripts (written in AssemblyScript) that translate raw blockchain events into the entities defined in the schema. This structure allows developers to create a customized, high-performance index of on-chain data without running their own nodes.
The Graph Subgraph
What is The Graph Subgraph?
A technical definition of a core component in The Graph's decentralized data indexing protocol.
The process begins when a subgraph is deployed to a Graph Node, which scans the specified blockchain for the defined events. When a matching event is found, the node executes the corresponding mapping logic. This logic, or handler, takes the raw event data, can perform calculations or fetch additional state from the chain, and then saves the processed information as typed entities to its internal store. This transformation from low-level log data to structured entities is the core value of a subgraph, enabling efficient queries that would be prohibitively slow or complex to run directly against an Ethereum node or other blockchain client.
Subgraphs power the vast majority of decentralized applications in the Web3 ecosystem, serving as the foundational data layer for DeFi dashboards, NFT marketplaces, and analytics platforms. For example, a Uniswap subgraph indexes data for every swap, mint, and burn event across all pools, making it possible to query historical trading volumes, liquidity provider returns, or real-time token prices with a single, fast GraphQL call. By abstracting away the complexities of direct chain interaction and data processing, subgraphs allow dApp frontends to be as responsive and feature-rich as traditional web applications.
How Does a Subgraph Work?
A subgraph is a data indexing specification that defines how to ingest, process, and serve blockchain data from a specific smart contract or protocol.
A subgraph operates by defining a manifest (subgraph.yaml) that specifies the smart contract addresses, the blockchain events to monitor, and the mapping logic to translate those events into queryable data. This manifest is compiled and deployed to a Graph Node, which begins a continuous process of indexing. The node scans the blockchain for the specified events, executes the mapping logic—written in AssemblyScript—to transform the raw event data, and stores the resulting structured data in a high-performance database. This process creates a real-time, queryable data layer that is decoupled from the blockchain's consensus mechanism, enabling fast and efficient data retrieval.
The core of a subgraph's logic is its mapping functions. These are event handlers written in a subset of TypeScript called AssemblyScript. When the Graph Node detects a specified event—such as a Transfer on an ERC-20 contract or a Swap on a decentralized exchange—it executes the corresponding mapping function. This function's job is to load or create entities (the subgraph's data model objects), update their fields based on the event data, and save them back to the store. This mapping process transforms raw, low-level log data into a structured, application-ready format, defining the entire schema of the queryable API.
Once indexed, the data is served via a GraphQL API endpoint. Developers can query this API using precise GraphQL queries to fetch exactly the data their dApp needs, such as a user's token balances, a list of recent transactions, or aggregated protocol statistics. This eliminates the need for dApps to run their own complex and resource-intensive indexing infrastructure. The decentralized network of Indexers, who stake the protocol's native GRT token, compete to serve these queries reliably and earn query fees. This separation of indexing and querying is fundamental to The Graph's architecture, creating a scalable data layer for Web3.
Key Features of a Subgraph
A subgraph is a data indexing specification that defines how to ingest, process, and serve blockchain data via The Graph's decentralized network.
Manifest & Schema
The manifest (subgraph.yaml) is the configuration file that defines the subgraph's data sources (smart contracts, events) and mapping logic. The GraphQL schema (schema.graphql) defines the shape of the queryable data, specifying entities and their relationships. These files are the blueprint for the indexer.
Mappings & Handlers
Mapping functions (written in AssemblyScript or WASM) are the core logic that transforms raw blockchain events into the structured data defined in the schema. Event handlers (e.g., handleTransfer) are triggered by specific contract events, processing the data and saving it to the Graph Node's store.
Decentralized Indexing
Once deployed, a subgraph is indexed by a decentralized network of Indexers who operate nodes. They stake GRT to provide indexing and query processing services, earning rewards and fees. This removes reliance on a single centralized server for data availability.
Queryable API
The primary output of a subgraph is a GraphQL API endpoint. Applications query this endpoint using precise GraphQL queries to fetch indexed data, such as user balances, transaction histories, or aggregated protocol metrics, without needing to process raw blockchain logs.
Deterministic Indexing
A core principle ensuring that given the same blockchain history and subgraph code, any indexer will produce an identical data store. This determinism is crucial for the decentralized network's security and consistency, allowing verifiable attestations of correct indexing work.
Versioning & Upgrades
Subgraphs are versioned and immutable upon deployment. Developers can deploy new versions to add features, fix bugs, or support new contracts. Curation via signal (staking GRT) helps guide users and indexers to the highest-quality, most useful versions of a subgraph.
Ecosystem Usage & Examples
A subgraph is a specialized data indexing protocol that defines how to ingest, process, and serve blockchain data from a specific smart contract or application. These examples illustrate their diverse implementations.
Subgraphs for NFT Metadata & Analytics
A technical overview of how subgraphs serve as the foundational data layer for querying and analyzing NFT collections on blockchains like Ethereum.
A subgraph in The Graph protocol is an open API that indexes and organizes blockchain data, specifically for querying NFT metadata, transaction history, and ownership records using GraphQL. It functions by defining a data schema and mapping logic that listens for specific on-chain events—such as Transfer or Mint—and processes them into queryable entities stored in a decentralized network of indexers. For NFTs, this transforms raw, scattered blockchain logs into structured information about collections, individual tokens, traits, and market activity, enabling efficient data retrieval for applications without needing to scan the entire chain.
The architecture of an NFT subgraph centers on its manifest (subgraph.yaml), which specifies the smart contracts to index, the events to watch, and the handlers that process them. Key entities typically defined include Collection, Token, Transfer, and User. When a new NFT is minted or transferred, the subgraph's mapping code written in AssemblyScript executes, creating or updating these entities in the underlying database. This process abstracts the complexity of direct blockchain interaction, allowing a dApp to simply query, for example, "all tokens owned by this address" or "the trading volume for this collection in the last 24 hours" with a single, fast GraphQL call.
For developers and analysts, subgraphs unlock powerful NFT analytics and metadata standardization. They enable features like real-time rarity scoring by indexing trait data, calculating floor prices across marketplaces, and tracking provenance through complete transfer histories. Prominent examples include the subgraphs for CryptoPunks and Bored Ape Yacht Club, which provide canonical, community-maintained datasets. By decentralizing this data layer, The Graph ensures that NFT metadata and analytics are resilient, transparent, and not reliant on a single centralized server, forming the backbone for most NFT marketplaces, galleries, and analytics dashboards in the Web3 ecosystem.
Subgraph vs. Other Data Access Methods
A technical comparison of approaches for querying blockchain data, highlighting the trade-offs between development complexity, performance, and decentralization.
| Feature | The Graph Subgraph | Direct Node RPC | Centralized Indexer API |
|---|---|---|---|
Data Abstraction | Declarative GraphQL schema | Raw transaction/block data | Proprietary REST/GraphQL |
Indexing Logic | Manifest & Mapping Handlers | Custom application code | Managed by provider |
Query Performance | Optimized for complex historical queries | Limited to recent blocks; slow for history | High, but dependent on provider scale |
Decentralization | Decentralized network of Indexers | Depends on node provider (centralized or self-hosted) | Centralized service provider |
Data Freshness | Near real-time (synced to chain head) | Real-time | Real-time to batched updates |
Development Overhead | High initial setup, low maintenance | Very high (build and maintain entire stack) | Low (API integration only) |
Cost Model | Query fee market (GRT) | Infrastructure/hosting costs | Subscription or per-call fees |
Data Integrity | Cryptographically verifiable proofs | Trusted node operator | Trusted API provider |
Technical Details & Components
A deep dive into the core technical components of The Graph, the decentralized protocol for indexing and querying blockchain data. This section explains the architecture and mechanics of subgraphs, the fundamental data units that power decentralized applications.
A GraphQL subgraph is a data indexing manifest that defines which blockchain data to index, how to transform it, and how to serve it via a GraphQL API. It works by specifying a smart contract, the events to listen for, and a set of mapping functions written in AssemblyScript that translate raw blockchain data into a queryable schema.
Key Components:
- Subgraph Manifest (
subgraph.yaml): The configuration file that defines the data sources, event handlers, and the mapping file. - Schema (
schema.graphql): A GraphQL schema defining the entities (data types) that will be stored and queried. - Mapping (
mapping.ts): Code that processes event data and saves it to the Graph Node's store according to the schema.
When deployed, a Graph Node scans the blockchain for the specified events, runs the mapping logic, and populates a database, making the data instantly available via its GraphQL endpoint.
Common Misconceptions
Clarifying frequent misunderstandings about The Graph's core indexing technology, its architecture, and its role in the decentralized data ecosystem.
No, a subgraph is not a smart contract; it is a set of instructions that tells a Graph Node how to index and serve data from a blockchain. A smart contract is an executable program on-chain that defines the state and logic of an application, while a subgraph is an off-chain indexing specification that processes and organizes the historical data emitted by those smart contracts. The subgraph's manifest (subgraph.yaml) defines the data sources, the events to watch, and how to map that data into queryable entities. This distinction is crucial: smart contracts write data, subgraphs read and structure it for efficient querying via GraphQL.
Frequently Asked Questions (FAQ)
Essential questions and answers about The Graph's core indexing component, subgraphs, for developers and data consumers.
A subgraph is a decentralized data index that extracts, processes, and stores blockchain event data from smart contracts into a queryable GraphQL API. It works by defining a manifest (subgraph.yaml) that specifies the smart contract addresses, the events to index, and the mapping logic written in AssemblyScript that transforms raw blockchain data into the defined GraphQL schema. Once deployed, a Graph Node scans the blockchain for the specified events, runs the mapping handlers, and stores the resulting entities, making the data accessible via performant GraphQL queries.
- Manifest: The configuration file linking contracts and mappings.
- Schema: Defines the shape of the queryable data using GraphQL.
- Mappings: The transformation logic that populates the schema with data.
Get In Touch
today.
Our experts will offer a free quote and a 30min call to discuss your project.