An Archive Node is a full node that retains the complete historical state of a blockchain, meaning it stores not only every transaction and block header but also the entire state—such as account balances, smart contract code, and storage—at every single block height since genesis. This contrasts with standard full nodes, which typically prune or discard older state data to conserve disk space, keeping only the current state and recent history necessary for validating new blocks. Archive nodes are essential infrastructure for services requiring deep historical data queries, such as block explorers, advanced analytics platforms, and certain decentralized applications (dApps).
Archive Node
What is an Archive Node?
A specialized type of blockchain node that maintains a complete, unpruned historical record of the entire blockchain, including the state for every block.
The primary function of an archive node is to serve historical data requests that other nodes cannot. For example, querying the balance of a specific Ethereum address at block number 5,000,000, or verifying the state of a complex smart contract at a past point in time, requires access to the archived historical state. Running an archive node demands significantly more storage resources than a standard node; for major networks like Ethereum, this can require multiple terabytes of SSD storage. Services like Infura and Alchemy often maintain archive nodes to provide this data via their APIs to developers.
In the Ethereum ecosystem, archive nodes are sometimes referred to as archive mode clients. When syncing a client like Geth or Erigon, users must explicitly enable the --syncmode full --gcmode archive flags to retain all historical state. Without this, the node operates in pruned mode, deleting state data that is no longer needed for verifying the chain's current head. Other networks, such as Polkadot and Solana, have analogous concepts, often called archive nodes or historical data nodes, which serve the same fundamental purpose of providing a verifiable and queryable record of the entire chain history.
How an Archive Node Works
An archive node is a specialized type of blockchain node that maintains a complete, unpruned historical record of the entire state of the network, from the genesis block to the present.
An archive node is a full node that retains the entire historical state of a blockchain, including the state (account balances, contract storage, etc.) at every single block height since genesis. Unlike a standard full node, which prunes old state data to save disk space, an archive node preserves everything, making it a crucial resource for deep historical analysis, auditing, and services requiring arbitrary historical queries. This comprehensive data storage allows developers to query the exact state of the network at any past block, a capability essential for block explorers, advanced analytics platforms, and certain decentralized applications (dApps).
The operational mechanism of an archive node involves continuously syncing and storing all blockchain data without deletion. This includes every block header, transaction, receipt, and the complete state trie (a cryptographic data structure like a Merkle Patricia Trie) for each block. Services like The Graph or block explorers rely on archive nodes to index and serve historical data efficiently. On networks like Ethereum, running an archive node requires significant resources, often needing multiple terabytes of storage and substantial memory to manage the ever-growing state size and transaction history.
The primary use cases for archive nodes are data-intensive and forensic. They enable: - Historical Data Analysis: Researchers and analysts can examine market trends, token flows, and protocol usage over time. - Smart Contract Auditing: Auditors can verify the precise state and execution of contracts at specific past blocks to investigate exploits or validate functionality. - Regulatory Compliance: Entities can generate provable, time-stamped reports of transactions and holdings. - Infrastructure Services: They serve as the backbone for RPC endpoints that provide archival data, which are more expensive to operate than standard endpoints due to the higher resource demands.
Key Features of an Archive Node
An archive node is a specialized type of blockchain node that retains the complete historical state of the network, enabling deep data queries and forensic analysis.
Complete Historical State
Unlike standard full nodes that only keep recent state data, an archive node stores the entire state trie or state database for every single block since genesis. This includes all account balances, smart contract storage, and internal transaction receipts, allowing for queries about the state of the network at any historical block height.
Essential for Block Explorers & Analytics
Archive nodes are the backbone of services requiring historical data lookups, such as:
- Block explorers (e.g., Etherscan) for viewing past transactions and balances.
- Analytics platforms for tracking token flows and protocol metrics over time.
- Audit and compliance tools for verifying historical on-chain activity.
High Resource Requirements
Maintaining a full archive requires significant and growing resources:
- Storage: Can require multiple terabytes (e.g., >10 TB for Ethereum) and expands continuously.
- Memory (RAM): Needs ample RAM for efficient state lookups during query processing.
- Compute: Historical state queries are computationally intensive compared to syncing the latest chain tip.
Pruning vs. Archival Mode
Most node clients (like Geth, Erigon) can run in different sync modes:
- Full Node (with pruning): Synchronizes all blocks but discards old state data to save space, typically keeping only the last ~128 blocks of state.
- Archive Node: Disables pruning entirely, preserving all historical state. The initial sync to archive mode is significantly slower and more resource-intensive.
Use Case: Trace APIs & Debugging
Archive nodes enable advanced JSON-RPC methods like debug_traceTransaction and trace_filter. These methods re-execute transactions in the context of their original historical state, which is crucial for:
- Smart contract debugging to understand complex transaction failures.
- Building accurate transaction fee calculators.
- Protocol research and simulation of historical events.
Comparison to Light & Full Nodes
Archive Node: Stores all blocks + all historical state. Highest resource cost. Full Node: Stores all blocks + recent state (pruned). Validates new blocks and transactions. Light Node: Stores block headers only. Relies on full nodes for state data. Lowest resource cost. Archive nodes are a superset of full nodes, providing the deepest data access layer.
Archive Node vs. Other Node Types
A functional comparison of core blockchain node types based on their data storage and network roles.
| Feature / Metric | Archive Node | Full Node | Light Node |
|---|---|---|---|
Primary Function | Complete historical data archive | State validation & block propagation | Fast client queries & wallet operations |
Blockchain Data Stored | Entire history (all states, receipts, traces) | Current state & recent blocks (prunable) | Block headers only |
Storage Requirement (Ethereum) | ~12+ TB and growing | ~650 GB - 1 TB (pruned) | < 1 GB |
Hardware Requirements | High (Enterprise SSD/HDD arrays, 16+ GB RAM) | Moderate (Fast SSD, 8+ GB RAM) | Low (Consumer hardware, mobile) |
Initial Sync Time | Weeks to months | Days to weeks | Minutes to hours |
Query Capability | Full historical state & transaction tracing | Current state & recent history | Relies on trusted full nodes for data |
Network Role | Data provider for indexers, explorers, analysts | Network backbone for security & decentralization | Client for end-users & applications |
Serves RPC Requests |
Primary Use Cases & Applications
Archive nodes serve as the complete, immutable historical record of a blockchain, enabling deep data analysis, auditing, and specialized services that require access to any past state.
Historical Data Analysis & Auditing
Archive nodes are essential for on-chain analytics, compliance audits, and forensic investigations. They provide access to the full historical state, allowing analysts to:
- Trace the complete history of a wallet or smart contract.
- Verify transaction provenance for regulatory compliance.
- Conduct detailed research on network activity, token flows, and protocol usage over time.
Block Explorer Backend
Public block explorers like Etherscan and Solana Explorer rely on archive nodes to serve detailed historical information. They power features such as:
- Displaying the complete transaction history for any address.
- Showing internal transactions and token transfers.
- Providing access to historical smart contract states and event logs.
Developer Tooling & Testing
Developers use archive nodes to build and test applications that require historical context. Key use cases include:
- Indexing services that build custom databases of past events.
- Simulating complex transactions that depend on a specific historical state.
- Debugging by replaying past blocks to identify issues in smart contract interactions.
Resilience & Data Availability
Archive nodes act as a decentralized historical backup for the network, ensuring data persistence and censorship resistance. They guarantee that:
- The entire chain history remains available even if many full nodes prune old data.
- New nodes can synchronize from genesis without relying on a single source.
- Historical data is preserved for future verification and chain analysis.
Specialized DeFi & Financial Services
Advanced financial applications require archive node data for accurate calculations and reporting. Examples include:
- DeFi protocols calculating time-weighted average prices (TWAP) or historical yields.
- Tax reporting services generating complete capital gains reports.
- Risk assessment models that analyze historical liquidity and volatility patterns.
Infrastructure for RPC Providers
Infrastructure providers like Alchemy, Infura, and QuickNode operate archive nodes to offer enhanced API endpoints. These services allow dApps to query:
- Historical balances and states via
eth_getBalancefor a past block. - Old transaction receipts and logs.
- Complete data for any block number, enabling reliable application performance.
Ecosystem Usage & Providers
An archive node is a type of blockchain node that retains the complete historical state of the network, including all past transactions and account balances, enabling deep historical data queries and analysis.
Core Function: Full Historical State
Unlike full nodes that only store recent state to validate new blocks, an archive node preserves the entire historical state of the blockchain. This includes the state (account balances, contract storage, etc.) for every single block since genesis. It enables queries like "What was the balance of this address at block 5,000,000?" which are impossible for standard nodes.
Primary Use Cases
Archive nodes are essential for services requiring deep historical data:
- Block Explorers & Analytics: Platforms like Etherscan query archive nodes to display full transaction histories and historical token balances.
- Auditors & Investigators: Tracing fund flows for compliance or security incidents.
- Research & Indexing: Building custom indexes or performing complex on-chain data analysis.
- Developer Tooling: Services that need to simulate or verify past contract states.
Infrastructure & Storage Requirements
Running an archive node demands significant resources. For example, an Ethereum archive node requires multiple terabytes of fast SSD storage and substantial RAM to hold the state trie in memory for quick access. The storage requirement grows continuously with chain activity, making it a major operational commitment compared to a pruned full node.
Comparison: Full vs. Archive Node
Full Node (Pruned):
- Stores only recent blocks and the current state.
- Validates new transactions and blocks.
- Storage: ~500GB-1TB for Ethereum.
Archive Node:
- Stores all blocks and every historical state.
- Can answer any historical query.
- Storage: 10TB+ and growing for Ethereum. The key difference is state retention; both participate in consensus.
The "Archive" RPC Method
A critical technical capability is support for historical RPC calls. The eth_getBalance method, for instance, accepts an optional block parameter. When queried with an old block number against an archive node, it returns the historical balance. A standard node will return an error for states it no longer holds. This method is foundational for all historical data services.
Technical Details & Implementation
An archive node is a specialized type of blockchain node that stores the complete historical state of the network, enabling deep historical queries and analysis that are impossible with standard full nodes.
An archive node is a blockchain node that stores the complete historical state of the network at every single block, not just the current state. It works by preserving all intermediate state trie data, including account balances, smart contract storage, and transaction receipts, which standard full nodes prune to save disk space. This allows an archive node to answer complex historical queries, such as "What was the balance of this address at block #5,000,000?" without needing to replay the entire chain history. Running an archive node requires significantly more storage—often multiple terabytes—and computational resources compared to a standard node.
Challenges & Operational Considerations
While essential for deep historical analysis, running an archive node presents significant technical and economic hurdles that must be carefully evaluated.
Massive Storage Requirements
Archive nodes store the complete historical state of a blockchain, not just block headers and transactions. This includes the state (account balances, smart contract storage) for every single block. For mature networks like Ethereum, this can require multiple terabytes (TB) to tens of TBs of fast storage (SSDs are recommended). This requirement grows linearly with chain activity.
High Operational Costs
The resource intensity translates directly to expense.
- Hardware: Requires high-performance CPUs, large amounts of RAM, and enterprise-grade SSDs.
- Hosting: Colocation or cloud hosting (e.g., AWS, GCP) for such a node can cost hundreds to thousands of dollars per month.
- Bandwidth: Constant, high-volume data synchronization and serving queries consumes significant bandwidth.
Complex Synchronization & Maintenance
Initial synchronization (the "sync") from genesis to the current block is the most demanding phase, often taking weeks for major chains. The process is I/O and CPU intensive. Ongoing maintenance requires monitoring disk space, managing software updates, and ensuring high uptime. State pruning is not an option, unlike with full nodes.
Limited Use Case Justification
The high cost is only justifiable for specific applications. Primary users include:
- Block explorers (Etherscan)
- Analytics platforms (Dune, Nansen)
- Historical data providers (The Graph for legacy queries)
- Auditors and researchers needing verifiable historical state. Most dApps and users are sufficiently served by RPC providers that offer archive data as a service.
Alternative Solutions
To avoid the overhead of a personal archive node, developers often use:
- Managed RPC Services: Providers like Alchemy, Infura, and QuickNode offer archive-level API endpoints.
- Indexing Protocols: The Graph indexes historical data into queryable subgraphs.
- Light Clients & Snap Sync: For applications needing only recent state with cryptographic proofs.
- Decentralized Networks: Services like Pokt Network that decentralize RPC access.
Common Misconceptions
Archive nodes are often misunderstood as simply being 'bigger' full nodes. This section clarifies their unique technical role, performance characteristics, and the specific trade-offs involved in running one.
No, an archive node is a fundamentally different type of node that maintains a complete historical state, not just a larger transaction history. A full node only stores the current state and recent block headers needed to validate new blocks. An archive node retains every intermediate state for every block since genesis, allowing it to instantly query historical data (like an account's balance at block #1,000,000) without re-executing all transactions. This requires exponentially more storage and resources, making it a specialized data service rather than an enhanced validation node.
Frequently Asked Questions (FAQ)
Common technical questions about the role, operation, and use cases of blockchain archive nodes.
An archive node is a type of blockchain node that stores the complete historical state of a network at every single block, rather than just the current state. It works by retaining all intermediate state changes, including the state of every account and smart contract after each transaction, which allows it to query historical data for any past block height. This is achieved by persistently storing the full state trie and its historical versions, unlike a full node which prunes this data to save disk space. Running an archive node requires significantly more storage and computational resources but is essential for services like block explorers, advanced analytics, and certain developer tools that need to audit or verify historical on-chain activity.
Get In Touch
today.
Our experts will offer a free quote and a 30min call to discuss your project.