Free 30-min Web3 Consultation
Book Consultation
Smart Contract Security Audits
View Audit Services
Custom DeFi Protocol Development
Explore DeFi
Full-Stack Web3 dApp Development
View App Services
Free 30-min Web3 Consultation
Book Consultation
Smart Contract Security Audits
View Audit Services
Custom DeFi Protocol Development
Explore DeFi
Full-Stack Web3 dApp Development
View App Services
Free 30-min Web3 Consultation
Book Consultation
Smart Contract Security Audits
View Audit Services
Custom DeFi Protocol Development
Explore DeFi
Full-Stack Web3 dApp Development
View App Services
Free 30-min Web3 Consultation
Book Consultation
Smart Contract Security Audits
View Audit Services
Custom DeFi Protocol Development
Explore DeFi
Full-Stack Web3 dApp Development
View App Services
LABS
Glossary

IPFS Hash (CID)

An IPFS Hash, formally a Content Identifier (CID), is a cryptographic hash that uniquely and permanently identifies content stored on the InterPlanetary File System (IPFS).
Chainscore © 2026
definition
DECODING CONTENT IDENTIFIERS

What is an IPFS Hash (CID)?

A technical breakdown of the cryptographic fingerprint at the heart of the InterPlanetary File System.

An IPFS Hash, formally known as a Content Identifier (CID), is a unique cryptographic fingerprint that permanently identifies a piece of content—such as a file, directory, or data block—on the InterPlanetary File System (IPFS). Unlike location-based addresses (e.g., URLs), which point to where data is stored, a CID is a content-addressed identifier derived from the content itself using a cryptographic hash function. This means the same content will always produce the same CID, enabling deduplication, tamper-proof verification, and permanent, location-independent addressing across a decentralized network.

The structure of a modern CID is self-describing, containing several key components within its encoded string. It specifies the multihash (the actual hash digest and the algorithm used, like SHA-256), the multicodec (the format of the data, e.g., raw bytes, dag-pb for IPFS objects, or dag-cbor), and the multibase prefix (the encoding, such as base58btc or base32, for readability). For example, a CIDv1 like bafybeigdyrzt5sfp7udm7hu76uh7y26nf3efuylqabf3oclgtqy55fbzdi tells the network everything needed to fetch and interpret the content. This self-contained design ensures CIDs remain usable even as hash algorithms evolve.

In practice, when a file is added to IPFS, it is split into blocks, each receiving its own CID. These blocks are then organized into a Merkle Directed Acyclic Graph (DAG) structure, with a root CID representing the entire file or directory. This architecture enables powerful features: deduplication (identical blocks are stored once), tamper-evidence (any change alters the CID), and efficient versioning. CIDs are fundamental to the decentralized web, forming the backbone of protocols like IPFS and Filecoin, and are widely used in blockchain applications for storing NFTs, static website assets, and immutable data.

how-it-works
CONTENT ADDRESSING

How Does an IPFS Hash (CID) Work?

An IPFS hash, known as a Content Identifier (CID), is a cryptographic fingerprint that uniquely and permanently identifies content on the InterPlanetary File System (IPFS).

A Content Identifier (CID) is a self-describing content-addressed identifier. It is generated by applying a cryptographic hash function (like SHA-256) to the content's data, creating a unique, fixed-size string. This process, known as content addressing, means the CID is derived directly from the data itself. If the data changes even by a single bit, the resulting CID will be completely different. This ensures immutability and verifiability, as anyone can hash the data to confirm it matches the CID.

The structure of a modern CID (version 1, or CIDv1) is more than just a hash. It is a multiformat string that encodes several pieces of metadata in a self-describing way using Multicodec, Multihash, and Multibase prefixes. For example, a CID like bafybeigdyr... tells you the hash function used (SHA2-256), the length of the hash, and the base encoding (Base32). This layered structure allows the IPFS protocol to evolve, supporting new hash functions and data formats without breaking compatibility.

When you request content by its CID, the IPFS network uses a Distributed Hash Table (DHT) to locate network peers who have announced they are storing the data blocks associated with that fingerprint. The system retrieves and reassembles these blocks. This mechanism decouples content from location; instead of asking "where is the file?" (like https://server.com/file.jpg), you ask "who has the data matching this hash?" This makes content resilient, as it can be retrieved from any node that has a copy, not just a single server.

key-features
CONTENT ADDRESSING

Key Features of IPFS CIDs

A Content Identifier (CID) is a self-describing cryptographic hash that uniquely and permanently identifies content on the IPFS network and other decentralized systems.

01

Cryptographic Hashing

At its core, a CID is generated by applying a cryptographic hash function (like SHA-256) to the content's data. This creates a unique, fixed-size fingerprint. Even a single bit change in the original data produces a completely different CID, ensuring data integrity and tamper-evidence.

02

Self-Describing Format

A CID is not just a hash; it's a structured identifier that describes itself. It encodes:

  • The multihash (the hash digest and the function used).
  • The multicodec (the format of the data, e.g., raw bytes, dag-pb, dag-cbor).
  • The multibase prefix (the encoding, e.g., base58btc for Qm..., base32 for b...). This allows systems to know how to interpret the data without external context.
03

Versioning (CIDv0 vs CIDv1)

IPFS CIDs have evolved:

  • CIDv0: The original format, starting with Qm. It is a Base58-encoded SHA-256 hash of a protobuf. Limited flexibility.
  • CIDv1: The current standard, more flexible with explicit multicodec and multibase prefixes (e.g., bafybei...). It supports future-proofing with different hash functions and data formats. Most new systems use CIDv1.
04

Content vs. Location Addressing

This is the fundamental shift CIDs enable.

  • Location Addressing: Points to where data is (e.g., https://server.com/file.jpg). If the server moves, the link breaks.
  • Content Addressing: Points to what the data is via its CID (e.g., ipfs://bafy...). The data can be retrieved from any node on the IPFS network that has a copy, providing persistence and redundancy.
05

Immutability & Deduplication

A CID guarantees immutability; the same content always has the same CID. This enables automatic deduplication across the network. If two users add the same 1GB file, IPFS stores it only once under the same CID, saving massive amounts of storage and bandwidth.

cid-versions
IPFS HASH (CID)

CID Versions: v0 vs. v1

A technical comparison of the two primary Content Identifier (CID) formats used in the InterPlanetary File System (IPFS) and related decentralized protocols.

A Content Identifier (CID) is a self-describing content-addressed identifier for data stored on the InterPlanetary File System (IPFS). The evolution from CIDv0 to CIDv1 represents a fundamental shift towards a more flexible, future-proof, and multi-protocol addressing scheme. CIDv0 is the original format, essentially a Base58-encoded SHA-256 multihash (e.g., Qm...), while CIDv1 is an extensible format that includes a version prefix, multicodec identifier, and multihash within a binary structure, typically represented as a CIDv1 string like bafybeig....

The primary technical distinction lies in their structure and encoding. CIDv0 is a legacy format that is implicitly version 0 and uses the dag-pb multicodec for IPLD data. It is restricted to the SHA-256 hash function and Base58 encoding, making it recognizable by its Qm prefix. In contrast, CIDv1 is explicitly versioned, includes a field to specify the content type (e.g., dag-cbor, raw), and can accommodate any hash function defined in the multihash table. This allows CIDv1 to represent a wider array of data structures and cryptographic commitments beyond IPFS's original scope.

A key practical difference is CIDv1's support for multiple textual representations. While CIDv0 only exists in Base58, a CIDv1 can be represented in various bases defined by the Multibase prefix, such as Base32 (b...) for case-insensitive environments or Base64. The common bafy... string is a Base32-encoded CIDv1. This flexibility makes CIDv1 more suitable for web applications, DNS, and filenames. Importantly, the underlying content address is identical for both versions when pointing to the same data; they are different encodings of the same cryptographic hash.

For developers, the choice is increasingly straightforward: CIDv1 is the modern standard. Most contemporary IPFS tooling and APIs, including the JavaScript (ipfs-core) and Go (kubo) implementations, generate CIDv1 by default. While the network remains fully compatible with CIDv0 for backward compatibility, new projects should adopt CIDv1 for its extensibility. A CIDv0 can be losslessly converted to its CIDv1 equivalent, but the reverse is not universally true, as CIDv1 can express constructs that CIDv0 cannot, such as content using the SHA3-256 hash or the dag-json codec.

Understanding the version difference is crucial for interoperability across the decentralized web stack. Protocols like IPLD (InterPlanetary Linked Data), Filecoin, and libp2p rely on CIDv1's ability to precisely describe the data being referenced. The version, multicodec, and multihash components work together to ensure that data is not only located but also correctly interpreted upon retrieval, forming the backbone of content-addressed verifiability in Web3 systems.

examples
IPFS HASH (CID)

Examples & Ecosystem Usage

The Content Identifier (CID) is the cornerstone of IPFS's content-addressed storage model. These examples illustrate its practical applications across the decentralized ecosystem.

CONTENT ADDRESSING COMPARISON

CID vs. Traditional URL Addressing

A technical comparison of content-addressed identifiers (CIDs) and location-addressed URLs, highlighting their core architectural differences.

FeatureContent Identifier (CID)Uniform Resource Locator (URL)

Addressing Method

Content-based (cryptographic hash of the data)

Location-based (path to a server and file)

Data Integrity

Data Immutability

Decentralization

Persistence

Data persists as long as one node hosts it

Link breaks if the hosting server changes or goes offline

Verification

Any node can verify the data matches the CID

Client must trust the server to serve correct data

Deduplication

Automatic (identical content = identical CID)

None (identical content can have infinite URLs)

Performance (Cached Data)

Near-instant from local or peer cache

Latency depends on origin server and network path

security-considerations
IPFS HASH (CID)

Security & Permanence Considerations

An IPFS Content Identifier (CID) is a self-describing cryptographic hash that uniquely addresses content on the InterPlanetary File System. Its design has profound implications for data integrity, availability, and long-term persistence.

01

Content-Addressed vs. Location-Addressed

A CID is a content-addressed identifier, meaning it is derived from the content's cryptographic hash. This contrasts with location-addressed systems (like HTTP URLs) that point to a server location. The key security benefit is immutability: if the data changes, its CID changes, guaranteeing the data you fetch is exactly what was originally stored. This prevents tampering and ensures verifiable integrity.

03

CID Inherent Properties

A CID is self-describing and versioned. It encodes:

  • The cryptographic hash of the content (e.g., SHA2-256).
  • The codec (e.g., dag-pb, raw) describing the data format.
  • The multihash identifier, specifying the hash function used. This structure allows any system to independently verify the data's integrity and understand how to interpret it without external context, a core feature for trustless systems.
04

Sybil & Eclipse Attacks

While the CID itself is secure, the IPFS peer-to-peer routing layer can be vulnerable. Sybil attacks (creating many malicious nodes) or eclipse attacks (isolating a node from honest peers) can prevent a user from finding the correct peers hosting the desired CID. This doesn't corrupt the data (the CID remains valid) but can create a denial-of-service by hiding its availability. Mitigations include using trusted peers or DHT security extensions.

06

CIDv1 & Future-Proofing

Early CIDv0 was base58-encoded and limited. CIDv1 is the current standard, featuring:

  • Multibase prefixes (e.g., bafy...) for encoding flexibility.
  • Explicit version byte.
  • Support for multiple hash functions (future-proofing against cryptographic breaks). Migrating to CIDv1 is critical for long-term archival, as it ensures addresses remain interpretable and resolvable even as underlying protocols evolve.
IPFS HASH (CID)

Frequently Asked Questions (FAQ)

A Content Identifier (CID) is the core addressing mechanism of the InterPlanetary File System (IPFS), providing a unique, verifiable fingerprint for any piece of content.

A Content Identifier (CID) is a self-describing content-addressed identifier that uniquely and verifiably points to data stored on the InterPlanetary File System (IPFS). It works by applying a cryptographic hash function (like SHA-256) to the content itself, generating a unique string of characters. This CID is not a location-based address (like a URL); instead, it is derived from the content's data. Any change to the data results in a completely different CID. The system uses this hash to locate and retrieve the content from the distributed IPFS network, ensuring data integrity and persistence.

Key components of a CID include:

  • Multihash: Specifies the hash function used and the hash digest.
  • Multicodec: Indicates the format of the target data (e.g., raw, dag-pb, dag-cbor).
  • Multibase: The encoding prefix (like b for base58btc) for the string representation.

Example CIDv1: bafybeigdyrzt5sfp7udm7hu76uh7y26nf3efuylqabf3oclgtqy55fbzdi

ENQUIRY

Get In Touch
today.

Our experts will offer a free quote and a 30min call to discuss your project.

NDA Protected
24h Response
Directly to Engineering Team
10+
Protocols Shipped
$20M+
TVL Overall
NDA Protected Directly to Engineering Team
IPFS Hash (CID): Decentralized Content Identifier | ChainScore Glossary