A Piece CID (Content Identifier) is a unique cryptographic hash that represents a specific, fixed-size chunk of data, known as a piece, prepared for storage on the Filecoin network. It is generated by applying a SHA-256 hash to the raw bytes of the piece after it has undergone piece padding to meet the network's power-of-two size requirements (e.g., 256 KiB, 512 KiB, 1 MiB, etc.). Unlike a typical file CID, which identifies the logical structure of a file, a Piece CID identifies the exact physical data payload that a storage provider commits to storing.
Piece CID
What is Piece CID?
A Piece CID is a cryptographic identifier for a raw, unprocessed data segment in a Filecoin storage deal.
The creation of a Piece CID is a prerequisite for a storage deal. Before data can be stored, a client's files are segmented into one or more pieces, each with its own Piece CID. This identifier is then included in the deal proposal and is cryptographically linked to the provider's storage commitment via Proof-of-Replication and Proof-of-Spacetime. The Piece CID serves as the fundamental unit of storage and verification, enabling the network to audit that a provider is storing the exact, agreed-upon data.
A key distinction is between a Piece CID and a Payload CID (or Data CID). The Payload CID, often a CIDv1, identifies the user's original data (e.g., a file or directory) using a codec like dag-pb. The Piece CID is derived from the padded binary representation of that data. While they represent the same underlying content, they are different hashes with different purposes: the Payload CID is for retrieval and data graphs, while the Piece CID is for storage proofs and sector sealing.
In practice, when data is retrieved from Filecoin, the storage provider returns the original payload data. The client can verify its integrity by recomputing the Payload CID. The Piece CID remains a critical backend component, ensuring the data stored in the provider's sealed sector is immutable and verifiable against the original deal terms. This two-tiered identification system is central to Filecoin's trustless storage and verifiable capacity market.
How a Piece CID is Created
A Piece CID is a cryptographic fingerprint for raw, unsealed data in Filecoin's storage ecosystem. Its creation is a deterministic process that transforms a client's data into a format suitable for provable storage deals.
The creation of a Piece CID begins with the client's raw data. This data is prepared by a Storage Provider into a CAR (Content Addressed aRchive) file, which contains the data and its corresponding IPLD DAG structure. The CAR file is then padded to a standard size that aligns with Filecoin's sector sizes (e.g., 32 GiB or 64 GiB). This padding ensures the data fits precisely into the cryptographic compartments used for sealing and proof generation, making the Piece CID a commitment to a specific, fixed-size data segment.
Once padded, the data undergoes a SHA-256 hash to generate the Piece CID. This hash is encoded with the multicodec fil-piece (0x1012) and the multihash sha2-256 (0x12), resulting in a self-describing content identifier like baga6ea4seaq.... The fil-piece codec is critical—it signals that this CID refers to the raw, unsealed data piece, distinct from the dag-cbor or raw codecs used for the data's logical content. This process is entirely deterministic; the same data, prepared identically, will always produce the same Piece CID.
The final Piece CID and its size are the core components of a storage deal proposal on the Filecoin network. The Storage Provider commits to storing the data referenced by this specific CID. Later, during the sealing process, this Piece CID is used to generate the CommP (Commitment to the Piece), which is then embedded into the sector's Sealed CID (CommR). This chain of cryptographic commitments links the provable storage in a sector directly back to the original client data, enabling verifiable storage proofs without the provider needing to repeatedly access the raw data.
Key Features of a Piece CID
A Piece CID (Content Identifier) is a cryptographic hash that uniquely identifies a specific data segment, or 'piece', prepared for storage on the Filecoin network. It is fundamental to proving that data has been correctly stored and can be reliably retrieved.
Cryptographic Commitment
A Piece CID is generated by applying a cryptographic hash function (SHA-256) to the raw data of a piece. This creates a unique, deterministic fingerprint. Any alteration to the original data—even a single bit—results in a completely different CID. This property is the foundation for provable data integrity in decentralized storage systems.
Deterministic & Verifiable
The same data, prepared with the same piece size and padded to that size, will always produce the same Piece CID. This allows any network participant to independently verify that a storage provider is storing the exact data they committed to, by recomputing the CID from the retrieved data and comparing it to the original commitment.
Linked to CommP & CommD
The Piece CID is also known as the CommP (Piece Commitment). It is the root of a Merkle tree built from the piece's data. This CommP is then used to generate a CommD (Data Commitment) for the entire data deal. This chain of commitments links the raw data to the final on-chain storage deal and subsequent Proofs of Replication and Proofs of Spacetime.
Padded to Power-of-Two Size
Data is padded to a power-of-two size (e.g., 256 KiB, 1 MiB, 32 GiB) before the Piece CID is computed. This standardization is required for the Filecoin Proof-of-Replication (PoRep) process. The padding ensures all pieces align with the cryptographic structures used in storage proofs, enabling efficient sealing and verification.
Foundation for Storage Deals
When a client makes a storage deal, they provide the Piece CID to the storage provider. The provider must then seal the data, a computationally intensive process that generates a unique Replica ID (CommR) from the CommP. The on-chain deal is published with the CommP, creating a public, verifiable record of the storage agreement.
Distinct from IPFS CID
A Piece CID is not the same as a typical IPFS CID (Content Identifier). While both are cryptographic hashes, a Piece CID is computed on padded, unencrypted data for storage proofs. An IPFS CID is often computed on the original, unpadded data and may use different codecs (like dag-pb). A single file can have both identifiers.
Piece CID vs. Other CIDs
A comparison of the Piece CID with other common Content Identifier types, highlighting their distinct purposes, construction, and use cases in decentralized storage and data verification.
| Feature | Piece CID | File CID (UnixFS) | CommP (Legacy) |
|---|---|---|---|
Primary Purpose | Storage deal commitment | File system addressing | Legacy storage deal commitment |
Content Addressed | |||
Constructed From | Raw binary data (padded) | DAG-PB graph of file chunks | Raw binary data (unpadded) |
Padded to Power of Two | |||
Cryptographic Hash | SHA-256 | SHA-256, Blake2b, etc. | SHA-256 |
Used in Storage Deals | |||
Retrievable via IPFS | |||
Key Use Case | Proving storage capacity | Content discovery & retrieval | Pre-Filecoin storage proofs |
Etymology and Origin
The term **Piece CID** is a compound noun derived from two fundamental concepts in decentralized data storage: the cryptographic **CID** and the data structure known as a **piece**.
The first component, CID (Content Identifier), is a self-describing content-addressed identifier. It originates from the InterPlanetary File System (IPFS) and Filecoin ecosystems, where it serves as a unique fingerprint for any piece of data, generated using cryptographic hash functions like SHA-256. The concept of content addressing itself has roots in peer-to-peer networks and distributed systems, where identifying data by its content, rather than its location, is paramount for verifiability and persistence.
The second component, piece, is a specific construct from the Filecoin protocol. A piece is a CAR (Content Addressed aRchive) file containing user data that is prepared for storage deals with miners. The term reflects the protocol's design where data is packaged into standardized, sector-sized units for efficient storage and proof generation. The etymology ties directly to the physical metaphor of breaking data into manageable "pieces" for storage in a decentralized network.
Combined, Piece CID specifically refers to the CID of the root node of the IPLD DAG contained within a CAR file that constitutes a storage deal's payload. This precise definition distinguishes it from a regular data CID; a Piece CID identifies the entire packaged payload for a storage deal, which includes the target data plus any padding required to fill a storage sector. Its origin is inextricably linked to Filecoin's economic and cryptographic mechanisms for proving storage.
The term's adoption and standardization were driven by the need for a unambiguous identifier in storage deals and retrieval markets. It serves as the primary committed data reference in the Filecoin blockchain's Deal state and in Proof-of-Spacetime submissions. Understanding its dual etymology is key for developers interacting with Filecoin's storage abstraction layer, as it sits at the intersection of content addressing (CID) and storage logistics (piece).
Ecosystem Usage
A Piece CID (Content Identifier) is a cryptographic hash that uniquely identifies a raw, unsealed data segment prepared for storage deals on Filecoin. It is a foundational identifier used throughout the data onboarding and retrieval process.
Data Preparation & CAR Files
Before data can be stored on Filecoin, it is prepared into a Content Addressable aRchive (CAR) file. The Piece CID is generated from this CAR file and represents a committed data size (a power-of-two padded size). This process involves:
- Chunking the source data.
- Arranging it into a Merkle DAG (IPLD).
- Serializing the DAG into a CAR file.
- Calculating the commP (Piece Commitment) hash, which becomes the Piece CID.
Storage Deal Negotiation
The Piece CID is the primary identifier used in Filecoin storage deal proposals. When a client proposes a deal to a storage provider, the proposal includes the Piece CID and the piece size. This allows both parties to agree on the exact data to be stored. The Piece CID is recorded on-chain as part of the deal metadata, creating a public, verifiable record of the storage agreement.
Sealing & Sector Commitment
Storage providers use the Piece CID during the sealing process. The CAR file (identified by its Piece CID) is packed into a sector. The provider generates a Sealed CID (CommR) from the encrypted and encoded data within the sealed sector. The provider then submits a SectorCommitProof to the chain, cryptographically linking the Sealed CID back to the original Piece CID, proving the correct data was sealed.
Retrieval & Content Addressing
To retrieve data, clients use the Data CID (root CID of the IPLD DAG), not the Piece CID. However, the Piece CID is crucial for the storage layer's integrity. Retrieval deals may reference the original storage deal ID, which contains the Piece CID. This creates a verifiable chain from the requested content (Data CID) back to the proven storage commitment (Piece CID → Sealed CID).
Verification & Proofs
The Piece CID is central to Filecoin's proof system. WindowPoSt and WinningPoSt challenges indirectly verify the continued storage of the data associated with the Sealed CID, which is derived from the Piece CID. Auditors or clients can verify that a storage deal's on-chain Piece CID matches the data they intended to store, providing cographic proof of storage.
Interoperability with IPFS
A Piece CID is distinct from an IPFS CID. An IPFS CID (Data CID) identifies the logical content via a Merkle DAG, while a Piece CID identifies the physical byte layout for storage. The same content can have multiple Piece CIDs if prepared with different piece sizes or padding. This distinction separates the content-addressing layer (IPFS) from the verifiable storage layer (Filecoin).
Technical Details
A Piece CID (Content Identifier) is a cryptographic hash that uniquely identifies a raw, unsealed data segment prepared for storage on a Filecoin storage provider. It is a foundational concept for data onboarding and verification in decentralized storage networks.
A Piece CID is a unique cryptographic identifier for a raw data segment, or 'piece,' that is prepared for storage on the Filecoin network. It is generated by applying a SHA-256 hash to the unsealed data, resulting in a CIDv1 with the fil-piece codec (0xf101). Unlike a regular file CID, a Piece CID represents the data exactly as it will be committed to a storage deal, before any sealing or proof-of-replication processes occur. It serves as the primary reference for clients and storage providers to agree on the exact data being stored.
Common Misconceptions
Piece CIDs are fundamental to Filecoin's data onboarding and storage verification, but their specific role and technical details are often misunderstood. This section clarifies the most frequent points of confusion.
No, a Piece CID is a specific type of Content Identifier (CID) that represents a raw, unsealed data segment prepared for Filecoin storage, not the data itself. A regular CID (like a CIDv1) typically identifies the logical content (e.g., a file or directory via UnixFS), while a Piece CID identifies the exact binary payload that a storage provider commits to storing. The Piece CID is computed from the raw bytes of the data after padded piece sizing and is used in the storage deal and Proof-of-Replication process. You cannot directly retrieve your file using a Piece CID alone; you need the corresponding Data CID and retrieval deal.
Frequently Asked Questions
A Piece CID is a fundamental identifier in decentralized storage networks like Filecoin and IPFS. These questions address its core purpose, technical details, and practical applications.
A Piece CID is a cryptographic hash that uniquely identifies a raw, unsealed data segment prepared for storage deals on the Filecoin network. It works by applying a SHA-256 hash to the Piece, which is the data formatted to a specific power-of-two size (e.g., 256KiB, 1MiB) with optional zero-padding. This hash is then encoded into a Content Identifier (CID) using the frc42 codec. The Piece CID serves as the primary content address for the data in the storage deal protocol, allowing clients and storage providers to agree on the exact data being committed before sealing and proving.
Get In Touch
today.
Our experts will offer a free quote and a 30min call to discuss your project.