Privacy is not a feature you can bolt on; it's a foundational property that must be engineered into a system's architecture from the start. In Web3, where transparency is often a default, designing for privacy involves deliberate trade-offs between data exposure, user sovereignty, and system functionality. Documenting these decisions is critical for several reasons: it provides a single source of truth for your team, enables external audits and security reviews, and serves as a historical record of the rationale behind complex cryptographic choices, such as selecting a zero-knowledge proof system or a specific trust model for a mixer.
How to Document Privacy Architecture Decisions
How to Document Privacy Architecture Decisions
A systematic approach to creating clear, actionable, and auditable records of privacy design choices in Web3 systems.
Effective documentation follows a structured template that captures the what, why, and how of each decision. A standard entry should include: the Decision (a clear statement of what was chosen), the Status (e.g., Proposed, Accepted, Deprecated), the Context (the problem being solved), the Decision Drivers (requirements like scalability, trust minimization, or regulatory compliance), the Considered Options (e.g., zk-SNARKs vs. zk-STARKs, trusted setup vs. transparent setup), and the Consequences (the trade-offs, integration needs, and future implications). This format, inspired by Architecture Decision Records (ADRs), transforms subjective discussions into objective, reviewable artifacts.
For example, documenting the choice to use Semaphore for anonymous signaling in a DAO would detail the context of needing identity-proofed but action-private voting. The considered options might include Tornado Cash (for payments, not signaling) and custom zk-SNARK circuits. The final decision would be justified by Semaphore's reusable identity model, its non-custodial design, and the availability of audited contracts. The consequences section would note the dependency on a trusted ceremony for the initial setup and the gas cost of generating proofs. This concrete record prevents future developers from questioning the choice without understanding its full context.
Integrate this documentation into your development lifecycle. Store ADRs in version control (e.g., a /docs/adr/ directory in your GitHub repository) alongside the code they govern. Link decisions to specific pull requests, audit reports, and protocol upgrades. This creates a verifiable audit trail that is invaluable for security researchers, auditors, and even users assessing the protocol's privacy guarantees. Tools like adr-tools or madr can help manage these records. The goal is to make the privacy architecture as transparent and understandable as the public blockchain data it aims to protect.
Prerequisites and Audience
This guide details a systematic approach for documenting privacy architecture in blockchain systems, from zero-knowledge proofs to confidential transactions.
This guide is written for protocol engineers, security researchers, and technical product managers building or auditing privacy-centric Web3 applications. You should have a foundational understanding of blockchain architecture, including concepts like state, transactions, and consensus. Familiarity with cryptographic primitives such as hashing, digital signatures, and public-key infrastructure is assumed. Experience with smart contract development on platforms like Ethereum or Solana is beneficial for contextual examples.
The primary goal is to move beyond ad-hoc notes and establish a formal, auditable record of privacy decisions. We will cover how to document the choice of privacy technology (e.g., zk-SNARKs vs. zk-STARKs, confidential assets), threat models, data flow diagrams, and the trade-offs between privacy, scalability, and auditability. This process is critical for security reviews, regulatory compliance, and onboarding new team members to complex systems.
You will need a tool for creating and versioning technical documents. While a simple Markdown file in a Git repository (like GitHub or GitLab) is sufficient, dedicated architecture decision record (ADR) tools or Notion/Docusaurus can provide better structure. For diagramming data flows and system boundaries, tools like Mermaid.js (for code-based diagrams), Draw.io, or Lucidchart are recommended. The focus is on the content and structure of the documentation, not the specific software used to create it.
We will structure documentation around key decision points. For each major privacy feature—such as implementing a shielded pool or private voting—you should document: the context and problem statement, the considered solutions (e.g., Tornado Cash's model vs. Aztec's zk-rollup), the decision with rationale, and the consequences (positive and negative). This mirrors the Architecture Decision Record (ADR) pattern, applied specifically to privacy concerns.
By the end of this guide, you will be able to produce clear, actionable documentation that answers critical questions: What user data is kept private and from whom? What cryptographic assumptions does the system rely on? How does privacy interact with the chain's consensus and data availability? This creates a single source of truth that is invaluable for long-term maintenance and security audits.
A Framework for Privacy Documentation
A systematic approach to documenting privacy architecture decisions for Web3 protocols, ensuring clarity, auditability, and compliance.
Privacy in Web3 is not a single feature but a system property that must be architected from the ground up. Effective documentation of these architectural decisions is critical for developer onboarding, security audits, and regulatory compliance. This framework provides a structured template to capture the why, what, and how of your protocol's privacy model, moving beyond implementation details to document the core trade-offs and guarantees.
Start by defining the privacy model and its threat assumptions. Clearly state what information you are protecting (e.g., transaction amounts, sender/receiver identities, smart contract state) and against whom (e.g., public blockchain observers, other users, validators). For example, a protocol using zk-SNARKs like zkSync or Aztec must document its trust model regarding setup ceremonies and prover honesty. This section answers the fundamental question: What does 'private' mean for this system?
Next, document the cryptographic primitives and data flow. Specify the algorithms in use (e.g., Elliptic Curve Diffie-Hellman for key agreement, Poseidon hash for zk-circuits) and their parameters. Diagram and describe how data moves through the system, highlighting where plaintext exists, where encryption/commitment occurs, and where zero-knowledge proofs are generated and verified. This creates a verifiable map for auditors to assess potential leakage points.
A crucial, often overlooked component is privacy metadata and policy. Document the retention period for any transient private data, key rotation schedules, and data deletion procedures. For protocols handling identity credentials, specify the W3C Verifiable Credentials data model or IETF DKMS standards you adhere to. This operational documentation is essential for complying with regulations like GDPR, which grant users the 'right to be forgotten.'
Finally, maintain a decision log for future reference. For each major architectural choice—such as selecting Tornado Cash-like pools over zk-rollups—record the alternatives considered, the trade-offs analyzed (e.g., trust assumptions vs. scalability), and the rationale for the final decision. This log, inspired by Architecture Decision Records (ADRs), provides invaluable context for future developers and mitigates the risk of well-intentioned but privacy-breaking changes during protocol upgrades.
Core Concepts to Document
Documenting privacy design decisions is critical for security audits, team alignment, and protocol upgrades. These are the key architectural components to detail.
Data Minimization Strategy
Define what data is stored on-chain vs. off-chain and the justification for each location.
- On-chain: Public commitments, state roots, or verification keys.
- Off-chain: Private inputs, witness data, or transaction details.
- Example: A zk-rollup stores validity proofs on-chain but keeps transaction details off-chain, minimizing public data leakage.
Trust Assumptions & Threat Model
Explicitly state the trust model and potential adversaries.
- Trusted setup: Document if a multi-party ceremony (like Groth16) is required and its security implications.
- Adversarial models: Consider malicious validators, data availability failures, or centralized sequencers.
- Example: A system using a 1-of-N trust assumption for data availability is vulnerable if all parties collude.
Cryptographic Primitives
Specify the core cryptographic components and their parameters.
- ZK-SNARKs vs. STARKs: Document the choice based on proof size, verification cost, and post-quantum security.
- Commitment schemes: Merkle trees, KZG commitments, or vector commitments.
- Example: Using BLS12-381 elliptic curve for pairing-based SNARKs offers ~128-bit security and efficient verification.
Privacy vs. Compliance Levers
Document mechanisms for regulatory compliance without breaking privacy guarantees.
- View keys: Allow designated parties to decrypt transaction histories.
- Selective disclosure: Use zero-knowledge proofs to prove compliance (e.g., proof of solvency, age > 18) without revealing underlying data.
- Example: Tornado Cash's compliance tool allowed users to generate a note to disclose funds to a regulator.
Anonymity Set Management
Detail how the system creates and maintains user anonymity.
- Set size: Document the theoretical maximum and practical size of mixing pools or rollup batches.
- Deposit/withdrawal linking: Explain techniques to prevent chain analysis, like using relayers or stealth addresses.
- Metric: A larger anonymity set (e.g., 100+ users per batch) increases privacy but may impact throughput.
Upgradeability & Governance
Plan for protocol evolution and emergency responses.
- Upgrade mechanisms: Timelocks, multi-sigs, or on-chain governance.
- Privacy implications: An upgrade could introduce a backdoor; document the process for verifying new circuit logic.
- Example: Aztec's upgradeable contract architecture uses a security council and a 14-day timelock for major changes.
Step 1: Document the Threat Model
A threat model is a structured representation of the security and privacy risks your system faces. Documenting it first ensures all subsequent architecture decisions are made with a clear understanding of the adversary.
Before writing a line of code or choosing a cryptographic primitive, you must define what you are protecting and from whom. A threat model answers these questions. It identifies your system's assets (e.g., user transaction history, private keys, on-chain state), the adversaries who might target them (e.g., malicious validators, network surveillors, application frontends), and their capabilities (e.g., can they run 50% of network nodes? Can they observe network traffic?). This document becomes the single source of truth for your project's security posture.
For a privacy-focused application, common threat models include:
- Network-level adversary: An entity that can monitor and potentially modify network traffic between users and the blockchain. This is the baseline threat for any system not using encrypted transport.
- Malicious smart contract or frontend: A corrupted dApp component that attempts to leak user data.
- Blockchain consensus adversary: Validators or miners colluding to censor transactions or extract metadata from the mempool. Documenting these explicitly prevents assumptions and ensures the architecture addresses real, not hypothetical, risks.
Your threat model documentation should be concise and living. Use a simple template:
1. Assets: List the sensitive data (e.g., "the link between a user's wallet address and their off-chain identity"). 2. Adversaries & Capabilities: Define who you defend against (e.g., "a RPC provider with the capability to log all query requests"). 3. Trust Assumptions: State what you do trust (e.g., "the underlying zk-SNARK circuit is correct"). 4. Security Objectives: Define success (e.g., "achieve unlinkability between two transactions from the same user").
This document will directly inform your choice of privacy architecture, whether it requires a trusted setup, secure enclaves, or zero-knowledge proofs.
Step 2: Specify Cryptographic Primitives
This section details the selection and justification of the core cryptographic building blocks that form the security foundation of your privacy system.
Documenting your chosen cryptographic primitives is a critical step in defining your system's trust model and security guarantees. This goes beyond simply listing algorithms; it involves specifying the exact elliptic curves, hash functions, signature schemes, and zero-knowledge proof systems you will use, along with their parameters and versions. For example, you must decide between using the BN254 curve (common in older zk-SNARKs like Groth16) or newer pairings like BLS12-381, which offers better security and efficiency for recursive proofs. Each choice has profound implications for proof size, verification speed, and compatibility with existing tooling and hardware.
Your documentation should explicitly justify each selection against the system's requirements. For a privacy-preserving payment system, you might write: "We selected the Pedersen Commitment scheme for balance encryption because it provides perfect hiding and computational binding under the discrete logarithm assumption, which is sufficient for our threat model. Commitments will be generated on the Jubjub curve (a twisted Edwards curve embedded within BLS12-381) for efficient integration with our zk-SNARK circuit." This level of specificity allows for accurate security audits and prevents ambiguous implementations.
For systems employing zero-knowledge proofs, you must document the proof system (e.g., Groth16, Plonk, Halo2), the trusted setup requirements (e.g., Powers of Tau ceremony, no trusted setup), and the arithmetic circuit or virtual machine they target. Include the library or framework, such as circom or halo2-lib, and the backend prover (e.g., arkworks, bellman). A clear specification enables developers to correctly implement the proving logic and allows third parties to verify the system's cryptographic claims independently.
Finally, address cryptographic agility and future-proofing. Specify how the system will handle the deprecation of a primitive, like a transition from SHA-256 to a quantum-resistant hash function. Document any pre-compiled contracts on the target blockchain (e.g., Ethereum's ECADD and ECMUL for BN254) that your design leverages for cost reduction. This complete specification transforms abstract privacy goals into a concrete, implementable, and auditable technical blueprint.
Step 3: Map the Data Lifecycle
This step involves creating a detailed, visual model of how sensitive data flows through your system, from collection to deletion, to inform and justify your privacy architecture.
A data lifecycle map is a visual artifact that traces the journey of a specific data element, such as a user's wallet address or transaction history, through your application. It documents each processing stage: collection, storage, usage, sharing, archival, and deletion. For each stage, you must identify the data controller (who decides how and why data is processed), the data processor (who performs the processing), the legal basis for processing (e.g., user consent, contractual necessity), and the technical safeguards in place (e.g., encryption, access controls). This map moves your documentation from abstract principles to concrete system design.
Start by identifying your high-risk data flows. In Web3, these often involve off-chain components that handle personal data, such as KYC verification services, centralized user databases, or analytics platforms. For example, map the flow when a user signs up: their email is collected via a frontend form, transmitted to a backend API, stored in a PostgreSQL database encrypted at rest, processed by a notification service, and shared with a third-party email provider via an API key. Documenting this flow reveals points where data minimization, retention policies, and access logging must be enforced.
Use a standard notation like Data Flow Diagrams (DFDs) or the LINDDUN privacy threat modeling framework to structure your map. Clearly label external entities (Users, Third-Party Oracles), data stores (IPFS, Cloud DB), processes (Smart Contract Function, API Endpoint), and data flows. For each component, annotate key decisions: Why is this data stored here? Who has access? Is it pseudonymized? When is it deleted? This exercise forces explicit justification for architectural choices, such as choosing zero-knowledge proofs to process data without exposing it or using decentralized storage like Ceramic for user-controlled data pods.
Integrate this map with your technical specifications. For instance, in your smart contract comments or API documentation, reference the lifecycle stage. A function comment might state: // Processes user preference data (Lifecycle: Usage). Data is ephemeral, held in memory only, and not persisted. Legal Basis: User Consent. This creates a traceable link between your architecture decisions, the implemented code, and the documented privacy requirements, which is crucial for audits and regulatory compliance like GDPR or CCPA.
Finally, treat the data lifecycle map as a living document. It must be updated with every significant change to your data infrastructure, new third-party integration, or protocol upgrade. Regular reviews of this map are essential for privacy impact assessments (PIAs) and help ensure that your system's evolution does not inadvertently create new privacy risks or data leakage points. This documented history of decisions also provides invaluable context for future development teams and auditors.
Cryptographic Decision Log Template
A structured template for documenting and comparing cryptographic choices in a privacy architecture.
| Decision Factor | Symmetric Encryption | Asymmetric Encryption | Zero-Knowledge Proofs |
|---|---|---|---|
Primary Use Case | Bulk data encryption | Key exchange & digital signatures | Privacy-preserving verification |
Key Management Complexity | High (shared secret) | Medium (public/private pair) | High (trusted setup, circuit keys) |
Computational Overhead | Low | Medium | Very High |
Post-Quantum Resistance | AES-256 (potentially) | Most are vulnerable | ZK-SNARKs/STARKs (resistant) |
Example Protocols | AES-GCM, ChaCha20-Poly1305 | RSA, ECDSA, EdDSA | Groth16, Plonk, Halo2 |
Data Provenance | |||
Suitable for On-Chain Data | Signatures only | ||
Audit Trail Clarity | Low (opaque ciphertext) | High (verifiable signatures) | Medium (proof validity only) |
Step 4: Document ZK Circuit Logic
Formalizing the design decisions behind your zero-knowledge circuits is critical for security audits, team alignment, and future maintenance. This step creates the single source of truth for your privacy architecture.
Effective documentation for a ZK circuit, such as one built with Circom or Halo2, goes beyond code comments. It captures the cryptographic assumptions, trusted setup requirements, and constraint system rationale. Start by creating a dedicated SPECIFICATION.md file in your project root. This document should explicitly state the circuit's purpose (e.g., "Proves knowledge of a valid Merkle proof for a whitelist"), list all public and private inputs/outputs with their types, and detail any external dependencies like specific hash functions or elliptic curves.
The core of the documentation is the constraint logic breakdown. For each major component of your circuit, describe in plain language what property it enforces. For a Circom circuit, this means explaining the relationship between the signals within each template. For example: "Template MerkleProofInclusion takes a leaf, a path index, and a sibling array. It iteratively hashes to reconstruct the root, enforcing that the provided leaf exists at the claimed position within the tree." Include diagrams or pseudocode for complex state transitions.
Crucially, document all security considerations and trade-offs. Note any range limits on inputs, the implications of a trusted setup, and assumptions about pre-image resistance of chosen hash functions. If you optimized for proving time at the expense of verification key size, state that decision and its rationale. This transparency is invaluable for auditors and prevents future developers from unknowingly introducing vulnerabilities when modifying the circuit.
Finally, integrate this specification with your testing strategy. Reference the documented constraints in your test files. For instance, a test case should verify the "MerkleProofInclusion fails with an incorrect sibling" property you described. Tools like zkREPL or gnark's test circuit compilation can generate witness data to validate your documentation against actual circuit execution. This creates a living document that ensures the implemented logic matches the designed intent.
Tools and Resources
These tools and frameworks help teams document privacy architecture decisions in a way that is reviewable, auditable, and compatible with security and compliance workflows. Each resource focuses on making privacy tradeoffs explicit rather than implicit.
Data Flow Diagrams and Personal Data Mapping
Data flow diagrams (DFDs) and personal data maps are foundational artifacts for documenting privacy architecture. They show where data originates, how it moves, and where it is transformed or stored.
Effective privacy data mapping should explicitly label:
- Data categories (identifiers, behavioral data, cryptographic material)
- Trust boundaries, such as off-chain services vs smart contracts
- Persistence layers, including logs, caches, and analytics pipelines
- Third-party dependencies, such as RPC providers or indexers
For blockchain systems, this often reveals privacy risks outside the protocol itself, including frontend telemetry, node providers, and monitoring tools. Keeping these diagrams versioned alongside architecture documents helps reviewers understand how privacy posture changes over time.
These diagrams are frequently reused as inputs to DPIAs and threat models.
Frequently Asked Questions
Common questions and clarifications for developers implementing privacy-preserving systems on-chain.
In blockchain contexts, privacy and confidentiality are distinct but related concepts. Privacy refers to the ability to obscure the link between a user's identity and their on-chain activity. Tools like zero-knowledge proofs (ZKPs) or mixers provide privacy.
Confidentiality specifically protects the content of a transaction or data state from being publicly visible, while still allowing its validity to be verified. Confidential transactions (e.g., using Pedersen Commitments) hide amounts, and zk-SNARKs can prove a state transition is correct without revealing the inputs.
Key distinction: You can have confidentiality without strong privacy (e.g., a hidden amount tied to a known address), and privacy without full confidentiality (e.g., a known transaction amount sent from a shielded address).
How to Document Privacy Architecture Decisions
Effective documentation is the cornerstone of sustainable privacy engineering. This guide outlines a systematic approach to recording architectural decisions for privacy-preserving systems.
Documenting privacy architecture decisions creates a critical institutional memory. It answers the why behind design choices, such as selecting a specific zero-knowledge proof system like Halo2 or a private computation framework like Sunscreen. This is essential for onboarding new team members, conducting security audits, and justifying compliance with regulations like GDPR. A well-maintained decision log prevents knowledge silos and ensures that future modifications don't inadvertently weaken the system's privacy guarantees. Start by creating a single source of truth, such as a dedicated PRIVACY_DECISIONS.md file in your repository or a section in your technical architecture document.
Each documented decision should follow a consistent template. We recommend the Architecture Decision Record (ADR) format, adapted for privacy concerns. A standard entry includes: Title and Status (e.g., "Proposed", "Accepted", "Deprecated"), Context (the problem and forces at play), Decision (the chosen solution), and Consequences (the trade-offs and impact). For privacy, add a Privacy Impact Assessment section detailing the data lifecycle, threat model considerations, and how the decision aligns with principles like data minimization. For example, an ADR for using zk-SNARKs over MPC would detail the trade-off between on-chain verifiability and computational overhead.
Integrate documentation into your development workflow. Decisions should be recorded at key milestones: during the initial design phase, after security reviews, and when implementing major features. Use pull request descriptions to link code changes to their corresponding ADR. Tools like adr-tools or log4brains can help manage ADRs. Furthermore, document not just the what and why, but also the how. Include references to the specific implementations, such as the Circom circuit repository or the Aztec.nr contract, and note any audits or formal verifications performed. This creates a verifiable trail of the privacy engineering process.
Maintaining this documentation is an ongoing responsibility. Schedule regular reviews, perhaps quarterly, to reassess decisions against new threats, technological advancements, or changing regulatory requirements. Mark deprecated decisions clearly and explain what supersedes them. This living document should be accessible to all stakeholders, including developers, auditors, and legal counsel. By treating privacy architecture documentation as a first-class artifact of your project, you build a foundation for long-term security, auditability, and trust in your Web3 application.