Free 30-min Web3 Consultation
Book Consultation
Smart Contract Security Audits
View Audit Services
Custom DeFi Protocol Development
Explore DeFi
Full-Stack Web3 dApp Development
View App Services
Free 30-min Web3 Consultation
Book Consultation
Smart Contract Security Audits
View Audit Services
Custom DeFi Protocol Development
Explore DeFi
Full-Stack Web3 dApp Development
View App Services
Free 30-min Web3 Consultation
Book Consultation
Smart Contract Security Audits
View Audit Services
Custom DeFi Protocol Development
Explore DeFi
Full-Stack Web3 dApp Development
View App Services
Free 30-min Web3 Consultation
Book Consultation
Smart Contract Security Audits
View Audit Services
Custom DeFi Protocol Development
Explore DeFi
Full-Stack Web3 dApp Development
View App Services
LABS
Guides

Setting Up Cohort Analysis for NFT Platform Users

A technical guide for developers to implement cohort analysis on NFT platforms using on-chain data, SQL queries, and behavioral metrics.
Chainscore © 2026
introduction
DATA ANALYTICS

Setting Up Cohort Analysis for NFT Platform Users

A technical guide to implementing cohort analysis for tracking user retention and lifetime value on NFT marketplaces and platforms.

Cohort analysis segments users based on a shared characteristic over a specific time period, most commonly their first interaction date. For an NFT platform, this allows you to track how groups of users who minted or purchased their first NFT in a given week or month behave over subsequent periods. Unlike aggregate metrics, cohort analysis isolates the performance of specific user groups, revealing whether platform changes improve long-term engagement for new users. This is critical for measuring the true impact of onboarding flows, feature launches, or marketing campaigns.

The foundational step is defining the cohort key and time intervals. The cohort key is typically the user's first transaction timestamp, extracted from on-chain events like a Transfer or Mint from a null address. Time intervals are the periods you measure behavior against, such as days or weeks since cohort creation. A standard retention analysis SQL query involves joining a cohorts table (user, cohort_week) with a transactions table (user, activity_week) to calculate the percentage of users active in each subsequent period. Tools like Dune Analytics, Flipside Crypto, or your own indexed data warehouse are essential for this ETL process.

For actionable insights, measure metrics relevant to NFT platform health. Retention Rate tracks the percentage of users from a cohort who return to perform any action (list, bid, buy) in a later period. User Lifetime Value (LTV) calculates the total platform fees or revenue generated by a cohort over its lifespan. NFTs Held Over Time analyzes the average number of NFTs retained per user, indicating holder loyalty versus flipper activity. Segmenting cohorts further by mint type (allowlist vs. public) or initial purchase value can uncover starkly different behavioral patterns.

Implementing this requires a robust data pipeline. Start by ingesting raw blockchain data for your platform's contracts using an RPC node or data provider. Transform this data to create a clean user_first_action fact table. Then, build your cohort analysis queries. Here's a simplified SQL example for weekly retention:

sql
WITH cohorts AS (
  SELECT 
    buyer as user,
    DATE_TRUNC('week', MIN(block_time)) as cohort_week
  FROM nft.trades 
  WHERE platform = 'your_platform'
  GROUP BY 1
),
activities AS (
  SELECT 
    buyer as user,
    DATE_TRUNC('week', block_time) as activity_week
  FROM nft.trades
  WHERE platform = 'your_platform'
  GROUP BY 1, 2
)
SELECT 
  c.cohort_week,
  COUNT(DISTINCT c.user) as cohort_size,
  DATE_PART('week', a.activity_week - c.cohort_week) as week_number,
  COUNT(DISTINCT a.user) as active_users,
  COUNT(DISTINCT a.user) * 100.0 / COUNT(DISTINCT c.user) as retention_rate
FROM cohorts c
LEFT JOIN activities a ON c.user = a.user 
  AND a.activity_week >= c.cohort_week
GROUP BY 1, 3
ORDER BY 1, 3;

Visualize the results using a cohort heatmap, where each row is a cohort and each column is a period since sign-up, with color intensity showing retention percentage. This instantly highlights trends: are newer cohorts retaining better than older ones? A successful platform update should show improving retention in the top-left of the heatmap (newer cohorts, early periods). Pair this with LTV curves to understand revenue implications. Regular cohort analysis transforms vague notions of 'growth' into a precise, actionable understanding of user lifecycle and platform sustainability.

prerequisites
SETUP

Prerequisites and Data Sources

Before analyzing user cohorts, you need the right data infrastructure. This guide covers the essential tools and data sources required to build a robust NFT platform cohort analysis system.

Cohort analysis for an NFT platform requires structured, historical on-chain and off-chain data. The primary data source is a blockchain indexer like The Graph, which allows you to query event logs for user interactions—mints, transfers, sales, and bids—across collections. You'll also need access to marketplace APIs (e.g., OpenSea, Blur) for floor prices, listing data, and metadata. For a complete view, integrate wallet activity data from services like Alchemy or Moralis to track transactions across DeFi, bridging, and other protocols, revealing user sophistication and capital flow.

Your technical stack should include a database for storing processed data (PostgreSQL or TimescaleDB for time-series), a query engine (like dbt for transformation), and a visualization tool (e.g., Metabase, Superset). You will write subgraphs on The Graph to index specific event schemas or use existing subgraphs from platforms like NFTX. For custom analysis, consider running an archive node or using an RPC provider with extended history. The key is creating a unified user identifier, typically the wallet address, to link activity across all data sources.

Define the core user actions and time dimensions for your cohorts. Key actions include: first mint (user acquisition), first secondary market purchase, listing an NFT, and selling an NFT. Time dimensions are usually the calendar week or month of the user's first action (cohort creation). You must also calculate derived metrics like average holding time, profit/loss per transaction, and collection diversity index. Clean your data by filtering out contract addresses and wash trading; platforms like Dune Analytics provide useful abstractions for this.

A practical first step is to query a subgraph for all Transfer events to a platform's mint contract. Using GraphQL, you can batch queries to extract minter addresses and timestamps. Process this data in a script to group users by their mint week. Then, enrich this cohort list with secondary sales data from marketplace APIs to track retention. For example, calculate what percentage of users from Week 1 mint cohort made a purchase in Week 2, 3, and 4. This cohort retention table is the foundation for more advanced behavioral analysis.

Finally, ensure your pipeline is reproducible and automated. Use cron jobs or orchestration tools (Apache Airflow, Prefect) to regularly fetch new block data, update your cohort tables, and refresh dashboards. Document your data schema and ETL logic clearly. By establishing these prerequisites—reliable data sources, a clear identity graph, defined metrics, and an automated pipeline—you build a system that can answer critical questions about user loyalty, lifetime value, and the impact of platform features or market conditions on different user segments.

data-model
CORE FOUNDATION

Defining the Data Model and Cohort Logic

A robust data model and precise cohort logic are the foundation of actionable NFT platform analytics. This guide details how to structure your data and define user cohorts for meaningful analysis.

The first step is to define a data model that captures essential user and transaction events. For an NFT platform, this typically includes tables for users, collections, assets, and events. The events table is the most critical, logging key user actions like wallet_connected, nft_minted, nft_purchased, nft_listed, and nft_transferred. Each event should be timestamped and linked to a user ID (often a hashed wallet address) and relevant asset or collection IDs. This normalized structure allows you to reconstruct any user's journey.

With the raw event stream captured, you define cohort logic to segment users based on their first meaningful action. The most common cohort is based on a user's first transaction date (e.g., the day they first mint or purchase an NFT). This creates a cohort key like 2024-01-15_minters. You can also define cohorts by first interaction with a specific collection or by user attributes like the wallet type (e.g., EOA vs. Smart Contract Wallet). The logic is implemented as a SQL query or transformation in your data pipeline that tags each user with their cohort identifier.

Here is a simplified example of cohort logic in a SQL query. It identifies the first purchase date for each user and assigns them to a weekly cohort:

sql
WITH first_purchase AS (
  SELECT
    user_id,
    DATE_TRUNC('week', MIN(block_timestamp)) AS cohort_week
  FROM nft_purchase_events
  GROUP BY user_id
)
SELECT
  user_id,
  cohort_week,
  FORMAT('Week of %s', cohort_week) AS cohort_name
FROM first_purchase;

This query produces a mapping table that can be joined to all subsequent user activity.

After defining cohorts, you must track cohort-specific metrics over time. The primary analysis is a cohort retention matrix, which shows what percentage of users from a given starting week performed a specific action (like a second purchase) in each subsequent week. Other key metrics include cohort lifetime value (LTV), average transactions per user, and the time to second transaction. Tools like Dune Analytics, Flipside Crypto, or a custom pipeline using ClickHouse or PostgreSQL can compute these aggregates from your event data and cohort mapping.

Finally, apply your cohorts to answer business questions. Compare the retention of users who minted a free NFT versus those who made a paid purchase. Analyze if users from a high-profile collection drop have higher LTV. Test whether a new onboarding flow improved Week-1 retention for the latest cohort. By iterating on your cohort definitions—such as narrowing to "users who purchased within 24 hours of connecting their wallet"—you can isolate the impact of specific platform features or market conditions on long-term user behavior.

sql-cohort-creation
DATA PREPARATION

Step 1: SQL Queries to Create User Cohorts

This guide details the SQL queries required to segment NFT platform users into actionable cohorts based on their first transaction date, a foundational step for retention and engagement analysis.

Cohort analysis groups users based on a shared characteristic over a defined time period, most commonly the date of their first interaction or transaction. For an NFT platform, a user's cohort is typically defined by the week or month they made their first purchase or mint. This initial timestamp becomes the anchor for measuring their subsequent activity, allowing you to compare how different groups of users behave over time. The core SQL task is to assign this cohort label to every user and transaction in your dataset.

The following query demonstrates how to create a user-cohort mapping table. It identifies the earliest transaction date for each wallet address (user_address) from a transactions table, truncates it to a weekly or monthly bucket, and stores the result. Using DATE_TRUNC('week', MIN(block_timestamp)) ensures all users are grouped by the start of the week they first transacted, creating consistent weekly cohorts.

sql
CREATE TABLE user_cohorts AS
SELECT
    user_address,
    DATE_TRUNC('week', MIN(block_timestamp)) AS cohort_week
FROM
    transactions
WHERE
    block_timestamp IS NOT NULL
GROUP BY
    user_address;

With the user_cohorts table created, you can join it back to your granular transaction data to analyze user activity in the context of their cohort. This enriched dataset enables you to calculate key metrics like retention. For example, you can compute the number of weeks post-signup (weeks_since_cohort) a user was active. This query joins the transactions to the cohort mapping and calculates the time difference.

sql
SELECT
    t.user_address,
    c.cohort_week,
    t.block_timestamp,
    DATE_PART('week', t.block_timestamp - c.cohort_week) AS weeks_since_cohort
FROM
    transactions t
LEFT JOIN
    user_cohorts c
ON
    t.user_address = c.user_address;

For practical analysis, you will likely aggregate this data into a cohort matrix. This involves counting the distinct active users per cohort for each subsequent period. The final query structure groups by cohort_week and weeks_since_cohort to produce a clear matrix showing how many users from the January 1st cohort were active in week 0, week 1, week 2, and so on. This matrix is the direct input for visualizing cohort retention curves in tools like Google Sheets, Looker, or Metabase.

When implementing this, ensure your transactions table includes a reliable user identifier. For NFT platforms, this is often the from_address or to_address field from on-chain data, filtered for relevant contract interactions (mints, transfers, sales). Using a trusted data source like Dune Analytics, Flipside Crypto, or an indexed Ethereum or Solana dataset is crucial for accuracy. Always verify the timezone settings of your block_timestamp to maintain consistent weekly cuts.

metric-calculation
ANALYTICS CORE

Step 2: Calculating Key Behavioral Metrics

This section details the essential calculations for analyzing user engagement and retention on your NFT platform using the cohort data prepared in Step 1.

With your cohort tables populated, you can now compute the metrics that reveal user behavior. The foundational metric is Cohort Size, which is simply the count of unique users who performed their first action (e.g., first mint, first purchase) in a given time period (e.g., weekly, monthly). This serves as the denominator for all subsequent retention and activity calculations. For example, a query for weekly cohorts might show Cohort_2024_10_28: 1,250 users.

Retention Rate is the most critical behavioral metric. It measures the percentage of users from a specific cohort who return to perform any meaningful action in subsequent periods. The formula is (Returning Users in Period N / Total Users in Cohort) * 100. Calculate this for each cohort across multiple periods (e.g., Week 1, Week 2, Week 4) to build a retention matrix. A sharp drop after Week 1 often indicates onboarding or product-market fit issues, while sustained retention signals strong engagement.

Beyond simple return visits, calculate Engagement Intensity. This involves summing key actions per user, such as total_mints, secondary_market_trades, or wallet_connections. Average these per cohort (total_actions / cohort_size) to see if retained users are becoming more or less active over time. You can also track the evolution of User Lifetime Value (LTV) proxies by calculating the cumulative trading volume or platform fees generated by each cohort over its lifetime.

For NFT platforms, Collection-Specific Metrics are vital. Segment your activity data by contract_address to calculate metrics like unique_collectors_per_cohort or average_hold_time for assets from a specific project. This reveals which NFT drops drive the most loyal communities. Use SQL GROUP BY clauses on your fact tables, joining with cohort labels, to perform this cohort-by-collection analysis.

Finally, implement these calculations in your data pipeline. Use a templated SQL query or a Python script with libraries like pandas to automate the generation of your retention matrix and engagement summaries. The output should be a clean dataset or dashboard that clearly shows trends, allowing you to answer questions like 'Did our new marketplace feature improve Week-2 retention for Q3 cohorts?'

KEY PERFORMANCE INDICATORS

Core Cohort Metrics and Definitions

Essential metrics for analyzing user behavior and retention on an NFT platform.

MetricDefinitionCalculationPrimary Use Case

Cohort Size

Number of users who performed a qualifying action in a specific time period.

Count of unique wallet addresses.

Segmenting users for behavioral analysis.

Retention Rate

Percentage of a cohort that returns to perform a core action in subsequent periods.

(Returning Users / Cohort Size) * 100

Measuring long-term user engagement and platform stickiness.

Average Transaction Value (ATV)

Mean value of transactions (in ETH or USD) per user within a cohort.

Total Transaction Value / Number of Transactions

Assessing user spending habits and revenue contribution.

User Lifetime Value (LTV)

Predicted total revenue a user will generate during their platform lifespan.

ATV * Purchase Frequency * Avg. User Lifespan

Evaluating customer acquisition cost (CAC) ROI and long-term profitability.

Time to First Purchase (TTFP)

Average time between a user's first on-chain interaction and their first NFT purchase.

Timestamp(First Purchase) - Timestamp(First Interaction)

Optimizing onboarding and initial user experience.

Secondary Market Activity Rate

Percentage of users in a cohort who list or trade NFTs on secondary markets.

(Users with Secondary Trades / Cohort Size) * 100

Gauging community engagement and marketplace liquidity health.

Wallet Churn

Percentage of users in a cohort who become inactive (no on-chain interactions) for a defined period.

(Inactive Users / Cohort Size) * 100

Identifying at-risk users and retention opportunities.

visualization-interpretation
COHORT ANALYSIS

Step 3: Visualizing and Interpreting Results

Transform your raw cohort data into actionable insights using visualization techniques. This step focuses on identifying user retention patterns, lifetime value trends, and behavioral shifts over time.

With your cohort data aggregated, the next step is to visualize the metrics to uncover patterns. The most common and powerful visualization is a cohort retention heatmap. This grid-like chart displays each cohort (rows) and their activity over subsequent periods (columns), with color intensity representing the metric value, such as retention rate or average transaction value. Tools like Python's Matplotlib/Seaborn, Plotly, or Google Data Studio can generate these. A well-constructed heatmap instantly reveals which acquisition periods yielded the most loyal users and how engagement decays—or sustains—over time.

Beyond the heatmap, analyze specific trend lines for deeper insight. Plot the Day N Retention for different cohorts to compare long-term stickiness. For example, you might find users who minted during a specific artist drop have a 40% higher 30-day retention than general marketplace minters. Similarly, chart the Cumulative Lifetime Value (LTV) per cohort. This shows the total revenue generated by users from a specific start date, helping you calculate the return on investment for marketing campaigns or gas fee subsidies offered during those periods.

Interpreting these visuals requires asking the right questions. Look for cohort anomalies: a spike in retention might correlate with a successful community event or a new platform feature launch. Conversely, a drop could indicate technical issues or increased competitor activity. Analyze the shape of the curve: do most cohorts show a steep drop-off after the first week (common for speculative traders) or a more gradual decline (indicative of collectors)? Use these insights to segment cohorts further, perhaps separating "whale" minters from smaller collectors to see if retention drivers differ.

Finally, translate these interpretations into actionable decisions for your NFT platform. If visualizations show poor retention after the first NFT purchase, consider implementing onboarding sequencesto introduce users to secondary market features. If certain artist cohorts show high LTV, you might develop targeted allowlists or rewards for their future drops. Continuously track these metrics after implementing changes to measure their impact, closing the loop on your data-driven growth strategy.

tools-frameworks
SETUP GUIDE

Tools and Frameworks for Implementation

Implementing cohort analysis requires a stack for data extraction, transformation, and visualization. This guide covers the essential tools for each stage.

advanced-techniques
ADVANCED TECHNIQUES AND CAVEATS

Setting Up Cohort Analysis for NFT Platform Users

Cohort analysis segments users based on shared characteristics, such as their first mint date, to track long-term engagement and retention. This guide covers practical implementation using on-chain data and common analytical pitfalls.

Cohort analysis groups users who performed a specific action during the same time period, allowing you to compare their behavior over time. For an NFT platform, the most common cohort is based on the user's first on-chain interaction, such as minting or purchasing their first NFT from your collection. This initial timestamp becomes their cohort identifier. You can then track metrics like retention rate (percentage of users from a cohort who return to perform another action), average transaction volume, and lifetime value across weekly or monthly cohorts to identify trends in user engagement.

To implement this, you must first index relevant on-chain events. Using a subgraph for an Ethereum-based platform, you can query Transfer events where the from address is the zero address (indicating a mint) to a user's wallet. The query should return the user's address and the block timestamp of their first mint. Store this data with the user's address and their cohort date (e.g., '2024-01-W1'). Subsequent analyses query all transactions, grouping them by these pre-defined cohorts to calculate metrics over time. Tools like Dune Analytics or Flipside Crypto are built for this type of on-chain SQL analysis.

A key caveat is address aliasing. Users often control multiple wallets. Analyzing a single address provides an incomplete picture and can skew retention data downward if a user switches wallets. To mitigate this, implement entity resolution. Techniques include clustering addresses that interact with the same NFT (via transfers) or funding from the same exchange deposit address. More advanced methods use ERC-4337 Account Abstraction smart accounts, where the user's persistent account address is separate from their key-based wallet addresses, inherently solving the aliasing problem.

Another significant challenge is distinguishing organic user behavior from sybil activity and airdrops. Wallets created solely to farm an airdrop will mint an NFT and never return, artificially depressing cohort retention metrics. Filtering these out requires heuristic analysis: flagging addresses with minimal transaction history, no other DeFi interactions, or those that received the NFT via a free claim from a factory contract. Adjust your cohort definitions accordingly, perhaps by only including users who paid a mint fee above a certain threshold.

When visualizing results, avoid vanity metrics. A cohort retention chart that shows a steep drop-off after Week 1 is expected; the insight is in the slope of the curve thereafter. Compare cohorts from before and after a major platform update or marketing campaign. Did the Week 1 retention for the post-launch cohort improve? Also, segment cohorts further by mint price tier or NFT trait rarity to understand how different user segments behave. This layered analysis reveals whether high-value collectors have better long-term engagement than standard minters.

Finally, integrate off-chain data where possible. While on-chain analysis is powerful, linking a wallet to Discord ID (via token-gating) or website activity can provide context for why a cohort's on-chain activity changed. Did a drop in transactions coincide with reduced Discord engagement? Combining these data sources moves analysis from descriptive to diagnostic, enabling targeted interventions to re-engage specific cohorts and improve overall platform health.

COHORT ANALYSIS

Frequently Asked Questions

Common technical questions and solutions for implementing cohort analysis on NFT platforms using on-chain data.

Cohort analysis groups users based on a shared characteristic or event within a defined time period (e.g., users who minted their first NFT in January 2024) and tracks their behavior over time. For NFT platforms, this involves analyzing on-chain data to answer questions about user retention, lifetime value, and engagement.

Key components include:

  • Cohort Definition: The initial action that defines the group, such as first mint, first purchase, or wallet connection.
  • Time Period: The interval for grouping users (daily, weekly, monthly).
  • Metric Tracking: Measuring subsequent actions like secondary sales, repeat mints, or collection holds over subsequent periods.

This analysis is performed by querying blockchain data (e.g., from The Graph, Dune Analytics, or a custom indexer) to segment wallet addresses and aggregate their transaction history.

conclusion
IMPLEMENTATION SUMMARY

Conclusion and Next Steps

You have now configured a foundational cohort analysis system for your NFT platform, enabling data-driven insights into user retention and lifecycle value.

The implemented pipeline transforms raw on-chain and off-chain data—including wallet addresses, transaction timestamps, mint events, and secondary market sales—into actionable cohort tables. By segmenting users based on their first interaction (e.g., first mint date) and tracking their activity over subsequent weeks or months, you can answer critical questions: What percentage of users minted in Week 1 returned to list an NFT for sale in Week 2? How does the lifetime value differ between cohorts that arrived during a free mint event versus a high-profile paid drop? This analysis moves you beyond vanity metrics like total users to understanding sustainable growth.

To operationalize this system, consider automating the data pipeline using a tool like dbt (data build tool) to manage the SQL transformations, or an orchestration framework like Apache Airflow or Prefect. Schedule the cohort aggregation queries to run daily and populate a dashboard in Grafana, Metabase, or Retool. For platforms with high transaction volume, optimize query performance by materializing the user_first_action and cohort_activity tables as incremental models, updating only with new blockchain blocks rather than reprocessing all historical data.

The next logical step is to deepen the analysis. Integrate more granular event data, such as trait-based filters for specific NFT collections, gas fee spending as a proxy for engagement intensity, or participation in platform-specific features like staking or voting. You can also build predictive models using this cohort data; for example, use the early activity signals (transactions in first 7 days) of a new cohort to forecast its 30-day retention rate using a simple regression or a more complex model from libraries like scikit-learn. Share these insights with your product and marketing teams to tailor onboarding flows and retention campaigns for at-risk user segments.

Finally, remember that data privacy and user anonymity are paramount in Web3. While analyzing wallet patterns, ensure you comply with relevant regulations. Aggregate data to a cohort level to protect individual privacy, and consider implementing differential privacy techniques if publishing insights. The goal is to build trust through transparency about data usage while leveraging analytics to create a better, more engaging platform for your community.