How to Build a Rug Pull Risk Indicator for Memecoins

introduction

INTRODUCTION

Launching a Rug Pull Risk Indicator Based on Holder Dynamics

This guide explains how to build a data-driven risk indicator for detecting potential rug pulls by analyzing on-chain holder concentration and distribution patterns.

A rug pull is a malicious event where developers abandon a project and drain its liquidity, leaving investors with worthless tokens. While no indicator is foolproof, analyzing holder dynamics provides critical, real-time signals. By examining metrics like the concentration of supply among top wallets, the rate of new holder acquisition, and the behavior of developer-controlled addresses, we can build a quantitative model to assess risk. This approach moves beyond speculation to a systematic analysis of on-chain data.

The core principle is that healthy, decentralized tokens typically exhibit a broad and growing distribution of holders. In contrast, tokens primed for a rug pull often show extreme supply concentration, suspicious transfers to centralized exchanges (CEXs) by insiders, and anomalous transaction patterns designed to create false momentum. We will use the Ethereum blockchain as our primary data source, querying events and balances via providers like Alchemy or Infura, and analyzing ERC-20 tokens such as Uniswap (UNI) or Chainlink (LINK) as comparative benchmarks for healthy distributions.

Our technical stack will involve Python for data processing and analysis, utilizing libraries like web3.py to interact with the blockchain and pandas for data manipulation. The guide will walk through calculating key metrics: the Gini Coefficient for supply inequality, the Herfindahl-Hirschman Index (HHI) for market concentration, and tracking the net flow of tokens to and from known exchange wallets. We'll also implement checks for locked liquidity and renounced ownership status of the token contract.

By the end of this tutorial, you will have a functional script that fetches on-chain data for any ERC-20 token address, computes a composite risk score based on holder dynamics, and outputs a clear assessment. This tool is intended for due diligence by investors, auditors, and researchers operating in the DeFi space. All code will be modular and open for extension to other blockchains like Polygon or Arbitrum.

prerequisites

SETUP

Prerequisites

Before building a rug pull risk indicator, you need the right tools and data sources. This section outlines the essential technical and conceptual foundations.

To analyze holder dynamics, you need reliable, real-time on-chain data. A blockchain node or a dedicated data provider like The Graph or Covalent is essential for querying token holder balances, transaction histories, and transfer events. For Ethereum and EVM-compatible chains, you'll interact with the ERC-20 token standard's Transfer event logs. You can use libraries like ethers.js or web3.py to connect to an RPC endpoint and fetch this data programmatically. Setting up a local archive node provides the most control but requires significant infrastructure; using a hosted service like Alchemy or Infura is a common alternative for developers.

The core of the analysis involves calculating key metrics from raw holder data. You must be proficient in processing large datasets to compute: holder concentration (percentage of supply held by top N addresses), velocity of top holders (how frequently large holders transact), and the distribution of new vs. old holders. This typically requires data analysis skills in Python (with pandas and NumPy) or JavaScript/TypeScript. Understanding statistical concepts like standard deviation for balance changes and time-series analysis for spotting abnormal withdrawal patterns is crucial for translating raw numbers into risk signals.

Finally, you need a framework to define and score risk. This isn't just about coding; it requires an understanding of common rug pull mechanics. You should research historical incidents to identify patterns, such as the slow-drip drain where creators gradually sell holdings, or the liquidity pull where all pooled funds are removed. Your indicator should weight different signals (e.g., a sudden 30% supply sell-off by a top holder is a stronger signal than a gradual 5% distribution). Documenting your risk model's thresholds and logic is key for transparency and future iteration. Start by prototyping with a few known scam and legitimate tokens to calibrate your system.

key-concepts-text

ON-CHAIN ANALYSIS

Launching a Rug Pull Risk Indicator Based on Holder Dynamics

This guide explains how to build a quantitative risk indicator for token projects by analyzing on-chain holder distribution and transaction patterns.

A rug pull is a type of exit scam where developers abandon a project and drain its liquidity, leaving investors with worthless tokens. While no single metric is foolproof, analyzing holder dynamics provides strong, on-chain signals of centralization and potential malicious intent. Key patterns include an excessively concentrated supply, suspicious distribution events, and a lack of organic trading activity. By monitoring these factors, you can create a data-driven risk score to flag high-risk projects before they collapse.

The primary indicator is supply concentration. Calculate the percentage of the total token supply held by the top 10 or 20 wallets. A high concentration (e.g., >60% for top 10 holders) is a major red flag, as it gives a small group unilateral control over price and liquidity. Use the getTokenHolders function from a provider like The Graph or Covalent to fetch this data. For example, a memecoin with 80% of its supply in the creator's wallet and a few 'sock puppet' wallets is structurally vulnerable to a rug pull.

Next, analyze the transaction history of the largest holders. Look for patterns of wash trading (rapid buying and selling between controlled wallets to simulate volume) and a lack of transfers to new, independent addresses. A healthy token should show a growing number of unique holders and decentralized selling pressure. You can track this by querying transfer events and calculating metrics like the Holder Concentration Index (HCI) and net flow from top wallets to exchanges over time.

To build the indicator, combine these metrics into a weighted score. For instance: Supply Concentration (50% weight), Liquidity Lock Status (20% weight), and 24-hour Net Outflow from Top Holders (30% weight). Set threshold values based on historical rug pull data. A project scoring above 0.8 (on a 0-1 risk scale) should be considered high-risk. Implement this logic in a script that periodically polls on-chain data and updates the risk score.

Here is a simplified Python pseudocode example using web3.py to check holder concentration:

python
from web3 import Web3
w3 = Web3(Web3.HTTPProvider('YOUR_RPC_URL'))
# Assume ERC-20 ABI and contract address
token_contract = w3.eth.contract(address=contract_address, abi=erc20_abi)
total_supply = token_contract.functions.totalSupply().call()
# Fetch top holders from an indexer API (conceptual)
top_10_balance = get_top_holders_balance(contract_address, limit=10)
concentration_ratio = top_10_balance / total_supply
print(f"Top 10 holders control: {concentration_ratio:.2%}")

Continuously monitor and backtest your indicator against known rug pulls and successful projects to refine its accuracy. Remember, this is a risk indicator, not a guarantee. Always combine it with other checks like contract verification, team doxxing, and liquidity lock audits from platforms like Unicrypt or Team Finance. For developers, publishing this analysis as a public dashboard or API can contribute to overall ecosystem safety by increasing transparency.

resource-links

GUIDE COMPONENTS

Essential Tools and Data Sources

These tools and datasets are required to build a rug pull risk indicator based on holder dynamics. Each card focuses on extracting, transforming, or validating on-chain holder data that directly signals liquidity exit risk, insider concentration, or coordinated sell behavior.

Etherscan and Chain Explorers

Chain explorers are the ground truth source for token holder and transfer data. For Ethereum-based tokens, Etherscan exposes holder distributions, transfer events, and contract metadata without relying on third-party indexing.

Key uses for a rug pull indicator:

Extract top holder concentration percentages and track changes over time
Monitor large outbound transfers from deployer or top 10 wallets
Detect mint, burn, or ownership renounce events via contract logs
Identify liquidity pool interactions with Uniswap V2/V3 pair contracts

Implementation notes:

Use the Token Holder API and ERC-20 Transfer Events endpoints for automation
Snapshot holder distributions at fixed block intervals to compute deltas
Cross-check deployer address against early recipients to flag insider clusters

Explorers are slower than indexed analytics platforms but remain essential for verifying anomalies before labeling a token as high risk.

EXPLORE

Dune Analytics

Dune provides SQL-accessible blockchain data that is ideal for modeling holder dynamics at scale. It allows you to compute risk signals across thousands of tokens using reproducible queries.

Common holder-based risk metrics built on Dune:

Gini coefficient of token holder balances
Percentage of supply held by top N wallets over time
Net flow of tokens from top holders to DEX pools
Wallet overlap between deployer, early holders, and liquidity removers

How to integrate into a rug pull indicator:

Write parameterized SQL queries for ERC-20 contracts
Schedule dashboards to update holder concentration every block or hour
Export results via Dune API for scoring models

Dune is particularly strong for historical backtesting, letting you compare known rug pulls against benign projects to calibrate thresholds.

EXPLORE

The Graph Subgraphs

The Graph enables custom indexing of on-chain events, which is critical when you need low-latency or chain-specific holder metrics not covered by public dashboards.

Why subgraphs matter for holder dynamics:

Index Transfer, Mint, and Burn events in real time
Maintain rolling aggregates for active holders, new holders, and exiting holders
Track interactions with specific contracts such as DEX pairs or lockers

Practical setup steps:

Define a subgraph that indexes ERC-20 Transfer events
Store per-address balances and update on each transfer
Expose queries like "top holders last 100 blocks" or "holder churn rate"

Subgraphs are ideal for powering production risk indicators where decisions depend on near-real-time holder behavior rather than delayed analytics.

EXPLORE

Nansen Wallet Labels

Nansen enriches raw holder data with wallet labels, making it easier to distinguish between organic holders and potentially malicious actors.

Relevant label categories for rug pull detection:

Token deployers and contract creators
Smart money vs newly created wallets
Known bridge, CEX, or mixer addresses that distort holder metrics
Wallet clusters controlled by a single entity

How to use labels in a risk indicator:

Exclude CEX and bridge wallets from holder concentration calculations
Flag tokens where deployer-linked wallets retain > X% supply
Detect coordinated selling by labeled wallet clusters

While Nansen is not fully open, it is valuable for ground-truth validation and for refining heuristics derived from open data sources.

EXPLORE

HOLDER DYNAMICS ANALYSIS

Risk Factor Scoring Matrix

Scoring criteria for evaluating rug pull risk based on on-chain holder behavior and token distribution.

Risk Factor	Low Risk (0-2 pts)	Medium Risk (3-5 pts)	High Risk (6-10 pts)
Top 10 Holder Concentration	< 20% of supply	20-50% of supply	50% of supply
Liquidity Pool Ownership	Renounced or timelocked	Multi-sig controlled	EOA controlled by deployer
Holder Growth Rate (7D)	15%	5-15%	< 5% or negative
Average Holder Balance	Wide distribution, low avg.	Moderate concentration	Extreme concentration, few whales
Recent Large Transfers (>5% supply)	None in last 48h	1-2 in last 48h	2 in last 48h
New Holder Retention (24h)	70% retained	40-70% retained	< 40% retained
Buy/Sell Pressure Ratio	Consistent buys > sells	Balanced activity	Sustained sell pressure

step-1-fetch-holder-data

DATA COLLECTION

Step 1: Fetch and Analyze Holder Distribution

The first step in building a rug pull risk indicator is to gather and analyze the fundamental on-chain data: token holder distribution. This data reveals the concentration of supply, which is a primary signal of centralization risk.

Holder distribution analysis examines how a token's total supply is divided among wallet addresses. A healthy, decentralized token typically exhibits a long-tail distribution with no single entity holding a disproportionate share. Conversely, a high concentration of tokens in a few wallets—often the deployer's—is a major red flag. To fetch this data, you can use blockchain indexers and APIs like The Graph, Covalent, or Alchemy. For Ethereum and EVM chains, querying the Transfer event logs of the token contract provides a historical record to reconstruct balances.

When analyzing the data, focus on key metrics: the percentage held by the top 10 and top 100 wallets, and the supply held by the contract deployer. A common heuristic suggests elevated risk if the top 10 holders control more than 50-60% of the circulating supply, especially if those wallets are inactive or recently funded. It's crucial to filter out known centralized exchange (CEX) and liquidity pool addresses (e.g., Uniswap V2/V3 pairs) from this analysis, as their large holdings represent user funds, not developer control. Services like Etherscan's Token Holder Charts offer a quick visual, but for an indicator, you need programmable access.

Here is a conceptual code snippet using the Etherscan API to fetch top holders. Note that you will need an API key and should handle pagination for complete results.

javascript
// Example: Fetch top token holders from Etherscan API
const apiKey = 'YOUR_API_KEY';
const tokenAddress = '0x...'; // The token contract address
const url = `https://api.etherscan.io/api?module=token&action=tokenholderlist&contractaddress=${tokenAddress}&page=1&offset=100&apikey=${apiKey}`;

fetch(url)
  .then(response => response.json())
  .then(data => {
    const holders = data.result;
    let totalSupply = 0;
    holders.forEach(h => totalSupply += parseFloat(h.Balance));
    
    // Calculate concentration metrics
    const top10Balance = holders.slice(0, 10).reduce((sum, h) => sum + parseFloat(h.Balance), 0);
    const top10Percentage = (top10Balance / totalSupply) * 100;
    console.log(`Top 10 holders control: ${top10Percentage.toFixed(2)}%`);
  });

Beyond raw percentages, analyze the nature of the top holders. Are they externally owned accounts (EOAs) or contracts? Contracts could be vesting schedules or staking pools, which are less risky than EOAs. Check the transaction history of large holder addresses: recent, large inflows from the deployer are suspicious. Furthermore, monitor for supply shifts over time. A sudden migration of tokens from the deployer to many new, small wallets (a "holder dilution" tactic) can artificially improve distribution metrics before a rug pull. Your analysis should therefore be time-series based, not a single snapshot.

Integrate this holder data with other on-chain contexts. For instance, cross-reference top holder addresses with the token's liquidity pool. If the deployer is also the sole provider of initial liquidity and holds a large supply, the risk multiplies. The output of this step is a set of quantifiable signals: Gini coefficient for inequality, Herfindahl-Hirschman Index (HHI) for concentration, and the deployer's retained supply percentage. These form the foundational layer for your composite rug pull risk score.

step-2-monitor-liquidity

LIQUIDITY ANALYSIS

Step 2: Monitor Liquidity Pool Locks and Changes

Liquidity pool manipulation is a core rug pull vector. This step explains how to track LP locks, migrations, and withdrawals to build a critical risk indicator.

A token's liquidity pool (LP) is its primary trading venue on a DEX like Uniswap V3 or PancakeSwap V3. The LP lock refers to the period during which the liquidity provider tokens (e.g., Uniswap V3 LP NFTs) are held in a time-locked smart contract, preventing the developers from withdrawing the paired assets (like ETH or USDC). A genuine project will typically lock a significant portion—often 100%—of its initial liquidity for a substantial period (e.g., 1+ years) using a reputable locker like Unicrypt or Team Finance. The absence of a lock, a very short lock duration, or the use of an obscure, unaudited locker are immediate red flags.

Beyond the initial lock, you must monitor for LP changes. Use blockchain explorers (Etherscan, BscScan) or APIs to watch the LP contract. Key events to track are IncreaseLiquidity, DecreaseLiquidity, Collect, and Burn. A sudden, large DecreaseLiquidity or Burn event that significantly reduces the pool's reserves is a direct signal of a liquidity pull. Developers might also perform a "soft rug" by gradually draining liquidity over time to avoid detection. Your indicator should calculate the percentage change in locked liquidity value and flag any withdrawals exceeding a defined threshold (e.g., >10% of total locked value within 24 hours).

Another sophisticated tactic is the LP migration rug. Developers create a new LP pair, migrate liquidity (and volume) to it, and then drain the new pool, leaving holders with tokens tied to an empty, old pool. To detect this, monitor for the creation of new LP pairs for the same token and track where the majority of trading volume and liquidity moves. A script should compare the TVL and 24h volume across all existing pairs for the token. A sharp drop in the original LP's TVL coinciding with a new pair's rise, without a clear, community-approved reason, indicates high risk.

For on-chain monitoring, you can use the Chainscore API or directly query contracts. Here's a simplified Python example using Web3.py to check if an LP position (NFT) is locked and query its liquidity:

python
from web3 import Web3
w3 = Web3(Web3.HTTPProvider('YOUR_RPC_URL'))

# LP NFT Contract (Uniswap V3 Positions NFT)
nft_contract = w3.eth.contract(address='0xC364...', abi=UNI_V3_NFT_ABI)
# Lock Contract (e.g., Unicrypt)
lock_contract = w3.eth.contract(address='0x663A...', abi=LOCKER_ABI)

token_id = 12345  # The LP NFT ID

# 1. Check owner of LP NFT
owner = nft_contract.functions.ownerOf(token_id).call()
print(f"NFT Owner: {owner}")

# 2. Check if owner is a known lock contract
if owner.lower() == lock_contract.address.lower():
    # 3. Get lock details from locker
    lock_data = lock_contract.functions.getLock(owner, token_id).call()
    unlock_time = lock_data[2]  # Example index for timestamp
    print(f"LP Locked until: {unlock_time}")
else:
    print("WARNING: LP NFT not in a known lock contract.")

Integrate these checks into a real-time dashboard. Your rug pull risk indicator should weight the LP status heavily. Assign high risk scores for: no verified lock, lock duration < 6 months, lock contract with zero TVL or no audit, recent large liquidity withdrawals (>20%), or evidence of a suspicious migration. By programmatically tracking these on-chain actions, you transform opaque developer behavior into a quantifiable, early-warning signal for token holders.

step-3-parse-contract-events

ANALYZING ON-CHAIN SIGNALS

Step 3: Parse Contract Events for Anomalies

This step involves programmatically scanning a smart contract's event logs to detect suspicious patterns in token holder dynamics, a primary indicator of potential rug pulls.

To build a rug pull risk indicator, you must first extract and analyze the Transfer event logs from the token's Ethereum contract. These logs provide a complete, immutable history of all token movements between addresses. Using a node provider like Alchemy or Infura, you can query for all Transfer events emitted by the contract since its creation. The critical data points are the from address, to address, and the value (amount transferred). This raw data forms the foundation for calculating holder concentration and identifying anomalous distribution patterns.

Once you have the event data, the next task is to reconstruct the holder balance sheet for the token at any given block. This is done by processing the Transfer events sequentially: subtracting the value from the from address's balance and adding it to the to address's balance. A common red flag is a sudden, massive transfer from the deployer or a large initial holder to a newly created, empty address. This can signal the creation of a dumping wallet in preparation for a liquidity pull. Tracking the flow of tokens to and from the liquidity pool (LP) contract is also essential, as abnormal withdrawals from the LP are a direct precursor to a rug pull.

For effective analysis, you need to calculate key metrics from the parsed events. The most telling is the Gini Coefficient, a statistical measure of inequality applied to token holdings. A very high Gini Coefficient (e.g., >0.95) indicates extreme concentration, where a handful of addresses control the vast majority of the supply. You should also monitor the rate of change in the top 10 holders' cumulative balance. A sharp increase in their share, especially if it follows a period of broad distribution, often precedes a dump. Implementing these calculations requires aggregating balances from your reconstructed ledger and applying the formulas programmatically.

Here is a simplified Python example using the Web3.py library to fetch and filter Transfer events, focusing on large outflows from the top holder:

python
from web3 import Web3

w3 = Web3(Web3.HTTPProvider('YOUR_INFURA_URL'))
contract_address = '0x...'
contract_abi = [...] # ABI containing Transfer event
contract = w3.eth.contract(address=contract_address, abi=contract_abi)

event_filter = contract.events.Transfer.createFilter(fromBlock=0)
events = event_filter.get_all_entries()

top_holder = '0xDEPLOYER_ADDRESS'
large_transfers = []
for event in events:
    if event['args']['from'].lower() == top_holder.lower():
        value = event['args']['value'] / (10 ** 18)  # Adjust for decimals
        if value > 1000000:  # Flag transfers > 1M tokens
            large_transfers.append({
                'to': event['args']['to'],
                'value': value,
                'txn': event['transactionHash'].hex()
            })

This code snippet isolates large, suspicious transfers originating from the primary holder for further investigation.

Finally, correlate event-based anomalies with other on-chain data. A large transfer to a new address is concerning, but it becomes a high-confidence signal if that address then immediately approves the token for spending on a DEX and begins removing liquidity. By parsing Approval events for the LP router (like Uniswap's) and Sync events from the LP pool itself, you can build a multi-signal detection system. The goal is to automate this pipeline to scan new tokens and generate a risk score based on the velocity, size, and context of holder dynamics revealed in the contract's event log history.

step-4-calculate-risk-score

IMPLEMENTATION

Step 4: Calculate and Output a Composite Risk Score

This step aggregates individual risk signals into a single, interpretable score for a token contract, providing a clear indicator of potential rug pull risk based on holder dynamics.

The final step in building a rug pull risk indicator is to synthesize the individual metrics—holder concentration, top holder activity, and liquidity lock status—into a single composite risk score. This score, typically ranging from 0 (low risk) to 100 (high risk), provides a holistic and actionable assessment. The calculation is not a simple average; it's a weighted aggregation where each component is assigned a weight based on its empirical correlation with historical rug pulls. For instance, an extremely concentrated token supply might be weighted more heavily than a temporary dip in liquidity.

A practical implementation involves defining a scoring function. First, normalize each raw metric to a 0-100 scale. For example, a holder_gini of 0.95 (highly concentrated) might map to a sub-score of 95, while a holder_gini of 0.3 maps to 10. The top holder sell-off rate could be scored based on the percentage of supply sold by the top 10 holders in the last 24 hours. The liquidity lock score is binary or tiered: a verified, long-term lock scores 0, an unverified lock scores 50, and no lock scores 100. The Python pseudocode structure would look like this:

python
def calculate_composite_score(gini_score, selloff_score, lock_score):
    weights = {'concentration': 0.5, 'selloff': 0.3, 'lock': 0.2}
    composite = (gini_score * weights['concentration'] +
                 selloff_score * weights['selloff'] +
                 lock_score * weights['lock'])
    return round(composite)

Outputting the score effectively is crucial for usability. The system should return a JSON object containing both the final score and the decomposed sub-scores for transparency. This allows users to understand why a token received a particular rating. For example: {"risk_score": 78, "components": {"holder_concentration": 85, "top_holder_selloff": 90, "liquidity_lock": 50}}. This output can be integrated into dashboards, alert systems, or API endpoints. Tools like Dune Analytics or Flipside Crypto use similar structured outputs for their analytics. By calculating and exposing a composite score, you transform raw on-chain data into a decisive risk indicator that can be used to screen tokens before interaction.

DEVELOPER FAQ

Frequently Asked Questions

Common technical questions about implementing and interpreting the Rug Pull Risk Indicator based on holder dynamics.

The indicator analyzes the distribution and behavior of a token's top holders to assess centralization risk. It's based on the principle that excessive control by a few wallets (often developer-controlled) is a primary precursor to rug pulls. The model tracks metrics like:

Concentration Ratios: The percentage of total supply held by the top 10 or 20 wallets.
Holder Velocity: The rate at which new holders are added versus old ones leaving.
Whale Transaction Patterns: Unusual large transfers from top holders to decentralized exchanges (DEXs) just before liquidity removal.

By establishing a baseline for healthy, decentralized projects, deviations from this pattern trigger risk alerts. For example, a sudden drop in the number of holders while the top wallet's balance remains static can indicate a bot-driven honeypot.

conclusion-next-steps

IMPLEMENTATION GUIDE

Conclusion and Next Steps

This guide has outlined the core methodology for building a rug pull risk indicator based on holder dynamics. The next step is to operationalize these concepts into a production-ready system.

To build a functional risk indicator, you must first establish a reliable data pipeline. This involves querying on-chain data for token contracts using a node provider like Alchemy or Infura, and indexing services like The Graph or Covalent for historical holder snapshots. The core metrics—concentration ratios, whale velocity, and new holder churn—should be calculated at regular intervals (e.g., hourly or daily) and stored in a time-series database. This creates the foundational dataset for your risk model.

The calculated metrics need to be synthesized into a single, interpretable risk score. A common approach is to assign weighted values to each signal. For instance, you might assign 40% weight to the top 10 holder concentration, 30% to whale sell pressure velocity, 20% to new holder retention rate, and 10% to liquidity pool ownership. The final score can be normalized to a 0-100 scale or a tiered system (e.g., Low, Medium, High, Critical). This model should be backtested against historical rug pulls to calibrate the weights for accuracy.

For developers, integrating this score into a user-facing application is the final step. You could build a browser extension that displays the risk score next to token addresses on Etherscan, create a public API for other developers, or embed the analysis into a portfolio dashboard. The code logic involves fetching the pre-computed risk score from your backend and presenting it with clear visual cues. Transparency is key: always show the underlying metrics that contributed to the score to build user trust.

Continuous iteration is crucial for maintaining the system's effectiveness. Monitor the performance of your risk indicator by tracking its false positive and false negative rates. As attackers develop new methods, you may need to incorporate additional on-chain signals, such as changes to token contract permissions or suspicious minting events. Engaging with the community on platforms like GitHub and Twitter can provide valuable feedback and help identify emerging threat patterns.

The ultimate goal is to contribute to a safer DeFi ecosystem. By open-sourcing your methodology or publishing your findings, you help educate other developers and researchers. Consider publishing your risk scores for a curated list of tokens on a public website or contributing to community-driven safety platforms like DeFiSafety. This transforms your indicator from a personal tool into a public good that benefits all participants.