Free 30-min Web3 Consultation
Book Consultation
Smart Contract Security Audits
View Audit Services
Custom DeFi Protocol Development
Explore DeFi
Full-Stack Web3 dApp Development
View App Services
Free 30-min Web3 Consultation
Book Consultation
Smart Contract Security Audits
View Audit Services
Custom DeFi Protocol Development
Explore DeFi
Full-Stack Web3 dApp Development
View App Services
Free 30-min Web3 Consultation
Book Consultation
Smart Contract Security Audits
View Audit Services
Custom DeFi Protocol Development
Explore DeFi
Full-Stack Web3 dApp Development
View App Services
Free 30-min Web3 Consultation
Book Consultation
Smart Contract Security Audits
View Audit Services
Custom DeFi Protocol Development
Explore DeFi
Full-Stack Web3 dApp Development
View App Services
LABS
Guides

Setting Up a Reputation-Driven Content Discovery Engine

A technical guide for developers to build a system that indexes and ranks content based on creator on-chain reputation and user engagement signals.
Chainscore © 2026
introduction
TUTORIAL

Setting Up a Reputation-Driven Content Discovery Engine

A guide to building a decentralized content feed that ranks posts based on user reputation scores, moving beyond simple engagement metrics.

Traditional social media algorithms prioritize content based on raw engagement metrics like likes and shares, which often amplifies sensationalism. A reputation-driven discovery engine flips this model by weighting user interactions based on their on-chain reputation. This means a vote from a long-term, active community member carries more influence than one from a new or spammy account. By leveraging decentralized identity and soulbound tokens (SBTs), you can create a feed that surfaces quality content from trusted sources, reducing noise and manipulation.

The core architecture involves three key on-chain components: a reputation registry, a content registry, and a staking/voting mechanism. The reputation registry, often implemented via an ERC-20 or ERC-1155 contract, assigns and manages scores based on user history (e.g., tenure, quality of past submissions, governance participation). The content registry, typically an ERC-721 contract for NFTs or a simpler struct mapping, stores post metadata and a mutable reputation score. The voting contract allows users to stake tokens to upvote or downvote, where the vote's weight is a function of the voter's reputation.

Here's a simplified Solidity snippet for a basic reputation-weighted vote function. It assumes a pre-existing mapping for user reputation scores and content scores.

solidity
function voteOnContent(uint256 contentId, bool isUpvote) external {
    uint256 voterRep = reputation[msg.sender];
    require(voterRep > 0, "No reputation");
    uint256 voteWeight = sqrt(voterRep); // Example: Use sqrt to diminish returns
    if (isUpvote) {
        contentScore[contentId] += voteWeight;
    } else {
        contentScore[contentId] -= voteWeight;
    }
    emit Voted(contentId, msg.sender, voteWeight, isUpvote);
}

Using a function like sqrt for weight calculation prevents whales with massive reputation from having disproportionate control.

To query and rank content, your front-end or indexer needs to fetch posts and sort them by their dynamically updated contentScore. Platforms like The Graph are ideal for this, allowing you to create a subgraph that indexes vote events and calculates real-time rankings. The query might order content by score descending and filter by time window (e.g., top posts this week). This decouples the heavy sorting logic from the blockchain, providing a performant feed while maintaining verifiable on-chain data provenance for scores and votes.

Implementing sybil resistance is critical. Pure on-chain reputation can be gamed by creating multiple addresses. Mitigation strategies include integrating proof-of-personhood protocols like Worldcoin, requiring a minimum token stake (with slashing for malice), or using attestation frameworks like Ethereum Attestation Service (EAS) to link off-chain social credentials. A hybrid approach often works best, where initial reputation is granted via a trusted attestation, then grown organically through verified on-chain actions within the application's ecosystem.

Finally, consider the user experience. A reputation system must be transparent. Users should easily view their own reputation score, the factors influencing it, and how their vote weight is calculated. This builds trust and encourages genuine participation. By moving the ranking logic on-chain and tying it to verifiable reputation, you create a discovery engine that is not only more resistant to manipulation but also aligns incentives towards long-term, high-quality content creation and curation.

prerequisites
SETUP GUIDE

Prerequisites and Tech Stack

This guide outlines the essential tools, accounts, and foundational knowledge required to build a reputation-driven content discovery engine on the blockchain.

Building a decentralized content discovery engine requires a specific technical foundation. You will need proficiency in TypeScript or JavaScript for the frontend and smart contract interactions. A solid understanding of React (or a similar framework like Next.js) is essential for building the user interface. For backend logic and data indexing, familiarity with Node.js and GraphQL is highly recommended. You should also be comfortable using Git for version control and have a code editor like VS Code installed.

The core of the system will be built on Ethereum Virtual Machine (EVM)-compatible blockchains. You will need a basic understanding of smart contracts, the Solidity programming language, and how to interact with them using libraries like ethers.js or viem. Setting up a MetaMask wallet is a prerequisite for testing transactions. You'll also need testnet ETH (e.g., from a Sepolia faucet) and an API key from a node provider like Alchemy or Infura to connect your application to the blockchain.

For managing user reputation and content curation, you will integrate with specific protocols. This guide uses Lens Protocol for social graph data and The Graph for indexing on-chain events into a queryable API. Ensure you have a Lens API sandbox endpoint and understand how to query a subgraph. Optionally, for advanced reputation scoring, familiarity with Oracle services like Chainlink can be useful for fetching off-chain data verifiably.

Finally, you will need a local development environment. This includes Node.js (v18 or later) and a package manager like npm or yarn. We will use Hardhat or Foundry as a development framework for compiling, testing, and deploying smart contracts. For persistent data related to user profiles and content metadata that isn't stored on-chain, you may use a database like PostgreSQL or a decentralized alternative like Ceramic Network, though initial prototyping can use a local JSON file or in-memory store.

key-concepts-text
TUTORIAL

Key Concepts: Reputation and Engagement Signals

Learn how to build a content discovery engine that prioritizes quality by leveraging on-chain reputation and user engagement data.

A reputation-driven content discovery engine moves beyond simple popularity metrics like view counts. It uses a multi-dimensional scoring system to surface content based on the author's credibility and the community's genuine engagement. This approach combats spam, manipulation, and low-quality content by weighting signals from trusted sources more heavily. Core signals include the author's on-chain history (e.g., token holdings, governance participation), social graph connections, and the quality of past contributions, creating a foundational reputation score.

Engagement signals measure how users interact with content, distinguishing meaningful actions from passive views. Key metrics include: - Weighted likes/dislikes from high-reputation users - Meaningful comment length and sentiment - Save/Bookmark rates - Secondary shares (when content is reposted by others) - Dwell time on the content page. These signals are processed in real-time, often using a decaying weight algorithm to prioritize recent interactions while maintaining a historical context, ensuring the discovery feed remains dynamic and current.

Implementing this system requires a backend architecture that aggregates on-chain and off-chain data. For on-chain reputation, you might query a user's ERC-20 token balances, NFT holdings from specific collections (like Proof-of-Membership NFTs), or their voting history in DAOs like Uniswap or Compound. Off-chain, you can integrate with social data providers like Lens Protocol or Farcaster to pull follower graphs and cross-platform engagement. This data is normalized into a unified scoring model, often using a weighted sum or machine learning model to output a final content ranking score.

Here is a simplified conceptual example of a scoring function in pseudocode:

code
function calculateContentScore(contentId, authorAddress) {
  // Fetch Signals
  authorRep = getOnChainReputation(authorAddress); // e.g., 0-100
  engagement = getEngagementMetrics(contentId); // likes, comments, saves
  
  // Apply Weights
  reputationWeight = 0.4;
  engagementWeight = 0.6;
  
  // Calculate (simplified)
  engagementScore = normalize(engagement.likes) * 0.3 +
                    normalize(engagement.meaningfulComments) * 0.4 +
                    normalize(engagement.saves) * 0.3;
  
  finalScore = (authorRep * reputationWeight) + (engagementScore * engagementWeight);
  return finalScore;
}

This model ensures content from a reputable developer with moderate engagement can rank higher than viral content from a new, unverified account.

To operationalize this, you need an indexing service (like The Graph for on-chain data) and a real-time processing pipeline (using tools like Apache Kafka or RabbitMQ) for engagement events. The ranked results are then served via an API to your frontend application. Best practices include transparently logging score calculations for auditability, implementing sybil-resistance mechanisms (like proof-of-personhood from Worldcoin), and allowing user customization of signal weights through a settings panel, balancing algorithmic curation with user control.

The final system creates a positive feedback loop: high-quality content from reputable sources gets amplified, which attracts more serious engagement, further boosting those signals. This leads to a healthier ecosystem where meritocratic discovery replaces pure attention-grabbing. For further reading, explore Token-Curated Registries (TCRs) as a conceptual model and projects like Gitcoin Passport for aggregating decentralized identity credentials.

how-it-works
REPUTATION ENGINE

System Architecture Components

A reputation-driven content discovery engine requires a modular stack for data ingestion, scoring, and curation. These components handle everything from on-chain data collection to final user-facing rankings.

03

Curation & Ranking Module

Uses reputation scores to filter, rank, and surface content. This determines what users see based on collective trust signals.

  • Mechanisms: Can implement quadratic voting (like Gitcoin Grants) to weight votes by reputation, or use scores as a direct ranking multiplier.
  • Output: Generates a personalized or community-wide feed, trending lists, or highlighted contributions.
  • Example: A developer forum where answers from high-reputation users are boosted, reducing spam.
04

Incentive & Staking Mechanism

Aligns participant behavior with network goals by rewarding positive contributions and penalizing abuse. Crucial for maintaining score integrity.

  • Staking for Curation: Users may stake tokens to upvote/downvote, with penalties for malicious behavior (e.g., fraud proofs).
  • Reward Distribution: Fees or token emissions distributed to high-reputation actors who perform valuable curation work.
  • Protocols: Inspired by models like Curve's vote-escrow or Olympus DAO's bonding for commitment-based reputation.
step-1-indexing
DATA LAYER

Step 1: Indexing On-Chain Content and Interaction Events

The foundation of a reputation-driven discovery engine is a robust index of on-chain activity. This step covers how to collect and structure raw blockchain data into a queryable graph of users, content, and interactions.

A discovery engine requires a data layer that transforms raw blockchain logs into a structured social graph. This involves indexing two primary data types: content creation events (e.g., posts, comments, articles minted as NFTs or stored on decentralized storage) and interaction events (e.g., likes, shares, mints, collects, and token transfers). Tools like The Graph with a custom subgraph or a purpose-built indexer using Ethers.js or Viem are essential for listening to these events. The goal is to map relationships: which addresses created which content items, and how other addresses interacted with them.

For example, when a user posts on a platform like Lens Protocol or Farcaster, the action emits an event. Your indexer must capture the event's core parameters: the creator address, a contentURI (often pointing to IPFS or Arweave), a timestamp, and a unique publicationId. Similarly, a 'collect' or 'mirror' action on that publication is an interaction event linking a collector address to the target publicationId. Structuring this data into tables or nodes (for content and users) and edges (for interactions) creates the foundational graph for reputation analysis.

Implementing this requires setting up a listener for your target smart contracts. Using a Node.js script with Viem, you would connect to an RPC provider, specify the contract ABI and address, and filter for specific event logs. The indexed data should be stored in a persistent database like PostgreSQL or a time-series database. It's critical to handle chain reorganizations and ensure data consistency. This process yields a rich dataset where each piece of content is annotated with its full interaction history, ready for the next step: calculating reputation scores.

step-2-scoring
CORE ENGINE

Step 2: Designing the Reputation Scoring Algorithm

The scoring algorithm is the core logic that transforms raw user activity into a quantifiable reputation score. This step defines the mathematical model and data inputs that power your discovery engine.

A reputation score is a weighted composite of multiple on-chain and off-chain signals. Common inputs include: token holdings (e.g., governance token balance, staked amount), contribution history (e.g., successful proposals, quality content submissions), social engagement (e.g., verified likes, meaningful comments), and network tenure. The first design decision is selecting which signals are relevant for your platform's goals—a DeFi protocol might prioritize governance participation, while a content hub might value posting and curation history.

Each signal must be normalized and weighted. For example, you might convert a user's token balance into a score from 0-100, relative to the total supply or a specific percentile of holders. A simple linear model could be: Reputation Score = (w1 * Token Score) + (w2 * Contribution Score) + (w3 * Social Score). Weights (w1, w2, w3) are critical levers; they determine whether your system values financial stake, active participation, or community sentiment more highly. These weights are often stored in a smart contract for transparency and upgradability.

To prevent manipulation, incorporate time decay or velocity checks. A pure balance-based score is vulnerable to flash-loan attacks or temporary capital influx. Applying exponential decay to contribution points, such as reducing the value of an upvote by 10% each month, ensures the score reflects sustained, long-term engagement. Similarly, implementing a velocity limit on score increases per day can thwart spam attacks designed to artificially inflate reputation quickly.

Here is a conceptual Solidity snippet for a basic, upgradeable scoring contract:

solidity
// SPDX-License-Identifier: MIT
pragma solidity ^0.8.19;
contract ReputationScorer {
    address public admin;
    struct Weights { uint tokenWeight; uint contribWeight; uint socialWeight; }
    Weights public currentWeights;
    
    function calculateScore(
        uint tokenScore,
        uint contribScore,
        uint socialScore
    ) public view returns (uint) {
        return (tokenScore * currentWeights.tokenWeight +
                contribScore * currentWeights.contribWeight +
                socialScore * currentWeights.socialWeight) / 100;
    }
    // Admin function to update weights
    function updateWeights(Weights memory _newWeights) external {
        require(msg.sender == admin, "Unauthorized");
        currentWeights = _newWeights;
    }
}

This contract separates the scoring logic from the data, allowing the algorithm to be refined without migrating user history.

Finally, calibrate and iterate using real or simulated data. Deploy the algorithm to a testnet with historical data to analyze score distributions. Are the results intuitive? Do known reputable users rank highly? Use this analysis to adjust weights, add new signals like Sybil-resistance proofs from platforms like Worldcoin or BrightID, or introduce non-linear scaling. The goal is a score that reliably correlates with genuine, valuable contribution to the ecosystem.

step-3-api
IMPLEMENTATION

Step 3: Building the Discovery API and Front-end

This section details the implementation of the backend API that serves reputation-scored content and the frontend interface that consumes it, creating a dynamic discovery feed.

The core of the discovery engine is a GraphQL API that queries the on-chain reputation data indexed in Step 2. We recommend using a framework like Apollo Server or Hasura for this. The API's primary resolver fetches content items (e.g., forum posts, articles, or project proposals) and joins them with the calculated reputation scores from the user_reputation table. A key query might be getTopContent(limit: 10, timeWindow: "7d"), which returns posts ranked by a weighted score combining the author's reputation and the post's own engagement metrics (likes, replies). This decouples the complex scoring logic from the frontend.

For the scoring algorithm, implement a weighted formula in your API business logic. A simple example could be: finalScore = (authorReputation * 0.6) + (postLikes * 0.3) + (postAgeDecayFactor * 0.1). The author's reputation—derived from their on-chain actions—carries the most weight, ensuring high-quality contributors are amplified. The postAgeDecayFactor applies a logarithmic decay to promote recent content, preventing the feed from becoming stale. This logic should be executed server-side to keep the scoring mechanism consistent and secure.

On the frontend, use a framework like Next.js or React to consume the GraphQL API via Apollo Client. The main component is an infinitely-scrolling or paginated feed that renders each content item with its calculated score displayed prominently. Implement real-time updates by subscribing to new blockchain events (via the indexer's WebSocket feed) and refetching the query or using GraphQL subscriptions. This ensures the UI reflects new posts and updated reputation scores without requiring a page refresh, creating a live, reputation-aware social feed.

Critical to the user experience is transparency. Each content item in the feed should have a tooltip or a detail view explaining the score breakdown: e.g., "Score: 85.2 (Author Rep: 92, Likes: 15, Time Bonus: 0.8)". This builds trust in the system. Furthermore, allow users to apply filters such as minimumReputationScore or contentType. The frontend state management (using Zustand or Redux) should handle these filter parameters and pass them as variables to the GraphQL queries, enabling personalized discovery.

Finally, integrate this discovery module into your larger dApp. The API endpoint should be secured and rate-limited. For production, cache frequent queries (like the top 100 posts) using Redis to reduce database load and improve response times. The complete system—from on-chain action to indexed reputation to ranked API response to dynamic UI—creates a closed-loop, reputation-driven content ecosystem that automatically surfaces valuable contributions based on verifiable, on-chain merit.

SIGNAL TYPES

Comparison of On-Chain Reputation Signals

A comparison of different on-chain data sources for building user reputation scores in a content discovery engine.

Signal / MetricTransaction HistoryToken HoldingsGovernance ParticipationSoulbound Tokens (SBTs)

Data Source

Wallet transaction logs

ERC-20/721/1155 balances

DAO voting & proposal data

Non-transferable attestations

Acquisition Difficulty

Low

Low

Medium

High

Sybil Resistance

Low

Medium

High

Very High

Cost to Fake

< $10

$50-500

$1000+

Theoretically Infinite

Temporal Decay

High (stale quickly)

Medium

Low (persistent impact)

None (permanent)

Context Specificity

Low (generic)

Medium

High (project-specific)

Very High (issuer-specific)

Primary Use Case

Activity & consistency

Financial stake & affiliation

Expertise & commitment

Credentials & affiliations

Example Weight in Score

20-30%

15-25%

25-40%

10-20%

REPUTATION ENGINE

Common Issues and Troubleshooting

Addressing frequent challenges and developer questions when implementing a reputation-driven content discovery system on-chain.

Reputation score updates are typically not instantaneous. The delay is often due to the oracle update cycle or the challenge period in your system's design.

Common causes:

  1. Oracle latency: If you're using an oracle (e.g., Chainlink) to fetch off-chain data for scoring, updates occur at predefined intervals (e.g., every 24 hours).
  2. Dispute windows: Decentralized reputation systems often include a challenge period (e.g., 7 days) where other users can dispute a score change before it's finalized on-chain.
  3. Batching for gas efficiency: To save gas, score updates may be batched and processed in a single transaction at the end of an epoch.

Check: Verify the updateInterval in your oracle configuration and the disputeWindow parameter in your reputation smart contract. Use an event listener to monitor for ReputationUpdated events.

DEVELOPER FAQ

Frequently Asked Questions

Common technical questions and troubleshooting for building a reputation-driven content discovery engine on-chain.

A reputation-driven content discovery engine is a decentralized application that uses on-chain reputation scores to rank and surface content. Unlike traditional algorithms controlled by a single entity, it leverages transparent, user-owned reputation data from sources like Ethereum Attestation Service (EAS) or Gitcoin Passport to filter spam and highlight high-quality contributions.

Core components include:

  • Reputation Oracle: Fetches and verifies on-chain attestations or soulbound tokens (SBTs).
  • Scoring Engine: Applies logic (e.g., weighted averages, time decay) to calculate a user's reputation score.
  • Indexing & Ranking: Uses the score to sort content in a feed or search results.
  • Incentive Layer: Often includes staking or slashing mechanisms to align user behavior with network goals.
conclusion
IMPLEMENTATION

Conclusion and Next Steps

You have now built the core components of a reputation-driven content discovery engine. This guide covered the foundational architecture, smart contract logic, and integration patterns.

Your system now uses on-chain reputation scores—derived from sources like POAPs, Gitcoin Passport stamps, or custom ERC-20 token holdings—to weight user votes and content rankings. The ContentRegistry smart contract enforces governance rules, while the off-chain indexer or subgraph aggregates signals to calculate dynamic scores. This creates a sybil-resistant discovery feed where influence is earned, not bought.

To extend this engine, consider implementing more sophisticated algorithms. Instead of simple weighted averages, explore quadratic voting to mitigate whale dominance or time-decay functions to prioritize recent engagement. Integrate with Lens Protocol or Farcaster to bootstrap a social graph, or use The Graph for efficient historical querying of user activity. Always audit upgrade paths in your contracts to manage future reputation formula changes.

For production deployment, security and scalability are critical. Conduct thorough testing with tools like Foundry or Hardhat, and consider using a rollup (Optimism, Arbitrum) or app-specific chain (via Polygon CDK, Arbitrum Orbit) to control gas costs for user interactions. Implement a robust indexing layer that can handle high-throughput events without missing blocks.

The next step is to define your content curation economic model. Will you use a curation tax that rewards successful signalers, similar to Curve's gauge voting? Or perhaps a bonding curve model for submitting new content? These mechanisms align incentives and can be governed by your reputation token holders via a DAO using frameworks like OpenZeppelin Governor.

Finally, measure your engine's success with concrete metrics: user retention rates, the correlation between high-reputation votes and content quality, and the rate of sybil attack detection. Start with a closed beta, gather feedback, and iterate. The goal is a self-sustaining ecosystem where reputation directly translates into better content discovery for everyone.

How to Build a Reputation-Based Content Discovery Engine | ChainScore Guides