How to Build a Validator Reputation Scoring System

introduction

INTRODUCTION

Setting Up a Validator Reputation Scoring System

A guide to implementing a data-driven reputation framework for blockchain validators, focusing on security, performance, and decentralization metrics.

A validator reputation scoring system quantifies the reliability and performance of nodes in a Proof-of-Stake (PoS) or Proof-of-Authority (PoA) network. Unlike simple uptime checks, a robust system evaluates multiple on-chain and off-chain signals to create a composite score. This score is crucial for delegators selecting validators, for governance-weighted voting, and for automated slashing or reward distribution mechanisms. Core metrics typically include attestation performance, proposal success rate, commission changes, and geographic distribution.

The foundation of any scoring system is data collection. You need to ingest real-time data from blockchain nodes, often via RPC endpoints or indexing services like The Graph or Covalent. For Ethereum validators, you would query the Beacon Chain API for metrics such as effective_balance, attestation_effectiveness, and proposals_missed. It's critical to normalize this data across different time windows—short-term (last 100 epochs) for reactivity and long-term (since activation) for stability. Storing this in a time-series database allows for trend analysis and anomaly detection.

Once data is collected, you apply scoring algorithms. A common approach is a weighted additive model. For example: Total Score = (Uptime * 0.4) + (Attestation Efficiency * 0.3) + (Proposal Score * 0.2) + (Decentralization Bonus * 0.1). Each component is itself a normalized value between 0 and 1. The Decentralization Bonus might penalize validators in over-represented cloud providers or geographic regions, promoting network resilience. Advanced systems may use machine learning models to detect subtle patterns of malicious behavior or predict future reliability.

Implementing this requires careful smart contract or off-chain oracle design. For on-chain reputation, you could deploy a registry contract that stores scores and is updated by a trusted oracle or a decentralized oracle network like Chainlink. The contract might have a function like function updateReputation(bytes32 validatorPubKey, uint256 score) external onlyOracle. Off-chain, you can use a service like Chainscore's APIs to fetch pre-computed reputation scores for major networks, which simplifies integration and leverages their aggregated data and analysis.

prerequisites

SETUP GUIDE

Prerequisites

Before building a validator reputation system, you need the right tools, data sources, and a clear understanding of the scoring framework.

To develop a validator reputation scoring system, you must first establish the foundational infrastructure. This requires a reliable connection to blockchain data sources. For Ethereum, you'll need access to an archive node or a service like the Beacon Chain API to fetch historical validator performance data, including attestations, proposals, and sync committee participation. For Cosmos-based chains, you'll query the Tendermint RPC endpoints for pre-commit and prevote data. Setting up a database—such as PostgreSQL or TimescaleDB—is essential for storing and efficiently querying this time-series performance data.

Next, define the core metrics that constitute reputation. A robust system evaluates multiple dimensions: Uptime and Liveness (measuring attestation effectiveness and proposal success rate), Technical Proficiency (analyzing block construction efficiency and MEV-related behavior), and Economic Security (tracking the validator's effective balance and slashing history). Each metric must be quantifiable. For example, attestation effectiveness can be calculated as the percentage of timely attestations within the correct source, target, and head epochs over a rolling window, such as the last 100 epochs.

Your development environment should include the necessary libraries and SDKs. For data ingestion, you'll use web3.py or ethers.js for Ethereum, and the Cosmos SDK's @cosmjs packages for Cosmos chains. Analytical logic will be written in Python (with pandas and numpy) or TypeScript. You must also decide on a scoring algorithm; a common approach is a weighted sum of normalized metrics, but more advanced systems may use machine learning models trained on historical chain data to predict future reliability.

Finally, consider the operational requirements. The scoring system should run as a scheduled cron job or within a workflow orchestrator like Apache Airflow to update scores periodically. You'll need to implement logging, monitoring (e.g., with Prometheus/Grafana), and alerting for data pipeline failures. Ensure you have a plan for handling chain reorganizations and missed slots in your data processing logic to maintain score accuracy. With these prerequisites in place, you can proceed to implement the data pipeline and scoring engine.

key-concepts

VALIDATOR REPUTATION

Key Concepts and Metrics

Building a robust validator scoring system requires understanding the core data inputs, economic models, and security mechanisms that define validator behavior and trustworthiness.

Slashing Conditions and Penalties

Slashing is the primary mechanism for penalizing malicious or negligent validators. Core conditions include:

Double signing: Proposing or attesting to two conflicting blocks.
Downtime penalties (inactivity leak): Failing to perform duties, with penalties increasing during network finality stalls.
Slashing amounts are typically a fixed minimum (e.g., 1 ETH) plus a correlation penalty that scales with other validators slashed in the same period. This disincentivizes coordinated attacks.

Performance Metrics: Uptime and Effectiveness

Beyond avoiding slashing, validator performance is measured by participation and proposal success. Key metrics are:

Attestation Effectiveness: The percentage of timely, correct attestations. High-performing nodes maintain >99%.
Block Proposal Success Rate: Successfully proposing a block when selected. Missed proposals indicate configuration or connectivity issues.
Sync Committee Participation: For networks like Ethereum, participation in sync committees is a critical, randomly assigned duty. These real-time metrics are foundational for any reputation score.

Economic Security: Stake and Decentralization

A validator's economic stake is its primary security deposit. Reputation systems must account for:

Effective Balance: The portion of the total stake actively used for rewards/penalties (capped at 32 ETH on Ethereum).
Stake Concentration: The risk posed by a single entity operating many validators. Scoring can incorporate client diversity and geographic distribution data to measure decentralization contribution.
Withdrawal Credentials: Control of withdrawals (e.g., Eth1 vs. Eth2) signals operational security practices.

Data Sources and Aggregation

Building a score requires aggregating on-chain and off-chain data.

On-chain: Slashing events, attestation performance, and balance changes are sourced directly from consensus layer clients (Lighthouse, Prysm, Teku).
Off-chain/MEV: Integration with relays (e.g., Flashbots, bloXroute) and mev-boost to assess proposer behavior, including censorship resistance and MEV extraction fairness.
Aggregation Challenges: Handling chain reorganizations (reorgs) and ensuring data consistency across different beacon chain API providers is critical.

Reputation Scoring Models

Translating metrics into a single score involves weighted models.

Weighted Linear Models: Assign scores to individual metrics (e.g., uptime 40%, slashing 30%, decentralization 30%) for a composite score.
Time-Decayed Metrics: Recent performance is weighted more heavily than historical data.
Comparative Scoring: Ranking validators against the network median or percentile (e.g., top 10% for attestation effectiveness). Models must be transparent and resistant to manipulation.

Implementation and Tooling

Practical implementation requires specific tools and frameworks.

Beacon Chain APIs: Use the standard Ethereum Beacon Node API or services like Beaconcha.in or Rated.network for metric data.
Analysis Frameworks: Libraries like Lodestar or web3.js for programmatic interaction.
Example Workflow:
1. Query validator indices and status via /eth/v1/beacon/states/head/validators.
2. Fetch performance metrics from a separate analytics endpoint.
3. Calculate scores based on your model, updating with each new epoch. This data pipeline forms the backend of any reputation dashboard.

EXPLORE

architecture-overview

SYSTEM ARCHITECTURE OVERVIEW

Setting Up a Validator Reputation Scoring System

This guide details the core components and data flow for building a decentralized reputation system for blockchain validators, focusing on on-chain data aggregation and off-chain scoring logic.

A validator reputation system transforms raw on-chain performance data into a quantifiable trust score. The architecture is typically divided into three layers: a data ingestion layer that pulls metrics from the blockchain, a computation layer that processes this data using a defined scoring model, and a publishing layer that makes the final reputation scores available for consumption. This separation ensures modularity, allowing the scoring logic to be updated without disrupting data collection. For example, an Ethereum validator's performance can be assessed using metrics like attestation effectiveness, proposal success, and slashing history, all sourced from the Beacon Chain.

The data ingestion layer is responsible for collecting real-time and historical data. This involves running nodes or using services like The Graph to index events or querying APIs from block explorers. Key data points include block_proposal_count, attestation_inclusion_delay, sync_participation, and slashing_events. This raw data is then normalized and stored in a structured format, such as a time-series database, to facilitate efficient analysis. Ensuring data integrity at this stage is critical, as the entire reputation system depends on accurate inputs.

In the computation layer, the normalized data is fed into a scoring algorithm. This model assigns weights to different behaviors. A common approach uses a weighted sum: Score = (W_attestation * A) + (W_proposal * P) - (W_slashing * S). More advanced systems may employ machine learning models to detect anomalous patterns. The logic is often executed off-chain for flexibility but can be verified on-chain via cryptographic proofs (e.g., zk-SNARKs) if transparency is required. The output is a reputation score, often normalized to a range like 0-100 or a tiered label (e.g., High, Medium, Low).

Finally, the publishing layer disseminates the scores. For decentralized applications (dApps), this often means writing the final score or a commitment hash to a smart contract on a cost-efficient L2 like Arbitrum or Optimism. Alternatively, scores can be served via a decentralized API or IPFS. This allows staking pools, delegation platforms, and cross-chain bridges to query a validator's reputation before assigning work or stake. The entire system must be designed to resist manipulation, often through decentralized oracle networks like Chainlink to fetch data or via a committee of watchers validating the score calculation.

When implementing this architecture, key considerations include the update frequency (real-time vs. epoch-based), the cost of on-chain publication, and the subjectivity of the scoring model. A system prioritizing security might heavily penalize slashing events, while one optimized for liveness might weight block proposal success more highly. Open-sourcing the scoring model and audit trails is essential for community trust. Practical examples include the Rated Network for Ethereum validators and Chorus One's Sentinel model for Cosmos, which provide public frameworks for reputation assessment.

data-collection

DATA PIPELINE

Step 1: Collecting Validator Data

This guide details the first step in building a validator reputation system: sourcing and structuring raw on-chain and off-chain data.

A robust reputation score is built on a foundation of comprehensive data. For validators, this data falls into two primary categories: on-chain performance metrics and off-chain metadata. On-chain data is objective, verifiable, and extracted directly from the blockchain via RPC nodes or indexers. This includes metrics like uptime, participation rate, slashing history, self-stake ratio, and commission rates. Off-chain data provides context and signals of operational health, such as the validator's public identity, geographic location, client software versions, and social presence.

To collect on-chain data, you will interact with the blockchain's consensus layer. For Ethereum, this means querying the Beacon Chain API endpoints. A basic example using curl to fetch validator information by its index would be: curl -X GET "https://<beacon-node>/eth/v1/beacon/states/head/validators/<validator_index>". For Cosmos-based chains, you can use the Tendermint RPC endpoint: curl -s "http://<node>:26657/validators?per_page=1000". It's critical to collect this data over a significant time window (e.g., 30-90 epochs on Ethereum) to calculate meaningful averages and identify trends.

Structuring this raw data is the next critical task. You should design a schema that normalizes information across different blockchains. A simple JSON schema for a validator record might include fields for chain_id, validator_address, metrics (an object containing uptime_pct, participation_pct, slashed), and metadata (an object for moniker, website, security_contact). This structured data is then typically written to a time-series database like TimescaleDB or InfluxDB for efficient historical querying, or to a standard SQL database for relational analysis.

Beyond basic performance, advanced data points can significantly enhance scoring models. These include governance participation (voting history on proposals), MEV-related metrics (block proposal patterns, inclusion lists), and network topology data (peer count, latency). For off-chain due diligence, you may need to scrape validator websites, monitor their GitHub repositories for client updates, or verify their presence on key social channels like Discord and Twitter to assess community engagement and operational transparency.

Finally, establish a reliable data pipeline. This involves setting up scheduled jobs (e.g., using Apache Airflow or a simple cron job) to periodically poll data sources, handle API rate limits, and manage data validation to catch anomalies. The output of this step is a clean, historical dataset ready for the next phase: processing and calculating reputation scores. Without accurate and comprehensive data collection, any subsequent analysis will be flawed, making this the most critical foundational step in the system.

scoring-algorithm

CORE LOGIC

Step 2: Designing the Scoring Algorithm

The scoring algorithm is the engine of your reputation system. It translates raw validator data into a single, comparable score, balancing multiple performance and risk metrics.

A robust validator scoring algorithm is a multi-factor model that aggregates various on-chain and off-chain signals. Common inputs include uptime percentage, proposal success rate, slashing history, self-bonded stake, and commission rate. The algorithm must be transparent, deterministic, and resistant to manipulation. For example, you might source uptime data from a service like Chainscore's Validator API and slashing events directly from the chain's consensus layer.

You must decide on a weighting scheme for each metric. A validator with 99.9% uptime but a history of double-signing should be penalized more heavily than one with 95% uptime and a clean record. A typical approach uses a base score that is then multiplied by penalty factors. For instance: final_score = (uptime_score * 0.4 + self_stake_score * 0.3) * slashing_penalty * governance_penalty The weights (0.4, 0.3) reflect your system's priorities—whether it values reliability or skin-in-the-game more highly.

Implementing the algorithm requires careful data normalization. Metrics like "self-bonded stake" exist on different scales across networks (e.g., 32 ETH vs. 1,000,000 ATOM). Use min-max scaling or z-score normalization to bring all values to a common 0-1 range before applying weights. This ensures a validator's score isn't disproportionately affected by the native token's denomination.

The algorithm should output a score that is comparable over time. Consider implementing a rolling window for metrics like uptime (e.g., last 10,000 blocks) rather than lifetime totals. This ensures the score reflects recent performance and allows validators to recover their reputation after an incident. Store the scoring logic and historical scores immutably, perhaps in a smart contract or a verifiable database, to ensure auditability.

Finally, test your algorithm against historical validator data. Simulate how the top 100 validators on a network like Cosmos or Ethereum would have been ranked over the past year. This backtesting phase is crucial for identifying edge cases, such as how to handle a validator that just joined the active set, and for calibrating weights to produce a meaningful distribution of scores.

on-chain-integration

STEP 3

On-Chain Integration

This step deploys the validator reputation scoring logic to a smart contract, creating a transparent, immutable, and programmatically accessible system.

The core of the reputation system is a smart contract that stores scores and manages updates. A common design uses a mapping to associate a validator's address with a ReputationData struct. This struct typically contains fields like totalScore, lastUpdateBlock, penaltyCount, and a history of recent actions. The contract must include permissioned functions, often restricted to an oracle or a decentralized set of keepers, to update these scores based off-chain analysis. It's critical that the update logic is gas-efficient, as frequent updates for many validators can become expensive.

For security and decentralization, the update mechanism should not rely on a single private key. Consider using a multi-signature wallet for the oracle address or implementing a commit-reveal scheme where multiple reporters submit scores and the median is used. For high-value systems, you can integrate with a decentralized oracle network like Chainlink Functions to fetch and compute scores in a trust-minimized way. The contract should also include a timelock or challenge period for score updates, allowing validators to dispute incorrect assessments before they are finalized.

Here is a simplified example of a core contract structure in Solidity. This contract allows a designated oracle to update a validator's score and allows anyone to read the current reputation data.

solidity
// SPDX-License-Identifier: MIT
pragma solidity ^0.8.19;

contract ValidatorReputation {
    address public oracle;

    struct ReputationData {
        uint256 score; // 0-1000 scale
        uint32 lastUpdate;
        uint16 penaltyCount;
    }

    mapping(address => ReputationData) public reputationOf;

    event ScoreUpdated(address indexed validator, uint256 newScore, uint256 timestamp);

    constructor(address _oracle) {
        oracle = _oracle;
    }

    function updateScore(address _validator, uint256 _score, uint16 _penaltyCount) external {
        require(msg.sender == oracle, "Unauthorized");
        reputationOf[_validator] = ReputationData({
            score: _score,
            lastUpdate: uint32(block.timestamp),
            penaltyCount: _penaltyCount
        });
        emit ScoreUpdated(_validator, _score, block.timestamp);
    }

    function getScore(address _validator) external view returns (uint256) {
        return reputationOf[_validator].score;
    }
}

Once deployed, the contract becomes a source of truth that other protocols can query. A DeFi slashing insurance protocol might use getScore(validatorAddress) to adjust premium rates. A cross-chain bridge could route transactions only through validators with a score above a certain threshold. The emitted ScoreUpdated events create an on-chain audit trail, allowing block explorers and analytics dashboards like Dune Analytics or Flipside Crypto to track reputation history transparently. This composability is a key advantage of having the scores on-chain.

Before mainnet deployment, conduct thorough testing and security audits. Use a testnet (like Sepolia or Goerli) to simulate the oracle update flow and gas costs. Consider the contract's upgradeability strategy—using a proxy pattern (e.g., Transparent or UUPS) allows you to fix bugs or improve the scoring algorithm later, but adds complexity. Finally, verify and publish the contract source code on block explorers like Etherscan to foster trust and allow developers to integrate with your contract's ABI easily.

use-cases

VALIDATOR OPERATIONS

Use Cases for the Reputation Score

A validator's reputation score, derived from on-chain performance data, enables new systems for delegation, security, and network governance.

Automated Staking Delegation Pools

Delegation pools can use reputation scores to automatically allocate stake to the most reliable validators. This creates a "set-and-forget" staking experience for users and improves overall network security.

Example: A pool's smart contract could automatically rebalance delegations weekly, moving funds away from validators whose scores drop below a threshold.
Impact: Reduces manual research for delegators and creates a competitive incentive for validators to maintain high performance.

30M+

ETH Staked via Lido

EXPLORE

Slashing Insurance and Risk Assessment

Protocols offering slashing insurance can use reputation scores to price premiums dynamically. A validator with a high, consistent score represents lower risk.

Mechanism: Insurance smart contracts query the reputation oracle to adjust coverage costs. Validators with poor uptime or prior slashing events pay higher premiums.
Data Points: Score calculations incorporate liveness, correctness, and governance participation to assess multifaceted risk.

Validator Set Optimization for Bridges & Rollups

Cross-chain bridges and optimistic rollups that rely on validator or guardian committees can use reputation to select and rotate members. This mitigates centralization and collusion risks.

Implementation: A bridge's governance contract could mandate that only validators in the top 40% by reputation score are eligible for the active set.
Security Benefit: Continuously cycling in high-performing validators based on objective metrics makes attacks more difficult and expensive to coordinate.

On-Chain Governance Weighting

DAO governance systems can weight voting power based on validator reputation, aligning influence with proven network contribution.

Process: A validator's voting power in a protocol DAO is multiplied by a factor derived from their reputation score.
Rationale: This prevents large, poorly performing validators from having disproportionate control over network upgrades and treasury decisions. It rewards long-term, reliable participants.

MEV Relay Selection and Monitoring

Block builders and searchers can use validator reputation to choose which MEV relays to trust. Relays are associated with the validators they serve.

Use Case: A searcher's bot might prioritize sending bundles through relays that only work with validators having a high "correctness" score, reducing the risk of bundle theft or censorship.
Transparency: Reputation systems can track and score relays based on the historical performance of their connected validators.

EXPLORE

New Validator Onboarding & Bonding

Networks can implement reputation-based bonding curves for new validators. Instead of a fixed bond, the required stake could be inversely related to a pre-established score from a testnet or other network.

Mechanism: A validator with a proven track record on a testnet (scored by the same system) could join mainnet with a 20% lower bond requirement.
Benefit: Lowers barriers to entry for competent operators while maintaining security through performance-based requirements.

SCORING COMPONENTS

Validator Metric Comparison and Weighting

A comparison of common on-chain and off-chain metrics used to evaluate validator performance and reliability, with suggested weighting for a reputation score.

Metric	Uptime / Liveness	Governance / Staking	Economic Security	Proposer Performance
Data Source	On-chain (Consensus Layer)	On-chain (Governance/Staking)	On-chain (Delegation)	On-chain (Blockchain Data)
Primary Measurement	Attestation effectiveness, missed blocks	Voting participation, proposal submission	Self-stake ratio, commission rate	MEV-Boost usage, block proposal latency
Typical Weight in Score	35-50%	15-25%	20-30%	10-20%
Reliability Signal
Manipulation Risk	Low (Sybil-resistant)	Medium (Can abstain)	High (Can be gamed with low self-stake)	Medium (Can be optimized)
Update Frequency	Per epoch (6.4 min)	Per proposal/epoch	Per validator change	Per slot (12 sec)
Example: Ethereum	99% target	67% for consensus	32 ETH self-stake ideal	< 4 sec proposal time
Scoring Complexity	Low (Binary/Percentage)	Medium (Weighted by proposal)	High (Requires slashing history)	High (Requires MEV data)

VALIDATOR REPUTATION

Frequently Asked Questions

Common questions and technical troubleshooting for developers implementing validator reputation scoring systems.

A validator reputation score is a dynamic metric that quantifies a validator's historical performance, reliability, and trustworthiness within a Proof-of-Stake (PoS) network. It is not a single number but a composite index derived from multiple on-chain and sometimes off-chain signals.

Core calculation inputs typically include:

Uptime/Slashing History: The primary factor. Penalties for double-signing or downtime drastically lower scores.
Governance Participation: Voting on proposals signals engagement.
Commission Rate & Changes: High or frequently increased commissions can negatively impact perceived reliability.
Self-Bonded Stake Ratio: A higher percentage of operator-owned stake aligns incentives.
Delegator Count & Distribution: A broad, decentralized delegator base is often seen as healthier than a few large whales.

Systems like Chainscore aggregate these signals, applying weighted algorithms (e.g., time-decayed averages for uptime) to produce a normalized score (e.g., 0-100). This allows delegators and protocols to programmatically assess risk and automate delegation decisions.

resource-links

GUIDES

Resources and Tools

Practical tools and design components for building a validator reputation scoring system. These resources cover data collection, scoring logic, monitoring infrastructure, and governance integration.

On-Chain Validator Performance Metrics

A validator reputation system should start with objective on-chain signals derived directly from consensus and staking modules. Most PoS networks already expose the raw data needed.

Key metrics to index per validator:

Uptime / signing rate from consensus logs or RPC endpoints
Missed blocks and rolling window availability
Slashing events including reason codes and penalty size
Commission changes and frequency of parameter updates
Voting participation for governance proposals

Example implementations:

Cosmos SDK chains expose validator signing info via slashing and staking modules
Ethereum validator effectiveness can be derived from attestation inclusion and missed proposals

Best practice is to normalize metrics per epoch and store historical snapshots. Raw uptime alone is insufficient; weighting recent behavior higher than lifetime performance reduces reputation inertia and makes the score responsive to operator changes.

Reputation Scoring Models and Weighting Logic

After collecting metrics, define a transparent scoring formula that converts raw signals into a single reputation score. Avoid black-box models unless governance explicitly approves them.

Common scoring approaches:

Weighted linear models where uptime, slashing, and governance participation each contribute a fixed percentage
Penalty-first models where any slashing event caps the maximum achievable score for a period
Decay functions that reduce the impact of older data using exponential or epoch-based decay

Example weighting:

50% uptime and missed blocks
30% slashing history and severity
20% governance participation and responsiveness

Publish the formula on-chain or in versioned documentation. Validators should be able to simulate their future score given expected behavior. This reduces disputes and aligns incentives toward measurable actions rather than subjective trust signals.

Monitoring Stack: Prometheus and Grafana

Reliable reputation scores require continuous monitoring infrastructure to prevent gaps or manipulation. Prometheus and Grafana are widely used across validator operators and protocol teams.

Typical setup:

Export consensus and node metrics via Prometheus exporters
Scrape metrics at fixed intervals aligned with block times
Store time-series data for long-term trend analysis
Visualize validator-level dashboards in Grafana

Metrics to monitor:

Block signing latency and success rate
Peer count and network stability
Disk, memory, and CPU saturation that precede downtime

Prometheus data can be aggregated off-chain to compute reputation scores, then committed on-chain periodically via governance-approved oracles. This separation keeps the chain lightweight while preserving verifiability of the underlying data.

EXPLORE

Governance and Slashing Event Indexing

Reputation systems should integrate governance outcomes and slashing records to reflect social and economic trust, not just technical uptime.

Key integration points:

Index governance votes to track participation rate and consistency
Flag abstentions versus explicit yes or no votes
Track slashing events with context such as double-signing or downtime

For Cosmos SDK chains, this data is available via:

gov module for proposal votes and deposits
slashing module for penalties and jail events

Design consideration:

A single severe slashing event can outweigh months of good behavior
Governance inactivity may be penalized less than active malicious voting

Publishing these rules in advance ensures validators understand how social behavior impacts reputation and avoids retroactive scoring changes.

On-Chain Storage and Consumer Use Cases

Decide early whether reputation scores live on-chain, off-chain, or hybrid. This choice affects cost, composability, and trust assumptions.

Storage models:

Off-chain computation with periodic on-chain commitments
Fully on-chain scoring using epoch-based updates
NFT or account-bound records representing validator reputation

Consumer use cases:

Delegation UIs that sort validators by reputation score
Liquid staking protocols applying minimum score thresholds
DAOs gating roles or rewards based on validator reputation

Hybrid models are most common. Scores are computed off-chain for flexibility, then anchored on-chain for transparency and downstream composability. Always include versioning so changes to scoring logic do not invalidate historical scores.