Miner Reputation and Incentives

In decentralized inference, miners vary in reliability and responsiveness, and the system cannot know this in advance. Treating all miners equally would reward unreliable behavior and penalize dependable ones.

Nesa uses incentives to make reliable and efficient behavior consistently more profitable than unreliable behavior.

Core Principle

Miners who deliver correct and timely results should receive more opportunities and higher long-term rewards.

This principle is implemented through a continuously updated reputation score rather than manual rules or centralized control.

How Incentives Work

Reputation summarizes observed miner behavior and directly affects outcomes that matter:

Task assignment: higher-reputation miners are selected more often
Failure cost: timeouts and non-responses reduce future opportunities
Recovery: occasional failures are tolerated, persistent issues are deprioritized

From a miner’s perspective:

Better performance → higher reputation → more tasks → higher rewards

Design Goals

The miner reputation system is designed to be:

Fast to compute (per request)
Fair across modalities (LLMs, diffusion, evolution models, etc.)
Robust to failures (timeouts and non-responses)
Recoverable for miners with temporary issues
Extensible to future validation signals (e.g., output correctness)

Core Reputation Variable

Each miner maintains a reputation score:

Reputation $R \in [0.1, 10]$
Initialized at $R = 1$
Updated after every inference request

The reputation score directly influences:

Miner selection priority
Task routing tier
Reward and penalty magnitude

Top-Down Reputation Update (Orchestrated Routing)

In orchestrated settings, miners are assigned tasks by an agent or scheduler. Reputation is updated primarily based on completion correctness.

Update Rule

R' = R \cdot \text{Pen}^{M} \cdot \text{Rew}^{1 - M}

Where:

$R$ : current reputation
$R'$ : updated reputation
$M \in \{0, 1\}$ : mistake indicator
- $M = 1$ → timeout or non-response
- $M = 0$ → successful execution
$\text{Pen} = 0.8$ : penalty multiplier
$\text{Rew} = 1.01$ (– 1.05 under evaluation): reward multiplier

Interpretation

Correct behavior leads to gradual exponential growth
Failures cause multiplicative decay
Consistently correct miners quickly separate from unreliable ones
Occasional mistakes are recoverable

Bottom-Up Reputation Update (Bidding / Open Routing)

In decentralized or bidding-based architectures, correctness alone is insufficient. The system must also filter out miners that are consistently slow or underpowered.

To address this, the reputation update incorporates performance metrics.

Extended Update Rule

R' = \alpha \cdot R \cdot \text{Pen}^{M} \cdot \text{Rew}^{1 - M} \;+\; \beta \cdot (w_S S + w_F F + w_B B + w_I I)

Where:

$\alpha = 0.7$ : weight for correctness history
$\beta = 0.3$ : weight for performance
$S$ : single-sample inference efficiency (token/s per block)
$F$ : forward-pass throughput (batch inference)
$B$ : backward-pass throughput (if applicable)
$I$ : network responsiveness / bandwidth
$w_S, w_F, w_B, w_I$ : configurable weights

All performance metrics are min-max normalized:

x \leftarrow \frac{x - x_{\text{min}}}{x_{\text{max}} - x_{\text{min}}}

Design Rationale

Correctness remains the dominant signal
Performance differentiates miners with similar accuracy
Hardware-only advantages cannot overwhelm correctness
Low-reputation miners can still recover over time

Efficiency Metric $S$

Efficiency is measured per request as:

S_{\text{raw}} = \frac{\text{input\_size} + \text{output\_size}}{\text{actual\_time}}

Normalization bounds:

Computed per model, not per miner
Window size: last 100 requests for that model
Strict min/max bounds (percentile-based bounds under evaluation)

A rolling average is maintained per miner:

S_{\text{rolling}} \leftarrow 0.9 \cdot S_{\text{rolling}} + 0.1 \cdot S

Timeouts and Mistakes

A request is considered a mistake ( $M = 1$ ) if:

Execution time exceeds $1.5 \times$ expected time
No response within $2 \times$ expected time

Expected time accounts for:

Average execution time for the model
Input and output size scaling

Non-responses incur a stronger penalty multiplier than slow responses.

Rewards and Penalties

Rewards (Successful Requests)

\text{reward} = \text{base\_reward} \cdot R^{\gamma}

$\gamma = 1.2$ (under evaluation)
Encourages long-term reliability

Penalties (Failures)

\text{penalty} = \frac{\text{base\_penalty}}{R^{\delta}}

$\delta = 0.5$
Prevents overly harsh punishment of high-reputation miners
Still discourages repeated failures

Reputation Bounds and Recovery

Reputation is clamped to $[0.1, 10]$
A small catch-up factor helps low-reputation miners recover:
- Encourages re-entry after transient issues
- Prevents permanent exclusion

Empirical Observations

From large-scale inference traces:

Total response time is strongly correlated with model loading time
Inference time alone correlates weakly with model size
Performance distributions are heavy-tailed and zero-inflated

These observations motivate:

Normalization by model size
Separate accounting of loading vs inference time
Careful tuning of performance weights

Relationship to Miner Selection

Reputation directly affects:

Miner tier assignment
Selection probability in routing
Reputation-based rewards (RBR)
Fallback and retry ordering

The system favors reliable miners, but does not permanently exclude others.

Summary

The miner reputation system in Nesa:

Combines correctness and efficiency
Adapts across models and modalities
Supports decentralized participation
Enables recovery from transient failures
Remains extensible to future validation signals

This framework provides a stable foundation for trust-aware, incentive-aligned decentralized inference.

PreviousDAI (Decentralized AI Application)NextInference Acceleration with MetaInf

Last updated 4 hours ago