# Trust and Incentives: Miner Reputation

In decentralized inference, miners vary in reliability and responsiveness, and the system cannot know this in advance. Treating all miners equally would reward unreliable behavior and penalize dependable ones.

**Nesa uses incentives to make reliable and efficient behavior consistently more profitable than unreliable behavior.**

***

### Core Principle

> **Miners who deliver correct and timely results should receive more opportunities and higher long-term rewards.**

This principle is implemented through a continuously updated reputation score rather than manual rules or centralized control.

### How Incentives Work

Reputation summarizes observed miner behavior and directly affects outcomes that matter:

* **Task assignment:** higher-reputation miners are selected more often
* **Failure cost:** timeouts and non-responses reduce future opportunities
* **Recovery:** occasional failures are tolerated, persistent issues are deprioritized

From a miner’s perspective:

> Better performance → higher reputation → more tasks → higher rewards

***

### Design Goals

The miner reputation system is designed to be:

* **Fast to compute** (per request)
* **Fair across modalities** (LLMs, diffusion, evolution models, etc.)
* **Robust to failures** (timeouts and non-responses)
* **Recoverable** for miners with temporary issues
* **Extensible** to future validation signals (e.g., output correctness)

***

### Core Reputation Variable

Each miner maintains a reputation score:

* **Reputation** $$R \in \[0.1, 10]$$
* Initialized at $$R = 1$$
* Updated after every inference request

The reputation score directly influences:

* Miner selection priority
* Task routing tier
* Reward and penalty magnitude

***

### Top-Down Reputation Update (Orchestrated Routing)

In orchestrated settings, miners are assigned tasks by an agent or scheduler.\
Reputation is updated primarily based on **completion correctness**.

#### Update Rule

$$
R' = R \cdot \text{Pen}^{M} \cdot \text{Rew}^{1 - M}
$$

**Where:**

* $$R$$: current reputation
* $$R'$$: updated reputation
* $$M \in {0, 1}$$: mistake indicator
  * $$M = 1$$ → timeout or non-response
  * $$M = 0$$ → successful execution
* $$\text{Pen} = 0.8$$: penalty multiplier
* $$\text{Rew} = 1.01$$ (– 1.05 under evaluation): reward multiplier

#### Interpretation

* Correct behavior leads to **gradual exponential growth**
* Failures cause **multiplicative decay**
* Consistently correct miners quickly separate from unreliable ones
* Occasional mistakes are recoverable

***

### Bottom-Up Reputation Update (Bidding / Open Routing)

In decentralized or bidding-based architectures, correctness alone is insufficient.\
The system must also filter out miners that are **consistently slow or underpowered**.

To address this, the reputation update incorporates **performance metrics**.

#### Extended Update Rule

$$
R' = \alpha \cdot R \cdot \text{Pen}^{M} \cdot \text{Rew}^{1 - M}
;+;
\beta \cdot (w\_S S + w\_F F + w\_B B + w\_I I)
$$

**Where:**

* $$\alpha = 0.7$$: weight for correctness history
* $$\beta = 0.3$$: weight for performance
* $$S$$: single-sample inference efficiency (token/s per block)
* $$F$$: forward-pass throughput (batch inference)
* $$B$$: backward-pass throughput (if applicable)
* $$I$$: network responsiveness / bandwidth
* $$w\_S, w\_F, w\_B, w\_I$$: configurable weights

All performance metrics are **min-max normalized**:

$$
x \leftarrow \frac{x - x\_{\text{min}}}{x\_{\text{max}} - x\_{\text{min}}}
$$

#### Design Rationale

* Correctness remains the dominant signal
* Performance differentiates miners with similar accuracy
* Hardware-only advantages cannot overwhelm correctness
* Low-reputation miners can still recover over time

***

### Efficiency Metric $$S$$

Efficiency is measured per request as:

$$
S\_{\text{raw}} = \frac{\text{input\_size} + \text{output\_size}}{\text{actual\_time}}
$$

Normalization bounds:

* Computed **per model**, not per miner
* Window size: last 100 requests for that model
* Strict min/max bounds (percentile-based bounds under evaluation)

A rolling average is maintained per miner:

$$
S\_{\text{rolling}} \leftarrow 0.9 \cdot S\_{\text{rolling}} + 0.1 \cdot S
$$

***

### Timeouts and Mistakes

A request is considered a **mistake** ($$M = 1$$) if:

* Execution time exceeds $$1.5 \times$$ expected time
* No response within $$2 \times$$ expected time

Expected time accounts for:

* Average execution time for the model
* Input and output size scaling

Non-responses incur a **stronger penalty multiplier** than slow responses.

***

### Rewards and Penalties

#### Rewards (Successful Requests)

$$
\text{reward} = \text{base\_reward} \cdot R^{\gamma}
$$

* $$\gamma = 1.2$$ (under evaluation)
* Encourages long-term reliability

#### Penalties (Failures)

$$
\text{penalty} = \frac{\text{base\_penalty}}{R^{\delta}}
$$

* $$\delta = 0.5$$
* Prevents overly harsh punishment of high-reputation miners
* Still discourages repeated failures

***

### Reputation Bounds and Recovery

* Reputation is clamped to $$\[0.1, 10]$$
* A small **catch-up factor** helps low-reputation miners recover:
  * Encourages re-entry after transient issues
  * Prevents permanent exclusion

***

### Empirical Observations

From large-scale inference traces:

* Total response time is strongly correlated with **model loading time**
* Inference time alone correlates weakly with model size
* Performance distributions are heavy-tailed and zero-inflated

These observations motivate:

* Normalization by model size
* Separate accounting of loading vs inference time
* Careful tuning of performance weights

***

### Relationship to Miner Selection

Reputation directly affects:

* Miner tier assignment
* Selection probability in routing
* Reputation-based rewards (RBR)
* Fallback and retry ordering

The system favors **reliable miners**, but does not permanently exclude others.

***

### Summary

The miner reputation system in Nesa:

* Combines correctness and efficiency
* Adapts across models and modalities
* Supports decentralized participation
* Enables recovery from transient failures
* Remains extensible to future validation signals

This framework provides a stable foundation for **trust-aware, incentive-aligned decentralized inference**.
