# Scalable Private Inference with HSS-EE

**HSS-EE (Homomorphic Secret Sharing over Encrypted Embeddings)** is Nesa’s two-party inference protocol that provides stronger privacy than standard encrypted inference. Built on top of [Equivariant Encryption (EE)](https://docs.nesa.ai/nesa/major-innovations/private-inference-for-ai/practical-approach-equivariant-encryption-ee), HSS-EE ensures that **no single server ever sees the full user input, not even in encrypted form**.

This makes HSS-EE ideal for use cases with **high confidentiality demands**, such as medical inference, compliance-constrained applications, or decentralized infrastructure with minimal trust assumptions.

***

### 🔑 Core Idea: Additive Sharing + Encrypted Embeddings

HSS-EE splits an encrypted user input into **two additive shares**, sending each to a separate server. Both servers run the same EE-compatible model but over different shares:

* Neither server can reconstruct the original input
* Activations and outputs remain secret-shared end-to-end
* The user alone combines results to recover the output

This achieves **information-theoretic security under non-collusion** — even if one server is fully compromised.

***

### 🛠️ How HSS-EE Works

1. **Preprocessing (Client-side):**
   * The user embeds their input locally (e.g., token embeddings or image patches)
   * Applies Equivariant Encryption (EE)
   * Splits the encrypted vector into two additive shares: `x = x₁ + x₂`
2. **Distributed Inference (Server-side):**
   * Server A receives `x₁`, Server B receives `x₂`
   * Each server runs the same encrypted model over their share
   * Intermediate activations remain encrypted and secret-shared
3. **Result Reconstruction (Client-side):**
   * Final results are returned as shares
   * The user reconstructs the final output locally: `y = y₁ + y₂`

<figure><img src="https://3903893560-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FVtjgh8wLtiRmdt9OTX2C%2Fuploads%2FppFkBRoLETXgQa2tE0Nl%2Fimage.png?alt=media&#x26;token=4ddc616d-7982-4023-b70e-a0f33801e3a7" alt=""><figcaption><p>High-level diagram of HSS-EE workflow and broadcasting schedule showing minimized</p></figcaption></figure>

***

### 🧠 Why HSS-EE?

| Feature                             | EE                      | HSS-EE              |
| ----------------------------------- | ----------------------- | ------------------- |
| Protects input from server          | ✅                       | ✅                   |
| No server sees full encrypted input | ❌                       | ✅                   |
| Collusion-resistance                | ❌ (single-server trust) | ✅ (two-party model) |
| GPU compatible                      | ✅                       | ✅                   |
| Applicable to large models          | ✅                       | ✅                   |

HSS-EE is particularly valuable in settings where:

* No single party should have full visibility
* Regulators require infrastructure separation (e.g., EU + US)
* Inference is hosted across multi-party environments (e.g., DAO + enterprise)

***

### 📊 Performance Benchmarks

All results are measured using consumer-grade A100 nodes over gRPC with CUDA kernel acceleration.

| Model      | Latency (batch=1) | Throughput (QPS) | Notes                          |
| ---------- | ----------------- | ---------------- | ------------------------------ |
| LLaMA-2 7B | \~700–850 ms      | 3–5 QPS          | Sequence generation (1-token)  |
| ResNet-50  | \~400 ms          | 10–12 QPS        | Image classification           |
| T5-small   | \~540 ms          | 6 QPS            | Text generation, decoder-heavy |

➡ Compared to Equivariant Encryption (EE), HSS-EE adds \~1.5× latency due to dual-server parallelism.\
➡ Compared to vanilla inference, HSS-EE remains **2–3× faster than MPC or HE-based protocols** for similar security guarantees.

***

### 🛡️ Threat Model

HSS-EE assumes:

* **At most one server is compromised** (non-collusion assumption)
* User-side preprocessing is trusted (input embedding and splitting)
* Transport is secure (e.g., TLS or VPN)

Even if one server is malicious:

* It sees only a random-looking vector (a share)
* It cannot reconstruct the plaintext input
* It cannot reverse-engineer intermediate states

> HSS-EE raises the security bar while keeping inference scalable and GPU-compatible.

***

### 🧭 Summary

**HSS-EE enables secure, two-server inference over encrypted inputs** with near-native performance:

* Zero TEE requirement
* No single point of failure
* End-to-end secret-shared encrypted inference
* GPU-native implementation

HSS-EE is the **cryptographic backbone** for strong privacy in decentralized inference, particularly when users demand **split trust**.
