HSS-EE: Secure Two-Party Inference at Scale

HSS-EE (Homomorphic Secret Sharing over Encrypted Embeddings) is Nesa’s two-party inference protocol that provides stronger privacy than standard encrypted inference. Built on top of Equivariant Encryption (EE), HSS-EE ensures that no single server ever sees the full user input, not even in encrypted form.

This makes HSS-EE ideal for use cases with high confidentiality demands, such as medical inference, compliance-constrained applications, or decentralized infrastructure with minimal trust assumptions.

HSS-EE splits an encrypted user input into two additive shares, sending each to a separate server. Both servers run the same EE-compatible model but over different shares:

Neither server can reconstruct the original input
Activations and outputs remain secret-shared end-to-end
The user alone combines results to recover the output

This achieves information-theoretic security under non-collusion — even if one server is fully compromised.

🛠️ How HSS-EE Works

Preprocessing (Client-side):
- The user embeds their input locally (e.g., token embeddings or image patches)
- Applies Equivariant Encryption (EE)
- Splits the encrypted vector into two additive shares: x = x₁ + x₂
Distributed Inference (Server-side):
- Server A receives x₁, Server B receives x₂
- Each server runs the same encrypted model over their share
- Intermediate activations remain encrypted and secret-shared
Result Reconstruction (Client-side):
- Final results are returned as shares
- The user reconstructs the final output locally: y = y₁ + y₂

🧠 Why HSS-EE?

Feature

HSS-EE

Protects input from server

✅

No server sees full encrypted input

❌

✅

Collusion-resistance

❌ (single-server trust)

✅ (two-party model)

GPU compatible

✅

Applicable to large models

✅

HSS-EE is particularly valuable in settings where:

No single party should have full visibility
Regulators require infrastructure separation (e.g., EU + US)
Inference is hosted across multi-party environments (e.g., DAO + enterprise)

📊 Performance Benchmarks

All results are measured using consumer-grade A100 nodes over gRPC with CUDA kernel acceleration.

Model

Latency (batch=1)

Throughput (QPS)

Notes

LLaMA-2 7B

~700–850 ms

3–5 QPS

Sequence generation (1-token)

ResNet-50

~400 ms

10–12 QPS

Image classification

T5-small

~540 ms

6 QPS

Text generation, decoder-heavy

➡ Compared to Equivariant Encryption (EE), HSS-EE adds ~1.5× latency due to dual-server parallelism. ➡ Compared to vanilla inference, HSS-EE remains 2–3× faster than MPC or HE-based protocols for similar security guarantees.

🛡️ Threat Model

HSS-EE assumes:

At most one server is compromised (non-collusion assumption)
User-side preprocessing is trusted (input embedding and splitting)
Transport is secure (e.g., TLS or VPN)

Even if one server is malicious:

It sees only a random-looking vector (a share)
It cannot reconstruct the plaintext input
It cannot reverse-engineer intermediate states

HSS-EE raises the security bar while keeping inference scalable and GPU-compatible.

🧭 Summary

HSS-EE enables secure, two-server inference over encrypted inputs with near-native performance:

Zero TEE requirement
No single point of failure
End-to-end secret-shared encrypted inference
GPU-native implementation

HSS-EE is the cryptographic backbone for strong privacy in decentralized inference, particularly when users demand split trust.

PreviousEquivariant Encryption (EE)NextAdditional Information

Last updated 4 months ago

🔑 Core Idea: Additive Sharing + Encrypted Embeddings

🛠️ How HSS-EE Works

🧠 Why HSS-EE?

📊 Performance Benchmarks

🛡️ Threat Model

🧭 Summary