HSS-EE: Secure Two-Party Inference at Scale

HSS-EE (Homomorphic Secret Sharing over Encrypted Embeddings) is Nesa’s two-party inference protocol that provides stronger privacy than standard encrypted inference. Built on top of Equivariant Encryption (EE), HSS-EE ensures that no single server ever sees the full user input, not even in encrypted form.

This makes HSS-EE ideal for use cases with high confidentiality demands, such as medical inference, compliance-constrained applications, or decentralized infrastructure with minimal trust assumptions.


šŸ”‘ Core Idea: Additive Sharing + Encrypted Embeddings

HSS-EE splits an encrypted user input into two additive shares, sending each to a separate server. Both servers run the same EE-compatible model but over different shares:

  • Neither server can reconstruct the original input

  • Activations and outputs remain secret-shared end-to-end

  • The user alone combines results to recover the output

This achieves information-theoretic security under non-collusion — even if one server is fully compromised.


šŸ› ļø How HSS-EE Works

  1. Preprocessing (Client-side):

    • The user embeds their input locally (e.g., token embeddings or image patches)

    • Applies Equivariant Encryption (EE)

    • Splits the encrypted vector into two additive shares: x = x₁ + xā‚‚

  2. Distributed Inference (Server-side):

    • Server A receives x₁, Server B receives xā‚‚

    • Each server runs the same encrypted model over their share

    • Intermediate activations remain encrypted and secret-shared

  3. Result Reconstruction (Client-side):

    • Final results are returned as shares

    • The user reconstructs the final output locally: y = y₁ + yā‚‚

High-level diagram of HSS-EE workflow and broadcasting schedule showing minimized

🧠 Why HSS-EE?

Feature
EE
HSS-EE

Protects input from server

āœ…

āœ…

No server sees full encrypted input

āŒ

āœ…

Collusion-resistance

āŒ (single-server trust)

āœ… (two-party model)

GPU compatible

āœ…

āœ…

Applicable to large models

āœ…

āœ…

HSS-EE is particularly valuable in settings where:

  • No single party should have full visibility

  • Regulators require infrastructure separation (e.g., EU + US)

  • Inference is hosted across multi-party environments (e.g., DAO + enterprise)


šŸ“Š Performance Benchmarks

All results are measured using consumer-grade A100 nodes over gRPC with CUDA kernel acceleration.

Model
Latency (batch=1)
Throughput (QPS)
Notes

LLaMA-2 7B

~700–850 ms

3–5 QPS

Sequence generation (1-token)

ResNet-50

~400 ms

10–12 QPS

Image classification

T5-small

~540 ms

6 QPS

Text generation, decoder-heavy

āž” Compared to Equivariant Encryption (EE), HSS-EE adds ~1.5Ɨ latency due to dual-server parallelism. āž” Compared to vanilla inference, HSS-EE remains 2–3Ɨ faster than MPC or HE-based protocols for similar security guarantees.


šŸ›”ļø Threat Model

HSS-EE assumes:

  • At most one server is compromised (non-collusion assumption)

  • User-side preprocessing is trusted (input embedding and splitting)

  • Transport is secure (e.g., TLS or VPN)

Even if one server is malicious:

  • It sees only a random-looking vector (a share)

  • It cannot reconstruct the plaintext input

  • It cannot reverse-engineer intermediate states

HSS-EE raises the security bar while keeping inference scalable and GPU-compatible.


🧭 Summary

HSS-EE enables secure, two-server inference over encrypted inputs with near-native performance:

  • Zero TEE requirement

  • No single point of failure

  • End-to-end secret-shared encrypted inference

  • GPU-native implementation

HSS-EE is the cryptographic backbone for strong privacy in decentralized inference, particularly when users demand split trust.

Last updated