HSS-EE: Secure Two-Party Inference at Scale
HSS-EE (Homomorphic Secret Sharing over Encrypted Embeddings) is Nesaās two-party inference protocol that provides stronger privacy than standard encrypted inference. Built on top of Equivariant Encryption (EE), HSS-EE ensures that no single server ever sees the full user input, not even in encrypted form.
This makes HSS-EE ideal for use cases with high confidentiality demands, such as medical inference, compliance-constrained applications, or decentralized infrastructure with minimal trust assumptions.
š Core Idea: Additive Sharing + Encrypted Embeddings
HSS-EE splits an encrypted user input into two additive shares, sending each to a separate server. Both servers run the same EE-compatible model but over different shares:
Neither server can reconstruct the original input
Activations and outputs remain secret-shared end-to-end
The user alone combines results to recover the output
This achieves information-theoretic security under non-collusion ā even if one server is fully compromised.
š ļø How HSS-EE Works
Preprocessing (Client-side):
The user embeds their input locally (e.g., token embeddings or image patches)
Applies Equivariant Encryption (EE)
Splits the encrypted vector into two additive shares:
x = xā + xā
Distributed Inference (Server-side):
Server A receives
xā
, Server B receivesxā
Each server runs the same encrypted model over their share
Intermediate activations remain encrypted and secret-shared
Result Reconstruction (Client-side):
Final results are returned as shares
The user reconstructs the final output locally:
y = yā + yā

š§ Why HSS-EE?
Protects input from server
ā
ā
No server sees full encrypted input
ā
ā
Collusion-resistance
ā (single-server trust)
ā (two-party model)
GPU compatible
ā
ā
Applicable to large models
ā
ā
HSS-EE is particularly valuable in settings where:
No single party should have full visibility
Regulators require infrastructure separation (e.g., EU + US)
Inference is hosted across multi-party environments (e.g., DAO + enterprise)
š Performance Benchmarks
All results are measured using consumer-grade A100 nodes over gRPC with CUDA kernel acceleration.
LLaMA-2 7B
~700ā850 ms
3ā5 QPS
Sequence generation (1-token)
ResNet-50
~400 ms
10ā12 QPS
Image classification
T5-small
~540 ms
6 QPS
Text generation, decoder-heavy
ā” Compared to Equivariant Encryption (EE), HSS-EE adds ~1.5Ć latency due to dual-server parallelism. ā” Compared to vanilla inference, HSS-EE remains 2ā3Ć faster than MPC or HE-based protocols for similar security guarantees.
š”ļø Threat Model
HSS-EE assumes:
At most one server is compromised (non-collusion assumption)
User-side preprocessing is trusted (input embedding and splitting)
Transport is secure (e.g., TLS or VPN)
Even if one server is malicious:
It sees only a random-looking vector (a share)
It cannot reconstruct the plaintext input
It cannot reverse-engineer intermediate states
HSS-EE raises the security bar while keeping inference scalable and GPU-compatible.
š§ Summary
HSS-EE enables secure, two-server inference over encrypted inputs with near-native performance:
Zero TEE requirement
No single point of failure
End-to-end secret-shared encrypted inference
GPU-native implementation
HSS-EE is the cryptographic backbone for strong privacy in decentralized inference, particularly when users demand split trust.
Last updated