Skip to content

PSI

PSI lets two parties learn which elements they have in common without revealing the rest. Conceptually:

\[ \text{PSI}(A, B) = A \cap B \]

In practice, the AP3 PSI variant returns a result scoped to the initiator's question — typically "is element X in B?" (a boolean) or "how many of my elements are in B?" (a count) — without ever exposing the contents of \(B\) to the initiator or the contents of \(A\) to the receiver.

Interactive explanation

Why does the B stamp survive when A is peeled off?

The wrapping/stamping picture works beautifully for the journey but cracks at this exact moment. The fix is a small upgrade to the metaphor.

Imagine Provider's stamp B isn't a postage sticker laid on top of the wrapper. It's a steel die pressed firmly against the wrapped lump — pressed hard enough that it punches through wrapper A and leaves a permanent impression on the ball underneath. The wrapper is just a soft envelope. The stamp's mark lives on the ball itself.

The stamp pushes through the wrapper, therefore it leaves a push mark. Peel the wrapper off and the push mark is still there.

So when Consumer peels A away, the wrapper comes off clean — but the indentation B punched into the ball stays right where it was. And critically: it's the same indentation Provider would have left if it had stamped the bare ball directly. Which is exactly what Provider did to every ball in its own list. Same procedure on both sides → directly comparable outputs. That equality check at the end is only possible because of this.

Underneath the metaphor

The ball is a point \(P\) on an elliptic curve. \(A\) is Consumer's secret scalar \(a\); \(B\) is Provider's secret scalar \(b\). The protocol computes:

\[ \begin{aligned} \text{Consumer:} \quad & a \cdot P && \text{(blind)} \\ \text{Provider:} \quad & b \cdot (a \cdot P) = a \cdot b \cdot P && \text{(evaluate — scalar mul commutes)} \\ \text{Consumer:} \quad & a^{-1} \cdot (a \cdot b \cdot P) = b \cdot P && \text{(unblind)} \end{aligned} \]

The \(a\) and \(a^{-1}\) annihilate. \(b\) was never tangled with \(a\), so it survives untouched. What's left is \(b \cdot P\) — identical to what Provider produces on every ball in its own list (each is \(b \cdot P_i\)). Equality comparison works because both sides went through the same \(b\)-multiplication, on the same hash-to-curve output, and nothing else.

Without commutativity, peeling A would smear B in some recoverable way (\(a^{-1} \cdot b \cdot a \cdot P\) wouldn't simplify to anything useful) and the comparison would silently fail. Every OPRF in the wild — 2HashDH, the constructions in RFC 9497 — is built on a commutative group operation for exactly this reason.

How it runs (4 envelopes, end to end)

PSI exchanges four envelopes per session. The two messages from the initiator (OB) each carry their own signed PrivacyIntentDirective bound to that envelope's payload — every initiator→receiver message is independently authenticated.

# Direction Phase Payload What happens
1 OB → BB init commit(sid_0, blind) Session kick-off. OB picks sid_0 and a random blind value, then sends a hiding commitment to them. The signed intent rides here and authenticates OB.
2 BB → OB msg0 sid_1 BB picks its half of the session ID (sid_1) and sends it in the clear. BB has no way to see sid_0 yet, so it can't grind.
3 OB → BB msg1 sid_0 ‖ blind ‖ psc1 OB opens the commit (revealing sid_0 + blind), derives session_id = H(sid_0, sid_1), blinds its query, and sends psc_msg1. BB verifies the commit opens correctly. A fresh signed intent on this envelope binds the actual payload.
4 BB → OB msg2 psc2 BB runs its half of the cryptographic protocol against its private set, returns the response. BB learns nothing about OB's query.
(local) result OB processes msg2 locally and learns the answer. BB never learns the answer.

The contributory session_id (H(sid_0, sid_1) with a commit-then-reveal exchange) means neither party alone chooses the session_id: OB is locked into sid_0 by the commit before seeing sid_1, and BB has to pick sid_1 before seeing sid_0. This prevents either party from grinding for a session_id that produces a favorable transcript.

The full cryptographic details are encapsulated by the operation implementation; from the SDK side, you call PSIOperation.start(...) / .receive(...) / .process(...) and the framework drives the four envelopes for you (or use PrivacyAgent.run_intent(...) to drive the whole exchange end-to-end over A2A).

Use cases

  • Customer overlap analysis between companies that don't want to swap customer lists.
  • Fraudulent account detection across institutions, surfacing only the matches.
  • Supply chain partner verification — confirming a counterparty exists in an approved-supplier list without exposing the list.
  • Sanctions / blacklist screening without disclosing either side.

What PSI does not do

  • It does not reveal the receiver's full set to the initiator.
  • It does not reveal the initiator's query to the receiver.
  • It does not, by itself, prove that the receiver actually used the dataset it advertised in its commitment. That gap is what proof-of-computation work is intended to close — see the Distribution section and the Roadmap.

For a hands-on tutorial, see the psi_simple and psi_adk_simple examples in the main repository.

PSI - Performance Benchmarks

Environment: Python 3.11.9, arm64-darwin, pure-Python Ristretto255 via rbcl + merlin_transcripts.
Methodology: time.perf_counter wall-clock, 5-iteration warmup. Each row shows mean with p95 and min/max.


1. Functional layer

Direct calls into the pure-Python cryptographic primitives (Ristretto255 via rbcl, Merlin transcripts). This is the minimum cost for each operation.

Operation n Mean p95 Min Max
generate_hash() — customer 1000 0.30 µs 0.33 µs 0.25 µs 1.25 µs
generate_hash() — sanction 1000 0.30 µs 0.38 µs 0.25 µs 0.58 µs
compute_session_id() 1000 0.31 µs 0.33 µs 0.25 µs 11.21 µs
create_commitment() 1000 0.30 µs 0.33 µs 0.25 µs 7.42 µs
verify_commitment() 1000 0.34 µs 0.38 µs 0.29 µs 0.92 µs
create_psc_msg1() 500 655.40 µs 669.08 µs 628.71 µs 711.29 µs
process_psc_msg1() — 1 entries 100 2.62 ms 2.65 ms 2.49 ms 3.59 ms
process_psc_msg1() — 10 entries 100 16.86 ms 16.96 ms 16.56 ms 18.00 ms
process_psc_msg1() — 50 entries 100 80.23 ms 81.80 ms 79.82 ms 83.24 ms
process_psc_msg1() — 100 entries 100 159.49 ms 161.88 ms 158.62 ms 162.60 ms
process_psc_msg1() — 500 entries 50 794.31 ms 796.96 ms 790.05 ms 819.49 ms
process_psc_msg1() — 1,000 entries 20 1583.00 ms 1597.86 ms 1579.11 ms 1597.86 ms
process_psc_msg2() 1000 1.98 ms 2.01 ms 1.95 ms 3.15 ms

Notes:

  • generate_hash, compute_session_id, create_commitment, and verify_commitment are SHA-256 only — no curve operations.
  • create_psc_msg1 performs one h1_function (hash-to-curve via Merlin + from_hash) + one scalar multiply.
  • process_psc_msg1 is O(n) in sanction list size: per entry: one h1_function + one scalar multiply + one h2_function, plus one Schnorr DLog proof.
  • process_psc_msg2 is fixed cost: one DLog proof verification, one scalar invert, one h2_function, one O(n) hmac.compare_digest scan.

2. PSI Operation Layer

PSIOperation wraps the cryptographic core in the ap3.Operation contract (4-round contributory session ID, base64 serialisation, role/phase validation).

Operation n Mean p95 Min Max FFI baseline Overhead
on_start() — OB init 200 2.56 µs 2.71 µs 2.46 µs 4.50 µs 1.93 µs ~0.64 µs
_receiver_step(init) — BB, 50 entries 100 19.12 µs 19.33 µs 18.79 µs 21.96 µs 12.38 µs ~6.74 µs
_initiator_step(msg0) — OB creates msg1 200 660.09 µs 671.71 µs 653.04 µs 697.42 µs 657.67 µs ~2.42 µs
_receiver_step(msg1) — BB, 50 entries 100 80.34 ms 82.40 ms 79.79 ms 83.53 ms 80.57 ms ~0.00 µs
_initiator_step(msg2) — OB finalise 200 1.99 ms 2.01 ms 1.96 ms 2.96 ms 1.98 ms ~10.03 µs

Notes:

  • Operation-layer overhead (base64 encode/decode, phase dispatch, state dict ops) is small relative to the underlying crypto.
  • _receiver_step(init) hashes the entire sanction list upfront and caches results in session state, so _receiver_step(msg1) skips re-hashing.
  • The 4-round commitment exchange adds only cheap SHA-256 operations (create_commitment, verify_commitment, compute_session_id).

3. Full PSI Round-Trip

End-to-end protocol execution across all 4 rounds: on_start_receiver_step(init)_initiator_step(msg0)_receiver_step(msg1)_initiator_step(msg2). Single-threaded sequential execution; no network overhead.

Sanction list size n Mean p95 Min Max Throughput
1 100 5.29 ms 5.33 ms 5.26 ms 5.37 ms ~189 checks/s
10 100 19.62 ms 19.80 ms 19.51 ms 21.06 ms ~51 checks/s
50 100 82.90 ms 84.56 ms 82.29 ms 85.86 ms ~12 checks/s
100 50 161.59 ms 164.06 ms 160.76 ms 164.57 ms ~6 checks/s
500 10 792.71 ms 795.34 ms 790.79 ms 795.34 ms ~1 checks/s
1,000 10 1581.07 ms 1583.44 ms 1578.23 ms 1583.44 ms ~1 checks/s

Scaling characteristic: Round-trip time grows linearly with sanction list size. The dominant cost is process_psc_msg1 (receiver side), O(n): one h1_function (hash-to-curve via Merlin) + one scalar multiply + one h2_function per entry.


4. Session Management

Pure-Python overhead for Operation session bookkeeping (no cryptography).

Operation n Mean p95 Min Max
has_session() — hit 10000 0.07 µs 0.08 µs 0.00 µs 0.50 µs
has_session() — miss 10000 0.06 µs 0.08 µs 0.04 µs 0.29 µs
_save_session() 10000 0.74 µs 0.33 µs 0.21 µs 3.78 ms

Session operations are sub-microsecond for lookups; they contribute < 1% of total round-trip cost.


Summary

Layer Dominant cost Notes
SHA-256 helpers single hashlib.sha256 call sub-microsecond
create_psc_msg1 1× hash-to-curve + 1× scalar mul ~650 µs
process_psc_msg1 O(n): hash-to-curve + scalar mul + H2 per entry ~2.6 ms @ n=1, ~1.6 s @ n=1000
process_psc_msg2 DLog verify + scalar invert + H2 + O(n) compare ~2 ms fixed
Operation overhead base64, phase dispatch, state dict ops < 100 µs per step
Session management Python dict lookup / insert < 5 µs