- Format-stable — the same shape regardless of input length, so the model is never surprised by a short or long token
- Type-aware — the LLM can still tell a phone from a name from a date
- Reversible — Masker can rehydrate the original on the response leg
- Non-revealing — given only the token, you cannot recover the original value
Token format
All tokens follow this structure:| Field | Meaning | Example |
|---|---|---|
scheme | Versioned scheme identifier | MSKV1 (vault-deterministic) or MSK1 (reversible AEAD) |
kind | Entity type | PHONE, NAME, SSN, MRN, EMAIL, DOB |
kid | Key ID — which key was used to produce this token | K_HEALTHCARE |
value | The opaque token body | Base32-encoded, approximately 22 characters |
Two tokenization schemes
Masker offers two schemes. You choose one per agent.- Vault deterministic (default)
- Reversible AEAD
Algorithm: HMAC-SHA256 of
Use vault-deterministic when you want same-value → same-token behavior across a session — for example, so the LLM can refer to “the patient” consistently across multiple turns.
(kid_secret || normalized_input) → first 16 bytes → base32-encoded.Storage: A row is written to a SQLite vault at /data/vault.db on the Fly volume.Scheme prefix: MSKV1Properties
| Property | Detail |
|---|---|
| Deterministic | The same input always produces the same token within the same kid. The LLM can recognize “this is the same person as in the previous turn.” |
| Reversible | Masker looks up the token in the vault to retrieve the original. |
| Per-agent isolation | Different agents use different kids, so token namespaces don’t collide across customers. |
| Vault-bound | If the vault is lost, tokens are opaque forever. This is a feature when you need hard erasure — drop the vault, and all tokens referencing it become permanently unreadable. |
Choosing a scheme
| You want… | Pick |
|---|---|
| Same value → same token (consistent reference within a call) | Vault deterministic |
| No shared state across regions | Reversible AEAD |
| Hard erasure (drop the vault, tokens are dead forever) | Vault deterministic |
| Self-describing tokens that survive restarts | Reversible AEAD |
| Default for healthcare voice agents | Vault deterministic |
healthcare-default policy ships with vault-deterministic tokenization. You can change it per agent in the portal or via the create-agent API.
Rehydration
On the response leg, Masker scans the LLM’s output for any token matching the patternMSK*.*.*.*. For each match:
- Vault deterministic: look up the original value in
/data/vault.db, replace inline. - Reversible AEAD: decrypt with the
kidkey, replace inline.
rehydration_failed event and replaces the token with [REDACTED:KIND]. The failure is recorded in the audit log.
Key management
Eachkid is a logical key identifier mapped to actual key material in your environment:
kids at once to support key rotation:
Add the new key
Set
MASKER_KEY_K_HEALTHCARE to the new key material. Keep the old key registered as MASKER_KEY_K_HEALTHCARE_OLD.Update the policy
Switch your agent’s policy to reference the new
kid. New tokens will use the new key; existing live tokens still rehydrate via the old key.