Method for Secure AI Grounding using Semantic Token Substitution and Client-Side Re-Hydration

Abstract

Using generative artificial intelligence with sensitive data may present challenges, as transmitting personally identifiable information or protected health information to third-party providers can introduce security risks, and some data masking techniques can reduce reasoning capabilities. A described system uses a proxy, masking layer that can intercept data within an enterprise's secure perimeter. This layer can substitute sensitive strings with persistent, structured semantic tokens that may be enriched with non-sensitive metadata hints to help preserve context. An external artificial intelligence can perform reasoning on this abstracted data, and its tokenized response can be re-hydrated into readable text on a client device (e.g., a smartphone, computer, or wearable device). This approach may allow third-party models to reason on proprietary information without direct access to the underlying plaintext data, which can assist organizations in managing data sovereignty while maintaining functional utility.
×