Security company Baffle has announced the release of a new solution for securing private data for use with generative AI. Baffle Data Protection for AI integrates with existing data pipelines and helps companies accelerate generative AI projects while ensuring their regulated data is cryptographically secure and compliant, according to the firm.
The solution uses the advanced encryption standard (AES) algorithm to encrypt sensitive data throughout the generative AI pipeline, with unauthorized users unable to see private data in cleartext, Baffle added.
The risks associated with sharing sensitive data with generative AI and large language models (LLMs) are well documented. Most relate to the security implications of sharing private data with advanced, public self-learning algorithms, which has driven some organizations to ban/limit certain generative AI technologies such as ChatGPT.
Private generative AI services are considered less risky, specifically retrieval-augmented generation (RAG) implementations that allow embeddings to be computed locally on a subset of data. However, even with RAG, data privacy and security implications have not been fully considered.
Solution anonymizes data values to prevent cleartext data leakage
Baffle Data Protection for AI encrypts data with the AES algorithm as it is ingested into the data pipeline, the firm said in a press release. When this data is used in a private generative AI service, sensitive data values are anonymized, so cleartext data leakage cannot occur even with prompt engineering or adversarial prompting, it claimed.
Sensitive data remains encrypted no matter where the data may be moved or transferred in the generative pipeline, helping companies to meet specific compliance requirements — such as the General Data Protection’s (GDPR’s) right to be forgotten — by shredding the associated encryption key, according to Baffle. Furthermore, the solution prevents private data from being exposed in public generative AI services too, as personally identifiable information (PII) is anonymized.