Researchers poison stolen data to make AI systems return wrong results

Researchers affiliated with universities in China and Singapore have devised a technique to make stolen knowledge graph data useless if incorporated into a GraphRAG AI system without consent.

Large language models (LLMs) base their predictions on training data and cannot respond effectively to queries about other data. The AI industry has dealt with that limitation through a process called retrieval-augmented generation (RAG), which gives LLMs access to external datasets. Google's AI Overviews in Search, for example, use RAG to provide the underlying Gemini model with current, though not necessarily accurate, web data.

GraphRAG represents Microsoft's effort to make RAG more effective. By creating semantically related data clusters called knowledge graphs (KGs), GraphRAG outperforms basic RAG when linked to an LLM-based system. The structuring of the data makes it easier for the LLM to make accurate predictions when prompted.

Amazon, Google, and Microsoft all support GraphRAG in their respective cloud services.

In a preprint paper titled Making Theft Useless: Adulteration-Based Protection of Proprietary Knowledge Graphs in GraphRAG Systems, authors Weijie Wang, Peizhuo Lv, et al. observe that enterprise KGs can cost a considerable amount to build, citing a figure of $5.71 per factual statement [PDF] in the KG encompassing 21 million assertions available in Cyc.

Given the potential expense, companies have an incentive to prevent KG assets from being stolen and used to build a competitive AI-oriented product – a concern exhibited by publishers, authors, and other creators of media content. Companies like Pfizer and Siemens have invested in KGs to facilitate drug discovery and assist with manufacturing.

Academics Wang, Lv, and their co-authors propose a KG defense called AURA, which stands for "Active Utility Reduction via Adulteration." The ten authors are affiliated with the Chinese Academy of Sciences, National University of Singapore, Nanyang Technological University, and Beijing University of Technology.

AURA, they explain in their paper, is "a novel framework that makes a stolen KG unusable to an adversary while maintaining minimal performance overhead for the GraphRAG system."

Essentially, it's a mechanism for subtly poisoning or adulterating the data that goes into a KG such that accurate retrieval requires a secret key. Unlike traditional encryption, the goal is not to deny access to cleartext; rather it's to degrade KG responses to LLMs such that predictions made without the key produce reduced accuracy and hallucinations.

Alternative approaches like watermarking may have some utility for making data theft traceable, but they don't address misuse of stolen data in a private setting. And the authors argue that encryption isn't practical.

"Fully encrypting the text and embeddings would require decrypting large portions of the graph for every query," they claim. "This process introduces prohibitive computational overhead and latency, making it unsuitable for real-world use."

The threat model here assumes that the attacker has been able to steal a KG outright but hasn't obtained the secret key. Trade secret lawsuits confirm that companies like Waymo aren't keen to see their IP assets spirited away.

The researchers tested their technique by creating adulterated KGs using datasets MetaQA, WebQSP, FB15K-237, and HotpotQA, then attempted to deploy GraphRAG systems using these poisoned KGs in conjunction with various LLMs (GPT-4o, Gemini-2.5-flash, Llama-2-7b, and Qwen-2.5-7b).

The results indicate that AURA is highly effective. The models retrieved adulterated content 100 percent of the time and emitted incorrect responses to users based on that misinformation 94 percent of the time.

The technique is not perfect, the academics note, because in some cases the KG may contain both correct and incorrect (adulterated) data about a subject and the LLM may choose the correct answer.

There are techniques for detoxifying poisoned data but the authors claim their approach mostly resists checks based on semantic consistency (e.g. Node2Vec), on graph-based anomaly detection (e.g. ODDBALL [PDF]), and on hybrid approaches (e.g. SEKA).

"By degrading the stolen KG's utility, AURA offers a practical solution for protecting intellectual property in GraphRAG," the authors conclude. ®

Source: The register

Home

Researchers poison stolen data to make AI systems return wrong results