Popular Python libraries used in Hugging Face models subject to poisoned metadata attack

Vulnerabilities in popular AI and ML Python libraries used in Hugging Face models with tens of millions of downloads allow remote attackers to hide malicious code in metadata. The code then executes automatically when a file containing the poisoned metadata is loaded.

The open source libraries - NeMo, Uni2TS, and FlexTok - were created by Nvidia, Salesforce, and Apple working with the Swiss Federal Institute of Technology's Visual Intelligence and Learning Lab (EPFL VILAB), respectively.

All three libraries use Hydra, another Python library maintained by Meta and commonly used as a configuration management tool for machine learning projects. Specifically, the vulnerabilities involve Hydra's instantiate() function.

Palo Alto Networks' Unit 42 spotted the security flaws and reported them to the libraries' maintainers, who have since issued security warnings, fixes and, in two cases, CVEs. While the threat hunters say they haven't seen any in-the-wild abuse of these vulnerabilities to date, "there is ample opportunity for attackers to leverage them."

"It is common for developers to create their own variations of state-of-the-art models with different fine-tunings and quantizations, often from researchers unaffiliated with any reputable institution," Unit 42 malware research engineer Curtis Carmony wrote in a Tuesday analysis. "Attackers would just need to create a modification of an existing popular model, with either a real or claimed benefit, and then add malicious metadata."

Plus, Hugging Face doesn't make the metadata contents as easily accessible as it does with other files, nor does it flag files using its safetensors or NeMo file formats as potentially unsafe.

Models on Hugging Face use more than 100 different Python libraries, and almost 50 of these use Hydra. "While these formats on their own may be secure, there is a very large attack surface in the code that consumes them," Carmony wrote.

The Register reached out to Hugging Face along with the libraries' maintainers (Meta, Nvidia, Salesforce, and Apple), and only received one response. It came from a Salesforce spokesperson who told us: "We proactively remediated the issue in July 2025 and have no evidence of unauthorized access to customer data."

We will update this story if we hear back from any of the other companies.

As mentioned earlier, the vulnerabilities have to do with the way the NeMo, Uni2TS, and FlexTok use the hydra.utils.instantiate() function to load configurations from model metadata, which allows for remote code execution (RCE).

The creators or maintainers of these libraries appear to have overlooked the fact instantiate() doesn't just accept the name of classes to instantiate. It also takes the name of any callable and passes it the provided arguments.

By leveraging this, an attacker can more easily achieve RCE using built-in Python functions like eval() and os.system().

Meta has since updated Hydra's documentation with a warning that states RCE is possible when using instantiate() and urges users to add a block-list mechanism that compares the the _target_ value against a list of dangerous functions before it is called. As of now, however, the block-list mechanism hasn't been made available in a Hydra release.

Here's a closer look at three of the AI/ML libraries that use Hydra's instantiate() function and the related vulnerabilities.

NeMo is a PyTorch-based framework that Nvidia created in 2019. Its .nemo and .qnemo file extensions - TAR files containing a model_config.yaml file - store model metadata along with a .pt file or a .safetensors file, respectively.

The problem here is that the metadata isn't sanitized before these NeMo files make an API call to hydra.utils.instantiate(), and this allows an attacker to load .nemo files with maliciously crafted metadata, trigger the vulnerability, and achieve RCE or tamper with data.

Nvidia issued CVE-2025-23304 to track the high-severity bug and released a fix in NeMo version 2.3.2

NeMo also integrates with Hugging Face, and an attacker could follow the same code path to exploit this vulnerability once the model is downloaded.

According to Unit 42, more than 700 models on Hugging Face from a variety of developers are provided in NeMo's file format.

Uni2TS is a PyTorch library created by Salesforce and used in its Morai foundation model for time series analysis along with a set of models published on Hugging Face.

This library works exclusively with .safetensors files, created by Hugging Face as a safe format for storing tensors, as opposed to pickle, which allows for arbitrary code execution during the loading process.

Salesforce models using these libraries have hundreds of thousands of downloads on Hugging Face, and other users have also published several adaptations of these models.

Hugging Face also provides a PyTorchModelHubMixin interface for creating custom model classes that can be integrated with the rest of its framework.

This interface provides a specific mechanism for registering coder functions, and - you guessed it - the uni2TS library uses this mechanism to decode the configuration of a specific argument via a call to hydra.utils.instantiate().

On July 31, Salesforce issued CVE-2026-22584 and deployed a fix.

Early last year, Apple and EPFL VILAB created FlexTok, a Python-based framework that enables AI/ML models to process images.

As with uni2TS, FlexTok only uses safetensors files, it extends PyTorchModelHubMixin, and it can load configuration and meta data from a .safetensors file. After it decodes the metadata, FlexTok passes it to hydra.utils.instantiate(), which triggers the vulnerability.

"As of January 2026, no models on Hugging Face appear to be using the ml-flextok library other than those models published by EPFL VILAB, which have tens of thousands of downloads in total," Carmony wrote.

Apple and EPFL VILAB fixed these security issues by using YAML to parse their configurations. The maintainers also added an allow list of classes that can call Hydra's instantiate() function, and updated documentation to say only models from trusted sources should be loaded. ®

Source: The register

Home

Popular Python libraries used in Hugging Face models subject to poisoned metadata attack