Hack-Proof Artificial Intelligence Supply Chains Using Open Source Security

Practical ways to protect against AI software attacks

Tim Miller

Michael Lieberman

August 5, 2024

As security professionals, we are always working to guard against backdoors and vulnerabilities throughout the entire software development lifecycle. Developers try to pull in open source code that is secure, but the dependencies are several layers deep, projects are inconsistently maintained, and now AI enters the picture. How can the industry keep up with the increasing pace of attacks, especially in the relatively novel artificial intelligence space?

Why is security a challenge for the AI software supply chain? 

The perils of Artificial Intelligence (AI) software supply chains mirror those of the broader software landscape, with some added intricacies. This is particularly true when integrating large language models (LLMs) or machine learning (ML) models. Traditional software supply chains are concerned with software. AI supply chains add the complication of the dataset used to train the model. The same model with two different sets of training data can produce dramatically different output.

Individual organizations as well as the broader AI technology ecosystem are seeing some alarming trends. Recent studies suggest an inverse correlation between the security stance of open-source AI software tools and their popularity. In other words: as these tools gain wider adoption, they may also increase the risk to users. It’s not just that the code itself could contain an exploitable vulnerability, the model data presents risks as well.

For instance, consider a scenario where a company uses AI models for screening job applicants. State and federal laws forbid discrimination on the basis of race, sex, veteran status, and other attributes. Companies must meticulously consider and assess their AI model's software and training data supply chain to avoid biases that could lead to legal issues down the road. This isn’t just speculation. As far back as 2018, Amazon stopped using its AI recruiting tool when it discovered that it perpetuated discrimination against women.

The proliferation of AI models poses substantial legal and regulatory hazards when the models are trained on potentially illegal or unethical data. This underscores the urgency for bolstered measures within the AI supply chain in an effort to ensure users are safe and secure. Getting over these hurdles is key to responsible adoption and meeting and sustaining the potential of AI.

One of the main concerns with generative AI models at the moment is understanding the provenance of the data: does the model have the input necessary to give reasonable output, or will the model tell us to put glue on pizza? A fully-open model is the only way to be sure. It’s good to see government alignment in favor of open innovation between the EU’s AI Act legislation and the US NTIA’s policy recommendations

Actions for security professionals to take

Protecting the AI software supply chain takes diligence and a holistic look at the problem across multiple avenues. Only then can you make decisions that truly protect your organization. 

The first step is to use truly open source AI models — that is to say a model open to inspection, modification, and redistribution, as well as an openly accessible training dataset with transparent origins, offering the same freedoms for scrutiny and utilization. After all, you can’t trust — or fix — what you can’t inspect.

Second, implement security best practices internally and advocate for greater transparency and accountability within the open source community. Make it the minimum requirement in your organization to have essential security metadata, such as Software Bills of Materials (SBOMs), SLSA (Supply Chain Levels for Software Artifacts), and SARIF (Static Analysis Results Interchange Format). Many of the projects you rely on are maintained by volunteers, so bring help to improve the practices in your upstreams — don’t just make demands of them.

Third, adopt open source security tools into your workflow. Projects such as Allstar, GUAC, and in-toto attestations provide tools you can incorporate to observe and verify your software stack’s security posture. Google, our partner in the development of GUAC, recently released a report that shares how they secure their AI supply chain using provenance information and provides guidance for other organizations looking to do the same. 

Lastly, invest in open source contributions and funding. Support organizations like the Open Source Security Foundation (OpenSSF), which develops specifications, tools, and initiatives to secure critical open source projects. Donate time and money to the projects in your supply chain that you depend on — it’s far more affordable than writing all that software yourself.

There is no silver bullet to address security, and even the most careful organizations can find themselves on the wrong end of a compromise. The addition of AI models into the software supply chain only adds more complexity. But there’s no need to panic — you can improve your AI supply chain’s observability with tools and practices available today. Once you understand your supply chain, you can secure it.

Like what you read? Share it with others.

Other blog posts 

The latest industry news, interviews, technologies, and resources.

View all posts

Previous

No older posts

Next

No newer posts

Want to have a conversation about your software supply chain?

We’d love to hear from you.  Get in touch and we'll get back to you.

Say Hello
By clicking “Accept”, you agree to the storing of cookies on your device to enhance site navigation, analyze site usage, and assist in our marketing efforts. View our Privacy Policy for more information.