Sunday, March 2, 2025

3 Top Trends To Emerge From Generative AI Poisoning Attacks

 


It seems like that all the news headlines today in Cyber are all about Generative AI and its many different subsets, such as Large Language Models (also known as “LLMs”).  I have covered this topic very extensively in the four books that I have written about it, as well as in the white papers, articles and blogs that I have written for other people. 

But there is one area in which, unbelievably, I have touched upon, and that is the area of what is known as “AI Data Poisoning”. 

You may be wondering what it is, so here is a technical definition of it:

“Data poisoning is a type of cyberattack where threat actors manipulate or corrupt the training data used to develop artificial intelligence (AI) and machine learning (ML) models.”

(SOURCE:  What Is Data Poisoning? | IBM)

Remember, as I have written about in the past, what drives a Generative AI model is the data that is fed into it.  It can be easily compared to a car which needs gasoline to make it run and go places.  Likewise, it is the data that fuels the model and gives the momentum that it needs to produce an answer, or an output to the query that has been submitted to it.

But keep in mind that not just any output will do.  It must meet what the end user is looking for.  In order to make sure that this happens, whoever  is in charge of the model must make sure that the datasets that are fed into the model are cleansed and robust, as well as free from having any statistical outliers. 

Using our car for example again, you need to give the right kind of fuel so that the engine will not get damaged (for instance, you do not pump diesel fuel into a Honda).  The same is true of the Generative AI model.  It needs the right data to make its algorithms (which is its engine) work equally smoothly.

But Generative AI is a field that is changing on an almost daily basis.  Thus trying to deploy the latest Cybersecurity controls can be an almost. impossible task to accomplish.  The Cyberattacker is fully aware of this and knows the vulnerabilities that are present.  Thus, they launch what are known as Poisoning Attacks to insert fake data into the model. 

But it does not stop here.  They can also quite easily insert a malicious payload to serve two key purposes:

Ø  Launch another Supply Chain Attack (just as we saw with Solar Winds and Crowd Strike) that could have huge, cascading effects.

Ø  Launch a Data Exfiltration Attack to not only steal the legitimate datasets that are being used in the model itself, but also those datasets which reside in the IT and Network Infrastructure of a business entity.

So given all of this, there are now three trends that are expected to happen, at some point in time down the road, which are as follows:

1)     Back To Solar Winds:

Yes, I know I just mentioned this, but the kind of attack that can happen here to a Generative AI Model will be magnified by at least ten times because of a Poisoning Attack.  To put it another perspective, when the Solar Winds hack took place, there were about 1,000 victims.  Now, there could be at least 10,000 victims or even more, all over the world.  In this regard, the main point of insertion for a malicious payload would be LLM, if there is one that is present.

2)     The Role of the CDO:

This is an acronym that stands for the “Chief Data Officer”.  This job title can be compared to that of the CISO, but their focus is on the datasets that their company has and is currently using.  Up until now, their main tasks were to simply write the Security Policies that would help fortify the lines of defenses around a Generative AI model.  But with the advent of Data Poisoning, their role will now shift into hiring and managing a team of employees whose sole mission is the cleansing and optimization of the datasets before they are fed into the model.  Another key role for them here also is to make sure that whatever datasets they are using come into compliance with the data privacy laws, such as those of the GDPR and the CCPA.

3)     It is Going to Happen:

If Phishing has been around, so will Poisoning Attacks.  They will start to evolve this year and pick up steam later on.  But as companies keep using Generative AI, this will be a highly favored threat variant for the Cyberattacker.  In fact, according to a recent market survey that was conducted by McKinsey, over 65% of businesses today use Generative AI on a daily basis.  To see the full report, access the link below:

http://cyberresources.solutions/Blogs/Gen_AI_Report.pdf

My Thoughts on This:

I am far from being an actual Generative AI practitioner, but I would like to offer my opinion as to how you can mitigate the threat of a Poisoning Attack from impacting your business:

Ø  Generative AI models are not just one thing.  The model or models that it uses are connected to many other resources in the external world.  There are a lot of interconnectivities here, so I would recommend keeping a map or visual to keep track of all this and keep updating on a real-time basis as more connections are being made into it.  This will also give a clever idea as to where you need to exactly deploy your Cybersecurity controls in the Generative AI Ecosystem.

 

Ø  If you can, hire a CDO as quickly as you can.  You do not have to hire them as full-time employees, you can also hire them on a contract basis, to keep them affordable.  But you will need them ASAP if you are going to make use of Generative AI based models.

Poisoning Attacks are going to be around for a long time.  So, now is the time to get prepared!!!

No comments:

Post a Comment

3 Top Trends To Emerge From Generative AI Poisoning Attacks

  It seems like that all the news headlines today in Cyber are all about Generative AI and its many different subsets, such as Large Languag...