Just this past Thursday, I gave a presentation to about
150 people (coworkers of mine) about Generative AI and Cybersecurity. I touched upon a few use cases, and the kinds
of questions that a client or a prospect might ask if they are interested in this
kind of solution.
At the end of the presentation, a question was asked to
me: “How do I safeguard my PII (Personal
Identifiable Information) data if I know I am submitting to an application that
uses Generative AI?”
I had to think about this one, because obviously I did
not know the answer. There is no silver
bullet answer to this one, so this can be difficult to answer. As far as I know at the present time, there
are no concrete controls in place for this.
But rather, the security must be baked in as soon as you
start the process of building your Generative AI model and eventually deploying
into your production environment.
This concept is technically known as “Secure by Design”,
and it can be defined as follows:
“Secure by design is a philosophy and approach that
prioritizes security considerations at every stage of the development
lifecycle, ensuring that systems and products are inherently secure from the
start. Instead of adding security as an afterthought, secure by design integrates
security measures throughout the entire design process, making it a
foundational element of the product or system.”
(SOURCE: Google
Search)
In other words, security is not after though after
deployment of anything, but rather, it is planned out from the very beginning to
make sure that nothing has been left behind.
This has started to become the mantra in the world of software
development, but now, given the evolution of Generative AI and its long-term potential,
it is now being deployed here as well.
This is a concept that has been strongly advocated by
CISA, and to read a comprehensive white paper, click on the link below:
http://cyberresources.solutions/Blogs/CSIA_SBD.pdf
This is an initiative-taking approach to help secure your
Generative AI models, right from the very beginning. Because of the confusion of what means “secure”
in this realm of the world, at the present time, there is a reactive approach that
is being taken to this. Here are some of
the consequences:
1) Data
Poisoning:
One of the cardinal rules in
Generative AI is that you want all the datasets that you will be using to be as
“cleansed” as possible, meaning that they are free from any or other erroneous
bits of information. But in this regard,
the Cyberattacker, and unknowing to you, can easily insert some kind of malicious
payload into them. This will eventually
lead to data exfiltration or data leakages happening.
2) Prompt
Engineering:
This is the art of Generative
AI in which it teaches you how to create specific queries to get the best answers
(or outputs) that are possible. It takes
time to learn all of this. But, the Cyberattacker
is stealthy enough that they can inject malicious prompts into your Generative
AI model, thus having it give out answers that it should not be given.
3) Deserialization:
This is the process where the Generative
AI model can take bits of data and transform them into other types, especially
when they are needed for software development.
Once again, a malicious payload can be deployed here by the Cyberattacker
to create a back door to get into the software application after it has been
deployed.
So how can Secure by Design come to work here? It does it primarily by adopting the principles
of what is know as “Machine Learning Security Operations”, or “ML” SecOps for
short. This is where the Generative AI,
the Operations team, and the IT Security team come together in one unison to
make sure that the Generative AI model has security baked into right from the very
beginning of the development lifecycle.
Some of its components are as follows:
Ø Threat
Modeling: This is where future threat
variants are predicted, to help beef up the lines of defenses.
Ø Data
Preparation: This is where all efforts
are made to ensure that all the collected data sets are cleansed and optimized,
as stated earlier.
Ø Testing: You want to evaluate the Generative AI model
as it is being developed throughout its various stages. A tool that can be used here are what are
known as “model scanners”.
Ø Continuous
Monitoring: Even after the Generative AI
model has been deployed into the production environment, you will still want to
keep an eye on it and monitor any rogue network traffic that may come its way.
Ø Penetration
Testing: This is where you will want the
Red and Blue Teams to launch comprehensive exercises so that any gaps or vulnerabilities
can be found and quickly remediated. You
do not want to wait until the end of this.
Ø Firewalls: One of the best tools that you can use to protect
your Generative AI model is using something like a Next Generation Firewall. These are much more sophisticated versions of
the traditional Firewall and can analyze data packets at a much more granular level.
Ø Incident
Response: In this regard, you will want
to have playbooks that can be automatically triggered to help contain a
security breach to your Generative AI model should it ever occur.
My Thoughts on This:
Even despite using the concepts of MLSecOps, you must
have bought it from every key stakeholder that participates in the development
of every aspect of the Generative AI development process. This event also includes C-Suite.
They must know what is going on, and they cannot plea ignorance
or that they have not been informed. There
must be a Change Management Committee that will oversee any changes to the Generative
Model after it has been developed and deployed into the production environment.
So, in the end I guess, the best answer for tight as to
how to safeguard your PII is just as initiative-taking on your end as the MLSecOps
team would be as well. Trust your
gut. If something does not feel right, do
not enter your private and confidential information into it.
No comments:
Post a Comment