It seems like
we can never get away from this one topic: Generative AI. The bottom line is whether you love it or hate
it, use it or not, etc. it is going to
be around with us for a long time to come.
There are many facets of Generative AI that touch our lives, the one
that is the most influential is that of ChatGPT. Personally, I have never used it, and I have
made a promise to myself to never use it.
Anyways,
differences aside, there is a new trend coming out now, and it is called “Shadow
Generative AI”. It is just like its cousin,
“Shadow IT”. With this, employees either
continue to use outdated software or download non-sanctioned ones either because
they are creatures of habit and do not want to use what is new, or they are
trying to retaliate against something that has happened to them personally at
their workplace.
The same can
be said of the former. This is the situation
where employees use non-approved Generative AI models to help them to do their
daily job tasks. Because of the inherent risks that it can bring, many
businesses in Corporate America are now cracking down on this. Consider some of these stats:
*Using non
approved Generative AI models is now completely banned in the healthcare and
financial sectors.
*Even technology
companies are cracking down, such as Apple, Samsung, and even Amazon.
*ChatGPT that
has not been approved for corporate usage is at an alarming 74%. The same can also be said of its competitors,
Geminin and Bard.
*Given the development
of Generative AI, companies are finding it difficult to enforce data security
policies, and at least 27% of the data that is pumped into the model is not
made secure.
(SOURCE: The
Security Risk of Rampant Shadow AI)
The main
culprit for the huge risk of “Shadow Generative AI” lies in the fact that it is
the datasets fed that are fed into a model which is the lifeblood for it. For example, large amounts of it are needed in
not only to initially train the model, but to keep it learning going
forward.
Because of
this, datasets have become a prized target not only for the Cyberattacker, but
also for those rogue employees that are considering launching an Insider
Attack.
So now all of
this begs the question: How can a CISO and their IT Security team mitigate the risks
of “Shadow Generative AI” from happening in the first place? This can be examined from a couple of different
areas but let us start at the heart of the matter: the datasets.
Every effort
must be taken to ensure that they are as secure as possible. You can even think of this is as a “Wash,
Rinse, and Repeat” cycle:
1)
Before
Ingestion:
Make
sure that all pathways that lead from the database to the actual model are
secure. It is at this first point that the
datasets will be “ingested”, and thus, this can be considered to be the most
vulnerable point. Aside from this, you must
make sure that the datasets are cleansed and optimized as much as possible. This is imperative, because if they are not, the
processing of any submitted query and the output that must be created from it
will be skewed.
2)
In
Process:
The
appropriate controls also need to be put into place to protect the model in the
first instance. This simply means that
only the authorized employees should be able to gain access to it, such as the
data scientists that have built the model and the data analysts that examine
the processes that are taking place from within it.
3)
What
Comes Out:
In
the end, once all the datasets have been inputted, and processed per the query that
was submitted by the end user, the result is the answer to the question, or in more
technical terms, the “output”. But even
here, the appropriate safeguards must be out into place not only to ensure the privacy
of the end user, but if the outputs that have derived are created for market
intelligence purposes, then this can be considered as Intellectual Property, which
needs to be highly protected.
This entire
process must be run each time the model is submitted with a new query, or a new
one has been created from scratch.
The second angle
of attack here is the policies that CISO and the IT Security implement with
regards to the usage of Generative AI in the workplace. Here are some key things to be considered:
Ø
Use
Obfuscation: This is
where Data Tokens can be created to represent the actual datasets. As a result, even if they are hijacked, there
is very little that the Cyberattacker can do with it.
Ø
Watch
The Access: In this regard, you will want to follow up on
the concept of “Least Privilege”. This
is where you assign those rights, permissions, and privileges that are absolutely
for the employee to do their jobs, and no more than that. This also holds true for the Generative AI
models. The scientists and analysts that
work on them should be only given what they absolutely need.
My
Thoughts on This:
There is another
motivating factor for the CISO and their IT Security team to be on their toes. All of the datasets that are used are now
starting to come under the purview of the data privacy laws, such as the GDPR
and the CCPA. Meaning, if the controls
are not in place or have not been further optimized to protect them, they could
also face an exhaustive audit and steep financial penalties.
In the end,
all of this may same to be a huge rat race, and in fact, it can be quite overwhelming. IMHO, it is thus very important to break down
the “Wash, Rinse, and Repeat” cycle into the smaller tasks so that it becomes
much more manageable.
No comments:
Post a Comment