With all the political turmoil that is happening today,
the news headlines do not seem to be coming out as quickly about Generative AI
as it once did, say, going until the end of last year. The biggest fear is China, not just from the standpoint
of tariffs, but also in terms of competition.
In fact, if you recall, they came out with something remarkable
like ChatGPT. It was developed by a
company called Deep Seek, and the cost of running the algorithms and the hardware
needed (such as the GPUs) is much lower.
Also, Nvidia took
a decent hit with a financial charge of over $5
billion, with the restrictions
that have been put into place on sending GPUs to China. But despite all this turmoil, there is yet
another headwind that both produce and make use of Generative AI must contend
with: Data Privacy, and Compliance that
comes along with it.
As I have written before, the fuel that runs Generative
AI models are the datasets that are fed into it. Not only do they need it to train, but they
also need them to create the output you are seeking when you ask it a specific query. Generative AI Compliance will come across three
different angles:
Making sure that the right controls have been implemented
on the training datasets.
The same with the above, but for the output that has been
generated.
Also, the same with the above, but making sure that any
data which is submitted by the end user
is also as secure as possible.
To this end, the trends for this year are expected to be
as follows:
1) Efforts
From The EU:
They have produced a new
piece of legislation called the “NIS2”.
It is an acronym that stands for the “Network and Information Security”. Just like the GDPR, it applies to any
entities that conduct business in the EU, even if they are not physically
located there. The tenets and the provisions are almost similar, but they also
take a strong stance to Generative AI.
But, the financial penalties are very harsh for non-compliance: It can be up to 2% of the revenue that
has been generated on a global basis.
2) The
DORA:
This is an acronym that
stands for the “Digital Operational Resilience Act”. It was created and enacted by the EU as
well. But apart from Generative AI
compliance, it has two key specific focuses:
Ø Proving
that you cannot just only create backups, but that you can also restore the
mission critical data from them, if you are ever impacted by a disaster,
natural or man-made.
Ø That
the backups which have been created are segregated in terms of the physical and
logical ones. The goal here is to make
sure that businesses are storing their backups in various locations, such as On
Premises or in the Cloud.
3) More
From China:
Take for example, that you
have a hosting account with a domain registrar that is located here in the
United States (such as GoDaddy, Namecheap),
etc.). You decide to host your
application in a datacenter that is in the US.
Although this may be technically correct once you launch your web
application, the datasets that it uses could be stored at a datacenter in
entirely different country that you may not even be aware of. So, the hot topic of debate here is who takes
custody of it? Well, the Chinese Government
is making this even clearer now, especially when it comes to Generative AI. To
this extent, they have passed two distinct laws:
Ø The Personal
Information Protection Law ( also known as the “PIPL”).
Ø The Data
Security Law ( also known as the “DSL”).
Ø The Cybersecurity
Law (also known as the “CSL”).
The result is that China is
now relaxing its restrictions on storing “foreign datasets” on the datacenters
that are located there and are now highly encouraging businesses from all over the world to even put there backups as
well.
4) The
Rise of E2EE:
This is yet another acronym
that stands for “End to End Encryption”.
Encryption has always been a favored tool in the arsenal to protect
anything data related. After all, it
scrambles it so that if anything was to be intercepted by a third party, there
is nothing that they can do with it
unless they have the appropriate key to decode it. But with the E2EE, the IT Security team will
have no choice on what can be encrypted, by default, everything will
be. While this is heavily targeted
towards the Generative AI algorithms and
the datasets they use, using the E2EE
can be a bad thing as well. For
instance, even a Network or Database Administrator with the right permissions
can be denied access.
My Thoughts on This:
On a theoretical level, all of this sounds great, taking
more steps that the datasets that Generative AI use are now even more
protected. But in the real-world sense,
just how enforceable is all of this?
Normally in a world where there is not much chaos or confusion, this all
could very well be done. But once again,
given the political climate that we now have in the United States, who knows
how this will all come together.
Then there is the issue of China. They are the second largest economy in the
world, and in fact, their manufacturing and supply chain logistics far surpass
that of the United States. For example,
we are still trying finish construction on the next Ford class aircraft
carrier, the “USS John F Kennedy”.
During this time, the Chinese are already working on I
believe, their third carrier, which would be quite compatible.
So, there are still many complexities and uncertainties
which lie ahead because of these tariffs.
But one thing is for sure in this regard: Given their sheer dominance, I bet they will
far outpace the United States when it
comes to Generative AI development and production. Not only can they do it faster and cheaper,
but the quality in the end may prove to
far superior in the end.