A monumental development has just taken place in the AI realm, and if you work in cybersecurity, you will soon realize its implications.
My own expertise stems primarily from cybersecurity but I have spent the past few years buried in various strands of research and this includes tooling up to understand AI.
Many of you will also have a growing awareness of AI and most have by now at least sampled ChatGPT, an awesome text-based chatbot AI capable of real-time interactions and boasting deep knowledge across more disciplines than any human in history. This already presents some problems for cybersecurity, as various functions from this AI can be called via APIs (Application Programming Interfaces) at very low cost.
The misuse of ChatGPT and other commercial AI platforms is at least restricted by policies – but that still requires any unethical or illegal use to be detected. These AIs will not intentionally set out to break any laws but those without any ethics are busy finding ways to circumvent the rules.
Let me cut to the endgame here before I explain how it happened. Imagine what would happen if cybercriminals could get an AI as capable as ChatGPT 3.5 or 4.0, but instead of a vast data center, be able to run a wholly independent instance on a standalone machine – where they can decide what rules or policies it abides by?
It is technically illegal for cybercriminals to reuse this work, but alas through the effort of several parties, it has proven possible to take an AI model with the power of ChatGPT 3.5 (an AI that requires a massive data center just to get its basic functions running) and create a much tinier and more efficient version that has been able (in a small number of tests conducted so far) to outperform it.
Here is what happened:
We have long been warned that once AI arrived, its development would be exponential.
A group of researchers at a Stanford-based research team were able to use just 175 different manually created tasks (self-instruct seed tasks), and using these in combination with an API connection to ChatGPT 3.5 (the DaVinci version for those interested), they were able to get into a cycle of automated generation until they reached a sample size of 52,000 conversations.
They then fed these samples into a separate AI (Metas Llama 7B) and fine-tuned it. By this point, the model was able to compete effectively with the original and the derivative AI still required some hefty cloud computing (but a fraction of what GPT would run on).
The execution of the processes above was measured in hours.
It is worth noting that these tasks were only permitted for research purposes as various terms and conditions at OpenAI prohibit the use of outputs from GPT to create rival models.
With this accomplishment out in the open, the researchers made all of the key data available. They called the resultant AI chatbot model Alpaca 7B.
Excited by the possibilities of this outcome, further parties have worked to see just how much further the model could be compressed. The process used is called LoRA (stands for Low-Rank Adaptation), and what it seeks to do is perform dimension reduction on every front it can—for example, eliminate redundant features, simplify recognition characteristics and, in many cases, reduce vast grids of multi-dimensional formulae to single numbers.
What this compression has managed to do is to get the model so small that it can reportedly run on something as small as a Raspberry Pi (as the disclaimer says, for research purposes).
http://twitter.com/_akhaliq/status/1636421203626737686
Although questions arise about just how far the compression can go, and what dependencies it may continue to have in the very short-term, the implication of this overall event in the context of cybersecurity is huge.
It is evidence that the theft and repurposing of vastly powerful AI models is (as of right now) not only within the reach of cybercriminals, but able to work from very small and inexpensive hardware.
It means that as an industry, we can forget about relying exclusively on the policies and controls of large AI companies to prevent the malicious use of AI. Savvy cybercriminals everywhere are now able to steal and repurpose AI in ways that, until a few weeks ago, we thought might have been prevented by the sheer scale and cost of the computational resources required.
Strap in and start locking down your systems because in 2023, it's crucial to strengthen our digital defenses and prepare for the latest AI-driven challenges in cybersecurity.
Update: Since writing this blog post, further testing and use of the Alpaca 7B revealed that it did not continue to outperform ChatGPT and was prone to “hallucination” – the term affectionately applied by AI people to the feature (bug) where an AI may fill the gaps in its knowledge by confidently making things up. That does not negate the importance of this moment and significant step forward it represents in the ability to create powerful AI on very small computing instances.
Author’s note: “Artificial Intelligence for Beginners” is released in paperback, hardback and eBook formats on 22 May 2023: http://www.amazon.com/dp/B0BZ58JHGD