Why we’re vaccinating algorithms against adversarial attacks
Is the image on the left that of a temple, or of a large bird native to Africa? According to a machine learning model that’s been duped by an adversarial attack, it’s an ostrich.
As the use of artificial intelligence (AI) and machine learning across our society and economy is increasing, so too is the threat of these technologies being manipulated by bad actors.
Our researchers have developed a world-first set of techniques, similar to the vaccination process, to inoculate machine learning models against such attacks.
Back to basics: What is machine learning?
At its most basic form, an algorithm can be likened to a recipe: a set of instructions to be followed by a chef (or in this case, a computer) to create a desired outcome (a tasty dish).
Machine learning algorithms ‘learn’ from the data they are trained on to create a machine learning model. These models perform a given task effectively without needing specific instructions, such as making predictions or accurately classifying images and emails.
Computer vision is one instance of a machine learning model. After being fed millions of images of traffic signs, the machine can distinguish between a stop and speed limit signal to a specific degree of accuracy.
Adversarial attacks can fool machine learning models
However, adversarial attacks — a technique employed to fool machine learning models through the input of malicious data — can cause machine learning models to malfunction.
“Adversarial attacks have proven capable of tricking a computer vision system into incorrectly labelling a traffic stop sign as speed sign, which could have disastrous effects in the real world,” says Dr Richard Nock, machine learning group leader at CSIRO’s Data61.
“Images obvious to the human eye are misinterpreted by a slightly distorted image created by the attacker.”
This is how, through a barely visible layer of distortion, researchers at Google were able to trick a machine learning modelling into thinking the image of a building is in fact an ostrich. The same can be done to speech; a scarcely audible pattern overlaid on a voice recording can trick a machine learning model into interpreting the speech entirely differently.
Vaccinating against adversarial attacks
Presenting at the 2019 International Conference on Machine Learning (ICML), a team of machine learning researchers from CSIRO’s Data61 demonstrated a world-first set of techniques that effectively inoculate machine learning models from attacks.
“Our new techniques use a process similar to vaccination,” says Dr Nock.
“We implement a weak version of an adversary, such as small modifications or distortion to a collection of images, to create a more ‘difficult’ training data set. When the algorithm is trained on data exposed to a small dose of adversarial examples, the resulting model is more robust and immune to adversarial attacks.”
As the ‘vaccination’ techniques are built from the worst possible adversarial examples, they are in turn able to withstand very strong attacks.
Future of AI
AI and machine learning represent an incredible opportunity to solve social, economic and environmental challenges. But that can’t happen without focussed research into new and emerging areas of these technologies.
The new ‘vaccination’ techniques are a significant development in machine learning research, one that will likely spark a new line of exploration and ensure the positive use of transformative AI technologies.
As AI becomes more integrated into many aspects of our lives, ‘vaccinations’, such as the one designed by our machine learning research team, are essential to the progression of a protected and safe innovative future.
Adrian Turner, CEO at CSIRO’s Data61 said this research is a significant contribution to the growing field of adversarial machine learning.
“Artificial intelligence and machine learning can help solve some of the world’s greatest social, economic and environmental challenges, but that can’t happen without focused research into these technologies.
“The new techniques against adversarial attacks developed at Data61 will spark a new line of machine learning research and ensure the positive use of transformative AI technologies.”
The research paper, Monge blunts Bayes: Hardness Results for Adversarial Training, was presented at ICML on 13 June in Los Angeles. Read the full paper here.