How to fool an AI: Artificial intelligence may not be that intelligent after all
Posted: 18 April 2017 | By Darcie Thompson-Fields
AI is getting smarter but there are still some simple tricks you can use to flummox computers.
Last year, researchers were able to fool a facial recognition system into thinking they were someone else simply by wearing patterned glasses. Whilst the twist and patterns of the printed glasses look random to humans but can confuse computers.
AI designed to pick out eyes, noses, mouths and ears can easily mistake the pattern for contours of someone’s face.
Fooling AI with images
These types of attacks are bracketed within a broad category of AI cybersecurity known as “adversarial machine learning,” according to The Verge. Within this field, the use of psychedelic patterns and detailed bitmaps are referred to as “adversarial images” or “fooling images,” but adversarial attacks can take various forms including audio and text.
These sort of attacks usually target a type of machine learning system known as a “classifier”. A classifier is something that sorts data into different categories, like the algorithms in Google Photos that tag pictures on your phone.
A classifier picks up visual features of an image too distorted to be recognised by the human eye. These patterns can be used to bypass AI systems and have substantial implications for future security systems, factory robots and self-driving cars.
“Imagine you’re in the military and you’re using a system that autonomously decides what to target,” Jeff Clune, co-author of a 2015 paper on fooling images, told The Verge.
“What you don’t want is your enemy putting an adversarial image on top of a hospital so that you strike that hospital. Or if you are using the same system to track your enemies; you don’t want to be easily fooled [and] start following the wrong car with your drone.”
The research community will need to solve this vulnerability as we continue down our current path of AI development says Clune. The challenge of defending against these attacks is not as simple as countering existing attacks, more effective attack variations are being discovered. Optical illusions with overlapping patterns are easily spotted but there are more subtle forms of adversarial attack.
One such adversarial image, known as “perturbation”, is all but invisible to the human eye. It’s a filter of pixels on the surface of a photo. In a 2014 paper titled “Explaining and Harnessing Adversarial Examples,” researchers demonstrated how flexible these were. They found that this type of adversarial image was capable of fooling a range of different classifiers, even ones it hasn’t been trained on.
Using images to fool AI systems has its limitations, it takes time to craft these images and you mostly need to know the internal code of the system you’re trying to manipulate. Attacks also aren’t consistently effective, what fools one network 90 per cent of the time may only have a success rate of 50 or 60 per cent on a different neural network.
Real world harm
To better defend AI against these attacks, engineers subject them to “adversarial training”. The training involves feeding a classifier adversarial images so it can identify and ignore them, but this training is weak against potential attacks. Engineers also don’t yet know why certain attacks fail or succeed.
As far as we know adversarial images have not been used to cause real-world harm, but Google Brain research scientist, Ian Goodfellow says they’re not being ignored.
“The research community in general, and especially Google, take this issue seriously,” says Goodfellow. “And we’re working hard to develop better defences.”