We hear every day of new uses companies find for artificial intelligence (AI,) but security of machine-learning models that underpin AI doesn’t get as much attention. When machine learning goes wrong, it’s never good for business – like when a text-generation model spews racists slurs or social media comment filters are tricked into displaying toxic comments.
But some businesses are taking action – especially those at the top of their game. The cybersecurity community is producing guidelines like MITRE ATLAS to help businesses to secure their machine learning models and tech giants like Microsoft and Meta are assembling expert teams to safeguard the machine learning beneath their business-critical AI.
One way Kaspersky protects the machine-learning technologies powering everything from our malware detection to antispam, is we attack our own models.
Ethical AI fights unethical AI
When AI goes low, go high
AI is maligned for nefarious uses. But where AI is a problem, it may also be a solution.
Machine learning is vulnerable because it learns like us
Known threats to machine-learning models, called adversarial examples, don’t stem from error or misconfiguration – they’re part of how the machine-learning model is made.
Alexey Antonov, Lead Data Scientist at Kaspersky and expert in machine learning-based malware detection, has a great way to describe these threats.
We sometimes see optical illusions because our brains interpret images using past experience. An artist who understands that can exploit it. The same is true of machine learning threats.
Alexey Antonov, Lead Data Scientist, Kaspersky
Antonov’s team set out to attack their own malware detection model. Rather than being programmed with rules, this model learns to tell malicious from legitimate software by training on troves of malware examples collected over the years. It’s like how our minds learn – and like our minds, it can be fooled.
“We craft specific data to confuse the algorithm”, says Antonov. “For example, you can glue pieces of data to a malicious file until the model stops recognizing it as malicious.”
Unlike bugs in traditional software, adversarial examples are hard to fix. There’s not yet a universal way to protect against them, but you can improve your machine learning’s security by adding adversarial examples to training data.
Diluting the impact of poisoned data
Machine-learning models are so called because they aim to model what happens in the real world. But what they really do, is describe the data used to train them mathematically. It can produce biased results if the training data is biased, for example, a face-recognition model trained predominantly on white faces will struggle to recognize People of Color. If an adversary can modify your training data (for example, if you use openly available datasets,) they will change your models – known as ‘data poisoning.’
Nikita Benkovich, Head of Technology Research at Kaspersky, says his team realized a model protecting enterprises from spam could be attacked with data poisoning. Models are frequently retrained because spam is always evolving, so an adversary could send spam emails using a legitimate company’s technical email headers, perhaps causing the model to stop all customers receiving that company’s real emails.
“We had many questions,” says Benkovich. “Can you actually do it? How many emails would we need? And can we fix it?”
After verifying such an attack was possible, they looked at ways to protect the system, coming up with a statistical test that would flag anything suspect.
How businesses can prevent adversarial attacks on AI
Adversarial attacks can affect fields as broad as object detection to machine translation, but Alexey says real-world adversarial attacks aren’t yet happening on a large scale. “These attacks need highly-skilled data scientists and much effort, but they can be part of Advanced Persistent Threats (APTs) for targeted attacks. Also, if a security solution relies solely on machine learning, an adversarial attack can be highly profitable – fooling the algorithm once, the adversary can use the same method to make new strains of malware the algorithm can’t detect. That’s why we use a multi-layered approach.”
Benkovich says pay close attention to where your machine-learning training data comes from, or ‘data provenance.’ “Know where your training sample comes from.”
Use diverse data because it makes poisoning harder. If an attacker poisons an open dataset, your hand-picked one might be harder to meddle with. Monitor the machine-learning training process and test models before deployment.
Nikita Benkovich, Head of Technology Research, Kaspersky
Both experts agree the best way to keep your models protected is to test by attacking them yourself before others do. Alexey quotes Chinese military strategist Sun Tzu for the best advice: “If you know the enemy and know yourself, you need not fear the result of a hundred battles.”