Bad actors use machine learning to break passwords more quickly and build malware that knows how to hide, experts warn.
Three cybersecurity experts explained how artificial intelligence and machine learning can be used to evade cybersecurity defenses and make breaches faster and more efficient during a NCSA and Nasdaq cybersecurity summit.
Kevin Coleman, the executive director of the National Cyber Security Alliance, hosted the conversation as part of Usable Security: Effecting and Measuring Change in Human Behavior on Tuesday, Oct. 6.
Elham Tabassi, chief of staff information technology laboratory, National Institute of Standards and Technology, was one of the panelists in the “Artificial Intelligence and Machine Learning for Cybersecurity: The Good, the Bad, and the Ugly” session.text
SEE: Social engineering: A cheat sheet for business professionals (free PDF) (TechRepublic)
“Attackers can use AI to evade detections, to hide where they can’t be found, and automatically adapt to counter measures,” Tabassi said.
Tim Bandos, chief information security officer at Digital Guardian, said that cybersecurity will always need human minds to build strong defenses and stop attacks.
“AI is the sidekick and security analysts and threat hunters are the superheroes,” he said.
Here are three ways AI and ML can be used in cybersecurity attacks.
Tabassi said that bad actors sometimes target the data used to train machine learning models. Data poisoning is designed to manipulate a training dataset to control the prediction behavior of a trained model to trick the model into performing incorrectly, such as labeling spam emails as safe content.
There are two types of data poisoning: Attacks that target a ML algorithm’s availability and attacks that target its integrity. Research suggests that a 3% training data set poisoning leads to an 11% drop in accuracy.
With backdoor attacks, an intruder can add an input to an algorithm that the model’s designer does not know about. The attacker uses that backdoor to get the ML system to misclassify a certain string as benign when it might be carrying bad data.
Tabassi said that techniques for poisoning data can be transferred from one model to another.
“Data is the blood and fuel for machine learning and as much attention should be paid to the data we are using to train the models as the models,” she said. “User trust is influenced by the model and the quality of the training and the data that is going into it.”
Tabassi said the industry needs standards and guidelines to ensure data quality and that NIST is working on national guidelines for trustworthy AI, including both high-level guidelines and technical requirements to address accuracy, security, bias, privacy, and explainability.
Generative Adversarial Networks
Generative Adversarial Networks (GANs) are basically two AI systems pitted against each other—one that simulates original content and one that spots its mistakes. By competing against each other, they jointly create content convincing enough to pass for the original.
Nvidia researchers trained a unique AI model to recreate PAC-MAN simply by observing hours of gameplay, without a game engine, as Stephanie Condon explained on ZDNet.
Bandos said that attackers are using GANs to mimic normal traffic patterns, to divert attention away from attacks, and to find and exfiltrate sensitive data quickly.
“They’re in and out within 30-40 minutes thanks to these capabilities,” he said. “Once attackers start to leverage artificial intelligence and machine learning, they can automate these tasks.”
GANs also can be used for password cracking, evading malware detection, and fooling facial recognition, as Thomas Klimek described in the paper, “Generative Adversarial Networks: What Are They and Why We Should Be Afraid.” A PassGAN system built by machine learning researchers was trained on an industry standard password list and was eventually able to guess more passwords than several other tools trained on the same dataset. In addition to generating data, GANs can create malware that can evade machine learning-based detection systems.
Bandos said that AI algorithms used in cybersecurity have to be retrained frequently to recognize new attack methods.
“As adversaries evolve, we have to evolve as well,” he said.
He used obfuscation as an example, such as when a piece of malware is mostly built with legitimate code. A ML algorithm would have to be able to identify the malicious code within it.
Panelist Greg Foss, senior cybersecurity strategist at VMware Carbon Black, said that if AI algorithms are making decisions, they can be manipulated to make the wrong decision.
“If attackers understand these models, they can abuse these models,” he said.
Foss described a recent attack on a cryptocurrency trading system run by bots.
“Attackers went in and figured out how bots were doing their trading and they used the bots to trick the algorithm,” he said. “This can be applied across other implementations.”
Foss added that this technique is not new but now these algorithms are making more intelligent decisions which increases the risk of making a bad one.