Microsoft and Intel researchers have found a way to combine artificial intelligence and image analysis to create a highly effective means to combat malicious software infections.
The researchers call their approach “STAMINA” — static malware-as-image network analysis — and say it’s proven to be highly effective in detecting malware with a low rate of false positives.
What STAMINA does is take binary files and turn them into images that artificial intelligence software can analyze using “deep learning.”
“STAMINA is a fascinating approach to classifying malware,” said Mark Nunnikhoven, vice president of cloud research at Trend Micro, a cybersecurity solutions provider headquartered in Tokyo.
“This approach is like graphing a large table of data,” he told TechNewsWorld. “It can be easier to spot patterns in the graph than combing through the raw data.”
By using common image analysis machine learning approaches, the teams were able to group malware samples into families and differentiate between desired software and malware, Nunnikhoven said.
“This isn’t the only machine learning method, but it is a new and interesting approach filled with potential,” he added.
The biggest shortcoming of the method is tied to malware size, Nunnikhoven noted. “Because the technique converts the malware to an image, it can get resource-intensive quickly. If you’ve ever tried to open a really large photo on an older computer, you have firsthand experience with the challenges.”
99 Percent Accuracy
“As malware variants continue to grow, traditional signature-matching techniques cannot keep up,” Intel researchers Li Chen and Ravi Sahita and Microsoft researchers Jugal Parikh and Marc Marino explained in a white paper.
“We looked to applying deep-learning techniques to avoid costly feature engineering and used machine learning techniques to learn and build classification systems that can effectively identify malware program binaries,” they wrote.
“We explored a novel image-based technique on x86 program binaries,” they continued, “which resulted in 99.07% accuracy with 2.58% false positive rate.”
Classical malware-detection approaches involve extracting binary signatures or fingerprints of the malware. However, the exponential growth of signatures makes signature-matching inefficient, the researchers explained.
Malware also can be identified by analyzing the code of files. That’s usually done with static or dynamic analysis, or both. Static analysis can disassemble code, but its performance can suffer from code obfuscation. Dynamic analysis, while able to unpack the code, can be time-consuming, they pointed out.
“While static analysis is typically associated with traditional detection methods, it remains to be an important building block for AI-driven detection of malware,” Microsoft’s Parikh and Marino wrote in a separate post on STAMINA.
“It is especially useful for pre-execution detection engines: static analysis disassembles code without having to run applications or monitor runtime behavior,” they noted.
“Finding ways to perform static analysis at scale and with high effectiveness benefits overall malware detection methodologies,” Parikh and Marino noted.
“To this end, the research borrowed knowledge from computer vision domain to build an enhanced static malware detection framework that leverages deep transfer learning to train directly on portable executable (PE) binaries represented as images,” they explained.
Better Scaling, Faster Processing
“Traditional malware analysis techniques have been decreasing in efficacy for a long time,” observed Chris Rothe, chief product officer ofRed Canary, a cloud-based security services provider located in Denver.
“Static and dynamic analysis are effective but can be difficult to scale,” he told TechNewsWorld. “One of the benefits of this approach is that it makes it possible to leverage technology from other domains that has the ability to operate at large scale.”
“This is necessary because of the explosion of binary samples that have been created by attackers mutating malware to avoid detection,” Rothe continued. “So if this technique works, it could bring back binary analysis as a viable method of threat detection.”
The Microsoft-Intel approach also reduces the size of input into the analysis system, which can translate into faster processing.
“If you’re turning a binary file into pixels, there’s a certain amount of input downsizing that goes with that,” said Malek Ben Salem, Americas security R&D lead for Accenture, a professional services company based in Dublin.
“With STAMINA, they go even further. They turn binaries into pixels and then they reduce the size of the image,” she told TechNewsWorld.
“The fact that you can reduce that input size and feed it to a deep-learning network means you can process a lot more information,” Ben Salem said. “You can look at many more instances of malware, which will speed things up a lot.”
Easy on the Human Eye
Although the researchers see their method being used in a completely automated environment, the images would be valuable to human security types, too.
“In cases where a machine isn’t sure if a file is benign or not and human inspection is needed, a human would find it easier to relate to an image than to hexcode,” Ben Salem noted.
Adding deep learning to the detection process also provides advantages over existing techniques.
“With a deep learning model, you can deal with complex data,” Ben Salem said. “That means minor variations in malware could be more easily detected way better than the classical machine learning approaches we’ve been using so far.”
The researchers acknowledged limits on their methods.
“Our study indicates the pros and cons between sample-based and meta data-based methods,” they wrote in their white paper.
“The major advantages are that we can go in-depth into the samples and extract textural information, so all the characteristics of the malware files are captured during training,” the researchers explained.
“However, for bigger size applications, STAMINA becomes less effective due to software not being able to convert billions of pixels into JPEG images and then resizing,” they continued. “In cases like this, meta-data-based methods show advantages over sample-based models.”
In the future, the team wants to evaluate hybrid models using intermediate representations of the binaries and information extracted from binaries with deep learning approaches. Those datasets are expected to be bigger but may provide higher accuracy.
The researchers plan to continue exploring platform acceleration optimizations for their deep learning models so they can deploy such detection techniques with minimal power and performance impact to the end-user.