A frameowrk to collect and utilize API semantics to enhance Android malware classifiers.
We introduce APIGraph that uses a new concept named API semantics, to tackle model aging, a long-standing problem for machine learning-based malware detection systems, from the perspective of enhancing feature space abstractions. Our paper, Enhancing State-of-the-art Classifiers with API Semantics to Detect Evolved Android Malware is published in ACM CCS 2020. Here's the trailer of of our talk:
Model aging, or model degradation, describes the phenomenon that the performance of trained classifiers drops significantly over time, which is a long-standing problem in the machine learning literature. However, when applying ML techniques to security areas, for example malware detection, things become even worse as the problem space may be much more complicated than traditional ML tasks such as image classification. Recently both academic and insdutry experiences demonstrate the severe problem of model aging:
A Kaspersky 2019 white paper[1] shows the detection rate of an ML-based classifier drops from 100% to below 80%, or even 60% under another configuration, in only three months.
A USENIX Security 2019 paper[2] tests 3 state-of-the-art ML-based Android malware classifiers and they all suffer from model aging.
[1] https://media.kaspersky.com/en/enterprise-security/Kaspersky-Lab-Whitepaper-Machine-Learning.pdf, Kaspersky 2019
[2] TESSERACT: Eliminating experimental bias in malware classification across space and time, USENIX Security 2019
Existing solution to model aging is to add newly labeled samples to retrain and update the aged models, i.e. data-perspective solution. However, this kind of methods has the following shortcoming: 1) it comes at a high cost, as we need to label many new samples, and there is a time window during retraining, which may leave a chance for malware to infect lots of users. 2) More importantly, retraining with new data is still constrained by the training data itself, so in nature the retrained models are still blind, unaware of malware evolution.
Observation: Many semantics are preserved during evolution while implementation may be different across variations
A real-world example: XLoader is a family of spyware and banking trojan that steals personally identifiable information (PII) and financial data. It was reported by TrendMicro in April, 2018 and kept evolving since its born and had generated several variations until late 2019. During the evolution, the implementations of different versions have changed a lot. However, we observe that one of the core logic are preserved across versions. As shown in the following figure, an early version V1 collects IMEI and sends this information to its server throught HTTP. In a later version V2, it further collects IMSI and ICCID, and sends to its server throught socket. Although different APIs are used, the overall target are the same, i.e. collecting PII and sending to its server.
Therefore, our idea is to let machine learning models to capture such preserved semantics during malware evolution. Our framework, APIGraph mainly solves two main challenges:
The short answer is that we extract API relations from Android API documents using NLP techniques and pre-defined relation templates, and the we build an relation graph and then use graph embedding algorithm to vectorize each API while preserving their semantics and relations. After that, we cluster APIs into semnatically-close groups and then use these groups to stabilize the feature vectors of evolutionary malware. For more detailed explaination, please refer to our paper or join our talk at CCS 2020.
We build a large-scale & evolutionary dataset, which contains more than 320K Android apps across 7 years. The scale of this dataset is, as far as we know, the largest one to study and evaluate Android malware classifiers. Furthermore, to make fair evaluations, we strictly follow recently proposed best practice (Tesseract-Security2019) to satisfy both temporal and spatial consistency:
The dataset details are available at APIGraph-code-database.
We tested four state-of-the-art Android malware classifiers as the baselines, as listed below.
Classifiers | Publication | API feature format | Algorithms | Reproduction |
---|---|---|---|---|
MamaDroid | NDSS 2017 | Markov Chain of API Calls | Random Forest | source code |
DroidEvolver | Euro S&P 2019 | API Occurrence | Model Pool | source code |
Drebin | NDSS 2014 | Selected API Occurrence | SVM | re-implemented |
Drebin-DL | ESORICS 2017 | Selected API Occurrence | DNN | re-implemented |
These four classifiers are published in top venues and their source code are publicly available or we can re-implement them, sometimes with the help of their authors.
Specially, we thank the authors of DroidEvolver for their help.
We strictly follow their configuration to make sure our reproductions can achieve the results as stated in their paper.
The first experiment shows how APIGraph helps in slowing down model aging.
We use the AUT metric, proposed by a recent paper[1], which is the area under the curve within a time period.
As shown in this table, after enhanced by APIGraph, the four classifiers can achieve from 8.7% to 19.6% improvements.
We also draw the detailed figures for models trained on 2012 and tested on 2013.
The red curves show the f1-score of the original 4 classifiers, while the blue ones are f1-score of the enhanced classifiers.
We can clearly see in these figures that the trend of performance decreasing is slowed down after using APIGraph.
[1] TESSERACT: Eliminating experimental bias in malware classification across space and time, USENIX Security 2019
In the second experiment, we show how APIGraph can help reducing retraining cost in two metrics:
The experiment is done as follows:
As shown in this table, we can see that APIGraph can help reduce retrain frequency from 22% to 76%, and decrease the number of labeled samples from 33% to 96%.
We also draw the detailed figures here.
Xiaohan Zhang, Yuan Zhang, Ming Zhong, Daizong Ding, Yinzhi Cao, Yukun Zhang, Mi Zhang, Min Yang. "Enhancing State-of-the-art Classifiers with API Semantics to Detect Evolved Android Malware." 27th ACM Conference on Computer and Communications Security (ACM CCS 2020).
If you find our paper interesting, you can reference our paper using the following Bibtex:
@inproceedings{zhang2020enhancing,
title={Enhancing State-of-the-art Classifiers with API Semantics to Detect Evolved Android Malware},
author={Zhang, Xiaohan and Zhang, Yuan and Zhong, Ming and Ding, Daizong and Cao, Yinzhi and Zhang, Yukun and Zhang, Mi and Yang, Min},
booktitle={Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security},
pages={757--770},
year={2020}
}