TY - GEN
T1 - Empowering convolutional networks for malware classification and analysis
AU - Kolosnjaji, Bojan
AU - Eraisha, Ghadir
AU - Webster, George
AU - Zarras, Apostolis
AU - Eckert, Claudia
N1 - Publisher Copyright:
© 2017 IEEE.
PY - 2017/6/30
Y1 - 2017/6/30
N2 - Performing large-scale malware classification is increasingly becoming a critical step in malware analytics as the number and variety of malware samples is rapidly growing. Statistical machine learning constitutes an appealing method to cope with this increase as it can use mathematical tools to extract information out of large-scale datasets and produce interpretable models. This has motivated a surge of scientific work in developing machine learning methods for detection and classification of malicious executables. However, an optimal method for extracting the most informative features for different malware families, with the final goal of malware classification, is yet to be found. Fortunately, neural networks have evolved to the state that they can surpass the limitations of other methods in terms of hierarchical feature extraction. Consequently, neural networks can now offer superior classification accuracy in many domains such as computer vision and natural language processing. In this paper, we transfer the performance improvements achieved in the area of neural networks to model the execution sequences of disassembled malicious binaries. We implement a neural network that consists of convolutional and feedforward neural constructs. This architecture embodies a hierarchical feature extraction approach that combines convolution of n-grams of instructions with plain vectorization of features derived from the headers of the Portable Executable (PE) files. Our evaluation results demonstrate that our approach outperforms baseline methods, such as simple Feedforward Neural Networks and Support Vector Machines, as we achieve 93% on precision and recall, even in case of obfuscations in the data.
AB - Performing large-scale malware classification is increasingly becoming a critical step in malware analytics as the number and variety of malware samples is rapidly growing. Statistical machine learning constitutes an appealing method to cope with this increase as it can use mathematical tools to extract information out of large-scale datasets and produce interpretable models. This has motivated a surge of scientific work in developing machine learning methods for detection and classification of malicious executables. However, an optimal method for extracting the most informative features for different malware families, with the final goal of malware classification, is yet to be found. Fortunately, neural networks have evolved to the state that they can surpass the limitations of other methods in terms of hierarchical feature extraction. Consequently, neural networks can now offer superior classification accuracy in many domains such as computer vision and natural language processing. In this paper, we transfer the performance improvements achieved in the area of neural networks to model the execution sequences of disassembled malicious binaries. We implement a neural network that consists of convolutional and feedforward neural constructs. This architecture embodies a hierarchical feature extraction approach that combines convolution of n-grams of instructions with plain vectorization of features derived from the headers of the Portable Executable (PE) files. Our evaluation results demonstrate that our approach outperforms baseline methods, such as simple Feedforward Neural Networks and Support Vector Machines, as we achieve 93% on precision and recall, even in case of obfuscations in the data.
UR - https://www.scopus.com/pages/publications/85022198952
U2 - 10.1109/IJCNN.2017.7966340
DO - 10.1109/IJCNN.2017.7966340
M3 - Conference contribution
AN - SCOPUS:85022198952
T3 - Proceedings of the International Joint Conference on Neural Networks
SP - 3838
EP - 3845
BT - 2017 International Joint Conference on Neural Networks, IJCNN 2017 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2017 International Joint Conference on Neural Networks, IJCNN 2017
Y2 - 14 May 2017 through 19 May 2017
ER -