Enhancing Ransomware Classification with Multi-stage Feature Selection and Data Imbalance Correction

Faithful Chiagoziem OWNUEGBUCHE, Anca Delia Jurcut, Liliana Pasquale

June, 2023

Image credit: CSO

Abstract

Ransomware is a critical security concern, and developing applications for ransomware detection is paramount. Machine learning models are helpful in detecting and classifying ransomware. However, the high dimensionality of ransomware datasets divided into various feature groups such as API calls, Directory, and Registry logs has made it difficult for researchers to create effective machine learning models. Class imbalance also leads to poor results when classifying ransomware families. To tackle these challenges, in this paper, we propose a three-stage feature selection method that effectively reduces the dimensionality of the data and considers the varying importance of the different feature groups in the classification of ransomware families. We also applied cost-sensitive learning and re-sampling of the training data using SMOTE to address data imbalance. We applied these techniques to the Elderan ransomware dataset. Our results show that the proposed feature selection method significantly improves the detection of ransomware compared to other state-of-the-art studies using the same dataset. Furthermore, the data balancing techniques (cost-sensitive learning and SMOTE) were effective in the multi-class classification of ransomware.

Type

Conference paper

Publication

7th International Symposium on Cyber Security, Cryptology, and Machine Learning

Enhancing Ransomware Classification with Multi-stage Feature Selection and Data Imbalance Correction

Abstract

Faithful Chiagoziem OWNUEGBUCHE

PhD Candidate in Machine Learning and Blockchain Technology