LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

Compact feature hashing for machine learning based malware detection

Photo by cokdewisnu from unsplash

Abstract Machine learning can detect variant malware files that can evade signature-based detection. Feature hashing is used to convert features into a fixed-length vector. In this paper, we study the… Click to show full abstract

Abstract Machine learning can detect variant malware files that can evade signature-based detection. Feature hashing is used to convert features into a fixed-length vector. In this paper, we study the appropriate vector size for feature hashing for a large dataset of malware files. Through exhaustive experiments on more than 280,000 real malware and benign files, we find for the first time that the default vector size of current feature hashing practices is unnecessarily large. We experimentally explore the appropriate vector size, which not only reduces memory space by 70% but also increases the detection accuracy, compared with the state-of-the-art scheme.

Keywords: detection; machine learning; feature; feature hashing; malware

Journal Title: ICT Express
Year Published: 2021

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.