Fortinet, Inc.
MACHINE-LEARNING BASED APPROACH FOR MALWARE SAMPLE CLUSTERING

Last updated:

Abstract:

Systems and methods for a machine learning based approach for identification of malware using static analysis and a machine-learning based automatic clustering of malware are provided. According to various embodiments of the present disclosure, a processing resource of a computer system receives a potential malware sample. A plurality of feature vectors is extracted from the potential malware sample and is converted into an input vector. A byte sequence is generated by walking a plurality of decision trees based on the input vector. Further, a hash value for the byte sequence is calculated and a determination is made regarding whether the hash value matches a malware hash value of a plurality of malware hash values corresponding to a known malware sample. Upon said determination being affirmative, the potential malware sample is classified as malware and is associated with a malware family of the known malware sample.

Status:
Application
Type:

Utility

Filling date:

31 Mar 2020

Issue date:

30 Sep 2021