Dissertation Project

ML Malware Analysis - Dissertation

Introduction

The following repo is my University Dissertation, exploring ML techniques to identify / classify Binaries based upon their maliciousness using a point-scoring system.

Read my Proposal

achieving a 1st Honors (78%)

(Everything in its entirety is being re-written in LaTeX)

For example, 1 = Malicious 0 = Not Malicious

If a Binary has FEATURE A, it is given a score of 0.6 (E.g. Virus Total)
If a Binary has FEATURES B, C, it is given 0.4 (E.g. Bitcoin addresses, IP Addresses, etc)

Datasets

Clean Files

Obtained from Windows 7 / Windows 10 Install

Malicious Files

Collected from Honeypots that I run, as well as a private Malware Sample Project

Families include Ransomwares, Trojans, Droppers, Keyloggers

TODO

Literally everything

Installation

PIP install and/or considering Jupyter Notebook release