Skip to content

Python script for detecting malware using machine learning techniques.

Notifications You must be signed in to change notification settings

Dru-O7/Malware-Detection-using-Machine-Learning

Repository files navigation

Malware-Detection-using-Machine-Learning

Introduction

This project is a Python script for detecting malware using machine learning techniques. The script employs the TPOT (Tree-based Pipeline Optimization Tool) library to automate the feature selection process and optimize the classification pipeline. The primary classifier used for malware detection is the ExtraTreesClassifier.

Prerequisites

Before running this code, please ensure that you have the following dependencies installed:

  • Python
  • Pandas
  • scikit-learn
  • TPOT

You should also have a dataset in CSV format named "MalwareData.csv" that contains the necessary data for training and testing. The dataset should adhere to a specific structure for this code to function correctly.

Usage

Follow these steps to use the script effectively:

  1. Prerequisites: Make sure you've met the requirements mentioned above.

  2. Data Preparation:

    • Place the "MalwareData.csv" file in the same directory as this script.
  3. Running the Script:

    • Execute the script.
    • The script performs several crucial tasks:
      • Loads the dataset and divides it into legitimate and malware data.
      • Identifies important features using the ExtraTreesClassifier.
      • Utilizes TPOT to optimize the classification pipeline.
      • Exports the optimized pipeline to a Python script named 'tpot_pipeline.py'.
  4. Further Analysis:

    • The 'tpot_pipeline.py' script, containing the optimized classification pipeline, can be employed for malware detection and in-depth analysis.

Customization

This code can be tailored to suit your specific dataset and requirements. You have the flexibility to adjust data preprocessing steps and TPOT parameters to achieve the best model for your use case.

Note: This code is designed for educational and experimental purposes. It should not be used for production-level security applications without a thorough validation process and consideration of potential security risks.

Authors

Installation

To install the required dependencies, you can use the provided requirements.txt file:

  1. Create a virtual environment (optional but recommended):

    • On Windows:

      python -m venv myenv
    • On macOS and Linux:

      python3 -m venv myenv
  2. Activate the virtual environment:

    • On Windows:

      myenv\Scripts\activate
    • On macOS and Linux:

      source myenv/bin/activate
  3. Install the required dependencies using the requirements.txt file:

    pip install -r requirements.txt

You are now ready to run the code and perform malware detection.


About

Python script for detecting malware using machine learning techniques.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published