Skip to content

itsnotapt/evil-feature-extractor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AppCompat SmartViewer

Dependancies

  • Redis
  • Elasticsearch
  • Flask (python)
  • Pandas (python)
  • elasticsearch (python)
  • rq (python)

Loader

The loader will import an AppCompat CSV file, extract the features and push the results into Elasticsearch.

Setup

Start the Redis server.

$ ./src/redis-server

Start the Elasticsearch server.

$ ES_HEAP_SIZE=4g ./bin/elasticsearch

Start rq workers, these workers will pull jobs from the Redis queue. They will pull from two queues:

  • high - loads the data into elasticsearch (will need 1-2 workers)
  • default - performs the feature extraction (can have 1 worker per CPU i.e. 4-8 workers)
$ cd ./loader

# Start 1-2 of these
$ rq worker -q high default

# Start 4-8 of these (or 1 per CPU)
$ rq worker -q default

Usage

usage: loader.py [-h] [-v] [--chunk_size CHUNK_SIZE]
                 [--compression COMPRESSION]
                 read_file index_name

Parses appcompat CSV, extract features and load into Elasticsearch

optional arguments:
  -h, --help            show this help message and exit
  -v, --verbose         Toggles verbose output

  read_file             Reads data from a file
  index_name            Elasticsearch index name (will be prepended with 
                        'appcompat-')
  --chunk_size CHUNK_SIZE
                        Set the size of the chunks, where bigger chunks use
                        more memory, but too small with impact unique host
                        features. Default is 250000.
  --compression COMPRESSION
                        Input CSV file is compressed (uses Pandas method
                        {'infer', 'gzip', 'bz2'})

Web Interface

Follow the setup steps for the loader i.e. Redis, Elasticsearch and workers.

Start the flask webapp.

$ cd ./flask
$ python run.py

Connect to the interface: http://localhost:5000

Features

  • f_path_unique_hosts - typical feature for stacking data. Paths that have been seen on majority of hosts are unlikely to be malicious.
  • f_shortname_ends_3264 - does the filename end with 32, 64, 86. Attackers like to label their tools (eg. wce32.exe, x64.exe).
  • f_path_depth - calculate the depth of the path structure. Attackers prefer not to use deep path structures for their tools (however, backdoors may have deep paths).
  • f_staging_directory - is the file in a known staging directory? Attackers like to store their tools in preexisting directories and that are preferably empty.
  • f_temp_dir - is the file in a temp directory? Attackers like to write to temp directories as they always have write permissions.
  • f_system32_dir - is the file in the system32 directory? Attackers like to store backdoors in the system32 directory.
  • f_recon_cmd - is the file a windows file commonly used for recon by attackers? This feature is used later for recon clusters.
  • f_users_dir - is the file in the users directory? Common for 1st stage backdoors to be in this directory and hence, attacker may use tools here as well (e.g. current working directory)
  • f_number_digits - how many digits in path? Used to filter out noise, since attackers generally don't use more than a few digits. This will filter at random generated paths (d:\4563bb32f7060ac2f373fe2d81d0\install.exe).
  • f_executable_archive - is the file part of an extracted archive (RarSFX, 7z executable)? Common attack vector for user to run executable archives (PlugX).
  • f_shortname_length - how long is the filename? Attackers like to use short names (e.g. 1.exe, w.exe).
  • f_root_length - how long is the directory structure? Attackers are unlikely to use long directory names.
  • f_recon_cluster - this looks for clusters of recon commands. Very common for attackers to run a combination of commands (e.g. whoami, quser, tasklist).
  • f_neighbour_psexec - this looks for commands adjacent to the psexec service. Attackers commonly use PsExec to perform lateral movement.
  • f_same_timestamp_different_name - do any files share a timestamp but have different names? This can be used to detect timestomping (e.g. bad.exe timestomped from cmd.exe).
  • f_same_filesize_different_name - do any files share a filesize but have different names? This can be used to detect backdoors or tools used in multiple staging directories with different names.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published