Audit Reporter Configuration

Overview

SPADE's Audit Reporter infers data provenance from system calls and related information in Linux Audit logs. The extent of call coverage and level of detail needed may vary based on what the provenance records will be used for. Consequently, the Audit Reporter supports a range of a configuration options.

All of the options can either be specified in the configuration file cfg/spade.reporter.Audit.config or in the SPADE control client when the Audit Reporter is added. The values specified in the configuration file are treated as defaults. Options specified through the SPADE control client override the defaults.

Graph Content

The options in this table control whether specific elements are monitored and reported. Options listed in the Name column are case-sensitive. All options can be set to either true or false.

Name	Description

`agents`	Use separate Agent vertices (and WasControlledBy edges) to report agent-related annotations, such as `uid` and `gid` (instead of including the annotations in the corresponding Process vertices.
`anonymousMmap`	Create Artifact vertices (and WasGeneratedBy edges) for memory that has been anonymously mapped (with `mmap`).
epochs	If `true`, then epochs of an Artifact vertex are reported. If `false`, then not reported.
fileIO	If `true`, then provenance is reported for filesystem I/O. If `false`, then not reported.
fsids	If `true`, then `setfsuid`, and `setfsgid` system calls' provenance is reported. If `false`, then not reported.
IPC	If `true`, then provenance is reported for inter-process communication through shared memory, and message queue. If `false`, then not reported.
memorySyscalls	If `true`, then provenance is reported for memory mappings using `mmap` system call, and memory protections updates using `mprotect` system call. If `false`, the not reported.
namespaces	If `true`, then Linux namespace metadata is reported in provenance. If `false`, then not reported.
netIO	If `true`, the provenance is reported for network I/O. If `false`, then not reported.
permissions	If `true`, then permissions of an Artifact vertex are reported. If `false`, then not reported.
reportKill	If `true`, the `kill` system call provenance is reported. If `false`, then not reported.
rootFS	If `true`, then filesystem root changes with respect to processes and the whole system are tracked, and reported as `root path` in the Artifact vertex. If `false`, then not tracked and not reported.
simplify	If `true`, then (a) Related system call names are reported as the name of the group that they belong to (on edges), and (b) Only `uid`, `euid`, `gid`, and `egid` are reported for an agent. If `false`, then exact system call names, and full set of agent identifiers (`uid`, `euid`, `suid`, `fsuid`, `gid`, `egid`, `sgid`, `fsgid`) are reported.
units	If `true`, then UBSI provenance is reported. If `false`, then not reported.
unixSockets	If `true`, then unix sockets provenance is reported. If `false`, then not reported.
versions	If `true`, then Artifact vertices are versioned. If `false`, then not versioned.

Controlling Provenance Capturing

Following are the flags to control the provenance to capture, and consume. The Name column specifies the flag name (case-sensitive) which is followed by Description, and then the type of the Value. Lastly, the column Mode(s) specifies in which mode of the Audit reporter launch is the flag valid. The Audit reporter can be run in live mode, or playback mode. In live mode, the audit events are read from the Linux Audit Subsystem, and in playback mode, the audit events are read from a file or a directory.

Name	Description	Value	Mode(s)

excludeProctitle	If `true`, then adds an audit rule to exclude the audit record `PROCTITLE` which is not used - to reduce size of the log. If `false`, then the audit record is included.	Boolean	Live
failfast	If `true`, then stops reading audit events as soon as an error is encountered in the reporter. If `false`, then ignores the error and continues reading.	Boolean	Live, and Playback
inputDir	Path to the audit log directory to read events from. The log reading order starts from the file that was modified the earliest to the file that was modified the latest.	String	Playback
inputLog	Path to the audit log file to read events from	String	Playback
inputTime	Any log file in `inputDir` which has a modified time before the `inputTime` is ignored. The format of the time is `yyyy-MM-dd:HH:mm:ss`.	String	Playback
localEndpoints	If `true`, then kernel modules are added, and local ports of a network connection are reported. If `false`, then kernel modules not added, and local ports of a network connection not reported.	Boolean	Live
networkAddressTranslation	If `true`, then network address translations done in Netfilter are reported. If `false`, then not reported.	Boolean	Live
outputLog	The file to write the consumed audit events to.	File-system path	Live, and Playback
outputLogRotate	The maximum number of lines to write to the output log file specified using `outputLog`. Rotated log file naming convention: `<Value of outputLog>.<Number starting from 1>`. The number is incremented each time the maximum line limit is exceeded.	Positive number, or `0` to not rotate	Live, and Playback
reportingInterval	The length of the interval (in seconds) to report statistics (if any).	Positive number, or `0` to not report	Live, and Playback
rotate	If `true`, then looks for rotated audit log files in the same directory as the file `inputLog`. Any file that matches the naming convention `<Value of inputLog>.<Number from 1 to 99>` is considered a rotated log. The log reading start from the log with the highest number to the log with the lowest number.	Boolean	Playback
syscall	Specifies the selection of set of system calls to audit. Can be one of: (a) `default` - Audit reporter selected system calls, (b) `all` - All system calls, and (c) `none` - No system call i.e. user sets the audit rules manually.	String	Live
user	If specified, then only the specified user is audited. If not specified, then all users except the user running SPADE are audited.	String	Live
waitForLog	If `true`, then prevents the reporter from being removed if the input log(s) is still being read. If `false`, then the reporter is removed immediately.	Boolean	Playback

NOTE: If neither inputLog, nor inputDir is specified then the Audit reporter is run in live mode i.e. reads audit events from the Linux Audit Subsystem.

Example Configuration for Generating Linux Namespaces Provenance

Live

The following SPADE control client command shows how to configure the Audit reporter to run in live mode, and report provenance for the supported Linux namespaces:

-> add reporter Audit namespaces=true localEndpoints=true networkAddressTranslation=true cwd=true rootFS=true IPC=true outputLog=/tmp/audit.log

The command, above, would result in provenance for the host as well as provenance which would contain:

Linux namespace inode identifiers for ipc, user, pid, network, and mount namespaces. Responsible flags: namespaces=true, and localEndpoints=true
Linux namespace filesystem paths resolved with respect to host filesystem paths. Responsible flags: cwd=true, and rootFS=true
Mapping between Linux namespace network connections and host network connections. Responsible flag: networkAddressTranslation=true
IPC per Linux namespace. Responsible flag: IPC=true
Process id view of the Linux namespace as well as the host in Process vertices. Responsible flags: namespaces=true, and localEndpoints=true

NOTE: networkAddressTranslation=true generates a large volume of provenance which can slow the machine running SPADE significantly.

Playback

The following command shows how to configure the Audit reporter to run in playback mode, and report provenance for Linux namespaces:

-> add reporter Audit namespaces=true cwd=true rootFS=true inputLog=/tmp/audit.log

This command, also reports provenance for all supported Linux namespaces as the previous command though it assumes that the inputLog was the same as the outputLog collected using the previous command. The reason for that is that in live mode, the Audit reporter collects extra information using a kernel module and then stores it in the outputLog which is necessary for reporting Linux namespaces provenance.

This material is based upon work supported by the National Science Foundation under Grants OCI-0722068, IIS-1116414, and ACI-1547467. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

Setting up SPADE
Storing provenance
Collecting provenance
- Across the operating system
- Limiting collection to a part of the filesystem
  - On Linux
  - On macOS
- From an external application
- With compile-time instrumentation
- Using the reporting API
- Of transactions in the Bitcoin blockchain
- Filtering provenance
  - Using filters
  - Available filters
Viewing provenance
- In a graph database
- In a relational database
Querying SPADE
- Illustrative example
- Transforming query responses
  - Using transformers
  - Available transformers
- Protecting query responses
Miscellaneous

Provide feedback

Saved searches

Use saved searches to filter your results more quickly