-
Notifications
You must be signed in to change notification settings - Fork 77
Audit provenance
The Audit reporter transforms records into an Open Provenance Model (OPM) representation.
The table below outlines the key-value annotations that decorate the OPM elements generated.
OPM element | Annotation Key | Annotation Value's semantics | Annotation Value's type | Presence |
---|---|---|---|---|
Agent | ||||
uid |
operating system identifier of user that ran the program | unsigned integer |
required | |
euid |
operating system identifier of effective user of program | unsigned integer |
required | |
gid |
operating system identifier of user's group when they ran the program | unsigned integer |
required | |
egid |
operating system identifier of effective group of program | unsigned integer |
required | |
suid |
saved identifier when program's effective user has changed | unsigned integer |
optional | |
sgid |
saved identifier when program's effective group has changed | unsigned integer |
optional | |
fsuid |
program's user identifier for filesystem access checks | unsigned integer |
optional | |
fsgid |
program's group identifier for filesystem access checks | unsigned integer |
optional | |
source |
can be one of: syscall - if information came from a Linux kernel Audit system call record /proc - if information came from Linux's /proc pseudofilesystem |
string (as enumerated) | required | |
Process | ||||
name |
command used to invoke program | string | optional | |
pid |
operating system process identifier | integer | required | |
ppid |
parent's process identifier | integer | required | |
cwd |
only for process from operation execve , current working directory of user (in the shell when they ran the program) |
string | optional | |
command line |
only for process from operation execve , program name and arguments provided |
string | optional | |
start time |
if known, when the process (or unit) started (in Unix time) | floating point |
optional | |
seen time |
if start time not known, (Unix) time of first event seen from process |
floating point |
optional | |
unit |
only if UBSI1 used, unique identifier of unit (with 0 denoting the non-unit part of the process) |
long integer |
optional | |
count |
only if UBSI1 used and unit ≠0 , number of times entire unit loop ran previously |
long integer |
optional | |
iteration |
only if UBSI1 used and unit ≠0 , number of times unit loop has iterated |
long integer |
optional | |
source |
can be one of: syscall - if information came from a Linux kernel Audit system call record /proc - if information came from Linux's /proc pseudofilesystem beep - if information came from UBSI1
|
string (as enumerated) | required | |
Artifact | ||||
subtype |
can be one of: memory - for memory addresses file , link , directory , block device , character device - for filesystem entities named pipe , unnamed pipe , unix socket , unix socket pair , and network socket pair - for inter-process flow network socket - for network flows unknown - underlying artifact can be of subtype file , link , directory , block device , character device , named pipe , unnamed pipe , unix socket , network socket , network socket pair , or unix socket pair
|
string (as enumerated) | required | |
memory address |
only for subtype memory , location in memory |
integer (in hexadecimal) | optional | |
size |
only for subtype memory , length of allocated memory |
hexadecimal integer | optional | |
tgid |
only for subtype memory , unnamed pipe , unknown , unix socket pair , or network socket pair , group identifier of threads that share memory or file descriptors |
integer | optional | |
time |
only for subtype memory , unnamed pipe , 'unknown', unix socket pair , or network socket pair , start or seen time of group identifier of threads that share memory or file descriptors |
floating point | optional | |
path |
only for subtype file , named pipe , link , directory , block device , character device , or unix socket , location in the local filesystem |
string | optional | |
permissions |
only for subtype file , link , directory , block device , character device , named pipe , or unix socket , filesystem access mode |
integer (in octal) | optional | |
version |
only for subtype file , link , directory , block device , character device , named pipe , unnamed pipe , memory , unix socket , or unknown , how many times it has been written |
integer | optional | |
epoch |
only for subtype file , link , directory , block device , character device , named pipe , unnamed pipe , unix socket , network socket , or unknown , how many times an artifact has been created at specified path |
integer | optional | |
fd |
only for subtype unknown , descriptor used to access file |
integer | optional | |
read fd |
only for subtype unnamed pipe , descriptor used to read pipe |
integer | optional | |
write fd |
only for subtype unnamed pipe , descriptor used to write pipe |
integer | optional | |
fd 0 |
only for subtypes unix socket pair and network socket pair , descriptor used to access connected socket pair |
integer | optional | |
fd 1 |
only for subtypes unix socket pair and network socket pair , descriptor used to access connected socket pair |
integer | optional | |
local address |
only for subtype network socket , host from which connection originates |
dotted octet | optional | |
local port |
only for subtype network socket , connection port used at originating host |
unsigned short integer |
optional | |
remote address |
only for subtype network socket , host at which connection terminates |
dotted octet | optional | |
remote port |
only for subtype network socket , connection port used at terminating host |
unsigned short integer |
optional | |
protocol |
can be one of: udp or tcp , only for subtype network socket , connection protocol used |
string (as enumerated) | optional | |
source |
can be one of: syscall - if information came from a Linux kernel Audit system call record netfilter - if information came from a Linux kernel Audit network filter record /proc - if information came from Linux's /proc pseudofilesystem beep - if information came from UBSI1
|
string (as enumerated) | required | |
WasControlledBy | ||||
operation |
can be one of: update - implicit process ownership change setuid or setgid - explicit process ownership change |
string (as enumerated) | optional | |
time |
if known, when the event occurred (in Unix time) | floating point |
optional | |
event id |
if source is syscall , underlying event's identifier |
unsigned integer |
optional | |
source |
can be one of: syscall - if information came from a Linux kernel Audit system call record /proc - if information came from Linux's /proc pseudofilesystem |
string (as enumerated) | required | |
WasTriggeredBy | ||||
operation |
can be one of: fork - another independent process was created clone - another process created with shared state execve - child process replaced parent unknown - underlying operation can be of type fork , clone , or execve update - implicit process ownership change setuid or setgid - explicit process ownership change unit - creation of a UBSI1 unit (by a program loop) unit dependency - dependent unit read memory written by another unit ptrace - trace another process kill - send signal to another process |
string (as enumerated) | optional | |
flags |
only for operation clone , clone flags |
string | optional | |
request |
only for operation ptrace , can be one of: PTRACE_POKETEXT or PTRACE_POKEDATA or PTRACE_POKEUSER or PTRACE_SETREGS or PTRACE_SETFPREGS or PTRACE_SETREGSET or PTRACE_SETSIGINFO or PTRACE_SETSIGMASK or PTRACE_SET_THREAD_AREA or PTRACE_SETOPTIONS - data of tracee modified or PTRACE_CONT or PTRACE_SYSCALL or PTRACE_SINGLESTEP or PTRACE_SYSEMU or PTRACE_SYSEMU_SINGLESTEP or PTRACE_LISTEN or PTRACE_KILL or PTRACE_INTERRUPT or PTRACE_ATTACH or PTRACE_DETACH - execution of tracee modified |
string | optional | |
signal |
only for operation kill , signal sent |
integer | optional | |
time |
if known, when the event occurred (in Unix time) | floating point |
optional | |
event id |
if source is syscall , underlying event's identifier |
unsigned integer |
optional | |
source |
can be one of: syscall - if information came from a Linux kernel Audit system call record /proc - if information came from Linux's /proc pseudofilesystem beep - if information came from UBSI1
|
string (as enumerated) | required | |
WasGeneratedBy | ||||
operation |
can be one of: create - file was created open - file was opened for writing write - data was transferred to memory, file, or network send - data was transferred from process to network connect - outgoing network connection was established truncate - data at end of file was removed rename (write) - to new file, after renaming link (write) - to new file, after linking mmap (write) - to mapped memory tee (write) - data copied to pipe splice (write) - data transferred to destination vmsplice (write) - data mapped to pipe chmod - changed file permissions mprotect - changed memory protection unlink - file was deleted close - file was closed lseek - file offset was updated madvise - set memory advice |
string (as enumerated) | required | |
size |
only for operations truncate , tee (write) , splice (write) , vmsplice (write) , write , and send , number of bytes transferred |
long integer |
optional | |
mode |
only for operations chmod , open and create , permissions applied to file |
integer (in octal) | optional | |
flags |
only for operations open and create , status or creation flags |
string | optional | |
protection |
only for operations mmap , and mprotect , permissions set for memory location |
hexadecimal integer | optional | |
offset |
only for system calls lseek , pwrite , and pwritev , offset in the file |
long | optional | |
whence |
only for system call lseek , can be one of: SEEK_SET or SEEK_CUR or SEEK_END or SEEK_DATA or SEEK_HOLE - directive on how to use the offset value for lseek system call |
string | optional | |
advice |
only for system call madvise , can be one of: MADV_NORMAL or MADV_RANDOM or MADV_SEQUENTIAL or MADV_WILLNEED or MADV_DONTNEED or MADV_FREE or MADV_REMOVE or MADV_DONTFORK or MADV_DOFORK or MADV_MERGEABLE or MADV_UNMERGEABLE or MADV_HUGEPAGE or MADV_NOHUGEPAGE or MADV_DONTDUMP or MADV_DODUMP or MADV_WIPEONFORK or MADV_KEEPONFORK or MADV_HWPOISON or MADV_OFFLINE - advice on memory use |
string | optional | |
time |
if known, when the event occurred (in Unix time) | floating point |
required | |
event id |
if source is syscall , underlying event's identifier |
unsigned integer |
required | |
source |
can be one of: syscall - if information came from a Linux kernel Audit system call record /proc - if information came from Linux's /proc pseudofilesystem beep - if information came from UBSI1
|
string (as enumerated) | required | |
Used | ||||
operation |
can be one of: open - file was opened for reading read - data was transferred from memory, file, or network recv - data was transferred from network to process accept - incoming network connection was established rename (read) - from original file, before renaming link (read) - from original file, before linking mmap (read) - from mapped file tee (read) - data copied from pipe splice (read) - data transferred from source vmsplice (read) - data mapped from pipe load - dynamic library loaded close - file was closed init_module - module loaded from memory finit_module - module loaded from file |
string (as enumerated) | required | |
size |
only for operations read , tee (read) , splice (read) , vmsplice (read) , and recv , number of bytes transferred |
long integer |
optional | |
mode |
only for operation open , permissions applied to file |
integer (in octal) | optional | |
flags |
only for operation open , status flags |
string | optional | |
offset |
only for system calls pread , and preadv , offset in the file from where bytes were read |
long | optional | |
time |
if known, when the event occurred (in Unix time) | floating point |
required | |
event id |
if source is syscall , underlying event's identifier |
unsigned integer |
required | |
source |
can be one of: syscall - if information came from a Linux kernel Audit system call record /proc - if information came from Linux's /proc pseudofilesystem beep - if information came from UBSI1
|
string (as enumerated) | required | |
WasDerivedFrom | ||||
operation |
can be one of: update - the artifact has been modified rename - the same artifact has a new name link - a new name can be used to refer to the old artifact mmap - a file has been mapped into memory tee - data copied between pipes splice - data transferred between artifacts |
string (as enumerated) | required | |
pid |
process that performed the operation | integer | optional | |
time |
if known, when the event occurred (in Unix time) | floating point |
required | |
event id |
if source is syscall , underlying event's identifier |
unsigned integer |
required | |
source |
can be one of: syscall - if information came from a Linux kernel Audit system call record netfilter - if information came from a Linux kernel Audit network filter record /proc - if information came from Linux's /proc pseudofilesystem beep - if information came from UBSI1
|
string (as enumerated) | required | |
NOTE: Though some operation
values match system call names, the semantics differ. In particular, the interpretation is provenance-oriented. Multiple system calls may map to a single operation value (such as chmod() and fchmod() both reported as chmod
). Some system calls have an indirect effect (such as dup() resulting in a new file descriptor resolving to the old path during read() and write() calls). The mapping of system calls to OPM edges is outlined here.
1 Unit-based selective instrumentation (UBSI). For more information, see:
- Hassaan Irshad, Gabriela Ciocarlie, Ashish Gehani, Vinod Yegneswaran, Kyu Lee, Jignesh Patel, Somesh Jha, Yonghwi Kwon, Dongyan Xu, Xiangyu Zhang, TRACE: Enterprise-Wide Provenance Tracking For Real-Time APT Detection, IEEE Transactions on Information Forensics and Security (TIFS), 2021. [PDF]
This material is based upon work supported by the National Science Foundation under Grants OCI-0722068, IIS-1116414, and ACI-1547467. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.
- Setting up SPADE
- Storing provenance
-
Collecting provenance
- Across the operating system
- Limiting collection to a part of the filesystem
- From an external application
- With compile-time instrumentation
- Using the reporting API
- Of transactions in the Bitcoin blockchain
- Filtering provenance
- Viewing provenance
-
Querying SPADE
- Illustrative example
- Transforming query responses
- Protecting query responses
- Miscellaneous