Skip to content

Commit

Permalink
Add a generic hostsfile artifact (#2930)
Browse files Browse the repository at this point in the history
I wrote an artifact for parsing hosts files on Linux/macOS. Afterwards,
I realised there already [exists one for
Windows](https://github.com/Velocidex/velociraptor/blob/master/artifacts/definitions/Windows/System/HostsFile.yaml).
The file format is the same on all three supported OSes, so I don't see
a reason for not having a generic artifact that runs on all of them.

The differences now are

- I match the array of hostnames against the regex, allowing one to use
anchors like "^"/"$" in the regex (instead of matching the string of
hostname/aliases)
- I provide flattened query as well, with one hostname per row
- The parsing function is exported
- The hosts filename parameter is now a glob

Let me know what you want to do with this. Include it and leave the
existing Windows.System.HostsFile? Replace the existing? Suggest this to
belong in the exchange instead? Something else?

I'll add the test when I know how to proceed.

---------

Co-authored-by: Mike Cohen <[email protected]>
  • Loading branch information
misje and scudette committed Sep 8, 2023
1 parent b6415e3 commit eee1288
Show file tree
Hide file tree
Showing 5 changed files with 218 additions and 0 deletions.
91 changes: 91 additions & 0 deletions artifacts/definitions/Generic/System/HostsFile.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
name: Generic.System.HostsFile
description: |
The system hosts file maps hostnames to IP addresses. In some cases,
entries in this file take precedence and overrides the results from
the system DNS service.
The file is a simple text file, with one line per IP address. Each
whitespace-separated word following the IP address is a hostname.
The Linux man page refers to the the first hostname as *canonical_hostname*,
and any following words as *aliases*. They are treated the same by this
artifact.
The hosts file is typically present on all Linux-based systems (including macOS),
with entries for localhost. The same file format is also supported on Windows.
The source *Hosts* returns each line in each hosts file that matches
the glob parameters for address and hostname. The hostname and aliases
are combined in a single column *Hostnames*. Columns returned:
- OSPath
- Hostnames
- Comment
Only comments that follows the hostname on the same line are captured in Comment.
Comments on their own lines are ignored.
A second source *HostsFlattened* provides a flattened result, with each row
containing an IP address and a single hostname.
This artifact also exports a function `parse_hostsfile()` that returns Hostname
and Aliases individually.
reference:
- https://manpages.debian.org/bookworm/manpages/hosts.5.en.html

export: |
LET _parse_hostsfile(OSPath) = SELECT parse_string_with_regex(
string=Line,
regex='''^[\t ]*(?P<Address>[^\s#]+)[\t ]+(?P<Hostname>[^\s#]+)(?P<Aliases>[^#\n\r]+)?(?:[\t ]*#(?P<Comment>.+))?''') AS Parsed
FROM parse_lines(filename=OSPath)
WHERE Parsed.Address
LET parse_hostsfile(OSPath) = SELECT Parsed.Address AS Address,
Parsed.Hostname AS Hostname,
filter(list=split(sep='''\s+''', string=Parsed.Aliases), regex='.') AS Aliases,
/* Remove any whitespace between comment character and comment: */
regex_replace(re='''^\s+''', source=Parsed.Comment, replace='$1') AS Comment
FROM _parse_hostsfile(OSPath=OSPath)
LET Files = SELECT OSPath FROM glob(globs=hostsFileGlobs.HostsFileGlobs)
LET HostsFiles = SELECT * FROM foreach(row=Files, query={
SELECT OSPath, Address, Hostname, Aliases, Comment
FROM parse_hostsfile(OSPath=OSPath)
})
parameters:
- name: hostsFileGlobs
description: Globs to find hosts files
type: csv
default: |
HostsFileGlobs
C:\Windows\System32\drivers\etc\hosts
/etc/hosts
- name: HostnameRegex
description: Hostname or aliases to match
default: .
type: regex
- name: AddressRegex
description: IP addresses to match
default: .
type: regex

sources:
- name: Hosts
query: |
SELECT OSPath, Address,
filter(list= (Hostname, ) + Aliases, regex=HostnameRegex) AS Hostname,
Comment
FROM HostsFiles
WHERE Hostname AND Address =~ AddressRegex
- name: HostsFlattened
query: |
SELECT * FROM flatten(query={
SELECT OSPath, Address, (Hostname, ) + Aliases AS Hostname, Comment
FROM HostsFiles
})
WHERE Address =~ AddressRegex
AND Hostname =~ HostnameRegex
3 changes: 3 additions & 0 deletions artifacts/testdata/files/hosts
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,10 @@
# ::1 localhost

127.0.0.1 test.com
# Comment to ignore
127.0.0.2 test2.com
# Comment to ignore
127.0.3.3 standardcomment.com # testing standard comment
# Comment to ignore
127.0.3.4 second.com standardcomment2.com # testing standard comment
8.8.8.8 evil.com
75 changes: 75 additions & 0 deletions artifacts/testdata/server/testcases/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
## Velociraptor Golden Tests

The files in this directory are the golden test suite used by the CI
pipeline.

What are Golden tests? Golden testing is a methodology to quickly and
efficiently write tests:

1. First a test case is written with the VQL queries that should be
run. These queries are written in a file with a `.in.yaml`
extension.
2. The `golden` test runner can be run on the test files using `make
golden` at the top level of this repository.
3. If the output of the queries is different from the existing output
(stored in `.out.yaml` files) the test will fail. The golden runner
will then update the output file with the new data.
4. The user can compare the changes in the output file (e.g. using
`git diff`) and if the changes are OK then simply `git add` the new
output file. Running the golden tests again should produce no
change.

By default the makefile rule runs the debug race detector binary (you
can built this using just `make` at the top level. This will produce a
debug build in `./output/velociraptor`. This binary includes the race
detector and so it is quite slow to run but worth it for tests.

If you find you need to iterate quicker you can manually run the
production binary (built using `make linux`) by modifying the command
run by the `make golden` command.

Additionally you can run the `dlv` debugger in the golden output by
running `make debug_golden` at the top level.

To filter the test cases (so they dont have to all run) you can set
the `GOLDEN` environment variable. For example to only run the tests
in `pe.in.yaml`:

```
$ GOLDEN=pe make golden
./output/velociraptor -v --config artifacts/testdata/windows/test.config.yaml golden artifacts/testdata/server/testcases/ --env srcDir=`pwd` --filter=pe
```


## NOTES

Golden Testing requires the output to not change between subsequent
runs and when running between different environment. This means that
output that naturally changes should be avoided - for example output
that depends on:

- Time
- File paths
- Operating systems

You can use a combination of mocking plugin output and selecting
specific columns to format the output in such a way that it does not
depends on ephemeral things.


## Developing artifacts

When developing artifacts using TDD it is useful to load the raw
artifact YAML without needing to build the binary each time. This way
we can iterate over the artifact yaml and see the results immediately
in the golden out yaml.

An example command line is:

```
./output/velociraptor-v0.7.0-linux-amd64 -v --config artifacts/testdata/windows/test.config.yaml golden artifacts/testdata/server/testcases/ --env srcDir=`pwd` --filter=hostsfile --definitions artifacts/definitions/Generic/System/
```

Here the binary will force load the raw yaml definition at runtime
overriding the built in artifact definition. It will then run the
Golden test `hostsfile.in.yaml`
18 changes: 18 additions & 0 deletions artifacts/testdata/server/testcases/hostsfile.in.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,21 @@ Queries:
- SELECT * FROM Artifact.Windows.System.HostsFile(
HostsFile=srcDir + '/artifacts/testdata/files/hosts',
ResolutionRegex = '127.0.3.3')

- LET hostsFileGlobs = (dict(HostsFileGlobs=srcDir + '/artifacts/testdata/files/hosts'),)

- SELECT Address, Hostname, Comment
FROM Artifact.Generic.System.HostsFile(
hostsFileGlobs=hostsFileGlobs, HostnameRegex = 'second.com', source='HostsFlattened')

- SELECT Address, Hostname, Comment
FROM Artifact.Generic.System.HostsFile(
hostsFileGlobs=hostsFileGlobs, AddressRegex = '127.0.0', source='HostsFlattened')

- SELECT Address, Hostname, Comment
FROM Artifact.Generic.System.HostsFile(
hostsFileGlobs=hostsFileGlobs, AddressRegex = '0.3.3$', source='HostsFlattened')

- SELECT Address, Hostname, Comment
FROM Artifact.Generic.System.HostsFile(
hostsFileGlobs=hostsFileGlobs, HostnameRegex = 'second.com', source='Hosts')
31 changes: 31 additions & 0 deletions artifacts/testdata/server/testcases/hostsfile.out.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -28,4 +28,35 @@ SELECT * FROM Artifact.Windows.System.HostsFile( HostsFile=srcDir + '/artifacts/
"Comment": "testing standard comment",
"_Source": "Windows.System.HostsFile"
}
]LET hostsFileGlobs = (dict(HostsFileGlobs=srcDir + '/artifacts/testdata/files/hosts'),)[]SELECT Address, Hostname, Comment FROM Artifact.Generic.System.HostsFile( hostsFileGlobs=hostsFileGlobs, HostnameRegex = 'second.com', source='HostsFlattened')[
{
"Address": "127.0.3.4",
"Hostname": "second.com",
"Comment": "testing standard comment"
}
]SELECT Address, Hostname, Comment FROM Artifact.Generic.System.HostsFile( hostsFileGlobs=hostsFileGlobs, AddressRegex = '127.0.0', source='HostsFlattened')[
{
"Address": "127.0.0.1",
"Hostname": "test.com",
"Comment": ""
},
{
"Address": "127.0.0.2",
"Hostname": "test2.com",
"Comment": ""
}
]SELECT Address, Hostname, Comment FROM Artifact.Generic.System.HostsFile( hostsFileGlobs=hostsFileGlobs, AddressRegex = '0.3.3$', source='HostsFlattened')[
{
"Address": "127.0.3.3",
"Hostname": "standardcomment.com",
"Comment": "testing standard comment"
}
]SELECT Address, Hostname, Comment FROM Artifact.Generic.System.HostsFile( hostsFileGlobs=hostsFileGlobs, HostnameRegex = 'second.com', source='Hosts')[
{
"Address": "127.0.3.4",
"Hostname": [
"second.com"
],
"Comment": "testing standard comment"
}
]

0 comments on commit eee1288

Please sign in to comment.