This project implements an end-to-end system for automatic fingerprinting of transmitters at the physical layer, incorporating capture, demodulation, model training, fingerprinting, fingerprint tracking and visualisation.
The project underpins work to study the usefulness of machine-learning techniques in device fingerprinting. Fingerprint derivation is typically a costly, manual task and can vary noticeably between technologies and scenarios. There is also an inherent tension between deriving fingerprinting methods which are robust and generalisable vs. taking full advantage of specifics in a given scenario to enhance performance. The goal of this work is to reduce the need for human analysis in deriving fingerprints.
The system is built on a modular framework to permit simple variation at each stage.
At present the framework is validated with two concrete examples, developed for this project:
- 1090 MHz ADS-B traffic, received over-the-air
- RS-485 Serial traffic, received via cabling
The system comprises several components:
Component | Description |
---|---|
Signal Capture | Retrieval of a raw signal. Compatible with either software radio frameworks (e.g., GNURadio) or other digital capture (e.g., storage oscilloscopes). |
Demodulation | Interface to 3rd party demodulation code for message validation and data extraction. |
Identifier Masking | A pre-processing stage to remove in-message identifiers, which lead to trivial solutions |
Fingerprint Extraction and Comparison | A Siamese network model used to derive comparisons between received messages across observed transmitters. |
Message Verification | Long-term tracking of message fingerprint comparisons and identity verification |
In addition to the main pipeline, some additional components are implemented to help study the data:
Component | Description |
---|---|
Storage | Storage components receive bursts and write them to files for offline processing (e.g. model training). At present there are two storage components: one writing bursts to an HDF5 file, one writing decoded messages to an SQLite3 database. |
Visualisation | Visualisations to show live performance. A message verification visualisation is implemented for the ADS-B case, in which messages and their verification status are displayed, alongside a map of their claimed positions. |
Supporting (Weather) | A supporting component retrieves and stores data on local weather conditions. |
Data transfer between stages is principally through ZMQ message queues in a PUB/SUB pattern. Each part of the pipeline outputs received bursts, along with appropriate metadata, and forwards them to all subscribers. Alternatively, data can be written to files at each stage, for offline processing. However the streaming model is preferred, both for architectural convenience when splitting across containers/hosts and to support long-running, live uses in which stopping and outputting a file is cumbersome. As such, the file-based method is considered deprecated.
On host:
- Docker
- SecLab Docker images for GNURadio (https://github.com/ssloxford/docker-gnuradio)
Included within containers:
- Python 3
- numPy, pyplot, h5py
- pySDRBurstfile (https://gitlab.com/sdrburstfile/pysdrburstfile)
- Ettus USRP UHD
- GNURadio
- gr-burstfile
- gr-triggers
- Tensorflow 2, Keras
- ZMQ
- Bokeh
-
Clone and build base GNURadio images
git clone https://github.com/ssloxford/docker-gnuradio
cd docker-gnuradio
./build_all.sh
-
git clone --recursive https://github.com/ssloxford/auto-phy-fingerprint
-
Create directory
auto-phy-fingerprint-mountdir
-
If using GPU resources, add a suitable
deploy.resources
section todocker-compose.yml
-
docker-compose up -d
-
Check for files appearing in
auto-phy-fingerprint-mountdir
and accessibility of the visualisation service on port 5006
Two case studies exist at present: one for ADS-B and one for RS-485. Please see each link for details.
This project is supported by InnovateUK under the HICLASS project.