Domain Auto-labeling through Reinforcement Learning for the Inference of Gestures
This is the code repository for Yvan Satyawan's Master Thesis.
We must first prepare the dataset by generating indexes of the data and calculating its mean and standard deviation for standardization later on.
- Generate the small dataset, if desired.
- Generate an index for the small dataset using
src/data_utils/generate_dataset_index.py
.- To generate the single-user leave out, run with
generate_dataset_index.py [PATH_TO_DATA_ROOT] -u
. - Use
-s
instead of-u
to make a single-domain index. - Add
-n 3
to only use 3 repetitions instead of the full dataset. Useful for debugging as it means a smaller dataset.
- To generate the single-user leave out, run with
- Generate the smaller datasets using
src/data_utils/generate_smaller_splits.py
.- To generate the single-user split, run with
generate_smaller_splits.py [PATH_TO_DATA_ROOT] single_user
. - It is possible to replace
single_user
with whatever index file suffix was generated bygenerate_dataset_index.py
- To generate the single-user split, run with
- Generate an index for the small dataset using
- Otherwise, generate the dataset index using
src/data_utils/generate_dataset_index.py
. This will generate the index of the full dataset. - Calculate the mean and standard deviation of amplitude and phase using
src/data_utils/calculate_mean_std.py
- If not using the full dataset, copy both the indexes and generated
mean_std.csv
into the smaller split directory
- If not using the full dataset, copy both the indexes and generated
- Pregenerate transformations, as they take too long to generate during training, using
src/data_utils/pregenerate_transform.py
- Run using the config file for a given experiment with
pregenerate_transform.py [PATH_TO_CONFIG_FILE]
.
- Run using the config file for a given experiment with
Config files can be generated to ensure that runs are consistent.
Our used experimental run configuration files are stored in the run_configs
directory as YAML files.
utils/config_parser.py
contain the full list of possible configuration parameters.
DARLInG is trained by experiments/train_runner.py
.
It is run by using train_runner.py [PATH_TO_CONFIG_FILE]
.
Hyperparameter sweeps with Weights and Biases can be done by pointing weights and biases to utils/sweep_runner.py
as the main file.
This runner accepts command line arguments instead of a YAML file to initialize training.
Internally, it's transforming the command line arguments into a dictionary and running the dictionary through the standard config parser.
All the other files and scripts are carefully documented with Python Docstrings to indicate what they do and how they work, but are not strictly necessary to run training usually.
train_runner.py
is for general model training, train.py
hosts the main training loop code, and final_experiment_runner.py
contains code to repeatedly call train_runner.py
on multiple configurations without causing any memory issues.
All other scripts in experiments/
also contain the motivation of the script, the question the script attempts to answer, and the answer.
These are mostly notes to myself to understand what the data looks like.
The files are stored in a lot of files, split into multiple folders. Folder naming scheme doesn't have much meaning, other than to split the dataset into when the data was captured.
Not all gestures are performed by all users. As such, we will only use gestures 1-6 in this work.
User | Gesture -> | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | Total | |
1 | 130 | 130 | 130 | 130 | 130 | 130 | 65 | 65 | 65 | 40 | 1015 |
2 | 200 | 175 | 175 | 175 | 150 | 125 | 25 | 25 | 25 | 25 | 1100 |
3 | 150 | 150 | 150 | 125 | 125 | 125 | 825 | ||||
4 | 25 | 25 | 25 | 25 | 25 | 25 | 150 | ||||
5 | 50 | 50 | 50 | 50 | 50 | 50 | 25 | 25 | 25 | 375 | |
6 | 50 | 50 | 50 | 50 | 50 | 50 | 300 | ||||
7 | 25 | 25 | 25 | 25 | 25 | 25 | 150 | ||||
8 | 25 | 25 | 25 | 25 | 25 | 25 | 150 | ||||
9 | 25 | 25 | 25 | 25 | 25 | 25 | 150 | ||||
10 | 25 | 25 | 25 | 25 | 25 | 25 | 25 | 25 | 25 | 225 | |
11 | 25 | 25 | 25 | 25 | 25 | 25 | 25 | 25 | 25 | 225 | |
12 | 25 | 25 | 25 | 25 | 25 | 25 | 25 | 25 | 25 | 225 | |
13 | 25 | 25 | 25 | 25 | 25 | 25 | 25 | 25 | 25 | 225 | |
14 | 25 | 25 | 25 | 25 | 25 | 25 | 25 | 25 | 25 | 225 | |
15 | 25 | 25 | 25 | 25 | 25 | 25 | 25 | 25 | 25 | 225 | |
16 | 25 | 25 | 25 | 25 | 25 | 25 | 25 | 25 | 25 | 225 | |
17 | 25 | 25 | 25 | 25 | 25 | 25 | 25 | 25 | 25 | 225 | |
Total | 880 | 855 | 855 | 830 | 805 | 780 | 315 | 315 | 315 | 65 | 6015 |
The CSI files are .dat
files, which are simply CSI dumps from the tool used by the team to gather CSI data.
The file naming convention is as follows:
id-a-b-c-d-Rx.dat
id |
a |
b |
c |
d |
Rx |
---|---|---|---|---|---|
User ID | Gesture Class | Torso Location | Face Orientation | Repetition Number | Wi-Fi Receiver ID |
Each recorded CSI sequence can be understood as an tensor with the shape (i, j, k, 1).
- i is the packet number
- j is the subcarrier number
- k is the receiver antenna number
In the case of Widar3.0, the value of k is always 3 (3 antennas per receiver). Widar3.0 uses 1 transmitter and 6 receivers placed around the sensing area.
We use the package csiread to read the file.
The .dat
files can be read using csiread.Intel
.
The BVP files are .mat
files (MATLAB) that have been preprocessed by the authors.
The file naming convention is as follows:
id-a-b-c-d-suffix.dat
id |
a |
b |
c |
d |
---|---|---|---|---|
User ID | Gesture Class | Torso Location | Face Orientation | Repetition Number |
suffix
has no explanation. As far as I understand, I think it's just the configuration used to produce the BVP.
There is no receiver ID since all 6 receivers were combined to produce the BVP.
Each file is a 20x20xT tensor.
Dimension | Meaning |
---|---|
0 | Velocity along x-axis from [-2, +2] m/s |
1 | Velocity along y-axis from [-2, +2] m/s |
2 | Timestamp with 10 Hz sampling rate |
We use Scipy to read the .mat
files with scipy.io.loadmat()
.
BVP lengths are not consistent. We pad them to the all have the same length of 28.
The directory 20181130-VS
contains no 6-link
subdirectory.
As we are testing in vs out of domain performance, we will use the following data split for train, validation, and test.
Set | Room IDs | User IDs | Torso Location |
---|---|---|---|
Training | 1, 2 | 1, 2, 4, 5 | 1-5 |
Validation | 1 | 10-17 | 1-5 |
Test Room | 3 | 3, 7, 8, 9 | 1-5 |
Test Torso Location | 1 | 1 | 6-8 |
We split it this way to make sure that the test set is truly unseen while the validation set is an unseen room-user combination instead of truly unseen.
We only use gestures 1-6, since these are the gestures which have samples from all participants.
For single domain, we use only User 2 in Room 1 with torso location 1 and face orientation 1. This was chosen as it has the largest number of samples. Test, validation, and training splits are randomly generated.