This repository accompanies the paper "Investigating Gender Fairness of Recommendation Algorithms in the Music Domain
" by Alessandro B. Melchiorre, Navid Rekabsaz, Emilia Parada-Cabaleiro, Stefan Brandl, Oleg Lesota, and Markus Schedl.
Create the environment with
conda env create -f gender_fair.yml
conda activate gender_fair
and then setup the project with
python3 setup.py develop
Data is available at http://www.cp.jku.at/datasets/LFM-2b/
To sample the items as done in the paper, please check item_sampler.ipynb
in notebooks
folder.
Data paths must be set in the conf.py
file. In particular:
DATA_PATH
: points to a .tsv file with columns [user_id
,track_id
,play_count
].DEMO_PATH
: points to a .tsv file with columns [user_name
,country
,age
,gender
,timestamp
].TRACKS_PATH
: points to a .tsv file with columns [track_artist
,track_name
].DOWN_DATA_PATH
: similar toDATA_PATH
but contains the balanced dataset (optional).DOWN_DEMO_PATH
: similar toDEMO_PATH
but contains the balanced dataset (optional).OUT_DIR
: where the processed datasets (e.g train/val/test data for fold 0).
algorithms
contains all the algorithms used in the project. Move to the folder of the algorithm of your choice (e.g. slim
)
cd algorithms/slim
and with the active env, run the experiment
python3 main_slim.py --experiment_type <experiment_type>
experiment_type
can be:
standard
: original dataup_sample
: debiasing method where the minority group in the training data is upsampled (as specified in the paper)down_sample
: balanced data is used instead where users from the majority groups have been discarded
for vae
there is also the parameter gpu
which specifies which gpu to use.
Experiment status and results will be saved in the res
folder (created when the first experiment is run).
To check the status of your experiment, move in the folder of the algorithm you are running (e.g. slim
):
cd res/slim/<experiment_type>-<timestamp_start_experiment>/val
For a broad overview of the experiments, you can just use tensorboard
in the folder.
tensorboard --logdir=./
You can also jump to a specific test fold by selecting one of the folders there (numbered from 0 to 4 for 5-folds cross validation). E.g. monitor how the experiment is going on fold 0.
cd 0
tensorboard --logdir=./
Experiment status and results will be saved in the res
folder (created when the first experiment is run).
To run the evaluation when an experiment has ended, move in the folder of the algorithm you run (e.g. slim
)
cd algorithms/slim
and the run the evaluation with:
python3 eval_slim.py --experiment_type <experiment_type>
where experiment_type
parameter is defined as above in "Check the experiment status".
NOTE: Evaluation requires loading the trained model (except for pop
). The code will look in the res
folder for the best configuration automatically.
It is needed to update the path to the trained model by providing the timestamp_start_experiment
in the evaluation code.
E.g. for eval_slim.py
you can find this at the top of the code:
best_configs = {
'standard': '2020-11-24 19:43:47.801230',
'up_sample': '2020-11-24 19:44:32.188924',
'down_sample': '2020-11-24 19:45:15.744244'
}
Set the timestamp of your experiment in the dictionary (in the correct experiment_type
field) for running the evaluation.
E.G. slim
is started at `2020-11-24 19:43:47.801230 with
python3 main_slim.py --experiment_type standard
in res
there will be a folder slim/standard-2020-11-24 19:43:47.801230
.
After the training and validation, place 2020-11-24 19:43:47.801230
in the filed standard
of the best_configs
dictionary in eval_slim.py
Experiment status and results will be saved in the res
folder (created when the first experiment is run).
To check the results of your experiment, move in the folder of your experiment (e.g. slim
):
cd res/slim/<experiment_type>-<timestamp_start_experiment>/test
For a broad overview of the experiments, you can just use tensorboard
in the folder.
tensorboard --logdir=./
The results are saved separately for each fold in the relative folder. In each fold folder you will find:
full_metrics.pkl
full_raw_metrics.pkl
Both files contain a dictionary where the key is the evaluated metric function and the value is the result of the metric function on the test data.
full_raw_metrics.pkl
contains vectors as values of the dictionary (results are not aggregated).full_metrics.pkl
contains floats as values of the dictionary (results are aggregated with mean)
The best configuration for each algorithm and for each fold is located in:
cd res/slim/<experiment_type>-<timestamp_start_experiment>/val/<fold-number>/best_config.pkl
The significant testing is located in signigicance_testing
folder. Code is in R and Jupyter notebook.
For instructions, refer to the README in the folder.
cd significance_testing