Instructions for reproducing the experiments reported in our paper [Directed Graph Auto-Encoders] (https://arxiv.org/abs/2202.12449), published in AAAI 2022.
If our code is helpful for your research, please cite our work:
@inproceedings{gkolliasAAAI22,
author = {Georgios Kollias and
Vasileios Kalantzis and
Tsuyoshi Id\'e and
Aur\'elie Lozano and
Naoki Abe},
title = {Directed Graph Auto-Encoders},
booktitle = {Proceedings of the 36th AAAI Conference on Artificial Intelligence (AAAI 2022)},
month = {February},
year = {2022}
}
The following Python packages are required in addition to standard tensorflow
and pytorch
machine learning frameworks:
torch-geometric
: https://github.com/rusty1s/pytorch_geometricgravity-gae
: https://github.com/deezer/gravity_graph_autoencoders
Copy all scripts under code/scripts/
to the top code/
directory.
Will use feature-based cora_ml
and citeseer
datasets under data/cora_ml/raw
and data/citeseer/raw
.
These were originally utilized in "Deep gaussian embedding of graphs: Unsupervised inductive learning via ranking", Aleksandar Bojchevski and Stephan Günnemann.
https://github.com/abojchevski/graph2gauss
Execute citation_grid_search.sh
to generate json
files with performance metrics results for all dataset and model combinations and for all hyperparameter values in the relevant search grid for each such combination as defined in the manuscript. Perform 5
repetitions (different graph splits) per configuration and train for 200
epochs for each repetition. Example command:
python train.py --dataset=cora_ml --model=digae --alpha=0.0 --beta=0.2 --epochs=200 --nb_run=5 --logfile=digae_cora_ml_grid_search.json --learning_rate=0.005 --hidden=64 --dimension=32 --validate=True
Hyperparameters for best mean AUC results are selected for the final model runs: citation_run.sh
collects relevant commands.
Execute citation_run.sh
to generate json
files with performance metrics results for all dataset and model combinations for the selected hyperparameter values. Perform 20
repetitions per configuration and train for 200
epochs for each repetition. Example command:
python gravity_train.py --dataset=citeseer --model=gravity_gcn_ae --epochs=200 --nb_run=20 --logfile=run_features.json --learning_rate=0.005 --hidden=64 --dimension=32 --validate=False --lamb=0.1 --load_features=True
Execute citation_svd_run.sh
for all datasets and for both SVD and Randomized SVD approaches and for k=2,4,8,16,32,64,128
to generate json
files with performance metrics results. Perform 20
repetitions per configuration. Example command:
python train.py --dataset=cora_ml --model=dummy_pair --epochs=10 --nb_run=20 --validate=False --feature_vector_type=svd --feature_vector_size=32 --logfile=svd_cora_ml_runs.json
Will use feature-based texas
, cornell
and wisconsin
datasets under the corresponding folders in data/
.
In torch-geometric
they can be imported through torch_geometric.datasets.WebKB
class.
Execute webkb_grid_search.sh
to generate json
files with performance metrics results for all dataset and model combinations and for all hyperparameter values in the relevant search grid for each such combination as defined in the manuscript. Perform 5
repetitions (different graph splits) per configuration and train for 200
epochs for each repetition. Example command:
python train.py --dataset=texas --model=digae_single_layer --alpha=0.0 --beta=0.0 --epochs=200 --nb_run=5 --logfile=texas_grid_search.json --learning_rate=0.005 --hidden=32 --dimension=16 --validate=True
Hyperparameters for best mean AUC results are selected for the final model runs: webkb_run.sh
collects relevant commands.
Execute webkb_run.sh
to generate json
files with performance metrics results for all dataset and model combinations for the selected hyperparameter values. Perform 20
repetitions per configuration and train for 200
epochs for each repetition. Example command:
python train.py --dataset=wisconsin --model=digae_single_layer --alpha=0.8 --beta=0.8 --epochs=200 --nb_run=20 --logfile=webkb_run_features.json --learning_rate=0.005 --hidden=64 --dimension=32 --validate=False --feature_vector_type=None