About ↑
RAFBL is the repository accompanying the manuscript: Reaction-Agnostic Featurization of Bidentate Ligands for Bayesian Ridge Regression of Enantioselectivity. It includes two packages modsel
and moltop
.
modsel
is used for additional ligand featurization from base features and takes care of the feature selection for the final models.
moltop
generates topological features from molecular structures. A molecular graph is either constructed using xyz coordinates and covalent radii or SMILES directly.
Ligand features can be visualized on Materials Cloud.
Install ↑
We recommend the use of conda to install all the require dependencies.
To create the environment, run:
conda env create -f environment.yml
And then activate the environment as:
conda activate rafbl
Run ↑
To re-generate the features from Gaussian log files you can run:
./feat_csd.sh
./feat_lit.sh
This process takes a long time but only has to be run once. Beware! If you regenerate the features you will need to finish the process, since the regeneration will overwrite the currently present, already ready to use feature lists.
The final files containing all the features can be found under ligs/csd_pool.csv
for the CSD ligands and under ligs/lit_pool.csv
for the literature ligands.
# possible modes: 0 -> oa, 1- > cp, 2 -> cc, 3 -> da_f
python main_models.py 0
# possible modes: 1 -> csd ligands, 2 -> literature ligands
python main_pool_cand.py 1
A list of ligands sorted by decreasing Expected Improvement (EI) values is obtained.