Skip to content

AABBA is a Python library for applying autocorrelation (AC) functions

License

Notifications You must be signed in to change notification settings

uiocompcat/AABBA

Repository files navigation

AABBA

AABBA is a Graph Kernel Python library for applying autocorrelation (AC) functions. It transforms molecular graphs into a fixed-length vector that combines generic properties (GP) and natural bond orbital (NBO) properties.

NB! Generic properties (GP) and periodic-table (PT) features are employed indistinctly.

Requirements

  • Python 3.7.0+
  • NumPy 1.21.5+
  • NetworkX 2.6.3

Scripts

  • graph_info.py - read the chemical graph and extract the indexes of the nodes and edges at different depths. It also provides the labels of the features.
  • ac_functions.py - perform the autocorrelation functions.
  • utilities.py - tools for manipulating the data.
  • ac_PT_multithread.py - parallel implementation to perform the autocorrelation functions with periodic-table features.
  • ac_NBO_multithread.py - parallel implementation to perform the autocorrelation functions with nbo features.

How to use

Clone the code.

git clone

Install the requirements.

Define the AABBA parameters in the first lines of ac_NBO_multithread.py and ac_PT_multithread.py files:

  1. Select the parameters to perform the autocorrelation functions accordingly ("PARAMS" dictionary).
  • ac_operator: origin of the autocorrelation (M or F) and arithmetic operator (A, D, S, R) applied to the properties.
ac_operator (str)
MA metal-centered product autocorrelation
MD metal-centered deltametric autocorrelation
MS metal-centered summetric autocorrelation
MR metal-centered ratiometric autocorrelation
FA full product autocorrelation
FD full deltametric autocorrelation
FS full summetric autocorrelation
FR full ratiometric autocorrelation

NB! MC refers to metal-centered autocorrelation and F refers to full autocorrelation.

  • walk: types of autocorrelation. The different modes to read the chemical graph.
walk variable (str)
AA atom-atom correlation
BBavg bond-bond autocorrelation with averaging-superbond (only for MC)
BB bond-bond autocorrelation with summing-superbond (for MC), and with full bond-bond (F)
AB bond-atom autocorrelation
ABBAavg implicit autocorrelation, AABBA(II), with averaging-superbond (only for MC); select the model 1, 2, 3, 4, 5
ABBA implicit autocorrelation, AABBA(II), with individual bond (only for F); select the model 1, 2, 3, 4, 5

NB! According to the article:

AABBA(I) = AA ⊕ BBavg ⊕ AB, therefore, it is necessary to obtain them separately, (i.e. first AA, then BBavg, and AB) and concatenate them afterwards.

AABBA(II) is obtained using AABBAavg and ABBA in the code. We also need to indicate the model number to obtain each of the different five possibilities

  • model_number: 1, 2, 3 to be performed with periodic table features (PT); and 4, 5 to be performed with nbo features (NBO). The attributes contain the following features:
model_number (str)
1 Zi, Zj, Ti, Tj, Xi, Xj, d, BO, I
2 Zi, Zj, Ti, Tj, Xi-Xj, d, BO, I
3 Zi, Zj, Ti, Tj, Xi-Xj, Si, Sj, BO, I
4 qNati, qNatj, VNati, VNatj, Nsi, Nsj, Npi, Npj, Ndi, Ndj, NLPi, NLPj, NLVi, NLVj, BD, BONat, NBN , BNs, BNp, BNd, NBN∗ , BN∗s , BN∗p , BN∗d , I
5 qNati, qNatj, VNati, VNatj, NLPi, NLPj, LPEi, LPEj, LP∆Ei, LP∆Ej, NLVi, NLVj, LVEi, LVEj, LV∆Ei, LV∆Ej, BD, BONat, NBN , BNE , BN∆E , NBN∗ , BN∗E , BN∗∆E , I

(i and j are the nodes of the edges)

  • depth_max (int): maximum depth of the autocorrelation function.

The graphs (.gml files) are stored in the folllowing directories:

  • 'PT_graphs' for generic property graphs.
  • 'uNatQ_graphs' for NBO property graphs.

If the location of these folders is different, adjust the path_to_gml variable in your script accordingly.

Resulting Vectors: The vectors generated from processing the graphs are saved in the 'vectors_ABBA' folder.

Code Configuration and Customization

This code is designed to work with graphs extracted from https://github.com/uiocompcat/HyDGL. If you have graphs with different properties or need to customize certain aspects, follow these steps:

  • Adjust Graph Properties:

File: graph_info.py. Define and adjust the properties and features of your graphs according to your requirements.

  • Customize Decimal Precision:

File: utilities.py. Modify how decimals are rounded to meet your specific needs.

Ensure that these customizations align with the rest of your code to maintain compatibility and accuracy.

For more information, please refer to the preprint: doi:10.26434/chemrxiv-2023-5wbkr

Feature_selection_and_ML

The folder contains the necessary files for running the GBM and GPs.

Citation

Morán-González L, Betten JE, Kneiding H, Balcells D. AABBA: Atom–Atom Bond–Bond Bond–Atom Graph Kernel for Machine Learning on Molecules and Materials. ChemRxiv. 2023; doi:10.26434/chemrxiv-2023-5wbkr

Contact

[email protected]

About

AABBA is a Python library for applying autocorrelation (AC) functions

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages