Collection of Jupyter Notebooks used for the work described in 10.1186/s13321-018-0325-4
This work is divided into 6 notebooks:
- 1_Data_collection: Extract the data from ChEMBL and identify active and inactives compounds.
- 2-Descriptor_calculation: Describe compounds using RDKit descriptors
- 3-Modeling_file_creation: In charge of splitting the data per targets and to prepare it for modelling
- 4-Job_submission: Example of job submission
- 5-Train_QSAR_and_CP: Actual training part
- utils: few python functions and classes
Tested with Python 3.5/3.6
If you are only interested in the data, the data returned by (3) are available here
If you are only interested in the models, the models are available to upload