Skip to content

atomicarchitects/datasets

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Atomic Datasets

This repository contains the following 3D molecular datasets:

  • QM9
  • GEOM (Drugs)
  • tmQM

as well as the following toy datasets:

  • Platonic Solids
  • 3D Tetris

Installation

pip install git+https://github.com/atomicarchitects/datasets

Example

from atomic_datasets.datasets import QM9Dataset

dataset = QM9Dataset(
    root_dir="data/qm9",
    check_molecule_sanity=True,
    use_edm_splits=True,
    num_train_molecules=10,
    num_val_molecules=10,
    num_test_molecules=10,
)

for graph in dataset.all_structures():
    # graph is a jraph.GraphsTuple object

Citation

If you use this repository, please cite the original papers:

  • QM9:
@article{qm9,
	author = {Ramakrishnan, Raghunathan and Dral, Pavlo O. and Rupp, Matthias and von Lilienfeld, O. Anatole},
	journal = {Scientific Data},
	number = {1},
	pages = {140022},
	title = {Quantum chemistry structures and properties of 134 kilo molecules},
	volume = {1},
	year = {2014}
}
  • GEOM:
@article{geom,
	author = {Axelrod, Simon and G{\'o}mez-Bombarelli, Rafael},
	journal = {Scientific Data},
	number = {1},
	pages = {185},
	title = {GEOM, energy-annotated molecular conformations for property prediction and molecular generation},
	volume = {9},
	year = {2022}
}
  • tmQM:
@article{tmQM,
	author = {Balcells, David and Skjelstad, Bastian Bjerkem},
	journal = {Journal of Chemical Information and Modeling},
	month = {12},
	number = {12},
	pages = {6135--6146},
	title = {tmQM Dataset---Quantum Geometries and Properties of 86k Transition Metal Complexes},
	volume = {60},
	year = {2020}
}
  • 3D Tetris:
@phdthesis{
    author={Smidt, Tess E.},
    year={2018},
    title={Toward the Systematic Design of Complex Materials from Structural Motifs},
    journal={ProQuest Dissertations and Theses},
    pages={200},
    note={Copyright - Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works; Last updated - 2023-03-04},
    language={English},
    url={https://www.proquest.com/dissertations-theses/toward-systematic-design-complex-materials/docview/2137540057/se-2},
}

About

3D datasets useful to the Atomic Architects!

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages