Publication | Open Access
Alchemy: A Quantum Chemistry Dataset for Benchmarking AI Models
64
Citations
31
References
2019
Year
Artificial IntelligenceEngineeringMachine LearningMachine Learning ToolMachine Learning ModelsComputational ChemistryChemistryQuantum ComputingData ScienceQuantum Machine LearningQuantum Chemistry DatasetQuantum ScienceMolecular SciencesGraph Neural NetworkBenchmark DatasetsDataset ComprisesKnowledge DiscoveryComputer ScienceQuantum ChemistryMolecular Property PredictionNatural SciencesMolecular PropertyAlchemy DatasetQuantum Benchmarking
The Alchemy dataset expands the volume and diversity of existing molecular datasets, with further details available on its contest website. The authors introduce Alchemy as a new dataset for developing machine learning models in chemistry and material science and launch a contest to engage researchers. Alchemy contains 12 quantum mechanical properties for 119,487 organic molecules up to 14 heavy atoms sampled from GDB MedChem, with additional samples added since its initial release. Benchmarks of state‑of‑the‑art graph neural networks on Alchemy demonstrate its usefulness for validating and developing chemistry models, and the authors provide the list of 119,487 molecules used.
We introduce a new molecular dataset, named Alchemy, for developing machine learning models useful in chemistry and material science. As of June 20th 2019, the dataset comprises of 12 quantum mechanical properties of 119,487 organic molecules with up to 14 heavy atoms, sampled from the GDB MedChem database. The Alchemy dataset expands the volume and diversity of existing molecular datasets. Our extensive benchmarks of the state-of-the-art graph neural network models on Alchemy clearly manifest the usefulness of new data in validating and developing machine learning models for chemistry and material science. We further launch a contest to attract attentions from researchers in the related fields. More details can be found on the contest website \footnote{https://alchemy.tencent.com}. At the time of benchamrking experiment, we have generated 119,487 molecules in our Alchemy dataset. More molecular samples are generated since then. Hence, we provide a list of molecules used in the reported benchmarks.
| Year | Citations | |
|---|---|---|
Page 1
Page 1