Skip to content

Blog¤

Best of Atomistic Machine Learning

Best of Atomistic Machine Learning ⚛️🧬💎

🏆  A ranked list of awesome atomistic machine learning (AML) projects. Updated regularly.

DOI

This curated list contains 510 awesome open-source projects with a total of 240K stars grouped into 23 categories. All projects are ranked by a project-quality score, which is calculated based on various metrics automatically collected from GitHub and different package managers. If you like to add or update projects, feel free to open an issue, submit a pull request, or directly edit the projects.yaml.

The current focus of this list is more on simulation data rather than experimental data, and more on materials rather than drug design. Nevertheless, contributions from other fields are warmly welcome!

How to cite. See the button "Cite this repository" on the right side-bar.

🧙‍♂️ Discover other best-of lists or create your own.

Contents

Explanation

  • 🥇🥈🥉  Combined project-quality score
  • ⭐️  Star count from GitHub
  • 🐣  New project (less than 6 months old)
  • 💤  Inactive project (6 months no activity)
  • 💀  Dead project (12 months no activity)
  • 📈📉  Project is trending up or down
  • ➕  Project was recently added
  • 👨‍💻  Contributors count from GitHub
  • 🔀  Fork count from GitHub
  • 📋  Issue count from GitHub
  • ⏱️  Last update timestamp on package manager
  • 📥  Download count from package manager
  • 📦  Number of dependent projects


Active learning

Back to top

Projects that focus on enabling active learning, iterative learning schemes for atomistic ML.

DP-GEN (🥇23 · ⭐ 380) - The deep potential generator to generate a deep-learning based model of interatomic potential energy and force field. LGPL-3.0 ML-IAP MD workflows - [GitHub](https://github.com/deepmodeling/dpgen) (👨‍💻 73 · 🔀 180 · 📥 2.1K · 📦 8 · 📋 340 - 13% open · ⏱️ 06.04.2026):
git clone https://github.com/deepmodeling/dpgen
- [PyPi](https://pypi.org/project/dpgen) (📥 660 / month · 📦 2 · ⏱️ 07.08.2025):
pip install dpgen
- [Conda](https://anaconda.org/deepmodeling/dpgen) (📥 290 · ⏱️ 25.03.2025):
conda install -c deepmodeling dpgen
FLARE (🥈18 · ⭐ 350) - An open-source Python package for creating fast and accurate interatomic potentials. MIT C++ ML-IAP - [GitHub](https://github.com/mir-group/flare) (👨‍💻 44 · 🔀 78 · 📥 9 · 📦 12 · 📋 220 - 14% open · ⏱️ 30.01.2026):
git clone https://github.com/mir-group/flare
Bgolearn (🥈17 · ⭐ 130) - [arXiv:2601.06820] Offical implement of Bgolearn. MIT materials-discovery probabilistic - [GitHub](https://github.com/Bin-Cao/Bgolearn) (👨‍💻 4 · 🔀 18 · 📥 71 · ⏱️ 05.04.2026):
git clone https://github.com/Bin-Cao/Bgolearn
- [PyPi](https://pypi.org/project/Bgolearn) (📥 300 / month · ⏱️ 13.01.2026):
pip install Bgolearn
IPSuite (🥈15 · ⭐ 24) - A Python toolkit for FAIR development and deployment of machine-learned interatomic potentials. EPL-2.0 ML-IAP MD workflows HTC FAIR - [GitHub](https://github.com/zincware/IPSuite) (👨‍💻 9 · 🔀 13 · 📦 14 · 📋 180 - 49% open · ⏱️ 13.03.2026):
git clone https://github.com/zincware/IPSuite
- [PyPi](https://pypi.org/project/ipsuite) (📥 140 / month · 📦 5 · ⏱️ 20.11.2025):
pip install ipsuite
DP-GEN2 (🥉14 · ⭐ 40) - 2nd generation of the Deep Potential GENerator. LGPL-3.0 ML-IAP MD workflows - [GitHub](https://github.com/deepmodeling/dpgen2) (👨‍💻 19 · 🔀 36 · 📦 6 · 📋 46 - 39% open · ⏱️ 04.04.2026):
git clone https://github.com/deepmodeling/dpgen2
Show 4 hidden projects... - flare++ (🥉12 · ⭐ 38 · 💀) - A many-body extension of the FLARE code. MIT C++ ML-IAP - Finetuna (🥉10 · ⭐ 67 · 💀) - Active Learning for Machine Learning Potentials. MIT - ACEHAL (🥉5 · ⭐ 15 · 💀) - Hyperactive Learning (HAL) Python interface for building Atomic Cluster Expansion potentials. Unlicensed Julia - ALEBREW (🥉4 · ⭐ 21 · 💀) - Official repository for the paper Uncertainty-biased molecular dynamics for learning uniformly accurate interatomic.. Custom ML-IAP MD


Community resources

Back to top

Projects that collect atomistic ML resources or foster communication within community.

🔗 ACE / GRACE support - Support forum for the Atomic Cluster Expansion (ACE) and extensions.

🔗 AI for Science Map - Interactive mindmap of the AI4Science research field, including atomistic machine learning, including papers,..

🔗 ASE ecosystem - This is a list of software packages related to ASE or using ASE. md, ml-iap

🔗 Atomic Cluster Expansion - Atomic Cluster Expansion (ACE) community homepage.

🔗 CrystaLLM - Generate a crystal structure from a composition. language-models generative pretrained transformer

🔗 GAP-ML.org community homepage ML-IAP

🔗 matsci.org - A community forum for the discussion of anything materials science, with a focus on computational materials science..

🔗 Matter Modeling Stack Exchange - Machine Learning - Forum StackExchange, site Matter Modeling, ML-tagged questions.

Best-of Machine Learning with Python (🥇21 · ⭐ 23K) - A ranked list of awesome machine learning Python libraries. Updated weekly. CC-BY-4.0 general-ml Python - [GitHub](https://github.com/lukasmasuch/best-of-ml-python) (👨‍💻 57 · 🔀 3K · 📋 63 - 46% open · ⏱️ 22.03.2026):
git clone https://github.com/ml-tooling/best-of-ml-python
MatBench Discovery (🥇21 · ⭐ 220) - An evaluation framework for machine learning models simulating high-throughput materials discovery. MIT datasets benchmarking model-repository - [GitHub](https://github.com/janosh/matbench-discovery) (👨‍💻 29 · 🔀 54 · 📦 7 · 📋 72 - 4% open · ⏱️ 08.04.2026):
git clone https://github.com/janosh/matbench-discovery
- [PyPi](https://pypi.org/project/matbench-discovery) (📥 4.1K / month · 📦 2 · ⏱️ 11.09.2024):
pip install matbench-discovery
OpenML (🥇19 · ⭐ 730) - Open Machine Learning. BSD-3 datasets - [GitHub](https://github.com/openml/OpenML) (👨‍💻 35 · 🔀 120 · 📋 960 - 40% open · ⏱️ 23.01.2026):
git clone https://github.com/openml/OpenML
Garden (🥈18 · ⭐ 39) - FAIR AI/ML Model Publishing Framework. MIT model-repository - [GitHub](https://github.com/Garden-AI/garden) (👨‍💻 14 · 🔀 4 · 📦 6 · 📋 380 - 3% open · ⏱️ 18.03.2026):
git clone https://github.com/Garden-AI/garden
- [PyPi](https://pypi.org/project/garden-ai) (📥 670 / month · ⏱️ 18.03.2026):
pip install garden-ai
Graph-based Deep Learning Literature (🥈17 · ⭐ 5K) - links to conference publications in graph-based deep learning. MIT general-ml rep-learn - [GitHub](https://github.com/naganandy/graph-based-deep-learning-literature) (👨‍💻 12 · 🔀 770 · ⏱️ 07.02.2026):
git clone https://github.com/naganandy/graph-based-deep-learning-literature
AI for Science Resources (🥈14 · ⭐ 750) - List of resources for AI4Science research, including learning resources. GPL-3.0 license - [GitHub](https://github.com/divelab/AIRS) (👨‍💻 36 · 🔀 89 · 📋 32 - 18% open · ⏱️ 30.03.2026):
git clone https://github.com/divelab/AIRS
GT4SD - Generative Toolkit for Scientific Discovery (🥈14 · ⭐ 370 · 💤) - Gradio apps of generative models in GT4SD. MIT generative pretrained drug-discovery model-repository - [GitHub](https://github.com/GT4SD/gt4sd-core) (👨‍💻 20 · 🔀 79 · 📋 120 - 11% open · ⏱️ 18.09.2025):
git clone https://github.com/GT4SD/gt4sd-core
Awesome Materials Informatics (🥈12 · ⭐ 500 · 💤) - Curated list of known efforts in materials informatics, i.e. in modern materials science. Custom - [GitHub](https://github.com/tilde-lab/awesome-materials-informatics) (👨‍💻 21 · 🔀 100 · ⏱️ 19.06.2025):
git clone https://github.com/tilde-lab/awesome-materials-informatics
Neural-Network-Models-for-Chemistry (🥈12 · ⭐ 190) - A collection of Neural Network Models for chemistry. MIT rep-learn - [GitHub](https://github.com/Eipgen/Neural-Network-Models-for-Chemistry) (👨‍💻 3 · 🔀 24 · 📋 2 - 50% open · ⏱️ 09.04.2026):
git clone https://github.com/Eipgen/Neural-Network-Models-for-Chemistry
Awesome-Scientific-Language-Models (🥈10 · ⭐ 650 · 💤) - A Comprehensive Survey of Scientific Large Language Models and Their Applications in Scientific Discovery (EMNLP24). MIT language-models general-ml pretrained multimodal - [GitHub](https://github.com/yuzhimanhua/Awesome-Scientific-Language-Models) (👨‍💻 9 · 🔀 37 · ⏱️ 21.06.2025):
git clone https://github.com/yuzhimanhua/Awesome-Scientific-Language-Models
Awesome Materials & Chemistry Datasets (🥈10 · ⭐ 300) - A curated list of the most useful datasets in materials science and chemistry for training machine learning and AI.. MIT datasets experimental-data literature-data proprietary - [GitHub](https://github.com/blaiszik/awesome-matchem-datasets) (👨‍💻 9 · 🔀 34 · 📋 15 - 26% open · ⏱️ 22.03.2026):
git clone https://github.com/blaiszik/awesome-matchem-datasets
optimade.science (🥉9 · ⭐ 10) - A sky-scanner Optimade browser-only GUI. MIT datasets - [GitHub](https://github.com/tilde-lab/optimade.science) (👨‍💻 8 · 🔀 4 · 📋 26 - 26% open · ⏱️ 04.11.2025):
git clone https://github.com/tilde-lab/optimade.science
Awesome Neural Geometry (🥉8 · ⭐ 1.1K) - A curated collection of resources and research related to the geometry of representations in the brain, deep networks,.. Unlicensed educational rep-learn - [GitHub](https://github.com/neurreps/awesome-neural-geometry) (👨‍💻 16 · 🔀 70 · ⏱️ 24.02.2026):
git clone https://github.com/neurreps/awesome-neural-geometry
The Collection of Database and Dataset Resources in Materials Science (🥉8 · ⭐ 420) - A list of databases, datasets and books/handbooks where you can find materials properties for machine learning.. Unlicensed datasets - [GitHub](https://github.com/sedaoturak/data-resources-for-materials-science) (👨‍💻 2 · 🔀 59 · ⏱️ 06.03.2026):
git clone https://github.com/sedaoturak/data-resources-for-materials-science
AI for Science paper collection (🥉8 · ⭐ 170 · 💤) - List the AI for Science papers accepted by top conferences. Apache-2 - [GitHub](https://github.com/AI4QC/AI_for_Science_paper_collection) (👨‍💻 7 · 🔀 17 · ⏱️ 24.09.2025):
git clone https://github.com/sherrylixuecheng/AI_for_Science_paper_collection
Awesome Neural SBI (🥉8 · ⭐ 150) - Community-sourced list of papers and resources on neural simulation-based inference. MIT active-learning - [GitHub](https://github.com/smsharma/awesome-neural-sbi) (👨‍💻 7 · 🔀 8 · 📋 2 - 50% open · ⏱️ 28.01.2026):
git clone https://github.com/smsharma/awesome-neural-sbi
Charting ML Publications in Science (🥉8 · ⭐ 47) - Literature analysis of ML applications in materials science, chemistry, physics. MIT literature-data general-ml - [GitHub](https://github.com/blaiszik/ml_publication_charts) (👨‍💻 2 · ⏱️ 10.03.2026):
git clone https://github.com/blaiszik/ml_publication_charts
DeepModeling Projects (🥉8 · ⭐ 8) - DeepModeling projects. CC-BY-4.0 - [GitHub](https://github.com/deepmodeling/deepmodeling-projects) (👨‍💻 4 · 🔀 2 · ⏱️ 02.04.2026):
git clone https://github.com/deepmodeling/deepmodeling-projects
Awesome-Crystal-GNNs (🥉7 · ⭐ 120) - This repository contains a collection of resources and papers on GNN Models on Crystal Solid State Materials. MIT - [GitHub](https://github.com/kdmsit/Awesome-Crystal-GNNs) (👨‍💻 2 · 🔀 13 · ⏱️ 09.03.2026):
git clone https://github.com/kdmsit/Awesome-Crystal-GNNs
Show 11 hidden projects... - MatBench (🥈18 · ⭐ 190 · 💀) - Matbench: Benchmarks for materials science property prediction. MIT datasets benchmarking model-repository - GNoME Explorer (🥈12 · ⭐ 1.2K · 💀) - Graph Networks for Materials Exploration Database. Apache-2 datasets materials-discovery - MoLFormers UI (🥉9 · ⭐ 390 · 💀) - A family of foundation models trained on chemicals. Apache-2 transformer language-models pretrained drug-discovery - Awesome-Graph-Generation (🥉8 · ⭐ 360 · 💀) - A curated list of up-to-date graph generation papers and resources. Unlicensed rep-learn - A Highly Opinionated List of Open-Source Materials Informatics Resources (🥉8 · ⭐ 150 · 💀) - A Highly Opinionated List of Open Source Materials Informatics Resources. MIT - MADICES Awesome Interoperability (🥉6 · ⭐ 1) - Linked data interoperability resources of the Machine-actionable data interoperability for the chemical sciences.. MIT datasets - LAM Crystal Philately competition 2024 (🥉5 · ⭐ 22 · 💀) - OpenLAM Challenge crystal structure prediction https://arxiv.org/abs/2501.16358. LGPL-2.1 single-paper datasets structure-prediction materials-discovery ML-IAP UIP - Geometric-GNNs (🥉4 · ⭐ 120 · 💀) - List of Geometric GNNs for 3D atomic systems. Unlicensed datasets educational rep-learn - Does this material exist? (🥉3 · ⭐ 18 · 💀) - Vote on whether you think predicted crystal structures could be synthesised. MIT for-fun materials-discovery - GitHub topic materials-informatics (🥉1) - GitHub topic materials-informatics. Unlicensed - MateriApps (🥉1) - A Portal Site of Materials Science Simulation. Unlicensed


Datasets

Back to top

Datasets, databases and trained models for atomistic ML.

🔗 Alexandria Materials Database - A database of millions of theoretical crystal structures (3D, 2D and 1D) discovered by machine learning accelerated..

🔗 Catalysis Hub - A web-platform for sharing data and software for computational catalysis research!.

🔗 Citrination Datasets - AI-Powered Materials Data Platform. Open Citrination has been decommissioned.

🔗 crystals.ai - Curated datasets for reproducible AI in materials science.

🔗 DeepChem Models - DeepChem models on HuggingFace. model-repository pretrained language-models

🔗 Graphs of Materials Project 20190401 - The dataset used to train the MEGNet interatomic potential. ML-IAP

🔗 HME21 Dataset - High-temperature multi-element 2021 dataset for the PreFerred Potential (PFP).. UIP

🔗 JARVIS-Leaderboard ( ⭐ 73 · 💤) - A large scale benchmark of materials design methods: https://www.nature.com/articles/s41524-024-01259-w. model-repository benchmarking community-resource educational

🔗 Materials Project - Charge Densities - Materials Project has started offering charge density information available for download via their public API.

🔗 Materials Project Trajectory (MPtrj) Dataset - The dataset used to train the CHGNet universal potential. UIP

🔗 matterverse.ai - Database of yet-to-be-sythesized materials predicted using state-of-the-art machine learning algorithms.

🔗 MPF.2021.2.8 - The dataset used to train the M3GNet universal potential. UIP

🔗 NRELMatDB - Computational materials database with the specific focus on materials for renewable energy applications including, but..

🔗 QM9 Charge Densities and Energies - QM9 molecules calculated with VASP using Atomic Simulation Environment. ML-DFT

🔗 QM40 Dataset - A More Realistic QM Dataset for Machine Learning in Molecular Science https://doi.org/10.1038/s41597-024-04206-y. drug-discovery

🔗 QMugs dataset - Quantum Mechanical Properties of Drug-like Molecules https://doi.org/10.1038/s41597-022-01390-7. drug-discovery

🔗 Quantum-Machine.org Datasets - Collection of datasets, including QM7, QM9, etc. MD, DFT. Small organic molecules, mostly.

🔗 sGDML Datasets - MD17, MD22, DFT datasets.

🔗 MoleculeNet - A Benchmark for Molecular Machine Learning. benchmarking

🔗 ZINC15 - A free database of commercially-available compounds for virtual screening. ZINC contains over 230 million purchasable.. graph biomolecules

🔗 ZINC20 - A free database of commercially-available compounds for virtual screening. ZINC contains over 230 million purchasable.. graph biomolecules

FAIR Chemistry datasets (🥇32 · ⭐ 2K · 📈) - Datasets OC20, OC22, etc. Formerly known as Open Catalyst Project. MIT catalysis - [GitHub](https://github.com/facebookresearch/fairchem) (👨‍💻 69 · 🔀 450 · 📋 570 - 1% open · ⏱️ 09.04.2026):
git clone https://github.com/FAIR-Chem/fairchem
- [PyPi](https://pypi.org/project/fairchem-core) (📥 130K / month · 📦 44 · ⏱️ 26.03.2026):
pip install fairchem-core
Meta Open Materials 2024 (OMat24) Dataset (🥇31 · ⭐ 2K · 📈) - Contains over 100 million Density Functional Theory calculations focused on structural and compositional diversity. CC-BY-4.0 - [GitHub](https://github.com/facebookresearch/fairchem) (👨‍💻 69 · 🔀 450 · 📋 570 - 1% open · ⏱️ 09.04.2026):
git clone https://github.com/FAIR-Chem/fairchem
- [PyPi](https://pypi.org/project/fairchem-core) (📥 130K / month · 📦 44 · ⏱️ 26.03.2026):
pip install fairchem-core
MPContribs (🥇25 · ⭐ 39) - Platform for materials scientists to contribute and disseminate their materials data through Materials Project. MIT - [GitHub](https://github.com/materialsproject/MPContribs) (👨‍💻 29 · 🔀 27 · 📦 58 · 📋 120 - 30% open · ⏱️ 26.02.2026):
git clone https://github.com/materialsproject/MPContribs
- [PyPi](https://pypi.org/project/mpcontribs-client) (📥 7.3K / month · 📦 7 · ⏱️ 09.02.2026):
pip install mpcontribs-client
OPTIMADE Python tools (🥇23 · ⭐ 89) - Tools for implementing and consuming OPTIMADE APIs in Python. MIT - [GitHub](https://github.com/Materials-Consortia/optimade-python-tools) (👨‍💻 34 · 🔀 50 · 📋 500 - 22% open · ⏱️ 02.03.2026):
git clone https://github.com/Materials-Consortia/optimade-python-tools
- [PyPi](https://pypi.org/project/optimade) (📥 24K / month · 📦 4 · ⏱️ 13.02.2026):
pip install optimade
- [Conda](https://anaconda.org/conda-forge/optimade) (📥 170K · ⏱️ 13.02.2026):
conda install -c conda-forge optimade
load-atoms (🥈19 · ⭐ 49) - download and manipulate atomistic datasets. MIT data-structures - [GitHub](https://github.com/jla-gardner/load-atoms) (👨‍💻 5 · 🔀 5 · 📦 8 · 📋 35 - 14% open · ⏱️ 25.11.2025):
git clone https://github.com/jla-gardner/load-atoms
- [PyPi](https://pypi.org/project/load-atoms) (📥 65K / month · 📦 3 · ⏱️ 25.11.2025):
pip install load-atoms
Open Databases Integration for Materials Design (OPTIMADE) (🥈17 · ⭐ 100) - Specification of a common REST API for access to materials databases. CC-BY-4.0 - [GitHub](https://github.com/Materials-Consortia/OPTIMADE) (👨‍💻 24 · 🔀 37 · 📋 260 - 30% open · ⏱️ 18.12.2025):
git clone https://github.com/Materials-Consortia/OPTIMADE
QH9 (🥈14 · ⭐ 750) - A Quantum Hamiltonian Prediction Benchmark. CC-BY-NC-SA-4.0 ML-DFT - [GitHub](https://github.com/divelab/AIRS) (👨‍💻 36 · 🔀 89 · 📋 32 - 18% open · ⏱️ 30.03.2026):
git clone https://github.com/divelab/AIRS
OpenQDC (🥈14 · ⭐ 60 · 💤) - Repository of Quantum Datasets Publicly Available. CC-BY-4.0 - [GitHub](https://github.com/valence-labs/OpenQDC) (👨‍💻 10 · 🔀 6 · 📦 4 · 📋 50 - 18% open · ⏱️ 19.06.2025):
git clone https://github.com/valence-labs/openQDC
- [PyPi](https://pypi.org/project/openqdc) (📥 140 / month · ⏱️ 09.08.2024):
pip install openqdc
- [Conda](https://anaconda.org/conda-forge/openqdc) (📥 2.2K · ⏱️ 22.04.2025):
conda install -c conda-forge openqdc
OpenKIM (🥈13 · ⭐ 37) - The Open Knowledgebase of Interatomic Models (OpenKIM) aims to be an online resource for standardized testing, long-.. LGPL-2.1 model-repository knowledge-base pretrained - [GitHub](https://github.com/openkim/kim-api) (👨‍💻 27 · 🔀 18 · 📋 37 - 40% open · ⏱️ 06.03.2026):
git clone https://github.com/openkim/kim-api
nablaDFT (🥈12 · ⭐ 230) - nablaDFT: Large-Scale Conformational Energy and Hamiltonian Prediction benchmark and dataset. MIT ML-DFT ML-WFT drug-discovery ML-IAP benchmarking - [GitHub](https://github.com/AIRI-Institute/nablaDFT) (👨‍💻 9 · 🔀 25 · 📋 28 - 10% open · ⏱️ 31.12.2025):
git clone https://github.com/AIRI-Institute/nablaDFT
MatPES (🥈12 · ⭐ 53) - A foundational potential energy dataset for materials. BSD-3 UIP ML-IAP - [GitHub](https://github.com/materialyzeai/matpes) (👨‍💻 3 · 🔀 5 · 📋 9 - 22% open · ⏱️ 02.03.2026):
git clone https://github.com/materialsvirtuallab/matpes
- [PyPi](https://pypi.org/project/matpes) (📥 190 / month · ⏱️ 10.03.2025):
pip install matpes
SPICE (🥈11 · ⭐ 190) - A collection of QM data for training potential functions. MIT ML-IAP MD - [GitHub](https://github.com/openmm/spice-dataset) (👨‍💻 1 · 🔀 11 · 📥 340 · 📋 76 - 27% open · ⏱️ 25.02.2026):
git clone https://github.com/openmm/spice-dataset
MPDS API (🥈11 · ⭐ 26) - Tutorials, notebooks, issue tracker, and website on the MPDS API: the data retrieval interface for the Materials.. CC-BY-4.0 phase-transition - [GitHub](https://github.com/mpds-io/mpds-api) (👨‍💻 5 · 🔀 5 · 📋 36 - 22% open · ⏱️ 24.01.2026):
git clone https://github.com/mpds-io/mpds-api
- [PyPi](https://pypi.org/project/mpds_client) (📥 360 / month · ⏱️ 14.09.2020):
pip install mpds_client
OBELiX (🥉9 · ⭐ 53) - A Curated Dataset of Crystal Structures and Experimentally Measured Ionic Conductivities for Lithium Solid-State.. CC-BY-4.0 experimental-data transport-phenomena - [GitHub](https://github.com/NRC-Mila/OBELiX) (👨‍💻 6 · 🔀 9 · 📋 2 - 50% open · ⏱️ 27.11.2025):
git clone https://github.com/NRC-Mila/OBELiX
- [PyPi](https://pypi.org/project/obelix-data) (📥 39 / month · ⏱️ 16.05.2025):
pip install obelix-data
AIS Square (🥉9 · ⭐ 15) - A collaborative and open-source platform for sharing AI for Science datasets, models, and workflows. Home of the.. LGPL-3.0 community-resource model-repository - [GitHub](https://github.com/deepmodeling/AIS-Square) (👨‍💻 8 · 🔀 8 · 📋 6 - 83% open · ⏱️ 03.04.2026):
git clone https://github.com/deepmodeling/AIS-Square
polyVERSE (🥉7 · ⭐ 32) - polyVERSE is a comprehensive repository of informatics-ready datasets curated by the Ramprasad Group. Custom soft-matter - [GitHub](https://github.com/Ramprasad-Group/polyVERSE) (👨‍💻 9 · 🔀 6 · ⏱️ 21.01.2026):
git clone https://github.com/Ramprasad-Group/polyVERSE
Visual Graph Datasets (🥉6 · ⭐ 5) - Datasets for the training of graph neural networks (GNNs) and subsequent visualization of attributional explanations.. MIT XAI rep-learn - [GitHub](https://github.com/aimat-lab/visual_graph_datasets) (👨‍💻 2 · 🔀 3 · ⏱️ 24.03.2026):
git clone https://github.com/aimat-lab/visual_graph_datasets
The Perovskite Database Project (🥉5 · ⭐ 70) - Perovskite Database Project aims at making all perovskite device data, both past and future, available in a form.. Unlicensed community-resource - [GitHub](https://github.com/Jesperkemist/perovskitedatabase) (👨‍💻 2 · 🔀 26 · ⏱️ 20.03.2026):
git clone https://github.com/Jesperkemist/perovskitedatabase
Show 16 hidden projects... - ATOM3D (🥈19 · ⭐ 320 · 💀) - ATOM3D: tasks on molecules in three dimensions. MIT biomolecules benchmarking - Materials Data Facility (MDF) (🥉10 · ⭐ 10 · 💀) - A simple way to publish, discover, and access materials datasets. Publication of very large datasets supported (e.g.,.. Apache-2 - MoleculeNet Leaderboard (🥉9 · ⭐ 100 · 💀) - MIT benchmarking - 2DMD dataset (🥉9 · ⭐ 8 · 💀) - Code for Kazeev, N., Al-Maeeni, A.R., Romanov, I. et al. Sparse representation for machine learning the properties of.. Apache-2 material-defect - ANI-1 Dataset (🥉8 · ⭐ 100 · 💀) - A data set of 20 million calculated off-equilibrium conformations for organic molecules. MIT - GEOM (🥉7 · ⭐ 240 · 💀) - GEOM: Energy-annotated molecular conformations. Unlicensed drug-discovery - ANI-1x Datasets (🥉6 · ⭐ 67 · 💀) - The ANI-1ccx and ANI-1x data sets, coupled-cluster and density functional theory properties for organic molecules. MIT - COMP6 Benchmark dataset (🥉6 · ⭐ 40 · 💀) - COMP6 Benchmark dataset for ML potentials. MIT - SciGlass (🥉6 · ⭐ 15 · 💀) - The database contains a vast set of data on the properties of glass materials. MIT - GDB-9-Ex9 and ORNL_AISD-Ex (🥉5 · ⭐ 10 · 💀) - Distributed computing workflow for generation and analysis of large scale molecular datasets obtained running multi-.. Unlicensed - OPTIMADE providers dashboard (🥉5 · ⭐ 2 · 💤) - A dashboard of known providers. Unlicensed - 3DSC Database (🥉4 · ⭐ 26 · 💀) - Repo for the paper publishing the superconductor database with 3D crystal structures. Custom superconductors materials-discovery - paper-data-redundancy (🥉4 · ⭐ 11 · 💀) - Repo for the paper Exploiting redundancy in large materials datasets for efficient machine learning with less data. BSD-3 small-data single-paper - linear-regression-benchmarks (🥉4 · ⭐ 1 · 💀) - Data sets used for linear regression benchmarks. MIT benchmarking single-paper - nep-data (🥉3 · ⭐ 21 · 💀) - Data related to the NEP machine-learned potential of GPUMD. Unlicensed ML-IAP MD transport-phenomena - tmQM_wB97MV Dataset (🥉1 · ⭐ 9 · 💀) - Code for Applying Large Graph Neural Networks to Predict Transition Metal Complex Energies Using the tmQM_wB97MV.. Unlicensed catalysis rep-learn


Data Structures

Back to top

Projects that focus on providing data structures used in atomistic machine learning.

dpdata (🥇28 · ⭐ 240) - A Python package for manipulating atomistic data of software in computational science. LGPL-3.0 - [GitHub](https://github.com/deepmodeling/dpdata) (👨‍💻 67 · 🔀 160 · 📦 160 · 📋 160 - 31% open · ⏱️ 06.04.2026):
git clone https://github.com/deepmodeling/dpdata
- [PyPi](https://pypi.org/project/dpdata) (📥 72K / month · 📦 44 · ⏱️ 28.02.2026):
pip install dpdata
- [Conda](https://anaconda.org/deepmodeling/dpdata) (📥 420 · ⏱️ 25.03.2025):
conda install -c deepmodeling dpdata
Metatensor (🥈22 · ⭐ 98) - Self-describing sparse tensor data format for atomistic machine learning and beyond. BSD-3 ML-IAP MD Rust C-lang C++ Python - [GitHub](https://github.com/metatensor/metatensor) (👨‍💻 36 · 🔀 25 · 📥 54K · 📦 14 · 📋 270 - 27% open · ⏱️ 01.04.2026):
git clone https://github.com/metatensor/metatensor
- [PyPi](https://pypi.org/project/metatensor) (📥 1.8K / month · ⏱️ 26.01.2024):
pip install metatensor
mp-pyrho (🥉18 · ⭐ 42) - Tools for re-griding volumetric quantum chemistry data for machine-learning purposes. Custom ML-DFT - [GitHub](https://github.com/materialsproject/pyrho) (👨‍💻 10 · 🔀 10 · 📦 35 · 📋 10 - 70% open · ⏱️ 13.10.2025):
git clone https://github.com/materialsproject/pyrho
- [PyPi](https://pypi.org/project/mp-pyrho) (📥 130K / month · 📦 5 · ⏱️ 13.10.2025):
pip install mp-pyrho
dlpack (🥉16 · ⭐ 1.2K) - common in-memory tensor structure. Apache-2 C++ - [GitHub](https://github.com/dmlc/dlpack) (👨‍💻 33 · 🔀 160 · 📋 85 - 28% open · ⏱️ 24.01.2026):
git clone https://github.com/dmlc/dlpack


Density functional theory (ML-DFT)

Back to top

Projects and models that focus on quantities of DFT, such as density functional approximations (ML-DFA), the charge density, density of states, the Hamiltonian, etc.

🔗 IKS-PIML - Code and generated data for the paper Inverting the Kohn-Sham equations with physics-informed machine learning.. neural-operator pinn datasets single-paper

🔗 M-OFDFT - Overcoming the Barrier of Orbital-Free Density Functional Theory in Molecular Systems Using Deep Learning.. transformer single-paper

JAX-DFT (🥇26 · ⭐ 38K) - This library provides basic building blocks that can construct DFT calculations as a differentiable program. Apache-2 - [GitHub](https://github.com/google-research/google-research) (👨‍💻 860 · 🔀 8.3K · 📋 2.1K - 83% open · ⏱️ 09.04.2026):
git clone https://github.com/google-research/google-research
MALA (🥇16 · ⭐ 98 · 💤) - Materials Learning Algorithms. A framework for machine learning materials properties from first-principles data. BSD-3 - [GitHub](https://github.com/mala-project/mala) (👨‍💻 47 · 🔀 27 · 📦 2 · 📋 310 - 9% open · ⏱️ 16.09.2025):
git clone https://github.com/mala-project/mala
QHNet (🥇14 · ⭐ 750) - Artificial Intelligence Research for Science (AIRS). GPL-3.0 rep-learn - [GitHub](https://github.com/divelab/AIRS) (👨‍💻 36 · 🔀 89 · 📋 32 - 18% open · ⏱️ 30.03.2026):
git clone https://github.com/divelab/AIRS
SALTED (🥇14 · ⭐ 42) - Symmetry-Adapted Learning of Three-dimensional Electron Densities (and their electrostatic response). GPL-3.0 - [GitHub](https://github.com/andreagrisafi/SALTED) (👨‍💻 24 · 🔀 6 · 📋 11 - 27% open · ⏱️ 26.03.2026):
git clone https://github.com/andreagrisafi/SALTED
HamGNN (🥈13 · ⭐ 190) - An E(3) equivariant Graph Neural Network for predicting electronic Hamiltonian matrix. GPL-3.0 rep-learn magnetism C-lang - [GitHub](https://github.com/QuantumLab-ZY/HamGNN) (👨‍💻 8 · 🔀 44 · 📋 79 - 83% open · ⏱️ 09.04.2026):
git clone https://github.com/QuantumLab-ZY/HamGNN
Q-stack (🥈13 · ⭐ 19) - Stack of codes for dedicated pre- and post-processing tasks for Quantum Machine Learning (QML). MIT excited-states general-tool - [GitHub](https://github.com/lcmd-epfl/Q-stack) (👨‍💻 8 · 🔀 7 · 📋 55 - 12% open · ⏱️ 06.03.2026):
git clone https://github.com/lcmd-epfl/Q-stack
DeePKS-kit (🥈9 · ⭐ 120 · 💤) - a package for developing machine learning-based chemically accurate energy and density functional models. LGPL-3.0 ml-functional - [GitHub](https://github.com/deepmodeling/deepks-kit) (👨‍💻 7 · 🔀 39 · 📋 32 - 46% open · ⏱️ 28.04.2025):
git clone https://github.com/deepmodeling/deepks-kit
CiderPress (🥈9 · ⭐ 18 · 💤) - A high-performance software package for training and evaluating machine-learned XC functionals using the CIDER.. GPL-3.0 ml-functional C-lang - [GitHub](https://github.com/mir-group/CiderPress) (👨‍💻 2 · 🔀 3 · ⏱️ 09.04.2025):
git clone https://github.com/mir-group/CiderPress
- [PyPi](https://pypi.org/project/ciderpress) (📥 38 / month · ⏱️ 13.03.2025):
pip install ciderpress
ACEhamiltonians (🥈9 · ⭐ 17 · 💤) - Provides tools for constructing, fitting, and predicting self-consistent Hamiltonian and overlap matrices in solid-.. MIT Julia - [GitHub](https://github.com/ACEsuit/ACEhamiltonians.jl) (👨‍💻 5 · 🔀 7 · 📋 5 - 40% open · ⏱️ 17.09.2025):
git clone https://github.com/ACEsuit/ACEhamiltonians.jl
dftio (🥈8 · ⭐ 15) - dftio is to assist machine learning communities to transcript DFT output into a format that is easy to read or used by.. LGPL-3.0 data-structures workflows - [GitHub](https://github.com/deepmodeling/dftio) (👨‍💻 5 · 🔀 11 · 📋 8 - 50% open · ⏱️ 18.12.2025):
git clone https://github.com/deepmodeling/dftio
DeepH-E3 (🥉6 · ⭐ 110) - General framework for E(3)-equivariant neural network representation of density functional theory Hamiltonian. MIT magnetism - [GitHub](https://github.com/Xiaoxun-Gong/DeepH-E3) (👨‍💻 2 · 🔀 28 · 📋 40 - 67% open · ⏱️ 27.01.2026):
git clone https://github.com/Xiaoxun-Gong/DeepH-E3
Show 26 hidden projects... - DM21 (🥇20 · ⭐ 15K · 💀) - This package provides a PySCF interface to the DM21 (DeepMind 21) family of exchange-correlation functionals described.. Apache-2 - DeepH-pack (🥈12 · ⭐ 320 · 💀) - Deep neural networks for density functional theory Hamiltonian. LGPL-3.0 Julia - Grad DFT (🥈10 · ⭐ 110 · 💀) - GradDFT is a JAX-based library enabling the differentiable design and experimentation of exchange-correlation.. Apache-2 - NeuralXC (🥈10 · ⭐ 36 · 💀) - Implementation of a machine learned density functional. BSD-3 - PROPhet (🥈9 · ⭐ 66 · 💀) - PROPhet is a code to integrate machine learning techniques with first-principles quantum chemistry approaches. GPL-3.0 ML-IAP MD single-paper C++ - Libnxc (🥈8 · ⭐ 21 · 💀) - A library for using machine-learned exchange-correlation functionals for density-functional theory. MPL-2.0 C++ Fortran - ChargE3Net (🥉7 · ⭐ 71 · 💀) - [npj Comp. Mat.] Higher-order equivariant neural networks for charge density prediction in materials. MIT rep-learn - Mat2Spec (🥉7 · ⭐ 30 · 💀) - Density of States Prediction for Materials Discovery via Contrastive Learning from Probabilistic Embeddings. MIT spectroscopy - DeepDFT (🥉6 · ⭐ 89 · 💀) - Official implementation of DeepDFT model. MIT - scdp (scalable charge density prediction) (🥉6 · ⭐ 40 · 💀) - [NeurIPS 2024] source code for A Recipe for Charge Density Prediction. MIT rep-learn single-paper - charge-density-models (🥉6 · ⭐ 15 · 💀) - Tools to build charge density models using [fairchem](https://github.com/FAIR-Chem/fairchem). MIT rep-learn - KSR-DFT (🥉6 · ⭐ 4 · 💀) - Kohn-Sham regularizer for machine-learned DFT functionals. Apache-2 - xDeepH (🥉5 · ⭐ 40 · 💀) - Extended DeepH (xDeepH) method for magnetic materials. LGPL-3.0 magnetism Julia - InfGCN for Electron Density Estimation (🥉5 · ⭐ 16 · 💀) - Official implementation of the NeurIPS 23 spotlight paper of InfGCN. MIT rep-learn neural-operator - rho_learn (🥉5 · ⭐ 4 · 💀) - A proof-of-concept workflow for torch-based electron density learning. MIT ML-DFT rep-eng - ML-DFT (🥉4 · ⭐ 27 · 💀) - A package for density functional approximation using machine learning. MIT - DeepCDP (🥉4 · ⭐ 6 · 💀) - DeepCDP: Deep learning Charge Density Prediction. Unlicensed - CSNN (🥉4 · ⭐ 3 · 💀) - Primary codebase of CSNN - Concentric Spherical Neural Network for 3D Representation Learning. BSD-3 - rholearn (🥉4 · ⭐ 3 · 💀) - Learning and predicting electronic densities decomposed on a basis and global electronic densities of states at DFT.. MIT ML-DFT rep-eng density-of-states - gprep (🥉4 · 💀) - Fitting DFTB repulsive potentials with GPR. MIT single-paper - ofdft_nflows (🥉3 · ⭐ 11 · 💀) - Nomalizing flows for orbita-free DFT. Unlicensed generative - APET (🥉3 · ⭐ 6 · 💀) - Atomic Positional Embedding-based Transformer. GPL-3.0 density-of-states transformer - MALADA (🥉3 · ⭐ 1 · 💤) - MALA Data Acquisition: Helpful tools to build data for MALA. BSD-3 - A3MD (🥉2 · ⭐ 8 · 💀) - MPNN-like + Analytic Density Model = Accurate electron densities. Unlicensed rep-learn single-paper - MLDensity (🥉1 · ⭐ 7 · 💀) - Linear Jacobi-Legendre expansion of the charge density for machine learning-accelerated electronic structure.. Unlicensed - kdft (🥉1 · ⭐ 2 · 💀) - The Kernel Density Functional (KDF) code allows generating ML based DFT functionals. Unlicensed


Educational Resources

Back to top

Tutorials, guides, cookbooks, recipes, etc.

🔗 AI for Science 101 community-resource rep-learn

🔗 AL4MS 2023 workshop tutorials active-learning

🔗 Quantum Chemistry in the Age of Machine Learning - Book, 2022.

Deep Learning for Molecules and Materials Book (🥇13 · ⭐ 730) - Deep learning for molecules and materials book. Custom - [GitHub](https://github.com/whitead/dmol-book) (👨‍💻 19 · 🔀 130 · 📋 180 - 17% open · ⏱️ 20.02.2026):
git clone https://github.com/whitead/dmol-book
AI4Chemistry course (🥇13 · ⭐ 260) - EPFL AI for chemistry course, Spring 2023. https://schwallergroup.github.io/ai4chem_course. MIT chemistry - [GitHub](https://github.com/schwallergroup/ai4chem_course) (👨‍💻 10 · 🔀 59 · 📋 4 - 25% open · ⏱️ 03.04.2026):
git clone https://github.com/schwallergroup/ai4chem_course
Geometric GNN Dojo (🥇12 · ⭐ 520) - New to geometric GNNs: try our practical notebook, prepared for MPhil students at the University of Cambridge. MIT rep-learn - [GitHub](https://github.com/chaitjo/geometric-gnn-dojo) (👨‍💻 4 · 🔀 52 · 📋 9 - 22% open · ⏱️ 09.10.2025):
git clone https://github.com/chaitjo/geometric-gnn-dojo
COSMO Software Cookbook (🥇12 · ⭐ 47) - A collection of simulation recipes for the atomic-scale modeling of materials and molecules. BSD-3 - [GitHub](https://github.com/lab-cosmo/atomistic-cookbook) (👨‍💻 20 · 🔀 8 · 📋 33 - 33% open · ⏱️ 07.04.2026):
git clone https://github.com/lab-cosmo/software-cookbook
MLforMaterials (🥈9 · ⭐ 140) - Online resource for a practical course in machine learning for materials research at Imperial College London.. MIT community-resource general-ml rep-eng materials-discovery - [GitHub](https://github.com/aronwalsh/MLforMaterials) (👨‍💻 2 · 🔀 17 · ⏱️ 07.02.2026):
git clone https://github.com/aronwalsh/MLforMaterials
DSECOP (🥈9 · ⭐ 53 · 💤) - This repository contains data science educational materials developed by DSECOP Fellows. CCO-1.0 - [GitHub](https://github.com/GDS-Education-Community-of-Practice/DSECOP) (👨‍💻 14 · 🔀 26 · 📋 8 - 12% open · ⏱️ 29.04.2025):
git clone https://github.com/GDS-Education-Community-of-Practice/DSECOP
iam-notebooks (🥈9 · ⭐ 38) - Jupyter notebooks for the lectures of the Introduction to Atomistic Modeling. Apache-2 - [GitHub](https://github.com/ceriottm/iam-notebooks) (👨‍💻 6 · 🔀 6 · ⏱️ 14.02.2026):
git clone https://github.com/ceriottm/iam-notebooks
jarvis-tools-notebooks (🥈8 · ⭐ 96 · 💤) - This repository is no longer maintained. For the latest updates and continued development, please visit:.. NIST - [GitHub](https://github.com/JARVIS-Materials-Design/jarvis-tools-notebooks) (👨‍💻 6 · 🔀 40 · ⏱️ 10.07.2025):
git clone https://github.com/JARVIS-Materials-Design/jarvis-tools-notebooks
DeepModeling Tutorials (🥉6 · ⭐ 16) - Tutorials for DeepModeling projects. Unlicensed - [GitHub](https://github.com/deepmodeling/tutorials) (👨‍💻 12 · 🔀 24 · 📋 4 - 25% open · ⏱️ 11.03.2026):
git clone https://github.com/deepmodeling/tutorials
Show 20 hidden projects... - DeepLearningLifeSciences (🥇12 · ⭐ 400 · 💀) - Example code from the book Deep Learning for the Life Sciences. MIT - Introduction to AI-driven Science on Supercomputers: A Student Training Series (🥈11 · ⭐ 240) - Unlicensed general-ml rep-learn language-models - OPTIMADE Tutorial Exercises (🥈9 · ⭐ 17 · 💀) - Tutorial exercises for the OPTIMADE API. MIT datasets - RDKit Tutorials (🥈8 · ⭐ 310 · 💀) - Tutorials to learn how to work with the RDKit. Custom - BestPractices (🥈8 · ⭐ 200 · 💀) - Things that you should (and should not) do in your Materials Informatics research. MIT - MAChINE (🥉7 · ⭐ 1 · 💀) - Client-Server Web App to introduce usage of ML in materials science to beginners. MIT - Applied AI for Materials (🥉6 · ⭐ 73 · 💀) - Course materials for Applied AI for Materials Science and Engineering. Unlicensed - Machine Learning for Materials Hard and Soft (🥉6 · ⭐ 40 · 💀) - ESI-DCAFM-TACO-VDSP Summer School on Machine Learning for Materials Hard and Soft. Unlicensed - ML for catalysis tutorials (🥉6 · ⭐ 12 · 💀) - A jupyter book repo for tutorial on how to use OCP ML models for catalysis. MIT - Data Handling, DoE and Statistical Analysis for Material Chemists (🥉6 · ⭐ 4 · 💀) - Notebooks for workshops of DoE course, hosted by the Computational Materials Chemistry group at Uppsala University. GPL-3.0 - AI4Science101 (🥉5 · ⭐ 100 · 💀) - AI for Science. Unlicensed - ML-in-chemistry-101 (🥉4 · ⭐ 87 · 💀) - The course materials for Machine Learning in Chemistry 101. Unlicensed - MACE-tutorials (🥉4 · ⭐ 54) - Another set of tutorials for the MACE interatomic potential by one of the authors. MIT ML-IAP rep-learn MD - DSM-CORE (🥉4 · ⭐ 17 · 💤) - Data Science for Materials - Collection of Open Educational Resources. Unlicensed - chemrev-gpr (🥉4 · ⭐ 12 · 💀) - Notebooks accompanying the paper on GPR in materials and molecules in Chemical Reviews 2020. Unlicensed - AI4ChemMat Hands-On Series (🥉4 · ⭐ 1 · 💀) - Hands-On Series organized by Chemistry and Materials working group at Argonne Nat Lab. MPL-2.0 - PiNN Lab (🥉3 · ⭐ 3 · 💀) - Material for running a lab session on atomic neural networks. GPL-3.0 - MLDensity_tutorial (🥉2 · ⭐ 12 · 💀) - Tutorial files to work with ML for the charge density in molecules and solids. Unlicensed - LAMMPS-style pair potentials with GAP (🥉2 · ⭐ 4 · 💀) - A tutorial on how to create LAMMPS-style pair potentials and use them in combination with GAP potentials to run MD.. Unlicensed ML-IAP MD rep-eng - MALA Tutorial (🥉2 · ⭐ 2 · 💀) - A full MALA hands-on tutorial. Unlicensed


Explainable Artificial intelligence (XAI)

Back to top

Projects that focus on explainability and model interpretability in atomistic ML.

exmol (🥇19 · ⭐ 350 · 💤) - Explainer for black box models that predict molecule properties. MIT - [GitHub](https://github.com/ur-whitelab/exmol) (👨‍💻 9 · 🔀 46 · 📋 72 - 8% open · ⏱️ 08.05.2025):
git clone https://github.com/ur-whitelab/exmol
- [PyPi](https://pypi.org/project/exmol) (📥 4.1K / month · 📦 3 · ⏱️ 08.05.2025):
pip install exmol
Show 3 hidden projects... - Linear vs blackbox (🥈3 · ⭐ 2 · 💀) - Code and data related to the publication: Interpretable models for extrapolation in scientific machine learning. MIT XAI single-paper rep-eng - MEGAN: Multi Explanation Graph Attention Student (🥉2 · ⭐ 12) - Minimal implementation of graph attention student model architecture. MIT rep-learn - XElemNet (🥉2 · 💀) - Using explainable artificial intelligence (XAI) techniques to analyze ElemNet... Unlicensed rep-eng single-paper


Electronic structure methods (ML-ESM)

Back to top

Projects and models that focus on quantities of electronic structure methods, which do not fit into either of the categories ML-WFT or ML-DFT.

DeePTB (🥇14 · ⭐ 100) - DeePTB: A deep learning package for tight-binding Hamiltonian with ab initio accuracy. LGPL-3.0 ML-DFT - [GitHub](https://github.com/deepmodeling/DeePTB) (👨‍💻 15 · 🔀 29 · 📦 4 · 📋 64 - 37% open · ⏱️ 19.03.2026):
git clone https://github.com/deepmodeling/DeePTB
- [PyPi](https://pypi.org/project/dptb) (📥 140 / month · 📦 2 · ⏱️ 07.05.2025):
pip install dptb
Show 5 hidden projects... - QDF for molecule (🥈8 · ⭐ 230 · 💀) - Quantum deep field: data-driven wave function, electron density generation, and energy prediction and extrapolation.. MIT - QMLearn (🥈5 · ⭐ 12 · 💀) - Quantum Machine Learning by learning one-body reduced density matrices in the AO basis... MIT - q-pac (🥈5 · ⭐ 6 · 💀) - Kernel charge equilibration method. MIT electrostatics - halex (🥈5 · ⭐ 4 · 💀) - Hamiltonian Learning for Excited States https://doi.org/10.48550/arXiv.2311.00844. Unlicensed excited-states - e3psi (🥉3 · ⭐ 7 · 💀) - Equivariant machine learning library for learning from electronic structures. LGPL-3.0


General Tools

Back to top

General tools for atomistic machine learning.

RDKit (🥇39 · ⭐ 3.4K) - BSD-3 C++ cheminformatics - [GitHub](https://github.com/rdkit/rdkit) (👨‍💻 270 · 🔀 970 · 📦 3 · 📋 4.3K - 15% open · ⏱️ 07.04.2026):
git clone https://github.com/rdkit/rdkit
- [PyPi](https://pypi.org/project/rdkit) (📥 7.4M / month · 📦 1.6K · ⏱️ 05.04.2026):
pip install rdkit
- [Conda](https://anaconda.org/rdkit/rdkit) (📥 2.6M · ⏱️ 25.03.2025):
conda install -c rdkit rdkit
DeepChem (🥇33 · ⭐ 6.7K · 📉) - Democratizing Deep-Learning for Drug Discovery, Quantum Chemistry, Materials Science and Biology. MIT - [GitHub](https://github.com/deepchem/deepchem) (👨‍💻 260 · 🔀 2.2K · 📦 660 · 📋 2.2K - 41% open · ⏱️ 20.02.2026):
git clone https://github.com/deepchem/deepchem
- [PyPi](https://pypi.org/project/deepchem) (📥 43K / month · 📦 24 · ⏱️ 20.02.2026):
pip install deepchem
- [Conda](https://anaconda.org/conda-forge/deepchem) (📥 120K · ⏱️ 22.04.2025):
conda install -c conda-forge deepchem
- [Docker Hub](https://hub.docker.com/r/deepchemio/deepchem) (📥 9.4K · ⭐ 5 · ⏱️ 15.07.2025):
docker pull deepchemio/deepchem
Matminer (🥇31 · ⭐ 580) - Data mining for materials science. Custom - [GitHub](https://github.com/hackingmaterials/matminer) (👨‍💻 57 · 🔀 210 · 📦 480 · 📋 230 - 12% open · ⏱️ 10.02.2026):
git clone https://github.com/hackingmaterials/matminer
- [PyPi](https://pypi.org/project/matminer) (📥 290K / month · 📦 86 · ⏱️ 22.01.2026):
pip install matminer
- [Conda](https://anaconda.org/conda-forge/matminer) (📥 110K · ⏱️ 23.01.2026):
conda install -c conda-forge matminer
QUIP (🥈26 · ⭐ 390) - libAtoms/QUIP molecular dynamics framework: https://libatoms.github.io. GPL-2.0 MD ML-IAP rep-eng Fortran - [GitHub](https://github.com/libAtoms/QUIP) (👨‍💻 86 · 🔀 130 · 📥 860 · 📦 46 · 📋 500 - 23% open · ⏱️ 31.03.2026):
git clone https://github.com/libAtoms/QUIP
- [PyPi](https://pypi.org/project/quippy-ase) (📥 2.5K / month · 📦 9 · ⏱️ 30.01.2026):
pip install quippy-ase
- [Docker Hub](https://hub.docker.com/r/libatomsquip/quip) (📥 10K · ⭐ 4 · ⏱️ 24.04.2023):
docker pull libatomsquip/quip
JARVIS-Tools (🥈24 · ⭐ 370 · 💤) - About JARVIS-Tools: an open-source software package for data-driven atomistic materials design. Publications:.. Custom - [GitHub](https://github.com/usnistgov/jarvis) (👨‍💻 16 · 🔀 140 · 📋 95 - 52% open · ⏱️ 25.08.2025):
git clone https://github.com/usnistgov/jarvis
- [PyPi](https://pypi.org/project/jarvis-tools) (📥 79K / month · 📦 42 · ⏱️ 05.04.2026):
pip install jarvis-tools
- [Conda](https://anaconda.org/conda-forge/jarvis-tools) (📥 130K · ⏱️ 06.04.2026):
conda install -c conda-forge jarvis-tools
MAML (🥈20 · ⭐ 450) - Python for Materials Machine Learning, Materials Descriptors, Machine Learning Force Fields, Deep Learning, etc. BSD-3 - [GitHub](https://github.com/materialyzeai/maml) (👨‍💻 39 · 🔀 95 · 📦 17 · 📋 76 - 14% open · ⏱️ 14.02.2026):
git clone https://github.com/materialsvirtuallab/maml
- [PyPi](https://pypi.org/project/maml) (📥 590 / month · 📦 3 · ⏱️ 02.04.2025):
pip install maml
Molfeat (🥈20 · ⭐ 220 · 💤) - molfeat - the hub for all your molecular featurizers. Apache-2 cheminformatics rep-eng rep-learn generative language-models pretrained - [GitHub](https://github.com/datamol-io/molfeat) (👨‍💻 19 · 🔀 27 · 📦 73 · 📋 61 - 27% open · ⏱️ 27.05.2025):
git clone https://github.com/datamol-io/molfeat
- [PyPi](https://pypi.org/project/molfeat) (📥 6.9K / month · 📦 13 · ⏱️ 27.05.2025):
pip install molfeat
- [Conda](https://anaconda.org/conda-forge/molfeat) (📥 43K · ⏱️ 30.05.2025):
conda install -c conda-forge molfeat
Scikit-Matter (🥈19 · ⭐ 93) - A collection of scikit-learn compatible utilities that implement methods born out of the materials science and.. BSD-3 scikit-learn - [GitHub](https://github.com/scikit-learn-contrib/scikit-matter) (👨‍💻 20 · 🔀 25 · 📥 19 · 📋 84 - 26% open · ⏱️ 01.04.2026):
git clone https://github.com/scikit-learn-contrib/scikit-matter
- [PyPi](https://pypi.org/project/skmatter) (📥 2.7K / month · 📦 7 · ⏱️ 06.01.2026):
pip install skmatter
- [Conda](https://anaconda.org/conda-forge/skmatter) (📥 6.5K · ⏱️ 08.01.2026):
conda install -c conda-forge skmatter
AtomAI (🥈18 · ⭐ 230 · 💤) - Deep and Machine Learning for Microscopy. MIT computer-vision USL experimental-data - [GitHub](https://github.com/pycroscopy/atomai) (👨‍💻 6 · 🔀 41 · 📦 13 · 📋 20 - 55% open · ⏱️ 23.06.2025):
git clone https://github.com/pycroscopy/atomai
- [PyPi](https://pypi.org/project/atomai) (📥 740 / month · 📦 1 · ⏱️ 23.06.2025):
pip install atomai
Artificial Intelligence for Science (AIRS) (🥉14 · ⭐ 750) - Artificial Intelligence Research for Science (AIRS). GPL-3.0 license rep-learn generative ML-IAP MD ML-DFT ML-WFT biomolecules - [GitHub](https://github.com/divelab/AIRS) (👨‍💻 36 · 🔀 89 · 📋 32 - 18% open · ⏱️ 30.03.2026):
git clone https://github.com/divelab/AIRS
MLatom (🥉14 · ⭐ 140) - AI-enhanced computational chemistry. MIT UIP ML-IAP MD ML-DFT ML-ESM transfer-learning active-learning spectroscopy structure-optimization - [GitHub](https://github.com/dralgroup/mlatom) (👨‍💻 6 · 🔀 17 · 📋 8 - 37% open · ⏱️ 09.03.2026):
git clone https://github.com/dralgroup/mlatom
- [PyPi](https://pypi.org/project/mlatom) (📥 700 / month · ⏱️ 09.03.2026):
pip install mlatom
MAST-ML (🥉14 · ⭐ 130) - MAterials Simulation Toolkit for Machine Learning (MAST-ML). MIT - [GitHub](https://github.com/uw-cmg/MAST-ML) (👨‍💻 19 · 🔀 61 · 📥 170 · 📋 220 - 14% open · ⏱️ 10.10.2025):
git clone https://github.com/uw-cmg/MAST-ML
Show 12 hidden projects... - QML (🥈17 · ⭐ 210 · 💀) - QML: Quantum Machine Learning. MIT - Automatminer (🥉16 · ⭐ 170 · 💀) - An automatic engine for predicting materials properties. Custom autoML - XenonPy (🥉16 · ⭐ 150 · 💀) - XenonPy is a Python Software for Materials Informatics. BSD-3 - AMPtorch (🥉11 · ⭐ 61 · 💀) - AMPtorch: Atomistic Machine Learning Package (AMP) - PyTorch. GPL-3.0 - OpenChem (🥉10 · ⭐ 740 · 💀) - OpenChem: Deep Learning toolkit for Computational Chemistry and Drug Design Research. MIT - JAXChem (🥉7 · ⭐ 81 · 💀) - JAXChem is a JAX-based deep learning library for complex and versatile chemical modeling. MIT - uncertainty_benchmarking (🥉7 · ⭐ 43 · 💀) - Various code/notebooks to benchmark different ways we could estimate uncertainty in ML predictions. Unlicensed benchmarking probabilistic - torchchem (🥉7 · ⭐ 38 · 💀) - An experimental repo for experimenting with PyTorch models. MIT - Equisolve (🥉6 · ⭐ 5 · 💀) - A ML toolkit package utilizing the metatensor data format to build models for the prediction of equivariant properties.. BSD-3 ML-IAP - quantum-structure-ml (🥉3 · ⭐ 3 · 💀) - Multi-class classification model for predicting the magnetic order of magnetic structures and a binary classification.. Unlicensed magnetism benchmarking - ACEatoms (🥉3 · ⭐ 2 · 💀) - Generic code for modelling atomic properties using ACE. Custom Julia - Magpie (🥉3) - Materials Agnostic Platform for Informatics and Exploration (Magpie). MIT Java


Generative Models

Back to top

Projects that implement generative models for atomistic ML.

GT4SD (🥇16 · ⭐ 370 · 💤) - GT4SD, an open-source library to accelerate hypothesis generation in the scientific discovery process. MIT pretrained drug-discovery rep-learn - [GitHub](https://github.com/GT4SD/gt4sd-core) (👨‍💻 20 · 🔀 79 · 📋 120 - 11% open · ⏱️ 18.09.2025):
git clone https://github.com/GT4SD/gt4sd-core
- [PyPi](https://pypi.org/project/gt4sd) (📥 900 / month · ⏱️ 19.02.2025):
pip install gt4sd
SLICES and MatterGPT (🥈14 · ⭐ 140) - SLICES: An Invertible, Invariant, and String-based Crystal Representation [2023, Nature Communications] MatterGPT,.. LGPL-2.1 rep-eng language-models transformer materials-discovery structure-prediction - [GitHub](https://github.com/xiaohang007/SLICES) (👨‍💻 2 · 🔀 58 · 📦 7 · 📋 17 - 23% open · ⏱️ 03.03.2026):
git clone https://github.com/xiaohang007/SLICES
- [PyPi](https://pypi.org/project/slices) (📥 240 / month · 📦 1 · ⏱️ 14.10.2025):
pip install slices
- [Docker Hub](https://hub.docker.com/r/xiaohang07/slices) (📥 830 · ⭐ 1 · ⏱️ 14.10.2025):
docker pull xiaohang07/slices
synspace (🥈12 · ⭐ 48 · 💤) - Synthesis generative model. MIT - [GitHub](https://github.com/whitead/synspace) (👨‍💻 2 · 🔀 4 · 📦 37 · 📋 4 - 50% open · ⏱️ 24.04.2025):
git clone https://github.com/whitead/synspace
- [PyPi](https://pypi.org/project/synspace) (📥 4.3K / month · 📦 4 · ⏱️ 24.04.2025):
pip install synspace
SchNetPack G-SchNet (🥈11 · ⭐ 63) - G-SchNet extension for SchNetPack. MIT - [GitHub](https://github.com/atomistic-machine-learning/schnetpack-gschnet) (👨‍💻 3 · 🔀 11 · ⏱️ 13.11.2025):
git clone https://github.com/atomistic-machine-learning/schnetpack-gschnet
SiMGen (🥈9 · ⭐ 29 · 💤) - Zero Shot Molecular Generation via Similarity Kernels. MIT viz - [GitHub](https://github.com/RokasEl/simgen) (👨‍💻 4 · 🔀 5 · 📦 2 · 📋 5 - 20% open · ⏱️ 27.08.2025):
git clone https://github.com/RokasEl/simgen
- [PyPi](https://pypi.org/project/simgen) (📥 18 / month · ⏱️ 13.12.2024):
pip install simgen
Show 12 hidden projects... - MoLeR (🥇15 · ⭐ 320 · 💀) - Implementation of MoLeR: a generative model of molecular graphs which supports scaffold-constrained generation. MIT - PMTransformer (🥈14 · ⭐ 120 · 💀) - Universal Transfer Learning in Porous Materials, including MOFs. MIT transfer-learning pretrained transformer - EDM (🥈9 · ⭐ 560 · 💀) - E(3) Equivariant Diffusion Model for Molecule Generation in 3D. MIT - G-SchNet (🥉8 · ⭐ 150 · 💀) - G-SchNet - a generative model for 3d molecular structures. MIT - bVAE-IM (🥉8 · ⭐ 14 · 💀) - Implementation of Chemical Design with GPU-based Ising Machine. MIT QML single-paper - molecular-vae (🥉7 · ⭐ 71 · 💀) - Pytorch implementation of the paper Automatic Chemical Design Using a Data-Driven Continuous Representation of.. MIT rep-learn cheminformatics single-paper - cG-SchNet (🥉7 · ⭐ 65 · 💀) - cG-SchNet - a conditional generative neural network for 3d molecular structures. MIT - COATI (🥉6 · ⭐ 120 · 💀) - COATI: multi-modal contrastive pre-training for representing and traversing chemical space. Apache-2 drug-discovery multimodal pretrained rep-learn - rxngenerator (🥉5 · ⭐ 12 · 💀) - A generative model for molecular generation via multi-step chemical reactions. MIT - MolSLEPA (🥉5 · ⭐ 7 · 💀) - Interpretable Fragment-based Molecule Design with Self-learning Entropic Population Annealing. MIT XAI - Mapping out phase diagrams with generative classifiers (🥉4 · ⭐ 8 · 💀) - Repository for our ``Mapping out phase diagrams with generative models paper. MIT phase-transition - descriptors-inversion (🥉4 · ⭐ 6 · 💀) - Local inversion of the chemical environment representations. MIT rep-eng single-paper


Interatomic Potentials (ML-IAP)

Back to top

Machine learning interatomic potentials (aka ML-IAP, MLIAP, MLIP, MLP) and force fields (ML-FF) for molecular dynamics.

fairchem (🥇32 · ⭐ 2K · 📈) - FAIR Chemistrys library of machine learning methods for chemistry. Formerly known as Open Catalyst Project. MIT pretrained UIP rep-learn catalysis - [GitHub](https://github.com/facebookresearch/fairchem) (👨‍💻 69 · 🔀 450 · 📋 570 - 1% open · ⏱️ 09.04.2026):
git clone https://github.com/FAIR-Chem/fairchem
- [PyPi](https://pypi.org/project/fairchem-core) (📥 130K / month · 📦 44 · ⏱️ 26.03.2026):
pip install fairchem-core
NequIP (🥇30 · ⭐ 890) - NequIP is a code for building E(3)-equivariant interatomic potentials. MIT - [GitHub](https://github.com/mir-group/nequip) (👨‍💻 39 · 🔀 200 · 📦 49 · 📋 120 - 3% open · ⏱️ 25.03.2026):
git clone https://github.com/mir-group/nequip
- [PyPi](https://pypi.org/project/nequip) (📥 120K / month · 📦 17 · ⏱️ 25.03.2026):
pip install nequip
- [Conda](https://anaconda.org/conda-forge/nequip) (📥 22K · ⏱️ 25.03.2026):
conda install -c conda-forge nequip
DeePMD-kit (🥇29 · ⭐ 1.9K) - A deep learning package for many-body potential energy representation and molecular dynamics. LGPL-3.0 MD workflows C++ - [GitHub](https://github.com/deepmodeling/deepmd-kit) (👨‍💻 84 · 🔀 600 · 📥 69K · 📦 47 · 📋 1K - 13% open · ⏱️ 08.04.2026):
git clone https://github.com/deepmodeling/deepmd-kit
- [PyPi](https://pypi.org/project/deepmd-kit) (📥 7.1K / month · 📦 17 · ⏱️ 19.03.2026):
pip install deepmd-kit
- [Conda](https://anaconda.org/deepmodeling/deepmd-kit) (📥 3.6K · ⏱️ 25.03.2025):
conda install -c deepmodeling deepmd-kit
- [Docker Hub](https://hub.docker.com/r/deepmodeling/deepmd-kit) (📥 5.4K · ⭐ 1 · ⏱️ 05.04.2026):
docker pull deepmodeling/deepmd-kit
TorchANI (🥇26 · ⭐ 540) - TorchANI 2.0 is an open-source library that supports training, development, and research of ANI-style neural network.. MIT - [GitHub](https://github.com/aiqm/torchani) (👨‍💻 22 · 🔀 140 · 📦 73 · 📋 180 - 4% open · ⏱️ 15.12.2025):
git clone https://github.com/aiqm/torchani
- [PyPi](https://pypi.org/project/torchani) (📥 6.1K / month · 📦 15 · ⏱️ 17.11.2025):
pip install torchani
- [Conda](https://anaconda.org/conda-forge/torchani) (📥 1.2M · ⏱️ 23.01.2026):
conda install -c conda-forge torchani
MatCalc (🥇25 · ⭐ 140) - A python library for calculating materials properties from the PES. BSD-3 workflows benchmarking UIP pretrained model-repository - [GitHub](https://github.com/materialyzeai/matcalc) (👨‍💻 23 · 🔀 35 · 📦 23 · 📋 39 - 15% open · ⏱️ 25.03.2026):
git clone https://github.com/materialsvirtuallab/matcalc
- [PyPi](https://pypi.org/project/matcalc) (📥 2.4K / month · 📦 10 · ⏱️ 09.12.2025):
pip install matcalc
Metatrain (🥇25 · ⭐ 65) - Train, fine-tune, and manipulate machine learning models for atomistic systems. BSD-3 workflows benchmarking rep-eng rep-learn - [GitHub](https://github.com/metatensor/metatrain) (👨‍💻 34 · 🔀 24 · 📥 79 · 📦 14 · 📋 300 - 31% open · ⏱️ 09.04.2026):
git clone https://github.com/metatensor/metatrain
- [PyPi](https://pypi.org/project/metatrain) (📥 10K / month · 📦 4 · ⏱️ 03.03.2026):
pip install metatrain
MACE (🥇24 · ⭐ 1.1K) - MACE - Fast and accurate machine learning interatomic potentials with higher order equivariant message passing. MIT - [GitHub](https://github.com/ACEsuit/mace) (👨‍💻 69 · 🔀 390 · 📋 580 - 20% open · ⏱️ 01.03.2026):
git clone https://github.com/ACEsuit/mace
TorchMD-NET (🥇22 · ⭐ 470) - Training neural network potentials. MIT MD rep-learn transformer pretrained - [GitHub](https://github.com/torchmd/torchmd-net) (👨‍💻 20 · 🔀 97 · 📥 190 · 📋 120 - 25% open · ⏱️ 17.03.2026):
git clone https://github.com/torchmd/torchmd-net
- [Conda](https://anaconda.org/conda-forge/torchmd-net) (📥 790K · ⏱️ 18.03.2026):
conda install -c conda-forge torchmd-net
janus-core (🥈21 · ⭐ 44) - Tools for machine learnt interatomic potentials. BSD-3 benchmarking workflows structure-optimization MD transport-phenomena - [GitHub](https://github.com/stfc/janus-core) (👨‍💻 11 · 🔀 17 · 📥 240 · 📦 16 · 📋 310 - 15% open · ⏱️ 09.04.2026):
git clone https://github.com/stfc/janus-core
- [PyPi](https://pypi.org/project/janus-core) (📥 3.8K / month · 📦 6 · ⏱️ 10.03.2026):
pip install janus-core
apax (🥈20 · ⭐ 36 · 📉) - A flexible and performant framework for training machine learning potentials. MIT - [GitHub](https://github.com/apax-hub/apax) (👨‍💻 11 · 🔀 7 · 📦 7 · 📋 180 - 17% open · ⏱️ 24.03.2026):
git clone https://github.com/apax-hub/apax
- [PyPi](https://pypi.org/project/apax) (📥 210 / month · 📦 2 · ⏱️ 18.03.2026):
pip install apax
sGDML (🥈18 · ⭐ 170 · 💤) - sGDML - Reference implementation of the Symmetric Gradient Domain Machine Learning model. MIT - [GitHub](https://github.com/stefanch/sGDML) (👨‍💻 8 · 🔀 42 · 📦 13 · 📋 22 - 50% open · ⏱️ 13.06.2025):
git clone https://github.com/stefanch/sGDML
- [PyPi](https://pypi.org/project/sgdml) (📥 1.3K / month · 📦 2 · ⏱️ 13.06.2025):
pip install sgdml
KLIFF (🥈18 · ⭐ 40) - KIM-based Learning-Integrated Fitting Framework for interatomic potentials. LGPL-2.1 probabilistic workflows - [GitHub](https://github.com/openkim/kliff) (👨‍💻 14 · 🔀 22 · 📦 4 · 📋 57 - 42% open · ⏱️ 28.02.2026):
git clone https://github.com/openkim/kliff
- [PyPi](https://pypi.org/project/kliff) (📥 120 / month · ⏱️ 11.04.2025):
pip install kliff
- [Conda](https://anaconda.org/conda-forge/kliff) (📥 210K · ⏱️ 22.04.2025):
conda install -c conda-forge kliff
Allegro (🥈17 · ⭐ 470) - Allegro is an open-source code for building highly scalable and accurate equivariant deep learning interatomic.. MIT - [GitHub](https://github.com/mir-group/allegro) (👨‍💻 9 · 🔀 73 · 📋 50 - 2% open · ⏱️ 24.02.2026):
git clone https://github.com/mir-group/allegro
Autoplex (🥈16 · ⭐ 140) - Code for automated fitting of machine learned interatomic potentials. GPL-3.0 benchmarking workflows - [GitHub](https://github.com/autoatml/autoplex) (👨‍💻 15 · 🔀 22 · 📦 2 · 📋 140 - 25% open · ⏱️ 09.04.2026):
git clone https://github.com/autoatml/autoplex
- [PyPi](https://pypi.org/project/autoplex) (📥 30 / month · ⏱️ 14.11.2025):
pip install autoplex
Graph-PES (🥈15 · ⭐ 120) - train and use graph-based ML models of potential energy surfaces. MIT rep-learn UIP MD pretrained - [GitHub](https://github.com/vldgroup/graph-pes) (👨‍💻 5 · 🔀 14 · 📦 3 · 📋 17 - 17% open · ⏱️ 20.02.2026):
git clone https://github.com/jla-gardner/graph-pes
- [PyPi](https://pypi.org/project/graph-pes) (📥 2.6K / month · 📦 2 · ⏱️ 20.02.2026):
pip install graph-pes
Neural Force Field (🥈14 · ⭐ 290) - Neural Network Force Field based on PyTorch. MIT pretrained - [GitHub](https://github.com/learningmatter-mit/NeuralForceField) (👨‍💻 45 · 🔀 61 · 📋 23 - 21% open · ⏱️ 10.02.2026):
git clone https://github.com/learningmatter-mit/NeuralForceField
NNPOps (🥈14 · ⭐ 100) - High-performance operations for neural network potentials. MIT MD C++ - [GitHub](https://github.com/openmm/NNPOps) (👨‍💻 11 · 🔀 18 · 📋 66 - 42% open · ⏱️ 04.02.2026):
git clone https://github.com/openmm/NNPOps
- [Conda](https://anaconda.org/conda-forge/nnpops) (📥 670K · ⏱️ 22.04.2025):
conda install -c conda-forge nnpops
MLIPX - Machine-Learned Interatomic Potential eXploration (🥈14 · ⭐ 100) - Machine-Learned Interatomic Potential eXploration (mlipx) is designed at BASF for evaluating machine-learned.. MIT benchmarking viz workflows - [GitHub](https://github.com/basf/mlipx) (👨‍💻 5 · 🔀 8 · 📦 6 · 📋 18 - 33% open · ⏱️ 16.10.2025):
git clone https://github.com/basf/mlipx
- [PyPi](https://pypi.org/project/mlipx) (📥 1.2K / month · ⏱️ 09.06.2025):
pip install mlipx
wfl (🥈14 · ⭐ 43) - Workflow is a Python toolkit for building interatomic potential creation and atomistic simulation workflows. GPL-2.0 workflows HTC - [GitHub](https://github.com/libAtoms/workflow) (👨‍💻 20 · 🔀 21 · 📦 5 · 📋 170 - 42% open · ⏱️ 22.12.2025):
git clone https://github.com/libAtoms/workflow
MACE-Jax (🥈12 · ⭐ 94) - Equivariant machine learning interatomic potentials in JAX. MIT - [GitHub](https://github.com/ACEsuit/mace-jax) (👨‍💻 5 · 🔀 21 · 📋 10 - 50% open · ⏱️ 10.02.2026):
git clone https://github.com/ACEsuit/mace-jax
aiida-mlip (🥈12 · ⭐ 16) - machine learning interatomic potentials aiida plugin. BSD-3 workflows structure-optimization MD - [GitHub](https://github.com/ElliottKasoar/aiida-mlip) (👨‍💻 7 · 🔀 10 · ⏱️ 30.01.2026):
git clone https://github.com/ElliottKasoar/aiida-mlip
- [PyPi](https://pypi.org/project/aiida-mlip) (📥 280 / month · ⏱️ 17.11.2025):
pip install aiida-mlip
calorine (🥈12 · ⭐ 15) - A Python package for constructing and sampling neuroevolution potential models. https://doi.org/10.21105/joss.06264. Custom - [PyPi](https://pypi.org/project/calorine) (📥 42K / month · 📦 8 · ⏱️ 24.01.2026):
pip install calorine
- [GitLab](https://gitlab.com/materials-modeling/calorine) (🔀 6 · 📋 120 - 9% open · ⏱️ 24.01.2026):
git clone https://gitlab.com/materials-modeling/calorine
PiNN (🥉11 · ⭐ 120) - A Python library for building atomic neural networks. BSD-3 - [GitHub](https://github.com/Teoroo-CMC/PiNN) (👨‍💻 8 · 🔀 39 · 📋 7 - 14% open · ⏱️ 26.03.2026):
git clone https://github.com/Teoroo-CMC/PiNN
- [Docker Hub](https://hub.docker.com/r/teoroo/pinn) (📥 760 · ⏱️ 26.03.2026):
docker pull teoroo/pinn
tinker-hp (🥉10 · ⭐ 100) - Tinker-HP: High-Performance Massively Parallel Tinker for CPUs & GPUs. Custom - [GitHub](https://github.com/TinkerTools/tinker-hp) (👨‍💻 12 · 🔀 26 · 📋 29 - 24% open · ⏱️ 26.01.2026):
git clone https://github.com/TinkerTools/tinker-hp
DeepMD-GNN (🥉10 · ⭐ 54) - DeePMD-kit plugin for various graph neural network models. LGPL-3.0 rep-learn MD UIP C++ - [GitHub](https://github.com/deepmodeling/deepmd-gnn) (👨‍💻 8 · 🔀 9 · 📋 23 - 52% open · ⏱️ 13.02.2026):
git clone https://github.com/deepmodeling/deepmd-gnn
ALF (🥉10 · ⭐ 39) - A framework for performing active learning for training machine-learned interatomic potentials. Custom active-learning - [GitHub](https://github.com/lanl/ALF) (👨‍💻 8 · 🔀 13 · ⏱️ 21.03.2026):
git clone https://github.com/lanl/alf
ACEfit (🥉10 · ⭐ 8) - MIT Julia - [GitHub](https://github.com/ACEsuit/ACEfit.jl) (👨‍💻 10 · 🔀 8 · 📋 60 - 38% open · ⏱️ 20.03.2026):
git clone https://github.com/ACEsuit/ACEfit.jl
PyNEP (🥉9 · ⭐ 69) - A python interface of the machine learning potential NEP used in GPUMD. MIT - [GitHub](https://github.com/bigd4/PyNEP) (👨‍💻 10 · 🔀 17 · 📋 14 - 42% open · ⏱️ 27.10.2025):
git clone https://github.com/bigd4/PyNEP
ACE1.jl (🥉9 · ⭐ 23 · 💤) - Atomic Cluster Expansion for Modelling Invariant Atomic Properties. Custom Julia - [GitHub](https://github.com/ACEsuit/ACE1.jl) (👨‍💻 9 · 🔀 7 · 📋 46 - 47% open · ⏱️ 15.04.2025):
git clone https://github.com/ACEsuit/ACE1.jl
TurboGAP (🥉9 · ⭐ 20) - The TurboGAP code. Custom Fortran - [GitHub](https://github.com/mcaroba/turbogap) (👨‍💻 9 · 🔀 13 · 📋 12 - 66% open · ⏱️ 08.04.2026):
git clone https://github.com/mcaroba/turbogap
GAP (🥉8 · ⭐ 46) - Gaussian Approximation Potential (GAP). Custom - [GitHub](https://github.com/libAtoms/GAP) (👨‍💻 13 · 🔀 20 · ⏱️ 03.01.2026):
git clone https://github.com/libAtoms/GAP
Asparagus (🥉8 · ⭐ 12) - Program Package for Sampling, Training and Applying ML-based Potential models https://doi.org/10.48550/arXiv.2407.15175. MIT workflows sampling MD - [GitHub](https://github.com/MMunibas/Asparagus) (👨‍💻 11 · 🔀 6 · ⏱️ 07.04.2026):
git clone https://github.com/MMunibas/Asparagus
Show 46 hidden projects... - MEGNet (🥇22 · ⭐ 560 · 💀) - Graph Networks as a Universal Machine Learning Framework for Molecules and Crystals. BSD-3 multifidelity - Ultra-Fast Force Fields (UF3) (🥈15 · ⭐ 71 · 💀) - UF3: a python library for generating ultra-fast interatomic potentials. Apache-2 - PyXtalFF (🥈14 · ⭐ 93 · 💀) - Machine Learning Interatomic Potential Predictions. MIT - n2p2 (🥈13 · ⭐ 240 · 💀) - n2p2 - A Neural Network Potential Package. GPL-3.0 C++ - Pacemaker (🥈13 · ⭐ 110 · 💀) - Python package for fitting atomic cluster expansion (ACE) potentials. Custom - TensorMol (🥈12 · ⭐ 280 · 💀) - Tensorflow + Molecules = TensorMol. GPL-3.0 single-paper - ANI-1 (🥈12 · ⭐ 230 · 💀) - ANI-1 neural net potential with python interface (ASE). MIT - So3krates (MLFF) (🥈12 · ⭐ 140 · 💀) - Build neural networks for machine learning force fields with JAX. MIT - SIMPLE-NN (🥈12 · ⭐ 48 · 💀) - SIMPLE-NN(SNU Interatomic Machine-learning PotentiaL packagE version Neural Network). GPL-3.0 - CCS_fit (🥉10 · ⭐ 10 · 💀) - Curvature Constrained Splines. GPL-3.0 - DimeNet (🥉9 · ⭐ 350 · 💀) - DimeNet and DimeNet++ models, as proposed in Directional Message Passing for Molecular Graphs (ICLR 2020) and Fast and.. Custom - SchNet (🥉9 · ⭐ 290 · 💀) - SchNet - a deep learning architecture for quantum chemistry. MIT - GemNet (🥉9 · ⭐ 220 · 💀) - GemNet model in PyTorch, as proposed in GemNet: Universal Directional Graph Neural Networks for Molecules (NeurIPS.. Custom - ACE.jl (🥉9 · ⭐ 67 · 💀) - Parameterisation of Equivariant Properties of Particle Systems. Custom Julia - Point Edge Transformer (PET) (🥉9 · ⭐ 34 · 💀) - Point Edge Transformer. MIT rep-learn transformer - EquiformerV2 (🥉8 · ⭐ 340 · 💀) - [ICLR 2024] EquiformerV2: Improved Equivariant Transformer for Scaling to Higher-Degree Representations. MIT pretrained UIP rep-learn - AIMNet (🥉8 · ⭐ 110 · 💀) - Atoms In Molecules Neural Network Potential. MIT single-paper - SIMPLE-NN v2 (🥉8 · ⭐ 43 · 💀) - SIMPLE-NN is an open package that constructs Behler-Parrinello-type neural-network interatomic potentials from ab.. GPL-3.0 - Atomistic Adversarial Attacks (🥉8 · ⭐ 40 · 💀) - Code for performing adversarial attacks on atomistic systems using NN potentials. MIT probabilistic - SNAP (🥉8 · ⭐ 38 · 💀) - Repository for spectral neighbor analysis potential (SNAP) model development. BSD-3 - NNsforMD (🥉8 · ⭐ 11 · 💀) - Neural network class for molecular dynamics to predict potential energy, forces and non-adiabatic couplings. MIT - MEGNetSparse (🥉8 · ⭐ 5 · 💀) - A library imlementing a graph neural network with sparse representation from Code for Kazeev, N., Al-Maeeni, A.R.,.. MIT material-defect - PhysNet (🥉7 · ⭐ 120 · 💀) - Code for training PhysNet models. MIT electrostatics - MLIP-3 (🥉6 · ⭐ 27 · 💀) - MLIP-3: Active learning on atomic environments with Moment Tensor Potentials (MTP). BSD-2 C++ - testing-framework (🥉6 · ⭐ 11 · 💀) - The purpose of this repository is to aid the testing of a large number of interatomic potentials for a variety of.. Unlicensed benchmarking - PANNA (🥉6 · ⭐ 11 · 💀) - A package to train and validate all-to-all connected network models for BP[1] and modified-BP[2] type local atomic.. MIT benchmarking - MLXDM (🥉6 · ⭐ 9 · 💀) - A Neural Network Potential with Rigorous Treatment of Long-Range Dispersion https://doi.org/10.1039/D2DD00150K. MIT long-range - Alchemical learning (🥉6 · ⭐ 3 · 💀) - Code for the Modeling high-entropy transition metal alloys with alchemical compression article. BSD-3 rep-eng Defects & Disorder - BPNET (🥉6 · ⭐ 3 · 💤) - Fast Behler-Parrinello type neural networks in Fortran2008. MIT rep-eng Fortran - ACE1Pack.jl (🥉6 · ⭐ 1 · 💀) - Provides convenience functionality for the usage of ACE1.jl, ACEfit.jl, JuLIP.jl for fitting interatomic potentials.. MIT Julia - glp (🥉5 · ⭐ 26 · 💀) - tools for graph-based machine-learning potentials in jax. MIT - NequIP-JAX (🥉5 · ⭐ 24 · 💀) - JAX implementation of the NequIP interatomic potential. Unlicensed - Allegro-Legato (🥉5 · ⭐ 21 · 💀) - An extension of Allegro with enhanced robustness and time-to-failure. MIT MD - TensorPotential (🥉5 · ⭐ 13 · 💀) - Tensorpotential is a TensorFlow based tool for development, fitting ML interatomic potentials from electronic.. Custom - GN-MM (🥉5 · ⭐ 11 · 💀) - The Gaussian Moment Neural Network (GM-NN) package developed for large-scale atomistic simulations employing atomistic.. MIT active-learning MD rep-eng magnetism - MatML (🥉4 · ⭐ 9) - Full MatML Docker image, including MatGL, MatCalc, MatPES and LAMMPS with ML-GNNP and ML-SNAP. BSD-3 MD UIP rep-learn pretrained - PeriodicPotentials (🥉4 · 💀) - A Periodic table app that displays potentials based on the selected elements. MIT community-resource viz JavaScript - Allegro-JAX (🥉3 · ⭐ 22 · 💤) - JAX implementation of the Allegro interatomic potential. MIT - ACE Workflows (🥉3 · 💀) - Workflow Examples for ACE Models. Unlicensed Julia workflows - PyFLAME (🥉3 · 💀) - An automated approach for developing neural network interatomic potentials with FLAME.. Unlicensed active-learning structure-prediction structure-optimization rep-eng Fortran - SingleNN (🥉2 · ⭐ 9 · 💀) - An efficient package for training and executing neural-network interatomic potentials. Unlicensed C++ - mag-ace (🥉2 · ⭐ 6 · 💤) - Magnetic ACE potential. FORTRAN interface for LAMMPS SPIN package. Unlicensed magnetism MD Fortran - AisNet (🥉2 · ⭐ 3 · 💀) - A Universal Interatomic Potential Neural Network with Encoded Local Environment Features.. MIT - RuNNer (🥉2) - The RuNNer Neural Network Energy Representation is a Fortran-based framework for the construction of Behler-.. GPL-3.0 Fortran - nnp-pre-training (🥉1 · ⭐ 6 · 💀) - Synthetic pre-training for neural-network interatomic potentials. Unlicensed pretrained MD - mlp (🥉1 · ⭐ 1 · 💀) - Proper orthogonal descriptors for efficient and accurate interatomic potentials... Unlicensed Julia


Language Models

Back to top

Projects that use (large) language models (LMs, LLMs) or natural language procesing (NLP) techniques for atomistic ML.

🔗 MaCBench Leaderboard - Leaderboard for multimodal language models for chemistry & materials research. community-resource benchmarking datasets

paper-qa (🥇18 · ⭐ 8.3K) - LLM Chain for answering questions from docs. Unlicensed ai-agent - [GitHub]() (🔀 840):
git clone https://github.com/whitead/paper-qa
- [PyPi](https://pypi.org/project/paper-qa) (📥 31K / month · 📦 24 · ⏱️ 18.03.2026):
pip install paper-qa
ChemBench (🥇17 · ⭐ 140 · 💤) - How good are LLMs at chemistry?. MIT benchmarking multimodal - [GitHub](https://github.com/lamalab-org/chembench) (👨‍💻 15 · 🔀 16 · 📦 3 · 📋 340 - 16% open · ⏱️ 11.09.2025):
git clone https://github.com/lamalab-org/chembench
- [PyPi](https://pypi.org/project/chembench) (📥 1.1K / month · ⏱️ 27.02.2025):
pip install chembench
ChatMOF (🥈12 · ⭐ 110 · 💤) - Predict and Inverse design for metal-organic framework with large-language models (llms). MIT generative - [GitHub](https://github.com/Yeonghun1675/ChatMOF) (👨‍💻 2 · 🔀 22 · 📦 3 · ⏱️ 15.05.2025):
git clone https://github.com/Yeonghun1675/ChatMOF
- [PyPi](https://pypi.org/project/chatmof) (📥 310 / month · ⏱️ 01.07.2024):
pip install chatmof
AtomGPT (🥈11 · ⭐ 50 · 💤) - https://atomgpt.org. Custom generative pretrained transformer - [GitHub](https://github.com/usnistgov/atomgpt) (👨‍💻 7 · 🔀 10 · ⏱️ 21.08.2025):
git clone https://github.com/usnistgov/atomgpt
- [PyPi](https://pypi.org/project/atomgpt) (📥 73 / month · 📦 1 · ⏱️ 22.03.2025):
pip install atomgpt
LLaMP (🥉8 · ⭐ 91) - [EMNLP 25] A web app and Python API for multi-modal RAG framework to ground LLMs on high-fidelity materials.. BSD-3 multimodal RAG materials-discovery pretrained JavaScript Python - [GitHub](https://github.com/chiang-yuan/llamp) (👨‍💻 6 · 🔀 13 · 📋 25 - 32% open · ⏱️ 11.11.2025):
git clone https://github.com/chiang-yuan/llamp
LLM-Prop (🥉7 · ⭐ 54) - A repository for the LLM-Prop implementation. MIT - [GitHub](https://github.com/vertaix/LLM-Prop) (👨‍💻 6 · 🔀 10 · 📋 3 - 66% open · ⏱️ 31.01.2026):
git clone https://github.com/vertaix/LLM-Prop
LLM4Chem (🥉6 · ⭐ 110 · 💤) - Official code repo for the paper LlaSMol: Advancing Large Language Models for Chemistry with a Large-Scale,.. MIT cheminformatics datasets - [GitHub](https://github.com/OSU-NLP-Group/LLM4Chem) (👨‍💻 2 · 🔀 19 · ⏱️ 09.06.2025):
git clone https://github.com/OSU-NLP-Group/LLM4Chem
Show 17 hidden projects... - OpenBioML ChemNLP (🥇17 · ⭐ 170 · 💀) - ChemNLP project. MIT datasets - ChemCrow (🥈16 · ⭐ 890 · 💀) - Open source package for the accurate solution of reasoning-intensive chemical tasks. MIT ai-agent - ChemDataExtractor (🥈16 · ⭐ 350 · 💀) - Automatically extract chemical information from scientific documents. MIT literature-data - mat2vec (🥈12 · ⭐ 640 · 💀) - Supplementary Materials for Tshitoyan et al. Unsupervised word embeddings capture latent knowledge from materials.. MIT rep-learn - gptchem (🥈12 · ⭐ 260 · 💀) - Use GPT-3 to solve chemistry problems. MIT - nlcc (🥈11 · ⭐ 46 · 💀) - Natural language computational chemistry command line interface. MIT single-paper - NIST ChemNLP (🥈11 · ⭐ 28 · 💤) - chemnlp. MIT literature-data - MoLFormer (🥉9 · ⭐ 390 · 💀) - Repository for MolFormer. Apache-2 transformer pretrained drug-discovery - MolSkill (🥉9 · ⭐ 120 · 💀) - Extracting medicinal chemistry intuition via preference machine learning. MIT drug-discovery recommender - chemlift (🥉7 · ⭐ 45 · 💀) - Language-interfaced fine-tuning for chemistry. MIT - BERT-PSIE-TC (🥉6 · ⭐ 15 · 💀) - A dataset of Curie temperatures automatically extracted from scientific literature with the use of the BERT-PSIE.. MIT magnetism - crystal-text-llm (🥉5 · ⭐ 120 · 💀) - Large language models to generate stable crystals. CC-BY-NC-4.0 materials-discovery - SciBot (🥉5 · ⭐ 33 · 💀) - SciBot is a simple demo of building a domain-specific chatbot for science. Unlicensed ai-agent - Cephalo (🥉5 · ⭐ 12 · 💀) - Multimodal Vision-Language Models for Bio-Inspired Materials Analysis and Design. Apache-2 generative multimodal pretrained - MAPI_LLM (🥉5 · ⭐ 9 · 💀) - A LLM application developed during the LLM March MADNESS Hackathon https://doi.org/10.1039/D3DD00113J. MIT ai-agent dataset - CatBERTa (🥉4 · ⭐ 28 · 💀) - Large Language Model for Catalyst Property Prediction. Unlicensed transformer catalysis - ChemDataWriter (🥉3 · ⭐ 13 · 💀) - ChemDataWriter is a transformer-based library for automatically generating research books in the chemistry area. MIT literature-data


Materials Discovery

Back to top

Projects that implement materials discovery methods using atomistic ML.

SMACT (🥇26 · ⭐ 130) - Python package to aid materials design and informatics. MIT HTC structure-prediction electrostatics - [GitHub](https://github.com/WMD-group/SMACT) (👨‍💻 49 · 🔀 30 · 📦 75 · ⏱️ 09.04.2026):
git clone https://github.com/WMD-group/SMACT
- [PyPi](https://pypi.org/project/smact) (📥 11K / month · 📦 14 · ⏱️ 03.03.2026):
pip install smact
- [Conda](https://anaconda.org/conda-forge/smact) (📥 10K · ⏱️ 31.07.2025):
conda install -c conda-forge smact
MatterGen (🥇18 · ⭐ 1.7K) - Official implementation of MatterGen -- a generative model for inorganic materials design across the periodic table.. MIT generative structure-prediction pretrained - [GitHub](https://github.com/microsoft/mattergen) (👨‍💻 13 · 🔀 310 · 📋 140 - 5% open · ⏱️ 27.02.2026):
git clone https://github.com/microsoft/mattergen
Materials Discovery: GNoME (🥈12 · ⭐ 1.2K · 💀) - Graph Networks for Materials Science (GNoME) and dataset of 381,000 novel stable materials. Apache-2 UIP datasets rep-learn proprietary - [GitHub](https://github.com/google-deepmind/materials_discovery) (👨‍💻 2 · 🔀 180 · 📋 25 - 84% open · ⏱️ 03.03.2025):
git clone https://github.com/google-deepmind/materials_discovery
aviary (🥈12 · ⭐ 61) - The Wren sits on its Roost in the Aviary. MIT - [GitHub](https://github.com/CompRhys/aviary) (👨‍💻 6 · 🔀 13 · 📦 1 · 📋 34 - 11% open · ⏱️ 06.01.2026):
git clone https://github.com/CompRhys/aviary
BOSS (🥈12 · ⭐ 27) - Bayesian Optimization Structure Search (BOSS). Apache-2 probabilistic - [PyPi](https://pypi.org/project/aalto-boss) (📥 580 / month · ⏱️ 28.11.2025):
pip install aalto-boss
- [GitLab](https://gitlab.com/cest-group/boss) (🔀 14 · 📋 39 - 17% open · ⏱️ 28.11.2025):
git clone https://gitlab.com/cest-group/boss
AGOX (🥉9 · ⭐ 18) - AGOX is a package for global optimization of atomic system using e.g. the energy calculated from density functional.. GPL-3.0 structure-optimization - [PyPi](https://pypi.org/project/agox) (📥 130 / month · 📦 3 · ⏱️ 04.02.2026):
pip install agox
- [GitLab](https://gitlab.com/agox/agox) (🔀 8 · 📋 28 - 32% open · ⏱️ 04.02.2026):
git clone https://gitlab.com/agox/agox
Show 7 hidden projects... - Computational Autonomy for Materials Discovery (CAMD) (🥉7 · ⭐ 1 · 💀) - Agent-based sequential learning software for materials discovery. Apache-2 - MAGUS (🥉5 · ⭐ 100 · 💀) - Machine learning And Graph theory assisted Universal structure Searcher. Unlicensed structure-prediction active-learning - CSPML (crystal structure prediction with machine learning-based element substitution) (🥉5 · ⭐ 29 · 💀) - Original implementation of CSPML. MIT structure-prediction - SPINNER (🥉4 · ⭐ 15 · 💀) - SPINNER (Structure Prediction of Inorganic crystals using Neural Network potentials with Evolutionary and Random.. GPL-3.0 C++ structure-prediction - ML-atomate (🥉4 · ⭐ 7 · 💀) - Machine learning-assisted Atomate code for autonomous computational materials screening. GPL-3.0 active-learning workflows - closed-loop-acceleration-benchmarks (🥉4 · 💀) - Data and scripts in support of the publication By how much can closed-loop frameworks accelerate computational.. MIT materials-discovery active-learning single-paper - sl_discovery (🥉3 · ⭐ 5 · 💀) - Data processing and models related to Quantifying the performance of machine learning models in materials discovery. Apache-2 materials-discovery single-paper


Mathematical tools

Back to top

Projects that implement mathematical objects used in atomistic machine learning.

cuEquivariance (🥇24 · ⭐ 390) - cuEquivariance is a math library that is a collective of low-level primitives and tensor ops to accelerate widely-used.. Apache-2 rep-learn - [GitHub](https://github.com/NVIDIA/cuEquivariance) (👨‍💻 8 · 🔀 27 · 📋 80 - 21% open · ⏱️ 08.04.2026):
git clone https://github.com/NVIDIA/cuEquivariance
- [PyPi](https://pypi.org/project/cuequivariance) (📥 93K / month · 📦 10 · ⏱️ 16.03.2026):
pip install cuequivariance
- [Conda](https://anaconda.org/conda-forge/cuequivariance) (📥 21K · ⏱️ 16.03.2026):
conda install -c conda-forge cuequivariance
KFAC-JAX (🥇21 · ⭐ 320) - Second Order Optimization and Curvature Estimation with K-FAC in JAX. Apache-2 - [GitHub](https://github.com/google-deepmind/kfac-jax) (👨‍💻 20 · 🔀 29 · 📦 11 · 📋 36 - 69% open · ⏱️ 01.04.2026):
git clone https://github.com/google-deepmind/kfac-jax
- [PyPi](https://pypi.org/project/kfac-jax) (📥 2K / month · 📦 2 · ⏱️ 25.02.2026):
pip install kfac-jax
OpenEquivariance (🥈20 · ⭐ 140) - OpenEquivariance: a fast, open-source GPU JIT kernel generator for the Clebsch-Gordon Tensor Product. BSD-3 rep-learn - [GitHub](https://github.com/PASSIONLab/OpenEquivariance) (👨‍💻 4 · 🔀 9 · 📦 2 · 📋 33 - 3% open · ⏱️ 24.03.2026):
git clone https://github.com/PASSIONLab/OpenEquivariance
- [PyPi](https://pypi.org/project/openequivariance) (📥 24K / month · 📦 4 · ⏱️ 24.03.2026):
pip install openequivariance
SpheriCart (🥈20 · ⭐ 95) - Multi-language library for the calculation of spherical harmonics in Cartesian coordinates. MIT - [GitHub](https://github.com/lab-cosmo/sphericart) (👨‍💻 13 · 🔀 17 · 📥 960 · 📦 10 · 📋 47 - 34% open · ⏱️ 26.03.2026):
git clone https://github.com/lab-cosmo/sphericart
- [PyPi](https://pypi.org/project/sphericart) (📥 4.2K / month · 📦 2 · ⏱️ 26.03.2026):
pip install sphericart
gpax (🥈17 · ⭐ 230 · 💤) - Gaussian Processes for Experimental Sciences. MIT probabilistic active-learning - [GitHub](https://github.com/ziatdinovmax/gpax) (👨‍💻 6 · 🔀 29 · 📦 6 · 📋 43 - 23% open · ⏱️ 04.07.2025):
git clone https://github.com/ziatdinovmax/gpax
- [PyPi](https://pypi.org/project/gpax) (📥 310 / month · ⏱️ 04.07.2025):
pip install gpax
Polynomials4ML.jl (🥈12 · ⭐ 14) - Polynomials for ML: fast evaluation, batching, differentiation. MIT Julia - [GitHub](https://github.com/ACEsuit/Polynomials4ML.jl) (👨‍💻 12 · 🔀 7 · 📋 61 - 14% open · ⏱️ 29.12.2025):
git clone https://github.com/ACEsuit/Polynomials4ML.jl
GElib (🥉9 · ⭐ 26) - C++/CUDA library for SO(3) equivariant operations. MPL-2.0 C++ - [GitHub](https://github.com/risi-kondor/GElib) (👨‍💻 6 · 🔀 3 · 📋 8 - 50% open · ⏱️ 21.10.2025):
git clone https://github.com/risi-kondor/GElib
COSMO Toolbox (🥉6 · ⭐ 8) - Assorted libraries and utilities for atomistic simulation analysis. Unlicensed C++ - [GitHub](https://github.com/lab-cosmo/toolbox) (👨‍💻 10 · 🔀 7 · ⏱️ 01.04.2026):
git clone https://github.com/lab-cosmo/toolbox
Show 6 hidden projects... - lie-nn (🥉9 · ⭐ 36 · 💀) - Tools for building equivariant polynomials on reductive Lie groups. MIT rep-learn - LapJAX (🥉8 · ⭐ 75 · 💀) - A JAX based package designed for efficient second order operators (e.g., laplacian) computation. MIT - EquivariantOperators.jl (🥉6 · ⭐ 18 · 💀) - This package is deprecated. Functionalities are migrating to Porcupine.jl. MIT Julia - cnine (🥉4 · ⭐ 5) - Cnine tensor library. Unlicensed C++ - torch_spex (🥉3 · ⭐ 2 · 💀) - Spherical expansions in PyTorch. Unlicensed - Wigner Kernels (🥉1 · ⭐ 2 · 💀) - Collection of programs to benchmark Wigner kernels. Unlicensed benchmarking


Molecular Dynamics

Back to top

Projects that simplify the integration of molecular dynamics and atomistic machine learning.

JAX-MD (🥇28 · ⭐ 1.4K) - Differentiable, Hardware Accelerated, Molecular Dynamics. Apache-2 - [GitHub](https://github.com/jax-md/jax-md) (👨‍💻 45 · 🔀 230 · 📦 82 · 📋 190 - 29% open · ⏱️ 05.04.2026):
git clone https://github.com/jax-md/jax-md
- [PyPi](https://pypi.org/project/jax-md) (📥 14K / month · 📦 18 · ⏱️ 22.03.2026):
pip install jax-md
TorchSim (🥇24 · ⭐ 440 · 📉) - Torch-native, batchable, atomistic simulations. MIT HTC UIP ML-IAP structure-optimization - [GitHub](https://github.com/TorchSim/torch-sim) (👨‍💻 33 · 🔀 90 · 📋 160 - 13% open · ⏱️ 09.04.2026):
git clone https://github.com/Radical-AI/torch-sim
- [PyPi](https://pypi.org/project/torch-sim-atomistic) (📥 50K / month · 📦 7 · ⏱️ 18.02.2026):
pip install torch-sim-atomistic
GPUMD (🥈22 · ⭐ 740) - GPUMD is a highly efficient general-purpose molecular dynamic (MD) package and enables machine-learned potentials.. GPL-3.0 ML-IAP C++ electrostatics - [GitHub](https://github.com/brucefan1983/GPUMD) (👨‍💻 57 · 🔀 180 · 📋 270 - 6% open · ⏱️ 08.04.2026):
git clone https://github.com/brucefan1983/GPUMD
mlcolvar (🥈22 · ⭐ 140) - A unified framework for machine learning collective variables for enhanced sampling simulations. MIT sampling - [GitHub](https://github.com/luigibonati/mlcolvar) (👨‍💻 14 · 🔀 35 · 📦 11 · 📋 110 - 11% open · ⏱️ 09.04.2026):
git clone https://github.com/luigibonati/mlcolvar
- [PyPi](https://pypi.org/project/mlcolvar) (📥 600 / month · 📦 4 · ⏱️ 03.02.2026):
pip install mlcolvar
FitSNAP (🥈17 · ⭐ 180) - Software for generating machine-learning interatomic potentials for LAMMPS. GPL-2.0 - [GitHub](https://github.com/FitSNAP/FitSNAP) (👨‍💻 24 · 🔀 65 · 📥 15 · 📋 84 - 25% open · ⏱️ 17.10.2025):
git clone https://github.com/FitSNAP/FitSNAP
- [Conda](https://anaconda.org/conda-forge/fitsnap3) (📥 17K · ⏱️ 22.04.2025):
conda install -c conda-forge fitsnap3
OpenMM-ML (🥈17 · ⭐ 160) - High level API for using machine learning models in OpenMM simulations. MIT ML-IAP - [GitHub](https://github.com/openmm/openmm-ml) (👨‍💻 8 · 🔀 40 · 📦 2 · 📋 72 - 33% open · ⏱️ 25.03.2026):
git clone https://github.com/openmm/openmm-ml
- [Conda](https://anaconda.org/conda-forge/openmm-ml) (📥 41K · ⏱️ 25.03.2026):
conda install -c conda-forge openmm-ml
Psiflow (🥉15 · ⭐ 140) - scalable molecular simulation. MIT ML-IAP active-learning sampling - [GitHub](https://github.com/molmod/psiflow) (👨‍💻 5 · 🔀 15 · 📋 59 - 16% open · ⏱️ 30.03.2026):
git clone https://github.com/molmod/psiflow
pair_allegro (🥉15 · ⭐ 61) - LAMMPS pair styles for NequIP and Allegro deep learning interatomic potentials. MIT ML-IAP rep-learn - [GitHub](https://github.com/mir-group/pair_nequip_allegro) (👨‍💻 6 · 🔀 10 · 📋 49 - 20% open · ⏱️ 10.03.2026):
git clone https://github.com/mir-group/pair_allegro
DMFF (🥉12 · ⭐ 190) - DMFF (Differentiable Molecular Force Field) is a Jax-based python package that provides a full differentiable.. LGPL-3.0 C++ - [GitHub](https://github.com/deepmodeling/DMFF) (👨‍💻 17 · 🔀 47 · 📋 33 - 39% open · ⏱️ 07.04.2026):
git clone https://github.com/deepmodeling/DMFF
pair_nequip (🥉10 · ⭐ 44 · 💤) - LAMMPS pair style for NequIP. MIT ML-IAP rep-learn - [GitHub](https://github.com/mir-group/pair_nequip) (👨‍💻 3 · 🔀 14 · 📋 33 - 39% open · ⏱️ 25.04.2025):
git clone https://github.com/mir-group/pair_nequip
PACE (🥉10 · ⭐ 31) - The LAMMPS ML-IAP `pair_style pace`, aka Atomic Cluster Expansion (ACE), aka ML-PACE,.. Custom - [GitHub](https://github.com/ICAMS/lammps-user-pace) (👨‍💻 8 · 🔀 16 · 📋 11 - 45% open · ⏱️ 03.12.2025):
git clone https://github.com/ICAMS/lammps-user-pace
MUSE (🥉5 · ⭐ 7 · 💤) - A python package for fast building amorphous solids and liquid mixtures from @materialsproject computed structures and.. MIT ML-IAP Defects & Disorder - [GitHub](https://github.com/chiang-yuan/muse) (👨‍💻 2 · 📦 1 · ⏱️ 15.05.2025):
git clone https://github.com/chiang-yuan/muse
Show 3 hidden projects... - openmm-torch (🥈17 · ⭐ 220 · 💀) - OpenMM plugin to define forces with neural networks. Custom ML-IAP C++ - SOMD (🥉4 · ⭐ 17) - Molecular dynamics package designed for the SIESTA DFT code. AGPL-3.0 ML-IAP active-learning - interface-lammps-mlip-3 (🥉3 · ⭐ 5 · 💀) - An interface between LAMMPS and MLIP (version 3). GPL-2.0


Probabilistic ML

Back to top

Projects that focus on probabilistic, Bayesian, Gaussian process and adversarial methods for atomistic ML, for optimization, uncertainty quantification (UQ), etc.

thermo (🥇5 · ⭐ 17) - Data-driven risk-conscious thermoelectric materials discovery. MIT materials-discovery experimental-data active-learning transport-phenomena - [GitHub](https://github.com/janosh/thermo) (👨‍💻 2 · 🔀 4 · ⏱️ 12.11.2025):
git clone https://github.com/janosh/thermo


Reinforcement Learning

Back to top

Projects that focus on reinforcement learning for atomistic ML.

Show 2 hidden projects... - ReLeaSE (🥇11 · ⭐ 370 · 💀) - Deep Reinforcement Learning for de-novo Drug Design. MIT drug-discovery - CatGym (🥉5 · ⭐ 13 · 💀) - Surface segregation using Deep Reinforcement Learning. GPL


Representation Engineering

Back to top

Projects that offer implementations of representations aka descriptors, fingerprints of atomistic systems, and models built with them, aka feature engineering.

cdk (🥇27 · ⭐ 580) - The Chemistry Development Kit. LGPL-2.1 cheminformatics Java - [GitHub](https://github.com/cdk/cdk) (👨‍💻 170 · 🔀 180 · 📥 98K · 📋 330 - 2% open · ⏱️ 03.04.2026):
git clone https://github.com/cdk/cdk
- [Maven](https://search.maven.org/artifact/org.openscience.cdk/cdk-bundle) (📦 18 · ⏱️ 03.03.2026):
<dependency>
    <groupId>org.openscience.cdk</groupId>
    <artifactId>cdk-bundle</artifactId>
    <version>[VERSION]</version>
</dependency>
DScribe (🥇24 · ⭐ 460 · 💤) - DScribe is a python package for creating machine learning descriptors for atomistic systems. Apache-2 - [GitHub](https://github.com/SINGROUP/dscribe) (👨‍💻 18 · 🔀 96 · 📦 290 · 📋 110 - 12% open · ⏱️ 27.09.2025):
git clone https://github.com/SINGROUP/dscribe
- [PyPi](https://pypi.org/project/dscribe) (📥 120K / month · 📦 63 · ⏱️ 27.09.2025):
pip install dscribe
- [Conda](https://anaconda.org/conda-forge/dscribe) (📥 280K · ⏱️ 10.12.2025):
conda install -c conda-forge dscribe
ChemML (🥇19 · ⭐ 170) - ChemML is a machine learning and informatics program suite for the chemical and materials sciences. BSD-3 cheminformatics active-learning workflows - [GitHub](https://github.com/hachmannlab/chemml) (👨‍💻 17 · 🔀 34 · 📥 14 · 📦 8 · 📋 13 - 53% open · ⏱️ 20.03.2026):
git clone https://github.com/hachmannlab/chemml
- [PyPi](https://pypi.org/project/chemml) (📥 170 / month · 📦 2 · ⏱️ 08.10.2023):
pip install chemml
MODNet (🥈15 · ⭐ 110 · 💤) - MODNet: a framework for machine learning materials properties. MIT pretrained small-data transfer-learning - [GitHub](https://github.com/ppdebreuck/modnet) (👨‍💻 11 · 🔀 34 · 📦 11 · 📋 64 - 50% open · ⏱️ 02.05.2025):
git clone https://github.com/ppdebreuck/modnet
Featomic (🥈15 · ⭐ 79) - Computing representations for atomistic machine learning. BSD-3 Rust C++ - [GitHub](https://github.com/metatensor/featomic) (👨‍💻 19 · 🔀 18 · 📥 850 · 📋 89 - 49% open · ⏱️ 31.03.2026):
git clone https://github.com/metatensor/featomic
GlassPy (🥈15 · ⭐ 38) - Python module for scientists working with glass materials. GPL-3.0 - [GitHub](https://github.com/drcassar/glasspy) (👨‍💻 2 · 🔀 8 · 📦 7 · 📋 16 - 43% open · ⏱️ 17.03.2026):
git clone https://github.com/drcassar/glasspy
- [PyPi](https://pypi.org/project/glasspy) (📥 660 / month · ⏱️ 13.03.2026):
pip install glasspy
pySIPFENN (🥈15 · ⭐ 24) - Python python toolset for Structure-Informed Property and Feature Engineering with Neural Networks. It offers unique.. LGPL-3.0 material-defect Defects & Disorder pretrained transfer-learning - [GitHub](https://github.com/PhasesResearchLab/pySIPFENN) (👨‍💻 5 · 🔀 5 · 📥 120 · 📦 7 · 📋 8 - 62% open · ⏱️ 20.01.2026):
git clone https://github.com/PhasesResearchLab/pySIPFENN
- [PyPi](https://pypi.org/project/pysipfenn) (📥 95 / month · ⏱️ 20.01.2026):
pip install pysipfenn
- [Conda](https://anaconda.org/conda-forge/pysipfenn) (📥 22K · ⏱️ 20.01.2026):
conda install -c conda-forge pysipfenn
SISSO (🥈11 · ⭐ 350) - A data-driven method combining symbolic regression and compressed sensing for accurate & interpretable models. Apache-2 Fortran - [GitHub](https://github.com/rouyang2017/SISSO) (👨‍💻 3 · 🔀 95 · 📋 78 - 23% open · ⏱️ 26.01.2026):
git clone https://github.com/rouyang2017/SISSO
BenchML (🥈11 · ⭐ 15) - ML benchmarking and pipeling framework. Apache-2 benchmarking - [GitHub](https://github.com/capoe/benchml) (👨‍💻 9 · 🔀 6 · 📋 13 - 23% open · ⏱️ 28.10.2025):
git clone https://github.com/capoe/benchml
- [PyPi](https://pypi.org/project/benchml) (📥 35 / month · ⏱️ 14.07.2022):
pip install benchml
PDynA (🥉10 · ⭐ 51) - Python package to analyse the structural dynamics of perovskites. MIT MD - [GitHub](https://github.com/WMD-group/PDynA) (👨‍💻 4 · 🔀 5 · 📦 2 · ⏱️ 14.01.2026):
git clone https://github.com/WMD-group/PDynA
- [PyPi](https://pypi.org/project/pdyna) (📥 11 / month · ⏱️ 23.09.2024):
pip install pdyna
MOLPIPx (🥉9 · ⭐ 50) - Differentiable version of Permutationally Invariant Polynomial (PIP) models in JAX and Rust. Apache-2 Python Rust - [GitHub](https://github.com/ChemAI-Lab/molpipx) (👨‍💻 13 · 🔀 1 · ⏱️ 27.03.2026):
git clone https://github.com/ChemAI-Lab/molpipx
ElemNet (🥉7 · ⭐ 100) - Deep Learning the Chemistry of Materials From Only Elemental Composition for Enhancing Materials Property Prediction. Unlicensed single-paper - [GitHub](https://github.com/NU-CUCIS/ElemNet) (👨‍💻 4 · 🔀 35 · 📋 8 - 50% open · ⏱️ 13.01.2026):
git clone https://github.com/NU-CUCIS/ElemNet
fplib (🥉6 · ⭐ 8 · 💤) - libfp is a library for calculating crystalline fingerprints and measuring similarities of materials. MIT C-lang single-paper - [GitHub](https://github.com/Rutgers-ZRG/libfp) (👨‍💻 2 · 🔀 1 · 📦 2 · ⏱️ 22.09.2025):
git clone https://github.com/zhuligs/fplib
soap_turbo (🥉5 · ⭐ 8) - soap_turbo comprises a series of libraries to be used in combination with QUIP/GAP and TurboGAP. Custom Fortran - [GitHub](https://github.com/libAtoms/soap_turbo) (👨‍💻 4 · 🔀 8 · 📋 8 - 62% open · ⏱️ 22.01.2026):
git clone https://github.com/libAtoms/soap_turbo
Show 17 hidden projects... - CatLearn (🥇16 · ⭐ 120 · 💀) - GPL-3.0 surface-science - ElementEmbeddings (🥈15 · ⭐ 51 · 💀) - Python package to interact with high-dimensional representations of the chemical elements. MIT XAI USL viz - Librascal (🥈12 · ⭐ 83 · 💀) - A scalable and versatile library to generate representations for atomic-scale learning. LGPL-2.1 - CBFV (🥈12 · ⭐ 29 · 💀) - Tool to quickly create a composition-based feature vector. Unlicensed - SkipAtom (🥉10 · ⭐ 28 · 💀) - Distributed representations of atoms, inspired by the Skip-gram model. MIT - cmlkit (🥉9 · ⭐ 33 · 💀) - tools for machine learning in condensed matter physics and quantum chemistry. MIT benchmarking - NICE (🥉7 · ⭐ 12 · 💀) - NICE (N-body Iteratively Contracted Equivariants) is a set of tools designed for the calculation of invariant and.. MIT - SISSO++ (🥉7 · ⭐ 6 · 💀) - C++ Implementation of SISSO with python bindings. Apache-2 C++ - milad (🥉6 · ⭐ 34 · 💀) - Moment Invariants Local Atomic Descriptor. GPL-3.0 generative - SA-GPR (🥉6 · ⭐ 23 · 💀) - Public repository for symmetry-adapted Gaussian Process Regression (SA-GPR). LGPL-3.0 C-lang - SOAPxx (🥉6 · ⭐ 7 · 💀) - A SOAP implementation. GPL-2.0 C++ - pyLODE (🥉6 · ⭐ 3 · 💀) - Pythonic implementation of LOng Distance Equivariants. Apache-2 electrostatics - AMP (🥉6 · 💀) - Amp is an open-source package designed to easily bring machine-learning to atomistic calculations. Unlicensed - MXenes4HER (🥉5 · ⭐ 7 · 💀) - Predicting hydrogen evolution (HER) activity over 4500 MXene materials https://doi.org/10.1039/D3TA00344B. GPL-3.0 materials-discovery catalysis scikit-learn single-paper - automl-materials (🥉4 · ⭐ 5 · 💀) - AutoML for Regression Tasks on Small Tabular Data in Materials Design. MIT autoML benchmarking single-paper - magnetism-prediction (🥉4 · ⭐ 2 · 💤) - DFT-aided Machine Learning Search for Magnetism in Fe-based Bimetallic Chalcogenides. Apache-2 magnetism single-paper - ML-for-CurieTemp-Predictions (🥉3 · ⭐ 2 · 💀) - Machine Learning Predictions of High-Curie-Temperature Materials. MIT single-paper magnetism


Representation Learning

Back to top

General models that learn a representations aka embeddings of atomistic systems, such as message-passing neural networks (MPNN).

Deep Graph Library (DGL) (🥇36 · ⭐ 14K · 💤) - Python package built to ease deep learning on graph, on top of existing DL frameworks. Apache-2 - [GitHub](https://github.com/dmlc/dgl) (👨‍💻 300 · 🔀 3K · 📦 4.2K · 📋 3K - 20% open · ⏱️ 31.07.2025):
git clone https://github.com/dmlc/dgl
- [PyPi](https://pypi.org/project/dgl) (📥 120K / month · 📦 150 · ⏱️ 13.05.2024):
pip install dgl
- [Conda](https://anaconda.org/dglteam/dgl) (📥 480K · ⏱️ 25.03.2025):
conda install -c dglteam dgl
PyG Models (🥇34 · ⭐ 24K) - Representation learning models implemented in PyTorch Geometric. MIT general-ml - [GitHub](https://github.com/pyg-team/pytorch_geometric) (👨‍💻 560 · 🔀 4K · 📦 11K · 📋 4K - 31% open · ⏱️ 08.04.2026):
git clone https://github.com/pyg-team/pytorch_geometric
e3nn (🥇29 · ⭐ 1.2K) - A modular framework for neural networks with Euclidean symmetry. MIT - [GitHub](https://github.com/e3nn/e3nn) (👨‍💻 38 · 🔀 180 · 📦 610 · 📋 180 - 17% open · ⏱️ 13.02.2026):
git clone https://github.com/e3nn/e3nn
- [PyPi](https://pypi.org/project/e3nn) (📥 350K / month · 📦 74 · ⏱️ 13.02.2026):
pip install e3nn
- [Conda](https://anaconda.org/conda-forge/e3nn) (📥 65K · ⏱️ 14.02.2026):
conda install -c conda-forge e3nn
MatGL (Materials Graph Library) (🥇28 · ⭐ 530) - Graph deep learning library for materials. BSD-3 ML-IAP pretrained multifidelity - [GitHub](https://github.com/materialyzeai/matgl) (👨‍💻 25 · 🔀 110 · 📦 95 · 📋 160 - 2% open · ⏱️ 08.04.2026):
git clone https://github.com/materialsvirtuallab/matgl
- [PyPi](https://pypi.org/project/matgl) (📥 24K / month · 📦 33 · ⏱️ 15.03.2026):
pip install matgl
- [Docker Hub](https://hub.docker.com/r/materialsvirtuallab/matgl) (📥 280 · ⭐ 1 · ⏱️ 08.04.2025):
docker pull materialsvirtuallab/matgl
SchNetPack (🥇27 · ⭐ 910) - SchNetPack - Deep Neural Networks for Atomistic Systems. MIT - [GitHub](https://github.com/atomistic-machine-learning/schnetpack) (👨‍💻 43 · 🔀 250 · 📦 110 · 📋 280 - 1% open · ⏱️ 17.03.2026):
git clone https://github.com/atomistic-machine-learning/schnetpack
- [PyPi](https://pypi.org/project/schnetpack) (📥 6.3K / month · 📦 4 · ⏱️ 19.12.2025):
pip install schnetpack
e3nn-jax (🥇23 · ⭐ 230) - jax library for E3 Equivariant Neural Networks. Apache-2 - [GitHub](https://github.com/e3nn/e3nn-jax) (👨‍💻 8 · 🔀 19 · 📦 83 · 📋 26 - 15% open · ⏱️ 01.04.2026):
git clone https://github.com/e3nn/e3nn-jax
- [PyPi](https://pypi.org/project/e3nn-jax) (📥 130K / month · 📦 35 · ⏱️ 01.04.2026):
pip install e3nn-jax
ALIGNN (🥈21 · ⭐ 310 · 💤) - Atomistic Line Graph Neural Network https://scholar.google.com/citations?user=9Q-tNnwAAAAJ.. Custom - [GitHub](https://github.com/usnistgov/alignn) (👨‍💻 8 · 🔀 110 · 📋 87 - 67% open · ⏱️ 25.08.2025):
git clone https://github.com/usnistgov/alignn
- [PyPi](https://pypi.org/project/alignn) (📥 11K / month · 📦 13 · ⏱️ 06.04.2026):
pip install alignn
Uni-Mol (🥈17 · ⭐ 1.1K · 💤) - Official Repository for the Uni-Mol Series Methods. MIT pretrained - [GitHub](https://github.com/deepmodeling/Uni-Mol) (👨‍💻 20 · 🔀 170 · 📥 23K · 📋 230 - 48% open · ⏱️ 29.05.2025):
git clone https://github.com/deepmodeling/Uni-Mol
HydraGNN (🥈16 · ⭐ 100) - Distributed PyTorch implementation of multi-headed graph convolutional neural networks. BSD-3 - [GitHub](https://github.com/ORNL/HydraGNN) (👨‍💻 18 · 🔀 38 · 📦 3 · 📋 56 - 30% open · ⏱️ 03.04.2026):
git clone https://github.com/ORNL/HydraGNN
hippynn (🥈13 · ⭐ 94) - python library for atomistic machine learning. Custom workflows - [GitHub](https://github.com/lanl/hippynn) (👨‍💻 20 · 🔀 34 · 📦 4 · 📋 41 - 34% open · ⏱️ 04.03.2026):
git clone https://github.com/lanl/hippynn
Compositionally-Restricted Attention-Based Network (CrabNet) (🥈13 · ⭐ 17 · 💤) - Predict materials properties using only the composition information!. MIT - [GitHub](https://github.com/sparks-baird/CrabNet) (👨‍💻 6 · 🔀 7 · 📦 16 · 📋 19 - 84% open · ⏱️ 04.06.2025):
git clone https://github.com/sparks-baird/CrabNet
- [PyPi](https://pypi.org/project/crabnet) (📥 430 / month · 📦 2 · ⏱️ 10.01.2023):
pip install crabnet
SE(3)-Transformers (🥈9 · ⭐ 580) - code for the SE3 Transformers paper: https://arxiv.org/abs/2006.10503. MIT single-paper transformer - [GitHub](https://github.com/FabianFuchsML/se3-transformer-public) (👨‍💻 2 · 🔀 76 · 📋 29 - 37% open · ⏱️ 03.04.2026):
git clone https://github.com/FabianFuchsML/se3-transformer-public
UVVisML (🥉8 · ⭐ 35 · 💤) - Predict optical properties of molecules with machine learning. MIT optical-properties single-paper probabilistic - [GitHub](https://github.com/learningmatter-mit/uvvisml) (👨‍💻 1 · 🔀 10 · 📋 2 - 50% open · ⏱️ 30.07.2025):
git clone https://github.com/learningmatter-mit/uvvisml
DeeperGATGNN (🥉7 · ⭐ 65) - Scalable graph neural networks for materials property prediction. MIT - [GitHub](https://github.com/usccolumbia/deeperGATGNN) (👨‍💻 3 · 🔀 8 · ⏱️ 02.02.2026):
git clone https://github.com/usccolumbia/deeperGATGNN
Crystalframer (🥉7 · ⭐ 18) - The official code respository for Rethinking the role of frames for SE(3)-invariant crystal structure modeling (ICLR.. MIT transformer single-paper - [GitHub](https://github.com/omron-sinicx/crystalframer) (👨‍💻 3 · 🔀 3 · 📥 11 · ⏱️ 16.10.2025):
git clone https://github.com/omron-sinicx/crystalframer
Show 48 hidden projects... - dgl-lifesci (🥇24 · ⭐ 800 · 💀) - Python package for graph neural networks in chemistry and biology. Apache-2 - NVIDIA Deep Learning Examples for Tensor Cores (🥈20 · ⭐ 15K · 💀) - State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and.. Custom educational drug-discovery - DIG: Dive into Graphs (🥈20 · ⭐ 2K · 💀) - A library for graph deep learning research. GPL-3.0 - escnn (🥈18 · ⭐ 520 · 💀) - Equivariant Steerable CNNs Library for Pytorch https://quva-lab.github.io/escnn/. Custom - kgcnn (🥈17 · ⭐ 120 · 💀) - Graph convolutions in Keras with TensorFlow, PyTorch or Jax. MIT - Graphormer (🥈15 · ⭐ 2.4K · 💀) - Graphormer is a general-purpose deep learning backbone for molecular modeling. MIT transformer pretrained - benchmarking-gnns (🥈14 · ⭐ 2.7K · 💀) - Repository for benchmarking graph neural networks (JMLR 2023). MIT single-paper benchmarking - Crystal Graph Convolutional Neural Networks (CGCNN) (🥈13 · ⭐ 850 · 💀) - Crystal graph convolutional neural networks for predicting material properties. MIT - xtal2png (🥈13 · ⭐ 39 · 💀) - Encode/decode a crystal structure to/from a grayscale PNG image for direct use with image-based machine learning.. MIT computer-vision - Neural fingerprint (nfp) (🥈12 · ⭐ 62 · 💀) - Keras layers for end-to-end learning with rdkit and pymatgen. Custom - FAENet (🥈12 · ⭐ 34 · 💀) - Frame Averaging Equivariant GNN for materials modeling. MIT - pretrained-gnns (🥈10 · ⭐ 1.1K · 💀) - Strategies for Pre-training Graph Neural Networks. MIT pretrained - GDC (🥈10 · ⭐ 280 · 💀) - Graph Diffusion Convolution, as proposed in Diffusion Improves Graph Learning (NeurIPS 2019). MIT generative - Atom2Vec (🥈10 · ⭐ 37 · 💀) - Atom2Vec: a simple way to describe atoms for machine learning. MIT - GATGNN: Global Attention Graph Neural Network (🥈9 · ⭐ 85 · 💀) - Pytorch Repository for our work: Graph convolutional neural networks with global attention for improved materials.. MIT - ai4material_design (🥈9 · ⭐ 8 · 💀) - Code for Kazeev, N., Al-Maeeni, A.R., Romanov, I. et al. Sparse representation for machine learning the properties of.. Apache-2 pretrained material-defect - molecularGNN_smiles (🥉8 · ⭐ 340 · 💀) - The code of a graph neural network (GNN) for molecules, which is based on learning representations of r-radius.. Apache-2 - Equiformer (🥉8 · ⭐ 280 · 💀) - [ICLR 2023 Spotlight] Equiformer: Equivariant Graph Attention Transformer for 3D Atomistic Graphs. MIT transformer - graphite (🥉8 · ⭐ 110 · 💀) - A repository for implementing graph network models based on atomic structures. MIT - GNNOpt (🥉8 · ⭐ 33 · 💀) - Universal Ensemble-Embedding Graph Neural Network for Direct Prediction of Optical Spectra from Crystal Structures. MIT optical-properties single-paper - T-e3nn (🥉8 · ⭐ 19 · 💀) - Time-reversal Euclidean neural networks based on e3nn. MIT magnetism - tensorfieldnetworks (🥉7 · ⭐ 160 · 💀) - Rotation- and translation-equivariant neural networks for 3D point clouds. MIT - DTNN (🥉7 · ⭐ 78 · 💀) - Deep Tensor Neural Network. MIT - Graph-Aware-Transformers (🥉7 · ⭐ 69 · 💀) - Graph-Aware Attention for Adaptive Dynamics in Transformers. Apache-2 transformer graph-data pretrained single-paper - Cormorant (🥉7 · ⭐ 60 · 💀) - Codebase for Cormorant Neural Networks. Custom - AdsorbML (🥉7 · ⭐ 44 · 💀) - MIT surface-science single-paper - escnn_jax (🥉7 · ⭐ 32 · 💀) - Equivariant Steerable CNNs Library for Pytorch https://quva-lab.github.io/escnn/. Custom - CGAT (🥉7 · ⭐ 31 · 💀) - Crystal graph attention neural networks for materials prediction. MIT - Crystalformer (🥉7 · ⭐ 28 · 💀) - The official code respository for Crystalformer: Infinitely Connected Attention for Periodic Structure Encoding (ICLR.. MIT transformer single-paper - Geom3D (🥉6 · ⭐ 130 · 💀) - Geom3D: Geometric Modeling on 3D Structures, NeurIPS 2023. MIT benchmarking single-paper - matsciml (🥉6 · ⭐ 130 · 💀) - Open MatSci ML Toolkit is a framework for prototyping and scaling out deep learning models for materials discovery.. MIT workflows benchmarking - MACE-Layer (🥉6 · ⭐ 47 · 💀) - Higher order equivariant graph neural networks for 3D point clouds. MIT - charge_transfer_nnp (🥉6 · ⭐ 37 · 💀) - Graph neural network potential with charge transfer. MIT electrostatics - FieldSchNet (🥉6 · ⭐ 24 · 💀) - Deep neural network for molecules in external fields. MIT - GLAMOUR (🥉6 · ⭐ 24 · 💀) - Graph Learning over Macromolecule Representations. MIT single-paper - ML4pXRDs (🥉6 · ⭐ 3 · 💀) - Contains code to train neural networks based on simulated powder XRDs from synthetic crystals. MIT XRD single-paper - Autobahn (🥉5 · ⭐ 30 · 💀) - Repository for Autobahn: Automorphism Based Graph Neural Networks. MIT - CraTENet (🥉5 · ⭐ 18 · 💀) - An attention-based deep neural network for thermoelectric transport properties. MIT transport-phenomena - SCFNN (🥉5 · ⭐ 15 · 💀) - Self-consistent determination of long-range electrostatics in neural network potentials. MIT C++ electrostatics single-paper - gkx: Green-Kubo Method in JAX (🥉5 · ⭐ 8 · 💀) - Green-Kubo + JAX + MLPs = Anharmonic Thermal Conductivities Done Fast. MIT transport-phenomena - Per-site PAiNN (🥉5 · ⭐ 2 · 💀) - Fork of PaiNN for PerovskiteOrderingGCNNs. MIT probabilistic pretrained single-paper - Per-Site CGCNN (🥉5 · ⭐ 1 · 💀) - Crystal graph convolutional neural networks for predicting material properties. MIT pretrained single-paper - Graph Transport Network (🥉4 · ⭐ 15 · 💀) - Graph transport network (GTN), as proposed in Scalable Optimal Transport in High Dimensions for Graph Distances,.. Custom transport-phenomena - atom_by_atom (🥉4 · ⭐ 12 · 💀) - Atom-by-atom design of metal oxide catalysts for the oxygen evolution reaction with Machine Learning. Unlicensed surface-science single-paper - EGraFFBench (🥉4 · ⭐ 11 · 💀) - Unlicensed single-paper benchmarking ML-IAP - Element encoder (🥉3 · ⭐ 6 · 💀) - Autoencoder neural network to compress properties of atomic species into a vector representation. GPL-3.0 single-paper - Point Edge Transformer (🥉2) - Smooth, exact rotational symmetrization for deep learning on point clouds. CC-BY-4.0 - SphericalNet (🥉1 · ⭐ 3 · 💀) - Implementation of Clebsch-Gordan Networks (CGnet: https://arxiv.org/pdf/1806.09231.pdf) by GElib & cnine libraries in.. Unlicensed


Universal Potentials

Back to top

Machine-learned interatomic potentials (ML-IAP) that have been trained on large, chemically and structural diverse datasets. For materials, this means e.g. datasets that include a majority of the periodic table.

🔗 TeaNet - Universal neural network interatomic potential inspired by iterative electronic relaxations.. ML-IAP

🔗 PreFerred Potential (PFP) - Universal neural network potential for material discovery https://doi.org/10.1038/s41467-022-30687-9. ML-IAP proprietary

FAIRChem EquiformerV2 models (🥇32 · ⭐ 2K · 📈) - FAIRChem implementation of Equiformer V2 (eqV2) models. MIT pretrained UIP rep-learn catalysis - [GitHub](https://github.com/facebookresearch/fairchem) (👨‍💻 69 · 🔀 450 · 📋 570 - 1% open · ⏱️ 09.04.2026):
git clone https://github.com/FAIR-Chem/fairchem
- [PyPi](https://pypi.org/project/fairchem-core) (📥 130K / month · 📦 44 · ⏱️ 26.03.2026):
pip install fairchem-core
FAIRChem eSEN models (🥇32 · ⭐ 2K · 📈) - FAIRChem implementation of Smooth Energy Network (eSEN) models arXiv:2502.12147. MIT pretrained UIP rep-learn catalysis - [GitHub](https://github.com/facebookresearch/fairchem) (👨‍💻 69 · 🔀 450 · 📋 570 - 1% open · ⏱️ 09.04.2026):
git clone https://github.com/FAIR-Chem/fairchem
- [PyPi](https://pypi.org/project/fairchem-core) (📥 130K / month · 📦 44 · ⏱️ 26.03.2026):
pip install fairchem-core
DPA-2 (🥈30 · ⭐ 1.9K) - A large atomic model as a multi-task learner https://arxiv.org/abs/2312.15492. LGPL-3.0 ML-IAP pretrained workflows datasets - [GitHub](https://github.com/deepmodeling/deepmd-kit) (👨‍💻 84 · 🔀 600 · 📥 69K · 📦 47 · 📋 1K - 13% open · ⏱️ 08.04.2026):
git clone https://github.com/deepmodeling/deepmd-kit
- [PyPi](https://pypi.org/project/deepmd-kit) (📥 7.1K / month · 📦 17 · ⏱️ 19.03.2026):
pip install deepmd-kit
- [Conda](https://anaconda.org/conda-forge/deepmd-kit) (📥 2.7M · ⏱️ 19.03.2026):
conda install -c conda-forge deepmd-kit
- [Docker Hub](https://hub.docker.com/r/deepmodeling/deepmd-kit) (📥 5.4K · ⭐ 1 · ⏱️ 05.04.2026):
docker pull deepmodeling/deepmd-kit
DeePMD-DPA3 (🥈30 · ⭐ 1.9K) - Successor of DPA-2. LGPL-3.0 ML-IAP pretrained workflows datasets - [GitHub](https://github.com/deepmodeling/deepmd-kit) (👨‍💻 84 · 🔀 600 · 📥 69K · 📦 47 · 📋 1K - 13% open · ⏱️ 08.04.2026):
git clone https://github.com/deepmodeling/deepmd-kit
- [PyPi](https://pypi.org/project/deepmd-kit) (📥 7.1K / month · 📦 17 · ⏱️ 19.03.2026):
pip install deepmd-kit
- [Conda](https://anaconda.org/conda-forge/deepmd-kit) (📥 2.7M · ⏱️ 19.03.2026):
conda install -c conda-forge deepmd-kit
- [Docker Hub](https://hub.docker.com/r/deepmodeling/deepmd-kit) (📥 5.4K · ⭐ 1 · ⏱️ 05.04.2026):
docker pull deepmodeling/deepmd-kit
SevenNet (🥈24 · ⭐ 240) - SevenNet - a graph neural network interatomic potential package supporting efficient multi-GPU parallel molecular.. GPL-3.0 ML-IAP MD pretrained - [GitHub](https://github.com/MDIL-SNU/SevenNet) (👨‍💻 20 · 🔀 52 · 📥 14K · 📋 90 - 17% open · ⏱️ 07.04.2026):
git clone https://github.com/MDIL-SNU/SevenNet
- [PyPi](https://pypi.org/project/sevenn) (📥 170K / month · 📦 23 · ⏱️ 03.03.2026):
pip install sevenn
MACE-FOUNDATION models (🥈23 · ⭐ 1.1K) - MACE foundation models (MP, OMAT, mh-1). MIT ML-IAP pretrained rep-learn MD - [GitHub](https://github.com/ACEsuit/mace-foundations) (👨‍💻 3 · 🔀 380 · 📥 360K · 📋 36 - 13% open · ⏱️ 19.11.2025):
git clone https://github.com/ACEsuit/mace-foundations
- [PyPi](https://pypi.org/project/mace-torch) (📥 62K / month · 📦 74 · ⏱️ 22.02.2026):
pip install mace-torch
MatterSim (🥈22 · ⭐ 530) - MatterSim: A deep learning atomistic model across elements, temperatures and pressures. MIT ML-IAP active-learning multimodal phase-transition pretrained - [GitHub](https://github.com/microsoft/mattersim) (👨‍💻 20 · 🔀 77 · 📥 36 · 📋 44 - 40% open · ⏱️ 18.03.2026):
git clone https://github.com/microsoft/mattersim
- [PyPi](https://pypi.org/project/mattersim) (📥 140K / month · 📦 26 · ⏱️ 05.04.2026):
pip install mattersim
Orb Models (🥉21 · ⭐ 560) - ORB forcefield models from Orbital Materials. Custom ML-IAP pretrained - [GitHub](https://github.com/orbital-materials/orb-models) (👨‍💻 16 · 🔀 78 · 📦 35 · 📋 75 - 4% open · ⏱️ 18.03.2026):
git clone https://github.com/orbital-materials/orb-models
- [PyPi](https://pypi.org/project/orb-models) (📥 8.1K / month · 📦 36 · ⏱️ 18.03.2026):
pip install orb-models
CHGNet (🥉20 · ⭐ 380) - Pretrained universal neural network potential for charge-informed atomistic modeling https://chgnet.lbl.gov. Custom ML-IAP MD pretrained electrostatics magnetism structure-relaxation - [GitHub](https://github.com/CederGroupHub/chgnet) (👨‍💻 13 · 🔀 96 · 📦 69 · 📋 77 - 5% open · ⏱️ 19.02.2026):
git clone https://github.com/CederGroupHub/chgnet
- [PyPi](https://pypi.org/project/chgnet) (📥 33K / month · 📦 33 · ⏱️ 22.09.2025):
pip install chgnet
PET-MAD (🥉20 · ⭐ 200) - Universal interatomic potentials for advanced materials modeling. BSD-3 ML-IAP MD rep-learn transformer - [GitHub](https://github.com/lab-cosmo/upet) (👨‍💻 15 · 🔀 17 · 📥 46 · 📦 3 · ⏱️ 31.03.2026):
git clone https://github.com/lab-cosmo/pet-mad
- [PyPi](https://pypi.org/project/pet-mad) (📥 700 / month · 📦 9 · ⏱️ 12.12.2025):
pip install pet-mad
- [Conda](https://anaconda.org/conda-forge/pet-mad):
conda install -c conda-forge pet-mad
M3GNet (🥉18 · ⭐ 320 · 💤) - Materials graph network with 3-body interactions featuring a DFT surrogate crystal relaxer and a state-of-the-art.. BSD-3 ML-IAP pretrained - [GitHub](https://github.com/materialyzeai/m3gnet) (👨‍💻 16 · 🔀 74 · 📋 35 - 42% open · ⏱️ 07.04.2025):
git clone https://github.com/materialsvirtuallab/m3gnet
- [PyPi](https://pypi.org/project/m3gnet) (📥 1.8K / month · 📦 17 · ⏱️ 17.11.2022):
pip install m3gnet
MLIP Arena Leaderboard (🥉13 · ⭐ 95) - [NeurIPS 25 Spotlight] Fair and transparent benchmark of machine learning interatomic potentials (MLIPs), beyond basic.. Apache-2 ML-IAP benchmarking - [GitHub](https://github.com/atomind-ai/mlip-arena) (👨‍💻 3 · 🔀 8 · 📦 2 · 📋 23 - 56% open · ⏱️ 27.03.2026):
git clone https://github.com/atomind-ai/mlip-arena
GRACE (🥉13 · ⭐ 89) - GRACE models and gracemaker (as implemented in TensorPotential package). Custom ML-IAP pretrained MD rep-learn rep-eng - [GitHub](https://github.com/ICAMS/grace-tensorpotential) (👨‍💻 4 · 🔀 11 · 📦 10 · 📋 18 - 66% open · ⏱️ 06.03.2026):
git clone https://github.com/ICAMS/grace-tensorpotential
EScAIP (🥉7 · ⭐ 60 · 💤) - [NeurIPS 2024] Official implementation of the Efficiently Scaled Attention Interatomic Potential. MIT ML-IAP rep-learn transformer single-paper - [GitHub](https://github.com/ASK-Berkeley/EScAIP) (👨‍💻 2 · 🔀 6 · 📥 15 · 📋 9 - 22% open · ⏱️ 26.09.2025):
git clone https://github.com/ASK-Berkeley/EScAIP
Show 3 hidden projects... - ffonons (🥉7 · ⭐ 23 · 💀) - Phonons from ML force fields. MIT benchmarking density-of-states - CHIPS-FF (🥉6 · ⭐ 9 · 💀) - Evaluation of universal machine learning force-fields https://doi.org/10.1021/acsmaterialslett.5c00093. Custom benchmarking structure-optimization MD materials-discovery transport-phenomena - Joint Multidomain Pre-Training (JMP) (🥉5 · ⭐ 62 · 💀) - Code for From Molecules to Materials Pre-training Large Generalizable Models for Atomic Property Prediction. CC-BY-NC-4.0 pretrained ML-IAP general-tool


Unsupervised Learning

Back to top

Projects that focus on unsupervised, semi- or self-supervised learning for atomistic ML, such as dimensionality reduction, clustering, contrastive learning, etc.

DADApy (🥇16 · ⭐ 140 · 📉) - Distance-based Analysis of DAta-manifolds in python. Apache-2 - [GitHub](https://github.com/sissa-data-science/DADApy) (👨‍💻 21 · 🔀 23 · 📋 42 - 28% open · ⏱️ 12.02.2026):
git clone https://github.com/sissa-data-science/DADApy
- [PyPi](https://pypi.org/project/dadapy) (📥 320 / month · ⏱️ 11.04.2025):
pip install dadapy
Show 9 hidden projects... - mat_discover (🥈13 · ⭐ 46 · 💀) - A materials discovery algorithm geared towards exploring high-performance candidates in new chemical spaces. MIT materials-discovery rep-eng HTC - ASAP (🥈12 · ⭐ 150 · 💀) - ASAP is a package that can quickly analyze and visualize datasets of crystal or molecular structures. MIT - pumml (🥈10 · ⭐ 37 · 💀) - Positive and Unlabeled Materials Machine Learning (pumml) is a code that uses semi-supervised machine learning to.. MIT materials-discovery - Sketchmap (🥉8 · ⭐ 48 · 💀) - Suite of programs to perform non-linear dimensionality reduction -- sketch-map in particular. GPL-3.0 C++ - paper-ml-robustness-material-property (🥉5 · ⭐ 4 · 💀) - A critical examination of robustness and generalizability of machine learning prediction of materials properties. BSD-3 datasets single-paper - 3D-EMGP (🥉4 · ⭐ 33 · 💀) - [AAAI 2023] The implementation for the paper Energy-Motivated Equivariant Pretraining for 3D Molecular Graphs. MIT pretrained rep-learn single-paper - Coarse-Graining-Auto-encoders (🥉4 · ⭐ 21 · 💀) - Implementation of coarse-graining Autoencoders. Unlicensed single-paper - KmdPlus (🥉4 · ⭐ 8 · 💀) - This module contains a class for treating kernel mean descriptor (KMD), and a function for generating descriptors with.. MIT - Descriptor Embedding and Clustering for Atomisitic-environment Framework (DECAF) ( ⭐ 2) - Provides a workflow to obtain clustering of local environments in dataset of structures. Unlicensed


Visualization

Back to top

Projects that focus on visualization (viz.) for atomistic ML.

Crystal Toolkit (🥇26 · ⭐ 200) - Crystal Toolkit is a framework for building web apps for materials science and is currently powering the new Materials.. MIT - [GitHub](https://github.com/materialsproject/crystaltoolkit) (👨‍💻 38 · 🔀 66 · 📦 43 · 📋 150 - 45% open · ⏱️ 09.04.2026):
git clone https://github.com/materialsproject/crystaltoolkit
- [PyPi](https://pypi.org/project/crystal-toolkit) (📥 10K / month · 📦 12 · ⏱️ 05.03.2026):
pip install crystal-toolkit
pymatviz (🥈23 · ⭐ 310) - A toolkit for visualizations in materials informatics. MIT general-tool probabilistic - [GitHub](https://github.com/janosh/pymatviz) (👨‍💻 14 · 🔀 38 · 📥 3.3K · 📦 32 · 📋 65 - 1% open · ⏱️ 09.04.2026):
git clone https://github.com/janosh/pymatviz
- [PyPi](https://pypi.org/project/pymatviz) (📥 9.3K / month · 📦 10 · ⏱️ 04.03.2026):
pip install pymatviz
Chemiscope (🥈23 · ⭐ 170) - An interactive structure/property explorer for materials and molecules. BSD-3 JavaScript - [GitHub](https://github.com/lab-cosmo/chemiscope) (👨‍💻 27 · 🔀 43 · 📥 650 · 📦 6 · 📋 170 - 8% open · ⏱️ 30.03.2026):
git clone https://github.com/lab-cosmo/chemiscope
- [npm](https://www.npmjs.com/package/chemiscope) (📥 150 / month · 📦 3 · ⏱️ 15.03.2023):
npm install chemiscope
Elementari (🥉21 · ⭐ 320) - Interactive browser visualizations for materials science: crystal structures/molecules, trajectories, convex hulls,.. MIT JavaScript - [GitHub](https://github.com/janosh/matterviz) (👨‍💻 6 · 🔀 30 · 📥 4.2K · 📦 5 · 📋 43 - 4% open · ⏱️ 04.04.2026):
git clone https://github.com/janosh/elementari
- [npm](https://www.npmjs.com/package/elementari) (📦 2 · ⏱️ 19.06.2025):
npm install elementari
ZnDraw (🥉21 · ⭐ 49) - A powerful tool for visualizing, modifying, and analysing atomistic systems. EPL-2.0 MD generative JavaScript - [GitHub](https://github.com/zincware/ZnDraw) (👨‍💻 16 · 🔀 5 · 📦 16 · 📋 380 - 20% open · ⏱️ 09.04.2026):
git clone https://github.com/zincware/ZnDraw
- [PyPi](https://pypi.org/project/zndraw) (📥 2.5K / month · 📦 5 · ⏱️ 01.04.2026):
pip install zndraw
Atomvision (🥉11 · ⭐ 34 · 💤) - Deep learning framework for atomistic image data. Custom computer-vision experimental-data rep-learn - [GitHub](https://github.com/usnistgov/atomvision) (👨‍💻 4 · 🔀 17 · 📦 4 · 📋 9 - 55% open · ⏱️ 25.08.2025):
git clone https://github.com/usnistgov/atomvision
- [PyPi](https://pypi.org/project/atomvision) (📥 110 / month · ⏱️ 08.05.2023):
pip install atomvision


Wavefunction methods (ML-WFT)

Back to top

Projects and models that focus on quantities of wavefunction theory methods, such as Monte Carlo techniques like deep learning variational Monte Carlo (DL-VMC), quantum chemistry methods, etc.

DeepQMC (🥇17 · ⭐ 410) - Deep learning quantum Monte Carlo for electrons in real space. MIT - [GitHub](https://github.com/deepqmc/deepqmc) (👨‍💻 14 · 🔀 62 · 📦 3 · 📋 64 - 6% open · ⏱️ 08.04.2026):
git clone https://github.com/deepqmc/deepqmc
- [PyPi](https://pypi.org/project/deepqmc) (📥 110 / month · ⏱️ 24.09.2024):
pip install deepqmc
FermiNet (🥈15 · ⭐ 820) - An implementation of the Fermionic Neural Network for ab-initio electronic structure calculations. Apache-2 transformer - [GitHub](https://github.com/google-deepmind/ferminet) (👨‍💻 24 · 🔀 160 · 📋 71 - 4% open · ⏱️ 11.03.2026):
git clone https://github.com/google-deepmind/ferminet
DeepErwin (🥈8 · ⭐ 67 · 💤) - DeepErwin is a python 3.8+ package that implements and optimizes JAX 2.x wave function models for numerical solutions.. Custom - [GitHub](https://github.com/mdsunivie/deeperwin) (👨‍💻 9 · 🔀 8 · 📥 18 · 📦 2 · ⏱️ 18.04.2025):
git clone https://github.com/mdsunivie/deeperwin
- [PyPi](https://pypi.org/project/deeperwin) (📥 26 / month · ⏱️ 14.12.2021):
pip install deeperwin
JaQMC (🥉7 · ⭐ 94) - JAX accelerated Quantum Monte Carlo. Apache-2 - [GitHub](https://github.com/bytedance/jaqmc) (👨‍💻 5 · 🔀 11 · ⏱️ 08.04.2026):
git clone https://github.com/bytedance/jaqmc
Show 3 hidden projects... - ACEpsi.jl (🥉7 · ⭐ 3 · 💀) - ACE wave function parameterizations. MIT rep-eng Julia - LapNet (🥉5 · ⭐ 73 · 💀) - Efficient and Accurate Neural-Network Ansatz for Quantum Monte Carlo. Apache-2 - SchNOrb (🥉5 · ⭐ 69 · 💀) - Unifying machine learning and quantum chemistry with a deep neural network for molecular wavefunctions. MIT


Others

Back to top


Contribution

Contributions are encouraged and always welcome! If you like to add or update projects, choose one of the following ways:

  • Open an issue by selecting one of the provided categories from the issue page and fill in the requested information.
  • Modify the projects.yaml with your additions or changes, and submit a pull request. This can also be done directly via the Github UI.

If you like to contribute to or share suggestions regarding the project metadata collection or markdown generation, please refer to the best-of-generator repository. If you like to create your own best-of list, we recommend to follow this guide.

For more information on how to add or update projects, please read the contribution guidelines. By participating in this project, you agree to abide by its Code of Conduct.

License

CC0

Credit by: @github.com/JuDFTteam/best-of-atomistic-machine-learning

Awesome Python

# Awesome Python

An opinionated list of Python frameworks, libraries, tools, and resources.

Sponsors

The #10 most-starred repo on GitHub. Put your product in front of Python developers. Become a sponsor.

Categories

AI & ML

Web Development

HTTP & Scraping

Database & Storage

Data & Science

Developer Tools

DevOps

CLI & GUI

Text & Documents

Media

Python Language

Python Toolchain

Security

Miscellaneous


AI & ML

AI and Agents

Libraries for building AI applications, LLM integrations, and autonomous agents.

  • Agent Skills
  • django-ai-plugins - Django backend agent skills for Django, DRF, Celery, and Django-specific code review.
  • sentry-skills - Python-focused engineering skills for code review, debugging, and backend workflows.
  • trailofbits-skills - Python-friendly security skills for auditing, testing, and safer backend development.
  • Orchestration
  • autogen - A programming framework for building agentic AI applications.
  • crewai - A framework for orchestrating role-playing autonomous AI agents for collaborative task solving.
  • dspy - A framework for programming, not prompting, language models.
  • hermes-agent - An adaptive AI agent framework that grows with you.
  • langchain - Building applications with LLMs through composability.
  • pydantic-ai - A Python agent framework for building generative AI applications with structured schemas.
  • TradingAgents - A multi-agents LLM financial trading framework.
  • Data Layer
  • instructor - A library for extracting structured data from LLMs, powered by Pydantic.
  • llama-index - A data framework for your LLM application.
  • mem0 - An intelligent memory layer for AI agents enabling personalized interactions.
  • Pre-trained Models and Inference
  • diffusers - A library that provides pre-trained diffusion models for generating and editing images, audio, and video.
  • sglang - A high-performance serving framework for large language models and multimodal models.
  • transformers - A framework that lets you easily use pre-trained transformer models for NLP, vision, and audio tasks.
  • unsloth - A library for faster LLM fine-tuning and training with reduced memory usage.
  • vllm - A high-throughput and memory-efficient inference and serving engine for LLMs.

Deep Learning

Frameworks for Neural Networks and Deep Learning. Also see awesome-deep-learning.

  • jax - A library for high-performance numerical computing with automatic differentiation and JIT compilation.
  • keras - A high-level deep learning library with support for JAX, TensorFlow, and PyTorch backends.
  • pytorch-lightning - Deep learning framework to train, deploy, and ship AI products Lightning fast.
  • pytorch - Tensors and Dynamic neural networks in Python with strong GPU acceleration.
  • stable-baselines3 - PyTorch implementations of Stable Baselines (deep) reinforcement learning algorithms.
  • tensorflow - The most popular Deep Learning framework created by Google.

Machine Learning

Libraries for Machine Learning. Also see awesome-machine-learning.

  • catboost - A fast, scalable, high performance gradient boosting on decision trees library.
  • feature_engine - sklearn compatible API with the widest toolset for feature engineering and selection.
  • h2o - Open Source Fast Scalable Machine Learning Platform.
  • lightgbm - A fast, distributed, high performance gradient boosting framework.
  • mindsdb - MindsDB is an open source AI layer for existing databases that allows you to effortlessly develop, train and deploy state-of-the-art machine learning models using standard queries.
  • pgmpy - A Python library for probabilistic graphical models and Bayesian networks.
  • scikit-learn - The most popular Python library for Machine Learning with extensive documentation and community support.
  • spark.ml - Apache Spark's scalable Machine Learning library for distributed computing.
  • TabGAN - Synthetic tabular data generation using GANs, Diffusion Models, and LLMs.
  • xgboost - A scalable, portable, and distributed gradient boosting library.

Natural Language Processing

Libraries for working with human languages.

  • General
  • gensim - Topic Modeling for Humans.
  • nltk - A leading platform for building Python programs to work with human language data.
  • spacy - A library for industrial-strength natural language processing in Python and Cython.
  • stanza - The Stanford NLP Group's official Python library, supporting 60+ languages.
  • Chinese
  • funnlp - A collection of tools and datasets for Chinese NLP.

Computer Vision

Libraries for Computer Vision.

Recommender Systems

Libraries for building recommender systems.

  • annoy - Approximate Nearest Neighbors in C++/Python optimized for memory usage.
  • implicit - A fast Python implementation of collaborative filtering for implicit datasets.
  • scikit-surprise - A scikit for building and analyzing recommender systems.

Web Development

Web Frameworks

Traditional full stack web frameworks. Also see Web APIs.

  • Synchronous
  • bottle - A fast and simple micro-framework distributed as a single file with no dependencies.
  • django - The most popular web framework in Python.
  • flask - A microframework for Python.
  • pyramid - A small, fast, down-to-earth, open source Python web framework.
  • fasthtml - The fastest way to create an HTML app.
  • masonite - The modern and developer centric Python web framework.
  • Asynchronous
  • litestar - Production-ready, capable and extensible ASGI Web framework.
  • microdot - The impossibly small web framework for Python and MicroPython.
  • reflex - A framework for building reactive, full-stack web applications entirely with Python.
  • robyn - A high-performance async Python web framework with a Rust runtime.
  • starlette - A lightweight ASGI framework and toolkit for building high-performance async services.
  • tornado - A web framework and asynchronous networking library.

Web APIs

Libraries for building RESTful and GraphQL APIs.

  • Django
  • django-ninja - Fast, Django REST framework based on type hints and Pydantic.
  • django-rest-framework - A powerful and flexible toolkit to build web APIs.
  • strawberry-django - Strawberry GraphQL integration with Django.
  • Flask
  • apiflask - A lightweight Python web API framework based on Flask and Marshmallow.
  • Framework Agnostic
  • connexion - A spec-first framework that automatically handles requests based on your OpenAPI specification.
  • falcon - A high-performance framework for building cloud APIs and web app backends.
  • fastapi - A modern, fast, web framework for building APIs with standard Python type hints.
  • sanic - A Python 3.6+ web server and web framework that's written to go fast.
  • strawberry - A GraphQL library that leverages Python type annotations for schema definition.
  • webargs - A friendly library for parsing HTTP request arguments with built-in support for popular web frameworks.

Web Servers

ASGI and WSGI compatible web servers.

  • ASGI
  • daphne - An HTTP, HTTP/2 and WebSocket protocol server for ASGI and ASGI-HTTP.
  • granian - A Rust HTTP server for Python applications built on top of Hyper and Tokio, supporting WSGI/ASGI/RSGI.
  • hypercorn - An ASGI and WSGI Server based on Hyper libraries and inspired by Gunicorn.
  • uvicorn - A lightning-fast ASGI server implementation, using uvloop and httptools.
  • WSGI
  • gunicorn - Pre-forked, ported from Ruby's Unicorn project.
  • uwsgi - A project aims at developing a full stack for building hosting services, written in C.
  • waitress - Multi-threaded, powers Pyramid.
  • RPC
  • grpcio - HTTP/2-based RPC framework with Python bindings, built by Google.
  • rpyc (Remote Python Call) - A transparent and symmetric RPC library for Python.

WebSocket

Libraries for working with WebSocket.

  • autobahn-python - WebSocket & WAMP for Python on Twisted and asyncio.
  • channels - Developer-friendly asynchrony for Django.
  • flask-socketio - Socket.IO integration for Flask applications.
  • picows - Fastest WebSocket clients and servers with a frame level interface for the most demanding use-cases.
  • websockets - A library for building WebSocket servers and clients with a focus on correctness and simplicity.

Template Engines

Libraries and tools for templating and lexing.

  • jinja - A modern and designer friendly templating language.
  • mako - Hyperfast and lightweight templating for the Python platform.

Web Asset Management

Tools for managing, compressing and minifying website assets.

  • django-compressor - Compresses linked and inline JavaScript or CSS into a single cached file.
  • django-storages - A collection of custom storage back ends for Django.

Authentication

Libraries for implementing authentication schemes.

  • OAuth
  • authlib - JavaScript Object Signing and Encryption draft implementation.
  • django-allauth - Authentication app for Django that "just works."
  • django-oauth-toolkit - OAuth 2 goodies for Django.
  • oauthlib - A generic and thorough implementation of the OAuth request-signing logic.
  • JWT
  • pyjwt - JSON Web Token implementation in Python.
  • Permissions
  • django-guardian - Implementation of per object permissions for Django 1.2+
  • django-rules - A tiny but powerful app providing object-level permissions to Django, without requiring a database.

Admin Panels

Libraries for administrative interfaces.

  • ajenti - The admin panel your servers deserve.
  • django-grappelli - A jazzy skin for the Django Admin-Interface.
  • django-unfold - Elevate your Django admin with a stunning modern interface, powerful features, and seamless user experience.
  • flask-admin - Simple and extensible administrative interface framework for Flask.
  • flower - Real-time monitor and web admin for Celery.
  • func-to-web - Instantly create web UIs from Python functions using type hints. Zero frontend code required.
  • jet-bridge - Admin panel framework for any application with nice UI (ex Jet Django).

CMS

Content Management Systems.

  • django-cms - The easy-to-use and developer-friendly enterprise CMS powered by Django.
  • indico - A feature-rich event management system, made @ CERN.
  • wagtail - A Django content management system.

Static Site Generators

Static site generator is a software that takes some text + templates as input and produces HTML files on the output.

  • lektor - An easy to use static CMS and blog engine.
  • nikola - A static website and blog generator.
  • pelican - Static site generator that supports Markdown and reST syntax.

HTTP & Scraping

HTTP Clients

Libraries for working with HTTP.

  • aiohttp - Asynchronous HTTP client/server framework for asyncio and Python.
  • furl - A small Python library that makes parsing and manipulating URLs easy.
  • httptap - Dissects an HTTP request into DNS, TCP, TLS, wait, and transfer phases and renders the timings as a waterfall.
  • httpx - A next generation HTTP client for Python.
  • requests - HTTP Requests for Humans.
  • urllib3 - A HTTP library with thread-safe connection pooling, file post support, sanity friendly.

Web Scraping

Libraries to automate web scraping and extract web content.

  • Frameworks
  • browser-use - Make websites accessible for AI agents with easy browser automation.
  • crawl4ai - An open-source, LLM-friendly web crawler that provides lightning-fast, structured data extraction specifically designed for AI agents.
  • mechanicalsoup - A Python library for automating interaction with websites.
  • scrapy - A fast high-level screen scraping and web crawling framework.
  • Content Extraction
  • feedparser - Universal feed parser.
  • html2text - Convert HTML to Markdown-formatted text.
  • micawber - A small library for extracting rich content from URLs.
  • sumy - A module for automatic summarization of text documents and HTML pages.
  • trafilatura - A tool for gathering text and metadata from the web, with built-in content filtering.

Email

Libraries for sending and parsing email, and mail server management.

  • modoboa - A mail hosting and management platform including a modern Web UI.
  • yagmail - Yet another Gmail/SMTP client.

Database & Storage

ORM

Libraries that implement Object-Relational Mapping or data mapping techniques.

  • Relational Databases
  • django.db.models - The Django ORM.
  • sqlalchemy - The Python SQL Toolkit and Object Relational Mapper.
  • dataset - Store Python dicts in a database - works with SQLite, MySQL, and PostgreSQL.
  • peewee - A small, expressive ORM.
  • pony - ORM that provides a generator-oriented interface to SQL.
  • sqlmodel - SQLModel is based on Python type annotations, and powered by Pydantic and SQLAlchemy.
  • tortoise-orm - An easy-to-use asyncio ORM inspired by Django, with relations support.
  • NoSQL Databases
  • beanie - An asynchronous Python object-document mapper (ODM) for MongoDB.
  • mongoengine - A Python Object-Document-Mapper for working with MongoDB.
  • pynamodb - A Pythonic interface for Amazon DynamoDB.

Database Drivers

Libraries for connecting and operating databases.

Database

Databases implemented in Python.

  • chromadb - An open-source embedding database for building AI applications with embeddings and semantic search.
  • duckdb - An in-process SQL OLAP database management system; optimized for analytics and fast queries, similar to SQLite but for analytical workloads.
  • pickledb - A simple and lightweight key-value store for Python.
  • tinydb - A tiny, document-oriented database.
  • ZODB - A native object database for Python. A key-value and object graph database.

Caching

Libraries for caching data.

  • cachetools - Extensible memoizing collections and decorators.
  • django-cacheops - A slick ORM cache with automatic granular event-driven invalidation.
  • dogpile.cache - dogpile.cache is a next generation replacement for Beaker made by the same authors.
  • python-diskcache - SQLite and file backed cache backend with faster lookups than memcached and redis.

Libraries and software for indexing and performing search queries on data.

Serialization

Libraries for serializing complex data types.

  • marshmallow - A lightweight library for converting complex objects to and from simple Python datatypes.
  • msgpack - MessagePack serializer implementation for Python.
  • orjson - Fast, correct JSON library.

Data & Science

Data Analysis

Libraries for data analysis.

  • General
  • aws-sdk-pandas - Pandas on AWS.
  • datasette - An open source multi-tool for exploring and publishing data.
  • desbordante - An open source data profiler for complex pattern discovery.
  • ibis - A portable Python dataframe library with a single API for 20+ backends.
  • modin - A drop-in pandas replacement that scales workflows by changing a single line of code.
  • pandas - A library providing high-performance, easy-to-use data structures and data analysis tools.
  • pathway - Real-time data processing framework for Python with reactive dataflows.
  • polars - A fast DataFrame library implemented in Rust with a Python API.
  • Financial Data
  • akshare - A financial data interface library, built for human beings!
  • edgartools - Library for downloading structured data from SEC EDGAR filings and XBRL financial statements.
  • lumibot - Algorithmic trading framework for backtesting and live deployment across stocks, options, crypto, futures, and forex.
  • openbb - A financial data platform for analysts, quants and AI agents.
  • yfinance - Easy Pythonic way to download market and financial data from Yahoo Finance.

Data Validation

Libraries for validating data. Used for forms in many cases.

  • cerberus - A lightweight and extensible data validation library.
  • jsonschema - An implementation of JSON Schema for Python.
  • pandera - A data validation library for dataframes, with support for pandas, polars, and Spark.
  • pydantic - Data validation using Python type hints.
  • voluptuous - A Python data validation library primarily intended for validating data from untrusted sources.

Data Visualization

Libraries for visualizing data. Also see awesome-javascript.

  • Plotting
  • altair - Declarative statistical visualization library for Python.
  • bokeh - Interactive Web Plotting for Python.
  • bqplot - Interactive Plotting Library for the Jupyter Notebook.
  • matplotlib - A Python 2D plotting library.
  • plotly - Interactive graphing library for Python.
  • plotnine - A grammar of graphics for Python based on ggplot2.
  • pygal - A Python SVG Charts Creator.
  • pyqtgraph - Interactive and realtime 2D/3D/Image plotting and science/engineering widgets.
  • seaborn - Statistical data visualization using Matplotlib.
  • ultraplot - Matplotlib wrapper for publication-ready scientific figures with minimal code. Includes advanced subplot management, panel layouts, and batteries-included geoscience plotting.
  • vispy - High-performance scientific visualization based on OpenGL.
  • Specialized
  • cartopy - A cartographic python library with matplotlib support.
  • pygraphviz - Python interface to Graphviz.
  • Dashboards and Apps
  • gradio - Build and share machine learning apps, all in Python.
  • streamlit - A framework which lets you build dashboards, generate reports, or create chat apps in minutes.

Geolocation

Libraries for geocoding addresses and working with latitudes and longitudes.

  • django-countries - A Django app that provides a country field for models and forms.
  • geodjango - A world-class geographic web framework that is part of Django.
  • geojson - Python bindings and utilities for GeoJSON.
  • geopandas - Python tools for geographic data (GeoSeries/GeoDataFrame) built on pandas.
  • geopy - Python Geocoding Toolbox.

Science

Libraries for scientific computing. Also see Python-for-Scientists.

  • Core
  • numba - Python JIT compiler to LLVM aimed at scientific Python.
  • numpy - A fundamental package for scientific computing with Python.
  • scipy - A Python-based ecosystem of open-source software for mathematics, science, and engineering.
  • statsmodels - Statistical modeling and econometrics in Python.
  • sympy - A Python library for symbolic mathematics.
  • Biology and Chemistry
  • biopython - Biopython is a set of freely available tools for biological computation.
  • cclib - A library for parsing and interpreting the results of computational chemistry packages.
  • openbabel - A chemical toolbox designed to speak the many languages of chemical data.
  • rdkit - Cheminformatics and Machine Learning Software.
  • Physics and Engineering
  • astropy - A community Python library for Astronomy.
  • obspy - A Python toolbox for seismology.
  • pydy - Short for Python Dynamics, used to assist with workflow in the modeling of dynamic motion.
  • PythonRobotics - This is a compilation of various robotics algorithms with visualizations.
  • Simulation and Modeling
  • pathsim - A block-based system modeling and simulation framework with a browser-based visual editor.
  • pymc - Probabilistic programming and Bayesian modeling in Python.
  • simpy - A process-based discrete-event simulation framework.
  • Other
  • colour - Implementing a comprehensive number of colour theory transformations and algorithms.
  • manim - An animation engine for explanatory math videos.
  • networkx - A high-productivity software for complex networks.
  • shapely - Manipulation and analysis of geometric objects in the Cartesian plane.

Quantum Computing

Libraries for quantum computing.

  • Cirq — A Google-developed framework focused on hardware-aware quantum circuit design for NISQ devices.
  • pennylane — A hybrid quantum-classical machine learning library with automatic differentiation support.
  • qiskit — An IBM-backed quantum SDK for building, simulating, and running circuits on real quantum hardware.
  • qutip - Quantum Toolbox in Python.

Developer Tools

Algorithms and Design Patterns

Python implementation of data structures, algorithms and design patterns. Also see awesome-algorithms.

  • Algorithms
  • algorithms - Minimal examples of data structures and algorithms.
  • sortedcontainers - Fast and pure-Python implementation of sorted collections.
  • thealgorithms - All Algorithms implemented in Python.
  • Design Patterns
  • python-patterns - A collection of design patterns in Python.
  • transitions - A lightweight, object-oriented finite state machine implementation.

Interactive Interpreter

Interactive Python interpreters (REPL).

Code Analysis

Tools of static analysis, linters and code quality checkers. Also see awesome-static-analysis.

  • Code Analysis
  • code2flow - Turn your Python and JavaScript code into DOT flowcharts.
  • prospector - A tool to analyze Python code.
  • repowise - Codebase intelligence that indexes repos into dependency graphs, git history, and auto-generated docs with dead code detection.
  • vulture - A tool for finding and analyzing dead Python code.
  • Code Linters
  • bandit - A tool designed to find common security issues in Python code.
  • flake8 - A wrapper around pycodestyle, pyflakes and McCabe.
  • pylint - A fully customizable source code analyzer.
  • ruff - An extremely fast Python linter and code formatter.
  • Code Formatters
  • black - The uncompromising Python code formatter.
  • isort - A Python utility / library to sort imports.
  • ruff - An extremely fast Python linter and code formatter.
  • Refactoring
  • rope - Rope is a python refactoring library.
  • Type Checkers - awesome-python-typing
  • mypy - Check variable types during compile time.
  • pyre-check - Performant type checking.
  • ty - An extremely fast Python type checker and language server.
  • typeshed - Collection of library stubs for Python, with static types.
  • Type Annotations Generators
  • monkeytype - A system for Python that generates static type annotations by collecting runtime types.
  • pytype - Pytype checks and infers types for Python code - without requiring type annotations.

Testing

Libraries for testing codebases and generating test data.

  • Frameworks
  • hypothesis - Hypothesis is an advanced Quickcheck style property based testing library.
  • pytest - A mature full-featured Python testing tool.
  • robotframework - A generic test automation framework.
  • scanapi - Automated Testing and Documentation for your REST API.
  • unittest - (Python standard library) Unit testing framework.
  • Test Runners
  • nox - Flexible test automation for Python.
  • tox - Auto builds and tests distributions in multiple Python versions
  • GUI / Web Testing
  • locust - Scalable user load testing tool written in Python.
  • playwright-python - Python version of the Playwright testing and automation library.
  • pyautogui - PyAutoGUI is a cross-platform GUI automation Python module for human beings.
  • schemathesis - A tool for automatic property-based testing of web applications built with Open API / Swagger specifications.
  • selenium - Python bindings for Selenium WebDriver.
  • Mock
  • freezegun - Travel through time by mocking the datetime module.
  • mock - (Python standard library) A mocking and patching library.
  • mocket - A socket mock framework with gevent/asyncio/SSL support.
  • responses - A utility library for mocking out the requests Python library.
  • respx - Mock HTTPX with awesome request patterns and response side effects.
  • vcrpy - Record and replay HTTP interactions on your tests.
  • Object Factories
  • factory_boy - A test fixtures replacement for Python.
  • polyfactory - mock data generation library with support to classes (continuation of pydantic-factories)
  • Code Coverage
  • coverage - Code coverage measurement.
  • Fake Data
  • faker - A Python package that generates fake data.
  • mimesis - is a Python library that help you generate fake data.

Debugging Tools

Libraries for debugging code.

  • pdb-like Debugger
  • ipdb - IPython-enabled pdb.
  • pudb - A full-screen, console-based Python debugger.
  • Tracing
  • manhole - Debugging UNIX socket connections and present the stacktraces for all threads and an interactive prompt.
  • python-hunter - A flexible code tracing toolkit.
  • Profiler
  • py-spy - A sampling profiler for Python programs. Written in Rust.
  • scalene - A high-performance, high-precision CPU, GPU, and memory profiler for Python.
  • Others
  • django-debug-toolbar - Display various debug information for Django.
  • flask-debugtoolbar - A port of the django-debug-toolbar to flask.
  • icecream - Inspect variables, expressions, and program execution with a single, simple function call.
  • memory_graph - Visualize Python data at runtime to debug references, mutability, and aliasing.

Build Tools

Compile software from source code.

  • bitbake - A make-like build tool for embedded Linux.
  • invoke - A tool for managing shell-oriented subprocesses and organizing executable Python code into CLI-invokable tasks.
  • platformio - A console tool to build code with different development platforms.
  • pybuilder - A continuous build tool written in pure Python.
  • doit - A task runner and build tool.
  • scons - A software construction tool.

Documentation

Libraries for generating project documentation.

  • sphinx - Python Documentation generator.
  • awesome-sphinxdoc
  • diagrams - Diagram as Code.
  • mkdocs - Markdown friendly documentation generator.
  • pdoc - Epydoc replacement to auto generate API documentation for Python libraries.

DevOps

DevOps Tools

Software and libraries for DevOps.

  • Cloud Providers
  • awscli - Universal Command Line Interface for Amazon Web Services.
  • boto3 - Python interface to Amazon Web Services.
  • Configuration Management
  • ansible - A radically simple IT automation platform.
  • cloudinit - A multi-distribution package that handles early initialization of a cloud instance.
  • openstack - Open source software for building private and public clouds.
  • pyinfra - A versatile CLI tools and python libraries to automate infrastructure.
  • saltstack - Infrastructure automation and management system.
  • Deployment
  • chalice - A Python serverless microframework for AWS.
  • fabric - A simple, Pythonic tool for remote execution and deployment.
  • Monitoring and Processes
  • psutil - A cross-platform process and system utilities module.
  • sentry-python - Sentry SDK for Python.
  • sh - A full-fledged subprocess replacement for Python.
  • supervisor - Supervisor process control system for UNIX.
  • Other
  • borg - A deduplicating archiver with compression and encryption.
  • chaostoolkit - A Chaos Engineering toolkit & Orchestration for Developers.
  • pre-commit - A framework for managing and maintaining multi-language pre-commit hooks.

Distributed Computing

Frameworks and libraries for Distributed Computing.

  • Batch Processing
  • dask - A flexible parallel computing library for analytic computing.
  • luigi - A module that helps you build complex pipelines of batch jobs.
  • mpi4py - Python bindings for MPI.
  • pyspark - Apache Spark Python API.
  • joblib - A set of tools to provide lightweight pipelining in Python.
  • ray - A system for parallel and distributed Python that unifies the machine learning ecosystem.

Task Queues

Libraries for working with task queues.

  • celery - An asynchronous task queue/job queue based on distributed message passing.
  • dramatiq - A fast and reliable background task processing library for Python 3.
  • huey - Little multi-threaded task queue.
  • rq - Simple job queues for Python.

Job Schedulers

Libraries for scheduling jobs.

  • airflow - Airflow is a platform to programmatically author, schedule and monitor workflows.
  • apscheduler - A light but powerful in-process task scheduler that lets you schedule functions.
  • dagster - An orchestration platform for the development, production, and observation of data assets.
  • prefect - A modern workflow orchestration framework that makes it easy to build, schedule and monitor robust data pipelines.
  • schedule - Python job scheduling for humans.
  • SpiffWorkflow - A powerful workflow engine implemented in pure Python.

Logging

Libraries for generating and working with logs.

  • logfmter - A standard library compatible logfmt formatter.
  • logging - (Python standard library) Logging facility for Python.
  • loguru - Library which aims to bring enjoyable logging in Python.
  • structlog - Structured logging made easy.

Network Virtualization

Tools and libraries for Virtual Networking and SDN (Software Defined Networking).

  • mininet - A popular network emulator and API written in Python.
  • napalm - Cross-vendor API to manipulate network devices.
  • scapy - A brilliant packet manipulation library.

CLI & GUI

CLI Development

Libraries for building command-line applications.

  • CLI Development
  • argparse - (Python standard library) Command-line option and argument parsing.
  • cement - CLI Application Framework for Python.
  • click - A package for creating beautiful command line interfaces in a composable way.
  • python-fire - A library for creating command line interfaces from absolutely any Python object.
  • python-prompt-toolkit - A library for building powerful interactive command lines.
  • typer - Modern CLI framework that uses Python type hints. Built on Click and Pydantic.
  • Terminal Rendering
  • alive-progress - A new kind of Progress Bar, with real-time throughput, eta and very cool animations.
  • asciimatics - A package to create full-screen text UIs (from interactive forms to ASCII animations).
  • colorama - Cross-platform colored terminal text.
  • rich - Python library for rich text and beautiful formatting in the terminal. Also provides a great RichHandler log handler.
  • textual - A framework for building interactive user interfaces that run in the terminal and the browser.
  • tqdm - Fast, extensible progress bar for loops and CLI.

CLI Tools

Useful CLI-based tools for productivity.

  • Productivity Tools
  • cookiecutter - A command-line utility that creates projects from cookiecutters (project templates).
  • copier - A library and command-line utility for rendering projects templates.
  • doitlive - A tool for live presentations in the terminal.
  • thefuck - Correcting your previous console command.
  • tmuxp - A tmux session manager.
  • xonsh - A Python-powered shell. Full-featured and cross-platform.
  • yt-dlp - A command-line program to download videos from YouTube and other video sites, a fork of youtube-dl.
  • CLI Enhancements
  • httpie - A command line HTTP client, a user-friendly cURL replacement.
  • iredis - Redis CLI with autocompletion and syntax highlighting.
  • litecli - SQLite CLI with autocompletion and syntax highlighting.
  • mycli - MySQL CLI with autocompletion and syntax highlighting.
  • pgcli - PostgreSQL CLI with autocompletion and syntax highlighting.

GUI Development

Libraries for working with graphical user interface applications.

  • Desktop
  • customtkinter - A modern and customizable python UI-library based on Tkinter.
  • dearpygui - A Simple GPU accelerated Python GUI framework
  • enaml - Creating beautiful user-interfaces with Declarative Syntax like QML.
  • kivy - A library for creating NUI applications, running on Windows, Linux, Mac OS X, Android and iOS.
  • pyglet - A cross-platform windowing and multimedia library for Python.
  • pygobject - Python Bindings for GLib/GObject/GIO/GTK+ (GTK+3).
  • PyQt - Python bindings for the Qt cross-platform application and UI framework.
  • pyside - Qt for Python offers the official Python bindings for Qt, this is same as PyQt but it's the official binding with different licensing.
  • tkinter - (Python standard library) The standard Python interface to the Tcl/Tk GUI toolkit.
  • toga - A Python native, OS native GUI toolkit.
  • wxPython - A blending of the wxWidgets C++ class library with the Python.
  • Web-based
  • flet - Cross-platform GUI framework for building modern apps in pure Python.
  • nicegui - An easy-to-use, Python-based UI framework, which shows up in your web browser.
  • pywebview - A lightweight cross-platform native wrapper around a webview component.
  • Terminal
  • curses - Built-in wrapper for ncurses used to create terminal GUI applications.
  • urwid - A library for creating terminal GUI applications with strong support for widgets, events, rich colors, etc.
  • Wrappers
  • gooey - Turn command line programs into a full GUI application with one line.

Text & Documents

Text Processing

Libraries for parsing and manipulating plain texts.

  • General
  • babel - An internationalization library for Python.
  • chardet - Python ⅔ compatible character encoding detector.
  • difflib - (Python standard library) Helpers for computing deltas.
  • ftfy - Makes Unicode text less broken and more consistent automagically.
  • pangu.py - Paranoid text spacing.
  • pyfiglet - An implementation of figlet written in Python.
  • pypinyin - Convert Chinese hanzi (漢字) to pinyin (拼音).
  • python-slugify - A Python slugify library that translates unicode to ASCII.
  • textdistance - Compute distance between sequences with 30+ algorithms.
  • unidecode - ASCII transliterations of Unicode text.
  • Unique identifiers
  • sqids - A library for generating short unique IDs from numbers.
  • shortuuid - A generator library for concise, unambiguous and URL-safe UUIDs.
  • Parser
  • pygments - A generic syntax highlighter.
  • pyparsing - A general purpose framework for generating parsers.
  • python-nameparser - Parsing human names into their individual components.
  • python-phonenumbers - Parsing, formatting, storing and validating international phone numbers.
  • python-user-agents - Browser user agent parser.
  • sqlparse - A non-validating SQL parser.

HTML Manipulation

Libraries for working with HTML and XML.

  • beautifulsoup - Providing Pythonic idioms for iterating, searching, and modifying HTML or XML.
  • justhtml - A pure Python HTML5 parser that just works.
  • lxml - A very fast, easy-to-use and versatile library for handling HTML and XML.
  • markupsafe - Implements a XML/HTML/XHTML Markup safe string for Python.
  • pyquery - A jQuery-like library for parsing HTML.
  • tinycss2 - A low-level CSS parser and generator written in Python.
  • xmltodict - Working with XML feel like you are working with JSON.

File Format Processing

Libraries for parsing and manipulating specific text formats.

  • General
  • docling - Library for converting documents into structured data.
  • kreuzberg - High-performance document extraction library with a Rust core, supporting 62+ formats including PDF, Office, images with OCR, HTML, email, and archives.
  • pyelftools - Parsing and analyzing ELF files and DWARF debugging information.
  • tablib - A module for Tabular Datasets in XLS, CSV, JSON, YAML.
  • MS Office
  • docxtpl - Editing a docx document by jinja2 template
  • openpyxl - A library for reading and writing Excel 2010 xlsx/xlsm/xltx/xltm files.
  • pyexcel - Providing one API for reading, manipulating and writing csv, ods, xls, xlsx and xlsm files.
  • python-docx - Reads, queries and modifies Microsoft Word 2007/2008 docx files.
  • python-pptx - Python library for creating and updating PowerPoint (.pptx) files.
  • xlsxwriter - A Python module for creating Excel .xlsx files.
  • xlwings - A BSD-licensed library that makes it easy to call Python from Excel and vice versa.
  • PDF
  • pdf_oxide - A fast PDF library for text extraction, image extraction, and markdown conversion, powered by Rust.
  • pdfminer.six - Pdfminer.six is a community maintained fork of the original PDFMiner.
  • pikepdf - A powerful library for reading and editing PDF files, based on qpdf.
  • pypdf - A library capable of splitting, merging, cropping, and transforming PDF pages.
  • reportlab - Allowing Rapid creation of rich PDF documents.
  • weasyprint - A visual rendering engine for HTML and CSS that can export to PDF.
  • Markdown
  • markdown-it-py - Markdown parser with 100% CommonMark support, extensions, and syntax plugins.
  • markdown - A Python implementation of John Gruber’s Markdown.
  • markitdown - Python tool for converting files and office documents to Markdown.
  • mistune - Fastest and full featured pure Python parsers of Markdown.
  • Data Formats
  • csvkit - Utilities for converting to and working with CSV.
  • pyyaml - YAML implementations for Python.
  • tomllib - (Python standard library) Parse TOML files.

File Manipulation

Libraries for file manipulation.

  • mimetypes - (Python standard library) Map filenames to MIME types.
  • pathlib - (Python standard library) A cross-platform, object-oriented path library.
  • python-magic - A Python interface to the libmagic file type identification library.
  • watchdog - API and shell utilities to monitor file system events.
  • watchfiles - Simple, modern and fast file watching and code reload in python.

Media

Image Processing

Libraries for manipulating images.

  • pillow - Pillow is the friendly PIL fork.
  • pymatting - A library for alpha matting.
  • python-barcode - Create barcodes in Python with no extra dependencies.
  • python-qrcode - A pure Python QR Code generator.
  • pyvips - A fast image processing library with low memory needs.
  • scikit-image - A Python library for (scientific) image processing.
  • thumbor - A smart imaging service. It enables on-demand crop, re-sizing and flipping of images.
  • wand - Python bindings for MagickWand, C API for ImageMagick.

Audio & Video Processing

Libraries for manipulating audio, video, and their metadata.

  • Audio
  • gtts - Python library and CLI tool for converting text to speech using Google Translate TTS.
  • librosa - Python library for audio and music analysis.
  • matchering - A library for automated reference audio mastering.
  • pydub - Manipulate audio with a simple and easy high level interface.
  • Video
  • moviepy - A module for script-based movie editing with many formats, including animated GIFs.
  • vidgear - Most Powerful multi-threaded Video Processing framework.
  • Metadata
  • beets - A music library manager and MusicBrainz tagger.
  • mutagen - A Python module to handle audio metadata.
  • tinytag - A library for reading music meta data of MP3, OGG, FLAC and Wave files.

Game Development

Awesome game development libraries.

  • arcade - Arcade is a modern Python framework for crafting games with compelling graphics and sound.
  • panda3d - 3D game engine developed by Disney.
  • py-sdl2 - A ctypes based wrapper for the SDL2 library.
  • pygame - Pygame is a set of Python modules designed for writing games.
  • pyopengl - Python ctypes bindings for OpenGL and it's related APIs.
  • renpy - A Visual Novel engine.

Python Language

Implementations

Implementations of Python.

  • cpython - Default, most widely used implementation of the Python programming language written in C.
  • cython - Optimizing Static Compiler for Python.
  • ironpython - Implementation of the Python programming language written in C#.
  • micropython - A lean and efficient Python programming language implementation.
  • pyodide - Python distribution for the browser and Node.js based on WebAssembly.
  • pypy - A very fast and compliant implementation of the Python language.

Built-in Classes Enhancement

Libraries for enhancing Python built-in classes.

  • attrs - Replacement for __init__, __eq__, __repr__, etc. boilerplate in class definitions.
  • bidict - Efficient, Pythonic bidirectional map data structures and related functionality.
  • box - Python dictionaries with advanced dot notation access.

Functional Programming

Functional Programming with Python.

  • coconut - A variant of Python built for simple, elegant, Pythonic functional programming.
  • functools - (Python standard library) Higher-order functions and operations on callable objects.
  • funcy - A fancy and practical functional tools.
  • more-itertools - More routines for operating on iterables, beyond itertools.
  • returns - A set of type-safe monads, transformers, and composition utilities.
  • toolz - A collection of functional utilities for iterators, functions, and dictionaries. Also available as cytoolz for Cython-accelerated performance.

Asynchronous Programming

Libraries for asynchronous, concurrent and parallel execution. Also see awesome-asyncio.

  • anyio - A high-level async concurrency and networking framework that works on top of asyncio or trio.
  • asyncio - (Python standard library) Asynchronous I/O, event loop, coroutines and tasks.
  • awesome-asyncio
  • concurrent.futures - (Python standard library) A high-level interface for asynchronously executing callables.
  • gevent - A coroutine-based Python networking library that uses greenlet.
  • multiprocessing - (Python standard library) Process-based parallelism.
  • trio - A friendly library for async concurrency and I/O.
  • twisted - An event-driven networking engine.
  • uvloop - Ultra fast asyncio event loop.

Date and Time

Libraries for working with dates and times.

  • dateparser - A Python parser for human-readable dates in dozens of languages.
  • dateutil - Extensions to the standard Python datetime module.
  • pendulum - Python datetimes made easy.
  • zoneinfo - (Python standard library) IANA time zone support. Brings the tz database into Python.

Python Toolchain

Environment Management

Libraries for Python version and virtual environment management.

  • pyenv - Simple Python version management.
  • pyenv-win - Pyenv for Windows.
  • uv - An extremely fast Python version, package and project manager, written in Rust.
  • virtualenv - A tool to create isolated Python environments.

Package Management

Libraries for package and dependency management.

  • conda - Cross-platform, Python-agnostic binary package manager.
  • pip - The package installer for Python.
  • pipx - Install and Run Python Applications in Isolated Environments. Like npx in Node.js.
  • poetry - Python dependency management and packaging made easy.
  • uv - An extremely fast Python version, package and project manager, written in Rust.

Package Repositories

Local PyPI repository server and proxies.

  • bandersnatch - PyPI mirroring tool provided by Python Packaging Authority (PyPA).
  • devpi - PyPI server and packaging/testing/release tool.
  • warehouse - Next generation Python Package Repository (PyPI).

Distribution

Libraries to create packaged executables for release distribution.

  • cx-Freeze - It is a Python tool that converts Python scripts into standalone executables and installers for Windows, macOS, and Linux.
  • Nuitka - Compiles Python programs into high-performance standalone executables (cross-platform, supports all Python versions).
  • pyarmor - A tool used to obfuscate python scripts, bind obfuscated scripts to fixed machine or expire obfuscated scripts.
  • pyinstaller - Converts Python programs into stand-alone executables (cross-platform).
  • shiv - A command line utility for building fully self-contained zipapps (PEP 441), but with all their dependencies included.

Configuration Files

Libraries for storing and parsing configuration options.

  • configparser - (Python standard library) INI file parser.
  • dynaconf - Dynaconf is a configuration manager with plugins for Django, Flask and FastAPI.
  • hydra - Hydra is a framework for elegantly configuring complex applications.
  • python-decouple - Strict separation of settings from code.
  • python-dotenv - Reads key-value pairs from a .env file and sets them as environment variables.

Security

Cryptography

  • cryptography - A package designed to expose cryptographic primitives and recipes to Python developers.
  • paramiko - The leading native Python SSHv2 protocol library.
  • pynacl - Python binding to the Networking and Cryptography (NaCl) library.

Penetration Testing

Frameworks and tools for penetration testing.

  • mitmproxy - An interactive TLS-capable intercepting HTTP proxy for penetration testers and software developers.
  • setoolkit - A toolkit for social engineering.
  • sherlock - Hunt down social media accounts by username across social networks.
  • sqlmap - Automatic SQL injection and database takeover tool.

Miscellaneous

Hardware

Libraries for programming with hardware.

  • bleak - A cross platform Bluetooth Low Energy Client for Python using asyncio.
  • pynput - A library to control and monitor input devices.

Microsoft Windows

Python programming on Microsoft Windows.

  • pythonnet - Python Integration with the .NET Common Language Runtime (CLR).
  • pywin32 - Python Extensions for Windows.
  • winpython - Portable development environment for Windows 10/11.

Miscellaneous

Useful libraries or tools that don't fit in the categories above.

  • blinker - A fast Python in-process signal/event dispatching system.
  • boltons - A set of pure-Python utilities.
  • itsdangerous - Various helpers to pass trusted data to untrusted environments.
  • tryton - A general-purpose business framework.

Resources

Where to discover learning resources or new Python libraries.

Newsletters

Podcasts

Contributing

Your contributions are always welcome! Please take a look at the contribution guidelines first.


If you have any question about this opinionated list, do not hesitate to contact @vinta on X (Twitter).

Credit by: @github.com/vinta/awesome-python

BibTeX Generator

Have you ever found yourself weary and uninspired from the tedious task of manually creating BibTeX entries for your paper?

There are, indeed, support tools and plugins that are bundled with reference managers such as Zotero, Mendeley, etc. These tools can automate the generation of a .bib file. To use them, you need to install a reference manager, its associated plugins, and a library of papers on your computer. However, these tools are not flawless. The BibTeX entries they generate often contain incomplete information, are poorly formatted, and include numerous unnecessary fields. You then still need to manually check and correct the entries.

There are the times you just need to cite a paper or two, and you don't want to go through the hassle of the aforementioned complex process. In such situations, a simple tool that allows you to quickly copy and paste a BibTeX entry into your .bib file would be ideal. Think of such a simple tool, I have looked around the Chrome extension store to see if there is any that can pick up the Bibtex while you are browsing the paper. I found some, but they do not really work.

Therefore, I decided to create my own tool to address this dilemma. I developed a Chrome extension that can generate the BibTeX entry for any browsing URL with just one click. I named it the 1click BibTeX. It delivers exactly what it is expected and has proven to be quite helpful. This extension, along with the Latex tools, will ensure that the manuscript's citations are properly formatted before they are delivered to the journal.

Usage

Install the 1click BibTeX extension on your Chrome browser. Then, whenever you're browsing a paper or any URL, just click on the extension icon, and the BibTeX entry will be instantly generated and copied to your clipboard. The remaining thing is just paste it to your .bib file.

BibTeX generator

I've tested the extension on numerous publishers and websites with varying structures and it works consistently as it was designed. The tested publishers include Elsevier, Wiley, ACS, IOP, AIP, APS, arXiv,...

Below are some examples of BibTeX entries generated by the extension 1click BibTeX:

@article{nguyen2019pattern,
    title = {Pattern transformation induced by elastic instability of metallic porous structures},
    author = {Cao Thang Nguyen and Duc Tam Ho and Seung Tae Choi and Doo-Man Chun and Sung Youb Kim },
    year = {2019},
    month = {2},
    journal = {Computational Materials Science},
    publisher = {Elsevier},
    volume = {157},
    pages = {17-24},
    doi = {10.1016/j.commatsci.2018.10.023},
    url = {https://www.sciencedirect.com/science/article/abs/pii/S0927025618306955?via%3Dihub},
    accessDate = {Jan 25, 2024}
}
@article{nguyen2024an,
    title = {An Enhanced Sampling Approach for Computing the Free Energy of Solid Surface and Solid–Liquid Interface},
    author = {Cao Thang Nguyen and Duc Tam Ho and Sung Youb Kim},
    year = {2024},
    month = {1},
    journal = {Advanced Theory and Simulations},
    publisher = {John Wiley & Sons, Ltd},
    volume = {7},
    number = {1},
    pages = {2300538},
    doi = {10.1002/adts.202300538},
    url = {https://onlinelibrary.wiley.com/doi/10.1002/adts.202300538},
    accessDate = {Jan 25, 2024}
}
@book{daum2003america,,
    title = {America, the Vietnam War, and the World},
    author = {Andreas W. Daum and Lloyd C. Gardner and Wilfried Mausbach},
    year = {2003},
    month = {7},
    publisher = {Cambridge University Press},
    isbn = {052100876X},
    url = {https://www.google.co.kr/books/edition/America_the_Vietnam_War_and_the_World/9kn6qYwsGs4C?hl=en&gbpv=0},
    accessDate = {Jan 25, 2024}
}
@book{rickards2011currency,
    title = {Currency Wars},
    author = {James Rickards},
    year = {2011},
    month = {11},
    publisher = {Penguin},
    isbn = {110155889X},
    url = {https://books.google.co.kr/books?id=-GDwL2s5sJoC&source=gbs_book_other_versions},
    accessDate = {Jan 25, 2024}
}
@misc{deci2024introducing,
    title = {Introducing DeciCoder-6B: The Best Multi-Language Code LLM in Its Class},
    author = {Deci},
    year = {2024},
    month = {1},
    publisher = {Deci},
    url = {https://deci.ai/blog/decicoder-6b-the-best-multi-language-code-generation-llm-in-its-class/},
    accessDate = {Jan 25, 2024}
}
@misc{kai2023forcefield,
    title = {Force-field files for "Noble gas (He, Ne and Ar) solubilities in high-pressure silicate melts calculated based on deep potential modeling"},
    author = {Wang, Kai and Lu, Xiancai and Liu, Xiandong and Yin, Kun},
    year = {2023},
    month = {3},
    publisher = {Zenodo},
    doi = {10.5281/zenodo.7751762},
    url = {https://zenodo.org/records/7751762},
    accessDate = {Jan 25, 2024}
}
  • Bibtex this page
@misc{nguyen2024bibtex,
    title = {BibTeX Generator},
    author = {Cao Thang Nguyen},
    year = {2024},
    month = {1},
    url = {https://thangckt.github.io/blog/2024/01/25/bibtex_generator},
    accessDate = {Jan 25, 2024}
}

In summary, the new extension 1click BibTeX works well for most websites with varying data structures.

Accelerated Molecular Simulation Using Deep Potential Workflow with NGC

Credit: NVIDIA's blog

Molecular simulation communities have faced the accuracy-versus-efficiency dilemma in modeling the potential energy surface and interatomic forces for decades. Deep Potential, the artificial neural network force field, solves this problem by combining the speed of classical molecular dynamics (MD) simulation with the accuracy of density functional theory (DFT) calculation.1 This is achieved by using the GPU-optimized package DeePMD-kit, which is a deep learning package for many-body potential energy representation and MD simulation.2

This post provides an end-to-end demonstration of training a neural network potential for the 2D material graphene and using it to drive MD simulation in the open-source platform Large-scale Atomic/Molecular Massively Parallel Simulator (LAMMPS).3 Training data can be obtained either from the Vienna Ab initio Simulation Package (VASP)4, or Quantum ESPRESSO (QE).5

A seamless integration of molecular modeling, machine learning, and high-performance computing (HPC) is demonstrated with the combined efficiency of molecular dynamics with ab initio accuracy — that is entirely driven through a container-based workflow. Using AI techniques to fit the interatomic forces generated by DFT, the accessible time and size scales can be boosted several orders of magnitude with linear scaling.

Deep potential is essentially a combination of machine learning and physical principles, which start a new computing paradigm as shown in Figure 1.

The image shows the new computing paradigm that combines molecular modeling, machine learning and high-performance computing to understand the interatomic forces of molecules compared to the traditional methods.


Figure 1. A new computing paradigm composed of molecular modeling, AI, and HPC. (Figure courtesy: Dr. Linfeng Zhang, DP Technology)

The entire workflow is shown in Figure 2. The data generation step is done with VASP and QE. The data preparation, model training, testing, and compression steps are done using DeePMD-kit. The model deployment is in LAMMPS.

This figure displays the workflow of training and deploying a deep potential model. The workflow includes data generation, data preparation, model training, model testing, model compression, and model deployment.


Figure 2. Diagram of the DeePMD workflow.

Why Containers?

A container is a portable unit of software that combines the application, and all its dependencies, into a single package that is agnostic to the underlying host OS.

The workflow in this post involves AIMD, DP training, and LAMMPS MD simulation. It is nontrivial and time-consuming to install each software package from source with the correct setup of the compiler, MPI, GPU library, and optimization flags.

Containers solve this problem by providing a highly optimized GPU-enabled computing environment for each step, and eliminates the time to install and test software.

The NGC catalog, a hub of GPU-optimized HPC and AI software, carries a whole of HPC and AI containers that can be readily deployed on any GPU system. The HPC and AI containers from the NGC catalog are updated frequently and are tested for reliability and performance — necessary to speed up the time to solution.

These containers are also scanned for Common Vulnerabilities and Exposure (CVEs), ensuring that they are devoid of any open ports and malware. Additionally, the HPC containers support both Docker and Singularity runtimes, and can be deployed on multi-GPU and multinode systems running in the cloud or on-premises.

Training data generation

The first step in the simulation is data generation. We will show you how you can use VASP and Quantum ESPRESSO to run AIMD simulations and generate training datasets for DeePMD. All input files can be downloaded from the GitHub repository using the following command:

git clone https://github.com/deepmodeling/SC21_DP_Tutorial.git

VASP

A two-dimensional graphene system with 98-atoms is used as shown in Figure 3.6 To generate the training datasets, 0.5ps NVT AIMD simulation at 300 K is performed. The time step chosen is 0.5fs. The DP model is created using 1000 time steps from a 0.5ps MD trajectory at a fixed temperature.

Due to the short simulation time, the training dataset contains consecutive system snapshots, which are highly correlated. Generally, the training dataset should be sampled from uncorrelated snapshots with various system conditions and configurations. For this example, we used a simplified training data scheme. For production DP training, using DP-GEN is recommended to utilize the concurrent learning scheme to efficiently explore more combinations of conditions.7

The projector-augmented wave pseudopotentials are employed to describe the interactions between the valence electrons and frozen cores. The generalized gradient approximation exchange−correlation functional of Perdew−Burke−Ernzerhof. Only the Γ-point was used for k-space sampling in all systems.

This figure displays the top view of a single layer graphene system with 98 carbon atoms.


Figure 3. A graphene system composed of 98 carbon atoms is used in AIMD simulation.

Quantum Espresso

The AIMD simulation can also be carried out using Quantum ESPRESSO, available as a container from the NGC Catalog. Quantum ESPRESSO is an integrated suite of open-source computer codes for electronic-structure calculations and materials modeling at the nanoscale based on density-functional theory, plane waves, and pseudopotentials. The same graphene structure is used in the QE calculations. The following command can be used to start the AIMD simulation:

$ singularity exec --nv docker://nvcr.io/hpc/quantum_espresso:qe-6.8 cp.x
< c.md98.cp.in

Training data preparation

Once the training data is obtained from AIMD simulation, we want to convert its format using dpdata so that it can be used as input to the deep neural network. The dpdata package is a format conversion toolkit between AIMD, classical MD, and DeePMD-kit.

You can use the convenient tool dpdata to convert data directly from the output of first-principles packages to the DeePMD-kit format. For deep potential training, the following information of a physical system has to be provided: atom type, box boundary, coordinate, force, viral, and system energy.

A snapshot, or a frame of the system, contains all these data points for all atoms at one-time step, which can be stored in two formats, that is raw and npy.

The first format raw is plain text with all information in one file, and each line of the file represents a snapshot. Different system information is stored in different files named as box.raw, coord.raw, force.raw, energy.raw, and virial.raw. We recommended you follow these naming conventions when preparing the training files.

An example of force.raw:

$ cat force.raw
-0.724  2.039 -0.951  0.841 -0.464  0.363
 6.737  1.554 -5.587 -2.803  0.062  2.222
-1.968 -0.163  1.020 -0.225 -0.789  0.343

This force.raw contains three frames, with each frame having the forces of two atoms, resulting in three lines and six columns. Each line provides all three force components of two atoms in one frame. The first three numbers are the three force components of the first atom, while the next three numbers are the force components of the second atom.

The coordinate file coord.raw is organized similarly. In box.raw, the nine components of the box vectors should be provided on each line. In virial.raw, the nine components of the virial tensor should be provided on each line in the order XX XY XZ YX YY YZ ZX ZY ZZ. The number of lines of all raw files should be identical. We assume that the atom types do not change in all frames. It is provided by type.raw, which has one line with the types of atoms written one by one.

The atom types should be integers. For example, the type.raw of a system that has two atoms with zero and one:

$ cat type.raw
0 1

It is not a requirement to convert the data format to raw, but this process should give a sense on the types of data that can be used as inputs to DeePMD-kit for training.

The easiest way to convert the first-principles results to the training data is to save them as numpy binary data.

For VASP output, we have prepared an outcartodata.py script to process the VASP OUTCAR file. By running the commands:

$ cd SC21_DP_Tutorial/AIMD/VASP/
$ singularity exec --nv docker://nvcr.io/hpc/deepmd-kit:v2.0.3 python outcartodata.py
$ mv deepmd_data ../../DP/

For QE output:

$ cd SC21_DP_Tutorial/AIMD/QE/
$ singularity exec --nv docker://nvcr.io/hpc/deepmd-kit:v2.0.3 python logtodata.py
$ mv deepmd_data ../../DP/

A folder called deepmd_data is generated and moved to the training directory. It generates five sets 0/set.000, 1/set.000, 2/set.000, 3/set.000, 4/set.000, with each set containing 200 frames. It is not required to take care of the binary data files in each of the set.* directories. The path containing the set.* folder and type.raw file is called a system. If you want to train a nonperiodic system, an empty nopbc file should be placed under the system directory. box.raw is not necessary as it is a nonperiodic system.

We are going to use three of the five sets for training, one for validating, and the remaining one for testing.

Deep Potential model training

The input of the deep potential model is a descriptor vector containing the system information mentioned previously. The neural network contains several hidden layers with a composition of linear and nonlinear transformations. In this post, a three layer-neural network with 25, 50 and 100 neurons in each layer is used. The target value, or the label, for the neural network to learn is the atomic energies. The training process optimizes the weights and the bias vectors by minimizing the loss function.

The training is initiated by the command where input.json contains the training parameters:

$ singularity exec --nv docker://nvcr.io/hpc/deepmd-kit:v2.0.3 dp train input.json

The DeePMD-kit prints detailed information on the training and validation data sets. The data sets are determined by training_data and validation_data as defined in the training section of the input script. The training dataset is composed of three data systems, while the validation data set is composed of one data system. The number of atoms, batch size, number of batches in the system, and the probability of using the system are all shown in Figure 4. The last column presents if the periodic boundary condition is assumed for the system.

This image is a screenshot of the DP training output. Summaries of the training and validation dataset are shown with detailed information on the number of atoms, batch size, number of batches in the system and the probability of using the system.


Figure 4. Screenshot of the DP training output.

During the training, the error of the model is tested every disp_freq training step with the batch used to train the model and with numb_btch batches from the validating data. The training error and validation error are printed correspondingly in the file disp_file (default is lcurve.out). The batch size can be set in the input script by the key batch_size in the corresponding sections for training and validation data set.

An example of the output:

#  step      rmse_val    rmse_trn    rmse_e_val  rmse_e_trn    rmse_f_val  rmse_f_trn         lr
      0      3.33e+01    3.41e+01      1.03e+01    1.03e+01      8.39e-01    8.72e-01    1.0e-03
    100      2.57e+01    2.56e+01      1.87e+00    1.88e+00      8.03e-01    8.02e-01    1.0e-03
    200      2.45e+01    2.56e+01      2.26e-01    2.21e-01      7.73e-01    8.10e-01    1.0e-03
    300      1.62e+01    1.66e+01      5.01e-02    4.46e-02      5.11e-01    5.26e-01    1.0e-03
    400      1.36e+01    1.32e+01      1.07e-02    2.07e-03      4.29e-01    4.19e-01    1.0e-03
    500      1.07e+01    1.05e+01      2.45e-03    4.11e-03      3.38e-01    3.31e-01    1.0e-03

The training error reduces monotonically with training steps as shown in Figure 5. The trained model is tested on the test dataset and compared with the AIMD simulation results. The test command is:

$ singularity exec --nv docker://nvcr.io/hpc/deepmd-kit:v2.0.3 dp test -m frozen_model.pb -s deepmd_data/4/ -n 200 -d detail.out

This image shows the total training loss, energy loss, force loss and learning rate decay with training steps from 0 to 1,000,000. Both the training and validation loss decrease monotonically with training steps.


Figure 5. Training loss with steps

The results are shown in Figure 6.

This image displays the inferenced energy and force in the y-axis, and the ground true on the x-axis. The inferenced values soundly coincide with the ground truth with all data distributed in the diagonal direction.


Figure 6. Test of the prediction accuracy of trained DP model with AIMD energies and forces.

Model export and compression

After the model has been trained, a frozen model is generated for inference in MD simulation. The process of saving neural network from a checkpoint is called “freezing” a model:

$ singularity exec --nv docker://nvcr.io/hpc/deepmd-kit:v2.0.3 dp freeze -o graphene.pb

After the frozen model is generated, the model can be compressed without sacrificing its accuracy; while greatly speeding up the inference performance in MD. Depending on simulation and training setup, model compression can boost performance by 10X, and reduce memory consumption by 20X when running on GPUs.

The frozen model can be compressed using the following command where -i refers to the frozen model and -o points to the output name of the compressed model:

$ singularity exec --nv docker://nvcr.io/hpc/deepmd-kit:v2.0.3 dp compress -i graphene.pb -o graphene-compress.pb

Model deployment in LAMMPS

A new pair-style has been implemented in LAMMPS to deploy the trained neural network in prior steps. For users familiar with the LAMMPS workflow, only minimal changes are needed to switch to deep potential. For instance, a traditional LAMMPS input with Tersoff potential has the following setting for potential setup:

pair_style      tersoff
pair_coeff      * * BNC.tersoff C

To use deep potential, replace previous lines with:

pair_style      deepmd graphene-compress.pb
pair_coeff      * *

The pair_style command in the input file uses the DeePMD model to describe the atomic interactions in the graphene system.

The graphene-compress.pb file represents the frozen and compressed model for inference. The graphene system in MD simulation contains 1,560 atoms. Periodic boundary conditions are applied in the lateral x– and y-directions, and free boundary is applied to the z-direction. The time step is set as 1 fs. The system is placed under NVT ensemble at temperature 300 K for relaxation, which is consistent with the AIMD setup. The system configuration after NVT relaxation is shown in Figure 7. It can be observed that the deep potential can describe the atomic structures with small ripples in the cross-plane direction. After 10ps NVT relaxation, the system is placed under NVE ensemble to check system stability.

The image displays the side view of the single layer graphene system after thermal relaxation in LAMMPS.


Figure 7. Atomic configuration of the graphene system after relaxation with deep potential.

The system temperature is shown in Figure 8.

The image displays the temperature profiles of the graphene system under NVT and NVE ensembles from 0 to 20 picoseconds. The first 10 picosecond is NVT and the second 10 picosecond is NVE.


Figure 8. System temperature under NVT and NVE ensembles. The MD system driven by deep potential is very stable after relaxation.

To validate the accuracy of the trained DP model, the calculated radial distribution function (RDF) from AIMD, DP and Tersoff, are plotted in Figure 9. The DP model-generated RDF is very close to that of AIMD, which indicates that the crystalline structure of graphene can be well presented by the DP model.

This image displays the plotted radial distribution function from three different methods, including DP, Tersoff and AIMD, which are denoted in black, red and blue solid lines respectively.


Figure 9. Radial distribution function calculated by AIMD, DP and Tersoff potential, respectively. It can be observed that the RDF calculated by DP is very close to that of AIMD.

Conclusion

This post demonstrates a simple case study of graphene under given conditions. The DeePMD-kit package streamlines the workflow from AIMD to classical MD with deep potential, providing the following key advantages:

Highly automatic and efficient workflow implemented in the TensorFlow framework. APIs with popular DFT and MD packages such as VASP, QE, and LAMMPS. Broad applications in organic molecules, metals, semiconductors, insulators, and more. Highly efficient code for HPC with MPI and GPU support. Modularization for easy adoption by other deep learning potential models. Furthermore, the use of GPU-optimized containers from the NGC catalog simplifies and accelerates the overall workflow by eliminating the steps to install and configure software. To train a comprehensive model for other applications, download the DeepMD Kit Container from the NGC catalog.

References

[1] Jia W, Wang H, Chen M, Lu D, Lin L, Car R, E W and Zhang L 2020 Pushing the limit of molecular dynamics with ab initio accuracy to 100 million atoms with machine learning IEEE Press 5 1-14

[2] Wang H, Zhang L, Han J and E W 2018 DeePMD-kit: A deep learning package for many-body potential energy representation and molecular dynamics Computer Physics Communications 228 178-84

[3] Plimpton S 1995 Fast Parallel Algorithms for Short-Range Molecular Dynamics Journal of Computational Physics 117 1-19

[4] Kresse G and Hafner J 1993 Ab initio molecular dynamics for liquid metals Physical Review B 47 558-61

[5] Giannozzi P, Baroni S, Bonini N, Calandra M, Car R, Cavazzoni C, Ceresoli D, Chiarotti G L, Cococcioni M, Dabo I, Dal Corso A, de Gironcoli S, Fabris S, Fratesi G, Gebauer R, Gerstmann U, Gougoussis C, Kokalj A, Lazzeri M, Martin-Samos L, Marzari N, Mauri F, Mazzarello R, Paolini S, Pasquarello A, Paulatto L, Sbraccia C, Scandolo S, Sclauzero G, Seitsonen A P, Smogunov A, Umari P and Wentzcovitch R M 2009 QUANTUM ESPRESSO: a modular and open-source software project for quantum simulations of materials Journal of Physics: Condensed Matter 21 395502

[6] Humphrey W, Dalke A and Schulten K 1996 VMD: Visual molecular dynamics Journal of Molecular Graphics 14 33-8

[7] Yuzhi Zhang, Haidi Wang, Weijie Chen, Jinzhe Zeng, Linfeng Zhang, Han Wang, and Weinan E, DP-GEN: A concurrent learning platform for the generation of reliable deep learning based potential energy models, Computer Physics Communications, 2020, 107206.