Skip to content

alff.al¤

alff.al ¤

Modules:

  • active_learning

    Active Learning workflow implementation.

  • finetune

    Classes and functions for fine-tuning ML models.

  • libal_md_ase

    Library for ASE MD with SevenNet model.

  • libal_md_lammps

    Library for LAMMPS MD with SevenNet model.

  • utilal

    Utilities for Active Learning workflow.

  • utilal_uncertainty

    Utilities for uncertainty estimation using models committee.

active_learning ¤

Active Learning workflow implementation.

Classes:

Functions:

  • stage_train

    Stage function for ML training tasks.

  • stage_md

    Stage function for MD exploration tasks.

  • stage_dft

    Stage function for DFT labeling tasks.

WorkflowActiveLearning(params_file: str, machines_file: str) ¤

Bases: Workflow

Workflow for active learning. Note: Need to redefine .run() method, since the Active Learning workflow is different from the base class.

Methods:

Attributes:

stage_map = {'ml_train': stage_train, 'md_explore': stage_md, 'dft_label': stage_dft} instance-attribute ¤
wf_name = 'ACTIVE LEARNING' instance-attribute ¤
params_file = params_file instance-attribute ¤
machines_file = machines_file instance-attribute ¤
schema_file = schema_file instance-attribute ¤
multi_mdicts = config_machine.multi_mdicts instance-attribute ¤
pdict = Config.loadconfig(self.params_file) instance-attribute ¤
stage_list = self._load_stage_list() instance-attribute ¤
run() ¤

stage_train(iter_idx, pdict, mdict) ¤

Stage function for ML training tasks.

This function includes: preparing training data and args, running training, and postprocessing. - collect data files - prepare training args based on MLP engine

stage_md(iter_idx, pdict, mdict) ¤

Stage function for MD exploration tasks.

Including: pre, run, post MD. - Collect initial configurations - Prepare MD args - Submit MD jobs to remote machines - Postprocess MD results

stage_dft(iter_idx, pdict, mdict) ¤

Stage function for DFT labeling tasks. Including: pre, run, post DFT.

finetune ¤

Classes and functions for fine-tuning ML models.

Classes:

  • WorkflowFinetune

    Workflow for fine-tuning the existed ML models or train a new ML model.

Functions:

  • stage_train

    Stage function for ML training tasks.

WorkflowFinetune(params_file: str, machines_file: str) ¤

Bases: Workflow

Workflow for fine-tuning the existed ML models or train a new ML model. Needs to override self.stage_list in base class, because the stages are fixed here.

Methods:

  • run

    The main function to run the workflow. This default implementation works for simple workflow,

Attributes:

stage_map = {'ml_train': stage_train} instance-attribute ¤
wf_name = 'FINE-TUNING' instance-attribute ¤
stage_list = ['ml_train'] instance-attribute ¤
params_file = params_file instance-attribute ¤
machines_file = machines_file instance-attribute ¤
schema_file = schema_file instance-attribute ¤
multi_mdicts = config_machine.multi_mdicts instance-attribute ¤
pdict = Config.loadconfig(self.params_file) instance-attribute ¤
run() ¤

The main function to run the workflow. This default implementation works for simple workflow, for more complex workflow (e.g. with iteration like active learning), need to reimplement this .run() function.

stage_train(pdict, mdict) ¤

Stage function for ML training tasks.

libal_md_ase ¤

Library for ASE MD with SevenNet model.

Classes:

Functions:

OperAlmdAseSevennet(work_dir, pdict, multi_mdict, mdict_prefix='md') ¤

Bases: RemoteOperation

This class runs ASE md for a list of structures in task_dirs.

Methods:

Attributes:

op_name = 'ASE MD with SevenNet' instance-attribute ¤
task_filter = {'has_files': ['conf.extxyz'], 'no_files': ['committee_error.txt']} instance-attribute ¤
work_dir = work_dir instance-attribute ¤
pdict = pdict instance-attribute ¤
mdict_list = self._select_machines(multi_mdicts, mdict_prefix) instance-attribute ¤
task_dirs = self._load_task_dirs() instance-attribute ¤
commandlist_list: list[list[str]] instance-attribute ¤
forward_files: list[str] instance-attribute ¤
backward_files: list[str] instance-attribute ¤
forward_common_files: list[str] instance-attribute ¤
backward_common_files: list[str] = [] instance-attribute ¤
prepare() ¤

Prepare MD tasks.

Includes: - Prepare the task_list - Prepare forward & backward files - Prepare commandlist_list for multi-remote submission

postprocess() ¤
run() ¤

Function to submit jobs to remote machines.

Note
  • Orginal taks_dirs is relative to run_dir, and should not be changed. But the sumbmission function needs taks_dirs relative path to work_dir, so we make temporary change here.

premd_ase_sevenn(work_dir, pdict, mdict) ¤

Prepare MD args.

Includes: - copy ML models to work_dir - collect initial configurations - prepare ASE args - generate task_dirs for ranges of temperature and press

temperature_press_mdarg_ase(struct_dirs: list, temperature_list: list = [], press_list: list = [], ase_argdict: dict = {}) -> list ¤

Generate the task_dirs for ranges of temperatures and stresses.

Parameters:

  • struct_dirs (list) –

    List of dirs contains configuration files.

  • temperature_list (list, default: [] ) –

    List of temperatures.

  • press_list (list, default: [] ) –

    List of stresses.

  • ase_argdict (dict, default: {} ) –

libal_md_lammps ¤

Library for LAMMPS MD with SevenNet model.

Classes:

Functions:

OperAlmdLammpsSevennet(work_dir, pdict, multi_mdict, mdict_prefix='md') ¤

Bases: RemoteOperation

This class runs LAMMPS md for a list of structures in task_dirs.

Methods:

Attributes:

op_name = 'LAMMPS MD with SevenNet' instance-attribute ¤
task_filter = {'has_files': ['conf.lmpdata'], 'no_files': ['committee_error.txt']} instance-attribute ¤
work_dir = work_dir instance-attribute ¤
pdict = pdict instance-attribute ¤
mdict_list = self._select_machines(multi_mdicts, mdict_prefix) instance-attribute ¤
task_dirs = self._load_task_dirs() instance-attribute ¤
commandlist_list: list[list[str]] instance-attribute ¤
forward_files: list[str] instance-attribute ¤
backward_files: list[str] instance-attribute ¤
forward_common_files: list[str] instance-attribute ¤
backward_common_files: list[str] = [] instance-attribute ¤
prepare() ¤

Prepare MD tasks.

Includes: - Prepare the task_list - Prepare forward & backward files - Prepare commandlist_list for multi-remote submission

postprocess() ¤
run() ¤

Function to submit jobs to remote machines.

Note
  • Orginal taks_dirs is relative to run_dir, and should not be changed. But the sumbmission function needs taks_dirs relative path to work_dir, so we make temporary change here.

premd_lammps_sevenn(work_dir, pdict, mdict) ¤

Prepare MD args.

Includes: - copy ML models to work_dir - collect initial configurations - prepare lammps args - generate task_dirs for ranges of temperature and press

temperature_press_mdarg_lammps(struct_dirs: list, temperature_list: list = [], press_list: list = [], lammps_argdict: dict = {}) -> list ¤

Generate the task_dirs for ranges of temperatures and stresses.

Parameters:

  • struct_dirs (list) –

    List of dirs contains configuration files.

  • temperature_list (list, default: [] ) –

    List of temperatures.

  • press_list (list, default: [] ) –

    List of stresses.

  • lammps_argdict (dict, default: {} ) –

utilal ¤

Utilities for Active Learning workflow.

Classes:

  • D3ParamMD

    Different packages use different names for D3 parameters.

  • MLP2Lammps

    Convert MLP model to be used in LAMMPS.

D3ParamMD(d3package: str = 'sevenn') ¤

Different packages use different names for D3 parameters. This class to 'return' conventional names for D3 parameters for different packages used for MD.

Note
  • The default cutoff values are 95 Bohr (50.2718 Angstrom) for two-body dispersion calculations and 40 Bohr (21.1671 Angstrom) for coordination number and three-body calculations, as in ASE-DFTD3 package. Other packages may use different default values.
  • Some dftd3 parameters for DFT calculations support triple-body interactions, but most MD packages only support pairwise interactions. So the triple-body cutoff parameter is not included in this class.

Methods:

Attributes:

d3package: str = d3package instance-attribute ¤
default_twobody_cutoff: float = 50.2718 instance-attribute ¤
default_cn_cutoff: float = 21.1671 instance-attribute ¤
param_names = params['params'] instance-attribute ¤
damping_map = params['damping_map'] instance-attribute ¤
get_params() -> dict ¤

Return D3 parameter names according to different packages.

check_supported_damping(damping: str) ¤

Check if the damping method is supported in the selected package.

angstrom2bohr(angstrom_value: float) -> float staticmethod ¤

Convert Angstrom to Bohr. Note: in simple-dftd3, 60*Bohr converts 60 Bohr to Angstrom.

angstrom2bohr2(angstrom_value: float) -> float staticmethod ¤

Convert Angstrom to Bohr^2. To used in sevenn package.

bohr2angstrom(bohr_value: float) -> float staticmethod ¤

Convert Bohr to Angstrom.

MLP2Lammps(mlp_model: str = 'sevenn') ¤

Convert MLP model to be used in LAMMPS.

Methods:

Attributes:

mlp_model: str = mlp_model instance-attribute ¤
convert(checkpoint: str, outfile: str = 'deployed.pt', **kwargs) ¤

Convert MLP model to LAMMPS format.

Parameters:

  • checkpoint (str) –

    Path to checkpoint file of MLP model.

  • outfile (str, default: 'deployed.pt' ) –

    Path to output LAMMPS potential file.

  • **kwargs

    Additional arguments for specific conversion methods.

convert_sevenn(checkpoint: str, outfile: str = 'deploy_sevenn', modal: str | None = None, use_flash: bool = False, parallel_type=False, **kwargs) staticmethod ¤

Convert sevenn model to be used in LAMMPS.

Parameters:

  • checkpoint (str) –

    Path to checkpoint file of sevenn model.

  • outfile (str, default: 'deploy_sevenn' ) –

    Path to output LAMMPS potential file.

  • modal (str, default: None ) –

    Channel of multi-task model.

  • parallel_type (bool, default: False ) –

    Convert to potential for run in parallel simulations.

  • use_flash (bool, default: False ) –

    Use flashTP.

  • **kwargs

    Additional arguments to avoid breaking the function signature when future additional arguments are added.

Note

Single mode: will generate file as "outfile.pt" Parallel mode: will generate files as "outfile/deployed_parallel_0.pt", "outfile/deployed_parallel_1.pt", ...

convert_sevenn_mliap(checkpoint: str, outfile: str = 'deploy_sevenn_mliap.pt', modal: str | None = None, use_cueq: bool = False, use_flash: bool = False, use_oeq: bool = False, **kwargs) staticmethod ¤

Convert sevenn model to be used in LAMMPS MLIAP.

Parameters:

  • checkpoint (str) –

    Path to checkpoint file of sevenn model.

  • outfile (str, default: 'deploy_sevenn_mliap.pt' ) –

    Path to output LAMMPS potential file.

  • modal (str, default: None ) –

    Channel of multi-task model.

  • use_cueq (bool, default: False ) –

    Use cueq. cuEquivariance is only supported in ML-IAP interface.

  • use_flash (bool, default: False ) –

    Use flashTP.

  • use_oeq (bool, default: False ) –

    Use oeq.

  • **kwargs

    Additional arguments to avoid breaking the function signature when future additional arguments are added.

utilal_uncertainty ¤

Utilities for uncertainty estimation using models committee. - DO NOT import any alff libs in this file, since this file will be used remotely.

Classes:

  • ModelCommittee

    A class to manage a committee of models for uncertainty estimation.

Functions:

  • simple_lmpdump2extxyz

    Convert LAMMPS dump file to extended xyz file. This is very simple version, only convert atomic positions, but not stress tensor.

  • chunk_list

    Yield successive n-sized chunks from input_list.

ModelCommittee(mlp_model: str, model_files: list[str], calc_kwargs: dict | None = None, compute_stress: bool = False, rel_force: float | None = None, rel_stress: float | None = None, e_std_lo: float = 0.05, e_std_hi: float = 0.1, f_std_lo: float = 0.05, f_std_hi: float = 0.1, s_std_lo: float = 0.05, s_std_hi: float = 0.1, block_size: int = 1000) ¤

A class to manage a committee of models for uncertainty estimation.

Parameters:

  • mlp_model (str) –

    MLP model engine, e.g., 'sevenn'.

  • model_files (list[str]) –

    List of model files for the committee.

  • calc_kwargs (dict, default: None ) –

    Additional arguments for the MLP calculator. Defaults to {}.

  • compute_stress (bool, default: False ) –

    Whether to compute stress. Defaults to False.

  • rel_force (float, default: None ) –

    Relative force to normalize force std. Defaults to None.

  • rel_stress (float, default: None ) –

    Relative stress to normalize stress std. Defaults to None.

  • e_std_lo (float, default: 0.05 ) –

    energy std low. Defaults to 0.05.

  • e_std_hi (float, default: 0.1 ) –

    energy std high. Defaults to 0.1.

  • f_std_lo (float, default: 0.05 ) –

    force std low. Defaults to 0.05.

  • f_std_hi (float, default: 0.1 ) –

    force std high. Defaults to 0.1.

  • s_std_lo (float, default: 0.05 ) –

    stress std low. Defaults to 0.05.

  • s_std_hi (float, default: 0.1 ) –

    stress std high. Defaults to 0.1.

  • block_size (int, default: 1000 ) –

    Block size of configurations to compute 'committee error' at once, just to avoid flooding memory. Defaults to 1000.

Note
  • Consider using @staticmethod for some functions to avoid recursive messing.

Methods:

  • compute_committee_error_blockwise

    Compute committee error for energy, forces, and stress for a multiple configurations in a block-wise manner.

  • committee_judge

    Decide whether a configuration is candidate, accurate, or inaccurate based on committee error.

  • select_candidate

    Select candidate configurations for DFT calculation.

  • remove_inaccurate

    Remove inaccurate configurations based on committee error. This is used to revise the dataset.

Attributes:

mlp_model = mlp_model instance-attribute ¤
model_files = model_files instance-attribute ¤
calc_kwargs = calc_kwargs or {} instance-attribute ¤
compute_stress = compute_stress instance-attribute ¤
rel_force = rel_force instance-attribute ¤
rel_stress = rel_stress instance-attribute ¤
block_size = block_size instance-attribute ¤
e_std_lo = e_std_lo instance-attribute ¤
e_std_hi = e_std_hi instance-attribute ¤
f_std_lo = f_std_lo instance-attribute ¤
f_std_hi = f_std_hi instance-attribute ¤
s_std_lo = s_std_lo instance-attribute ¤
s_std_hi = s_std_hi instance-attribute ¤
calc_list = self._get_calc_list() instance-attribute ¤
committee_error_file: str = 'committee_error.txt' instance-attribute ¤
committee_judge_file: str = 'committee_judge_summary.yml' instance-attribute ¤
compute_committee_error_blockwise(struct_list: list[Atoms]) ¤

Compute committee error for energy, forces, and stress for a multiple configurations in a block-wise manner.

Parameters:

  • struct_list (list[Atoms]) –

    List of Atoms objects.

Note

The output file is controlled by the class attribute self.committee_error_file.

committee_judge() -> tuple[np.ndarray, np.ndarray, np.ndarray] ¤

Decide whether a configuration is candidate, accurate, or inaccurate based on committee error.

Returns:

  • committee_judge_file ( s ) –

    files contain candidate, accurate and inaccurate configurations

Note
  • If need to select candidates based on only energy, just set f_std_hi and s_std_hi to a very large values. By this way, the criterion for those terms will always meet.
  • Similarly, if need to select candidates based on only energy and force, set s_std_hi to a very large value. E.g., s_std_hi=1e6 for selecting candidates based on energy and force.
select_candidate(extxyz_file: str) ¤

Select candidate configurations for DFT calculation.

Returns:

  • extxyz_file ( str ) –

    candidate configurations

Note: See parameters in functions committee_error and committee_judge.

remove_inaccurate(extxyz_file: str) ¤

Remove inaccurate configurations based on committee error. This is used to revise the dataset.

Returns:

  • extxyz_file ( str ) –

    revise configurations

Note
  • blockwise functions requires all configurations in block have the same number of atoms. So if the input extxyz file contains configurations with different number of atoms, must use block_size=1 when initializing ModelCommittee class.

simple_lmpdump2extxyz(lmpdump_file: str, extxyz_file: str) ¤

Convert LAMMPS dump file to extended xyz file. This is very simple version, only convert atomic positions, but not stress tensor.

chunk_list(input_list: list, chunk_size: int) -> Generator[list, None, None] ¤

Yield successive n-sized chunks from input_list.

Parameters:

  • input_list (list) –

    Input list to be chunked.

  • chunk_size (int) –

    Chunk size (number of elements per chunk).