alff.al¤
alff.al
¤
Modules:
-
active_learning–Active Learning workflow implementation.
-
finetune–Classes and functions for fine-tuning ML models.
-
libal_md_ase–Library for ASE MD with SevenNet model.
-
libal_md_lammps–Library for LAMMPS MD with SevenNet model.
-
utilal–Utilities for Active Learning workflow.
-
utilal_uncertainty–Utilities for uncertainty estimation using models committee.
active_learning
¤
Active Learning workflow implementation.
Classes:
-
WorkflowActiveLearning–Workflow for active learning.
Functions:
-
stage_train–Stage function for ML training tasks.
-
stage_md–Stage function for MD exploration tasks.
-
stage_dft–Stage function for DFT labeling tasks.
WorkflowActiveLearning(params_file: str, machines_file: str)
¤
Bases: Workflow
Workflow for active learning.
Note: Need to redefine .run() method, since the Active Learning workflow is different from the base class.
Methods:
-
run–
Attributes:
-
stage_map– -
wf_name– -
params_file– -
machines_file– -
schema_file– -
multi_mdicts– -
pdict– -
stage_list–
stage_map = {'ml_train': stage_train, 'md_explore': stage_md, 'dft_label': stage_dft}
instance-attribute
¤
wf_name = 'ACTIVE LEARNING'
instance-attribute
¤
params_file = params_file
instance-attribute
¤
machines_file = machines_file
instance-attribute
¤
schema_file = schema_file
instance-attribute
¤
multi_mdicts = config_machine.multi_mdicts
instance-attribute
¤
pdict = Config.loadconfig(self.params_file)
instance-attribute
¤
stage_list = self._load_stage_list()
instance-attribute
¤
run()
¤
stage_train(iter_idx, pdict, mdict)
¤
Stage function for ML training tasks.
This function includes: preparing training data and args, running training, and postprocessing. - collect data files - prepare training args based on MLP engine
stage_md(iter_idx, pdict, mdict)
¤
Stage function for MD exploration tasks.
Including: pre, run, post MD. - Collect initial configurations - Prepare MD args - Submit MD jobs to remote machines - Postprocess MD results
stage_dft(iter_idx, pdict, mdict)
¤
Stage function for DFT labeling tasks. Including: pre, run, post DFT.
finetune
¤
Classes and functions for fine-tuning ML models.
Classes:
-
WorkflowFinetune–Workflow for fine-tuning the existed ML models or train a new ML model.
Functions:
-
stage_train–Stage function for ML training tasks.
WorkflowFinetune(params_file: str, machines_file: str)
¤
Bases: Workflow
Workflow for fine-tuning the existed ML models or train a new ML model.
Needs to override self.stage_list in base class, because the stages are fixed here.
Methods:
-
run–The main function to run the workflow. This default implementation works for simple workflow,
Attributes:
-
stage_map– -
wf_name– -
stage_list– -
params_file– -
machines_file– -
schema_file– -
multi_mdicts– -
pdict–
stage_map = {'ml_train': stage_train}
instance-attribute
¤
wf_name = 'FINE-TUNING'
instance-attribute
¤
stage_list = ['ml_train']
instance-attribute
¤
params_file = params_file
instance-attribute
¤
machines_file = machines_file
instance-attribute
¤
schema_file = schema_file
instance-attribute
¤
multi_mdicts = config_machine.multi_mdicts
instance-attribute
¤
pdict = Config.loadconfig(self.params_file)
instance-attribute
¤
run()
¤
The main function to run the workflow. This default implementation works for simple workflow,
for more complex workflow (e.g. with iteration like active learning), need to reimplement this .run() function.
stage_train(pdict, mdict)
¤
Stage function for ML training tasks.
libal_md_ase
¤
Library for ASE MD with SevenNet model.
Classes:
-
OperAlmdAseSevennet–This class runs ASE md for a list of structures in
task_dirs.
Functions:
-
premd_ase_sevenn–Prepare MD args.
-
temperature_press_mdarg_ase–Generate the task_dirs for ranges of temperatures and stresses.
OperAlmdAseSevennet(work_dir, pdict, multi_mdict, mdict_prefix='md')
¤
Bases: RemoteOperation
This class runs ASE md for a list of structures in task_dirs.
Methods:
-
prepare–Prepare MD tasks.
-
postprocess– -
run–Function to submit jobs to remote machines.
Attributes:
-
op_name– -
task_filter– -
work_dir– -
pdict– -
mdict_list– -
task_dirs– -
commandlist_list(list[list[str]]) – -
forward_files(list[str]) – -
backward_files(list[str]) – -
forward_common_files(list[str]) – -
backward_common_files(list[str]) –
op_name = 'ASE MD with SevenNet'
instance-attribute
¤
task_filter = {'has_files': ['conf.extxyz'], 'no_files': ['committee_error.txt']}
instance-attribute
¤
work_dir = work_dir
instance-attribute
¤
pdict = pdict
instance-attribute
¤
mdict_list = self._select_machines(multi_mdicts, mdict_prefix)
instance-attribute
¤
task_dirs = self._load_task_dirs()
instance-attribute
¤
commandlist_list: list[list[str]]
instance-attribute
¤
forward_files: list[str]
instance-attribute
¤
backward_files: list[str]
instance-attribute
¤
forward_common_files: list[str]
instance-attribute
¤
backward_common_files: list[str] = []
instance-attribute
¤
prepare()
¤
Prepare MD tasks.
Includes: - Prepare the task_list - Prepare forward & backward files - Prepare commandlist_list for multi-remote submission
postprocess()
¤
run()
¤
Function to submit jobs to remote machines.
Note
- Orginal
taks_dirsis relative torun_dir, and should not be changed. But the sumbmission function needstaks_dirsrelative path towork_dir, so we make temporary change here.
premd_ase_sevenn(work_dir, pdict, mdict)
¤
Prepare MD args.
Includes: - copy ML models to work_dir - collect initial configurations - prepare ASE args - generate task_dirs for ranges of temperature and press
temperature_press_mdarg_ase(struct_dirs: list, temperature_list: list = [], press_list: list = [], ase_argdict: dict = {}) -> list
¤
Generate the task_dirs for ranges of temperatures and stresses.
Parameters:
-
struct_dirs(list) –List of dirs contains configuration files.
-
temperature_list(list, default:[]) –List of temperatures.
-
press_list(list, default:[]) –List of stresses.
-
ase_argdict(dict, default:{}) –See ase.md schema
libal_md_lammps
¤
Library for LAMMPS MD with SevenNet model.
Classes:
-
OperAlmdLammpsSevennet–This class runs LAMMPS md for a list of structures in
task_dirs.
Functions:
-
premd_lammps_sevenn–Prepare MD args.
-
temperature_press_mdarg_lammps–Generate the task_dirs for ranges of temperatures and stresses.
OperAlmdLammpsSevennet(work_dir, pdict, multi_mdict, mdict_prefix='md')
¤
Bases: RemoteOperation
This class runs LAMMPS md for a list of structures in task_dirs.
Methods:
-
prepare–Prepare MD tasks.
-
postprocess– -
run–Function to submit jobs to remote machines.
Attributes:
-
op_name– -
task_filter– -
work_dir– -
pdict– -
mdict_list– -
task_dirs– -
commandlist_list(list[list[str]]) – -
forward_files(list[str]) – -
backward_files(list[str]) – -
forward_common_files(list[str]) – -
backward_common_files(list[str]) –
op_name = 'LAMMPS MD with SevenNet'
instance-attribute
¤
task_filter = {'has_files': ['conf.lmpdata'], 'no_files': ['committee_error.txt']}
instance-attribute
¤
work_dir = work_dir
instance-attribute
¤
pdict = pdict
instance-attribute
¤
mdict_list = self._select_machines(multi_mdicts, mdict_prefix)
instance-attribute
¤
task_dirs = self._load_task_dirs()
instance-attribute
¤
commandlist_list: list[list[str]]
instance-attribute
¤
forward_files: list[str]
instance-attribute
¤
backward_files: list[str]
instance-attribute
¤
forward_common_files: list[str]
instance-attribute
¤
backward_common_files: list[str] = []
instance-attribute
¤
prepare()
¤
Prepare MD tasks.
Includes: - Prepare the task_list - Prepare forward & backward files - Prepare commandlist_list for multi-remote submission
postprocess()
¤
run()
¤
Function to submit jobs to remote machines.
Note
- Orginal
taks_dirsis relative torun_dir, and should not be changed. But the sumbmission function needstaks_dirsrelative path towork_dir, so we make temporary change here.
premd_lammps_sevenn(work_dir, pdict, mdict)
¤
Prepare MD args.
Includes: - copy ML models to work_dir - collect initial configurations - prepare lammps args - generate task_dirs for ranges of temperature and press
temperature_press_mdarg_lammps(struct_dirs: list, temperature_list: list = [], press_list: list = [], lammps_argdict: dict = {}) -> list
¤
Generate the task_dirs for ranges of temperatures and stresses.
Parameters:
-
struct_dirs(list) –List of dirs contains configuration files.
-
temperature_list(list, default:[]) –List of temperatures.
-
press_list(list, default:[]) –List of stresses.
-
lammps_argdict(dict, default:{}) –See lammps.md schema
utilal
¤
Utilities for Active Learning workflow.
Classes:
-
D3ParamMD–Different packages use different names for D3 parameters.
-
MLP2Lammps–Convert MLP model to be used in LAMMPS.
D3ParamMD(d3package: str = 'sevenn')
¤
Different packages use different names for D3 parameters. This class to 'return' conventional names for D3 parameters for different packages used for MD.
Note
- The default cutoff values are
95 Bohr (50.2718 Angstrom)for two-body dispersion calculations and40 Bohr (21.1671 Angstrom)for coordination number and three-body calculations, as in ASE-DFTD3 package. Other packages may use different default values. - Some dftd3 parameters for DFT calculations support triple-body interactions, but most MD packages only support pairwise interactions. So the triple-body cutoff parameter is not included in this class.
Methods:
-
get_params–Return D3 parameter names according to different packages.
-
check_supported_damping–Check if the damping method is supported in the selected package.
-
angstrom2bohr–Convert Angstrom to Bohr.
-
angstrom2bohr2–Convert Angstrom to Bohr^2. To used in sevenn package.
-
bohr2angstrom–Convert Bohr to Angstrom.
Attributes:
-
d3package(str) – -
default_twobody_cutoff(float) – -
default_cn_cutoff(float) – -
param_names– -
damping_map–
d3package: str = d3package
instance-attribute
¤
default_twobody_cutoff: float = 50.2718
instance-attribute
¤
default_cn_cutoff: float = 21.1671
instance-attribute
¤
param_names = params['params']
instance-attribute
¤
damping_map = params['damping_map']
instance-attribute
¤
get_params() -> dict
¤
Return D3 parameter names according to different packages.
check_supported_damping(damping: str)
¤
Check if the damping method is supported in the selected package.
angstrom2bohr(angstrom_value: float) -> float
staticmethod
¤
Convert Angstrom to Bohr.
Note: in simple-dftd3, 60*Bohr converts 60 Bohr to Angstrom.
angstrom2bohr2(angstrom_value: float) -> float
staticmethod
¤
Convert Angstrom to Bohr^2. To used in sevenn package.
bohr2angstrom(bohr_value: float) -> float
staticmethod
¤
Convert Bohr to Angstrom.
MLP2Lammps(mlp_model: str = 'sevenn')
¤
Convert MLP model to be used in LAMMPS.
Methods:
-
convert–Convert MLP model to LAMMPS format.
-
convert_sevenn–Convert sevenn model to be used in LAMMPS.
-
convert_sevenn_mliap–Convert sevenn model to be used in LAMMPS MLIAP.
Attributes:
mlp_model: str = mlp_model
instance-attribute
¤
convert(checkpoint: str, outfile: str = 'deployed.pt', **kwargs)
¤
convert_sevenn(checkpoint: str, outfile: str = 'deploy_sevenn', modal: str | None = None, use_flash: bool = False, parallel_type=False, **kwargs)
staticmethod
¤
Convert sevenn model to be used in LAMMPS.
Parameters:
-
checkpoint(str) –Path to checkpoint file of sevenn model.
-
outfile(str, default:'deploy_sevenn') –Path to output LAMMPS potential file.
-
modal(str, default:None) –Channel of multi-task model.
-
parallel_type(bool, default:False) –Convert to potential for run in parallel simulations.
-
use_flash(bool, default:False) –Use flashTP.
-
**kwargs–Additional arguments to avoid breaking the function signature when future additional arguments are added.
Note
Single mode: will generate file as "outfile.pt" Parallel mode: will generate files as "outfile/deployed_parallel_0.pt", "outfile/deployed_parallel_1.pt", ...
convert_sevenn_mliap(checkpoint: str, outfile: str = 'deploy_sevenn_mliap.pt', modal: str | None = None, use_cueq: bool = False, use_flash: bool = False, use_oeq: bool = False, **kwargs)
staticmethod
¤
Convert sevenn model to be used in LAMMPS MLIAP.
Parameters:
-
checkpoint(str) –Path to checkpoint file of sevenn model.
-
outfile(str, default:'deploy_sevenn_mliap.pt') –Path to output LAMMPS potential file.
-
modal(str, default:None) –Channel of multi-task model.
-
use_cueq(bool, default:False) –Use cueq. cuEquivariance is only supported in ML-IAP interface.
-
use_flash(bool, default:False) –Use flashTP.
-
use_oeq(bool, default:False) –Use oeq.
-
**kwargs–Additional arguments to avoid breaking the function signature when future additional arguments are added.
utilal_uncertainty
¤
Utilities for uncertainty estimation using models committee.
- DO NOT import any alff libs in this file, since this file will be used remotely.
Classes:
-
ModelCommittee–A class to manage a committee of models for uncertainty estimation.
Functions:
-
simple_lmpdump2extxyz–Convert LAMMPS dump file to extended xyz file. This is very simple version, only convert atomic positions, but not stress tensor.
-
chunk_list–Yield successive n-sized chunks from
input_list.
ModelCommittee(mlp_model: str, model_files: list[str], calc_kwargs: dict | None = None, compute_stress: bool = False, rel_force: float | None = None, rel_stress: float | None = None, e_std_lo: float = 0.05, e_std_hi: float = 0.1, f_std_lo: float = 0.05, f_std_hi: float = 0.1, s_std_lo: float = 0.05, s_std_hi: float = 0.1, block_size: int = 1000)
¤
A class to manage a committee of models for uncertainty estimation.
Parameters:
-
mlp_model(str) –MLP model engine, e.g., 'sevenn'.
-
model_files(list[str]) –List of model files for the committee.
-
calc_kwargs(dict, default:None) –Additional arguments for the MLP calculator. Defaults to {}.
-
compute_stress(bool, default:False) –Whether to compute stress. Defaults to False.
-
rel_force(float, default:None) –Relative force to normalize force std. Defaults to None.
-
rel_stress(float, default:None) –Relative stress to normalize stress std. Defaults to None.
-
e_std_lo(float, default:0.05) –energy std low. Defaults to 0.05.
-
e_std_hi(float, default:0.1) –energy std high. Defaults to 0.1.
-
f_std_lo(float, default:0.05) –force std low. Defaults to 0.05.
-
f_std_hi(float, default:0.1) –force std high. Defaults to 0.1.
-
s_std_lo(float, default:0.05) –stress std low. Defaults to 0.05.
-
s_std_hi(float, default:0.1) –stress std high. Defaults to 0.1.
-
block_size(int, default:1000) –Block size of configurations to compute 'committee error' at once, just to avoid flooding memory. Defaults to 1000.
Note
- Consider using
@staticmethodfor some functions to avoid recursive messing.
Methods:
-
compute_committee_error_blockwise–Compute committee error for energy, forces, and stress for a multiple configurations in a block-wise manner.
-
committee_judge–Decide whether a configuration is candidate, accurate, or inaccurate based on committee error.
-
select_candidate–Select candidate configurations for DFT calculation.
-
remove_inaccurate–Remove inaccurate configurations based on committee error. This is used to revise the dataset.
Attributes:
-
mlp_model– -
model_files– -
calc_kwargs– -
compute_stress– -
rel_force– -
rel_stress– -
block_size– -
e_std_lo– -
e_std_hi– -
f_std_lo– -
f_std_hi– -
s_std_lo– -
s_std_hi– -
calc_list– -
committee_error_file(str) – -
committee_judge_file(str) –
mlp_model = mlp_model
instance-attribute
¤
model_files = model_files
instance-attribute
¤
calc_kwargs = calc_kwargs or {}
instance-attribute
¤
compute_stress = compute_stress
instance-attribute
¤
rel_force = rel_force
instance-attribute
¤
rel_stress = rel_stress
instance-attribute
¤
block_size = block_size
instance-attribute
¤
e_std_lo = e_std_lo
instance-attribute
¤
e_std_hi = e_std_hi
instance-attribute
¤
f_std_lo = f_std_lo
instance-attribute
¤
f_std_hi = f_std_hi
instance-attribute
¤
s_std_lo = s_std_lo
instance-attribute
¤
s_std_hi = s_std_hi
instance-attribute
¤
calc_list = self._get_calc_list()
instance-attribute
¤
committee_error_file: str = 'committee_error.txt'
instance-attribute
¤
committee_judge_file: str = 'committee_judge_summary.yml'
instance-attribute
¤
compute_committee_error_blockwise(struct_list: list[Atoms])
¤
Compute committee error for energy, forces, and stress for a multiple configurations in a block-wise manner.
Parameters:
-
struct_list(list[Atoms]) –List of Atoms objects.
Note
The output file is controlled by the class attribute self.committee_error_file.
committee_judge() -> tuple[np.ndarray, np.ndarray, np.ndarray]
¤
Decide whether a configuration is candidate, accurate, or inaccurate based on committee error.
Returns:
-
committee_judge_file(s) –files contain candidate, accurate and inaccurate configurations
Note
- If need to select candidates based on only
energy, just setf_std_hiands_std_hito a very large values. By this way, the criterion for those terms will always meet. - Similarly, if need to select candidates based on only
energyandforce, sets_std_hito a very large value. E.g.,s_std_hi=1e6for selecting candidates based on energy and force.
select_candidate(extxyz_file: str)
¤
Select candidate configurations for DFT calculation.
Returns:
-
extxyz_file(str) –candidate configurations
Note: See parameters in functions committee_error and committee_judge.
remove_inaccurate(extxyz_file: str)
¤
Remove inaccurate configurations based on committee error. This is used to revise the dataset.
Returns:
-
extxyz_file(str) –revise configurations
Note
blockwisefunctions requires all configurations in block have the same number of atoms. So if the input extxyz file contains configurations with different number of atoms, must use block_size=1 when initializingModelCommitteeclass.
simple_lmpdump2extxyz(lmpdump_file: str, extxyz_file: str)
¤
Convert LAMMPS dump file to extended xyz file. This is very simple version, only convert atomic positions, but not stress tensor.