alff.gdata¤

`alff.gdata` ¤

Modules:

convert_mpchgnet_to_xyz –
gendata –

Data generation workflow implementation.
libgen_gpaw –
util_dataset –

Utility functions for handling dataset files.

`convert_mpchgnet_to_xyz` ¤

Functions:

chgnet_to_ase_atoms –
run_convert –

Attributes:

info_keys –

`info_keys = ['uncorrected_total_energy', 'corrected_total_energy', 'energy_per_atom', 'ef_per_atom', 'e_per_atom_relaxed', 'ef_per_atom_relaxed', 'magmom', 'bandgap', 'mp_id']` `module-attribute` ¤

`chgnet_to_ase_atoms(datum: dict[str, dict[str, Any]]) -> list[Atoms]` ¤

`run_convert()` ¤

`gendata` ¤

Data generation workflow implementation.

Classes:

WorkflowGendata –

Workflow for generate initial data for training ML models.

Functions:

make_structure –

Build structures based on input parameters.
optimize_structure –

Optimize the structures.
sampling_space –

Explore the sampling space.
run_dft –

Run DFT calculations.
collect_data –

Collect data from DFT simulations.
copy_labeled_structure –

Copy labeled structures
strain_dim –

Scale 'a single spatial dimension' of the structures.
strain_x_dim –

Scale the x dimension of the structures.
strain_y_dim –

Scale the y dimension of the structures.
strain_z_dim –

Scale the z dimension of the structures.
perturb_structure –

Perturb the structures.

`WorkflowGendata(params_file: str, machines_file: str)` ¤

Bases: Workflow

Workflow for generate initial data for training ML models.

Methods:

run –

The main function to run the workflow. This default implementation works for simple workflow,

Attributes:

stage_map –
wf_name –
params_file –
machines_file –
schema_file –
multi_mdicts –
pdict –
stage_list –

`stage_map = {'make_structure': make_structure, 'optimize_structure': optimize_structure, 'sampling_space': sampling_space, 'run_dft': run_dft, 'collect_data': collect_data}` `instance-attribute` ¤

`wf_name = 'DATA GENERATION'` `instance-attribute` ¤

`params_file = params_file` `instance-attribute` ¤

`machines_file = machines_file` `instance-attribute` ¤

`schema_file = schema_file` `instance-attribute` ¤

`multi_mdicts = config_machine.multi_mdicts` `instance-attribute` ¤

`pdict = Config.loadconfig(self.params_file)` `instance-attribute` ¤

`stage_list = self._load_stage_list()` `instance-attribute` ¤

`run()` ¤

The main function to run the workflow. This default implementation works for simple workflow, for more complex workflow (e.g. with iteration like active learning), need to reimplement this .run() function.

`make_structure(pdict, mdict)` ¤

Build structures based on input parameters.

`optimize_structure(pdict, mdict)` ¤

Optimize the structures.

`sampling_space(pdict, mdict)` ¤

Explore the sampling space.

Sampling space includes: - Range of strains (in x, y, z directions) + range of temperatures - Range of temperatures + range of stresses

Notes - Structure paths are save into 2 lists: original and sampling structure paths

`run_dft(pdict, mdict)` ¤

Run DFT calculations.

`collect_data(pdict, mdict)` ¤

Collect data from DFT simulations.

`copy_labeled_structure(src_dir: str, dest_dir: str)` ¤

Copy labeled structures - First, try copy labeled structure if it exists. - If there is no labeled structure, copy the unlabeled structure.

`strain_dim(struct_files: list[str], strain_list: list[float], dim: int) -> list[str]` ¤

Scale 'a single spatial dimension' of the structures.

Parameters:

struct_files (list[str]) –

List of structure file paths.
strain_list (list[float]) –

Strain values to apply.
dim (int) –

Dimension index to strain (0=x, 1=y, 2=z).

`strain_x_dim(struct_files: list[str], strain_x_list: list[float]) -> list[str]` ¤

Scale the x dimension of the structures.

`strain_y_dim(struct_files: list[str], strain_y_list: list[float]) -> list[str]` ¤

Scale the y dimension of the structures.

`strain_z_dim(struct_files: list[str], strain_z_list: list[float]) -> list[str]` ¤

Scale the z dimension of the structures.

`perturb_structure(struct_files: list, perturb_num: int, perturb_disp: float)` ¤

Perturb the structures.

`libgen_gpaw` ¤

Classes:

OperGendataGpawOptimize –

This class does GPAW optimization for a list of structures in task_dirs.
OperGendataGpawSinglepoint –
OperGendataGpawAIMD –

See class OperGendataGpawOptimize for more details.
OperAlGpawSinglepoint –

`OperGendataGpawOptimize(work_dir, pdict, multi_mdict, mdict_prefix='gpaw')` ¤

Bases: RemoteOperation

This class does GPAW optimization for a list of structures in task_dirs.

Subclassed by:

alff API reference
- alff.gdata gdata libgen_gpaw
  - OperAlGpawSinglepoint
  - OperGendataGpawSinglepoint
- alff.pes pes libpes_gpaw OperPESGpawOptimize

Methods:

prepare –

Prepare the operation.
postprocess –

This function does:
run –

Function to submit jobs to remote machines.

Attributes:

op_name –
task_filter –
work_dir –
pdict –
mdict_list –
task_dirs –
commandlist_list (list[list[str]]) –
forward_files (list[str]) –
backward_files (list[str]) –
forward_common_files (list[str]) –
backward_common_files (list[str]) –

`op_name = 'GPAW optimize'` `instance-attribute` ¤

`task_filter = {'has_files': [K.FILE_FRAME_UNLABEL], 'no_files': [K.FILE_FRAME_LABEL]}` `instance-attribute` ¤

`work_dir = work_dir` `instance-attribute` ¤

`pdict = pdict` `instance-attribute` ¤

`mdict_list = self._select_machines(multi_mdicts, mdict_prefix)` `instance-attribute` ¤

`task_dirs = self._load_task_dirs()` `instance-attribute` ¤

`commandlist_list: list[list[str]]` `instance-attribute` ¤

`forward_files: list[str]` `instance-attribute` ¤

`backward_files: list[str]` `instance-attribute` ¤

`forward_common_files: list[str]` `instance-attribute` ¤

`backward_common_files: list[str] = []` `instance-attribute` ¤

`prepare()` ¤

Prepare the operation.

Includes: - Prepare ase_args for GPAW and gpaw_run_file. Note: Must define pdict.dft.calc_args.gpaw{} for this function. - Prepare the task_list - Prepare forward & backward files - Prepare commandlist_list for multi-remote submission

`postprocess()` ¤

This function does: - Remove unlabeled .extxyz files, just keep the labeled ones.

`run()` ¤

Function to submit jobs to remote machines.

Note

Orginal taks_dirs is relative to run_dir, and should not be changed. But the sumbmission function needs taks_dirs relative path to work_dir, so we make temporary change here.

`OperGendataGpawSinglepoint(work_dir, pdict, multi_mdict, mdict_prefix='gpaw')` ¤

Bases: OperGendataGpawOptimize

Methods:

prepare –
postprocess –

This function does:
run –

Function to submit jobs to remote machines.

Attributes:

op_name –
work_dir –
pdict –
mdict_list –
task_dirs –
task_filter –
commandlist_list (list[list[str]]) –
forward_files (list[str]) –
backward_files (list[str]) –
forward_common_files (list[str]) –
backward_common_files (list[str]) –

`op_name = 'GPAW singlepoint'` `instance-attribute` ¤

`work_dir = work_dir` `instance-attribute` ¤

`pdict = pdict` `instance-attribute` ¤

`mdict_list = self._select_machines(multi_mdicts, mdict_prefix)` `instance-attribute` ¤

`task_dirs = self._load_task_dirs()` `instance-attribute` ¤

`task_filter = {'has_files': [K.FILE_FRAME_UNLABEL], 'no_files': [K.FILE_FRAME_LABEL]}` `instance-attribute` ¤

`commandlist_list: list[list[str]]` `instance-attribute` ¤

`forward_files: list[str]` `instance-attribute` ¤

`backward_files: list[str]` `instance-attribute` ¤

`forward_common_files: list[str]` `instance-attribute` ¤

`backward_common_files: list[str] = []` `instance-attribute` ¤

`prepare()` ¤

`postprocess()` ¤

This function does: - Remove unlabeled .extxyz files, just keep the labeled ones.

`run()` ¤

Function to submit jobs to remote machines.

Note

Orginal taks_dirs is relative to run_dir, and should not be changed. But the sumbmission function needs taks_dirs relative path to work_dir, so we make temporary change here.

`OperGendataGpawAIMD(work_dir, pdict, multi_mdict, mdict_prefix='gpaw')` ¤

Bases: RemoteOperation

See class OperGendataGpawOptimize for more details.

Methods:

prepare –

Refer to the pregen_gpaw_optimize() function.
postprocess –

Refer to the postgen_gpaw_optimize() function.
run –

Function to submit jobs to remote machines.

Attributes:

op_name –
task_filter –
work_dir –
pdict –
mdict_list –
task_dirs –
commandlist_list (list[list[str]]) –
forward_files (list[str]) –
backward_files (list[str]) –
forward_common_files (list[str]) –
backward_common_files (list[str]) –

`op_name = 'GPAW aimd'` `instance-attribute` ¤

`task_filter = {'has_files': [K.FILE_FRAME_UNLABEL], 'no_files': [K.FILE_TRAJ_LABEL]}` `instance-attribute` ¤

`work_dir = work_dir` `instance-attribute` ¤

`pdict = pdict` `instance-attribute` ¤

`mdict_list = self._select_machines(multi_mdicts, mdict_prefix)` `instance-attribute` ¤

`task_dirs = self._load_task_dirs()` `instance-attribute` ¤

`commandlist_list: list[list[str]]` `instance-attribute` ¤

`forward_files: list[str]` `instance-attribute` ¤

`backward_files: list[str]` `instance-attribute` ¤

`forward_common_files: list[str]` `instance-attribute` ¤

`backward_common_files: list[str] = []` `instance-attribute` ¤

`prepare()` ¤

Refer to the pregen_gpaw_optimize() function.

Note: - This function differs from OperGendataGpawOptimize.prepare() in the aspects that ase_args now in task_dirs (not in work_dir). So, the forward files and commandlist_list are different. - structure_dirs: contains the optimized structures without scaling. - strain_structure_dirs: contains the scaled structures.

`postprocess()` ¤

Refer to the postgen_gpaw_optimize() function.

`run()` ¤

Function to submit jobs to remote machines.

Note

Orginal taks_dirs is relative to run_dir, and should not be changed. But the sumbmission function needs taks_dirs relative path to work_dir, so we make temporary change here.

`OperAlGpawSinglepoint(work_dir, pdict, multi_mdict, mdict_prefix='gpaw')` ¤

Bases: OperGendataGpawOptimize

Methods:

prepare –
postprocess –

Do post DFT tasks.
run –

Function to submit jobs to remote machines.

Attributes:

op_name –
work_dir –
pdict –
mdict_list –
task_dirs –
task_filter –
commandlist_list (list[list[str]]) –
forward_files (list[str]) –
backward_files (list[str]) –
forward_common_files (list[str]) –
backward_common_files (list[str]) –

`op_name = 'GPAW singlepoint'` `instance-attribute` ¤

`work_dir = work_dir` `instance-attribute` ¤

`pdict = pdict` `instance-attribute` ¤

`mdict_list = self._select_machines(multi_mdicts, mdict_prefix)` `instance-attribute` ¤

`task_dirs = self._load_task_dirs()` `instance-attribute` ¤

`task_filter = {'has_files': [K.FILE_FRAME_UNLABEL], 'no_files': [K.FILE_FRAME_LABEL]}` `instance-attribute` ¤

`commandlist_list: list[list[str]]` `instance-attribute` ¤

`forward_files: list[str]` `instance-attribute` ¤

`backward_files: list[str]` `instance-attribute` ¤

`forward_common_files: list[str]` `instance-attribute` ¤

`backward_common_files: list[str] = []` `instance-attribute` ¤

`prepare()` ¤

`postprocess()` ¤

Do post DFT tasks.

`run()` ¤

Function to submit jobs to remote machines.

Note

Orginal taks_dirs is relative to run_dir, and should not be changed. But the sumbmission function needs taks_dirs relative path to work_dir, so we make temporary change here.

`util_dataset` ¤

Utility functions for handling dataset files.

Functions:

split_extxyz_dataset –

Split a dataset into training, validation, and test sets.
read_list_extxyz –

Read a list of EXTXYZ files and return a list of ASE Atoms objects.
merge_extxyz_files –

Unify multiple EXTXYZ files into a single file.
change_key_in_extxyz –

Change keys in extxyz file.
remove_key_in_extxyz –

Remove unwanted keys from extxyz file to keep it clean.
select_structs_from_extxyz –

Choose frames from a extxyz trajectory file, based on some criteria.
sort_atoms_by_position –

Sorts the atoms in an Atoms object based on their Cartesian positions.
are_structs_identical –

Checks if two Atoms objects are identical by first sorting them and then comparing their attributes.
are_structs_equivalent –

Check if two Atoms objects are equivalent using ase.utils.structure_comparator.SymmetryEquivalenceCheck.compare().
remove_duplicate_structs_serial –

Check if there are duplicate structs in a extxyz file.
remove_duplicate_structs_hash –

Remove duplicate structures using hashing (very fast).

`split_extxyz_dataset(extxyz_files: list[str], train_ratio: float = 0.9, valid_ratio: float = 0.1, seed: int | None = None, outfile_prefix: str = 'dataset')` ¤

Split a dataset into training, validation, and test sets.

If input (train_ratio + valid_ratio) < 1, the remaining data will be used as the test set.

Parameters:

extxyz_files (list[str]) –

List of file paths in EXTXYZ format.
train_ratio (float, default: 0.9 ) –

Ratio of training set. Defaults to 0.9.
valid_ratio (float, default: 0.1 ) –

Ratio of validation set. Defaults to 0.1.
seed (Optional[int], default: None ) –

Random seed. Defaults to None.
outfile_prefix (str, default: 'dataset' ) –

Prefix for output file names. Defaults to "dataset".

`read_list_extxyz(extxyz_files: list[str]) -> list[Atoms]` ¤

Read a list of EXTXYZ files and return a list of ASE Atoms objects.

`merge_extxyz_files(extxyz_files: list[str], outfile: str, sort_natoms: bool = False, sort_composition: bool = False, sort_pbc_len: bool = False)` ¤

Unify multiple EXTXYZ files into a single file.

Parameters:

extxyz_files (list[str]) –

List of EXTXYZ file paths.
outfile (str) –

Output file path.
sort_natoms (bool, default: False ) –

Sort by number of atoms. Defaults to True.
sort_composition (bool, default: False ) –

Sort by chemical composition. Defaults to True.
sort_pbc_len (bool, default: False ) –

Sort by periodic length. Defaults to True.

Note

np.lexsort is used to sort by multiple criteria. np.argsort is used to sort by a single criterion.
np.lexsort does not support descending order, so we reverse the sorted indices using idx[::-1].
If multiple sorting criteria are provided, they will be applied in the order of 'last_key' to 'first_key' (i.e., the last key in the list is the primary sort key). Example: np.lexsort((key1, key2)) sorts by key2 first, then by `key1.

`change_key_in_extxyz(extxyz_file: str, key_pairs: dict[str, str])` ¤

Change keys in extxyz file.

Parameters:

extxyz_file (str) –

Path to the extxyz file.
key_pairs (dict) –

Dictionary of key pairs {"old_key": "new_key"} to change. Example: {"old_key": "new_key", "forces": "ref_forces", "stress": "ref_stress"}

Note

If Atoms contains internal-keys (e.g., energy, forces, stress, momenta, free_energy,...), there will be a SinglePointCalculator object included to the Atoms, and these keys are stored in dict atoms.calc.results or can be accessed using .get_() methods.
These internal-keys are not stored in atoms.arrays or atoms.info. If we want to store (and access) these properties in atoms.arrays or atoms.info, we need to change these internal-keys to custom-keys (e.g., ref_energy, ref_forces, ref_stress, ref_momenta, ref_free_energy,...).

`remove_key_in_extxyz(extxyz_file: str, key_list: list[str])` ¤

Remove unwanted keys from extxyz file to keep it clean.

`select_structs_from_extxyz(extxyz_file: str, has_symbols: list | None = None, only_symbols: list | None = None, exact_symbols: list | None = None, has_properties: list | None = None, only_properties: list | None = None, has_columns: list | None = None, only_columns: list | None = None, natoms: int | None = None, tol: float = 1e-06)` ¤

Choose frames from a extxyz trajectory file, based on some criteria.

Parameters:

extxyz_file (str) –

Path to the extxyz file.
has_symbols (list, default: None ) –

List of symbols that each frame must have at least one of them.
only_symbols (list, default: None ) –

List of symbols that each frame must have only these symbols.
exact_symbols (list, default: None ) –

List of symbols that each frame must have exactly these symbols.
has_properties (list, default: None ) –

List of properties that each frame must have at least one of them.
only_properties (list, default: None ) –

List of properties that each frame must have only these properties.
has_columns (list, default: None ) –

List of columns that each frame must have at least one of them.
only_columns (list, default: None ) –

List of columns that each frame must have only these columns.
natoms (int, default: None ) –

total number of atoms in frame.
tol (float, default: 1e-06 ) –

Tolerance for comparing floating point numbers.

`sort_atoms_by_position(struct: Atoms) -> Atoms` ¤

Sorts the atoms in an Atoms object based on their Cartesian positions.

`are_structs_identical(input_struct1: Atoms, input_struct2: Atoms, tol=1e-06) -> bool` ¤

Checks if two Atoms objects are identical by first sorting them and then comparing their attributes.

Parameters:

input_struct1 (Atoms) –

First Atoms object.
input_struct2 (Atoms) –

Second Atoms object.
tol (float, default: 1e-06 ) –

Tolerance for position comparison.

Returns:

bool ( bool ) –

True if the structures are identical, False otherwise.

`are_structs_equivalent(struct1: Atoms, struct2: Atoms) -> bool` ¤

Check if two Atoms objects are equivalent using ase.utils.structure_comparator.SymmetryEquivalenceCheck.compare().

Parameters:

struct1 (Atoms) –

First Atoms object.
struct2 (Atoms) –

Second Atoms object.

Returns:

bool ( bool ) –

True if the structures are equivalent, False otherwise.

Note

It is not clear what is "equivalent"?

`remove_duplicate_structs_serial(extxyz_file: str, tol=1e-06) -> None` ¤

Check if there are duplicate structs in a extxyz file.

Parameters:

extxyz_file (str) –

Path to the extxyz file.
tol (float, default: 1e-06 ) –

Tolerance for comparing atomic positions. Defaults to 1e-6.

Returns:

None –

extxyz_file without duplicate structs.

`remove_duplicate_structs_hash(extxyz_file: str, seen_extxyz: str | None = None, tol: float = 1e-06, backup: bool = True) -> None` ¤

Remove duplicate structures using hashing (very fast).

Much less memory overhead compared to pairwise are_structs_identical calls.
This reduces duplicate checking to O(N) instead of O(N²). No parallelism needed — it's already O(N)

Parameters:

extxyz_file (str) –

Path to the extxyz file.
seen_extxyz (str | None, default: None ) –

Optional path to an extxyz file to be included into the set of seen structures. Defaults to None.
tol (float, default: 1e-06 ) –

Tolerance for comparing atomic positions. Defaults to 1e-6.
backup (bool, default: True ) –

Whether to create a backup of the original file. Defaults to True.

Note

Use reversed() does not modify the original list, and it is memory-efficient (no copy)

alff.gdata¤

alff.gdata ¤

convert_mpchgnet_to_xyz ¤

info_keys = ['uncorrected_total_energy', 'corrected_total_energy', 'energy_per_atom', 'ef_per_atom', 'e_per_atom_relaxed', 'ef_per_atom_relaxed', 'magmom', 'bandgap', 'mp_id'] module-attribute ¤

chgnet_to_ase_atoms(datum: dict[str, dict[str, Any]]) -> list[Atoms] ¤

run_convert() ¤

gendata ¤

WorkflowGendata(params_file: str, machines_file: str) ¤

stage_map = {'make_structure': make_structure, 'optimize_structure': optimize_structure, 'sampling_space': sampling_space, 'run_dft': run_dft, 'collect_data': collect_data} instance-attribute ¤

wf_name = 'DATA GENERATION' instance-attribute ¤

params_file = params_file instance-attribute ¤

machines_file = machines_file instance-attribute ¤

schema_file = schema_file instance-attribute ¤

multi_mdicts = config_machine.multi_mdicts instance-attribute ¤

pdict = Config.loadconfig(self.params_file) instance-attribute ¤

stage_list = self._load_stage_list() instance-attribute ¤

run() ¤

make_structure(pdict, mdict) ¤

optimize_structure(pdict, mdict) ¤

sampling_space(pdict, mdict) ¤

run_dft(pdict, mdict) ¤

collect_data(pdict, mdict) ¤

copy_labeled_structure(src_dir: str, dest_dir: str) ¤

strain_dim(struct_files: list[str], strain_list: list[float], dim: int) -> list[str] ¤

strain_x_dim(struct_files: list[str], strain_x_list: list[float]) -> list[str] ¤

strain_y_dim(struct_files: list[str], strain_y_list: list[float]) -> list[str] ¤

strain_z_dim(struct_files: list[str], strain_z_list: list[float]) -> list[str] ¤

perturb_structure(struct_files: list, perturb_num: int, perturb_disp: float) ¤

libgen_gpaw ¤

OperGendataGpawOptimize(work_dir, pdict, multi_mdict, mdict_prefix='gpaw') ¤

op_name = 'GPAW optimize' instance-attribute ¤

task_filter = {'has_files': [K.FILE_FRAME_UNLABEL], 'no_files': [K.FILE_FRAME_LABEL]} instance-attribute ¤

work_dir = work_dir instance-attribute ¤

pdict = pdict instance-attribute ¤

mdict_list = self._select_machines(multi_mdicts, mdict_prefix) instance-attribute ¤

task_dirs = self._load_task_dirs() instance-attribute ¤

commandlist_list: list[list[str]] instance-attribute ¤

forward_files: list[str] instance-attribute ¤

backward_files: list[str] instance-attribute ¤

forward_common_files: list[str] instance-attribute ¤

backward_common_files: list[str] = [] instance-attribute ¤

prepare() ¤

postprocess() ¤

run() ¤

OperGendataGpawSinglepoint(work_dir, pdict, multi_mdict, mdict_prefix='gpaw') ¤

op_name = 'GPAW singlepoint' instance-attribute ¤

work_dir = work_dir instance-attribute ¤

pdict = pdict instance-attribute ¤

mdict_list = self._select_machines(multi_mdicts, mdict_prefix) instance-attribute ¤

task_dirs = self._load_task_dirs() instance-attribute ¤

task_filter = {'has_files': [K.FILE_FRAME_UNLABEL], 'no_files': [K.FILE_FRAME_LABEL]} instance-attribute ¤

commandlist_list: list[list[str]] instance-attribute ¤

forward_files: list[str] instance-attribute ¤

backward_files: list[str] instance-attribute ¤

forward_common_files: list[str] instance-attribute ¤

backward_common_files: list[str] = [] instance-attribute ¤

prepare() ¤

postprocess() ¤

run() ¤

OperGendataGpawAIMD(work_dir, pdict, multi_mdict, mdict_prefix='gpaw') ¤

op_name = 'GPAW aimd' instance-attribute ¤

task_filter = {'has_files': [K.FILE_FRAME_UNLABEL], 'no_files': [K.FILE_TRAJ_LABEL]} instance-attribute ¤

work_dir = work_dir instance-attribute ¤

pdict = pdict instance-attribute ¤

mdict_list = self._select_machines(multi_mdicts, mdict_prefix) instance-attribute ¤

task_dirs = self._load_task_dirs() instance-attribute ¤

commandlist_list: list[list[str]] instance-attribute ¤

forward_files: list[str] instance-attribute ¤

backward_files: list[str] instance-attribute ¤

forward_common_files: list[str] instance-attribute ¤

backward_common_files: list[str] = [] instance-attribute ¤

prepare() ¤

postprocess() ¤

run() ¤

OperAlGpawSinglepoint(work_dir, pdict, multi_mdict, mdict_prefix='gpaw') ¤

op_name = 'GPAW singlepoint' instance-attribute ¤

work_dir = work_dir instance-attribute ¤

pdict = pdict instance-attribute ¤

mdict_list = self._select_machines(multi_mdicts, mdict_prefix) instance-attribute ¤

task_dirs = self._load_task_dirs() instance-attribute ¤

`alff.gdata` ¤

`convert_mpchgnet_to_xyz` ¤

`info_keys = ['uncorrected_total_energy', 'corrected_total_energy', 'energy_per_atom', 'ef_per_atom', 'e_per_atom_relaxed', 'ef_per_atom_relaxed', 'magmom', 'bandgap', 'mp_id']` `module-attribute` ¤

`chgnet_to_ase_atoms(datum: dict[str, dict[str, Any]]) -> list[Atoms]` ¤

`run_convert()` ¤

`gendata` ¤

`WorkflowGendata(params_file: str, machines_file: str)` ¤

`stage_map = {'make_structure': make_structure, 'optimize_structure': optimize_structure, 'sampling_space': sampling_space, 'run_dft': run_dft, 'collect_data': collect_data}` `instance-attribute` ¤

`wf_name = 'DATA GENERATION'` `instance-attribute` ¤

`params_file = params_file` `instance-attribute` ¤

`machines_file = machines_file` `instance-attribute` ¤

`schema_file = schema_file` `instance-attribute` ¤

`multi_mdicts = config_machine.multi_mdicts` `instance-attribute` ¤

`pdict = Config.loadconfig(self.params_file)` `instance-attribute` ¤

`stage_list = self._load_stage_list()` `instance-attribute` ¤

`run()` ¤

`make_structure(pdict, mdict)` ¤

`optimize_structure(pdict, mdict)` ¤

`sampling_space(pdict, mdict)` ¤

`run_dft(pdict, mdict)` ¤

`collect_data(pdict, mdict)` ¤

`copy_labeled_structure(src_dir: str, dest_dir: str)` ¤

`strain_dim(struct_files: list[str], strain_list: list[float], dim: int) -> list[str]` ¤

`strain_x_dim(struct_files: list[str], strain_x_list: list[float]) -> list[str]` ¤

`strain_y_dim(struct_files: list[str], strain_y_list: list[float]) -> list[str]` ¤

`strain_z_dim(struct_files: list[str], strain_z_list: list[float]) -> list[str]` ¤

`perturb_structure(struct_files: list, perturb_num: int, perturb_disp: float)` ¤

`libgen_gpaw` ¤

`OperGendataGpawOptimize(work_dir, pdict, multi_mdict, mdict_prefix='gpaw')` ¤

`op_name = 'GPAW optimize'` `instance-attribute` ¤

`task_filter = {'has_files': [K.FILE_FRAME_UNLABEL], 'no_files': [K.FILE_FRAME_LABEL]}` `instance-attribute` ¤

`work_dir = work_dir` `instance-attribute` ¤

`pdict = pdict` `instance-attribute` ¤

`mdict_list = self._select_machines(multi_mdicts, mdict_prefix)` `instance-attribute` ¤

`task_dirs = self._load_task_dirs()` `instance-attribute` ¤

`commandlist_list: list[list[str]]` `instance-attribute` ¤

`forward_files: list[str]` `instance-attribute` ¤

`backward_files: list[str]` `instance-attribute` ¤

`forward_common_files: list[str]` `instance-attribute` ¤

`backward_common_files: list[str] = []` `instance-attribute` ¤

`prepare()` ¤

`postprocess()` ¤

`run()` ¤

`OperGendataGpawSinglepoint(work_dir, pdict, multi_mdict, mdict_prefix='gpaw')` ¤

`op_name = 'GPAW singlepoint'` `instance-attribute` ¤

`work_dir = work_dir` `instance-attribute` ¤

`pdict = pdict` `instance-attribute` ¤

`mdict_list = self._select_machines(multi_mdicts, mdict_prefix)` `instance-attribute` ¤

`task_dirs = self._load_task_dirs()` `instance-attribute` ¤

`task_filter = {'has_files': [K.FILE_FRAME_UNLABEL], 'no_files': [K.FILE_FRAME_LABEL]}` `instance-attribute` ¤

`commandlist_list: list[list[str]]` `instance-attribute` ¤

`forward_files: list[str]` `instance-attribute` ¤

`backward_files: list[str]` `instance-attribute` ¤

`forward_common_files: list[str]` `instance-attribute` ¤

`backward_common_files: list[str] = []` `instance-attribute` ¤

`prepare()` ¤

`postprocess()` ¤

`run()` ¤

`OperGendataGpawAIMD(work_dir, pdict, multi_mdict, mdict_prefix='gpaw')` ¤

`op_name = 'GPAW aimd'` `instance-attribute` ¤

`task_filter = {'has_files': [K.FILE_FRAME_UNLABEL], 'no_files': [K.FILE_TRAJ_LABEL]}` `instance-attribute` ¤

`work_dir = work_dir` `instance-attribute` ¤

`pdict = pdict` `instance-attribute` ¤

`mdict_list = self._select_machines(multi_mdicts, mdict_prefix)` `instance-attribute` ¤

`task_dirs = self._load_task_dirs()` `instance-attribute` ¤

`commandlist_list: list[list[str]]` `instance-attribute` ¤

`forward_files: list[str]` `instance-attribute` ¤

`backward_files: list[str]` `instance-attribute` ¤

`forward_common_files: list[str]` `instance-attribute` ¤

`backward_common_files: list[str] = []` `instance-attribute` ¤

`prepare()` ¤

`postprocess()` ¤

`run()` ¤

`OperAlGpawSinglepoint(work_dir, pdict, multi_mdict, mdict_prefix='gpaw')` ¤

`op_name = 'GPAW singlepoint'` `instance-attribute` ¤

`work_dir = work_dir` `instance-attribute` ¤

`pdict = pdict` `instance-attribute` ¤

`mdict_list = self._select_machines(multi_mdicts, mdict_prefix)` `instance-attribute` ¤

`task_dirs = self._load_task_dirs()` `instance-attribute` ¤

`task_filter = {'has_files': [K.FILE_FRAME_UNLABEL], 'no_files': [K.FILE_FRAME_LABEL]}` `instance-attribute` ¤