Combined Reader

class mlip.data.chemical_systems_readers.combined_reader.CombinedReader(chemical_systems_readers: list[ChemicalSystemsReader])

Wrapper for a list of ChemicalSystemsReader that combines the result of loading data from each.

__init__(chemical_systems_readers: list[ChemicalSystemsReader])

Constructor.

Parameters:

chemical_systems_readers – The list of readers to use to load data.

load(postprocess_fun: ~typing.Callable[[list[~mlip.data.chemical_system.ChemicalSystem], list[~mlip.data.chemical_system.ChemicalSystem], list[~mlip.data.chemical_system.ChemicalSystem]], tuple[list[~mlip.data.chemical_system.ChemicalSystem], list[~mlip.data.chemical_system.ChemicalSystem], list[~mlip.data.chemical_system.ChemicalSystem]]] | None = <function filter_systems_with_unseen_atoms_and_assign_atomic_species>) tuple[list[ChemicalSystem], list[ChemicalSystem], list[ChemicalSystem]]

Loads the datasets into their internal formats and combines the resulting lists of ChemicalSystems.

Parameters:

postprocess_fun – Function to call to postprocess the loaded dataset before returning it. Accepts train, validation and test systems (list[ChemicalSystems]), runs some postprocessing (filtering for example) and returns the postprocessed train, validation and test systems. If postprocess_fun is None then no postprocessing will be done. By default, it will run assign_atomic_species_and_filter_systems_with_unseen_atoms() which assigns atomic species on ChemicalSystem objects and filters out systems from the validation and test sets that contain chemical elements that are not present in the train systems.

Returns:

A tuple of loaded training, validation and test datasets (in this order). The internal format is a list of ChemicalSystem objects.