Bond length distribution

class mlipaudit.benchmarks.bond_length_distribution.bond_length_distribution.BondLengthDistributionBenchmark(force_field: ForceField | Calculator, data_input_dir: str | PathLike = './data', run_mode: RunMode | Literal['dev', 'fast', 'standard'] = RunMode.STANDARD)

Benchmark for small organic molecule bond length distribution.

name

The unique benchmark name that should be used to run the benchmark from the CLI and that will determine the output folder name for the result file. The name is bond_length_distribution.

Type:

str

category

A string that describes the category of the benchmark, used for example, in the UI app for grouping. Default, if not overridden, is “General”. This benchmark’s category is “Small Molecules”.

Type:

str

result_class

A reference to the type of BenchmarkResult that will determine the return type of self.analyze(). The result class type is BondLengthDistributionResult.

Type:

type[mlipaudit.benchmark.BenchmarkResult] | None

model_output_class

A reference to the BondLengthDistributionModelOutput class.

Type:

type[mlipaudit.benchmark.ModelOutput] | None

required_elements

The set of element types that are present in the benchmark’s input files.

Type:

set[str] | None

skip_if_elements_missing

Whether the benchmark should be skipped entirely if there are some element types that the model cannot handle. If False, the benchmark must have its own custom logic to handle missing element types. For this benchmark, the attribute is set to True.

Type:

bool

__init__(force_field: ForceField | Calculator, data_input_dir: str | PathLike = './data', run_mode: RunMode | Literal['dev', 'fast', 'standard'] = RunMode.STANDARD) None

Initializes the benchmark.

Parameters:
  • force_field – The force field model to be benchmarked.

  • data_input_dir – The local input data directory. Defaults to “./data”. If the subdirectory “{data_input_dir}/{benchmark_name}” exists, the benchmark expects the relevant data to be in there, otherwise it will download it from HuggingFace.

  • run_mode – Whether to run the standard benchmark length, a faster version, or a very fast development version. Subclasses should ensure that when RunMode.DEV, their benchmark runs in a much shorter timeframe, by running on a reduced number of test cases, for instance. Implementing RunMode.FAST being different from RunMode.STANDARD is optional and only recommended for very long-running benchmarks. This argument can also be passed as a string “dev”, “fast”, or “standard”.

Raises:
  • ChemicalElementsMissingError – If initialization is attempted with a force field that cannot perform inference on the required elements.

  • ValueError – If force field type is not compatible.

run_model() None

Run an MD simulation for each structure.

The MD simulation is performed using the JAX MD engine and starts from the reference structure. The simulation state is stored in the model_output attribute.

analyze() BondLengthDistributionResult

Calculate how much chemical bonds deviate from the equilibrium bond length.

The deviation of the length of the bond specified by the SMARTS pattern is measured throughout the simulation. The equilibrium bond length is taken from the reference structure.

Returns:

A BondLengthDistributionResult object.

Raises:

RuntimeError – If called before run_model().

class mlipaudit.benchmarks.bond_length_distribution.bond_length_distribution.BondLengthDistributionResult(*, failed: bool = False, score: Annotated[float | None, Ge(ge=0), Le(le=1)] = None, molecules: list[BondLengthDistributionMoleculeResult], avg_deviation: float | None = None)

Results object for the bond length distribution benchmark.

molecules

The individual results for each molecule in a list.

Type:

list[mlipaudit.benchmarks.bond_length_distribution.bond_length_distribution.BondLengthDistributionMoleculeResult]

avg_deviation

The average of the average deviations for each molecule that was stable. If the benchmark failed, will be None.

Type:

float | None

failed

Whether all the simulations or inferences failed and no analysis could be performed. Defaults to False.

Type:

bool

score

The final score for the benchmark between 0 and 1.

Type:

float | None

class mlipaudit.benchmarks.bond_length_distribution.bond_length_distribution.BondLengthDistributionMoleculeResult(*, molecule_name: str, deviation_trajectory: list[float] | None = None, avg_deviation: float | None = None, failed: bool = False)

Results object for a single molecule.

molecule_name

The name of the molecule.

Type:

str

deviation_trajectory

A list of floats with the entry at index i representing the deviation at frame i of the trajectory, with each frame corresponding to 1ps of simulation time.

Type:

list[float] | None

avg_deviation

The average deviation of the molecule over the whole trajectory.

Type:

float | None

failed

Whether the simulation succeeded and was stable. If not, the other attributes will default to None. Defaults to False.

Type:

bool

class mlipaudit.benchmarks.bond_length_distribution.bond_length_distribution.BondLengthDistributionModelOutput(*, molecules: list[MoleculeModelOutput], num_failed: int = 0)

Stores model outputs for the bond length distribution benchmark, consisting of simulation states for every molecule.

molecules

A list of simulation states for every molecule.

Type:

list[mlipaudit.benchmarks.bond_length_distribution.bond_length_distribution.MoleculeModelOutput]

num_failed

The number of molecules for which simulation failed.

Type:

int

class mlipaudit.benchmarks.bond_length_distribution.bond_length_distribution.MoleculeModelOutput(*, molecule_name: str, simulation_state: SimulationState | None = None, failed: bool = False)

Stores the simulation state for a molecule.

molecule_name

The name of the molecule.

Type:

str

simulation_state

The simulation state. Defaults to None if the simulation failed.

Type:

mlip.simulation.state.SimulationState | None

failed

Whether the simulation failed on the molecule. Defaults to False.

Type:

bool