Sampling¶
- class mlipaudit.benchmarks.sampling.sampling.SamplingBenchmark(force_field: ForceField | Calculator, data_input_dir: str | PathLike = './data', run_mode: RunMode | Literal['dev', 'fast', 'standard'] = RunMode.STANDARD)¶
Benchmark for sampling of amino acid backbone and sidechain dihedrals.
- name¶
The unique benchmark name that should be used to run the benchmark from the CLI and that will determine the output folder name for the result file. The name is
sampling.- Type:
str
- category¶
A string that describes the category of the benchmark, used for example, in the UI app for grouping. Default, if not overridden, is “General”. This benchmark’s category is “Biomolecules”.
- Type:
str
- result_class¶
A reference to the type of
BenchmarkResultthat will determine the return type ofself.analyze(). The result class isSamplingResult.- Type:
type[mlipaudit.benchmark.BenchmarkResult] | None
- model_output_class¶
A reference to the
SamplingModelOutputclass.- Type:
type[mlipaudit.benchmark.ModelOutput] | None
- required_elements¶
The set of atomic element types that are present in the benchmark’s input files.
- Type:
set[str] | None
- skip_if_elements_missing¶
Whether the benchmark should be skipped entirely if there are some atomic element types that the model cannot handle. If False, the benchmark must have its own custom logic to handle missing atomic element types. For this benchmark, the attribute is set to True.
- Type:
bool
- reusable_output_id¶
An optional ID that references other benchmarks with identical input systems and
ModelOutputsignatures (in form of a tuple). If present, a user or the CLI can make use of this information to reuse cached model outputs from another benchmark carrying the same ID instead of rerunning simulations or inference.- Type:
tuple[str, …] | None
- __init__(force_field: ForceField | Calculator, data_input_dir: str | PathLike = './data', run_mode: RunMode | Literal['dev', 'fast', 'standard'] = RunMode.STANDARD) None¶
Initializes the benchmark.
- Parameters:
force_field – The force field model to be benchmarked.
data_input_dir – The local input data directory. Defaults to “./data”. If the subdirectory “{data_input_dir}/{benchmark_name}” exists, the benchmark expects the relevant data to be in there, otherwise it will download it from HuggingFace.
run_mode – Whether to run the standard benchmark length, a faster version, or a very fast development version. Subclasses should ensure that when
RunMode.DEV, their benchmark runs in a much shorter timeframe, by running on a reduced number of test cases, for instance. ImplementingRunMode.FASTbeing different fromRunMode.STANDARDis optional and only recommended for very long-running benchmarks. This argument can also be passed as a string “dev”, “fast”, or “standard”.
- Raises:
ChemicalElementsMissingError – If initialization is attempted with a force field that cannot perform inference on the required elements.
ValueError – If force field type is not compatible.
- run_model() None¶
Run an MD simulation for each system.
- analyze() SamplingResult¶
Analyze the sampling benchmark.
- Raises:
RuntimeError – If
run_model()has not been called first.- Returns:
The result of the sampling benchmark.
- class mlipaudit.benchmarks.sampling.sampling.SamplingResult(*, failed: bool = False, score: Annotated[float | None, Ge(ge=0), Le(le=1)] = None, systems: list[SamplingSystemResult], exploded_systems: list[str], rmsd_backbone_total: float | None = None, hellinger_distance_backbone_total: float | None = None, rmsd_sidechain_total: float | None = None, hellinger_distance_sidechain_total: float | None = None, outliers_ratio_backbone_total: float | None = None, outliers_ratio_sidechain_total: float | None = None, rmsd_backbone_dihedrals: dict[str, float] | None = None, hellinger_distance_backbone_dihedrals: dict[str, float] | None = None, rmsd_sidechain_dihedrals: dict[str, float] | None = None, hellinger_distance_sidechain_dihedrals: dict[str, float] | None = None, outliers_ratio_backbone_dihedrals: dict[str, float] | None = None, outliers_ratio_sidechain_dihedrals: dict[str, float] | None = None)¶
Stores the result of the sampling benchmark.
- systems¶
The result for each system, including those that failed.
- exploded_systems¶
The systems that exploded, or that failed during simulation.
- Type:
list[str]
- rmsd_backbone_total¶
The RMSD of the backbone dihedral distribution for all systems.
- Type:
float | None
- hellinger_distance_backbone_total¶
The Hellinger distance of the backbone dihedral distribution for all systems.
- Type:
float | None
- rmsd_sidechain_total¶
The RMSD of the sidechain dihedral distribution for all systems.
- Type:
float | None
- hellinger_distance_sidechain_total¶
The Hellinger distance of the sidechain dihedral distribution for all systems.
- Type:
float | None
- outliers_ratio_backbone_total¶
The ratio of outliers in the backbone dihedral distribution for all systems.
- Type:
float | None
- outliers_ratio_sidechain_total¶
The ratio of outliers in the sidechain dihedral distribution for all systems.
- Type:
float | None
- rmsd_backbone_dihedrals¶
The RMSD of the backbone dihedral distribution for each residue type.
- Type:
dict[str, float] | None
- hellinger_distance_backbone_dihedrals¶
The Hellinger distance of the backbone dihedral distribution for each residue type.
- Type:
dict[str, float] | None
- rmsd_sidechain_dihedrals¶
The RMSD of the sidechain dihedral distribution for each residue type.
- Type:
dict[str, float] | None
- hellinger_distance_sidechain_dihedrals¶
The Hellinger distance of the sidechain dihedral distribution for each residue type.
- Type:
dict[str, float] | None
- outliers_ratio_backbone_dihedrals¶
The ratio of outliers in the backbone dihedral distribution for each residue type.
- Type:
dict[str, float] | None
- outliers_ratio_sidechain_dihedrals¶
The ratio of outliers in the sidechain dihedral distribution for each residue type.
- Type:
dict[str, float] | None
- failed¶
Whether all the simulations or inferences failed and no analysis could be performed. Defaults to False.
- Type:
bool
- score¶
The final score for the benchmark between 0 and 1.
- Type:
float | None
- class mlipaudit.benchmarks.sampling.sampling.SamplingSystemResult(*, structure_name: str, rmsd_backbone_dihedrals: dict[str, float] | None = None, hellinger_distance_backbone_dihedrals: dict[str, float] | None = None, rmsd_sidechain_dihedrals: dict[str, float] | None = None, outliers_ratio_backbone_dihedrals: dict[str, float] | None = None, hellinger_distance_sidechain_dihedrals: dict[str, float] | None = None, outliers_ratio_sidechain_dihedrals: dict[str, float] | None = None, failed: bool = False)¶
Stores the result for one system of the sampling benchmark.
- structure_name¶
The name of the structure.
- Type:
str
- rmsd_backbone_dihedrals¶
The RMSD of the backbone dihedral distribution with respect to the reference data for each residue type.
- Type:
dict[str, float] | None
- hellinger_distance_backbone_dihedrals¶
The Hellinger distance of the backbone dihedral distribution with respect to the reference data for each residue type.
- Type:
dict[str, float] | None
- rmsd_sidechain_dihedrals¶
The RMSD of the sidechain dihedral distribution with respect to the reference data for each residue type.
- Type:
dict[str, float] | None
- hellinger_distance_sidechain_dihedrals¶
The Hellinger distance of the sidechain dihedral distribution with respect to the reference data for each residue type.
- Type:
dict[str, float] | None
- outliers_ratio_backbone_dihedrals¶
The ratio of outliers in the backbone dihedral distribution for each residue type.
- Type:
dict[str, float] | None
- outliers_ratio_sidechain_dihedrals¶
The ratio of outliers in the sidechain dihedral distribution for each residue type.
- Type:
dict[str, float] | None
- failed¶
Whether the simulation was stable. If not stable, the other attributes will be not be set.
- Type:
bool
- class mlipaudit.benchmarks.sampling.sampling.SamplingModelOutput(*, structure_names: list[str], simulation_states: list[SimulationState | None])¶
Stores model outputs for the sampling benchmark.
- structure_names¶
The names of the structures.
- Type:
list[str]
- simulation_states¶
SimulationStateorNoneobject for each structure in the same order as the structure names.Noneif the simulation failed.- Type:
list[mlip.simulation.state.SimulationState | None]
- class mlipaudit.benchmarks.sampling.sampling.ResidueTypeBackbone(*, phi: list[float], psi: list[float])¶
Stores reference backbone dihedral data for a residue type.
- phi¶
The reference phi dihedral values for the residue type.
- Type:
list[float]
- psi¶
The reference psi dihedral values for the residue type.
- Type:
list[float]
- class mlipaudit.benchmarks.sampling.sampling.ResidueTypeSidechain(*, chi1: list[float] | None = None, chi2: list[float] | None = None, chi3: list[float] | None = None, chi4: list[float] | None = None, chi5: list[float] | None = None)¶
Stores reference sidechain dihedral data for a residue type.
- chi1¶
The reference chi1 dihedral values for the residue type.
- Type:
list[float] | None
- chi2¶
The reference chi2 dihedral values for the residue type.
- Type:
list[float] | None
- chi3¶
The reference chi3 dihedral values for the residue type.
- Type:
list[float] | None
- chi4¶
The reference chi4 dihedral values for the residue type.
- Type:
list[float] | None
- chi5¶
The reference chi5 dihedral values for the residue type.
- Type:
list[float] | None