Chemical System

class mlip.data.chemical_system.ChemicalSystem(*, atomic_numbers: ndarray, positions: ndarray, energy: float | None = None, forces: ndarray | None = None, stress: ndarray | None = None, hessian: ndarray | None = None, cell: ndarray | None = None, pbc: tuple[bool, bool, bool] | None = None, weight: float = 1.0, partial_charges: ndarray | None = None, charge: int | None = None, spin_multiplicity: int | None = None, dipole_moment: ndarray | None = None, extras: dict[str, Any] | None = None)

A single atomic configuration with optional reference properties.

Represents one snapshot of an atomistic system as produced by dataset readers (e.g. ExtXYZ, HDF5). Downstream, each ChemicalSystem is converted into a graph representation for model training or inference via Graph.from_chemical_system.

Validates on construction that positions, forces, cell, and stress arrays have mutually consistent shapes.

atomic_numbers

Atomic numbers (Z) for every atom, shape (N,).

Type:

numpy.ndarray

positions

Cartesian coordinates in Angstrom, shape (N, 3).

Type:

numpy.ndarray

energy

Reference total energy in eV.

Type:

float | None

forces

Reference per-atom forces in eV/Angstrom, shape (N, 3).

Type:

numpy.ndarray | None

stress

Reference stress tensor in eV/Angstrom^3, shape (3, 3).

Type:

numpy.ndarray | None

hessian

Reference energy Hessian matrix in eV/Angstrom^2, shape (N, 3, N, 3)

Type:

numpy.ndarray | None

cell

Unit-cell lattice vectors, shape (3, 3).

Type:

numpy.ndarray | None

pbc

Per-axis periodic boundary conditions.

Type:

tuple[bool, bool, bool] | None

weight

Relative weight of this configuration in the training loss (default 1.0).

Type:

float

partial_charges

Atomic partial charges, shape (N,).

Type:

numpy.ndarray | None

charge

Integer total system charge.

Type:

int | None

spin_multiplicity

Integer total system spin multiplicity.

Type:

int | None

dipole_moment

Dipole moment, shape (3,).

Type:

numpy.ndarray | None

extras

Arbitrary metadata that can be consumed by custom preprocessing steps.

Type:

dict[str, Any] | None

__init__(**data: Any) None

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

validate_variable_shapes() Self

Validates that positions and forces have the correct shape.

classmethod validate_cell_shape(value: ndarray | None) ndarray | None

Validates that the cell has the correct shape.

classmethod validate_stress_shape(value: ndarray | None) ndarray | None

Validates that the stress has the correct shape.

classmethod from_ase_atoms(atoms: Atoms, get_property_fields: bool = True, property_name_mapping: dict[str, str] | None = None) Self

Create a ChemicalSystem from an ase.Atoms object.

Extracts atomic numbers, positions, cell, and periodic boundary conditions directly. Energy, forces, and stress are read from the attached calculator if available; otherwise they default to None. For the energy, forces, and stress, this can be disabled by setting get_property_fields=False in the keyword arguments.

Parameters:
  • atoms – An ASE Atoms object, optionally with a calculator providing energy, forces, and/or stress.

  • get_property_fields – Whether to also try fetching property fields like energy, forces, and stress from the atoms object. By default, this is set to True.

  • property_name_mapping – Dictionary mapping from canonical property names used to access the targetted properties required by the chemical system. By default, field names are used as is. Currently, only “partial_charges”, “charge”, “spin_multiplicity”, and “dipole_moment” are extracted through this mapping from ase.Atoms.

Returns:

A new ChemicalSystem instance.