MLIPAudit

Overview

MLIPAudit is a Python library and app for benchmarking and validating Machine Learning Interatomic Potential (MLIP) models, in particular those based on the mlip library. It aims to cover a wide range of use cases and different levels of complexity, providing users with a comprehensive overview of the performance of their models. It also provides the option to benchmark models of any origin (e.g., also those based on PyTorch) via the ASE calculator interface.

MLIPAudit is a tool that can be installed easily via pip, and run via the command line. For example,

mlipaudit benchmark -m /path/to/visnet.zip /path/to/mace.zip -o /path/to/output

runs the complete benchmark suite for two models, visnet and mace and stores the results in JSON files in the /path/to/output directory. The results can contain multiple metrics, however, they will also always include a single score that reflects a model’s performance on the benchmark on a scale of 0 to 1.

To visualize these results, we provide a graphical user interface based on streamlit. Just run,

mlipaudit gui /path/to/output

to launch the app (opens a browser window automatically and displays the UI).

Note

This project is under active development.

Getting started

As a first step, we recommend that you check out our Installation page. Second, we provide a simple tutorial on how running the benchmark suite works and how to customize it. It is available at Tutorial: CLI tools.

We also refer you to Benchmarks for more information on each benchmark and to Model Scores for more information on the computation of scores for each model.

As MLIPAudit can also be used as a library, adding new benchmarks or building your own tools based on our benchmark classes, is easily possible. For a tutorial on this topic, see Tutorial: Adding a new benchmark.

Contents