Graph Dataset

class mlip.data.helpers.graph_dataset.GraphDataset(graphs: list[GraphsTuple], batch_size: int, max_n_node: int, max_n_edge: int, min_n_node: int = 1, min_n_edge: int = 1, min_n_graph: int = 1, should_shuffle: bool = True, should_shuffle_between_epochs: bool = True, skip_last_batch: bool = False, raise_exc_if_graphs_discarded: bool = False)

Class for holding a dataset consisting of graphs, i.e., jraph.GraphsTuple, and managing batching.

__init__(graphs: list[GraphsTuple], batch_size: int, max_n_node: int, max_n_edge: int, min_n_node: int = 1, min_n_edge: int = 1, min_n_graph: int = 1, should_shuffle: bool = True, should_shuffle_between_epochs: bool = True, skip_last_batch: bool = False, raise_exc_if_graphs_discarded: bool = False)

Constructor.

Parameters:
  • graphs – The graphs to store and manage in this class.

  • batch_size – The batch size.

  • max_n_node – The maximum number of nodes contributed by one graph in a batch.

  • max_n_edge – The maximum number of edges contributed by one graph in a batch.

  • min_n_node – The minimum number of nodes in a batch, defaults to 1.

  • min_n_edge – The minimum number of edges in a batch, defaults to 1.

  • min_n_graph – The minimum number of graphs in a batch, defaults to 1.

  • should_shuffle – Whether to shuffle the graphs before iterating, defaults to True.

  • should_shuffle_between_epochs – If true, then reshuffle data between epochs but only if should_shuffle is also true.

  • skip_last_batch – Whether to skip the last batch. The default is false.

  • raise_exc_if_graphs_discarded – Whether to raise an exception if there are graphs that must be discarded due to size constraints. Default is False, which means only a warning is logged.

__iter__()

Batch over the dataset, according to a batching strategy.

__len__()

Returns the number of batches but does not recompute them each time.

subset(i: slice | int | list | float)

Constructs and returns a new graph dataset containing a subset of graphs of the current one with given slicing information i.

Parameters:

i – The slicing information. See source code for options.

Returns:

A new graph dataset containing only a subset of the graphs of the current one.