Skip to content

Knapsack

knapsack

logger = ColorLog(console, __name__).logger module-attribute

Knapsack(max_mass: float, mass_scale: int, max_isotope: int, residues: list[str], residue_indices: dict[str, int], masses: MassArray, chart: KnapsackChart) dataclass

A class that precomputes and stores a knapsack chart.

PARAMETER DESCRIPTION
max_mass

The maximum mass up to which the chart is calculated.

TYPE: float

mass_scale

The scale in Daltons at which masses are calculated and rounded off. For example, a scale of 10000 would represent masses at a scale of 1e4 Da.

TYPE: int

residues

The list of residues that are considered in knapsack decoding. The order of this list is the inverse of residue_indices.

TYPE: list[str]

residue_indices

A mapping from residues as strings to indices in the knapsack chart. This is the inverse of residues.

TYPE: dict[str, int]

masses

The set of realisable masses in ascending order.

TYPE: numpy.ndarray[number of masses]

chart

The chart of realisable masses and residues that can lead to these masses. chart[mass, residue] is True if and only if a sequence of mass can be generated starting with the residue with index residue.

TYPE: numpy.ndarray[number of masses, number of residues]

max_mass: float instance-attribute

mass_scale: int instance-attribute

max_isotope: int instance-attribute

residues: list[str] instance-attribute

residue_indices: dict[str, int] instance-attribute

masses: MassArray instance-attribute

chart: KnapsackChart instance-attribute

save(path: str) -> None

Save the knapsack file to a directory.

PARAMETER DESCRIPTION
path

The path to the directory.

TYPE: str

RAISES DESCRIPTION
FileExistsError

If the directory path already exists, this message raise an exception.

construct_knapsack(residue_masses: dict[str, float], residue_indices: dict[str, int], max_mass: float, mass_scale: int, max_isotope: int = 2) -> 'Knapsack' classmethod

Construct a knapsack chart using depth-first search.

Previous construction algorithms have used dynamic programming, but its space and time complexity scale linearly with mass resolution since every possible mass is iterated over rather than only the feasible masses.

Graph search algorithms only iterate over feasible masses which become a smaller and smaller share of possible masses as the mass resolution increases. This leads to dramatic performance improvements.

This implementation uses depth-first search since its agenda is a stack which can be implemented using python lists whose operations have amortized constant time complexity.

PARAMETER DESCRIPTION
residue_masses

A mapping from considered residues to their masses.

TYPE: dict[str, float]

max_mass

The maximum mass up to which the chart is calculated.

TYPE: float

mass_scale

The scale in Daltons at which masses are calculated and rounded off. For example, a scale of 10000 would represent masses at a scale of 1e4 Da.

TYPE: int

from_file(path: str) -> 'Knapsack' classmethod

Load a knapsack saved to a directory.

PARAMETER DESCRIPTION
path

The path to the directory.

TYPE: str

RETURNS DESCRIPTION
_type_

description

TYPE: 'Knapsack'

get_feasible_masses(target_mass: float, tolerance: float) -> list[int]

Find a set of feasible masses for a given target mass and tolerance using binary search.

PARAMETER DESCRIPTION
target_mass

The masses to be decoded in Daltons.

TYPE: float

tolerance

The mass tolerance in Daltons.

TYPE: float

RETURNS DESCRIPTION
list[int]

list[int]: A list of feasible masses.