Skip to content

Greedy search

logger = ColorLog(console, __name__).logger module-attribute

GreedyDecoder(model: Decodable, suppressed_residues: list[str] | None = None, mass_scale: int = MASS_SCALE, disable_terminal_residues_anywhere: bool = True, float_dtype: torch.dtype = torch.float64)

Bases: Decoder

A class for decoding from de novo sequence models using greedy search.

This class conforms to the Decoder interface and decodes from models that conform to the Decodable interface.

mass_scale = mass_scale instance-attribute

disable_terminal_residues_anywhere = disable_terminal_residues_anywhere instance-attribute

float_dtype = float_dtype instance-attribute

residue_masses = torch.zeros((len(self.model.residue_set),), dtype=(self.float_dtype)) instance-attribute

terminal_residue_indices = torch.tensor(terminal_residues_idx, dtype=(torch.long)) instance-attribute

suppressed_residue_indices = torch.tensor(suppressed_residues_idx, dtype=(torch.long)) instance-attribute

residue_target_offsets = torch.tensor(residue_target_offsets, dtype=(self.float_dtype)) instance-attribute

vocab_size = len(self.model.residue_set) instance-attribute

decode(spectra: Float[Spectrum, ' batch'], precursors: Float[PrecursorFeatures, ' batch'], max_length: int, mass_tolerance: float = 5e-05, max_isotope: int = 1, min_log_prob: float = -float('inf'), return_encoder_output: bool = False, encoder_output_reduction: Literal['mean', 'max', 'sum', 'full'] = 'mean', **kwargs) -> dict[str, Any]

Decode predicted residue sequence for a batch of spectra using greedy search.

PARAMETER DESCRIPTION
spectra

The spectra to be sequenced.

TYPE: FloatTensor

precursors

The precursor mass, charge and mass-to-charge ratio.

TYPE: torch.FloatTensor[batch size, 3]

max_length

The maximum length of a residue sequence.

TYPE: int

mass_tolerance

The maximum relative error for which a predicted sequence is still considered to have matched the precursor mass.

TYPE: float DEFAULT: 5e-05

max_isotope

The maximum number of additional neutrons for isotopes whose mass a predicted sequence's mass is considered when comparing to the precursor mass.

All additional nucleon numbers from 1 to max_isotope inclusive are considered.

TYPE: int DEFAULT: 1

min_log_prob

Minimum log probability to stop decoding early. If a sequence probability is less than this value it is marked as complete. Defaults to -inf.

TYPE: float DEFAULT: -float('inf')

return_encoder_output

Whether to return the encoder output.

TYPE: bool DEFAULT: False

encoder_output_reduction

The reduction to apply to the encoder output. Valid values are "mean", "max", "sum", "full". Defaults to "mean".

TYPE: Literal['mean', 'max', 'sum', 'full'] DEFAULT: 'mean'

RETURNS DESCRIPTION
dict[str, Any]

dict[str, Any]: Required keys: - "predictions": list[list[str]] - "mass_error": list[float] - "prediction_log_probability": list[float] - "prediction_token_log_probabilities": list[list[float]] - "encoder_output": list[float] (optional) Example additional keys: - "prediction_beam_0": list[str]