Beam search
beam_search
logger = ColorLog(console, __name__).logger
module-attribute
BeamSearchDecoder(model: Decodable, suppressed_residues: list[str] | None = None, mass_scale: int = MASS_SCALE, disable_terminal_residues_anywhere: bool = True, keep_invalid_mass_sequences: bool = True, float_dtype: torch.dtype = torch.float64)
Bases: Decoder
A class for decoding from de novo sequence models using beam search.
This class conforms to the Decoder interface and decodes from
models that conform to the Decodable interface.
mass_scale = mass_scale
instance-attribute
disable_terminal_residues_anywhere = disable_terminal_residues_anywhere
instance-attribute
keep_invalid_mass_sequences = keep_invalid_mass_sequences
instance-attribute
float_dtype = float_dtype
instance-attribute
residue_masses = torch.zeros((len(self.model.residue_set),), dtype=(self.float_dtype))
instance-attribute
terminal_residue_indices = torch.tensor(terminal_residues_idx, dtype=(torch.long))
instance-attribute
suppressed_residue_indices = torch.tensor(suppressed_residues_idx, dtype=(torch.long))
instance-attribute
residue_target_offsets = torch.tensor(residue_target_offsets, dtype=(self.float_dtype))
instance-attribute
vocab_size = len(self.model.residue_set)
instance-attribute
decode(spectra: Float[Spectrum, ' batch'], precursors: Float[PrecursorFeatures, ' batch'], beam_size: int, max_length: int, mass_tolerance: float = 5e-05, max_isotope: int = 1, min_log_prob: float = -float('inf'), return_encoder_output: bool = False, encoder_output_reduction: Literal['mean', 'max', 'sum', 'full'] = 'mean', return_beam: bool = False, **kwargs) -> dict[str, Any]
Decode predicted residue sequence for a batch of spectra using beam search.
| PARAMETER | DESCRIPTION |
|---|---|
spectra
|
The spectra to be sequenced.
TYPE:
|
precursors
|
The precursor mass, charge and mass-to-charge ratio.
TYPE:
|
beam_size
|
The maximum size of the beam. Ignored in beam search.
TYPE:
|
max_length
|
The maximum length of a residue sequence.
TYPE:
|
mass_tolerance
|
The maximum relative error for which a predicted sequence is still considered to have matched the precursor mass.
TYPE:
|
max_isotope
|
The maximum number of additional neutrons for isotopes whose mass a predicted sequence's mass is considered when comparing to the precursor mass. All additional nucleon numbers from 1 to
TYPE:
|
min_log_prob
|
Minimum log probability to stop decoding early. If a sequence probability is less than this value it is marked as complete. Defaults to -inf.
TYPE:
|
return_beam
|
Optionally return beam-search results. Ignored in greedy search.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
dict[str, Any]
|
list[list[str]]: The predicted sequence as a list of residue tokens. This method will return an empty list for each spectrum in the batch where decoding fails i.e. no sequence that fits the precursor mass to within a tolerance is found. |