Skip to content

Model

model

MassSpectrumTransformer

Bases: Pogfuse

A transformer model specialised for encoding mass spectra.

forward(x: Float[PeptideEmbedding, ' batch'], t_emb: Float[TimeEmbedding, ' batch'], precursor_emb: Float[Tensor, '...'], cond_emb: Optional[Float[SpectrumEmbedding, ' batch']] = None, x_padding_mask: Optional[Bool[PeptideMask, ' batch']] = None, cond_padding_mask: Optional[Bool[SpectrumMask, ' batch']] = None, pos_bias: Optional[Float[Tensor, '...']] = None) -> Float[Tensor, 'batch token embedding']

Compute encodings with the model.

Forward with x (bs, seq_len, dim), summing t_emb (bs, dim) before the transformer layer, and appending conditioning_emb (bs, seq_len2, dim) to the key/value pairs of the attention. Also pooled_conv_emb (bs, 1, dim) is summed with the timestep embeddings

Optionally specify key/value padding for input x with x_padding_mask (bs, seq_len), and optionally specify key/value padding mask for conditional embedding with cond_padding_mask (bs, seq_len2). By default no padding is used. Good idea to use cond padding but not x padding.

pos_bias is positional bias for wavlm-style attention gated relative position bias.

Returns x of same shape (bs, seq_len, dim)

MassSpectrumTransFusion(cfg: DictConfig, max_transcript_len: int = 200)

Bases: TransFusion

Diffusion reconstruction model conditioned on mass spectra.

layers = nn.ModuleList(layers) instance-attribute

conditioning_pos_emb = None instance-attribute

encoder = TransformerEncoder(dim_model=(cfg.dim), n_head=(cfg.nheads), dim_feedforward=(cfg.dim_feedforward), n_layers=(cfg.get('encoder_layers', cfg.get('layers', None))), dropout=(cfg.dropout), use_flash_attention=(cfg.get('use_flash_attention', False)), conv_peak_encoder=(cfg.get('conv_peak_encoder', False)), peak_embedding_dtype=(cfg.get('peak_embedding_dtype', torch.float64))) instance-attribute

charge_encoder = torch.nn.Embedding(cfg.max_charge, cfg.dim) instance-attribute

peak_encoder = self.encoder.peak_encoder instance-attribute

cache_spectra = None instance-attribute

cache_cond_emb = None instance-attribute

cache_cond_padding_mask = None instance-attribute

forward(x: Integer[Peptide, ' batch'], t: Integer[TimeStep, ' batch'], spectra: Float[Spectrum, ' batch'], spectra_padding_mask: Bool[SpectrumMask, ' batch'], precursors: Float[PrecursorFeatures, ' batch'], x_padding_mask: Optional[Bool[PeptideMask, ' batch']] = None) -> Float[ResidueLogits, 'batch token']

Transformer with conditioning cross attention.

  • x: (bs, seq_len) long tensor of character indices or (bs, seq_len, vocab_size) if cfg.diffusion_type == 'continuous'
  • t: (bs, ) long tensor of timestep indices
  • cond_emb: (bs, seq_len2, cond_emb_dim) if using wavlm encoder, else (bs, T)
  • x_padding_mask: (bs, seq_len) if using wavlm encoder, else (bs, T)
  • cond_padding_mask: (bs, seq_len2)

Returns logits (bs, seq_len, vocab_size)