Model
model
MassSpectrumTransformer
Bases: Pogfuse
A transformer model specialised for encoding mass spectra.
forward(x: Float[PeptideEmbedding, ' batch'], t_emb: Float[TimeEmbedding, ' batch'], precursor_emb: Float[Tensor, '...'], cond_emb: Optional[Float[SpectrumEmbedding, ' batch']] = None, x_padding_mask: Optional[Bool[PeptideMask, ' batch']] = None, cond_padding_mask: Optional[Bool[SpectrumMask, ' batch']] = None, pos_bias: Optional[Float[Tensor, '...']] = None) -> Float[Tensor, 'batch token embedding']
Compute encodings with the model.
Forward with x (bs, seq_len, dim), summing t_emb (bs, dim) before the transformer layer,
and appending conditioning_emb (bs, seq_len2, dim) to the key/value pairs of the
attention. Also pooled_conv_emb (bs, 1, dim) is summed with the timestep embeddings
Optionally specify key/value padding for input x with x_padding_mask (bs, seq_len), and
optionally specify key/value padding mask for conditional embedding with cond_padding_mask
(bs, seq_len2). By default no padding is used. Good idea to use cond padding but not x
padding.
pos_bias is positional bias for wavlm-style attention gated relative position bias.
Returns x of same shape (bs, seq_len, dim)
MassSpectrumTransFusion(cfg: DictConfig, max_transcript_len: int = 200)
Bases: TransFusion
Diffusion reconstruction model conditioned on mass spectra.
layers = nn.ModuleList(layers)
instance-attribute
conditioning_pos_emb = None
instance-attribute
encoder = TransformerEncoder(dim_model=(cfg.dim), n_head=(cfg.nheads), dim_feedforward=(cfg.dim_feedforward), n_layers=(cfg.get('encoder_layers', cfg.get('layers', None))), dropout=(cfg.dropout), use_flash_attention=(cfg.get('use_flash_attention', False)), conv_peak_encoder=(cfg.get('conv_peak_encoder', False)), peak_embedding_dtype=(cfg.get('peak_embedding_dtype', torch.float64)))
instance-attribute
charge_encoder = torch.nn.Embedding(cfg.max_charge, cfg.dim)
instance-attribute
peak_encoder = self.encoder.peak_encoder
instance-attribute
cache_spectra = None
instance-attribute
cache_cond_emb = None
instance-attribute
cache_cond_padding_mask = None
instance-attribute
forward(x: Integer[Peptide, ' batch'], t: Integer[TimeStep, ' batch'], spectra: Float[Spectrum, ' batch'], spectra_padding_mask: Bool[SpectrumMask, ' batch'], precursors: Float[PrecursorFeatures, ' batch'], x_padding_mask: Optional[Bool[PeptideMask, ' batch']] = None) -> Float[ResidueLogits, 'batch token']
Transformer with conditioning cross attention.
x: (bs, seq_len) long tensor of character indices or (bs, seq_len, vocab_size) if cfg.diffusion_type == 'continuous't: (bs, ) long tensor of timestep indicescond_emb: (bs, seq_len2, cond_emb_dim) if using wavlm encoder, else (bs, T)x_padding_mask: (bs, seq_len) if using wavlm encoder, else (bs, T)cond_padding_mask: (bs, seq_len2)
Returns logits (bs, seq_len, vocab_size)