Mass Error Features¶
Computes the signed precursor mass error, correcting for possible isotope peak selection by the instrument. Two feature classes are available:
MassErrorPPMFeature— error in parts per million (ppm)MassErrorDaFeature— error in Daltons on the neutral-mass scale
Both features share the same isotope-correction logic and differ only in the unit used for the output column.
Purpose¶
Mass accuracy is one of the most direct measures of PSM quality. A correctly identified peptide should have a precursor m/z very close to its theoretical m/z. Large mass errors often indicate:
- Incorrect peptide identification
- Unexpected modifications
- Instrument calibration issues
When the instrument selects a precursor ion, it may pick the M+1 or M+2 isotope peak instead of the monoisotopic (M0) peak. Without correction, this would introduce a ~1 Da error that could penalise correct PSMs. Both features account for this by evaluating multiple isotope offsets and selecting the one that gives the smallest absolute ppm error.
Implementation¶
For each isotope offset in the configured isotope_error_range, the mass error is computed in m/z space. The isotope offset with the smallest absolute ppm error is selected, and its signed value is stored in the requested unit.
Parts per million (ppm):
Daltons (neutral-mass scale):
Where:
- mz_theoretical =
(neutral_mass + z × proton_mass) / z - neutral_mass =
sum(residue_masses) + water_mass - 1.00335 is the carbon-13 isotope mass shift
- z is the precursor charge
The two outputs are related by da = ppm × mz_measured / 1e6 × z.
Columns¶
| Column | Feature class | Unit | Description |
|---|---|---|---|
mass_error_ppm |
MassErrorPPMFeature |
Parts per million (ppm) | Signed precursor mass error after isotope correction. Negative = observed m/z is heavier than theoretical. |
mass_error_da |
MassErrorDaFeature |
Daltons (Da) | Same error on the neutral-mass scale. Negative = observed m/z is heavier than theoretical. |
Usage¶
from winnow.calibration.calibrator import ProbabilityCalibrator
from winnow.calibration.features import MassErrorDaFeature
residue_masses = {
"G": 57.021464,
"A": 71.037114,
"P": 97.052764,
"E": 129.042593,
"T": 101.047670,
"I": 113.084064,
"D": 115.026943,
"R": 156.101111,
"O": 237.147727,
"N": 114.042927,
"S": 87.032028,
"M": 131.040485,
"L": 113.084064,
}
da_feature = MassErrorDaFeature(
residue_masses=residue_masses,
isotope_error_range=(0, 1),
)
calibrator = ProbabilityCalibrator(seed=42)
calibrator.add_feature(da_feature)
Parameters¶
Both feature classes accept the same constructor arguments:
| Parameter | Type | Default | Description |
|---|---|---|---|
residue_masses |
Dict[str, float] |
Required | Mapping of residue tokens to monoisotopic masses in Daltons |
isotope_error_range |
Tuple[int, int] |
(0, 1) |
Range of isotope offsets to evaluate (inclusive). (0, 1) considers M0 and M+1. |
Requirements¶
The dataset must contain:
precursor_mz: Observed precursor m/z valueprecursor_charge: Precursor charge state (integer)prediction: List of residue tokens for the predicted peptide
Notes¶
- The error is signed: negative values indicate the observed m/z is heavier than theoretical
- Isotope selection always uses the smallest absolute ppm error, even when computing the Da column
- Typical mass accuracy for modern instruments is < 10 ppm
- The
isotope_error_rangeshould match the setting used by your data loader - For modifications, ensure the
residue_massesdictionary includes modified residue tokens (e.g.,"M[UNIMOD:35]"for oxidised methionine)