NIST Peptide libraries
Downloads
Dataset Description
The original dataset is 646 MB (zipped). After parsing the MSP library into a tabular format while only retaining peak intensities for singly charged b- and y-ions, it was randomly split into test (3.4 MB, 27 036 spectra) and train/validation subsets (30 MB, 243 404 spectra). Files with encoded peptides were processed for ML as described in the fragmentation tutorial NIST (part 2): Traditional ML: Gradient boosting.
Attributes
- title: NIST
- dataset tag:
fragmentation/nist
- data publication: Sheetlin et al. 2020
- machine learning publication:
- data source identifier:
- data type: fragmentation intensity
- format: MSP
- columns:
- instrument:
- organism: Homo sapiens (human)
- fixed modifications: Carbamidomethylation of C
- variable modification: unmodified & Oxidation of M
- dissociation method: HCD (beam-type CID)
- collision energy: various
- mass analyzer type: Orbitrap
- spectra encoding:
Sample Protocol
See chemdata.nist.gov for more information.
Data Analysis Protocol
Consensus spectral libraries generated by NIST, the US National Institute of Standards and Technology.
Comments
/