The ProForma standard and implementation¶
ProForma is a standard for representing proteoforms and peptidoforms, developed by the PSI (Proteomics Standards Initiative). It provides a structured way to represent peptide sequences a wide variety of modifications and uncertainties.
Pyteomics supports ProForma v2.0. The core functions and classes related to ProForma support are located in the proforma - Proteoform and Peptidoform Notation, see there for more information.
Basic usage¶
The ProForma parser is object-oriented, with a primary class ProForma representing a parsed ProForma sequence.
To instantiate a ProForma object, use the class method ProForma.parse():
.. code-block:: python
>>> seq = ProForma.parse("EM[Oxidation]EVT[Phospho]SES[Phospho]PEK")
>>> seq
ProForma([('E', None), ('M', [GenericModification('Oxidation', None, None)]), ('E', None), ('V', None), ('T', [GenericModification('Phospho', None, None)]), ('S', None), ('E', None), ('S', [GenericModification('Phospho', None, None)]), ('P', None), ('E', None), ('K', None)], {'n_term': None, 'c_term': None, 'unlocalized_modifications': [], 'labile_modifications': [], 'fixed_modifications': [], 'intervals': [], 'isotopes': [], 'group_ids': [], 'charge_state': None})
>>> seq.mass
1440.47687500136
>>> seq.composition()
Composition({'H': 86, 'C': 51, 'O': 30, 'N': 12, 'S': 1, 'P': 2})