proforma - Proteoform and Peptidoform Notation¶
ProForma is a notation for defining modified amino acid sequences using a set of controlled vocabularies, as well as encoding uncertain or partial information about localization. See ProForma specification for more up-to-date information.
Strictly speaking, this implementation supports ProForma v2.
Contents
Data Access¶
parse()
- The primary interface for parsing ProForma strings.
>>> parse("EM[Oxidation]EVT[#g1(0.01)]S[#g1(0.09)]ES[Phospho#g1(0.90)]PEK")
([('E', None),
('M', [GenericModification('Oxidation', None, None)]),
('E', None),
('V', None),
('T', [LocalizationMarker(0.01, None, '#g1')]),
('S', [LocalizationMarker(0.09, None, '#g1')]),
('E', None),
('S',
[GenericModification('Phospho', [LocalizationMarker(0.9, None, '#g1')], '#g1')]),
('P', None),
('E', None),
('K', None)],
{'n_term': None,
'c_term': None,
'unlocalized_modifications': [],
'labile_modifications': [],
'fixed_modifications': [],
'intervals': [],
'isotopes': [],
'group_ids': ['#g1']})
to_proforma()
- Format a sequence and set of properties as ProForma text.
Classes¶
ProForma
- An object oriented version of the parsing and formatting code,
coupled with minimal information about mass and position data.
>>> seq = ProForma.parse("EM[Oxidation]EVT[#g1(0.01)]S[#g1(0.09)]ES[Phospho#g1(0.90)]PEK")
>>> seq
ProForma([('E', None), ('M', [GenericModification('Oxidation', None, None)]), ('E', None),
('V', None), ('T', [LocalizationMarker(0.01, None, '#g1')]), ('S', [LocalizationMarker(0.09, None, '#g1')]),
('E', None), ('S', [GenericModification('Phospho', [LocalizationMarker(0.9, None, '#g1')], '#g1')]),
('P', None), ('E', None), ('K', None)],
{'n_term': None, 'c_term': None, 'unlocalized_modifications': [],
'labile_modifications': [], 'fixed_modifications': [], 'intervals': [],
'isotopes': [], 'group_ids': ['#g1'], 'charge_state': None}
)
>>> seq.mass
1360.51054400136
>>> seq.tags
[GenericModification('Oxidation', None, None),
LocalizationMarker(0.01, None, '#g1'),
LocalizationMarker(0.09, None, '#g1'),
GenericModification('Phospho', [LocalizationMarker(0.9, None, '#g1')], '#g1')]
>>> str(seq)
'EM[Oxidation]EVT[#g1(0.01)]S[#g1(0.09)]ES[Phospho|#g1(0.9)]PEK'
Dependencies¶
To resolve PSI-MOD, XL-MOD, and GNO identifiers, psims
is required. By default,
psims
retrieves the most recent version of each controlled vocabulary from the internet, but
includes a fall-back version to use when the network is unavailable. It can also create
an application cache on disk.
CV Disk Caching¶
ProForma uses several different controlled vocabularies (CVs) that are each versioned separately.
Internally, the Unimod controlled vocabulary is accessed using Unimod
and all other controlled vocabularies are accessed using psims
. Unless otherwise stated,
the machinery will download fresh copies of each CV when first queried.
To avoid this slow operation, you can keep a cached copy of the CV source file on disk and tell
pyteomics
and psims
where to find them:
from pyteomics import proforma
# set the path for Unimod loading via pyteomics
proforma.set_unimod_path("path/to/unimod.xml")
# set the cache directory for downloading and reloading OBOs via psims
proforma.obo_cache.cache_path = "obo/cache/dir/"
proforma.obo_cache.enabled = True
Compliance Levels¶
1. Base Level Support Represents the lowest level of compliance, this level involves providing support for:
- [x] Amino acid sequences
- [x] Protein modifications using two of the supported CVs/ontologies: Unimod and PSI-MOD.
- [x] Protein modifications using delta masses (without prefixes)
- [x] N-terminal, C-terminal and labile modifications.
- [x] Ambiguity in the modification position, including support for localisation scores.
- [x] INFO tag.
2. Additional Separate Support These features are independent from each other:
- [x] Unusual amino acids (O and U).
- [x] Ambiguous amino acids (e.g. X, B, Z). This would include support for sequence tags of known mass (using the character X).
- [x] Protein modifications using delta masses (using prefixes for the different CVs/ontologies).
- [x] Use of prefixes for Unimod (U:) and PSI-MOD (M:) names.
- [x] Support for the joint representation of experimental data and its interpretation.
Top Down Extensions
- [ ] Additional CV/ontologies for protein modifications: RESID (the prefix R MUST be used for RESID CV/ontology term names)
- [x] Chemical formulas (this feature occurs in two places in this list).
Cross-Linking Extensions
- [ ] Cross-linked peptides (using the XL-MOD CV/ontology, the prefix X MUST be used for XL-MOD CV/ontology term names).
Glycan Extensions
- [x] Additional CV/ontologies for protein modifications: GNO (the prefix G MUST be used for GNO CV/ontology term names)
- [x] Glycan composition.
- [x] Chemical formulas (this feature occurs in two places in this list).
Spectral Support
- [x] Charge state and adducts
- [ ] Chimeric spectra are special cases.
- [x] Global modifications (e.g., every C is C13).
Functions¶
-
pyteomics.proforma.
parse
(sequence)[source]¶ Tokenize a ProForma sequence into a sequence of amino acid+tag positions, and a mapping of sequence-spanning modifiers.
Note
This is a state machine parser, but with certain sub-state paths unrolled to avoid an explosion of formal intermediary states.
Parameters: sequence (str) – The sequence to parse Returns: - parsed_sequence (list[tuple[str, list[TagBase]]]) – The (amino acid: str, TagBase or None) pairs denoting the positions along the primary sequence
- modifiers (dict) – A mapping listing the labile modifications, fixed modifications, stable isotopes, unlocalized modifications, tagged intervals, and group IDs
-
pyteomics.proforma.
to_proforma
(sequence, n_term=None, c_term=None, unlocalized_modifications=None, labile_modifications=None, fixed_modifications=None, intervals=None, isotopes=None, charge_state=None, group_ids=None)[source]¶ Convert a sequence plus modifiers into formatted text following the ProForma specification.
Parameters: - sequence (list[tuple[str, TagBase]]) – The primary sequence of the peptidoform/proteoform to render
- n_term (Optional[TagBase]) – The N-terminal modification, if any.
- c_term (Optional[TagBase]) – The C-terminal modification, if any.
- unlocalized_modifications (Optional[list[TagBase]]) – Any modifications which aren’t assigned to a specific location.
- labile_modifications (Optional[list[TagBase]]) – Any labile modifications
- fixed_modifications (Optional[list[ModificationRule]]) – Any fixed modifications
- intervals (Optional[list[TaggedInterval]]) – A list of modified intervals, if any
- isotopes (Optional[list[StableIsotope]]) – Any global stable isotope labels applied
- charge_state (Optional[ChargeState]) – An optional charge state value
- group_ids (Optional[list[str]]) – Any group identifiers. This parameter is currently not used.
Returns: Return type:
Helpers¶
-
pyteomics.proforma.
set_unimod_path
(path)[source]¶ Set the path to load the Unimod database from for resolving ProForma Unimod modifications.
Note
This method ensures that the Unimod modification database loads quickly from a local database file instead of downloading a new copy from the internet.
Parameters: path (str or file-like object) – A path to or file-like object for the “unimod.xml” file. Returns: Return type: Unimod
High Level Interface¶
-
class
pyteomics.proforma.
ProForma
(sequence, properties)[source]¶ Bases:
object
Represent a parsed ProForma sequence.
The preferred way to instantiate this class is via the
parse()
method.-
sequence
¶ The list of (amino acid, tag collection) pairs making up the primary sequence of the peptide.
Type: list[tuple[str, List[TagBase]]]
-
isotopes
¶ A list of any stable isotope rules that apply to this peptide
Type: list[StableIsotope]
-
intervals
¶ Any annotated intervals that contain either sequence ambiguity or a tag over that interval.
Type: list[Interval]
-
labile_modifications
¶ Any modifications that were parsed as labile, and may not appear at any location on the peptide primary sequence.
Type: list[ModificationBase]
-
unlocalized_modifications
¶ Any modifications that were not localized but may be attached to peptide sequence evidence.
Type: list[ModificationBase]
-
n_term
¶ Any modifications on the N-terminus of the peptide
Type: list[ModificationBase]
-
c_term
¶ Any modifications on the C-terminus of the peptide
Type: list[ModificationBase]
-
mass
¶ The computed mass for the fully modified peptide, including labile and unlocalized modifications. Does not include stable isotopes at this time
Type: float
-
__init__
(sequence, properties)[source]¶ Initialize self. See help(type(self)) for accurate signature.
Find all occurrences of a particular tag ID
Parameters: Returns: Return type:
-
fragments
(ion_shift, charge=1, reverse=None, include_labile=True, include_unlocalized=True)[source]¶ The function generates all possible fragments of the requested series type.
Parameters: - ion_shift (float or str) – The mass shift of the ion series, or the name of the ion series
- charge (int) – The charge state of the theoretical fragment masses to generate. Defaults to 1+. If 0 is passed, neutral masses will be returned.
- reverse (bool, optional) – Whether to fragment from the N-terminus (
False
) or C-terminus (True
). Ifion_shift
is astr
, the terminal will be inferred from the series name. Otherwise, defaults toFalse
. - include_labile (bool, optional) – Whether or not to include dissociated modification masses.
Defaults to
True
- include_unlocalized (bool, optional) – Whether or not to include unlocalized modification masses.
Defaults to
True
Returns: Return type: np.ndarray
Examples
>>> p = proforma.ProForma.parse("PEPTIDE") >>> p.fragments('b', charge=1) array([ 98.06004032, 227.1026334 , 324.15539725, 425.20307572, 538.2871397 , 653.31408272]) >>> p.fragments('y', charge=1) array([148.06043424, 263.08737726, 376.17144124, 477.21911971, 574.27188356, 703.31447664])
-
Tag Types¶
-
class
pyteomics.proforma.
TagBase
(type, value, extra=None, group_id=None)[source]¶ Bases:
object
A base class for all tag types.
-
type
¶ An element of
TagTypeEnum
saying what kind of tag this is.Type: Enum
-
extra
¶ Any extra tags that were nested within this tag. Usually limited to INFO tags but may be other synonymous controlled vocabulary terms.
Type: list
-
__init__
(type, value, extra=None, group_id=None)[source]¶ Initialize self. See help(type(self)) for accurate signature.
-
find_tag_type
(tag_type)[source]¶ Search this tag or tag collection for elements with a particular tag type and return them.
Parameters: tag_type (TagTypeEnum) – A label from TagTypeEnum
, or an equivalent type.Returns: matches – The list of all tags in this object which match the requested tag type. Return type: list
-
Modification Tags¶
-
class
pyteomics.proforma.
MassModification
(value, extra=None, group_id=None)[source]¶ Bases:
pyteomics.proforma.TagBase
A modification defined purely by a signed mass shift in Daltons.
The value of a
MassModification
is always afloat
-
__init__
(value, extra=None, group_id=None)[source]¶ Initialize self. See help(type(self)) for accurate signature.
-
find_tag_type
(tag_type)¶ Search this tag or tag collection for elements with a particular tag type and return them.
Parameters: tag_type (TagTypeEnum) – A label from TagTypeEnum
, or an equivalent type.Returns: matches – The list of all tags in this object which match the requested tag type. Return type: list
-
-
class
pyteomics.proforma.
ModificationBase
(value, extra=None, group_id=None)[source]¶ Bases:
pyteomics.proforma.TagBase
A base class for all modification tags with marked prefixes.
While
ModificationBase
is hashable, its equality testing brings in additional tag-related information. For pure modification identity comparison, usekey
to get aModificationToken
free of these concerns..-
__init__
(value, extra=None, group_id=None)[source]¶ Initialize self. See help(type(self)) for accurate signature.
-
composition
¶ The chemical composition shift this modification applies
-
definition
¶ A
dict
of properties describing this modification, given by the providing controlled vocabulary. This value is cached, and should not be modified.Returns: Return type: dict
-
find_tag_type
(tag_type)¶ Search this tag or tag collection for elements with a particular tag type and return them.
Parameters: tag_type (TagTypeEnum) – A label from TagTypeEnum
, or an equivalent type.Returns: matches – The list of all tags in this object which match the requested tag type. Return type: list
-
id
¶ The unique identifier given to this modification by its provider
Returns: Return type: str or int
-
key
¶ Get a safe-to-hash-and-compare
ModificationToken
representing this modification without tag-like properties.Returns: Return type: ModificationToken
-
mass
¶ The monoisotopic mass shift this modification applies
Returns ——-float
-
-
class
pyteomics.proforma.
GenericModification
(value, extra=None, group_id=None)[source]¶ Bases:
pyteomics.proforma.ModificationBase
-
__init__
(value, extra=None, group_id=None)[source]¶ Initialize self. See help(type(self)) for accurate signature.
-
composition
¶ The chemical composition shift this modification applies
-
definition
¶ A
dict
of properties describing this modification, given by the providing controlled vocabulary. This value is cached, and should not be modified.Returns: Return type: dict
-
find_tag_type
(tag_type)¶ Search this tag or tag collection for elements with a particular tag type and return them.
Parameters: tag_type (TagTypeEnum) – A label from TagTypeEnum
, or an equivalent type.Returns: matches – The list of all tags in this object which match the requested tag type. Return type: list
-
id
¶ The unique identifier given to this modification by its provider
Returns: Return type: str or int
-
key
¶ Get a safe-to-hash-and-compare
ModificationToken
representing this modification without tag-like properties.Returns: Return type: ModificationToken
-
mass
¶ The monoisotopic mass shift this modification applies
Returns ——-float
-
-
class
pyteomics.proforma.
FormulaModification
(value, extra=None, group_id=None)[source]¶ Bases:
pyteomics.proforma.ModificationBase
-
__init__
(value, extra=None, group_id=None)¶ Initialize self. See help(type(self)) for accurate signature.
-
composition
¶ The chemical composition shift this modification applies
-
definition
¶ A
dict
of properties describing this modification, given by the providing controlled vocabulary. This value is cached, and should not be modified.Returns: Return type: dict
-
find_tag_type
(tag_type)¶ Search this tag or tag collection for elements with a particular tag type and return them.
Parameters: tag_type (TagTypeEnum) – A label from TagTypeEnum
, or an equivalent type.Returns: matches – The list of all tags in this object which match the requested tag type. Return type: list
-
id
¶ The unique identifier given to this modification by its provider
Returns: Return type: str or int
-
key
¶ Get a safe-to-hash-and-compare
ModificationToken
representing this modification without tag-like properties.Returns: Return type: ModificationToken
-
mass
¶ The monoisotopic mass shift this modification applies
Returns ——-float
-
-
class
pyteomics.proforma.
UnimodModification
(value, extra=None, group_id=None)[source]¶ Bases:
pyteomics.proforma.ModificationBase
-
__init__
(value, extra=None, group_id=None)¶ Initialize self. See help(type(self)) for accurate signature.
-
composition
¶ The chemical composition shift this modification applies
-
definition
¶ A
dict
of properties describing this modification, given by the providing controlled vocabulary. This value is cached, and should not be modified.Returns: Return type: dict
-
find_tag_type
(tag_type)¶ Search this tag or tag collection for elements with a particular tag type and return them.
Parameters: tag_type (TagTypeEnum) – A label from TagTypeEnum
, or an equivalent type.Returns: matches – The list of all tags in this object which match the requested tag type. Return type: list
-
id
¶ The unique identifier given to this modification by its provider
Returns: Return type: str or int
-
key
¶ Get a safe-to-hash-and-compare
ModificationToken
representing this modification without tag-like properties.Returns: Return type: ModificationToken
-
mass
¶ The monoisotopic mass shift this modification applies
Returns ——-float
-
provider
¶ The name of the controlled vocabulary that provided this modification.
Returns: Return type: str
-
resolve
()¶ Find the term and return it’s properties
-
-
class
pyteomics.proforma.
PSIModModification
(value, extra=None, group_id=None)[source]¶ Bases:
pyteomics.proforma.ModificationBase
-
__init__
(value, extra=None, group_id=None)¶ Initialize self. See help(type(self)) for accurate signature.
-
composition
¶ The chemical composition shift this modification applies
-
definition
¶ A
dict
of properties describing this modification, given by the providing controlled vocabulary. This value is cached, and should not be modified.Returns: Return type: dict
-
find_tag_type
(tag_type)¶ Search this tag or tag collection for elements with a particular tag type and return them.
Parameters: tag_type (TagTypeEnum) – A label from TagTypeEnum
, or an equivalent type.Returns: matches – The list of all tags in this object which match the requested tag type. Return type: list
-
id
¶ The unique identifier given to this modification by its provider
Returns: Return type: str or int
-
key
¶ Get a safe-to-hash-and-compare
ModificationToken
representing this modification without tag-like properties.Returns: Return type: ModificationToken
-
mass
¶ The monoisotopic mass shift this modification applies
Returns ——-float
-
provider
¶ The name of the controlled vocabulary that provided this modification.
Returns: Return type: str
-
resolve
()¶ Find the term and return it’s properties
-
-
class
pyteomics.proforma.
XLMODModification
(value, extra=None, group_id=None)[source]¶ Bases:
pyteomics.proforma.ModificationBase
-
__init__
(value, extra=None, group_id=None)¶ Initialize self. See help(type(self)) for accurate signature.
-
composition
¶ The chemical composition shift this modification applies
-
definition
¶ A
dict
of properties describing this modification, given by the providing controlled vocabulary. This value is cached, and should not be modified.Returns: Return type: dict
-
find_tag_type
(tag_type)¶ Search this tag or tag collection for elements with a particular tag type and return them.
Parameters: tag_type (TagTypeEnum) – A label from TagTypeEnum
, or an equivalent type.Returns: matches – The list of all tags in this object which match the requested tag type. Return type: list
-
id
¶ The unique identifier given to this modification by its provider
Returns: Return type: str or int
-
key
¶ Get a safe-to-hash-and-compare
ModificationToken
representing this modification without tag-like properties.Returns: Return type: ModificationToken
-
mass
¶ The monoisotopic mass shift this modification applies
Returns ——-float
-
provider
¶ The name of the controlled vocabulary that provided this modification.
Returns: Return type: str
-
resolve
()¶ Find the term and return it’s properties
-
-
class
pyteomics.proforma.
GNOmeModification
(value, extra=None, group_id=None)[source]¶ Bases:
pyteomics.proforma.ModificationBase
-
__init__
(value, extra=None, group_id=None)¶ Initialize self. See help(type(self)) for accurate signature.
-
composition
¶ The chemical composition shift this modification applies
-
definition
¶ A
dict
of properties describing this modification, given by the providing controlled vocabulary. This value is cached, and should not be modified.Returns: Return type: dict
-
find_tag_type
(tag_type)¶ Search this tag or tag collection for elements with a particular tag type and return them.
Parameters: tag_type (TagTypeEnum) – A label from TagTypeEnum
, or an equivalent type.Returns: matches – The list of all tags in this object which match the requested tag type. Return type: list
-
id
¶ The unique identifier given to this modification by its provider
Returns: Return type: str or int
-
key
¶ Get a safe-to-hash-and-compare
ModificationToken
representing this modification without tag-like properties.Returns: Return type: ModificationToken
-
mass
¶ The monoisotopic mass shift this modification applies
Returns ——-float
-
provider
¶ The name of the controlled vocabulary that provided this modification.
Returns: Return type: str
-
resolve
()¶ Find the term and return it’s properties
-
-
class
pyteomics.proforma.
GlycanModification
(value, extra=None, group_id=None)[source]¶ Bases:
pyteomics.proforma.ModificationBase
-
__init__
(value, extra=None, group_id=None)¶ Initialize self. See help(type(self)) for accurate signature.
-
composition
¶ The chemical composition shift this modification applies
-
definition
¶ A
dict
of properties describing this modification, given by the providing controlled vocabulary. This value is cached, and should not be modified.Returns: Return type: dict
-
find_tag_type
(tag_type)¶ Search this tag or tag collection for elements with a particular tag type and return them.
Parameters: tag_type (TagTypeEnum) – A label from TagTypeEnum
, or an equivalent type.Returns: matches – The list of all tags in this object which match the requested tag type. Return type: list
-
id
¶ The unique identifier given to this modification by its provider
Returns: Return type: str or int
-
key
¶ Get a safe-to-hash-and-compare
ModificationToken
representing this modification without tag-like properties.Returns: Return type: ModificationToken
-
mass
¶ The monoisotopic mass shift this modification applies
Returns ——-float
-
-
class
pyteomics.proforma.
ModificationToken
(name, id, provider, source_cls)[source]¶ Bases:
object
Describes a particular modification from a particular provider, independent of a
TagBase
’s state.This class is meant to be used in place of a
ModificationBase
object when equality testing and hashing is desired, but do not want extra properties to be involved.ModificationToken
is comparable and hashable, and can be compared withModificationBase
subclass instances safely. It can be called to create a new instance of theModificationBase
it is equal to.-
id
¶ Whatever unique identifier the providing controlled vocabulary gave to this modification
Type: int or str
-
source_cls
¶ A sub-class of
ModificationBase
that will be used to fulfill this token if requested, providing it a resolver.Type: type
-
Label Tags¶
-
class
pyteomics.proforma.
InformationTag
(value, extra=None, group_id=None)[source]¶ Bases:
pyteomics.proforma.TagBase
A tag carrying free text describing the location
-
__init__
(value, extra=None, group_id=None)[source]¶ Initialize self. See help(type(self)) for accurate signature.
-
find_tag_type
(tag_type)¶ Search this tag or tag collection for elements with a particular tag type and return them.
Parameters: tag_type (TagTypeEnum) – A label from TagTypeEnum
, or an equivalent type.Returns: matches – The list of all tags in this object which match the requested tag type. Return type: list
-
-
class
pyteomics.proforma.
PositionLabelTag
(value=None, extra=None, group_id=None)[source]¶ Bases:
pyteomics.proforma.GroupLabelBase
A tag to mark that a position is involved in a group in some way, but does not imply any specific semantics.
-
__init__
(value=None, extra=None, group_id=None)[source]¶ Initialize self. See help(type(self)) for accurate signature.
-
find_tag_type
(tag_type)¶ Search this tag or tag collection for elements with a particular tag type and return them.
Parameters: tag_type (TagTypeEnum) – A label from TagTypeEnum
, or an equivalent type.Returns: matches – The list of all tags in this object which match the requested tag type. Return type: list
-
-
class
pyteomics.proforma.
LocalizationMarker
(value, extra=None, group_id=None)[source]¶ Bases:
pyteomics.proforma.GroupLabelBase
A tag to mark a particular localization site
-
__init__
(value, extra=None, group_id=None)[source]¶ Initialize self. See help(type(self)) for accurate signature.
-
find_tag_type
(tag_type)¶ Search this tag or tag collection for elements with a particular tag type and return them.
Parameters: tag_type (TagTypeEnum) – A label from TagTypeEnum
, or an equivalent type.Returns: matches – The list of all tags in this object which match the requested tag type. Return type: list
-
Supporting Types¶
-
class
pyteomics.proforma.
ModificationRule
(modification_tag, targets=None)[source]¶ Bases:
object
Define a fixed modification rule which dictates a modification tag is always applied at one or more amino acid residues.
-
class
pyteomics.proforma.
StableIsotope
(isotope)[source]¶ Bases:
object
Define a fixed isotope that is applied globally to all amino acids.
-
class
pyteomics.proforma.
TaggedInterval
(start, end=None, tags=None, ambiguous=False)[source]¶ Bases:
object
Define a fixed interval over the associated sequence which contains the localization of the associated tag or denotes a region of general sequence order ambiguity.
-
class
pyteomics.proforma.
ChargeState
(charge, adducts=None)[source]¶ Bases:
object
Describes the charge and adduct types of the structure.
Modification Resolvers¶
-
class
pyteomics.proforma.
ModificationResolver
(name, **kwargs)[source]¶ Bases:
object
-
parse_identifier
(identifier)[source]¶ Parse a string that is either a CV prefixed identifier or name.
Parameters: identifier (str) – The identifier string to parse, removing CV prefix as needed. Returns: - name (str, optional) – A textual identifier embedded in the qualified identifier, if any, otherwise
None
. - id (int, optional) – An integer ID embedded in the qualified identifier, if any, otherwise
None
.
- name (str, optional) – A textual identifier embedded in the qualified identifier, if any, otherwise
-
-
class
pyteomics.proforma.
GenericResolver
(resolvers, **kwargs)[source]¶ Bases:
pyteomics.proforma.ModificationResolver
-
__init__
(resolvers, **kwargs)[source]¶ Initialize self. See help(type(self)) for accurate signature.
-
parse_identifier
(identifier)[source]¶ Parse a string that is either a CV prefixed identifier or name.
Does no parsing as a
GenericModification
is never qualified.Parameters: identifier (str) – The identifier string to parse, removing CV prefix as needed. Returns: - name (str, optional) – A textual identifier embedded in the qualified identifier, if any, otherwise
None
. - id (int, optional) – An integer ID embedded in the qualified identifier, if any, otherwise
None
.
- name (str, optional) – A textual identifier embedded in the qualified identifier, if any, otherwise
-
-
class
pyteomics.proforma.
UnimodResolver
(**kwargs)[source]¶ Bases:
pyteomics.proforma.ModificationResolver
-
parse_identifier
(identifier)¶ Parse a string that is either a CV prefixed identifier or name.
Parameters: identifier (str) – The identifier string to parse, removing CV prefix as needed. Returns: - name (str, optional) – A textual identifier embedded in the qualified identifier, if any, otherwise
None
. - id (int, optional) – An integer ID embedded in the qualified identifier, if any, otherwise
None
.
- name (str, optional) – A textual identifier embedded in the qualified identifier, if any, otherwise
-
-
class
pyteomics.proforma.
PSIModResolver
(**kwargs)[source]¶ Bases:
pyteomics.proforma.ModificationResolver
-
parse_identifier
(identifier)¶ Parse a string that is either a CV prefixed identifier or name.
Parameters: identifier (str) – The identifier string to parse, removing CV prefix as needed. Returns: - name (str, optional) – A textual identifier embedded in the qualified identifier, if any, otherwise
None
. - id (int, optional) – An integer ID embedded in the qualified identifier, if any, otherwise
None
.
- name (str, optional) – A textual identifier embedded in the qualified identifier, if any, otherwise
-
-
class
pyteomics.proforma.
XLMODResolver
(**kwargs)[source]¶ Bases:
pyteomics.proforma.ModificationResolver
-
parse_identifier
(identifier)¶ Parse a string that is either a CV prefixed identifier or name.
Parameters: identifier (str) – The identifier string to parse, removing CV prefix as needed. Returns: - name (str, optional) – A textual identifier embedded in the qualified identifier, if any, otherwise
None
. - id (int, optional) – An integer ID embedded in the qualified identifier, if any, otherwise
None
.
- name (str, optional) – A textual identifier embedded in the qualified identifier, if any, otherwise
-
-
class
pyteomics.proforma.
GNOResolver
(**kwargs)[source]¶ Bases:
pyteomics.proforma.ModificationResolver
-
get_mass_from_glycan_composition
(term)[source]¶ Parse the Byonic-style glycan composition from property GNO:00000202 to get the counts of each monosaccharide and use that to calculate mass.
The mass computed here is exact and dehydrated, distinct from the rounded-off mass that
get_mass_from_term()
will produce by walking up the CV term hierarchy. However, not all glycan compositions are representable in GNO:00000202 format, so this may silently be absent or incomplete, hence the double-check inget_mass_from_term()
.Parameters: term (psims.controlled_vocabulary.Entity) – The CV entity being parsed. Returns: mass – If a glycan composition is found on the term, the computed mass will be returned. Otherwise the None
is returnedReturn type: float or None
-
get_mass_from_term
(term, raw_mass)[source]¶ Walk up the term hierarchy and find the mass group term near the root of the tree, and return the most accurate mass available for the provided term.
The mass group term’s mass is rounded to two decimal places, leading to relatively large errors.
Parameters: term (psims.controlled_vocabulary.Entity) – The CV entity being parsed. Returns: mass – If a root node is found along the term’s lineage, computed mass will be returned. Otherwise the None
is returned. The mass may beReturn type: float or None
-
parse_identifier
(identifier)¶ Parse a string that is either a CV prefixed identifier or name.
Parameters: identifier (str) – The identifier string to parse, removing CV prefix as needed. Returns: - name (str, optional) – A textual identifier embedded in the qualified identifier, if any, otherwise
None
. - id (int, optional) – An integer ID embedded in the qualified identifier, if any, otherwise
None
.
- name (str, optional) – A textual identifier embedded in the qualified identifier, if any, otherwise
-