Pyteomics documentation v4.7.1

usi - Universal Spectrum Identifier (USI) parser and minimal PROXI client

«  mztab - mzTab file reader   ::   Contents   ::   proforma - Proteoform and Peptidoform Notation  »

usi - Universal Spectrum Identifier (USI) parser and minimal PROXI client

Summary

USI is a standardized method of referencing a specific spectrum in a dataset, possibly attached to an interpretation. This module includes a USI type which can represent these constructs, parse() them and reconstruct them.

One use-case for USI is to request spectrum information from a PROXI service host. PROXI services are available from several of the major national proteomics data hosts, including MassIVE, PeptideAtlas, PRIDE, and jPOST.

See also

LeDuc, Richard D., Eric W. Deutsch, Pierre-Alain Binz, Ryan T. Fellers, Anthony J. Cesnik, Joshua A. Klein, Tim Van Den Bossche, et al. “Proteomics Standards Initiative’s ProForma 2.0: Unifying the Encoding of Proteoforms and Peptidoforms.” ArXiv:2109.11352 [q-Bio], September 23, 2021. http://arxiv.org/abs/2109.11352.

Data access

USI for representing Universal Spectrum Identifiers. Call USI.parse() to parse a USI string.

proxi() to request a USI from a remote service. Provides access to the PeptideAtlas, MassIVE, PRIDE and jPOST hosts.

class pyteomics.usi.JPOSTBackend(**kwargs)[source]

Bases: _PROXIBackend

__init__(**kwargs)[source]
get(usi)

Retrieve a USI from the host PROXI service over the network.

Parameters:

usi (str or USI) – The universal spectrum identifier to retrieve.

Returns:

The spectrum as represented by the requested PROXI host.

Return type:

dict

class pyteomics.usi.MassIVEBackend(**kwargs)[source]

Bases: _PROXIBackend

__init__(**kwargs)[source]
get(usi)

Retrieve a USI from the host PROXI service over the network.

Parameters:

usi (str or USI) – The universal spectrum identifier to retrieve.

Returns:

The spectrum as represented by the requested PROXI host.

Return type:

dict

class pyteomics.usi.PRIDEBackend(**kwargs)[source]

Bases: _PROXIBackend

__init__(**kwargs)[source]
get(usi)

Retrieve a USI from the host PROXI service over the network.

Parameters:

usi (str or USI) – The universal spectrum identifier to retrieve.

Returns:

The spectrum as represented by the requested PROXI host.

Return type:

dict

class pyteomics.usi.PROXIAggregator(backends=None, n_threads=None, timeout=15, merge=True, ephemeral_pool=True, **kwargs)[source]

Bases: object

Aggregate across requests across multiple PROXI servers.

Will attempt to coalesce responses from responding servers into a single spectrum representation.

backends

The backend servers to query. Defaults to the set of all available backends.

Type:

dict mapping str to _PROXIBackend

n_threads

The number of threads to run concurrently to while making requests. Defaults to the number of servers to query.

Type:

int

timeout

The number of seconds to wait for a response.

Type:

float

ephemeral_pool

Whether or not to tear down the thread pool between requests.

Type:

bool

__init__(backends=None, n_threads=None, timeout=15, merge=True, ephemeral_pool=True, **kwargs)[source]
coalesce(responses, method='first')[source]

Merge responses from disparate servers into a single spectrum representation.

The merging process will use the first of every array encountered, and all unique attributes.

Parameters:
  • responses (list) – A list of response values, pairs (_PROXIBackend and either dict or Exception).

  • method (str) – The name of the coalescence technique to use. Currently only “first” is supported.

Returns:

result – The coalesced spectrum

Return type:

dict

get(usi)[source]

Retrieve a USI from each PROXI service over the network.

Parameters:

usi (str or USI) – The universal spectrum identifier to retrieve.

Returns:

result – The spectrum coalesced from all responding PROXI hosts if merge is True, or a list of responses marked by host.

Return type:

dict or list[dict]

tag_with_source(responses)[source]

Mark each response with it’s source.

Parameters:

responses (list) – A list of response values, pairs (_PROXIBackend and either dict or Exception).

Returns:

result – The tagged dict for each response.

Return type:

list[dict]

class pyteomics.usi.PeptideAtlasBackend(**kwargs)[source]

Bases: _PROXIBackend

__init__(**kwargs)[source]
get(usi)

Retrieve a USI from the host PROXI service over the network.

Parameters:

usi (str or USI) – The universal spectrum identifier to retrieve.

Returns:

The spectrum as represented by the requested PROXI host.

Return type:

dict

class pyteomics.usi.ProteomeExchangeBackend(**kwargs)[source]

Bases: _PROXIBackend

__init__(**kwargs)[source]
get(usi)

Retrieve a USI from the host PROXI service over the network.

Parameters:

usi (str or USI) – The universal spectrum identifier to retrieve.

Returns:

The spectrum as represented by the requested PROXI host.

Return type:

dict

class pyteomics.usi.USI(protocol, dataset, datafile, scan_identifier_type, scan_identifier, interpretation)[source]

Bases: USI

Represent a Universal Spectrum Identifier (USI).

Note

This implementation will capture the interpretation component but will not interpret it at this time.

protocol

The protocol to use to access the data (usually mzspec)

Type:

str

dataset

The name or accession number for the dataset the spectrum residues in

Type:

str

datafile

The basename of the data file from dataset to retrieve the spectrum from

Type:

str

scan_identifier_type

The format of the scan identifier, one of (scan, index, nativeId, trace)

Type:

str

scan_identifier

A usually numerical but potentially comma separated value encoded as a string to uniquely identify the spectrum to be recovered from datafile in dataset.

Type:

str

interpretation

The trailing material of the USI, such as the ProForma peptide sequence and charge

Type:

str

__init__()
count(value, /)

Return number of occurrences of value.

datafile

Alias for field number 2

dataset

Alias for field number 1

index(value, start=0, stop=9223372036854775807, /)

Return first index of value.

Raises ValueError if the value is not present.

interpretation

Alias for field number 5

classmethod parse(usi)[source]

Parse a USI string into a USI object.

Parameters:

usi (str) – The USI string to parse

Return type:

USI

protocol

Alias for field number 0

scan_identifier

Alias for field number 4

scan_identifier_type

Alias for field number 3

pyteomics.usi.proxi(usi, backend='peptide_atlas', **kwargs)[source]

Retrieve a USI from a PROXI <http://www.psidev.info/proxi>.

Parameters:
  • usi (str or USI) – The universal spectrum identifier to request.

  • backend (str or Callable) – Either the name of a PROXI host (peptide_atlas, massive, pride, jpost, or aggregator), or a callable object (which _PROXIBackend instances are) which will be used to resolve the USI. The “aggregator” backend will use a PROXIAggregator instance which will request the same USI from all the registered servers and attempt to merge their responses into a single whole. See PROXIAggregator.coalesce() for more details on the merging process.

  • **kwargs – extra arguments passed when constructing the backend by name.

Returns:

The spectrum as represented by the requested PROXI host.

Return type:

dict

«  mztab - mzTab file reader   ::   Contents   ::   proforma - Proteoform and Peptidoform Notation  »