peff - PSI Extended FASTA Format¶
PEFF is a forth-coming standard from PSI-HUPO formalizing and extending the encoding of protein features and annotations for building search spaces for proteomics. See The PEFF specification for more up-to-date information on the standard.
Data manipulation¶
Classes¶
The PEFF parser inherits several properties from implementation in the fasta
module,
building on top of the TwoLayerIndexedFASTA
reader.
Available classes:
IndexedPEFF
- Parse a PEFF format file in binary-mode, supporting direct indexing by header string or by tag.
- class pyteomics.peff.Header(mapping, original=None)[source]¶
Bases:
Mapping
Hold parsed properties of a key-value pair like a sequence’s definition line.
This object supports the
Mapping
interface, and keys may be accessed by attribute access notation.- get(k[, d]) D[k] if k in D, else d. d defaults to None. ¶
- class pyteomics.peff.IndexedPEFF(source, ignore_comments=False, **kwargs)[source]¶
Bases:
TwoLayerIndexedFASTA
Creates an
IndexedPEFF
object.- Parameters:
- __init__(source, ignore_comments=False, **kwargs)[source]¶
Open source and create a two-layer index for convenient random access both by full header strings and extracted fields.
- Parameters:
source (str or file-like) – File to read. If file object, it must be opened in binary mode.
header_pattern (str or RE or None, optional) – Pattern to match the header string. Must capture the group used for the second index. If
None
(default), second-level index is not created.header_group (int or str or None, optional) – Defines which group is used as key in the second-level index. Default is 1.
ignore_comments (bool, optional) – If
True
then ignore the second and subsequent lines of description. Default isFalse
, which concatenates multi-line descriptions into a single string.parser (function or None, optional) – Defines whether the FASTA descriptions should be parsed. If it is a function, that function will be given the description string, and the returned value will be yielded together with the sequence. The
std_parsers
dict has parsers for several formats. Hint: specifyparse()
as the parser to apply automatic format recognition. Default isNone
, which means return the header “as is”.arguments (Other)
- build_second_index()¶
Create the mapping from extracted field to whole header string.
- get_by_id(key)¶
Get the entry by value of header string or extracted field.
- map(target=None, processes=-1, args=None, kwargs=None, **_kwargs)¶
Execute the
target
function over entries of this object across up toprocesses
processes.Results will be returned out of order.
- Parameters:
target (
Callable
, optional) – The function to execute over each entry. It will be given a single object yielded by the wrapped iterator as well as all of the values inargs
andkwargs
processes (int, optional) – The number of worker processes to use. If 0 or negative, defaults to the number of available CPUs. This parameter can also be set at reader creation.
args (
Sequence
, optional) – Additional positional arguments to be passed to the target functionkwargs (
Mapping
, optional) – Additional keyword arguments to be passed to the target function**_kwargs – Additional keyword arguments to be passed to the target function
- Yields:
object – The work item returned by the target function.
- reset()¶
Resets the iterator to its initial state.