peff - PSI Extended FASTA Format¶
PEFF is a forth-coming standard from PSI-HUPO formalizing and extending the encoding of protein features and annotations for building search spaces for proteomics. See The PEFF specification for more up-to-date information on the standard.
Data manipulation¶
Classes¶
The PEFF parser inherits several properties from implementation in the fasta
module,
building on top of the TwoLayerIndexedFASTA
reader.
Available classes:
IndexedPEFF
- Parse a PEFF format file in binary-mode, supporting direct indexing by header string or by tag.
-
class
pyteomics.peff.
Header
(mapping, original=None)[source]¶ Bases:
collections.abc.Mapping
Hold parsed properties of a key-value pair like a sequence’s definition line.
This object supports the
Mapping
interface, and keys may be accessed by attribute access notation.-
__init__
(mapping, original=None)[source]¶ Initialize self. See help(type(self)) for accurate signature.
-
get
(k[, d]) → D[k] if k in D, else d. d defaults to None.¶
-
-
class
pyteomics.peff.
IndexedPEFF
(source, ignore_comments=False, **kwargs)[source]¶ Bases:
pyteomics.fasta.TwoLayerIndexedFASTA
Creates an
IndexedPEFF
object.Parameters: -
__init__
(source, ignore_comments=False, **kwargs)[source]¶ Open source and create a two-layer index for convenient random access both by full header strings and extracted fields.
Parameters: - source (str or file-like) – File to read. If file object, it must be opened in binary mode.
- header_pattern (str or RE or None, optional) – Pattern to match the header string. Must capture the group used
for the second index. If
None
(default), second-level index is not created. - header_group (int or str or None, optional) – Defines which group is used as key in the second-level index. Default is 1.
- ignore_comments (bool, optional) – If
True
then ignore the second and subsequent lines of description. Default isFalse
, which concatenates multi-line descriptions into a single string. - parser (function or None, optional) – Defines whether the FASTA descriptions should be parsed. If it is a
function, that function will be given the description string, and
the returned value will be yielded together with the sequence.
The
std_parsers
dict has parsers for several formats. Hint: specifyparse()
as the parser to apply automatic format recognition. Default isNone
, which means return the header “as is”. - arguments (Other) –
-
build_second_index
()¶ Create the mapping from extracted field to whole header string.
-
get_by_id
(key)¶ Get the entry by value of header string or extracted field.
-
map
(target=None, processes=-1, args=None, kwargs=None, **_kwargs)¶ Execute the
target
function over entries of this object across up toprocesses
processes.Results will be returned out of order.
Parameters: - target (
Callable
, optional) – The function to execute over each entry. It will be given a single object yielded by the wrapped iterator as well as all of the values inargs
andkwargs
- processes (int, optional) – The number of worker processes to use. If 0 or negative, defaults to the number of available CPUs. This parameter can also be set at reader creation.
- args (
Sequence
, optional) – Additional positional arguments to be passed to the target function - kwargs (
Mapping
, optional) – Additional keyword arguments to be passed to the target function - **_kwargs – Additional keyword arguments to be passed to the target function
Yields: object – The work item returned by the target function.
- target (
-
reset
()¶ Resets the iterator to its initial state.
-