Welcome to Pyteomics tutorial!¶
What is Pyteomics?¶
Pyteomics is a collection of lightweight and handy tools for Python that help to handle various sorts of proteomics data. Pyteomics provides a growing set of modules to facilitate the most common tasks in proteomics data analysis, such as:
calculation of basic physico-chemical properties of polypeptides:
mass and isotopic distribution
charge and pI
chromatographic retention time
access to common proteomics data:
MS or LC-MS data
FASTA databases
search engines output
easy manipulation of sequences of modified peptides and proteins
The goal of the Pyteomics project is to provide a versatile, reliable and well-documented set of open tools for the wide proteomics community. One of the project’s key features is Python itself, an open source language increasingly popular in scientific programming. The main applications of the library are reproducible statistical data analysis and rapid software prototyping.
Citation¶
Pyteomics is distributed under Apache License version 2.0.
When using or redistributing Pyteomics, or parts of it, please cite the following papers:
Goloborodko, A.A.; Levitsky, L.I.; Ivanov, M.V.; and Gorshkov, M.V. (2013) “Pyteomics - a Python Framework for Exploratory Data Analysis and Rapid Software Prototyping in Proteomics”, Journal of The American Society for Mass Spectrometry, 24(2), 301–304. DOI: 10.1007/s13361-012-0516-6
Levitsky, L.I.; Klein, J.; Ivanov, M.V.; and Gorshkov, M.V. (2018) “Pyteomics 4.0: five years of development of a Python proteomics framework”, Journal of Proteome Research. DOI: 10.1021/acs.jproteome.8b00717
Useful Links¶
Pyteomics is hosted at the following sites:
Python package @ Python Package Index: https://pypi.org/project/pyteomics/
project documentation @ Read the Docs: https://pyteomics.readthedocs.io/
source code @ Github: https://github.com/levitsky/pyteomics
mailing list @ Google: https://groups.google.com/group/pyteomics/
Backup of old repo¶
Pyteomics source code used to be hosted on Bitbucket. An archive of issues and pull requests is stored at: https://levitsky.github.io/bitbucket_backup/#!/levitsky/pyteomics.
Pyteomics Extensions¶
Additional, third-party packages extending the Pyteomics functionality can be insalled separately:
pyteomics.pepxmltk (pepXML file creation)
pyteomics.biolccc (retention time prediction)
pyteomics.cythonize (cythonized versions of
mass
andparser
modules)
Feedback & Support¶
Your questions and suggestions are welcome at:
pyteomics@googlegroups.com mailing list;
(new!) Pyteomics Discussions page on Github;
Github issue tracker (for bugs, feature requests, etc.)
Relation to other proteomics data analysis tools¶
Our goal is to create an infrastructure for proteomics data analysis within Python ecosystem. Pyteomics is not a proteomic search engine, nor does it any data conversion. There are other tools for that. Pyteomics does not aim to substitute any of these, but rather to coexist and complement them.
Contents:¶
- Introduction
- How to install Pyteomics
- Peptide sequence formats. Parser module
- Peptide properties: mass, charge, chromatographic retention
- Data Access
- Pyteomics API documentation
- parser - operations on modX peptide sequences
- mass - molecular masses and isotope distributions
- unimod - interface to the Unimod database
- achrom - additive model of polypeptide chromatography
- electrochem - electrochemical properties of polypeptides
- fasta - manipulations with FASTA databases
- peff - PSI Extended FASTA Format
- mzml - reader for mass spectrometry data in mzML format
- mzxml - reader for mass spectrometry data in mzXML format
- mzmlb - reader for mass spectrometry data in mzMLb format
- mgf - read and write MS/MS data in Mascot Generic Format
- ms1 - read and write MS/MS data in MS1 format
- ms2 - read and write MS/MS data in MS2 format
- pepxml - pepXML file reader
- protxml - parsing of ProteinProphet output files
- tandem - X!Tandem output file reader
- mzid - mzIdentML file reader
- mztab - mzTab file reader
- usi - Universal Spectrum Identifier (USI) parser and minimal PROXI client
- proforma - Proteoform and Peptidoform Notation
- featurexml - reader for featureXML files
- trafoxml - reader for trafoXML files
- idxml - idXML file reader
- traml - targeted MS transition data in TraML format
- pylab_aux - auxiliary functions for plotting with pylab
- xml - utilities for XML parsing
- auxiliary - common functions and objects
- version - Pyteomics version information
- Combined examples
- History of changes
- 4.7.4
- 4.7.3
- 4.7.2
- 4.7.1
- 4.7
- 4.6.3
- 4.6.2
- 4.6.1
- 4.6
- 4.5.6
- 4.5.5
- 4.5.4
- 4.5.3
- 4.5.2
- 4.5.1
- 4.5
- 4.4.2
- 4.4.1
- 4.4
- 4.3.3
- 4.3.2
- 4.3.1
- 4.3
- 4.2
- 4.1.2
- 4.1.1
- 4.1
- 4.0.1
- 4.0
- 3.5.1
- 3.5
- 3.4.2
- 3.4.1
- 3.4
- 3.3.1
- 3.3
- 3.2
- 3.1.1
- 3.1
- 3.0.1
- 3.0.0
- 2.5.5
- 2.5.4
- 2.5.3
- 2.5.2
- 2.5.1
- 2.5.0
- 2.4.3
- 2.4.2
- 2.4.1
- 2.4.0
- 2.3.0
- 2.2.2
- 2.2.1
- 2.2.0
- 2.1.6
- 2.1.5
- 2.1.4
- 2.1.3
- 2.1.2
- 2.1.1
- 2.1.0
- 2.0.3
- 2.0.2
- 2.0.1
- 2.0.0
- 1.2.5
- 1.2.4
- 1.2.3
- 1.2.2
- 1.2.1
- 1.2.0
- 1.1.1
- 1.1.0
- 1.0.2
- 1.0.1
- 1.0.0