ms2rescore.rescoring_engines
Rescoring engines integrated in MS²Rescore.
Each integrated rescoring engine typically includes a rescore()
function that takes a
PSMList
as input and writes the new scores, q-values, and PEPs to
the original PSMList
.
Mokapot
Mokapot integration for MS²Rescore.
mokapot
is a full-Python implementation of the semi-supervised learning algorithms
introduced with Percolator. It builds upon the flexible scikit-learn package, which makes it
highly efficient for routine applications, but also customizable for experimental research
settings. Using Mokapot through MS²Rescore brings several advantages over Percolator: It can be
easily installed in the same Python environment, and it is generally faster as the communication
between the tools happens completely within Python, without the need to write and read files
or communicate through the command line. See
mokapot.readthedocs.io for more information.
If you use Mokapot through MS²Rescore, please cite:
Fondrie W. E. & Noble W. S. mokapot: Fast and Flexible Semisupervised Learning for Peptide Detection. J Proteome Res (2021). doi:10.1021/acs.jproteome.0c01010
- ms2rescore.rescoring_engines.mokapot.rescore(psm_list, output_file_root='ms2rescore', fasta_file=None, train_fdr=0.01, write_weights=False, write_txt=False, write_flashlfq=False, protein_kwargs=None, **kwargs)
Rescore PSMs with Mokapot.
The function provides a high-level interface to use Mokapot within MS²Rescore. It first converts the
PSMList
to aLinearPsmDataset
, and then optionally adds protein information from a FASTA file. The dataset is then passed to thebrew()
function, which returns the new scores, q-values, and PEPs. These are then written back to the originalPSMList
. Optionally, results can be written to a Mokapot text file, a FlashLFQ-compatible file, or the model weights can be saved.- Parameters:
psm_list (PSMList) – PSMs to be rescored.
output_file_root (str) – Root of output file names. Defaults to
"ms2rescore"
.fasta_file (str | None) – Path to FASTA file with protein sequences to use for protein inference. Defaults to
None
.train_fdr (float) – FDR to use for training the Mokapot model. Defaults to
0.01
.write_weights (bool) – Write model weights to a text file. Defaults to
False
.write_txt (bool) – Write Mokapot results to a text file. Defaults to
False
.write_flashlfq (bool) – Write Mokapot results to a FlashLFQ-compatible file. Defaults to
False
.protein_kwargs (Dict[str, Any] | None) – Keyword arguments to pass to the
add_proteins()
method.**kwargs (Any) – Additional keyword arguments are passed to the Mokapot
brew()
function.
- Return type:
None
- ms2rescore.rescoring_engines.mokapot.convert_psm_list(psm_list, feature_names=None)
Convert a PSM list to a Mokapot dataset.
- Parameters:
- Return type:
- ms2rescore.rescoring_engines.mokapot.save_model_weights(models, feature_names, output_file_root)
Save model weights to a file.
Percolator
Percolator integration for MS²Rescore
Percolator was the first tool to introduce semi-supervised learning for PSM rescoring. It is
still widely used and has been integrated in many proteomics data analysis pipelines. This module
integrates with Percolator through its command line interface. Percolator must be installed
separately and the percolator
command must be available in the PATH for this module to work.
See github.com/percolator/percolator for
more information.
If you use Percolator through MS²Rescore, please cite:
The M, MacCoss MJ, Noble WS, Käll L. Fast and Accurate Protein False Discovery Rates on Large-Scale Proteomics Data Sets with Percolator 3.0. J Am Soc Mass Spectrom (2016). doi:10.1007/s13361-016-1460-7
- ms2rescore.rescoring_engines.percolator.rescore(psm_list, output_file_root='ms2rescore', log_level='info', processes=1, fasta_file=None, percolator_kwargs=None)
Rescore PSMs with Percolator.
Aside from updating the PSM
score
,qvalue
, andpep
values, the following output files are written:Target PSMs:
{output_file_root}.percolator.psms.pout
Target peptides:
{output_file_root}.percolator.peptides.pout
Target proteins:
{output_file_root}.percolator.proteins.pout
Decoy PSMs:
{output_file_root}.percolator.decoy.psms.pout
Decoy peptides:
{output_file_root}.percolator.decoy.peptides.pout
Decoy proteins:
{output_file_root}.percolator.decoy.proteins.pout
Feature weights:
{output_file_root}.percolator.weights.tsv
Percolator is run through its command line interface. Percolator must be installed separately and the
percolator
command must be available in the PATH for this module to work.- Parameters:
psm_list (PSMList) – PSMs to be rescored.
output_file_root (str) – Root of output file names. Defaults to
ms2rescore
.log_level (str) – Log level for Percolator. Defaults to
info
.processes (int) – Number of processes to use. Defaults to 1.
fasta_file (str | None) – Path to FASTA file for protein inference. Defaults to
None
.percolator_kwargs (Dict[str, Any] | None) – Additional keyword arguments for Percolator. Defaults to
None
.
- Return type:
None