Functionality for analyzing and reporting MS²Rescore results, including reusable Plotly-based charts and HTML-report generation.

Generate report

Generate an HTML report with various QC charts for of MS²Rescore results., psm_list=None, feature_names=None, use_txt_log=False)

Generate the report.

  • output_path_prefix (str) – Prefix of the MS²Rescore output file names. For example, if the output PSM file is /path/to/file.ms2rescore.psms.tsv, the prefix is /path/to/file.ms2rescore.

  • psm_list (PSMList | None) – PSMs to be used for the report. If not provided, the PSMs will be read from the PSM file that matches the output_path_prefix.

  • feature_names (Dict[str, list] | None) – Feature names to be used for the report. If not provided, the feature names will be read from the feature names file that matches the output_path_prefix.

  • use_txt_log (bool) – If True, the log file will be read from output_path_prefix + ".log.txt" instead of output_path_prefix + ".log.html".


Collection of Plotly-based charts for reporting results of MS²Rescore.

Plot histogram of scores for a single PSM dataset.


psms (PSMList | DataFrame) – PSMs to plot, as psm_utils.PSMList or pandas.DataFrame generated with psm_utils.PSMList.to_dataframe().

Return type:


Generate PP plot of target and decoy score distributions.


psms (PSMList | DataFrame) – PSMs to plot, as psm_utils.PSMList or pandas.DataFrame generated with psm_utils.PSMList.to_dataframe().

Return type:

Figure, fdr_thresholds=None, log=True)

Plot number of identifications in function of FDR threshold.

Return type:

Figure, after, level='psms', indexer='index', fdr_threshold=0.01)

Plot PSM scores before and after rescoring.

  • before (LinearConfidence) – Mokapot linear confidence results before rescoring.

  • after (LinearConfidence) – Mokapot linear confidence results after rescoring.

  • level (str) – Level of confidence estimates to plot. Must be one of “psms”, “peptides”, or “proteins”.

  • indexer (str) – Column with index for each PSM, peptide, or protein to use for merging data frames.

  • fdr_threshold (float)

Return type:

Figure, after, level='psms', indexer='index')

Plot number of identifications in function of FDR threshold before/after rescoring.

  • before (LinearConfidence) – Mokapot linear confidence results before rescoring.

  • after (LinearConfidence) – Mokapot linear confidence results after rescoring.

  • level (str) – Level of confidence estimates to plot. Must be one of “psms”, “peptides”, or “proteins”.

  • indexer (str) – Column with index for each PSM, peptide, or protein to use for merging dataframes.

Return type:

Figure, after)

Plot stacked bar charts of removed, retained, and gained PSMs, peptides, and proteins.

  • before (LinearConfidence) – Mokapot linear confidence results before rescoring.

  • after (LinearConfidence) – Mokapot linear confidence results after rescoring.

Return type:

Figure, color_discrete_map=None)

Plot bar chart of feature weights.

  • feature_weights (DataFrame) – Data frame with columns feature, feature_generator, and weight.

  • color_discrete_map (Dict[str, str] | None) – Mapping of feature generator names to colors for plotting.

Return type:

Figure, color_discrete_map=None)

Plot bar chart of feature weights, summed by feature generator.

  • feature_weights (DataFrame) – Data frame with columns “feature”, “feature_generator”, and “weight”.

  • color_discrete_map (Dict[str, str] | None) – Mapping of feature generator names to colors for plotting.

Return type:

Figure, is_decoy, qvalue)

Plot MS²PIP correlation for target PSMs with q-value <= 0.01.

  • features (DataFrame) – Data frame with features. Must contain the column spec_pearson_norm.

  • is_decoy (Series | ndarray) – Boolean array indicating whether each PSM is a decoy.

  • qvalue (Series | ndarray) – Array of q-values for each PSM.

Return type:

Figure, is_decoy)

Calculate q-values and ECDF AUC for all rescoring features.

Q-values are calculated for each feature as if it was directly used PSM score. For each q-value distribution, the ECDF AUC is calculated as a measure of overall individual performance of the feature.

As it is not known whether higher or lower values are better for each feature, q-values are calculated for both the original and reversed scores. The q-values and ECDF AUC are returned for the calculation with the highest ECDF AUC.

  • features (DataFrame) – Data frame with features. Must contain the column spec_pearson_norm.

  • is_decoy (Series) – Boolean array indicating whether each PSM is a decoy.


  • feature_qvalues – Wide-form data frame with q-values for each feature.

  • feature_ecdf_auc – Long-form data frame with ECDF AUC for each feature.

Return type:

Tuple[DataFrame, DataFrame], color_discrete_map=None)

Plot bar chart of feature q-value ECDF AUCs.

  • feature_ecdf_auc (DataFrame) – Data frame with columns feature, feature_generator, and ecdf_auc.

  • color_discrete_map (Dict[str, str] | None) – Mapping of feature generator names to colors for plotting.

Return type:
