ms2rescore.report

Functionality for analyzing and reporting MS²Rescore results, including reusable Plotly-based charts and HTML-report generation.

Generate report

Generate an HTML report with various QC charts for of MS²Rescore results.

ms2rescore.report.generate.generate_report(output_path_prefix, psm_list=None, feature_names=None, use_txt_log=False)

Generate the report.

Parameters:
  • output_path_prefix (str) – Prefix of the MS²Rescore output file names. For example, if the output PSM file is /path/to/file.ms2rescore.psms.tsv, the prefix is /path/to/file.ms2rescore.

  • psm_list (PSMList | None) – PSMs to be used for the report. If not provided, the PSMs will be read from the PSM file that matches the output_path_prefix.

  • feature_names (Dict[str, list] | None) – Feature names to be used for the report. If not provided, the feature names will be read from the feature names file that matches the output_path_prefix.

  • use_txt_log (bool) – If True, the log file will be read from output_path_prefix + ".log.txt" instead of output_path_prefix + ".log.html".

Charts

Collection of Plotly-based charts for reporting results of MS²Rescore.

ms2rescore.report.charts.score_histogram(psms)

Plot histogram of scores for a single PSM dataset.

Parameters:

psms (PSMList | DataFrame) – PSMs to plot, as psm_utils.PSMList or pandas.DataFrame generated with psm_utils.PSMList.to_dataframe().

Return type:

Figure

ms2rescore.report.charts.pp_plot(psms)

Generate PP plot of target and decoy score distributions.

Parameters:

psms (PSMList | DataFrame) – PSMs to plot, as psm_utils.PSMList or pandas.DataFrame generated with psm_utils.PSMList.to_dataframe().

Return type:

Figure

ms2rescore.report.charts.fdr_plot(psms, fdr_thresholds=None, log=True)

Plot number of identifications in function of FDR threshold.

Parameters:
Return type:

Figure

ms2rescore.report.charts.score_scatter_plot(before, after, level='psms', indexer='index', fdr_threshold=0.01)

Plot PSM scores before and after rescoring.

Parameters:
  • before (LinearConfidence) – Mokapot linear confidence results before rescoring.

  • after (LinearConfidence) – Mokapot linear confidence results after rescoring.

  • level (str) – Level of confidence estimates to plot. Must be one of “psms”, “peptides”, or “proteins”.

  • indexer (str) – Column with index for each PSM, peptide, or protein to use for merging data frames.

  • fdr_threshold (float)

Return type:

Figure

ms2rescore.report.charts.fdr_plot_comparison(before, after, level='psms', indexer='index')

Plot number of identifications in function of FDR threshold before/after rescoring.

Parameters:
  • before (LinearConfidence) – Mokapot linear confidence results before rescoring.

  • after (LinearConfidence) – Mokapot linear confidence results after rescoring.

  • level (str) – Level of confidence estimates to plot. Must be one of “psms”, “peptides”, or “proteins”.

  • indexer (str) – Column with index for each PSM, peptide, or protein to use for merging dataframes.

Return type:

Figure

ms2rescore.report.charts.identification_overlap(before, after)

Plot stacked bar charts of removed, retained, and gained PSMs, peptides, and proteins.

Parameters:
  • before (LinearConfidence) – Mokapot linear confidence results before rescoring.

  • after (LinearConfidence) – Mokapot linear confidence results after rescoring.

Return type:

Figure

ms2rescore.report.charts.feature_weights(feature_weights, color_discrete_map=None)

Plot bar chart of feature weights.

Parameters:
  • feature_weights (DataFrame) – Data frame with columns feature, feature_generator, and weight.

  • color_discrete_map (Dict[str, str] | None) – Mapping of feature generator names to colors for plotting.

Return type:

Figure

ms2rescore.report.charts.feature_weights_by_generator(feature_weights, color_discrete_map=None)

Plot bar chart of feature weights, summed by feature generator.

Parameters:
  • feature_weights (DataFrame) – Data frame with columns “feature”, “feature_generator”, and “weight”.

  • color_discrete_map (Dict[str, str] | None) – Mapping of feature generator names to colors for plotting.

Return type:

Figure

ms2rescore.report.charts.ms2pip_correlation(features, is_decoy, qvalue)

Plot MS²PIP correlation for target PSMs with q-value <= 0.01.

Parameters:
  • features (DataFrame) – Data frame with features. Must contain the column spec_pearson_norm.

  • is_decoy (Series | ndarray) – Boolean array indicating whether each PSM is a decoy.

  • qvalue (Series | ndarray) – Array of q-values for each PSM.

Return type:

Figure

ms2rescore.report.charts.calculate_feature_qvalues(features, is_decoy)

Calculate q-values and ECDF AUC for all rescoring features.

Q-values are calculated for each feature as if it was directly used PSM score. For each q-value distribution, the ECDF AUC is calculated as a measure of overall individual performance of the feature.

As it is not known whether higher or lower values are better for each feature, q-values are calculated for both the original and reversed scores. The q-values and ECDF AUC are returned for the calculation with the highest ECDF AUC.

Parameters:
  • features (DataFrame) – Data frame with features. Must contain the column spec_pearson_norm.

  • is_decoy (Series) – Boolean array indicating whether each PSM is a decoy.

Returns:

  • feature_qvalues – Wide-form data frame with q-values for each feature.

  • feature_ecdf_auc – Long-form data frame with ECDF AUC for each feature.

Return type:

Tuple[DataFrame, DataFrame]

ms2rescore.report.charts.feature_ecdf_auc_bar(feature_ecdf_auc, color_discrete_map=None)

Plot bar chart of feature q-value ECDF AUCs.

Parameters:
  • feature_ecdf_auc (DataFrame) – Data frame with columns feature, feature_generator, and ecdf_auc.

  • color_discrete_map (Dict[str, str] | None) – Mapping of feature generator names to colors for plotting.

Return type:

Figure