ms2rescore.report
Functionality for analyzing and reporting MS²Rescore results, including reusable Plotly-based charts and HTML-report generation.
Generate report
Generate an HTML report with various QC charts for of MS²Rescore results.
- ms2rescore.report.generate.generate_report(output_path_prefix, psm_list=None, feature_names=None, use_txt_log=False)
Generate the report.
- Parameters:
output_path_prefix (str) – Prefix of the MS²Rescore output file names. For example, if the output PSM file is
/path/to/file.ms2rescore.psms.tsv
, the prefix is/path/to/file.ms2rescore
.psm_list (PSMList | None) – PSMs to be used for the report. If not provided, the PSMs will be read from the PSM file that matches the
output_path_prefix
.feature_names (Dict[str, list] | None) – Feature names to be used for the report. If not provided, the feature names will be read from the feature names file that matches the
output_path_prefix
.use_txt_log (bool) – If True, the log file will be read from
output_path_prefix + ".log.txt"
instead ofoutput_path_prefix + ".log.html"
.
Charts
Collection of Plotly-based charts for reporting results of MS²Rescore.
- ms2rescore.report.charts.score_histogram(psms)
Plot histogram of scores for a single PSM dataset.
- Parameters:
psms (PSMList | DataFrame) – PSMs to plot, as
psm_utils.PSMList
orpandas.DataFrame
generated withpsm_utils.PSMList.to_dataframe()
.- Return type:
Figure
- ms2rescore.report.charts.pp_plot(psms)
Generate PP plot of target and decoy score distributions.
- Parameters:
psms (PSMList | DataFrame) – PSMs to plot, as
psm_utils.PSMList
orpandas.DataFrame
generated withpsm_utils.PSMList.to_dataframe()
.- Return type:
Figure
- ms2rescore.report.charts.fdr_plot(psms, fdr_thresholds=None, log=True)
Plot number of identifications in function of FDR threshold.
- Parameters:
psms (PSMList | DataFrame) – PSMs to plot, as
psm_utils.PSMList
orpandas.DataFrame
generated withpsm_utils.PSMList.to_dataframe()
.fdr_thresholds (List[float] | None) – List of FDR thresholds to draw as vertical lines.
log (bool) – Whether to plot the x-axis on a log scale. Defaults to
True
.
- Return type:
Figure
- ms2rescore.report.charts.score_scatter_plot(before, after, level='psms', indexer='index', fdr_threshold=0.01)
Plot PSM scores before and after rescoring.
- Parameters:
before (LinearConfidence) – Mokapot linear confidence results before rescoring.
after (LinearConfidence) – Mokapot linear confidence results after rescoring.
level (str) – Level of confidence estimates to plot. Must be one of “psms”, “peptides”, or “proteins”.
indexer (str) – Column with index for each PSM, peptide, or protein to use for merging data frames.
fdr_threshold (float)
- Return type:
Figure
- ms2rescore.report.charts.fdr_plot_comparison(before, after, level='psms', indexer='index')
Plot number of identifications in function of FDR threshold before/after rescoring.
- Parameters:
before (LinearConfidence) – Mokapot linear confidence results before rescoring.
after (LinearConfidence) – Mokapot linear confidence results after rescoring.
level (str) – Level of confidence estimates to plot. Must be one of “psms”, “peptides”, or “proteins”.
indexer (str) – Column with index for each PSM, peptide, or protein to use for merging dataframes.
- Return type:
Figure
- ms2rescore.report.charts.identification_overlap(before, after)
Plot stacked bar charts of removed, retained, and gained PSMs, peptides, and proteins.
- Parameters:
before (LinearConfidence) – Mokapot linear confidence results before rescoring.
after (LinearConfidence) – Mokapot linear confidence results after rescoring.
- Return type:
Figure
- ms2rescore.report.charts.feature_weights(feature_weights, color_discrete_map=None)
Plot bar chart of feature weights.
- ms2rescore.report.charts.feature_weights_by_generator(feature_weights, color_discrete_map=None)
Plot bar chart of feature weights, summed by feature generator.
- ms2rescore.report.charts.ms2pip_correlation(features, is_decoy, qvalue)
Plot MS²PIP correlation for target PSMs with q-value <= 0.01.
- ms2rescore.report.charts.calculate_feature_qvalues(features, is_decoy)
Calculate q-values and ECDF AUC for all rescoring features.
Q-values are calculated for each feature as if it was directly used PSM score. For each q-value distribution, the ECDF AUC is calculated as a measure of overall individual performance of the feature.
As it is not known whether higher or lower values are better for each feature, q-values are calculated for both the original and reversed scores. The q-values and ECDF AUC are returned for the calculation with the highest ECDF AUC.
- Parameters:
- Returns:
feature_qvalues – Wide-form data frame with q-values for each feature.
feature_ecdf_auc – Long-form data frame with ECDF AUC for each feature.
- Return type:
- ms2rescore.report.charts.feature_ecdf_auc_bar(feature_ecdf_auc, color_discrete_map=None)
Plot bar chart of feature q-value ECDF AUCs.