AlphaPeptStats: an open-source Python package for automated and scalable statistical analysis of mass spectrometry-based proteomics

Research output: Contribution to journalJournal articleResearchpeer-review

Documents

  • Fulltext

    Final published version, 1.72 MB, PDF document

SUMMARY: The widespread application of mass spectrometry (MS)-based proteomics in biomedical research increasingly requires robust, transparent, and streamlined solutions to extract statistically reliable insights. We have designed and implemented AlphaPeptStats, an inclusive Python package with currently with broad functionalities for normalization, imputation, visualization, and statistical analysis of label-free proteomics data. It modularly builds on the established stack of Python scientific libraries and is accompanied by a rigorous testing framework with 98% test coverage. It imports the output of a range of popular search engines. Data can be filtered and normalized according to user specifications. At its heart, AlphaPeptStats provides a wide range of robust statistical algorithms such as t-tests, analysis of variance, principal component analysis, hierarchical clustering, and multiple covariate analysis-all in an automatable manner. Data visualization capabilities include heat maps, volcano plots, and scatter plots in publication-ready format. AlphaPeptStats advances proteomic research through its robust tools that enable researchers to manually or automatically explore complex datasets to identify interesting patterns and outliers.

AVAILABILITY AND IMPLEMENTATION: AlphaPeptStats is implemented in Python and part of the AlphaPept framework. It is released under a permissive Apache license. The source code and one-click installers are freely available and on GitHub at https://github.com/MannLabs/alphapeptstats.

Original languageEnglish
JournalBioinformatics
Volume39
Issue number8
Number of pages4
ISSN1367-4811
DOIs
Publication statusPublished - 2023

Bibliographical note

© The Author(s) 2023. Published by Oxford University Press.

    Research areas

  • Proteomics/methods, Software, Mass Spectrometry/methods, Algorithms, Search Engine

ID: 363063449