High performance computational analysis of large-scale proteome data sets to assess incremental contribution to coverage of the human genome

Research output: Contribution to journalJournal articleResearchpeer-review

Standard

High performance computational analysis of large-scale proteome data sets to assess incremental contribution to coverage of the human genome. / Neuhauser, Nadin; Nagaraj, Nagarjuna; McHardy, Peter; Zanivan, Sara; Scheltema, Richard; Cox, Jürgen; Mann, Matthias.

In: Journal of Proteome Research, Vol. 12, No. 6, 07.06.2013, p. 2858-68.

Research output: Contribution to journalJournal articleResearchpeer-review

Harvard

Neuhauser, N, Nagaraj, N, McHardy, P, Zanivan, S, Scheltema, R, Cox, J & Mann, M 2013, 'High performance computational analysis of large-scale proteome data sets to assess incremental contribution to coverage of the human genome', Journal of Proteome Research, vol. 12, no. 6, pp. 2858-68. https://doi.org/10.1021/pr400181q

APA

Neuhauser, N., Nagaraj, N., McHardy, P., Zanivan, S., Scheltema, R., Cox, J., & Mann, M. (2013). High performance computational analysis of large-scale proteome data sets to assess incremental contribution to coverage of the human genome. Journal of Proteome Research, 12(6), 2858-68. https://doi.org/10.1021/pr400181q

Vancouver

Neuhauser N, Nagaraj N, McHardy P, Zanivan S, Scheltema R, Cox J et al. High performance computational analysis of large-scale proteome data sets to assess incremental contribution to coverage of the human genome. Journal of Proteome Research. 2013 Jun 7;12(6):2858-68. https://doi.org/10.1021/pr400181q

Author

Neuhauser, Nadin ; Nagaraj, Nagarjuna ; McHardy, Peter ; Zanivan, Sara ; Scheltema, Richard ; Cox, Jürgen ; Mann, Matthias. / High performance computational analysis of large-scale proteome data sets to assess incremental contribution to coverage of the human genome. In: Journal of Proteome Research. 2013 ; Vol. 12, No. 6. pp. 2858-68.

Bibtex

@article{7e1fb81fa257487fa15e1bfde5928467,
title = "High performance computational analysis of large-scale proteome data sets to assess incremental contribution to coverage of the human genome",
abstract = "Computational analysis of shotgun proteomics data can now be performed in a completely automated and statistically rigorous way, as exemplified by the freely available MaxQuant environment. The sophisticated algorithms involved and the sheer amount of data translate into very high computational demands. Here we describe parallelization and memory optimization of the MaxQuant software with the aim of executing it on a large computer cluster. We analyze and mitigate bottlenecks in overall performance and find that the most time-consuming algorithms are those detecting peptide features in the MS(1) data as well as the fragment spectrum search. These tasks scale with the number of raw files and can readily be distributed over many CPUs as long as memory access is properly managed. Here we compared the performance of a parallelized version of MaxQuant running on a standard desktop, an I/O performance optimized desktop computer ({"}game computer{"}), and a cluster environment. The modified gaming computer and the cluster vastly outperformed a standard desktop computer when analyzing more than 1000 raw files. We apply our high performance platform to investigate incremental coverage of the human proteome by high resolution MS data originating from in-depth cell line and cancer tissue proteome measurements.",
author = "Nadin Neuhauser and Nagarjuna Nagaraj and Peter McHardy and Sara Zanivan and Richard Scheltema and J{\"u}rgen Cox and Matthias Mann",
year = "2013",
month = jun,
day = "7",
doi = "10.1021/pr400181q",
language = "English",
volume = "12",
pages = "2858--68",
journal = "Journal of Proteome Research",
issn = "1535-3893",
publisher = "American Chemical Society",
number = "6",

}

RIS

TY - JOUR

T1 - High performance computational analysis of large-scale proteome data sets to assess incremental contribution to coverage of the human genome

AU - Neuhauser, Nadin

AU - Nagaraj, Nagarjuna

AU - McHardy, Peter

AU - Zanivan, Sara

AU - Scheltema, Richard

AU - Cox, Jürgen

AU - Mann, Matthias

PY - 2013/6/7

Y1 - 2013/6/7

N2 - Computational analysis of shotgun proteomics data can now be performed in a completely automated and statistically rigorous way, as exemplified by the freely available MaxQuant environment. The sophisticated algorithms involved and the sheer amount of data translate into very high computational demands. Here we describe parallelization and memory optimization of the MaxQuant software with the aim of executing it on a large computer cluster. We analyze and mitigate bottlenecks in overall performance and find that the most time-consuming algorithms are those detecting peptide features in the MS(1) data as well as the fragment spectrum search. These tasks scale with the number of raw files and can readily be distributed over many CPUs as long as memory access is properly managed. Here we compared the performance of a parallelized version of MaxQuant running on a standard desktop, an I/O performance optimized desktop computer ("game computer"), and a cluster environment. The modified gaming computer and the cluster vastly outperformed a standard desktop computer when analyzing more than 1000 raw files. We apply our high performance platform to investigate incremental coverage of the human proteome by high resolution MS data originating from in-depth cell line and cancer tissue proteome measurements.

AB - Computational analysis of shotgun proteomics data can now be performed in a completely automated and statistically rigorous way, as exemplified by the freely available MaxQuant environment. The sophisticated algorithms involved and the sheer amount of data translate into very high computational demands. Here we describe parallelization and memory optimization of the MaxQuant software with the aim of executing it on a large computer cluster. We analyze and mitigate bottlenecks in overall performance and find that the most time-consuming algorithms are those detecting peptide features in the MS(1) data as well as the fragment spectrum search. These tasks scale with the number of raw files and can readily be distributed over many CPUs as long as memory access is properly managed. Here we compared the performance of a parallelized version of MaxQuant running on a standard desktop, an I/O performance optimized desktop computer ("game computer"), and a cluster environment. The modified gaming computer and the cluster vastly outperformed a standard desktop computer when analyzing more than 1000 raw files. We apply our high performance platform to investigate incremental coverage of the human proteome by high resolution MS data originating from in-depth cell line and cancer tissue proteome measurements.

U2 - 10.1021/pr400181q

DO - 10.1021/pr400181q

M3 - Journal article

C2 - 23611042

VL - 12

SP - 2858

EP - 2868

JO - Journal of Proteome Research

JF - Journal of Proteome Research

SN - 1535-3893

IS - 6

ER -

ID: 88585299