Darkness in the Human Gene and Protein Function Space: Widely Modest or Absent Illumination by the Life Science Literature and the Trend for Fewer Protein Function Discoveries Since 2000

Research output: Contribution to journalJournal articlepeer-review

Standard

Darkness in the Human Gene and Protein Function Space : Widely Modest or Absent Illumination by the Life Science Literature and the Trend for Fewer Protein Function Discoveries Since 2000. / Sinha, Swati; Eisenhaber, Birgit; Jensen, Lars Juhl; Kalbuaji, Bharata; Eisenhaber, Frank.

In: Proteomics, Vol. 18, No. 21-22, e1800093, 2018, p. 1-13.

Research output: Contribution to journalJournal articlepeer-review

Harvard

Sinha, S, Eisenhaber, B, Jensen, LJ, Kalbuaji, B & Eisenhaber, F 2018, 'Darkness in the Human Gene and Protein Function Space: Widely Modest or Absent Illumination by the Life Science Literature and the Trend for Fewer Protein Function Discoveries Since 2000', Proteomics, vol. 18, no. 21-22, e1800093, pp. 1-13. https://doi.org/10.1002/pmic.201800093

APA

Sinha, S., Eisenhaber, B., Jensen, L. J., Kalbuaji, B., & Eisenhaber, F. (2018). Darkness in the Human Gene and Protein Function Space: Widely Modest or Absent Illumination by the Life Science Literature and the Trend for Fewer Protein Function Discoveries Since 2000. Proteomics, 18(21-22), 1-13. [e1800093]. https://doi.org/10.1002/pmic.201800093

Vancouver

Sinha S, Eisenhaber B, Jensen LJ, Kalbuaji B, Eisenhaber F. Darkness in the Human Gene and Protein Function Space: Widely Modest or Absent Illumination by the Life Science Literature and the Trend for Fewer Protein Function Discoveries Since 2000. Proteomics. 2018;18(21-22):1-13. e1800093. https://doi.org/10.1002/pmic.201800093

Author

Sinha, Swati ; Eisenhaber, Birgit ; Jensen, Lars Juhl ; Kalbuaji, Bharata ; Eisenhaber, Frank. / Darkness in the Human Gene and Protein Function Space : Widely Modest or Absent Illumination by the Life Science Literature and the Trend for Fewer Protein Function Discoveries Since 2000. In: Proteomics. 2018 ; Vol. 18, No. 21-22. pp. 1-13.

Bibtex

@article{dfb018afeec043c7a1aa36370f388d0b,
title = "Darkness in the Human Gene and Protein Function Space: Widely Modest or Absent Illumination by the Life Science Literature and the Trend for Fewer Protein Function Discoveries Since 2000",
abstract = "The mentioning of gene names in the body of the scientific literature 1901-2017 and their fractional counting was used as a proxy to assess the level of biological function discovery. We define a literature score of one as full publication equivalent (FPE), the amount of literature necessary to achieve one publication solely dedicated to a gene. We find that less than 5000 human genes have each at least 100 FPEs in the available literature corpus. This group of elite genes (4817 protein-coding genes, 119 non-coding RNAs) attracts the overwhelming majority of the scientific literature about genes. Yet, thousands of proteins have never been mentioned at all, ∼2000 further proteins have not even one FPE of literature and, for ∼4600 additional proteins, the FPE count is below 10. The protein function discovery rate measured as numbers of proteins first mentioned or crossing a threshold of accumulated FPEs in a given year has grown until 2000 but is in decline thereafter. This drop is partially offset by function discoveries for non-coding RNAs. The full human genome sequencing did not boost the function discovery rate. Since 2000, the fastest growing group in the literature is that with at least 500 FPEs per gene. This article is protected by copyright. All rights reserved.",
author = "Swati Sinha and Birgit Eisenhaber and Jensen, {Lars Juhl} and Bharata Kalbuaji and Frank Eisenhaber",
note = "Special Issue: The Dark Proteome and Related Structural Proteomics",
year = "2018",
doi = "10.1002/pmic.201800093",
language = "English",
volume = "18",
pages = "1--13",
journal = "Proteomics",
issn = "1615-9853",
publisher = "Wiley - V C H Verlag GmbH & Co. KGaA",
number = "21-22",

}

RIS

TY - JOUR

T1 - Darkness in the Human Gene and Protein Function Space

T2 - Widely Modest or Absent Illumination by the Life Science Literature and the Trend for Fewer Protein Function Discoveries Since 2000

AU - Sinha, Swati

AU - Eisenhaber, Birgit

AU - Jensen, Lars Juhl

AU - Kalbuaji, Bharata

AU - Eisenhaber, Frank

N1 - Special Issue: The Dark Proteome and Related Structural Proteomics

PY - 2018

Y1 - 2018

N2 - The mentioning of gene names in the body of the scientific literature 1901-2017 and their fractional counting was used as a proxy to assess the level of biological function discovery. We define a literature score of one as full publication equivalent (FPE), the amount of literature necessary to achieve one publication solely dedicated to a gene. We find that less than 5000 human genes have each at least 100 FPEs in the available literature corpus. This group of elite genes (4817 protein-coding genes, 119 non-coding RNAs) attracts the overwhelming majority of the scientific literature about genes. Yet, thousands of proteins have never been mentioned at all, ∼2000 further proteins have not even one FPE of literature and, for ∼4600 additional proteins, the FPE count is below 10. The protein function discovery rate measured as numbers of proteins first mentioned or crossing a threshold of accumulated FPEs in a given year has grown until 2000 but is in decline thereafter. This drop is partially offset by function discoveries for non-coding RNAs. The full human genome sequencing did not boost the function discovery rate. Since 2000, the fastest growing group in the literature is that with at least 500 FPEs per gene. This article is protected by copyright. All rights reserved.

AB - The mentioning of gene names in the body of the scientific literature 1901-2017 and their fractional counting was used as a proxy to assess the level of biological function discovery. We define a literature score of one as full publication equivalent (FPE), the amount of literature necessary to achieve one publication solely dedicated to a gene. We find that less than 5000 human genes have each at least 100 FPEs in the available literature corpus. This group of elite genes (4817 protein-coding genes, 119 non-coding RNAs) attracts the overwhelming majority of the scientific literature about genes. Yet, thousands of proteins have never been mentioned at all, ∼2000 further proteins have not even one FPE of literature and, for ∼4600 additional proteins, the FPE count is below 10. The protein function discovery rate measured as numbers of proteins first mentioned or crossing a threshold of accumulated FPEs in a given year has grown until 2000 but is in decline thereafter. This drop is partially offset by function discoveries for non-coding RNAs. The full human genome sequencing did not boost the function discovery rate. Since 2000, the fastest growing group in the literature is that with at least 500 FPEs per gene. This article is protected by copyright. All rights reserved.

U2 - 10.1002/pmic.201800093

DO - 10.1002/pmic.201800093

M3 - Journal article

C2 - 30265449

VL - 18

SP - 1

EP - 13

JO - Proteomics

JF - Proteomics

SN - 1615-9853

IS - 21-22

M1 - e1800093

ER -

ID: 203287597