On the total number of genes and their length distribution in complete microbial genomes

Research output: Contribution to journalJournal articleResearchpeer-review

Standard

On the total number of genes and their length distribution in complete microbial genomes. / Skovgaard, M; Jensen, L J; Brunak, S; Ussery, David; Krogh, A.

In: Trends in Genetics, Vol. 17, No. 8, 2001, p. 425-8.

Research output: Contribution to journalJournal articleResearchpeer-review

Harvard

Skovgaard, M, Jensen, LJ, Brunak, S, Ussery, D & Krogh, A 2001, 'On the total number of genes and their length distribution in complete microbial genomes', Trends in Genetics, vol. 17, no. 8, pp. 425-8.

APA

Skovgaard, M., Jensen, L. J., Brunak, S., Ussery, D., & Krogh, A. (2001). On the total number of genes and their length distribution in complete microbial genomes. Trends in Genetics, 17(8), 425-8.

Vancouver

Skovgaard M, Jensen LJ, Brunak S, Ussery D, Krogh A. On the total number of genes and their length distribution in complete microbial genomes. Trends in Genetics. 2001;17(8):425-8.

Author

Skovgaard, M ; Jensen, L J ; Brunak, S ; Ussery, David ; Krogh, A. / On the total number of genes and their length distribution in complete microbial genomes. In: Trends in Genetics. 2001 ; Vol. 17, No. 8. pp. 425-8.

Bibtex

@article{bc35d9d993734622a5319f214522739d,
title = "On the total number of genes and their length distribution in complete microbial genomes",
abstract = "In sequenced microbial genomes, some of the annotated genes are actually not protein-coding genes, but rather open reading frames that occur by chance. Therefore, the number of annotated genes is higher than the actual number of genes for most of these microbes. Comparison of the length distribution of the annotated genes with the length distribution of those matching a known protein reveals that too many short genes are annotated in many genomes. Here we estimate the true number of protein-coding genes for sequenced genomes. Although it is often claimed that Escherichia coli has about 4300 genes, we show that it probably has only approximately 3800 genes, and that a similar discrepancy exists for almost all published genomes.",
keywords = "Databases, Factual, Escherichia coli, Genome, Genome, Bacterial, Models, Statistical, Open Reading Frames, Saccharomyces cerevisiae",
author = "M Skovgaard and Jensen, {L J} and S Brunak and David Ussery and A Krogh",
year = "2001",
language = "English",
volume = "17",
pages = "425--8",
journal = "Trends in Genetics",
issn = "0168-9525",
publisher = "Elsevier Ltd. * Trends Journals",
number = "8",

}

RIS

TY - JOUR

T1 - On the total number of genes and their length distribution in complete microbial genomes

AU - Skovgaard, M

AU - Jensen, L J

AU - Brunak, S

AU - Ussery, David

AU - Krogh, A

PY - 2001

Y1 - 2001

N2 - In sequenced microbial genomes, some of the annotated genes are actually not protein-coding genes, but rather open reading frames that occur by chance. Therefore, the number of annotated genes is higher than the actual number of genes for most of these microbes. Comparison of the length distribution of the annotated genes with the length distribution of those matching a known protein reveals that too many short genes are annotated in many genomes. Here we estimate the true number of protein-coding genes for sequenced genomes. Although it is often claimed that Escherichia coli has about 4300 genes, we show that it probably has only approximately 3800 genes, and that a similar discrepancy exists for almost all published genomes.

AB - In sequenced microbial genomes, some of the annotated genes are actually not protein-coding genes, but rather open reading frames that occur by chance. Therefore, the number of annotated genes is higher than the actual number of genes for most of these microbes. Comparison of the length distribution of the annotated genes with the length distribution of those matching a known protein reveals that too many short genes are annotated in many genomes. Here we estimate the true number of protein-coding genes for sequenced genomes. Although it is often claimed that Escherichia coli has about 4300 genes, we show that it probably has only approximately 3800 genes, and that a similar discrepancy exists for almost all published genomes.

KW - Databases, Factual

KW - Escherichia coli

KW - Genome

KW - Genome, Bacterial

KW - Models, Statistical

KW - Open Reading Frames

KW - Saccharomyces cerevisiae

M3 - Journal article

C2 - 11485798

VL - 17

SP - 425

EP - 428

JO - Trends in Genetics

JF - Trends in Genetics

SN - 0168-9525

IS - 8

ER -

ID: 40749806