A computational framework to integrate high-throughput "-omics" datasets for the identification of potential mechanistic links

Research output: Contribution to journalJournal articleResearchpeer-review

Standard

A computational framework to integrate high-throughput "-omics" datasets for the identification of potential mechanistic links. / Pedersen, Helle Krogh; Forslund, Sofia K; Gudmundsdottir, Valborg; Petersen, Anders Østergaard; Hildebrand, Falk; Hyötyläinen, Tuulia; Nielsen, Trine; Hansen, Torben; Bork, Peer; Ehrlich, S. Dusko; Brunak, Søren; Oresic, Matej; Pedersen, Oluf; Nielsen, Henrik Bjørn.

In: Nature Protocols, Vol. 13, No. 12, 2018, p. 2781-2800.

Research output: Contribution to journalJournal articleResearchpeer-review

Harvard

Pedersen, HK, Forslund, SK, Gudmundsdottir, V, Petersen, AØ, Hildebrand, F, Hyötyläinen, T, Nielsen, T, Hansen, T, Bork, P, Ehrlich, SD, Brunak, S, Oresic, M, Pedersen, O & Nielsen, HB 2018, 'A computational framework to integrate high-throughput "-omics" datasets for the identification of potential mechanistic links', Nature Protocols, vol. 13, no. 12, pp. 2781-2800. https://doi.org/10.1038/s41596-018-0064-z

APA

Pedersen, H. K., Forslund, S. K., Gudmundsdottir, V., Petersen, A. Ø., Hildebrand, F., Hyötyläinen, T., ... Nielsen, H. B. (2018). A computational framework to integrate high-throughput "-omics" datasets for the identification of potential mechanistic links. Nature Protocols, 13(12), 2781-2800. https://doi.org/10.1038/s41596-018-0064-z

Vancouver

Pedersen HK, Forslund SK, Gudmundsdottir V, Petersen AØ, Hildebrand F, Hyötyläinen T et al. A computational framework to integrate high-throughput "-omics" datasets for the identification of potential mechanistic links. Nature Protocols. 2018;13(12):2781-2800. https://doi.org/10.1038/s41596-018-0064-z

Author

Pedersen, Helle Krogh ; Forslund, Sofia K ; Gudmundsdottir, Valborg ; Petersen, Anders Østergaard ; Hildebrand, Falk ; Hyötyläinen, Tuulia ; Nielsen, Trine ; Hansen, Torben ; Bork, Peer ; Ehrlich, S. Dusko ; Brunak, Søren ; Oresic, Matej ; Pedersen, Oluf ; Nielsen, Henrik Bjørn. / A computational framework to integrate high-throughput "-omics" datasets for the identification of potential mechanistic links. In: Nature Protocols. 2018 ; Vol. 13, No. 12. pp. 2781-2800.

Bibtex

@article{66114ccbe5c44091ae99a876a7188010,
title = "A computational framework to integrate high-throughput {"}-omics{"} datasets for the identification of potential mechanistic links",
abstract = "We recently presented a three-pronged association study that integrated human intestinal microbiome data derived from shotgun-based sequencing with untargeted serum metabolome data and measures of host physiology. Metabolome and microbiome data are high dimensional, posing a major challenge for data integration. Here, we present a step-by-step computational protocol that details and discusses the dimensionality-reduction techniques used and methods for subsequent integration and interpretation of such heterogeneous types of data. Dimensionality reduction was achieved through a combination of data normalization approaches, binning of co-abundant genes and metabolites, and integration of prior biological knowledge. The use of prior knowledge to overcome functional redundancy across microbiome species is one central advance of our method over available alternative approaches. Applying this framework, other investigators can integrate various '-omics' readouts with variables of host physiology or any other phenotype of interest (e.g., connecting host and microbiome readouts to disease severity or treatment outcome in a clinical cohort) in a three-pronged association analysis to identify potential mechanistic links to be tested in experimental settings. Although we originally developed the framework for a human metabolome-microbiome study, it is generalizable to other organisms and environmental metagenomes, as well as to studies including other -omics domains such as transcriptomics and proteomics. The provided R code runs in ~1 h on a standard PC.",
author = "Pedersen, {Helle Krogh} and Forslund, {Sofia K} and Valborg Gudmundsdottir and Petersen, {Anders {\O}stergaard} and Falk Hildebrand and Tuulia Hy{\"o}tyl{\"a}inen and Trine Nielsen and Torben Hansen and Peer Bork and Ehrlich, {S. Dusko} and S{\o}ren Brunak and Matej Oresic and Oluf Pedersen and Nielsen, {Henrik Bj{\o}rn}",
year = "2018",
doi = "10.1038/s41596-018-0064-z",
language = "English",
volume = "13",
pages = "2781--2800",
journal = "Nature Protocols (Print)",
issn = "1754-2189",
publisher = "nature publishing group",
number = "12",

}

RIS

TY - JOUR

T1 - A computational framework to integrate high-throughput "-omics" datasets for the identification of potential mechanistic links

AU - Pedersen, Helle Krogh

AU - Forslund, Sofia K

AU - Gudmundsdottir, Valborg

AU - Petersen, Anders Østergaard

AU - Hildebrand, Falk

AU - Hyötyläinen, Tuulia

AU - Nielsen, Trine

AU - Hansen, Torben

AU - Bork, Peer

AU - Ehrlich, S. Dusko

AU - Brunak, Søren

AU - Oresic, Matej

AU - Pedersen, Oluf

AU - Nielsen, Henrik Bjørn

PY - 2018

Y1 - 2018

N2 - We recently presented a three-pronged association study that integrated human intestinal microbiome data derived from shotgun-based sequencing with untargeted serum metabolome data and measures of host physiology. Metabolome and microbiome data are high dimensional, posing a major challenge for data integration. Here, we present a step-by-step computational protocol that details and discusses the dimensionality-reduction techniques used and methods for subsequent integration and interpretation of such heterogeneous types of data. Dimensionality reduction was achieved through a combination of data normalization approaches, binning of co-abundant genes and metabolites, and integration of prior biological knowledge. The use of prior knowledge to overcome functional redundancy across microbiome species is one central advance of our method over available alternative approaches. Applying this framework, other investigators can integrate various '-omics' readouts with variables of host physiology or any other phenotype of interest (e.g., connecting host and microbiome readouts to disease severity or treatment outcome in a clinical cohort) in a three-pronged association analysis to identify potential mechanistic links to be tested in experimental settings. Although we originally developed the framework for a human metabolome-microbiome study, it is generalizable to other organisms and environmental metagenomes, as well as to studies including other -omics domains such as transcriptomics and proteomics. The provided R code runs in ~1 h on a standard PC.

AB - We recently presented a three-pronged association study that integrated human intestinal microbiome data derived from shotgun-based sequencing with untargeted serum metabolome data and measures of host physiology. Metabolome and microbiome data are high dimensional, posing a major challenge for data integration. Here, we present a step-by-step computational protocol that details and discusses the dimensionality-reduction techniques used and methods for subsequent integration and interpretation of such heterogeneous types of data. Dimensionality reduction was achieved through a combination of data normalization approaches, binning of co-abundant genes and metabolites, and integration of prior biological knowledge. The use of prior knowledge to overcome functional redundancy across microbiome species is one central advance of our method over available alternative approaches. Applying this framework, other investigators can integrate various '-omics' readouts with variables of host physiology or any other phenotype of interest (e.g., connecting host and microbiome readouts to disease severity or treatment outcome in a clinical cohort) in a three-pronged association analysis to identify potential mechanistic links to be tested in experimental settings. Although we originally developed the framework for a human metabolome-microbiome study, it is generalizable to other organisms and environmental metagenomes, as well as to studies including other -omics domains such as transcriptomics and proteomics. The provided R code runs in ~1 h on a standard PC.

U2 - 10.1038/s41596-018-0064-z

DO - 10.1038/s41596-018-0064-z

M3 - Journal article

C2 - 30382244

VL - 13

SP - 2781

EP - 2800

JO - Nature Protocols (Print)

JF - Nature Protocols (Print)

SN - 1754-2189

IS - 12

ER -

ID: 204344693