Inferring Population Structure and Admixture Proportions in Low-Depth NGS Data

Novo Nordisk Foundation
Center for Protein Research

Inferring Population Structure and Admixture Proportions in Low-Depth NGS Data

Research output: Contribution to journal › Journal article › Research › peer-review

Standard

Inferring Population Structure and Admixture Proportions in Low-Depth NGS Data. / Meisner, Jonas; Albrechtsen, Anders.

In: Genetics, Vol. 210, No. 2, 2018, p. 719-731.

Research output: Contribution to journal › Journal article › Research › peer-review

Harvard

Meisner, J & Albrechtsen, A 2018, 'Inferring Population Structure and Admixture Proportions in Low-Depth NGS Data', Genetics, vol. 210, no. 2, pp. 719-731. https://doi.org/10.1534/genetics.118.301336

APA

Meisner, J., & Albrechtsen, A. (2018). Inferring Population Structure and Admixture Proportions in Low-Depth NGS Data. Genetics, 210(2), 719-731. https://doi.org/10.1534/genetics.118.301336

Vancouver

Meisner J, Albrechtsen A. Inferring Population Structure and Admixture Proportions in Low-Depth NGS Data. Genetics. 2018;210(2):719-731. https://doi.org/10.1534/genetics.118.301336

Author

Meisner, Jonas ; Albrechtsen, Anders. / Inferring Population Structure and Admixture Proportions in Low-Depth NGS Data. In: Genetics. 2018 ; Vol. 210, No. 2. pp. 719-731.

Bibtex

@article{76f5c575c5054c0fa1ceb5d3e3af51ea,

title = "Inferring Population Structure and Admixture Proportions in Low-Depth NGS Data",

abstract = "We here present two methods for inferring population structure and admixture proportions in low depth next generation sequencing data. Inference of population structure is essential in both population genetics and association studies and is often performed using principal component analysis or clustering-based approaches. Next-generation sequencing methods provide large amounts of genetic data but are associated with statistical uncertainty for especially low depth sequencing data. Models can account for this uncertainty by working directly on genotype likelihoods of the unobserved genotypes. We propose a method for inferring population structure through principal component analysis in an iterative heuristic approach of estimating individual allele frequencies, where we demonstrate improved accuracy in samples with low and variable sequencing depth for both simulated and real datasets. We also use the estimated individual allele frequencies in a fast non-negative matrix factorization method to estimate admixture proportions. Both methods have been implemented in the PCAngsd framework available at http://www.popgen.dk/software/.",

author = "Jonas Meisner and Anders Albrechtsen",

note = "Copyright {\textcopyright} 2018, Genetics.",

year = "2018",

doi = "10.1534/genetics.118.301336",

language = "English",

volume = "210",

pages = "719--731",

journal = "Genetics",

issn = "1943-2631",

publisher = "The Genetics Society of America (GSA)",

number = "2",

}

RIS

TY - JOUR

T1 - Inferring Population Structure and Admixture Proportions in Low-Depth NGS Data

AU - Meisner, Jonas

AU - Albrechtsen, Anders

PY - 2018

Y1 - 2018

N2 - We here present two methods for inferring population structure and admixture proportions in low depth next generation sequencing data. Inference of population structure is essential in both population genetics and association studies and is often performed using principal component analysis or clustering-based approaches. Next-generation sequencing methods provide large amounts of genetic data but are associated with statistical uncertainty for especially low depth sequencing data. Models can account for this uncertainty by working directly on genotype likelihoods of the unobserved genotypes. We propose a method for inferring population structure through principal component analysis in an iterative heuristic approach of estimating individual allele frequencies, where we demonstrate improved accuracy in samples with low and variable sequencing depth for both simulated and real datasets. We also use the estimated individual allele frequencies in a fast non-negative matrix factorization method to estimate admixture proportions. Both methods have been implemented in the PCAngsd framework available at http://www.popgen.dk/software/.

AB - We here present two methods for inferring population structure and admixture proportions in low depth next generation sequencing data. Inference of population structure is essential in both population genetics and association studies and is often performed using principal component analysis or clustering-based approaches. Next-generation sequencing methods provide large amounts of genetic data but are associated with statistical uncertainty for especially low depth sequencing data. Models can account for this uncertainty by working directly on genotype likelihoods of the unobserved genotypes. We propose a method for inferring population structure through principal component analysis in an iterative heuristic approach of estimating individual allele frequencies, where we demonstrate improved accuracy in samples with low and variable sequencing depth for both simulated and real datasets. We also use the estimated individual allele frequencies in a fast non-negative matrix factorization method to estimate admixture proportions. Both methods have been implemented in the PCAngsd framework available at http://www.popgen.dk/software/.

U2 - 10.1534/genetics.118.301336

DO - 10.1534/genetics.118.301336

M3 - Journal article

C2 - 30131346

VL - 210

SP - 719

EP - 731

JO - Genetics

JF - Genetics

SN - 1943-2631

IS - 2

ER -

ID: 201429982

Novo Nordisk Foundation Center for Protein Research

Inferring Population Structure and Admixture Proportions in Low-Depth NGS Data

Standard

Harvard

APA

Vancouver

Author

Bibtex

RIS