Compass: A hybrid method for clinical and biobank data mining

Research output: Contribution to journalJournal articleResearchpeer-review

Standard

Compass : A hybrid method for clinical and biobank data mining. / Krysiak-Baltyn, Konrad; Nordahl Petersen, T.; Audouze, Karine Marie Laure; Jørgensen, Niels; Ängquist, Lars; Brunak, S.

In: Journal of Biomedical Informatics, Vol. 47, 02.2014, p. 160-170.

Research output: Contribution to journalJournal articleResearchpeer-review

Harvard

Krysiak-Baltyn, K, Nordahl Petersen, T, Audouze, KML, Jørgensen, N, Ängquist, L & Brunak, S 2014, 'Compass: A hybrid method for clinical and biobank data mining', Journal of Biomedical Informatics, vol. 47, pp. 160-170. https://doi.org/10.1016/j.jbi.2013.10.007

APA

Krysiak-Baltyn, K., Nordahl Petersen, T., Audouze, K. M. L., Jørgensen, N., Ängquist, L., & Brunak, S. (2014). Compass: A hybrid method for clinical and biobank data mining. Journal of Biomedical Informatics, 47, 160-170. https://doi.org/10.1016/j.jbi.2013.10.007

Vancouver

Krysiak-Baltyn K, Nordahl Petersen T, Audouze KML, Jørgensen N, Ängquist L, Brunak S. Compass: A hybrid method for clinical and biobank data mining. Journal of Biomedical Informatics. 2014 Feb;47:160-170. https://doi.org/10.1016/j.jbi.2013.10.007

Author

Krysiak-Baltyn, Konrad ; Nordahl Petersen, T. ; Audouze, Karine Marie Laure ; Jørgensen, Niels ; Ängquist, Lars ; Brunak, S. / Compass : A hybrid method for clinical and biobank data mining. In: Journal of Biomedical Informatics. 2014 ; Vol. 47. pp. 160-170.

Bibtex

@article{8946b1c6d0d34a20b0cc89ab7a6280c2,
title = "Compass: A hybrid method for clinical and biobank data mining",
abstract = "We describe a new method for identification of confident associations within large clinical data sets. The method is a hybrid of two existing methods; Self-Organizing Maps and Association Mining. We utilize Self-Organizing Maps as the initial step to reduce the search space, and then apply Association Mining in order to find association rules. We demonstrate that this procedure has a number of advantages compared to traditional Association Mining; it allows for handling numerical variables without a priori binning and is able to generate variable groups which act as {"}hotspots{"} for statistically significant associations. We showcase the method on infertility-related data from Danish military conscripts. The clinical data we analyzed contained both categorical type questionnaire data and continuous variables generated from biological measurements, including missing values. From this data set, we successfully generated a number of interesting association rules, which relate an observation with a specific consequence and the p-value for that finding. Additionally, we demonstrate that the method can be used on non-clinical data containing chemical-disease associations in order to find associations between different phenotypes, such as prostate cancer and breast cancer.",
author = "Konrad Krysiak-Baltyn and {Nordahl Petersen}, T. and Audouze, {Karine Marie Laure} and Niels J{\o}rgensen and Lars {\"A}ngquist and S. Brunak",
year = "2014",
month = feb,
doi = "10.1016/j.jbi.2013.10.007",
language = "English",
volume = "47",
pages = "160--170",
journal = "Journal of Biomedical Informatics",
issn = "1532-0464",
publisher = "Academic Press",

}

RIS

TY - JOUR

T1 - Compass

T2 - A hybrid method for clinical and biobank data mining

AU - Krysiak-Baltyn, Konrad

AU - Nordahl Petersen, T.

AU - Audouze, Karine Marie Laure

AU - Jørgensen, Niels

AU - Ängquist, Lars

AU - Brunak, S.

PY - 2014/2

Y1 - 2014/2

N2 - We describe a new method for identification of confident associations within large clinical data sets. The method is a hybrid of two existing methods; Self-Organizing Maps and Association Mining. We utilize Self-Organizing Maps as the initial step to reduce the search space, and then apply Association Mining in order to find association rules. We demonstrate that this procedure has a number of advantages compared to traditional Association Mining; it allows for handling numerical variables without a priori binning and is able to generate variable groups which act as "hotspots" for statistically significant associations. We showcase the method on infertility-related data from Danish military conscripts. The clinical data we analyzed contained both categorical type questionnaire data and continuous variables generated from biological measurements, including missing values. From this data set, we successfully generated a number of interesting association rules, which relate an observation with a specific consequence and the p-value for that finding. Additionally, we demonstrate that the method can be used on non-clinical data containing chemical-disease associations in order to find associations between different phenotypes, such as prostate cancer and breast cancer.

AB - We describe a new method for identification of confident associations within large clinical data sets. The method is a hybrid of two existing methods; Self-Organizing Maps and Association Mining. We utilize Self-Organizing Maps as the initial step to reduce the search space, and then apply Association Mining in order to find association rules. We demonstrate that this procedure has a number of advantages compared to traditional Association Mining; it allows for handling numerical variables without a priori binning and is able to generate variable groups which act as "hotspots" for statistically significant associations. We showcase the method on infertility-related data from Danish military conscripts. The clinical data we analyzed contained both categorical type questionnaire data and continuous variables generated from biological measurements, including missing values. From this data set, we successfully generated a number of interesting association rules, which relate an observation with a specific consequence and the p-value for that finding. Additionally, we demonstrate that the method can be used on non-clinical data containing chemical-disease associations in order to find associations between different phenotypes, such as prostate cancer and breast cancer.

UR - http://www.scopus.com/inward/record.url?scp=84887293437&partnerID=8YFLogxK

U2 - 10.1016/j.jbi.2013.10.007

DO - 10.1016/j.jbi.2013.10.007

M3 - Journal article

C2 - 24513869

VL - 47

SP - 160

EP - 170

JO - Journal of Biomedical Informatics

JF - Journal of Biomedical Informatics

SN - 1532-0464

ER -

ID: 88726807