Seasonally adjusted laboratory reference intervals to improve the performance of machine learning models for classification of cardiovascular diseases

Research output: Contribution to journalJournal articleResearchpeer-review

Standard

Seasonally adjusted laboratory reference intervals to improve the performance of machine learning models for classification of cardiovascular diseases. / Muse, Victorine P.; Placido, Davide; Haue, Amalie D.; Brunak, Søren.

In: BMC Medical Informatics and Decision Making, Vol. 24, 62, 2024.

Research output: Contribution to journalJournal articleResearchpeer-review

Harvard

Muse, VP, Placido, D, Haue, AD & Brunak, S 2024, 'Seasonally adjusted laboratory reference intervals to improve the performance of machine learning models for classification of cardiovascular diseases', BMC Medical Informatics and Decision Making, vol. 24, 62. https://doi.org/10.1186/s12911-024-02467-6

APA

Muse, V. P., Placido, D., Haue, A. D., & Brunak, S. (2024). Seasonally adjusted laboratory reference intervals to improve the performance of machine learning models for classification of cardiovascular diseases. BMC Medical Informatics and Decision Making, 24, [62]. https://doi.org/10.1186/s12911-024-02467-6

Vancouver

Muse VP, Placido D, Haue AD, Brunak S. Seasonally adjusted laboratory reference intervals to improve the performance of machine learning models for classification of cardiovascular diseases. BMC Medical Informatics and Decision Making. 2024;24. 62. https://doi.org/10.1186/s12911-024-02467-6

Author

Muse, Victorine P. ; Placido, Davide ; Haue, Amalie D. ; Brunak, Søren. / Seasonally adjusted laboratory reference intervals to improve the performance of machine learning models for classification of cardiovascular diseases. In: BMC Medical Informatics and Decision Making. 2024 ; Vol. 24.

Bibtex

@article{a4aeeb5892c14c158dd63426a90a23b7,
title = "Seasonally adjusted laboratory reference intervals to improve the performance of machine learning models for classification of cardiovascular diseases",
abstract = "Background: Variation in laboratory healthcare data due to seasonal changes is a widely accepted phenomenon. Seasonal variation is generally not systematically accounted for in healthcare settings. This study applies a newly developed adjustment method for seasonal variation to analyze the effect seasonality has on machine learning model classification of diagnoses. Methods: Machine learning methods were trained and tested on ~ 22 million unique records from ~ 575,000 unique patients admitted to Danish hospitals. Four machine learning models (adaBoost, decision tree, neural net, and random forest) classifying 35 diseases of the circulatory system (ICD-10 diagnosis codes, chapter IX) were run before and after seasonal adjustment of 23 laboratory reference intervals (RIs). The effect of the adjustment was benchmarked via its contribution to machine learning models trained using hyperparameter optimization and assessed quantitatively using performance metrics (AUROC and AUPRC). Results: Seasonally adjusted RIs significantly improved cardiovascular disease classification in 24 of the 35 tested cases when using neural net models. Features with the highest average feature importance (via SHAP explainability) across all disease models were sex, C- reactive protein, and estimated glomerular filtration. Classification of diseases of the vessels, such as thrombotic diseases and other atherosclerotic diseases consistently improved after seasonal adjustment. Conclusions: As data volumes increase and data-driven methods are becoming more advanced, it is essential to improve data quality at the pre-processing level. This study presents a method that makes it feasible to introduce seasonally adjusted RIs into the clinical research space in any disease domain. Seasonally adjusted RIs generally improve diagnoses classification and thus, ought to be considered and adjusted for in clinical decision support methods.",
keywords = "Cardiovascular Disease, Diagnostics, Digital Health, Electronic Health Records, Laboratory Values, Machine Learning, Seasonality",
author = "Muse, {Victorine P.} and Davide Placido and Haue, {Amalie D.} and S{\o}ren Brunak",
note = "Publisher Copyright: {\textcopyright} The Author(s) 2024.",
year = "2024",
doi = "10.1186/s12911-024-02467-6",
language = "English",
volume = "24",
journal = "BMC Medical Informatics and Decision Making",
issn = "1472-6947",
publisher = "BioMed Central",

}

RIS

TY - JOUR

T1 - Seasonally adjusted laboratory reference intervals to improve the performance of machine learning models for classification of cardiovascular diseases

AU - Muse, Victorine P.

AU - Placido, Davide

AU - Haue, Amalie D.

AU - Brunak, Søren

N1 - Publisher Copyright: © The Author(s) 2024.

PY - 2024

Y1 - 2024

N2 - Background: Variation in laboratory healthcare data due to seasonal changes is a widely accepted phenomenon. Seasonal variation is generally not systematically accounted for in healthcare settings. This study applies a newly developed adjustment method for seasonal variation to analyze the effect seasonality has on machine learning model classification of diagnoses. Methods: Machine learning methods were trained and tested on ~ 22 million unique records from ~ 575,000 unique patients admitted to Danish hospitals. Four machine learning models (adaBoost, decision tree, neural net, and random forest) classifying 35 diseases of the circulatory system (ICD-10 diagnosis codes, chapter IX) were run before and after seasonal adjustment of 23 laboratory reference intervals (RIs). The effect of the adjustment was benchmarked via its contribution to machine learning models trained using hyperparameter optimization and assessed quantitatively using performance metrics (AUROC and AUPRC). Results: Seasonally adjusted RIs significantly improved cardiovascular disease classification in 24 of the 35 tested cases when using neural net models. Features with the highest average feature importance (via SHAP explainability) across all disease models were sex, C- reactive protein, and estimated glomerular filtration. Classification of diseases of the vessels, such as thrombotic diseases and other atherosclerotic diseases consistently improved after seasonal adjustment. Conclusions: As data volumes increase and data-driven methods are becoming more advanced, it is essential to improve data quality at the pre-processing level. This study presents a method that makes it feasible to introduce seasonally adjusted RIs into the clinical research space in any disease domain. Seasonally adjusted RIs generally improve diagnoses classification and thus, ought to be considered and adjusted for in clinical decision support methods.

AB - Background: Variation in laboratory healthcare data due to seasonal changes is a widely accepted phenomenon. Seasonal variation is generally not systematically accounted for in healthcare settings. This study applies a newly developed adjustment method for seasonal variation to analyze the effect seasonality has on machine learning model classification of diagnoses. Methods: Machine learning methods were trained and tested on ~ 22 million unique records from ~ 575,000 unique patients admitted to Danish hospitals. Four machine learning models (adaBoost, decision tree, neural net, and random forest) classifying 35 diseases of the circulatory system (ICD-10 diagnosis codes, chapter IX) were run before and after seasonal adjustment of 23 laboratory reference intervals (RIs). The effect of the adjustment was benchmarked via its contribution to machine learning models trained using hyperparameter optimization and assessed quantitatively using performance metrics (AUROC and AUPRC). Results: Seasonally adjusted RIs significantly improved cardiovascular disease classification in 24 of the 35 tested cases when using neural net models. Features with the highest average feature importance (via SHAP explainability) across all disease models were sex, C- reactive protein, and estimated glomerular filtration. Classification of diseases of the vessels, such as thrombotic diseases and other atherosclerotic diseases consistently improved after seasonal adjustment. Conclusions: As data volumes increase and data-driven methods are becoming more advanced, it is essential to improve data quality at the pre-processing level. This study presents a method that makes it feasible to introduce seasonally adjusted RIs into the clinical research space in any disease domain. Seasonally adjusted RIs generally improve diagnoses classification and thus, ought to be considered and adjusted for in clinical decision support methods.

KW - Cardiovascular Disease

KW - Diagnostics

KW - Digital Health

KW - Electronic Health Records

KW - Laboratory Values

KW - Machine Learning

KW - Seasonality

U2 - 10.1186/s12911-024-02467-6

DO - 10.1186/s12911-024-02467-6

M3 - Journal article

C2 - 38438861

AN - SCOPUS:85186627709

VL - 24

JO - BMC Medical Informatics and Decision Making

JF - BMC Medical Informatics and Decision Making

SN - 1472-6947

M1 - 62

ER -

ID: 385213512