Language-agnostic pharmacovigilant text mining to elicit side effects from clinical notes and hospital medication records

Research output: Contribution to journalJournal articleResearchpeer-review

Standard

Language-agnostic pharmacovigilant text mining to elicit side effects from clinical notes and hospital medication records. / Kaas-Hansen, Benjamin Skov; Placido, Davide; Rodríguez, Cristina Leal; Thorsen-Meyer, Hans-Christian; Gentile, Simona; Nielsen, Anna Pors; Brunak, Søren; Jürgens, Gesche; Andersen, Stig Ejdrup.

In: Basic & clinical pharmacology & toxicology, Vol. 131, No. 4, 2022, p. 282-293.

Research output: Contribution to journalJournal articleResearchpeer-review

Harvard

Kaas-Hansen, BS, Placido, D, Rodríguez, CL, Thorsen-Meyer, H-C, Gentile, S, Nielsen, AP, Brunak, S, Jürgens, G & Andersen, SE 2022, 'Language-agnostic pharmacovigilant text mining to elicit side effects from clinical notes and hospital medication records', Basic & clinical pharmacology & toxicology, vol. 131, no. 4, pp. 282-293. https://doi.org/10.1111/bcpt.13773

APA

Kaas-Hansen, B. S., Placido, D., Rodríguez, C. L., Thorsen-Meyer, H-C., Gentile, S., Nielsen, A. P., Brunak, S., Jürgens, G., & Andersen, S. E. (2022). Language-agnostic pharmacovigilant text mining to elicit side effects from clinical notes and hospital medication records. Basic & clinical pharmacology & toxicology, 131(4), 282-293. https://doi.org/10.1111/bcpt.13773

Vancouver

Kaas-Hansen BS, Placido D, Rodríguez CL, Thorsen-Meyer H-C, Gentile S, Nielsen AP et al. Language-agnostic pharmacovigilant text mining to elicit side effects from clinical notes and hospital medication records. Basic & clinical pharmacology & toxicology. 2022;131(4):282-293. https://doi.org/10.1111/bcpt.13773

Author

Kaas-Hansen, Benjamin Skov ; Placido, Davide ; Rodríguez, Cristina Leal ; Thorsen-Meyer, Hans-Christian ; Gentile, Simona ; Nielsen, Anna Pors ; Brunak, Søren ; Jürgens, Gesche ; Andersen, Stig Ejdrup. / Language-agnostic pharmacovigilant text mining to elicit side effects from clinical notes and hospital medication records. In: Basic & clinical pharmacology & toxicology. 2022 ; Vol. 131, No. 4. pp. 282-293.

Bibtex

@article{32baa8f6cccd4feabd8361cb9d15f7d6,
title = "Language-agnostic pharmacovigilant text mining to elicit side effects from clinical notes and hospital medication records",
abstract = "We sought to craft a drug safety signalling pipeline associating latent information in clinical free text with exposures to single drugs and drug pairs. Data arose from 12 secondary and tertiary public hospitals in two Danish regions, comprising approximately half the Danish population. Notes were operationalised with a fastText embedding, based on which we trained 10,720 neural-network models (one for each distinct single-drug/drug-pair exposure) predicting the risk of exposure given an embedding vector. We included 2,905,251 admissions between May 2008 and June 2016, with 13,740,564 distinct drug prescriptions; the median number of prescriptions was 5 (IQR: 3-9) and in 1,184,340 (41%) admissions patients used ≥5 drugs concomitantly. 10,788,259 clinical notes were included, with 179,441,739 tokens retained after pruning. Of 345 single-drug signals reviewed, 28 (8.1%) represented possibly undescribed relationships; 186 (54%) signals were clinically meaningful. 16 (14%) of the 115 drug-pair signals were possible interactions and 2 (1.7%) were known. In conclusion, we built a language-agnostic pipeline for mining associations between free-text information and medication exposure without manual curation, predicting not the likely outcome of a range of exposures, but the likely exposures for outcomes of interest. Our approach may help overcome limitations of text mining methods relying on curated data in English and can help leverage non-English free text for pharmacovigilance.",
author = "Kaas-Hansen, {Benjamin Skov} and Davide Placido and Rodr{\'i}guez, {Cristina Leal} and Hans-Christian Thorsen-Meyer and Simona Gentile and Nielsen, {Anna Pors} and S{\o}ren Brunak and Gesche J{\"u}rgens and Andersen, {Stig Ejdrup}",
note = "This article is protected by copyright. All rights reserved.",
year = "2022",
doi = "10.1111/bcpt.13773",
language = "English",
volume = "131",
pages = "282--293",
journal = "Basic and Clinical Pharmacology and Toxicology",
issn = "1742-7835",
publisher = "Wiley-Blackwell",
number = "4",

}

RIS

TY - JOUR

T1 - Language-agnostic pharmacovigilant text mining to elicit side effects from clinical notes and hospital medication records

AU - Kaas-Hansen, Benjamin Skov

AU - Placido, Davide

AU - Rodríguez, Cristina Leal

AU - Thorsen-Meyer, Hans-Christian

AU - Gentile, Simona

AU - Nielsen, Anna Pors

AU - Brunak, Søren

AU - Jürgens, Gesche

AU - Andersen, Stig Ejdrup

N1 - This article is protected by copyright. All rights reserved.

PY - 2022

Y1 - 2022

N2 - We sought to craft a drug safety signalling pipeline associating latent information in clinical free text with exposures to single drugs and drug pairs. Data arose from 12 secondary and tertiary public hospitals in two Danish regions, comprising approximately half the Danish population. Notes were operationalised with a fastText embedding, based on which we trained 10,720 neural-network models (one for each distinct single-drug/drug-pair exposure) predicting the risk of exposure given an embedding vector. We included 2,905,251 admissions between May 2008 and June 2016, with 13,740,564 distinct drug prescriptions; the median number of prescriptions was 5 (IQR: 3-9) and in 1,184,340 (41%) admissions patients used ≥5 drugs concomitantly. 10,788,259 clinical notes were included, with 179,441,739 tokens retained after pruning. Of 345 single-drug signals reviewed, 28 (8.1%) represented possibly undescribed relationships; 186 (54%) signals were clinically meaningful. 16 (14%) of the 115 drug-pair signals were possible interactions and 2 (1.7%) were known. In conclusion, we built a language-agnostic pipeline for mining associations between free-text information and medication exposure without manual curation, predicting not the likely outcome of a range of exposures, but the likely exposures for outcomes of interest. Our approach may help overcome limitations of text mining methods relying on curated data in English and can help leverage non-English free text for pharmacovigilance.

AB - We sought to craft a drug safety signalling pipeline associating latent information in clinical free text with exposures to single drugs and drug pairs. Data arose from 12 secondary and tertiary public hospitals in two Danish regions, comprising approximately half the Danish population. Notes were operationalised with a fastText embedding, based on which we trained 10,720 neural-network models (one for each distinct single-drug/drug-pair exposure) predicting the risk of exposure given an embedding vector. We included 2,905,251 admissions between May 2008 and June 2016, with 13,740,564 distinct drug prescriptions; the median number of prescriptions was 5 (IQR: 3-9) and in 1,184,340 (41%) admissions patients used ≥5 drugs concomitantly. 10,788,259 clinical notes were included, with 179,441,739 tokens retained after pruning. Of 345 single-drug signals reviewed, 28 (8.1%) represented possibly undescribed relationships; 186 (54%) signals were clinically meaningful. 16 (14%) of the 115 drug-pair signals were possible interactions and 2 (1.7%) were known. In conclusion, we built a language-agnostic pipeline for mining associations between free-text information and medication exposure without manual curation, predicting not the likely outcome of a range of exposures, but the likely exposures for outcomes of interest. Our approach may help overcome limitations of text mining methods relying on curated data in English and can help leverage non-English free text for pharmacovigilance.

U2 - 10.1111/bcpt.13773

DO - 10.1111/bcpt.13773

M3 - Journal article

C2 - 35834334

VL - 131

SP - 282

EP - 293

JO - Basic and Clinical Pharmacology and Toxicology

JF - Basic and Clinical Pharmacology and Toxicology

SN - 1742-7835

IS - 4

ER -

ID: 314528208