Download PDFOpen PDF in browser

Using the Literature to Identify Confounders

EasyChair Preprint no. 155

5 pagesDate: May 23, 2018


We introduce an approach to causal modeling that uses Literature-Based Discovery (LBD) to identify salient domain knowledge in observational data. Causal models represent a marriage between graph theory, probability, and domain knowledge. We hypothesize that the LBD paradigm can be applied to identify variables of interest for the automated construction of causal models of observational data, and that causal models thus informed will improve upon the performance of purely statistical techniques. We evaluated our hypothesis with a pharmacovigilance (PV) use case. In PV, the task is to discriminate between drug/side-effect signals and noise. We analyzed observational clinical data derived from electronic health records (EHR) and constructed causal models. We used logistic regression coefficients as our baseline and calculated estimated controlled direct effect from the LBD-informed causal models. Causal models improved upon unadjusted statistical models by 8.64% using Area under the Curve of the Receiver Operating Characteristic. Improving upon previous work in PV using EHR as the primary data source, our results establish the utility of the approach.

Keyphrases: Adverse Drug Reaction, causal model, causality, Electronic Health Record, feature selection, literature-based discovery, observational clinical data, Predication-based Semantic Indexing

BibTeX entry
BibTeX does not have the right entry for preprints. This is a hack for producing the correct reference:
  author = {Scott Malec},
  title = {Using the Literature to Identify Confounders},
  howpublished = {EasyChair Preprint no. 155},
  doi = {10.29007/zj61},
  year = {EasyChair, 2018}}
Download PDFOpen PDF in browser