Risk Prediction of Opioid Use Disorder (OUD) using Electronic Health Records

Wenqiu Cao, University of Rhode Island

Abstract

The opioid epidemic has emerged as a public health crisis, attracting increasing nationwide attention. Electronic health records (EHRs) provide rich resources to investigate and predict the risk of opioid use disorder (OUD) in real-world settings due to its diversity of data types and wide range of information. In thisdissertation, I conducted three studies to investigate the association between OUD and different factors in EHR, including demographics, comorbidity, laboratory test results, medications, and opioid prescription, and develop predictive models to predict OUD risk using features from EHR data.

In Manuscript 1, we used penalized logistic regression models to predict OUD in the emergency department in order to handle the large number of predictors and imbalanced classes for OUD in EHR data. We presented the prediction performance of Lasso logistic regression, Firth logistic regression, Firth logistic regression with intercept-correction (FLIC), and Firth logistic regression with added covariate (FLAC) and show how physical and mental comorbidity contributed to the risk of opioid misuse in the emergency department.

In Manuscript 2, a shared parameter joint model for longitudinal and time-to-event data was built to investigate the association between longitudinal opioid prescription dosages and time to OUD onset after patients' first opioid prescription from emergency department. Results from the models suggested a weak positive association between longitudinal opioid prescription dosage and the OUD onset. We also tested how the shared parameter model can handle data missing at random in a simulation study shown in Appendix B.

In Manuscript 3, we proposed a conditional Gated Recurrent Unit with decay rate (GRU-D) model to predict the risk of opioid dependence and abuse using both static features, like demographics and disease history, and temporal features, such as laboratory test results during the entire visit. The GRU-D model allows us to capture the patterns of temporal features even though the measurements in EHR are collected irregularly or missing due to practical issues. We presented and discussed the predictive performance of our proposed conditional GRU-D with a GRU-D model with only temporal features and a GRU-D model with static features added at first time step. In addition, we investigated the feature importance using Leave-One-Covariate-Out (LOCO) approach. The top 15 most important predictors was presented, covering static features, such as insurance type, race, anxiety history, and also temporal features, such as blood test results, and medication use.