The digital patient : machine learning techniques for analyzing electronic health record data
Abstract/Contents
- Abstract
- The current unprecedented rate of digitization of longitudinal health data --- continuous device monitoring data, laboratory measurements, medication orders, treatment reports, reports of physician assessments --- allows visibility into patient health at increasing levels of detail. A clearer lens into this data could help improve decision making both for individual physicians on the front lines of care, and for policy makers setting national direction. However, this type of data is high-dimensional (an infant with no prior clinical history can have more than 1000 different measurements in the ICU), highly unstructured (the measurements occur irregularly, and different numbers and types of measurements are taken for different patients) and heterogeneous (from ultrasound assessments to lab tests to continuous monitor data). Furthermore, the data is often sparse, systematically not present, and the underlying system is non-stationary. Extracting the full value of the existing data requires novel approaches. In this thesis, we develop novel methods to show how longitudinal health data contained in Electronic Health Records (EHRs) can be harnessed for making novel clinical discoveries. For this, one requires access to patient outcome data --- which patient has which complications. We present a method for automated extraction of patient outcomes from EHR data; our method shows how natural languages cues from the physicians notes can be combined with clinical events that occur during a patient's length of stay in the hospital to extract significantly higher quality annotations than previous state-of-the-art systems. We develop novel methods for exploratory analysis and structure discovery in bedside monitor data. This data forms the bulk of the data collected on any patient yet, it is not utilized in any substantive way post collection. We present methods to discover recurring shape and dynamic signatures in this data. While we primarily focus on clinical time series, our methods also generalize to other continuous-valued time series data. Our analysis of the bedside monitor data led us to a novel use of this data for risk prediction in infants. Using features automatically extracted from physiologic signals collected in the first 3 hours of life, we develop Physiscore, a tool that predicts infants at risk for major complications downstream. Physiscore is both fully automated and significantly more accurate than the current standard of care. It can be used for resource optimization within a NICU, managing infant transport to a higher level of care and parental counseling. Overall, this thesis illustrates how the use of machine learning for analyzing these large scale digital patient data repositories can yield new clinical discoveries and potentially useful tools for improving patient care.
Description
Type of resource | text |
---|---|
Form | electronic; electronic resource; remote |
Extent | 1 online resource. |
Publication date | 2011 |
Issuance | monographic |
Language | English |
Creators/Contributors
Associated with | Saria, Suchi | |
---|---|---|
Associated with | Stanford University, Computer Science Department | |
Primary advisor | Koller, Daphne | |
Thesis advisor | Koller, Daphne | |
Thesis advisor | Penn, Anna Asher | |
Thesis advisor | Thrun, Sebastian, 1967- | |
Advisor | Penn, Anna Asher | |
Advisor | Thrun, Sebastian, 1967- |
Subjects
Genre | Theses |
---|
Bibliographic information
Statement of responsibility | Suchi Saria. |
---|---|
Note | Submitted to the Department of Computer Science. |
Thesis | Ph.D. Stanford University 2011 |
Location | electronic resource |
Access conditions
- Copyright
- © 2011 by Suchi Saria
- License
- This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).
Also listed in
Loading usage metrics...