Analysis of Big Healthcare Databases Short Course
Event Information
About this Event
December 2 and 4, 2020, 2:00-4:30pm
The widespread adoption of electronic health records (EHR) as a means of documenting medical care has created a vast resource for the study of health conditions, interventions, and outcomes in routine clinical practice. Conducting research using healthcare databases, including EHR and administrative claims data, facilitates the efficient creation of large research databases, execution of pragmatic clinical trials, and study of rare diseases. The recent release of the Framework for FDA’s Real-World Evidence Program has further spurred interest in the use of healthcare data to improve the efficiency and generalizability of pre- and post-marketing studies of medical products. Despite the potential benefits, there are many challenges for research conducted using healthcare data. To make valid inference, statisticians must be aware of data generation, capture, and availability issues and utilize appropriate study designs and statistical analysis methods to account for these issues. In this course, we will discuss topics related to the design and analysis of research studies using big healthcare databases. We will cover issues related to the structure and quality of the data, including data types and methods for extracting variables of interest; sources of missing data; error in covariates and outcomes extracted from EHR and claims data; and data capture considerations such as informative visit processes and medical records coding procedures. In the second half of the course, we will discuss statistical approaches to address some of the challenges and unique features of healthcare databases, including missing data and error in automated algorithm-derived covariates and outcomes. R code will be provided for implementation of the presented methods, and hands-on exercises will be used to compare results of alternative approaches. The overarching objective of this course is to provide participants with an introduction to the structure and content of healthcare databases and equip them with a set of appropriate tools to investigate and analyze this rich data resource.
Course agenda:
12/2
- 2:00-2:15pm Introduction
- 2:15-3:00pm Overview of the Structure of EHR Data
- 3:00-3:15pm Break
- 3:15-4:00pm Extracting Data from the EHR
- 4:00-4:30pm Hands-on Tutorial 1
12/4
- 2:00-2:45pm Missing Data Issues
- 2:45-3:00pm Break
- 3:00-3:45pm Correcting for Bias due to EHR Data Errors
- 3:45-4:15pm Hands-on Tutorial 2
- 4:15-4:30pm Wrap-up
Please note, HACASA membership is not required for the discount, but you can use the registration form to become a member of HACASA. There is a student discount for a limited number of students.