Presentation Authors: Matthew R. Cooperberg*, San Francisco, CA, William Meeks, Linthicum, MD, Ji Qi, Rodney L. Dunn, Ann Arbor, MI, Sanyog Pendharkar, Pune, India, Daniel Pichardo, Linthicum, MD, Anna Johnson, Susan Linsell, Ann Arbor, MI, Raymond Fang, Linthicum, MD, Steven Schlossberg, Walnut Creek, CA, James E. Montie, Ann Arbor, MI
Introduction: The AUA Quality (AQUA) Registry now includes data on >4.3M patients managed by over 1500 urologists across the country. AQUAs databases are populated by automated extraction of data from a variety of EHR systems. Some data (e.g., billing codes and orders) usually exist as structured data in EHRs. Others (e.g., cancer grade) usually do not, and must be identified via regular expression or the use of natural language processing. As a test of data extraction quality, we performed a patient-level validation of prostate cancer data from two AQUA practices compared to the manually abstracted data available through their participation in the Michigan Urological Surgical Improvement Collaborative (MUSIC).
Methods: Data were collected from men newly diagnosed with prostate cancer between 2014 and 2017 at two urology practices in Michigan. AQUA data were extracted using EHR connector software (FIGMD Inc, San Diego, CA), and MUSIC data were manually abstracted by trained staff at each site with annual onsite quality audits. Date of diagnosis, Gleason score (primary and secondary), diagnostic PSA, number of biopsy cores (positive and total), clinical staging, and primary treatment were compared. Percent of cases with missing information on each variable was also evaluated for both registries.
Results: A total of 725 patients from the two practices were linked between AQUA and MUSIC registry. The rate of missing data in each registry as well as matching rates for values when identified are shown in Table 1. The most common mismatches for treatment were between brachytherapy and external- beam radiation, and between radiation and primary androgen deprivation.
Conclusions: Automated extraction of both structured and unstructured data from EHRs is possible, and has the potential to substantially reduce the time and cost of disease registry population. Adjustments to algorithms will continually improve the quality of the automated abstraction.
Source of Funding: American Urological Association and Blue Cross Blue Shield of Michigan