Abstract Funding Acknowledgements Type of funding sources: Foundation. Main funding source(s): British Heart Foundation Background An algorithm that identifies individuals with a digital electronic health record (EHR) signature homologous to… Click to show full abstract
Abstract Funding Acknowledgements Type of funding sources: Foundation. Main funding source(s): British Heart Foundation Background An algorithm that identifies individuals with a digital electronic health record (EHR) signature homologous to patients with atrial fibrillation (AF) could delineate a subpopulation that may benefit from early interventions to reduce future adverse events. Purpose We aimed to train and test a scalable algorithm to identify individuals at higher risk of incident AF in the short-term, and quantify associations with AF and a range of other conditions. Methods We used UK primary care EHR data from individuals aged ≥30 years without known AF in the CPRD-GOLD dataset (Jan 2, 1998, Nov 30, 2018), randomly divided into training (80%) and testing (20%) datasets. We trained a random forest classifier using age, sex, ethnicity and comorbidities (FIND-AF). Performance was evaluated in the testing dataset with internal bootstrap validation with 200 samples, and compared against the CHA2DS2-VASc and C2HEST scores. We calculated the cumulative incidence rate for AF, heart failure, valvular heart disease (and specifically aortic stenosis), MI, stroke or TIA, peripheral vascular disease, CKD, diabetes and COPD. Incident diagnoses were the first record of that condition in primary or secondary care records from any diagnostic position. We excluded individuals for the analysis of each condition who had a preceding diagnosis of that condition. Fine and Gray’s models with competing risk of death were fit for each condition between higher and lower predicted AF risk. Results FIND-AF could be applied to 100% of records for 2 081 139 individuals in the cohort. In the testing dataset (n = 416 228), individuals at higher predicted AF risk had similar baseline characteristics to individuals who developed incident AF (Table 1). Prediction performance for AF was strongest for FIND-AF (AUROC 0·824, 95% CI 0·813-0·829; Brier score 0.069) compared with CHA2DS2-VASc (0·784, 0·773-0·794; 0.093) and C2HEST (0·757, 0·744-0·770; 0.102). FIND-AF demonstrated favourable reclassification and superior net benefit on decision curve analysis, with robust performance in both sexes and across ethnic groups. The higher predicted risk cohort, compared to lower predicted risk, had a 20-fold higher 6-month incidence rate for AF and higher long-term risk of AF (HR 8·75, 95% CI 8·44-9·06), but also incident heart failure (HR 12.54, 95% CI 12.08-13.01) aortic stenosis (9.98, 9.16-10.87), stroke/TIA (8.07, 7.80-8.34), CKD (6.85, 6.70-7.00), peripheral vascular disease (6.62, 6.28-6.98), valvular heart disease (6.49, 6.14-6.85), MI (5.02, 4.82-5.22), diabetes (2.05, 2.00-2.10) and COPD (2.02, 2.00-2.05) (Figure 1). This cohort were also at higher risk of death (10.45, 10.23-10.68), accounting for 71% of cardiovascular deaths. Conclusions FIND-AF is applicable to national electronic health records data, identifies people at higher risk of incident AF within the next 6 months with good performance, and predicts risk of a range of other conditions and death. Table 1. Figure 1
               
Click one of the above tabs to view related content.