LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

Joining Datasets Without Identifiers: Probabilistic Linkage of Virtual Pediatric Systems and PEDSnet*

Photo by jhc from unsplash

Supplemental Digital Content is available in the text. Objectives: To 1) probabilistically link two important pediatric data sources, Virtual Pediatric Systems and PEDSnet, 2) evaluate linkage accuracy overall and in… Click to show full abstract

Supplemental Digital Content is available in the text. Objectives: To 1) probabilistically link two important pediatric data sources, Virtual Pediatric Systems and PEDSnet, 2) evaluate linkage accuracy overall and in patients with severe sepsis or septic shock, and 3) identify variables important to linkage accuracy. Design: Retrospective linkage of prospectively collected datasets from Virtual Pediatrics Systems, Inc (Los Angeles, CA) and the PEDSnet consortium. Setting: Single-center academic PICU. Patients: All PICU encounters between January 1, 2012, and December 31, 2017, that were deterministically matched between the two datasets. Interventions: None. Measurements and Main Results: We abstracted records from Virtual Pediatric Systems and PEDSnet corresponding to PICU encounters and probabilistically linked using 44 features shared by the two datasets. We generated a gold standard deterministic linkage using protected health information elements, which were then removed from datasets. We then calculated candidate pair log-likelihood ratios for all pairs of subjects and selected optimal pairs in a two-stage algorithm. A total of 22,051 gold standard PICU encounter pairs were identified over the study period. The optimal linkage model demonstrated excellent discrimination (area under the receiver operating characteristic curve > 0.99); 19,801 cases (89.9%) were matched with 13 false positives. The addition of two protected health information dates (admission month, birth day-of-year) increased to 20,189 (91.6%) the cases matched, with three false positives. Restricting to patients with Virtual Pediatric Systems diagnosis of severe sepsis or septic shock (n = 1,340 [6.1%]) matched 1,250 cases (93.2%) with zero false positives. Increased number of laboratory values present in the first 12 hours of admission significantly increased log-likelihood ratios, suggesting stronger candidate pair matching. Conclusions: We demonstrated the use of probabilistic linkage to accurately join two complementary pediatric critical care datasets at a single academic PICU in the absence of protected health information. Combining datasets with curated diagnoses and granular measurements can validate patient acuity metrics and facilitate multicenter machine learning algorithms. We anticipate these methods will generalize to other common PICU diagnoses.

Keywords: probabilistic linkage; linkage; picu; pediatric systems; virtual pediatric; systems pedsnet

Journal Title: Pediatric Critical Care Medicine
Year Published: 2020

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.