LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

Real‐world Big‐data: Strengths and weaknesses of ASCO's CancerLinQ® discovery multiple myeloma dataset

Photo by neom from unsplash

Real-world data (RWD) are increasingly relevant in oncology practice. RWD can shed light on outcomes of patients who are underrepresented or ineligible for clinical trials or with rare cancers that… Click to show full abstract

Real-world data (RWD) are increasingly relevant in oncology practice. RWD can shed light on outcomes of patients who are underrepresented or ineligible for clinical trials or with rare cancers that are difficult to study prospectively. In addition, RWD can simulate clinical trials, construct external control groups for single-arm studies, and generate real-world evidence for Food and Drug Administration (FDA) regulatory purposes. One currently available source of RWD is the health technology platform CancerLinQ, developed by the American Society of Clinical Oncology (ASCO) in 2014 as part of ASCO's mission to improve quality of cancer care. CancerLinQ extracts electronic health record (EHR) data from participating oncology practices, which are then aggregated, de-identified, and made available for research through CancerLinQ Discovery (CLQD). The CLQD multiple myeloma (MM) dataset was recently accessed, providing an opportunity to highlight important strengths and weaknesses inherent to RWD for the study of MM and other malignancies. ASCO's CLQD MM dataset provided individual de-identified data on 53 028 MM patients, a formidable number to reach in conventional trials or registries. The dataset comprises 362 variables offering detailed information on patient-, disease-, and treatment-related factors. In addition, a comprehensive list of ICD codes captures competing comorbidities, diseaseor treatment-related complications, and second primary malignancies. Critical dates including date of diagnosis, dates of therapies administered, and date of death or last followup allow for calculation of progression-free survival (PFS) and overall survival (OS), as well as long-term outcomes of patients with disease refractory to multiple drugs. Redundant variables are included to improve readability; excluding them leaves investigators with 192 unique variables: a sizable figure enabling extensive analyses. Despite the dataset's size and power, like all EHR-based or claims-based RWD, it has the inherent limitation of missing data. For example, among the 53 028 MM patients in the CLQD dataset, over 25% (i.e., 14 123) are lacking a date of MM diagnosis. Similarly, 3762 patients (7%) have either an omitted or erroneous year of birth, leaving their age incalculable. Excluding patients with missing critical dates leaves 35 865 evaluable patients. In addition to missing critical dates, left out information on medications is not uncommon in RWD. For example, only 11% of the 35 865 evaluable MM patients in CLQD were reported to have received lenalidomide, despite it being a standard frontline drug that was FDA approved over a decade ago. Notably, lenalidomide and other oral oncolytics are often supplied through specialty pharmacies, and therefore may not be captured in the EHR. Similarly, only 4% of evaluable MM patients were reported as receiving intravenous melphalan, despite this being standard of care when combined with autologous stem cell transplant (ASCT) in eligible patients. It is likely that many MM patients received an ASCT at an outside transplant center, meaning that details of their transplant were not captured in the EHR of the CLQD-participating facility. RWD also suffers from missing critical prognostic information, such as cytogenetics and stage, making it difficult to perform analyses on patients with rare cytogenetic subtypes or high-risk disease. Given variations in the definition of high-risk cytogenetics, the presence or absence of specific cytogenetic abnormalities frequently encountered would ideally be reported in RWD. Similarly, given important differences in staging systems, the specific staging system utilized by the charting provider (i.e., Durie-Salmon, ISS, R-ISS, or R2-ISS) would ideally be delineated in structured data fields where it can be captured in a database query. Generation of RWD is complex and requires multiple data structures, training, resources, and financial support, impacting the availability of critical data. CancerLinQ captures pertinent structured and unstructured data from multiple EHR vendors, the former including variables such as ICD codes and date of diagnosis coded to standard terminologies, and the latter typically composed of stage, pathology, Received: 18 January 2023 Revised: 5 February 2023 Accepted: 23 March 2023

Keywords: oncology; cancerlinq discovery; rwd; multiple myeloma; real world

Journal Title: American Journal of Hematology
Year Published: 2023

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.