The Value of Federated Data Networks in Oncology: What Research Questions Do They Answer? Outcomes From a Systematic Literature Review

Published:December 22, 2021DOI:



      Real-world evidence (RWE) plays an important role in addressing key research questions of interest to healthcare decision makers. Federated data networks (FDNs) apply novel technology to enable the conduct of RWE studies with multiple partners, without the need to share the individual partner’s data set. A systematic review of the published literature was performed to determine which types of research questions can best be addressed through FDNs, specifically in the field of oncology.


      Systematic searches of MEDLINE and Embase were undertaken to identify the types of research questions that had been addressed in studies using FDNs. Additional information was retrieved about study characteristics, statistical methods, and the FDN itself.


      In total, 40 publications were included where research questions on the following had been addressed (multiple categories possible): disease natural history (58%), safety surveillance (18%), treatment pathways (15%), comparative effectiveness (10%), and cost/resource use studies (3%)—13% of studies had to be left uncategorized. A total of 50% of the studies were run with data partners in networks of ≤5. The size of the networks ranged from 227 patients to >5 million patients. Statistical methods used included distributed learning and distributed regression methods.


      Further work is needed to raise awareness of the important role that FDNs can play in leveraging readily available RWE to address key research questions of interest in cancer and the benefits to the research community in engaging in federated data initiatives with a long-term perspective.


      To read this article in full you will need to make a payment

      Purchase one-time access:

      Academic & Personal: 24 hour online accessCorporate R&D Professionals: 24 hour online access
      One-time access price info
      • For academic or personal research use, select 'Academic and Personal'
      • For corporate R&D use, select 'Corporate R&D Professionals'


      Subscribe to Value in Health
      Already a print subscriber? Claim online access
      Already an online subscriber? Sign in
      Institutional Access: Sign in to ScienceDirect


        • Garrison Jr., L.P.
        • Neumann P.J.
        • Erickson P.
        • Marshall D.
        • Mullins C.D.
        Using real-world data for coverage and payment decisions: the ISPOR Real-World Data Task Force report.
        Value Health. 2007; 10: 326-335
        • Makady A.
        • de Boer A.
        • Hillege H.
        • Klungel O.
        • Goettsch W.
        on behalf of GetReal Work Package 1. What is real-world data? A review of definitions based on literature and stakeholder interviews.
        Value Health. 2017; 20: 858-865
        • Jacobson R.S.
        • Becich M.J.
        • Bollag R.J.
        • et al.
        A federated network for translational cancer research using clinical data and biospecimens.
        Cancer Res. 2015; 75: 5194-5201
        • Wagner A.K.
        • Haug N.
        • Hou L.
        • Welch E.C.
        • Lu C.Y.
        • Toh D.
        Opportunities for rapid monitoring of new cancer treatments-tyrosine kinase inhibitors in the sentinel database.
        Pharmacoepidemiol Drug Saf. 2017; 26: 118
        • Walunas T.
        • Galvez C.
        • Jacob S.
        • Sosman J.
        • Kho A.
        P67 Use of a regional integrated health record data network to identify patients who received checkpoint therapy following cancer diagnosis as a foundation for exploring immunotoxic events.
        J Immunother Cancer. 2019; 7: 282
        • Deist T.M.
        • Dankers F.J.W.M.
        • Ojha P.
        • et al.
        Distributed learning on 20 000+ lung cancer patients - the Personal Health Train.
        Radiother Oncol. 2020; 144: 189-200
        • Rassen J.A.
        • Solomon D.H.
        • Curtis J.R.
        • Herrinton L.
        • Schneeweiss S.
        Privacy-maintaining propensity score-based pooling of multiple databases applied to a study of biologics.
        Med Care. 2010; 48: S83-S89
        • Toh S.
        • Gagne J.J.
        • Rassen J.A.
        • Fireman B.H.
        • Kulldorff M.
        • Brown J.S.
        Confounding adjustment in comparative effectiveness research conducted within distributed research networks.
        Med Care. 2013; 51: S4-S10
        • Mandl K.D.
        • Kohane I.S.
        Federalist principles for healthcare data networks.
        Nat Biotechnol. 2015; 33: 360-363
        • Willems S.M.
        • Abeln S.
        • Feenstra K.A.
        • et al.
        The potential use of big data in oncology.
        Oral Oncol. 2019; 98: 8-12
        • Chen R.
        • Ryan P.
        • Natarajan K.
        • et al.
        Treatment patterns for chronic comorbid conditions in patients with cancer using a large-scale observational Data Network.
        JCO Clin Cancer Inform. 2020; 4: 171-183
        • Curtis L.H.
        • Weiner M.G.
        • Boudreau D.M.
        • et al.
        Design considerations, architecture, and use of the Mini-Sentinel distributed data system.
        Pharmacoepidemiol Drug Saf. 2012; 21: 23-31
        • Trifiro G.
        • Fourrier-Reglat A.
        • Sturkenboom M.C.
        • Díaz Acedo C.
        • Van Der Lei J.
        • EU-ADR Group
        The EU-ADR project: preliminary results and perspective.
        Stud Health Technol Inform. 2009; 148: 43-49
        • West-Strum D.
        Chapter 1: introduction to pharmacoepidemiology.
        in: Yang Y. West-Strum D. Understanding Pharmacoepidemiology. McGraw Hill, New York, NY2010
        • Moher D.
        • Liberati A.
        • Tetzlaff J.
        • Altman D.G.
        • PRISMA Group
        Preferred reporting Items for systematic reviews and meta-analyses: the PRISMA statement.
        PLoS Med. 2009; 6e1000097
        • Scherer R.W.
        • Saldanha I.J.
        How should systematic reviewers handle conference abstracts? A view from the trenches.
        Syst Rev. 2019; 8: 264
        • Sammour T.
        • Jones I.T.
        • Gibbs P.
        • et al.
        Comparing oncological outcomes of laparoscopic versus open surgery for colon cancer: analysis of a large prospective clinical database.
        J Surg Oncol. 2015; 111: 891-898
        • Mulshine J.L.
        • Avila R.S.
        • Conley E.
        • et al.
        The international association for the study of lung cancer early lung imaging confederation.
        JCO Clin Cancer Inform. 2020; 4: 89-99
      1. Passey A, Perualilia NJ, Bardenheuer K, et al. HONEUR (Haematology Outcomes Network in Europe) - distributed statistics in a federated model to support real world data research in haematology. Paper presented at: The Annual Meeting 2019 of the European Hematology Association; June 15, 2019; Amsterdam, The Netherlands.

        • Holloway L.C.
        • Field M.
        • Barakat M.S.
        • et al.
        OzCAT: the Australian computer aided theragnostics network.
        Australas Phys Eng Sci Med. 2016; 39: 1057-1058
        • Bogowicz M.
        • Jochems A.
        • Deist T.M.
        • et al.
        Privacy-preserving distributed learning of radiomics to predict overall survival and HPV status in head and neck cancer.
        Sci Rep. 2020; 10: 4542
        • Shantakumar S.
        • Nordstrom B.L.
        • Djousse L.
        • et al.
        Occurrence of hepatotoxicity with pazopanib and other anti-VEGF treatments for renal cell carcinoma: an observational study utilizing a distributed database network.
        Cancer Chemother Pharmacol. 2016; 78: 559-566
        • Deist T.M.
        • Jochems A.
        • van Soest J.
        • et al.
        Infrastructure and distributed learning methodology for privacy-preserving multi-centric rapid learning health care: euroCAT.
        Clin Transl Radiat Oncol. 2017; 4: 24-31
        • Evans L.
        • Kuranz S.
        Pcn53 checkpoint inhibitor use and the occurrence of insulin-dependent or drug-induced diabetes: what can we learn from real world data?.
        Value Health. 2019; 22: S445
        • Field K.
        • Wong H.L.
        • Shapiro J.
        • et al.
        Developing a national database for metastatic colorectal cancer management: perspectives and challenges.
        Intern Med J. 2013; 43: 1224-1231
      2. Joseph J, Verstraete H, Halvorsen L, Speybroeck MV. Patient similarity analysis and visualization in multiple myeloma. Paper presented at: EHA 25-The European Hematology Association; 2020; Virtual.

        • Prokosch H.U.
        • Acker T.
        • Bernarding J.
        • et al.
        MIRACUM: medical informatics in research and care in university medicine.
        Methods Inf Med. 2018; 57: e82-e91
        • Ananda S.S.
        • McLaughlin S.J.
        • Chen F.
        • et al.
        Initial impact of Australia’s National Bowel Cancer Screening Program.
        Med J Aust. 2009; 191: 378-381
        • Field M.
        • Barakat M.S.
        • Bailey M.
        • et al.
        A distributed data mining network infrastructure for Australian radiotherapy decision support.
        Australas Phys Eng Sci Med. 2016; 39: 323
        • Ritzwoller D.P.
        • Carroll N.M.
        • Delate T.
        • et al.
        Comparative effectiveness of adjunctive bevacizumab for advanced lung cancer: the cancer research network experience.
        J Thorac Oncol. 2014; 9: 692-701
        • Field K.
        • Zelenko A.
        • Kosmider S.
        • et al.
        Dose rounding of chemotherapy in colorectal cancer: an analysis of clinician attitudes and the potential impact on treatment costs.
        Asia Pac J Clin Oncol. 2010; 6: 203-209
        • Aiello Bowles E.J.
        • Tuzzio L.
        • Ritzwoller D.P.
        • et al.
        Accuracy and complexities of using automated clinical data for capturing chemotherapy administrations: implications for future research.
        Med Care. 2009; 47: 1091-1097
        • Linkov F.Y.
        • Schwenk M.
        • Davis M.
        • Edwards R.
        • Becich M.
        New directions in gynecologic cancer research utilizing Text Information Extraction System (TIES) Cancer Research Network.
        Clin Cancer Res. 2018; 24: nrB71
        • Shantakumar S.
        • Nordstrom B.L.
        • Hall S.A.
        • et al.
        Prescriber compliance with liver monitoring guidelines for pazopanib in the postapproval setting: results from a distributed research network.
        J Patient Saf. 2019; 15: 55-60
        • Sheller M.
        • Edwards B.
        • Reina G.A.
        • Martin J.
        • Bakas S.
        Federated learning in neuro-oncology for multi-institutional collaborations without sharing patient data.
        Neuro-Oncology. 2019; 21: vi176-vi177
        • Jochems A.
        • Deist T.M.
        • El Naqa I.
        • et al.
        Developing and validating a survival prediction model for NSCLC patients through distributed learning across 3 countries.
        Int J Radiat Oncol Biol Phys. 2017; 99: 344-352
        • Jochems A.
        • Deist T.M.
        • van Soest J.
        • et al.
        Distributed learning: developing a predictive model based on data from multiple hospitals without data leaving the hospital - a real life proof of concept.
        Radiother Oncol. 2016; 121: 459-467
        • Masciocchi C.
        • Damiani A.
        • Capocchiano N.D.
        • et al.
        EP-1937 Distributed AUC algorithm: a privacy-preserving approach to measure the performance of Cox models: a privacy-preserving approach to measure the performance of Cox models.
        Radiother Oncol. 2019; 133: S1055
        • But A.
        • De Bruin M.L.
        • Bazelier M.T.
        • et al.
        Cancer risk among insulin users: comparing analogues with human insulin in the CARING five-country cohort study.
        Diabetologia. 2017; 60: 1691-1703
        • Rutter C.M.
        • Greenlee R.T.
        • Johnson E.
        • et al.
        Prevalence of colonoscopy before age 50.
        Prev Med. 2015; 72: 126-129
        • Van Soest J.
        • Masciocchi C.
        • Fick P.
        • et al.
        Distributed rapid learning made easy: a user-friendly dashboard for model development and execution.
        Radiother Oncol. 2019; 133: S1039-S1040
        • Merriel R.B.
        • Gibbs P.
        • O’Brien T.J.
        • Hibbert M.
        BioGrid Australia facilitates collaborative medical and bioinformatics research across hospitals and medical research institutes by linking data from diverse disease and data types.
        Hum Mutat. 2011; 32: 517-525
        • Warren R.
        • Thompson D.
        • del Frate C.
        • et al.
        A comparison of some anthropometric parameters between an Italian and a UK population: “proof of principle” of a European project using MammoGrid.
        Clin Radiol. 2007; 62: 1052-1060
        • Field K.
        • Platell C.
        • Rieger N.
        • et al.
        Lymph node yield following colorectal cancer surgery.
        ANZ J Surg. 2011; 81: 266-271
        • Sammour T.
        • Hayes I.P.
        • Jones I.T.
        • Steel M.C.
        • Faragher I.
        • Gibbs P.
        Impact of anastomotic leak on recurrence and survival after colorectal cancer surgery: a BioGrid Australia analysis.
        ANZ J Surg. 2018; 88: E6-E10
        • Dekker A.
        • Wiessler W.
        • Xiao Y.
        • et al.
        PD-0571 Rapid learning in practice: validation of an EU population-based prediction model in USA trial data for H&N cancer.
        Radiother Oncol. 2012; 103: S229
        • Avila R.
        S01.23 evolution of ELIC.
        J Thorac Oncol. 2019; 14: S201
        • Shi Z.
        • Zhovannik I.
        • Traverso A.
        • et al.
        Distributed radiomics as a signature validation study using the Personal Health Train infrastructure.
        Sci Data. 2019; 6: 218
        • Hassett M.J.
        • Ritzwoller D.P.
        • Taback N.
        • et al.
        Validating billing/encounter codes as indicators of lung, colorectal, breast, and prostate cancer recurrence using 2 large contemporary cohorts.
        Med Care. 2014; 52: e65-e73
        • Welch E.C.
        • Huang T.Y.
        • Shinde M.
        • Nguyen M.D.
        • Dutcher S.K.
        Duration of follow-up of chronic condition cohorts in the sentinel system.
        Pharmacoepidemiol Drug Saf. 2019; 28: 263-264
        • Pawloski P.
        • Haynes K.
        • Kent D.
        • et al.
        Evaluating biologics and their biosimilars using a distributed research network to demonstrate real-world outcomes.
        J Manag Care Spec Pharm. 2018; 24: S96
        • Steiner J.F.
        • Paolino A.R.
        • Thompson E.E.
        • Larson E.B.
        Sustaining research networks: the twenty-year experience of the HMO research network.
        EGEMS (Wash DC). 2014; 2: 1067
        • Bazelier M.T.
        • Eriksson I.
        • de Vries F.
        • et al.
        Data management and data analysis techniques in pharmacoepidemiological studies using a pre-planned multi-database approach: a systematic literature review.
        Pharmacoepidemiol Drug Saf. 2015; 24: 897-905
        • Cave A.
        • Kurz X.
        • Arlett P.
        Real-world data for regulatory decision making: challenges and possible solutions for Europe.
        Clin Pharmacol Ther. 2019; 106: 36-39
        • Rahm A.K.
        • Ladd I.
        • Burnett-Hartman A.N.
        • et al.
        The Healthcare Systems Research Network (HCSRN) as an environment for dissemination and implementation research: a case study of developing a multi-site research study in precision medicine.
        EGEMS (Wash DC). 2019; 7: 16