Canada’s experience with management and analysis of federated health data an enormous addition to the project
Hinxton, UK, 24 Jan 2019
A patient develops a rare condition and needs answers, so their clinician searches frantically to find patients with similar, rare, symptoms and similar possible causes. To understand the mechanisms of one debilitating disease, a medical researcher tries to separate the “signal” of causes of that disease in particular from the “noise” of natural biological variation of human lives and conditions.
Getting the answers those patients and researchers need requires the ability to analyze or query health and genomic data from an enormous number of patients - patients who have their own needs, and deserve to have their data kept at the highest levels of security and privacy.
Today a collaboration of African, Canadian, and EU researchers came together to announce the CINECA1 project, establishing an unprecedented multi-continental project to build the infrastructure — data standards, technical protocols, and software — to allow queries and analyses over the distributed data sets made available by each partner, while allowing those partners complete control over the patient data that they have been entrusted with.
Canada’s health data system has always necessarily been federated, and the experience of the Canadian Distributed Infrastructure for Genomics (CanDIG) with building federated queries and analyses over locally controlled private health data is essential to the project. CanDIG member institutions McGill University and The Hospital for Sick Children (SickKids) are directly involved with CINECA, and CanDIG as a whole will bring its experience to bear by leading the work of building standard methods for federating queries, and actively participating in building compatible and interoperable systems for login, access control, and running complex distributed analyses.
“The technical goals we have set for ourselves are ambitious”, said Mike Brudno, PI of the CanDIG project and Senior Scientist at SickKids. “But CanDIG has extensive experience working with CINECA partner projects EGA2 and ELIXIR3 through their participation as peer Driver Projects for the Global Alliance for Genomics and Health (GA4GH). Building on what our projects have already done alone and together, we’re confident that we can not only meet those goals, but build open-source standards-based solutions for the entire community.”
“CanDIG is already connecting several important Canadian health data sets in cancer research”, said Guillaume Bourque, Director of the Centre for Computational Genomics at McGill and Co-PI of CanDIG. “As part of this project, we are proposing to connect additional Canadian data sets, and then connect those to an even larger number of data sets internationally. Those new connections between data sets are going to allow Canadian researchers much deeper insight into even that data that they already had access to.”
“Key to this project’s success is trusted, reliable, federated data querying and analysis”, said Steve Jones, Head of Bioinformatics and Co-Director, Michael Smith Genome Sciences Centre, and Co-PI of CanDIG. “We’ve shown how this can be done in support of real science and insight, while retaining control over the data we have been entrusted with; and we’re excited to bring our expertise in data federation to the international community.”
The CINECA project is funded by both the EU through the Horizon 2020 Research and Innovation Programme and the Canadian Government through the Canadian Institutes for Health Research. More information can be found here.
CanDIG is a Canadian national health and genomics platform for allowing authorized queries and analysis of data over locally-controlled private data sets. For more information, see distributedgenomics.ca
Common Infrastructure for National Cohorts in Europe, Canada, and Africa ↩
European Genome/Phenome Archive: ega-archive.org ↩
A European network of life sciences and bioinformatics resources: elixir-europe.org ↩