Edit

Solve-RD: supporting rare disease data collection and harmonisation

The Solve-RD project standardises genomic and clinical data across Europe to support rare disease research

Credit: Karen Arnott/EMBL-EBI

Summary

  • The Solve-RD project brings together expertise and both genomic and clinical data from multiple European partners to uncover the genetic causes of rare diseases.
  • By standardising and securely sharing these large-scale datasets, researchers can make use of these data and collaborate more effectively. 
  • Accessing data across multiple countries allows researchers to investigate more cases and uncover new genetic links in rare diseases.

A rare disease affects fewer than 1 in 2,000 people, yet together, such diseases represent a substantial global challenge. Overall, 300 million people worldwide suffer from more than 7,000 known rare conditions. While genomic sequencing has made it easier to identify disease-causing variants in the human genome, many patients with rare diseases still lack a genetic explanation for their conditions. 

Because each rare disease affects a relatively small number of people, researchers need to be able to collect, share, and analyse data from multiple countries to understand the conditions and identify the genetic bases of rare diseases. The Solve-RD project was established to harmonise and standardise rare disease genomic and phenotypic data across multiple European partners. 

Solve-RD leverages a wide professional network of research infrastructures, clinicians, geneticists, and experts in bioinformatics and knowledge management, including EMBL-EBI. Together, they coordinate data-sharing efforts across Europe to help improve rare disease research and patient outcomes.

“By integrating large-scale, genomic and phenotypic multi-omic datasets and employing data standards, Solve-RD aims to harmonise data to help facilitate cross-cohort analysis for rare disease research,” said Thomas Keane, Team Leader at EMBL-EBI. “Through secure, federated data access, we have established a network that can help expedite research into rare diseases and ultimately lead to better diagnoses for patients.”

Phenopackets for data harmonisation

Each country has its own format for recording clinical information, which makes it difficult to compare and re-analyse datasets from different sources. A data standard from the Global Alliance for Genomics and Health (GA4GH), called Phenopackets, addresses this issue by providing a structured, machine-readable format for capturing patient attributes such as symptoms, age of onset, test results, and treatments. 

Phenopakets also enables researchers to integrate Solve-RD’s data into the European Genome-phenome Archive (EGA), EMBL-EBI’s secure repository for sensitive human genomic and phenotypic data. Phenopackets creates a common language for clinical data, improves data harmonisation, and streamlines multi-cohort analyses.  

“For many of the groups we worked with, transitioning to Phenopackets as a new data standard meant adjusting long-established workflows,” said Coline Thomas, Senior Operations Bioinformatician at EMBL-EBI. “To help support this shift, we provided guidelines and training to demonstrate how Phenopackets could make data more interoperable and valuable. As partners experienced the practical benefits of using Phenopackets, initial uncertainties gave way to a broader sense of progress.”

Secure data access

All raw and processed data from the Solve-RD project are available for download through the EGA in a controlled-access way. These data make up one of the largest datasets available in the EGA. Researchers requesting access to these sensitive human datasets must submit a proposal for review by the Solve-RD data access committee to ensure their use of the data complies with ethical and legal standards. 

Researchers can also analyse Solve-RD data using the RD-Connect Genome-Phenome Analysis Platform (GPAP) upon registration and approval. This secure online platform is designed to facilitate data analysis without requiring direct download of sensitive data, to safeguard patient data and support collaborative research.

Funding

The Solve-RD project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 779257.


Source article(s)

Tags: bioinformatics, ega, embl-ebi, european genome-phenome archive, genomics, precision medicine, rare disease

News archive

E-newsletter archive

EMBLetc archive

News archive

For press

Contact the Press Office
Edit