Welcome: Christina Ernst
EMBL-EBI's new Functional Genomics Team Leader will develop standards for new data types and integrate AI into workflows
EMBL-EBI’s Functional Genomics team delivers tools and services for the archiving, analysis, and visualisation of data obtained from functional genomics experiments. By ensuring these data are expertly curated and accessible, the team supports reproducibility and advances in life science research.
As the new Functional Genomics Team Leader, Christina Ernst brings new ideas to the team. Here, she shares her professional journey, highlights the critical importance of metadata and data standards, and discusses how she plans to integrate AI into the team’s workflows.
What is your professional background?
I trained as a molecular biologist at the University of Heidelberg, earning my bachelor’s and master’s degrees in cancer biology. I then completed my PhD in functional comparative genomics in Duncan Odom’s group at the Cancer Research UK Cambridge Institute. Following this, I joined John Marioni’s group as a postdoctoral fellow here at EMBL-EBI, where I focused on single-cell transcriptomics.
Afterwards, I moved to Switzerland as a Human Frontiers Long-Term Fellow at the Federal Institute of Technology in Lausanne (EPFL), initially focusing on the regulation of endogenous retroviruses before transitioning to viral evolution during the pandemic. This work earned me a career development award from the Swiss National Science Foundation. Now, I’ve returned to the UK to lead EMBL-EBI’s Functional Genomics team.
What does your team do?
Our team currently has two main responsibilities. The first is to manage functional genomics data submitted to the ArrayExpress collection in BioStudies, for which we ensure that the data meet FAIR data standards – Findable, Accessible, Interoperable, and Reusable. We work closely with data producers to capture all necessary experimental details. This ensures our data include robust metadata. Metadata provides the contextual information that describes the data, such as how and where they were generated. In functional genomics, this is crucial because it allows researchers to understand the experimental conditions and reuse the data effectively in their own studies.
The second responsibility is processing transcriptomics data from ArrayExpress and other public archives for Expression Atlas, which provides accessible and interactive tools for exploring gene expression across a range of conditions.
What are the Expression Atlas and the Single-Cell Expression Atlas and how do scientists use the data in these resources?
Researchers use these resources to search for tissues or organisms of interest and explore where certain genes are expressed. The Expression Atlas covers many different species and biological conditions, while the Single Cell Expression Atlas focuses on single-cell expression data. These resources reprocess archived expression data to present them in a standardised, accessible way. We also incorporate proteomics data from partners like PRIDE, enabling researchers to explore protein expression in tandem.
We also offer our users access to many human expression datasets. Human datasets often include personal identifiers and require special access agreements for researchers to obtain these data. Our team works to process these data in a way that removes personal identifiers. By doing so, we can make a version of these data available to the wider research community without compromising privacy.
What is ArrayExpress and why is it still important?
ArrayExpress was EMBL-EBI’s database for functional genomics data. It was one of the first databases of its kind to recognise the importance of defining metadata standards and ensuring that extra experimental information is captured when archiving data. Originally, as the name suggests, ArrayExpress handled microarray data.
Over time, as technologies evolved, ArrayExpress expanded to include next-generation sequencing data, such as transcriptomics and epigenetic profiling. Today, the ArrayExpress collection in BioStudies remains essential, capturing a wide variety of data types.
What are some of the first things you’re hoping to do in your new role?
I want to focus on developing standards and visualisation tools for new data types, like spatial transcriptomics, which combines sequencing and imaging data. To do this, we will collaborate with EMBL-EBI’s BioImage Archive, ensuring these datasets are well-integrated for seamless archiving, exploration, and analysis.
I’m also interested in exploring how AI can improve our work. By integrating AI into our workflows, we aim to handle the increasing volume and complexity of data we receive more efficiently. For example, AI could assist in metadata annotation or in identifying patterns within the data that might be valuable for researchers. Another of my goals is to curate single-cell datasets to a high standard, making them AI-ready, so they can serve as reference datasets for the community and facilitate future AI applications in genomics.
How do you collaborate with other teams across EMBL-EBI and EMBL?
Within EMBL-EBI, we work closely with different teams to integrate our services. These collaborations are essential for ensuring that our data resources are interoperable and meet the needs of our users.
Across EMBL, we’re looking to engage the Data Science Center, an initiative that aims to ensure that data generated across EMBL and beyond are expertly curated, managed, and shared.
Can you tell us about some of your hobbies and interests?
I enjoy spending time outdoors; hiking and paddleboarding are favourite activities of mine. I used to live near Lake Geneva, so this was great for these hobbies. I also have a dog named Jupiter, a mix of a Cocker Spaniel and a Havanese. He’s energetic and loves joining me on hikes and paddleboarding adventures.