We are EMBL: Timothé Cezard on pivoting from biology to infrastructure
Timothé Cezard has been a user of EMBL-EBI’s open data resources since his university days in France. When a Project Lead job came up at the European Variation Archive, he saw it as an opportunity to take a look ‘under the bonnet’.
What is your background?
I did a Bachelor’s degree in biology and a Master’s degree in bioinformatics. It was the early days of bioinformatics as a science, rather than as a discovery tool. I first worked in a small start-up doing applications development in France and then I moved to Canada working on next generation sequencing method development, doing mainly data investigation. Next I joined the University of Edinburgh, initially as an bioinformatician, and progressed to a lead position in clinical bioinformatics.
During the course of my career, I migrated from being a biologist to being a bioinformatician, a developer, then building research and clinical infrastructure.
What attracted you to EMBL-EBI?
I first heard of EMBL-EBI during university, where we learned about its open data resources.
In 2020, I joined EMBL-EBI as a Project Lead for the European Variation Archive. As an avid user of EMBL-EBI services, I was always impressed with the output of the institute, so I was curious to see how it worked behind the scenes. I also wanted to bring my expertise in infrastructure building to help deliver EMBL-EBI’s mission of safely archiving the world’s biological data and making it openly available to all. Discovering the faces behind the tools I had been using for years was very interesting.
What does your current role entail?
The European Variation Archive (EVA) is a repository of genomic variation data for all species. We archive studies generated by the scientific community, and make them freely available to all, but we also enable researchers to slice and query one or multiple datasets, and reuse data generated by others in lots of different ways, to answer new research questions. We also create our own catalogue of all the genetic variants discovered in all species.
To be as useful as possible to our users, we also do quite a bit of outreach. The EVA is still a young resource so we focus on reaching new communities and speaking to our users to understand different submission and data reuse cases which guide the development of the EVA.
What has been your biggest achievement at EMBL-EBI?
Over the past 4-5 years, the team has worked incredibly hard to establish the EVA as the main provider of variation data for any non-human species. One of our biggest achievements has been the creation of a large catalogue of variants that we produce every six months, and that covers over 250 species. This is a list of all the variants that have ever been discovered, organised by species, with links to the original data where the variants were discovered. It’s like a massive index that allows people to easily understand the context of variants of interest.
Each variant is given a unique name and we are maintaining this list, which is a pretty hefty job. We work closely with our Ensembl colleagues on this, to ensure that when new versions of reference genomes become available, the data is always up to date.
This is one of our most used products and contains data for over 200 species, and counting.
How does your team work within the wider EMBL-EBI context?
We believe it’s important to link data from different resources, to give users a comprehensive and useful picture whenever possible. As such, we work closely with other teams, particularly BioSamples, ENA and Ensembl.
The integration of the EVA tools and data with other resources is another massive achievement and something we’re constantly improving, with the aim of making it easier for our users to submit, query and navigate across resources.
What do you think will be the next big discovery coming from the life sciences?
I think we’re getting closer and closer to the ability to find novel, efficient drug targets based on available data. It’s not an area I’m particularly working with but I feel like it’s the next thing that needs to happen to really show that the big data gathering endeavours are key for bringing about precision medicine.