Edit

Genome Biology Computational Support

At your side to solve your daily data management and NGS data analysis challenges

The Genome Biology Computational Support (GBCS) provides essential computational support to the GB Unit and EMBL-wide, by implementing a coherent infrastructure that improves data management, helps ensure data integrity and reproducible science (FAIR data) and empowers biologists to analyse their own data.

Edit

We develop Lab Integrated Data (LabID), our own open-source data management solution, to address complex data management challenges and help improving the transparency, reproducibility and efficiency of scientific research at EMBL and elsewhere.

We employ and maintain state-of-the-art software and established technologies (e.g. Galaxy, R & RStudio Server) to help you to perform advanced analysis on your data, at different levels of computer literacy.

Support areas

Lab Integrated Data
Inventory ELN Sample management Data management FAIR Traceability

LabID is a web platform for fundamental research data-management, featuring an lab inventory system, an Electronic Lab Notebook (ELN) coupled to a powerful Dataset & Sample management system. It is designed to help individual scientists and research groups better managing their lab inventory, samples, research notes, protocols, assays, datasets and link them into projects. It facilitates documenting and referencing research progress throughout the experimental and analysis chains, effectively preserving data integrity and enhancing traceability. It implements international standards1 to ease data submission to international data repositories2.

LabID is used across EMBL sites (including teams at EMBL Heidelberg, Barcelona, Grenoble, and Rome) and integrates with the EMBL ITS DM app to operate data archiving and sharing. Since 2018, more than 500 colleagues from more than 60 groups have been using it, and more are still joining. It is available to all EMBL scientists who can log in with their EMBL credentials. Oh and it’s free & open source so you can still use it after your EMBL time.

[1] (e.g. MINISEQ, ontologies)
[2] (EBI’s ENA, ArrayExpress, etc.)

Galaxy
Analysis Workflow Reproducibility Traceability HPC / CLUSTER

For more than a decade, GBCS maintains a Galaxy instance accessible to all EMBL scientists. Running our local instance offers a number of advantages regarding data access e.g. LabID-managed datasets are readily available in Galaxy, and computing power as Galaxy jobs are executed on the EMBL High Performance Cluster. We also have the expertise to help you start your Galaxy journey in the best possible ways, and we regularly offer training. With Galaxy, no need to be a professional bioinformatician to execute jobs on the HPC or assemble complex workflows!

We have also developed various workflows for common NGS data analysis (RNA-seq, ChIP-seq, ATAC-seq, HiC…) that we used in our own projects. We are happy to share them and adapt them to your projects. Simply get in touch!

galaxy project logo
RStudio Server (Posit™ Workbench)
R Bioconductor Analysis statistical computing Visualisation

GBCS maintains a RStudio Server instance, making it available for all EMBL scientists from all sites at no cost. Running our local instance offers a number of advantages regarding data access e.g. direct access to your group share and scratch and computing power (see below). Together with ITS, we support different R versions through the easybuild (module) infrastructure; that are all available in our Posit instance and on the command line (i.e. including on the HPC cluster). Develop your R code on Posit and run it on the cluster, or vice versa; it just works.

Posit Workbench runs on one of our super computer (Seneca), and has access to 64 cores and 2Tb of RAM. Our R is bundled with Bioconductor, so you have all the state-of-the-art R libraries at your fingertips.

Gitlab
Bio IT version control project management collaboration CI/CD

For many years, GBCS has been maintaining the EMBL-wide gitlab server in collaboration with the EMBL BioIT. This service makes it easy to version, share your code and deploy your services on virtual machines or in the EMBL kubernetes cloud instance.

Chat
Bio IT chat communication collaboration

Mattermost is an EMBL-wide chat service to enhance communication and collaboration across all sites. Initially set up and maintained by Jelle Scholtalbers from GBCS, the service is now supported by the EMBL BioIT. It is the preferred way to quickly get help from us or the EMBL community. We have different channels for the group (GBCS) and per-service (LabID, Galaxy, Posit…). Don’t be shy, join channels to ask questions, post suggestions or report bugs.

Edit