ProtVar: understanding missense variation in humans
A new tool for the interpretation of missense variation in humans – ProtVar – will help enable drug discovery
EMBL’s European Bioinformatics Institute (EMBL-EBI) has launched ProtVar, a new web tool that helps users contextualise and interpret human missense variation within proteins.
What is missense variation?
Missense variation is a point mutation in which a single nucleotide change results in a different amino acid within a protein. This type of genomic variation can impact normal biological function in complex ways. Understanding how this variation translates into proteins is vital for identifying drug targets and developing new therapeutics, but requires a broad range of data to be coherently assimilated.
ProtVar combines data from UniProt, PDBe, Ensembl among other databases, and integrates new analyses from Open Targets to bring together genomic data and protein sequence, function and structure insights. This streamlines the user’s workflow and allows for rapid identification of the potential effects of missense variation concerning human health and disease.
Interpreting missense variation
ProtVar streamlines data interpretation by providing amino acid-specific information allowing users to investigate each variant independently as well as to compare other known variations in the same amino acid. Clinical geneticists can use this information to identify and prioritise which variants in their patients are most likely associated with disease.
From a project carried out in Open Targets using AlphaFold2 algorithms, ProtVar integrates data regarding the predicted impact on protein stability and whether the variant position is likely to be involved in a protein-protein interface or small molecule binding site. Users can then visualise variant positions in an annotated structural context to assess the likely effect of missense mutations, for example helping to understand variation within drug-binding pockets.
“ProtVar is a transformative tool that allows the interpretation of coding variations faster and easier than ever before,” said James Stephenson, Lead Data Scientist in the Protein Function Development team at EMBL-EBI. “ProtVar enables users to rapidly analyse and interpret the consequences of amino acid substitutions making it invaluable to a broad range of professionals, from clinical geneticists to drug discovery experts.”
Speeding up analysis
“ProtVar’s power lies not only in its advanced algorithm but also in the ease of access and the speed at which it delivers relevant, updated data, saving users hours or even days of work,” said Prabhat Totoo, Software Engineer at EMBL-EBI and Open Targets. “Its impact on our understanding of genetic variation and its role in disease and health is profound, and the tool itself will evolve as we continue to make updates and add new features in our future releases.”
The ProtVar resource simplifies data interpretation for the user by allowing them to input genomic coordinates, protein positions, or variant IDs such as dbSNP. This flexibility is a significant advantage over similar resources and helps to reduce the need for researchers to map information manually. Human proteome mappings have all been precomputed, including between different genome assemblies, such that researchers can find the information they need in a single click.
Applications of ProtVar
ProtVar’s target audiences include clinical geneticists diagnosing patients, those working in drug discovery, virologists, and other researchers interested in the link between protein sequence, structure, and function.
“ProtVar provides a cast-iron link between genomic coordinates and protein positions, quickly and reliably; it will be really helpful for the community to explore and interpret missense variants that are of potential relevance to disease,” said Professor Caroline Wright, Professor of Genomic Medicine at the University of Exeter, who was not involved in the work.
“ProtVar has the potential to help clinical geneticists/clinicians to prioritise missense variants in relation to disease association,” said Dr Elizabeth Radford, Academic Clinical Lecturer in Paediatric Neurology at the University of Cambridge, who was not involved in the work.