Paralogue Annotation utilizes information from evolutionarily related proteins, specifically paralogues, to help inform the clinical significance of missense variants associated with human diseases.
The original methodology and implementation of Paralogue Annotation on arrhythmia syndrome genes'
was published here and here.
This web app extends Paralogue Annotation exome-wide, using paralogues defined by Ensembl gene trees and pathogenic/likely pathogenic missense variants defined by ClinVar.
This web app is currently being built using Shiny, the source code is available at https://github.com/ImperialCardioGenetics/Paralogue_Annotation_App.
Frequently Asked Questions (FAQ)
Q. What genome build coordinates do my variants need to be in?
A. Currently only GRCh37 coordinates are supported.
We recommend using
Ensembl liftover service, for coordinate conversions.
Q. What are the Para_z scores?
A. The Para_z scores are a measure of paralogue conservation independently derived by
Lal et al. (2020).
You may therefore find in your results that some Para_z scores do not agree with your expectations.
This is because the paralogue alignments used to generate the scores are different to the alignments used here.
The Para_z scores can thus be thought as a third-party confidence score of paralogue conservation across aligned positions.
Q. How can we use the conservation of Ref/Alt alleles to filter out results?
A.That is not currently available in this version of the web app.
Q. What paralogue alignments do you use here?
A. We utilize paralogue alignments at the protein level generated by Ensembl, which were obtained through
Compara.
Q. Why do the results for arrhythmia genes from the original Paralogue Annotation and here differ?
A. This is mainly because the original Paralogue Annotation utilized T-COFFEE for the alignments, whereas Ensembl alignments are generated by CLUSTAL W instead.
Furthermore, variants from
HGMD were used instead of
ClinVar.
Q. What formats do my input variants have to be in?
A. Currently variants have to be submitted using their chromosome, position, reference allele,
and alternate allele using any delimiter in the format of 'CHROM:POS:REF:ALT' with separate variants on newlines.
Alternatively we also accept
VCF as well.
Q: I have previously identified a known paralogous variant to my variant of interest, but why does Paralogue Annotation not find it?
A: We currently utilize Pathogenic and Likely pathogenic missense variants from ClinVar.
Your variant may not exist in ClinVar or may be considered Variant of Uncertain Significance (VUS). We do not at this time look at other databases, e.g. HGMD.
Q: Do you have all homologous pfam position available?
A: We do not currently, but we are working towards providing this information in a near future update.
For more details on specific methods, code of how Paralogue Annotation functions, or any other questions please email
nyl112@ic.ac.uk
This web app is a
work in progress, final version may differ.