DDM – How to interprete results/Bioinformatics explanations (5)
The SOPHiA DDM™ Platform evaluates the impact of each variant on all transcripts of the gene in the kit. For each variant, the platform displays the transcript for which it is most deleterious. The variant annotation (coding consequence, cDNA HGVS nomenclature, protein effect) is then provided relative to this transcript. This method is used to minimize the probability of missing a potentially pathogenic mutation.
During variant analysis, the SOPHiA DDM™ Platform has the capability to directly ascertain the status of a pre-defined set of features.
This set of features can be defined both at the genomic and protein level. The monitoring of a given genomic region, such as an exon, is also possible.
If the screening analysis is enabled, a screening panel displaying the status of the pre-defined features is available on SOPHiA DDM™ Platform.
The pre-defined set is thoroughly tested before being deployed. This procedure ensures that all events are assessable from the product used. Therefore, this analysis can only be enabled by our team.
Using PMS2 and PMS2CL as an example: For both gene and pseudogene, exons 11-15 are very similar and the SOPHiA DDM™ Platform provides a warning for all variants that are detected in PMS2 exons 11-15 according to one of 3 categories:
Pseudogene_identical
applies to all variants detected in PMS2 exon 15, which are identical to the corresponding exon in PMS2CL. This warning indicates that it is impossible to distinguish a variant listed for PMS2 exon 15 from a variant that may be located in the corresponding exon in pseudogene PMS2CL.
Pseudogene_polymorphism
applies to variants detected in PMS2 exon 13 or exon 14, which are very similar to the corresponding exons in PMS2CL and – importantly – where gene conversion is frequently observed. These PMS2 exons may have replaced the corresponding sequence in the PMS2CL pseudogene locus or vice versa. Although PMS2 and PMS2CL sequences can be distinguished in these exons, no high confidence conclusions can be made since the observed PMS2 sequence (and any variants detected in this context) could originate either from the PMS2 gene locus or the PMS2CL pseudogene locus (due to gene conversion).
Pseudogene_distinct
applies to variants detected in PMS2 exon 12 or exon 11, which are very similar to the corresponding exons in PMS2CL but where gene conversion is rare. Here, the PMS2 sequence can be distinguished from the corresponding PMS2CL sequence with reasonable confidence [e.g. the observed PMS2 sequence (and any variants detected in this context) actually originates from PMS2 and not the PMS2CL pseudogene].
The SOPHiA DDM™ Platform normalizes the scores provided by the different tools to enable comparisons among the different sources. For all predictive scores, a value of 1 means likely pathogenic and 0 likely benign irrespective of the database. The scores are transformed following the rationale and procedure established by dbNSFP and discussed in Xiaoming Liu et al., 2011 (https://doi.org/10.1002/humu.21517). Briefly:
SIFT
“1 – SIFT” score (the original SIFT score is opposite, with 0 being predicted pathogenic and 1 benign)
PolyPhen2
No transformation (there are two different PolyPhen2 scores, HVAR and HDIV, based on different training sets)
MutationTaster
Combination of prediction and score (1 – confident, 0 not confident). The SOPHiA DDM™ Platform displays the score if the prediction is pathogenic or “1 – score” if the prediction is benign
There are two different BRCA1 exon naming conventions.
1) The BRCA1 legacy exon nomenclature is still used by some clinical databases, where exon 4 is missing due to an initial oversight during BRCA1 protein characterization – all following exons have a number increased by one.
2) The standard HGVS/RefSeq/Ensembl nomenclature ranks exons dependent on the specific transcript (NM_*, ENST*) and they are always numbered in increasing order from 1 up to the last exon.