Why are there sometimes mismatches between genes and transcripts (nucleotides highlighted in red on the transcript track)?

This is due to occasional genome/transcript sequence discrepancies, where the genome reference includes polymorphism minor alleles, but the transcript includes corresponding major alleles. This means that some genomic variants are seen as ‘non-variants’ if analyzed at the transcript level.

Basically, at the positions highlighted in red, the nucleotide of the transcript differs from the nucleotide of the genome build (GRCh37 or GRCh38). For these nucleotides, it is more difficult to definitively determine whether a variant is indeed a variant.

These discrepancies mainly occur in RefSeq transcripts (Beginning with “NM_”), as RefSeq does not correct the transcript to the genome build, while ENSEMBL transcripts (beginning with “ENST”) are corrected to match the nucleotides present in the genome build.