Avoid the noise, remove the background

How data accuracy can be improved with better algorithms

Your favorite song is playing on the car radio, but as you drive along, the frequency seems to hit a snag as hisses and pops infiltrate the music. The same song could sound much clearer on a slightly different radio frequency. It just might take some fine tuning. The process of tuning into a specific “clean” frequency is not unique to music on the radio alone. Medical research must be reproducible without the static.

How do we “tune” into data accuracy?

Some of the most concerning diseases of our time can now be studied in ways that scientists had only dreamed of decades ago. Through the evolution of Next Generation Sequencing, data becomes the lifeblood of new clinical research capabilities.

If you think of data accuracy like that radio signal, what you’re trying to do is tune into the right frequency, searching for your intended disease-causing variant. Ideally, your end result is to hear a perfect signal coming through a mess of static. It’s that signal that gives you the most accurate reading for what you’re searching for among the messy noise that’s naturally present in any given sample you may be testing. In order to finetune the end result, you must eliminate what is known as background noise. For NGS, this is a combination of the biases inherit to design. It looks like peaks of signals when visualized. Background noise could come from any outliers and excess datapoints that don’t apply to the research.

How can data be made better?

The new age of Next Generation Sequencing comes with massive amounts of data being analyzed each day at record levels. The amount of background noise in those datasets also increases on a major scale, making it more difficult to reach levels of accuracy that support your research.

Every single step of an experiment can introduce noise to the mix. Luckily, when data is muddied with irregularities captured throughout the analysis, it can also be cleaned. With advanced algorithms and exceptional analytical performance, it’s easier to identify variants of interest or to overcome any corruption of the data quality/accuracy. This is thanks to the ability to look past the “static” of background noise and zoom in on variants of interest with a higher resolution, sometimes down to 2-5 exons.  

How can we further data analysis? It’s clear that the initial data capture is far from the final step in your research. In addition to the existing interpretation functionalities such as ACMG automated variant classification, virtual gene panels, and cascading filters as part of our platform, SOPHiA DDMTM for use with KAPA HyperExome offers extremely accurate detection of biomarkers in a single workflow. The solution and our platform include the Familial Variant Analysis (trio analysis) to automatically filter variants based on different inheritance modes. If you’d like to learn more about what we offer, contact us today.