ABRF 2023, Boston, Massachusetts
The impact of insert length on variant calling quality in whole genome sequencing
Kelly Blease, Bryan R. Lajoie, Sophie Billings, Vivian Dien, Andrew Altomare, Ryan Kelley, Edmund Miller, Juan Moreno, Connor Thompson, Junhua Zhao, Matthew Kellinger, Shawn Levy, Semyon Kruglyak;
Element BioSci.s, San Diego, CA
Accurate variant calling is critical for whole genome sequencing applications, including rare disease and oncology. The availability of high-quality truth sets provided by NIST for several human genomes has enabled various benchmarking efforts across both sequencing platforms and NGS algorithms. Based on these benchmarking results, we have an understanding of the most accurate variant calling methods, the impact of greater read length, and the properties of the remaining difficult regions. However, we do not yet have a careful examination of the impact that varying insert length distributions have on benchmarking. This is in part because amplifying long inserts is challenging for many sequencing chemistries.