Whole Exome Sequencing 101: Cost-effective DNA sequencing to understand genetic disease

Scientists have many options when planning a DNA sequencing experiment, from assay type, sample prep solution, DNA sequencing technology, to data analysis tools. In this post of our Element 101 series, we review some of the considerations that arise when planning a whole exome sequencing experiment, a targeted DNA sequencing method commonly used to understand genetic disease that can be more cost-effective than whole genome sequencing.

What is whole exome sequencing (WES)?

The development of cost effective high-throughput next-generation DNA sequencing (NGS) has allowed scientists to link specific genetic changes to a wide range of health conditions, including inherited diseases and cancer. These discoveries have improved our understanding of the biology of a wide array of diseases, improving treatment options for patients and in some cases leading to new therapeutic interventions. Most of the time, disease-causing genetic variants occur in exons, the regions of genes that are translated into proteins. Other overrepresented locations include intro-exon boundaries, promoters, and untranslated regions (UTRs) just upstream or downstream of coding regions. Since these targeted regions make up just ~1% of the human genome, focusing sequencing resources on just these parts of the genome is more cost effective than whole genome sequencing (WGS) but still has a high yield of variants.

How does whole exome sequencing work?

In whole exome sequencing, oligonucleotide probes complementary to genomic regions of interest are used to capture the relevant DNA fragments for sequencing while washing away unwanted fragments. There are two methods used for exome enrichment. In both methods, a genomic sample is sheared to produce double stranded DNA fragments. The fragments are end-repaired and universal priming sequences are added.

In array-based capture, sequences of interest hybridize to oligo probes attached to a high-density microarray. Unwanted DNA is washed away, and the captured material is amplified by PCR for sequencing.

In solution-based capture, oligo probes are attached to magnetic beads which can be pulled down and washed to remove unwanted DNA fragments. This method requires less sample input and eliminates the need for specialized hardware.

Figure 1. Whole exome sequencing workflow

There are a number of kitted solutions for bead-based exome enrichment, including Agilent SureSelect Human All Exome v8, Twist Exome 2.0, and IDT xGEN Exome Hyb Panel, and Qiaseq Human Exome. The options vary in the total size of the panel, workflow hands-on time, library prep automation compatibility, and the availability of paired data analysis software, reporting tools, and proprietary variant databases. In selecting a solution, users often weigh the size and completeness of the panel against the cost of sequencing. Evenness of coverage, often expressed as the fold-80 metric, is also important. Fold-80 measures how much in excess of the desired coverage one must sequence to ensure that at least 80% of the target regions have that level of coverage. A high fold-80 can impact the sensitivity of an assay and add to sequencing costs. For time sensitive applications, the speed of the paired data analysis and reporting solution are other important considerations.

Exome Sequencing on the Element AVITI System

The choice of sequencing platform is just as critical as the selection of which enrichment solution to use. High accuracy data is needed for the assay to have high sensitivity and specificity. To control costs, the platform should have a low cost per gigabase. However, the instrument throughput should also match your typical batch size and coverage requirements, otherwise one must weigh paying more per sample to run a partially filled flow cell versus waiting longer to get results, neither of which is desirable.

The AVITI system is uniquely positioned to address all of these considerations.

  • High quality data with >90% of reads Q30 or above
  • Cost effective at $5-7 per GB / $1 million reads
  • Sample volume flexibility with two fully independent flow cells
  • 300 Gb / flow cell with 2x150 PE kit or 150 Gb / flow cell with 2x75 PE kit
  • Guaranteed reagent pricing for the life of your instrument
  • Validated ecosystem compatibility for a wide array of target capture and library prep solutions

The AVITI system gives you access to affordable per sample pricing even if you don’t have 288 samples to run at once, or if your sample volume is variable. For $45 per sample, you can multiplex 24 exomes per flow cell and generate 6 Gb data per sample using the 2x75 PE sequencing kit. Or, if you have more samples, you can run both side-by-side flow cells and sequence 48 samples per run. With such affordable per Gb costs, users can also afford to run fewer than 24 samples for time sensitive samples.

Table 1. The affordability of exome sequencing is dependent on multiplex level
Table 2. Exome sequencing data quality on the AVITI system

Best of all, the data quality of the AVITI system is high, in fact superior to that of other DNA sequencing systems on the market today. Table 2 shows sequencing quality for multiple exomes prepared using the Agilent SureSelect All Exome v8 target capture solution and sequenced in one flow cell.

Exome sequencing continues to reveal more about the genetic underpinnings of health and disease, undercovering biomarkers and drug targets that will improve diagnostics and patient care in the future. The AVITI system is a new option for researchers who need high quality, affordable exome data but face challenges with long queues and slow turn-around times in NovaSeq outsourcing. Skip the queue and own your own science with an AVITI system. Your science can’t wait.