Cytosine methylation plays a vital role in many biological processes, including cell lineage specification, X-chromosome inactivation, and the preservation of chromosome stability. Given its importance in many essential cellular functions, errors in methylation have been linked to a wide range of human diseases, including cancer and autoimmune disease.
The detection of methylation patterns via DNA sequencing is an important tool for researchers trying to unwind the mechanisms of human disease and health. Still, these experiments can be expensive due to the inherent low diversity of bisulfite samples and the limitations of traditional NGS technology.
The Challenges of Bisulfite Sequencing
It is common to assess the methylation state of DNA via sequencing by converting unmethylated C’s to U’s either enzymatically or via bisulfite conversion. Following sequencing and alignment to a methylated and unmethylated reference, the methylated sites can be identified in the sample of interest.
However, because the vast majority of cytosines are unmethylated in most sample types, this leads to very few C base calls (and an overabundance of T base calls) in the resulting sequencing library. Libraries with one or more under-represented bases are termed low diversity, and they pose a challenge for many sequencing technologies.
With many technologies, low-diversity samples interfere with the ability to map the location of distinct clusters and maintain base calling accuracy as sequencing progresses. To overcome this challenge, it is common to either pool such libraries with high diversity libraries or to supplement the library with a significant PhiX DNA spike-in before sequencing. While the amount of PhiX DNA required and the impact of low diversity on target density varies by sequencing platform, the addition reduces the run’s effective throughput, driving up the cost of sequencing.1
A Better Path to Low-Diversity Sequencing on the AVITI™ System
Unlike other sequencing platforms, the AVITI system does not require diversity to maintain accuracy as signals from the four bases are more reliably distinguishable due to our unique sequencing chemistry. In addition, AVITI libraries have specific characteristics that enable clean mapping of polonies during the initial cycles.
We decided to assess the capability of the AVITI Sequencing system on MethylSeq libraries, prepared using the NEBNext Enzymatic MethylSeq kits. Our objective was to evaluate density and accuracy while varying the PhiX spike-in percentage.
We sequenced the well-characterized sample NA12878 pooled with the addition of 1% each of a fully methylated control library (pUC19) and a fully unmethylated control library (phage lambda). Three runs were completed with a PhiX spike-in of 0%, 5%, and 20%, respectively by concentration. The summary of the primary sequencing metrics is provided in Table 1, below.
|Condition||PE Reads||%Q30||PhiX Aligned (%)|
|No PhiX||989 M||96%||N/A|
|5% PhiX||857 M||93%||5.3%|
|20% PhiX||920 M||95%||27%|
Each run had high coverage and attained a high percentage of Q30 bases. The PhiX spike-in for the third condition was higher than expected, reflecting either loading variation or some amplification preference for the PhiX reads.
We processed the run using the nf-core implementation of the Bismark methylation pipeline (version 1.6.1) to obtain the percentage of methylated CpG sites in each library. According to documentation from sample manufacturer NEB, the expected methylation percentage of CpG sites for the three libraries are 53% (NA112878), 100% (pUC19), and 0% (phage lambda). Figure 2 shows our results from the output of the Bismark pipeline.
Escaping the Throughput-Stealing Requirement for PhiX to Lower Costs
The methylation fraction closely matches the expected results and is highly consistent across runs, even with very little PhiX present. The FASTQ data is publicly available for download here. These results show that the AVITI system is compatible with the NEBNext Enzymatic MethylSeq and produces high-quality methylation data, even with no PhiX spike-in.
Given the small number of runs and the presence of a small amount of the fully methylated control library, we still recommend a 5% PhiX spike-in for robustness and real-time error measurement.
However, a large amount of PhiX is not required to obtain accurate MethylSeq data on AVITI, further lowering the cost of sequencing.