A: In traditional approaches, that is what happens. Some of the cleverness of this method comes through in the molecular biology and it is worth jumping back into this distribution reaction. If the gene of interest is 2KB or 5KB, what happens in that step is the barcode on that end is inserted uniformly all along that gene. You have 1000s of copies and they each have a unique address and then you have a known barcode next to some part of the gene that you can access. That is how you do the short read to long read jump. You actually sequencing maybe only 150 bases, but you are doing it at all these different places along the gene and you have the barcode for reassembly.
A: Actually they will, the way the fragmentation is done. It turns out that the UMI winds up at the beginning of every read; then you can aggregate by UMI and then throw it in your assembler.
A: This system is short-read technology. Today it gives you 2x150 for a total read length. You can play with insert size to get it longer but you will still end up with 2x150 base pair reads. So for this application, that would mean a lot of missed content in the middle, if everything is not individually barcoded, reassembly would not happen. So that is what this assay overcomes with some molecular biology - a way to get to a synthetic long-read limited by long-range PCR – 20Kb is typically the end of that size. In terms of comparison to PacBio or ONT, there are several publications that address this topic.
A: We have low diversity methods which are implemented, what we require is a reasonable diversity within the first 5 cycles. We are working on eliminating that requirement in our own Elevate prep, but if you are using compatibility, we do require that diversity within the first 5 cycles, after that it is pretty robust.
A: So we should be good with arbitrarily low. We haven’t worked it out extensively, but we would be excited to try your use-case and then figure out if we need maybe a 5% PHiX spike-in or if we are good to go.
A: It is a single barcode on the front but there are 3 layers of barcoding that can happen so the multiplex levels can get high. You can have a well-based barcode which is this one, but then you can have plate-based barcodes so you don’t need endless well barcodes and then you can have barcode for the NGS run itself. So in a 96 plex sequencing run, one index could be dedicated to an application like this and inside that one index, you have 2 more layers of indexing which can let you put 1000s of samples inside that one NGS run.
View Q&A Transcript from "Uncovering the Meta-Transcriptome with Long-Read Sequencing" Presentatio