Today Ion Torrent launches a neat product based upon a highly multiplexed technology called AmpliSeq, called the Ion AmpliSeq Exome, covering >97% of the CCDS regions. But before getting into the details, a little context is in order.
Way back in 2008 (okay, 2008 isn’t all that long ago but it was in the heady early years of next-generation sequencing) Bert Vogelstein’s group at Hopkins published this paper which accomplished an amazing feat of high-throughput biology: PCR amplifying no less than 219,229 amplicons from 20,735 genes and sequencing them using Sanger capillary electrophoresis from 24 advanced pancreatic adenocarcinoma samples. A companion paper in the same issue of Science used the same technology on glioblastoma multiforme on 22 samples. One significant finding from this second paper is the involvement of a metabolic Krebs-cycle gene, isocitrate dehydrogenase 1, with this type of cancer, its prevalence among young patients, and association with survival rate. This unexpected finding validates an hypothesis-free approach which the authors call an ‘unbiased genomic approach’.
If you think about the amount of work involved for these pair of papers, it is nothing short of astounding: each sample had to be sequenced as tumor-normal pairs, so doing some math you can calculate 24×2 + 22×2 = 84 samples in total, multiplied by 219,229 primer pairs, multiplied by 2 (for bidirectional sequencing), which gives some 36,803,472 sequencing reactions. Even with a large discount based upon high volume (disclosure – I work for Life Technologies but am not directly involved with the CE business), it is clear that these two projects cost in the many millions of dollars. (And by the way if you want to take a look at all the primer sequences, the Supplementary Data table to an Excel spreadsheet is available here if you want to take a look at it, in all of its 19MB glory.)
Meanwhile the technology for capturing these sequences for next-generation sequencing developed quickly. First via microarray hybridization (Basiardes et al in 2005 on NimbleGen microarrays), and then to solution capture (notably via Agilent microarray oligos synthesized as a microarray and then cleaved from its substrate), the first exome sequencing by NGS to determine the causative mutation of a Mendelian disorder was Ng and Shendure et al. 2009 “Targeted capture and massively parallel sequencing of 12 human exomes”, looking at a rare disorder called Freeman-Sheldon Syndrome. (PCR-based methods like Fluidigm and RainDance are limited to focused sets of genes, due to the constraints on their systems, which I’ve written up before here.)
Thus exome sequencing for research in cancer (as tumor-normal pairs) and for rare Mendelian disorders (usually as parent-parent-affected child trios) has exploded in popularity over the past several years, with maturing selection technologies centered around solution hybridization-based approaches. The market leader Agilent SureSelect has iterated to its fifth version, NimbleGen SeqCap EZ (yes what a mouthful) is now on its third version, and Life Technologies has its own TargetSeq exome that works with both the SOLiD™ and Ion Torrent platforms.
Typically in a hybridization-based enrichment, a size-selected DNA sample has sequence-platform-specific adapters ligated to them, and are then allowed to hybridize overnight to streptavidin paramagnetic beads. After a pull-down, clean-up steps, and additional low-cycle PCR, the exome-enriched libraries are ready for sequencing. This process from starting DNA sample to finished library typically is under two days, due to the overnight hybridization step and several hours of hands-on work before and after hybridization.
Ion Torrent AmpliSeq technology was originally developed as a PCR-based targeted selection method not requiring any specialized equipment (like Fluidigm and RainDance) other than a standard thermal cycler. Requiring only 10ng of input DNA (and depending on the primer design it works well with FFPE samples), the single-tube assay is as straightforward as a PCR setup but with single-tube multiplexing of the PCR to up to 6,000. Life Technologies’ Ion Torrent has launched four standard panels and two ‘community panels’ (i.e. designs made available but purchased as a custom synthesis), and even launched AmpliSeq RNA panels for targeted RNA-Seq.
With the launch of Ion AmpliSeq Exome, the AmpliSeq technology has been scaled to over 24,000 amplicons in a single-tube, and for the exome, 12 primer pools are amplified from ¬5ng per tube (total of 50ng input sample), and then each tube is combined together after the amplification step for downstream processing. The beauty of a PCR-based amplification is PCR specificity; with any hybridization approach there will be uneven efficiency of selection (there is one hybridization condition, for the hybridization temperature, thus the development and many iterations of exome enrichment designs mentioned before). Another strength of this approach is the elimination of an overnight hybridization step – the total time from DNA to sequence-ready library is only 6 hours, versus the 2 days for the hybridization methods.
On the note of PCR specificity, way back in 2009 when I was selling the RainDance Technologies enrichment system, I had a conversation with Victor Velculescu and Ken Kinzler at Johns Hopkins, and they were very interested in RainDance producing an exome library for a PCR-based enrichment for the entire exome. At that time RainDance could not make the financial investment (for the primer-pair synthesis alone, not to include the labor involved in design or packaging of those primers into droplets) would run in the low millions of USD, and I estimated at that time would involve 10 or 12 enrichment runs per sample, which made it even less feasible.
From several perspectives, AmpliSeq Exome is a winner: PCR specificity without the need for equipment, two exomes per P1 chip on the Ion Torrent Proton at 95% coverage of the exome at greater than 20x coverage, a lower cost (on a per-sample basis, enrichment plus sequencing) than even the Illumina HiSeq + Nextera TruSeq Exome combination (not to mention no longer the need to wait for a 24- or 48-sample batch to even begin the 11-day run). On that last note, doing two runs per day, and two samples per run, means a 16-sample week workflow. (Doing three samples per run gives lower coverage metrics – on the order of 87% coverage at greater than 20x – but equates to 24 samples per week.)
One last word regarding content: this can be tricky as different vendors have differing definitions of what the exome means, in particular additional content around 3′ untranslated regions (UTRs), highly conserved regions, and non-coding RNA (ncRNA) that is transcribed but not translated. For AmpliSeq Exome, it is >97% of the comprehensive CDS, maintained by the National Center for Biotechnology Information (NCBI).