There is a lot of interest in what is the Next Big Thing in next-generation sequencing. The case can be made that the clinical application of NGS (either targeted sequencing or WES or WGS for cancer genomics) will be that growth driver, but I suspect it will be the next generation of next-generation sequencing.
This next generation will not be third generation sequencing (Helicos, PacBio and the upcoming Oxford Nanopore all can be grouped in the single molecule, or third generation camp), but will be an iteration of the existing next-generation platforms. (Note that the distinction between platforms exists at the level of library → template preparation by amplification → sequencing of bases, where the template amplification of a single molecule in third generation technology does not take place.)
This iteration follows the general rule of high technology development – better, faster and cheaper. Better in the sequencing sense has several dimensions: higher throughput (per run, or per workday, or even per hour); higher accuracy; longer readlengths (of great usefulness the longer they are, and this usefulness is not linear); and easier to use (admittedly a qualitative judgement, but hands-on time of a researcher is so valuable it has to be taken into consideration). Faster and cheaper for sequencing is straightforward enough.
Time for a disclaimer here: I am a Life Technologies employee, whose regular day-job interacts directly with the Ion Torrent team, and all the information I share here is public (nothing I type here I wouldn’t mind being on the front page of the local newspaper). This blog reflects my own opinions.
Within a day of each other, at a major Morgan Stanley investor meeting way back in January of 2012, Life Technologies / Ion Torrent announced the Ion Torrent Proton sequencer, and the ability of two subsequent chips (named coincidentally the Proton I and Proton II) to produce two exomes in four hours (Proton I), and a human genome in four hours (Proton II), for $1,000. This ‘path to the $1000 genome’ message is a remarkable one, as the technology has scaled so rapidly (in 2008 a $50,000 genome was published by Helicos, so we’re looking at a 50-fold decrease in five years – early 2013 is when the Proton II chip is expected to become available). Can you imagine anything that is 50-fold less expensive in the course 5 years?
The next day Illumina announced an upgrade to the existing HiSeq 2000 system, the HiSeq 2500. Priced as a $50,000 upgrade to the $690,000 HiSeq 2000 (or $740,000 as a new standalone 2500 system), it promises a larger 120Gb throughput (about 1.5x to 2x the throughput of what the Proton II is expected to do) over a longer time (27 hours).
So for this comparison, I’ll cover the improvements in a reverse order: cheaper, faster, better; we’ll cover the dimensions of better in the last section.
Cheaper is of paramount importance. If a customer can’t afford to buy the system, maintain it, and run experiments on it, the vendor company won’t be able to sell it. The Ion Torrent Proton system (including OneTouch 2 automation of template preparation, and a new Proton Torrent Server to call bases) is priced in North America at $243K. For Illumina, a new HiSeq 2500 is $740K. Maintenance would be from 8% to 12% of the system price after year 1 (the first year’s maintenance is included in the purchase price, the range can reflect different service response levels), thus at 8% the Ion Proton costs $19.4K for one year’s service, the HiSeq 2500 costs $59.2K.
Another aspect is the run costs, and here we’ll make some conservative estimates on what the per-Gb costs will be, as Ion Torrent has not released a specification on the Proton I and Proton II throughput, stating overall metrics in terms of well numbers but no firm gigabase per run figures.
For Ion Torrent, here will only use the Proton II numbers (available in March or April 2013, six months after the Proton I launch which is expected in late September). A low estimate on throughput is 60Gb per run, defining a human genome at 20x coverage, and is the number I’ll use for this set of financial calculations. The template preparation and sequencing for a Proton 2 chip has been stated to cost $1,000. Calculating a per-Gb price, $1,000 / 60Gb = $16.67 per Gb.
For the HiSeq 2500, 120Gb has been stated to cost above the current HiSeq 2000 per-Gb price. For the purposes of this modeling I will use a 20% premium to run the HiSeq 2000 in ‘2500 fast mode’. A HiSeq 2000 takes about 11 days, yields 600Gb per run, at a cost of approximately $23K. (In case you were wondering about this higher per-run price, Illumina raised their prices on consumables a few months ago, on the order of 8-10%, without any increase in yield on the instrument, which made not a few customers unhappy.) Thus the HiSeq 2000 per-Gb price is $23,000 / 600 Gb = $38.33 per Gb. At a 20% premium, the HiSeq 2500 per-Gb price is $46.00 per Gb. At this rate, a single HiSeq 2500 run of 120Gb costs $5,520.
So given these conservative assumptions on the per-Gb price calculations, and the difference in purchase and maintenance costs, they truly add up when you look at a three year total cost of ownership calculation (abbreviated TCO).
Over three years, the TCO is the purchase price of the instrument, and two year’s warranty, plus reagents for the respective instrument at a 30% utilization rate. This works out as follows: 250 working days per year x 0.3 = 75 run-days, or 150 Proton 2 runs per year or 75 HiSeq 2500 runs per year. Conveniently, a single HiSeq 2500 run = 2x Proton II runs since I’m assuming a 120Gb HiSeq 2500 and a 60Gb Proton 2, so the total Gb per year is the same (150 x 60 = 9,000 Gb = 9 Tb). To get really technical, a 27-hour run should equate to about 12% less runs, or 67 runs instead of 75 runs, but for the sake of a 30% utilization I’ll assume that a 27-hour run is a ‘day’. But for sake of argument, we’ll keep the numbers as-is, so that each instrument is putting out the same 9 Tb over the 150 or 75 runs across the timeframe of a year.
So the TCO calculation over three years is as follows:
Ion Proton: $243K (instrument) + 2x $19.4K (two year’s service) + (150x $1K) x 3 years (three years of sequencing runs) = $731,800.
HiSeq 2500: $740K (instrument) + 2x $59.2K (two year’s service) + (75x $5,520) x 3 years (three years of sequencing runs) = $2,100,400.
This is almost a factor of three, for a 30% utilization of 27 Tb of sequence generation over three years. If the utilization rate were higher (say 50% or even 75%) that multiple will rise, if you look at the equation above.
Of course if a customer already has a HiSeq 2000, it is a much easier case to make, but doing the same math as above and presuming that all work at 30% utilization is in ‘2500 fast mode’, the HiSeq 2000/2500 upgraded: $50K + 3x $59.2K + (75x $5,520) x 3 years = $1,410,400, or almost a factor of two.
Onward to the second major factor of improvement – speed. On a runtime basis, from a ready-to-run library to sequence, the Ion Proton is at 8 hours (about 4h for the automated OneTouch 2, and 4h for the sequencing). The HiSeq 2500 is at 27 hours, with automated cluster generation on-instrument (no more need for the cBot instrument) and sequencing.
So there you have cheaper and faster.
And now for the third section, ‘better’. Breaking down ‘better’ into throughput, accuracy, readlength, and ease-of-use, we have the following:
For the Ion Torrent Proton, throughput can be considered 2 runs per 8 hour day, since the template prep and sequencing is de-coupled. 120Gb / 8h = 15Gb / hour, and 10 runs per five 8 hour day long week, or 600Gb/week. For the HiSeq 2500, throughput is hampered with the 27h number, so it can only do four runs per five 8 hour day long week, or 120Gb x 4 = 480Gb /week, and 480Gb / 40 hour workweek = 12Gb / hour. (It is not practical to break up a 27h runtime into neat 8 hour workday chunks, so here the weekly yield is simply divided by a 40 hour workweek.) 600Gb / week on the Proton compared to 480Gb / week on the 2500 is a 25% advantage.
For the Ion Torrent PGM platform against the Illumina MiSeq platform, a bit of a marketing war has erupted, with both sides trading barbs. I admit my own bias, and for today suffice it to say that each platform in terms of technology has an accuracy rate on the order of 99%, with the majority of the bases on each run (70-80% of the bases generated) with Q30 quality scores. (For those not familiar with Q-scores, it goes back to the Phred scale of quality established with Sanger sequencing; Wikipedia has an nice description of it.) The majority of the PGM error is comprised of homopolymers, while the majority of the Illumina’s chemistry error is substitution, although recent work has shown that Illumina’s errors also have homopolymer difficulties that show themselves as substitution errors.
I won’t give the accuracy advantage to either side, for the sake of this discussion.
For readlength, the Ion Torrent Proton is expected to launch at 200 base-pair readlengths, while the HiSeq 2500 is expected at 150 base-pairs. Looking at the current development path of the PGM and MiSeq (to foreshadow expected improvements on the Proton and 2500 respectively), the PGM expects to deliver 400 base-pair readlengths by the end of 2012, while the MiSeq is looking at 250 base-pairs sometime this summer. (Although as of this writing ‘this summer’ is already halfway over, so it may roll out sometime this fall instead.)
For ease of use, the Ion Torrent Proton has an automated template prep (the Ion Torrent OneTouch 2), and an automated enrichment system (called the OneTouch 2 ES), that does require some hands-on intervention, athough it is on the order of 10 or 15 minutes apiece. The HiSeq 2500, like the MiSeq, will automate the template prep (called in Illumina jargon cluster building) automatically on the instrument.
So there you have it, and since I had a few extra minutes on a Sunday afternoon I put together the following handy graphic:
Footnotes: Price per HiSeq 2500 run is an estimate, calculated as a 20% premium over existing approximate HiSeq 2000 list price. Estimate of Proton throughput based upon an estimate of throughput on the Proton 2 chip available in the Spring of 2013, and no specification for this chip has been established as of this writing.
So what is the conclusion to make? System price / affordability is the biggest differentiator across this list, and the TCO calculations are eye-opening. One can quibble all they want about which is easier to use or more accurate, but if you can’t afford the equipment in the first place (or afford to maintain this equipment) in a time of tight budgets the point is moot.