February 2008, Marco Island Florida – an exciting time in the world of NGS, the first pioneering papers were being published using short-read sequencing that are now every-day applications – ChIP, RNA-Seq, small RNA, the first whole genomes.
A small startup had the last slot of the conference, on a Saturday after a busy four days. Hugh Martin takes the stage, starts to present, and basically steals the show. A completely new and novel method of sequencing, using DNA polymerase tethered to a nano-well, and showed some proof-of-principle data. Capable of producing 5,000 – 25,000 bp reads, at a rate of 10 bases per second, with 25 bp/sec in development, and expect room to speed up to 50 bp/sec, at the speed of biology. No moving parts, and they anticipate a rate of 100 Gb per hour, which would enable a full draft genome in 15 minutes (30Gb at 10x coverage) all at a cost of $100.
Needless to say everyone in the room was blown away. The Genome Analyzer (a renamed Solexa 1G for 1G throughput in 2007) and SOLiD 1 were in their first iterations of 1-3Gb throughput (after 7-10 day run) so the thought of a 30Gb throughput in 15 minutes captures the imagination. To this day people still remember that promise.
Fast forward to 2010, a big year for PacBio (PACB). Going public late last year, $280M in the bank, huge VC investment having burned through >$300M, and their second year loss at $109M in 2011, their first year loss at $140M in 2010, compared to a prior year loss of $88M in 2009. But it is all about the product, and their customer’s experiences (compared to their expectations) were not exceeded.
A single molecule sequencer that requires 500ng of input material? A $795K instrument that weighs 1800 lbs and is over six feet long? Stories about a customer spending $200K to reinforce the underlying floor as well as widening doorways to get the instrument in place? Accuracy in the range of 85%? Average reads on the order of 850-1500bp. The top 5% of reads averaging 2-3kb (with some molecules going a lot longer) gets a lot of press – but how many applications will this kind of throughput require this?
Burt Vogelstein, a prominent Hopkins cancer researcher, once gave an interview on why he became a researcher, and he said that scientists have great toys to play with. (If you’re interested in the 1997 interview you can read it here.) But here is a new tool, and a new approach, that hit the market in early-access mode in 2010 and has underwhelmed. Poor accuracy is probably the biggest problem, followed by high input requirement.
Thus the title – the first derivative rate of change. Any system that’s in early-access, you roll out improvements as soon as they can be tested reliably. And here is where PacBio has been struggling to make changes in accuracy – ever since some early-access pilot projects with key customers (early 2010) had the error rate on the order of 16-18%. And over the rest of that year, as early access instruments (11 in total, only in North America, including one near where I live at the NCI in Gaithersburg MD) that error profile hasn’t changed. Only recently did they announce their C2 chemistry improvement, which bumps up the accuracy and throughput, but for many customers I spoke to recently it isn’t enough.
A colleague of mine at LifeTech was working at the WashU Genome Center at the time the Solexa startup started shipping their first early-access 1G systems (this was in the summer of 2006). The first reads were a paltry 15 bases, but soon were 25 bases long, usable for a lot of applications. The first still-early-access systems later that year that shipped were not quite ‘1G’ (i.e. 1 gigabase), but were something like 700Mb at 35 base-pairs, but no one who had one of those early systems took them up on an offer to return them based on a 1 gigabase specification. Soon in early 2007 the 1 gigabase threshold was surpassed, and the Solexa instrument was on its way. This is demonstrable ‘room for improvement’, shown in the hands of customers.
A PacBio presentation I attended at ASHG in Montreal said ‘the system has room for improvement’ but I’ll believe the improvement when customers see improvement. Accuracy is what killed the Helicos HeliScope – the famous ‘dark base’ problem – that they were unable to solve. Another example of where the platform did not improve, the rate of change was too slow for customers (and the market) to accept.
And to contrast – it took 10,000 runs of the 454 FLX to get their readlength from 100bp to 200bp. Compared to Ion Torrent, a much different picture emerges, the doubling of readlength was on the order of 12 months. I’ll have to check, but if I remember correctly it took about two years for the GS20 to double its readlength to 200bp.
The first derivative of change is all-important.