The next-generation sequencing market continues its downward trajectory – routinely violating Moore’s Law by an estimated 3x, the cost per megabase curve started to significantly bend downward around 2007 when the Solexa 1G started selling in volume, and gave the 454 GS20 (as it was known then) the first competition for massively parallel sequencing the market had seen.
Technology platforms would warrant an entire publication for a single genome; the first was James Watson (appropriate as the co-discoverer of the structure of the DNA helix) in 2008 on the 454 platform. The time it took (4.5 months) and the cost (USD $1.5M) was revolutionary as of 2008; yet only a few months later Illumina would announce the completion of a Yoruba individual (from the International HapMap project) and BGI (in the same November 2008 Nature edition) would also announce the completion of a Han Chinese individual, for about the same time and a cost (USD $500K) that was also revolutionary. (Timing for journal publications is an approximate measure of the marketplace, as there can be many months of elapsed time from the generation of the data to submission of a manuscript to final publication in print.)
It would seem that by August of 2009, a $50K genome produced by a new single-molecule technology would generate buzz and interest in a new startup company (and the individual sequenced was a company founder and scientific pioneer Stephen Quake), but even by then (about 9 months later) the $50K cost was dismissed by the marketplace, as the cost of sequencing (in addition to installation headaches and stability issues) was too high for the value of the sequence generated.
This startup company, Helicos, went public in May 2007 at $8.48, hit a high of $17.44 in January 2008, and by the end of 2008 crashed to $0.44. Now trading at $0.06, Helicos has ceased commercial operations for some time now.
Yet in August 2009, this single-molecule technology sequenced a whole genome for $50K, when only a year before a whole genome cost $500K. There was one problem (of several that Helicos had) – by that point in time a whole genome at similar coverage and accuracy was about two thirds of that price, or about $33K.
Going back to 2007, Helicos had very smart scientists (I met a few Helicos R&D folks when I was at RainDance in 2009), experienced commercial leadership, and a unique technical position. But what they ran into was a fundamental issue: at the level of single molecules, you find out the hard way how dirty everything is. In other words, 99.99% purity means that what is in that 0.01% can break chemistry in surprisingly rude ways. On top of that, their technical implementation of the imaging meant that each Helicos sequencer had to be installed on 400 pounds of Vermont granite, which isn’t easy to install (nor transport). Of course on top of everything else, the systems proved to be expensive and very difficult to maintain (they were listed for sale at USD $1.35M at first, and then subsequently went down in price in later years to $1M). Their first customer, Expression Analysis, is a sequencing and genotyping service laboratory in North Carolina, and on a visit in 2009 their HeliScope system sat unplugged in an unused corner, and I was shown a broken 16-channel flowcell.
Looking at this from purely a product perspective, the HeliScope had the difficulty of differentiating itself in the marketplace. The instrument price was one hurdle; to have a seven-figure instrument limits the market reach (there are perhaps 20 research institutions in the world that could and would pay USD $1M for a research device). The applications for this sequencer did not differentiate it at all (Helicos was a short-read technology, on the order of 24-70 bases), and its accuracy was plagued by a ‘dark base’ problem. (Since it was light-based, the presence of unlabeled nucleotide – even if at 0.01%! – would preferentially incorporate the growing DNA strand, thus leaving random ‘N’ bases inserted into the sequence.) The sample prep for library creation was easier than others (they used a poly-A tail to bind to their poly-T flowcells), but other than that they had to compete in a marketplace with less expensive systems (on the order of 1/3 to 1/2 the cost). And another advantage, 20 gigabases per run was impressive in the early days of 2008; but not for long as the Genome Analyzer would soon enough hit that mark and exceed it (now they can do over 50 gigabases per run), and the Illumina HiSeq now generates about 600 gigabases per run.
As a technology, single-molecule sequencing offers something of a holy grail for genetic analysis – very long reads – however, Helicos’ technology could only do short reads, which put it on the same playing field as the rest of next-generation sequencing. As a pioneer, they had a market niche to themselves, but could not capitalize on that promise and scale the technology along with the aggressive market.
And as far as I am aware of, there is only one Helicos instrument still running, at Dana Farber Cancer Institute in Boston. They put together poster at this year’s AGBT describing their experience going from a Helicos system purchased in 2009 to a benchtop sequencer, a MiSeq from Illumina.