Next-Generation Sequencing – its historical context

Photo of J. Craig Venter Inst. circa 2005 by {a href=""}jurvetson{/a} via Flickr.

Even though the history of next-generation sequencing is short (the 454 GS20 came out in 2005, the Solexa 1G in 2007, and the SOLiD 2 in 2008), there is a robust genomic revolution going on, and a fierce battle in the marketplace with plummeting costs and soaring throughput. Whether Moore’s Law is beat by some 2.5-fold or even faster, there is no question that we are in the middle of burgeoning growth, remarkable discovery, and new insights and discoveries just about every day.

Yet in this context there are many who are only vaguely familiar with next-generation sequencing, or may be familiar with ‘first-generation’ Sanger sequencing but much less-so with next-gen.With a current NGS market on the order of USD $1 Billion, and an estimated growth track of some 25% yearly projected over the next five years, this market that demands attention.

In order to speak intelligently about next-generation sequencing, what about the ‘current-generation’ sequencing, which is Sanger chain-termination sequencing detected via capillary electrophoresis (CE)? The method has been around since the late 1970’s, and I would note here that Fred Sanger shared a Nobel Prize for his effort in 1980. Automation coupled with fluorescent detection and CE separation enabled the Human Genome Project (spanning 1990 – 2004, but it’s heyday was in 1998 or so through 2001), and a very expensive method to sequencing entire genomes. The HGP was estimated to cost on the order of USD $3 Billion, and Craig Venter’s genome in 2007 was variously estimated to cost USD $3 Million to $4 Million.

So with the extremely low cost of next-generation sequencing (in 2012 a human genome at a reasonable depth of 30x is about $4,500 in reagents, with it set to drop much lower with the advent of the Ion Torrent Proton and its associated Proton II chip in early 2013), why is Sanger sequencing via capillary electrophoresis still around, and even growing?

The short answer is that there is no realistic substitute.

A single set of primers can sequence 700-1000 bases of DNA at a very low per-sample cost (on the order of $2 to $5 USD depending on several factors). The error is very low; it is estimated that its accuracy is on the order of 99.95%; and its error model, based upon the work of Phil Green (the ‘Phred Quality Score‘), has been extensively validated and refined in the 14 years since its original publication. Up against these parameters, next-generation sequencing simply cannot be compared to it.

Primarily, the difference is one of not only accuracy (in general the accuracy of next-generation sequencing is on the order of 99%, or in other words, instead of ~0.05% error it is 1% error, or some 20-fold higher), but also one of readlength (in general the readlengths are on the order of 100-200 bases).

There have been exceptions in the competitive world of NGS; SOLiD came out in 2008 taking the accuracy differentiating feature, at 99.94% (using an E.coli reference; human samples were a bit lower), but required a two-base encoding / decoding scheme that added complexity to the analysis, and had a difficult time in the marketplace (market share was estimated to be on the order of 20% over the past 3 years). Pacific Biosciences, a single-molecule sequencing platform (“third-generation”), offers an average readlength on the order of 2,500 bases, but a low accuracy on the order of 85% (15% error), which has landed the company in a tangle of legal problems. The 454 FLX+, a recent enhancement to the Roche / 454 FLX, offers readlengths of around 700 base-pairs, but you still get the 99% accuracy and customers have reported problems with the on-site upgrade process, saying that Roche is asking for systems to be returned to them for the upgrade process, on top of difficulties with short reads and low throughput on the FLX+.

Thus Sanger / CE remains the gold standard – even while next-generation sequencing has proven it’s usefulness. There are many complementary aspects and other completely novel applications that make NGS very appealing, and meanwhile Sanger sequencing will continue for years to come.



About Dale Yuzuki

A sales and marketing professional in the life sciences research-tools area, Dale currently is employed by Pillar Biosciences as a Global Marketing Manager. He represents Pillar across the East Coast, engages key customers for feedback for further product improvement and development, and is responsible for sales activities across tghe region. He also represents Pillar at tradeshows, writes on a blog for them, helps guide social media strategy and tactics, and keeps track of what is going on in the marketplace. For additional biographical information, please see my LinkedIn profile here: and also find me on Twitter @DaleYuzuki.