Complete Genomics and the Whole Genome Sequencing market 3

Complete Genomics Logo

Complete Genomics is a startup business founded upon a particular idea – that the whole genome sequencing of human individuals is going to be industrialized, commonplace, and have such clinical utility so as to become the dominant application for next-generation sequencing. (Disclosure – I have no financial interest in this company, just an interested observer.)

And based upon this idea, millions of dollars were spent on developing a unique sequencing-by-ligation method, with a unique method of library preparation and template preparation, involving small ‘nanoballs’ that adhere to microlithographically-etched patterned silicon substrates at high density. And with a sequencing services-only model (Complete Genomics, unlike Illumina or Life Technologies, does not sell any of their systems to researchers), they have a fee-for-service model with another unique dimension, which is the sequencing of only human whole genomes.

Only human whole genomes are sequenced – no human whole exomes, no mouse whole genomes, no plants for the agricultural segment, no RNA for expression experiments, no methylated DNA to look at epigenetics. Thus Complete Genomics, from its inception, looked at the next-generation sequencing market and was going to offer a single offering set to revolutionize medicine.

From what I’ve been able to gather (from various market research surveys), currently the segment of human Whole Genome Sequencing as a function of the overall market is on the order of about 15%. So by whatever metric one chooses for the overall size of the next-generation sequencing market as a whole (I use a $1B value for 2011, and at an estimated growth rate of about 20% puts the 2012 value of the overall NGS market at $1.2B), the whole-genome sequencing segment would represent about $150M of that. And of that $150M (doing some back-calculations and estimates here) $20M went to Complete Genomics in 2011, and perhaps $20M went to Illumina (these two have been locked into battle over the whole-genome sequencing services market space, and a 50/50 split may be just a guess, as Illumina does not break out their genome sequencing service business separately).

So looking at the research market, about $110M was devoted to whole-genome sequencing that was not done as a service by Complete nor Illumina, and assuming $3,300 per sample in consumables costs, represents about 33,000 samples sequenced, in addition to the ~4,000 sequenced by Complete Genomics and another ~4,000 by Illumina, for a total of about 40,000 whole genome sequencing samples worldwide in 2011.

Looking back historically, the first individual whole genome, Craig Venter’s, was first sequenced via CE Sanger sequencing only in 2007, followed shortly thereafter by the publication of James Watson’s genome via Roche / 454. Only in the span of four years, going from 2 individuals to 40,000 is a truly remarkable achievement, a reflection of the pace of innovation and market-force driven competition so that the volumes of these samples are actually affordable, as well as the frontiers of genomics and genetics being expanded. Looking at costs, it was estimated to be ‘several million dollars’ for the Venter genome, and in the ‘several hundred thousand dollars’ for the Watson genome; now, as a service from Complete Genomics the price per sample is on the order of $5,000 or less depending upon the volume. (As a point of comparison, the list price of Illumina’s reagents for a HiSeq 2000 is on the order of $3,300, but does not include the depreciation of the instrument – a significant expense on a $690,000 device – yearly maintenance, overhead, labor etc.)

On that note, assuming a 3-year straight depreciation schedule and including service over those three years, equates to about $23,000 per month, and given that a HiSeq 2000 could do about six human whole genome samples in 11 days, that’s about 175 samples per year. Doing some math, the $3,300 in reagents on a fully-loaded HiSeq (assuming an ‘uptime’ of 90% utilization over the entire year), now magically turns into a $4,880 per sample price.

All those economic calculations aside, is the bigger picture: is the whole genome sequencing market set to break out of a 15% share of the overall NGS market and become a ‘killer app’ that is a ‘we must sequence many many individuals in a clinical setting’? Well, not perhaps as originally envisioned, and in part due to the economics, and in part to the science. A great number of papers are being published on genomics using NGS, and not limited to the genetic maladies that affect many in the population. (The 85% of NGS overall spending not on whole genome sequencing is spent on many other types of experiments – RNA expression, protein/DNA interactions, methylation of cytosines etc., and these applications will only continue to grow in the future along with the overall market.)

As I mentioned before, a Whole Genome Sequence compared to a Whole Exome Sequence is a debate that I’ve commented on before – there are pluses and minuses to argue for each side, and Complete Genomics only offers WGS. Yet for many prospective customers they rule Complete out as they simply do not make that offering, and prefer exomes as they can get more ‘bang for the buck’.

What are the other barriers to the clinical application of WGS to drive incredible demand now? Manifold reasons – from federal regulation (i.e. FDA in the U.S.), to what a ‘clinical genome’ actually means in terms of data quality, to agreement in nomenclature between variant databases, to reimbursement issues (again a U.S.-specific situation). And in terms of its usefulness, at the Medical College of Wisconsin Liz Worthey said at a conference earlier in the year that of 13 human genomes sequenced (understand that these were selected for having an undiagnosed disease with a suspected genetic cause), only 3 of the 13 resulted in a definitive diagnosis. While it is certainly a nice addition (three pediatic patients and their families now know have a diagnosis), the 20%+ ‘success rate’ matches well with what Dr. William Gahl has said for the NIH Undiagnosed Diseases program. (He was on the US television show ’60 Minutes’ recently; the 12-minute interview is worth watching.)

So these efforts are to be heralded and celebrated – we are witnessing a revolution in health care, one undiagnosed case at a time. However, it does not follow that there is a successful service business to be built around that – Complete Genomics is buffeted by dropping sequencing costs (and when the Ion Torrent Proton comes out, in particular the Proton II chip next spring, it will drop again), and a limited market segment (remember that 15% number), the true value of the company is its informatics pipeline. So don’t be surprised should they decide to change their business model.

About Dale Yuzuki

A sales and marketing professional in the life sciences research-tools area, Dale currently is employed by Sysmex-Inostics USA as the Director of Marketing. He will help Sysmex-Inostics build out their liquid biopsy franchise (OncoBEAM and Plasma SafeSEQ) with market planning, positioning and branding as well as thought leadership, opinion-leader management, and sensing of market trends. He also represents Sysmex at tradeshows and other events. For additional biographical information, please see my LinkedIn profile here: and also find me on Twitter @DaleYuzuki.

Leave a comment

Your email address will not be published. Required fields are marked *

3 thoughts on “Complete Genomics and the Whole Genome Sequencing market

  • scbaker813

    Great article, Dale! Do you think CG can really change their business model to just bioinformatics services? Specifically, can their pipeline be translated to other platforms (Illumina and Ion Torrent)?

    • Dale Yuzuki

      Hi Shawn, thanks for the note.

      I do believe that their parallel processing (Bahram Kermani once told me several years ago how they can remarkably scale their whole-genome alignment as a square of their processors, an incredible feat IMHO) could be modified to accommodate other types of reads – anything would be easier than their (10bp + 10bp + 10bp + 5bp) x2 reads, with variable gaps between them, and an insert of several 100 bases. So you can imagine that while their alignment algorithm may need to be changed for optimization of a longer readlength, the real ‘secret sauce’ is in making the entire alignment much more efficient computation-wise.

      That all being said, if one person / group could solve this, it is just a matter of time before other people / groups could solve it, but perhaps not as well as the original method. I wonder what theoretical genius is out there now working on a new compression algorithm for data that will replace the use of Burrows-Wheeler for alignment. (Note – I don’t know whether GNOM uses BWA or not, just as an example).

      Over Twitter last week GNOM has said their long mate-pair method is ‘in press’ in Nature, so that is something to watch for, for completely phased haplotyped genomes. The insert size has been mentioned in the past to be 100kb in size (!!), making the phasing a much easier task.

      Best – Dale

      • scbaker813

        That’s interesting – it certainly gives them hope. I’m very interested in reading about their phasing technology. Hopefully the paper will come out soon.