The whole-exome vs. whole-genome sequencing debate

By Sarah Kusala via {a href="http://commons.wikimedia.org/wiki/File:In_solution_capture.png"}Wikimedia Commons{/a}

An enterprising salesperson from Complete Genomics used this newfangled social media thing called LinkedIn to make her mark on the world (perhaps) by posing a discussion question. (It was over at the ‘Genome Interpretation‘ group in case you were wondering.) Entitled, “The last days of exome sequencing“, she posed the question whether exome sequencing day’s were numbered.

Well, far from it.

She (working for Complete Genomics which as of this writing is still in business doing only human whole genomes) may have a case of seeing everything through a certain lens, which is that the idea of a 6 billion base human diploid genome will be routinely sequenced in toto for everyone. I know of many customers who would have used Complete Genomics if they offered such a service, and one of their founders makes his case here in favor of whole genomes versus whole exomes.

According to Radoje Drmanac, is it the beginning of the end of exome sequencing? I’d venture to say it’s the beginning of the beginning. I learned recently that Ambry Genetics in California is now offering Clinical Diagnostic Exome sequencing from their CLIA-certified lab, the first company to offer it. Baylor in Houston also offers a clinical exome test, 23andMe in Mountain View has started an offering, and GeneDx in nearby-to-me Gaithersburg MD is also working on launching an exome test.

Why do the commercial genetic test providers go the route of whole-exome sequencing (WES), instead of whole-genome sequencing (WGS)?

It’s not the fact that WGS will give much more variants across the 94% or so of the accessible human genome (by ‘accessible’ I mean able to be sequenced and mapped uniquely) compared to the 1-2% of the genome that codes for protein. It’s not the fact that WGS can give copy number variation data in addition to single nucleotide variation data. It’s not the fact that WGS can also give (with the appropriate DNA library construction method) complex chromosomal rearrangement data.

It’s the fact that with all this additional information, not very much of it (if at all) is ‘clinically relevant’. (This paper mentions the fact that any study that discovered a causal variant for a Mendelian disease or syndrome via WGS would have discovered the variant using WES, and there is also evidence due to the enrichment and higher sequence coverage inherent in WES that mutations can be picked up via WES that WGS will miss.) (Remember, WES is normally at an average 100x coverage, while WGS is a fraction of that, on the order of 30x.) What good is it to sequence and store some 100G bases for WGS (100Gb is the jargon, ‘gigabases’ instead of the computer hardware jargon ‘gigabytes’ for hard drive space) instead of 5Gb for WES, 1/20th of the data, if that additional 95Gb isn’t all that useful?

Remember that additional 95Gb is useful in some contexts for research purposes, already mentioned above. But for clinical utility, that is not research, it is a patient being referred by a clinician to both discover and diagnose simultaneously.

A few real-world examples of how this technology is being used for individual cases in a powerful way. The Milwaukee Journal-Sentinel won a Pulitzer for this 3-part series, “One in a Billion: A boy’s life, a medical mystery“, about a boy named Nicholas Volker. It’s a heart-breaking and heart-warming story, well worth the reading. NPR ran this short piece on Alexis and Noah Beery, whose rare condition was diagnosed via WGS. (Nicholas was sequenced using WES and the 454 FLX, terribly expensive at the time it was done in 2009, on the order of $70K or so. The Beery twins were sequenced using WGS on a SOLiD at Baylor, at the time in 2010 probably $30K for the pair.) (I would also note that the Beery twins’ father, Joe Beery, started at Life Technologies moving over from US Airways partly due to the idea that perhaps some of our life science research tools could be used to determine the cause of his children’s health difficulties. Turns out he was right!) Here is a nice YouTube video narrated by Joe. There are other to share. such as from the NIH / NHGRI Undiagnosed Disease Program, but these two should suffice.

One of the most important factors between WGS and WES however, is the relative cost between the two approaches. It can be argued that with the ‘$1,000 Genome’ on the horizon (via Ion Torrent Proton, no less), WES will be rendered not usable as price isn’t a barrier it once was. However, what people do not quickly grasp is that as sequencing in general drops precipitously in price, WES costs are still 1/20th what the WGS costs, in terms of the sequencing data involved. There are additional WES costs in terms of sample preparation / exome enrichment, but these costs have declined almost as dramatically as the sequencing costs have, due to intense market competition and technological advances.

For example, in early 2009 when I worked for RainDance Technologies, the SureSelect Whole Human Exome selection kit cost on the order of $1,100 per sample for enrichment, at a time when WGS was on the order of $15,000. Now in early 2012, Illumina’s TruSeq Whole Human Exome selection kit costs on the order of $75 per sample for enrichment, when WGS is on the order of $4,500. As WGS has gotten cheaper, the enrichment has gotten cheaper in parallel, and of course as WGS gets cheaper so does WES. We have a situation today where the WES sequencing is today about $250/sample, and the enrichment is no longer $1,100, but rather $75, making it a $325/sample proposition.

Another way to look at it, with a given budget (take a number, say $45,000) a scientist can either get 10 WGS samples, or 138 WES samples. Having 128 more samples for the same budget makes the calculus easy.

Lastly, a newly published paper in Science (available here as an early ‘express’ edition) looks at 2,440 individual exomes to examine rare functional variation in populations and their association with disease. One conclusion? “We show that large sample sizes will be required to associate rare variants with complex traits.” So there’s going to be a large number – a very large number – of whole exomes to sequence in the years to come.

Next Generation Sequencing, Marketing, and the Genomic Revolution

Next Generation Sequencing, Marketing, and the Genomic Revolution

The whole-exome vs. whole-genome sequencing debate

Leave a comment Cancel reply