Next Generation Technologist

Next Generation Sequencing, Marketing, and the Genomic Revolution

February 9, 2015
by Dale Yuzuki

Commentary from Behind the Bench

Right about one calendar year ago I was asked to start writing for a new blog for the Genetic Analysis division of Thermo Fisher Scientific. Called ‘Behind the Bench’, it took a few months to get off the ground, and started posting right at the end of May 2014.

A few personal comments on what I’ve learned this past year:

  • Finding post ideas is not a problem when your own customers are doing fascinating things
  • The mechanics of legal and regulatory approval can be daunting, but I have plenty of company among others who have similar constraints
  • Producing nice-looking video takes a lot of effort and expertise, but is so much easier (and less expensive) now
  • Being a social media professional doesn’t mean my prior experience no longer counts; it becomes more relevant than ever
  • Marketing is going through a huge tectonic shift, away from email toward digital (in all its forms – webinars, video, white-papers, ebooks) and social
  • Presence online doesn’t mean anything to those who are still offline; peer recommendations and in-person involvement (such as interaction with a flesh-and-blood salesperson) still mean a lot to customers

If you are interested in the most popular posts from 2014, they are available here. If you’d like to see all 113 posts published to-date, they are here.

Lastly, if you would like to ‘listen in’ on live-tweets from an upcoming conference (such as AGBT in Marco Island FL Feb 25, AACR in Philadelphia PA April 19 and ESHG in Glascow Scotland June 6) follow me on Twitter. And if you are interested in the top tweets from 2014, you can access them here too.

If there are topics you’d like to see in the future (suitable for this space rather than Behind the Bench), feel free to leave a comment below.


February 3, 2015
by Dale Yuzuki

The Core Competency of Google is not Life Sciences

What does Google X have to offer in life science diagnostic development?

Screen capture of The Atlantic Magazine interview via

Screen capture of The Atlantic Magazine interview via

I’ve picked up a phrase, ‘it’s a narrow world’, from somewhere in my travels. Way back in my laboratory manager days in Santa Monica California at the John Wayne Cancer Institute (‘laboratory manager’ sounds so much better than ‘laboratory technician’), I met a young scientist named Andrew Conrad who started a company called National Genetics Institute. (This was around 1992 or 1993.) Their aim was for fast (and inexpensive) PCR-based diagnostics.

They had two unique qualities: one was that they were based in west Los Angeles where I grew up, the other was a technology to re-use other equipment components to build a fast thermal cycler. Their system at that time was a home-made one that used pumps from other medical equipment (think dialysis machines and the like) and plumbing to move high volumes of heated water into circulating water baths. Add in some liquid-handling robotics (remember this is in the early 1990s, and the first 96-plate based systems were just appearing about that time. (This Wikipedia article indicates the Society for Biomolecular Screening started a standardization initiative in 1996.)

Fast forward twenty years to March 2013, and Andrew Conrad’s name re-appears as the chief scientist for the Google X Life Sciences project, and in the subsequent months there was a veritable avalanche of exposure: an interview with the Wall Street Journal (complete with a photo of Dr. Conrad with an Attune flow cytometer in the background), an interview with Wired journalist Steven Levy entitled ‘We’re hoping to build the Tricorder’, and most recently a description of their work with modeling human skin, complete with an Atlantic Magazine video.

Two larger questions need to be asked, however. What is Google’s unique competence? And what research Google X would have  to offer?

Google’s Unique Competence

During last October’s American Society for Human Genetics meeting in San Diego, our group at Thermo Fisher Scientific generated 15 posts from this conference (you can access them here at Behind the Bench).  One plenary talk that I did not discuss before was on Sunday Oct. 19 (8:30am) given by David Glaser of Google, with the title “Lessons from a Mixed Marriage: Big Sequencing Meets Big Data”. David said a number of interesting things, and I’ll take the time here to share a synopsis of that talk.

He said that genomics is now becoming an ‘N of Millions’ activity, and this is certainly true, given projects such as the Resilience Project of the Mount Sinai Medical Center in New York, the 100,000 Genomes Project of Genomics England in the UK, and of course the recent Personalized Medicine Initiative announced by the President that seeks the genetic profiles, medical histories and other data from a million or more Americans.

Importantly the speaker from Google laid out a brief history of big data mining, from the development of MapReduce in 2004, Hadoop in 2005, Apache Spark in 2009, and most recently Google Dremel in 2010, as key milestones for the analysis of very large datasets. What is meant by ‘very large’? Think trillions of rows of data. And the guiding principles are to go big, to go fast, and go standard.

As an example of ‘big’, he brought up YouTube, currently uploading 300 hours of video every minute. Google’s YouTube search engine covers more than 100 Petabytes of data. (That is 100,000 1 Terabyte hard drive’s worth.)

They applied Dremel and another tool called BigQuery to 1,000 whole-genome sequences from the publicly-available Thousand Genome Project datasets, to see how well their computational code can sift through .vcf files. (Remember – each whole-genome sequence contains 3-4 million variants apiece.) The first task: segregate variants by population. After a total time of only 10 seconds, a graph (in R) was produced, that validated prior analysis. After another 10 lines of code, a graph of shared variation across all 1,000 samples was produced. Another few lines of code, and output of SNP distribution of heterozygosity by population of origin.

He next went through a PCA analysis, solving not only a 1000×1000 computational problem but then scaling it to 1 million x 1 million. You get the idea – the folks from Google are experts at huge datasets and mining it for search, whether a particular cat video on YouTube or the frequency of heterozygosity at a given locus across many samples. He concluded with this XKCD cartoon to illustrate how insurmountable some Big Data challenges may remain.

Google’s expertise at Big Data and datamining is undisputed. And with more than $60B in annual revenue (primarily from their search engine, in particular their AdWords search marketing), there’s no question there. And when they start to monetize genomic analysis for scientists, there will be a healthy business there.

Google X

The Google X project is another matter entirely. A semi-secretive facility for making ‘major technological advancements’, their mandate is to gain at least a 10-fold improvement over an existing method. Google’s self-driving car project and Project Glass are two better-known projects, and Google X Life Sciences has a similar goal.

Where does Google X Life Science start? Their list includes a contact lens for blood-sugar monitoring and a spoon for people with tremors. It is the last two that are of interest to the geneticist: the Baseline study, and cancer-detection via nanoparticles. For Baseline, studying what the meaning of ‘normal’ is is a useful exercise, but it is difficult to envision how this can align under the ’10-fold improvement goal’. There are many efforts underway from many different research institutions that have been looking at this question for many years. For example the US National Institutes on Aging has been conducting The SardiNIA Study since it launched in 2001 – with two important dimensions: genetic homogeneity and careful phenotyping.

What Google’s Baseline may fail to capture is the control over both environmental and genetic variables. This is where careful work from geneticists come in – choosing ‘natural experiments’ such as an island population that traces their lineage back some 8,000 years. Google needs to be very careful who they choose as their ‘normals’ – and be prepared that most of the work is not on the data generation, but rather on the phenotyping data collection and figuring out what population of individuals to choose to baseline to begin with.

But the larger question is this: what is the protein signal they want to monitor that’s so indicative of an early-detection for cancer that couldn’t be examined with a blood draw? That is a different question altogether. There are technologies available for single-cell analysis, as well as for cell-free DNA (and RNA) analysis, to look at circulating tumor cells and for particular biomarkers that could be somatic mutation-based, methylation-based, copy-number variant based, you name it. Not to mention exquisite technologies available for very sensitive protein detection. If only the requisite biomarkers in the blood can be defined, which many companies (and many biomarkers) are actively pursuing, with many available today.

About the other (non-life science) large Google X projects, self-driving cars have huge social, legal and policy implications (laid out here recently in the Washington Post). For the recently-pulled Google Glass, to put it back into secret development in the hands of a fashion design expert does not bode well for its future, as Google doesn’t perceive Glass to be a social interaction problem, just a lifestyle / design one.

I for one am not optimistic that Google X Life Sciences (or Google X in general) will be able to perform the needed 10x disruption they have mandated. Self-driving cars should be (and currently is) ongoing in the development laboratories of major automobile manufacturers, as cars are their core competency. The world of augmented reality got a boost recently with Microsoft’s announcement of the HoloLens, and if early predictions come true, can be a game-changer in how individuals can interact with a combination of the real world and the virtual. I for one am not waiting on Facebook to introduce complex computing hardware, nor am I waiting on Google for hardware, for that matter.

Innovation is very hard: there are thousands and thousands of misses for every success. Google X Life Sciences is trying to do what many small startups are also trying to do: solve a problem (early detection of cancer) that is very very difficult. They just have a lot more funding to do it, and over 100 people working on it. They are optimistic that they will show results within five years – so it will be certainly something to watch.

PS – For all of you that have followed me over to the Behind the Bench blog, many thanks! (And for those who haven’t discovered it yet, please pay us a visit.)

June 12, 2014
by Dale Yuzuki

First Customer PhiX Data from the NextSeq 500

Image courtesy of Wikipedia user Fdardel

Phi X 174 image courtesy of Wikipedia user Fdardel

Illumina announced in January at the JP Morgan Healthcare conference the NextSeq 500, which was trumpeted (at least at the level of press releases and public relations) of being available immediately. Knowing first-hand how difficult it is to launch a new system, I had expected the first systems to ship by the end of the first quarter (March), but here we are in mid-June and the first data is only now being reported on the NextSeq 500 system. Continue Reading →

May 26, 2014
by Dale Yuzuki

Going ‘Behind the Bench’ (a new blog)

Behind the Bench blog from Life Technologies

The Behind the Bench blog from Thermo Fisher Scientific and Life Technologies

First of all I’m announcing to the world a new Thermo Fisher Scientific / Life Technologies blog entitled ‘Behind the Bench’. I mentioned in the last post that this blog will launch May 27th, but due to some factors beyond my control we’ve gone ‘live’ with it today. (It may seem self-referential to blog post about another blog, but I’ll save that for another day!) Continue Reading →

May 5, 2014
by Dale Yuzuki

A new role, a new team and a new blog

“You’ve got to be very careful if you don’t know where you’re going, because you might not get there.” – Yogi Berra

The Ion Proton at the Smithsonian 'Code of Life', F1 generation in the foreground.

The Ion Proton at the Smithsonian ‘Code of Life’, F1 generation in the foreground.

Two years ago when I started this blog I had a few goals – one was to educate and inform, another to build connections, and a third to initiate conversation. In that relatively short timeframe these goals have been fulfilled way past what I originally anticipated, in particular how many folks do read what it out there, you just don’t know exactly who it is until they casually mention it. Continue Reading →

Copy Protected by Chetans WP-Copyprotect.