Back in 1989 in Belgrade Yugoslavia, Radoje Drmanac had an idea that would shape the next decade of genetic analysis, which was being able to sequence megabases of DNA information by a collection of 11-mers to 20-mers. Doing the mathematical calculations, this paper lays out what kind of DNA oligonucleotides would be needed to sequence: 95,000 specific set of 11-mers, specifically of the 5′-(A,T,C,G)(A,T,C,G)N8(A,T,C,G)-3′ format, and 2100 dot-blot hybridizations for a 1 million base-pair sequence.
For those unfamiliar with dot-blots, it is just a simplified Southern Blot.
In 1994 a company called Hyseq (yes that’s what it was called) was formed around this idea, and in 1997 it was a $38M IPO. Shortly after it tangled with a new company called Affymetrix in lawsuits and countersuits; Affymetrix offered resequencing by array hybridization to the research market in the early 2000’s, and even had an FDA approval through a Roche partnership way back in 2004, for the CYP2D6 and CYP2C19 drug metabolizing variants in the Cytochrome P450 gene. It was called the AmpliChip CYP450 Test.
What a difference a few decades makes, as Nanostring first rolled out the idea of developing a new instrument and technology called Hyb & Seq at the 2017 Advances in Genome Biology and Technology, which I wrote about here.
The Workflow Hasn’t Changed
One year later, the basics remain the same:
- Still <1h FFPE sample to load the sequencer, <15 minutes hands-on time
- Still no need for library preparation or other enzymology, thus the ease-of-use will be extremely simple
- Still capable of direct sequencing of DNA or RNA molecules, and irrespective of length as the DNA molecules are bound as single molecules to a substrate and imaged in-situ, with repeated hybridization and imaging.
The importance of getting rid of library preparation is a major advance; all single molecule sequencing (Pacific Biosciences, Oxford Nanopore) requires a fair amount of molecular biology, so that DNA can get sequenced. Biochemistry being what it is, a high level of training is needed to do the appropriate liquid manipulations, whether setting up reactions or incubating reactions or performing wash steps or magnetic bead steps.
By doing a relatively simple lysis and bead-based cleanup and then just loading up the instrument is about as simple as it gets.
The Data Compares Favorably
By looking at a specific TP53 mutation, a side-by-side comparison was performed by a collaborator Anna Piskorz of the Cancer Research UK in London, who shared her data during a session at AGBT called “Simultaneous mutation detection, copy number, and digital gene expression profiling high-grade serous ovarian cancer FFPE samples using the Hyb & Seq targeted sequencing technology”. She had a complex request: look for SNV mutations in TP53, look for copy number alterations in a handful of genes including CCNE1, MYC and TP53, as well as the gene expression profile of six genes.
So from a probe perspective, for TP53 SNVs the design had to be done on the anti-sense strand of the coding region of the gene; for CNA, probes were designed in the intronic anti-sense strand of the genes of interest, and for gene expression, the splice site was targeted.
Running an HL-60 cell line, found copy number data to be very consistent with known CNA for MYC and CCNE1. In two of her HGSOC samples, MYC was measured very close to an orthogonal measurement (2.7x vs 3x) and for TP53 in a second sample (3.1x vs 3x).
Single Molecule Accuracy
What is surprising was the ability to look at TP53 raw sequencing accuracy on a per-base basis. Here is a chart that was presented (figure kindly provided by Nanostring) for four specific TP53 mutations, comparing Hyb & Seq to a Swift Biosciences Accel-Amplicon 56 panel and MiSeq sequencing, compared to Hyb & Seq.
Speed may be an advantage
One intriguing application will be expected turnaround times. For small or medium cancer panels (defined as about 100 gene regions) the smaller flowcell at 20M reads with 150 H&S cycles would take about 9 hour turnaround time. For the 80M reads with the same 150 H&S cycles would be 13 hours. For large panels (about 500 gene regions) the smaller 20M reads would take 22 hours with 400 H&S cycles, 80M reads would take 31 hours.
Application to Infectious Disease
Chris Mason of Weill Cornell Medical College (NY) gave an interesting Nanostring workshop presentation called “Rapid and accurate cross-kingdom human pathogen identification and detection using Hyb & Seq™ technology” and presented a poster by the same name.
He described the challenge of time when looking at species identification. With the ability to look at fast sequencing, designed Hyb & Seq probes (18 per microorganism or virus) for 10 species and 1 virus. By titrating a standard mock mixture, showed titration down to a ‘conservative’ 10 cell/mL detection lower limit. He tried flooding a background of human DNA (from 5M cell line NA19240), an r^2 of 0.91.
At Weill-Cornell, they had 66 clinical samples, and 65/66 was in exact agreement with pathology staining and culture; for the 1 disagreement, it was a low-confidence pathology sample which was likely a false positive. Being able to get results very quickly (four hours from sample to result), and have a method to get species-level detection, shows genuine promise.
One more thing
An interesting development is to use NGS to read-out Digital Spatial Profiling microscope tags. Taking 1000, 200, and 20 cells, and comparing the nCounter readout to either a MiSeq or Hyb & Seq readout, R^2 values north of 0.9 were obtained. Thus the ability to look at more than 1 million tags for the Digital Spatial Profiling microscope are possible, opening up the potential for what NanoString calls “Spatial Genomics”.
This progress with real-world samples and real-world accuracy is nice to see. No expectations has been set for product availability timing, although I’m told that field testing will start in 2019.