A while back I wrote about a San Diego company setting out to produce optical maps of genomes called Bionano Genomics. It was November 2012 when I saw them exhibit for the first time at the American Society for Human Genetics, held that year in San Francisco, and decided to write a few words about it.
Since then I’ve seen its use in the Genome in a Bottle Consortium, a complementary long- and short- and optical mapping (which they cleverly call Next-Generation Mapping or NGM) paper, and a regular stream of publications around animal genomes (for example, a yellow croaker [PDF of a press release]).
Bionano Genomics at AGBT 2017
Hard to believe it has been already over four years later, but they announced at the recent Advances in Genome Biology and Technology (#AGBT17) conference in Hollywood Florida a new system with ten times the overall throughput, to be able to do single-molecule optical mapping at human-genome scale.
Their new system, named Saphyr, now has the capability of mapping structural variants in the human genome at high sensitivity, such as deletions from 1,000 basepairs on the low-end to several megabases large. If you are not familiar with the technology, their NanoChannel features image single very large fragments of DNA, from 300KB or so to several megabases large, that have been nicked on one strand with either a 6-base or 7-base modified endonuclease that instead of cutting the double-stranded DNA simply cuts one of the two strands, and these nicks are then labeled with a dye.
Basically it is then an optical restriction map, and instead of being read out on a gel, these fragments are simply imaged and mapped in terms of length. Now the DNA is not fixed on any surface (such as the OpGen systems of over a decade ago for microbial genomes) so the system’s microfluidics and optics are likely pressing the limits of what is possible. And a technical specialist told me that the linear between the optical map of individual molecules is on the order of 5%, or 500 bases out of 10,000. (Seeing one of these restriction maps involves redundancy of the same region of the genome overlaid with several independent molecules with very similar, but not identical regions.)
Looking at a screenshot of the NGM ‘raw data’, it looked a lot messier than something I could find on their website (here’s a PDF of an interesting white paper on human structural variation). Nonetheless the approach is surely sound, what makes it complicated is to hear a specification of ‘640GB per run’ which is amount of single molecules mapped, but makes it somewhat hard to grasp. (At 6.4GB per diploid human, that calculates out to 100-fold human genome coverage, but I am admittedly not an expert regarding Bionano’s nomenclature.) I am told that the original Irys had 26K nanochannels, while the Saphyre now has 120K, so there’s a four-fold increase there.
But the real importance of this technology is what it can detect that current NGS systems do not. Included here are short-read offerings from Illumina and Thermo Fisher Scientific, along with long read-length offerings from PacBio and Oxford Nanopore, and virtual long reads from technologies such as Moleculo, 10X Genomics and now-defunct Complete Genomics. It is an illustration of Francis Collin’s (the current US Director of the National Institutes of Health) often-used slide of ‘looking for your keys where the light is the best‘.
A real-world study of undiagnosed disease
And thus it was a pleasant surprise to hear Dr. Eric Vilain (Chief of Medical Genetics at Univ California Los Angeles) present a talk entitled “Use of next generation mapping in undiagnosed genetic disorders” at AGBT, where as part of the Undiagnosed Disease Network sponsored by the NHGRI of the NIH. For those not familiar with the UDN, it was a program for individual patients who have had an undiagnosed disease, given the current standard of care and practice in diagnostics.
Patients would come to one of these centers, and receive state-of-the-art consultation with a combination of specialists for the suspected disease, and often will involve Whole Exome or Whole Genome Sequencing (WES or WGS). While the pilot program of 600 patients solved only 100 of them, nonetheless led to the discovery of 15 genes previously never associated with any disease, as well as two discoveries of new diseases.
Of 11 patient trios tested by Dr. Vilain that had no diagnosis even after WES or WGS (i.e. part of the ~85% of the patients who do not receive a diagnosis even after participation in the UDN program), using Saphyr mapping data two of the eleven had novel deletion mutations that explained the developmental phenotype. One eight-year old boy had a 3.5kb deletion in a gene called MED17 that was also present in the father, but the child also inherited two deep intronic mutations in the same gene which could be a compound heterozygote, which they are still investigating.
The other case involved a 13-year old female patient, where the NGM detected a 1.6kb deletion in a gene called EZH1, which had never been before described as associated with the developmental symptoms she presented. However, a closely related gene EZH2 is causative for an overgrowth syndrome, and this gene has been shown to work in close conjunction with EZH1 for a role in skeletal development.
For a 20% improvement (2 of the 11 cases), this is a practical and important use of technology for clinical use; structural variation has long been associated with autism and schizophrenia in addition to cancer, but due to limitations of the technology there is likely causative mutations present that have simply been missed as the biology of DNA exceeds our capability to detect these aberrant cases.
It is noteworthy that in the US, there are some 25 million individuals with undiagnosed disease, that costs some $5M per individual over the course of their lifetime in tests and hospitalization. Even though a rare disease affects a fraction of 1% of the population, the fact that there are several thousand disease-causing genes means there are some 8% of the population (about 25 million) affected by a genetic disorder. (Here is one adult’s personal story with undiagnosed disease.)
Importance for cancer
At #AGBT17, a highlight of many attendees were the poster sessions. There were many interesting approaches, and this photo will give you an idea of how busy the poster sessions were.
One Bionano Genomics technology poster was presented by Vanessa Hayes of the Garvin Institute in Sydney Australia. Speaking with her later about her work on prostate cancer, she indicated her firm belief that the cohorts used for large GWAS and The Cancer Genome Atlas may not have been ideal, given the current status of PSA screening in the general population and due to the limitations of PSA as a test, there has been oversampling of the individuals selected for the stud. She brought up the seven subtypes that TCGA recently characterized, which while informative nonetheless gives her additional incentive to look at Bionano’s technology to see genomic signals (as structural variation) that NGS technology has missed.
Dr. Hayes is no stranger to the power of NGS, having been very involved as a native South African to arrange the sequencing of Desmond Tutu (using a combination of 454, SOLiD, and Illumina technology for a de-novo assembly). It may well be the case that the more familiar you are with a given tool, the more sensitive you are to its strengths and weakness. There is a video on the Bionano Genomics’ website of Dr. Hayes discussing the importance of optical mapping to her work.
It will be something to watch the Saphyr technology in the coming months, both as a clinical tool as well as a discovery one. It is notable that Bionano Genomics chose to work with the UDN as its first real-world application effort to showcase its utility.