ASHG 2012 in San Francisco is finally over! The exhibit booths get torn down, the equipment gets packed up and shipped to storage, and hundreds of foot-weary front-line soldiers get back to their normal routines, whether in sales, marketing, product development, R&D or product management.
One interesting company is BioNano Genomics. Way back at ASHG 2008 (it was in Philadelphia that year) I happened to meet Han Cao, who founded a company called BioNanoMatrix in Philadelphia, out of technology he was working on as a post-doc at Princeton. Based upon photomicrolithography and 50 nanometer-wide channels, they had a prototype box and a number of large diagnostic companies showing interest.
Fast forward to 2012 and BioNanoMatrix has become BioNano Genomics, and in the meantime moved from Philadelphia to San Diego.
One may wonder what geography has to do with the traction and success of a company, and let’s explore that topic for a bit. Biotechnology ‘hubs’ become a hub for concrete business reasons, not the least of which is a critical-mass of specific expertise. And for life science instrumentation, the expertise is wide-ranging. For example, an optical engineer could be needed for a protein analysis platform, and would be need to be very familiar with the a concept like total internal reflection fluorescence (TIRF), and apply this concept in depth while working on a surface plasmon resonance-based protein detection system.
Other expertise, such as the development of molecular biology that is upfront of the detection system, is also highly specialized and difficult to find. In the protein analysis example above, it is one thing to have a very sensitive detection instrument and platform that can detect very low levels of bound protein on the surface of a specialized chip. But how does that specific protein get bound to a specific protein partner in the first place? And how is that evanescent signal meaningful in a biological sense? This takes other kinds of experienced scientists, well familiar with the world of protein binding and conformation, not to mention biochemistry to assay the specific protein of interest, and making the process so simple a reasonably skilled lab technician can perform the assay reliably and consistently.
And then there’s the fluidic systems that surround these systems – the labeled sample has to get into an observation chamber, the system has to have some method of moving aqueous fluids with the specific analyte being detected past the detector, without contamination or clogging or any one of a number of things that can go wrong in the fluidic path. There are experts in fluidics, in surface derivitization, the list goes on and on.
Getting back to BioNano Genomics, they need all of this expertise, as they are taking large pieces of DNA (a few hundred kb up to 1 Mb) and imaging these single molecules through a silicon chip etched with 50nm-wide channels.
50nm is wide enough for only one single DNA strand to travel through it. But it seems like such a technical feat of engineering as to be almost impossible to do.
As I stood before their instrument (the Irys™) and took a close look at their chip, the image of a tangled ‘cake’ of Top Ramen instant noodles flashed before me. (Credit to this Wired article about higher-order genomic organization by Erez Lieberman Aiden at Harvard, who uses the Top Raman analogy to illustrate of tertiary-order and quaternary-order folding.) And you take that cake of dried noodles, and break it into several pieces.
And there is a long stretch of the noodle, 100kb to many 100’s of kb, and only two ends to that noodle. Another noodle, another many 100’s of kb, and another two ends. According to this reference, each package of ramen noodles have 80 noodles, 16″ long, all wrapped up in an ordered cake. Boil these noodles until they are just loosened (but not soft, just enough to be flexible and released from the cake) and then try to get as many of the 160 ends that are in that large mass of noodles into an ordered array of narrow straws. Assume the noodles are not that flimsy like pasta noodles are, but are in a stacked mess nonetheless, and you have to get very few ends into a channel only wide enough for one noodle to enter in and be measured. It is a feat of technology that tackles a significant problem.
The main technical effort for BioNano Genomics is getting the DNA to cooperate. There are 2000 channels per sample port, and I was told by one of their R&D scientists that a small percentage are used in any given experiment. (If I remember correctly it was in the low single digits, so if that number is 5%, then only 100 channels out of 2000 are actually usable.) How the system is able to unwind and line up long stretches of DNA so that the ends thread a 50nm-wide needle, is no mean accomplishment of single-molecule DNA manipulation.
The sample preparation is also another interesting wrinkle. Scientists are accustomed to using a QIAGEN spin-column, say a QIAamp Blood kit, to get high-quality DNA on the order of 80-100kb long. But in order to retain the intact long DNA pices from 100kb to 1Mb, the DNA has to be very gently extracted in-situ from the cells wherein they reside. So hearkening back to the PFGE days (Pulse Field Gel Electrophoresis), a technique that was developed in the 1980’s, cells are treated in an agarose matrix. It isn’t necessarily time-consuming or technically challenging, just a bit of retro nostalgia for those of us who have been around a molecular biology / genetics lab for a while.
At ASHG 2012 BioNano Genomics announced the product launch of the Irys, but it is more early-access than full commercial launch. What this means is that the Irys will be sold to only those that are accepted as early-access, and full commercial launch is expected in about six months (Spring of 2013).
The system at present can only measure ‘mid-size genomes’, on the order of a few hundred Mb. This is a limitation of how much data they can collect from a given chip, which gets back to the number of usable channels that can produce usable data. They use a ‘nick-ase’ enzyme that recognizes a 7-base motif, and instead of cutting the DNA molecule on both strands, the nick-ase only cleaves one strand.
At that nicked site, a fluorescent label is attached, and then is imaged in the channel. Thus if the genome is small enough, there is enough imaging data to have multiple examinations of each site along the genome to construct a map of that particular sample’s genomic organization.
Note that this is not sequencing – it is mapping of 7-base motifs along the genome, using single molecule imaging. Compared to a reference sequence, any deviation of the spacing of these 7-base motifs means a genomic rearrangement of some sort, whether a duplication event, a translocation event, inversions etc. The resolution of this 7-base motif is on the order of 1kb, with multiple examinations of each position along the genome, that resolution can go down to 100bp.
This is a technology that would complement NGS, in that the larger genomic organization can then be determined with relative ease. Mapping using short reads (and eventually, the longer reads on an Ion Proton™) will continue to have problems with rearrangement events, due to mismapping of reads to the reference. A local de novo assembly can help, but each sample being unique is a challenge. Thus single-molecule approaches are very desirable for this application, but the poor accuracy of the Pacific Biosciences has hobbled that platform, and as I mentioned in my last post Oxford Nanopore has not set a date for their promised systems.
So today this early-access system will handle a 160Mb genome (one of their customers is looking at the sea urchin genome), and the $300/sample chip (3 samples per chip, a $900 run cost) will get about a 50-fold ‘coverage’ of the 160Mb genome; that is, each mapped site will have a redundant examination fifty times over. At this coverage level, it will reach approximately 90% of the genome. (The nomenclature of 8 Gb / chip is used, but is too easily confused with sequencing capacity; 8 Gb / 50-fold examination redundancy = 160 Mb. They really need to invent a different nomenclature to avoid confusion with sequencing capacity.)
One thought I had while standing on the show floor, was that this is the American Society of Human Genetics, not Plant and Animal Genome (PAG XXI is set for January 2013 in San Diego). But they are promising a combination of increased efficiency and perhaps a larger chip to get to human genomes. When asked about the potential of looking at a human sample today, a ‘brute force’ effort could be done with a human genome, doing many runs at a lower total coverage. For example, 19 runs of a 3 Gb human genome would get 50-fold examination, at a 90% genomic coverage. But 10 runs would get 25-fold examination, at perhaps 50% genomic coverage, which might be enough. But it was clear that the time and effort to do so many runs would not be generally recommended. (A run takes over a day, but less than two days.)
And to be clear, a targeted approach to mapping simply doesn’t work, as very large fragments (100kb – 1Mb) are needed. Flow sorting human chromosomes may be just too impractical an approach, although as a technique it is extremely difficult to perform successfully, with only a few laboratories (the Sanger Centre at Wellcome Trust, and the National Cancer Institute) reportedly being able to do it. The first paper showing proof of concept used BAC clones that covered 4.7Mb of MHC sequence, so BAC library construction could be another method. However, that method would certainly hinder widespread adoption.
The Irys system is on the order of $300K, and is available now. (That is, if you are studying sea urchins or something with a similar-sized genome.) If they can improve / scale it to human (about 20-fold given the current range of about 160Mb), it could be a very useful tool for CNV and rearrangement mapping. It is a bit narrow in its application however (being a supplement to NGS), and until single-molecule sequencing becomes commonplace with very long reads and reasonable accuracy may find a niche in the marketplace.
Lam ET and Kwok PY et al, Nature Biotechnol 2012 August 771-776, “Genome mapping on nanochannel arrays for structural variation analysis and sequence assembly”