During the flurry of activity at the recent Advances in Genome Biology and Technology, I made it a point to take a close look at the newest CMOS-based NGS instrument from Illumina, called the iSeq 100. And it’s a good time to get a little perspective where next-generation sequencing has come from, and in a very short amount of time.
The Solexa 1G 11 years ago
Starting with the original Solexa 1G shipping to its first commercial customers in December 2006, the first purchase I was personally involved with occurred in January 2007. (The original Solexa purchase was touted to be for $600M announced in November 2006.) There was inserted into many of the purchase contracts at that time that the 1G would yield at least 1,000 megabases of mappable data at a certain read-level quality, and the truth at that time was that metric was typically in the mid-700’s or low 800’s of megabases per run, and were originally paired-end, 35-base reads.
To my knowledge, no customer ever took Illumina up on the offer to return their Solexa 1G, the data was so useful and the demand for short-read NGS was so great. Remember, the GS20 had started the entire field with about 200,000 reads and 100 bases of readlength (so about 20 megabases) increasing over time to the FLX that would quintuple the throughput, doubling both the read number to 400,000 reads and 250 basepair readlengths, or 100 megabases in 2007.
But to purchase a similar-cost instrument (they cost more than $350K if not $450K at that time) and get 10-12 million 2×35 paired end reads (for a throughput of 700MB to 840MB of sequencing output) meant opening up many more tag counting applications, such as RNA-Seq and ChIP-Seq, which frankly blew the market wide open.
Thus the first Genome Analyzer (to which the Solexa 1G was rebranded) cost on the order of $400K, had a throughput of about 1 Gigabase, and took something like 32 or 34 hours to generate this data. To generate this data, a single run was on the order of $1500; for the 90 Gb of sequence information necessary for a 30x whole genome, assumine an 800 MB throughput at $1500/run would equate to $169,000.
The iSeq 100, 11 years later
Here we are only 11 years later, and the iSeq splashes on the scene. At a unit cost of $19.5K, this is 20-fold less expensive to acquire; with a 1.2 Gigabase throughput, this is 50% more than the original 800 megabase throughput, and the runtime is 17.5 hours for the highest level throughput, or half the time. With a $625 price for a run, that same 30x whle genome (for comparison purposes only, the NovaSeq or HiSeq X-10 are much better suited for this) would cost $46,875.
The decrease in cost, the increase in throughput, the decrease in time, and the decrease in the complexity all mean a win for the end-user. It is difficult to imagine anything dropping in price 20-fold while cutting overall time in half as well as increasing the overall output 50%, all for a price less than half to press the ‘start’ button. Trying to think of analogies stretches the imagination to the point of being ridiculous.
But let me try to be ridiculous: in 2007, the original iPhone made its debut at $599 for an 8GB model. Try to imagine purchasing the iPhone 8 in late 2017 for $29.95 (that’s 20-fold less than $599). The other dimensions (memory has gone up 32-fold; screen resolution gone up 18-fold) have gone way up in the meantime so this may not be a fair comparison. But can you think of anything that has decreased 20-fold in price? Anybody?
The iSeq 100 operating principle
The original Solexa chemistry had four different dyes, with reversible terminator chemistry every base would be synthesizing the next base and interrogated on the basis of the four different colors. Thus there are 4 colors, 1 synthesis step, dyes that stay associated to their base, and 4 images produced per sequencing cycle. Now with a CMOS detector, there’s only 1 color, 1 synthesis step, a chemistry step that changes the dye associated to a different base, and 2 images per sequencing cycle.
The first imaging step after single base synthesis interrogates either an A (on) or a T (on), which is an IUPAC “W” base. In a clever bit of chemistry, the fluor linker on the A nucleotides are cleaved removing any dye signal on A, and a linkage site on the C nucleotides are all chemically joined to a dye molecule. The second imaging step then interrogates the same base interrogating either a C (on) or a T (still on, as no manipulations of the originally dye-labeled T base has occurred). A C or a T base is a “Y” base.
The figures A & B here, borrowed from the iSeq 100 Specification Sheet (PDF), can help illustrate the chemistry and its interpretation. Any given base to be read can find its complementary base extended by one base, either A – C – T or G, and in the two images of that particular ExAmp cluster it will be reading a color as either On/Off (A), Off/On (C), On/On (T), or Off/Off (G).
Oh yes, each of the wells has a single library molecule bound to it, and the ExAmp (Exclusion Amplification) process will amplify that library so the amplified cluster is registered to the CMOS imaging element right below it.
Simplicity of Operation
In the photo at the top, you can see the Illumina applications scientist handling the reagent cartridge. It comes shipped on dry ice, and to use you simply thaw it on the benchtop for a period of time, then invert it 5 times upside-down and upright again. (The touchscreen lays out exactly what you need to do.) And then you simply tap the cartridge onto the benchtop a few times in order to settle the reagents to the bottom, and that’s it for preparing the reagents. Thaw, mix and tap.
It’s a little difficult to see (darn phone camera ended up focusing on my thumb rather than the flowcell) but the photograph on the left shows the chip and the ports where reagents flow through. Talking with Dr. Chris Mason and his laboratory people beforehand, he told me he felt there was a good 3x to 5x the surface area where Illumina could expand the surface area, and if he was referring to the oval section he may have a point.
But on the other hand when I think about the position of those ports, that feature may be fixed, and the increase in features comes in packing in the CMOS features closer together. Nonetheless Illumina made it clear in their presentations (both as a vendor talk in their suite in the early morning, as well as during their lunchtime workshop) that they will be decreasing the time of the runs while simultaneously increasing the throughput.
I have little doubt that several-fold increases in throughput and decreases in runtime are achieveable. I still have plenty of AGBT Ion Torrent polo shirts (those black ones) that say ‘The Chip is the Machine’, and having the chip do all the signal transduction from nucleobase chemistry to electrons is a wonderful thing. Illumina didn’t state the obvious, in order to not criticize the platforms that got them to 6,000 GB of throughput, but they did get rid of optical detection with this system.
Installation is simplicity itself
Illumina likes to state that this instrument is ‘completely dry’, as if I think about the instrumentation I run in the lab as being ‘wet’. What is meant is that a nearby sink or carboy isn’t needed to run waste tubing down into, or other kinds of lines feeding into the instrument (such as a vacuum line or as in the case of the original PGM a tank of Argon gas). It’s all self-contained, you prepare the cartridge, you pipette in your quantified library into a port, you plug in the iSeq 100 chip in the rear of the cartridge, and you slide the entire 4″ x 3.5″ x 8″ plastic box into the front of the unit.
And then you press ‘Start’. This is the kind of simplicity people have been asking for.
As far as installation goes, it has been designed to be user-installable. Gone are the days to wait for a service engineer to uncrate and setup a complicated instrument. Also it is expected that there would not be a need for preventive maintenance, although I did not get many details about that.
Remarkable quality of sequence reads
During the iSeq 100 morning workshop the following sequence statistics were shared. The limited number of runs are shown (either 100, 40 or 70) and I’m sure additional data is being produced to show data at different GC%. During the J.P. Morgan Conference presentation and press release, it’s difficult to evaluate actual performance, and this data is very promising.
In a conversation with Dr. Chris Mason, he said that the error profile he saw (remember his lab is the only laboratory outside of Illumina’s that has used an early-access version of the iSeq) he said it compared ‘very favorably with our NextSeq’, with well less-than 1% error on both reads. Seeing the chart above, with 0.4% error on R1 and 0.6% error on R2, I’d say taht that comment is justified.
In case you were wondering the throughput and time requirements, here’s a chart borrowed from this specification sheet (PDF).
If you want one, get in line…
I feel like this instrument is what the clinical laboratory could immediately put to good use. The ease of use, the high accuracy, and the low price to both acquire and run ($19.5K and $625 respectively) to get 1.2GB of 2×150 high quality paired-end reads is a compelling value proposition. Of course research and applied laboratories will find these advantages compelling as well, and for those who are just getting into doing next-generation sequencing, this technology, instrument and consumables package makes for a very attractive offering.
I was told that orders coming in now (mid February) would be fulfilled by the end of Q2, which puts it out to late June. I should pick one up to put in a basement laboratory… (just kidding, my spouse would have none of that).