After a library is properly prepared, (remember it can be from many sources – randomly sheared genomic DNA, cDNA from a small RNA sample, an immunoprecipitated sample) the library molecules need to be amplified in some manner, before the sequencing takes place. Thus there is a critical need for accurate quantitation of the library DNA, whose importance can be overlooked.
Once upon a time the amount of UV absorption was all one needed to get an accurate gauge of the amount of DNA in a sample. Now it is fluorimetry or quantitative PCR that is the standard for library quant, and the equipment doesn’t have to be that expensive. (Life Technologies sells a fluorimeter called the Qubit 2.0 for about $1,200.)
Conceptually, the library molecules are diluted in such a way that a single library molecule is amplified thousands (or hundreds of thousands) of times in a format conducive to sequencing that amplified molecule. For Ion Torrent, SOLiD / 5500xl, and Roche / 454 FLX this means making a micro-reactor in solution called emulsion PCR; for the Illumina platforms HiSeq 2000 and Genome Analyzer IIx (and the recently announced 5500xl Wildfire) this means making an amplified colony on a solid surface. If the library is too highly concentrated, in the case of emulsion PCR (or ePCR for short) you would get too many droplets with two or more molecules on the same bead template, yielding nonsense sequence. In the case of Wildfire and cluster generation on Illumina (as it is termed), you would get a saturated surface where the imaging system could not find any discrete spots where a distinct fluorescent signal can be detected.
Naturally if the library is too dilute, the sequence yield of that particular run would be compromised, which would be a waste of reagents as well.
Emulsion PCR has received much criticism over the years (by Illumina salespeople of course) for its complexity and time-consuming nature. Emulsions have to be made, PCR amplification take time, the emulsions need to be broken, and the amplified beads need to be enriched. By enrichment, not very ‘bubble’ will have only one molecule in it; a Poisson distribution will determine that to guarantee only 1 molecule per microreactor, two thirds of the emulsion will be by definition empty of library molecules, and about on third of the emulsion will have only one molecule. Thus the ‘PCR-positive’ beads need to be enriched, and there is a strepavidin bead enrichment step to get rid of the great majority of ‘PCR-negative’ beads.
While Roche / 454 has not automated this process in the 7 years the FLX or its predecessor the GS20 has been on the market, SOLiD introduced the EZ Bead automation system in 2009, and Ion Torrent announced the OneTouch system in 2011. So the complexity and time-consuming bits have been made much more convenient. (Although frankly, having done manual emulsions myself, it isn’t that laborious a process, but to be fair I was doing the ~5h Ion Torrent manual process, not the Roche ~11h one.)
With the Illumina cluster generation, these clusters are amplified on the solid surface of the flowcell by using a modification of PCR and ‘grafted’ oligonucleotides to the surface, using formamide for the intervening denaturation steps of PCR instead of heat as is the case with ‘regular’ PCR. The recent 5500xl Wildfire accomplishes the same in principle, using an isothermal template walking procedure, which is remarkable in its efficiency and ability to get to the same end-point without trodding over any infringing IP (at least until someone files a lawsuit!).
At the end of all this, you will have millions of amplified beads or colonies (ranging from about 10M for the Ion Torrent PGM, to about 100M for the Roche / 454, to ~700M for SOLiD, to ~3B for the HiSeq 2000 and 5500xl Wildfire) that are ready for sequencing.