Sponsored by the U.S. Department of Energy Human Genome Program
Human Genome News Archive Edition
One of the genome project's major challenges is the need for increased automation in DNA-sequencing technologies to increase speed and reduce costs. Standard methods based on gel-electrophoresis separation of nested fragment sets are considered too slow and expensive. Progress in automation ranges from enhancements of conventional gel-based technologies to novel, gel-less, automatable approaches. Some topics represented at the meeting are outlined below.
Richard Guilfoyle [University of Wisconsin, Madison (UWM)] discussed improvements in front-end strategies and progress toward automation. Triple-helix-affinity capture techniques facilitate (1) purification (95%) of cosmid inserts (reducing 20-fold the amount of sequencing-vector DNA) and (2) CviJ 1 digestion for random fragmentation and cloning into M13-100, a direct-selection vector. The group is exploring a promising flow-cytometric sorting approach for isolating M13 clones that could deliver 7000 clones/h and is optimizing procedures for ordering M13 clones and selecting minimally overlapping inserts. This will drastically reduce redundancy in the shotgun approach and the postsequencing fragment-assembly process. The UWM group is also quantitating M13 sequencing-template concentrations to identify unsequenceable clones and normalize base calling.
Current widely used procedures for automated DNA sequencing all involve the use of substrates synthesized by DNA polymerases and terminated by a dideoxynucleotide analog. Stanley Tabor (Harvard Medical School) identified a site in many DNA polymerases that can be modified to incorporate these analogs more efficiently. This is critical for obtaining bands of uniform intensity in the gel electrophoresis step, increasing sequencing sensitivity and accuracy. The presence or absence of a single hydroxyl group (tyrosine vs phenylalanine) at a highly conserved position on E. coli, T7, and Taq polymerases makes more than a 1000-fold difference in their ability to discriminate against these analogs. Another advantage of these modified DNA polymerases is their requirement for much lower amounts of the fluorescent dideoxynucleotide analogs used in automated DNA sequencing; this reduces the cost of reagents and also the fluorescent background caused by unincorporated analogs.
Barbara Ramsay Shaw (Duke University) described a method that allows direct sequencing of PCR products, bypassing the need for cycle sequencing. The method is based on efficient and stable incorporation into DNA of a new class of boronated triphosphates that permit exponential amplification, unlike chain terminators. Sequence is revealed by simple exonuclease digestion. The method should be completely automatable and requires much less DNA template.
Application of higher electric fields in electrophoretic separation of DNA fragments can increase sequencing speed and efficiency. Conventional slab gels cannot dissipate adequately the additional heat produced, but capillary arrays or ultrathin (50- to 100 mm) slab gels can be used for more efficient heat transfer. Progress on these systems is reported below.
A bottleneck in gel-based systems is the manual gel-preparation step, which is both time consuming and a potential source of variability in DNA sequencing. Barry Karger (Northeastern University) discussed a fully automated, closed-end, high-throughput CE system to minimize human intervention, with a noncross-linked polymer matrix that is replaced after each run. The group is also attempting to incorporate a library-based primer-walking system. Karger observed that mutation detection will be another important use for capillary technology.
Norman Dovichi (University of Alberta) noted progress in constructing a CE sequencing system that requires a single laser for simultaneously exciting CE fluorescence signals from many capillaries. As they exit capillaries, DNA fragments with fluor labels are smoothly entrained in an optically clean fluid sheath flow in which as few as 120 fluorescein-labeled molecules can be detected. The capillary ends are physically staggered to distinguish fluorescence signals of the exiting DNA fragments. A 32-capillary system has been demonstrated, and Dovichi projected expansion to 864 within a sheath flow cuvette.
Richard Mathies (UCB) spoke of recent efforts to develop and combine improved high-sensitivity fluorescent reagents and new instrumentation for capillary array electrophoresis (CAE) coupled with confocal fluorescence detection. His group has developed sequencing primers labeled with pairs of dyes coupled by fluorescence energy transfer and has obtained dye signals 2 to 6 times more intense than for single-dye-labeled primers, thus decreasing the amount of template DNA needed. Another project involves fabrication of a miniaturized capillary system using photolithography to etch very tight injection zones onto glass microscope slides; up to 100 channels can be placed on a single slide. Use of these CAE chips has enabled the group to achieve high-resolution separations of double-stranded DNA from 70 to 1000 bp in only 120 sec.
Michael Westphall (UWM) reported on a new fluorescence-based detection system for use with multiple fluorophore sequencing in horizontal ultrathin slab gel electrophoresis (HUGE). The system uses laser through-the-side excitation and a cooled CCD detector, which allows for parallel detection of up to 24 sets of 4 fluorescently labeled DNA-sequencing reactions during their separation in HUGE gels. The automated sequencing system is capable of producing 500 bp of raw data from 24 samples in less than 70 min. The group is exploring dense loading and is building an automated gel loader.
As an alternative to slab and capillary systems, Joe Balch (LLNL) and colleagues at Perkin Elmer Corporation are investigating a hybrid technique based on a high-density array of electrophoresis channels micromachined on a single, large substrate at fixed locations. DNA sequencing in both mechanically polished and chemically etched microchannels allows base calling to about 500 bp/channel (comparable to slab gels) for a 25-cm load, and current efforts are focused on developing larger channel arrays for high-throughput sequencing. A 1993 CRADA established this collaboration between LLNL and Perkin-Elmer to develop analytical instrumentation for faster DNA sequencing via electrophoresis.
Robert Weiss (University of Utah, NIH GESTEC and DOE funded) described instrumentation for automated hybridization and detection of DNA hybrids on nylon membranes to be used in multiplex mapping and sequencing of transposon inserts in large plasmid templates. The prototype system features a pair of nested Plexiglas cylinders with a heated inner drum having nylon membranes fixed to its outer surface; the drum rotates through a fluid puddle formed by an outer-drum enclosure. Fluorescent-light output on nylon membranes is amplified with a conjugated probe system of alkaline phosphatase combined with a fluorogenic alkaline-phosphatase substrate. The amplified signal allows detection of DNA hybrids in the subfemtomole band range. The group also began large-scale genomic DNA sequencing of the Pyrococcus furiosus genome for the DOE Microbial Genome Initiative.
Directed strategies using presynthesized libraries of short (5 to 7) oligonucleotides may eliminate subcloning as well as the redundancy, sequence-assembly, and gap-closure problems associated with random-sequencing methods. Primer availability reduces the time and cost of synthesizing individual primers and will enable complete automation of this approach.
Levy Ulanovsky (Weizmann Institute of Science) and William Studier (Brookhaven National Laboratory) reported progress in designing primers from oligo libraries and developing closed-end, automatable systems. Ulanovsky assembles primers without ligation using a 5+7+7 structure with purine base stacking between the 5-mer and the adjacent 7-mer. The modular primers, which can be used with dye terminators on the ABI 373A automated sequencer and on the replaceable matrix CE system of Barry Karger, are extended by polymerase selectivity. The 1000-primer library contains 500 each of pentamers and degenerate hexamers.
Studier summarized efforts to construct a closed system that can be scaled up for production sequencing. His group assembled an efficient priming complex containing 3 hexamers from a library of 4096 hexamers. With dye terminators and an ABI 373A automated sequencer, primer walking with hexamer strings rapidly determined the entire sequence of two different M13 templates (6.4 and 7.2 kb), at a success rate comparable to that obtained with conventional primers. To extend primer walking to cosmid-sized (40-kb) DNA, new fesmid cloning vectors were developed (based on Simon's fosmids), whose inserts are flanked by replication and packaging signals recognized by bacteriophage T7. When infected by T7, the cloned fragment is amplified and packaged into phage particles, leaving most of the vector sequence behind. Fesmids should provide adequate amounts of DNA for performing several primer-walking steps directly on each 40-kb template. The group (with Karger's laboratory) is integrating this walking approach into a CE system containing multiple capillaries with a replaceable polyacrylamide matrix.
An innovative approach for high-throughput DNA sequencing uses MS instead of gel electrophoresis for very fast separation and detection of nested sequencing fragments. MS has the potential to reduce fragment separations from about 1 h using gel-based methods to mere seconds.
Winston Chen (ORNL) reported detection of single- and double-stranded DNA up to 500 bp and enhanced detection sensitivity to the femtomole region using matrix-assisted laser desorption ionization (MALDI) MS. Current mass resolution corresponds to about 10 bases, with single-base resolution the immediate goal.
Peter Williams' team (Arizona State University) launches DNA into time-of-flight MS from an ice matrix. The ice is transparent to the laser beam, which impacts the underlying copper support and shock-vaporizes the ice-DNA mixture. DNA mixtures with lengths up to 90 nucleotides have been analyzed with near-single-base resolution. The immediate challenge is to achieve high reproducibility of still-rare successes.
Christine Nelson (UWM) discussed studies on DNA fragmentation that may be responsible for MALDI's limitations in size and base composition. The group found that base protonation seems to initiate fragmentation.
In another MS approach, Richard Smith (Pacific Northwest Laboratory) described analysis of single, large DNA fragments by using electrospray ionization (ESI) combined with Fourier transform ion cyclotron resonance (FTICR) MS. In this approach, large (up to 25-kb) DNA segments are transferred to the gas phase using ESI; multiply charged ions are trapped in an FTICR mass spectrometer cell, where single ions can be selected for precise and rapid mass measurements. Thus, large DNA segments can be ionized intact and detected with high sensitivity by a nondestructive process, due to the high charge that results from ESI, allowing reactions to be followed for effectively unlimited time (>>hours). Current research is aimed at developing the gas phase chemistry necessary to obtain sequence by inducing a step-wise dissociation of the trapped ions.
SBH provides sequence information by specifically hybridizing small probes to a target sequence fixed to a solid support. Although initially conceived as a direct DNA-sequencing procedure, the SBH format is also practical for mutation and polymorphism detection, HLA typing, and other uses. Andrei Mirzabekov (Argonne National Laboratory) spoke about developments in the DNA-hybridization microchip and its use as a versatile tool for DNA analysis. His group successfully identified mutations in blood samples of beta-thalassemia patients.
In collaboration with Nanogen, Inc., Glen Evans' team has developed Active Programmable Electronic Matrix (APEX) to provide exquisite electronic control of individual oligomer sequences in oligomer arrays. An electrode under each patch permits selective attraction and repulsion of interrogating sequences as well as transfer of test DNA strands. The matrix is an electronic device like a computer chip capable of interacting with DNA. This novel technology, which has potential as a minilab on a chip, allows very fast hybridizations under attraction conditions and more sensitive mismatch discrimination in oligomer and test strands under repulsion conditions.
Charles Cantor (Boston University) described the use of DNA sequences in capture and detection methods to facilitate genome analysis. In his group's enzyme-enhanced SBH approach, a partially duplex DNA probe captures the five complementary bases at the end of a single-stranded DNA target. Additional bases can be read and accuracy of the original detection ensured by DNA-polymerase extension of the probe serving as the primer along a target serving as template. Cantor discussed using the SBH format to prepare DNA samples rapidly for such fast serial-sequencing methods as CE or MS; an array of 1024 probes could capture and generate sequence ladders from any arbitrary DNA sequence. He also described an efficient single-sided Alu-PCR procedure that provides larger samples for mapping and other purposes.
The electronic form of the newsletter may be cited in the following style:
Human Genome Program, U.S. Department of Energy, Human Genome News (v6n5).
The Human Genome Project (HGP) was an international 13-year effort, 1990 to 2003. Primary goals were to discover the complete set of human genes and make them accessible for further biological study, and determine the complete sequence of DNA bases in the human genome. See Timeline for more HGP history.
Published from 1989 until 2002, this newsletter facilitated HGP communication, helped prevent duplication of research effort, and informed persons interested in genome research.