Sponsored by the U.S. Department of Energy Human Genome Program
Human Genome News Archive Edition
| Available in PDF
In this issue...
In the News
Ethical, Legal, and Social Issues and Educational Resources
Genetics in Medicine
Web, Other Resources, Publications
Meeting Calendars & Acronyms
BAC End Sequencing Speeds Large and Small Projects
Ultimate goals of the Human Genome Project (HGP) are to determine the sequence of the 3 billion DNA bases that make up the human genome and to increase understanding of gene function. In search of the best route to these ends, researchers have generated several different types of useful chromosomal maps. Eventually, the human genome will be represented by DNA chromosome sequences with various levels of annotation.
Interim maps have proven useful for biomedical research, but the most valuable map resources for production DNA sequencing are megabase-scale assemblies of overlapping DNA clones (contigs). Building long contigs, however, has proven a difficult task. Although contig maps of chromosomes 16 and 19 (developed at Los Alamos and Lawrence Livermore national laboratories, respectively) were largely complete in 1995, comparable contig maps of other chromosomes are less ready to support high-throughput sequencing. To help alleviate this impending bottleneck, in 1998 DOE sponsored projects to enrich the BAC clone resources preferred for high-throughput sequencing systems.
BACs and STCs
Recent DOE-sponsored projects are producing sequence tag connectors (STCs) on BACs to help extend the human chromosome sequence already acquired. STCs are DNA sequence reads at both ends of the BACs. In 1995 and 1996 investigators began to advocate that the STC concept, which had proven useful in smaller-scale sequencing projects, be applied to large-scale human genome sequencing (Venter et al., Nature 381, 364-66). DOE accepted related applications in 1996 and implemented a fast-track, special review process involving a panel composed of international experts in human and mouse genetics, mapping, sequencing, informatics, and management. Following the panel's recommendations, in September 1996 DOE initiated pilot projects at six laboratories to refine protocols and clarify cost and quality factors.
Several months later in 1997, a workshop and review was held to assess progress. Attendees recommended that DOE maintain its level of support at about $5 million a year. They also suggested concentrating STC production at sites that achieve the highest-quality sequence reads to allow the design of valuable STSs (see Mapping with STCs and STSs).
High-throughput STC production is now being carried out at The Institute for Genomic Research (TIGR) under Bill Nierman and at the University of Washington Department of Molecular Biology (UWMB) by Gregory Mahairas of Leroy Hood's team. These sequencing projects are slated for completion in late 1999, with STC data sets on some 450,000 BACs. As of February 1999, more than 378,000 STCs had been acquired at the two sites (see BAC Projects).
STC data will provide researchers with an STC marker spaced an average of every 3000 to 4000 bases across the entire human genome, a 100-fold improvement over other current human genome maps.
The availability of STC data sets encourages more participation by smaller laboratories. Their contig building has been hindered previously by the prohibitive cost of maintaining and processing libraries on the human genome scale. With the number of STC data sets now expanding, BACs to extend chromosomal sequence can be screened computationally over the Internet. Scientists need to order only those BACs identified as candidates for contig extension (see Availability of BAC Clones and STC Data).
Enriching STC Data
STCs Useful in Non-HGP Efforts on Human, Other Genomes
*UniGene lists entries for nonredundant EST sequences read from the ends of cDNA clones generated and arrayed for wide distribution by the international I.M.A.G.E. Consortium [HGN6(6), 3]. I.M.A.G.E. clone libraries are an outgrowth of a 1991 DOE initiative to enrich the developing human genome physical maps with gene loci and open broad access to the resulting data and resources.
See graphic, BAC End Sequencing Extends Clones.
The electronic form of the newsletter may be cited in the following style:
The Human Genome Project (HGP) was an international 13-year effort, 1990 to 2003. Primary goals were to discover the complete set of human genes and make them accessible for further biological study, and determine the complete sequence of DNA bases in the human genome. See Timeline for more HGP history.
Published from 1989 until 2002, this newsletter facilitated HGP communication, helped prevent duplication of research effort, and informed persons interested in genome research.