Archive Site Provided for Historical Purposes
Sponsored by the U.S. Department of Energy Human Genome Program
In this issue...
Available in PDF
HGP and the Private Sector
In the News
Ethical, Legal, and Social Issues
Web, Publications, Resources
Meeting Calendars & Acronyms
On April 13, the U.S. Secretary of Energy announced that researchers at the DOE Joint Genome Institute (JGI) had determined the draft sequence for human chromosomes 5, 16, and 19. The three contain more than 300 million bases or about 10% of the total human genome, with an estimated 10,000 to 15,000 genes.
Some Disorders Linked to Genes on Chromosomes 5, 16, and 19
"These three chapters in the reference book of human life are nearly complete," said Energy Secretary Richardson. "Scientists already can mine this treasure trove of information for the advances it may bring in our basic understanding of life and in such applications as diagnosing, treating, and eventually preventing disease."
JGI, now headed by Trevor Hawkins,* was established by DOE at Walnut Creek, California, in 1997. It is one of the largest publicly funded human genome sequencing centers in the world.
JGI Sequencing Strategies
A critical part of JGIs strategy was to sequence paired-end plasmids instead of M13 subclones used in most other HGP sequencing facilities. Because of the forward and reverse links between them, however, plasmids provided excellent order and orientation value when the fragments were assembled into large contiguous stretches (contigs). The result was "virtual" megabase-sized contigs whose lengths facilitate gene discovery. This is immensely helpful to gene hunters, who are finding that data on order and orientation are not available for many contigs in the current human genome maps.
Computational Analysis of Draft Data
The Oak Ridge National Laboratorys (ORNL) Computational Biology Section enriched the draft sequence by maximizing fragment order and orientation, assembling contiguous sequence stretches, and finding genes. An IBM SP3 supercomputer, one of the worlds most powerful, provided the massive computing capability for analyzing millions of DNA base pairs. Standard data-analysis methods first identified such genomic features as sequence tagged sites (STSs), BAC end sequence tag connectors (STCs), and expressed sequence tags (ESTs). Data were refined further by programs for gene identification such as GRAIL-Exp that use both EST and complete cDNA data to add greater confidence in gene prediction. These analyses not only allowed for gene identification but also provided some fragment- or clone-ordering information.
The Java-based Genome Channel [http://genome.ornl.gov] browser developed at ORNL provides a view of genomic sequences, computational and experimental annotation, and related links. The HTML-based Genome Catalog includes genomic summary reports, gene and protein lists, homologies, and other Internet capabilities.
Finishing the Draft to High Quality
Some limitations of rough draft data include project-to-project contamination, floating contigs (sequence reads that dont seem to belong anywhere), and false joins and other assembly errors. Finding useful biological information, even with accompanying cDNA sequences, is extremely difficult with gaps, incomplete order and orientation, incorrect assemblies, and base-pair errors. Prefinishing steps at Stanford Human Genome Center involve reassembling and analyzing the sequence, with the goal of fixing low-quality regions and filling in gaps. Finishing includes performing computational analysis of the assembly and resolving discrepancies. Final finished data are submitted to GenBank when clones are completely contiguous. (See Data Web sites).
During October, JGI launched its first "Microbial Month," turning out high-quality draft sequences at a rate of more than one every 1.5 working days. JGI sequence data is sent to ORNLs annotation pipeline, where it is analyzed rapidly for genes and other important biological features. In addition to the basic research value of the 15 selected bacterial genomes, many have immediate implications for the economy and the environment. The next two bug months are scheduled for March and August 2001.
Sequencing has begun on mouse genomic regions that are similar to gene-containing regions in human chromosomes 5, 16, and 19. The extensive 9x coverage of chromosome 19 has enabled the rapid generation of sequence-ready mouse maps that are providing clones for the sequencing pipeline. These maps also furnish reagents for basic studies of genome evolution and analysis of mouse mutations. Furthermore, a collaborative project is in the works to sequence 30 to 50 Mb of mouse genomic clones generated by ORNL-developed knockout mice (those with deleted or inactivated genomic regions).
In October, JGI announced a collaboration to sequence the genome of Fugu rubripes (pufferfish). Joining JGI are the Institute for Molecular and Cell Biology (Chris Tan), U.K. HGMP Resource Centre (Greg Elgar), Molecular Sciences Institute (Sydney Brenner), and Institute for Systems Biology (Leroy Hood). Because of its strong similarity to the human genome in number of genes and control sequences, the Fugu genome is considered a powerful, compact tool for identifying these regions in the much larger human genome. Scientists expect to sequence more than 95% of Fugu by March 2001.
Continuing coverage at the JGI Website [http://www.jgi.doe.gov].
*On November 3, DOE announced Trevor Hawkins' appointment as JGI director. JGI's first director Elbert Branscomb will assume leadership in developing the new OBEr program, Bringing the Genome to Life.
The electronic form of the newsletter may be cited in the following style:
Human Genome Program, U.S. Department of Energy, Human Genome News (v11n1-2).
The Human Genome Project (HGP) was an international 13-year effort, 1990 to 2003. Primary goals were to discover the complete set of human genes and make them accessible for further biological study, and determine the complete sequence of DNA bases in the human genome. See Timeline for more HGP history.