Archive Site Provided for Historical Purposes
Sponsored by the U.S. Department of Energy Human Genome Program
In this issue...
In the News
Ethical, Legal, and Social Issues and Educational Resources
Genetics in Medicine
Web, Other Resources, Publications
Meeting Calendars & Acronyms
For the first time, scientists have the nearly complete genetic instructions for an animal that, like humans, has a nervous system, digests food, and reproduces sexually. The 97-million-base genome of the tiny roundworm Caenorhabditis elegans was deciphered by an international team led by Robert Waterston (Washington University School of Medicine, St.Louis) and John Sulston (Sanger Centre, Cambridge, England). The work was reported in a special issue of the journal Science (December 11, 1998) that featured six articles describing the history and significance of the accomplishment and some early sequence-analysis results.
Although sequencing has been almost completed, investigators pointed out that analysis and annotation will continue for years, facilitated by more information and better technologies. "We have provided biologists with a powerful new tool to experiment with and learn how genomes function," said Waterston. Obtaining genomic sequence, they noted, is more a beginning than an end.
C. elegans and the 12-Mb genome of the budding yeast Saccharomyces cerevisiae (completed in 1996) represent the only eukaryotes completely sequenced thus far. The two genomes are being compared in an attempt to identify elements essential for eukaryotic life and the genetic requirements for progression from a unicellular to multicellular existence. Eukaryotes, which include plants and animals, are the most complex of the three major branches of life on earth. The other branches are the least complex prokaryotes (bacteria) and the moderately complex Archaea, which share features with both other branches.
During its 2- to 3-week life span in the dirt of temperate regions, the benign C. elegans carries out many of the same processes as humans. Unlike the much smaller microbes sequenced so far, it begins life as a single fertilized cell that undergoes a series of divisions as it grows into an adult animal, forming complex tissues and organ systems. Researchers have found it particularly useful for studying early development, neurobiology, and aging --processes that have parallels in human biology.
The 9-year sequencing project required 2 million individual "reads" performed on DNA sequencing instrumentation to spell out the worm DNA sequence, 500 bases at a time. It began with the development of a clone-based physical map to facilitate gene analysis and grew into a collaboration among C. elegans Sequencing Consortium members and the entire international community of C. elegans researchers. In addition to the nuclear genome-sequencing effort, other researchers sequenced its 15-kb mitochondrial genome and carried out extensive cDNA analyses that facilitated gene identification. Free data exchange and immediate data release have been hallmarks of the project, which has been a model for cooperation and sharing among Human Genome Project researchers.
The first of a two-part sequencing process used to parse the C. elegans genome was the "shotgun" sequencing of randomly chosen subclones (each only a small piece of a much larger cloned DNA molecule). The finishing phase used a more ordered (directed) sequencing strategy to close specific remaining gaps and resolve ambiguities. Members of the Sequencing Consortium noted that, were they to begin the project again today, they would use the same combination strategy but with larger bacterial clones such as BACs. This is the strategy currently being used for large-scale human genome sequencing in the HGP (p. 4). Although tools for both sequencing phases have improved greatly over the years, finishing remains labor intensive.
The magnitude of this effort underscores the challenge of sequencing the human genome, which is some 30 times larger than that of C. elegans. Methods and data from the work are helping researchers sequence and interpret the human genome. In fact, a significant amount of production sequencing occurs at Washington University and the Sanger Centre.
Early analysis highlights the importance of sequencing entire genomes for finding all genes and understanding the function of nonprotein-coding DNA regions in the genomes of such eukaryotic organisms as humans and roundworms (see Why Sequence Entire Genomes?). The C. elegans genome is packaged into 6 chromosomes containing about 19,000 genes, several times the number originally predicted by classical genetics experiments. About 40% of identified genes match those of other organisms, including humans. Like the human genome, C. elegans contains large amounts of repeated DNA that does not encode proteins but probably plays a role in chromosome function, gene organization, or regulation of gene activity. The C. elegans project was funded by NIH and the Medical Research Council (U.K.). [Denise Casey, HGMIS]
C. elegans Data
Notes associated with the Science paper and links to data resources are on the sites listed below:
WUSTL Genome Sequencing Center, http://genome.wustl.edu/gsc/index.shtml
Sanger Center, http://www.sanger.ac.uk
The electronic form of the newsletter may be cited in the following style:
Human Genome Program, U.S. Department of Energy, Human Genome News (v10n1-2).
The Human Genome Project (HGP) was an international 13-year effort, 1990 to 2003. Primary goals were to discover the complete set of human genes and make them accessible for further biological study, and determine the complete sequence of DNA bases in the human genome. See Timeline for more HGP history.