Human Genome Project Information. Click to return to home page.

Sponsored by the U.S. Department of Energy Human Genome Program

Human Genome News Archive Edition

Human Genome News, May 1992; 4(1)

Successful Worm Studies Yield Much Data

An international research team has deposited the first 121,298 bp of finished DNA sequence data from the roundworm genome into public databases, and researchers say another 250 kb will be ready for inclusion soon. Christened an "honorary human" by Human Genome Project researchers, the worm whose scientific name is Caenorhabditis elegans has become an important testing ground for efforts to make large-scale DNA sequencing faster and more economical.

C. elegans is only a millimeter long, but it is extremely well studied by scientists seeking to understand how genes control growth and development. Cell division has been traced from fertilization to each of the adult worm's 959 body cells, and the wiring of its nervous system has been completely diagrammed. More than 95% of the worm's DNA along its six chromosomes has been physically mapped using cosmid and yeast artificial chromosome clones. The entire C. elegans genome, which is slightly smaller than one average human chromosome, is estimated to contain about 100 Mb.

In a recent issue of Nature [Vol. 356 (March 5, 1992)], Robert Waterston (Washington University, St. Louis) and John Sulston [U.K. Medical Research Council (MRC)] and their coworkers reported significant progress in their large-scale sequencing effort. Having sequenced more than 350,000 bp of the worm's DNA, the researchers plan to complete over 3 Mb in the next 2 years. The team has also uncovered a number of completely new genes, including genes not previously known to occur in C. elegans.

In addition to their achievements in sequencing and in discovering new genes, the team reported that they expect to lower the cost of DNA sequencing to $1 per base by applying their strategy to production-scale sequencing, according to the Nature report. With further refinements in technique, "a reduction in costs to $0.50 per base seems realistic," the authors said. When the Human Genome Project began, the cost of sequencing was estimated to be $2 to $5 per base.

The worm sequencing project is jointly supported by the National Center for Human Genome Research (NCHGR) and MRC. "This accomplishment is an important initial step toward the Human Genome Project's DNA sequencing goals," said Robert Strausberg, who oversees the NCHGR sequencing programs. "Before we commit resources to large-scale sequencing of the human genome, we must first show that such sequencing is technically possible and that it can be done more cost-effectively than previous strategies have allowed." In addition, says Strausberg, the work has shown that sequencing and computer analysis of large genomic regions can turn up valuable information about genes and their organization on chromosomes.

In the article, Waterston, Sulston, and coworkers described the strategy they used to sequence DNA in three cosmid clones covering a region of chromosome 3 known to be a rich source of genes. XDAP software, specially developed to speed assembly and editing, has been a key part of progress. Using BLAST and GENEFINDER computer programs to analyze the sequence, the investigators estimated that the region contains at least 32 genes, but further studies must be performed to confirm their identity. Comparisons with existing databases showed that 15 genes were similar to known genes but had not previously been identified in C. elegans. The remainder of the predicted genes appear to be completely new.

The programs predicted a larger number of genes in the sequenced region than indicated by classical genetic analysis or mutation studies. Analysis of genomic DNA has produced similar surprises in the genomes of yeast, Escherichia coli, and the fruit fly. In addition, several duplications and repeated sequences were discovered, including a large stretch of repeats similar to that discovered recently at the site of the human fragile X gene. The authors concluded that "genomic sequence is an efficient means for finding not only genes but also the other information stored in the genome."

The sequences have been submitted to the GenBank® and European Molecular Biology Laboratory databases, and the team has created ACEDB. This C. elegans database provides a bibliography and information on genetic maps, mapped clones, and sequence. "The ability to view the sequence in the context of much of the available knowledge about the worm should speed the assignment of function to each sequence," the Nature report says.


Predicted Genes in C. elegans

Similarities to Other Known Genes

  • adenylyl cyclase
  • phenylethanolamine-N-methyltransferase
  • acetyl-CoA acetyltransferase
  • Tc3 hypothetical protein
  • neutrophil oxidase factor
  • SLP1
  • giant secretory protein
  • 50S ribosomal protein L11
  • glucose transporter
  • IE110
  • arsATPase
  • rat proton pump
  • glutathione reductase
  • CDC25/string
  • globin-like host protective antigen

The C. elegans genome database (ACEDB) is available at no charge. Contact:

  • Richard Durbin
    MRC Laboratory of Molecular Biology
    Hills Road
    Cambridge CB2 2QH, U.K.
    (Int.) 44/223-248011
    E-mail: rd@mrc-lmba.cam.ac.uk

Reported by Leslie Fink, Office of Communications, NIH, NCHGR

Return to Table of Contents

The electronic form of the newsletter may be cited in the following style:
Human Genome Program, U.S. Department of Energy, Human Genome News (v4n1).

Human Genome Project 1990–2003

The Human Genome Project (HGP) was an international 13-year effort, 1990 to 2003. Primary goals were to discover the complete set of human genes and make them accessible for further biological study, and determine the complete sequence of DNA bases in the human genome. See Timeline for more HGP history.

Human Genome News

Published from 1989 until 2002, this newsletter facilitated HGP communication, helped prevent duplication of research effort, and informed persons interested in genome research.