Archive Site Provided for Historical Purposes
Sponsored by the U.S. Department of Energy Human Genome Program
In this issue...
Available in PDF
HGP and the Private Sector
In the News
Ethical, Legal, and Social Issues
Web, Publications, Resources
Meeting Calendars & Acronyms
In generating the draft sequence, scientists determined the order of base pairs in each chromosomal area at least 4 to 5 times (4x to 5x) to ensure data accuracy and to help with reassembling DNA fragments in their original order. This repeated sequencing is known as genome "depth of coverage." Draft sequence data are mostly in the form of 10,000 basepair-sized fragments whose approximate chromosomal locations are known.
To generate high-quality sequence, additional sequencing is needed to close gaps, reduce ambiguities, and allow for only a single error every 10,000 bases, the agreed-upon standard for HGP finished sequence. Investigators believe that a high-quality sequence is critical for recognizing regulatory components of genes that are very important in understanding human biology and such disorders as heart disease, cancer, and diabetes. The finished version will provide an estimated 8x to 9x coverage of each chromosome. Thus far, finished sequences have been generated for only two human chromosomes — 21 and 22.
In December 1999, the 56-Mb sequence of human chromosome 22 was declared essentially complete, yet only 33.5 Mb were sequenced. In early spring of this year, the fruit fly Drosophilas 180-Mb genome also was announced as completed, although just 120 Mb were characterized. Whats the deal?
Animal genomes have large DNA regions that currently cannot be cloned or assembled. In the human genome sequence, these regions include telomeres and centromeres (chromosome tips and centers), as well as many chromosomal areas packed with other types of sequence repeats.
Most unsequenceable areas contain heterochromatic DNA, which has few genes and many repeated regions that are difficult to maintain as clones for DNA sequencing. HGP scientists strive to sequence the entire euchromatic DNA, which generally is defined as gene-rich areas (including both exons and introns) that are translated into RNA during gene expression. In the case of human chromosome 22, the sequenced 60% represents 97% of euchromatic DNA. Similarly, nearly all the euchromatic regions were sequenced for Drosophila.
Although the HGP goal is to have complete strings of sequence for each chromosome from tip to tip, obtaining this high level of resolution presents a great challenge.
Diversity Represented All humans share the same basic set of genes and genomic regulatory regions that control the development and maintenance of biological structures and processes. Therefore, the human reference sequence will not, and does not need to, represent an exact match for any one person's genome.
Investigators are using DNA from donors representing widely diverse populations. For example, HGP researchers collected samples of blood (female) or sperm (male) from a large number of people; only a few samples were processed, with source names protected so neither donors nor scientists know whose genomes are being sequenced. The private company Celera Genomics collected samples from five individuals who identified themselves as Hispanic, Asian, Caucasian, or African-American.
In addition to generating the reference sequence, another important HGP goal is to identify many of the small DNA regions that vary among individuals and could underlie disease susceptibility and drug responsiveness. The most common variations are called SNPs (single nucleotide polymorphisms). The DNA resources used for these studies came from 24 anonymous donors of European, African, American (north, central, south), and Asian ancestry.
Although the sequence information will come from the DNA of many persons, it will be applicable to everyone.
DOEs role in the HGP arose from the historic congressional mandate of its predecessor agencies (the Atomic Energy Commission and the Energy Research and Development Administration) to study the genetic and health effects of radiation and chemical by-products of energy production. From this work the recognition grew that the best way to learn about these effects was to study DNA directly.
The electronic form of the newsletter may be cited in the following style:
Human Genome Program, U.S. Department of Energy, Human Genome News (v11n1-2).
The Human Genome Project (HGP) was an international 13-year effort, 1990 to 2003. Primary goals were to discover the complete set of human genes and make them accessible for further biological study, and determine the complete sequence of DNA bases in the human genome. See Timeline for more HGP history.