Archive Site Provided for Historical Purposes
The Human Genome Project (HGP) was completed in 2003. One of the primary research areas was DNA sequencing. This page details that research.
The HGP's emphasis was on obtaining a complete and highly accurate reference sequence (1 error in 10,000 bases), largely continuous across each human chromosome. Scientists believe that knowing this sequence is critically important for understanding human biology and for applications to other fields.
A "working draft" of the human genome DNA sequence was completed ahead of schedule in June 2000, published February 2001. The working draft comprises shotgun sequence data from mapped clones, with gaps and ambiguities unresolved. Draft sequence provides a foundation for obtaining the high-quality finished sequence and also is a valuable tool for researchers hunting disease genes. See Feb. 2001 and April 2003 Science and Nature papers analyzing the sequence.
Initial Human DNA Sequence Goals
A goal also focused on identifying individual variations in the human genome. Although more than 99% of human DNA sequences are the same across the population, variations in DNA sequence can have a major impact on how humans respond to disease; environmental insults such as bacteria, viruses, toxins, and chemicals; and drugs and other therapies.
Methods have been developed to detect different types of variation, particularly the most common type called single-nucleotide polymorphisms (SNPs), which occur about once every 100 to 300 bases. SNP maps are helping scientists identify the multiple genes associated with such complex diseases as cancer, diabetes, vascular disease, and some forms of mental illness. These associations are difficult to establish with conventional gene-hunting methods because a single altered gene may make only a small contribution to disease risk.
Human Genome Sequence Variation Goals
Text adapted from F. Collins, Ari Patrinos, et al., "New Goals for the U.S. Human Genome Project: 1998–2003," Science 282: 682-689 (1998). For a more detailed explanation of sequencing, see the U.S. DOE Primer on Molecular Genetics. See HGP Goals for more details on the project's goals and their revisions over time.
|Area||HGP Goal||Standard Achieved||Date Achieved|
|DNA Sequence||95% of gene-containing part of human sequence finished to 99.99% accuracy||99% of gene-containing part of human sequence finished to 99.99% accuracy||April 2003|
|Capacity and Cost of Finished Sequence||Sequence 500 Mb/year at < $0.25 per finished base||Sequence >1,400 Mb/year at <$0.09 per finished base||November 2002|
|Human Sequence Variation||100,000 mapped human SNPs||3.7 million mapped human SNPs||February 2003|
Sequence data from both ends of mapped BAC (bacterial artificial chromosomes) clones provide researchers with a series of markers spaced approximately every 3000 to 4000 bases across the genome. Researchers use these markers as "sequence tag connectors" (STCs) to identify the specific clones needing to be sequenced to extend sequenced regions further along the chromosomes and for other uses in large-scale sequencing efforts.
The Human Genome Project (HGP) was an international 13-year effort, 1990 to 2003. Primary goals were to discover the complete set of human genes and make them accessible for further biological study, and determine the complete sequence of DNA bases in the human genome. See Timeline for more HGP history.