Human Genome Project Information. Click to return to home page.

Sponsored by the U.S. Department of Energy Human Genome Program

Human Genome News Archive Edition

Human Genome News, April-June 1996; 7(6)

Santa Fe '96

Packaging the Genome

Community Resources for Mapping, Sequencing, Finding Genes

Collections of cloned DNA pieces (libraries) provide the essential starting material for genome researchers. Workshop speakers discussed improvements to the depth and quality of a virtual alphabet of clone libraries - BACs, PACs, cDNAs, HAECs, and TAR-YACs - and on their usefulness for mapping, sequencing, and functional analysis. The newer stable, large-insert vectors reduce chimerism and allow relatively easy DNA isolation and manipulation. Many groups are beginning to use these clones as sequencing substrates as well.

Progress toward creating a single, widely applicable, mapped resource was also reported (see "BAC-PAC Resource"); early applications attest to its usefulness for studying the human genome from sequence to global levels.

Highlights of these presentations follow.

BACs

Mel Simon (California Institute of Technology) suggested thinking of BACs as large cosmids, or small YACs without problems. Simon spoke about the advantages of this large-insert cloning system, the current human and mouse BAC libraries, and their usefulness as mapping and sequencing reagents. BACs are versatile, have a low incidence of chimerism, and are easy to manipulate, he said. Isolation of the circular DNA and removal of host DNA is simple.

The human-insert BAC library now stands at about 280,000 clones (average insert size, 140 kb), representing 10x coverage; 170,000 currently are available from Research Genetics (800/533-4363; ResGen Products and Services have migrated to Invitrogen’s website: http://mp.invitrogen.com/). This summer, 15x coverage of the human genome is expected, with 20 to 25x coverage by the end of the year. Plans are to increase the depth of the library to 30x to enable construction of optimal contig maps that can be used to select minimally overlapping BAC sets for genomic sequencing.

A mouse BAC library, made from the 129 mouse-strain embryonic stem cells used for producing transgenic mice, contains 230,000 clones (average insert size, 140 kb) and can be obtained from Research Genetics.

Mapping and Sequencing Applications.

BAC libraries can be screened to obtain a reliable set of clones corresponding to a specific marker and can be used for walking along a chromosome. On a larger scale, the libraries can be pooled on plates and probed to obtain a series of clones to generate physical maps. In collaboration with the Sanger Centre (U.K.), Simon's team has been using BACs to translate chromosome 22 YAC maps into BACs and then using ESTs to assemble the BAC clones into contigs. More than 600 BAC clones (average insert size, 145 kb) have been selected and mapped using a variety of markers (e.g., cDNAs, ESTs, STSs, cosmids); 90% of the markers gave hits in the BAC library. The clones were assembled into 120 contigs that are being verified by fingerprinting; gaps are closed by screening deeper into the BAC library with markers and BAC end probes.

ESTs from the radiation hybrid YAC framework maps can be used as landmarks for rapidly assembling BACs to generate genome-wide BAC contig maps. Simon's group is planning to construct such BAC-EST maps, initially using 30,000 mapped ESTs or cDNAs. The resulting maps will provide high-resolution gene maps and, more important, entry points for gene finding and large-scale genomic sequencing.

BACs can be used in a variety of ways for sequencing. Some groups have been successful in sequencing BAC ends (see "Early Successes"). These ends are used for making STSs or primers to screen the library further to extend or verify contigs or obtain the next BAC to sequence in a walk along a chromosome. An 8 to 10x BAC library could be used to develop a library array for sequencing the human genome distributively, keeping track of where in the array a BAC fits, and choosing a minimally overlapping group of BACs. Alternatively, BAC ends can be used to walk, and from these arrays a minimally overlapping set can be selected, verified, and sequenced (http://www.tree.caltech.edu/).

PACs

Pieter de Jong (Roswell Park Cancer Institute) described his newer human PAC libraries based on the slightly modified bacteriophage P1 vector (pCYPAC2) he has been using for the last 2 years. The library consists of more than 440,000 individual clones arrayed in over 1200 384-well plates that have been prepared in 4 sections designated RPCI-1, 3, 4, and 5; total coverage is 16x. Stability of clones is high, and chimerism is low to nonexistent.

The RPCI-1 segment (3x coverage, 120,000 clones) has been distributed to more than 40 genome centers worldwide; library screening results in an average of 3 positive PACs per probe or marker. In situ hybridization of 250 PAC clones demonstrates little or no chimerism or instability.

Distribution of RPCI-3 (3x, 78,000 clones) is under way. RPCI-4 and 5 will be available upon request. High-density colony membranes are being distributed at cost, mainly to groups having a copy of the PAC library. Washington University and the Sanger Centre have the complete collection (16x redundancy). De Jong's group is now generating a similar PAC library from the 129 mouse strain.

Human cDNA Libraries

Bento Soares (Columbia University) described his group's latest approach to normalizing cDNA libraries, which has enabled them to

  • lower the frequency of prevalent mRNAs while increasing the representation of rarer transcripts,
  • preserve the representation of full-length clones, and
  • minimize the representation of internal priming events within mRNAs during synthesis of first-strand cDNA.

Soares also discussed using subtractive hybridization methods to remove all sequenced clones from the collections of clones to be sequenced. Soares shared the podium with Keith Elliston (Merck & Co.), who discussed results of sequence data analysis derived from Soares' normalized libraries (see "Merck Gene Update").

HAECs and TAR-YACs

Human cells may prove to be more practical hosts for cloning stretches of the human genome that are unclonable or unstable in the bacterial cells used for the libraries described above. Jean-Michel Vos (University of North Carolina) spoke about his group's second-generation vector for cloning DNA in human cells as human artificial episomal chromosomes (HAECs) with an insert range of 80 to 350 kb. This system may also be advantageous for maintaining large DNA regions as single fragments for studying human gene function and regulation of other critical genomic regions that may span large areas.

Vladimir Larionov and Natalya Kouprina (visiting scientists from the Institute of Cytology, St. Petersburg, Russia) and Michael Resnick (National Institute of Environmental Health Sciences) reported a new approach to YAC cloning that exploits the high recombination tendency of DNA molecules during transformation into yeast cells. Transformation-associated recombination (TAR) events are used to clone out specific large regions of DNA, such as repeat sequences. With no chimerism and an average-size insert of 250 kb, TAR-YACs could prove useful for cloning gene families and specific genes from total-genome DNA as well as for filling gaps in chromosome maps. The group demonstrated simple purification of circular molecules and selective isolation of chromosomal, subchromosomal, and gene family DNA.


Return to the Table of Contents

The electronic form of the newsletter may be cited in the following style:
Human Genome Program, U.S. Department of Energy, Human Genome News (v7n6).

Human Genome Project 1990–2003

The Human Genome Project (HGP) was an international 13-year effort, 1990 to 2003. Primary goals were to discover the complete set of human genes and make them accessible for further biological study, and determine the complete sequence of DNA bases in the human genome. See Timeline for more HGP history.

Human Genome News

Published from 1989 until 2002, this newsletter facilitated HGP communication, helped prevent duplication of research effort, and informed persons interested in genome research.