Sponsored by the U.S. Department of Energy Human Genome Program
Human Genome News Archive Edition
Human Genome News, Jan.-Feb. 1995; 6(5): 12
The Fourth International Workshop on the Identification of Transcribed Sequences, sponsored by the Canadian Genome Technology and Analysis Program, DOE, and Amgen, Inc., was held October 16-18, 1994, in Montreal, Canada. Some 80 participants from 12 countries, including more than 50 speakers, gathered to discuss methods for gene isolation and identification.
For the first time in this workshop series, significant progress in actual transcriptional map construction was presented for several chromosomal regions, and transcriptional mapping in the mouse system made a strong showing. Although protocols are becoming standardized for gene-identification techniques such as cDNA selection and exon trapping, an alternative approach--large-scale genomic sequencing coupled with "software trapping"--is now considered an attractive option. unctional analysis poses a growing challenge as large numbers of transcribed sequences become available. Some highlights of workshop presentations and discussion groups follow.
This approach has proved highly efficient and is clearly the most popular gene-identification technique. Although the technique itself is well standardized, new improvements and variations were presented. S. Patanjali (Yale University) has demonstrated that whole-yeast DNA from YAC-containing strains can be used effectively without prior YAC purification; enrichments of >104 can still be obtained without increases in ribosomal contaminants. As alternative genomic material, Michel Fontes [Institut National de la Sante et de la Recherche Medicale (INSERM)] used IRS-PCR material amplified with degenerate Alu primers from whole-yeast DNA.
An ongoing problem with cDNA selection and exon trapping is the isolation of full-length cDNAs corresponding to the short cDNA fragments typically obtained. Sherman Weissman (Yale University) showed that selected cDNAs can be used without prior cloning to screen full-length cDNA libraries directly. Bernard Korn (Germany Cancer Research Center), Cynthia Jackson (Rhode Island Hospital), and Greg Lennon [Lawrence Livermore National Laboratory (LLNL)] carried out cDNA selection from more-complex genomic sources--flow-sorted X chromosome, a chromosome 9 somatic cell hybrid, and flow-sorted chromosome 19, respectively. Selection specificity was decreased, but processing simplicity and rapidity may compensate for this.
To increase the size of trapped exons, Johan den Dunnen (Leiden University) proposed a cosmid-based exon-trapping vector. Providing large genomic clones should allow for processing of full-length or nearly full-length cDNAs rather than the one or two exons typically trapped. Greg Landes (Integrated Genetics) used the pSPL3-CAM vector to trap 17 P1 clones from the PKD1 region and obtained 4 to 20 exons per clone. Yun-Fai Chris Lau (University of California, San Francisco) used 3' exon trap-ping with pTAG4 and a pool of 4600 Y chromosome cosmids. Products were 40% artifacts with 60% of the unique clones mapping back to the Y chromosome.
Marcia Budarf [Children's Hospital of Philadelphia (CHOP)] sequenced >200 kb from the DGCR of 22q11.2 and used GenBank searches, GRAIL predictions, and RT-PCR to identify nine new genes. B. Rajendra Krishnan (Washington University School of Medicine) applied a sample sequencing technique to segments of the HLA-C region and also identified new genes using both computer and laboratory techniques.
Using predominantly cDNA-selection techniques, researchers demonstrated significant progress in constructing high-density transcriptional maps of several chromosomal regions. These included the MHC class I region of 6p (Wufang Fan, LLNL); A-T region of 11q22-q23 (Anat Bar-Shira, Tel-Aviv University); 7q21-q22 (Johanna Rommens, The Hospital for Sick Children); regions of 21q (Hongxia Xu, Yale University, and Katheleen Gardiner, Eleanor Roosevelt Institute); 22q11 (Weilong Gong, CHOP, and Howard Sirotkin, Albert Einstein College of Medicine); and Xq28 and the entire X chromosome (Korn). The percentage of putative new cDNAs that map back to the correct chromosomal region varies considerably, but all are >50% and the best is >90%. After sequencing, further analyses include fine mapping to YAC and cosmid contigs, determination of transcription orientation, Northern analysis, and cDNA library screening.
Projects to sequence, map, and further analyze random cDNAs are ongoing. In sequencing >600 clones from a human testes cDNA library, Michael Jones (Cambridge University) found that 3% were identical to known genes and 70% novel. Of 200 cDNAs examined, 15% mapped to multiple chromosomes, and 70% showed significant cross hybridization to rodent DNA. To characterize further the apparently widely expressed novel ESTs, Donna Maglott (American Type Culture Collection) used Northern analysis to show that 20% were brain specific. Donald Moir (Collaborative Research, Inc.) mapped 138 infant brain ESTs
Gridded arrays of cDNA libraries are becoming more widely used. Catherine Nguyen (INSERM and Centre National de la Recherche Scientifique) hybridized a grid of 80,000 mouse thymus cDNAs with labeled cDNA from various tissues. Signals can be related quantitatively to the tissue-specific levels of expression. Daniela Toniolo (Consiglio Nazionale delle Richerche) used a similar approach with a mouse 10-day-embryo central nervous system library.
Greg Lennon (LLNL) discussed the multiuser analysis of the Bento Soares (Columbia University) gridded infant brain, liver, and spleen libraries under the auspices of the Integrated Molecular Analysis of Gene Expression (IMAGE) consortium. Filters of libraries are available from Lennon (email@example.com).
Because of the potential relevance to neurodegenerative disorders, Christian Neri (CEPH) screened a human fetal brain library with CAG and CCG repeats and is cataloguing those with >5 repeat units. Of the 114 analyzed, 17.5% have homologies in the EST databases.
Susan Ackerman (Jackson Laboratory) constructed a cDNA library from differentiated and nondifferentiated NT2 cells and is using a subtractive library and differential display to identify cDNAs expressed specifically in NT2 neurons. So far, most differences appear related to expression level and not absolute specificity.
Toniolo had previously identified a number of genes and CpG islands in two regions of human Xqter. She presented data showing that the order and orientation of the same genes is preserved in the mouse. Transcriptional patterns in mouse development are being investigated.
Kevin Brady (Harvard Medical School) discussed the ease and rapidity of mapping expressed sequences by single-strand confirmational polymorphism in recombinant inbred strains. Filters are available for such analyses at an estimated cost of $100.00 per locus.
Catherine Lambert (FUNDP School of Medicine) used cDNAs from 14-day embryos and embryonic brains to select candidate reeler genes among YAC clones from the mouse chromosome 5 centromeric region. Selected cDNAs and a random sequencing approach have thus far yielded no homologs in GenBank searches and no exons by GRAIL analysis.
Miriam Meisler (University of Michigan) is isolating genes in the distal mouse chromosome 15 region of A4, a transgenic line associated with a neuromuscular disorder. Monica Justice (Kansas State University) cloned a retroviral insertion site, Evi3, associated with mouse B-cell lymphomas. Reverse-transcription PCR and genomic sequencing were needed to analyze the complex gene organization in the region. In both cases, understanding the mouse genes involved is expected to facilitate the cloning and characterization of the homologous human genes, mapping to human chromosomes 12 and 18, respectively. This can be particularly important in such cases where tissue and timing of expression make human studies difficult.
In the only presentation to address functional analysis of new genes directly, Russ Finley (Massachusetts General Hospital) described use of a yeast-interaction mating system to identify and characterize interactions among cell-cycle regulatory proteins. Such systems will be increasingly important for defining gene functions.
As more transcribed sequences are isolated, the ability to interpret sequence information becomes increasingly important. Jean-Michel Claverie (NIH) discussed the expanding problem of analyzing novel genes with no homologies in the databases. Currently, 69% of ESTs are "unknown" cDNAs. A cross-species comparison strategy of these sequences identified 180 previously uncharacterized protein domains, many of which are likely to be involved in as-yet-uncharacterized basic cellular functions.
Richard Mural (Oak Ridge National Laboratory) discussed new enhancements to GRAIL that will aid in gene discovery through large-scale genomic sequencing coupled to "software trapping" and new tools for sequence annotation in the GRAIL system.
Martin Ringwald (Jackson Laboratory) discussed a project to develop a gene-expression information resource for mouse embryonic development. This would include both textual descriptions and 3-D images of gene-expression patterns during mouse development.
The gene-finding techniques of cDNA selection, exon trapping, and genomic sequence analysis are being widely and successfully applied with increasingly standardized protocols. Many groups find that focusing on one of these approaches quickly provides a vast resource of new genes. Comprehensive transcriptional mapping will probably employ combinations of all three strategies. Old problems of how to isolate complete cDNAs from exons and small cDNA fragments and identify pseudogenes remain to be solved efficiently. Clearly, future workshops will focus increasingly on two additional problems-interpretation of sequencing information of both genomic and cDNA clones and functional analysis of novel genes.
Richard Mural, Oak Ridge National Laboratory and
Katheleen Gardiner, Eleanor Roosevelt Institute
The electronic form of the newsletter may be cited in the following style:
Human Genome Program, U.S. Department of Energy, Human Genome News (v6n5).
The Human Genome Project (HGP) was an international 13-year effort, 1990 to 2003. Primary goals were to discover the complete set of human genes and make them accessible for further biological study, and determine the complete sequence of DNA bases in the human genome. See Timeline for more HGP history.
Published from 1989 until 2002, this newsletter facilitated HGP communication, helped prevent duplication of research effort, and informed persons interested in genome research.