Archive Site Provided for Historical Purposes
Sponsored by the U.S. Department of Energy Human Genome Program
In this issue...
Also available in pdf.
1997 Santa Fe Highlights
Human Genome Project Administration
In the News
Software and the Internet
Meeting Calendars & Acronyms
Scientific Director Branscomb Offers "State of JGI" Message
With its multicultural Hispanic, Anglo, and Indian heritages, Santa Fe seemed an appropriate venue for discussing the challenges of forging a union from various independent cultures. Joint Genome Institute (JGI) Scientific Director Elbert Branscomb acknowledged the formidable challenges in joining DOE's three genome centers.
Two 1997 reviews of JGI prompted a major redesign and sharpened goals, Branscomb said, especially for the first year, and high-quality, production-level sequencing was defined as JGI's single priority for 1998. Ambitious JGI sequencing goals are to submit 20 Mb of unique, "Bermuda-quality" DNA sequence to GenBank by October 1 (see article, Bermuda-Quality Sequence.) [Editor's Note: As of July 7, JGI had completed 11.9 Mb, for a projected throughput rate of 32 finished Mb per year.]
JGI is committed to immediate and full public data release, with data and quality assessments computed automatically and presented in a common internal Web interface. Monthly goals and results are available on the JGI Web site [http://www.jgi.doe.gov/Docs/JGI_Seq_Summary.html].
Branscomb highlighted the importance of an ongoing review of sequencing priorities in terms of amount, quality, and cost, noting the inevitability of some problematic trade-offs. Plans are under way for JGI to participate in sequence-evaluation programs with other major sequencing centers, and Branscomb pledged JGI to objective cost reporting and predicted cost-efficiencies comparable to those of the rest of the community. An 11-member external board [http://www.jgi.doe.gov/whoweare/sac.html] advises JGI on managerial, strategic, technical, and scientific matters. JGI also seeks input from "end users" of genomic information, governmental policymakers, and the academic community.
Sequencing Strategies and Goals
Because the new JGI Production Sequencing Facility (PSF) will not begin operations until later in 1998, JGI is using capabilities and strategies already in place at current facilities. "We need to go with what we know will work," Branscomb stated.
The three laboratories have significant experience in both directed and random (shotgun) sequencing strategies:
To reach the 20-Mb sequencing goal, 45% will be done at both Berkeley and Livermore and the remaining 10% at Los Alamos. A rate increase to around 35 Mb is projected by the end of FY 1998.
The sequencing goal for FY 1999 is 20 to 24 Mb of high-quality sequence and an additional 70 to 80 Mb of draft sequence. Branscomb observed that sequencing goals and cost economies will not be achieved unless, within 3 to 4 years, PSF is generating well above 100 Mb of Bermuda bases per year.
A Consensus Strategy for PSF
Branscomb outlined the basic plan, which focuses on using a combined shotgun strategy with directed-sequencing approaches. PSF was designed to be flexible to accommodate changes in technology, and optimization will begin after the first year. The initial sequencing targets are chromosomes 19, 16, and 5, which have been mapped largely by the three member laboratories.
Activities in support of production sequencing at Los Alamos, Berkeley, and Livermore will include sequencing-technology development, large-insert clone production and mapping, sequence-annotation submission, and overall informatics support and technology development. In general, PSF will obtain sequence-ready clones from the three laboratories and return assembled sequence.
Branscomb emphasized the need for well-designed informatics that integrates (within a single functional entity) support for the work at the different sites, especially PSF. Four goals for informatics are to achieve as much uniformity as possible in practices and tools, provide seamless management of and access to critical data across all sites, maintain organizational and administrative unity and coherence, and define a uniform role for informatics and computational approaches in support of quality maintenance.
Beyond Production Sequencing
The ultimate value of high-throughput sequencing depends largely on what is learned about the revealed genes. Logical adjuncts to production sequencing, therefore, include the generation of full-length or nearly full length cDNA sequences, obtaining various kinds of expression data in mouse and human, and obtaining mouse genomic sequence of all homologous regions conserved between mouse and human. On the last topic, Branscomb noted that "there are areas in the human genome where we already know the cost of sequencing will pay off richly, and in those same areas we'd like to know all the conserved elements in mouse --a critically important thing to find out; the question is how to do that affordably when we can't afford to sequence the mouse right now in parallel with human."
Branscomb pointed to another trade-off for high-quality and high-productivity sequencing goals: Postpone originally planned efforts to "functionalize" the sequence data (that is, annotate it with additional, experimentally derived information to make it more useful to biologists). Investigators are exploring ways to do comparative sequencing in the mouse at a much lower level of redundancy and quality to locate and sequence mouse-human conserved regions accurately enough to learn their biological significance. In the first year, researchers will perform a small amount of mouse physical mapping to support future mouse-human comparative genomic sequencing if it proves affordable. DOE OBER also has funded a pilot project to functionalize fruit fly sequences. "The practical challenge," Branscomb said, "is to find out what scaleable methods can be devised to annotate fly sequence data for about $1000 per gene or less. It's an interesting challenge."
The electronic form of the newsletter may be cited in the following style:
Human Genome Program, U.S. Department of Energy, Human Genome News (v9n3).
The Human Genome Project (HGP) was an international 13-year effort, 1990 to 2003. Primary goals were to discover the complete set of human genes and make them accessible for further biological study, and determine the complete sequence of DNA bases in the human genome. See Timeline for more HGP history.