Archive Site Provided for Historical Purposes
Sponsored by the U.S. Department of Energy Human Genome Program
In this issue...
Also available in pdf.
1997 Santa Fe Highlights
Human Genome Project Administration
In the News
Software and the Internet
Meeting Calendars & Acronyms
Providing informatics support for achieving "dream" targets of 100 Mb a year of Bermuda-quality sequence is an evolving process, said Tom Slezak, director of the JGI informatics team. "It can't happen in a single leap. It will ride a learning curve similar to all the other scientific and technological ones going on in parallel," he said. But a lag time for informatics support on these processes is inevitable because support requirements are not yet clear.
In general terms, informatics for sequencing encompasses clone resources and validation, sequence production, sequence analysis and annotation, informatics integration, and systems administration. Slezak gave an overview of various short-, medium-, and long-term solutions being implemented to meet these challenges.
Focusing on JGI's charter and scope, investigators are working toward jointly designing, developing, and deploying processes; and sharing hardware, software, expertise, and production goals. Prime areas of informatics concern include improving sequence quality-control tracking and reporting, increasing automation of finishing tasks, linking mapping data with sequencing, and starting some pilot informatics projects to support early functional genomics efforts. As to the last, Slezak cautioned, "We'd better get a head start on these, or we're going to end up with five different spreadsheets or other forms of information and a lot of duplication."
The JGI informatics team is using systems already in place at the three participating laboratories, sharing code, and standardizing where feasible. "The three sites will meet on the Web," Slezak said.
Medium-range goals (FY 1999-FY 2000) include adapting the most robust JGI shotgun system to scale up to 20 to 40 Mb per year. Major challenges are to modify process changes and automate sequence finishing. Slezak observed that large increases in automation present severe informatics challenges, such as moving from many small robots to automating individual processes to an automated factory. He also noted the difficulty of predicting how much these efforts will speed up finishing, and he warned that the system will hit inherent limits.
Turning to long-term plans, Slezak observed, "We can't scale up by bolting things together." A complete overhaul of the entire process --the biology, automation, informatics, perhaps even management-- will be needed, he said. One promising avenue for meeting long-term goals is participation by commercial and academic groups in developing some automation and laboratory information-management systems (see New Awards). Other long-term goals are to develop centralized databases, connect them with existing ones, and provide capabilities for Web-based navigation and display of all the data from any starting point.
The electronic form of the newsletter may be cited in the following style:
Human Genome Program, U.S. Department of Energy, Human Genome News (v9n3).
The Human Genome Project (HGP) was an international 13-year effort, 1990 to 2003. Primary goals were to discover the complete set of human genes and make them accessible for further biological study, and determine the complete sequence of DNA bases in the human genome. See Timeline for more HGP history.