Search Magazine 
  
Article Index Next Article Previous Article Feedback to Editor ORNL Review Home Page
A computational analysis of human and bacterial genomes by ORNL researchers provides insights into what our genes do. ORNL researchers will soon be predicting 100 protein structures a day and evaluating which compounds could make highly effective therapeutic drugs.

Probing Cells by Computer

Supercomputers are being used at ORNL to increase our knowledge about the structure and function of genes and proteins in living cells.

Analyzing Genomes Computationally

In 2001 scientists using supercomputers suggested we should say goodbye to some common beliefs in biology. No longer was it considered true that the human genome has 100,000 genes, that each gene makes only one protein, and that humans and bacteria have entirely different genes in their cells.

These tenets were tossed out in response to findings of the International Human Genome Sequencing Consortium, including the Department of Energy’s Joint Genome Institute (JGI), to which ORNL contributes computational analysis. On February 15, 2001, the consortium published the paper “Initial Sequencing and Analysis of the Human Genome” in the journal Nature. The paper states that the human genome has “about 30,000 to 40,000 protein-coding genes, only about twice as many as in a worm or fly”; each gene codes for an average of three proteins; and hundreds of genes may have been transferred from bacteria to human genes.

Ed Uberbacher, head of the Computational Biology Section in ORNL’s Life Sciences Division (LSD), was one of the hundreds of authors who contributed to this landmark paper. Using the IBM RS/6000 SP supercomputer (Eagle) at ORNL, he and his ORNL colleagues performed computational analysis and annotation of the human genome to uncover evidence of the existence of genes about which little or nothing was known—until this study. To perform their analysis, Uberbacher et al. used the latest version of the Gene Recognition and Analysis Internet Link (GRAIL), which was developed by Uberbacher and others in 1990 at ORNL and was rewritten as GrailEXP for parallel supercomputers. Use of GrailEXP helped provide evidence for alternative splicing—different ways of combining a gene’s protein-coding regions (exons) to produce variants of the complete protein. The evidence suggests that some genes when expressed produce up to 10 different protein products.

A view of the Genome Channel home page on the World Wide Web.
A view of the Genome Channel home page on the World Wide Web.

Researchers in LSD’s Computational Biology Section have identified many genes in bacterial, mouse, and human genomes. For the JGI they have created and used assembly programs and analysis tools to produce draft sequences of the 300 million DNA base pairs in chromosomes 19, 16, and 5. They have analyzed 25 complete microbial genomes and many JGI draft microbial genomes.

The section’s researchers have written algorithms and developed other tools that make it easier for biologists to use computers to find genes and make sense out of the rising flood of biological data. Through ORNL’s popular, user-friendly Genome Channel Web site (150,000 sessions per month) and its Genomic Integrated Supercomputing Toolkit (developed by ORNL’s Phil LoCascio and commonly called GIST), the international biology community, including pharmaceutical industry researchers, have readily obtained meaningful interpretations of their DNA sequences. With help from its supercomputers, ORNL is on the genome analysis map.

Computationally Predicting Protein Structures

In the summer of 2000, an LSD group led by Ying Xu participated in an international competition to predict the three-dimensional (3D) structures of 43 proteins, using computational tools. Of the 123 groups competing in the fourth Critical Assessment of Techniques for Protein Structure Prediction competition, this group placed sixth, putting ORNL in the top 4% and placing it ahead of all other DOE national laboratories in the contest.

A protein structure, predicted at ORNL and the actual structure, determined experimentally
A protein structure, predicted at ORNL (top) and the actual structure, determined experimentally (bottom).

The actual structures of the 43 target proteins had been determined experimentally by using nuclear magnetic resonance spectroscopy and X-ray crystallography. The computational groups were provided with the identity and order of amino acids making up each protein and the length of the one-dimensional amino-acid sequence. Their predicted structures (obtained in a few weeks) were compared with the experimentally determined structures (obtained in about a year).

Protein structure is the key to protein behavior. Because the function of a protein is related to its shape, it is essential to find or predict correctly the 3D structures of proteins that make us ill or keep us well. Using the details of a protein’s shape, a chemical compound can be custom designed to fit precisely in the protein, like a hand in a glove, blocking or enhancing the protein’s activity. In this way, a highly effective drug with no side effects could be created for an individual.

To speed up drug development, the goal is to predict computationally the structures of 100,000 proteins by aligning different amino-acid sequences along 1000 unique structural folds that are being determined experimentally.

ORNL researchers will soon be predicting 100 protein structures a day and evaluating which potential drug molecules dock well with specific proteins by running various automated tools on the Eagle supercomputer. One of those tools is PROSPECT, the Laboratory’s copyrighted protein-threading computer program that brought the group a high world ranking and an R&D 100 Award in 2001. It is giving ORNL good prospects in a field that could shape future health care.

Beginning of Article

Related Web sites

DOE's Joint Genome Institute
ORNL Life Sciences Division
ORNL Computational Biology Section
Genome Channel Website

Search Magazine  
Article Index Next Article Previous Article Feedback to Editor ORNL Review Home Page

Web site provided by Oak Ridge National Laboratory's Communications and External Relations
ORNL is a multi-program research and development facility managed by UT-Battelle for the US Department of Energy
[ORNL Home] [Communications] [Privacy and Security Disclaimer]