Sponsored by the U.S. Department of Energy Human Genome Program
Human Genome News Archive Edition
Vol.12, Nos.1-2 February 2002
In the News
Special Meeting Report
Web, Publications, Resources
Meeting Calendars & Acronyms
PROSPECT for Protein Structure Predictions
Wins 2001 R&D 100 Award
Explorations into the 3-D structures of proteins hold the key to understanding their biological functions and thus their roles in a living system. Proteins fold into complex shapes, creating active areas that enable them to interact with other proteins to accomplish a complex biological function in much the same way that gears in a watch mesh into a functioning machine. A broad collection of protein structural data will have an abundance of applications in the life sciences, biotechnology, and medicine. [This goal is the focus of an international structural genomics effort reported in (HGN).]
Revealing these structures, however, is not easily accomplished (see Predicting 3-D Protein Structure). Typically, a proteins 3-D structure is determined through such experimental methods as X-ray crystallography or nuclear magnetic resonance (NMR). The whole process, including protein expression and sample preparation, data collection, and structure-model construction, may take months or even years. This pace clearly cannot keep up with the rate at which protein-encoding genes are being identified worldwide. Nor can it satisfy increasing demands by drug companies hoping to use these data to custom-design drugs that fit precisely in the proteins like hands in gloves, blocking or enhancing their activities and minimizing side effects.
Predicting Structures with PROSPECT
Another of PROSPECTs unique capabilities allows users to enter any known structural data as constraints on the prediction. That structural information could be disulfide bonds between certain cysteines, geometrical relationships among residues identified as involved in the active site, and experimentally verified or predicted secondary structuresjust to name a few. This use of additional structural data as prediction constraints has greatly increased PROSPECTs accuracy.
By further extending the data-constrained prediction paradigm, ORNL researchers have developed a hybrid technique for protein-structure determination, using PROSPECT and large-scale experimental data from NMR or mass spectrometry (MS) in conjunction with chemical cross-linkers. The basic idea is to systematically obtain a large number of distances across amino acid residues and use them as constraints to threading and detailed atomic structure modeling through energy minimization. The investigators have demonstrated that structural information from fold recognition by threading is complementary to that from NMR or MS. Effectively combining these multiple sources of information makes it possible to solve protein structures or structure complexes that cannot be identified by existing methods. This series of developments led R&D Magazine to designate PROSPECT as winner of a 2001 R&D 100 award, presented for the years most significant technological innovations (see R&D Awards).
The hybrid technique could have significant implications for structural genomics projects, where the goal is to solve protein structures on a genome scale through the development and application of new and improved technologies. NMR methods generally work well for small proteins, but their effectiveness drops quickly as protein weight increases beyond 30 kD. The problem is in assigning enough NMR spectral peaks for an accurate structure determination of a large protein. Typically, when this problem is solved, valuable information can be retrieved in identifying the correct structural folds and providing accurate backbone and even detailed side-chain conformation predictions, as the ORNL researchers have demonstrated. As it matures, this capability should allow at least a good approximation of a proteins actual structure for which existing NMR methods may not work well, due either to the proteins size or its structural stability under NMR experimental conditions.
PROSPECT is the second biological analysis system from ORNL to receive an R&D 100 award. The first was GRAIL, an online automated gene-finding tool that won in 1992. Detailed information about PROSPECT and related projects can be found at http://compbio.ornl.gov/structure/.
Predicting 3-D Protein Structure
Protein threading is a computational method for predicting a proteins backbone structure or fold by comparing its amino acid sequence with solved structures already in the international depository Protein Data Bank and then assessing how well it fits from the potential energy point of view. Within hours to a couple of days of computing time, the method can predict a backbone structure by selecting the placement with the best assessment score. Existing threading techniques are thought to be capable of solving 60% to 70% of proteins identified through the genome projects.
Although there could be millions of proteins in nature, the number of unique structural folds could be as few as 1000, as many structural biologists believe. Up to now, more than 12,000 protein structures have been determined experimentally and deposited into PDB. Among these proteins, about 700 have unique structural folds. If estimates are correct, about 70% of all structural folds have been calculated. Statistics from PDB submissions are consistent with this hypothesis; over 90% of protein structures solved in the past 3 years have similar structures in PDB. Scientists have found more efficient ways to calculate a protein structure by making use of this information. Back
CASP Competition for Protein Structure Prediction
Return to Top of Page
The electronic form of the newsletter may be cited in the following style:
The Human Genome Project (HGP) was an international 13-year effort, 1990 to 2003. Primary goals were to discover the complete set of human genes and make them accessible for further biological study, and determine the complete sequence of DNA bases in the human genome. See Timeline for more HGP history.
Published from 1989 until 2002, this newsletter facilitated HGP communication, helped prevent duplication of research effort, and informed persons interested in genome research.