Sponsored by the U.S. Department of Energy Human Genome Program
Human Genome News Archive Edition
Human Genome News, January 1992; 3(5)
The international workshop "Open Problems in Computational Molecular Biology" was held in Telluride Summer Research Center (Telluride, Colorado) on June 2-8, 1991. Sponsored by the DOE Human Genome Program, the meeting was organ ized by Andrzej Konopka (National Cancer Institute), Hugo Martinez (University of California, San Francisco), and Peter Salamon (San Diego State University) with Danielle Konings (University of Colorado at Boulder) as events coordinator.
The workshop brought together key researchers in computational biology, coding theory, and biomathematics from nine countries (Canada, China, France, Germany, Netherlands, Israel, Scotland, United States, and the former U.S.S.R.) to address the problem of identifying the kind of phenomena and principles that constitute biological coding (not only the mRNA protein-translation code).
Sequence-analysis software, which is becoming progressively faster and more powerful, graphical, and user friendly, is now routinely used. New computational architectures are beginning to be implemented for searching databases and comparing sequences.
Many studies in computer-assisted sequence research have been based on the assumption that the genomic code can be compared to a text carrying many messages written in many languages. Although this linguistic analogy was originally meant to be just a metaphor, it has been taken quite literally. An arbitrarily defined pattern in a nucleotide sequence has often been given the rank of a word in an alleged language responsible for an alleged (but often unknown) function. Published sequence-analysis papers are full of references to signals, codes, languages, texts, information, and similar terms that do not refer to concept or phenomenon. As a result, most scientific conclusions of the last 10 years were based on speculation and premature inferences from incomplete evidence.
Use of these arbitrary standards has created a real need for computational biologists to formulate carefully the very foundations of their field; the Telluride workshops are planned as a systematic forum for the exchange of pertinent ideas and results. The 1991 workshop, devoted entirely to the foundations of biolinguistics, dealt with several general topics, and participants reached the following conclusions.
Legitimacy of the linguistic metaphor as a research tool
Structural patterns and the physiological conditions in which they can be expressed as "signals"
Methods of assessing functional significance without knowing sequence function
Technical aspects of biomathematics
Coding theory, cryptology, and information theory
The workshop promoted formal and informal exchanges of ideas, and sessions were vigorous and lively. Many promising collaborations were initiated, including a nonorthodox application of Kullback entropy to computational molecular biology, a study of the evolution of recombination machinery as a pattern-recognition system, statistical modeling, and the inclusion of thermodynamic properties of sequence fragments in sequence-analysis tasks.
Reported by Andrzej Konopka and Peter Salamon
The electronic form of the newsletter may be cited in the following style:
Human Genome Program, U.S. Department of Energy, Human Genome News (v3n5).
The Human Genome Project (HGP) was an international 13-year effort, 1990 to 2003. Primary goals were to discover the complete set of human genes and make them accessible for further biological study, and determine the complete sequence of DNA bases in the human genome. See Timeline for more HGP history.
Published from 1989 until 2002, this newsletter facilitated HGP communication, helped prevent duplication of research effort, and informed persons interested in genome research.