Sponsored by the U.S. Department of Energy Human Genome Program
Human Genome News Archive Edition
Human Genome News, January-March 1996; 7(5)
Faced with the challenge of storing and distributing the burgeoning mapping data of the Human Genome Project, the Genome Database made a serious assessment of its role as manager and curator. Some shortcomings recognized by the genome community included the need to graphically display maps based on the data, create a more representational data model, and increase GDB's ability to accumulate and curate mapping data within a reasonable time. Clearly, GDB's curatorial staff could not grow sufficiently to accommodate the last requirement, so focus shifted to developing a new curatorial model that allowed more community interaction. In addition, a major redesign of GDB was accomplished with new technologies to replace the underlying schema. The result was GDB 6.0.
The development of GDB 6.0 called for a new schema based on the Object Protocol Model (OPM) tools of Victor Markowitz's group at Lawrence Berkeley National Laboratory. OPM defines more explicitly the relationships between GDB object classes and their attributes (such as genomic segments and genes) and between pairs of classes. Most important, the new schema is easier for GDB users to understand and query. Data in GDB 6.0 are now divided into a family of interrelated data sets consisting of the biological data (the mapping data component), the citation database (literature citations), and the registry (information on people and organizations). This separation of information can be viewed as a pilot effort toward federating genomic databases across the Internet. GDB 6.0's use of an extensible object broker furthers this effort by addressing the problem of incompatible architectures among genomic databases and by managing communication among frequently changing schemas, database technologies, and user interfaces.
With the priority to improve data representation of genetic and physical maps and to visualize and query graphical maps, GDB staff developed a model to represent regions of the genome in multiple resolutions; produce maps based on experimental data; and provide for more sophisticated querying on position, order, and distance. "Mapview" works across platforms with a Web external viewer that is integrated with Netscape and provides the capability to query further on objects contained in the map. Several enhancements were built into the user interface, such as the ability to browse all maps in the database and download preextracted versions in compressed formats for viewing later. GDB 6.0 included linkage and cytogenetic maps in its initial release, and other maps are being added as quickly as possible.
The GDB 5.6 browser was replaced by another Web interface based on an application called Genera, developed by Stan Letovsky at GDB. With a modified Genera, query and edit forms are generated automatically from the GDB 6.0 schema. This allows users to move easily among query, update, and insert operations. Although GDB 6.0 was released as read-only in January, editing capabilities are anticipated shortly. Editing capabilities were delayed to ensure complete and accurate migration of the data.
What Does GDB 6.0 Mean for the Genome Community?
GDB 6.0 represents a fundamental change in the way users navigate the data. Data representation has been improved by the graphical display of various mapping methodologies, the ability to make more complex queries, and timely data accessibility for the community as a whole. Genome Database views this undertaking as a step toward providing another tool for the genome community, one that will allow greater access not only to the data contained within GDB but to information stored throughout the world in related databases. GDB hopes Version 6.0 will promote increased community participation in support of the Human Genome Project.
The electronic form of the newsletter may be cited in the following style:
Human Genome Program, U.S. Department of Energy, Human Genome News (v7n5).
The Human Genome Project (HGP) was an international 13-year effort, 1990 to 2003. Primary goals were to discover the complete set of human genes and make them accessible for further biological study, and determine the complete sequence of DNA bases in the human genome. See Timeline for more HGP history.
Published from 1989 until 2002, this newsletter facilitated HGP communication, helped prevent duplication of research effort, and informed persons interested in genome research.