HGPI

Human Genome Project Information Archive
1990–2003

Archive Site Provided for Historical Purposes


Sponsored by the U.S. Department of Energy Human Genome Program

Human Genome News Archive Edition
go to list of issues »

Human Genome News, November 1992; 4(4)

GDB Version 5.0 Highlights

Genome Data Base (GDB) Version 5.0 contains major new features enhancing database content, data relationships, and software for data retrieval and display. Highlights are summarized below, and a detailed description will be available online in the Release Notes under News.

Database Content and Organization

Information in GDB is organized into ten data managers: Map, Locus, Probe, Polymorphism, Mutation, Population, Library, Cell Line, Citation (previously named Source), and Contact. Five of them are newly added to increase the searchability and types of data that can be retrieved and displayed:

  • Polymorphism - Data previously available for display only is now directly searchable and has been reorganized to show on one screen all polymorphism information (location, detection method, alleles, frequencies, and population).
  • Population - Population definitions, also previously available for display only, have been broken down into component elements such as race, geographical region, or ethnic group and reconstituted to facilitate creation of new population definitions. Mutations and polymorphisms specific to particular populations can now be retrieved.
  • Mutation - Mutation data will include location, sublocalization within the genes, and comparison of wildtype and mutant nucleotide and amino acid sequences.
  • Cell Line - Definitions can be retrieved independently from their breakpoints.
  • Library - DNA library descriptions will aid researchers in obtaining clones from the GDB contacts.

Significant amounts of new data have also been added to three existing managers:

  • Map - Distance information (including degree of overlap) between two map objects is now included. Confidence limits for localization data are also available.
  • Probe - Amplification conditions for PCR primer sets include buffer concentrations, time/temperature cycles, and the name of the thermal cycler used. These conditions are linked to citations, and multiple sets of amplification conditions can be linked to a single set of primers. Probe-to-probe interactions will make available such data as sequence tagged site screening of yeast artificial chromosomes.
  • Citation - Citations are ranked (important, supportive, background) with respect to all linked entries. MEDLINE® citations include cross references to other databases including GenBank®, EC (Enzyme Commission) number, RN (CAS Registry) number, and symbols or abbreviated forms of gene names as they appear in the published citation.

Flexibility has been increased for retrieving linkage relationships among entries in different managers:

  • Multiple entries can be selected before another manager is called to retrieve linked entries.
  • A list of all linked data for a selected entry is available from the View menu.

Retrieving and Displaying Data

A query can retrieve any number of entries. If more than 500 entries are retrieved, data are grouped in sets and the user can move between sets. To provide a variety of ways to search for data, a basic set of fields (Cytogenetic Location, Locus Symbol, Locus Name, Probe Symbol) is included on the retrieve screens for loci, probes, polymorphisms, mutations, and populations. The detail view screens in each of these managers also include these basic fields.

To make information available as quickly as possible, new entries that are complete in fields and citations but subject to modifications during review will be viewable by general users as PROPOSED DATA. However, unpublished material may be withheld from view for up to 6 months if requested by the submitter.

Formats Available for Data Output

Output functions now include additional types and formats of GDB data that can be sent via e-mail. In addition to personal search results, standard reports (see GDB Standard Reports) are also available in ASCII, Tab-delimited, and PostScript formats.

Using GDB

An online system provides help with individual screens and fields and general topics. Help is now included in Citation and Locus managers and will be added to other managers on an ongoing basis.

Menu choices are selected through control R followed by unique menu letters that are consistent across all screens and managers. A set of function keys is available for the most commonly used menu and submenu options.


HGMIS Staff

Return to Table of Contents

The electronic form of the newsletter may be cited in the following style:
Human Genome Program, U.S. Department of Energy, Human Genome News (v4n4).

Human Genome Project 1990–2003

The Human Genome Project (HGP) was an international 13-year effort, 1990 to 2003. Primary goals were to discover the complete set of human genes and make them accessible for further biological study, and determine the complete sequence of DNA bases in the human genome. See Timeline for more HGP history.

Human Genome News

Published from 1989 until 2002, this newsletter facilitated HGP communication, helped prevent duplication of research effort, and informed persons interested in genome research.