Human Genome News Archive Edition

Human Genome News, May 1994; 6(1)

Software, Services, Electronic Data Access

UT, Memphis, Offers Mouse Resources

Mouse Gene Loci Data Files. A group of servers has been set up at the University of Tennessee (UT), Memphis, from which data files on mouse loci, many from the 1992 and 1993 Chromosome Committee reports, can be downloaded (see addresses below). New data include map position information for Massachusetts Institute of Technology loci released in April. Chromosome-specific files can be downloaded in generic text format or as Excel or FileMaker Pro files.

The World Wide Web (WWW) server, accessible using Mosaic (from the National Center for Supercomputing Applications at the University of Illinois), is a repository of Map Manager files. These include data sets on the new Birkenmeier loci [Lucy Rowe and Ed Birkenmeier (Jackson Laboratory)], the Shionogi loci from the National Center for Cardiovascular Research, Japan [Ken Manly and Verne Chapman [Roswell Park Memorial Institute (RPMI)], Rosemary Elliott's (RPMI) new recombinant inbred data files in Map Manager format, and updates of the Portable Dictionary files.

All these servers require an Internet-connected computer. The WWW Mosaic program is the only known client for downloading Map Manager files. Map Manager, a Macintosh-only program written and supported by Manly, is accessible via ftp at Users who have Map Manager data sets to make available over the Internet should e-mail the Map Manager files to All collaborators should agree to making files public before data are sent.

Server Addresses

  • WWW/Mosaic:
  • Gopher: or
  • Ftp: in the directory /repository/genedict or in the directory /pub/genedict

Portable Dictionary of the Mouse Genome. The Portable Dictionary of the Mouse Genome, a compact database for use on personal computers, contains information on over 12,000 mouse loci and on homologs in several other mammalian species, including human, rat, cat, cow, and pig. Key features are its compact size (less than 10 MB), network independence, and ability to convert to formats suitable for a wide variety of common programs. The dictionary includes DNA sequence accession numbers for over 1200 genes. Loci can be resorted rapidly by chromosomal position, type, human homology, or gene effect. The accessible, easily manipulated set of data has many uses, from a quick review of loci and gene nomenclature to the design of experiments and analysis of results. Updated versions of the Portable Dictionary of the Mouse Genome can be downloaded from the addresses above. The dictionary is also available on the January NCBI Data Repository CD-Rom disk.

[Contact for comments, corrections, and additions: Robert Williams; Department of Anatomy and Neurobiology; University of Tennessee; 875 Monroe Ave.; Memphis, TN 38163 (901/448-7018, Fax: -7193, Internet:]

LBL Develops Automatic Submissions to Genome Databases

SubmitData is a newly developed software program that allows fast and easy submissions to a particular database by merging a list of data records with a predefined template. SubmitData, which was developed in Smalltalk-80, generates appropriate forms from individual database protocol definitions and checks data values for conformance to the protocol. This capability makes the program readily adaptable to new or changing definitions.

SubmitData combines three functions: (1) template creation and editing, (2) data merging, and (3) data submission. The template editor presents a number of forms showing required and optional fields for constructing a valid data submission. The editor allows the user to choose from a controlled vocabulary when appropriate, checks entered values for agreement with type and range specifications, inserts default values, revises dates if necessary, and generates error messages. Column and reference variables can also be defined in the template (a column variable is a placeholder for the value in a column of the data record, and a reference variable allows references to other template fields). A finished template can be saved, printed, and modified to fit future submissions.

Another editor is used to specify the merging operation, and each variable in the submission template is associated with a column in a tab-delimited input file. Merging can be tested for accuracy and completeness. The final step names the input data file and merges each data record with the template. A dialogue at the end confirms the automatic transfer to a particular database. Once a template and merging operation have been defined, new data can be submitted in a single step.

The first version of SubmitData constructs submissions for GenBank using the same protocol as AUTHORIN, which can be edited and used for automatic bulk submissions. Submission to Genome Data Base is under development and will be available soon. For more information, contact Manfred Zorn (Lawrence Berkeley Laboratory, 510/486-5041, Internet: or BITNET: mdzorn@lbl).

HGN Reprinting Encouraged

Numerous HGN articles are being reprinted in other publications, including the newsletters of various universities and disease-gene groups. HGMIS encourages readers to duplicate and reprint any part of HGN. Contacting HGMIS is not necessary, permission is not required, and no charge is made. When reprinting an article, please add a credit such as "Reprinted from the U.S. DOE-NIH newsletter Human Genome News. For a free subscription to HGN, contact Betty Mansfield at 615/576-6669 or" Send us a copy of the publication, if possible.

EMBL Data Library

The following three products of the European Molecular Biology Laboratory (EMBL) Data Library are freely available by ftp (, Gopher (gopher., and WWW (www.embl-heidelberg. de). [Contact: EMBL Data Library; Postfach 10.2209; 69012 Heidelberg, Germany (Fax: +49/6221-387-519, Internet:]

MacPattern Fast Pattern and Block Searching: A computer program that helps researchers find putative biological functions for new protein sequences through a combination of algorithms. Fast and user friendly, MacPattern supports protein pattern searches using the PROSITE database, protein block searches with the BLOCKS database, and identification of statistically significant protein segments. It allows batch processing of sequences and automatic translation of nucleotide sequence data. Already in use by various genome projects worldwide, MacPattern is particularly suited for genome analysis and cDNA sequencing projects.

EMBL-Search: A database query-and-retrieval program for Macintosh systems. It enables easy construction of complex queries on EMBL, SWISS-PROT, PROSITE, EPD, and ENZYME databases as supplied on the EMBL CD-ROM, which also includes EMBL-Search. Full utilization of database cross-reference information allows easy movement between databases and exploration of related information. EMBL-Search can be particularly cost-effective because of its ability to run on a local computer network accessing a shared database CD-ROM [Internet:].

Mail Server Utility (MSU): Simplifies the use of electronic mail servers for sequence analysis by helping users produce properly formatted requests with a simple menu interface. Service descriptions are defined in external control files, which can be changed with a normal text editor without affecting the main program. MSU, which runs on UNIX and OpenVMS platforms, is a highly flexible tool that allows easy modification, extension, and customization to suit individual requirements.

[Rainer Fuchs, EMBL Data Library]

AAAS Publishes ELSI Reports

The following books are available from the American Association for the Advancement of Science (AAAS): The Genetic Frontier: Ethics, Law, and Policy (catalog no. 93-27S), based on an invitational conference, consists of 15 original essays by experts in genetics, ethics, law, philosophy, and social science. Topics include privacy and confidentiality issues, genetic testing, property rights, family relationships, and social policies. [AAAS Press Books: P.O. Box 521; Dept. D3GT; Annapolis Junction, MD 20701 (800/222-7809, Fax: 301/206-9789).] Ethical and Legal Issues in Pedigree Research reports on an invitational conference at which participants, including researchers studying five different genetic disorders, discussed such issues as informed consent, subject recruitment and withdrawal, privacy and the control of genetic information, children as research subjects, the role of researchers and provision of clinical care, and publication practices. [Contact for ordering: Kamla Butaney; AAAS Directorate for Science and Policy Programs; 1333 H Street NW; Washington, DC 20005 (202/326-6792, Fax: /289-4950).]

Whitehead/MIT Announces Release Six of Mouse Genetic Map

Release Six of the Whitehead Institute-Massachusetts Institute of Technology Genome Center Genetic Map of the Mouse is now available. The map consists of randomly chosen simple sequence length polymorphisms (microsatellites) that can be analyzed using the polymerase chain reaction, as described in W. Dietrich et al., Genetics 131, 423-47 (1992).

Release Six contains 3752 markers that fall into 20 linkage groups spanning about 1400 cM with an average spacing of less than 0.5 cM. The map can be accessed via the following:

  • Internet e-mail: For a copy of the most-current e-mail query forms, send a message to with help in either the subject line or body text. Instructions and a query form will be returned by e-mail. The filled-out form should be sent to genome_database, and the query answer will be mailed back automatically.
  • Anonymous ftp to in directory /distribution/mouse_sslp_release/apr94/ (log in as anonymous with user's e-mail address as password). The file README describes the file format and gives other information about the map.
  • WWW browser (client such as NCSA Mosaic is required). Point the client at http://www-genome.

This project is ongoing, and new markers will be released at the beginning of each quarter. [Contact for questions and comments: Ert Dredge; Whitehead Institute Center for Genome Research; One Kendall Square; Bldg. 300, 5th Floor; Cambridge, MA 02139 (617/252-1922, Fax: -1902, Internet:]

Mouse Map-Drawing Resource

A mouse genetic linkage map file can be generated via e-mail using a new map-drawing resource from the Mouse Genome Informatics Project at Jackson Laboratory. After a text file is e-mailed to the service, a postscript file is returned by e-mail. When sent to a postscript printer, this file will print a mouse genetic linkage map displaying loci at their relative positions along a chromosome.

Access to the program is accomplished in three steps:

  • Address e-mail to services@beadle.informatics.
  • In the subject or first line of the message body, enter the request in the following form: map [options] filename. [Options] refers to optional variations, and filename is the name of a user-specified file or (for local users) a printer designation.
  • Attach a text document containing the data for executing the service.

The return e-mail message will be either a postscript file for printing the linkage map or a report of errors detected in the map request. Maps can be customized in a number of ways, and several variations in the data file can be used to create maps for different purposes.

A manuscript describing this service has been submitted to Mammalian Genome. For a copy of the paper, including example figures showing the file sent and the resulting map, contact Michelle Stanley (207/288-3371, ext 1421, Fax: -2516, Internet:

[Janan T. Epping and Michael Kosowsky, Jackson Laboratory]

Video Available

A set of eight 60-min. videotapes is available for The Secret of Life, a series funded by NIH and DOE and shown last year on public television. The series explores how scientists' ability to decipher and manipulate genes will transform medicine and affect human lives. #3342. Tapes are also available separately. [Contact: Films for the Humanities & Sciences; P.O. Box 2053; Princeton, NJ 08543; (800/257-5126, Fax: 609/275-3767).]


HGMIS would like to be informed about informatics and educational resources freely available for use by genomics researchers and educators.

