9th Annual Workshop, October 28-31, 1999
Co-sponsored by the U.S. Department of Energy
Analysis of gene expression data generated by oligonucleotide fingerprinting
Christof Bull, John O'Brien, Uwe Radelof, Ralf Herwig, Steffen Hennig, Axel Nagel and Hans Lehrach
Abteilung Lehrach, Max-Planck-Institut fuer Molekulare Genetik, Berlin, Germany
Oligonucleotide fingerprinting (OFP) is a powerful method for genome-wide expression analysis and gene finding. It is based on the analysis of arrayed cDNA libraries by sequential hybridisation of 200 oligonucleotides 10 bp in length. Clones are grouped into clusters according to their hybridisation fingerprints. The number and the size of the clusters provide information about the spectrum of expressed genes and their relative expression levels respectively, whereas the fingerprint itself is used for database matching of the cDNA clones. We can therefore identify the corresponding gene of a cDNA clone and get information about expression rates at the same time. We have performed OFP on cDNA libraries from human monocytes and dendritc cells with 100,000 clones each. The clones were grouped into 11,897 clusters plus 25,582 singletons (clusters with just one member). This would correspond to a variety of 37,479 different genes that are found to be expressed in either of the cell types. However, due to technical reasons we observed a 1,57 fold overestimation of expressed genes in previous experiments so that we would estimate the real number to be around 24,000 expressed genes. Of the genes that are differentially expressed between monocytes and dendritic cells, we selected 260 genes that are of particular interest to us for further studies. We will re-array these and other selected clones from the libraries to a non-redundant set. This clone set will be further evaluated in expression studies using cDNA arrays and complex probes derived from hematopoetic cell types including monocytes and dendritic cells from various differentiation stages. There were also approximately 1,000 potentially new genes which are currently being tag-sequenced at the MPI-MG. The massive fingerprinting and sequencing data that we have obtained are analysed by highly automated computer tools. The sequence data are compared to the following databases: dbEST, GenEMBL, human UniGene, SWISSPROT and our cDNA sequence databases from sea urchin, amphioxus and zebrafish. Following the database searches (BLAST) a series of further analysis steps is performed, including filtering of blast output files, clustering of related matches and tabulating the results in web-pages, which allow easy access to the analysis details. We will integrate all our data and think that especially the comparision of gene expression patterns from homologue genes in model organisms will be very useful to determine the function of new human genes.