Computational Gene Annotation of the Genome of Drosophila melanogaster

Martin G. Reese
Director of Informatics Discovery
Tour Neptune
98086 Paris-La-Defense
telephone: 33 (0)1 47 67 66 00
fax: 33 (0)1 47 67 66 00
prestype: Platform
presenter: Martin G. Reese

The DNA sequence of the euchromatic portion of Drosophila melanogaster has been determined to 98.2% completion. An initial analysis and preliminary gene annotation and interpretation have been performed. The genome encodes a total of ~13,600 genes. I will present the annotation process focusing on the evaluation of genome annotation methods in the GASP (Genome Annotation Assessment Project; experiment. The results of this experiment are essential to fully understand the quality of the final gene predictions.

At least 30% of the initial predicted proteome in Drosophila is based solely on gene finding predictions. Many of the remaining genes were fully refined using the computer program Genie. An introduction into Genie, a generalized hidden Markov model, will be given and the strength and weaknesses will be described. Results of a comparison of the 13,187 Genie predictions against the final hand curated complete set of 13,601 annotated genes will be discussed.

