Beyond the Identification of Transcribed Sequences:
Functional, Evolutionary and Expression Analysis
12th International Workshop
October 25-28, 2002
Washington, DC

Diversity in Gene Expression: Assessment of Exon Skipping and Expression States

Winston Hide, Tzu-Ming Chern, Janet Kelso and Vladimir Babenko
SA National Bioinformatics Institute, University Western Cape, Bellville, South Africa
Telephone: +27 21 959 3645; Fax: +27 21 959 2512

Completion of the human genome sequence provides evidence for a gene count with lower bound 30 000 40 000. Significant protein complexity may derive in part from multiple transcript isoforms. Recent EST based studies have revealed that alternate transcription, including alternative splicing, polyadenylation and transcription start sites, occurs within at least 30-40 % of human genes. Transcript form surveys have yet to integrate the genomic context, expression, frequency, and contribution to protein diversity of isoform variation. We describe the degree to which protein coding diversity may be influenced by alternate expression of transcripts. 545 genes have been studied in this first intensive hand-curated assessment of exon skipping on chromosome 22. Combining manual assessment with software screening of exon boundaries provides a highly accurate and internally consistent indication of skipping frequency. 57 of 62 exon skipping events occur in the protein coding regions of 52 genes. A single gene, (FBXO7) expresses an exon repetition. 59% of highly represented multi-exon genes are likely to express exon-skipped isoforms in ratios that vary from 1:1 to 1:>100. The proportion of all transcripts corresponding to multi-exon genes that exhibit an exon skip is estimated to be 5%. A comparison with mouse orthologous genes reveals that common skipping events are not frequently detected, but that the frequency of skipping is similar between mouse and man. Comparitive assessment of expression state and skip occurrence is discussed.

