TRANSCRIPTOME 2002: From Functional Genomics to Systems Biology
March 10-13, 2002
Seattle, Washington, USA


Progress to Construct the RIKEN Mouse Full-Length cDNA Encyclopedia

P. Carninci, K. Shibata, M. Itoh, T. Arakawa, Y. Ishii, H. Konno, K. Sato, N. Hayatsu, T. Shiraki, T. Hirozane, K. Aizawa, H. Bono, S. Kondo, K. Waki, J. Kawai, A. Yoshiki, Y. Okazaki and Y. Hayashizaki, Genome Science Laboratory, RIKEN, Saitama, JAPAN and Genome Exploration Research Group, RIKEN Genomic Sciences Center (GSC), Yokohama, JAPAN

We have prepared long insert, full-length, subtracted/normalized cDNA libraries with cap-trapper from >250 tissues/stages using thermoactivated RT and cloning in lambda-FLC vectors, amenable to transfer in functional vectors. We have produced >1,445,000 3 end ESTs, >252,000 5 ESTs and fully sequenced and functionally annotated >21,000 fully sequenced cDNAs (FANTOM). Clustering gives >179,000 groups, which cover most genes. Remaining clusters are being fully sequenced. Discrepancy with current gene number prediction (35-40,000 only), derives from: (1) sequencing errors, (2) clustering strategies, (3) polymorphisms of transcription initiation/termination sites; but also (4) non-protein-coding RNAs and (5) genes not found by genome annotation. To select new, rare/very long cDNAs, we are making (1) lambda cDNA libraries subtraction, (2) size selection (~7Kb), (3) stabilization of long cDNAs, (4) improved plasmid preparation and (5) special interest libraries (preimplantation embryos, cancer, tissues of immunological and neurobiological interest). To produce functional proteins, we use cytoplasmic RNA to avoid residual unspliced introns.

