High Throughput Subcloning of Complete Protein Coding Open Reading Frames (ORF)

Bernhard Korn
RZPD REsource Center for Genome Research
Im Neuenheimer Feld 506
69120 Heidelberg Germany
telephone: +49 (0)6221 42 4700
fax: +49 (0)6221 42 4704
email: b.korn@dkfz-heidelberg.de
prestype: Poster
presenter2 = Ebert, Lars

L. Ebert, R. Schatten, A. Poustka and B. Korn

In order to progress from genomic sequence information to mRNA expression analysis to proteomics, it is of increasing interest to have the functional gene products at hand. As only the protein determines the function of a given gene that harbors translation potential, the characterization of these final gene products will lead us to the experimental verification and determination of protein- (and therefore gene-) function. As cloning of native open reading frames of proteins is a labor, time and cost consuming process, we have initiated a project that aims to clone as many complete open reading frames of human genes as possible. We intend to provide the respective clones as part of the RZPD clone collection to all interested groups. In order to have the ORFs available for many different applications (e.g. in situ transcription (and translation), expression in eukaryots and prokaryotes, expression as fusion proteins, …), we decided to clone the complete ORF (excluding the stop codon) into a vector system which does allow base-specific shuttling of the insert into a wide range of other vector types without restriction and ligation, rather making use of in vitro recombination between vectors. In a first attempt, we have concentrated on human ORFs, which are potentially present in EST clones arising from the RZPD and IMAGE clone collection. However, most of these clones are only single pass sequenced, and only partial sequence of each clone is available (ESTs).

To solve this problem we developed a bioinformatic method to select EST clones, which contain the full ORF, although they are not yet completely sequenced.

  1. We extract the relevant information - such as the sequence of the ORF region of a gene - from the complete human unigene dataset.
  2. We use alignment algorithms to select clones, whose EST sequences cover the start and stop region of the ORF. The quality of the potential full ORF clone is checked via several bioinformatic selection procedures.
  3. Primer pairs are designed that allow the amplification of the complete ORF while deleting the stop codon.
  4. PCR is performed and the products are checked on agarose gels. PCR product size is verified by comparing the experimentally generated PCR product with the expected ORF size.
  5. A secondary PCR is performed to add recombination sequences that allow the cloning of the ORF PCR product by homologous recombination.
  6. Recombination clones are verified by: a) PCR of the inserts and b) expression of a GFP fusion protein.

Up to now, we have amplified more than 360 ORFs of known human genes, and aim to increase this number steadily while streamlining the whole process by partial automation. We expect to distribute the first clones from this set by beginning of year 2001 via the distribution service of RZPD.

  Abstract List

Abstracts * Speakers * Organizers * Home

Genetic Meetings