Analysis of Genomic G + C Content, Codon Usage, Initiator Codon Context and Translation Termination Sites in Tetrahymena thermophila

Document Type




Format of Original

9 p.

Publication Date



International Society of Protistologists

Source Publication

Journal of Eukaryotic Microbiology

Source ISSN



In recent years, the amount of molecular sequencing data from Tetrahymena thermophila has dramatically increased. We analyzed G + C content, codon usage, initiator codon context and stop codon sites in the extremely A + T rich genome of this ciliate. Average G + C content was 38% for protein coding regions. 21% for 5′ non-coding sequences, 19% for 3′ non-coding sequences, 15% for introns, 19% for micronuclear limited sequences and 17% for macronuclear retained sequences flanking micronuclear specific regions. the 75 available T. thermophila protein coding sequences favored codons ending in T and, where possible, avoided those with G in the third position. Highly expressed genes were relatively G + C-rich and exhibited an extremely biased pattern of codon usage while developmentally regulated genes were more A + T-rich and showed less codon usage bias. Regions immediately preceding Tetrahymena translation initiator codons were generally A-rich. For the 60 stop codons examined, the frequency of G in the end + 1 site was much higher than expected whereas C never occupied this position.


Journal of Eukaryotic Microbiology, Vol. 46, No. 3 (May 1999): 239-247. DOI.