Sequence characteristics within nuclear genes from Sordaria Sequence characteristics within nuclear genes from Sordaria macrospora

Abstract This paper reports sequence features within nuclear genes from Sordaria macrospora. Eight nuclear gene sequences were analyzed for codon usage, GC content, intron regulatory sequences and translation initiation sites.


Sequence characteristics within nuclear genes from Sordaria macrospora
Stefanie Pöggeler-Ruhr-Universität, 44780 Bochum, Germany This paper reports sequence features within nuclear genes from Sordaria macrospora.Eight nuclear gene sequences were analyzed for codon usage, GC content, intron regulatory sequences and translation initiation sites.
The homothallic ascomycete Sordaria macrospora is an excellent model system to study not only meiotic pairing and recombination (Zickler 1977 Chromosoma 61:29-316) but also fruiting body development (Esser and Straub 1958 Z. Vererbungslehre 89:729-746).Recently, these studies have been extended to a molecular level (Walz and Kück 1995 Curr.Genet.29:88-95) and knowledge about sequence features would be a helpful tool in sequence analysis.Until now, sequence information from S. macrospora was only available from a single nuclear gene (LeChevanton and Leblon 1989 Gene 77:39-49).Here we compile sequence data from eight recently sequenced genes to determine common features of nuclear genes from S. macrospora.We provide a consensus sequence for the translation initiation site (Table 1), a codon usage table (Table 2), and consensus sequences for intron regulatory sequences (Table 3).Comparison of the data presented here with sequence features from the well studied ascomycete Neurospora crassa (data taken from Brucherez et al. 1993 Fungal Genet. Newsl. 40:85-95;and Edelman and Staben 1994 Exp Mycol 18:70-81) shows that S. macrospora sequence characteristics are very similar to those determined for N. crassa genes.
*The subscript number indicates the percentage occurrence of the particular nucleotide.
The S. macrospora consensus for initiation of translation shows a high degree of identity to the N. crassa translation initiation consensus sequence and, as N. crassa, a prevalence of GC following the ATG which means that an alanine (GCN) is found at the amino terminus of most proteins studied so far.The GC content in a coding region of 7491 nucleotides is 56.7%.For comparison in N. crassa the GC content is 58.6% in the coding region (GC content in total DNA 54.1%).In cases where amino acids are represented by more than one codon, S. macrospora, as many other organisms, does not use synonym codons equally (Table 2).
In S. macrospora, as in N. crassa, codons are preferred with a C in the third position and in four codon families the codon ending in T is usually preferred to those ending in A or G.The stop codon TAA is more frequently used than TAG or TGA, respectively.The six least used codons for S. macrospora are ATA (Ile), TTA (Leu), CTA (Leu), TGT (Cys), GTA (Val), and AGT (Ser).All of these six codons are belonging to low-usage codons in N. crassa as well.As reported by Zhang et al. (1991 Gene 105:61-67) in many organisms, low-usage codons are clearly avoided in abundant proteins and therefore may affect translation rates.In S. macrospora genes the intron length lies between 47 bp and 256 bp, the average length is 88 bp and the median length is 60 bp.Intron length in N. crassa ranges from 46 to 856 bp with a tendency toward 60 to 70 bp.Among the eight genes analyzed so far, two genes, ura3 and ura5, do not contain introns.In S. macrospora introns the distance from the C of the splice branch site to the G of the 3' splice site is between 12 nt and 22 nt.This distance varies in N. crassa from 14 to 30 nucleotides.The S. macrospora intron signals (5' donor site, intron branch site and 3' intron acceptor site) are very similar to the N. crassa intron consensus sequences.

Table 1 .
Translation initiation context

Table 3 .
Intron regulatory sequences and intron length