Phylogenic analysis of additional Neurospora crassa isolates

The ascomycete Neurospora crassa is classical model organisms in biology. So far, a phylogenetic analysis based on genomic sequences of four non-functional nuclear loci has been reported for 44 natural isolates of N. crassa. Three subgroups (clades) with a distinct geographical distribution have been identified: clade A (Caribbean Basin and Ivory Coast), clade B (Europe, Ivory Coast and India), and clade C (India). Here, we report the results of a phylogenetic analysis of 16 additional isolates. Six of these were from the Caribbean Basin, eight from Europe and one from Pakistan and one from Thailand. The previously described clades and their geographical distribution were generally confirmed. All Caribbean isolates belonged to clade A and all European isolates belonged to clade B, with the exception of one isolate from Italy, which also belonged to clade A, suggesting a transport from the Caribbean Basin or the Ivory Coast to Europe. Interestingly, the isolates from Pakistan and Thailand were found in a separate group, basal to all other clades. Their phylogenetic classification is not yet clear as they might belong to N. crassa but as well to N. perkinsii, potentially representing yet undescribed phylogenetic groups or species of Neurospora, or hybrids.


Introduction
The filamentous ascomycete Neurospora comprises about 35 species (Nygren et al., 2011). One of the Neurospora species, Neurospora crassa, gained a special status as model organism for genetic, cytogenetic, biochemical, molecular and population biology studies (Perkins, 1992) because of its favorable characteristics, such as a haploid life cycle, fast growth and reproduction, rather large spores, and because of its sequenced genome (Galagan et al., 2003). Furthermore, it became a valuable model for studying the molecular mechanisms of the circadian clock (Merrow et al., 2001). Most of the knowledge about the clock came from studying mutant N. crassa strains (Bell-Pedersen, 2000). However, the response characteristics of the clock under natural conditions have so far received little attention (Michael et al., 2007). Therefore, we became interested in possible differences in the circadian system according to the geographical distribution of wild N. crassa isolates. Since, the phylogenetic relationship of several isolates was not known but appeared important for the interpretation of our physiological results, the aim of this study is to clarify this matter.
To date, the phylogenetic relationship of 188 isolates from eight phylogenetic species has been studied (Villalta et al., 2009) using the phylogenetic species recognition (PSR) method, which is based on the genealogical concordance of DNA sequence of four polymorphic loci (Dettman et al., 2003a). Within N. crassa, the isolates (N = 44) were shown to belong to three subgroups, named clade A, B and C, which are not recognized as phylogenetic species because their evolutionary lineages were not identified as independent (Dettman et al., 2003a). Clade A is found predominantly in the Caribbean Basin and the Ivory Coast, clade B in Europe, western North America, southern India and the Ivory Coast, and clade C exclusively in Tamil Nadu, India . Since clade C is nearest to the root of the phylogenetic tree, Turner et al. (2010) hypothesized that N. crassa migrated from India to Africa and to the Caribbean Basin.
In this paper, we assess how the 16 additional isolates, which were classified as N. crassa by the Fungal Genetic Stock Center, fit into the existing phylogenetic framework. They were mainly from Europe but to represent a wider geographical range we also chose isolates from the Caribbean Basin, Thailand, and Pakistan. We found that our European isolates belong to clade B, with the exception of one Italian isolate that belongs to clade A. Our Caribbean isolates belong to clade A, while the isolates from Thailand and Pakistan could not be placed in any of the known clades.

Materials and Methods
Isolates A total of 16 wild Neurospora crassa isolates from 11 different countries were investigated (Table 1) (Vogel, 1956) and 2% agar at room temperature (slants plugged with cotton stoppers). They were inoculated from the original stocks on the day of arrival, and then allowed to grow on the bench for seven days before using or being wrapped with Parafilm (Pechiney Plastic Packaging, Menasha, WI) for long-term storage. For storage, all isolates were preserved at -20°C. To avoid the accumulation of background mutations, we refrained from making further subcultures of these stocks.

DNA preparation
Isolates were grown in 100 ml Erlenmeyer flasks with medium, containing 20 ml 50X Vogels' salt (Vogel, 1956), 5 g arginine, 100 l biotin, 20 g glucose, and 500 ml H 2 O, for 2-3 days at room temperature in an orbital shaker under constant white light (4 µE m -2 sec -1 ). Mycelial tissue was dried between paper tissues, submerged in liquid nitrogen and ground to powder. Dry tissue was incubated at 65°C for 1 h in 600 µl of lysis buffer with final concentrations of 100 mM Tris-HCl, 50 mM EDTA, 1% SDS, and 20 mg/ml Proteinase K (BioLabs, Frankfurt, Germany). 7.5 M ammonium acetate was added and samples were centrifuged at 13 800 rpm on 4°C for 3 min. The supernatant was incubated with RNase A (10 mg/ml, Roche Diagnostics, Mannheim, Germany) for 1 h at 37°C. After a wash with chloroform-isoamyl alcohol (24:1), samples were centrifuged (13 800 rpm for 8 min) to remove cellular debris. The aqueous phase was collected and genomic DNA was extracted using isopropanol by centrifuging at 13 800 rpm for 30 min at 4°C. The pellet containing genomic DNA was washed with 70% ethanol, dried and dissolved in water.
Amplification of DNA with PCR All primer sequences (TMI, DMG, TML, and QMA) were same as in Dettman et al. (2003a). The following PCR reaction conditions were used: 10 mM dNTPs (Qiagen, Hilden, Germany), 5 pmol/µl of each primer (Metabion, Planegg/Martinsried, Germany), 10 X Qiagen PCR Buffer, 5 X Qiagen Q-Solution, 25 mM MgCl 2 , 5 U/ml Taq DNA Polymerase (Qiagen, Hilden, Germany). The thermal cycler protocol for markers was as follows: initial denaturation at 94°C for 2 min; 35 cycles of 94°C for 1 min together with marker-specific annealing temperature for 30 sec (see Dettman et al., 2003a) and extension at 72°C for 1 min; 8 min final extension at 72°C; maintenance at 4°C. Finally, amplification products were purified from the gel using QIAquick Gel Extraction Kit (Qiagen, Hilden, Germany) according to the manufacture's protocol and then used for sequencing.

Sequencing setup
Sequencing reactions were performed using Big Dye Terminator BDu3 chemistry (Applied Biosystems, Darmstadt, Germany) and the following conditions: 1.3 µl of 1 M sequencing primers (Metabion, Planegg/Martinsried, Germany), 2 µl of 5 X Sequencing Buffer (Applied Biosystems, Darmstadt, Germany), 1 µl of Big Dye (Applied Biosystems, Darmstadt, Germany), 2.7 µl of the amplified product, and 3 µl of water. The PCR program was as follows: initial denaturation at 96°C for 1 min; 25 cycles of 96°C for 10 sec together with annealing at 50°C for 5 sec and extension at 60°C for 4 min; hold at 4°C. PCR products were purified from dye terminator nucleotides, primers, excess salts, and other contaminants from sequencing reactions with Sephadex G-50 (Sigma-Aldrich, Steinheim, Germany). The plate was filled with Sephadex and 300 µl H 2 O and left for 2 h at room temperature. The excess water was removed by centrifuging on 6000 rpm for 4 min. The columns were washed with water and sequencing reactions were added. After centrifuging the plate on 6000 rpm for 4 min the cleaned sequencing reactions were collected. Formamide was added to reactions and left at 95°C for 5 min. Sequencing reactions were run on a DNA Sequencer 3100 (Applied Biosystems, Foster City, CA, USA). Finally, the sequence data were examined and edited visually using Sequencher 4.7 (Gene Codes, Ann Arbor, MI, USA). Nucleotide sequences have been deposited in GenBank under the accession numbers JQ629968-JQ630031.
Phylogenetic analysis DNA sequences were aligned using MacClade 4.06 (Maddison and Maddison, 2000). The gaps were treated as missing data and the regions of sequence with ambiguous alignment and microsatellite repeats were excluded as in Dettman et al. (2003a). Sequence data from TMI, DMG, TML and QMA loci were aligned in one file and added to the alignment file provided by Villalta et al. (2009). However, to simplify the analysis, a total of 67 Neurospora isolates contributed to the present phylogenetic analysis (see Figure 1).
The appropriate nucleotide substitution model was chosen using Modeltest 3.7 (Posada and Crandall, 1998) and PAUP 4.0b10 (Swofford, 1998). The alignment and the chosen nucleotide substitution model were used as input for MrBayes (Huelsenbeck et al., 2001;Ronquist and Huelsenbeck, 2003). The analysis was run for 1 million generations with burn-in of 2 500 generations to produce a consensus tree with Bayesian posterior probabilities. The final tree ( Figure 1) and bootstrap branch support values were obtained with RAxML v7.2.8 (Stamatakis, 2006;Stamatakis et al., 2008) using the maximum likelihood option with gamma model for 100 replicates and with N. discreta as outgroup. The full alignment containing all four loci has been deposited in TREEBASE under the http://purl.org/phylo/treebase/ phylows/study/TB2:S12564. Figure 1: Phylogram of 67 wild Neurospora isolates produced from the combined TMI, DMG, TML, and QMA sequences. N. discreta was used as out-group. Numbers near branches indicate confidence levels (Bayesian posterior probability / maximum likelihood bootstrap proportion); newly characterized isolates (see Table 1) are indicated by capitals in bold; numbers in front of collection sites indicate either the FGSC number or were taken from Dettman et al. (2003a;prefix D) or from Villalta et al. (2009;the prefix CV).

Results and Discussion
The sequencing of the four independent nuclear loci (TMI, DMG, TML and QMA), which have previously been used in several studies for phylogenetic species recognition (e.g. N. crassa: Dettman et al., 2003a;N. discreta: Dettman et al., 2006;N. tetrasperma: Menkis et al., 2009), resulted in approximately 2 000 nucleotides of sequence data for each of the 16 newly sampled isolates. The maximum likelihood consensus tree based on the concatenated sequence data from the four loci had Bayesian posterior probabilities (PP) between 0.51 and 1.00, and maximum likelihood bootstrap proportions (MLBP) between 15% and 100%. The tree confirmed the phylogenetic species of Neurospora and was similar to those provided by Dettman et al. (2003a) and Villalta et al. (2009).
The geographical distribution of the known N. crassa clades was also confirmed (Dettman et al., 2003a). The six isolates from the Caribbean Basin (Costa Rica, Venezuela, Puerto Rico, Guyana, Brazil and Fr. Guiana) and one European isolate (from Italy) belong to clade A; the other seven European isolates (from Scotland and Spain) to clade B. The branch support values for clades A and B were equally high (Bayesian PP / MLBP = 0.93/50% and 1.00/100%, respectively). For then Indian isolates (clade C), the values were 1.00/75% compared to 1.00/95% of clade A and B combined (Figure 1).
One isolate from Europe (10866 Italy), however associated with clade A (predominantly representing isolates from the Caribbean Basin and the Ivory Coast; Jacobson et al., 2006). An explanation for this exception may be that isolate 10866 was at some time in history transported from the Caribbean Basin or Ivory Coast to Europe (e.g., by human trade). After all, Neurospora crassa has frequently been found in bakeries (Yassin and Wheals, 1992).
Two isolates (from Pakistan and Thailand) did not fall into any of the existing N. crassa clades and were found to be separate and basal to all other clades. Thus, their phylogenetic relationship cannot as yet be clearly classified as N. crassa or N. perkinsii. Two facts make the classification of these two isolates as N. crassa likely. Based on crosses, they have been grouped by Fungal Genetic Stock Center to N. crassa (Perkins et al., 1976;Perkins and Turner, 1988) and their geographical collection sites are close to those N. crassa isolates that fall into clade C. On the other hand, their bootstrap values indicate a closed relationship to N. perkinsii than to N. crassa (21% vs 75%, respectively). Thus, these two isolates may potentially represent yet undescribed phylogenetic groups or species of Neurospora, or hybrids. To clarify their classification unambiguously, comprehensive mating tests (Dettman et al., 2003b) and phylogenetic analyses of more individuals from this region will be necessary.