Genetic Differentiation between Quercus frainetto Ten . and Q . pubescens Willd . in Romania

Little is known about genetic differences among Quercus frainetto and Q. pubescens, two species of section Dascia Kotschy (subgenus Lepidobalanus, white oaks) that reach in Romania the margins of their natural distribution range. A set of genomic SSRs (simple sequence repeats) and EST (expressed sequence tags)-SSRs was used to estimate the genetic differentiation among four natural populations of the two species. Q. pubescens had higher values of genetic diversity than Q. frainetto, although the differences were not significant (P>0.05). Two out of seven marker-loci, QrZAG112 and QpZAG110, displayed very high FST values. Averaged across loci, the genetic differentiation was high and significant (FST = 0.067, P<0.05). Genetic distances were much higher among species than among populations within species. A Bayesian analysis indicated that two is the most appropriate number of genetic clusters. Using a blind procedure (i.e. based on multilocus genotypes only) the vast majority of sampled individuals (90%) could be assigned to the cluster corresponding to their phenotypes. When information about sampling localities was introduced in the assignment test, all individual trees were correctly classified. The higher degree of admixture in Q. frainetto as compared to Q. pubescens may be explained by different rates of introgressive hybridization.


Introduction
The issue of genetic differentiation among closely related species is a central research theme because it provides insights into the evolutionary processes responsible for species divergence.Genus Quercus is a long-standing model for studying population divergence, species delimitation and natural hybridization (e.g.Curtu et al., 2009;Gailing et al., 2007;Peñaloza-Ramírez et al., 2010;Zeng et al., 2010).In Europe, most of the studies have focused on genetic differentiation and identification of highly discriminating genomic regions between the most common oak species: Q. robur and Q. petraea ( Jensen et al., 2009;Muir and Schlötterer, 2005;Neophytou et al., 2010;Scotti-Saintagne et al., 2004).By contrast, little is known about genetic differences between the two oak species from Section Dascia Kotschy (subgenus Lepidobalanus or white oaks), Q. frainetto and Q. pubescens, although they have a high ecological and economic value in the southeastern part of the European continent (Abrudan et al., 2009;Ioras et al., 2009).
The actual distribution of Q. frainetto in Romania comprises approximately 130 000 ha, 2% of the total forest cover, as much as Q. robur (Sofletea and Curtu, 2007).Q. frainetto and Q. pubescens are elements of the (sub-)Mediterranean flora that reach in Romania the northern limit of their distribution range.Q. frainetto is a meso-xerophilous species which can grow on heavy clay soils.Q. pubescens is better adapted to xeric conditions, being found on dry sites, such as on limestones and sunny slopes (e.g. in Transylvania) or in the wood steppe in southern Romania.Both species can be easily identified using leaf and fruit characters (Schwarz, 1993).
Few studies have investigated genetic differences between Q. frainetto and Q. pubescens, or between them and other oak species.Chloroplast DNA markers were found to be polymorphic, but could not discriminate among species (Moldovan et al., 2010;Petit et al., 2002).Highly polymorphic nuclear markers, such as microsatellites (SSRs -simple sequence repeats), are more promising for differentiating among oak species (Curtu et al., 2007;Fortini et al., 2009).Very recently, microsatellite markers have been developed from expressed sequence tags (EST) or sequences of messenger RNA starting from a Q. robur/Q.petraea library (Durand et al., 2010).In contrast to genomic SSRs, EST-SSRs have the advantage of being located within the expressed portion of the genome, and thus reflect differences at gene level.
The aim of this study was to estimate the genetic differentiation between the two oak species of section Dascia that occur in Romania.It was tested whether genotypic information from a set of seven microsatellite markers is at 94ºC, a 40 s annealing step at 52ºC for gSSRs (57ºC for EST-SSRs), a 1 min 20 s elongation step at 70 ºC and a final extension step at 70ºC for 12 min.Amplification products were run on a Beckman Coulter Genetic Analyser using Frag-3 method and Size Standard 400.The products were then analyzed using Fragment Analysis Software using default parameters and PA ver1 dye correction.

Statistical data analysis
Microsatellite loci were tested for genotyping errors due to non-amplified alleles, large allele drop-out and scoring of stutter peaks using MICRO-CHECKER 2.2.0.3 (Van Oosterhout et al., 2004).For each microsatellite locus and species, number of alleles, number of species-specific alleles, allele frequencies, observed and expected heterozygosity (gene diversity), unbiased estimates of Nei's genetic distances were calculated using the computer software GenAlEx version 6.4 (Peakall and Smouse, 2006).Using FSTAT version 2.9.3.2 (Goudet, 1995), allelic richness (El Mousadik and Petit, 1996;Petit et al., 1998), a measure of the number of alleles that is independent of the sample size, was calculated.The smallest number of the individuals (n) for a locus (QrZAG39) in a species was set to 43.The differences between species were tested using a Student's t-test.An unweighted pair group method arithmetic average (UPGMA) dendrogram of the oak populations, based on Nei's genetic distance, was constructed using MEGA version 4 (Tamura et al., 2007).
A hierarchical Analysis of Molecular Variance (AMO-VA) using the program ARLEQUIN ver 3.5.1.2(Excoffier et al., 2005) was employed to examine the partitioning of molecular variance into components: within populations, among populations within species and among species.F ST values were estimated for each locus and across loci.F ST values were tested by permuting individual genotypes among populations and species.P-values were calculated using 10 000 permutations.
The frequency-based assignment test (Paetkau et al., 1995) available in GenAlEx v. 6.4 software was first used to assign individuals to species.For each individual, a log likelihood value was calculated for each species, using the allele frequencies of the respective species.An individual was assigned to the species with the highest log likelihood value.
sufficient to identify the species at population and individual level.

Genetic analysis
DNA was extracted from buds using the Qiagen DNeasy96 Plant Kit following the manufacturer protocol, but without liquid nitrogen for material disruption.The DNA was then kept by -60°C until use.Five genomic SSRs (gSSRs) and two EST-SSRs were amplified using Polymerase Chain Reaction (PCR).Both gSSRs (Kampfer et al., 1998;Steinkellner et al., 1997) and EST-SSRs (Durand et al., 2010) were developed from Q. robur/Q.petraea libraries.More information about the primer pairs, repeat motif and allele length is given in Tab. 1.The forward primer for each locus was fluorescently labelled with Beckman dyes (D2, D3 and D4).The primers were combined into two PCR multiplexes on the basis of annealing temperature and fluorescent label.The first multiplexing reaction included the five gSSRs and the second one the two EST-SSRs.The reactions were performed in a 20 μl volume containing approximately 10 ng template DNA ; 1x Promega colorless PCR buffer; 2 mM of MgCl 2 ; 0.45 mM of each dNTP (Fermentas); for each primer concentrations see Tab. 1; 1.15 U Taq DNA polymerase (Promega).Amplification was carried out in a Corbett Thermal Cycler.The PCR profile was as follows: 3 minutes of denaturation at 94ºC followed by 31 cycles of 50 s denaturation The Bayesian clustering method implemented in STRUCTURE software version 2.3.3 (Pritchard et al., 2000) was further used to determine the genetic structure of the sampled populations.Two model approaches have been used.The first approach was a blind procedure that did not use any prior information about species and geographic location.The second one took into consideration the sampling location (with LocPrior model).This model is recommended when molecular data is not very informative to help the detection of population structure (Hubisz et al., 2009).20 independent runs were done for K, number of clusters, ranging from 1 to 5. The program was run with correlated allele frequency (Falush et al., 2003).Each run consisted in 50 000 burn-in steps followed by 10 6 iterations.To estimate the number of clusters (K), an ad hoc measure, ΔK, which is based on the rate of change in the 'log probability of data' (L(K)) between successive K values, was calculated (Evanno et al., 2005).The software STRUCTURE HARVESTER (Earl, 2011) was used for ΔK estimation.

Results and discussion
The seven genomic SSR and EST-SSR markers have revealed high levels of polymorphism in both species (Tab.2).However, a strong reduction in variability was observed at two genomic dinucleotide microsatellite loci, QrZAG112 and QpZAG110, in Q. frainetto.The two oak species share the most frequent alleles at the investigated loci.Nevertheless, numerous species-specific alleles (i.e.alleles that are found only in each population of one species) were also observed, but the vast majority of them are rare (relative frequency<0.05).Most of the rare alleles may simply be species-specific because of the limited sample size.Allelic richness is highly sensitive to sampling errors, especially at highly polymorphic microsatellite markers (Litt and Luty, 1989).Only five Q. pubescens and two Q. frainetto specific alleles with higher frequency (>0.05) were detected in the four oak populations.Allele 92bp (Fig. 1a) at locus QrZAG112 (frequency=0.17)and allele 245bp at locus QrZAG11 (frequency=0.10)are among the five specific allele for Q. pubescens.By contrast, Q. frainetto specific alleles have been detected in the EST regions: allele 184bp at locus PIE040 (frequency=0.10)and allele 304bp (fre-quency=0.06) at locus GOT004.
No evidence for scoring errors due to stuttering and large allele drop-out was found in the microsatellite data set.However, Micro-checker software indicated that null (non-amplified) may be present at two marker-loci: QrZAG39 and GOT004.Interestingly, null alleles were also reported at locus QrZAG39 in Q. robur and Q. petraea (Neophytou et al., 2010).The high number of alleles detected at locus GOT004 (see Tab. 2) which differs only 1 base pairs (bp) in length, although the repeat motif is dinucleotide (TG) n , supports the assumption of high mutation rates in the flanking regions of the microsatellite sequence including the primer's binding site.
Mean values of genetic diversity measures were higher in Q. pubescens than in Q. frainetto, but the differences are not significant (P>0.05)(Tab.2).Higher allelic richness and gene diversity in Q. pubescens might be explained by the greater propensity of this species to hybridize with other white oak species (Q.petraea, Q. robur) as compared to Q. frainetto.Indeed, in a four-oak-stand (Bejan-Deva) in west-central Romania, there was evidence for more hybrids between Q. pubescens and the three other species than hybrids that have Q.frainetto as parental species (Curtu et al., 2007).Some hybrid combinations that involve Q. frainetto, such as Q. frainetto x Q. robur, were extremely rare in the same mixed stand.Since Q. pubescens and Q. frainetto have very similar proportions in the Bejan Forest, the differences in the hybridization rates were not influenced by their relative abundance (Lepais et al., 2009).Therefore, the reproductive barriers seemed to be stronger in Q. frainetto than in Q. pubescens.Moreover, Q. pubescens hybridizes extensively with Q. petraea in other parts of the natural range (e.g. in Italy, Salvini et al., 2008).
Two out of seven loci displayed very high F ST values (Tab.3).The two loci are genomic SSRs markers.However, one of the EST-SSR loci, PIE004, also shows a relatively high F ST value.Interestingly, locus QrZAG112 (Fig. 1) discriminated also very well between Q. petraea and Q. robur (Scotti-Saintagne et al., 2004).Not the same situation was found for another outlier locus between Q. pe- traea and Q. robur (QrZAG96) that shows an extremely low value for F ST in the present investigation (Tab.3).Loci with high F ST values are very likely situated in genomic regions under selection (Lexer et al., 2006;Neophytou et al., 2010;Scotti-Saintagne et al., 2004).Genetic differentiation between both species was high and significant (F ST =0.067; P<0.05), when all loci where considered jointly.Significant differences between Q. frainetto and Q. pubescens were also reported for a set of five genomic SSRs in Italy (Fortini et al., 2009).The four populations cluster according to species rather than geographic origin (Fig. 2).The genetic distance  among Q.frainetto populations is much smaller than that among Q.pubescens populations, which is consistent with the geographic distances among populations.Moreover, in accordance with AMOVA, the genetic variation among Q.pubescens and Q. frainetto is nearly six fold higher than the variation among populations within species (Fig. 3).Using a genetic assignment procedure implemented in software GenAlEx ver.6.4., the taxonomic status of the sampled individuals, that had either a Q. pubescens or Q. frainetto phenotype, was determined.In 96% of the cases, the molecular data indicated the correct status (Fig. 4).Only five individuals, four Q. pubescens and one Q.frainetto, were not correctly assigned.In the Bayesian analysis, the uppermost level of structure corresponds to two clusters (Fig. 5).Each species was represented by one cluster in the assignment procedure, 90% of the individuals had the highest admixture coefficient (Q>0.50) for the genetic group corresponding to their phenotype (Fig. 6a).The percentage reaches 100% if information about the sampling localities is considered in the admixture model (Fig. 6b).and the admixture coefficient (Q), corresponding to the assignment probability of each individual to each cluster, was used to infer the species status.There was a clear correspondence between the genetic cluster and the species designation (Fig. 6).When no prior information was used Among the individuals that were not correctly classified (10%) using the blind procedure, no one had a probability larger than 0.85 for the wrong cluster.Most of them had coefficients of admixture between 0.50 and 0.60, a proportion which usually correspond to hybrids, although only pure stands for both species have been sampled.Two hypotheses may explain the wrong assignment of the 13 individuals: (1) the molecular data was not informative or (2) these individuals are natural hybrids.The first hypothesis is supported by the absence (at Săcălaia-Cluj) and low-density (at Măcin-Tulcea) of Q. frainetto individuals in the vicinity of the two Q. pubescens populations.Moreover, closely related oak species, which have recently diverged, share ancestral polymorphism (Muir and Schlötterer, 2005).On the other hand, hybridization events are more likely in the two Q. frainetto stands since Q. pubescens populations are mentioned to occur in their proximity (Sofletea and Curtu, 2007).Indeed, the degree of admixture was larger for Q. frainetto populations than for Q. pubescens populations (0.30 versus 0.25), which suggests more introgression in Q. frainetto.However, hybridization by long distance pollen dispersal may explain the occurrence of hybrids between the two species in apparently isolated Q. pubescens populations.The same hypothesis was invoked for the presence of hybrids with Q. pyrenaica and Q. pubescens in a mixed stand consisting only of Q. robur and Q. petraea trees, although the nearest Q. pyrenaica and Q. pubescens populations are localised tens of kilometres away (Lepais et al., 2009).

Conclusions
Higher values of genetic diversity measures were observed in Q. pubescens than in Q. frainetto.Two out of seven microsatellite loci discriminated very well between the two species.The molecular analysis demonstrates that Q. pubescens and Q. frainetto can be unambiguously designated at population level.Using seven microsatellite markers the species of an individual can be easily identified when sampling location is known.In the absence of any information about geographical location, the species can still be determined with a high probability.

Fig. 2 .
Fig. 2. UPGMA dendrogram based on unbiased estimates of Nei's genetic distances between Q. pubescens (PUB) and Q. frainetto (FRA) populations at seven microsatellite loci Fig. 5. a -Mean L(K) (±SD) over 20 runs for each K value; b -ΔK calculated as ΔK = m|L"(K)|/s[L(K)].The modal value of this distribution is the true K or the uppermost level of structure, here two clusters Tab. 2. Diversity measures of the seven microsatellite loci investigated in Q. pubescens (PUB) and Q. frainetto(FRA) Note: Na -number of alleles, A -allelic richness after rarefaction (n=43 individuals), Ho -observed heterozygosity, He -expected heterozygosity.
Tab. 3. Pairwise F ST values between Q. pubescens and Q. frainetto for the seven microsatellite loci