Identifying strawberry Whirly family transcription factors and their expressions in response to crown rot

Crown rot is one of the most destructive diseases of cultivated strawberry. The correlation between Whirly family transcription factors, the one class of known resistance genes, and strawberry crown rot resistance has not been studied. In this study, the Whirlys of Fragaria × ananassa, F. iinumae, F. vesca, F. viridis and F. nilgerrensis were characterized by searching the strawberry genome database and analyzing the presence of Whirly domains. Five FaWHYs, two FiWHYs, three FnWHYs, two FviWHYs and four FvWHYs were identified from their respective genome. Two gene clusters with segmental duplications were obtained from the gene cluster analysis with two and three FaWHYs, and three FaWHYs showed syntenic relationships with AtWHYs of Arabidopsis thaliana. FiWHY1, FvWHY2 and FviWHY1 showed syntenic relationships with FaWHY1 and FaWHY2. At the same time, FiWHY2, FvWHY3, FviWHY2 and FnWHY3 exhibited similar syntenic relationships with FaWHY4 and FaWHY5. In addition, FnWHY1 and FnWHY2 corresponded to both FaWHY1 and FaWHY2. Gene expression analysis revealed that five FaWHYs were expressed in crowns, and the regulation of FaWHYs was always consistent with the cis-elements in their promoters. All of them were downregulated by crown rot infected. Together, these results provided a basis for further functional studies of the FaWHYs proteins and their responses to crown rot.


Introduction
Strawberry, the small crop producing much appreciated fruits with unique flavour and high nutritious qualities, is of great importance throughout the world, but its productivity and quality are seriously limited by crown rot (Mangandi et al., 2015;Anciro et al., 2018). Crown rot occurs in the crown root neck, which is manifested as a short plant. After infection, the crown root neck produces red streaks, and then rapidly expands to dark, sunken spots, and finally the whole plant wilts and withers, which is a devastating disease of strawberry.
Plants have a small family of single-stranded DNA (ssDNA) binding proteins called Whirly that are involved in the control of defence gene expression (Desveaux et al., 2004;Isemer et al., 2012). The Whirly transcription factor family can participate in plant resistance to adversity through the transduction of disease resistance signals and the regulation of hypersensitive responses (Yao et al., 2008).
Whirly protein is a plant-specific protein, which is mainly distributed in chloroplasts, mitochondria and cell nuclei. In most species, this family contains only two members in most plant species, and three in 2 Arabidopsis thaliana and a few species. There are currently many Whirly studies on A. thaliana. AtWHY1 adapts to adversity and immune response through redox regulation of chloroplast components in retrograde signal (Lepage et al., 2013;Foyer et al., 2013). The AtWHY1 genes are also involved in the salicylic acid (SA)dependent disease resistance and SA-induced expression of the systemic acquired resistance response gene.
AtWHY1 is required for both full basal and specific disease resistance responses (Desveaux et al., 2005). WHYs also function in response to microbe interactions, such as pathogen infection. A large set of defence genes with various biochemical functions, including pathogenesis-related (PR) genes, are activated or repressed in response to pathogen attack. Studies have found that the tagged AtWHY1 is translocated from the plastid to the nucleus, which affects expression of target genes such as PR1 (Isemer et al., 2012). For instance, a transcriptional activator located in the nucleus of potato PBF2 (StWHY1) was identified, which can combine with the elicitor response element on the promoter of the disease resistance gene PR1, activate the expression of PR1, and participate in the pathogen response process (Desveaux et al., 2000). And overexpression of tomato Whilry gene in transgenic tobacco resulted in Pseudomonas solanacearum resistance (Zhao et al., 2018). Whirly genes (MeWHYs) in cassava, MeWRKY75 and MeWHYs confer improved disease resistance against cassava bacterial blight through forming an interacting complex of MeWRKY75-MeWHY1/2/3 and transcriptional module of MeWRKY75-MeWHY3 (Liu et al., 2018).
The WHY family has been characterized in several plants, including potato (Desveaux et al., 2000), A. thaliana (Desveaux et al., 2004), wheat (Chitnis et al., 2014), tomato (Zhao et al., 2018), cassava (Liu et al., 2018), tobacco (Zhao et al., 2018), chili (Lu et al., 2019), soybeans (Li et al., 2019), and rice . WHYs play important roles in biotic stress and may function in the strawberry biotic stress response. However, strawberry-specific WHY studies are lacking. In the present study, the strawberry Whirly family transcription factors members were identified via bioinformatics tools, and their expression patterns in response to biotic stress were characterized. This study provides basic information on the protein structures, subfamily divisions, chromosome localization in the strawberry genome, and expression patterns of the Whirly proteins response to crown rot.
Sequence alignment and phylogenetic analysis The full-length Whirly protein sequences from A. thaliana and strawberries were aligned via muscle in MEGA version 7.0, with default parameters (Edgar, 2004;Kumar et al., 2013). A neighbour-joining (NJ) tree was also generated with bootstrapping (1000 replicates). The phylogenetic relationships among the five kinds of strawberries WHYs and A. thaliana WHYs were estimated.
3 Conserved motifs and gene structure analysis Motif analysis was conducted on the MEME website (http://meme-suite.org/tools/meme) to identify conserved motifs with the following optimized parameters: zero or one occurrence per sequence, a maximum of 10 motifs and an optimum motif width between 6 and 50 residues. The default settings were used for all other parameters. Comparing the coding sequence with the corresponding genome sequence, the structure of FaWHYs was determined using TBtools .
The Strawberries Generic Feature Format (GFF) files were downloaded from the strawberry Genome Database and used to elucidate the structure information of the Whirly gene. An illustration of the FaWHYs protein motifs, conserved domain, gene structures and a phylogenetic tree was also constructed in TBtools .
Chromosomal distribution, gene duplication and collinearity The chromosome locations of the candidate strawberry Whirly genes were analyzed from the GFF information and visualized by TBtools . Gene duplication events of the FaWHYs and collinearity between the A. thaliana Whirly protein sequences and five kinds of strawberries Whirly protein sequences were investigated by MCScanX (Wang et al., 2012). The results were visualized in TBtools .

FaWHY expression in response to biotic stress
A single factor experiment was performed using different treatments causing inoculation with Colletotrichum siamense SCR-7. The healthy and consistent strawberry seedlings were divided into treatment group (JZ) and control group (CK) with 12 pots each. The JZ treatment group was inoculated with C. siamense SCR-7 using a sterilized needle as described in Li et al. (2014), while the CK group was inoculated with nontoxic medium using the same method. The seedlings were grown with or without C. siamense SCR-7 and 0 or 6 days after vaccination, resulting in four treatment groups: inoculation with no pathogens after 0 day (0DCK), inoculation with no pathogens after 6 days (6DCK), inoculation with pathogens after 0 day (0DJZ) and inoculation with pathogens after 6 days (6DJZ). Each treatment had three biological replicates. The photo of strawberry inoculation with pathogens after 6 days and CK was shown in Figure S1.
Transcriptomic data of seedling crowns from the four treatments were analysed as described by Shu et al. (2016). Twelve libraries of seedling crowns were sequenced using the Illumina HiSeq 2000 system. Reads that contained adapters, more than 10% unknown nucleotides, and more than 50% bases with a quality value ≤5 were removed to obtain uncontaminated sequences based on the raw data. Uncontaminated sequences were mapped to the genome of F. ananassa 'Camarosa' (v1.0.a1) for annotation. The transcriptomic data were uploaded to the NCBI Sequence Read Archive as PRJNA715088. Gene expression was analysed based on the transcriptomic data, where the transcriptional abundance of FaWHY was calculated as fragments per kilobase of exon model per million mapped reads (FPKM) using the Cufflinks package cuffdiff version 2.2.1. The FPKM value of 0DCK was considered the relevant control. Heat maps were created using TBtools software based on the transformed data of log2 (FPKM+1) values ).

Cis-acting elements of the FaWHYs
The 2,000 bp sequences upstream of the transcription initiation site of the candidate genes were extracted from the strawberry genome sequences. The PlantCARE software (http://bioinformatics.psb.ugent.be/webtools/plantcare/html) was used to search for cis-acting elements (Rombauts et al., 1999), and the results were visualized in TBtools  Results Identification, characteristics, and chromosomal distribution of the WHYs in strawberry genome Five FaWHYs were identified in the F. ananassa genome after searching for Whirly domain sequences. The FaWHYs were named according to their positions on each chromosome. The FaWHY protein lengths ranged from 164 aa (FaWHY3) to 298 aa (FaWHY5), the pI was ranged from 8.53 (FaWHY3) to 9.59 (FaWHY1), and the molecular weight ranged from 18.46 kDa (FaWHY3) to 33.10 kDa (FaWHY5). For the other four diploid strawberry, the Whirly protein lengths ranged from 112 aa (FnWHY1 and FnWHY2) to 273 aa (FiWHY2), the pI was ranged from 9.24 (FiWHY1) to 9.96 (FvWHY1), and the molecular weight ranged from 12.41 kDa (FnWHY1 and FnWHY2) to 30.08kDa (FiWHY2) ( Table 1).
The WHYs of the other diploid strawberries were all distributed on second and fourth chromosome, respectively ( Figure S2). Gene duplication and divergence are important in gene family expansion and in the evolution of novel functions. Two gene clusters with segmental duplications were obtained from the gene cluster analysis. One cluster contained two genes (FaWHY1 and 2), whereas the other cluster had three genes (FaWHY3, 4 and 5). There were no tandem duplications in the FaWHY genes ( Figure 1). Synteny analysis of the Whirlys in A. thaliana and five strawberries To further investigate the phylogenetic patterns of the FaWHYs, a comparative syntenic map of five strawberries and A. thaliana was constructed. FaWHY2 and FaWHY5 showed syntenic relationships with the AtWHY2, which is located on the Fvb2-2 and Fvb4-3 chromosome. While FaWHY4 showed syntenic relationships with both AtWHY1 and AtWHY3, indicating that it may have played an important role in the evolution of the Whirly family ( Figure 2). F. iinumae, F. vesca, F. viridis and F. nilgerrensis showed similar collinearity with F. ananassa. F. iinumae, F. vesca and F. viridis on the second chromosome: chr2, fvb2, fvir2, which have FiWHY1, FvWHY2, FviWHY1 showed syntenic relationships with FaWHY1 and FaWHY2. At the same time, the genes on four fourth chromosome: FiWHY2, FvWHY3, FviWHY2 FnWHY3 had similar syntenic relationships with FaWHY4 and FaWHY5. In addition, genes FnWHY1 and FnWHY2 on the second chromosome of F. nilgerrensis corresponded to both FaWHY1 and FaWHY2. This showed that in the process of strawberry evolution, genes have been duplicated, and the Whirly gene is highly conserved (Table   S1). 6 Phylogenetic, exon-intron structure, conservative domains and motifs analysis of the FaWHYs The phylogenetic relationships of five different kinds of strawberries and AtWHYs were analysed by a phylogenetic tree of the protein sequence alignment. As shown in Figure 3, the strawberry Whirlys and AtWHYs clustered into three major groups. Groups I to III have 9, 2 and 8 members, respectively, and differences were observed between A. thaliana and strawberries. The number of strawberries and A. thaliana Whirlys in groups I and III was nearly equal and groups II only have two AtWHYs (Figure 3).
The FaWHYs in different groups were characterized according to their Whirly domain numbers and exon-intron structures. Motifs 1 and 2 composed the whirly domain, and all 5 FaWHYs had one characteristic domains. The number of introns varied from 4 (FaWHY3) to 8 (FaWHY2). FaWHYs in group I possessed motifs 1, 2, 3, 4, 5 and 8. Group II contained no FaWHY member. FaWHY4 and FaWHY5 in groups III contained 8 same motifs, and the differences were in the exon-intron structure, while the FaWHY3 lacked untranslated region (5' UTR) (Figure 4; Figure S3). The phylogenetic tree was constructed using the neighbour-joining method implemented in MEGA 7.0. Reliability of the predicted tree was tested using bootstrapping with 1,000 replicates. Branch lines with different colours represent different Whirly groups.

Analysis of FaWHY expression and the cis-elements in FaWHY promoters
Gene expression analysis revealed that 5 FaWHYs were variably expressed in the crown roots and all of them were downregulated by crown rot infected ( Figure 5). Several cis-elements, including 'hormoneresponsive', were identified in the upstream regulatory regions (promoters) of the FaWHYs. The cis-elements for 'defense and stress responsive' and 'salicylic acid responsive' are responsible for the plant response to pathogen infection. 'Defense and stress responsive' cis-elements were identified in promoters of four FaWHYs (FaWHY1, 3, 4 and 5) and one 'salicylic acid responsive' cis-elements were identified in promoters of FaWHY1. There was only one 'defence and stress response' cis elements of the same species found in FaWHY1, 3, 4, and 5 indicates that they have the same ability to regulate stress. The promoters of FaWHY1 contained 'defence and stress responsive' and 'salicylic acid responsive' cis-elements, suggesting that this gene may be regulated by SA-dependent disease resistance responsive and microbial interactions ( Figure 6). 7 Figure 4 Phylogenetic tree of deduced FaWHY proteins associated with the motif composition and exonintron composition of FaWHY genes The phylogenetic tree was constructed using the neighbour-joining method (left-hand side of the figure). Reliability of the predicted tree was tested using bootstrapping with 1,000 replicates. The motif composition related to each FaWHY protein is displayed in the middle of the figure. The motifs, numbered 1-10, are displayed in different colored boxes. The information for each motif is provided in Figure S3.  The 'defense and stress responsive' and 'salicylic acid responsive' cis-elements are indicated by orange and nattier blue, respectively.

Discussion
Plant Whirlys are a multigene family; there are 3 Whirlys in A. thaliana, 2 Whirlys in potato (Maréchal et al., 2008), 2 Whirlys in tomato (Akbudak et al., 2019), in addition, the amino acid sequence of the protein can also be found in dozens of plants such as soybean, wheat, rice, corn, and lily (Kong et al., 2012). Although the size of F. ananassa genome (780 Mb) (https://www.rosaceae.org) was larger than the genome of A. thaliana (125 Mb) (https://www.arabidopsis.org), the total number of FaWHY genes was similar to the number of these genes in A. thaliana. Five FaWHYs were identified on five chromosomes in the F. ananassa genome. Two gene clusters with segmental duplications were obtained from the gene cluster analysis with two and three FaWHYs, these clusters likely arose from segmental duplications, suggesting gene family expansion during evolution. Among them, three FaWHYs (FaWHY2, 4 and 5) showed syntenic relationships with the 8 AtWHYs. It can be speculated that FaWHY1 and FaWHY3 may be new genes produced during plant evolution. The syntenic relationships of whirly between other four kinds of strawberries and the F. ananassa showed that the F. ananassa had gene duplication during the evolution process. However, FaWHY3 did not find collinearity in the four diploid strawberries, but FaWHY3 had segmental duplications between FaWHY4 and FaWHY5, which may be caused by chromosomal variation during evolution. These results provided insights that would assist in the prediction of the evolution of FaWHYs. F. ananassa is a common allopolyploid, and its parental ancestors still exist. Recently, the possibility of ancestral parents of octoploid strawberry were researched (Edger et al., 2019;Liston et al., 2020;Edger et al., 2020), through the syntenic relationship between octoploid strawberry and its possible ancestor parents to confirm its evolution and genetic characteristics at the early stage of formation. In our study, we found that FaWHY1 and FaWHY2 have syntenic relationship with FvWHY1, FiWHY1 and FviWHY1, but FnWHY1 and FnWHY2 have syntenic relationship with both FaWHY1 and FaWHY2. The FnWHYs have not increased exponentially but has decreased exponentially in the syntenic relationship analysis, which provided evidence for other researchers' study that F. nilgerrensis is not the ancestor of F. ananassa (Feng et al., 2021).
The whirly domain was highly conserved during the evolution process, which provided information for the prediction of the structure and function of the FaWHYs gene. The whirly protein (WHYs) have three domains: Whirly domain, an N-terminal domain and C-terminal variable region. The Whirly domain is the most important domain, which has the ability to bind to ssDNA and the KGKAAL, YDW and K amino acid residues in this region may play a role as important sites of WHY protein (Desveaux et al., 2005). The Nterminal domain may have chloroplast or mitochondrial signal peptides and transcription activation regions; the C-terminal variable region has a self-regulating region, which can regulate ssDNA binding activity (Desveaux et al., 2002). All of the FaWHYs clustered into two major groups, with distinct protein domains, motifs and sequences. Whirly domain is the most conserved region in whirly protein, Motifs 1 and 2 composed the whirly domain, and all 5 FaWHYs had one characteristic domain.
The phylogenetic tree generated from the protein sequence alignment of strawberries and A. thaliana segregated the 5 FaWHYs into two large groups. Group members shared similar protein sequence lengths, motif compositions and exon-intron structures, suggesting a close relationship. Thus, FaWHY1, FaWHY2 and their homolog AtWHY2 in the same branch may play similar roles in plant-microbe interactions and biotic stress responses. AtWHY2 clustered with FaWHY1, FaWHY2 speculated that they may be located in mitochondrial cells and participate in the transmission of disease resistance signals (Cappadocia et al., 2012).
The phylogenetic tree predicted that the FaWHYs are involved in pathogen infection interactions, but this hypothesis requires verification in future studies. In addition, gene expression analysis revealed that 5 FaWHYs were expressed in the crown roots, with identical expression patterns. All of them were down-regulated by crown rot infected. Furthermore, 'Defence and stress responsive' cis-elements were identified in the promoters of FaWHYs1, 3, 4, and 5. The regulation of FaWHYs expression was similar to the cis-elements in the promoters, it can be speculated that they have similar functions. This result suggested that all of the FaWHYs have the ability to regulate pathogen infection stress.

Conclusions
Strawberry crown rot occurs all over the world, and the correlation between its disease resistance and the disease resistance gene Whirly is still unclear. In our current study, we identified five FaWHYs, two FiWHYs, three FnWHYs, two FviWHYs and four FvWHYs in the F. ananassa, F. iinumae, F. nilgerrensis, F. viridis, and F. vesca genome, respectively. In the syntenic relationship analysis with A. thaliana, it was found that F. ananassa produced a new genome (FaWHY1) during the evolution process. In the syntenic relationship analysis with four diploid strawberries, FiWHY1, FvWHY2, and FviWHY1 showed syntenic relationships with FaWHY1 and FaWHY2. At the same time，FiWHY2, FvWHY3, FviWHY2, and FnWHY3 have 9 similar syntenic relationships with FaWHY4 and FaWHY5, and FnWHY1 and FnWHY2 corresponded to both FaWHY1 and FaWHY2. It showed that F. ananassa may have chromosomal variation during the evolution process, which also proved it is highly conserved during whirly evolution. It was revealed that these genes are simultaneously down-regulated in the process of disease resistance. The analysis of phylogenetic tree and cis-elements in promoters indicated that the genes may have the ability to regulate the pressure of pathogen infection. However, the study of Whirly's mechanism of action is still not thorough, so it is necessary to further study the signal pathways involved to further study the specific mechanism of action. Collectively, the results of this study provided a basis for future functional studies of the strawberry Whirly and their responses to crown rot.

Authors' Contributions
SB conceived and designed the experiments, supervised and revised the manuscript. YH conducted the experiments and wrote the original manuscript.
Both authors read and approved the final manuscript.