Determination of Camellia oleifera Abel . Germplasm Resources of Genetic Diversity in China using ISSR Markers

Camellia oleifera is one of the four woody oil plants in the world, which is widely cultivated in South China. To examine the genetic diversity of C. oleifera in China, the diversity and genetic relationships among and within major populations of 109 varieties of C. oleifera were analyzed using ISSR markers. Twenty-three ISSR primers out of 49 primers yielded approximately 487 legible bands. A total of 335 of these bands were polymorphic markers, and the ratio of polymorphism was 68.86%. From the results, Zhejiang province showed the highest populations genetic diversity (H value 0.18), while Guangxi population showed the lowest genetic diversity (H 0.0851). Base on the bands, the genetic similarity coefficient ranged from 0.61 to 0.93 using NTSYS2.10e software. When coefficient was 0.75, 109 cultivars were divided into 11 categories and categories I contain 79 varieties by UPGMA cluster analysis. The test varieties divided into 7 sub-groups when categories were 0.75, which show a close genetic relationship. Results advised that Hunan is the main producing area of C. oleifera, with enriched C. oleifera variety and complex topography, and therefore has a high genetic diversity. Meanwhile, the main varieties of C. oleifera in Hubei are imported from Hunan, which results in fewer varieties and reduces the genetic diversity of C. oleifera. The ISSR profiles can improve C. oleifera germplasm management and provide potential determine correlations between different varieties and its distribution in different province.


Introduction
The oil-tea camellia (Camellia oleifera Abel.) is a traditional woody vegetable oil plant, which is mainly distributed in the southern part of China (Dai et al., 2013a).Seeds of oil tea, which contain tea saponins and tea oil, have been utilized in China for more than 2000 years (Hu et al., 2012;Liu et al., 2016).The tea oils extracted from the seeds of C. oleifera contain bioactive compounds with powerful nutritional and medicinal value (Zhao et al, 2017;Jin, 2012).Moreover, the oil of C. oleifera is the main cooking oil used in the southern provinces of China, especially in Hunan, Hubei, and Jiangxi.The high nutrition value of tea oil is mainly due to its high oleic acid content, which constitutes up to 88% of the fatty acids (Zhang et al., 2012).
Tea oil flavor is comparable to that of olive oil.There is earlier evidence that tea oil is abundant in unsaturated fatty acids and has hepatoprotective and antioxidant properties (Lee et al., 2007).At present, commercial oil tea production depends mainly on cultivated resources, whose planting area is concentrated in southern China provinces.Some areas of these provinces had become the main production regions of oil tea in the country.The species is predominantly propagated sexually by seeds in most parts.In rural areas, farmers usually collect randomly all mature seeds directly from local wild resources, mix them together, and plant them in the fields.However, is it likely that the cultivation practices of oil tea will lead to homogeneity or decrease genetic diversity after several decades of cultivation?The answer to this question is becoming increasingly more important due to the rapid decrease of wild gene pools of this species and the lack of good cultivars (Yu et al., 2013).
germplasm for the development of novel cultivars of pasture grasses and medicinal plants.To date, a limited number of studies have been conducted on the genetic assessment of Chinese C. oleifera species using the ISSR markers in different areas.Genetic studies of plant resources have substantially decreased the redundancy of germplasm conservation and facilitated the construction of a core germplasm collection, which is important for the efficient use of gene resources in plant breeding.

Plant material and DNA extraction
A total of 109 samples were collected from healthy C. oleifera of the main producing region in China (Fig. 1).From each sample, 10~20 newly-grown and tender leaves were randomly selected and placed into a labeled zip-lock bag, which was preserved in a -80 °C refrigerator (Table 1).
In this study, we modified the CTAB method proposed by Yang (2011), to extract the total genomic DNA from leaves of one hundred nine C. oleifera varieties.The genomic DNA mass was tested using 0.8% agarose gel electrophoresis, and the DNA concentration was determined with a UV spectrophotometer.The extracted DNA was stored in a -20°C refrigerator for spare (stored at -4 °C for a short-term).
Besides, because of the diversity of C. oleifera germplasm resources, a variety of good individual plants, clones and families were cultivated throughout China, while the morphology of Camellia oleifera showed obvious differences under the influences of genetic background and environment, which lead to the confusion of C. oleifera varieties (Yu et al., 2013).Camellia oleifera does not produce flowers, fruits, and exhibits slight leaf shape variations among varieties at the seeding stage.Thus, it is hard to distinguish varieties based on their morphology of leaves, inflorescences, involucre, and fruits.Therefore, the widespread application of genetic diversity research technology at the DNA level could enable the better understanding of the genetic relationship and distributions among different Camellia oleifera germplasm resources (Wen et al., 2006).
The inter-simple sequence repeat (ISSR) marker system is a polymerase chain reaction-based technique that uses a single amplification primer composed of a microsatellite motif to target a subset of simple sequence repeats or microsatellites (Salis et al., 2017).SSRs and ISSRs have been recognized as useful molecular markers in marker-assisted selection, the analysis of genetic diversity, population genetic analysis, and for other purposes in various species (Hamza et al., 2012)

ISSR amplification
In terms of primers used in this study, ISSR sequences were screened according to the EST library of C. oleifera.Referring to the IRRS primer sequence published in website of Columbia University, Canada, 49 amplification primers were designed (Table 2), which were synthesized by GenScript (Nanjing) Co., Ltd.
A C. oleifera sample with a high purity of genomic DNA was considered as template, and 49 primers were separately performed PCR amplification according to the above program.The amplified products were first detected by 1% agarose gel electrophoresis, and further analyzed by 6% polyacrylamide gel electrophoresis, from which primers high polymorphism were screened for identification of 109 C. oleifera samples (Cheng et al., 2012).

Data analysis
All PCR products were amplified by 6% polyacrylamide gel electrophoresis at 30V/cm for about 90 min, followed by silver nitrate staining and sodium hydroxide development, and the obtained amplified bands were identified as 0 or 1 to build a database.The bands with the same electrophoretic mobility in the amplification products of the same primer were considered homologous (Yang et al., 2013).Bands were read manually.In the same migration position, "1" referred to presence of clear and repeatable bands, while "0" referred to absence of clear and repeatable bands.In the NTSYS 2.10e analysis software, the genetic similarity matrix between varieties was obtained using the SM similarity coefficient method, and then the cluster was analyzed by unweighted pair-group method with arithmetic means (UPGMA) to obtain dendrogram (Liu et al., 2006).

Screening results of primers
DNAs of AH18 and XL51 varieties were randomly selected as templates to separately screen the 49 ISSR amplification primers, from which primers without amplified products and with poor polymorphism were excluded, while those with clear amplified bands and obvious polymorphic fragments were retained.As shown in Fig. 1, there were polymorphic bands in primers SSR 8,9,[13][14][15][16][17][18]20,25,30,32,36,38,43 and 45, which indicates that primers with the corresponding numbers can provide reference for screening of ISSR primers.Primers that did not show polymorphic bands on other lanes were excluded.Further analysis with polyacrylamide gel electrophoresis (Fig. 2) revealed that primers SSR 7,8,9,13,14,19,20,21,28,29,32,33,34 and 36 showed clear polymorphism bands.Combining with the preliminary screening results shown in Fig. 3, the following 23 primers (Table 2) could be determined, which were used for certification and identification of 109 varieties.These primers were able to amplify bands that were clear, stable, reproducible and highly polymorphic in the tested C. oleifera samples.

ISSR polymorphism and fingerprinting
The genomic DNA extracted from the 109 samples was amplified using the screened 23 primers.Statistical analysis showed that a total of 487 DNA bands were amplified, of which 21.2 bands were amplified by each primer in average.Of these, there were 335 polymorphic bands, which meant 14.6 polymorphic bands were amplified per primer, where the average percentage of polymorphic loci was 68.86% (Table 2).The number of bands differed a lot among different primers, where the bands amplified by ISSR13 and ISSR36 were respectively 27 and 28, while those amplified by ISSR13 and ISSR36.Meanwhile, ISSR18, ISSR28 and ISSR36 were able to amplify 20 polymorphic bands.DNA fingerprint profiles were successfully constructed for the 23 primers, of which the amplification results of SSR36 were shown in Fig. 3, and those of other primers were shown in this paper.The DNA band polymorphism in amplification results of each primer can be used to better distinguish the C. oleifera cultivars.Analysis of genetic diversity A total of 487 bands were amplified from the 23 primers, including 335 polymorphic loci.There was a large difference in the genetic diversity among several populations (Tables 3 and 4).For the 109 C. oleifera samples, the average of alleles was 1.2791, the average NEI gene diversity index was 0.1851, and the average Shannon information index was 0.296.In addition, there was difference in the degree of genetic diversity among different populations.The Nei's gene diversity index ranged 0.0851-0.1809,and the Shannon information index ranged 0.1242-0.286.Among the results (Fig. 5), Zhejiang population showed the highest genetic diversity and its Nei genetic diversity was more than 0.18, while Guangxi population showed the lowest genetic diversity, with a H=0.0851.The genetic diversity and richness of C. oleifera population are influenced by climate, topography and human activities, etc.In China, Hunan is the main producing area of C. oleifera, with enriched C. oleifera variety and complex topography, and therefore has a high genetic diversity.Meanwhile, the main varieties of C. oleifera in Hubei are imported from Hunan, which results in less varieties and reduces the genetic diversity of C. oleifera.

Genetic differentiation coefficient and gene flow
For the above mentioned five provinces, the total genetic diversity (Ht) of the C. oleifera populations was 0.1784, the genetic diversity within a population was 0.1528, which accounted for 82.55% of the total genetic diversity, and the genetic diversity among populations was 0.0323, which accounted for 17.45% of the total genetic diversity.The genetic diversity within several C. oleifera populations was greater than that among these populations, which indicates that the genetic variation of the 109 C. oleifera varieties is mainly within the population.Meanwhile, the genetic differentiation coefficient of C. oleifera population was 0.32, which suggests that the genetic differentiation was relatively low among different populations.The gene flow among populations (Nm) was 2.9846, which suggests that there was frequent gene exchange among different populations of C. oleifera.

Genetic similarity analysis between C. oleifera varieties
Dice genetic similarity coefficients among different C. oleifera varieties were calculated using NTSYS2.10esoftware (Table 5; Supplementary Tables S5-1 to S5-9).As shown in the tables, the genetic similarity coefficients among the 109 C. oleifera varieties ranged 0.61-0.93,with an average coefficient was 0.74.Base on the results, the No. 59 (S13) and No. 60 (S12) samples showed a genetic similarity coefficient of 0.93, which indicates that these two varieties have a very close genetic relationship, and they may belong to the same variety.The genetic similarity coefficient of No. 4 (GW 16) and No. 6 (GW 2) samples was 0.87, which indicates that these two varieties have a close genetic relationship, and have a small genetic difference.Meanwhile, the genetic similarity coefficient of No.14 (XL 53) and No. 15 (XL 210) samples was only 0.61, which suggests that they show a big genetic difference, and have far genetic relationship.

Cluster analysis
Based on the Dice genetic similarity coefficients between any two different varieties, the genetic relationship cluster dendrogram (Fig. 6) was constructed using UPGMA method, in which the bands of the 487 loci were considered as the original matrix.The cluster dendrogram indicated the classification of the 109 C. oleifera varieties, and revealed consistent results with the genetic similarity coefficients among different varieties, where the higher the genetic similarity coefficient, the closer the genetic relationship, and the smaller the difference between varieties.As shown in the dendrogram, the similarity coefficients among the 109 samples range 0.70-0.93,which manifested abundant genetic diversity.According to the ISSR cluster analysis of C. oleifera, the genetic relationship dendrogram was complicated.When the genetic similarity coefficient was 0.70, the 109 cultivars were divided into A and B groups, of which the No. 15 sample (XL 201) in B group was clustered alone, which suggests that there is genetic difference between this variety and other varieties.
When the genetic similarity coefficient was 0.735, the C. oleifera germplasm could be divided into 11 class groups (Fig. 6).The class group I contained 79 samples, and it could be divided into 7 sub-groups with a genetic similarity coefficient of 0.735.Of these, Subgroup I -1 included 14 varieties, and could be divided into two groups, where the second group clustered 6 varieties, including 4 of QY series, and the first group clustered 8 varieties, of which the genetic similarity coefficient of No. 59 and No. 60 from S series was the highest (0.93), which indicates that the two varieties have very close genetic relationship and may belong to the same variety.
The subgroup I -2 included 27 samples, 7 of which were of Gan series.The sub-group I -3 included 4 varieties, and the No. 4 and No. 6 samples from GW series were clustered, with a genetic similarity coefficient of 0.87, which suggests that the have a close genetic relationship, and have a small genetic difference.Subgroup I -4 included XL 7, XL32, QY235, AH12, CL 3 and QY104 varieties.The subgroup I -5 included 24 samples, 8 of which were of Gan series.The subgroup I -6 only contained S8 and S6 varieties from S series.And the subgroup I -7 only consisted of XLC25 and AH19 samples.
Class group II included five varieties.Class group III included four varieties, of which there were three varieties of Gan series.Class group IV consisted of three samples, of which two were from the Gan series.Class groups V, VI and VII included 3 varieties each.Meanwhile, the class groups VIII and IX only contained S3 and camellia, respectively.Class group X included 6 samples, 5 of which were of XL series.While class group XI only consisted of XL 210 variety.

Discussion
ISSR marker and primer screening ISSR plays a significant role in the evolution and classification of fruit trees due to its characteristics of simple equipment, high repeatability and abundant polymorphism (Zhang 2004).In this research, only 23 pairs of high polymorphism primers were screened out from 49 pairs of ISSR primers, with a lower efficiency of 46.9%.There may be two reasons, only a small amount of EST data is included, and they were derived from single-source, which results in a high degree of redundancy for the designed primers themselves.Second, the relationship between C. oleifera in China is close, and thus it is not easy to obtain polymorphic primers (Wu et al., 2012).

Analysis of genetic diversity
The percentage of polymorphic loci is an important index of the level of genetic variation in cultivar, and an important parameter to measure genetic diversity (Wu et al., 2008).Zhang (2011) analyzed genetic diversity of fortyeight C. oleifera cultivars in five regions of Hubei Province.
According to the results, the main cultivars of C. oleifera in Hubei divided into two groups: Hefeng population and other population.
In terms of analysis of genetic relationships, Huang (2006) utilized RAPD marker technique to analyze the genetic diversity of ninety C. oleifera cultivars using 18 primers, and obtained a polymorphic band ratio in the amplified bands up to 95.11%.Peng (2012) used ISSR technology to identify 192 varieties of C. oleifera germplasm with 10 primers, and obtained a percentage of polymorphism of 88.14%.Dai (2013b) used ISSR molecular marker technique to amplify 32 C. oleifera cultivars in Qinba Mountain area with 11 primers, and obtained a polymorphic locus ratio of 60.28%.In this study, The ISSR molecular marker technique was used to amplify 109 varieties of C. oleifera with 23 primers, and the obtained polymorphism ratio was 68.86%, which indicates that there is a large degree of genetic variation among the tested cultivars, and there was abundant genetic diversity.
In this research, the total genetic diversity of the six C. oleifera populations Ht was 0.1784, which indicated that there was abundant genetic diversity among the cultivars of C. oleifera in different provinces, and the genetic variation among individuals was significant (Wang, 2011).It's supposed that both the rich germplasm resources and extensive distribution and highly complex genetic background of C. oleifera caused by long-term crosspollination and other factors in Hubei.

Cluster analysis of genetic relationship of C. oleifera
In this study, the cluster analysis chart of 109 C. oleifera germplasm resources using ISSR molecular markers seems to somewhat cluttered as a whole, which is mainly due to the large number of samples.From the aspect of single series of C. oleifera, genetic maps could be clearly established for Gan series, XL series, CL and QY series, which could be used to identify the genetic relationship among them.In the Gan series, GW16 and GW2 that were clustered in the subgroup I -3 showed the closest genetic relationship, with a genetic similarity coefficient of 0.87.GW5, GW11, GX46, GS84-8, GW24, GY6 and GW20 showed a close genetic relationship and were clustered in the sub-group I -2.While GW1, Gan 68, Gan8, Gan6, Gan70, GX48, Gan 190 and GS83-4 were clustered in the subgroup I -5.As in the XL series, XLC26, XLC4, XL36, XLC6 and XLC8 showed a close genetic relationship and were clustered in the class group X.In the S series, S1, S6, S8, S10, S11, S12, S13 and S17 were clustered in the class group I, of which S13 and S12 in the subgroup I -1 had the closest genetic relationship, with a genetic similarity coefficient of 0.93.Because that the bands are subjectively interpretated and amplification effects are not obvious in some bands, the identification results may be biased, which requires more experiments to verify.

Conclusions
In this research, the germ plasm of 109 C. oleifera cultivars were analyzed by using ISSR marker.The results showed that all varieties with the same geographical origin or with similar genetic background could be clustered into eleven class group, but some varieties with different geographical origin and different genetic background were also clustered in the same class group, which showed a complex genetic relationship.The possible reasons are: (1) C. oleifera has a long history of cultivation, during which genetic resources are continuously communicated among different C. oleifera producing areas.(2) There is ununiform number of varieties.Among the samples collected in this study, there were 24 cases of Gan series, 20 cases of XL series, 10 cases of CL, S and AH series separately, 9 cases of QY series, 8 cases of EY series, which there was less and even 1 case of other series, such as HS, which led to irregular clustering results.(3) C. oleifera is characterized by crosspollination, which results in a high level of genetic diversity among varieties.

Fig. 1 .
Fig. 1.Genetic diversity of C. oleifera in different regions in China

Fig. 5 .
Fig. 5. Genetic similarity cluster based on ISSR analysis in C. oleifera populations

Table 1 .
. Oil tea species are an invaluable gene pool for 502 C. oleifera sampling place in different provinces of China

Table 2 .
The primer alignment and amplified results of ISSR analysis

Table 3 .
Genetic diversity of C. oleifera population (variance analysis) Notes: P, percentage of polymorphic bands; H, Nei's gene diversity; I, Shannon's information index; Na, Observed number of alleles; Ne, Effective number of alleles

Table 4 .
Nei's genetic identity and genetic distance among populations of C. oleifera

Table 5 .
Similarity coefficient of Hubei C. oleifera cultivars (only show a part of the data; other data show in S5-1 to S5-9)