Evaluation of Genetic Diversity by DNA Barcoding of Local Tomato Populations from North-Western Romania

Tomato is one of the most important crops worldwide. DNA barcoding is a molecular based method that has been successfully used for species identification, but a few studies have used this method for cultivated varieties identification. The aim of this study was to test the utility of DNA barcoding for the identification of five local salt tolerant tomato varieties and two commercial varieties. To assess the genetic diversity of tomato varieties, the non-coding plastid trnH-psbA intergenic spacer and three plastid regions (rbcL, rpoC1, rpoB) were used. Based on the sequence variation of the trnH-psbA barcode, three haplotypes were detected among the seven tomato varieties. A neighbor-joining tree was generated and separated the local tomato varieties from the commercial varieties into two distinct clusters. We found very low levels of variation in the chosen plastial markers, but additional markers could be tested in order to assess the utility of DNA barcodes in tomato varieties identification.


Introduction
Tomato (Solanum lycopersicum L.) is an important vegetable from the Solanaceae family, cultivated worldwide due to its good flavor and rich source of nutrients (Sun et al., 2014).It is also a well-known model species for study fruit development and metabolite accumulation.To obtain tomato crops with desired agronomical traits requires a good understanding and management of tomato genetic resources diversity (Bauchet and Causse, 2012).Tomato landraces are highly heterogeneous as they were systemically selected for their performance in adverse agricultural environments (Ciulca et al., 2015) For evaluating genetic variation and phylogenetic relationships among tomato varieties, different molecular methods have been used: RAPD (Carelli et al., 2006), RFLP (Asamizu and Ezura, 2009), AFLP and SSR (García-Martínez et al., 2006;Benor et al., 2008).In 2014, Sun et al. used the 5S rRNA region to discriminate tomato varieties, and sequence analysis of this region suggested that a large number of variable nucleotide sites exists among tomato varieties.SNP methodology reveals patterns of genetic variation between cultivated landraces and varieties of tomato (Sim et al., 2012;Corrado et al., 2014).
DNA barcoding is a method for taxonomic identification which uses a standard short genomic region that has sufficient sequence variation to distinguish among species.A DNA sequence from such a standardized gene region can be obtained from a small amount of tissue taken from an unidentified organism and then compared to a library of reference sequences from known species.If the sequence from the unknown organism match to one of reference sequences means that the organism is recognize, thus providing a rapid identification.An ideal DNA barcode should be present in all groups of land plants, it should be short (700-800 bp) and show enough sequence variation to discriminate among species, also it should be easy to amplify and sequenced with a single primer pair (Kress and Erickson, 2007).Different regions from the plastid genome, including trnH-psbA intergenic spacer, rbcL, rpoC1 and rpoB, have been proposed and tested for DNA barcoding of land plants with different level of species identification success depending of the studied group taxa (Kress and Erickson, 2007;Singh et al., 2012).The purpose of this study was to test the utility of DNA barcoding for the identification of closely related tomato varieties.In a conservation project, tomato seeds were collected from local farmers of the Bihor County (North-Western Romania).The seeds were chosen The four plant DNA barcodes, rbcL, trnH-psbA ,rpoC1 and rpoB were amplified in a 25 µL reaction volume, using My Taq TM DNA Polymerase (Bioline Reagents Ltd, UK), 0.5 µL primers and 200 ng DNA template.PCR amplification was performed on a Bioer XP Thermal Cycler (Bioer Technology Co., Ltd.).Primers for PCR and sequencing (Kress and Erickson, 2007), and PCR cycling conditions used in this study are provided in Table 2.
To verify the success of PCR amplification, 5 µL of the PCR product were subjected to 2% agarose gel electrophoresis in TAE buffer and visualized under an UV trans-illuminator with G: BOX ChemiXR5 (Syngene, UK).The remaining PCR product was purified using the Favor Prep TM Gel/PCR Purification Kit (Favorgen Biotech Corp.).Purified PCR products were send to Macrogen Europe (Amsterdam, Netherlands) and sequenced in both directions with the same primers used for PCR.

Data analysis
Sequences for each region were assembled and edited using BioEdit v7.2.5 (Hall, 1999).Then, the edited sequences were aligned by Clustal W in MEGA 6 (Tamura et al., 2013).The genetic pair wise distance for trnH-psbA marker was calculated using MEGA 6 with the Kimura 2-parameter (K2-P) model.A neighbor-joining (NJ) tree was constructed based on the multiple sequence alignment of the trnH-psbA intergenic spacer in MEGA 6 with p-distance model.Bootstrap values were calculated over 1000 replications (Felsenstein, 1985).The barcode sequences were queried against Gen Bank database (NCBI) using Nucleotide BLAST algorithm.from heirloom tomato (varieties that has been passed through several generations of a family) due to high productivity, and moreover these tomato varieties are tolerant to salinity.In this study, we used the non-coding plastid trnH-psbA intergenic spacer region, and three plastid coding regions rbcL, rpoC1, and rpoB.

Plant materials
In this study, plant samples were collected from tomatoes grown in the "Vasile Fati" Botanical Garden, Jibou.The plant seeds were obtained in 2012 from gardens of local farmers from three villages, all located in the Bihor County (Table 1).Also, in this study, were included two varieties of commercial tomatoes, Solanum lycopersicum 'Marmande' and Solanum lycopersicum 'Kecskemeti Jubileum'.Four tomato varieties, cherry tomatoes and the commercial varieties were grown in pots and the remaining three varieties were cultivated in the field.Each tomato variety was represented by a single individual.

DNA extraction, PCR amplification and sequencing
Total genomic DNA was isolated from 80 mg of fresh young leaves from each individual using ISOLATE Plant DNA Mini Kit (Bioline USA Inc.) following a modified protocol as described in Căprar et al. (2014).The concentration and purity of each DNA sample was measured with Nanodrop 2000 UV-VIS Spectrophotometer (Thermo Fisher Scientific Inc., United States).

Sequence characteristics of the barcodes
The four barcodes, rbcL, trnH-psbA, rpoC1 and rpoB showed high success rates for PCR amplification and sequencing using a single primer pair.The sequences characteristics of the four regions are presented in Table 3.Of the four barcodes, the trnH-psbA sequences had three variable sites among the seven tomato varieties, found in the commercial varieties ('K Jubileum and Marmande), and the rbcL, rpoC1 and rpoB sequences did not show any variable sites, thus these sequences were 100% conserved within the species.The genetic distances for the trnH-psbA sequence ranged from 0 to 0.004.

BLAST Search
Each barcode sequence was compared against the NCBI database through a BLAST search.All sequences of the rbcL and rpoB loci identified the seven tomato varieties as Solanum pimpinellifolium with 99 or 100% identity.Sequences of rpoC1 identified the seven tomato varieties as Solanum tuberosum with 100% identity.The lack of sequence variation did not allow to separate the samples into different tomato varieties, and after the BLAST search these loci were identified at genus level (Solanum).Only trnH-psbA sequences were correctly identified at species level, as Solanum lycopersicum with 99% identity.
In a study from The Tomato Genome Consortium (2012), the genome of cultivated tomato was compared with its closest wild relative, Solanum pimpinellifolium, and to the potato genome (Solanum tuberosum).The results revealed that the two tomato genomes have only 0.6% nucleotide divergence and evidence of recent admixture, but more than 8% divergence from potato.

Phylogenetic analysis
A neighbor-joining tree was constructed based on the sequence variation of the trnH-psbA region, and the cultivars were grouped into two distinct clusters (Fig. 1).The first cluster grouped all the tomato local populations; while the second cluster grouped the two commercial varieties.The tree topology is supported by a good bootstrap value.No differences between the five local tomato populations were found within the trnH-psbA barcode region.Although, the five local varieties have morphological different fruits, shared the same haplotype for trnH-psbA marker, which is considered one of the most variable non-coding regions of the plastid genome (Chase et al., 2007).
Studies of genetic diversity based on molecular markers in the section Lycopersicon revealed that wild species have a high level of genetic diversity compared to cultivated tomato (Stevens and Robbins, 2007).Domestication of tomatoes by selecting preferred traits has led to low genetic diversity among cultivated tomatoes.A high similarity coefficient was found among 29 cultivated tomatoes using SSR markers, as published by Zhou et al., 2015. In 2011, Sun et al. used three DNA markers to distinguish 26 tomato varieties, and found that nrDNA ITS region and rDNA 5S showed high nucleotide variation, whereas cpDNArbcL region was not suitable for tomato variety identification.Enan and Ahmed (2014) evaluated the potential of two DNA barcode markers, matK and rpoC1, for the authentication of 11 date cultivars, and rpoC1 was less informative than matK.A study of Jarret (2008) showed that trnH-psbA could not discriminate among the members of the Capsicum annum complex, but this complex was separated from another Capsicum species.In a study that assessed the genetic diversity of seven taro cultivars (Colocasia esculenta), the trnH-psbA marker showed genetic variability among them, and grouped the cultivars according to their geographical origin, Midwest and Southeast of Brazil (Nunes et al., 2014).

Table 1 .
List of tomato varieties, location, fruit shape and color

Table 2 .
Primers, their sequences and PCR conditions

Table 3 .
The characteristics of each single barcode

Table 4 .
BLAST search results