Genomic Tools for the Enhancement of Vegetable Crops: A Case in Eggplant

Dramatic advances in genomics during the last decades have led to a revolution in the field of vegetable crops breeding. Some vegetables, like tomato, have served as model crops in the application of genomic tools to plant breeding but other important crops, like eggplant ( Solanum melongena ), lagged behind. The advent of next generation sequencing (NGS) technologies and the continuous decrease of the sequencing costs have allowed to develop genomic tools with a greatly benefit for no-model plants such as eggplant. In this review we present the currently available genomic resources in eggplant and discuss their interest for breeding. The first draft of eggplant genome sequence and the new upcoming improved assembly, as well as the transcriptomes and RNA-based studies represent important genomic tools. The transcriptomes of cultivated eggplant and several wild relatives of eggplant are also available and have provided relevant information for the development of markers and understanding biological processes in eggplant. In addition, a historical overview of the eggplant genetic mapping studies, performed with different types of markers and experimental populations, provides a picture of the increase over time of the precision and resolution in the identification of candidate genes and QTLs for a wide range of stresses, and morpho-agronomic and domestication traits. Finally, we discuss how the development of new genetic and genomic tools in eggplant can pave the way for increasing the efficiency of eggplant breeding for developing improved varieties able to cope with the old and new challenges in horticultural production.


Genome assemblies
Sequencing of the genome of a crop is essential to dramatically accelerate crop improvement (Davey et al, 2012). A high quality reference genome opens the way to access to information of the complete set of genes, to the different layers of regulatory elements and to the basic genomic architecture, allowing precise structural and functional comparisons between species (Feuillet et al, 2011).
As of June 2017 there is only one genome draft publicly available (SME_r2.5.1) in the National Center for Biotechnology Information (NCBI) (SRA accession: DRR014074 and DRR014075) (Hirakawa et al., 2014). The accession used for the whole-genome shotgun sequencing was 'Nakate-Shinkuro', an important traditional Asian-type cultivar that has been used in the development of some modern commercial cultivars. For sequencing (∼144X), a combined genomic libraries approach was employed, consisting of paired-end (insert size of 200-300 bp) and 2 Kb Illumina mate-pair insert size (Table 1). The transcriptome of two other accessions 'AE-P03' and 'LS1934' was also sequenced to improve the de novo assembly . The final assembly consisted of 33,873 scaffolds, covering about 74% (833.1 Mb) of the estimated length of the genome (∼1.1Gb) (Arumuganathan et al., 1991) with an N50 parameter of 64.5 Kb. The total number of genes predicted was 42,035, of which 4,018 were described as being exclusive of eggplant, quite a large number if compared with more completed genomes like the last version of tomato Heinz 1706 (SL3.0 version, 34,879 genes in ITAG3.10, https://solgenomics. net/organism/Solanum_lycopersicum/genome) or the last version of Arabidopsis thaliana genome (Araport11 version, 27,655 Protein Coding Genes) (Cheng et al., 2016).
A large number of different simple sequence repeats (SSRs) motifs and repeats (83,401) was identified by Hirakawa et al. (2014), as well as 4,536 single nucleotide polymorphisms (SNPs) from a microarray among the accessions used for the study. This first genomic resource, which is quite far to be a comprehensive work, paved the way to understanding the genomic architecture of the eggplant and allowed to perform comparisons with other important Solanaceae crops, like potato, tomato and pepper (Potato Genome Sequencing Consortium, 2011;The Tomato Genome Consortium, 2012;Qin et al., 2014), whose genomes are more complete and annotated, and also with model species like Arabidopsis thaliana (Cheng et al., 2016).
The development of another eggplant genome by the Italian Eggplant Genome Sequencing Consortium has been presented in several scientific meetings (Barchi et al., 2016), but at the time of writing this work has not yet been published. According to the authors, a high-quality reference genome has been de novo assembled through Illumina sequencing (∼155X), using different genomic libraries sizes (from 270 bp to 10 Kb), of the inbred eggplant line "67/3", which was used as a male parent of a 157 F6 recombinant inbred lines (RILs) mapping Introduction Eggplant (Solanum melongena L., 2n = 2x = 24), also known as common eggplant or brinjal eggplant, belongs to the Leptostemonum clade within the Solanum genus (the "spiny" solanums) and is the second most important solanaceous fruit crop in total production after tomato (S. lycopersicum L.) (Knapp et al., 2013). Eggplant has undergone a constant increase in yield (2.7-fold) and total production (8.7-fold) in the last fifty years, although the largest increases have been recorded in the last decade (FAOSTAT, 2017). Common eggplant is widespread and consumed worldwide even though in some areas, mainly in the African continent, two other cultivated eggplants namely scarlet (S. aethiopicum L.) and gboma (S. macrocarpon L.) eggplants are locally important (Lester and Daunay, 2003;Plazas et al., 2014).
The breeding and genomics revolution of the last fifteen years has resulted in the development of large amounts of information of interest for breeding, allowing the enhancement of many vegetable crops. Even though eggplant is the sixth most important vegetable in production in the world for many years it has lagged behind in the development and use of genomic tools compared to other important Solanaceae crops like tomato, potato or pepper (Barchi et al., 2011;Hurtado et al., 2013). Tomato, being the closest model crop species with many enhanced genomic tools, has been used as reference for eggplant in many studies (Doganlar et al., 2002;Wu et al., 2009;Barchi et al., 2012). In this respect, the scientific genomic achievements in tomato have allowed fishing in its gene pool and taking advantage of the extraordinary genetic wealth of its wild relatives (Víquez-Zamora et al., 2013;Aflitos et al., 2014). In this way, tomato breeders have been able to find genetic solutions to some biotic and abiotic threats and achieving improvements to cope with unfavorable agricultural environments in a climate change scenario (Bolger et al., 2014;Thapa et al., 2015). On the contrary, up to now in common eggplant, as well as in scarlet and gboma eggplant, the use of wild relatives for the breeding purpose have been negligible and often purely academic . The availability of a large array of molecular markers obtained by genomic techniques allows a very efficient marker-assisted selection (MAS), by selecting the genes or quantitative trait loci (QTLs) of interest and facilitating the removal of undesirable traits of wild relatives and saving time and resources compared to the conventional breeding approach (Morrell et al., 2012;Brozynska et al., 2016). Even though the gap between eggplant and the crops with more genomic resources is still wide, a clear turnaround in this trend has occurred in the last few years (Hirakawa et al., 2014;Portis et al., 2015;Kouassi et al., 2016;Plazas et al., 2016;Salgon et al., 2017).
In this paper, we review the available genomic resources in eggplant gene pool, including experimental populations that are under development, as well as the future perspectives and directions to exploit the full potential of these tools for basic and applied research in this crop. population (Table 1). In addition, to improving the assembly, its transcriptome was also sequenced (Barchi et al., 2016). Moreover, the hybrid assembly was combined with the sequencing of the female parent "305E40" (35X) and the rest (1X) of the RILs mapping population and with a high-resolution restriction optical map. The final result consisted of 12 pseudomolecules, spanning ∼1.2Gb with an L50 of >3Mb. The functional annotation resulted in ∼ 40K protein-coding genes, confirming the estimation of Hirakawa et al. (2014).

Transcriptomes and RNA-based studies
RNA sequencing is another essential genomic resource (Ozsolak and Milos, 2001). Probably due to the reduced economic and bioinformatic efforts compared to genome assembly and the variety of study approaches and aims, RNA-based studies are in general more abundant. In fact, in RNA-based studies the results depend on the genetic expression at a certain stage of development in a specific plant tissue (Waterhouse and Helliwell, 2003).
Regarding common eggplant, up to now, few studies using RNA sequencing have been carried out. One of them was the de novo assembly of the whole transcriptome from root, stem and young leaves samples (SRA accession: SRR1104129) (Yang et al., 2014). From paired-end 2x100 bp libraries, these authors retrieved 15 M of raw reads, which were assembled after a filtering process in 44,672 transcripts and 34,174 unigenes, a similar number to the protein-coding genes from the genome annotation (Hirakawa et al., 2014). In addition to performing structural and functional annotation, a comparison was performed using a set of 4,900 orthologs with other 11 plant species, including model plants and Solanaceae crops like tomato and potato, allowing to estimate the time of divergence between them. Furthermore, using the Plant Resistance Gene database (http://prgdb.crg.eu/wiki/ Main_Page) a set of 621 resistance genes were identified. In the same study (Yang et al., 2014), the transcriptome of the wild eggplant relative S. torvum Sw., a species belonging to the tertiary eggplant gene pool (Syfert et al., 2016), was also assembled (SRA accession: SRR1104128). Solanum torvum, also known as turkey berry, is of great interest for eggplant breeding since it is resistant to a wide range of soil-borne diseases like root-knot nematodes, Ralstonia solanacearum, Verticillium dahliae, and Fusarium oxysporum f. sp. melongenae (Gousset et al., 2005;Yamaguchi et al., 2010). For S. torvum, the assembly statistics and annotation were slightly different to that of common eggplant, being the most relevant difference the number of unigenes (38,185 for S. torvum versus 34,174 for common eggplant) ( Table 1).
Another de novo transcriptome assembly was released in 2016 in order to identify putative allergens in eggplant fruit (Ramesh et al., 2016;SRA accession: SRR1291243). In this study, total RNA was extracted from fruit peel and flesh, a different plant material compared to Yang et al. (2014) whole transcriptome, and a total of 48 putative allergens and 526 B-cell linear epitopes were identified from the 149,224 transcripts assembled (Table 1). Of these 40,752 showed significant similarity with predicted proteins in tomato and potato.
In addition to common eggplant, the whole transcriptome of some eggplant relatives has been released. One of them was scarlet eggplant (S. aethiopicum accession BBS135) (Gramazio et al., 2016a; SRA accession: SRR2229192), the second most important cultivated eggplant, which is common in sub-Saharan Africa, as well as, in some areas of Brazil, Caribbean and south of Italy (Sunseri et al., 2010). In the same study, a de novo whole transcriptome of Solanum incanum L. accession MM577 (SRA accession: SRR2289250), a wild relative of common eggplant which is considered a powerful source of phenolics and tolerant to some abiotic stresses such as drought (Knapp et al. 2013), was also assembled.
Due to the high amount of reads obtained (more than 100 M), the number of assembled unigenes in Gramazio et al. (2016a) was relatively high (87,084 for S. aethiopicum and 83,905 for S. incanum), probably due to high representation of 3′ or 5′ untranslated regions (UTRs) and intron sequences from non-mature mRNAs as a consequence of a deep coverage (Table 1). On the other hand, the number of annotated unigenes in protein databases was similar to other Solanum crops (34,231 and 30,630 for S. aethiopicum and S. incanum respectively). In this study, molecular marker discovery was performed, identifying a total of 1,248 microsatellites for scarlet eggplant and 976 for S. incanum. In addition, intraspecific and interspecific single nucleotide variant (SNV), SNPs and 4 insertions/deletions (INDELs), were identified not only in S. aethiopicum and S. incanum but also in S. melongena and S. torvum, using for comparison the raw reads generated in the Yang et al. (2014) study.
The most recent transcriptome released in the eggplant genepool was from S. aculeatissimum Jacq., a wild relative of eggplant resistant to verticillium wilt (Zhou et al., 2016). Two different libraries were constructed, for the control and for the infected roots with Verticillium dahliae, each of them giving 28 M of raw reads. A total of 64,413 and 71,291 unigenes were obtained for the control and infected roots, respectively, which were functionally annotated, resulting in 17,645 of them differentially expressed (11,696 upregulated and 5,949 downregulated).
Apart from the whole transcriptomes, in common eggplant, other RNA-based studies were performed. Yang et at. (2013) identified miRNA from eggplant involved in the process of infection by V. dahliae through deep-sequencing of two small RNA libraries, using control and infected seedlings. From 5,940 MiRNA identified, 220 belonged to Solanaceae species and two new miRNA were eggplant specific. The authors identified a total of 33 differentially expressed miRNA between the two libraries (28 downregulated and 5 up-regulated), which were strongly involved in the V. dahliae infectious process.
The most recent RNA-based study (SRA: SRR3479276, SRA: SRR3479277), which is still unpublished and only available in NBCI database (https://www.ncbi.nlm.nih.gov/sra/SRR3479277/), identified miRNA involved in the difference in heterostyly in short-morph and long-morph eggplant pistils using highthroughput small RNA microarray and degradome sequencing. From the 686 miRNAs identified, 10 were differentially expressed and were determined as pistil development-related miRNAs.
Mapping studies, experimental populations, and genotyping methods Gene mapping establishes a connection between a trait under study and one or more chromosomal regions (Sevon et al., 2005). Although genetic maps have been constructed since the first decade of the 20th century (Brown, 2006), the first genetic map in eggplant was released only in 2001 (Nunome et al., 2001). This intraspecific map was built using 181 dominant markers (93 amplified fragment length polymorphism (AFLPs) and 88 random amplification of polymorphic DNA (RAPDs)) using a population of 168 F2 progenies, resulting in 21 linkage groups (LGs) and spanning 779.2 cM ( Table 2). The aim of this study was to identify genetic regions involved in fruit shape and color development, which were associated with regions in LG2 and LG7 respectively. Low frequency of DNA polymorphism in eggplant (Doganlar et al., 2002a;Wu et al., 2009) and the tendency of some markers to cluster, like AFLPs and RAPDs (Alonso-Blanco et al., 1998;Nilsson et al., 2007), generated a high number of LG, which did not correspond to the basic chromosome number in eggplant.
One year later (2002), an interspecific genetic map was developed, using S. melongena and the wild relative S. 5 linneanum Hepper & P.-M.L.Jaeger as parents, in order to increase the general low DNA polymorphism observed in mapping populations derived from an intraspecific cross (Doganlar et al., 2002a). The aim of this study was to compare tomato and eggplant maps in order to evaluate synteny and identify rearran-gements between the two Solanum species occurred during the domestication process. To achieve that objective, single-copy tomato cDNA, genomic tomato DNA, and tomato conserved orthologous set (COS) restriction fragment length polymorphism (RFLP) markers were assessed, which were previously mapped in a tomato map and used to establish a synteny between the tomato and potato genome (Tanksley et al. 1992;Fulton et al. 2002).
The map, which spanned 1,480 cM along 12 LG (Table  2), confirmed the high collinearity between the tomato and eggplant and identified 25 rearrangements, as well as 125 QTLs related to 22 domestication traits (fruit size, shape, and color and plant prickliness) (Doganlar et al., 2002b) and 18 morphological traits related to leaf, flower, and fruit size, shape, appearance, and development (Frary et al., 2003).
Another interspecific genetic map using the eggplant wild relative S. sodomeum L. [=S. linnaeanum], was developed by Sunseri et al. (2003). This map, that was built to achieve markers linked to Verticillium tolerance using 48 F2 progenies derived from an interspecific cross between a susceptible and tolerant parental, consisted of 273 markers (156 AFLPs and 117 RAPDs) distributed along 12 LG and 736 cM with a small average distance between them (2.7 cM, Table 2).
An improved version of the Nunome et al. (2001) genetic map was released in 2003 (Nunome et al., 2003a), where seven SSRs markers were added through screening an eggplant genomic library of dinucleotide motifs, in order to merge and reduce LGs. In fact, the number of LGs decreased from 21 to 17, while the map length and the average marker density remained almost the same (Table 2). At the same time, an eggplant genomic library was developed for screening trinucleotides motifs (Nunome et al., 2003b). SSRs, also known as microsatellites, are highly polymorphic, genomic (gSSRs) more than genic (EST-SSRs), within and across the species, generally displaying a higher polymorphic information content (PIC) compared to the other molecular markers (Kalia et al., 2011). In addition, microsatellites are codominant, robust, highly reproducible, abundant and quite-well spread along the genome (Varshney et al., 2005), Nevertheless, before the advent of next generation sequencing (NGS) their identification through the development of genomic libraries required a quite high degree of expertise, which implied a significant investment of time and resources (Fernandez-Silva et al., 2013). Nowadays, the identification of thousands of gSSRs and EST-SSRs by DNA and RNA sequencing has become cost-effective and effortless by virtue of the overwhelming advancements in sequencing platforms (Xiao et al., 2013;Goodwin et al., 2016). Other genomic libraries to discover microsatellites in eggplant were developed by Stàgel et al. (2008), Vilanova et al. (2012) and Nunome et al. (2009), the latter in order to improve the previous versions of their genetic map (Numone et al., 2001(Numone et al., , 2003 by adding 245 SSRs, thus reducing to 14 the number of LGs and increase the total length to 959 cM (Table 2).
An enhanced version of Doganlar et al. (2002) genetic map was obtained by adding 110 COSII and 5 tomatoderived markers and performing a detailed synteny with tomato through inferred the position of additional 522 COSII markers (Wu et al., 2009) (Table 2). The high number of common markers allowed to estimate the time of divergence in 12 million years and from genomic structural standpoint a minimum of 24 inversions and 5 chromosomal translocations occurred between the two species, and 37 conserved syntenic segments (CSSs) where the order of genes/markers have been well preserved were detected.
In order to map the resistance to F. oxysporum (gene Rfo-sa1), Barchi et al. (2010) developed an intraspecific genetic map using a 141 F2 population from 305E40, a double haploid line obtained through another culture of a backcross material (BC7) using S. aethiopicum as a donor parent carrying the Rfo-sa1 gene for resistance to Fusarium, and the line 67/3, derived from an intraspecific cross between an Italian and Chinese cultivars and susceptible to F. oxysporum. The map spanned 718.7 cM across 12 LGs with an average marker density of 3.0 cM and was composed of 212 AFLPs, 22 SSRs developed by Stàgel et al. (2008), 1 RFLP and three Rfo-sa1 cleaved amplified polymorphic sequence (CAPS) markers which cosegregated in LG1 (Table 2).
The same F2 population was also used to identify QTLs related to anthocyanin content (Barchi et al., 2012), as well as, 105 QTLs associated with twenty yield, fruit, and morphological traits  and 29 QTLs with seventeen traits (fruit qualitative and health-related compounds) , this time employing a large set of SNPs identified by restriction-site-associated (RAD) tags sequencing (Barchi et al., 2011), which led the total map length to 1390 cM, quite similar to that of Wu et al. (2009) and Doganlar et al. (2002) maps (Table 2).
In fact, when markers that tend to cluster, like AFLPs, are replaced with markers that are more dispersed, like SNPs, the total length of genetic maps usually increases (Rafalski, 2002). SNPs, like SSRs, are codominant, robust and easy to identify by NGS platform through sequencing (Davey et al., 2011;Scheben et al., 2016), and in addition have the advantage that are more abundant, ubiquitous and easy to automate than SSRs (Thomson et al., 2014;Kim et al., 2016), although are less informative (Filippi et al., 2015;Gonzaga, 2015). SSRs are generally considered as dispersed markers, even though in some species tend to concentrate more frequently in heterochromatic regions and it is unlikely to cover all the genomic regions assessing just SSRs (Hong et al., 2007;Shirasawa et al., 2010). On the other hand, it has been reported that the validation of SNPs by high-throughput SNP genotyping, like genotyping-bysequencing (GBS), RAD tag sequencing or similar, can be 100-fold faster and 75% less expensive than SSRs detection through an agarose or polyacrylamide gels or capillary sequencing (Jones et al., 2007;Yan et al., 2010). For all these reasons SNPs markers have quickly replaced SSRs in the last few years. Fukuoka et al. (2012) in an effort to represent genomic 6 region overlooked in the previous maps, mapped a considerable set of SNPs identified from a set of 4,754 orthologous genes in Solanum (SOL) developed from 16K eggplant, 47K tomato, and 57K potato unigenes. An integrated intraspecific map was built from two F2 populations, LWF2 and ALF2, and 952 markers (639 SNPs and 313 SSRs) along 12 LGs, resulting in 1,285 cM and an average marker density of 1.4 cM, covering 1.5 times the genomic region represented in Nunome et al. (2009) ( Table 2). The same set of SNPs developed by Fukuoka et al. (2012) and the SSRs developed by Nunome et al. (2009) were used to build two intraspecific maps from two F2 populations, ALF2 and NAF2, to identify QTLs involved in parthenocarpy . The two main QTLs detected (Cop3.1 and Cop8.1) were identified in both maps, which presented different LGs (12 versus 15), length (1,414 versus 1,153), and markers mapped (132 SNPs and 118 SSRs versus 125 SSRs and 49 SNPs), although shared one parent (parthenocarpic line AE-P03) ( Table 2).
The first intraspecific population of RILs in eggplant was developed from the resistant bacterial wilt R. solanacearum line AG91-25 (MM960), derived from the Turkish line MM127 and a S. aethiopicum Aculeatum Group accession, and the commercial type line MM738 (Lebeau et al., 2013), which were previously used by Doganlar et al. (2002a). That population was used to dissect the genetic control of resistance to R. solanacearum to four strains of phylotype I and identify genes/QTLs through developing a genetic map, which spanned 884 cM along 18 LGs (Table 2). Although, the map had a low saturation due to extremely low polymorphism rate, with only 119 markers mapped and most of which AFLPs, a major gene was identified (ERs1) in LG2. The low level of polymorphism in the RILs population showed with AFLPs drastically changed when the individuals were sequenced by GBS, identifying 1,779 filtered SNPs and allowing to build a high-density genetic map for screening four new strains of R. solanacearum, apart from the previous ones of Lebeau et al. (2013), belonging to phylotypes I, IIA, IIB and III (Salgon et al., 2017). The overall map length and the average marker density had been increased to 1,085 and 4.4 cM, respectively, and the LGs number had been reduced from 18 to 14 (Table 2), which lead to identifying a major QTL at the bottom of LG 9 that controls three phylotype I strains, corresponded to the previously identified major gene ERs1 of Lebeau et al. (2013), and two other minor QTLs on LG 2, associated with partial resistance to strains of phylotypes I, IIA, III, and on LG 5, controlling the strains of phylotypes IIA and III. This was a clear example of the high potential of the recent advances in genomics which improves extremely the resolution and precision of the genetic studies.
A major resistance locus (FM1) was identified at the end of chromosome 2, at the exact same position of Rfo-sa1 from S. aethiopicum gr. Gilo , suggesting they might be orthologous.
The first attempt of association mapping based on linkage disequilibrium (LD) was performed with 141 eggplant accession from different countries and 105 SSRs developed by Nunome et al. (2009) to investigate nine fruit traits (Ge et al., 2013, Table 2). The analysis performed revealed a total of 49 marker associations related to eight phenotypic traits and 24 SSRs, being the total variation explained ranged from 4.5 to 22.8%.
A larger set of 191 accessions, including breeding lines, old varieties, and landraces from the Mediterranean basin and Asia, were investigated by Cericola et al. (2014) based on genome-wide association (GWA) approach for anthocyanin pigmentation and fruit color at two locations over two years using 314 SNPs developed by Barchi et al. (2011). A total of 56 associations were found between SNPs and anthocyanin content and fruit color-related traits, which were clustered into 12 groups and scattered over nine chromosomes, being eight of the groups overlapping with known QTL and demonstrating in that way the effectiveness of GWA approach. In addition, synteny with tomato allowed the identification of the genomic regions associated with anthocyanin accumulation in LGs 2, 5, and 12. Using the same association panel of 191 accessions and set of 314 SNPs (Table 2), Portis et al. (2015) examined the phenotype/genotype associations related to 33 traits (fruit, plant and leaf morphology traits) identifying 194 association to 30 traits, which involved 79 SNP loci in 39 distinct regions distributed across the 12 LGs.
A further improvement of Doganlar et al. (2002a) and Wu et al. (2009) maps was performed by increasing the number of markers to 864, by adding 400 AFLPs and 117 RFLPs, and using a larger F2 population (108 individuals) . In that way, the overall map length remained almost the same but the marker average distance decreased from 6.1 cM to 1.8 cM (Table 2). On the other hand, the improved map precision led the authors to revise the number of rearrangements between eggplant and tomato from 29 (five translocations and 24 inversions) (Doganlar et al., 2002a;Wu et al., 2009)  In order to exploit the great genetic diversity of the wild relatives, a new interspecific map (named SMIBC) was developed using S. incanum MM577, which, as has been mentioned previously, is considered a powerful source of phenolics and tolerant to some abiotic stresses such as drought (Knapp et al., 2013). In fact, with the objective of increasing the content of chlorogenic acid (CGA) in eggplant, which is usually the main phenolic compound (Whitaker et al., 2003;Prohens et al., 2013), a first set of introgression lines in the eggplant gene pool has been developed using S. incanum as a donor parent (Gramazio et the genetic diversity, evolutionary history, domestication, and ecology in eggplant (Gramazio et al., 2016a).
In this light, a high-quality genome sequence from an inbred line is absolutely necessary, in which the scaffolds should be assembled in chromosomes or pseudomolecules from sequence-based genetic and physical maps. The sequencing technologies using long-reads such as PacBio combined with a huge amount of small-reads of Illumina could achieve a satisfactory result (Mavromatis et al., 2012), although long-read sequencing technologies combined with physical mapping approaches, like optical mapping, Hi-C and the Dovetail Chicago methods, are offering new and most accurate solution to genome assembly (Yuan et al., 2017). An accurate reference genome assembly would encourage many research groups to sequence the genomes and the transcriptomes of their genotypes of interest or wild eggplant related species with a modest amount of economic resources, obtaining a meaningful information for a wide range of studies.
For instance, more attempts to develop experimental populations using wild relatives could be addressed if more information would be available for marker assisted selection to introgress efficiently genomic fragments of allied species. Up to now, the attempts to introgress wild relatives traits has been performed only sporadically and even though 25 allied eggplant species have been employed in interspecific cross with S. melongena , only one set of introgression lines has been developed using S. incanum as a donor parent (Gramazio et al., 2016b) and no other biparental and multiparental populations are available.
The delay in developing populations using wild relatives is staggering compared with other crops like tomato, in which more than 96 genes and QTLs has been introgressed from 14 different wild relatives, comprising several sets of near isogenic lines (NILs), introgression lines (ILs) and RILs, as well as a multi-parent advanced generation intercross (MAGIC) population (Pascual et al., 2015;Redden et al. 2015). A reliable reference genome would indeed speed up the development and precision of these populations in eggplant. Barrantes et al. (2014) developed most of the lines of an ILs between tomato and S. pimpinellifolium L. from a BC3S1 generation, four generations versus eight required to develop the set of ILs with S. incanum (Gramazio et al., 2016b). Both tomato and S. pimpinellifolium parents had advanced assemblies that allowed designing arrays to perform high-throughput genotyping since the early generations of ILs population development.
In the last few years, sequence based genotyping (SBG) technology, which includes methods for the simultaneous polymorphism discovery and genotyping like GBS, RAD, ddRAD and related methods, has been a relative economical way to produce information in a crop and facilitate genotyping before a reference genome was available (Elshire et al., 2011). Barchi et al. (2011) RAD approach identified ~10,000 SNPs and 2,000 putative SSRs from the parents of a mapping population with a reference genome. Similarly, a GBS was performed in a 180 F6 RILs population identifying 1,779 filtered SNPs and improved drastically the quality of an intraspecific genetic map (Salgon et al., 2017). These genotyping through sequencing methods constitute a really powerful tools to increase precision and accelerate al., 2016b). Thus, to track and easily introgress the alleles involved in the content of CGA (phenylalanine ammonia lyase (PAL), cinnamate 4-hydroxilase (C4H), 4hydroxycinnamoyl-CoA ligase (4CL), hydroxycinnamoyl-coA shikimate/quinate hydroxycinnamoil transferase (HCT), pcoumaroyl ester 3'-hydroxylase (C3'H), and hydroxycinnamoyl CoA quinate hydroxycinnamoyl transferase (HQT)) in the genetic background of eggplant, they were successfully mapped in different LGs using 91 BC1 individuals (Gramazio et al., 2014). In addition, five polyphenol oxidase enzymes (PPO1, PPO2, PPO3, PPO4, PPO5), which may be involved in the browning of the fruit flesh by oxidation of GCA and other phenols, mapped in a cluster in LG 8, as well as, candidate genes important in domestication such as fruit shape (OVATE, SISUN1) and prickliness. The mapping was assisted by the use of synteny of the orthologous genes in tomato using Tomato-EXPEN 2000 map (Fulton et al., 2002). Furthermore, apart from tomato, SMIBC established a macro-synteny with four other eggplant maps (Nunome et al., 2009;Wu et al., 2009;Barchi et al., 2012;Fukuoka et al., 2012) by using shared markers. SMIBC spanned 1,085 cM along 12 LGs with a total of 243 markers (42 COSII,99 SSRs,88 AFLPs,9 CAPS,4 SNPs and one morphological marker) ( Table 2).
A few months later, at the end of 2014, the first draft of eggplant genome was finally released online (Hirakawa et al., 2014), where, apart from the genomic sequence, an integrated linkage map was also constructed from two F2 population (EWF2 and LWF2), which were previously used for mapping by Nunome et al. (2001) and . The map presented a total of 795 markers (574 SNPs and 221 SSRs) for an overall map length of 1,280 cM along 12 LGs and achieving the highest average marker density so far (0.7 cM) ( Table 2).
In order to exploit genomic resources and genetic data for key agronomic traits, Rinaldi et al. (2016) performed a syntenic relationship and QTL orthology among eggplant, tomato, and pepper using their respective genome assemblies, although the genome sequences of the three species are different in coverage, assembly quality, and percentage of anchorage. While the comparison between tomato and pepper was quite comprehensive due to the high quality of their assembly, the comparison with eggplant and tomato and eggplant and pepper was less exhaustive due to the impossibility to localize the physical position of the most eggplant QTL. Nevertheless, most of the previous rearrangements previous detected were confirmed and new ones were identified, even though an enhanced version of eggplant genome could have improved the precision of the analysis.

Future direction for genetics and genomics tools for eggplant breeding
In the last ten years from the genetic and genomic investigation standpoint, eggplant has shortened the gap with other important major crops like tomato, potato, and pepper. Nevertheless, more genomic resources are needed for an efficient breeding in order to develop new improved varieties that have to deal with a changing climate scenario and new and threatening biotic stresses, and to understand 8 genetic and genomic studies, although the legal dispute for exploited the patent of these methods driven up the prices of these technologies turning them economical prohibitive for many research groups. Currently, many efforts are being dedicated to developing an alternative to SBG technologies, although many of them required prior knowledge. The information obtained by sequencing is necessary to design primers to interrogate the genomic regions of interest for allelic discovery or genotyping. Each targeting platform differs in throughput, cost, probes, multiplexing, the number of target regions and much more customizable parameters. Among the most-used alternatives are Sequenom MassARRAY iPLEX platform (Gabriel et al., 2009), single primer enrichment technology (SPET) (NuGEN, San Carlos, USA), TruSeq Amplicon Sequencing and Nextera Target Enrichment (Illumina, San Diego, USA), KASP SNP genotyping (LGC Genomics, UK).
A new promising approach, rAmpSeq, was announced at the end of 2016 as a robust genotyping platform using repetitive sequences (Buckler et al., 2016). This method used conserved regions to design PCR primers for amplifying thousands of middle repetitive regions and interrogate thousands of markers. The authors affirmed that the cost per sample can be less than $2 per sample, which would allow to genotype thousands of samples for a very reasonable cost. Other advantages are the use of PCR without high requirements of DNA quality and quantity and less PCR competition among amplicons due to fairly similar length and composition of the repetitive. On the other hand, compared to the SBG technologies, rAmpSeq identifies fewer markers, required prior information and generally screens intergenic regions, as well as, more challenging bioinformatic analysis. This approach can revolutionize the breeding and conservation biology in the immediate future, even though at moment further improvement are required and optimization in more crops a part of maize.
The genomics revolution, that has led a perspective change in our comprehension of evolution, domestication, genetic architecture and much more aspects, is far from slowing down. The new achievements in sequencing technologies and their decreasing in cost are accelerating the development of high-quality genome reference assemblies, high-throughput genotyping and markers-assisted breeding selection that is reflecting in a greater overall understanding of species and new improved varieties adapted to new upcoming scenarios. In the near future sequencing 100s or 1,000s of samples will become routinary and affordable including for non-model species and for resource and infrastructure-limited institutions in the developing world. This will undoubtly speed up and revolutionize the breeding of eggplant for the development of a new generation of cultivars with dramatically improved yield, quality and resilience.