HPLC Determination and FT-MIR Prediction of Sugars from Juices of Different Apple Cultivars during Fruit Development

Individual sugars were analyzed by high performance liquid chromatography (HPLC) in samples of apple juices obtained from the fruits of ‘Jonathan’, ‘Starkrimson’ and ‘Golden Delicious’ cultivars. Samples were harvested from the inside and the periphery of the crown, at different periods during fruits growth, from 7 to 144 days after full bloom (DAFB). Values from 0.42 to 14.33%, 0.29 to 4.06% and 0 to 4.28% were determined for fructose, glucose and sucrose, respectively. The values of fructose and glucose have increased significantly (p<0.05), starting with the seventh DAFB, regardless of the studied cultivar, while sucrose increased slowly at the beginning and then faster starting 65 DAFB. Fourier transform mid-infrared (FT-MIR) analysis confirmed the differences between juice samples, the region 900-1500 cm -1 being the most specific to sugars signals. FT-MIR coupled to partial least squares (PLS) calibration models for predicting individual sugars of apple juices were developed. The optimal regions and pre-treatments of the spectra were 900-1500 cm -1 and Savitzky Golay first derivative (d1) for fructose, 900-1200 cm -1 and d2 for glucose and 900-1200 cm -1 and standard normal variate for sucrose. In cross-validation, the PLS calibration models showed very good performance for fructose (R cval2 =0.95; standard error of cross-validation (SECV) =0.907) and acceptable for glucose (R cval2 =0.85; SECV=0.424), while for sucrose showed only satisfactory performance (R cval2 =0.75; SECV=0.561). For practical relevance, the FT-MIR predicted values were compared against the HPLC determined reference values in external validations tests. The best results were achieved for fructose (R p2 =0.94; RPD=4.9), while glucose (R p2 =0.84; RPD=2.61) and sucrose (R p2 =0.7; RPD=2.08) models reached satisfactory values.


Introduction
The apple is considered one of the most agreed fruit for both children and adults. It is an important part of the human diet being a rich source of monosaccharides and biologically active compounds (Fuleki et al., 1994;Miller et al., 1997;Zhang et al., 2010;Zheng et al., 2012). Sugars represent more than 98% of the total soluble solids of apple juice (Kelly et al., 2005). Soluble carbohydrates reach significant concentrations in mature apple fruit (fructose 5-15% (w/w), glucose 1-4% (w/w), sucrose 1-5% (w/w)). Carbohydrates provide energy to the fruit and determine fruit's edible quality (Borsanie et al., 2009). Soluble sugars are known for their effect on plant growth and development (Rolland et al., 2006;Ruan et al., 2010). In fleshy fruits, soluble sugar accumulation during fruit development determines the sweetness at harvest. Compared to other plants, apples have a unique metabolism when it comes to sugar accumulation, because more than 80% of the total carbon flux is going through fructose, so fructose reaches higher levels. Although there are reports on the accumulation of apple carbohydrates and the changes caused by enzymes during growth, the accumulation of sugars remain unclear, being regulated at the level of gene expression (Chourey et al., 1985;Mingjun et al., 2012;Yamaki et al., 1992;Zhang et.al., 2010).
It is known that instrumental techniques for carbohydrates determination are expensive and require considerable work. Recently, Fourier transform mid-infrared (FT-MIR) has become a widely accepted method for determining the chemical composition of food, being quick, practical and environmental friendly. Absorption bands in the MIR region are characteristic linkages and functional groups of a molecule (Kelly et al., 2005). These absorption bands characterize a molecule state, including stretching or bending, so that the whole spectrum is a fingerprint of a specific compound or product. A major advantage of MIR spectroscopy is its sensitiveness to the precise chemical composition of analyzed samples. Near-infrared (NIR) spectroscopy is usually used to identify a class of compounds, and not the individual components (Ścibisz et al., 2011). NIR coupled to chemometrics provided good results in predicting apple fruit quality, including polyphenols content, total level of sugars (Pissard et al., 2013), consistency, color, antioxidant capacity (Giovanelli et al., 2014), or chlorophyll and carotenoids (Beghi et al., 2013). MIR spectroscopy coupled to chemometrics was successfully applied to predict the chemical composition (sugars, organic acids, acidity, dry matter etc.) of different food products, mainly liquids, such as mango juice (Duarte et al., 2002), apple juice (Irudayaraj and Tewari, 2003;Leopold, 2010), or even whole apricot fruit (Bureau et al., 2009) or tomato (Ścibisz et al., 2011). Andrade et al. (2003) showed that FT-MIR was able to differentiate "pure" apple drinks from those with added flavors. MIR spectral data was treated using various chemometric techniques to determine the best combination for optimal classification of the beverages. MIR technique was also used to determine apple juice authenticity, together with linear discriminant analysis, principal component (PCR) and partial least squares (PLS) regressions (Sivakesava et al., 2001). FT-MIR spectroscopy has been widely used for the analysis of must and wine. It also became the alternative method for analyzing sugar from foods and beverages such as sugar cane juice, mango juice, fruit juices and refreshments. On the other hand, the FT-MIR was used for authentication and detection of various products obtained from fruits that have suffered deterioration process, as well as classifying wine or honey according to their origin (Bureau et al., 2009).
The study of sugars accumulation dynamics for apple fruit harvested at different intervals (from 7 to 144 days after full bloom) (DAFB) will open new ways to valorise the fruits resulting from physiological fall. Moreover, a direct FT-MIR analysis method will bring significant contributions, the knowledge of individual sugars concentrations during apple fruit growth being an essential tool for selecting the correct valorisations conduit (direct consumption, juice production, fermentation, extraction of antioxidants, pigments etc.). Furthermore, a direct method for determining the sugars concentration of different quality of apple juice will be useful in future genetics and breeding research for monitoring the sugar content of the fruits obtained from new hybrids during growth. Consequently, the current study aimed to determine the individual sugars from juices of 'Jonathan', 'Starkrimson' and 'Golden Delicious' apple cultivars during fruit development by high performance liquid chromatography (HPLC) and based on the same set of samples to develop FT-MIR calibration models.

Biological materials
Studied apple cultivars were 'Jonathan', 'Starkrimson' and 'Golden Delicious'. Samples were collected during fruit development, from the same orchard, from trees having the same 223 exposure in the orchard. Sampling was done the same for all three apple cultivars. Fruit were collected from the inside and from the periphery of the crown, at different times of growth: 7, 15, 35, 65, 107 and 144 DAFB. Consequently, a number of 36 different samples (3 varieties x 6 harvesting times x 2 crown positions) were harvested in two replicates each. For each harvesting time, the apples were individually vacuum packed after sampling, then frozen immediately and stored at -18 °C. At the end of the harvesting period, all samples were unfrozen, peeled and the pulp was pressed. The correspondent apple juice obtained by pressing the pulp of each sample was stored at -18 °C in two separate containers until further analysis (e.g. HPLC and FT-MIR analyses).

Determination of individual sugars by HPLC
Determination of sugars in apple juice obtained from studied samples was performed according to Bonta et al. (2008) with some modifications. After filtration through a 0.45 μm Millipore filter, the individual sugar content of each sample was determined by HPLC coupled with a refraction index detector. Identification of sugars was based on the retention times of standards. Quantification was based on calibration curves obtained for each sugar, after injection of known concentrations of a standard. The equipment used was a Shimadzu HPLC system consisting of: controller, auto-sampler, degasser, pump, IR detector. Separation was performed on a modified Amino collum -Alltima Amino 100A, 5 μm, 250 × 4.6 mm. The mobile phase flow rate was 1.3 ml/min. The column temperature was set to 30 °C, the sample injection volume being 20 μm. It has been worked on isocratic conditions using as mobile phase acetonitrile /water (75/25; v/v). Each determination was performed in two repetitions, the standard error of the laboratory method being 0.0471, 0.0143 and 0.0051 for fructose, glucose and sucrose, respectively.

FT-MIR analysis
FT-MIR spectra of apple juice obtained from the studied samples were recorded using a Shimadzu IR Prestige-21 equipment. Each spectrum was recorded from 650-4000 cm -1 , in three repetitions, consisting of an average of 128 separate scans. Treatments applied to experimental data and mathematical calibration models were made using The Unscrambler v.9.7 chemometric software.

Statistical analysis
In order to show the effect of cultivar, harvesting crown position and time of harvest on apple juice fructose, glucose and sucrose concentrations, three-way ANOVA General Linear Model, as well as one-way ANOVA and Tukey's comparison statistical tests (significance level α = 0.05) were performed on Minitab 16.1.0.
PLS regression was performed to study the predictive ability of the calibration models. The models were validated using the full cross validation technique, in order to determine the optimal number of latent variables and to detect the outlier samples.
Cross validation (leave one out) means that at a time, one sample is not included in the calibration set and the model is obtained with the rest of the samples, being also used to predict the unused sample; this operation is repeated as many times as the number of samples. External validation sets were used for testing the obtained models. The best models were selected based Jonathan'; 2 Golden Delicious'; 3' Starkrimson'; 4 Inside of the three crown; 5 Periphery of the crown; 6 T1 T6 -Harvesting time during fruit growth; 7 Capital identical letters for each cultivar and each position in the crown denote no significant differences (p > 0.05); Lowercase identical letters for each position and each time denote no significant differences (p > 0.05); Italic capital identical letters for each cultivar and each time denote no significant differences (p > 0.05) on the highest coefficients of determination for calibration and cross-validation (Rcal 2 , Rcval 2 ) or external validation (Rp 2 ), and the smallest standard error of calibration (SEC), standard error of cross validation (SECV) or standard error of prediction (SEP). Ratio of performance of deviation (RPD) was also used. RPD is the ratio between standard deviation (SD) and SEP, indicating the model's ability to predict future results. Cozzolino et al. (2005) mentioned that models with RPD value higher than three can be used only in current standard screening analysis, while the models having RPDs higher than five are considered suitable for quality control products.

Results and discussions
HPLC analysis of apple juice sugars obtained from fruits harvested during development Table 1 shows the carbohydrate concentration for each cultivar, as a function of position and harvesting time. Fructose and glucose concentrations increased significantly (p < 0.05), starting seven DAFB, irrespective of studied cultivars. Sucrose content of the fruits increased slowly in early development stages and faster starting from 65 DAFB. A similar increase in fructose and sucrose content was reported in previous studies for other apple cultivars, which concluded that the increase of sucrose occurs at the time of starch degradation (Berüter and Feusi 1997;Hecke et al., 2006;Petkovsek et al., 2007;Teo et al., 2006;Zhang et al., 2010).
Comparing the cultivars as a function of fruit position in the crown for each time separately, one can notice statistically significant differences (p < 0.05) from seven DAFB for fructose and glucose and 15 DAFB for sucrose. Carbohydrate content oscillations occurred throughout the fruit development, being almost impossible to rank the cultivars by their carbohydrates accumulation.
Fruit position in the tree crown can play an important role on fructose, glucose and sucrose concentrations. During the current study oscillations were noticed among the harvesting crown positions, an overall conclusion regarding this experimental factor being impossible to be drawn. However, at the last harvesting time (144 DAFB), fruits from inside of the crown contained a higher content of fructose than the fruits harvested from the periphery of the crown for all the three apple cultivars studied. Regarding glucose and sucrose content at 144 DAFB, 'Jonathan' cultivar had the highest content in fruits harvested from fruits inside of the crown, while 'Starkrimson' and 'Golden Delicious' cultivars had the highest glucose and sucrose content in fruits harvested from the periphery of the crown.
The applicability of FT-MIR and chemometrics for the prediction of sugars from apple juices obtained from fruits harvested during development Spectra description A set of 30 standard sugar solutions was developed, the concentrations chosen being between 0-15% for fructose and between 0-5% for glucose and sucrose (Table 2). FT-MIR spectra of the standard solution mixes, respectively, those of the individual standard solutions of fructose (conc. 15%), glucose (conc. 5%) and sucrose (conc. 5%) are shown in Fig. 1A. Spectra showed strong absorption bands, characteristic to water molecules, between 1500-1800 cm -1 and between 2800-3700 cm -1 , indicating an apparent similarity of all solutions. However, a differentiation of sugars standard solutions can be achieved by analyzing the spectral region characteristic of this class of compounds, between 900-1500 cm -1 . For this region, characteristic absorption bands were identified (Fig. 1C) for glucose (specific max at 991, 1031, 1080, 1107, 1151, 1367 cm -1 ), fructose (specific max at 966, 1063, 1083, 1155, 1254, 1346, 1416 and 1456 cm -1 ) and sucrose (specific max at 995, 1055, 1113, 1138 cm -1 ). Carbohydrates characteristic absorption bands in the region 904-1153 cm -1 were attributable to C-O s and C-C bonds (stretching modes) whereas those from the region 1199-1474 cm -1 were due to the groups O-C-H, C-C-H and C-O-H (bending modes) (Irudayaraj and Tewari, 2003).
FT-MIR spectra (650-4000 cm -1 ) of the fruit juices obtained from 'Golden Delicious', 'Starkrimson' and 'Jonathan' apple cultivars harvested during their development (Fig. 1B), showed, similar to spectra of the standard sugar solutions discussed above, three major areas: 2800-3700 cm -1 , 1500-1800 cm -1 and 900-1500 cm -1 . Previous research showed that the first two areas are attributed to water molecules, apple juice containing a significant quantity of water (dry matter content of the juice was between 4.75-22%) (Bureau et al., 2009;Ścibisz et al., 2011;Wilkerson et al., 2013). In the current study, the characteristic region for sugars (900-1500 cm -1 ) indicated the accumulation of sugars in the apple fruits harvested during development. Glucose, fructose   Absorption bands in 950-1200 cm -1 region were explained by stretching peaks of C-C and C-O (Bureau et al., 2009). The differences between juice samples obtained from apples harvested during fruit development were observed, especially in 900-1500 cm -1 region. Baseline shift and multiplicative interferences were corrected using mathematical treatments (standard normal variate (SNV), first and second derivatives).

Optimal spectral region and pre-treatments selection
To find the calibration model with the best predictive power, for each sugar, several spectral regions were testes on the set of samples (N=36) obtained from apples harvested during their development (Table 3). The set contained a wide variety of samples, showing a high variability in the concentrations of sugars due to different harvesting periods (from seven to 144 DAFB), and also because of different cultivars studied. By obtaining the samples in this manner, we were provided with a set of samples showing a wide range of apple juice concentrations for fructose (0.42-14.33%), glucose (0.29-4.06%) and sucrose (0-4.28%). Moreover, it provided originality by creating spectra databases of apple juice with low concentrations of sugars (genuine / without running dilutions). So far, the studies were based on the analysis of apple juice obtained from fruits harvested at technological and / or consumption maturity (Irudayaraj and Tewari, 2003;Leopold, 2010). Checking the feasibility of the PLS calibration models represented the first step in the development of rapid methods (FT-MIR type) for analyzing apples growth dynamics.
For each type of sugar, the important wave numbers range for achieving performant PLS calibration models were proposed 226 Table 3. FT-MIR prediction of sugars in apple juice obtained from three cultivars fruits harvested during their development -descriptive statistics and PLS calibration models performance for selecting optimal spectral region and pre-treatments  N -number of samples for each set; * 1 Spectral region 900-1500 cm -1 ; Pre-treatment: d1 (7, 2); * 2 Spectral region 900-1200 cm -1 ; Pre-treatment: d2 (9, 2) ; * 3 Spectral region 900-1200 cm -1 ; Pre-treatment: SNV; R p 2 -coefficient of determination for external validation; all other abbreviations are similar with Table 3.
based on regression coefficients developed for the full spectrum (650-4000 cm -1 ). Wave numbers that are characterized by high absolute regression coefficients have a higher impact on Y variables (reference values), indicating their importance in the selection of calibration models development (Ścibisz et al., 2011). On the other hand, "noise" dominated regions of the regression coefficients must be removed from the model. As shown in Table 3, the highest values for Rcal 2 , Rcval 2 and RPD and the lowest values for SEC and SECV, were obtained when using 900-1500 cm -1 spectral region for fructose, 900-1200 cm -1 for glucose and 900-1200 cm -1 for sucrose. For each type of sugar, these regions were further used to select the optimal pre-treatment of the spectral data (Table 3). SNV pre-treatment, Savitzky Golay first derivative seven smoothing points, polynomial order 2 -d1 (7, 2) and Savitzky Golay second derivative 2-d1(7,2) were used to increase the RPD values. It was concluded that the optimal pretreatments of the spectra were d1 (7, 2) for fructose, d2 (9, 2) for glucose and SNV for sucrose. Selecting the optimal spectral region and pre-treatments leaded to PLS calibration models with very good performance for fructose (Rcval 2 = 0.95; SECV = 0.907) and acceptable for glucose (Rcval 2 = 0.85; SECV = 0.424). The best calibration models developed for sucrose showed only satisfactory performance (Rcval 2 = 0.75; SECV = 0.561), irrespective of regions or pre-treatments used.

External validation
For each type of sugar, the performance of previously best developed PLS models was tested by performing external validation sets. By manual random selection and targeting a representative distribution of all types of samples and a wide range of concentrations, two calibration sets were formed: the set C1 consisted of 24 samples (66.6 %), and an external validation set (V1) consisting of 12 samples (33.3%). As shown in Table 4, the mean, SD and the sugars concentrations range of the calibration set (N = 24) were very close to those of the validation set (N = 12). To verify the feasibility of the calibration models obtained from the 30 mixes of sugars standard solution (STD set; N = 30), the V2 validation set (N = 36) was used, consisting of all samples of apple juice obtained from fruits harvested during their development. For each type of sugar, the FT-MIR predicted results were compared with the ones determined by the direct HPLC method. Means, SD and concentration ranges of STD and V2 sets are shown in Table 4. To verify the hypothesis that a set containing both the mixes of standard solution and the authentic apple juice would have a higher predictive capacity, it was decided to use the C2 (N = 54) calibration set, consisting of the 30 mixes of standard solutions and the C1 set (N = 24). Model validation was performed by using the V1 set, consisting of 12 samples of authentic juice, other than the 24 already used in the calibration set (Table 4). Table 4 shows a synthesis of results for PLS calibration models developed for C1, C2 and STD sets, and their performance to predict V1 and V2 external validation sets. One can observe the high accuracy of PLS calibration models developed for spectra of the standard solution mixes (set STD, Table 4). For each of the three sugars, the models obtained for STD set revealed coefficients of determination very close to unit (0.98 for fructose, 0.95 for glucose and 0.94 for sucrose).
Very good results were obtained for the C1 set used to develop calibration models for fructose and glucose (Table 4), registering an Rcal 2 of 0.95 for fructose and 0.91 for glucose. On the other hand, while the results obtained for the STD set had a high performance in the case of sucrose the ones obtained for C1 and C2 sets were poor (Rcal 2 of 0.78 for the C1 set and 0.71 for C2). This can be explained by the high number of samples with zero concentration or very close to it, at the initial sampling times (T1 → T3, Table 1). Actually, none of the sets of calibration developed for sucrose could be successfully used to predict concentrations of sucrose for the external validation sets (V1-V2, Table 4).
For fructose, the best results attained in predicting the external sets (V1-V2), were obtained when the model was based on the spectra of the authentic apple juice (set C1). Thus, for the fructose, by using the PLS calibration model developed on C1 set, for the V1 validation set, Rp 2 of 0.94 and SEP of 0.826 were obtained (Table 4). Satisfactory performances were obtained (Rp 2 of 0.84 and SEP of 0.472; Table 4) when predicting the glucose concentrations of V1 set and using the C1 set calibration model. For all sugars studied when calibration model was based on the spectra of standard solution there was a drastic reduction of their ability to predict sugar concentrations of apple juice obtained from fruits harvested during development (Table 4, External validation section, V2 set for each sugar). When authentic juice samples were included on the calibration in addition to standard solution samples (C2 set), an improvement of the results was noticed (Table 4, External validation section, for each sugar's V1 set for PLS calibration models developed with the C2 set). These results reinforced the hypothesis that it is desirable to include in the calibration sets similar samples with the ones to be predicted.

Conclusions
The selection of optimal spectral region and pre-treatments resulted in PLS calibration models had very good performance in the case of fructose and acceptable for glucose. The sucrose calibration models showed a satisfactory performance, although different spectral regions and pre-treatments were used. In the case of fructose and glucose, the best results attained for predicting external sets were obtained when the model was based on the spectra of authentic apple juice. None of the calibration sets developed for sucrose could be successfully applied to predict the sucrose concentration for external validation sets.