Longitudinal Concordance Correlation Function Based on Variance Components: An Application in Fruit Color Analysis

The maturity stages of papaya fruit based on peel color are frequently characterized from a sample of four points on the equatorial region measured by a colorimeter. However, this procedure may not be suitable for assessing the papaya’s overall mean color and an alternative proposal is to use image acquisition of the whole fruit’s peel. Questions of interest are whether a sample on the equatorial region can reproduce a sample over the whole peel region and if the colorimeter can compete with a scanner, or digital camera, in measuring the mean hue over time. The reproducibility can be verified by using the concordance correlation for responses measured on a continuous scale. Thus, in this work we propose a longitudinal concordance correlation (LCC), based on a mixed-effects regression model, to estimate agreement over time among pairs of observations obtained from different combinations between measurement method and sampled peel region. The results show that the papaya’s equatorial region is not representative of the whole peel region, suggesting the use of image analysis rather than a colorimeter to measure the mean hue. Moreover, in longitudinal studies the LCC can suggest over which period the two methods are likely to be in agreement and where the simpler colorimeter method could be used. The performance of the LCC is evaluated using a small simulation study. Supplementary materials accompanying this paper appear online.


INTRODUCTION
Papaya (Carica papaya L.) is a tropical and climacteric fruit with antioxidant, anticarcinogenic and anti-mutagenic properties, containing carotenoids and with nutritionally valuable accumulations of lycopene (Sancho et al. 2010). Brazil is ranked second in the world for papaya production with 1.6 million tons cultivated annually, which is equivalent to 17.5% of global production, and is also the second major papaya exporting country, accounting for 11.2% of the global trade (Evans and Ballen 2015). Nevertheless, for expanding its internal and external market, Brazil still needs to improve its papaya postharvest quality, which will also bring added economic value to the product. Consumers purchase papaya based on color (yellow-orange papayas), freshness, taste, and size (Bron and Jacomino 2006;Sivakumar and Wall 2013).
Many studies, such as Silva-Ayala et al. (2005), Schweiggert et al. (2011), and Sivakumar and Wall (2013), have been conducted to understand how some variables may influence the papaya's quality in the postharvest period. One of the most important variables is the peel color, a criterion used for determination of the ripeness stage and fruit classification based on maturation scales (Mendoza and Aguilera 2004). Moreover, the color is highly correlated with acidity, firmness, and ethylene emission, which are used as physical, chemical, and sensory indicators of product quality (Bron and Jacomino 2006). Color is traditionally measured by a colorimeter at various points on the fruit's surface, but this can lead to bias in the determination of the mean color depending on the number and location of sampled points (Mendoza and Aguilera 2004;Oliveira et al. 2017). For example, it is common to sample only four equidistant points on the equatorial region of the papaya fruit (Chávez-Sánchez et al. 2013;Martins et al. 2014).
A possible alternative to measure color, suggested by Mendoza and Aguilera (2004) and Darrigues et al. (2008), is the use of image analysis. This involves using a digital camera, or a scanner, for acquisition of an image or video under standard lighting conditions that can be processed using image analysis software. An image or video offers advantages in providing pixel-by-pixel information on the fruit's peel along with other associated data, such as the fruit's shape (Wu and Sun 2013). Not surprisingly the colorimeter approach is faster and easier than image analysis; however, as noted by Oliveira et al. (2017) it may not be suitable for assessing the overall mean color of the papaya's peel and its performance will depend on the number of measured points and choice of sampled region. Therefore, other studies are still needed to assess if a sample on the equatorial region can reproduce a sample over the whole region, and if the colorimeter can compete with a scanner or digital camera in measuring the mean hue of papaya peel over time. Hence, it is important to know about the magnitude of agreement over time between values observed by both approaches.
The bivariate agreement coefficient was introduced by Krippendorff (1970) to assess the agreement between two independent observers, or methods, based on nominal, ordinal, or interval measurement scales. Later, Lin (1989) proposed the concordance correlation coefficient (CCC) to evaluate the agreement between observations measured by two methods when responses are measured on a continuous scale (King et al. 2007b). The reproducibility between the two methods is calculated by measuring the variation of the linear relationship between each pair of observations from the 45 • line through the origin. The main advantages of CCC are that it measures how far each observation deviates from the best-fit line (measure of precision) and how far the best-fit line deviates from the 45 • line through the origin (measure of accuracy) (Lin 1989).
A weighted CCC for repeated measurements was proposed by Chinchilli et al. (1996), allowing for within-unit variances to vary across experimental units using a random coeffi-cients model. King and Chinchilli (2001) proposed a generalized CCC for categorical data, allowing for the calculation of the agreement between two or more methods. Another relevant extension was proposed by King et al. (2007a,b), which focused on the estimation of a CCC for repeated measures incorporating a nonnegative definite matrix of weights across different repeated measurements. Soon after, Carrasco et al. (2009) proposed the estimation of a CCC for repeated measures (CCC rm ) based on variance components (VC) from a mixed-effects model summarizing the agreement among methods over all measurements. These authors considered that individual effects impact the variance of distribution of the responses, including this as random effect in the model. Consequently, the interaction between individual and method, or time, is also random, allowing the estimation of individual-specific variances for method and/or at repeated measurements over time. Another relevant agreement index was proposed by Hiriote and Chinchilli (2011), who introduced a repeated-measures CCC from a matrix that characterizes the overall agreement between two vectors of random variables. Recently, Rathnayake and Choudhary (2017) proposed a time-dependent CCC for multiple methods based on semiparametric models, using penalized regression splines for modeling longitudinal data.
An advantage of the VC approach is that the covariance structure of linear mixed-effects models is flexible and allows for constant, or non-constant, correlation between the observations and can handle unbalanced data (Lindstrom and Bates 1990). Restricted maximum likelihood (REML) can be used to obtain the estimates of variance components, since it is less biased and more accurate than standard maximum likelihood (Carrasco et al. 2009) . This is a popular framework for analyzing repeated-measures data and, in particular, longitudinal data. However, if the model is incorrectly, or incompletely, specified then the REML estimates may also be biased. Additionally, mixed-effects regression models can be used to describe the CCC as a function of time, rather than summarizing it in a single coefficient as proposed by Carrasco et al. (2009). This approach may give improved statistical power in the presence of missing data (Pinheiro and Bates 2000), decreasing the standard errors of estimators, and allows the assumption of different forms for random effects, such as constant or non-constant over time (Fitzmaurice et al. 2009). Moreover, standard repeated-measures ANOVA cannot be applied to longitudinal data when observations are irregularly spaced or when we need to include quantitative covariates in the model. In this work we propose an extension of the ideas presented by Lin (1989) and Carrasco et al. (2009) based on the mixed-effects regression model rather than a mixed-effects ANOVA approach. This allows us to estimate the longitudinal agreement over time among pairs of observations obtained from different methods and from different sampled regions on the papaya peel.
The remainder of the paper is organized as follows. The experimental data and descriptive analysis are introduced in Sect. 2. Section 3 presents the definition of the longitudinal concordance correlation (LCC) based on fixed effects and variance components in a multiple mixed-effects regression model. Section 4 considers the estimation of LCC from the fixed effects and variance components estimates. Section 5 presents bootstrap confidence intervals for the LCC and the performance of the approach is considered in a small simulation study in Sect. 6. Section 7 illustrates the application of this LCC methodology to the papaya hue color example, considering fruit classification and maturation studies in the postharvest phase. Section 8 presents a discussion on the advantages of the LCC, and some caveats on using this methodology and associated bootstrap confidence intervals. Finally, Sect. 9 gives some concluding remarks. All statistical computing and graphics have been performed using the free software environment R (R core Team 2015).

EXPERIMENTAL DATA AND DESCRIPTIVE ANALYSIS
An observational study was conducted in the Vegetable Production Department at "Luiz de Queiroz" College of Agriculture/University of São Paulo in 2010/2011 to evaluate the peel color of 20 papaya cv. 'Sunrise Solo' over time. The fruits were harvested on the same day and all of them presented up to 10% yellow peel color. They were labeled and stored in a cool chamber at 18 • C and 80% ± 5% relative humidity.
A flat-bed scanner (HP Scanjet G2410) with 200 pixels per inch resolution and a tristimulus colorimeter Minolta CR-300 (Konica Minolta 2003) were used to measure the fruits' peel color. In order to minimize the effects of shade and provide a dark background for the image, the scanner was covered with a cardboard box coated with black color fabric before the digitalization. The colorimeter was calibrated daily before the assessments, following the procedures in the manual (Minolta 1991), and the C I E LCh color space was used to describe the luminosity (L * ), chroma (C * ), and hue (h * ). Each day, after calibration, four predetermined equidistant points on the papaya's equatorial region were observed with the colorimeter and then both sides of the fruit (labeled sides A and B) were scanned, which provided information on the whole peel region.
The processing of these images used the GNU Image Manipulation Program (GIMP) (Kimball et al. 2014) for two main aspects: i) the fruit's equatorial region was obtained from a longitudinal slice of the original images from both sides, A and B, and these were stacked and saved as a single image ( Fig. 1a; ii) sides A and B of the whole region were stacked and saved as a single image to represent the whole peel region (Fig. 1b). Consequently, the equatorial region obtained from the original scanned images is a subset of the total. Afterward, the separation of fruit peel from the black background was made using the autoThreshold function of the rtiff in R (R core Team 2015). The autoThreshold uses a simple and automatic method proposed by Ridler and Calvard (1978) for identifying equatorial and whole regions of the peel, as shown in Fig. 1a, c, respectively. We observed approximately  1,000 pixels on the equatorial region and 10,000 pixels over the whole region of the fruit's peel using both scanned sides.
In the following, as the scanner uses the standard RG B (s RG B) rather than C I E LCh color space, the pixels of each image need to be converted from RG B to C I E LCh color space. This is done by using specific functions available in the colorspace package (Ihaka et al. 2015) of R. Oliveira et al. (2017) undertook a calibration study using 297 different color standards of Munsell's color charts for vegetable tissues (Munsell Color Company 1952) in order to correct the values of L * , C * , and h * obtained by the scanner.
Here, the hue variable will be used to illustrate the statistical methodology since this is frequently used in classification procedures and to establish ripening curves for various fruits, such as mango, papayas, and bananas (Silva-Ayala et al. 2005;Sancho et al. 2010). As the hue variable is an angle, it needs to be treated as a circular variable and the circular mean should be used to summarize the data (Fig. 2) (Fisher 1993;Mardia and Jupp 2000).
Note that because of some problems with fungal diseases, such as anthracnose (Colletotrichum gloeosporioides Penz.), and stem-end rot (Mycosphaerella sp.) on the fruit's peel during the conduct of the experiment, some fruits did not have a complete set of responses, see Fig. 3 where some profiles terminate before day 14 and correspond to dropout. The form of missingness in this experiment is classified as missing at random (MAR), because the probability of dropout may depend on the observed responses, but, given the observed responses, is conditionally independent of the missing data . Therefore, here the MAR mechanism can be treated as ignorable, because the probability that Y i jlk is observed does not depend on missing observations. In general, the papaya fruits showed a gradual and non-uniform color change over time from green (h * ≈ 115), with some yellow bands at the style region, to yellowish orange (h * ≈ 85) (Fig. 3). This is in line with the experiences of Sancho et al. (2010) and Martins et al. (2014), who have also studied the peel color of papaya fruit.

THE MULTIPLE MIXED-EFFECTS REGRESSION MODEL FOR LONGITUDINAL DATA
Let y i jlk be a realization of random variable Y i jlk measured on the ith fruit (i = 1, 2, . . . , N ) by the jth method ( j = 1, 2, . . . , m) on the lth region (l = 1, 2, . . . , r ) at time t ik (k = 1, 2, . . . , n i ), where n i is the total number of observations taken on the ith fruit over time. As an initial model we can consider a linear regression which represents a polynomial regression of degree p over time. Generally, the maximum value considered for p is 4, because the use of high order polynomials ( p > 4) may lead to over-fitting. The time variable, t, is assumed to be on a predetermined scale, such as, week, day, month, year. In this simplest case, the errors ε i jlk are assumed to be independent and normally distributed with mean zero and variance σ 2 ε . However, this independence assumption may not be reasonable for longitudinal repeated-measures data.
An alternative is to include in the model a random variable that indicates the influence of repeated measurements on individual i, that is, we assume that observations from the same individual are correlated. This variable is called a random effect and its population distribution is assumed to be normal with mean 0 and covariance matrix G. Thus, the linear mixed-effects model, including an unstructured covariance matrix for the random effects, is given by where h = 1, 2, . . . , q, q + 1, . . . , p is an index identifying the degree of the linear mixedeffects polynomial model, with q ≤ p; y i jlk is the response of ith fruit measured by the jth method on region l at time k; t ik is the time, in days since harvesting, at which the ith individual is observed; β jl = β 0 jl , β 1 jl , ..., β pjl T is a ( p + 1)-dimensional vector of fixed effects for the jth method on the lth region; dimensional vector of random effects with mean vector 0 and covariance matrix G; i is a (mrn i )-dimensional vector assumed to be independent for different i, j, l and k and independent of the random effects b i with mean vector 0 and (diagonal) covariance matrix R i . In this case, the independence among errors is much more reasonable because the individual longitudinal dependence is accounted for by the random effects. Under model (2), we can assume two different forms for the random effects: i) constant over time or ii) non-constant over time. An example of constant random effects over time is the random intercept model ( Sometimes, even when assuming non-constant random effects over time, it is necessary to include a time-dependent variance function in the model. As in Pinheiro and Bates (2000), we can generalize the variance function to where g(.) is a variance function assumed continuous in δ; t ik is the time covariate and δ jl is a vector of variance parameters for observations measured by the jth method on the lth region. Further details on the possible forms of variance functions can be seen in Pinheiro and Bates (2000).
The assumption of heteroscedasticity between different combinations of method and region is necessary in this context because there are subpopulation differences (equatorial and whole regions) as well as different numbers of raw observations for calculating the mean hue (four points for colorimeter; 1000 for the scanner on the equatorial region; and 10,000 for the scanner over the whole region).
The intercepts, time effects, and interactions that are related to the method and region under the model (2) are given by β h jl = β h + ϕ h j + γ hl + λ h jl , where β h is the overall polynomial coefficient; ϕ h j , γ hl , and λ h jl are the method-time, region-time, and methodregion-time interactions, respectively. Note that when h = 0, ϕ 0 j and γ 0l are the main effects of method and region, respectively. To ensure identifiability of the fixed effects, it is assumed that ϕ h1 = γ h1 = λ h1l = λ h j1 = 0, for j = 1, 2, . . . , m, l = 1, 2, . . . , r .

MODEL EXTENSIONS
The hue change in papaya fruit depends on several variables that were not measured before the beginning of experiment described in Sect. 2, such as fruit position on the plant, temperature and relative humidity, time until harvest of fruit, plant and/or fruit diseases, sun exposure time. These unobserved variables can influence directly, or indirectly, the physiological processes of the fruit and consequently its initial hue, justifying the inclusion of a random intercept in the model. Furthermore, as we take measurements on different fruit regions, it is reasonable that we assume these unobserved variables can differentially affect the response variable given the locality of the observed point on the peel. In particular, this assumption is necessary because papaya has an uneven ripening process, starting at the style and moving toward the peduncle insertion regions, forming yellowish-hue bands.
To estimate this additional variability associated with the interaction between method and region, we create a new variable A with mr categories given by the combination of region and method levels. Then the mr − 1 associated dummy variables for fruit i at time k are defined as follows: In this way, we can extend model (2) to T is a (q + mr)-dimensional vector of random effects with mean vector 0 and covariance matrix D; and is a matrix of zeros. Model (4) may be much more reasonable than model (2) because the individual influence on the intercept due to the interaction of method and region is also removed from the error term. Model (4) is equivalent to a general linear mixed model defined by Verbeke and Molenberghs (2000) as where is the associated vector of measurements for fruit i, X i (t ik ) and Z i (t ik ) are design matrices with ( p + 1)mr and q + mr covariates for fruit i at time t ik with dimensions mrn i ×( p +1)mr and mrn i ×(q +mr), respectively. Hence, under includes random coefficients of the model such as random intercept, or random intercept and slope. Marginally, the vector Y i is normally distributed with mean X i (t ik ) β and covariance matrix V ar

THE LONGITUDINAL CONCORDANCE CORRELATION
Let Y 1 and Y 2 be random variables independently selected from a bivariate population with means μ 1 and μ 2 , Var (Y 1 ) = σ 2 1 , Var (Y 2 ) = σ 2 2 and Cov (Y 1 , Y 2 ) = σ 12 . The concordance correlation coefficient (CCC) introduced by Lin (1989) is defined as where |ρ c | ≤ 1, E I is the expectation assuming that Y 1 and Y 2 are uncorrelated. ρ p = σ 12 σ −1 1 σ −1 2 is the Pearson correlation coefficient (|ρ p | ≤ 1) that measures how far each observation deviated from the best-fit line (precision measure), and C b is the accuracy (0 < C b ≤ 1) that measures how far the best-fit line deviates from the 45 • line through the origin. Lin (1989) √ σ 1 σ 2 is a location shift relative to the scale. Now suppose that the ith fruit is measured n i times by each of m methods on r regions and that the researcher wants to investigate the agreement between observations measured by different methods on the same region, by different methods and regions, and by the same method on distinct regions of the fruit's peel. Here we denote y i jlk and y i j l k as general realizations of random variables Y i jlk and Y i j l k measured from different unique combination of two factors (method and region) at time t ik , with j > j = 1, 2, . . . , m, and l > l = 1, 2, . . . , r . Thus, under the model (4), we can define the LCC based on variance components for observations measured from different unique combinations of two factors at time t ik as Let z i jlk and z i j l k be, respectively, rows of Z i (t ik ) matrix such that z i jlk = (t ik , d ik ) and z i j l k = t ik , d ik , where t ik = t 0 ik , t 1 ik , . . . , t q ik and d ik and d ik are the dummy variables for the two method-region combinations ( jl and j l ). Thus, the covariance between Y i jlk and Y i j l k is given by If we assume different variances for each unique combination between method and region, the variances of Y i jlk and Y i j l k can be expressed as On the other hand, the systematic difference between the mean responses for Y i jlk and Y i j l k given by S jl, j l (t ik ) = E Y i jlk − E Y i j l k = μ jl (t ik ) − μ j l (t ik ) reduces to S jl, j l (t ik ) = t ik β jl − β j l , with h = 1, 2, ..., p and jl = j l .
Thus, replacing the expressions (8), (9), and (10) in (7) and considering all individual fruits that were assessed at time t k , we have that the LCC is given by where jl, j l (t k ) and C jl, j l (t k ) are, respectively, the longitudinal Pearson correlation and longitudinal accuracy based on variance components for each unique combination of two factors, j > j = 1, 2, . . . , m and l > l = 1, 2, . . . , r .
The ρ ( p) jl, j l (t k ) measures how far each observation deviated from the best-fit line at a fixed time t k = t and is given by The accuracy C jl, j l (t k ) is a longitudinal bias correction factor that measures how far the best-fit line deviates from the 45 • line at a fixed time t k = t and is given by denotes the scale shift and denotes the location shift relative to the scale (Lin 1989). When Var Y i jlk = Var Y i j l k and E Y i jlk = E Y i j l k there is no deviation from the 45 • line. It is worth noting that, as often studies will use two or more methods to measure a characteristic on the same group of subjects, these measurements will tend to be positively correlated (Barnhart and Williamson 2001;King et al. 2007b). Here we may expect similar positive association between measurement methods and hue color (ripeness) at different regions of the fruit. Thus, it is plausible to assume that in most situations 0 ≤ ρ jl, j l (t) ≤ 1.

ESTIMATION OF THE LCC USING VARIANCE COMPONENTS
To estimate the fixed effects and variance components of the linear mixed-effects model, we use the restricted maximum likelihood approach. The log-likelihood function to maximize is proportional to where ψ u and ψ denote, respectively, vectors of variance components of the D and R i matrices; and r = ( y − X (t ik ) β) is a residual vector. It is worth noting that when the data are missing at random, or missing completely at random, the missing data mechanism may be ignored and the resulting likelihood function remains valid (Fitzmaurice et al. 2009).
Since ρ jl, j l (t ik ) is a function of the fixed effects (β) and variance components (ψ u , ψ ), it can be estimated by replacing β, ψ u , and ψ by their REML estimates.

NONPARAMETRIC BOOTSTRAP CONFIDENCE INTERVALS
Here we consider a simple case-resampling bootstrap, an alternative would be to use a model-based residual resampling approach. Initially, we generate B pseudo-samples by resampling from the data then refit the model in order to obtain B sets of estimates for all parameters of the LCC. Then for each pseudo-sample, for a specific time t k = t we use the estimated values to obtain ρ (b) jl, j l (t) and apply the Fisher Z-transformation 1 2 log 1+ρ 1−ρ to provide values that may be approximately normally distributed. The expected value of the Z-transformed bootstrap estimator is calculated from the B bootstrap samples at time t as and the standard deviation of the bootstrap distribution of ρ * jl, j l (t) is given by An approximate level 1 − α bootstrap confidence interval for the LCC ρ jl, j l on the original scale is [LB, UB], where with z α 2 and z (1− α 2 ) denoting the α 2 and 1 − α 2 quantiles of the standard normal distribution. When N is small the shape of the sampling distribution of ρ * jl, j l may look non-normal, then a more appropriate bootstrap CI approach should be based on the percentile method because the discrepancy between the CI based on S E * jl, j l and CI based on percentiles increases with decreasing N (Thai et al. 2013).
The inference about ρ ( p) jl, j l can be performed in a similar way as presented for LCC. However, the inference about C jl, j l is a little different because it belongs to the interval [0, 1], consequently we can use the logit transformation log C jl, j l (t k =t) 1−C jl, j l (t k =t) instead of Fisher Z-transformation to approximate the distribution of C jl, j l (t k = t) to the normal distribution.

SIMULATION STUDY
To evaluate the LCC approach, we conducted a simulation study to investigate its estimation performance under variance components approaches for papaya ripeness over different scenarios. We are specifically interested in examining the scaled mean error (SME), as a bias measure, and root mean square error (RMSE), as an accuracy measure, for LCC. We also examine the coverage of simultaneous confidence intervals for S jl, j l (t k ) = μ jl (t k ) − μ j l (t k ) and LCC(t k ). To simplify the study, we restrict attention to m = 2 methods, r = 2 regions, and k = 16 repeated measures (days) and we evaluate the LCC for the same methods over different fruit regions, as well as among methods on the same regions of the fruit; we label these cases as Examples 1, 2, 3, and 4.
We also consider two trajectories per fruit for N ∈ {20, 50, 100} fruits using both balanced and unbalanced designs, where the missing value pattern used was (monotone) dropout. The data for unbalanced designs are initially simulated as in a balanced design and then we create an indicator variable (I rm ) for the total number of repeated measures on the i-th fruit. We suppose that this variable has a Poisson distribution with mean and variance  (2017). We use the methodology of Sect. 5 with B = 2000 bootstrap replications to compute the simultaneous confidence intervals with 1 − α = 0.95 based on the Fisher Z-transformation to approximate the distribution of ρ jl, j l (t k ) by the normal distribution. The process of simulating data and constructing confidence intervals is repeated 2000 times, and we computed the coverage probability for each interval. Table 1 presents the overall mean of estimated SME and RMSE for the LCC in examples 1, 2, 3, and 4 for balanced and unbalanced designs. We see a small SME under both designs for N = 20, 50, or 100 for all examples; however, as would be expected the estimator of ρ jl, j l (t k ) is more biased in the unbalanced design than in the balanced one.
Furthermore, the RMSE indicates that the estimator of LCC is closer to the true value as the number of individuals (N ) increases, but also demonstrates a satisfactory performance even in conditions of unbalance and a low number of individuals.
The estimated coverage rate for the simultaneous confidence intervals around ρ jl, j l (t k ) and S jl, j l (t k ) at a fixed time t k = t is graphically summarized for both experimental designs in Fig. 4.
In the balanced case, we see that LCC bands still have a slight tendency over time to stay below the nominal coverage for N = 20; however, this phenomenon disappears as the sample size is increased. Interestingly, even in a severely unbalanced condition there is no appreciable difference in the coverage rate between balanced and unbalanced cases for LCC. This indicates that the methodology is robust to the imbalance conditions in the data. On the other hand, in both experimental designs the estimated coverage rates for systematic differences (S jl, j l (t k )) appear to decrease a little as we increase the number of fruits. In general, the coverage rate is acceptable even for a sample size of 20 and good for 100 subjects.

PAPAYA'S PEEL HUE DATA ANALYSIS
Initially, a quadratic growth model with a random effect of fruit was fitted for each combination between method ( j = 1, 2) and region (l = 1, 2), where j = 1 refers to the colorimeter while j = 2 to the scanner, while l = 1 refers to the equatorial region and l = 2 to the whole region. However, for computational reasons time was transformed to a log scale, τ ik = log (t ik + 1).
In sequence, a model selection procedure was performed from the model of expression (4), resulting in the following model: where d i1k and d i2k are dummy variables equal to 1 for observations measured by the scanner on equatorial region and whole region, respectively, and zero otherwise, as described in Sect. 3.2. In this model, the conditional residuals were assumed to be independent with constant variance over time and we checked this assumption from a variogram constructed using these residuals, suggesting that it was adequate for the whole region. On the other hand, the variability of observations measured on the equatorial region increased faster over time than observations measured on the whole region of the fruit. To solve this, a variance function using time as covariate was included in the model, which can be represented by Var i jlk = σ 2 exp 2δ jl τ ik for observations measured by the jth method on lth region, where the parameter δ jl is unrestricted enabling the variance to increase or decrease over time (Pinheiro and Bates 2000). As variances remained constant over time for observations measured on the whole region, we fixed δ 22 = 0, which represents the scanner on the whole region.
The parameter point estimates and respective 95% confidence intervals are presented in Table 2. The intercept as well as linear and quadratic coefficients were different between methods and sampled regions, which indicates that the colorimeter and scanner, as well as the equatorial and whole regions, differed in mean hue quantification.
As E Y i jlk = β 0 jl + β 1 jl τ ik + β 2 jl τ 2 ik + β 3 τ 3 ik = E Y i j l k for j > j = 1, 2, . . . , m, and l > l = 1, 2, . . . , r , the systematic differences (S 2 jl, j l (t ik )) between methods by region were different from zero and vary over time t ik . This difference was the smallest between observations measured by the colorimeter and by the scanner on the equatorial region (Fig. 5), indicating greater accuracy between these compared to other combinations.
Variance components as well as their 95% confidence intervals presented in Table 2 show that the variability of observations measured on papaya skin at the beginning of the experiment by the colorimeter was less than that by the scanner, with variances Var (Y i1l1 ) = σ 2 b 0 for the colorimeter and Var (Y i2l1 ) = σ 2 b 0 +σ 2 α for the scanner. This showed that a sample of four points on the equatorial region for the colorimeter was not representative of this region at the beginning of the experiment. At this point the papaya's ripening process has led to larger proportions of green hues than yellowish-green or yellow hues on the equatorial region, resulting in a distribution more concentrated around the mean hue when the fruit was assessed by the colorimeter in relation to scanner. Similar results can be verified in studies conducted by Bron and Jacomino (2006), Schweiggert et al. (2011) andChávez-Sánchez et al. (2013), who also showed a low variability among fruits measured by the colorimeter on  the equatorial region at start of experiments. Therefore, it is necessary that a greater number of points be sampled in this region with the colorimeter in order to obtain a representative sample, especially at initial ripeness stages.
The correlation parameter ρ α = σ α 12 /σ 2 α included in the model to quantify the relationship between observations measured by the scanner on the equatorial and whole regions showed a positive and high correlation (ρ α =0.85). It was a indicative of high precision among these measurements and makes sense because the equatorial region was represented by a percentage of pixels belonging to the whole region, in such a way that, as we increase this percentage, the Pearson correlation will tend to 1.
Under the fixed effects and variance components of the model described in (13), the longitudinal concordance correlation (LCC), longitudinal Pearson correlation (LPC), and longitudinal accuracy (LA) were estimated for each pair of unique combinations among levels of method and region, as well as their 95% bootstrap confidence intervals. Here we assumed that the scanner was the standard methodology because from it one we captured images that provided information on the whole region of the fruit's peel. We graphically summarize these functions and their respective confidence intervals for each case as follows: (i) Case 1 (Fig. 6): agreement among observations measured on the equatorial region by the scanner and colorimeter; (ii) Case 2 (Fig. 7): agreement among observations measured on the equatorial region by the colorimeter and on the whole region by the scanner; (iii)Case 3 (Fig. 8): agreement among observations measured on the equatorial and whole regions by the scanner. Figure 6 shows that the LCC and LPC between observations measured on the equatorial region by the scanner and colorimeter were the smallest at the beginning (0.57 and 0.62 respectively) increasing over time to approximately 0.92. On the other hand, the LA always had estimated values above 0.81 over time, therefore, the only moderate agreement at the beginning was caused by a lower precision. Consequently, despite the lower precision between them, we can conclude that values of mean hue obtained by the colorimeter were close to those obtained by the scanner on the equatorial region. Moreover, it also indicated that the topography and curved surface of the papaya fruit did not affect, or only slightly affected, the hue measurements taken on its peel by the scanner. Similar results were observed by Mendoza et al. (2006), who verified that the hue color was not affected by the topography and curved surface of green and yellow bananas, or red peppers, when measured by a digital camera.
Although confidence intervals for the LPC for Cases 1 (Fig. 6) and 2 (Fig. 7) are similar, the confidence intervals for LA between them over time were different, indicating that the mean hue calculated from observed values by the scanner over the whole region was not close to the observed values from the colorimeter on the equatorial region, especially over the period 0 to 7 days after start of the experiment.  Similar results were also observed between Cases 2 and 3 (Fig. 8) where the equatorial and whole regions were obtained by the scanner from a common image of both sides of the fruit's peel, reinforcing the conclusion that the whole region of the fruit's peel cannot be represented by sampling only on the equatorial region. Therefore, ideally image analysis of whole fruit's region should be used to compute the mean hue.
Another important characteristic observed from Figs. 6, 7, and 8 was that the LCC increased with time, showing a non-uniform ripening process, even when observed only on the equatorial region (Fig. 6). Many studies such as Bron and Jacomino (2006), Sancho et al. (2010), and Schweiggert et al. (2011) observed that the color changes were non-uniform from green to yellow/orange, as the yellow and orange colors on the peel are directly associated with production of carotenoids such as β-cryptoxanthin, β-carotene, and lycopene. Thus, it is plausible to assume that the chlorophyll degradation as well as the production of these carotenoids is also non-uniform over the peel surface.

DISCUSSION
Our main motivations for this study were to verify if a colorimeter reproduces a scanner and if a sample on the equatorial region can reproduce observation of the whole region in measuring the mean hue on papaya peel. For this, we proposed a LCC for assessing the agreement of paired continuous measurements over time associated with different level combinations of two factors (method and region) based on a mixed-effects regression model. Thus, the main contribution of this paper was describing the longitudinal agreement profile considering different forms for random effects rather than summarizing it in a single coefficient, as proposed by Carrasco et al. (2009).
Evaluating the longitudinal agreement profile can help the researcher to decide in which period the new method reproduces the observed values by the standard method. In Fig. 6, for example, if we consider that the ideal agreement is given by a lower band of a confidence interval higher than 0.80, then the colorimeter methodology may be used on the equatorial region from the 12th day after harvest. On the other hand, when we used the methodology proposed by Carrasco et al. (2013), we obtained a 95% confidence interval for the CCC rm between 0.7069 and 0.9041, consequently the conclusion would be that the colorimeter may not be used to calculate the mean hue on the fruit's equatorial region, when in fact we demonstrated that it could be used in a specific period.
Clearly, the experimental design and study objectives should be taken into account before choosing which method we should use to calculate the agreement between pairs of observations over time. Furthermore, in longitudinal designs, the number and spacing of measurements and missing data mechanisms should also be taken into account. In this way, in further studies we intend to extend this methodology to other forms of missingness. Also, the minimum number of time points that are necessary to use this LCC methodology should not be two, or even three, because this number depends on the change pattern of the response variable over time. Increasing the number of time points, provided that they are not concentrated near a specific time, allows us to use different types of polynomial growth curve models, which can directly influence the calculation of LCC.
Here we treated method and region as fixed effects; however it is possible to extend this methodology to consider the method, region, or both, as random effects. Chen and Barnhart (2013), for example, have already proposed extensions to CCC and the intraclass correlation coefficient based on three-way random-effects ANOVA, discussing about when the observer (or method), time, or both, should be treated as fixed or random effects. Furthermore, as we consider that conditional errors are independent, a possible extension to this methodology could be include a more complex structure in the error variance-covariance matrix, such as autocorrelation functions. This procedure has been discussed by Rathnayake and Choudhary (2017), who compared independent errors, a first order autoregressive process, a first order moving average process and a compound symmetry model.
Model selection was made from the likelihood ratio test for nested models, and AIC or BIC for non-nested models, to get a more parsimonious description for the data and, consequently, more reliable inference for the LCC. The LCC estimation was based on the REML approach, while 100(1 − α)% confidence intervals were obtained using a nonparametric bootstrap approach with case resampling (10,000 bootstrap replicates), as described in Sect. 5. The amount of CPU time required to produce these samples was approximately 3,5 hours on a laptop computer with an Intel Core TM 2.40GHz i7 processor and 4GB of RAM. Moreover, only two convergence problems were encountered. However, note that if the model assumptions are not fulfilled, or if they are not correctly specified, the estimates of LCC as well as its bootstrap confidence intervals may be biased.
The simulations indicated a satisfactory performance of LCC for 20 or more individuals, with robustness to problems caused by dropouts. Although the proposed simulation model assumed independent errors, it can be extended to include a serially autocorrelated error term. In this sense, Rathnayake and Choudhary (2017) proposed a longitudinal concordance correlation based on a semiparametric model under various autocorrelation function assumptions.

CONCLUSIONS
In this article, we developed a longitudinal concordance correlation as well as its 100 (1 − α) bootstrap confidence interval based on the fixed effects and variance components estimates from a multiple mixed-effects polynomial regression model. Its performance was satisfactory as was demonstrated based on simulation study. Further work will focus on developing a LCC for other combinations of random or fixed effects for method, region, or both.
In the substantive application the use of LCC, as well as LPC and LA, showed that sample points only on the equatorial region were not representative of the whole peel region, suggesting that image analysis of the whole peel region should be used to compute the mean hue. Moreover, the LA between observations measured by the colorimeter and scanner on the equatorial region suggested that the topography and curved surface of papaya fruit did not affect the mean hue obtained by the scanner.