Browsing by Person "Gudata, Diriba Tadese"

Now showing 1 - 1 of 1

Design evaluation and predictive accuracy of multi-environment trials in plant breeding
(2025) Gudata, Diriba Tadese; Piepho, Hans-Peter
In plant breeding, predictive accuracy of genotype means in the target population of environment (TPE) can be improved through proper experimental design and statistical analysis. During experimentation, blocking and randomization are expected to handle the major source of heterogeneity in the field. When heterogeneity exist in both directions, across row and column, two-way blocking is necessary to ensure homogeneity within blocks. Several trials need to be conducted in the TPE to generalize information. The TPE can be divided to form zonation that allows for borrowing information between zones when fitting genotypes as random and to allow for the zone-specific recommendation. The multi-environment trials (MET) data analysis can follow either one-stage or stage-wise analysis where in the latter case, information from individual trials is forwarded to the next stage of analysis. The linear mixed models (LMM) is commonly used in the MET data analysis. Furthermore, auxiliary information from the locations, particularly soil information and weather data can be integrated to MET data analysis to improve predictive accuracy. In general, the objective of this thesis was to improve predictive accuracy of modeling MET data based on different approaches of integrating ECs and pedigree information. Different spatial model selection and design evaluation was conducted in the second chapter using existing MET data from dry lowland sorghum breeding program of Ethiopia. Randomization based model, augmenting randomization-based model with linear variance and exponential spatial variations were compared in partially replicated and fully replicated row- column designs using Akaike information criterion (AIC). The baseline model with a two- dimensional nonlinear spatial model plus nugget improved the fitted model in many trials. In addition, the randomization-based plus two-dimensional linear variance model was also a good candidate model. According to the AIC, it is difficult to find a specific model that suits all the trials. Therefore, trying different spatial models and select the best fit model per trial could be a solution. Evaluation of the current design practice was also assessed in the same chapter through generating alternative designs by restructuring the blocking units and computing the relative efficiency. The relative efficiency results indicate most of the alternative alpha designs with block sizes of five, six, ten, fifteen, and the alternative row-column designs were more efficient when compared to the current practice. In the third chapter, a method of extracting and fitting synthetic environmental covariates (SCs) and pedigree information in multi-location trials data analysis was investigated. The main goal of this chapter was comparing predictive accuracy of LMM without pedigree information and SCs and with pedigree or/and SC to predict genotype performances in untested locations. The SCs were extracted from the actual ECs by using multivariate partial least squares (PLS) analysis. Then, subsequently we fitted in the LMM assuming the random coefficients of genotypes. An unstructured variance-covariance matrix of the random intercept and slope(s) was considered to ensure translational invariance. For the model with pedigree information, the baseline model with the independent genotype effect was modified to allow correlation between genotype through parents. For the GEI effect, the identity, the diagonal and the FA variance-covariance structures were considered. The mean squared error of prediction differences (MSEPD) and Spearman rank correlation shows that integrating the SCs in MET improve predictive accuracy of the model compared to the model without SCs. In all different variance-covariance structures of the GEI models, integrating SC was beneficial. There is also improvement with modelling pedigree information using diagonal and FA variance-covariance structures for genotype-environment effects. The diagonal variance-covariance structure of the GEI with the SC is the most accurate model in predicting genotype means to the new locations. In Chapter 4, the predictive accuracy under different approaches of fitting ECs in predicting genotypic performance in new environments was evaluated. The kinship matrix based on ECs, reduced rank regression and extended Finlay-Wilkinson approaches were evaluated and compared in predicting genotype means. Among the others, the reduced rank regression approach showed the smallest MSEPD. The limitation with this approach is that there are singularity problems when the number of ECs exceeds the number of environments. For this reason, a variable selection by using multivariate PLS was conducted to consider only the very important covariates in the subsequent modelling. Over all, there is a substantial gain in predictive accuracy in considering ECs compared to the model without ECs. In addition, we evaluated the importance of fitting the geographic zone factor, however, the result shows less improvement compared to the model without the zone factor. This result may be related to a smaller number of trials in some of the zones. One limitation with the data set when considering the zone effect is that only few trials remained in the western and northern zones after removing trials with zero genotype variances during individual trials analysis. The southern zone comprises the majority of the trials. The optimum allocation of trials to the zones was also tried based on the variance-covariances of the genotype -by-zone interactions. In chapter 3 and 4, when predicting genotype performance to new environments, the drop-out-one-environment at a time cross-validation (CV) mechanism was considered. This type of CV mimics the prediction for new environments and assesses uncertainty in model prediction. In conclusion, this study developed methods for improving the accuracy of genotypic performance prediction models in METs by improving the design efficiency in ongoing breeding programs through post-blocking mechanism, by fitting spatial models to capture spatial field trends in an experiment, and by using ECs, SCs and pedigree information.