Genomic selection in synthetic populations

Müller, Dominik

Genomic selection in synthetic populations

dc.contributor.advisor	Melchinger, Albrecht E.	de
dc.contributor.author	Müller, Dominik	de
dc.date.accepted	2018-03-25
dc.date.accessioned	2024-04-08T08:56:44Z
dc.date.available	2024-04-08T08:56:44Z
dc.date.created	2019-01-28
dc.date.issued	2017
dc.description.abstract	The foundation of genomic selection has been laid at the beginning of this century. Since then, it has developed into a very active field of research. Although it has originally been developed in dairy cattle breeding, it rapidly attracted the attention of the plant breeding community and has, by now (2017), developed into an integral component of the breeding armamentarium of international companies. Despite its practical success, there are numerous open questions that are highly important to plant breeders. The recent development of large-scale and cost-efficient genotyping platforms was the prerequisite for the rise of genomic selection. Its functional principle is based on information shared between individuals. Genetic similarities between individuals are assessed by the use of genomic fingerprints. These similarities provide information beyond mere family relationships and allow for pooling information from phenotypic data. In practice, first a training set of phenotyped individuals has to be established and is then used to calibrate a statistical model. The model is then used to derive predictions of the genomic values for individuals lacking phenotypic information. Using these predictions can save time by accelerating the breeding program and cost by reducing resources spent for phenotyping. A large body of literature has been devoted to investigate the accuracy of genomic selection for unphenotyped individuals. However, training individuals are themselves often times selection candidates in plant breeding, and there is no conceptual obstacle to apply genomic selection to them, making use of information obtained via marker-based similarities. It is therefore also highly important to assess prediction accuracy and possibilities for its improvement in the training set. Our results demonstrated that it is possible to increase accuracy in the training set by shrinkage estimation of marker-based relationships to reduce the associated noise. The success of this approach depends on the marker density and the population structure. The potential is largest for broad-based populations and under a low marker density. Synthetic populations are produced by intermating a small number of parental components, and they have played an important role in the history of plant breeding for improving germplasm pools through recurrent selection as well as for actual varieties and research on quantitative genetics. The properties of genomic selection have so far not been assessed in synthetics. Moreover, synthetics are an ideal population type to assess the relative importance of three factors by which markers provide information about the state of alleles at QTL, namely (i) pedigree relationships, (ii) co-segregation and (ii) LD in the source germplasm. Our results show that the number of parents is a crucial factor for prediction accuracy. For a very small number of parents, prediction accuracy in a single cycle is highest and mainly determined by co-segregation between markers and QTL, whereas prediction accuracy is reduced for a larger number of parents, where the main source of information is LD within the source germplasm of the parents. Across multiple selection cycles, information from pedigree relationships rapidly vanishes, while co-segregation and ancestral LD are a stable source of information. Long-term genetic gain of genomic selection in synthetics is relatively unaffected by the number of parents, because information from co-segregation and from ancestral LD compensate for each other. Altogether, our results provide an important contribution to a better understanding of the factors underlying genomic selection, and in which cases it works and what information contributes to prediction accuracy.	en
dc.description.abstract	Die jüngste Entwicklung von großen, kosteneffizienten Genotypisierungsplattformen stellt eine Grundvoraussetzung für den Erfolg der genomischen Selektion dar. Das funktionale Prinzip beruht auf der Ausnutzung von Informationen zwischen Individuen. Vorhandene genetische Ähnlichkeiten werden durch den genomischen Fingerabdruck erfasst. Diese Ähnlichkeiten liefern Informationen, die über die reinen Verwandschaftsverhältnisse hinausgehen und erlauben die Ausnutzung phänotypischer Daten über Individuen hinweg. In der Praxis muss zunächst ein Kalibrierungsdatensatz mit phänotypisierten Individuen erstellt werden, der zur Schätzung eines statistischen Modells dient. Dieses Model wird hernach eingesetzt, um Vorhersagen über den genomischen Wert von Individuen ohne phänotypische Daten zu treffen. Die Verwendung dieser Vorhersagen kann Zeit einsparen, indem das Zuchtprogramm beschleunigt wird, aber auch durch eine Verringerung der zur Phänotypisierung eingesetzten Ressourcen Kosten senken. Die Untersuchung der Vorhersagegenauigkeit genomischer Selektion innerhalb nicht phänotypisierter Individuen war bereits Gegenstand zahlreicher Forschungsarbeiten. Bei den Trainingsindividuen zur Kalibrierung des Modells handelt es sich in der Pflanzenzüchtung jedoch häufig ebenfalls um potentielle Selektionskandidaten und es existiert kein prinzipielles Hindernis, genomische Selektion ebenso auf diese anzuwenden und die Information von markerbasierten Ähnlichkeiten auszunutzen. Daher ist es wichtig, die Vorhersagegenauigkeit sowie deren Verbesserungsmöglichkeiten im Trainingsdatensatz zu prüfen. Unsere Ergebnisse zeigen, dass es grundsätzlich möglich ist durch Schrumpfungsschätzung von markerbasierten Verwandschaften deren Störsignale zu vermindern und die Genauigkeit im Trainingsdatensatz zu steigern. Dabei hängt der Erfolg von der Markerdichte und der Populationstruktur ab. Das Potential ist am größten für breite Populationen bei einer geringen Markerdichte. Synthetische Populationen werden durch Kreuzung einer geringen Anzahl an elterlichen Komponenten erzeugt und haben in der Geschichte der Pflanzenzüchtung eine wichtige Rolle gespielt. Dies betrifft sowohl die Verbesserung des Zuchtmaterials durch rekurrente Selektion, als auch die Erstellung von Sorten sowie die quantitativ-genetische Züchtungsforschung. Die Eigenschaften genomischer Selektion wurden bisher nicht in Synthetiks untersucht. Zudem handelt es sich bei Synthetiks um einen idealen Populationstyp, um die Bedeutung der drei Faktoren zu untersuchen, durch welche Marker Informationen über den Zustand an QTL liefern, nämlich (i) Verwandschaftsverhältnisse (ii) Kosegregation und (iii) Kopplungsphasenungleichgewicht (LD) im Zuchtmaterial. Unsere Ergebnisse zeigen, dass die Elternzahl einen entscheidenden Faktor für die Vorhersagegenauigkeit darstellt. Bei einer sehr geringen Elternzahl ist die Vorhersagegenauigkeit innerhalb eines Zyklus am größten und wird hauptsächlich durch Kosegregation zwischen Markern und QTL bestimmt. Ist die Elternzahl hingegen groß, so tritt als vornehmliche Informationsquelle LD im Ursprungsmaterial der Eltern hervor. Wird genomische Selektion über mehrere Zyklen hinweg praktiziert, so verschwindet die Information aus Verwandschaftsverhältnissen sehr schnell, wohingegen sich Kosegregation und LD als stabile Informationsquellen erweisen. Der langfristige Selektionserfolg genomischer Selektion in einem Synthetik ist nur in einem geringen Maße abhängig von der Elternzahl, da sich Informationen aus Kosegregation und LD gegenseitig aufwiegen. Insgesamt liefern unsere Ergebnisse einen wichtigen Beitrag für ein besseres Verständnis der Grundlagen der genomischen Selektion, in welchen Fällen sie Erfolg verspricht, und welche Informationen die Vorhersagegenauigkeit beeinflussen.	de
dc.identifier.swb	516636324
dc.identifier.uri	https://hohpublica.uni-hohenheim.de/handle/123456789/6341
dc.identifier.urn	urn:nbn:de:bsz:100-opus-15654
dc.language.iso	eng
dc.rights.license	cc_by	en
dc.rights.license	cc_by	de
dc.rights.uri	http://creativecommons.org/licenses/by/3.0/de/
dc.subject	Genom	en
dc.subject	Selection	en
dc.subject	Synthetic	en
dc.subject	Prediction	en
dc.subject	Synthetik	de
dc.subject.ddc	630
dc.subject.gnd	Genom	de
dc.subject.gnd	Auslese	de
dc.subject.gnd	Population	de
dc.subject.gnd	Prognose	de
dc.title	Genomic selection in synthetic populations	de
dc.title.dissertation	Genomische Selektion in synthetischen Populationen	de
dc.type.dcmi	Text	de
dc.type.dini	DoctoralThesis	de
local.access	uneingeschränkter Zugriff	en
local.access	uneingeschränkter Zugriff	de
local.bibliographicCitation.publisherPlace	Universität Hohenheim	de
local.export.bibtex	@phdthesis{Müller2017, url = {https://hohpublica.uni-hohenheim.de/handle/123456789/6341}, author = {Müller, Dominik}, title = {Genomic selection in synthetic populations}, year = {2017}, school = {Universität Hohenheim}, }
local.faculty.number	2	de
local.institute.number	350	de
local.opus.number	1565
local.title.full	Genomic selection in synthetic populations
local.university	Universität Hohenheim	de
local.university.faculty	Faculty of Agricultural Sciences	en
local.university.faculty	Fakultät Agrarwissenschaften	de
local.university.institute	Institute for Plant Breeding, Seed Science and Population Genetics	en
local.university.institute	Institut für Pflanzenzüchtung, Saatgutforschung und Populationsgenetik	de
thesis.degree.level	thesis.doctoral

Files

Original bundle

Now showing 1 - 1 of 1

Name:: PhD_Thesis_Dominik_Mueller_Final.pdf
Size:: 561.91 KB
Format:: Adobe Portable Document Format
Description:: Open Access Fulltext

Download

Collections

Institut für Pflanzenzüchtung, Saatgutforschung und Populationsgenetik