Approachable Solution

Biosci (Thailand) Co., Ltd.

Genstat® 18th Edition

www.biosci.global 1 Approachable Solution

Genstat 18th Edition

2 Approachable Solution

Data view Window navigator QTL data view 3 Approachable Solution

4 Approachable Solution

Outlines • Other statistic methods • Mixed models (REML) • Basic • Multivariate analysis • Design and Sample • • Regression • • Survey data • • Spatial analysis • (brief intro) • Survival analysis • Repeated measurements • Meta analysis • Microarray data • QTL analysis • Exact tests 5 ApproachableStatistical testSolution flow chart

Checking Assumptions

Nonparametric Statistical Tests Tests

More than More than One-sample Two-sample One-sample Two-sample Two-sample Two-sample

Independent One Sample two sample Mann- Kruskal- ANOVA Sign test t-test t-test Whitney test Wallis test

Paired t-test

6 Approachable Solution

สถิติพาราเมตริก (Parametric Statistics)

ข้อตกลงเบื้องต้นของสถิติทดสอบ 1. ข้อมูลมีการแจกแจงแบบปกติ 2. ข้อมูลจะต้องอยู่ในมาตรวัดระดับ Interval หรือ Ratio

Source: http://intraserver.nurse.cmu.ac.th/mis/.../lec_567730_lesson_09.pdf 7 Approachable Solution

Case1. Measurements of Sulphur in the air

• Comparing two samples • To explore whether there is a difference between the amount of Sulphur present in the air on wet and dry days

H0 : 1  2  0

H1 : 1  2  0

8 Approachable Solution

Case1. Measurements of Sulphur in the air

• Summarizing categorical data • Wind direction  cannot be summarized easily with means or quantiles • To count the numbers of observations

• Summarizing data by groups • To calculate means and standard deviations of Sulphur amount grouped by Winddir.

9 Approachable Solution

Case1. Measurements of Sulphur in the air

• Association between categorical variables • To evaluate whether there are any significant differences in the proportion of rainy days for each wind direction; or, equivalently, whether there is a significantly different distribution of directions on wet and dry days

10 Approachable Solution

Case2. Design and Sample size

• Designing an Experiment 1-β = power • Generate a Standard Design menu • Replications Required menu • Power of the design

β α • Control treatments

11 Approachable Solution

Regression

• Simple linear regression Hypothesis test • for intercept Model: Yi  0  1 X i   i i = 1,…,n

H 0 :  0  0 H1 :  0  0

• Multiple linear regression • for regression parameters

H0 : 1  0 H1 : 1  0 Model: Yi  0  1 X i1  2 X i2  ...   p X ip   i

H0 : 2  0 H1 : 2  0

where X ij is the i observation on the j independent variable

 i is an error term j = 1,…, p 12 Approachable Solution

Case3. Blood-pressure readings

• Recordings of blood-pressure (pressure.gsh) • a sample of 38 women • whose ages range from 20 to 80

✓ plot a graph of pressure against age ✓ fit a model to predict blood-pressure from age

pressurei = a + b*agei +ei ✓ predict pressure at other ages 13 Approachable Solution

Case4. One-way ANOVA

• Rat-feeding experiment (Ratlitters.gsh) • 8 litters, each with 5 • Litters was set up as blocks • 5 diets (A-E) allocated at random to five rats within each litter

H 0 : 1  2  ...  k

H a : i   j at least 1 pair where i  j

14 Approachable Solution

Case5. Factorial designs with two treatment factors

• Effect of fertilizers on canola • 2 treatment factors; Nitrogen and Sulphur • N levels; 0, 180 and 230 • S levels; 0, 10, 20 and 40 • In a randomized block design with 3 blocks • 12 plots per block

15 Approachable Solution

Statistical models

response = systematic component + error component

response = (explanatory component + structural component ) + error component

Error component • Corresponds to variation in the Conditions of interest response that is not explained by the systematic component • It may have several source, such as inherent between-subject variability, Physical structure of the study measurement errors, and background e.g. sub-sampling within observational study, variation within environment of study • Assume that it arises from - blocking within designed experiment independently and identically distributed (IID) normal variables with a common variance 16 Approachable Solution

Ex. A model with one factor

Dataset: a scientist compares three feeding regimes labelled A, B and C. They grow 12 plants of a single plant variety, each one in a separate plot, and allocate four plant at random to each of the three regimes. After six weeks, the height of plant is measured.

Height = overall mean + effect of feeding regime + deviation

- Treatments; j = 1, 2, 3 for regime A, B, C - Number of plants within each treatment group; Yjk = μ + fj + ejk using k = 1, 2, 3, 4 - ejk; IID, normal distribution with common variance

17 Approachable Solution

Ex. A model with one variate

Suppose now the scientist has evaluated the dose for feeding regimes A, B and C as 20, 40 and 60 ml per plant, respectively.

Height = f(dose)+ deviation where f(dose) = function of dose

- Treatments; j = 1, 2, 3 for regime 20, 40, 60 - Number of plants within each treatment group; using k = 1, 2, 3, 4

- xj is numerical quantity of the j th dose Yjk = α + βxj + ejk - α is plant height at zero dose (intercept) - β is linear response to increasing the dose by 1 ml (slope)

- ejk is deviations from linear trend for the k th replicated plant with the j th dose 18 Approachable Solution

Yjk = μ + fj + ejk Yjk = α + βxj + ejk

19 Approachable Solution

Ex. RCB design: Alfalfa experiment

Dataset: An experiment was establish to compare 12 alfalfa varieties (labelled A-L). These correspond to 3 different sources but the objective is to estimate heritability of varieties regardless of its source. A total of 6 plots per variety were established arranged in a RCB design. The response variable corresponds to yield (tons/acre) at harvest time.

Yield = μ + variety + block + error

Yjk = μ + gj + bk + ejk

j = 1, … , 12 (t treatment) k = 1, … , 6 ( block) 20 Approachable Solution

Mixed models

Treatment structure (Explanatory component)

Blocking structure (Structural component) Approachable Solution

Ex. RCB design: Alfalfa experiment

Hypothesis of interest Yield = μ + variety + block + error Fixed effects:

H0: μ1=μ2=…=μt H1: μi≠μj for some i, j in the set 1…t Yjk = μ + gj + bk + ejk (i.e. is there a significant treatment effect)

j = 1, … , 12 (t treatment) Test statistic: F or t k = 1, … , 6 (r block) Random effects: 2 H0: σ = 0 Assume that block effect and deviations are H1: σ2 > 0 independent with (i.e. is there a significant variation due to 2 2 the random effect) bk ̴ N (0, σ b), ejk ̴ N (0, σ e), Test statistic: Chi-square (likelihood ratio test) 22 Approachable Solution

Yield = mean + fixed effects + random effects

ANOVA REML (Restricted Maximum Likelihood) • Balanced design • Algorithm is not dependent on balance • Random error terms are normal, • It can be used for repeated measures or independent, each with constant variance field-correlated data • REML allows for changing variances, so it can be used in some experiments such as treatments with different spacings, crops growing over time, treatments that include a control • Random error terms are normal, possibly correlated, with possibly unequal variances

23 Approachable Solution

Ex. Split-plot design (Oats.gsh)

• 3 varieties of oats Treatment structure: Variety + Nitrogen + Variety.Nitrogen • 4 levels of nitrogen • 6 blocks => Variety * Nitrogen • Variety as whole-plot within each block Block structure: • Nitrogen level as subplot Blocks + Blocks.Wplots +Blocks.Wplots.Subplots within each whole-plot => Blocks / Wplots / Subplots

24 Approachable Solution

THANK YOU

25 www.biosci.global Approachable Solution

Biosci (Thailand) Co., Ltd.

Tutorial for ASReml-R

www.biosci.global 1 Approachable Solution

Source:

2 Approachable Solution

Outline

1 Estimating the heritability of birth weight 2 A bivariate animal model 3 A repeated measures animal model

3 Approachable Solution 1 Estimating the heritability of birth weight

Objective: how to run a univariate animal model using the ASReml–R

Background: In a population of gryphons there is strong positive selection on birth weight with heavier born individuals having, on average higher fitness. To find out whether increased birth weight will evolve in response to the selection, and if so how quickly, we want to estimate the heritability of birth weight.

4 Approachable Solution Animal model

5 Approachable Solution Animal model

• The solution lies in specifying model 1 as a linear mixed effects model – a type of model that contains both fixed and random Effects – in which the breeding value is treated as a random effect, random terms allow us to make inferences about the distribution of effects in a wider population. • Additional random effects could be fitted if other sources of non- independence between data points were suspected (e.g. habitat patch, year of birth, mother), and for each additional random effect a corresponding component of the total phenotypic variance would be estimated. • By fitting breeding value as a random effect, we obtain an estimate of the variance in breeding values which is defined as the additive genetic variance 푉퐴. In addition, variation from numerous other environmental and indirect genetic sources can be estimated using a mixed model approach, often simultaneously if the right pedigree and phenotypic data is available. 6 Approachable Solution Animal model

7 Approachable Solution

Data input and Data Structure

8 Approachable Solution Phenotype data

The phenotype data, gryphon, Columns correspond to individual identity (ANIMAL), maternal identity (MOTHER), year of birth (BYEAR), sex (SEX, where 1 is female and 2 is male), birth weight (BWT), and tarsus length (TARSUS). Each row of the data file contains a record for a different offspring individual. Note that all individuals included in the data file must be included as offspring in the pedigree file.

9 Approachable Solution Pedigree data

The pedigree data, gryphonped, contains three columns containing unique IDs that corresponding to each animal, its father, and its mother. Note that this is a multigenerational pedigree, with the earliest generation (for which parentage information is necessarily missing) at the beginning of the file. For later born individuals maternal identities are all known but paternity information is incomplete (a common situation in real world applications).

10 Approachable Solution

Pedigree Example

11 Approachable Solution Numerator relationship matrix (A)

12 Approachable Solution Obtaining the A matrix

13 Approachable Solution

Calculate an inverse relationship matrix

14 Approachable Solution

Analysis steps

15 Approachable Solution

Fit the model

Fit a simple univariate animal model with a single fixed effect( the mean ) and a single random effect (the additive genetic effect).

16 Approachable Solution

asreml function General Relevant file syntax ~ Separates response from the list of fixed and random ? asreml() terms. # Comment following (skips rest of line). , Model specification continue on the next line. $ Specifies an user-input option from commands

Basic syntax operation: + Sum of two factors

“*”, and “/” crossing and nesting operators, A*B =A+B+A:B and A/B = A+ A:B, where A:B is a model term which consists of all combinations of levels from the factors A and B (interaction).

17 Approachable Solution

asreml object

?asreml .object

18 Approachable Solution

Variance component estimation

The estimated variance components are:

2 휎푎 = 3.40

2 휎푒 = 3.83

scale parameters are Given that the ratio of 푉퐴 estimated as a ratio to its standard error (z.ratio) with respect to the is considerable larger than 2 residual variance (i.e. the parameter estimate is more than 2 SEs from zero) this looks likely to be highly significant.

19 Approachable Solution

Estimating heritability

20 Approachable Solution Adding fixed effects For example we might know (or suspect) that birth weight is a sexually dimorphic trait and therefore fit a model

21 Approachable Solution Fixed effects parameter Now we can look at the fixed effects parameters and assess their significance with a conditional Wald F-test, using the code below.

The probability (‘Pr’) in the Wald test shows that SEX is a highly significant fixed effect, and from the fixed effects we can see that the average male (sex 2) is 2.2kg (±0.16SE) heavier than the average female (sex 1).

22 Approachable Solution Incremental and Conditional Wald Statistics In general, the methods used to construct F-tests in analysis of variance and regression cannot be used for the diversity of applications of the general linear mixed model available in asreml().

23 Approachable Solution Incremental and Conditional Wald Statistics

24 Approachable Solution Variance component and Heritability Which is the better estimate? It depends on what your question is.

The first is an estimate of the proportion of 푉 2 variance in birth weight 푅 ℎ explained by additive ↓ ↑ effects, the latter is an estimate of the proportion of variance in birth weight after conditioning on sex that is explained by additive effects.

25 Approachable Solution

Adding random effects: BYEAR Here the variance in BWT explained by birth year is 0.886 and, based on the z.ratio appears to be significant. Thus we would conclude that year to year variation (e.g., in climate, resource abundance) contributes to 푉푃. what we have really done here is to partition environmental effects into those arising from year to year differences versus everything else, and we do not really expect much change in ℎ2.

26 Approachable Solution Adding random effects: MOTHER

Here partitioning of significant maternal variance has resulted in a further decrease in 푉푅 but also a decrease in 푉퐴. The latter is because maternal effects of the sort we simulated (fixed differences between mothers) will have the consequence of increasing similarity among maternal siblings. Consequently they can look very much like additive genetic effects and if present, but unmodelled, represent a type of “common environment effect”.

27 Approachable Solution

Testing significance of random effects

the z ratio (COMP/SE) reported in the primary results file is a good indicator of likely statistical significance, the approximate standard errors are not recommended for formal hypothesis testing. A better approach is to use likelihood ratio tests.

28 Approachable Solution REML likelihood ratio test

29 Approachable Solution REML likelihood ratio test

A test statistic equal to twice the absolute difference in these log-likelihoods is assumed to be distributed as Chi square with one degree of freedom. So in this case we would conclude that the maternal effects are highly significant since:

30 Approachable Solution

2 A bivariate animal model Objective: how to run a univariate animal model using the software ASReml–R

Background

31 Approachable Solution Phenotype data

The phenotype data, gryphon, Columns

correspond to individual identity (ANIMAL),

maternal identity (MOTHER), year of birth

(BYEAR), sex (SEX, where 1 is female and 2

is male), birth weight (BWT), and tarsus length

(TARSUS). Each row of the data file contains

a record for a different offspring individual.

Note that all individuals included in the data

file must be included as offspring in the

pedigree file. 32 Approachable Solution Pedigree data

The pedigree data, gryphonped, contains three columns containing unique IDs that corresponding to each animal, its father, and its mother. Note that this is a multigenerational pedigree, with the earliest generation (for which parentage information is necessarily missing) at the beginning of the file. For later born individuals maternal identities are all known but paternity information is incomplete (a common situation in real world applications).

33 Approachable Solution

Running in asreml-r

The code for a multivariate model is similar to the univariate case, but a few extra lines are required to specify the model of the (co)variance structures we want to fit. These extra lines are the principal source of confusion for new ASReml users but are necessary since the program can actually fit a wide variety of structures. The simplest - an unstructured covariance matrix - is often appropriate.

To run a multivariate analysis in ASReml-R you have to use cbind to bind your response variables together. To fit an intercept for each trait, you have to use 'trait' as the intercept and to fit the fixed effects for both variables, interact the effect with 'trait'. In a bivariate model for each random effect you will have three outputs -the variance component for each response variable and the covariance between the two.

34 Approachable Solution Model a

Note that the starting values supplied here are arbitrary. If the model is difficult to fit then it can be because the starting values are too far from the best estimates. One way around this is to run single trait models first to get good starting values for the variances (but you still have to “guess” starting values for the covariances).

35 Approachable Solution

Model structure So, for this two trait model, we would consider the phenotypic matrix P as comprising phenotypic variances in birth weight (푉푃1) and tarsus length (푉푃2) and the phenotypic covariance between the two traits (CO푉푃1,푃2 ). P is then initially decomposed into the additive genetic matrix G and a residual (or environmental) matrix R where, for two traits:

36 Approachable Solution

Variance components

푉퐴.퐵푊푇 퐶푂푉퐴 푉퐴.푇퐴푅푆푈푆

푉푅.퐵푊푇 퐶푂푉푅 푉푅.푇퐴푅푆푈푆

Based on our quick and dirty check (is z.ratio=Comp/SE > 2) all components look to be statistically significant. 37 Approachable Solution

genetic correlation and heritability

38 Approachable Solution Adding fixed and random effects SEX as a fixed effect as well as random effects of BIRTH YEAR and MOTHER

Note that we have specified a covariance structure for each random effect and an estimate of the effect of sex on both birth weight and tarsus length by interacting sex with trait in the fixed effect structure.

39 Approachable Solution Model Structure

maternal and year of birth effects are included and where M and BY are the matrices corresponding to those additional random effects:

40 Approachable Solution Variance components

41 Approachable Solution Testing significance of a covariance To test the significance of a covariance, fix the value of the covariance to zero and then compare the models with and without the covariance using log-likelihood ratio tests. In ASReml-R, you can do this by specifying the covariance matrix as a diagonal matrix (i.e. diag instead of us). To test the significance of the maternal covariance in the above model, use the following code, note the the number of starting values has also decreased.

42 Approachable Solution

Variance component

43 Approachable Solution

REML likelihood ratio test

44 Approachable Solution Extend to multiple traits • Of course the two trait example presented here can be extended in principle to any number of traits. However, as the dimension of each matrix increases, the number of parameters to be estimated rises very quickly and you can soon run into difficulties getting your models to converge. • The solution to this is to use simpler models, at least to start with. For instance, if you having trouble getting a bivariate model to converge then try modelling each trait in a univariate model first. This will give you a good idea of the variance components for each trait and these can be used as starting values in the bivariate analysis. If you want to estimate a full G matrix among a large number of traits then ultimately you may find that you cannot fit a full model but rather you will need to run a series of bivariate models to estimate each of the pairwise genetic covariances.

45 Approachable Solution 3 repeated measures animal model

Objective: how to run a univariate animal model for a trait with repeated observations using the software ASReml–R

Background: Since gryphons are iteroparous, multiple observations of reproductive traits are available for some individuals. Here we have repeated measures of lay date (measured in days after Jan 1) for individual females of varying age from 2 (age of maturation) up until age 6. Not all females lay every year so the number of observations per female is variable. We want to know how repeatable the trait is, and (assuming it is repeatable) how heritable it is.

46 Approachable Solution Pedigree data

The pedigree data, gryphonped, contains three columns containing unique IDs that corresponding to each animal, its father, and its mother. Note that this is a multigenerational pedigree, with the earliest generation (for which parentage information is necessarily missing) at the beginning of the file. For later born individuals maternal identities are all known but paternity information is incomplete (a common situation in real world applications).

47 Approachable Solution Phenotype data

Data gryphonRM: Columns correspond to individual identity (ANIMAL), birth year (BYEAR), age in years (AGE), year of measurement (YEAR) and lay date (LAYDATE). Each row of the data file corresponds to a single phenotypic observation. Here data are sorted by identity and then age so that the repeated observations on individuals are readily apparent.

48 Approachable Solution Repeated measures animal model We can estimate the repeatability of a trait as partition the phenotypic variance into within- vs. between-individual components and this can be done here by fitting individual identity as a random effect without associating it with the Pedigree, using the code below and fitting ide( ) of the animal, the among-individual variance expressed as a proportion of the trait is the repeatability.

49 Approachable Solution Estimating repeatability

50 Approachable Solution Adding fixed effect: age we might ask what the repeatability of lay date is after conditioning on age effect.

51 Approachable Solution Partitioning additive and permanent environment effects Generally we expect that the repeatability will set the upper limit for heritability since, while additive genetic effects will cause among-individual variation, so will other types of effect. Nonadditive contributions to fixed among-individual differences are normally referred to as “permanent environment effects”. If a trait has repeated measures then it is necessary to model permanent environment effects in an animal model to prevent upward bias in 푉퐴.

52 Approachable Solution

Partitioning additive and permanent environment effects

Variance components are almost unchanged, all of the among- individual variance is being

partitioned as VA. In fact here the partition is wrong since the simulation included both additive genetic effects and additional fixed heterogeneity that was not associated with the pedigree structure (i.e. permanent environment effects).

53 Approachable Solution Partitioning additive and permanent environment effects

To obtain an unbiased

estimate of VA we have to fit ANIMAL twice, once with, and once without a pedigree attached. To do this fit both ide(ANIMAL) and ped(ANIMAL)

54 Approachable Solution

THANK YOU

55 www.biosci.global Approachable Solution

Biosci (Thailand) Co., Ltd. Introduction to R and ASReml-R: The Basics

www.biosci.global 1 Approachable Solution

Outline

• Getting Started

• R packages

• R Basics

• ASReml-r

2 Approachable Solution Getting Started To install R on your MAC or PC you first need to go to http://www.r-project.org/, Download R

3 Approachable Solution R-Gui

4 Approachable Solution RStudio

5 Approachable Solution R Packages

6 Approachable Solution Installing Packages • install.packages( )

• Packages —install packages from local files : asreml

7 Approachable Solution

R Basics

• R is object base

– Types of objects (scalar, vector, matrices and arrays)

– Assignment of objects

• Building a data frame

8 Approachable Solution

R as a Calculator

> 1550+2000 [1] 3550

or various calculations in the same row

> 2+3; 5*9; 6-6 [1] 5 [1] 45 [1] 0 9 Approachable Solution

Object in R

• Objects in R obtain values by assignment.

• This is achieved by the gets arrow, <-,

• Objects can be of different kinds.

10 Approachable Solution

Built in functions

• R has many built in functions that compute different statistical procedures.

• Functions in R are followed by ( ).

• Inside the parenthesis we write the object (vector, matrix, array, dataframe) to which we want to apply the function.

11 Approachable Solution

Vectors

Vectors are variables with one or more values of the same type.

a<-c(1,2,3,4,5,6) b<-c("one","two","three") x<-rnorm(10) y<-seq(-5,5, by=1) z<-rep(c(1,4,6), times=3)

12 Approachable Solution Arrays • Arrays are numeric objects with dimension attributes. • The difference between a matrix and an array is that arrays have more than two dimensions.

Myarray<-array(vector, dimensions, dimnames) dim1<-c("A1","A2") dim2<-c("B2","B2","B3") dim3<-c("C1","C2","C3","C4") z<-array(1:24, c(2,3,4), dimnames=list(dim1,dim2,dim3)

13 Approachable Solution

Matrices

A matrix is a two dimensional array.

mymatrix<-matrix(1:15, nrow=3, ncol=5, byrow=TRUE) mymatrix[1,3] , mymatrix[,3] ,mymatrix[1,]

14 Approachable Solution

Dataframe

• Researchers work mostly with dataframes . • With previous knowledge you can built dataframes in R • Also, import dataframes into R.

Mydata<-data.frame(col1, col2, col3…)

Name=c("A", "B", "C", "D") Sex=c("F", "F", "M", "M") Age=c(23,23,24,24) Score=c(84,98,83,99) df<-data.frame(Name, Sex, Age, Score) 15 Approachable Solution

Data input

• Using read.table() or read.csv()

mydata2 <-read.table( "D:\\cookbook\\date\\date.csv", header = T, sep= "," ) mydata3 <-read.csv( "D:\\cookbook\\date\\date.csv") • Built-in Data

data(package="asreml")

16 Approachable Solution

ASReml-r

an R package for mixed models using residual maximum likelihood

17 Approachable Solution

About ASReml-r

ASReml in R uses the Average Information (AI) algorithm and sparse matrix operations methods. • Useful for analysis of large and complex dataset. • Very flexible to model a wide range of variance models for random effects or error structures (however, complex to program).

18 Approachable Solution

How to get ASReml –r

Distributor Page • http://www.vsni.co.uk/products/asreml (version 3) • http://www.r-project.org/ (for R)

19 Approachable Solution

Getting help

R • help(asreml) or ?asreml • asreml.man()

Webpages • uncronopio.org/ASReml/HomePage (cookbook) • http://www.vsni.co.uk/software/asreml/htmlhelp/ (distributor page) • www.vsni.co.uk/forum (user forum) 20 Approachable Solution

21 Approachable Solution

THANK YOU

22 www.biosci.global Approachable Solution

Breeding Management System (BMS)

1 www.biosci.global Approachable Solution

Breeding Management System (BMS)

2 Approachable Solution

Breeding Activities

3 Approachable Solution Core applications

Programme & information Breeding activities management • Germplasm List Manager • WorkBench (dashboard view) • Crossing Manager • Study Browser • Nursery Manager, with Seed • Breeder Queries Inventory • Ontology Manager • Trial Manager • Germplasm import tool • Integrated Breeding FieldBook • Data import tool

Statistical analysis – Marker-assisted breeding Breeding View: • Integrated Breeding Planner • Single-Site Analysis • Genotypic Data Management • Multi-Site Analysis System (GDMS) – in progress • Multi-Year Multi-Site Analysis; • QTL Analysis Tools • Breeding View Standalone for • Molecular Breeding Design Tool QTL (MBDT) • Quality assurance • OptiMAS​ 4 Approachable Solution

BMS can … • Manage Program information

5 Approachable Solution

6 Approachable Solution

BMS can … • Manage Program information • Manage Germplasm data

7 Approachable Solution

8 Approachable Solution

9 Approachable Solution

BMS can … • Manage Program information • Manage Germplasm data • Manage Phenotypic data • Nursery, Trial, Cross, Seed Inventory

10 Approachable Solution

11 Approachable Solution

12 Approachable Solution

13 Approachable Solution

14 Approachable Solution

15 Approachable Solution

BMS can … • Manage Program information • Manage Germplasm data • Manage Phenotypic data • Analyze data using Breeding View

16 Approachable Solution

BMS in summary • Simple and easy-to-use application containing all informatics tools needed by a breeder • Targets routine breeding activities, in complementarity with research tools • Accumulation, sharing and re-use of breeding data • As of mid 2015, twelve crop-specific databases with historical data: bean, cassava, chickpea, cowpea, groundnut, lentil, maize, pearl millet, rice, sorghum, soybean and wheat • Phenotyping DB schema: Chado Natural Diversity Module • In the same way that we have stored public data into BMS, we can do the same for your institute’s existing data as part of our service package

17 Approachable Solution

BMS in summary Delivery and integration: • Available as a cloud-based system • For computationally intensive analyses or large data storage needs • For large and/or decentralized team. • Also implementable as a standalone system • For small or remote breeding projects • Allows integration of users’ own tools into the system through a publicly accessible API

18 Approachable Solution

Technical documentation & tutorials

19 Approachable Solution

THANK YOU

20 www.biosci.global Approachable Solution

BMS: Management and Analysis of data for crop breeding Breeding View A visual tool for running analytical pipelines 1 www.biosci.global Approachable Solution

Breeding View: a visual tool for running analytical pipelines

User-friendly interface

Visual pipelines; from data quality checks to final reports

Summary results; combination of html reports, data export files and image files

GenStat: statistical analysis engine

2 Approachable Solution

Analysis pipelines

Field trial analysis

GxE analysis

QTL linkage analysis

3 Approachable Solution

4 Approachable Solution

Field trial analysis

5 Approachable Solution

Field trial analysis

6 Approachable Solution

GxE analysis

Quality control phenotypes

GxE analysis; Finlay-Wilkinson, AMMI, GGE biplot

Stability coefficients; sensitivity, superiority, static stability, Wricke’s ecovalence

Generate report

7 Approachable Solution

GxE analysis

8 Approachable Solution

GxE analysis AMMI biplot to GGE biplot to identify explore GxE pattern the best genotype

9 Approachable Solution

QTL analysis

• QTL detection; Marker-based, SIM and CIM • Backward selection to determine significant QTLs • Estimate the effects of significant set of QTLs

10 Approachable Solution

QTL analysis

11 Approachable Solution

QTL analysis

Genetic map with significant QTLs Genotype data plot across all 11 linkage groups 12 Approachable Solution

QTL analysis

13 Approachable Solution

THANK YOU

14 www.biosci.global