of – Get it right from the beginning

Margrét Thorsteinsdóttir Professor at the Faculty of Pharmaceutical , University of Iceland Objective • The objective is to illustrate how design of experiments (DoE) can be implemented for optimization of quantitative LC-MS/MS clinical diagnostic method Outline • Part 1 - Introduction to Design of Experiments (DoE) • Why use DoE? • Basic concepts • Part 2 - Design of Experiments (DoE) • Experimental Screening • Optimization • Quantitative Modeling • Part 3 - Practical Example • Optimization of clinical LC-MS based assay utilizing DoE What is ? Chemometrics is the chemistry discipline that uses mathematical and statistical methods; Professor Svante Wold and – To design or select optimal professor Bruce Kowalski, measurement procedure and 11 th Scandinavia Symposium experiments on Chemometrics (SSC), To provide maximum chemical Loen, Norway – information by analyzing chemical (Wold and Kowalski - 1972) Application of DoE • Development of new product and processes/ analytical method • Enhancement of existing analytical method • Optimization of and performance of an analytical method • Optimization of existing analytical method • Screening of important factors • Robustness testing Multivariate Problems

• One phenomenon usually depends on several factors! – Question is not : Is the problem multivariate? – The question is: How to handle multivariate problems? How shall I find the optimum?

• COST approach (Changing One Separate Factor at Time)

• DoE (Design of Experiments) Why Design of Experiments (DoE)? • To provide a framework for changing all important factors systematically with a limited number of experiments • To ensure that the selected experiments are maximally informative • To get an overview of the relationships between all the parameters • To make R&D more efficient COST approach (Changing One Separate Factor at Time)

• Does not lead to the real optimum and gives different implications with different starting points • Leads to many experiments and little information • System influenced by more than one factor are poorly investigated – are missed

A better approach - DoE

The solution is to construct a Standard 100 • 300/75/75 carefully prepared set of X3 representative experiments, 50 100 in which all relevant factors X2 are varied simultaneously 200 X1 400 50 A successful has several prerequisites

• Define your objective(s) • Planning • Estimation of experimental error Variability • Every measurement and experiment is influenced by noise • Under stable conditions every process and system varies around its in analysis – Variance in ‹ The experiment must be of sufficient precision to satisfy the main objective ‹ The experiment must be unbiased Reacting to noise • Consider one experiment where the temperature is changed from 35 °C to 40 °C • The response change, from slightly below 93% to close to 96%, lies within the variability interval found when replicating

Ten measurements of yield, under identical conditions

yield 92 94 96 98 Two measurements of yield. Any real difference?

yield 92 94 96 98 Consequence of variability

Two points experiments, Two points far away And if a center-point is added, it close to each other make frome each other is possible to explore whether the slope of the line be make the slope be the model is linear or non-linear poorly determined well determinde

Y Y Y

X X X

‹ Design is needed – as it matters where the experiments are positioned! Focusing on effects

• DoE provides better flow of measurements to information to knowledge • Leads to more precise effect estimates

Y1 X2 X3

X1 X2

X1 X1 Estimating real effects and noise

• Real effects are estimated by the coefficients • The noise is contained in the confidence intervals Uncertainty of coefficient Assessment of DoE • Define the experimental objective(s) • Define factors • Define responses • Selection of regression model • Generation of experimental design • Creation of worksheet • Analyze the data Terminology • Factors: Parameters changed to influence responses and direct the system/process towards a desired response profile • Responses: Variables describing the properties of the system/process • Model: Mathematical expression linking the changes in the factor to the changes in the responses

System

Process Factors(X) Responses (Y) Selection of Experimental Objective • Experimental objectives may be selected from different stages of DoE • Familiarization • Screening • Finding the optimal region • Optimization • Robustness testing • Mechanistic modelling Important questions?

• The experimental objective tells which kind of investigation one wants to do: – One should ask why is an experiment done? – For what purpose? – What is the desired result? Selection of model Generation of Design

• Chosen model and design to be generated are intimately linked Modeling

• The results are expressed • Advantageous: as a mathematical function – Replaces large tables of data of experimental conditions with a single equation – Provides mean to predict and estimate results at level that were not directly studied Software

• MODDE 12, MKS Data Analytics, Umetrics • Unscrambler®X, CAMO software • JMP® software from SAS • Matlab • ExperimentalDesign in R – on the CRAN repository Part 2 - Design of Experiments (DoE)

Experimental Screening Optimization Quantitative Modeling The primary experimental objectives • Screening - Which experimental factors are most influential? - What are their appropriate ranges? • Optimization - How shall we define optimum? - Is there a unique optimum, or is a compromise necessary to meet conflicting demands on the responses? • Quantitative modeling - What are the predicted values of the response for given settings of the factors in a model? Screening - objective

• To explore many factors in order to reveal whether they have an influence on the responses • To identify their appropriate ranges • To investigate if factor/response relationship is linear or non-linear? Specification of Factors

• Categorization of factors Example; Vcontrolled and • Quantitative uncontrolled – Temperature Vquantitative and – Flow rate qualitative – Capillary voltage • Qualitative • Define ranges – Type of column – Type of organic solvent Uncontrolled Factors

• These factors that cannot be controlled, but which still may influence the results (responses) VExample: Ambient humidity and temperature • Record values of uncontrolled factors, and include these in the data analysis • Use of experiments Specification of Responses

• Choose responses which are relevant to the objective(s) – Example: Retention time, peak height, peak area • Often responses need to be transformed V Transform the responses after executing the results Example: Log transformation – from the design if there is a non-linear relationship between y and x Factorial Full Factorial Design Fractional Factorial Design

23 = 8 experiments 23-1 = 4 experiments Two-level full factorial designs • These designs enable interaction models to be estimated, which is No of No of runs No of runs investigated Full factorial Fractional adequate for screening factors (k) factorial 2 4 --- • Each factor is investigated at both 3 8 4 levels of all other factors 4 16 8 5 32 16 – balancing 6 64 16 – 7 128 16 8 256 16 • Full factorial designs are realistic 9 512 32 choices with 2-4 factors; with 5 or 10 1024 32 more factors fractional factorial designs are recommended Fractional Factorial Design • Fractional factorial designs are used in screening and robustness testing • Advantage: Reduction of experiments • Disadvantage: of effects

7 8 100 7 8 100 7 8 100

5 6 5 6 5 6 Eggpowder Eggpowder Eggpowder 50 50 3 4 3 4 3 4 50 g g n 100 n 100 g i i n 100 n n i e e n t t e r r t o o r h h o h 1 2 S 1 2 S 1 2 S Flour 50 Flour 50 Flour 50 200 400 200 400 200 400 x3 = -x3 = x1 = Flour Run x1 x2 x1x2 Run x1 x2 x1x2 x2 = Shortening 5 - - + 1 - - - 2 + - - 6 + - + x3 = Eggpowder 3 - + - 7 - + + 8 + + + 4 + + - Graphical interpretation of confoundings (2 4-1 design)

• Only the sum of confounded terms is estimated • Main effects usually dominate over three-factor interactions • More experiments are needed to resolve confounded terms D-

• Multi-level qualitative factors ‹Type of organic solvent ‹Type of column ‹pH of mobile phase • Three quantitative factors Screening Example: Quantification of biomarker in human plasma with LC-MS/MS Objective: Optimize the response for compound x in as short time as possible • Factors: – Type of LC column – pH of the mobile phase – Amount of acetonitrile in mobile phase B – Slope of gradient – Flow rate – Amount Injected • Response: – Retention Time – Peak area Design

• Full factorial design of Creation of worksheet and results of the experiments 4 factors for each pH and each column type – 24 = 16 + 3 experiment in the center point – 4 x 19 experiments = 76 experiments total

MODDE 7, Umetrics AB Data Analysis

of raw data – Replicate plot, and model interpretation – R2/Q2/Model Peak Area Retention Time / – Coefficient plot • Use of regression model – Response contour plot Evaluation of raw data - Replicate plot V The replicate plot shows the variation among the replicates in relation to the variation across the entire design (”reproducibility”) Regression analysis – summary of fit plot R2 – measures fit (”explained variation”)

Q2 – measures predictive power (”predicted variation”) Model interpretation Coefficient plot shows importance of model terms, useful for model refinement

Peak area Retention time The interaction plot shows the strength of an interaction Interaction Plot between pH and column type Computation of effects using fit • DOE data are analyzed by calculating a regression model using least squares fit, which has the following advantages : – (i) the robustness to slight fluctuations in the factor settings – (ii) the ability to handle a failing corner where experiments could not be made – (iii) the estimation of the experimental noise – (iv) the availability of several useful model diagnostic tools • An important consequence of least squares analysis is that the outcome is not main and interaction effect estimates, but a regression model consisting of coefficients reflecting the influence of the factors Region of optimum Use model to make decisions Optimization

• Quadratic models

2 2 y = β0 + β1x1 + β2x2 + β11 x1 + β22 x2 + β12 x1x2 + ...+ ε • Response surface methodology (modeling) – How is a factor important? – What is the best setting of the factors to achieve optimal conditions for best performance? Central composite circumscribed (CCC) design for two factors • Extension of the two-level full and fractional factorial design

• The CCC design consists of three building blocks – Regular arranged corner experiments of a two-level factorial design – Symmetrically arrayed star points located on the factor axes – Repeatedly performed center-points Central composite face-centered (CCF) design in three factors • Central Composite Face (CCF) design – 23= 8 experiments – 6 experiments at axial points – 3 experiments at center point ‹Total of 17 experiments for a quadratic Optimization Example: Quantification of biomarker in human plasma with LC-MS/MS

• Factors: – Amount of acetonitrile in mobile phase B – Slope of gradient – Flow rate • Responses: – Retention Time – Peak area CCF- (3 variables)

Exp No Flow Capillary Cone voltage • 23= 8 experiments 1 -1 -1 -1 2 1 -1 -1 • 6 experiments at axial points 3 -1 1 -1 • 3 experiments at center point 4 1 1 -1 5 -1 -1 1 ‹Total of 17 experiments for 6 1 -1 1 a quadratic model 7 -1 1 1 8 1 1 1 Predicted vs Observed plot of Retention Time 9 -1 0 0 10 1 0 0 11 0 -1 0 12 0 1 0 13 0 0 -1 14 0 0 1 15 0 0 0 16 0 0 0 17 0 0 0 Optimized conditions

• S/N increased by factor of 5

• tm decreased by factor of 2 What have we learnt • DOE results in a set of experiments in which all factors are varied at the same time – factor interactions are estimable – reliable maps of the systems are possible – seen effects and noise are separable and estimable • Fractional factorial designs are the most widely used family of screening designs – Many factors can be mapped in few runs – Confounding of effects can be reasonably dealt with by selecting a ResIV design • The composite designs CCC and CCF are natural extensions of the two-level full and fractional factorial designs – Are used for optimization Part 3 - Practical Example

Optimization of clinical LC-MS based assay utilizing DoE Overview of steps in DoE

1. Selection of experimental objective 2. Definition of factors 3. Definition of response(s) 4. Create design (run experiments) 5. Modeling 6. Use model (make decisions) Clinical Diagnosis of Liquorice Induced Hypertension Aim

• To optimize a LC-MS/MS assay for simultaneous quantification of cortisol, cortisone and glycyrrhetinic acid in human plasma utilizing DoE VTo implement the assay for clinical diagnosis of liquorice induced hypertension ‹ Cortisone is an inactive metabolite of cortisol ‹ 11β-hydrosteroid dehydrogenase (11βHSD2) enzyme activity is inhibited by glycyrrhetinic acid Diagnostic Tool

• Ratio between cortisol and cortisone is a good indicator of 11βHSD type 2 activity – Alterations of the ratio in plasma and urine gives information about pathological state Analytical Strategy

• HPLC-MS/MS and UPLC-MS/MS with electrospray ionization for the analysis • Protein precipitation for sample preparation Waters Quattro Premier TM XE • DoE was utilized for optimization of the coupled to ACQUITY UPLC LC-MS/MS quantification method – Software: MODDE 11 , MKS Data Analytics Solutions

Quattro Ultima TM coupled to Waters 1525µ Binary pump Challenges!

• Improve sensitivity • Improve selectivity • Speed of analysis • Cost effectiveness Optimization of the LC-MS/MS Experimental Screening Optimization • D-optimal Design, Interaction • Central Composite Face model Design (CCF) • Multi-level qualitative factors Responses ‹ Type of column ‹ Peak Height ‹ Peak Area ‹ Type of organic solvent ‹ Retention time • Eight quantitative factors Modeling • Partial least square regression (PLS) Software • Modde 11, MKS Data Analytics Solutions D-Optimal Design Experimental domain Variable parameters (-) (0) (+) pH 2.7 5.8 9.2 Organic start (% v/v) 5 15 25 Temperature (°C) 20 30 40 Gradient slope (min) 1.0 2.5 4.0 Flow rate (ml/min) 0.2 0.3 0.4 Capillary voltage (kV) 0.5 2.0 3.5 Cone Voltage (V) 15 32.5 50 Collision energy (eV) 10 20 30 Analytical column Xbridge C18 (3.5 µm, 2.1x50 mm) Luna Phenyl-Hexyl (3.0 µm, 2.0x50 mm) Organic solvent Acetonitrile (ACN) or methanol (MeOH)

*Quattro Ultima TM Tandem MS/MS coupled to Waters 1525µ Binary pump Regression Coefficients Scaled and Centered for Cortisol Peak area Retention time Optimization - CCF Design Experimental domain Variable parameters HPLC-MS/MS UPLC-MS/MS Collision energy (eV) 10 -30 20 -40 Cone Voltage (V) 20 - 40 Gradient slope (min) 1.0 - 4.0 1.0 – 3.0 Flow rate (ml/min) 0.2 - 0.4 0.45 – 0.65 Organic start (% v/v) 5 - 25 5 pH 2.7 – 9.2 7.0 HPLC column: Xbridge C18 (3.5 µm, 2.1x50 mm) UPLC column: Acquity BEH C18 (1.7 µm, 2.1x50 mm) Ionization Mode: ESI in positive mode Organic phase: MeOH; Temperature: 20 (°C) Capillary voltage: 3.5 kV Observed vs Predicted Peak Area Glycyrrhetinic Acid Cortisol Response Surface Plot for Peak Area of Cortisol and Glycyrrhetinic Acid with the HPLC-MS/MS Assay

Peak area of cortisol as a function of gradient Peak Area of glycyrrhetinic acid as a slope and collision energy function of pH and Collision Energy

Flow: 0.25 ml/min Flow: 0.25 ml/min Organic start: 5% v/v Organic start: 5% v/v pH: 7.0 Gradient slope: 4 min Cone voltage: 20 V Cone voltage: 40 V Contour plot for Peak Height for Plasma Sample Analyzed with the Optimized UPLC-MS/MS Assay Cortisone Cortisol

40 V

30 V Cone Cone Voltage

20 V MRM Chromatogram for Cortisol at Optimum Conditions for both HPLC-MS/MS and UPLC-MS/MS

HPLC-MS/MS

UPLC-MS/MS Validation Results for Glycyrrhetinic Acid Validation Results

Parameters Compound name: Glycyrrhetinic acid Coefficient of Determination: R^2 = 0.996133 Calibration curve: -2.56883e-008 * x^2 + 0.00107803 * x + 0.000694645 Response type: Internal Std ( Ref 2 ), Area * ( IS Conc. / IS Area ) 3.00 - 2000 ng/ml Curve type: 2nd Order, Origin: Exclude, Weighting: 1/x^2, Axis trans: None 2.00

Intra-Assay % 1.80 Accuracy -7.2 –1.9 1.60 1.40 Precision 1.7 –9.2 1.20 1.00 Response Inter-Assay % 0.80

0.60 Accuracy -8.2 – -2.4 BLQ – 3 ng/mL 0.40 Precision 4.0 –9.7 0.20 0.00 ng/ml Stability Stable for; 0 200 400 600 800 1000 1200 1400 1600 1800 Autosampler at least 24 hr Analyzes of Glycyrrhetinic Acid in Plasma from Patient

Blank injection

Patient sample after two weeks of liquorice consumption

Patient sample before liquorice consumption Conclusions • DoE was useful for evaluation of the experimental factors and for optimization of the quantification methods • The UPLC-MS/MS assay provides significant improvements in sensitivity and analysis run time for quantification of the analytes in plasma • A rapid and robust method for quantitative determination of 18-β- glycyrrhetinic acid in human plasma was successfully developed and validated with HPLC-MS/MS • The method was implemented for analyzes of samples from individuals with and without liquorice consumption Usefulness of DoE;

• Identifying relationship between cause and effect • Discovering interactions among factors • Screening many factors and deciding which are important once • Establishing and maintaining • Optimizing a process • Reducing uncertainty of quantitative estimates • Saving time Acknowledgements University of Iceland, Faculty of Pharmaceutical Sciences – Unnur Thorsteinsdóttir ArcticMass; – Finnur Freyr Eiríksson Landspítali University Hospital; Financial support: – Baldur Bragi Sigurdsson The Icelandic Research Fund – Helga A Sigurjónsdóttir University of Iceland Research Fund Data Analytics Solutions, MKS Instruments AB - Lennart Eriksson