ABSTRACT

XUE, XIANGZHONG. Electronic System Optimization Design via GP-Based Surrogate Modeling. (Under the direction of Dr. Paul D. Franzon.)

For an electronic system with a given circuit topology, the designer’s goal is usually to automatically size the device and components to achieve globally optimal performance while at the same satisfying the predefined specifications. This goal is motivated by a human desire for optimality and perfection.

This research project improves upon current optimization strategies. Many types of convex programs and convex fitting techniques are introduced and compared, and the evolution of Geometric Program (GP)-based optimization approaches is investigated through a literature review. By these means, it is shown that a monomial-based GP can achieve optimal performance and accuracy only for long-channel devices, and that a piecewise linear

(PWL)-based GP works well only for short-channel, narrow devices without many data fitted. Based on known GP optimizer and convex PWL fitting techniques, an innovative surrogate modeling and optimization algorithm is proposed to further improve performance accuracy iteratively for a wide transistor with a short channel. The new surrogate strategy, which comprises a fine model and a coarse model, can automatically size the device to create a reusable system model for designing electronic systems and noticeably improving prediction accuracy, particularly when compared to the pure, GP-based optimization method.

To verify the effectiveness and viability of the proposed surrogate strategy, a widely used two-stage optimization design is employed, entailing an operational amplifier (op-amp) and

LC-tuned oscillator. In addition, an involved analysis and simulation demonstrate that the optimal results of both coarse and fine models in the proposed surrogate strategy may gradually converge to each other iteratively while achieving over 10% improvement in performance accuracy compared to the previous, PWL-based GP algorithm. As a result, the proposed surrogated modeling and optimization algorithm can serve as an efficient Computer

Aided Design (CAD) tool, with the capability of dramatically improving performance for

Integrated Circuit (IC) design.

© Copyright 2012 by Xiangzhong Xue

All Rights Reserved Electronic System Optimization Design via GP-Based Surrogate Modeling

by Xiangzhong Xue

A dissertation submitted to the Graduate Faculty of North Carolina State University in partial fulfillment of the requirements for the degree of Doctor of Philosophy

Electrical Engineering

Raleigh, North Carolina

2012

APPROVED BY:

______Dr. Paul D. Franzon Dr. Michael B. Steer Committee Chair

______Dr. W. Rhett Davis Dr. Harvey Charlton

DEDICATION

To my wife, son, and all other family members for a lifetime of support and encouragement.

ii

BIOGRAPHY

Xiangzhong Xue was born on July 17, 1976 in Chongqing, China. In 1997, he received a

Bachelor of Science degree from Chongqing University in China. He later moved to the

United States and studied at North Carolina Agricultural and Technical State University in

Greensboro, North Carolina, graduating with an Master of Science from the Department of

Electrical and Computer Engineering in August 2003. Continuing his doctoral work at North

Carolina State University in Raleigh, North Carolina, he graduated in May of 2012 with a

Ph.D. in Electrical and Electronic Engineering.

iii

ACKNOWLEDGMENTS

This dissertation would not have been possible without the great help and encouragement of my advisor, Dr. Paul D. Franzon, whom I most sincerely and deeply thank for all he has done to help me in my studies and my life. I will always be grateful, too, for the guidance I received from Dr. Michael B. Steer, Dr. Rhett Davis, and Dr. Harvey Charlton.

A special warm thanks to my wife, Xiuqiong Liu; my son, Ziang Xue; and all of my friends for their continued support, care, and assistance.

An expression of sincere gratitude is extended to all my roommates, who have provided me with generosity, help, and friendship through my time in Raleigh. I could not mark the occasion without acknowledging them.

Finally, in appreciation for a lifetime of support and encouragement, I would like to thank my parents, Bangquan Xue and Xifang Xiang, as well as the rest of my family in China.

iv

TABLE OF CONTENTS

List of Tables ...... viii List of Figures ...... ix Chapter 1 Introduction ...... 1 1.1 Background and Motivation ...... 1 1.2 Research Objectives ...... 3 1.3 Research Contributions ...... 3 1.4 Dissertation Organization ...... 6

Chapter 2 Convex Program ...... 8 2.1 Overview and Induction ...... 8 2.2 General Mathematical Optimization Problem ...... 9 2.3 Conic Programs ...... 10 2.4 Convex Programs ...... 11 2.4.1 Convex Set and Convex Function ...... 13 2.4.2 Convexity Verification ...... 14 2.4.3 Theoretical Properties ...... 18 2.4.4 Numerical Solution Algorithms ...... 20 2.4.5 Applications ...... 20

Chapter 3 Geometric Program ...... 22 3.1 Evolution of Geometric Programming ...... 22 3.2 Standard Geometric Program ...... 25 3.3 Relevant Terminology for the Geometric Program ...... 26 3.3.1 Monomial Functions and Examples ...... 26 3.3.2 Posynomial Functions and Examples ...... 27 3.3.3 Inverse Posynomial Functions and Examples ...... 28 3.3.4 Generalized Posynomial Functions and Examples ...... 29 3.3.5 Signomial Functions and Examples ...... 30 3.4 Transformation Methods for the Geometric Program ...... 30 3.4.1 GP-compatible Algebraic Transformation ...... 31 3.4.2 Fractional Powers of Posynomials ...... 33 3.4.3 Maximum of Posynomials ...... 35 3.4.4 Function Composition ...... 36 3.4.5 Additive Log Terms ...... 39 3.4.6 Generalized Posynomial Equality Constraints ...... 40 3.4.7 Application Example: Geometric Program in Convex Form ...... 44 3.5 Optimization Sensitivity Analysis ...... 47 3.5.1 Tradeoff Analysis ...... 47 3.5.2 Optimization Sensitivity Theory ...... 49

v

3.6 Convex Approximation and Fitting ...... 52 3.6.1 Convex Approximation and Fitting Theory ...... 53 3.6.2 Monomial Fitting ...... 57 3.6.3 Extended Monomial Fitting ...... 59 3.6.4 Max-monomial Fitting ...... 60 3.6.5 Posynomial Fitting ...... 63 3.6.6 Convex Piecewise-Linear (PWL) Fitting ...... 64 3.7 Methods for Solving Geometric Programs ...... 66 3.8 Generalized GP and Relevant Solution Methods ...... 69 3.9 Signomial Problem and Relevant Solution Methods ...... 71 3.10 Summary ...... 74

Chapter 4 Some Reviews of Analog Optimization Design ...... 75 4.1 Introduction ...... 75 4.2 GP-compatible Bias Description Techniques ...... 75 4.2.1 Bias Condition Description with PF Equalities ...... 76 4.2.2 Techniques for Converting PF Equality Constraints ...... 78 4.3 Reviews of Two Analog Optimization Designs ...... 80 4.3.1 Circuit Topology, Optimization Objective and Specifications ...... 81 4.3.2 Hershenson's Work ...... 82 4.3.3 Kim's Work ...... 83 4.4 GP-compatible Device Model ...... 86

Chapter 5 GP-Based Surrogate Modeling and Optimization Algorithm ...... 90 5.1 Surrogate Theory ...... 90 5.2 GP-based Surrogate Strategy ...... 93 5.3 Design Case Study ...... 95 5.3.1 Two-Stage Op-amp Design ...... 96 5.3.2 LC-Tuned Oscillator Design ...... 104

Chapter 6 Conclusion and Future Work ...... 112 6.1 Conclusion ...... 112 6.2 Future Work ...... 117

References ...... 119

Appendices ...... 132

Appendix A Convexity of Point-wise Maximum of Convex Function ...... 133

Appendix B Norm Approximation in Function Fitting ...... 134

B.1 The Manhattan or L1 Norm Approximation ...... 134

B.2 The Euclidean or L2 Norm Approximation ...... 135

vi

B.3 The Chebyshev or L Norm Approximation ...... 136

B.4 The Hölder or Lp Norm Approximation ...... 136 B.5 Largest k  term Norm ...... 140

Appendix C Optimization Objective and Specifications for Two-Stage Op-amp ...... 141

Appendix D Preparation for LC-Tuned Oscillator Optimization Design ...... 143

D.1 Analysis of Typical RLC Oscillator ...... 143 D.2 LC-Tuned Oscillator Model ...... 148 D.2.1 Transistor Model ...... 149 D.2.2 Spiral Inductor and Varactor Model ...... 150 D.2.3 LC-Tank Model ...... 152 D.3 Design Specifications ...... 156 D.3.1 Power Dissipation ...... 156 D.3.2 LC-Tank Switching Voltage ...... 156 D.3.3 Phase Noise ...... 157 D.3.4 Resonant Frequency ...... 160 D.3.5 Tuning Range ...... 161 D.3.6 Inverses Loop Gain ...... 162 D.3.7 Varactor Tuning Range ...... 163 D.3.8 Bias Condition ...... 163 D.3.9 Size Constraints ...... 163

vii

LIST OF TABLES

Table 4.1 Comparison btw. two results from GP optimizer and 0.6µm SPICE tech ...... 83 Table 4.2 Comparison btw. two results from GP optimizer and 0.18µm SPICE tech ...... 83 Table 4.3 Max/Mean MF modeling error in GP model ...... 84 Table 4.4 Discrepancy in specifications of two-stage op-amp design ...... 84 Table 4.5 NMOS modeling error in TSMC 0.18um technique ...... 84 Table 4.6 CPWL and MF-based optimization by GP & SPICE ...... 84

Table 4.7 Coefficients and exponents of Cgs of PMOS transistor with Vbs ...... 87 Table 4.8 Mean modeling error of design parameters in TSMC 0.18µm technology ...... 89

Table 5.1 Specifications & optimization performance for two-stage op-amp design ...... 96 Table 5.2 Optimal performance for two-stage op-amp with different specifications ...... 104 Table 5.3 Model expressions of optimization objective and specifications in LC-tuned oscillator ...... 106 Table 5.4 Specifications & optimization performance for 2.4GHa LC-tuned oscillator design ...... 107

viii

LIST OF FIGURES

Figure 2.1 Examples of (a) convex set and (b) non-convex set ...... 13 Figure 2.2 Convex function illustration ...... 14 Figure 2.3 Three functions in (2.10) depicted on log-log scale ...... 16

Figure 3.1 Function f (x)  atan(x) and its least-square monomial fitting ...... 59 Figure 3.2 Procedures of convex PWL fitting algorithm ...... 66

Figure 4.1 Common-source op-amps with PMOS active load ...... 77 Figure 4.2 Two-stage op-amp optimization design ...... 81 Figure 4.3 (a) NMOS transistor without Vbs (b) PMOS transistor without Vbs (c) PMOS transistor with Vbs operating in saturation region used for creating PWL and monomial device models ...... 85

Figure 5.1 Evaluation of convergence of two model performance in surrogate strategy (a) initial design; (b) second design; (c) nth design ...... 93 Figure 5.2 Surrogate modeling and optimization design flow ...... 94 Figure 5.3 Optimal low frequency gain and phase margin at 1st iteration ...... 97 Figure 5.4 Optimal low frequency gain and phase margin at 2nd iteration ...... 99 Figure 5.5 Optimal low frequency gain and phase margin at 3rd iteration ...... 100 Figure 5.6 Tradeoff curve of bandwidth vs. core area at 1st iteration ...... 101 Figure 5.7 Tradeoff curve of bandwidth vs. core area at 2nd iteration ...... 101 Figure 5.8 Tradeoff curve of bandwidth vs. core area at 3rd iteration ...... 102 Figure 5.9 LC-tuned oscillator optimization design ...... 105 Figure 5.10 Circuit-simulation-based phase noise of LC-tuned oscillator at 1st iteration ... 108 Figure 5.11 CPWL optimal phase noise of LC-tuned oscillator at 1st iteration ...... 108 Figure 5.12 Circuit-simulation-based phase noise of LC-tuned oscillator at 2nd iteration .. 109 Figure 5.13 CPWL optimal phase noise of LC-tuned oscillator at 2nd iteration ...... 110 Figure 5.14 Circuit-simulation-based phase noise of LC-tuned oscillator at 3rd iteration ... 110 Figure 5.15 CPWL optimal phase noise of LC-tuned oscillator at 3rd iteration ...... 111

Figure D.1 Analysis of typical RLC oscillator (a) LC tank (b) Equivalent circuit of LC tank (c) LC tank with negative transconductance compensating the ohmic loss .. ... 143 Figure D.2 LC-tuned oscillator ...... 148 Figure D.3 Small signal model of NMOS transistor ...... 149 Figure D.4 Small signal analysis of cross NMOS couple ...... 150 Figure D.5 Square spiral inductor used in LC-tuned oscillator ...... 151 Figure D.6 Small signal of LC-tuned oscillator ...... 152 Figure D.7 Reduced small signal of LC-tuned oscillator ...... 153 Figure D.8 Phase noise analysis of LC-tuned oscillator ...... 157

ix

Chapter 1

Introduction

1.1 Background and Motivation

Since the early 1960s, electronic Integrated Circuits (ICs) have become increasingly complex. One particular trend is that the number of components in a single Integrated Circuit

(IC) doubles every 18 months, as predicted by Moore’s law [1]. Some current ICs, for example, may contain devices that number in the hundreds of millions, with an equivalent number of interconnected wires [2]; another trend is the rapidly increasing use of mixed- signal ICs in fields such as telecommunications, computing, and automotive engineering [3].

Such an increase in design complexity, along with a growing demand for faster production and greater cost effectiveness in a highly competitive market, strongly point to the need for an innovative electronic design automation (EDA) and for new Computer Aided Design

(CAD) technologies for use in designing complex electronic systems.

Although the key component in EDA and CAD tools is the optimization engine, this engine is often driven by outdated optimization algorithms. The optimizers used in SPICE, for instance, are over 20 years old [4]. Thus, the focus of this work is to suggest a novel approach to automatic optimization techniques. The optimizers within the proposed optimization engine are expected to be capable of effectively handling large numbers of design variables and constraints, as well as wide stochastic uncertainty and variation, resulting in an increasingly complex electronic system.

1

The literature on the subject of CAD technologies for ICs is extensive. A good survey of early research can be found in [2].More recent papers include [3], [5], and [6]. Among existing techniques, the geometric program (GP)-based method has recently received close attention. Much work has been done, for example, on optimizing switched-capacitor filters

[7], pipelined Analog-to-Digital Converter (ADC) design [8], Phase Lock Loop (PLL) design

[9], and Direct Current to Direct Current (DC-DC) buck converter design [10]. Significantly, researchers have found that the GP-based method has overwhelming advantages in that it always automatically finds the globally optimal solution (if it exists), is extremely fast, and is independent of the initial point.

However, the geometric program approach still has several drawbacks, in that it requires a posynomial function (PF) type of objective and inequality and has monomial function (MF) equality constraints. Still, transistor characteristics in models such as BSIM3v3 are usually highly non-convex, so convex fitting techniques such as monomial fitting and convex piecewise linear (PWL) fitting are needed to formulate a GP-compatible model. The approximation inevitably leads to some level of error. Work [11] shows that a monomial fitting is suitable only for long-channel transistors, but not accurate enough for short-channel devices. Work [12] uses a convex PWL fitting for short-channel but narrow transistors with a small number of design parameter data fitted, and as a result higher fitting accuracy is achieved. However, our work shows that such an approach no longer works effectively for wide transistors when fitting a large amount of parameter data, and thus a way must be found to further improve the prediction accuracy for transistors in these situations.

2

1.2 Research Objectives

To handle the rapid increase in design complexity, the greater demand for design-to-market time reduction, and the need for greater cost effectiveness in a highly competitive market, innovative EDA and CAD technologies are highly encouraged [3]. Also, for an electronic system with a given circuit topology, the designer’s typical goal is to automatically size the devices and components to achieve globally optimal performance while at the same time satisfying the predefined specifications. This goal is motivated by an inherent human desire for optimality and perfection. Consequently, this work will concentrate on proposing novel

EDA and CAD tools. The resulting innovation should have the following noticeable features.

 It can completely and automatically size the devices and components in the designing

electronic system to save the designer from the tedious and time-consuming work of

device-size tuning, enabling the designer to focus on more important design tasks

such as analyzing the trade-off between the competing objectives and constraints.

 The proposed CAD tool should be embedded with an powerful optimizer, which will

find the existing globally optimal design automatically and extremely fast, directly

from the given specifications and independent of the initial design point.

 The achieved final design should have high performance and accuracy compared with

current EDA and CAD tools.

1.3 Research Contributions

As stated earlier, among those existing EDA and CAD techniques for IC design, the GP- based optimization approach has recently gained popularity [7, 8, 9, and 10] due to its

3

unmatched advantages: always finds the existing globally optimal solution automatically and efficiently regardless of the initial point [7, 8, 9, and 10]. This meets the first two desired features of an IC design CAD tool. Consequently, such an automatic and efficient optimizer is embedded in our proposed CAD tool.

The GP approach requires a special format: posynomial objective, posynomial inequality constraints, and monomial equality constraints [13]. However, today’s transistor characteristics are highly non-convex, so convex fitting techniques are needed to formulate a

GP-compatible device model. Here, a convex PWL (CPWL) fitting technique is selected in our proposed CAD tool to derive a GP-compatible device model because it can achieve quite high fitting accuracy compared with many convex fitting techniques such as monomial fitting.

But our work shows that convex PWL fitting is not accurate enough for an electronic system involved with wide transistors when fitting a large amount of parameter data. In this work, we adopt the strategy of surrogate modeling to further enhance the performance accuracy iteratively. Then our proposed CAD tool can achieve high performance and accuracy, the third desired feature of an IC design CAD tool.

In summary, the contribution that our work makes in this field is that, based on GP optimizer and CPWL fitting technology, we propose an innovative Surrogate Modeling and

Optimization algorithm (SMOA) to further enhance the performance and accuracy of the electronic system optimal design iteratively and automatically. This proposed surrogate strategy can serve as an efficient EDA and CAD tool with dramatic performance improvement for electronic system optimal design compared with existing similar tools. As far as we know, it is the first EDA and CAD tool with the following original, built-in features:

4

 For the first time, the surrogate models are created with a convex piecewise linear

(CPWL)-based GP. This proposed modeling and optimization tool inherits from GP

its high efficiency and the ability to find the globally optimal design. It also benefits

from the CPWL fitting technique to achieve high fitting precision. This better initial

surrogate model established with such high-fidelity fitting techniques will be very

helpful in reducing the number of total used iterations.

 In the modeling methods current today, the system models are created relying on the

design parameter information of a single device model (for example, [11] and [12]).

The problem with this method is that such design parameter information is isolated

from the actual designing electronic system. In contrast, except for the first iteration,

we extract the needed design parameters directly from fine model of the designing

electronic system. These extracted parameter data are directly related to the designing

electronic circuit system, and they have much more relevant physical and electrical

information about the designing circuit system. Establishing a more accurate

surrogate model would thus be highly beneficial. In addition, it reduces the

discrepancy between the surrogate model and the fine model. Furthermore, the

number of iterations is reduced, and the design time is cut accordingly.

 To ensure the convergence between both the fine model and the surrogate model, and

to increase the convergence speed, some matching constraints and Jacobean

conditions, denoted by a reflection functions  , can be imposed between the optimal

design values of both models. That is, we can obtain the optimal value for the

5

optimization design variables of the fine model from the optimal value of the

optimization design variables in the surrogate model through the relationship

x(n 1)  (x)

This will guarantee that we will achieve the optimization design of the designing

electronic system while meeting all the required specifications.

 To get a sequence of optimization design variable points, we choose to sweep one or

more design variables. The sweep is very flexible. For convenience, only one design

variable (bias current) is swept in both design study cases.

1.4 Dissertation Organization

This work has six chapters. The introduction begins with a background overview and a description of our research motivation for improving design automation and CAD tools for

IC optimization design. Next, the objectives of this work are stated, along with the original contributions it will make in the field. Chapter 2 introduces the convex program (CP) and some of its variations, some involved transformations, and fitting technologies. Chapter 3 introduces geometric programming (GP), a special optimization method playing an important role in two previous research works [11 and 12]. It also describes two involved convex fitting techniques: the monomial fitting and convex piecewise linear (PWL) fitting technologies used in [11] and [12]. The GP optimizer and the convex PWL are also one basis of this work.

Chapter 4 reviews two previous works [11 and 12] and analyzes the optimization algorithms and fitting technologies used in them; the modeling accuracy achieved by monomial fitting and convex PWL fitting technologies are then compared for different transistor sizes. The

6

major prediction error sources are also analyzed. Chapter 5 proposes a surrogate modeling and optimization strategy based on the GP optimizer and convex PWL fitting technique to further improve the performance accuracy. Its key idea and procedures are described in detail. Two design examples are also carried out to verify the effectiveness and viability of the proposed surrogate modeling and optimization algorithm. Chapter 6 states our conclusions and describes possible future research.

7

Chapter 2

Convex Programs

2.1 Overview and Introduction

Human beings have an inherent desire for their tools and technologies to perform with optimal efficiency and perfection. The search for extremes has inspired mountaineers, scientists, mathematicians, and the rest of humanity since the beginning of history [14].

Optimization techniques can help humans and machines make decisions as they approach goals that have been set in response to a given problem [15]. As a result, various design solutions can be realized: designs for minimal operating cost, maximum profit, minimum loss, minimum consuming fuel, optimal product quality, smallest device size, optimal strategy, optimal path, optimal trajectory, optimal orbit, aerospace trajectory optimization, optimal surface, optimal overall performance, and other goals.

The rapid development of optimization techniques began during World War II when such techniques were needed to optimize the trajectories of missiles [16]. Subsequently, mathematical programming has been developed to realize optimization through the application of mathematical techniques. A beautiful and practical mathematical theory of optimization (i.e. search-for-optimum strategies) emerged rapidly in the 1960s, when computers became widely available. For example, the optimization of meta-heuristics, which imitates physical phenomena and the evolution of living organisms, has been in development since the 1970s. Every new generation of computers allows us to attack new types of

8

problems and call for new methods. The goal of the optimization theories is the creation of reliable methods to catch the extremum of a function by an intelligent arrangement of its evaluations (measurements). In the past, new optimization techniques had been developed in each era in order to realize customer solutions. Now, however, those optimizations have become dated, and new solutions must be realized. For example, the system optimization engine in SPICE, an important popular circuit Computer Design Automation (CAD) tool, is still driven by an optimization algorithm that is over 20 years old [5].

Up-to-date optimization theories are vitally important for modern engineering and planning that incorporate optimization at every step of the complicated decision making process [14]. Hence, this chapter introduces some recent optimization techniques and their practical applications, but without emphasizing the theoretical optimization knowledge or reviewing classical optimum methods, such as Conjugate Gradients, the Newton Method, and methods for constrained and unconstrained optimization including Linear and Quadratic

Programming. Specifically, several types of optimization programs are introduced for the purpose of comparison, with a focus on the most widely used one, the convex program. This program has recently been used in a variety of fields—for example, control engineering, signal processing, circuit design, economics, finance, and communication and information theory.

2.2 General Mathematical Optimization Problem

First we will introduce the general optimization problem. A general optimization problem can be commonly abstracted and stated in the form [17, 18],

9

minimize f0 x

subject to gi x  1 i  1,  ,m (2.1)

hj x  1 j  1,  ,n

n The vector x R is the optimization variables; the function f0 : x  R is the objective

function which will be designed or realized; the functions gi (x) 1 and hj (x)  1are the involved inequality and equality constraints, respectively.

Specifically, for an optimization problem the objective is to find such an optimal solution

* * x that f0 (x )is the minimum (the maximum is also possible in some cases) satisfying the set

of constraints gi (x) 1 and , which reflect the requirements on the expected products and/or the limited situations and technologies that are currently available. Since the exact optimal solution cannot always be found for a real time optimization problem, we often try to achieve a practical near-optimal solution xˆ within a predefined error tolerance  such

* that f0 (xˆ)  f0 (x )   .

2.3 Conic Programs

One family of optimization problems that has become quite common is the primal and its corresponding dual conic programs [13]:

minimize cT x subject to Ax  b, i  1,  ,m (2.2) x  where the sets  are closed, convex cones (they satisfy     for any  0). The most common conic form is the Linear Program (LP), for which  is the nonnegative orthant

10

n n   R {x R | x j  0, j 1,  , n} (2.3)

Two conic forms dominate recent study and implementation. One is the semi-definite program (SDP), for which  is an isomorphism of the cone of positive semi-definite matrices:

n T nn S {X  X  R | min (X )  0} (2.4)

The second is the second-order cone program (SOCP), for which  is the Cartesian product of one or more second-order or Lorentz cones:

  n1  n k , n {(x, y) Rn  R | x  y} (2.5) 2

SDP and SOCP receive this focused attention because many applications have been discovered for them, and because their geometry admits certain useful algorithmic optimizations [19, 20, and 21]. Publicly available solvers for SDP and SOCP include [22, 23, and 24]. These solvers are generally quite efficient and reliable, and they require no external code to perform function calculations.

2.4 Convex Program

Another very important mathematical optimization problem is called convex program, which has the following form [17]:

minimize f0 (x)

subject to fi (x) 1, i 1,  ,m (2.6)

g j (x) 1, j 1,  , p

Compared with general optimization problems, the convex problem has the following additional requirements:

11

 The objective function must be convex.

 The inequality constraint functions must be convex.

 The equality constraint must be affine.

A convex program (CP) is special case of (2.1): the objective function and inequality

constraint functions fi (x) are convex, and the equality constraint functions g j (x) are monomial.

It is necessary to know that when the convex program is used to design an electronic system in practice, the design variable x should satisfy the constraints of x  0 because it usually represents the device size and component value of the designing electronic system.

One important property of the convex program is that the feasible set of a convex optimization problem is convex because it is the intersection of the domain of the problem.

m D  domfi (i  0,  ,m ) i0

Therefore, a convex optimization problem turns out to minimize the convex objective function over a convex set, and thus any local optimal point is definitely globally optimal, which is an overwhelming advantage over general nonconvex optimization problems.

Another fact, which is worth knowing, is that although it is not obvious, a convex program can be reformulated as a conic program by selecting appropriate  in (2.2) [13].

Certain special cases of (2.6) receive considerable attention. By far the most common is

the linear program, for which the functions fi , g j are all affine; e.g.,

minimize cT x (2.7) subject to Ax  b i 1,  ,m

12

Similarly, when an LP is used to design a practical electronic system, there should be an extra constraint on the design variables: that is, x  0 because it is the device size and the component value of the designing electronic system.

Quadratic programs (QPs) and least-squares problems can be expressed in this form as well. The class of nonlinear programs (NLPs) includes nearly all problems that can be expressed in the form (2.1) with x  Rn , although generally there are assumptions made about the continuity and/or differentiability of the objective and constraint functions. The set of

CPs is a strict subset of the set of NLPs, and includes all LPs, convex QPs (i.e., whose

Hessian 2 f is positive semi-definite), and least-squares problems. Several other classes of

CPs have been identified recently as standard forms. These include second-order cone programs [26], and geometric programs [27, 28, and 29].

2.4.1 Convex Set and Convex Function

A set C  Rn is convex [18] if the line segment between any two points inC still lies inC , i.e.,

x1,x2 C, 0  θ 1  θx1  (1θ)x2 C (2.8)

For example, the set (a) in Figure 2.1 (below) is convex set but the set (b) is nonconvex because the line segment lies outside of the set.

A C B

D

(a) (b)

Figure 2.1: Examples of (a) convex set (b) non-convex set

13

Accordingly, a function f : Rn  R is convex if its domain is a convex set and this function satisfies the inequality condition (2.9),

f (ax  by)  af (x)  bf (y) (2.9) for all x, y  Rn and all a,b R with a  b 1 , a  0 , b  0 . Geometrically, inequality (2.9) means that the line segment between (x, f (x)) and (y, f (y)) lies above the graph of f as shown in Figure 2.2.

f(y)

f(ax+by)

f(x)

x ax+by y

Figure 2.2: Convex function illustration

2.4.2 Convexity Verification

Apparently, the convexity verification of the objective function and the constraint functions is important for testing a convex program optimization problem. Unfortunately, it can be difficult to determine whether a function with many variables is convex (or nearly convex).

In general, this task is at least as difficult as solving nonconvex problems: that is, it is theoretically intractable.

14

In this section, we introduce some techniques that can be used to prove or disprove convexity of a general function F(y)  log( f (x)) , where y  logx is a n -dimensional variable vector [30].

Method 1: Check the curvature of the testing function

For the special case n 1, the convexity of a function F(y) is readily determined by simply plotting this function and checking whether it has positive (upward) curvature. This is the same as plotting f (x) on a log-log plot, and checking whether the resulting graph has positive curvature. If the curvature of the function is curved, then it is convex; if this curve is straight, then it is affine, and also convex; otherwise, it is not convex.

For high-dimension functions, a function is convex only if it is convex when restricted to

any straight line. In other words, one can plot F(y0  tv) versus t , for various values of

y0 and v . If any of these plots exhibits negative (i.e., downward) curvature, then F is not convex. In this way, it is not easy to directly prove that a high-dimension function is a convex function.

The following three functions of one variable in (2.10) are demonstrated here as an example of verifying the convexity of a function,

x4 1, atan(x) and  x3 1 (2.10)

These functions are plotted over the range 0.1 x 1on a log-log scale in Figure 2.3.

From the figure one may see that the first function is convex since its graph is an upward curve on this log-log plot; the second is also convex because its graph is relatively straight.

The third one is not convex since its graph has substantial downward curvature.

15

1

0.5

0

-0.5 f1(x) = x4+1 -1 f2(x) = atan(x) f3(x) = -x3+1 -1.5

-2

-2.5

-3

-3.5

-4 -1 0 10 10

Figure 2.3: Three functions in (2.10) depicted on log-log scale

Method 2: by verifying the inequality condition (2.9)

A function f : Rn  R is convex if domain of f , dom f , is a convex set and f satisfies

f (ax  by)  af (x)  bf (y) for all x, y  Rn and all a,b R with a  b 1 , a  0 , b  0 . Similarly, the convexity of F can be stated in terms of the original function f . This means that f must satisfy the inequality (2.11),

 ~1  ~1  ~ ~ 1 f (x1 x1 , ... , xn xn )  f (x1, ... , xn ) f (x1, ... , xn ) (2.11) for any satisfying 0  1. In other words, when f is evaluated at a weighted geometric mean of two points, it cannot be more than the weighted geometric mean of the function f evaluated at the two points.

Method 3: by checking the first or second-order conditions (i.e., gradient function f and

Hessian function ∆2f) of the testing function, i.e.,

f (y)  f (x)  f (x)T (y  x) for all x, ydom f (2.12)

16

or

   2 f (x)  0 x dom f (2.13)

This test method can be stated as the following results:

 If all Hessians of function f are positive semi-definite over the feasible region, the

problem is convex.

 If any Hessian of function f is negative definite over the feasible region, the problem

is nonconvex.

 Otherwise (i.e., any Hessian of function is semi-negative definite), no conclusion

can be made.

Note that the third case does not mean that the problem is convex or non-convex, only that the system was unable to reach a conclusion.

For example, linear functions are convex because their second derivative is zero, therefore satisfying (2.13).

Another example is to prove the convexity of the logarithmic function of a PF with

Method 3.

If posynomial function f is expressed in the form of

m 1i  2i  ni f x  ci x1 x2 ...xn i1

y i After introducing the logarithmic transform yi  log xi , (that is xi  e ), then the above posynomial function is changed to the below form,

17

m 1i  2i  ni f x  ci x1 x2 ...xn i1 m y1 1i y1  2i y1  ni  ci (e ) (e ) ...(e ) i1 m  eai ybi i1

T where ai  (1i ,2i ,  ,ni) , bi  logci and y  (y1, y2,  , yn ) . This is a so-called sum-exp

1 n function f (x)  e    e , herei  ai y  bi . Then its logarithmic function F(x) is a so- called log-sum-exp function F(x)  log(e1    en ) . Now we can verify the convexity of the log-sum-exp function with Method 3, first by computing its Hessian as

1 2 f (x)  ((1T z)diag(z)  zzT ) (1T z)2 where z  (e1 ,  ,en ), then by proving 2 f (x)  0which is to check that vT2 f (x)v  0 for all v , i.e.,

2 1  n  n   n   vT 2 f (x)v   v2 z  z    v z    0 T 2   i i  i  i i  (1 z)  i1  i1   i1  

This result can be guaranteed by the Cauchy-Schwarz inequality (aT a)(bTb)  (aTb)2 with

value ai  vi i , andbi  i .

2.4.3 Theoretical Properties

Here we survey and introduce some theoretical properties of convex programs. By the 1970s, a comprehensive theory of convex analysis had appeared [31, 32], and advances continue

[33, 34]. We can come to several powerful and practical theoretical conclusions about CPs.

18

First of all, the local optima of a convex program are also definitely global optima because, as stated previously, the domain of any convex program is the intersection of the domain of some convex inequality constraints and affine equality constraints. However, this is not the case for general nonconvex NLPs. A general nonconvex NLP may have multiple local optima, so it usually needs exhaustive computation to verify whether each local optima is global. Convex programming also has a rich duality theory. The dual of a CP is itself a CP, and its solution often provides interesting and useful information about the original, or primal, problem. For example, if the dual problem is unbounded, then the primal must be infeasible. Under certain conditions, the reverse implication is also true: If the primal is infeasible, then its dual must be unbounded. These and other consequences of duality facilitate the construction of numerical algorithms with definitive stopping criteria for detecting infeasibility, unboundedness, and near-optimality. In addition, the dual problem provides valuable information about the sensitivity of the primal problem to perturbations in its constraints. A more complete development of convex duality can be referred to [31, 21].

Another important property of CPs is the provable existence of efficient algorithms for solving them. Nesterov and Nemirovsky proved that a polynomial-time barrier method can be constructed for any CP that meets certain technical conditions [35]. Other authors have shown that the problems that do not meet those conditions can be embedded into larger problems that do, effectively making barrier methods universal [21, 36].

Note that the theoretical properties discussed here, including the existence of efficient solution methods, hold even if a CP is non-differentiable: that is, if one or more of the constraints or objective functions is non-differentiable.

19

2.4.4 Numerical Solution Algorithms

Some efficient algorithms for solving CPs have been known since the 1970s, but only in the last two decades have some practical advances been made. Among those algorithms, interior point methods [37, 38] have attracted the attentions of many researchers because these can always, and very efficiently, find the existing globally optimal solution. At the first stage, this method was limited to LPs [39, 40, 41], but soon it extended to other CPs [42, 43, 44, and

45], especially by Dr. Stephen P. Boyd and his research team. Now a number of excellent solvers are readily available in two classes: those relying on standard forms and those based on custom code.

2.4.5 Applications

Many practical applications for convex programming have already been discovered, and the list is steadily growing. Perhaps the most mature and pervasive field of the application for convex programming is control theory [46, 47, and 48]. Other fields where applications of convex optimization have been developed include robotics [49, 50], combinatorial optimization and graph theory [51, 52, 53, 54], structural optimization [55, 56, 57, 58], algebraic geometry [59, 60], signal processing [61, 62, 63, 64, 65, 66, 67], communications and information theory [68, 69, 70], networking [71, 72], circuit design [73, 74, 11, 75], neural networks [76], and economics and finance [77, 78].

This list is certainly incomplete, and it excludes applications where only LP or QP is employed. A list including LP and QP would be significantly larger, and yet convex programming is a generalization of these technologies.

20

A promising source of new applications for convex programming is the extension and enhancement of existing applications for linear programming. An example of this is robust linear programming, which allows uncertainties in the coefficients of an LP model to be accounted for in the solution of the problem, by transforming it into a nonlinear CP [25]. This approach produces robust solutions more quickly and reliably than the Monte Carlo methods.

Some may argue that our prognosis for the usefulness of convex programming is optimistic, but there is good reason to believe that the number of applications is in fact being underestimated. We can appeal to the history of linear programming as precedent. George

Dantzig first published his invention of the simplex method for linear programming in 1947; and while a number of military applications were soon found, it was not until 1955-1960 that the field enjoyed robust growth [79]. This delay was in large part due to a dearth of adequate computational resources, but that is precisely the point: The discovery of new applications accelerated only after hardware and software advances made it truly practical to solve LPs.

Thus, for applications relating to one special convex program—the geometric program— one can see the importance of the two optimization designs addressed in this work, the op- amp and LC-tuned oscillator.

21

Chapter 3

Geometric Programming

There is another special standard form of optimization problem (2.1): the geometric program

(GP). Although a geometric program can be converted to a convex program after logarithmic transformation, its discussion merits a whole chapter because of its rapidly increasing use in a wide variety of fields and the renewed interest it has generated recently. Moreover, it is used as one basis of the proposed modeling and optimization strategy in this work. This optimization program is thus discussed in detail here.

The specialty of GP is that its objective and constraint functions are posynomial and monominal functions of the design variables [17]. Once again, GP does not directly fall into the category of convex programs, but a simple logarithmic transformation produces an equivalent convex problem, which can then be solved with the efficient mature solvers developed for convex optimization problems. The convertibility to a convex program has profound advantages: It is highly efficient, and the global solution is always found regardless of the initial point if the problem is feasible. The latter is, perhaps, an even greater advantage than the high efficiency.

3.1 Evolution of Geometric Programming

Let us first give a brief introduction to the evolution of geometric programming. One pioneer devoted to geometric programming is Clarence Zener, who realized in 1961 that many cost

22

minimization problems in engineering can be stated in a special form and solved very easily

[80]. Almost at the same time, , a Mathematics professor at Carnegie Mellon

University, was developing a duality theory for nonlinear programming problems. Duffin learned of Zener’s work and started to construct a mathematical framework for geometric programming based on his work on duality theory [81, 82]. A few years later, the two pioneers published the book Geometric Programming, in 1967 [83]. This book set the necessary groundwork for geometric programming. It includes the basic theory and some electrical engineering applications (e.g., optimal transformer design), but not much on numerical solution methods. The name ―geometric programming‖ was given by Duffin,

Peterson, and Zener in 1967 [83]. One might think that the name derives from the many geometric problems that can be formulated as a geometric program; however, the name comes from the extensively used geometric-arithmetic mean inequality, which played a central role in the early analysis of the geometric program.

Since the late 1960s there has been extensive work done in both the theoretical and practical aspects of geometric programming. Another two books appeared in the 1970s:

Engineering Design by Geometric Programming, by Zener [84] and Applied Geometric

Programming, by Beightler and Phillips [85]. In 1978 two issues of the Journal of

Optimization Theory and Applications were entirely devoted to geometric programming.

Additionally, there are several papers on methods for solving geometric programs. The 1980 survey paper by Ecker [86] has many references on applications and methods, including numerical solution methods used at that time. Geometric programming is also briefly described in some surveys of optimization [87, 88].

23

Unfortunately geometric programming has been kind of an outcast in the optimization community [89]. For years, the mathematical programming community has regarded geometric programming as a mere curiosity. However, it has been enthusiastically embraced by the engineering community. This great change took place in the 1990’s, led by Stephen

Boyd and his research team at Stanford University. They developed an efficient and mature numerical solver for convex programs and geometric programs, and produced many works on engineering applications, especially electronic system design [11, 67, 73, 74, and 75].

There are various applications of the GP. The following selected list shows some of the range of applications. In Chemical Engineering: general problems [90], Williams Otto process optimization [91], and condenser design [92]. In Civil Engineering: cofferdam design optimization [93], structural design [94, 95], steel plate girders [96], and structural design of aircraft wings [97]. In Economics: marketing-mix problem [98], inventory optimization [99,

100, 101], and maximization of profit rate [102]. In Electrical Engineering: transformer design [93], digital circuit transistor sizing [11, 67, 73, 74, 75], and optimal floorplanning for layout [103, 104]. In Environmental Engineering: wastewater treatment plants [105, 106], and water treatment [107]. In Mechanical Engineering: optimal design of journal bearing

[108], space trusses [109], and helical spring [110]. In Nuclear Engineering: cooling-tower system optimization [111]. In Communication Engineering: communication systems [18].

Obviously, geometric programming has had tremendous impact in various areas. It is surprising and interesting that although geometric programming looks like a restrictive type of optimization problem, it has been used to solve an extremely wide variety of practical problems. For a more extensive description of geometric programming theory, see [83].

24

3.2 Standard Geometric Program

In this section, a powerful optimization method, called Geometric Programming, is introduced for determining the component values and transistor dimensions in electronic circuit system optimization design. This method can handle a very wide variety of specifications and constraints extremely fast, and results in globally optimal designs [11, 67,

73, 74, 75, and 112].

A standard geometric program (GP) is an optimization problem of the form

Minimize f0 (x) Subject to f (x) 1, i 1,  , m i (3.1) g j (x) 1, j 1,  , p

xk  0, k 1,  , n

where x  (x1,  , xn ) are the optimization design variables, the inequality constraint

functions f i , i  0,  , m , are posynomial, the equality constraints g j , j 1,  , p are

monomial functions, and all the optimization variables xk , k 1,  , n should be positive. A geometric program can be reformulated as a convex optimization problem by taking the logarithmic transformation of design variables. More details can be found in Section 3.4.8 later.

If the functions , are generalized posynomials, problem (3.1) is called a

generalized geometric program (GGP). If any function g j is signomial function, then problem

(3.1) is called a signomial program. A geometric program is also a GGP because a posynomial function is a special case of GGP, and a generalized geometric program can be reformulated as an equivalent geometric program with some simple transformations.

25

The GP has some unmatched advantages over many current IC optimization technologies. First, it can handle large numbers of design variables and constraints, and it can be reformatted as a convex optimization problem, then solved extremely fast by the developed interior point methods [28, 113, and 114]. For example, systems involving tens of variables and hundreds of constraints can be solved in less than one second [114, 17]; and those with thousands of variables and tens of thousands of constraints are readily solved on a small workstation in minutes [115]. Perhaps more important than the great efficiency is that

GP always obtains the globally—not locally—optimal solution (if it exists) no matter what the initial point is. GP has naturally gained popularity in IC optimization design [11, 12].

However, this approach still has several limits. It does not provide much insight into the failure of some specifications, nor does it suggest to the designer how to change the circuit topology for better results. It also requires a special format and function: PF objective, PF inequality, and MF equality. Exactly for this reason, some convex fitting techniques, such as monomial fitting and convex PWL fitting, are often needed to derive GP-compatible device models.

3.3 Relevant Terminology for the Geometric Program

In the following section, all of the terminology relevant to the geometric program will be introduced to facilitate understanding.

3.3.1 Monomial Functions and Examples

Let x  (x1,  , xn ) be an n-dimension real, positive variable. A function f (x) in the form of

26

1 2 n f (x)  cx1 x2  xn (3.2)

where the coefficient c  0 and the exponents ai  R , is called a monomial function (MF)

[17]. One should note the distinction between the MF defined here and the common algebra definition: The former can be any real value, including negative and fractional, but the latter

5.4 3.67 10 should be a nonnegative integer. For example, 10.7x1 x2 x3 is a MF of the

3 variables x1 , x2 and x3 , with a coefficient of 10.7 ; likewise, 5.1x xy / z is also a MF;

3 7.4 2 14 but 1.8x1 x2 x3 x4 is not, since its coefficient 1.8 is negative. One special case is that a positive constant such as 67.9 can be also considered an MF. Some properties of PF can be stated as follows:

 A positive constant is also a monomial.

 MF is closed under multiplication and division: If two functions f1 and f2 are MFs,

their product f1  f2 and quotient f1  f2 are also MFs. Two special cases are that an

MF scaling by any positive constant is still MF and an MF raised to any power is also

1 2 n an MF—that is, if function f (x)  cx1 x2 ...xn is an MF, then the scaled MF

1 2 n m m m1 m2 mn b f (x)  bcx1 x2 ...xn and raised power function [ f (x)]  c x1 x2 ...xn ,where

b  0 and m R , are still MFs.

3.3.2 Posynomial Functions and Examples

A sum of two or more monomial functions in the form of

m 1i  2i  ni f x  ci x1 x2 ...xn (3.3) i1

27

where ci  0 , is called a posynomial function (PF) [116]. Some examples of PF are given

3 7.4 2 14 3 7.4 2 14 3 here as 100.7, 1.8x1 x2 x3 x4 and 4.92 1.8x1 x2 x3 x4  5.1x xy / z , but the two functions

4.53xyz  secx and (x  y  z)5  2.56 are not PF. Some properties of PF can be listed as follows:

 Any MF is also a special PF; PF is closed under addition, multiplication, and

nonnegative scaling.

 If function f is a PF and m is a nonnegative integer, then f m is a PF because it is the

multiplication of PFs.

 If function f1 is posynomial and function f2 is a monomial, then f1 / f2 is a posynomial.

 If function is a PF, then its logarithmic transformation function log f is also PF.

3.3.3 Inverse Posynomial Functions and Examples

The reciprocal of a posynomial function with the format of

1 f x  m (3.4) 1i  2i  ni ci x1 x2 ...xn i1 is called an inverse posynomial function (IPF) [17]. For example, the functions 2/ x and1/(2x  y)are both IPFs.

An inverse posynomial function has following properties:

 A monomial is also an inverse posynomial function.

 Inverse posynomials are closed only under multiplication and nonnegative scaling.

 If function is an IPF and function is a monomial, then is an IPF.

28

3.3.4 Generalized Posynomial Functions and Examples

A generalized posynomial function (GPF) [116] is a function constructed from posynomial functions using addition, multiplication, pointwise maximum, and raising to constant positive power. From this definition, one can see that a PF is also a special GPF.

3.2 1.3 2 4.17 3.91 4.51 For example, the functions 100.27x1  7.9x2 , 8.2(x1x2  7.4x1 x2 x3  0.605) 1.8

6.21 3.4 0.7 5.14 3/ 8 2 4.17 3.91 and max{0.2x1x2 ,3.6 x1 x2 ,6.1x2 x3  x1 x2 x3 }are all GPF. Some properties of PF can be stated as follows:

 A MF is also a GPF; a PF is also a GPF.

 GPF holds its closure under addition, multiplication, nonnegative scaling, positive

power and component maximum, as well as other operations that can be derived from

these, such as division by monomials. They are also closed under composition in the

following sense. If  is a generalized function of k variables, for which no variable

occurs with a negative exponent, and h1 (x),  ,hk (x)are GPF, then the composition

function f (x)  (h1(x),  ,hk (x)) is a GPF.

 If function f is a GPF and m is a nonnegative integer, then f m is a GPF.

 If function f1 is a GPF and function f2 is a MF, then f1 / f2 is a GPF.

 If function f is a GPF of m variables with nonnegative exponent, and functions

gi ( i 1,  , m ) are all GPF, then the composition function f (g1,  , gm )is a GPF.

 If function f (x) is a GPF, then the logarithmic transformation

function F(y)  log f (ex )is a convex function.

29

3.3.5 Signomial Functions and Examples

In some cases, just as in an algebraic MF or PF, the coefficients are perhaps negative. In such a case, we call a function in the form of

m 1i  2i  ni f x  ci x1 x2 ...xn (3.5) i1

where ci  R , a signomial function (SF) [30]. An SF has the same format as a PF; however, the coefficients in the former can be both positive and negative, whereas the coefficients in the latter can be only positive. So a PF is also an SF. For example,

3  9.21and (x1  x2  2x3) 8.25 are both SFs. An SF has several properties:

 A constant number is SF; an MF is also an SF; a PF is also an SF.

 Any SF is the difference of two PFs by collecting together the terms with positive

coefficients and also the terms with negative coefficients.

 SF is closed under addition, subtraction, and multiplication.

 An SF divided by an MF or the negative of MF is still an SF.

 A SF raised to a nonnegative integer power is also a SF.

3.4 Transformation Methods for the Geometric Program

Simple tricks and transformations allow one to handle a wider variety of optimization problems by converting to a standard GP. In this section, some such transformations are introduced for comparison and later use (more details can be found in [13]).

30

3.4.1 GP-compatible Algebraic Transformations

Scale transformation If function f (x) is a PF and k is a positive value, then the inequality constraint f (x)  k can be changed to the standard posynomial inequality constraint as

f (x)/ k 1 which is compatible with the standard geometric program. Similarly, if another function g(x) is a monomial function, then the inequality constraint f (x)  g(x) can be converted to the equivalent standard posynomial inequality constraint

f (x) g(x) 1 which is also a posynomial constraint compatible in the standard geometric program [13].

Inverse transformation If function f (x) is an IPF and g(x) is posynomial, the inequality constraint f (x)  g(x) can be transformed as a standard posynomial inequality constraint

1 g(x)  1. f (x) through dividing by f (x) on both sides.

If function is an IPF and function is an MF or IPF, the inequality constraint f (x) g(x) 1can be converted as

1 1  1 f (x) g(x) through dividing by f (x) g(x)on both sides. This resultant inequality is compatible with the standard geometric program [17].

31

If the objective function f0 (x) is an IPF or MF, then one can maximize such an objective function by minimizing its inverse, which is a PF or MF, separately.

Monomial division transformation If function f (x) and g(x) are both MFs, then the equality constraint f (x)  g(x) can be replaced by the equivalent standard equality constraint f (x) g(x) 1or g(x) f (x) 1 [17].

An example is given here to demonstrate the basic GP model formulation and the application of simple transformation methods. Suppose that we have a piece of rectangular aluminum material, with sides w and l . The material is used to make a cylindrical container with as much volume as possible, denoted byV . The following requirements are delimited

by the available budget: The material’s dimensions must be wmin  w  wmax and lmin  l  lmax ,

the material’s area should be no more than Smax , the area of the cover should be no more

than Smax 1 , and the side area of cylinder should be no more than Smax 2 . What, then, are the available maximum volume and its equivalent dimensions? This is a simple optimization problem, which can be extracted and expressed as follows:

Maximize V  r 2h

Subject to wl  Smax

2rh  Smax 1 2 2r  2rh  Smax 2 w  w  w min max (3.6) lmin  l  lmax

h wmax 2r

h l max 2r

2r  lmax

2r  wmax

32

This is not a standard geometric program, but it can be reformulated in the standard format given below by using the transformation methods described above.

1 Minimize V 1  r 2h1 

Subject to ( 1/Smax )wl  1

(2 / Smax 1)rh  1 2 (2 / Smax 2 )r  (2 / Smax 2 )rh  1

(1/w max )w  1 1 w min w  1 (3.7)

(1/l max )l  1 1 l min l  1

(1/w max )h  (2/w max )r  1

(1/lmax )h  (2/lmax )r  1

(2 / lmax )r  1

(2 / wmax )r  1

3.4.2 Fractional Powers of Posynomials

It is already known that PFs are closed under positive integer powers. Consequently, if

functions 1(x) , 2 (x) and 3 (x) are all posynomials, the constraint

3 4 6 1 (x) 2 (x) 3 (x) 1is a standard posynomial inequality; nevertheless, the constraint in

(3.8) is not a posynomial inequality because it has fractional powers.

3.7 4.3 6.5 1 (x) 2 (x) 3 (x) 1 (3.8)

However, it can be handled in a geometric program by using the skill of introducing new

slack variables t1 , t2 and as well as the following inequality constraints [13]:

33

1(x)  t1  2 (x)  t2 (3.9)  3 (x)  t3 which are compatible with the geometric program. Then the non-posynomial inequality (3.8) is replaced with the inequality (3.10)

3.7 4.3 6.5 t1  t2  t3 1 (3.10) which is now a posynomial inequality.

One can see that non-posynomial fractional power inequality (3.8) is equivalent with the inequalities in (3.9) together with those in (3.10). First, if x satisfies (3.8), then obviously the

solution x, andt1  1(x), t2 2 (x) and t3  3(x) satisfy (3.9) and (3.10). Conversely, if the

variables x, t1 , t2 and t3 satisfy (3.9) and (3.10), i.e., the variable x meets the conditions

 1(x)  t1   2 (x)  t2   3 (x)  t3  3.7 4.3 6.5 t1  t2  t3  1

3.7 4.3 6.5 then it leads to1 (x) 2 (x) 3 (x) 1, and therefore x satisfies (3.8). Here we use the

critical fact that if , and t3 satisfy (3.10), and we reduce them (for example, setting them

equal to1(x), 2 (x) and3 (x) , respectively) then they still satisfy (3.9). This relies on the

3.7 4.3 6.5 fact that the posynomial t1  t2  t3 is an increasing function of , and t3 .

More generally, we can see that this method can be used to handle any number of positive fractional powers occurring in an optimization problem. We can handle any problem which has the form of a GP, but in which the posynomials are replaced with positive

34

fractional powers of posynomials. We will see later that positive fractional powers of posynomials are special cases of generalized posynomials, and a problem with the form of a

GP, but withi fractional powers of posynomials, is a generalized GP [13].

The same procedures can also be used in other composite functions of posynomials. If function h is a posynomial of m variables, with all its exponents positive or zero,

and1,  , m , h(1,  , m ) are posynomials, then the composition function inequality

h(1(x),  , m (x)) 1 (3.11) can be handled by replacing it with

h(t1,  , t m )  1   (x)  t  1 1    (3.12)       m (x)  tm

where t1,  , t m are adopted slack variables. This shows that products, as well as sums, of fractional positive powers of posynomials can be handled.

3.4.3 Maximum of Posynomials

In 3.4.2, we showed how positive fractional powers of posynomials, while not posynomials themselves, can be handled via GP by introducing a new slack variable and a bounding constraint. Here we show how to apply the same idea to the maximum of some posynomials.

Suppose function g1 , g2 , g3 and g4 are posynomials. The inequality constraint

max{g1(x)  g2 (x), g3(x)g4 (x)} 1 (3.13)

35

is generally not a posynomial inequality.

After adopting the new slack variable t  max{g1(x)  g2 (x), g3(x)g4 (x)}, the constraint

(3.13) in geometric program can be reformatted with the following GP-compatible inequality constraints [17],

g1(x)  g2 (x)  t  g3 (x)g4 (x)  t  t 1

The same idea applies to a maximum of more than two posynomials, by simply adding extra bounding inequalities. As with positive fractional powers, the idea can be applied recursively and can be mixed with the method for handling positive fractional powers.

3.4.4 Function Composition

In this section a few advanced transformation techniques are introduced to derive a valid constraint format for GP (or GGP). In some special cases, it is possible to reformulate some composition functions to compatible formats in GP (or GGP). The function in (3.14), often met by an engineer, is first analyzed as an example of this type of transformation skills:

g(y) 1/(1 y) (3.14)

Now consider the inequality constraint (3.15),

1/(1(x)) (x) 1 (3.15) where(x) and(x) are both generalized posynomials. Obviously, in (3.15), 1/(1(x)) is a composition function of the generalized posynomial function (x) with the often encountered function (3.14) . It also implies a constraint(x) 1.

36

To transform (3.15) to a GGP-compatible inequality constraint, one can introduce the slack variable t 1/(1(x)) . It can then be represented with the following constraints [13].

(x) 1  t  (x) 1  1/(1(x))  1

Now (3.15) is reformulated with the following equivalent GGP-compatible constraints,

(x) 1  t  (x) 1  1/t  (x) 1

Second, the more general case (3.16) is given as another example to indicate how to transform a composition function to a valid format in GP (or GGP):

 (x)/((x) (x)) (x) 1 (3.16) where  ,  , and  are generalized posynomial functions, and  is a monomial function.

Obviously, in (3.16),  (x)/((x) (x)) is a composition function of the generalized posynomial function (x) with the frequently encountered function (3.14) g(y) 1/(1 y) ; it also contains an implicit constraint (x)  (x) . Similarly, introducing a new slack variable t  (x)/((x) (x)) , an inequality constraint (3.16) can then be expressed with the following constraints:

(x)  (x)  t (x)  1   (x) /((x) (x)  t

And last, (3.16) is reformulated with the following equivalent GGP-compatible constraints:

37

(x)/ (x) 1  t (x) 1   (x)/t (x)/ (x) 1

The idea that the composition of a generalized posynomial function with1/(1 y) can be handled in GP can be guessed from its Taylor series [13],

1  1 y  y2  1 y which is a limit of polynomials with positive coefficients. This analysis suggests that we can handle, at least approximately, the composition of a GPF with any function whose series expansion has no negative coefficients by truncating the series. In some cases [such as g(y) 1/(1 y) ], the composition can be handled exactly.

Another example is the exponential function. Since the Taylor series for the exponential has all coefficients positive, we can guess that the exponential of a GPF can be handled as if it were a GPF. One good approximation is

e(x)  (1(x)/ m)m (3.17) where m is large. If  is a GPF, the right side is a GPF. This approximation is good for small enough ; if is known to be near, say, the number k , we can use the approximation

e(x)  ee(x)k (3.18)  e (1 ((x)  k)/ m)m which is a generalized posynomial provided m  k .

It is also possible to handle exponentials of posynomials exactly, i.e., without approximation [13]. We replace a term of the form e(x) with a new slack variable t , which

38

can be used anywhere a posynomial can be used, and we add the constraint e(x)  t to our problem. This results in a problem that would be a GP, except for the exponential e(x)  t .

To solve the problem, we first take the logarithm of original variables x and new variable t , so that our variables become y  log x and s  logt . Then we take the logarithm on both sides of each constraint. The PF objective, PF inequality, and MF equality constraints transform to convex objective and inequality constraints and linear equality constraints. The

y exponential constraint becomes loge(e )  loges , i.e., (ey )  s , being a convex constraint on y and s , since (e y ) is a convex function of y . Thus, the logarithmic transformation yields a convex problem, which, though not quite the same as the one obtained from a standard GP, is still easily solved.

In summary, exponential terms can be handled exactly, but with the disadvantage of requiring software that can handle a wider class of problems than GP. For this reason, the most common approach is to use an approximation such as the one described above.

3.4.5 Additive Log Terms

In some cases, one can see a logarithm of a generalized posynomial in the inequality constraints. How can we convert these types of terms to a GGP-compatible format? Let’s consider an inequality constraint in (3.19),

(x)  log(x) 1 (3.19) where(x) and(x) are generalized posynomial functions. Recall that it is possible that the logarithm termlog(x) is negative; therefore, the way to transform such a constraint differs

39

that used in the cases described previously. First, the logarithm term can be approximated with (3.20) [13],

log w  k(w1/ k 1) (3.20) which is valid for large w , and thus the inequality constraint (3.19) is changed to the following inequality,

(x)  k((x)1/ k 1) 1

The above inequality can then be presented as the following generalized posynomial inequality:

(x)  k(x)1/ k 1 k

Like exponentials, additive log terms can also be transformed to a GGP-compatible format by taking the logarithmic transformation y  log x . Now the inequality constraint (3.19) can be expressed as

(ey )  log(ey ) 1 (3.21)

This is a convex constraint, and so can be handled directly.

3.4.6 Generalized Posynomial Equality Constraints

The equality constraints should be monomial in a GGP (or GP). In some cases, it is possible to convert generalized posynomial equality constraints in a GGP [117].

First consider (3.22), which is a simple case with only one generalized posynomial equality constraint h(x) 1

40

Minimize  (x)

Subject to fi (x)  1, i  1,  , p

g j (x)  1, j  1,  ,q (3.22) h(x)  1

xk  0, k  1,  , n

where x1,  , xn are optimization design variables, functions , f1,  , fp and h are generalized

posynomials, and gi are monomials. This is not a generalized geometric program because the last equality constraint h(x) 1is not a monomial function, but a generalized posynomial function. Usually, it is very difficult to solve.

First, relaxing/loosing the generalized posynomial equality constraint with the inequality constraint h(x) 1 results in (3.23),

Minimize  (x)

Subject to fi (x) 1, i 1,  , p

g j (x) 1, j  1,  ,q (3.23) h(x) 1

xk  0, k  1,  , n

Let x be an optimal solution of the relaxed problem (3.23). If we have h(x) 1, then the value is also an optimal solution of the original problem (3.22), and we have finished. Of course this need not happen; we can have h(x) 1, in which case is not feasible for the original problem. In some cases, though, we can modify the point so that it remains optimal for the relaxation (3.23) but also satisfies the generalized posynomial equality constraint [and therefore is optimal for the original problem (3.22)].

Suppose we can find a variable xk with the following properties [117].

 The variable xk does not appear in any monomial equality constraint function g j (x).

41

 The objective and inequality constraint functions  , f1,  , fp are all monotone

decreasing in xk , i.e., if we increase (holding all other variables constant), the

functions , f1,  , fp decrease or remain constant.

 The generalized posynomial function h is monotone strictly increasing in , i.e., if

we increase (holding all other variables constant), the function h increases.

Now suppose we start with the point x , and increase , i.e., we consider the point

x  (x1,  , xk 1, x  u, xk 1,  , xn ) where u is a scalar that we increase fromu  0 . By the first property, the monomial equality constraints are unaffected, so the point x satisfies them, for any value of u. By the second property, the point continues to satisfy the inequality constraints, since increasing u decreases (or keeps constant) the functions .

The same argument tells us that the point has an objective value that is the same, or better than, the point . As we increase u , h(x) increases. Now we simply increase u until we have h(x) 1 [ h can be increased as much as we like, as a consequence of the convexity of logh(ey ) , where x  ey ]. The resulting is an optimal solution of the problem (3.22).

Increasing until the generalized posynomial equality constraint is satisfied is called tightening. The same method can be applied when the monotonicity properties are reversed, i.e., if , are monotone increasing functions of , and h is strictly monotone decreasing in . In this case, we tighten by decreasing until the generalized posynomial constraint is satisfied.

42

Starting with a problem wherein the objective and constraint functions are given as expressions involving variables, powers, sum, and maximum, it is easy to check the

monotonicity properties of the functions and to determine if a variable xk with the required monotonicity properties exists. The same idea can be used to solve a problem with multiple generalized posynomial equality constraints [117]:

Minimize  (x)

Subject to fi (x)  1, i  1,  , p

g j (x)  1, j  1,  ,q (3.24)

hm (x)  1, m  1,  ,r

xk  0, k  1,  , n

where fi (x) and hm (x) are generalized posynomials, and g j (x) are monomials. We form and solve the GGP relaxation

Minimize  (x)

Subject to fi (x)  1, i  1,  , p

g j (x)  1, j  1,  ,q (3.25)

hm (x)  1, m  1,  ,r

xk  0, k  1,  , n and let x denote an optimal solution. In this case, we need a different variable for each generalized posynomial equality constraint, with the monotonicity properties given above.

We re-order the variables so that x1 is the variable we increase to cause h1(x) to increase to

one, and x2 is the variable we increase to cause h2 (x)to increase to one, and so on. The simple method of increasing or decreasing one of the variables until the equality constraint is

satisfied cannot be used in this case, since increasing x1 to make increase to one can

decrease h2 (x) (and vice-versa). However, we can find a common adjustment of the

43

variables x1,  , xk with those results in all of the equality constraints being tightened simultaneously, by forming and solving an auxiliary GGP:

Minimize  (x)

Subject to fi (x)  1, i  1,  , p g (x)  1, j  1,  ,q j (3.26) hm (x)  1, m  1,  ,r  (x)  *

xk  0, k  1,  , n where *is the optimal value of the relaxed problem (3.25). The feasible set of this problem is the optimal set of the relaxed problem (3.25). The objective in the auxiliary problem is to maximize the product of the variables used to tighten the equality constraints. Any optimal solution of this auxiliary problem (3.26) is an optimal solution of the original problem (3.24).

We can use any objective that puts pressure on the variables to increase; for

example, we can minimize1/ x1,  , 1/x k .

3.4.7 Application Example: Geometric Programs in Convex Form

A geometric program (3.1) can be transformed as a convex optimization problem, and then solved highly efficiently by using the recently developed interior-point method [116].

A new slack variable y  log x is introduced here (i.e., x  ey ), and defined F(y)  f (ey ) .

a1 a2 an When f (x) is a monomial function f (x)  cx1 x2  xn , then it leads to the following equation by taking logarithmic transformation on both sides.

log f (x)  logc  a1 log x1  a2 log x2    an log xn

yi Substituting xi  e obtains

44

log f (e y )  logc  a y  a y    a y 1 1 2 2 n n  aT y  b

Therefore, we have

T F(y)  ea yb (3.27)

T T T where b  logc , a [a1,  ,an ] and y  [y1,  , yn ]  [log x1,  ,log xn ] . Thus with the

a1 a2 an logarithmic change of variables yi  log xi , the monomial function f (x)  cx1 x2  xn

T becomes an exponential of an affine function F(y)  ea yb .

m 1i  2i  ni Similarly, if function f is a posynomial function, i.e., f x  ci x1 x2 ...xn , then the i1 associated function of variable y is given by

m T F(y)  eai ybi (3.28) i1

T where ai  a1i ,  ,aniand bi  logci . After taking the logarithmic transformation of design

m T variables, a PF becomes a sum of exponentials of affine functions F(y)  eai ybi . Now, i1 the geometric program (3.1) can be expressed in terms of the new variable y as

K 0 T Minimize ea0k yb0k k1 K i T Subject to eaik ybik  1, i  1,  ,m (3.29) k1 T ea j ybj  1, j  1,  , p

45

n where aik  R ,i  0,...,m contain the exponents in the posynomials of the original geometric program. Now by taking the logarithm of the objective, inequality constraints, and equality constraint, (3.29) can be changed to (3.30):

k 0 T ~ aokyb0k Minimize f0 y  log e k1 k i T ~ aik ybik Subject to fi y  log e  0, i  1,  ,m (3.30) k1 ~ T g j y  a j y  bj  0, j  1,  , p

~ ~ where the functions fi are convex, and g j are affine. This problem is thus a convex optimization problem [109]. Any GP, then, can be globally and efficiently solved by the recently developed interior-point method.

Another point is that the log-sum-exp function can be fitted with a differentiable approximation of the max function, since

x1 xn max{x1,  ,xn} log(e    e )  max{x1,  ,xn} logn (3.31)

T T T ~ a1 yb1 a 2 yb2 a K ybK is valid for all x . Therefore fi (y)  log(e  e    e ) satisfies

T T ~ T T max{a1 y  b1,  ,aK y  bK } fi (x)  max{a1 y  b1,  ,aK y  bK } log K (3.32)

One can see that both the lower and upper bound in (3.32) are convex functions. First, by

using the definition in (2.9), if f1,  , fm are convex, then their pointwise maximum

f (x)  max{ f1(x),  , fm (x)} is also a convex function (for more details, see Appendix A). And the function

T T f (x)  max{a1 x  b1,  ,aK x  bK } (3.33)

46

defines a piecewise-linear function with K sections. It is convex since it is the pointwise maximum of linear functions, and linear functions are convex. The lower bound in (3.32) is an exactly piecewise-linear function and the upper bound is the sum of a piecewise-linear function and a constant; as a result, the two bounds are convex functions. As a consequence, any posynomial function after the logarithmic transformation is a smooth and differentiable approximation of piecewise-linear function. Second, the PWL function can be used to fit a posynomial function and any convex function [17]. The second observation is used in

Chapter 4 as the fundamental basis of the device modeling method.

3.5 Optimization Sensitivity Analysis

3.5.1 Tradeoff Analysis

In many practical problems, the constraints are not fixed but variable; this can be for many reasons, such as a compromise in current techniques or an unexpected change in the conditions, the budget, and so on. To achieve a better overall performance, a so- called trade- off analysis may be used to observe and predict the dynamic impact on the optimal value of the problem by varying the constraint limits [116]. First, we perturb the number 1 on the

right-hand side of each constraint in the standard GP with a parameter ui and v j to form a perturbed GP:

Minimize  (x) Subject to f (x)  u , i  1,  , m i i (3.34) g j (x)  v j , j  1,  , p

xk  0, k  1,  , n

47

where ui and v j are positive constants. When ui 1 and v j 1 , it is the original canonical geometric program. We let  *(u,v) denote the optimal value of the perturbed geometric

program as a function of vector u  (u1,  , um ) and v  (v1,  , up ) . Thus, the value  *(1,1)

(where 1 denotes a component-wise unit) is equal to the optimal value of the original GP.

th In the perturbed geometric program, if ui 1 , then it is said that the i inequality constraint in the canonical geometric program,

fi (x) 1 is loosened/relaxed with the corresponding inequality constraint,

fi (x)  ui (3.35)

If the parameter ui satisfies ui 1 , the inequality constraint is said to be tightened compared to the inequality in the canonical geometric program. There is a similar interpretation regarding in the equality constraint.

What does  *(u,v) mean? It gives the optimal value of the problem, after we perturb the constraints and then optimize again. When u and v change, so does (in general) the associated optimal point.

In optimal trade-off analysis, we check the function for certain values of u and v

[116]. For example, to see the optimal trade-off of the inequality constraint and the

objective, we can plot  *(u,v) versus the ith variable , with all other u j and all equal to one. The resulting curve, called the optimal trade-off curve, passes through the point given

by the optimal value of the original GP when ui 1. As increases above one, the curve

48

must decrease (or stay constant), since by relaxing the ith constraint we can only improve the

optimal objective. The optimal trade-off curve flattens out when ui is made large enough that the constraint is no longer relevant. When is decreased below one, the optimal value increases (or stays constant). If is decreased enough, the perturbed problem can become infeasible. Thus the optimal value (u,v) is a nonincreasing function of parameter u and v .

When multiple constraints are varied, we obtain an optimal trade-off surface. For

 example, the resulting graph of the optimal value (u2,u5 ) should be a surface as a two- dimensional function of the 2nd and 5th inequality constraints. One common approach is to plot trade-off surfaces with two parameters as a set of trade-off curves for several values of the second parameter.

3.5.2 Optimization Sensitivity Theory

It is also very important for the designer to know how sensitive the change of the optimal objective value is with a small variation around the unit constraints (i.e., u 1, v 1) of a geometric program problem. Such information tells you which constraints should require more attention for improving the optimal objective value, which means the overall performance [30]. This is what one does in sensitivity analysis. Mathematically, the optimal sensitivity can be defined as

  *(u,v) Sinequality,i   ui  * (3.36)   (u,v) Sequality, j   v j

49

Here it is supposed that the original geometric program is feasible, i.e.,  (u,v) is finite,

and the perturbed problem remains feasible for small changes of ui and v j near unit

constraint, and also that the optimal objective value as a function of ui and v j is differentiable at u 1, v 1, although this need not to be the case. For more details about sensitivity

analysis, see [17]. We will be interested in the case when the constants and vi are not too far away from 1.

Sensitivity analysis is closely related to trade-off analysis. In sensitivity analysis, we

* study the variation of  (u,v) as a function of u and v , for value of and v j near unit constraint. By definition, optimal sensitivity is indeed the slope of the trade-off curve going through the point (u,v)  (1, 1) .

Some simple but useful information can be observed. Relaxing/loosing the ith inequality

constraint (i.e., increasing ui ) always improves the optimal objective value, which means lowering the optimal objective value. In other words, the optimal value decreases

when we increase ; therefore, we always have Sinequality,i  0 . If the inequality constraint is

* * not tight at the optimum solution x , i.e., fi x  is strictly less than one, then we

have Sinequality,i  0, which means that a small change in the right-hand side of the inequality constraint (loosening or tightening) has no effect on the optimal value of the problem since we can slightly tighten or loosen the ith constraint with no effect.

Similarly, the sign of Sequality, j tells us the monotonicity of the optimal objective

value on the equality constraint. Specifically, when Sequality, j  0 , the optimal

50

objective value is an increasing function of v j ; conversely, when Sequality, j  0, the optimal

objective value is a decreasing function of ; when Sequality, j  0 , the optimal objective value stays constant with the variation of , which means the ith equality constraint has no effect on the optimal objective value. The magnitude tells us how sensitive the optimal value is to the right-hand side of the equality constraint.

The sensitivity information is very useful in practical design, and it gives deep insight to

the designer [116]. As an example, suppose we have Sinequality,1  0.6 and Sinequality,2  7.8.

This means that if we relax the first constraint by (for example) 1%, we would expect the optimal objective value to decrease by about 0.6%; if we tighten the first inequality constraint by 1%, we would expect the optimal objective value to increase by about 0.6%. On the other hand if we relax the second inequality constraint by 1%, we would expect the optimal objective value to decrease by the much larger amount of 7.8%; if tightening the first constraint by 1%, the optimal objective value increases by the much larger amount of 7.8%.

Roughly speaking, we can say that while both constraints are tight, the second constraint is much more tightly binding than the first.

Optimal sensitivities can also be very helpful in practice [116]. If a constraint is tight at the optimum but has a small sensitivity, then small changes in the constraint won’t significantly affect the optimal value of the problem. On the other hand, a constraint that is tight and has a large sensitivity is one that (for small changes) will greatly change the optimal value: If it is loosened (even just a small amount), the objective is likely to decrease considerably; if it is tightened, even just a little bit, the optimal objective value will increase

51

considerably. Roughly speaking, a constraint with a large sensitivity can be considered more strongly binding than one with a small sensitivity.

The optimal sensitivities are also useful when a problem is infeasible [116]. Assuming we find a point that minimizes some measure of infeasibility, the sensitivities associated with the constraints can be very informative. Each one gives the (approximate) relative change in the optimal infeasibility measure, given a relative change in the constraint. The constraints with large sensitivities are likely candidates to loosen (for inequality constraints) or to modify (for equality constraints), to make the problem feasible.

One very important fact is that when we solve a GP, we get the sensitivities of all constraints at no extra cost. This is because modern methods for solving GPs (such as interior-point method) solve both the primal (i.e., original) problem and its dual (which is related to the sensitivities) simultaneously.

3.6 Convex Approximation and Fitting

The GP has gained increasing popularity in various fields due to its unmatched advantages over many current IC optimization technologies [7, 8, 9 and 10]. First, it can solve, highly efficiently, an optimization system with large numbers of design variables and constraints by the developed interior-point methods [28, 113, and 114]. Also, it always achieves the existing globally optimal solution regardless of the starting point. However, besides being unable to provide much insight into the failure of some specifications and how to improve the circuit topology, another limitation is that it can not handle non-convex constraints because of its special structure: PF objective, PF inequality, and MF equality. Exactly for this reason, some

52

convex fitting technologies (such as the monomial fitting technique, the posynomial fitting technique, and the convex PWL fitting) are often needed to derive a GP-compatible device model if one selects the geometric program for solving the problem at hand.

This section addresses some convex approximation and fitting technologies used to convert non-convex constraints so as to derive GP-compatible approximate expressions.

3.6.1 Convex Approximation and Fitting Theory

To use GP to solve an optimization design, we need to formulate the on-hand optimization problem with a GP format. However, this requires a special format: PF objective, PF inequality, and MF equality [30]. Therefore, we need to know what functions (or design data set) can be expressed exactly or approximately with a GP-compatible function—such as an

MF and convex PWL function and how to express/fit a function (or design data set) to a GP- compatible function. In principle, the following Lemma can answer the above questions [29].

Here it is given a positive function f (x) of positive variables x1, ... ,xn .

 Lemma 1: Function f (x) can be approximated by an MF if and only if its logarithmic

function F(y)  log f (x) can be approximated by an affine function, i.e., a constant

plus a linear function.

 Lemma 2: Function f (x) can be approximated by a GPF if and only if its logarithmic

function can be approximated by a convex function.

Proof on Lemma 1:

a1 a2 an (1) If f (x) is monomial in the form of f (x)  cx1 x2  xn , then can be approximated by an affine function, i.e., a constant plus a linear function [29].

53

Let y  logex , i.e., x  ey . Taking the logarithmic transformation on both sides leads to the following equation:

log f (x)  logc  a1 log x1  a2 log x2    an log xn

Therefore, it is obtained that

y F(y)  log f (e )  logc  a1y1  a2 y2    an yn where the function F(y) is obviously an affine function of the

T T variables y  [y1,  , yn ]  [log x1,  ,log xn ] ; therefore, when f (x) is a monomial, can be approximated by an affine function, i.e., a constant plus a linear function.

(2) If function can be approximated by an affine function, i.e., a constant plus a linear function, then function f (x) can be approximated by a monomial function [29].

Assume that is expressed as F(y)  logc  a1y1  a2 y2  an yn . Taking natural exponential results in the following equation:

eF( y)  e(logca1 y1 a2 y2 an yn )

The following equation is thus obtained by using the properties of logarithmic functions,

(logclogxa1 logxa2 logxan ) eF( y)  e 1 2 n

Substituting F(y)  log f (x)  log f (ey )leads to the following expression:

log f (e y ) log(cx a1xa2 xan ) e  e 1 2 n

Therefore, the exponentials on both sides should be equal.

y a1 a2 an f (e )  cx1 x2  xn

That is,

54

a1 a2 an f (x)  cx1 x2  xn where f (x) is obviously a monomial function; therefore, when F(y) can be approximated by an affine function, i.e., a constant plus a linear function, f (x) can be approximated by a monomial function [29].

Proof on Lemma 2:

(1) If F(y) can be approximated by a convex function, then f (x) can be approximated by a generalized posynomial function [29].

Suppose function can be approximated by a convex function  , that is F   .

Because (as analyzed in Section 3.4.7) any convex function can be approximated arbitrarily well by a convex PWL function expressed as a maximum of a set of affine functions, one can expect the following result:

F(y)  (y)  max (b0i  b1i y1    bniyn ) i1, ... ,m

Taking the exponential of both sides, the above approximation expression becomes

f (x)  max (eb0i b1i y1   bniyn ) i1, ... ,m  max (eb0i eb1i log x1  ebnixn ) i1, ... ,m

b0i b1i bni  max (e x1  xn ) i1, ... ,m

The right-hand side is the maximum of m monomial functions, and therefore is a GPF. Such a function is sometimes called a max-monomial.

As an example, let us reconsider the three functions of one variable, x4 1 , atan(x) and  x3 1, which are discussed in Section 2.4.2, and ask whether each can be approximated

55

by an MF or a GPF, over the range 0.1 x 1. Figure 2.3 shows that the first function can be approximated by a GPF since its graph is a convex function (decided by the upward curvature) on a log-log scale. The second function can be approximated by an MF since its logarithmic graph is an affine function (i.e., straight curve). The third function cannot be approximated well by a GPF since its graph is an um-convex function (decided by the downward curvature); however, its reciprocal function has an upward curvature, and therefore can be fit by a GPF.

The convexity of logarithmic function F(y) can be tested in terms of its original function f in the following form,

 ~1  ~1  ~ ~ 1 f (x1 x1 , ... , xn xn )  f (x1, ... , xn ) f (x1, ... , xn ) (3.37) for any with 0  1. In other words, when is evaluated at a weighted geometric mean of two points, it cannot be more than the weighted geometric mean of the function f evaluated at the two points.

Now, the following optimization problem with positive variables x1, ... ,xn

Minimize (x)

Subject to fi (x)  ui , i  1,  , m

g j (x)  v j , j  1,  , p can be approximated by a GGP.

The above discussion shows that the answer is as follows: The optimization problem should meet the conditions that the transformed objective (y)  log(ey ) and the

y transformed inequality constraint functions Fi (y)  log fi (e ) must be nearly convex, and the

y transformed equality constraint functions Gi (y)  logGi (e ) must be nearly affine [13].

56

3.6.2 Monomial Fitting

Suppose there are given data points (x(i) , f (i) ) , i 1,  , N , where x(i)  Rn are positive vectors

(i) a1 an and f are positive constants. The goal is to fit the data with a monomial f (x)  cx1  xn .

(For more details, see [30]). In other words, we want to find c  0 and a1,  ,an so that it satisfies

f (x(i) )  f (i) , i 1,  , N (3.38)

This monomial fitting task can be done in the following way. First, setting y(i)  log x(i) , then

(i) (i) x(i)  ey , and replacing with f (ey )  log f (i) , we get

log f (x)  logc  a1 log x1  a2 log x2    an log xn

Since ,

log f (x)  logc  a1 y1  a2 y2  an yn

In particular , we can get

(i) (i) i i log f (x )  logc  a1y1  a2 y2    an yn

Since ,

(i) (i) (i) logc  a1y1    an yn  log f , i 1,  , N

This is a set of linear (approximate) equalities, in terms of the unknowns logc, and a1,  ,an .

To find an appropriate value for y we can minimize the p  norm of the error,

Minimize Ay  b (3.39) p

Where the involved parameters are given as the following matrices:

57

(1) (1) (1) (1) (1) (1) 1 y1 y2  yn  1 log x1 log x2  log xn   (2) (2) (2)   (2) (2) (2)  1 y1 y2  yn  1 log x1 log x2  log xn              A                                (N ) (N ) (N )   (N ) (N ) (N )  1 y1 y2  yn  1 log x1 log x2  log xn 

T Y  logc a1 a2    an  and

T b  log f (1) log f (2)    log f (N ) 

Minimizing the p-norm of the error is a convex problem that can be readily solved [17].

The error distribution depends on the selected norm. A small p puts more weight on the small errors, while a large p puts more weight on the large errors. As an illustration, the details of minimizing 2-norm are described here. Some other norms are discussed in

Appendix B.

An approximate solution to the above optimization problem can be found via least-

squares, by choosing logc and a1,  ,an to minimize the sum of the squared errors [30],

N (i) (i) (i) 2 (logc  a1 y1    an yn log f ) i1 with the constraint c  0 . In other words, we use simple linear regression to find

(1) (N) (1) (N) and a1  an , given the data y  y and f  f . This problem can also be expressed in a matrix format, as follows:

N Minimize (AY b)2 i1 (3.40) Subject to c  0

58

So far, there are many mature solvers that can deal with this easy least square problem.

To illustrate these ideas, the non-convex function f (x)  atan(x) , which is used in the two- stage op-amp optimization design later, is here considered over the interval[0.02, 1.4]. First,

70 data points (x(i) , f(x) (i) ) are generated with x (i) uniformly spaced over the given interval . The monomial fitting function obtained by a least square method is

0.9089 fls (x)  0.7827x

The function and its MF function fls (x) are depicted in Figure 3.1. As expected, the least-squares approximation gives good fit across the interval , with a mean fitting error of 6.6869% and a maximum fitting error of11.8008% .

Function Atan vs. Its Monomial Fitting 1.4 Value of Function Atan Monomial Fitting Value 1.2

1

0.8 anx anx (x) 0.6

0.4

0.2

0 0 0.2 0.4 0.6 0.8 1 1.2 1.4 Angle (Radian))

Figure 3.1: Function f (x)  atan(x) and its least-square monomial fitting

3.6.3 Extended Monomial Fitting

One useful extension of a monomial fitting is to shift by a constant offset, i.e., to fit the data set (x(i) , f (i) ) in the following form:

a1 an f (x)  cx1  xn  k (3.41)

59

where k is a positive constant and f is a sum of an MF and a positive constant. Such a fitting in (3.41) can be solved by exploiting the fitting technique described in the previous section to fit the data set (x(i) , f (i)  k) for various values of k , and selecting the one with smallest fitting error.

3.6.4 Max-monomial Fitting

Now the max-monomial function in (3.42) is used to fit the data points(x(i) , f (i) ) ,i 1, ... , N

f (x)  max fk (x) (3.42) k 1,  , K

where x  (x1,  , xn ) and f1,  , fK are monomials. When the number K is not known, this problem can be solved exactly by the logarithmic transform, which turns out to be the best convex piecewise linear fitting of the given data points. This problem can be cast as a large quadratic program and therefore solved [28]. This approach, however, often results in large number of terms and therefore is not practical. In the absence of a limit on the number of terms, it is helpful to know the best max-monomial fit to the data points.

If the number of terms K (or an upper bound on it) is known and not too large, it becomes a best K-term max-monomial fit to the given data points. In the case ofK 1, this fitting problem is indeed a monomial fitting problem, as previously described.

There is a practical method to solve the K term ( K  2 ) max-monomial fitting problem based on a monomial fitting and data point clustering [118] and K-means clustering algorithm [119], a well known method for clustering or partitioning a set of points into K subsets, so as to minimize a sum-of-squares criterion. The general max-monomial data fitting method is described as follows [13].

60

Given initial monomial function set F0  f1(x),  , fK (x)

repeat

for k 1,  ,K

(i) (i) (i) find the data point set Sk  x fk (x )  f (x ),i 1, ... ,N

(i.e., data points at which f k is the largest of the monomials)

update f k in F0  f1(x),  , fK (x)with a function which is MF to point set Sk

update function set Fk with function set F0 (i.e., let Fk  F0 )

until no further improvement occurs (i.e., Fk  Fk 1 ).

This algorithm first clusters the data points into groups, for which the different monomials give the maximum value. Then, for each cluster of data points, the associated monomial is updated with a monomial fit to the data points. This is repeated until convergence. The algorithm is not guaranteed to converge, but with real data it almost always does converge.

The final max-monomial approximation can depend on the initial max-monomial, so the algorithm can be run from several initial max-monomials, and the best final fit can be taken.

During the algorithm, it can happen that no points are assigned to one of the monomials. In this case, we can just eliminate this monomial, and proceed with K 1terms. For this reason, the final max-monomial approximation can have fewer than K terms.

One simple method for generating an initial max-monomial from which to start the algorithm is as follows [13]. We first fit a single monomial to the data (for example, by the monomial fitting technique described in Section 3.6.2), and then form K versions of it by randomly perturbing the exponents. We express the monomial fit as

61

a1 an fmon(x)  c(x1 / x1) (xn / xn ) where

1/ N  N  xi   xi  , i 1,  , N  k 1  is the geometric mean of the ith component of all the N data points (i.e., the average value of theith component on a logarithmic scale). We then form our monomials as

a1  k , 1 an  k , n fk (x)  c(x1 / x1) (xn / xn )

where  k, j ( , j 1,  , n , n is the dimension of the data points) are independent random variables. This generates K MFs, all near the original one, that are slightly different.

After the data points are fitted as a max-monomial function

f (x)  max fk (x) k 1,  , K then an inequality constraint such as (3.43) can be dealt with in the geometric program

max fk (x)1 (3.43) k 1,  , K

First, a slack variable u  max fk (x)is defined, and then the above inequality constraint k 1,  , K can be rewritten as

u  1  f (x)  u  1       fK (x)  u

62

Divided by the monomial u on both sides of the last K inequalities, the GP-compatible inequality constraints are achieved in (3.43):

u  1  f (x) /u  1  1  (3.44)      fK (x) /u  1

3.6.5 Posynomial Fitting

One can fit a function f (x) by a posynomial function fPF (x) in (3.45) such

that f (x)  fPF (x) ,

m ai1 i 2 in fPF (x)  ki x1 x2  xn (3.45) i1

where x is a positive vector of n real variables, the parametersij  R and ki  0 (i 1,  , m , j 1,  , n ). The details are shown as follows [13].

First, taking logarithms on both sides of (3.45) gives us the following expression:

m ai1 i 2 in log(ki x1 x2  xn ) log fPF (x)  log e i1 m  log elogki  i1 log x1  i 2 log x2    in log xn i1 m T  log eai ybi i1

T where y  log x1, log x2 ,  ,log xn  , ai  i1,i2,  ,in , andbi  logki . One can find the

function parameters ai and bi , for example, by minimizing the  -norm of the difference

between log f and log fPF , as shown in the following expression.

63

Minimize max log fi  log fPF ,i

This problem can be re-expressed in the following form by defining the new slack

variable t  max log fi  log fPF ,i .

Minimize t

m T  ai ybi  Subject to log fi  loge   t, i  1,  ,m (3.46)  i1  m T  ai log xi bi  log fi  loge   t, i  1,  ,m  i1 

Because the first m constraints are not convex, the problem (3.46) is not convex. Existing methods for this type of problem (the damped Newton method, the Gauss-Newton method, or sequential quadratic programming) are local, as is the max-monomial fitting method described above. This means that they cannot guarantee convergence to the true minimum; indeed, the final posynomial approximation obtained can depend on the initial one. But these methods often work well in practice, when initialized with reasonable starting PFs. Because the final PF found depends on the starting PF, it is common to run the algorithm from several different starting points and take the best final PF found. For example, a damped Newton method may be used to solve this problem with initial conditions close to the ones provided by a monomial fit. Since this problem is not convex, the result depends highly on the starting point. To obtain a good result, then, it is advisable to start from several different conditions.

3.6.6 Convex Piecewise Linear (CPWL) Fitting

Now it is time to introduce the most important fitting technique: convex piecewise linear fitting, which will be used in the two case study examples presented later, in Chapter 5.

64

Instead of a monomial fitting in GP, a convex PWL fitting is used in [12, 118] to reduce the modeling error for short-channel transistors, where the squared-law no longer represents well the physical behavior of the device. A convex PWL function is defined as ~ f (y)  max { fi (y)} (3.47) i1,  ,m

T where f i ( y) is a linear function of y such as fi  a i y  bi . Second, as described in Section

3.4.7, the convex data set ( yi , fi ) , i 1,  ,m generated by any convex function f  f (y) can be fitted by a PWL function with an arbitrarily small fitting error if it is being expressed with an arbitrarily large number of segments. Because PF becomes convex by logarithmic transform, it can be replaced approximately by a PWL function in log scale. This type of a fitting issue can be cast as a linear optimization [12, 19]. The algorithm is described in Figure

3.2 (on the next page).

The first step is to solve the linear optimization with a small initial set of planes S1 . The

~ next step is to calculate the resultant fitting error fi  f . If it is within an acceptable level, then quit and obtain the optimal solution. Otherwise, add index i corresponding to the

maximum fitting error to the set S1 for constructing the second subset S2 .

65

1. Solve the following linear optimization problem ~ Minimize f  f p ~ ~ T Subject to f j  f i  gi (y j  yi ), i  S1 , j  1,  , m ~ n where variable f  R , n is the dimension of S1 , k k g1 ,  , g m  R for given data (yi , f i ), where yi  R

Fitting error Ye ~ || f - f || Stop p s acceptable ?

N o 3. Find index i responding to max fitting error in step 2

4. Add i to S to construct S and back for next iteration 1 2

Figure 3.2: Procedures of convex PWL fitting algorithm

~ This means: Iteratively add a plane at the point ( yi , fi ) responsible for the largest fitting error, and then go back to step 1 for next iteration. Since the fitting is performed in log-

yi domain, we need to transform back to real-domain with xi  e . This is shown as follows.

~ ~ i1 i 2 ik f (x) fitted  expf (log x) fitted  max ci x1 x2  xk  (3.48) i1,  , m

~ T bi th k where a i  gi , bi  fi  gi yi ,ci  e and aik  k component in ai R .

3.7 Methods for Solving Geometric Programs

During the past four decades, extensive research on methods for solving GPs has been conducted and used in various fields. However, they did not work well or efficiently in practice until the developed interior point method appeared in 1990’s. In this section, we briefly introduce some of those methods.

66

At the early stage, efforts were made to directly solve GPs in the standard non-convex form. Work [120] solved the GP (3.1) with several cutting-plane algorithms. Work [121] approximated the given GP into a new GP form by an LP, and then solved it. Some methods were also invented to solve convex form GPs. For example, a general purpose algorithm is used to solve the GP in convex form in [122], and a general method for CP is used to solve the GP in convex form in [123]. These general purpose methods could not solve all GPs, however, and they always took a long time.

A huge volume of literature has been devoted to solving the dual GP, which has the advantage of containing only linear constraints. Work [124] solved the dual GP based on a direct search method; work [125] solved the Karush-Kuhn-Tucker conditions of the dual GP.

Convergence, however, was very slow and even failed in some cases.

Interior-point methods use smooth minimization techniques, often Newton’s method, to solve a sequence of smooth unconstrained (or equality-constrained) problems. If the sequence of smooth problems is selected appropriately, the resulting algorithms are very efficient both in theory and practice. More details on interior-point methods can be found in

[17, 126].

Recently, there have been several important developments related to solving the GP in the convex form. A huge improvement in computational efficiency was achieved in 1994, when Nesterov and Nemirovsky developed efficient interior point algorithms to solve a variety of nonlinear optimization problems, including GP [113]. Recently, Kortanek et al. have shown how the most sophisticated primal-dual interior-point methods used in LP can be extended to GP, resulting in an algorithm approaching the efficiency of current interior-point

67

LP solvers [114]. The algorithm they describe has the desirable feature of exploiting sparsity in the problem, i.e., efficiently handling problems in which each variable appears in only a few constraints.

Xu [127] compares the performance of the sophisticated primal-dual interior-point method developed by Kortanek, Xu, and Ye [108] with two general purpose optimizers,

MINOS and LINGO-NL, on standard geometric programming problems in convex form.

Their primal-dual interior-point method performs substantially better than the other codes; it is more efficient and robust.

For our purposes, the most important feature of geometric programs is that they can be globally solved with great efficiency. Problems with hundreds of variables and thousands of constraints are readily handled on a small workstation in minutes. The problems we encounter in this work have a few variables and fewer than 100 constraints, and are easily solved in less than one second.

Perhaps even more important than the great efficiency is the fact that algorithms for geometric programs in convex form always obtain the global minimum [13]. Infeasibility is unambiguously detected: If the problem is infeasible, the algorithm will determine this fact and not just fail to find a feasible point. Another benefit of the global solution is that the initial starting point is irrelevant; the same global solution is found regardless of the initial starting point.

These properties should be compared to general methods for nonlinear optimization (such as sequential quadratic programming), which only find locally optimal solutions and cannot unambiguously determine infeasibility. As a result, the starting point for the optimization

68

algorithm does have an effect on the final point found. Indeed, the simplest way to lower the risk of finding a local (rather than a global) optimal solution is to run the algorithm several times from different starting points. This heuristic strategy only reduces the risk of finding a non-global solution. For geometric programming, in contrast, the risk is always exactly zero, since the global solution is always found, regardless of the starting point.

As mentioned above, a wide variety of other codes can also be used to solve the convex form GP. Since this simple interior-point method already works extremely fast on relatively small problems such as those encountered in this work, the choice of algorithm is not critical.

When the method is applied to large-scale problems, such as the ones obtained for a robust design problem, the choice may become critical; it may then be necessary to use primal-dual interior-point methods that handle sparsity [126].

3.8 Generalized GP and Relevant Solution Methods

In this section, we introduce some extensions of GP. Some more relaxed versions of optimization problems can be transformed into standard GP by introducing slack variables, which further extends the application of GP. A generalized posynomial function is a function in the form

h(x)  ( f1(x),  , fk (x)) (3.49)

k n where  : R  R and fk : R  R are posynomials, and all the exponents of  are

0.3 1.2 0.5 nonnegative. For example, suppose(z1, z2 )  2z1 z2  z1z2 and f1 and f2 are posynomials. Then the function

0.3 1.2 0.5 h(x)  ( f1(x), f2 (x))  2 f1(x) f2 (x)  f1(x) f2 (x)  2

69

is a generalized posynomial. Note that h is not a posynomial unless f1 and f2 are monomials or constants.

A generalized geometric program (GGP) is now an optimization problem in the form

Minimize h0 (x) Subject to h (x)  1, i  1,  , m i (3.50) g j (x)  1, j  1,  , p

xk  0, k  1,  , n

where g1,  ,gp are monomials, and h0,  ,hm are GPFs [30].

The GGP (3.50) can be expressed as an equivalent standard GP by introducing slack variables as follows; it can then be solved by the same solver as that used for GP. Shown below is the transformation from a GGP to an equivalent GP. By using the definition of GPF

(3.49), a GGP has the form

Minimize  ( f (x),  , f (x)) 0 0,1 0,k0 Subject to  ( f (x),  , f (x)) 1, i 1,  , m i i,1 i,ki (3.51) g j (x) 1, j 1,  , p

xk  0, k 1,  , n

where the functions i (x) , i  0, 1,  , m , are PFs with nonnegative exponents, the

functions fi,l (x), l 1,  , ki , are PFs, and the functions g j (x) are MFs. This is equivalent to the following GP:

Minimize  (t (x),  ,t (x)) 0 0,1 0,k0 Subject to  (t (x),  ,t (x))  1, i  1,  , m i i,1 i,ki 1 fil (x)til  1, l  1,  ,ki (3.52)

g j (x)  1, j  1,  , p

xk  0, k  1,  , n

70

with variables x , ti,l  fi,l (x) , i 1,  ,m , l 1,  ,ki . The equivalence follows from the

fact that the exponents of i are nonnegative, and so it is non-decreasing in each of its

arguments. Therefore, we will have ti,l  fi,l (x) at the optimum.

3.9 The Signomial Problem and Relevant Solution Methods

In this section, we introduce another extension of GP: the signomial GP. A signomial geometric program (SGP) is an optimization problem in the form of

Minimize h0 (x) Subject to h (x) 1, i 1,  , m i (3.53) g j (x) 1, j 1,  , p

xk  0, k 1,  , n

where the hi (x) , g j (x) , i  0, 1,  , m , j 1,  , p , are SFs, which is a function with the same form as a PF, except that the coefficients are allowed to be negative [30]. From (3.53), one can find that an SGP has the same form as a GP. However, the objective and constraint functions are SFs. Also, a GP is obviously an SGP.

Unlike a GP, only a locally optimal solution of an SGP, not the globally optimal solution, can be obtained efficiently. The earliest research on signomial programs began in

1967 [128]. Thereafter, many efforts were made to find solution methods. See [129] for more information about the methods for solving approximately and locally signomial problems.

Some solvers use a condensed PF to search for the local minima of signomial programs

[26]. The condensed PF is closely related to the arithmetic-geometric mean inequality as follows.

71

Arithmetic-geometric mean inequality: For any positive vector v and any nonnegative

n vector   [ 1,  , n ]which satisfies i 1, it holds the following inequality, i1

  w  i  i  wi    i i  i 

 i where wi i  1 when i  0 .

n Condensed posynomial: Given a PF f x   gi (x) and a vector which i1

satisfies , the equivalent condensed posynomial fc xwith weights  is defined as

 n  g x i  i  fc x    i1  i 

The condensed posynomial is an MF, and for any value of x it holds the following inequality,

f x fc x (3.54)

Most general methods relied on the condensed posynomial method to approximate the GPF in a signomial program by condensed posynomials iteratively [130, 131, and 132]. With some conditions, this iterative process can converge to a local minimum of the original signomial problem.

One can also approximate the posynomial constraints and objective by condensed posynomials. After taking logarithms, the approximate problem is just an LP. Several iterative methods in [132, 133, and 134] use LPs to solve signomial programs. One can also exploit general-purpose optimizers to solve the signomial programs with limited efficiency

[89].

72

Another general method is introduced here to find the local solution of an optimization problem with the same form as a GP, but the objective and inequality constraints are not posynomials, and the equality constraints are not monomials [89]. Obviously, the SGP is a special case of this. At each step there is a solution guess x(i) ; the objective and each constraint function are fitted and then replaced with the best local monomial approximation near the solution guess x(l) (e.g., within a predefined trust region of  ) to derive a GP. Now the signomial geometric program (3.53) may be changed to the optimization problem (3.55),

ˆ Minimize h0 (x) ˆ Subject to hi (x)  1, i  1,  , m

gˆ j (x)  1, j  1,  , p (3.55) l l xk   xk  xk  

xk  0, k  1,  , n

ˆ where hi (x) , i  0, 1,  , m , is the local monomial approximation of hi (x) at , gˆ j (x) ,

j  1,  , p, is the local monomial approximation of g j (x) at , and is a predefined trust region at each guess step. The problem (3.55) is now a GP and can be readily solved. Its solution is used as the next iterate solution guess x(l 1) . The process continues in this way until it converges.

This method is local: It need not converge (but very often does), and it can converge to a point that is not the global solution. But if the problem is not too far from a GP, and the starting guess x(0) is good, it can work well in practice.

There are many variations on this method [89]. For example, if a constraint is already in a

GP-compatible format in the original problem (3.53), it no longer has to be fitted with a

73

monomial function, if hi (x) is a posynomial. One can just approximate only those terms which are not in a GP-compatible format by a monomial approximation, i.e., those terms with negative coefficients. A general signomial inequality constraint can be expressed as

f (x)  g(x)  h(x) (3.56) where g(x) and h(x) are both posynomials. Here g(x) consists of the terms in the signomial f (x) with positive coefficients, and  h(x) consists of the terms in with negative coefficients. The signomial inequality (3.56) can be represented as

p(x) 1 q(x)

Then approximate the term on the right side, 1 q(x) , with a monomial function(x) , at the current point x(l) , and substitute the following resultant posynomial constraint

p(x)  (x) (3.57) in the GP (3.54) for use at the next iterate.

3.10 Summary

In this chapter, we have given a brief overview of geometric programming and the relevant terminology. The most important feature of geometric programs is that they can be transformed into convex problems and then solved globally and efficiently. The main limit is that they require a special format: posynomial objective, posynomial inequality constraints, and monomial equality constraints. Therefore, we have also reviewed and compared some convex fitting techniques for deriving the GP format.

74

Chapter 4

Some Reviews of Analog Optimization Designs

4.1 Introduction

As described in Chapter 3, geometric programming has an overwhelming advantage over many current IC design optimization algorithms, but it requires a special structure: PF objective, PF inequality constraint, and MF equality constraint [13]. A conflict thus arises, because the behaviors of devices such as transistors, inductors, and capacitors in benchmark models such as BSIM3v3 are in general completely non-convex. If one wants to use GP to execute IC optimization design, then, the first step should be to reformulate the device characteristics into a GP-compatible format for deriving GP-compatible device models.

In this chapter, the GP-compatible models for a Complementary Metal-Oxide

Semiconductor (CMOS) transistor in 0.18mtechnology are derived for later use, and the involved GP-compatible bias description techniques are shown to handle the encountered PF equality bias constraints. Also, two previous works [11, 12] are reviewed to show a GP-based application in analog circuit optimization design and the limitations of previously used GP- compatible modeling techniques.

4.2 GP-compatible Bias Description Techniques

The model of device characteristics used in electronic circuit simulation and verification tools such as Cadence and Agilent ADS are highly nonlinear and non-convex. Some of those

75

transistor properties need to be approximated as a convex format with appropriate convex fitting techniques. The fitting process unavoidably produces some approximation errors, which are called modeling errors in Kim’s work [12].

Another major error comes from handling bias conditions of the designing electronic system. Most common bias conditions are required by KVL and KIL (Kirchoff’s Voltage

Law and Kirchoff’s Current Law) in an electronic system, and these bias condition constraints are usually non-convex equality constraints. Accordingly, some convex fitting techniques are also needed to convert these non-convex equality constraints to a convex format, required by a convex program or geometric program. This type of error that is related to bias condition estimation is called bias estimation error [12].

Both modeling errors and bias estimation errors can have a huge impact on the overall prediction accuracy in designing an electronic system. As defined, a modeling error is produced from the conversion of non-convex device properties to a convex format compatible with a convex program or geometric program; this error can be reduced by selecting the appropriate convex fitting techniques (described in the previous chapter). This section addresses several GP description techniques that can help reduce bias condition errors. Section 4.2.1 shows the need in PF equality constraints when considering the required bias conditions and then demonstrates a proposed solution. Section 4.2.2 describes how to covert nonconvex PF equalities to convex constraints which are compatible with GP.

4.2.1 Bias Condition Description with PF Equalities

One major prediction error source is from bias estimation [12]. Further, bias condition errors may result from an incomplete bias condition description, so more bias condition

76

descriptions are often required to reduce bias errors. However, bias condition description often needs equality constraints. One of the most common bias conditions is to satisfy KVL.

KVL, however, is an equality condition. For instance, the bias conditions for Figure 4.1

require that the sum of drain-source voltages equals the power supply voltage VDD , i.e.,

VDS1 VSD2 VDD .

vDD  I DS2

vbias | v | M2 DS1

vout  vin M1 vDS1

IDS1

Figure 4.1: Common-source op-amp with PMOS active load

Because VDS is a design variable here in a GP device model, this constraint becomes the following posynomial equality:

1 (VDS1 VSD2 ) 1 (4.1) VDD

Bias condition (4.1) is obviously a PF equality, which is not a type of constraint compatible with GP. Therefore, it is necessary to convert the non-convex PF equality to a convex constraint. In next section, such conversion techniques will be addressed.

77

4.2.2 Techniques for Converting PF Equality Constraints

In general, GP can not deal with posynomial equality constraints. However, in some very special cases it is perhaps possible. For instance, constraints of the form h(x) 1 , where h(x)is PF of variable x , are acceptable. The idea can be shown by considering the following optimization problem [12]:

Minimize f0 (x)

Subject to fi (x)  1, i  1,  , m (4.2) h(x)  1

where fi (x) , i  0,  , m and h(x) are PF of design variable x. This is not a GP problem unless h(x)is monomial. Then consider the related problem

Minimize f0 (x)

Subject to fi (x)  1, i  1,  , m (4.3) h(x)  1 where the posynomial equality has been relaxed to a posynomial inequality. This problem is then a GP. Now suppose that at any optimal solution x of the problem (4.3), it holds h(x) 1, i.e., the inequality h(x) 1becomes active at the solution. Then by solving the GP problem (4.3), its related non-GP problem (4.2) is essentially solved. One can show that the necessary condition is that there is a variable xˆ such that

 f 0 is monotonically increasing in xˆ and f1,  , fm are non-increasing in

 h is monotonically decreasing in

The proof of these necessary conditions is as follows [30]. Because we want to minimize , the optimizer pushes as small as possible because decreases as decrease.

78

For f1,  , fm , they are non-increasing in xˆ , so making smaller does not degrade the

inequalities fi (x)  1. What limits the decrease of is h(x) 1 because reducing will increase h , so when reaching at certain a point we cannot push any lower in order to avoid violating h(x) 1. Therefore h(x) 1becomes always active at the optimum. Likewise, one can show that the exact same principle applies to the following reverse case:

 f 0 is monotonically decreasing in

 are non-decreasing in

 is monotonically increasing in

This special property is referred to as monotonicity [30]. By utilizing monotonicity, KVL can be activated in most cases.

Considering the common source amplifier design example in Figure 4.1, if the design

goal is to maximize the voltage gain AV , it can be described as follows:

1 Minimize (gds1  gds2 ) gm1 1 Subject to (VDS1  VSD2 )  1 VDD (4.4) 1 (VDSAT1 Vout,min )VSD1  1   

According to the above monotonicity, it can be changed to the form in (4.5)-(4.7).

1 Minimize (gds1  gds2 ) (4.5) gm1

1 Subject to (VDS1  VSD2 ) 1 (4.6) VDD

79

1 (VDSAT1 Vout,min )VSD1 1 (4.7)

  

Equation (4.6) becomes always active as long as the other inequalities and objective

values such as (4.5) and (6.7) improve or do not degrade as VDS ,M1 or VDS ,M 2 grows. In this

example, if gds1 , 1/ gm1 and VDSAT1 decrease as VDS1 grows, (4.5) and (4.7) only improve as

VDS1 grows. Therefore at the optimum, (4.6) is always active. This idea can be easily extended to activate more general KVL equalities.

Note that this implies that we must constrain the values of the exponents of the fitted

model. In the previous example, the exponents of VDS in , and must be nonpositive. However, this limitation does not degrade the accuracy of the model in the designs we have studied [12].

As addressed above, by adding more important bias condition descriptions and then converting these posynomial equalities to convex constraints, the bias condition error is expected to reduce.

4.3 Reviews of Two Analog Optimization Designs

Some research communities have been aggressively working toward computer-aided analog optimization design. A good survey of analog synthesis techniques is available in [2, 5].

Here, two previous works [11, 12] are reviewed to show some GP-based applications in analog circuit optimization design.

80

4.3.1 Circuit Topology, Optimization Objective, and Specifications

In both works, the same widely used two-stage CMOS op-amp is designed, as shown in

Figure 4.2, which consists of a PMOS (Positive Metal Oxide Semiconductor) current source loaded with a differential amplifier input stage followed by a common-source stage with active load [11, 12]. The objective here is to maximize the unit-gain bandwidth while satisfying a set of practical specifications in low frequency gain, power dissipation, occupied area, phase margin, output swing, and so on. There are 18 optimal design variables in this

circuit: transistor widths W1, ... ,W8 and lengths L1, ... ,L8 of all eight transistors, bias

current I BIAS , and compensation capacitance CC .

M8 M5 M7

M1 M2

Vin+ Vin-

Ibias Vout

Cc Rc

M3 M4 M6 CL

Figure 4.2: Two-stage op-amp optimization design

First, the optimization objective and specifications involved are derived and tabulated in

Appendix C. This set of equalities and inequalities appropriately describes this two-stage op- amp design problem, serving as a coarse system model later in our proposed surrogate modeling and optimization algorithm. This table demonstrates that, except for the positive power supply rejection ratio (PPSRR) (which is easily satisfied, so not considered first in

81

such optimization designs), all of the performance measures of this two stage op-amp are a

PF or GPF of the optimal design variables. Such an op-amp design can then be formulated as a GP and effectively solved.

4.3.2 Hershenson's Work

Hershenson's work [11] first conducted the optimization design in 0.6mCMOS technology.

Monomial fitting is used to fit all of the 11 design parameters involved in the coarse circuit model of the two-stage op-amp (represented by the set of equalities and inequalities in the table in Appendix C) to create the monomial device model compatible with a geometric program, (The 11 parameters include saturated drain-to-source voltageVDSAT , gate-to-source

1 1 voltage VGS , VGS , gate overdrive voltage VOV , VOV , transconductance g m , inverse

1 transconductance g m , output conductance g ds , drain-to-source capacitance CDS , gate-to-

source capacitance CGS , and bulk-to-drain capacitance CBD .) This design is then cast as a GP format (for example, the output conductance of NMOS (Negative Metal Oxide

Semiconductor) transistors used in low frequency gain AV is expressed with a monomial

2 0.18 1.14 0.82 nd function as gds  3.110 W L IDS ). The design specifications are listed in the 2 column of Table 4.1; the 3rd column lists the GP optimization results.; The optimal values of the design variables are input to the SPICE simulation as device sizes and component value, and the corresponding SPICE simulation results are shown in the 4th column of Table 4.1, which, by comparison, illustrates that the GP optimization result agrees well with the SPICE simulation result. For example, there is only a 2.8dB discrepancy in the low frequency gains.

82

This means that the monomial device model in GP is accurate for long-channel transistors, where the square-law can represent its electrical characteristics quite actually.

Table 4.1: Comparison btw. two results from GP optimizer and 0.6µm SPICE technology Constraints Specifications GP Results SPICE Results Power dissipation (mW) ≤ 5 4.99 4.95

Unit Gain Frequency (MHz) Max. 86 81

Phase Margin (deg) ≥ 60 60 64

Low Frequency Gain (dB) ≥ 80 89.2 86.4

Table 4.2: Comparison btw. two results from GP optimizer & 0.18µm SPICE technology Constraints Specifications GP Results SPICE Results Power dissipation (mW) ≤ 0.2 0.2 0.22

GBP (MHz) Max. 109.6 82.8

Phase Margin (deg) ≥ 60 60 69.2

Low Frequency Gain (dB) ≥ 70 70 55.56

4.3.3 Kim's Work

Work [12] redesigns the same two-stage op-amp with 0.18m technology to check if the monomial device model still works well for short-channel transistors. Table 4.2 summarizes the specifications and results. In this case, GP optimization deviates significantly from

SPICE simulation in some specifications. For example, there is a huge14.44dBdiscrepancy between the low frequency gains. Thus the monomial device model in GP is not accurate enough for short-channel devices, especially for deep submicron technology. As analyzed here, a major error source arises from the transform between different models. Since transistor behaviors in benchmark models such as BSIM3v3 are in general completely non- convex, the monomial fitting used for GP formulation leads to some so-called modeling error.

83

Table 4.3: Max/Mean MF modeling error in GP model Table 4.4: Discrepancy in specifications GP % Error % Error GP SPICE Device Model NMOS w/o Vbs PMOS w/o Vbs Large Modeling Error g 20.6µS 122.7µS ds6 1/gm 29.8/10.9 23.1/7.4 Av 70dB 55.56dB g 132.7/49.7 128.8/48.7 ds 14.44dB Large Prediction Error

Table 4.3 shows the max/mean monomial modeling error in some design parameters.

Although most errors are reasonably small, some modeling errors are quite high, as in the

over-40% mean error of design parameter gds . These monomial modeling errors in all 11 design parameters result in an accumulated performance prediction error in low frequency gain, power dissipation, occupied area, phase margin, output swing, and so on. For example,

errors of design parameters1/g m and g ds lead to a performance prediction error of 14.44dBin

1 1 the low frequency gain AV by their relationship AV  g m gds , as shown in Table 4.4. More accurate fitting methods in the short-channel region are thus needed for better accuracy.

Table 4.5: NMOS modeling error in TSMC 0.18um technology Table 4.6: CPWL and MF-based optimization by GP & SPICE Monomial Convex PWL Performance Spec. PWL PWL Mon. Mon. Design Dependency: W,L,I W,L,I , V Measures GP SPICE GP SPICE Parameter DS DS DS Max/mean Modeling Error (%) Output Swing ≥1.4 1.4 1.43 1.46 1.54 (V) 1/gm 29.8/10.9 5.4/1.7 Power (mw) ≤0.2 0.2 0.2 0.2 0.22 g 132.7/49.7 39.8/9.4 ds DC Gain (dB) ≥70 73.78 70.3 70.0 55.56 V 11.3/3.5 3.14/0.79 GS GBP (MHz) Max. 74.00 66.9 109.6 82.8

CGS 15.5/5.4 11.7/3.1 PM (deg) ≥60 60 66.48 60 69.2

Kim's work then tries PWL fitting instead of monomial fitting to fit all of the involved 11 design parameters (mentioned in the previous section) for creating a PWL device model and casting this design in GP format. It is verified with NMOS and PMOS transistors in the saturation region as shown in Figure 4.3. The selected transistor size is within

84

1μm  w  25μm and 0.18μm  L  3μm . SPICE generates only around 2000 data for the PWL fitting. The model parameters are transistor width W and length L , drain source

current I DS and drain source voltageVDS . The third column in Table 4.5 reports that PWL fitting may achieve much higher modeling accuracy than monomial fitting for narrow transistors with short channels when not many design parameter data are fitted. For example,

the mean fitting error of design parameter gds is now reduced to 9.4% from 49.7%. As a result, the PWL-based scenario obviously produces much better prediction accuracy than the monomial-based GP. Table 4.6 contains the PWL-based and monomial-based optimization results and their equivalent SPICE simulation results for the two-stage op-amp, which demonstrates now that all specifications are met in the PWL-based optimization and that the prediction discrepancy in low frequency gain (i.e., DC gain) is only 3.48dB, compared to the

14.44dBshort of gain specification in the monomial-based case. As a result, one can say that the PWL-based device model in GP works well for short-channel transistors, but the width should be not too wide and there should not be many model data values to be fitted.

IDS IDS IDS VBS

VDS VDS VDS Vin Vin Vin

W, L, IDS, VDS W, L, IDS, VDS W, L, IDS, VDS, VBS (a) (b) (c)

Figure 4.3: (a) NMOS transistor without Vbs, (b) PMOS transistor without Vbs, (c) PMOS transistor with Vbs operating in saturation region used for creating convex PWL and monomial single device model for two-stage op-amp optimization design

85

4.4 GP-Compatible Device Model

In our work, to investigate whether a CPWL-based GP is still suitable for wide transistors with a large number of fitted design parameter data and with deep submicron technology, the transistor size is expanded to the range 0.18μm  w  80μm and 0.18μm  L  6.18μm , and a large number (about half of one million) of design parameter data for all the involved design parameters are generated by SPICE and extracted for CPWL fitting to build a GP coarse model with a higher accuracy over a wide range of device sizes and component

1 1 1 values. (The design parameters include VDSAT , VGS , VGS , VOV , VOV , gm , g m , g ds , CDS , CGS ,

and CBD involved in the coarse circuit model of the two-stage op-amp represented by the set of equalities and inequalities in the table in Appendix C.). The same CMOS transistors

(including NMOS without bulk-to-source voltageVBS , PMOS with , PMOS without ) in TSMC 0.18mtechnology as in Figure 4.3 are used to create the CPWL device model with the convex PWL fitting technology described in Section 3.6.6. To get a more accurate

device model, besides transistor widthW and length L , drain source current I DS and drain

source voltageVDS , we also consider the bulk-to-source voltageVBS as another device model design parameter. Thus, there are in total five device model design parameters considered in our device model. Now, by fitting the almost one-half million device model design parameter data with the convex PWL fitting technology, all the involved 11 design parameters can be expressed (or appropriately expressed) with a convex PWL expression in terms of those five device model design variables; and monomial fitting is also used to express all of the

involved device characteristics such as the output conductance gds and inverse

86

transconductance1/g m with a monomial function. The purpose is to derive the GP-compatible device model. Next, all the performance measures (or specifications) of a given electronic circuit system (e.g., low frequency gain, unit gain bandwidth, the phase margin of an op-amp) are thus expressed in a GP-compatible format (i.e., posynomial inequality and monomial equality) in terms of the related optimal design variables or design parameters in the derived

GP-compatible device model. This can be done successfully because any performance measure (or specifications) of a given electronic circuit system is a function of the design variables or design parameters representing the device model. For example, the monomial

fitting expression for the device design parameter Cgs of a PMOS transistor with bulk-to-

1  2 3  4 5 source voltageVbs is Cgs,MF  CW L Ids Vds Vbs , where the coefficient C  0.00401051438798

and the values of the exponents are 1  0.98855129419175 , 2  0.95438149386090 ,

3  0.01128234464400 , 4  -0.02732447668352 and 5  -0.00000000000107 .

However, the convex PWL fitting expression for the same design parameterCgs is in the quite

i1 i 2 i3 i 4 i5 different form of Cgs,CPWL  max CiW L Ids Vds Vbs  . The values of the involved i1,  , m

coefficient and power are shown in Table 4.7. One can see here that Cgs,CPWL is the point-wise maximum of 32 monomial functions in term of all five device model design parameters.

Table 4.7: Coefficients and exponents of Cgs of PMOS transistor with Vbs ci ai1 ai2 ai3 ai4 ai5 8.9671e-005 4.8515e-001 1.0569e+000 1.9886e-001 -8.5605e-002 -2.0115e+000 7.2531e-004 9.1783e-001 9.9930e-001 7.9767e-002 -1.3399e+000 -1.8409e-001 4.6709e-007 9.0936e-001 4.5863e-001 9.0789e-002 -6.7619e-001 -1.9294e-001 9.3167e+002 2.6545e-002 2.0331e+000 9.5394e-001 4.2750e-001 -2.7528e+000 1.0242e+001 1.7898e-001 1.7918e+000 8.0646e-001 -8.9508e-001 -1.9933e+000 5.8550e-009 9.2491e-001 1.0262e-001 7.5409e-002 -7.7828e-003 -1.9605e-001 1.4932e+005 -6.2646e-001 2.6023e+000 1.6080e+000 4.5448e-001 -1.9839e+000 3.3978e-003 9.7655e-001 9.4530e-001 2.4509e-002 -1.9501e-002 -4.5647e-002 2.9292e-003 9.7859e-001 9.9529e-001 -6.0674e-002 -4.6263e-002 -1.4655e-002 3.9215e-003 9.8156e-001 9.6897e-001 -9.4003e-004 -3.4450e-002 -1.0728e-002

87

Table 4.7: Continued 6.4207e-002 5.0060e-001 1.6929e+000 -1.0121e-001 3.0095e-001 -2.1174e+000 3.7982e-003 9.2519e-001 1.0293e+000 7.6071e-002 2.0117e+000 -1.7418e-001 9.7205e+001 9.2562e-001 1.8762e+000 7.3757e-002 -9.4087e-001 -1.3810e-001 2.2068e+004 -3.9713e-001 2.3896e+000 1.4097e+000 7.0905e-001 -1.9551e+000 3.1085e-007 9.8343e-001 4.1482e-001 1.6573e-002 -6.6827e-001 -2.5022e-001 1.4766e-002 9.0051e-001 1.0658e+000 1.0100e-001 -2.0162e-002 -1.3001e-001 4.0631e-009 9.9509e-001 4.8766e-002 3.5900e-002 -1.9898e-002 -6.2413e-002 8.3960e-003 9.8666e-001 9.9712e-001 3.5235e-002 -3.9457e-002 -5.1500e-002 3.1689e-009 9.9379e-001 4.2149e-002 1.6210e-002 4.6866e-002 -5.1118e-002 4.2868e-003 9.8810e-001 9.6754e-001 2.9527e-003 -8.0790e-003 -2.2867e-002 3.8141e-001 4.7406e-001 1.4914e+000 5.2814e-001 -9.4992e-001 -1.7485e+000 5.9277e-001 5.8033e-001 1.3901e+000 4.2103e-001 8.0991e-001 -2.7238e+000 1.1531e-002 8.3434e-001 1.0748e+000 1.9656e-001 4.6514e-001 -2.5085e+000 1.6632e-003 9.9726e-001 9.6136e-001 2.3195e-003 2.0193e+000 -2.5948e-002 3.2973e-009 9.9635e-001 4.1836e-002 1.8431e-002 3.7158e-002 -5.5570e-002 2.2066e-003 9.9536e-001 9.7556e-001 1.4560e-002 2.0259e+000 -4.6570e-002 8.0916e-009 1.0624e+000 5.8775e-002 1.6405e-002 -2.0712e-002 -8.3541e-002 1.7584e-002 1.0469e+000 1.0048e+000 4.0652e-002 -3.8584e-002 -4.4361e-002 8.3535e-004 1.8790e+000 2.8648e-001 8.9045e-003 9.6220e-002 -8.2085e-002 1.8911e+007 3.2333e+000 9.9994e-001 -3.5449e-002 -6.6634e-002 -3.7034e-002 1.8340e+009 2.3739e+000 1.9886e+000 2.4761e-002 -4.8750e-002 -1.4180e-002 1.0161e+009 2.3709e+000 1.9616e+000 1.2616e-003 -4.8879e-002 -4.2402e-010

Table 4.8 illustrates the mean PWL fitting error of all 11 design parameters involved in the same op-amp design. Compared with monomial fitting, the convex PWL fitting dramatically improves the modeling accuracy. For example, the mean convex PWL fitting

error of design parameter Cgsis dramatically reduced from39.462% to 2.553% ; yet some are

still too big, as in the 18.61% error of parameter Cgs for NMOS without Vbs and 10.08% for

parameter g ds of PMOS with . The major reason for this is that the convex PWL is perfect only when fitting a convex data set, but most physical and electrical behaviors of today's transistors can not be represented directly by a convex function. These huge fitting errors thus result in accumulative performance discrepancies in small-signal characteristics such as low frequency gain, power dissipation, occupied area, and so on, which we will show later in the initial design of Table 5.1. Clearly, then, methods need to be developed to further improve prediction accuracy for wide transistors with many design parameter data fitted and with deep submicron technology.

88

Table 4.8: Mean modeling error of design parameters in TSMC 0.18µm technology

Variables: W,L,IDS,VDS, VBS Design Parameters Modeling Error (%): Monomial/CPWL

NMOS w/o VBS PMOS w/o VBS PMOS w/t VBS

VDSAT 4.348/2.639 3.035/2.222 4.557/2.726 VGS 8.806/3.138 5.437/1.458 10.00/2.769

1/VGS 15.22/4.415 20.03/4.265 28.83/8.939 VOV 8.770/3.829 4.626/2.094 4.526/2.386 1/VOV 57.02/4.992 55.72/1.718 57.69/3.198 gm 37.38/8.652 47.42/4.484 50.30/5.238 1/gm 22.24/2.368 4.578/2.409 4.658/2.925 gds 244.7/5.710 162.9/4.093 163.1/10.08 CGS 60.73/18.61 39.21/0.425 39.46/2.553 CGD 337.9/0.896 255.9/0.236 286.6/0.850 CBD 337.2/2.025 262.4/0.381 293.2/1.683

In the next chapter, a surrogate modeling and optimization algorithm are proposed and incorporated with both a geometric program and a convex piecewise linear fitting technique to model and optimize the electronic circuit system to further improve prediction accuracy, especially for wide transistors with lots of fitted design parameter data.

89

Chapter 5

GP-Based Surrogate Modeling and Optimization

Algorithm

5.1 Surrogate Theory

This optimization problem can be stated as follows. There are two models: a fine model and

m a coarse model. Their behaviors can be described with two functions: Ffine : x  R and

m n Fsurrogate : x  R , separately, where x R is the optimal design variable vector. We aim at finding an optimal value for the design variables, x to minimize (maximize in some cases)

the objective or cost function J under the constraint of specificationSfine . Mathematically, that is to find,

 x  argmin{J(Ffine (x, Ccoef ))} X (5.1) Subject to specification constraint Sspec such that

Sfine Sspec  ε, Ssurrogate Sspec  ε, Ssurrogate Sfine  ε (5.2) where the number  is a given small positive constant to define a trust region, the

  specification constraints Sfine  Ffine(x ) and Ssurrogate  Fsurrogate(x ) .

Usually, the fine model is computationally expensive; thus, it is impractical to solve this design by direct optimization. Instead, we exploit inexpensive surrogate models, which are

90

computationally cheaper and much faster to evaluate, though not as accurate as fine models

[135]. In most cases, this optimization design can be solved by surrogate modeling and an optimization algorithm iteratively and interactively between the fine and coarse (i.e., surrogate model) model. This method combines the widely acclaimed computational efficiency of an optimizer engine such as the GP optimizer with the high fidelity of circuit- based CAD models like electromagnetic (EM) simulators [136]. This facilitates a highly efficient approach to achieving system performance improvement, cost reduction, and design time reduction.

First set the initial optimal value for the design variable x (0) , and the initial value

Sfine (0) for the required specification, which is the predefined specification value for the designing optimization system. The resultant value of the fine model specification index at

th the n iteration, Sfine (n) , can then be obtained by the fine model function evaluation (in this work, SPICE is used), and the value of the involved design parameters can be extracted from the fine model. This process can be expressed with the following recursive formula:

 [Sfine(n), Pdp(n)]  Ffine(x (n -1)) (5.3)

Next, the surrogate model coefficient Ccoef (n) is evaluated by some ideal fitting techniques,

represented by the function Ffitting .

Ccoef (n)  Ffitting(Pdp(n)) (5.4)

The surrogate model is then reconstructed with the updated model coefficient Ccoef (n) .

Fsurrogate(n)  Fsurrogate(x, Ccoef (n)) (5.5)

91

Now the chosen optimization algorithm in the surrogate model (i.e., coarse model) generates the optimal value for the design variables at the nth iteration.

 x (n)  argmin{J(Fsurrogate(n))} X (5.6) Subject to the specification constraint S

Then calculate the specification index values with the updated surrogate model function.

 Ssurrogate(n)  Fsurrogate(x (n)) (5.7)

The optimal values of the design variable are now updated for use in the next iteration:

x  x(n), x(n 1)  x (5.8)

However, to ensure the convergence of two models and to increase the convergence speed, some matching constraints and Jacobean conditions, denoted by a reflection function  , can be imposed between the optimal design values of both models, such that (5.8) is replaced by

(5.9),

x  x(n), x(n 1)  (x) (5.9)

Later in the two design case studies, (5.8) is directly used for the sake of simplicity.

Repeat the above procedures until satisfying the following matching (termination) condition:

Sfine (n)  Sspec  ε

Ssurrogate(n)  Sspec  ε

Ssurrogate(n)  Sfine (n)  ε

Figure 5.1 illustrates the convergence procedures of both the fine and the surrogate models.

92

Fine Model f P1(X)) e c n a Fine Model f P2(X)) m

r Fine Model f Pn(X)) o f r e P

l ˆ e Surrogate Model fˆ(P (X)) Surrogate Model f(Pn (X)) d Surrogate Model fˆ(P (X)) 2 o 1 M

Design Variable Vector X Design Variable Vector X Design Variable Vector X (a) (b) (c)

Figure 5.1: Evaluation of convergence of two model performance in surrogate strategy (a) initial design; (b) second design; (c) nth design

In both cases, after parameter extraction, the new surrogate model replaces the fine model and becomes the function to be optimized. We can simulate the fine model for verification of its response. If the fine model design specification is not satisfied, the step can be repeated

[137]. Usually, the initial surrogate and fine model deviate much from each other at the

initial iteration. At each future iteration, replacing the pre-assigned parameter pn-1(x) with

pn (x) , which is extracted from the fine model at each iteration step, the new surrogate model may align gradually with the fine model.

5.2 GP-based Surrogate Strategy

For improved performance and reduced time, the surrogate-based approach appears in work

[136] in the optimal design of a liquid rocked injector, and work [112] uses it in the wing and flapping-flight design. To further improve the prediction accuracy for wide transistors with large numbers of fitting parameter data and with deep submicron technology, which even a convex PWL-based GP cannot handle well, in this work a novel alternative— surrogate modeling and an optimization algorithm—is proposed, based on a GP optimizer and convex

PWL fitting. Figure 5.2 shows the design flow of such a modeling and optimization strategy.

93

Start Surrogate Model

Initial design variables Update design variables l

Fine Model [ W1, ∙∙∙ ,W8 L1, ∙∙∙ ,L8 Coarse Mode el (SPICE here) (GP Optimizerd here) Ids1, ∙∙∙ ,Ids8 Vds1, ∙∙∙ ,Vds8 ] o

 m

x(n 1)  (x (n)) s t e n s r e i a c i Meet o Yes f c f Spec Stop e e t ? o c a

d

No p U Input new extracted parameter data Data Extraction GP Fitting [gm, gds, VGS, VDSAT] (Convex PWL here) Ccoef (n)  Ffit (Pdp (n)

Figure 5.2: Surrogate modeling and optimization design flow

In this advocated GP-based surrogate strategy, two types of models are involved. One is a fine model (generally, this could be implemented by any circuit simulator; SPICE is used here); another is a coarse or surrogate model (GP optimizer here). The design procedures in this design flow for the proposed GP-based surrogate strategy can be described as follows.

 First input initial values for the design variables into the fine model to run the

simulation.

 After the simulation, if all specifications are met, exit and obtain the optimal solution.

 Otherwise, from the fine model extract the values of all 11 design parameters

1 1 1 (including VDSAT , VGS , VGS , VOV , VOV , gm , g m , g ds , CDS , CGS , and CBD ) involved in

the coarse model.

 Then use some fitting techniques (in our case, a convex PWL fitting is used) to fit

those extracted design parameter data to calculate new model coefficients for the

coarse model (see the coefficients and exponents in Table 4.7).

94

 Now update the old coarse model coefficients to create a new surrogate coarse model

and run a coarse model optimizer (in our case, a GP optimizer is used) to recalculate

new optimal values of the design variables.

 Then input those new optimal values of the design variables back into the fine model

for use in the next iteration.

These procedures are repeated until all the required specifications are met or time has run out.

Obviously, the key components are the coarse model GP optimizer and the convex PWL fitting technique, which has some unmatched advantages.

5.3 Design Case Study

Two design cases are conducted here to verify the effectiveness and viability of the above proposed surrogate strategy. One is the same widely used two-stage op-amp shown in Figure

4.2; another is an LC-tuned oscillator, which is also often used in various applications such as frequency synthesizers and hands-free Bluetooth devices.

5.3.1 Two-Stage Op-amp Design

In this section, the same two-stage op-amp circuit [11, 12] in Figure 5.2 is first reused to verify the effectiveness and viability of the proposed modeling and optimization algorithm. A total of three iterations are conducted for this amplifier design problem. The required specifications and corresponding results of both the fine and the coarse model at each iteration step are summarized in Table 5.1. The optimal low frequency gain, unit-gain frequency, and corresponding phase margin are also depicted with SPICE in Figure 5.3 - 5.5.

95

Table 5.1: Specifications and optimization performance for two-stage op-amp design Optimal Performance Results Constraints Spec. 1st iteration 2nd iteration 3rd iteration CPWL-GP SPICE CPWL-GP SPICE CPWL-GP SPICE DC Gain (dB) ≥ 80 80 72.64 80 78.11 80 81.9 Phase Margin (deg) ≥ 60 60.003 55.3 60.008 57.5 60.007 63.5 UG Frequency (MHz) 170 Max 258.20 210.6 231.28 197.1 201.35 185.4 Power Dissipation (mw) ≤ 0.5 0.499 0.498 0.439 0.438 0.407 0.406 Area (μm²) ≤ 37 35.6 35.5 34.8 34.7 30.9 30.8 Output Swing (V) ±0.7 ±0.7 ±0.65 ±0.7 ±0.69 ±0.7 ±0.72 Supply Voltage 1.8V Load 4pF capacitor Benchmark Tech. TSMC 0.18µm CMOS technology

The initial optimization design is implemented with the same CPWL-based GP optimizer used previously in work [12] and reviewed in Section 3.6.6. In other words, the work in [12] is just what we do at the first iteration in our proposed surrogate strategy, except that we have a more accurate device model created in Section 4.4 because we use SPICE to generate much more design parameter data (about half a million) for a single CMOS transistor device.

Specifically, the set of GP-compatible equalities and inequalities shown in Appendix C serves as the coarse model in our proposed surrogate modeling and optimization algorithm.

First, we used the GP optimizer to solve that coarse model—i.e., those GP-compatible equalities and inequalities in the Appendix C table—and listed the obtained optimization results in the 3rd column of Table 5.1. The obtained optimal values of the 18 design variables were input to the SPICE simulator (here, the fine model) as the initial transistor size and component value (i.e., as the initial value of the design variables in Figure 5.2); we then ran the SPICE simulation and listed the corresponding simulation results 4th column of Table 5.1.

Comparing the two columns demonstrates that at the initial iteration the performance measures in unit gain frequency, power dissipation, and occupied area satisfy the predefined specifications, but there is a serious shortfall in low frequency gains (10.13% ) and in phase

96

margin (8.50%). There is also a 22.60% discrepancy between unit gain frequencies due to huge convex PWL fitting errors for wide transistors with large number of fitted design parameter data, as shown in Table 4.8. So far, at this first iteration, the obtained results are also what can be achieved by the single-step GP-based optimization method used previously in work [12]; but the performance achieved at this iteration is definitely not good enough, which means, again, that the single CPWL-based GP optimizer (i.e., the method used in work

[12]) is not accurate enough for an electronic system design involved wide device with a short channel when fitting large number of design parameter data. Figure 5.3 shows that at this first iteration the achieved optimal low frequency gain (i.e., DC gain) is 72.64dB, the unit-gain frequency is 210.6MHz, and the corresponding phase margin is 55.3 .

Figure 5.3: Optimal low frequency gain and phase margin at 1st iteration

97

To alleviate the accuracy limit issue and narrow the gap between the CPWL-based GP optimal result and the actual SPICE circuit simulation result, at the 2nd iteration we begin to

execute the proposed surrogate strategy. We sweep one crucial design variable I bias (other

design variables or parameters such as transistor widths W1, ... ,W8 and compensation

capacitance CC may also be chosen to sweep) when running the GP coarse model optimizer, and then get a sequence of corresponding optimal design variable values; afterwards, they are input to the fine model as the transistor size and component value to run a series of SPICE

simulations. Now we extract a sequence of value of all 11 design parameters ( gm , g ds , VGS ,

1 VDSAT , VOV , g m , , CDS , CGS , and CBD ) involved in the coarse model from the SPICE simulation and then fix them with a convex PWL fitting to calculate the related coarse model coefficients for rebuilding the surrogate GP coarse model. (That is, we express those 11 design parameters with GP-compatible functions and then substitute them into all the involved performance measures, such as low frequency gain, unit gain bandwidth, and phase margin of an op-amp, which are composed of some design parameters.) After updating the coarse model coefficients, we now run the new surrogate GP coarse model, but we do not sweep any design variable this time; the optimization results are listed in the 5th column of

Table 5.1. Next, the resultant optimal values of the design variables are input into the SPICE fine model as the transistor size and component value; the corresponding simulation results are shown in the 6th column of Table 5.1.

From the previous description about the 2nd iteration, it may be noticed that the GP coarse model optimizer and the SPICE fine model simulation are both run twice at this iteration. The

98

first time is for rebuilding the surrogate coarse model; the second time is for the verification and comparison of the two model design results. Now, the SPICE simulation performance is notably improved. For example, there are only 2.42dB shorts in low frequency gain and a reduced 17.34% discrepancy from 22.60% in unit gain frequency.

Figure 5.4: Optimal low frequency gain and phase margin at 2nd iteration

At the 3rd iteration, we take a similar surrogate strategy to that of the 2nd iteration. The deviation in unit gain frequency is now further reduced to8.60% from , and this time all the specifications are met, and we stop here. Figure 5.5 shows that at this first iteration the achieved optimal low frequency gain (i.e., DC gain) is 81.9dB , the unit-gain frequency is185.4MHz, and the corresponding phase margin is 63.5 .

99

Figure 5.5: Optimal low frequency gain and phase margin at 3rd iteration

From the initial design to the 3rd iteration, the performance discrepancy in unit gain frequency is reduced to8.60%from 22.60% ; a noticeable 14% improvement is thus achieved, which greatly exceeds the preset objective for this work. As expected, through the proposed

GP-based surrogate strategy both model results from the GP optimization and the actual

SPICE circuit simulation agree with each other gradually, at each iteration step, until all required specifications are met. From the macro point of view, we can understand the reason this way: At each iteration step, the coarse model is responsible for providing the device size to the fine model, which, in turn, generates the involved design parameters extracted later for rebuilding the coarse model. So, both models affect each other, mimic each other, converge to each other, and even match each other at last. Consequently, the discrepancy between two model performances may reduce iteratively.

To get some design insight into how the performance characteristics interact with one another, the optimal trade-off curves of bandwidth versus core area of the two-stage op-amp

100

at three iterations are depicted in Figures 5.6-5.8, separately. At first iteration, the maximum discrepancy between the results from the GP-based optimization and the actual SPICE circuit simulation is 25.07% , and the mean deviation between both model results is 21.77% ; at second iteration, the maximum discrepancy between the GP-based optimization result and the SPICE circuit simulation result is now reduced to16.34% , and the mean deviation between both model results is12.06% ; at third iteration, the maximum discrepancy between the results from the GP-based optimization and the SPICE circuit simulation is further reduced to9.02%, and the mean deviation between both model results is 7.29% .

Unit Gain Bandwidth vs. Core Area 400

350 CPWL Optimization Result at 1st iteration Circuit Simulation Result at 1st Iteration 300

250

200 Bandwidth (MHz) Bandwidth

150

100

50 25 30 35 40 45 50 55 60 65 70 Area (um2)

Figure 5.6: Tradeoff curve of bandwidth vs. core area at 1st iteration

350

CPWL Optimization Result at 2nd Iteration Circuit Simulation Result at 2nd Iteration 300

250

200 Bandwidth (MHz) Bandwidth

150

100 25 30 35 40 45 50 55 60 65 70 Area (um2)

Figure 5.7: Tradeoff curve of bandwidth vs. core area at 2nd iteration

101

350 CPWL Optimization Result at 3rd Iteration Circuit Simulation Result at 3rd Iteration

300

250

200 Bandwidth (MHz) Bandwidth

150

100 25 30 35 40 45 50 55 60 65 70 2 Area (um )

Figure 5.8: Tradeoff curve of bandwidth vs. core area at 3rd iteration

From the 1st design to the 3rd iteration, the maximum prediction discrepancy is reduced from 25.07% to9.02%; thus, a noticeable 16.05% improvement is achieved; also, the mean prediction discrepancy is reduced from 21.77% to 7.29% , making it 14.48% more accurate.

The iteratively decreasing prediction discrepancy tells us again that the proposed GP-based surrogate modeling and optimization algorithm forces both model results from the GP optimization and the circuit simulation to agree with each other iteration by iteration, until all predefined specifications are satisfied. Another important fact is that all three iterations indicate an increasing trend of bandwidth with an increasing core area of the two-stage op- amp when the area constraint is tight; however, after a certain point, increasing the available area does not significantly improve the maximum unity-gain bandwidth because other constraints start to play a larger role after that breakpoint.

However, this proposed surrogate approach may take longer than the single-iteration convex PWL-based GP (the optimization method used in work [12]), due to its iteration process. In this case, at the first iteration, it takes the SPICE simulator about 14 hours CPU

102

time to generate about half a million data points for all the involved eight design parameters of each type of single CMOS transistor device (including NMOS without bulk-to-source

voltage VBS , PMOS with , and PMOS without ). Then it takes the CPWL fitting technique about 10 hours CPU time to fit those design parameter data extracted from the

SPICE simulator for deriving CPWL expressions for each design parameter, and it takes the

CPWL-based GP optimizer only about 1.5 seconds to obtain the GP optimization results. In total, at the 1st iteration, it takes about 24 hours CPU time. At the 2nd iteration, we sweep the

design variable I bias 10,000 times (i.e., run the GP optimizer 10000 times) to get 10,000 values of design variables (about 4 hours); then we input those values of design variables to the SPICE simulator to run the simulation 10,000 times (about 68 hours); then we fit the extracted design parameter data with the CPWL fitting technique (about 8 hours); then we run the updated GP optimizer (about 1.5 seconds); once again, we then input the obtained design variable value to the SPICE simulator for the purpose of verification (about 24.95 seconds). In total, the 2nd iteration takes about 80 hours. At the 3rd iteration, we perform a similar action, but just sweep the design variable 2000 times; in total, it takes about 19.4 hours at this iteration. All together, the process takes about 124.4 hours.

Although our proposed surrogate modeling and optimization algorithm takes much longer to design than the single-step CPWL-based GP optimization methods, it is still worth the time. First, our design case already shows that this CPWL-based surrogate strategy can achieve much higher performance accuracy than the single-step CPWL-based GP optimization methods used previously in work [12]; second, once the surrogate model with a high accuracy for an electronic system with a given topology is created (such as that obtained

103

at the 3rd iteration), it is reusable to solve—with extreme efficiency and much higher performance accuracy—future electronic system design problems with the same topology but different specifications. For example, we use the surrogate model obtained at the 3rd iteration to design the same two-stage op-amp, but with different specifications, which are shown in the 2nd column of Table 5.2. It takes only 26.45 seconds CPU time to finish this design. The two model results from the CPWL-based GP optimizer and the corresponding SPICE simulator are listed in the 3rd and 4th columns of Table 12, separately. Comparing them also demonstrates that the specifications are all satisfied and the deviation between the two model results is acceptable. For example, the discrepancy in unit gain frequency is about8.91%.

Table 5.2: Optimal performance for two-stage op-amp design with different specifications Constraints Specifications CPWL-based GP SPICE Low Frequency Gain (dB) ≥ 75 75.04 76.41 Phase Margin (deg) ≥62 62.0 62.9 Unit Gain Bandwidth (MHz) 120 Max 134.89 123.86 Power dissipation (mW) ≤ 0.3 0.300 0.298 Area (μm²) ≤ 30 29.309 29.28 Output Swing (V) ±0.72 ±0.72 ±0.73 Supply Voltage 1.8V Load 4pF capacitor Benchmark Technology TSMC 0.18µm CMOS technology

5.3.2 LC-Tuned Oscillator Design

In the past twenty years much has been learned about radio frequency (RF) oscillators [138,

139, 140] because of their various uses: for example, as the voltage controlled oscillator used in Frequency synthesizers and Bluetooth transveivers. In this section, one LC-tuned oscillator

(Figure 5.9) is designed with the proposed surrogate modeling and optimization strategy as a second case study to verify the effectiveness and viability of the proposed surrogate strategy.

104

L L

Vtune Vout+ Vout-

CLoad CLoad M3 M4 CVar CVar

M1 M2

Mtail Itail

Figure 5.9: LC-tuned oscillator optimization design

There are two inductors L and four NMOS transistors M1 , M 2 , M 3 and M 4 .

Transistors and , with the drain and source terminals connected together, act as the voltage-controlled varactors and create an LC tank with the inductor . The oscillation frequency is tuned by the resonant frequency of this LC tank; specifically, the frequency is controlled by varying the capacitance of the LC tank. Transistors and construct a cross-coupled transistor, which works as a negative resistance that sustains the oscillation by compensating the loss in the LC tank [141].

There are 12 optimal variables for the LC-tuned oscillator: inductor diameter DL ;

inductor widthWL ; turn space S and number of turns n ; transistor widthWn and length Ln ;

105

maximum and minimum varactor capacitance CVar, max and CVar, min ; bias current Ibias ; tank

voltage swingVswing ; resonant frequencyres ; and maximum resonant frequency,res,max .

All expressions for the optimization objective and specifications involved in the LC-tuned oscillator design are derived and tabulated in Table 5.3. More details can be found in

Appendix D. The equations in this table show that all the performance measures of this LC- tuned oscillator are monomial or posynomial functions of the optimal design variables, which means that this oscillator system can be formulated with a geometric program. Such an LC- tuned oscillator optimization design can then be effectively solved by the existing mature solvers for a geometric program.

Table 5.3: Model expressions of optimization objective and specifications in LC-tuned oscillator Constraints Equations Type Power dissipation MF D.29 P VddIB

D.30 Vswing  minRT IB , Vdd Tank switching voltage amplitude D.32 Vbias Vswing  Vdd Minimize PF

D.33 VswingGTank / I B 1 PF Phase noise @ offset frequency 2  i 2  f away from operating D.34 L  rms  n  PF 2 2   frequency 2 qmax  f  f C Resonant frequency 2 MF D.40 res LT CT  1 Tuning range 2 D.42 LT CT ,min 1/max PF C C D.45 (k 2 1) T ,min  Var,min 1 PF CVar,max CVar,max Bias condition D.50 MF Vbias Vgs Vswing Vdd Loop gain PF D.48 GT / gm,n 1/g Core chip area 2 PF D.51 A  DL  2Wn Ln Transistor Width D.52 MF Wn,min  Wn  Wn, max Transistor Length D.52 MF Ln, min  Ln  Ln, max Inductor width D.52 MF WL,min  WL  WL, max Inductor diameter D.52 MF DL,min  DL  DL, max

106

In next section, the LC-tuned oscillator of Figure 5.9 is used to verify the effectiveness and viability of the proposed surrogate strategy. The most emphasized specification parameter for an oscillator is phase noise, given that CMOS oscillators exhibit substantially higher flicker noise than their bipolar counterparts [142]. Thus, it is natural for us to focus on how the phase noise of an LC-tuned oscillator interacts with other characteristics such as power dissipation, center operating frequency, offset frequency from the carrier frequency, and tuning range.

Three iterations are here conducted. The specifications and corresponding results at each iteration step are summarized in Table 5.4. The SPICE simulation results for phase noise versus the offset frequency from the center frequency at three iterations are also depicted with SPICE in Figures 5.10, 5.12, and 5.14. And for comparison, the CPWL-based optimal results for phase noise versus the offset frequency from the center operating frequency at three iterations are depicted with MATLAB in Figures 5.11, 5.13, and 5.15.

Table 5.4: Specifications and optimization performance for 2.4GHz LC-tuned oscillator design Optimal Performance Results Design Spec. 1st iteration 2nd iteration 3rd iteration Performance Index CPWL-GP SPICE CPWL-GP SPICE CPWL-GP SPICE Power Consumption (mW) ≤ 4.5 1.6675 1.6662 3.0353 3.0344 4.2470 4.2464 Tuning Range ≥ ±15% ±15% ±15% ±15% ±15% ±15% ±15% Core Device Area (mm²) ≤ 0.4x0.4 0.1584 0.1578 0.1589 0.1585 0.1598 0.1596 Tank Switching Voltage (V) ≥ 1.44 1.54 1.36 1.54 1.47 1.54 1.52 Phase Noise @ fC+ 3MHz (dBc/Hz) Minimize -126.49 -107.64 -131.72 -119.26 -134.65 -127.90 (≤ -126dBc/Hz) (17.52%) (10.45%) (5.28%) Power Supply Voltage 1.8V Load 250fF capacitor Benchmark Tech. TSMC 0.18µm CMOS technology

The initial optimization design is implemented with the same CPWL-based GP optimizer reviewed in Section 3.6.6 and is verified thereafter by SPICE simulation. The obtained GP optimization results and the corresponding simulation results are listed in the 3rd and 4th

107

column of Table 5.4. By comparison, it demonstrates that the performance measures in power dissipation, tuning range, and core device area satisfy the predefined specifications, but there exist shorts in tank switching voltage ( 0.6% ) and phase noise (14.57% ), as well as a serious 17.52% discrepancy in phase noise because of some huge convex PWL fitting errors for wide transistors with a large number of design parameter data fitted, as shown in

Table 4.8.

Figure 5.10: CPWL optimal phase noise of LC-tuned oscillator at 1st iteration

Phase Noise vs. Offset Frequency from Operating Frequency 0 CPWL Optimization Result at 1st iteration

-20

-40

-60

-80 Phase Noise (dBc/Hz) -100

-120

-140 0 1 2 3 4 5 6 7 8 10 10 10 10 10 10 10 10 10 Offset Frequency from Operating Frequency (Hz)

Figure 5.11: Circuit-simulation-based phase noise of LC-tuned oscillator at 1st iteration

108

To alleviate the accuracy limit issue and narrow the gap between the CPWL-based GP optimal result and the actual SPICE circuit simulation result, at the 2nd iteration we begin to

execute the proposed surrogate strategy. We still sweep one crucial design variable ( I bias ) when running the GP coarse model optimizer and then get the corresponding optimal design variable values before inputting them to the fine model as the transistor size and component value to run the SPICE simulation. We then extract the value sets of the involved design

parameters (such as gm and gds ) from the SPICE simulation and fix them with the CPWL fitting to calculate the related coarse model coefficients (i.e., calculate the convex expression for those design parameters and then use all the involved performance characteristics of the designing electronic circuit system) for rebuilding the GP coarse model. After updating the coarse model coefficients, we run the new GP coarse model optimizer with the same design objective and constraints, but we don’t sweep any design variable. The optimization results are in Table 5.3, column 4. Next, we input the resultant optimal values of the design variables into the SPICE fine model. The corresponding simulation results are in Table 5.3, column 5.

Figure 5.12: CPWL optimal phase noise of LC-tuned oscillator at 2nd iteration

109

Phase Noise vs. Offset Frequency from Operating Frequency 0 CPWL Optimization Result at 2nd Iteration -20

-40

-60

-80

-100

Phase Noise (dBc/Hz) -120

-140

-160

-180 0 1 2 3 4 5 6 7 8 10 10 10 10 10 10 10 10 10 Offset Frequency from Operating Frequency (Hz)

Figure 5.13: Circuit-simulation-based phase noise of LC-tuned oscillator at 2nd iteration

From our previous description of the 2nd iteration, one may notice that the GP coarse model optimizer and the SPICE fine model simulator both run twice at this iteration. The first time is for rebuilding the surrogate model; the second time is for verification and comparison of the two model design results. Now, the SPICE simulation performance is notably improved.

For example, there are only 6.74dBc shorts in phase noise and a reduced 10.45% discrepancy from 17.52% in phase noise, but all other performances meet the required specifications.

Figure 5.14: CPWL optimal phase noise of LC-tuned oscillator at 3rd iteration

110

Phase Noise vs. Offset Frequency from Operating Frequency 0 CPWL Optimization Result at 3rd Iteration -20

-40

-60

-80

-100

Phase Noise (dBc/Hz) -120

-140

-160

-180 0 1 2 3 4 5 6 7 8 10 10 10 10 10 10 10 10 10 Offset Frequency from Operating Frequency (Hz)

Figure 5.15: Circuit-simulation-based phase noise of LC-tuned oscillator at 3rd iteration

Likewise at the 3rd iteration, the deviation in phase noise is reduced to5.28%, and this time all the specifications are met, and we stop here.

From the initial design to the 3rd case, the prediction performance discrepancy in phase noise reduces to from17.52% ; thus, a noticeable 12.24% improvement is achieved.

As expected, through the proposed GP-based surrogate modeling and optimization algorithm, both model results from the GP optimization and the actual SPICE circuit simulation agree with each other gradually, in each iteration step, until all required specifications are met.

However, this approach may still take longer than the single-iteration convex PWL-based

GP, due to its iteration process. In this design case, the GP optimizer takes about 13 seconds, but design parameter extraction and model coefficient calculation may take up to 40 hours.

111

Chapter 6

Conclusions and Future Work

6.1 Conclusions

For an electronic system with a given circuit topology—and especially for those very large- scale integrated circuits which may contain hundreds of millions of devices and an equivalent number of interconnected wires—it is really difficult, if not impossible, to finish the design manually. Thus, the ultimate goal is usually to automatically size the device and component in order to achieve optimal performance while at the same time satisfying the predefined specifications. Today, rapidly increasing system complexity and stiff demands for time reduction and cost effectiveness in a highly competitive market tend to boost new design automation and CAD technologies in such complex electronic system optimization designs.

The critical component in a CAD tool is the optimizer engine, which has huge impact on the needed cycle design time and performance precision [4].

In this work, we thus concentrate on proposing novel and efficient automatic optimization techniques. So far, there have been many free and commercial optimizers in IC design. Of them all, the geometric program has gained in popularity due to its high efficiency and ability to always find the globally optimal solution.

By reviewing two previous works, we found that a monomial-based geometric program is suitable only for long-channel transistors, as it is not accurate enough for short-channel devices, especially for deep submicron technology. For example, there is a significant

112

14.44dBdiscrepancy between the low frequency gains achieved by a monomial-based GP optimizer and the actual SPICE circuit simulator with 0.18m technology, as shown in Table

4.3. As for the CPWL-based geometric program, it works well for short-channel, narrow transistors with a small number of design parameter data fitted; however, our work shows that the performance accuracy degrades seriously for wide transistors when fitting a large number of parameter data. As an example, Table 4.8 shows that the mean convex PWL

fitting error of gate-source capacitance parameter Cgs for NMOS without Vbs is as large as

18.61% and 10.08% for output conductance g ds of PMOS with . The major reason is because the convex PWL is perfect only when fitting a convex data set, whereas most physical and electrical behaviors of today's transistors can not be represented directly by a convex function. These huge fitting errors thus result in accumulative performance discrepancies in small-signal characteristics.

To reduce design time and further improve the prediction accuracy for wide transistors with a large number of parameter data fitted and with deep submicron technology, which even a CPWL-based GP cannot handle well, we propose in this work an innovative alternative: an SMOA based on a GP optimizer and a CPWL fitting. The basic framework and principles of our modeling and optimization strategy are stated in Section 5.1 and shown in Figures 5.2 – 5.12. As far as we know, this surrogate strategy is the first surrogate framework with the following original features:

 For the first time, a family of surrogate models is created with a CPWL-based GP.

This feature allows this proposed modeling and optimization tool to inherit the GP’s

high efficiency and ability to find the global optimum, and to use a large number of

113

existing mature solvers developed for GP. It also benefits from the CPWL fitting

technique’s ability to get a high fitting precision that, in many cases, other fitting

technologies cannot achieve. A better initial surrogate model established with such

high-fidelity fitting techniques can be very helpful in reducing the number of total

used iterations.

 In many other modeling and optimization methods, the system models are created

relying on the parameter information of a single device model (as in [11, 12]).

Obviously, such information is isolated from the actual designing electronic system.

In contrast, except for the first iteration, in our surrogate modeling and optimization

algorithm we extract the needed parameters directly from the fine model of the

designing electronic system (here, the SPICE circuit schematic), rather than from a

single device. Such extracted parameter data are directly related to the designing

electronic circuit system, and they have much richer physical and electrical

information about the designing circuit system. As a result, we can achieve a more

accurate surrogate model by calculating the course model coefficients based on those

parameter data extracted directly from the SPICE designing circuit system (which

represents the fine model). This is helpful because it reduces the discrepancy between

the surrogate model and the fine model; furthermore, the number of iterations is

reduced, and the design time is cut accordingly.

 To ensure the convergence between both the fine and the surrogate models and to

increase the convergence speed, some matching constraints and Jacobean conditions,

denoted by a reflection functions  , can be imposed between the optimal design

114

values of both models. That is, we can obtain the optimal value for the optimization

design variables of the fine model from the optimal value of the optimization design

variables in the surrogate model through the following relationship:

x(n 1)  (x)

This is very important to guarantee achieving the optimization design of the designing

electronic system while meeting all the required specifications.

 To get a sequence of optimization design variable points, we choose to sweep one or

more design variables. The number to be swept is very flexible. For convenience,

only one design variable (bias current) is swept in both design study cases.

Analysis and simulation in both design study cases (two-stage op-amp and LC-tuned oscillator optimization design) demonstrate that the proposed surrogate modeling and optimization algorithm is appropriate for wide transistors with a short channel when fitting a large number of parameter data, where neither a single-iteration monomial-based GP nor a single-iteration convex PWL-based GP works well. For example, in the two-stage op-amp optimization design, from the initial design to the 3rd iteration, the performance discrepancy in unit gain frequency is reduced to 8.60% from 22.60% ; thus, a noticeable 14% improvement is achieved through only three iterations; in another design case (the LC-tuned oscillator optimization design), from the initial design to the 3rd iteration, the prediction performance discrepancy in phase noise is reduced to5.28%from17.52% ; thus, a noticeable

12.24% improvement is obtained also through three iterations. Moreover, these improved prediction accuracies greatly exceed the preset research objective: the performance discrepancy between the optimal results and the circuit simulation results were expected to

115

decrease by around 5% over the previous PWL-based GP approach. These dramatically increasing accuracies achieved within only three iterations in both design cases sufficiently demonstrate that our proposed GP-based surrogate modeling and optimization algorithm in can achieve much higher performance over a single-iteration convex PWL-based GP.

From the macro point of view, we can imagine why. At each iteration step, the coarse model is responsible for providing the device size to the fine model, which in turn generates the involved design parameters extracted later for rebuilding the coarse model. Both models thus affect each other, mimic each other, converge to each other, and even match each other at last. Consequently, the discrepancy between the two model performances may reduce iteratively.

The GP-based surrogate strategy inherits from the GP the ability to get the global optimum highly efficiently. It also inherits the disadvantages of handling only limited types of constraints and demanding designers' expertise in deriving the GP-compatible model

(which, fortunately, can be partially done by some symbolic analyzers and circuit simulators; see [143, 144, and 145]). Once the GP formulation is done, particular design instances can be carried out automatically and rapidly; it is then reusable in future design, just like an IC block library. This saves the designers much time that would normally be spent tediously and manually tuning the device, and it enables them to focus on more important tasks, such as carefully analyzing the optimal trade-offs between competing objectives.

116

6.2 Future Work

In this work, the proposed GP-based surrogate strategy works quite well for both case studies: the two-stage op-amp and the LC-tuned oscillator optimization design. Both the GP-based and the circuit-simulation-based optimal results agree with each other iteratively and converge eventually to the given specifications, using in total three iterations. However, this is not the general case; in some cases, both model optimization results may diverge—that is, the convergence cannot be guaranteed. The convergence issue, then, should be a matter of utmost concern. At the next stage, some space mapping technologies such as those found in

[146, 147] will be investigated, and an attempt will be made to combine them with the proposed GP-based surrogate strategy to enhance the convergence possibility.

In this work, to increase the prediction accuracy, we fit a large number of parameter data points (almost half a million) when creating the GP-compatible device model. This is very computation-expensive and time consuming. It takes 40-50 hours to fit those parameter data points with a convex PWL fitting technique. In the future, then, some statistic technology should be developed to reduce the design variable space and decrease the model fitting time without much degradation in prediction accuracy [29]. Also, this proposed approach may still take longer than the single-iteration CPWL-based GP due to its iteration process. Hence, some technologies, such as model order reduction methods [148], will be studied and integrated into our proposed surrogate strategy to cut the total time cost. Some space mapping technologies can also be used to increase the convergence speed by minimizing the number of calls to the fine model and thus save the total design time cycle.

117

Other fitting technologies and optimization approaches will be studied as well; the optimization design of some ICs with other topologies will be conducted with the proposed

GP-based surrogate algorithm. The convex PWL-based GP will still serve as the optimization engine. MATLAB codes for the GP-based surrogate strategy will be written; and the SPICE simulation will be carried out to verify the improved performance.

118

REFERENCES

[1] J. M Rabaey, A. Chandrakasan and B. Nikolic, Digital Integrated Circuits: A Design Perspective, Prentice Hall, 2001.

[2] R. K. Brayton, G. D. Hachtel and A. Sangiovanni-Vincentelli, ―A survey of optimization techniques for integrated-circuit design,‖ Proc. IEEE, vol. 69, pp. 1334- 1362, Oct. 1981.

[3] B. A. A. Antao, ―Trends in CAD of analog ICs,‖ IEEE Circuits Devices Magazine., vol. 12, no. 5, pp. 31-41, Sept. 1996.

[4] P. D. Franzon, K. Gard and K. Sivaramkrishnan, ―Holistic system optimization.‖

[5] L. R. Carley, G. G. E. Gielen, R. A. Rutenbar, and W. M. C. Sansen, ―Synthesis tools for mixed-signal ICs: Progress on front-end and back-end strategies,‖ in Proceedngs of the 33rd Annual Design Automation Conference, 1996, pp. 298-303.

[6] L. R. Carley and R. A. Rutenbar, ―How to automate analog IC designs,‖ IEEE Spectrum, vol. 25, pp. 26-30, Aug. 1988.

[7] A. Hassibi and M. Hershenson, ―Automated optimal design of switched-capacitor filters,‖ in Proceedings of Design, Automation and Test in Europe Conference and Exhibition, 2002, pp. 1111.

[8] M. Hershenson, ―Design of pipeline analog-to-digital converters via geometric programming,‖ in Proceedings of IEEE International Conference on Computer- Aided Design, Nov. 2003, pp. 317-324.

[9] D. Colleran, C. Portmann, A. Hassibi, C. Crusius, S. Mohan, S. Boyd, T. Lee and M. Hershenson, ―Optimization of phase-locked loop circuits via geometric programming,‖ in Proceedings of IEEE Custom Integrated Circuit Conference, 2003, pp. 377-380.

[10] J. Lee, J. Hatchter, and C. K. K. Yang, ―Evaluation of fully-integrated switching regulator for CMOS process technologies,‖ in Proceedings of 2003 International Symposium on SOC, to appear.

[11] M. M. Hershenson, S. P. Boyd and T. H. Lee, ―Optimal design of a CMOS op-amp via geometric programming,‖ IEEE Transaction on Computer-Aided Design of Integrated Circuits and Systems, Vol. 20, No. 1, January 2001.

119

[12] J. Kim, J. Lee, L. Vandenberghe, and C. K. Ken Yang, "Techniques for improving the accuracy of geometric programming-based analog circuit design optimization," Proceedings of IEEE/ACM International Conference on Computer-Aided Design, 2004.

[13] S. Boyd and L. Vandenberghe, Convex Optimization, Cambridge University Press, 2007.

[14] A. Cherkaev, ―Methods of optimization,‖ available at http://www.math.utah.edu/~cherk/teach/opt/course.html

[15] A. O. Adewumi, ―Some improved genetic-algorithms based heuristics for global optimization with innovative applications,‖ Ph.D. Dissertation, 2010.

[16] C. C. Bissell, ―A History of automatic control,‖ available at http://siamun.weebly.com/uploads/4/1/7/3/4173241/history_of_automatic_control.pdf

[17] S. Boyd and L. Vandenberghe, ―Introduction to convex optimization with engineering applications,‖ Course Notes, 1999, avail. at http://www.leland.stanford.edu/ece364/

[18] M. Chiang, Geometric Programming for Communication Systems, Now Publishers Inc., 2005.

[19] Y. Nesterov and M. Todd, ―Self-scaled barriers and interior-point methods for convex programming,‖ Mathematics of , 22:1-42, 1997.

[20] D. Goldfarb and G. Iyengar, ―Robust quadratically constrained problems program,‖ Technical Report TR-2002-04, Department of IEOR, Columbia University, New York, NY, 2002.

[21] Z.-Q. Luo, J. Sturm, and S. Zhang, ―Duality and self-duality for conic convex programming,‖ Technical Report, Department of Electrical and Computer Engineering, McMaster University, 1996.

[22] J. Sturm, ―Using SeDuMi 1.02, a MATLAB toolbox for optimization over symmetric cones,‖ Optimization Methods and Software, 11:625-653, 1999.

[23] K. Fujisawa, M. Kojima, K. Nakata, and M. Yamashita, ―SDPA (Semi-Definite Programming Algorithm) user's manual - version 6.00,‖ Technical Report, Tokyo Institute of Technology, July 2002.

[24] S. Benson, ―DSDP 4.5: A dual scaling algorithm for semidefinite programming,‖ http://www-unix.mcs.anl.gov/~benson/dsdp/, March 2002.

120

[25] Yu. Nesterov and A. Nemirovsky, ―Interior-point polynomial algorithms in convex programming: theory and algorithms,‖ Studies in Applied Mathematics, vol. 13 of SIAM Publications, Philadelphia, PA, 1993.

[26] M. Lobo, L. Vandenberghe, S. Boyd, and H. Lebret, ―Applications of second-order cone programming,‖ Linear Algebra and its Applications, 284:193-228, Special issue on Signals and Image Processing, November 1998.

[27] R. J. Duffin, "Linearizing geometric programs," SIAM Review, vol. 12, pp. 211-237, 1970.

[28] E. Rosenberg, ―Globally convergent algorithms for convex programming with applications to geometric programming,‖ PhD thesis, Department of Operations Research, Stanford University, 1979.

[29] Y. Xu, K. L. Hsiung, X. Li, L. T. Pileggi, and S. P. Boyd, ―Regular analog/rf integrated circuits design using optimization with recourse including ellipsoidal uncertainty,‖ IEEE Trans. On Computer-Aided Design of Integrated Circuits and Systems, vol. 23, pp. 623-637, May. 2009.

[30] S. Boyd, S. Kim, L. Vandenberghe and A. Hassibi, ―A Tutorial on geometric programming,‖ Optimum Engineering, 2007.

[31] R. Rockafellar, Convex Analysis, Princeton Univ. Press, Princeton, New Jersey, 2nd edition, 1970.

[32] J. Stoer and C. Witzgall, Convexity and Optimization in Finite Dimensions I, Springer-Verlag, 1970.

[33] J.-B. Hiriart-Urruty and C. Lemaréchal, ―Convex analysis and minimization algorithms I,‖ Grundlehren der mathematischen Wissenschaften, vol. 305, Springer- Verlag, New York, 1993.

[34] D. Bertsekas, A. Nedic, and A. Ozdaglar, Convex Analysis and Optimization, Athena Scientific, Nashua, NH, 2004.

[35] Yu. Nesterov and A. Nemirovsky, ―A general approach to polynomial-time algorithms design for convex programming,‖ Technical Report, Central Economics and Mathematics Institute, USSR Academy of Science, Moscow, USSR, 1988.

[36] S. Zhang, ―A new self-dual embedding method for convex programming,‖ Technical Report, Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, October 2001.

121

[37] Y. Ye, Interior-Point Algorithms: Theory and Practice, John Wiley & Sons, New York, NY, 1997.

[38] S. Wright, Primal Dual Interior Point Methods, SIAM Publications, Philadelphia, PA, 1999.

[39] N. Karmarkar, ―A new polynomial-time algorithm for linear programming,‖ Combinatorica, 4(4):373-395, 1984.

[40] N. Megiddo, ―Pathways to the optimal set in linear programming,‖ In N. Megiddo, editor, Progress in Mathematical Programming: Interior Point and Related Methods, pages 131-158. Springer Verlag, New York, 1989. Identical version in: Proceedings of the 6th Mathematical Programming Symposium of Japan, Nagoya, Japan, 1-35, 1986.

[41] R. Monteiro and I. Adler, ―Interior path following primal-dual algorithms: Part I: Linear programming,‖ Mathematical Programming, 44:27-41, 1989.

[42] M. Wright, ―Some properties of the Hessian of the logarithmic barrier function,‖ Mathematical Programming, 67:265-295, 1994.

[43] O. Bahn, J. Goffin, J. Vial, and O. Du Merle, ―Implementation and behavior of an interior point cutting plane algorithm for convex programming: An application to geometric programming,‖ Working Paper, University of Geneva, Geneva, Switzerland, 1991. . [44] S. Nash and A. Sofer, ―A barrier method for large-scale constrained optimization,‖ ORSA J. on Computing, 5:40-53, 1993.

[45] S. Joshi and S. Boyd, ―Sensor selection via convex optimization,‖ IEEE Trans. On signal processing, vol.57, no.2, Feb. 2009.

[46] G. Dullerud and F. Paganini, ―A Course in robust control theory,‖ vol. 36 of Texts in Applied Mathematics, Springer-Verlag, 2000.

[47] S. Boyd, L. El Ghaoui, E. Feron, and V. Balakrishnan, ―Linear matrix inequalities in system and control theory,‖ SIAM, 1994.

[48] M. Milanese and A. Vicino, ―Optimal estimation theory for dynamic systems with set membership uncertainty: An overview,‖ Automatica, 27(6):997-1009, Nov. 1991.

[49] G. Calafiore and M. Indri, ―Robust calibration and control of robotic manipulators,‖ American Control Conference, pp. 2003-2007, 2000.

122

[50] L. Han, J. Trinkle, and Z. Li, ―Grasp analysis as linear matrix inequality problems,‖ IEEE Trans. on Robotics and Automation, 16(6):663-674, Dec. 2000.

[51] S. Boyd and L. Vandenberghe, ―Semidefinite programming relaxations of non- convex problems in control and combinatorial optimization,‖ In A. Paulraj, V. Roychowdhuri, and C. Schaper, editors, Communications, Computation, Control and Signal Processing: A Tribute to Thomas Kailath, chapter 15, pp. 279-288, Kluwer Academic Publishers, 1997.

[52] J. Kleinberg, C. Papadimitriou, and P. Raghavan, ―Segmentation problems,‖ Proceedings of the 30th Symposium on Theory of Computation, pp.473-482, 1998.

[53] U. Zwick, ―Outward rotations: a tool for rounding solutions of semidefinite programming relaxations, with applications to max cut and other problems,‖ Proceedings of the 31 st Symposium on Theory of Computation, pp.679-687, 1999.

[54] U. Feige and M. Langberg, ―Approximation algorithms for maximization problems arising in graph partitioning,‖ Journal of Algorithms, vol.41, pp.174-211, 2001.

[55] A. Ben-Tal and M. Bendsoe, ―A new method for optimal truss topology design,‖ SIAM Journal on Optimization, vol.13(2), 1993.

[56] A. Ben-Tal and A. Nemirovski, ―Robust truss topology design via semidefinite programming,‖ SIAM Journal on Optimization, 7(4):991-1016, 1997.

[57] F. Jarre, M. Kocvara, and J. Zowe, ―Optimal truss design by interior point methods,‖ SIAM Journal on Optimization, 8(4):1084-1107, 1998.

[58] M. Kocvara, J. Zowe, and A. Nemirovski, ―Cascading an approach to robust material optimization,‖ Computers and Structures, 76:431-442, 2000.

[59] P. Parrilo, ―Semidefinite programming relaxations for semialgebraic problems,‖ Mathematical Programming, Series B, 96(2):293-320, 2003.

[60] J. Lasserre, ―Bounds on measures satisfying moment conditions,‖ Annals of Applied Probability, 12:1114-1137, 2002.

[61] Z. Tan, Y. Soh, and L. Xie, ―Envelope-constrained H∞ filter design: an LMI optimization approach,‖ IEEE Trans. on Signal Processing, 48(10):2960-2963, Oct. 2000.

[62] W. Lu, ―A unified approach for the design of 2-D digital filters via semidefinite programming,‖ IEEE Trans. on Circuits and Systems I: Fundamental Theory and Applications, 49(6):814-826, June 2002.

123

[63] C. Tseng and B. Chen, ―H∞ fuzzy estimation for a class of nonlinear discrete-time dynamic systems,‖ IEEE Trans. on Signal Processing, 49(11):2605-2619, Nov. 2001.

[64] R. Palhares, C. de Souza, and P. D. Peres, ―Robust H∞ filtering for uncertain discrete- time state-delayed systems,‖ IEEE Trans. on Signal Processing, 48(8):1696-1703, Aug. 2001.

[65] J. Geromel and M. De Oliveira, ―H2/H∞ robust filtering for convex bounded uncertain systems,‖ IEEE Trans. on Automatic Control, 46(1):100-107, Jan. 2001.

[66] F. Wang and V. Balakrishnan, ―Robust Kalman filters for linear time-varying systems with stochastic parametric uncertainties,‖ IEEE Trans. on Signal Processing, 50(4):803-813, Apr. 2002.

[67] J. Mattingley and S. Boyd, ―Real-time convex optimization in signal processing,‖ IEEE Signal Processing Magazine, pp50-61, May 2010.

[68] J. Borwein and A. Lewis, ―Duality relationships for entropy-like minimization problems,‖ SIAM Journal Control and Optimization, 29(2):325-338, Mar. 1991.

[69] T. Davidson, Z. Luo, and K. Wong, ―Design of orthogonal pulse shapes for communications via semidefinite programming,‖ IEEE Trans. on Communications, 48(5):1433-1445, May 2000.

[70] H. Tan and L. Rasmussen, ―The application of semidefinite programming for detection in CDMA,‖ IEEE Journal on Selected Areas in Communications, 19(8):1442-1449, Aug. 2001.

[71] D. Bertsimas and J. Nino-Mora, ―Optimization of multiclass queuing networks with changeover times via the achievable region approach: Part II, the multi-station case,‖ Mathematics of Operations Research, 24(2), May 1999.

[72] P. Biswas and Y. Ye, ―Semidefinite programming for ad hoc wireless sensor network localization,‖ Technical Report, Stanford University, April 2004. http://www.stanford.edu/~yyye/adhocn4.pdf.

[73] L. Vandenberghe, S. Boyd, and A. El Gamal, ―Optimizing dominant time constant in RC circuits,‖ IEEE Trans. on Computer Aided Design, 2(2):110-125, Feb. 1998.

[74] J. Dawson, S. Boyd, M. Hershenson, and T. Lee, ―Optimal allocation of local feedback in multistage amplifiers via geometric programming,‖ IEEE Journal of Circuits and Systems I, 48(1):1-11, January 2001.

124

[75] S. Boyd, M. Hershenson, and T. Lee, ―Optimal analog circuit design via geometric programming,‖ Preliminary Patent Filing, Stanford Docket S97-122, 1997.

[76] J. Park, H. Cho, and D. Park, ―Design of GBSB neural associative memories using semidefinite programming,‖ IEEE Trans. on Neural Networks, 10(4):946-950, Jul. 1999.

[77] D. Goldfarb and G. Iyengar, ―Robust portfolio selection problems,‖ Technical Report, Computational Optimization Research Center, Columbia University, Mar. 2002. http://www.corc.ieor.columbia.edu/reports/techreports/tr-2002-03.pdf.

[78] Y. Ye, ―A path to the Arrow-Debreu competitive market equilibrium,‖ Technical Report, Stanford University, Feb. 2004. http://www.stanford.edu/»yyye/arrow- debreu2.pdf.

[79] G. B. Dantzig, Linear Programming and Extensions, Princeton University Press, 1963.

[80] C. Zener, "A mathematical aid in optimizing engineering design," Proceedings of National Academy of Sciences, vol. 47, pp. 537-539, 1961.

[81] R. J. Duffin, "Cost minimization problems treated by geometric means," Operations Research, vol. 10, pp. 668-675, 1962.

[82] R. J. Duffin, "Dual problems and minimum cost," SIAM Journal on Applied Mathematics, vol. 10, pp. 119-123, 1962.

[83] R. J. Duffin, E. L. Peterson, and C. Zener, Geometric Programming: Theory and Applications, Wiley, 1967.

[84] C. Zener, Engineering Design by Geometric Programming, Wiley, 1971.

[85] C. S. Beightler and D. T. Phillips, Applied Geometric Programming, John Wiley & Sons, 1976.

[86] J. Ecker, ―Geometric programming, methods, computations and applications,‖ SIAM Review, vol. 22, no. 3, pp. 338-362, 1980.

[87] J. Ecker and M. Kupferschmid, Introduction to Operations Research, Krieger, 1991.

[88] D. J. Wilde and D. D. Beightler, Foundations of Optimization, Prentice-Hall, 1967.

125

[89] R. S. Dembo, ―Current state of the art of algorithms and computer software for geometric programming,‖ Journal of Optimization Theory and Applications, vol.26, no. 2, pp. 149-184, 1978.

[90] U. Passy and D. J. Wilde, ―A geometric programming algorithm for solving chemical engineering problems,‖ SIAM Review, vol. 16, pp. 363-373, 1968.

[91] M. J. Rdjckaert and X. M. Martens, ―Analysis and optimization of the Williams-Otto process by geometric programming,‖ AICHE Journal, vol. 20, no. 4, pp. 742-750, Jul. 1974.

[92] M. Avriel and D. J. Wilde, ―Optimal condenser design by geometric programming,‖ I&EC Process Design and Development, vol. 6, pp. 256-263, 1967.

[93] J. J. Dinkel and G. A. Kochenberger, ―A cofferdam design optimization,‖ Mathematical Programming, vol. 6, pp. 114-117, 1974.

[94] A. B. Templeman, ―Structural design for minimum cost using the method of geometric programming,‖ Proceedings of the Institute of Civil Engineering, vol.46, pp. 459-470, 1970.

[95] Z. Wei and S. Ye, ―Optimal sectional design of frame structures using geometric programming,‖ Journal of Structural Engineering, vol. 116, no. 8, pp. 2292-2298, Aug. 1990.

[96] S. Abuyounes and H. Adeli, ―Optimization of steel plate girders via general geometric programming,‖ Journal of Structural Mechanics, vol. 14, no. 4, pp. 501- 524, 1986.

[97] M. B. Snell and P. Banhollomew, ―Application of geometric programming to the structural design of aircraft wings,‖ Journal Aeronautics, vol. 86, no. 857, pp.259- 268, 1982.

[98] V. Balachandran and D. Gensch, ―Solving the marketing-mix problem using geometric programming,‖ Management Science, vol. 21, pp.160 -171, 1974.

[99] D. Soesilo and K. J. Min, ―Inventory model with variable levels of quality attributes via geometric programming,‖ International Journal of Systems Science, vol. 27, no. 4, pp. 379-386, Apr. 1996.

[100] M. O. Abou-EI-Ata and K. A. M. Kotb, ―Multi-item EOQ inventory model with varying holding cost under two restrictions: a geometric programming approach,‖ Production Planning and Control, vol. 8, no. 6, pp. 608-611, Sept. 1997.

126

[101] A. M. A. Hariri and M. O. Abou-EI-Ata, ―Multi-item production lot-size inventory model with varying order cost under a restriction: a geometric programming approach,‖ Production Planning and Control, vol. 8, no. 2, pp. 179-182, Mar.1997.

[102] R. Gupta, J. L. Batra, and G. K. Lal, ―Profit rate maximization in multi-pass turning with constraints: a geometric programming approach,‖ International Journal of Production Research, vol. 32, no. 7, pp. 1557-69, 1994.

[103] T. Moh, T. Chang, and S. L. Hakimi, ―Globally optimal floor planning for a layout Problem,‖ IEEE Transactions on Circuits and Systems I: Fundamental Theory and Applications, vol. 43, no. 9, pp. 713-20, Sept. 1996.

[104] T. Chen and H. Fan, ―On convex formulation of the floorplan area minimization problem,‖ International Symposium on Physical Design, pp.124-128, 1998.

[105] Y. Smeers and D. Tyteca, ―Geometric programming model for the optimal design of wastewater treatment plants,‖ Operations Research, vol. 32, no. 2, pp. 314-342, 1984.

[106] W. C. Pisano, ―Optimal wastewater treatment plant design for variable loadings,‖ Engineering Optimization, vol. 2, no. 3, pp. 197-208, 1976.

[107] J. R. McNamara, ―Optimization model for regional water quality management,‖ Water Resources Research, vol. 12, no. 2, pp. 125-134, 1976.

[108] K. Unklesbav, G. E. Staats, and D. L. Creghton, ―Optimal design of journal bearings,‖ International Journal of Engineering Science, vol.11, pp. 973-983, 1973.

[109] H. Adeli and O. Kamal, ―Efficient optimization of space trusses,‖ Computers and Structures, vol. 24, no. 3, pp. 501-11, 1986.

[110] L. J. Mancini and R. L. Piziali, ―Optimal design of helical springs by geometric programming,‖ Engineering optimization, vol. 2, pp. 73-81, 1976.

[111] D. O. Stuart and R. C. Arnold, ―Condenser, cooling-tower system optimization using geometric programming,‖ ASME/IEEE Power Generation Conference, 1986.

[112] T. D. Robinson, M. S. Eldred, K. E. Willcox and R. Haimes, ―Surrogate-based optimization using multifidelity models with variable parameterization and corrected space mapping,‖ AIAA Journal, vol. 46, No. 11, November 2008.

[113] Y. Nesterov and A. Nemirovsky, ―Interior-point polynomial methods in convex programming,” Studies in Applied Mathematics, vol. 13, 1994.

127

[114] K. O. Kortanek, X. Xu, and Y. Ye, ―An infeasible interior-point algorithm for solving primal and dual geometric programs,‖ Math Programming, vol. 76, pp.155- 181, 1996.

[115] M. Hershenson, S. Boyd, and T. H. Lee, ―GPCAD: A tool for CMOS op-amp synthesis,‖ in IEEE/ACM International Conference on Computer Aided Design, pp. 296-303, San Jose, CA, 1998.

[116] S. Boyd, S. J. Kim and S. S. Mohan, ―Geometric programming and its applications to EDA problems,‖ available at http://www.stanford.edu/~boyd/papers/pdf/date05.pdf

[117] D. Wilde, Globally Optimal Design, John Wiley & Sons Inc., New York, 1978.

[118] A. Magnani and S. P. Boyd, ―Convex piecewise-linear fitting,‖ Optimization Engineering, March 2008.

[119] T. Hastie, R. Tibshirani and J. Friedman, The Elements of Statistical Learning, Springer, Berlin, 2001.

[120] J. J. Dinkel, W. H. Elliot, and G. A. Kochenberger, ―Computational aspects of cutting plane algorithms for solving geometric programming problems,‖ Mathematical Programming, vol. 13, pp. 200-220, 1977.

[121] J. G. Ecker and M. J. Zoracki, ―An easy primal method for geometric programming,‖ Management Science, vol. 23, pp. 600-612, 1976.

[122] G. V. Reklaitis and D. J. Wilde, ―Geometric programming via a primal auxiliary problem,‖ AIIE Transactions, vol. 6, pp. 308-317, 1974.

[123] G. S. Dawkins, B. C. McInnis and S. K. Moonat, ―Solution to geometric programming problems by transformation to convex programming problems,‖ International Journal of Solid Structures, vol. 10, pp. 135-136, 1974.

[124] C. J. Frank, Recent Advances in Optimization Techniques, John Wiley, 1966.

[125] M. J. Rijckaert and X. M. Martens, ―A condensation method for generalized geometric programming,‖ Mathematical Programming, vol. 11, pp. 89-93, 1976.

[126] S. J. Wright, ―Primal-dual interior-point methods,‖ SIAM, Philadelphia, 1997.

[127] X. Xu, ―XGP-an optimizer for geometric programming,‖ Technical Report, May 1995, fttp://www.col.biz.uiowa.edu/dist/xue/doc/home.html.

128

[128] U. Passy and D. J. Wilde, ―Generalized polynomial optimizations,‖ SIAM Journal Of Applied Mathematics, vol. 15, pp. 1344-1356, 1967.

[129] M. J. Rijckaert and X. M. Martens, ―Comparison of generalized geometric programming algorithms,‖ Journal of Optimization Theory and Applications, vol. 26, no. 2, pp. 205-242, 1978.

[130] M. Avriel and A. C. Williams, ―Complementary geometric programming,‖ SIAM Journal of Applied Mathematics, vol. 19, pp. 125-141, 1970.

[131] U. Passy, ―Generafized weighted mean programming,‖ SIAM Journal of Applied Mathematics, vol. 20, pp. 763-778, 1971.

[132] M. Avriel, R. Dembo, and U. Passy, ―Solution to generalized geometric programs,‖ International Journal for Numerical Methods in Engineering, vol. 9, pp. 149-168, 1975.

[133] R. J. Duffin and E. L. Peterson, ―Geometric programming with signomials,‖ Journal of Optimization Theory and Applications, vol. 11, pp. 3-35, 1973.

[134] J. Rajpogal and D. L. Bricker, ―Posynomial geometric programming as a special case of semi-infinite linear programming,‖ Journal of Optimization Theory and Applications, vol. 66, no. 3, pp. 455-475, Sept. 1990.

[135] J. Wang, ―Microwave circuit optimization exploiting tuning space mapping,‖ available at http://www.sos.mcmaster.ca/theses/meng/jie_thesis_final.pdf

[136] N. V. Queipo, R. T. Haftka, W. Shyy, T. Goel, R. Vaidynathan, and P. K. Tucker, ―Surrogate-based analysis and optimization,‖ Prog. Aerospace Science., vol. 41, no. 1, pp. 1-28, 2005.

[137] J. Sondergaard, ―Optimization using surrogate models,‖ PhD Thesis, available at http://www2.imm.dtu.dk/pubdb/views/publication_details.php?id=2450

[138] B. Razavi, ―A 1.8GHz CMOS voltage controlled oscillator,‖ ISSCC Digest of Technical Papers, pp.388-389, Feb. 1997.

[139] J. Craninckx and M. Steyaert, ―A fully integrated spiral-LC CMOS VCO set with prescaler for GSM and DCS-1800 systems,‖ Proc. CICC, pp.403-406, May 1997.

[140] A. Rofourgan, J. Rael, M. Rofourgan and A. Abidi, ―A 900-MHz CMOS LC- oscillator with quadrature outputs,‖ ISSCC Digest of Technical Papers, pp. 392-393, February 1996.

129

[141] C. Huang, L.C.N, de Vreede, A. Akhnoukh and J. N. Burghartz ―Low frequency noise LC osillators models,‖ available at stw.nl/NR/rdonlyres/818B4D81-3195-4300- 801E-C1D066F333EF/0/huang.pdf

[142] S. E. Weberg ―Design of a 1.6-mV LC-tuned VCO for 2.4GHz in 0.18-um RF CMOS technology,‖ available at http://cp.literature.agilent.com/litweb/pdf/5989-8893EN.pdf

[143] G. G. E. Gielen, H. C. C.Walscharts, and W. M. C. Sansen, ―Analog circuit design optimization based on symbolic simulation and simulated annealing,‖ IEEE J. Solid- State Circuits, vol. 25, pp. 707-713, June 1990.

[144] G. G. E. Gielen, H. C. C.Walscharts, and W. M. C. Sansen, ―ISAAC: A symbolic simulator for analog integrated circuits,‖ IEEE J. Solid-State Circuits, vol. 24, pp. 1587-1597, Dec. 1989.

[145] F. V. Fernández, A. Rodríguez-Vázquez, and J. L. Huertas, ―Interactive AC modeling and characterization of analog circuits via symbolic analysis,‖ Analog Integrated Design and Signal Processing, vol. 1, pp183-208, Nov. 1991.

[146] J. W. Bandler, and R. M. Biernacki, ―Electromagnetic optimization exploiting aggressive space mapping,‖ IEEE Trans. On Microwave Theory and Techniques, vol. 43, pp. 2874-2882, Dec. 1995.

[147] S. Koziel, J. W. Bandler, and K. Madsen, ―A space-mapping framework for engineering optimization – theory and implementation,‖ IEEE Trans. On Microwave Theory and Techniques, vol. 54, pp. 3721-3730, Oct. 2006.

[148] R. Achar and M. S. Nakhla, ―Simulation of high-speed interconnects,‖ Proc. IEEE, vol. 89, no. 5, pp. 693-728, May 2001.

[149] J. Moré and D. Sorensen, ―NMTR (software package),‖ March 2000. http://www- unix.mcs.anl.gov/~more/nmtr

[150] P. Hovland, B. Norris, and C. Bischof, ―ADIC (software package),‖ November 2003. http://www-fp.mcs.anl.gov/adic/.

[151] C. Bischof, A. Carle, G. Corliss, A. Griewank, and P. Hovland, ―ADIFOR: Generating derivative codes from fortran programs,‖ Scientific Programming, pp. 1- 29, December 1991.

[152] M. Overton and R. Womersley, ―On the sum of the largest eigenvalues of a symmetric matrix,‖ SIAM Journal on Matrix Analysis and Applications, 13(1):41-45, January 1992.

130

[153] C. Samori, A. L. Lacaita, F. V.illa and F. Zappa, ―Spectrum folding and phase noise in LC tuned oscillators,‖ IEEE Transaction on Circuits and Systems – II: Analog and Digital Signal Processing, vol. 45, No. 7, July. 1998.

[154] P. R. Gray, P. J. Hurst, S. H. Lewis and R. G. Mever, Analysis and Design of Analog Integrated Circuits (4th Edition), John Wiley & Sons, Inc, 2001.

[155] M. Hershenson, A Hajimiri, S. S. Mohan, S P. Boyd and T. H. Lee, ―Design and optimization of LC oscillators,‖ IEEE, 1999

[156] A. Hajimiri and T. H. Lee, ―A general theory of phase noise in electrical oscillators,‖ IEEE Journal of Solid State Circuit, pp. 179-194, Feb. 1998.

[157] A. Hajimiri and T. H. Lee, ―Design issues in CMOS differential LC oscillators,‖ IEEE Journal of Solid State Circuits, 34(5), May 1999.

[158] A. Hajimiri and T. H. Lee, ―Phase noise in CMOS differential LC oscillators,‖ Proc. VLSI Circuits, pp. 48-51, June 1998.

131

APPENDICES

132

Appendix A

Convexity of Point-wise Maximum of Convex Functions

Lemma: If functions f1(y) and f2 (y) are convex, their pointwise maximums

function f (y)  max{ f1(y), f2 (y)} is also convex; if functions f1(y), f2 (y), , fm (y) are

convex, then their pointwise maximums function f (y)  max{ fi (y)} is also convex. Proof: im

It is known that if for any  R with 0   1, the function f (y)satisfies

f ( x  (1 )y)f ( x)  (1 ) f (y) (A.1)

then function f (y)is a convex function. Now, since f1(y) and f2 (y) are convex functions, it holds that for any ,

f ( x  (1 )y) max{ f1(x  (1 )y), f2 (x  (1 )y)}  max{f (x)  (1 ) f (y),f (x)  (1 ) f (y)} 1 1 2 2 (A.2)    max{ f1(x), f2 (x)} (1 )  max{ f1(y), f2 (y)}  f (x)  (1 ) f (y)

Consequently, is convex. Similarly, this can be extended to the general case. If functions are convex, then

f ( x  (1 )y)  max{ fi (x  (1 )y)} im

 max{ f1(x  (1 )y), , fm (x  (1 )y)}

 max{f1(x)  (1 ) f1(y),,fm (x)  (1 ) fm (y)} (A.3)

  max{ f1(x),, fm (x)} (1 )max{ f1(y),, fm (y)}  f (x)  (1 ) f (y)

Therefore, is convex.

133

Appendix B

Norm Approximation in Function Fitting

Norm is a measure of the errors when fitting in this work. Here, several types of norm approximations in function fitting (for example, monomial and posynomial fitting) are briefly introduced, as they are often required in a convex program. More details can be found in

[28].

The norm-approximation problem is an unconstrained problem as

Minimize f (x)  Ax  b (B.1) p where A Rmn and b  Rm are given data that we want to fit to, x Rn is the variable, and

 is a norm on Rm . A solution of this problem can approximate Ax  b . It is shown that the norm approximation problem is a convex problem and solvable [28]. Basically, a small p puts more weight on the small errors, and a large p puts more weight on the large errors. An

L1 -norm penalty puts the most weight on small residuals, whereas an L norm puts the most

weight on the worst case residual. The L2 -norm is in between these two. There are several choices of the norm , which are illustrated as follows.

B.1 The Manhattan or L1 Norm Approximation

In the case of index p 1 , the norm is called the Manhattan or norm [13]. The -norm approximation is the following minimization problem:

134

m Minimize f(x)  Ax  b  aT x  b (B.2) 1  i i i1

There is no analytical solution for this problem, but a solution can be determined by

m T introducing a slack variable t   ai x  bi , then reformulating it as a linear problem: i1

Minimize 1T t (B.3) Subject to - t  Ax b  t with variables x Rn and t  Rm .

B.2 The Euclidean or L2 Norm Approximation

In the case of index p  2 , the norm is called the Euclidean or L2 norm [13], which is the most common choice of the norm approximation problem. By squaring the objective, we obtain an equivalent problem which is called the least-squares approximation problem:

m Minimize f(x)  Ax  b  (aT x  b )2 (B.4) 2  i i i1

Or we can indirectly minimize its square:

m Minimize f 2 (x)  Ax  b 2  (aT x  b )2 (B.5) 2  i i i1

This is actually a simple least-square problem. To solve this problem, we can analytically express the object as the convex quadratic function

f (x)  xT AT Ax  2bT Ax  bTb (B.6)

A variable x minimizes f (x) if and only if when

135

f (x)  2AT Ax  2ATb  0 (B.7) which always has a solution given by x  (AT A)1 ATb, where A is the m n matrix.

B.3 The Chebyshev or L Norm Approximation

In the case of p   , the norm is called the Chebyshev or norm. The L norm approximation is the following minimization problem:

T Minimize f(x)  Ax  b  min max ai x  bi (B.8)  i1,2,,m which is also called the Chebyshev approximation problem, or minmax approximation problem, since we minimize the maximum (absolute value) residual. A solution can be achieved [13] from the following equivalent linear program by introducing a slack variable

T t  max ai x  bi , i1,2,,m

Minimize t   (B.9) Subject to - t 1  Ax b  1 with variables x Rn and t  Rm . That is,

Minimize t

Subject to log fi  0 1 log x1i ...  an log xni t ,i 1,...,m (B.10)

log fi  0 1 log x1i ...  an log xni t , i 1,...,m

B.4 The Hölder or Lp Norm Approximation

The Hölder norm is defined as

136

m p Minimize f(x)  Ax  b  p aT x  b (B.11) p  i i i1

Now some methods are described to solve the above problem [30].

Newton's method When p  2 , f(x) is twice differentiable everywhere except when Ax = b. This condition is not likely to be encountered in practical norm minimization problems, so we may consider applying Newton's method directly. In fact, let us minimize instead the related function

m p T p g(x)  f (x)   ai x  bi (B.12) i1 which produces the same solution x but simplifies the calculations somewhat, and eliminates the non-differentiability at Ax  b . The iterates produced by Newton's method are

(k 1) (k) 2 (k) 1 (k) x  x k  g(x ) g(x )   x(k)  k (ATW(k)A)1 ATW(k) (Ax(k)  b) (B.13) p -1

 k  (k) k (k) 1/2  1 x  arg min (W ) (A )  b  p -1 p -1  2

where x0  0, k 1, 2, 3, , and we have defined

p2 p2 p2 W(k)  diag aTx(k)  b , aTx(k)  b ,  , aT x(k)  b  (B.14)  1 1 2 2 m m 

The quantities k [0, 1] are either fixed or determined at each iteration step using a line search technique. Notice how the Newton computation involves a least-squares problem: In

fact, if p  2 , then Wk  I , and a single iteration with 1 1 produces the correct solution. So

137

the more ―complex‖ Lp case simply involves solving a series of similar least-squares problems—a resemblance that often turns up in numerical methods for convex programming.

An important technical detail must be mentioned here. If one or more of the residuals

T (i) (k) 2 T (k) ai x  bi are zero, the matrix W (B.14) is singular, and  g(x)  A W A can therefore be singular as well. If m  n and A has full column rank, this is not likely—but care must be taken, nonetheless, to guard for the possibility. A variety of methods can be considered to address the issue—for example, adding a slight damping factor I to either or2g(x) .

Newton's method is itself a relatively straightforward algorithm, and a number of implementations have been developed (see, for example, [149]). These methods require that code be created to perform the computation of the gradient and Hessian of the function that is being minimized. This task is eased somewhat by automatic differentiation packages such as

ADIC [150] or ADIFOR [151], which can generate derivative code from code that simply computes a function's value.

A better approach For1 p  2 , Newton's method cannot reliably be employed, because

 neither f (x) nor g(x)  f p (x) is twice differentiable whenever any of the residuals are zero.

An alternative that works for all p[1,  ) is to apply a barrier method to the problem. A full introduction to barrier methods is beyond the scope of this text, so we will highlight only the key details. The reader is invited to consult [13] for a truly exhaustive development of barrier methods, or [28] for a gentler introduction.

To begin, we note that the solution to (B.1) can be obtained by solving

138

 minimize 1T t (B.15) T p subject to ai x  bi  ti i 1,  ,m

p p Where the slack variable t is defined as t  aT x  b ,  , aT x  b  . To solve (B.15),  1 1 i i  we construct a barrier function  : Rn  Rm  (R U ) to represent the inequality constraints

[13],

m  2/p T 2   logti  (ai x  bi )   2log ti  (x, t) Int S (x,t)   i1   (x, t) Int S (B.16)

n m T p S  { (x,t) R  R | ai x  bi  ti , i 1,  ,m}

The barrier function is finite and twice differentiable whenever the inequality constraints in

(B.15) are strictly satisfied, and   otherwise. This barrier is used to create a family of functions g(u) parameterized over a quantity   0 :

 n m T g : R  R  (R  ), g (x,t) 1  (x,t) (B.17)

The minimizing values of g (x,t) converge to the solution of the original problem as   0 .

A practical barrier method takes Newton steps to minimize g (x,t) , decreasing the value of

 between iterations so as to ensure convergence and acceptable performance.

This approach is significantly more challenging than our previous efforts. As when using the Newton method for the p  2 case, the code must be written (or automatic differentiation

employed) to compute the gradient and Hessian of function g (x,t) . Furthermore, the authors are unaware of any readily available software implementing such a general-purpose barrier

139

method. Several custom solvers exist specifically for the p-norm minimization problem, but applications-oriented users are unlikely to know of them. Obviously, the p case is the most difficult option to solve thus far.

B.5 Largest k  term Norm

m This is an uncommon case of norm. Given a vector w R , let w k be the k - th largest element of the vector after it has been sorted in descending order in absolute value:

w  w    w (B.18) 1 2 k

Then the largest-k norm is defined as follows:

L f(x)  w  w , L{1, 2,  ,m} (B.19) L  k k1

Solving (B.1) using this norm produces a solution x that minimizes the sum of the absolute

values of the L largest residuals. It is equivalent to the L case for L = 1 and the L1 case for

L = m, but for 1 < L < m this norm produces novel results. Indeed, equation (B.19) is a norm and even a convex function. It is even less obvious how this problem can be solved. But in fact, a solution can be obtained from the following LP [13]:

 Minimize 1 t  Lq   Subject to - t - q 1  Ax  b  t  q 1 (t  R m , q  R) (B.20) t  0

This LP is only slightly larger than the ones used for and norm cases. The result is known—see, for example [152]—but not widely so, even among those who actively study optimization.

140

Appendix C

Objective and Specifications for Two-Stage Op-amp

Constraints Equations Type Unit-Gain BWD 4.8 Maximize MF ωC  AV0ω3dB  gm1 / CC 6 4.9 Circuit Area A  α0  α1CC  α 2 Wi Li PF i1 Quiescent Power 4.10 PF P  (Vdd  VSS )(Ibias  I5  I7 )

A v  A v1A v2  g m2 (ro2 || ro4)g m6 (ro6 || ro7) 4.11 Low-Frequency Gain g m2 g m6 2Cox W2 /L 2 MF   2 μ nμ p g o2  g o4 g o6  g o7 (λ n  λ p ) I1I7 W6 /L 6

4.12 g m1 -3dB frequency ω3dB  p1  MF A vCC 0.7 4 4 Phase Margin 4.13 ωC ωC  π PF arctan( )  0.75    PMmin i2 pi i2  pi  2 2I 4.14 1  VSS  VT3   VT1  Vcm,min μ n CoxW3/L 3 CM Input range PF 4.15 2I1 2I5 Vcm,max  Vdd  VT1   μ pCoxW1/L1 μ pCoxW5 /L 5 2I 4.16 7  Vout,min  VSS μ n CoxW6 /L 6 Output Range PF 4.17 2I7  Vdd  Vout,max μ n CoxW7 /L 7 Symmetry, Matching 4.18 MF W1 W2 L1 L2 W3 W4

L3  L4 L5  L7  L8

2g m1g m3 CMRR CMRR  MF 4.19 (go1  go3)go5

2Cox W1/L 3  μ nμ p 2 (λ n  λ p )λ p I5 W3/L 3

4.20 2g m2 g m3 g m6 PPSRR PSRR  Neither (g o2  g o4)(2g m3 g o7  g m6 g o5)

4.21 g m2 g m6 NPSRR 0  (g o2  g o4)g o6 NPSRR 2 IPF 4.22 A p NPSRR(jω0 )  2 2 2 2  a (1 ω0 /p1 )(1 ω0 /p 2 )

4.23 CC 1 & CC  CTL 1 Slew Rate   PF 2I1 SR min I7 SR min

141

Table Appendix C: Continued

2 2  2  4.24 , 2K p K nμ n L1 V1/f  α/f  β  V1/f,max α  1   2  CoxW1L1  K pμ p L3  Input-Referred Spot Noise PF 16kT  μ (W/L)  β  1 n 3   μ (W/L)  3 2μ pCox (W/L)1 I1  p 1  Device Width MF Wmin,i  Wi  Wmax, i Device Length MF Lmin,i  Li  Lmax, i Positive Supply Voltage MF VDD,min  VDD  VDD,max Neg. Supply Voltage MF VSS,min  VSS  VSS,max Compensation Capacitor MF CC,min  CC  Cmax

142

Appendix D

Preparation for LC-Tuned Oscillator Optimization Design

D.1 Analysis of Typical RLC Oscillator

In the past twenty years much knowledge has been gathered on RF oscillators [138, 139, and

140] because of their wide use (e.g., the voltage controlled oscillators used in Frequency synthesizers). Here, one LC-tuned oscillator [141] is also designed with the proposed surrogate modeling and optimization strategy as a second case study example. Figure D.1 (a) shows a general circuit diagram of a parallel resonant LC-tuned oscillator with all the parasitic resistances including inductor, capacitor, and parallel losses.

VOUT VOUT VOUT

GM C L

rP RP RCP RLP C L RP RCP RLP C L

rCS rLS

(a) (b) (c)

Figure D.1: Analysis of typical RLC oscillator. (a) LC tank; (b) equivalent circuit of LC tank; (c) LC tank with negative transconductance compensating the ohmic loss

The transfer function of this LC tank is

V H ()  out 2 I in (D.1) 1  G()

143

where G() is the admittance of this configuration. The admittance for such an RLC oscillator is calculated in (D.2) [153].

I G()  in Vout 1 1 1    R 1/ jC  r jL  r P CS LS (D.2) 1 C 1   2  RP [1 (CrCS ) ]/(CrCS ) L[L / rLS  rLS /(L)] C 1  j 2  j 2 1 (CrCS ) L[1 (rLS /L) ]

L 1 Let QL  and QC  , which are inductor and capacitor quality factors, respectively. rLS CrCS

Then the admittance for such an RLC oscillator can be represented as

1 1 G()   R Q /(C)[11/ Q2 )] P C C (D.3) 1 C 1   j 2  j 2 L[QL 1/QL ] 11/QC L[11/QL ]

When QL and QC are sufficiently large, then equivalent admittance is approximated well by

(D.4):

 1 1 1  1 G()       jC  (D.4) RP QC /(C) LQL  jL

1 L2 Let RCP  QC /(C)  2 and RLP  LQL  , which are the equivalent parallel C rCS rLS resistances of the capacitor C and the inductor, respectively. Define the characteristic impedance as (D.5):

144

1 Z  o C (D.5) L  C

Also define the total effective parallel resistance in (D.6):

Reff  RP || RCP || RLP (D.6)

And define the total effective transconductance as

Geff 1/ Reff (D.7)

Define the quality factor of the complete tank circuit QT as follows:

QT  Reff / Zo (D.8)

And define the equivalent quality factor of the parallel resistor RP as

Q  R / Z (D.9) RP P o

According to (D.6) and (D.8), the quality factor of the complete tank circuit is now equal to

Q  Q || Q || Q (D.10) T RP C L

That is,

1 Z 1 1  o   (D.11) QT RP QC QL

Note that the quality factor of the complete tank circuit is mainly decided by the component with the smallest quality factor.

Now the tank can be represented by the equivalent circuit in Figure D.1 (b), where all losses are represented with equivalent parallel losses. The admittance of the LC tank now is

145

 1 1 1  1 G()       jC  R R R jL  P CP LP  (D.12) 1  1    jC   Reff  L

Therefore, the transfer function of this LC tank, at last, is expressed as

H2 () 1/G() 1  (D.13) 1  1   jC   Reff  L

1/ Reff C 1/L  2 2  j 2 2 1/ Reff   C 1/L 1/ Reff   C 1/L

According to the Barkhausen criterion, oscillation occurs at the frequency where the loop transfer function becomes exactly equal to one. In other words, the imaginary part of the loop transfer function must be zero, which leads to (D.14):

Im{G()}  C 1/(L)  0 (D.14)

Therefore the oscillation frequency is

0 1/ LC (D.15)

To build an oscillator that operates the tank at this parallel resonance frequency, a negative

transconductance GM may be added in parallel to compensate the losses and to sustain the oscillation as shown in Figure D.1(c); it is necessary to have the loop transfer function exactly equal to one at that oscillation frequency.

Now the loop transfer function of the total tank with the negative transconductance is

H()  GM H2 ()

1/ Reff GM C 1/(L) (D.16)  GM 2 2  j 2 2 1/ Reff   C 1/(L) 1/ Reff   C 1/(L)

146

And its real part should be equal to one, once again, according to the Barkhausen criterion:

1/ Reff Re{H(0 )}  GM 2 2 1 (D.17) 1/ Reff   0C 1/(0L)

Therefore, the negative transconductance GM guaranteeing the oscillation of the LC tank is

GM  1/ Reff (D.18)

To guarantee start-up under all conditions, the negative conductance is overdesigned by a

factor  1.5  3(in this work,   2is selected) so thatGMNR  GM .

To evaluate the noise-to-carrier ratio of the oscillator, we compute the tank impedance at

an offset frequency  from the center frequency0 .

1  1  G(0  )   j(0  )C   Reff  (0  )L (D.19) 1    1         j  1 0 11/1  R    Z     eff  0  0    0 

  Assume that  0 , then 1/(1 ) 1 from the property of Geometric Series, 0 0 which results in

1    1         0   G(0  )   j  1 1 1  Reff  0  Z0    0      (D.20) 1    1   2 j  Reff  0  Z0

In such an oscillator shown in Figure D.1(c), the effect of the real part in the tank admittance(i.e., the first term in the above equation) is cancelled with the negative conductance so that only the second term remains.

147

D.2 LC-Tuned Oscillator Model

Another design example in this work, the LC-tuned oscillator in Figure D.2, is conducted with the proposed GP-based surrogate modeling and optimization strategy. There are in total

12 optimal design variables for the LC-tuned oscillator, which are inductor diameter DL ,

inductor widthWL , turn space S and number of turns n , transistor widthWn and length Ln ,

maximum and minimum capacitance of varactor CVar, max and CVar, min , bias current Ibias , tank

voltage swing Vswing , resonant frequencyres and maximum resonant frequency,res,max [116].

L L

Vtune Vout+ Vout-

CLoad CLoad M3 M4 CVar CVar

M1 M2

Mtail Itail

Figure D.2: LC-tuned oscillator

148

D.2.1 Transistor Model

To analyze the small signal characteristics of the above practical LC-tuned oscillator, the small signal model of the involved NMOS transistor is used as in Figure D.3, which

considers the impact of the gain to drain capacitance Cgd , gate to source capacitance Cgs , gate

to bulk capacitance Cgb , drain to bulk capacitance Cdb , source to bulk capacitance Csb , output

resistance ro , and transistor transconductance GM [154].

G D

Cgd

Cgb Cgs GMVgs gmbVsb ro Cdb

S Csb b

Figure D.3: Small signal model of NMOS transistor

Then the small signal model of the two involved cross NMOS couple transistors

M1 and M2 is derived step by step, as shown in Figure D.4. In further analysis procedures, the reduced model in Figure D.4 (c) is always used for this NMOS cross transistor couple.

149

G1 D1(G2) D2

Cgd Cgd

Cgs1 GM1Vgs1 ro1 Cdb1 Cgs2 GM2Vgs2 ro2 Cdb2

S (a)

G1(D2) D1(G2) D2(G1)

Cgd1 Cgd2

Cgs1 Cdb2 GM1Vgs1 ro1 Cdb1 Cgs2 GM2Vgs2 ro2

S (b)

D1(G2) D2(G1)

Cgd1

Cgd2 Cdb1+Cgs2 GM1Vgs1 ro1 GM2Vgs2 ro2 Cdb2+Cgs1

(c)

Figure D.4: Small signal analysis of cross NMOS couple

D.2.2 Spiral Inductor and Varactor Model

In this LC-tuned oscillator, two square spiral inductors (Figure D.5) are used. Their model is

shown in Figure D.1(a), which is composed of series resistance rLS (for modeling the losses of

a practical inductor), series inductance L , and parallel capacitance C p , but with the parallel

resistance rp   and rCS  0 for the sake of the used process file on hand. The model design

variables of the inductor are the outer diameter DL , turn widthWL , turn space S and number of turns n , separately [155, 156].

150

S

DL

DIN

WL

Figure D.5: Square spiral inductor in LC-tuned oscillator

The involved model parameters (series inductance LS , series resistance rLS and associated capacitance C ) are calculated as follows. First, the associated capacitance is expressed with the following posynomial function:

5 10 C  3.1210 nDLWL 1.37 10 nSWL (D.21)

The series inductance is modeled by

6 2.07 0.21 0.104 1.42 0.023 LS 1.6610 n WL S DL f (D.22) which is a monomial function of the design variables. The series resistance is given by the following monomial function:

7 1..32 1.03 1.31 0.94 0.82 rLS  4.3510 n WL S DL f (D.23)

Also, this assumes that the aractors used in the LC-tuned oscillator have some losses and are

modeled with a series resistance rVS for practical factor.

151

D.2.3 LC-tank Model

Now the complete small signal model of the LC-tuned oscillator is derived as shown in

Figure D.6, by putting the models of these three device blocks together: inductor, varactor, and cross NMOS transistor couple [155].

Inductor Model

L rLS rLS L

CL CL

r rP CLoad P CLoad

Varactor Model

CVar rVS rVS CVar

Cross NMOS Couple Model

D1(G2) D2(G1)

Cgd1

Cdb1+Cgs2 GM1Vgs1 ro1 Cgd2 GM2Vgs2 ro2 Cdb2+Cgs1

Figure D.6: Small signal of LC-tuned oscillator

Simply using some properties of electrical circuits (such as series and parallel combinations of inductor, capacitor, resistor, and admittance) the small signal model of this oscillator can be reduced in detail as in Figures D.7 (a) – D.7 (e). In the next section, all the equivalent lumped system parameters (such as oscillation frequency, characteristic impedance, and equivalent parallel resistance of the capacitor and the inductor) may be obtained based on

Figure D.7 (e) [156].

152

L rLS rLS L 2L 2rLS

CVar rVS rVS CVar Cvar/2 2rVS

rP rP 2rP

CL CL CL/2

CLoad CLoad CLoad CLoad

Cgd1+Cgd2 2Cgd

Cdb1+Cgs2 Cdb2+Cgs1 (Cdb+Cgs)/2

go1 go2 gO/2

-GM1 -GM2 -GM/2 (a) (b)

2L LTank=2L

Cvar/2 CTank=(CL+CVar)/2

2 RLP=2(ώL) /rLS gTank=(gP+gCV+gL)/2

2 RCP=2/[(ώCVar) rVS] CLoad CLoad

2rP

CMOS=(Cdb+Cgs+4Cgd)/2

CL/2 gMOS=(gO-GM)/2

CLoad CLoad (d)

2Cgd LT=2L

(Cdb+Cgs)/2 CT=(CL+CV+Cdb+Cgs+4Cgd)/2

gO/2 gT=(gP+gCV+gL+gO-GM)/2

-GM/2 CLoad CLoad (c) (e)

Figure D.7: Reduced small signal of LC-tuned oscillator

Now combining the analysis for the general parallel resonant LC-tuned oscillator shown in

Figure D.1, which converts the series resistance to parallel resistance, some equivalent component values are derived in (D.24) [155, 156, 157 and 158],

153

L  2L'  C  CVar / 2  RLS  2rLS (D.24) R  2r  CS VS RP  2rP  2/ gP , where gP  1/ rP

Therefore, the oscillation frequency, characteristic impedance, inductor quality factors, capacitor quality factors, and equivalent parallel resistance of the capacitor and the inductor are

0 1/ LC 1/ 2L'CVar / 2 1/ L'CV

1 1 L' Z0 0L   2L' 2 0C L'CVar CVar

0L 0 2L' 0L' QL    RLS 2rLS rLS

1 1 1 QC    0CRCS 0 CVar/ 22rCS  0CVarrCS

RLP  0LQL

 0 2L'0L'/ rLS   L'2  2 0 rLS

 2RLP

 2/ gL

2 2 where RLP  0L' / rLS and gL  rLS /0L'

RCP  QC /(0C)

 1/(0CVarrVS )/0 CVar / 2 2  21/(0CVar  rVS )

 2RCP  2/ gCV

154

2 2 Here we define RCP 1/(0CVar  rVS ) and gCV  0CVar  rVS for convenience.

Therefore, the effective tank resistance Reff is

Reff  Reff

 RP || RCP || RLP

 (2RP ) || (2RCP ) || (2RLP ) 2  gP  gCV  gL

And the effective tank conductance Geff is equal to

Geff 1/ Reff 1 1 1    RP RCP RLP

 (gP  gCV  gL ) / 2

The lumped tank inductance LTank is

LTank  2L'

And the total oscillator inductance LT is also equal to

LT  2L' (D.25)

The lumped tank capacitance CTank is

1 1 C  C  C  C || C  C || C  C  C  Tank 2 Var 2 L db1 gs2 db2 gs1 gd1 gd2 1 1 1 1  C  C  C  C  2C 2 Var 2 L 2 db 2 gs gd

 CVar  CL  Cdb  Cgs  4Cgd / 2

The total oscillator capacitance CT , thus, is

C  C  C / 2 T Tank load (D.26)  CVar  CL  Cdb  Cgs  4Cgd  Cload / 2

155

The lumped tank conductance GTank is

G  G  g / 2 Tank eff 0 (D.27)  (gP  gCV  gL  g0 )/ 2

The lumped oscillator conductance GT is

GT  GTank  gMOS / 2

 GTank  g0 / 2  Gm / 2 (D.28)

 (gP  gCV  gL  g0  Gm )/ 2

D.3 Design Specifications

In this section, all the involved performance specifications are derived in detail for formulating the LC-tuned oscillator optimization as a GP format.

D.3.1 Power Dissipation

The power consumed by the oscillator is readily expressed as [155]

P VddIB (D.29)

Obviously, it is a monomial function, so both minimum and maximum constraints can be imposed on the dissipated power.

D.3.2 Tank Switching Voltage

The tank switching voltage of this oscillator is expressed in the form of (D.30) [155],

Vswing  minRT IB , Vdd (D.30) which can be re-expressed by (D.31) and (D.32):

156

Vswing,min  RT IB (D.31)

Vswing,min Vdd (D.32)

Since the tank resistance RT is a posynomial, (D.27) is a posynomial constraint, and (D.28) is obviously a monomial constraint.

D.3.3 Phase Noise

To analyze the phase noise of the implemented LC-tuned oscillator, all the important noise sources of the involved inductor, capacitor, and transistors are illustrated in Figures D.8 (a) –

Figure D.8 (b). Later, the overall 1/ f 2 phase noise of this oscillator will be calculated based on Figure D.8 (b) [155, 156, 157 and 158].

L rLS rLS L L =2L V 2 V 2 Tank rLS rLS Δf Δf

CTank=(CL+CVar)/2

r P 2 2 rP Vr Vr C P P C 2 Load Load gTank VR Δf Vtune Δf Tank Δf

CVar rVS rVS CVar C C V 2 V 2 Load Load rCS rCS D1(G2) Δf Δf D2(G1) D1(G2) D2(G1)

M1 M1 M2 2 2 2 M2 2 2 2 2 2 VM 1,d VM 1,g VM 2,g VM 2,d VM 1,d VM 1,g VM 2,g VM 2,d Δf Δf Δf Δf Δf Δf Δf Δf

Mtail Mtail Itail Itail

(a) (b)

Figure D.8: Phase noise analysis of LC-tuned oscillator

157

In the 1/ f 2 of the phase noise spectrum, the total single sideband phase noise spectral density in dB below the carrier per unit bandwidth due to the source on one node at an offset

frequency of  (  2foff )is given by [156],

2  i 2  L   rms  n  (D.33)   2 2   2 qmax  f 

2 where the cooefficient rms  0.5 if a sinusoidal impulse current source is injected; also qmax is the total charge swing of the tank and is given by the monomial expression,

Vswing qmax  CTVswing  2 (D.34) LTres

2 which is a monomial function of optimal design variable. in / f is the sum of the current noise densities of the individual noise source: the transistor channel thermal noise

2 2 ( in,d / f ), transistor gate noise ( in,g / f ), the thermal noise generated by the parallel resistor of the inductor (i 2 / f ), the inductor thermal noise ( i 2 / f ), and the varactor thermal noise rP RL

( i 2 / f ). These noise sources can all be represented by PFs of the design variables. RVAR

Among these noise sources, transistor channel thermal noise can be expressed as [156],

i 2 1 i 2 M ,d  M ,nd (D.35) f 2 f

2 iM ,nd W Idn IBIAS where  4KTCox (VGS VT )  4KTgm  8KT  4KT for short channel f L VOV VOV transistor devices. Here K is Boltzmann’s constant, T is absolute temperature in Kelvins, and

  2/3 ~1 for long channel transistor devices, while  2 ~ 3 for short channel devices.

158

And the transistor gate noise is calculated in the form of

i 2 1 i 2 M ,g  M ,ng (D.36) f 2 f

2 2 2 where iM ,ng / f  4KT CgsVOV /(5IB ) for short channel transistor devices. Here

  2 and Cgsis the gate to source capacitance. Because is a posynomial expression, the above equation (D.36) is also posynomial function.

The Inductor noise can be expressed in (D.37) [156, 158],

i 2 4kT RL  2 f RLP  1 1   8kT  2  (D.37) rP 0L' / rLS 

 1 rLS   8kT  2  rP 0L' 

Since rP , L'and rLS are all given by monomial expressions, the above equation is a posynomial function of the design variables (in the inductor used in this LC-tuned oscillator, since

rP   , and is given by a posynomial function, the above equation is a posynomial function of the design variables).

The noise produced by the varactor can be presented by (D.38) [155],

i 2 4kT RVAR  2 f RCP 1  8kT 2 (D.38) 1/[(0CVar ) rVS ] 2  8kT0CVar  rVS

159

where RCP is the equivalent parallel resistance of the varactor. And varactor noise is a

monomial function of the design variable if we consider rVS as an extra design variable.

Now because rms is a constant, qmax is a monomial in the design variables and the noise

i 2 source n are a posynomial function, as a consequence, phase noise Lis a posynomial f equation of the design variables. So we can optimize the phase noise of the LC-tuned

oscillator by minimizing the phase noise at a given frequencyi when treating it as the design objective.

D.3.4 Resonant Frequency

We can impose a constraint on the maximum resonant frequency res,max . This constraint, together with a constraint on the tuning range, is equivalent to specifying the center resonant frequency. The maximum tank resonant frequency is then expressed by the posynomial [155],

1 res  (D.39) LT CT

We can therefore impose a minimum required res,max with the posynomial

constraintres,max  res,max,req . This constraint is always active (i.e., is practically an equality).

If it were not, the inductor could contribute additional capacitance to the tank, which would

translate into a higher QL .

160

D.3.5 Tuning Range

The tuning range is specified with two constraints

 1 res,max   max  LT CT ,min (D.40)  1     res,min L C min  T T ,max

That is,

2 LT CT ,min 1/max (D.41)

2 LT CT ,max 1/min (D.42)

Where CT ,max  CT ,min  CVar,max  CVar,min . The last constraint is not posynomial and cannot

2 be handled by a GP. The constraint LTCT ,min 1/max is always tight; if it were not tight, the oscillator would be operating at a higher frequency and the phase noise would be higher.

Since is always tight, we can also handle constraint

2 LTCT ,max 1/min indirectly in the following way. Dividing (D.42) by (D.43) results in the inequality (D.44) [155].

2 CT ,min min  2 (D.43) CT ,max max

Now let k  res,max /res,min

C C (k 2 1) T ,min  Var,min 1 (D.44) CVar,max CVar,max

Thus, we can substitute by the previous posynomial constraint. Since one constraint is tight, we can incorporate what at first seems to be a non-posynomial constraint.

161

    Now define Tuning range T  res,max res 100%  res res,min 100% res res

      res,max res,min  res 2   (1 t%)  res,max res   (1 t%)   res,min res (D.45) k 1 T   k 1  1 T k   1 T

Then assuming Tuning range is no less thant% , therefore

   res,max res 100%  t% res   res,max 1  t% res     2  res,max  1 t%  res,max res,min  (D.46)    res,max res,min res,max 1 t% 2  1 t% 1 1 t% 1 1 t%  res,min       res,max 1 t% res,max 1 t% k 1 t%

res,min

D.3.6 Inverses Loop Gain

To have enough gain to compensate for the total loss in the LC tank for normal operation, it is necessary to impose a minimum constraint on the loop gain. That is, the loop gain should

be higher than a predefined minimum loop gain—say, Gmin . In other words, we may make an upper bound on the inverse loop gain. The inverse loop gain can be expressed as [155],

162

g 1 T  (D.47) gm,n Gmin

Since the tank conductance gT is posynomial, (D.43) is a posynomial constraint.

D.3.7 Varactor Tuning Range

In practice, it is necessary to limit the ratio of the maximum varactor capacitance to

minimum varactor capacitance, expressed as the maximum tuning ratio K  CVar,max /CVar,min .

Alternatively (represented by D.48), we impose a

CVar,max  KCVar,min (D.48) which is the monomial constraint apparently [155].

D.3.8 Bias Condition

To make the oscillator operate at a normal bias operating condition, (D.49) should be met.

Vbias Vgs Vswing Vdd (D.49) which is a monomial constraint.

D.3.9 Size Constraints

For a real circuit design, naturally there are some limits on the size of the device to produce a chip with an acceptable area. First, we give an upper bound on the area of the oscillator, which can be expressed in (D.50),

2 DL  2WnLn  Amax (D.50)

163

which is a posynomial constraint; second, we limit the device size for the width and diameter of the inductor and the width and length of the transistor in (D.51),

WL,min  WL  WL,max  DL,min  DL  DL,max (D.51) W  W  W  n,min n n,max  Ln,min  Ln  Ln,max which are obviously some monomial constraints.

The above analysis for the LC-tuned oscillator can be used to formulate the GP- compatible circuit model and can help us in its optimization design with the proposed surrogate modeling and optimization strategy for verifying the effectiveness and viability of the proposed modeling and optimization algorithm.

164