IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PAPERS, VOL. 53, NO. 4, APRIL 2006 879 Model-Order Reduction Using Variational Balanced Truncation With Spectral Shaping Payam Heydari, Member, IEEE, and Massoud Pedram, Fellow, IEEE

Abstract—This paper presents a spectrally weighted balanced the reduced system. Extensions of explicit moment-matching truncation (SBT) technique for tightly coupled integrated circuit techniques have been proposed recently, to reduce the linear interconnects, when the interconnect circuit parameters change as time-varying (LTV) as well as nonlinear dynamic systems [7], a result of statistical variations in the manufacturing process. The salient features of this algorithm are the inclusion of the parameter [8]. variation in the RLCK interconnect, the guaranteed passivity of the An alternative model reduction technique is the balanced reduced transfer function, and the availability of provable spec- realization [9]–[11]. While being widely investigated in con- trally weighted error bounds for the reduced-order system. This trol system theory, balanced realization techniques have not paper shows that the variational balanced truncation technique received the same attention as Krylov-based and Pade-based produces reduced systems that accurately follow the time- and fre- quency-domain responses of the original system when variations in model reduction techniques in the context of interconnect the circuit parameters are taken into consideration. Experimental analysis, partly because these methods involve computation-in- results show that the new variational SBT attains, in average, 30% tensive algorithms [9], [11], [12]. Furthermore, it is well known more accuracy than the variational Krylov-subspace-based model- that the balancing transformation may be poorly conditioned order reduction techniques. when the system is nearly uncontrollable or unobservable [9]. Index Terms—Balanced truncation, interconnect, Lur’e equa- The most significant drawback of the conventional balanced re- tions, model-order reduction, passivity, process variation, spectral alization techniques lies in their inability to guarantee a passive shaping. reduced-order system. However, the balanced truncation tech- nique can achieve smaller reduced-order models with a better I. INTRODUCTION error control than those obtained using Krylov-subspace-based order reduction techniques. ODEL reduction techniques enable circuit designers Recently, in [13], the authors developed a guaranteed passive to capture the interconnect effects with a much shorter M balancing transformation for the model-order reduction of large computational time than that required for simulation of the full linear time-invarian (LTI) systems. circuit. On the other hand, as the minimum feature sizes shrink The balanced realization-based model reduction methods to the subquarter microns, geometrical variations in the line provide an error-bound between the transfer function of width, metal height, and dielectric thickness due to process the original and that of the reduced system.The bottleneck in variations have more pronounced effects on the reliability balanced truncation methods is the computational complexity and performance of VLSI circuits [1]. As a consequence, it for solving the Lyapunov equations. [10] uses the truncated is crucial to assess the impact of these process variations on balanced realization technique as well as the Schur decompo- model-order reduction techniques. sition to develop an efficient numerical method for the order Among various classes of model reduction techniques, reduction of a large LTI system. [11] and [12] propose efficient explicit moment-matching algorithms ( asymptotic waveform algorithms to solve the two Lyapunov equations in order to evaluation (AWE) [2], rapid interconnect circuit evaluation obtain the controllability and the observability grammians. The (RICE) [3]) and Krylov-subspace-based methods (Pact [4], algorithms are based on the alternated direction implicit (ADI) Pade approximation via the Lanczos proces (PVL) [5], pas- method that was first proposed in [14], [15]. sive reduced-order interconnect macromodeling algorithm A shortcoming of these model reduction techniques is that (PRIMA) [6]) have been most commonly employed for gen- they do not reshape the frequency spectrum to emphasize error erating the reduced-order models of the interconnects. The minimization in some frequency range of interest. Furthermore, computational complexity of these model-order reduction they do not address the numerical difficulties when the system is techniques is primarily due to the matrix-vector products. How- nearly uncontrollable or unobservable. In [16]–[18], a new nu- ever, these methods do not provide a provable error bound for merically stable, frequency-weighted balanced truncation tech- nique was presented. The proposed method gives a definite a Manuscript received June 22, 2004; revised July 24, 2005. This paper was priori bound on the weighted error, and is guaranteed to be recommended by Associate Editor M. Stan. stable even when both input and output weightings are utilized P. Heydari is with the Department of Electrical Engineering and , University of California, Irvine, CA 92697-2625 USA (e-mail: at the same time. [email protected]). Analyzing the interconnect without taking into account the M. Pedram is with the Department of Electrical Engineering Systems, rather large variations of the interconnect geometries is not University of Southern California, Los Angeles, CA 90089 USA (e-mail: [email protected]). useful in practice. These variations are especially large in the Digital Object Identifier 10.1109/TCSI.2005.859772 inter-layer dielectric (ILD) thickness and the metal line width

1057-7122/$20.00 © 2006 IEEE 880 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PAPERS, VOL. 53, NO. 4, APRIL 2006 and height. These process-dependent geometrical variations the state vector constitutes node voltages across circuit capaci- have a definite impact on the total line and inter-wire coupling tances, voltage sources, currents flowing through inductors, and parasitics, which in turn results in variations in the signal delay current sources. The system matrix is readily obtained from and the coupling noise. In [19], Liu et al. studied the effect the conductance and susceptance matrices of the RLC circuit. of interconnect parameter variations on the Krylov-subspace Similar to [5], the representation of with respect to the sus- model-order reduction techniques. The paper adds the vari- ceptance, , and conductance, , matrices ability to the Krylov-subspace-based model reduction method is carried out around an expansion point , such [4], [6] by borrowing some ideas from the matrix perturbation that becomes nonsingular. The goal theory [20]. The authors allow two-dimensional variations on of the model-order reduction is to obtain a similar lower order the projection matrices. To compute the corresponding sensi- system tivities of the susceptance and conductance matrices to each dimensional variation, some sample points were picked up and (3) the dominant eigenvalues/eigenvectors were calculated. (4) The goal of the present paper is to study the effects of process where , , , . variations and spectral shaping on model-order reduction using . , the order of the reduced system, is much smaller than an extended balanced truncation technique and to propose an that of the original system. The balanced truncation technique efficient order reduction technique that includes these effects. carries out the order reduction by providing an error-bound The main contribution of this paper is the use of the spec- between the transfer functions of the original and the reduced trally weighted balanced truncation (SBT) method proposed in systems. [16]–[18] combined with a new variational balanced truncation Central to the state-space description and balanced truncation approach proposed in that accounts for process variations of dynamic systems are the controllability and observability resulting in a new variational SBT method [21]. The works of grammians. The key functions in formulating the passivity-pre- [18], [19], and [21] have laid the groundwork for interesting serving balanced truncation of linear systems are the posi- research on various extensions of balanced truncation method tive-definite controllability and observability grammians (e.g., [13]) and model-order reduction of parameterized inter- and of the system, which are obtained by solving Lur’e connects (e.g., [22]). To preserve the passivity, the proposed equations [13], i.e., algorithm employs results of [13]. As the future work, we will investigate the systematic way of determining the input/output weighting functions for the SBT method. This paper is organized as follows. Section II gives a brief overview of the balanced realization technique. Section III de- (5) scribes the frequency-weighted model reduction proposed in [16]–[18]. A discussion about the effect of process variations on interconnect modeling is provided in Section IV. Also, in Sec- (6) tion IV,the variational balanced truncation with spectral shaping is illustrated, and a theoretical comparison between the pro- The controllability grammian is a measure of how much the posed algorithm and the work presented in [19] is provided. input energy is coupled to the states of the system. The observ- Section V discusses the experimental results including the com- ability grammian is a measure of how the states and the outputs parison between the proposed model reduction technique and are coupled to each other. The controllability and observability different model-order reduction techniques in [6] and [13]. The grammians give some interesting insights about the system char- accuracy of the proposed order reduction technique in gener- acteristics. It can be proved (cf. [9]) that there exists a similarity ating low-order reduced systems for electromagnetically cou- transformation matrix that maps any given representation of pled global interconnect and H-tree clock distribution network is a system to a balanced realization such that the controllability examined. Finally, Section VI presents the concluding remarks and observability grammians of the new system for this paper. are equal and diagonal, i.e.,

(7)

II. OVERVIEW OF BALANCED REALIZATION where . Calculating the diag- onal matrices and , whose diagonal elements repre- Consider the state-space representation of an LTI system sent the singular values associated with each state variable , ; makes it possible to find a criterion for evaluating (1) the possibility of eliminating in the reduced model scheme (2) while preserving the time and frequency domain characteristics of the original system. The matrix is partitioned into to two where , , , and . submatrices represents the state vector of the system. de- notes the order of the system, and represents the size of the (8) input vector. In a general th-order RLC circuit, for example, HEYDARI AND PEDRAM: MODEL-ORDER REDUCTION USING VARIATIONAL BALANCED TRUNCATION 881 where , , and the new coor- dinate transformed system is also partitioned in conformity with as

Fig. 1. v -norm of the error versus frequency.

method is utilized to efficiently calculate this subset of the (9) system eigenvalues. The Lanczos algorithm reduces the orig- inal large symmetric matrix, , to a smaller The reduced-order model based on is stable tridiagonal matrix where . The algorithm and the -error is bounded by involves successfully filling in the columns and such that , where (10) and and the vectors and span the Krylov subspaces Reference [13] has proved that obtained using this and , respectively method is actually positive-real, and therefore, the re- duced-order model is passive. According to (10), the error bound is similar to the twice-the-sum-of-the-tail formula ob- tained for conventional balanced truncation method presented in [9]–[12]. The balanced realization relies on the calculation of controllability and observability grammians of the original system. The Lur’e equations are solved in order to obtain the . . system grammians. The procedure to calculate the grammians .. . effectively involves applying the Cholesky factorization to .. .. matrix [10] ...... (11) . and diagonalizing the matrix , i.e.,

The columns of the two projection matrices , A balancing transformation, , which maps the original system form bases for the respective right and left eigenspaces of the to the balanced realization form is obtained as follows: Krylov subspace in the Lancsoz method, associated with their (12) “big” eigenvalues. The matrices and are used as bases for the relevant eigenspaces of the matrix in the deriva- It is worth mentioning that the balanced truncation can di- tion of the reduced-order model. rectly apply to the matrix differential equation obtained from A shortcoming of the conventional balanced truncation tech- modified nodal analysis (MNA) [13]. niques proposed in [9], [11]–[13] is that they do not reshape the This algorithm, however, involves solving a generalized frequency spectrum to emphasize error minimization in some eigenvalue problem whose dimension is as large as the order frequency range of interest. Our preliminary experiments show of the original transfer function [9]. More precisely, the Lur’e that the error between the transfer function of the actual inter- equations (which is similar in form to Lyapunov equations) connect system and the transfer function of the reduced-order still need to be solved for the mapped system in order to system obtained using any conventional order reduction tech- obtain the coordinate-mapped controllability and observability nique is frequency-dependent, and increases with frequency, grammians and , a task that is computationally quite which is undesirable. Fig. 1 visualizes this observation, where expensive. the energy of the error arising from order reduction of the The problem of solving a high-order set of matrix equa- original system is frequency dependent. tions for the eigenvalues and eigenvectors of the system can Section III describes a numerically stable, frequency- be avoided by using the Krylov subspace-based methods. In weighted balanced truncation technique proposed by [16]–[18]. fact, studying the algorithm proposed by [23] shows that it is only necessary to find the first largest eigenvalues of the product and their corresponding left and right eigenvectors. III. BALANCED TRUNCATION WITH SPECTRAL SHAPING Based on this observation, a modified version of the Safanov’s So far, it has been observed that balanced realization is an algorithm is used in [10]. More precisely, in [10], the Arnoldi attractive model reduction technique due to the fact that it pro- algorithm is utilized to compute the largest eigenvalues and vides a priori -error bound for the reduced-order system. the corresponding left and right eigenvectors. is a large As mentioned above, the frequency dependence of the error be- symmetric matrix, which is a positive definite matrix. tween the reduced-order and the original system transfer func- The underlying problem is to efficiently obtain the first tions is important in many applications. In other words, error largest eigenvalues of a symmetric matrix. The Lanczos should be small in one or more frequency ranges of interests, 882 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PAPERS, VOL. 53, NO. 4, APRIL 2006

For a complete explanation of controller and observer form re- alizations, see [24, chapter 3]. Since all controllable and observ- able modes of these two augmented systems are determined by Fig. 2. Block diagram of the original LTI system along with input and output upper left corner submatrices of and , the desired weightings. controllability and observability grammians are thus given by the corresponding upper left corner submatrices of and while it can be larger in other ranges, depending on the ap- plication. The balanced truncation technique extended to in- clude weighting on the input and/or output as shown in Fig. 2 (18) [16]–[18]. To determine the state-space characteristics of the new where and must satisfy the following Lur’e equations: augmented system, two basic questions must be addressed (cf. Fig. 2). 1) What set of points in the -state space could be a part of the zero initial condition response for the weighted input (19) denoted by ? 2) What set of points in the -state space as initial conditions could produce a weighted output denoted by ? Consider the state-space representation of a set of tightly cou- (20) pled RLC interconnects given by (1) and (2). The goal of the fre- quency-weighted balanced realization technique is to calculate The passivity of the original system and positive-realness of degree , while minimizing of weighting function guarantee the existence of matrices , (13) , , and which satisfy (19) and (20) [13]. Expanding the upper left corner block of the Lur’e To obtain such a reduced system, the grammians of the aug- equations yields mented system must first be calculated and the same steps that are used for a unity-weighted system should then be employed. Most importantly, the weighting functions are chosen to be pos- itive real functions. Note that the purpose of reshaping the fre- (21) quency response is to minimize the error of the reduced-order system. The weighting functions should thus: 1) have a simple rational function in the s-domain, so that both the frequency- and time-domain behavior of the augmented system can be easily studied, and 2) be synthesized using simple low-order passive (22) RLCK, in case the actual realization is of interest. We define the new variables First, we write the Laplace transformation of the input and output weighting functions that are chosen to be stable functions (23) [16] (14) (24) (15) It is readily seen that and are symmetric matrices. As According to the definition in [24], the controllable subspace a consequence, there exist orthogonal matrices and and of the augmented system, , is the solution set to the diagonal matrices and such that first question. A controller-form realization of the augmented (25) system is as follows: (26) (16) where , and , Similarly, according to the definition in [24], the observable and , subspace of the augmented system, , is the solution . Suppose that and , where , set to the second question. An observer-form realization of the . We can write augmented system is as follows: (27)

(28)

Now, suppose that and is uniquely specified as , , and moreover, let and denote the (17) solutions of the Lur’e (21) and (22), which are both positive HEYDARI AND PEDRAM: MODEL-ORDER REDUCTION USING VARIATIONAL BALANCED TRUNCATION 883 semi-definite [13]. The similarity transformation that simul- parameters within the die. Fortunately, the intra-die variations taneously diagonalizes and is thus as follows: and inter-die variations are two uncorrelated processes. The intra-die variations have a low spatial frequency, by nature. Therefore, the proposed variational balanced truncation (even (29) without spectral shaping) can handle the intra-die variation Similar to the unit-weighted balanced truncation method, this accurately. This will reduce the complexity of our algorithm. transformation matrix, , is used to map the original system to Details about the physical imperfections, which lead to each a new coordinate transformed system for which the controlla- of these variations, are beyond the scope of this paper (see bility and observability grammians are diagonal and identical. [25]–[27]). These two variations are uncorrelated, thereby The reduced-order system is then obtained from the transformed simplifying the mathematical formulations. Moreover, The system. Most importantly, the reduced-order system is passive, physical variations induced by nonidealities of the manufac- because has been constructed using the positive semi-defi- turing process manifest themselves to electrical variations. nite grammians and . The balanced truncation algorithm For instance, for a given on-chip metal wire, a within-the-die based on the coordinate transformed system using preserves width variation of , a within-the-wafer variation of , the positive-realness of the transfer function, and hence pre- a within-the-die height variation of , and within-the-wafer serves passivity. Note that contains the characteristics of the height variation of will result in the nonzero offsets for weighting functions. The input and output weightings are deter- the parasitic elements associated with that metal wire, i.e., mined based on the range of frequencies where the maximum accuracy is desired. The weighting functions should emphasize the frequency ranges where more accuracy is required. Simi- larly, they must de-emphasize the range of frequencies where the noise resulting from the order reduction has very minimal energy or is out of the desired frequency bound.

IV. VARIATIONAL SPECTRALLY WEIGHTED BALANCED TRUNCATION where , , are the zero-offset parasitic resistance, capac- Due to process variations, interconnect technology param- itance, and inductance per-unit length. In general, for the sus- eters are varying substantially. These parameters can have as ceptance and conductance matrices of the interconnect system much as a 30% variation off their nominal values [1]. Therefore, that are exposed to the process variations, we have the effect of the process variations on the interconnect delay and crosstalk should be taken into consideration. A common approach to anticipate these variations in the design is the (30) conventional skew-corner, worst case modeling. This method, however, is too conservative because the probability of all 3- (31) process corner values occurring simultaneously is very small. As a consequence, statistically based worst case interconnect To alleviate the problem of large computational complexity, modelings using Monte Carlo simulation have been proposed the effect of process variations must be taken into account in [25], [26]. These approaches, however, fail to handle large model-order reduction algorithms. Furthermore, the resulting circuits that exist in reality. To alleviate the problem of having variational reduced-order model needs to converge to the re- large computational complexity (as also mentioned in [19]) duced-order model of the nominal network when all the param- the effect of process variations must be taken into account in eter variations are zero. model-order reduction algorithms. Furthermore, the resulting To obtain a balanced truncation technique that takes the variational reduced-order model needs to converge to the process variations into account, we should first find the new reduced-order model of the nominal network when all the system matrix (the so called perturbed system matrix) of parameter variations are zero. coupled interconnects affected by process variation with respect Characterization of the interconnect geometry variation is an to the ideal system matrix, , and the perturbed susceptance important issue in deep-submicron VLSI technology. In order and conductance matrices and . Lemma 1 helps us to accurately assess the performance of a complex interconnect obtain the relationship among the perturbed system matrix, structure, it is essential to characterize the interconnect geom- the original system matrix, and the perturbed susceptance and etry, which in turn specifies the interconnect parasitics [19]. conductance matrices. From a designer’s point of view, one important source of the Lemma 1: Given an LTI system whose state-space represen- integrated circuit (IC) performance variability is the physical tation is characterized by (1) and (2), let the susceptance and source of variability [27]. The manufacturing process variations conductance matrices, and , vary according to (30) and (31). can be overlooked at two levels of IC fabrication. The variation If , then at the first level of IC fabrication includes the case where the (32) interconnect (or device) parameters are constant within a die but where vary within a wafer or a lot. The process variation at the second level results in the variations of the device and interconnect (33) 884 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PAPERS, VOL. 53, NO. 4, APRIL 2006

; [28]. In fact, according to the Cayley-Hamilton theorem, there where , and exist analytical scalar functions, , such that is an arbitrary, but fixed expansion point such that becomes nonsingular. (38) Proof: Starting with the state-space representation of the perturbed system, the perturbed system matrix, , is derived Replacing the exponential function and its transpose in (37) by substituting (30) and (31) in the closed-form expression of with their corresponding equivalent finite-series representations the system matrix, i.e., in (38) yields the following expression: (34) Expanding the Laurent series of , and ne- glecting higher order terms for directly yields the desired equation. Replacing in the above equation with , and per- Lemma 1 enables one to obtain the relationship between the forming matrix factorizations will prove the validity of (35). It perturbed system matrix and the original one as well as the in- can also be proven that (35) and (36) will indeed provide upper cremental variations of the susceptance and conductance ma- bounds for and matrix values. trices. According to Theorem 1, any perturbation in the system The balanced realization approach directly utilizes the bal- matrix manifests itself as a congruence transformation, , ancing transformation to project the existing system to a new that maps the observability and controllability grammians of the system whose controllability and observability grammians are system to the ones for the new perturbed system demonstrated identical and diagonalized. The diagonal elements readily rep- by (35) and (36). Moreover, having positive-definite and resent all the singular values corresponding to the state variables automatically results in positive-definite and , of the system. The th-order truncated balanced realization because the latter matrices are congruent to the former ones and is then obtained by considering the largest singular-values. the positive-realness is retained by congruent transformations This approach provides insightful information about the energy (the proof is straightforward). exerted by each state variable, and thus, the contribution of To account for the effect of process variations in the proposed each state variable on the external behavior of the system. model-order reduction technique, Theorem 1 is directly utilized From a mathematical viewpoint, the Hankel singular values of in the new algorithm. Moreover, the proposed technique also re- the system transfer function are indeed the eigenvalues of the shapes the estimated error in the frequency domain using the fre- symmetric matrix, of the passivity guaranteed system. quency weighting method described in Section III, and thus, cre- To realize the effect of interconnect parameter variations on the ates a reduced-order system with more accurate time- and fre- observability and controllability grammians of the system, the quency-domain responses than a Krylov-based order reduction following Theorem is introduced. technique. Reshaping the spectrum of the estimated error is un- Theorem 1: Consider a passive LTI system with the dertaken by introducing input and/or output weighting functions state-space representation given by (1) and (2). Suppose that that appropriately minimize(s) the error in the regions where the system matrix is perturbed by . there is a large discrepancy between the spectrum of the orig- The controllability and observability grammians of the per- inal system and that of reduced system. Comparing the proposed turbed system are approximately equal to algorithm with the work in [19] on the variational Krylov-sub- space model-reduction, the key advantages of variational trun- (35) cated balanced realization is a superior accuracy and a proven (36) error-bound. Under specific assumptions set forth by the per- turbation theory [20], one can theoretically calculate the pro- where jection matrix of the perturbed system. Nevertheless, as also pointed out in [19], it is impractical to use the perturbation theory. Instead, [19] seeks to calculate the first-order expansion Proof: A complete proof is provided for (35). Equation of the Krylov-subspace. More precisely, the method proposed (36) can be proven using a similar approach. The positive-defi- by [19] involves the calculation of the first-order expansion of nite controllability grammian is the solution to Lur’e equa- the Krylov-subspace whose projection matrix is as follows: tion given by (5) and (6). Due to the similarity between the Lur’e equation and the classical Lyapunov equation used in the con- ventional balanced truncation method, will thus have the where and are dimensional variations. ’s are to be same mathematical form as the solution to the corresponding computed by choosing a set of sample points. A variational re- Lyapunov equation, i.e., duced-order model can then be constructed by inserting the re- sulting variational Krylov-subspace into the PRIMA equations. (37) The calculation of ’s on a limited number of sample points will lead to additional errors, which may become significant in The matrix exponential can be expressed in terms of long interconnects carrying high-frequency signals. There are, its matrix-valued power series using Cayley-Hamilton theorem however, some problems that need to be taken into account. HEYDARI AND PEDRAM: MODEL-ORDER REDUCTION USING VARIATIONAL BALANCED TRUNCATION 885

• We still need to solve the Lyapunov equations to obtain In the second step, matrix must be calculated. This ma- the grammians of the system. To efficiently solve the Lya- trix is obtained by using Theorem 1, where the inverse of the punov equations, an iterative Lyapunov equation solver, system matrix is encountered. To avoid the pitfall of explicit in- vector ADI (VADI),which was presented in [11] and [12], verse computation, the expression is cast as a linear equation. To is utilized. The VADI algorithm was developed to provide efficiently calculate the mapping transformation, , the gen- a low-rank approximation to the solution of the Lyapunov eralized minimum residual (GMRES) method is utilized [20]. equations. The VADI method is as inexpensive as Krylov The avoid excessive computational complexity and run-time, a subspace-based moment matching methods. [12] includes restart strategy has been employed. In this strategy, a discussion about the number of required iterations and iterations of the GMRES algorithm was performed. Solution an analysis of the computational complexity of the VADI of the last iteration was used as the initial vector for the next method. GMRES sequence. The complexity of iterations is . • To account for the effect of process variations on the in- In the third step, a modified version of VADI method pre- terconnect performance, Theorem 1 is directly employed. sented in [11] and [12] is used. This modified method is utilized However, (35) and (36) involve a matrix inversion opera- to efficiently solve the Lur’e equations. The VADI algorithm tion. To compute the inverse of a matrix more efficiently, provides a low-rank approximation to the solution of Lyapunov the problem is elaborated in the form of solving a set of equation by obtaining the dominant eigenspaces of the control- linear algebraic equations rather than explicit inverse for- lability and the observability grammians. The similarity of the mation [20]. Lur’e equation with the Lyapunov equation makes it possible Now, we proceed with describing the new procedure for vari- to refine the VADI algorithm proposed in [11] and [12] to ef- ational balanced truncation. ficiently solve the Lur’e equation.The total number of matrix- vector solves in all three examples of the experimental results Procedure 1. was the same as the Krylov-subspace-based Arnoldi method. Given , , : In Step 4, the idea of spectral shaping is implemented to 1. Using lemma 1, Determine matrix . further enhance the accuracy. We apply perturbing congruence 2. Using Theorem 1, compute the perturbing congruence transformation on the upper left corner blocks of transformation, . and specified by matrices and in (18), and construct 3. Using the VADI iteration, compute the controllability and new perturbed controllability and observability grammians, observability grammians of and of the original spec- and . Recall that both grammians and trally weighted system by solving the Lur’e (19) and (20). are symmetric, thereby reducing complexity of the matrix 4. Calculate new perturbed controllability and observability product. Theorem 1 makes it possible to derive the perturbed grammians, and using (35) and (36). grammians of the frequency-weighted system without solving 5. Using (23) and (24), compute and from the Lyapunov equations (in our case, Lur’e equations). Step 5 and . involves computation of symmetric matrices and with the 6. Decompose and using the eigenvalue decompo- same computational complexity as perturbed grammians. sition technique into and . In Step 6, matrices and are diagonalized. Notice that 7. Use (27) and (28), compute and . and solely depend on input/output vectors of the origianal 8. Using the VADI iteration, solve the mapped Lur’e (21) and system and weighting functions. The dimensions of these ma- (22) for the new perturbed system to compute and . trices are determined by the number of input/output ports, which 9. Using the block Lanczos algorithm, obtain the reduced- are considerably smaller than the order of the system. For ex- order left and right eigen-matrices, and , asso- ample, for a single-input single-output system, the order is one. ciated with the largest singular values of . As a consequence, the eigenvalue decompostion is performed 10. Let and compute efficiently. In Step 7, similar to Step 6, we build matrices whose the singular-value decomposition of , i.e., dimensions are determined by the input/output matrices of the . system and the weighting functions. 11. Let and Steps 8–12 basically implement the conventional balanced truncation technique. The Lur’e equations are efficiently solved by using the modified version of VADI algorithm. Since the 12. Compute the reduced-order space realization using the largest eigenvalues are to be computed for the th-order re- matrices and as follows: duced-order model, we employ the block Lanczos algorithm in Step 9 to efficiently calculate the largest singular values of the matrix product . The complexity of this step is there- fore determined by the order of the reduced-order model. Steps (39) 10–12 calculate the singular-value decomposition (SVD) of the Cholesky products. In doing so, matrices and Notice that to guarantee the passivity, the Lur’e equations are comprised of column vectors that constitute the othonormal must be solved instead of Lyapunov equations, leading to a bases for the left and right eigenspaces of . This no- slight modification of the above algorithm. tion reduces the computational cost of the SVD decomposition. Complexity Issue of Procedure 1 886 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PAPERS, VOL. 53, NO. 4, APRIL 2006

Now, we prove that a provable error bound exists on the -norm of the weighted error between the original system and the reduced-order system obtained using the proposed variational, SBT. Theorem 2: The -error of the variational model reduction with spectral shaping is

(40) where , are singular values of the perturbed system. and Fig. 3. Circuit schematic of x interconnects that are electromagnetically coupled to each other.

As mentioned earlier, process variations will have a definite Proof: Expanding the left-hand side of (13) in terms of impact on the on-chip interconnect parasitics. These variations Laplace-domain expressions of and results in the have two adverse effects on the system matrix as also demon- following equation: strated by (32) and (33). Ignoring second-order variations, any incremental increase in the values of parasitic conductances and resistances reduces , whereas any incremental increase in the values of parasitic inductances and capacitances increases . In the spectrally weighted variational balanced truncation, the As an interesting special case, consider the tightly coupled reduced-order system is derived by applying the transformation RLC interconnects shown in Fig. 3. It is easily proved that, for matrix, , on the original perturbed system and then trun- this case, the system matrix is a symmetric positive-definite cating the transformed system as mathematically described by matrix (the susceptance and conductance matrices are both sym- (8)–(10) metric positive definite matrices). Under these circumstances, the following theorem proves useful in finding the upper and lower limits of variations in the poles of the system transfer function of tightly coupled RLC interconnects that are subject to the process variations. Theorem 3 [20]: Given an LTI system whose state-space Matrices and , obtained from the seventh step of the representation is given by (1) and (2), if the system matrix spectrally weighted variational balanced truncation algorithm, is perturbed by due to the process variation, are also partitioned correspondingly. It is readily proved that then the following inequalities hold: and . Thus, we obtain (41), (42) shown at the bottom of the page. Note that represents the reduced- order model of . Hence, there exists a (43) bounded -error for the former system whose reduced-order where is given by lemma 1. model is represented by the latter system Proof Hint: Follows from theories developed for the sym- metric eigenvalue problem (see [20, pages 395–400]). According to Theorem 3, the magnitude of difference be- tween poles of the perturbed system and those of the original system is limited by the -norm of the perturbing matrix, . Furthermore, the eigenvalues of the perturbed system are upper Plugging the above equation into (41) proves the Theorem. and lower bounded by the values given in (43).

(41) HEYDARI AND PEDRAM: MODEL-ORDER REDUCTION USING VARIATIONAL BALANCED TRUNCATION 887

Fig. 4. Single lossy line modeled by 1000 ladder RLC sections.

Fig. 6. Two electromagnetically coupled RLC interconnects.

function is identified as a lead compensator with its zero located at a lower frequency compared to its pole location, i.e.,

(44)

Clearly, the SBT technique produces a reduced system with a frequency response closely following that of the orig- inal system. The weighting function in (44) is capable of maintaining the low-frequency accuracy, while significantly improving the high-frequency accuracy up to 10 GHz.

Fig. 5. Bode diagram of a single interconnect having 1000 lumped RLC sections and the corresponding reduced-order systems. B. Two Electromagnetically Coupled Interconnects To examine the performance of the proposed SBT technique V. E XPERIMENTAL RESULTS in high-frequency coupled interconnects, two electromagneti- cally coupled interconnects are considered. Shown in Fig. 6 In this section, the proposed variational SBT model-order re- is the distributed RLC model for these two interconnects. The duction technique (VSBT) is evaluated by performing exper- values for the electrical elements of interconnects are also in- iments on a number of global interconnect structures such as dicated underneath the circuit schematic. The capacitive cou- clock trees and coupled interconnect lines. To preserve passivity, pling is modeled by distributed floating capacitances, .In the proposed VSBT incorporates the algorithm proposed in [13]. state-of-the-art CMOS technologies, the coupling capacitances First, the accuracy of SBT technique [16]–[18] is demon- account for approximately 70%–95% of the total node capaci- strated and the result of applying this algorithm is compared tances, which makes the coupling noise analysis even more im- with those obtained by utilizing our implementations of portant. Similarly, mutual inductances with a coupling coeffi- PRIMA [6] and the positive-real truncated balanced realization cient of capture the inductive coupling between the two inter- (PR-TBR) [13]. The VSBT method is then applied to study connects. To include the most general case, the electrical values the impact of the interconnect process variations on the timing of the two lines are considered to be different. and rep- performance of the clock trees and coupled busses. Finally, the resent the load capacitances at the far end terminations of the accuracy of VSBT is validated by using it to reduce the order of aggressor and victim lines. In high-speed ICs, the lines are nor- an arbitrary stable LTI system that is subject to perturbations. mally driving other CMOS buffers, therefore, the load capaci- tances are comprised of the input capacitances of the CMOS re- A. Single Lossy Distributed RLC Interconnect ceiver buffers driven by the interconnects. The receiver buffer sizes for long interconnects are rather large, leading to large First, consider a single lossy transmission line of 3-mm load capacitances, and . Each line is modeled using length. The values for unit-length capacitances, inductances 500 ladder RLC sections in cascade, to accurately capture the and resistances are demonstrated in Fig. 4. The transmission distributed nature of long interconnects in high-speed VLSI cir- line is modeled by 1000 lumped RLC sections in cascade, cuits. hence, the system has an order of 2000. In this experiment, the accuracy of the proposed SBT method Shown in Fig. 5 is the magnitude response of the reduced in estimating the spectrum of electromagnetic crosstalk is ex- transfer function obtained by the proposed technique compared amined. The frequency response of the two coupled RLC in- to those of the PR-TBR algorithm and PRIMA algorithm. The terconnects with the input signal at the near end of the ag- order of the reduced systems is set to four. The input weighting gressor line and the output voltage across the load capacitance 888 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PAPERS, VOL. 53, NO. 4, APRIL 2006

Fig. 8. H-tree clock distribution driven by a tapered buffer.

TABLE I COMPARISON BETWEEN SIMULATION RESULTS ON AN H-TREE CLOCK NET WITH PROCESS VARIATIONS (STAR-HSPICE LEVEL 49, 0.13-"m, CMOS PROCESS). DELAYS ARE GIVEN IN PICOSECONDS Fig. 7. Magnitude response of the original system and reduced-order systems obtained by using SPBT, PR-TBR [13], PRIMA [16].

at the far end termination of the victim line is first sim- ulated using HSPICE. Next, three model-order reduction tech- niques are employed to reduce the order of the original system: PRIMA [6], PR-TBR [13], and the proposed SBT method. The order of the system is reduced to eight using these three dif- ferent order-reduction techniques. For the SBT method, a first- order low-pass weighting function, ,at the input is incorporated. To calculate the time-constant of the weighting function, , an important attribute of the crosstalk noise is utilized. The time-domain crosstalk voltage experiences its maximum gradient around its 50% rising transition time. The 50% rising transition time is calculated by using an extension to the Elmore delay metric for RLC tree circuits [28]. The max- imum time-domain variation corresponds to the high-frequency spectral components of the signal. As a consequence, an input weighting function with a time-constant equal to 50% of the rising transition time of the far end crosstalk greatly reduces the high-frequency imprecision of the low-order reduced system obtained by PRIMA and PR-TBR techniques. This is evident from Fig. 7 where the magnitude response of the two coupled RLC circuits in Fig. 6 is compared with those of the reduced systems. of each line is modeled by an RLC ladder network, whose R, L, All of the order reduction techniques in this experiment accu- and C values are changed due to metal width and ILD thickness rately predict the spectrum of the far end crosstalk at the low fre- variations. Based on the data reported in [26] on the intercon- quency range. However, PRIMA and PR-TBR are incapable of nect-dominated test circuits, the typical variational distribution following the signal spectrum for frequencies beyond 700 MHz. for metal interconnects as well as ILD thicknesses is a normal In contrast, SBT follows the spectral variations of the crosstalk distribution. The widths of metal and ILD layers vary up to 30% for frequencies up to 7 GHz. of their nominal values (which is the ratio of the 3- variation to the nominal value stated as a percentage). C. H-Tree Clock Distribution Twenty experiments were carried out where, in each exper- An H-tree clock network is constructed and routed in the iment, a set of normally distributed numbers for the metal and TSMC 0.13- m digital CMOS technology. The clock tree is ILD width variations were generated. The 50% propagation driven by a four-stage tapered buffer at the root of the tree, delay at an arbitrarily chosen fan-out (leaf) node was computed as shown in Fig. 8. The design target of the on-chip clock fre- by the VSBT technique and compared with the result obtained quency is 3.0 GHz. The clock tree is modeled by a large coupled using the algorithm presented in [19]. Results of this com- distributed RLC network. More precisely, every 5- m segment parison are provided in Table I. Throughout this experiment, HEYDARI AND PEDRAM: MODEL-ORDER REDUCTION USING VARIATIONAL BALANCED TRUNCATION 889

TABLE II COMPARISON BETWEEN SIMULATION RESULTS ON TWO CAPACITIVELY COUPLED MICROSTRIP LINES (STAR-HSPICE LEVEL 49, 0.13-MM CMOS PROCESS). DELAYS ARE GIVEN IN PICOSECONDS

Fig. 9. Two parallel microstrip lines in a 0.13-"m CMOS process. for the sake of simplicity, the geometrical variations of the interconnect are assumed to be mutually independent. In all of these experiments, the following input weighting function was utilized to minimize the high-frequency components of the estimated error:

From Table I, it is clear that the VSBT algorithm can predict the 50% delays more accurately than [19] for all possible 3- variations of metal and ILD layers used in Table I. By comparing the number of floating point operation (flops) from MATLAB simulation, the computation time for the VSBT technique is, on average, 15% lower than that of [19]. Notice that although the computation time for VSBT is not that much faster than that of [19], VSBT results in far more accurate delay values compared to [19].

D. Two Electromagnetically Coupled Interconnects With Variational Parameters Fig. 10. Pole–zero map of the LTI system specified in section 6.5. The next experiment is on two electromagnetically coupled microstrip lines with statistically varying electrical parameters E. General Example due to process variations. Microstrip lines serve as the most ap- To demonstrate the accuracy and validity of VSBT on propriate models for characterizing the topmost metal layers in any arbitrary stable LTI system exposed to perturbations, we CMOS technology. The schematic of these two coupled lines started from an arbitrary state-space specification of an LTI along with their nominal geometrical parameters are depicted system having twenty poles and thirteen zeros, as shown in the in Fig. 9. The accuracy of two variational order reduction tech- pole–zero map of Fig. 10. Suppose that each and every element niques, the proposed VSBT and the technique proposed in [19], of the system matrix experiences a normally distributed in estimating the delay of signal at the far end (load termination) random perturbation with a standard deviation of 30% around of Line 2 is examined. the nominal value. The order of the reduced-order system is set The height and width of the metal layer are subject to to three. It is desired to have high accuracy in high frequencies, variations around their nominal values. Twenty experiments therefore, the following weighting function is used: were carried out, while allowing the metal height and width to vary as normally distributed random processes. The following input weighting function was employed to reshape the spectrum of the error: A variational spectrally weighted reduced-order model is de- rived by using two approaches. The first approach is to calculate the perturbed system matrix and then directly apply the SBT. (45) The second approach is to use the VSBT technique. Fig. 11 shows and compares the bode diagram of the mag- Results of these twenty experiments are reported in Table II. nitude response, the bode diagram of the phase response, the The MATLAB-reported flop usage of the VSBT technique is impulse response of the original perturbed system, the reduced- 10% more than that of [19], while the delays reported by VSBT order system by using direct order reduction technique, and the is more accurate than [19]. reduced-order system by using the VSBT technique. Notice that 890 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PAPERS, VOL. 53, NO. 4, APRIL 2006

[4] K. J. Kerns and A. T. Yang, “Stable and efficient reduction of large, multiport RC networks by pole analysis via congruence transforma- tions,” IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst., vol. 16, pp. 734–744, 1997. [5] P. Feldmann and R. W. Freund, “Efficient linear circuit analysis by Pade approximation via the Lanczos process,” IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst., vol. 14, pp. 639–649, May 1995. [6] A. Odabasioglu, M. Celik, and L. T. Pileggi, “PRIMA: Passive reduced- order interconnect macromodeling algorithm,” IEEE Trans. Comput.- Aided Des. Integr. Circuits Syst., vol. 17, no. 8, pp. 645–654, Aug. 1998. [7] J. Roychowdhury, “Reduced-Order modeling of time-varying systems,” IEEE Trans. Circuits Syst. II, Analog Digit. Signal Process., vol. 46, no. 10, pp. 1273–1288, Oct. 1999. [8] L. Peng and L. T. Pileggi, “NORM: Compact model-order reduction of weakly nonlinear systems,” in Proc. IEEE/ACM Design Autom. Conf., Jun. 2003, pp. 427–477. [9] K. Glover, “All optimal Hankel-norm approximation of linear multivari- able systems and their v -error bounds,” Int. J. Contr., vol. 39, no. 6, pp. 1115–1193, 1984. [10] P. Rabiei and M. Pedram, “Model-order reduction of large circuits using balanced truncation,” in Proc. IEEE ASP-DAC, 1999, pp. 237–240. [11] J. Li, F. Wang, and J. White, “An efficient Lyapunov equation-based ap- proach generating reduced-order models of interconnect,” in Proc. 36th Fig. 11. Bode diagram and impulse response of the original perturbed system, ACM/IEEE Design Autom. Conf., 1999, pp. 1–6. the system obtained by direct SBT, and the VSBT algorithm. [12] J. Li and J. White, “Efficient model reduction of interconnect via approx- imate system Grammians,” Proc. IEEE/ACM ICCAD’99, pp. 380–384, 1999. applying the direct mode-order reduction technique on a large [13] J. R. Phillips, L. Daniel, and L. M. Silveira, “Guaranteed passive system of parameter-varying interconnects to generate the re- balancing transformations for model-order reduction,” IEEE Trans. duced-order system is very time consuming. The reason for this Comput.-Aided Des. Integr. Circuits Syst., vol. 22, no. 8, pp. 1027–1041, Aug. 2003. is that the direct order reduction algorithm must be run on the [14] A. Lu and E. L. Wachspress, “Solution of Lyapunov equations by al- system each time under different offset values for electrical pa- ternating direction implicit,” Comput. Math. Appl., vol. 21, no. 9, pp. rameters of the interconnect. As can be seen from Fig. 11, the 43–58, Jun. 1991. [15] N. Ellner and E. L. Wachspress, “Alternating direction implicit iteration difference between the two order reduction approaches is quite for systems with complex spectra,” SIAM J. Numer., Anal., vol. 28, no. small. 3, pp. 859–870, Jun. 1991. [16] G. Wang, V. Sreeram, and W. Q. Liu, “A new frequency-weighted balanced truncation method and an error bound,” IEEE Trans. Autom. VI. CONCLUSIONS AND FUTURE WORK Contr., vol. 44, no. 9, pp. 1734–1737, Sep. 1999. [17] V. Sreeram, “On the properties of frequency weighted balanced trun- In this paper a variational SBT technique for the model-order cation techniques,” in Proc. Amer. Contr. Conf., vol. 3, May 2002, pp. reduction of geometrically varying multiport RLC interconnects 1753–1754. was proposed. It was shown that the balanced truncation tech- [18] P. Heydari and M. Pedram, “Balanced truncation with spectral shaping for RLC interconnects,” in Proc. IEEE ASP-DAC, Jan. 2001, pp. nique is a highly effective approach when the variations in the 203–208. circuit parameters need to be taken into consideration. Various [19] Y. Liu, L. T. Pileggi, and A. Strojwas, “Model order-reduction of RC(L) experiments demonstrate that the computational run-time of the interconnect including variational analysis,” in Proc. 36th IEEE/ACM Proc. Design Autom. Conf., Jun. 1999, pp. 201–206. variational SBT approach is 10–15% higher than that of the [20] G. H. Golub and C. Van Loan, Matrix Computation. Baltimore, MD: variational Krylov-subspace-based model-order reduction tech- Johns Hopkins University Press, 1996, pp. 390–405. niques while the accuracy is also 20% higher, on average. [21] P. Heydari and M. Pedram, “Model reduction of variable geometry in- terconnects using variational spectrally-weighted balanced truncation,” As a future work, we will investigate the systematic way of in Proc. IEEE ICCAD, San Jose, CA, Nov. 2001. determining the input/output weighting functions that minimize [22] L. Daniel, C. S. Ong, S. C. Low, K. H. Lee, and J. White, “A multiparam- (13) in the SBT method. eter moment matching model reduction approach for generating geomet- rically parameterized interconnect performance models,” IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst., vol. 23, no. 5, pp. 678–93, ACKNOWLEDGMENT May 2004. [23] M. G. Safanov and R. Y. Chiang, “A Schur method for balanced trunca- The authors would thank Prof. V. Sreeram, Department of tion model reduction,” in Proc. Amer. Contr. Conference, vol. 2, 1988, Electrical and Electronic Engineering, University of Western pp. 1036–1040. [24] P. J. Antsaklis and A. N. Micel, Linear Systems. New York: McGraw- Australia, for the helpful discussions. Hill, 1997. [25] O. S. Nakagawa, S.-Y. Oh, and G. Ray, “Modeling of pattern-dependent on-chip interconnect geometry variation of deep-submicron process and REFERENCES design technology,” in Tech. Dig. IEEE Int. Electron Devices Meeting, [1] S. Oh, W. Jung, J. Kong, and K. Lee, “Interconnect modeling for VLSI,” 1997, pp. 137–141. in Proc. Int. Conf. Simulation of Semiconductor Processes and Devices, [26] O. S. Nakagawa, N. Chang, S. Lin, and D. Sylvester, “Circuit impact 1999, pp. 203–206. and skew corner analysis of stochastic process variation in global inter- [2] L. T. Pillage and R. A. Rohrer, “Asymptotic waveform evaluation for connect,” in Proc. IITC, 1999, pp. 230–232. timing analysis,” IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst., [27] S. R. Nassif, “Modeling and forecasting of manufacturing variations,” vol. 9, no. 4, pp. 352–366, Apr. 1990. in Proc. 5th Int.Workshop Statistical Metrology, 2000, pp. 2–10. [3] C. Ratzlaff and L. T. Pillage, “RICE: Rapid interconnect circuit eval- [28] Y. I. Ismail, E. G. Friedman, and J. L. Neves, “Equivalent elmore delay uation using AWE,” IEEE Trans. Comput.-Aided Des. Integr. Circuits for RLC trees,” in Proc. IEEE/ACM Des. Autom. Conf., Jun. 1999, pp. Syst., vol. 13, no. 6, pp. 763–776, Jun. 1994. 715–720. HEYDARI AND PEDRAM: MODEL-ORDER REDUCTION USING VARIATIONAL BALANCED TRUNCATION 891

Payam Heydari (S’98–M’00) received the B.S. and Massoud Pedram (S’88–M’90–SM’98–F’01) re- M.S. degrees in electrical engineering from the Sharif ceived the B.S. degree in electrical engineering from University of Technology, in 1992 and 1995, respec- the California Institute of Technology, Pasadena, in tively, and the Ph.D. degree in electrical engineering 1986, and the M.S. and Ph.D. degrees in electrical from the University of Southern California, Los An- engineering and computer sciences from the Uni- geles, in 2001. versity of California, Berkeley, in 1989 and 1991, During the summer of 1997, he was with respectively. Bell-Labs, Lucent Technologies, Murray Hill, He joined the Department of Electrical Engi- NJ, where he worked on noise analysis in deep neering Systems, University of Southern California, submicrometer very large-scale integrated (VLSI) Los Angeles, where he is currently a Professor. He circuits. During the summer of 1998, he was with has authored four books and more than 250 journal IBM T. J. Watson Research Center, Yorktown Heights, NY, where he worked and conference papers. His current work focuses on developing computer-aided on gradient-based optimization and sensitivity analysis of custom-integrated design methodologies and techniques for low-power design, synthesis, and circuits. Since August 2001, he has been an Assistant Professor of Electrical physical design. Engineering at the University of California, Irvine, where his research interest Dr. Pedram has served on the Technical Program Committee of a number of is the design of high-speed analog, radio-frequency (RF), and mixed-signal conferences, including the Design Automation Conference (DAC), the Design integrated circuits. and Test in Europe Conference (DATE), the Asia-Pacific Design Automation Dr. Heydari has received the 2005 National Science Foundation (NSF) CA- Conference (ASP-DAC), and the International Conference on Computer-Aided REER Award, the 2005 IEEE Circuits and Systems Society Darlington Award, Design (ICCAD). He served as the Technical Co-chair and General Co-chair of the 2005 Henry Samueli School of Engineering Teaching Excellence Award, the International Symposium on Low-Power Electronics and Design (ISLPED) the Best Paper Award at the 2000 IEEE International Conference on Computer in 1996 and 1997, respectively. He was the Technical Program Chair and the Design (ICCD), the 2000 Honorable Mention Award from the Department of General Chair of the 2002 and 2003 International Symposium on Physical De- EE-Systems at the University of Southern California, and the 2001 Technical sign. He serves on the Executive Committee of the IEEE Circuits and Systems Excellence Award in the area of Electrical Engineering from the Association Society (CASS) and on the Advisory Board of the ACM Special Interest Group of Professors and Scholars of Iranian Heritage (APSIH). He was recognized on Design Automation. He is an Associate Editor of the IEEE TRANSACTIONS as the 2004 Outstanding Faculty at the EECS Department of the University of ON COMPUTER-AIDED DESIGN OF INTERGRATED CIRCUITS AND SYSTEMS, and California, Irvine. His name was included in the 2006 Who’s Who in America. the ACM Transactions on Design Automation of Electronic Systems. He re- Dr. Heydari is an Associate Editor of the IEEE TRANSACTIONS ON CIRCUITS ceived the 2000 Distinguished Service Award of ACM—SIGDA for contribu- AND SYSTEMS—I: REGULAR PAPERS. He currently serves on the Technical Pro- tions in developing the SIGDA Multimedia Monograph Series and organizing gram Committees of Custom Integrated Circuits Conference (CICC), Interna- the Young Student Support Program. He was a member of the Board of Gover- tional Symposium on Low-Power Electronics and Design (ISLPED), Interna- nors of the IEEE CASS from 2000 to 2002 and the Chair of the Distinguished tional Symposium on Quality Electronic Design (ISQED), and the Local Ar- Lecturer Program of the IEEE CASS for 2003 and 2004. He is currently the rangement Chair of the ISLPED conference. He was the Student Design Con- CASS Vice President of Publications of Publications. His research has received test Judge for the DAC/ISSCC Design Contest Award in 2003, the Technical a number of awards including two ICCD Best Paper Awards, a Distinguished Ci- Program Committee member of the IEEE Design and Test in Europe (DATE) tation from ICCAD, two DAC Best Paper Awards, and an IEEE TRANSACTIONS from 2003 to 2004, and International Symposium on Physical Design (ISPD) in ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS Best Paper Award. He 2003. is a recipient of the National Science Foundation (NSF)’s Young Investigator Award in 1994 and the Presidential Faculty Fellows Award (aka the PECASE Award) in 1996.