Microelectronics Reliability 78 (2017) 319–330

Contents lists available at ScienceDirect

Microelectronics Reliability

journal homepage: www.elsevier.com/locate/microrel

Assembly yield prediction of plastically encapsulated packages with a large number of manufacturing variables by advanced approximate integration method

Hsiu-Ping Wei a, Bongtae Han a,⁎, Byeng D. Youn b, Hyuk Shin c,IlhoKimc, Hojeong Moon c a Mechanical Engineering Department, University of Maryland College Park, MD 20742, USA b Mechanical and Aerospace Engineering, Seoul National University, Republic of Korea c Package Development Team, Semiconductor R&D Center, Samsung Electronics, Republic of Korea article info abstract

Article history: An advanced approximate integration scheme called eigenvector dimension reduction (EDR) method is imple- Received 20 February 2017 mented to predict the assembly yield of a plastically encapsulated package. A total of 12 manufacturing input var- Received in revised form 3 August 2017 iables are considered during the yield prediction, which is based on the JEDEC reflow flatness requirements. The Accepted 5 September 2017 method calculates the statistical moments of a system response (i.e., warpage) first through dimensional reduction Available online xxxx and eigenvector sampling, and a probability density function (PDF) of random responses is constructed subse- quently from the statistical moments by a probability estimation method. Only 25 modeling runs are needed to Keywords: fi Approximate integration scheme produce an accurate PDF for 12 input variables. The results prove that the EDR provides the numerical ef ciency Dimension reduction required for the tail-end probability prediction of manufacturing problems with a large number of input variables, Probability density function while maintaining high accuracy. Assembly yield © 2017 Elsevier Ltd. All rights reserved. Warpage Stacked die thin flat ball grid array

1. Introduction The yield loss is in general a small probability event (i.e., tail-end probability) [1–3], especially for the large production volume. In many Epoxy molding compound (EMC) has been used extensively as a cases, even 0.1% yield loss would cause a significant profit loss. Based protection layer in various semiconductor packaging components. The on the Six Sigma concept, the target is often to control the yield loss with- mismatch of coefficient of thermal expansion (CTE) causes the warpage in 3 to 6 sigma, i.e., 6.67% to 3.4 ppm [4]. of components after molding, which is one of the most critical issues to Fig. 1 shows a schematic illustration of the tail-end probability, where board assembly yield. The warpage issue has become more critical as the statistical property of system performance (e.g., warpage) is repre- Package-on-Package (PoP) and fan-out wafer level package (FO-WLP) sented by a probability density function (PDF). When a component has are widely adopted for portable devices. the performance exceeding or falling behind a certain specification, it The computer-aided engineering (CAE) tools, such as the finite ele- cannot be processed further and is regarded as a failure. The probability ment method (FEM), have been used extensively to predict the warpage. of all possible failure, i.e., yield loss, is the area under the PDF where the Typically, the CAE tools provide deterministic outputs, which establish performance does not satisfy the specification. quantitative relationships between the system response (i.e., warpage) A technical approach critically required for the yield loss prediction is and the input parameters such as geometries, material properties, pro- the uncertainty propagation analysis, which enables the intrinsically de- cess and/or environmental conditions, etc. The deterministic approaches terministic computational model to characterize the output distribution have been proven effective for comparing competitive designs. In reality, in the presence of input uncertainties. The most popular uncertainty the package warpage behavior shows statistical variations (or probabili- propagation methods are “random sampling method” and “response ty distributions) due to inherent manufacturing variabilities. The proba- surface method (RSM)”. When they are applied to complex manufactur- bilistic aspect should be incorporated in prediction if the assembly yield ing problems with a large number of input variables, however, they be- is to be predicted. come impractical due to their own limitations. Due to its random nature, the failure probability estimated from the random sampling method, e.g., Monte Carlo simulation (MCS), exhibits ⁎ Corresponding author. statistical variations [5]. The variations can be substantial when the tail- E-mail address: [email protected] (B. Han). end probability is to be predicted. In order to ensure that the tail-end

https://doi.org/10.1016/j.microrel.2017.09.006 0026-2714/© 2017 Elsevier Ltd. All rights reserved. 320 H.-P. Wei et al. / Microelectronics Reliability 78 (2017) 319–330

In this paper, the EDR method is implemented to predict the assem- bly yield of a plastically encapsulated package. A total of 12 manufactur- ing input variables are considered during the yield prediction, which is based on the JEDEC reflow flatness requirements. Section 2 provides a brief introduction of the EDR method. In Section 3, the details of an EDR implementation are described. The accuracy of the yield prediction is verified by the direct MCS in Section 4. Section 5 concludes the paper.

2. Eigenvector dimension reduction method

The eigenvector dimension-reduction (EDR) method estimates the complete probability density function (PDF) of a system response by (1) calculating the statistical moments and (2) constructing the corre- sponding PDF using the probability estimation methods. The statistical moments are the characteristics of a distribution. The 1st moment, μ, is the mean, which represents the central tendency of the distribution, and the 2nd moment is the standard deviation, σ, which represents the spread of the distribution. The 3rd and 4th mo-

ments are skewness, β1, and kurtosis, β2, which indicate the symmetry and the peakedness of the distribution, respectively. The mth-order sta- Fig. 1. Illustration of a yield loss (tail-end probability). tistical moment of a system response is defined as

þZ∞ Zþ∞ m m E ½YXðÞ; …; X ≡ ⋯ fgYXðÞ; …; X f ;…; ðÞX ; …; X dX ⋯dX probability prediction falls within the specified accuracy tolerance, an 1 N 1 N X1 XN 1 N 1 N extremely large number of model computations is required. This com- −∞ −∞ putational burden makes the random sampling impractical for the ð1Þ cases that require complex nonlinear computational models (e.g., visco- … elastic analysis required for warpage prediction of plastically encapsu- where E(·) is the expectation operator; Y(X1, ,XN)is the system re- … lated components) [6]. sponse with N random input variables, X1 , ,XN (i.e., N dimensions); and f … (X ,…,X ) is the joint probability density function. In this The RSM has also been widely used in conjunction with the MCS [7,8] X1 , ,XN 1 N to reduce the computational burden. The RSM relies on Design of Exper- paper, the capital letters are used to denote the input variables. iments (DOE) to build computationally inexpensive mathematical re- To tackle the mathematical challenge associated with the multidi- sponse surface models, which can be used for the direct MCS. Two mensional integration in Eq. (1), Rahman and Xu proposed the additive commonly used types of DOE are the Full Factorial Design (FFD) [9–11] decomposition [16] to transform the multidimensional response func- … and the Central Composite Design (CCD) [12–14]. Although the CCD tion Y(X1, ,XN) into a series of one-dimensional functions. The approx- can reduce the sample size of the FFD substantially, both types cannot imated system response function, then, can be expressed as [16]: avoid the challenge known as the curse of dimensionality (i.e., the com- XN ; …; ≈ ; …; μ ; …; μ ; ; μ ; …; μ − − μ ; …; μ putational costs increase exponentially as the number of random input YXðÞ1 XN YaðÞ¼X1 XN Y 1 j−1 X j jþ1 N ðÞN 1 YðÞ1 N variables increases). Due to this inherent limitation, the RSM has been j¼1 applied to the designs with only a few input variables. ð2Þ Another method for the uncertainty propagation analysis is “approx- imate integration scheme”. The scheme calculates the statistical mo- where Ya is the approximated system response function obtained by the μ ments of the output response by performing a multi-dimensional additive decomposition, j is the mean value of an input variable, Xj, μ … μ μ … μ integration. Seo and Kwak proposed a numerical algorithm to perform Y( 1, , j−1,Xj, j+1, , N) is the system response of the input variable, the integration [15]. The algorithm also suffered from the curse of dimen- Xj, while the other input variables are kept as their respective mean μ … μ sionality as the FFD was used to select integration points. Rahman and Xu values, and Y( 1, , N) is the system response with all input variables proposed the univariate dimension-reduction (UDR) method to cope are fixed as their mean values. with the curse of dimensionality [16]. With the method, a multi-dimen- Substituting Eq. (2) into Eq. (1) yields: sional integration is transformed into a series of one-dimensional inte- ; …; m ≅ ; …; m grations, and thus the computational cost increases only additively E ½YX()ðÞ"#1 XN E ½YaðÞX1 XN m N with the increased number of input variables. This additive increase ∑ μ ; …; μ ; ; μ ; …; μ − − μ ; …; μ ¼ E Y 1 j−1 X j jþ1 N ðÞN 1 YðÞ1 N makes the method attractive to the problems with a large number of j¼1 input variables. ð3Þ In a typical UDR implementation, however, a large number of nu- merical integration points are still required to ensure the accuracy of Using the binomial formula, the right-hand side of Eq. (3) can be re- each one-dimensional integration result. For a large number of input written as [16]: variables, the method also can be computationally expensive. Youn et “ ()"# al. developed a method called eigenvector dimension-reduction m N ” ∑ μ ; …; μ ; ; μ ; …; μ − − μ ; …; μ (EDR) method [17] to relax the requirement of the UDR method. In E Y 1 j−1 X j jþ1 N ðÞN 1 YðÞ1 N the EDR method, the eigenvector sampling scheme was proposed to se- j¼1 8 "#9 i

The recursive formula is further employed to simplify the expecta- tion operation in Eq. (4), which yields [16]:

8"#9 i Xm < N = m! m−i E ∑ Y μ ; …; μ − ; X ; μ ; …; μ ½−ðÞN−1 YðÞμ ; …; μ i!ðÞm−i ! : 1 j 1 j jþ1 N ; 1 N i¼0 j¼1 ð5Þ Xm m! − ¼ Si ½−ðÞN−1 YðÞμ ; …; μ m i i!ðÞm−i ! N 1 N i¼0 where

8 Zþ∞ > > i i > S ¼ YXðÞ1; μ ; …; μ f ;…; ðÞX1jX2 ¼ μ ; …; XN ¼ μ dX1 > 1 2 N X1 XN 2 N > −∞ > > Zþ∞ no > Xi ! i−k <> i i k S ¼ S − Y μ ; …; μ − ; X ; μ …; μ j k!ðÞi−k ! j 1 1 j 1 j jþ1 N > k¼0 −∞ > > > f ;…; X jX1 ¼ μ ; …; X − ¼ μ − ; X ¼ μ ; …; XN ¼ μ dX > X1 XN j 1 j 1 j 1 jþ1 jþ1 N j > Zþ∞ > Xi ! > i i k i−k > S ¼ S − YðÞμ ; …; XN f ;…; ðÞXNjX1 ¼ μ ; …; XN−1 ¼ μ − dXN : N k!ðÞi−k ! N 1 1 X1 XN 1 N 1 k¼0 −∞ ð6Þ

From Eqs. (1) to (6), a total of m × N one-dimensional integrations are needed to obtain the mth-order statistical moment. These integra- tions in general can be done by numerical integration algorithms, Fig. 2. Schematic illustration of numerical integration. which perform the calculations at the integration points, xj,i,with weights, wj,i,as[17]: The three sample-point scheme is typically used in practice. The loca- þZ∞ no tions of the three sample points are mean and the mean ± 3 standard de- r μ ; …; μ ; ; μ …; μ viations, which are expressed as [17]: Y 1 j−1 X j jþ1 N −∞ f ;…; X jjX1 ¼ μ ; …; X j−1 ¼ μ − ; X j 1 ¼ μ ; …; XN ¼ μ ð7Þ 0 μ ; …; μ X1 XN 1 j 1 þ jþ1 N V ¼ ½hi1 N pffiffiffiffiffi Xn no 1 r V ¼ μ ; …; μ −3 λ ; …; μ ≅ μ ; …; μ ; ; μ …; μ i hi1 i i N ð8Þ dX j w j;i Y 1 j−1 x j;i jþ1 N pffiffiffiffiffi 2 μ ; …; μ λ ; …; μ i¼1 Vi ¼ 1 i þ 3 i N

th where xj , i is the i integration point of an input variable, Xj, where μ and λ are the mean and eigenvalue along the ith eigenvector. The Y(μ1, …, μj−1, xj, i, μj+1…, μN) is the system response at xj,i,whilethe i i locations of sample points in Eq. (8) were suggested based on a paramet- other input variables are kept as their respective mean values, and wj,i is the corresponding weight which approximates the area under the ric study reported in Ref. [17]. Fig. 4(a) illustrates the three sample-point scheme of an input variable X following the normal distribution. It can be PDF of Xj from xj,i−(xj,i−xj,i−1)/2 to xj,i+(xj,i+1−xj,i)/2, respectively. j Fig. 2 illustrates the numerical integration with 5 integration points. observed that the PDF values outside of the range of mean ± 3 standard deviations are very small (i.e., 0.27% of all the possible values of X ), The PDF of Xj and the system response along Xj are shown as the blue j and red dashed curves, respectively. The products of these two curves which suppresses their contribution during the numerical integration. are integrated along the variable, Xj. More specifically, the products of the responses and their corresponding weights (the bars) at the integra- tion points (the black cross) are added to complete the numerical integration. High accuracy of the 1-D numerical integration in Eq. (7) can be achieved by selecting numerous integration points, which can be com- putationally challenging for the applications with a large number of input variables, especially those that require nonlinear modeling. Two major improvements were made in the EDR method to reduce the num- ber of simulations while maintaining the accuracy. First, the eigenvector sampling scheme was proposed to handle the statistical correlation and variation of the input variables. By solving the eigenvalue problem of the covariance matrix, the eigen- values and the corresponding eigenvectors are obtained. The ei- genvector associated with the largest eigenvalue is the direction of the largest variation, wherein the square root of the eigenvalue is the standard deviation along this direction. The eigenvector as- sociated with the second largest eigenvalue is the orthogonal di- rection with the next highest variation. The sample points are selected along the eigenvectors, and the simulations are conducted only at the sample points. Eigenvectors of two random variables are illustrated in Fig. 3, where the 1st and 2nd eigenvector direc- tions are shown with a joint PDF. Fig. 3. Schematic illustration of eigenvectors of two random variables. 322 H.-P. Wei et al. / Microelectronics Reliability 78 (2017) 319–330

Fig. 5. TFBGA package: (a) cross-sectional view and (b) bottom view showing the measuring zone.

least square (SMLS) method [17] is utilized to interpolate and ex- trapolate the responses at the integration points. Using the approx- ^ imated system responses, Y, at the integration points, the numerical integration of the one-dimensional integrations finally takes the following form [17]:

þZ∞ no r μ ; …; μ ; ; μ …; μ Y 1 j−1 X j jþ1 N −∞ f X X μ ; …; X − μ ; X μ ; …; X μ ð10Þ X1;…;XN jj 1 ¼ 1 j 1 ¼ j−1 jþ1 ¼ jþ1 N ¼ N no Xn r ≅ ^ μ ; …; μ ; ; μ …; μ dX j w j;i Y 1 j−1 x j;i jþ1 N i¼1

3. Assembly yield loss prediction of thin flat ball grid array

The assembly yield of a plastically encapsulated package is deter- Fig. 4. Schematic illustration of locations of sample points: (a) three and (b) five sample- mined. A viscoelastic analysis to predict the warpage is described after point schemes. defining the warpage. The uncertainty propagation analysis and PDF es- timation are followed. The above condition is no longer valid if the system response is highly nonlinear within the range of mean ± 3 standard deviations. More sam- 3.1. Package description ple points are needed to capture the nonlinear response. Fig. 4(b) illus- trates the five sample-point scheme. The additional sample points can A stacked die thin flat ball grid array (TFBGA) package is often used as be expressed as [17]: the top package of a Package-on-Package (PoP). Fig. 5(a) shows the sche- hipffiffiffiffiffi matic of a typical TFBGA package. The encapsulation of the TFBGA pack- 3V ¼ μ ; …; μ −1:5 λ ; …; μ age is done by the transfer molding process. For successful PoP stacking i hi1 i i N pffiffiffiffiffi ð9Þ with the high assembly yield, the package warpage at the solder reflow 4V ¼ μ ; …; μ þ 1:5 λ ; …; μ i 1 i i N temperature must be controlled [18–20]. Typically, the TFBGA packages are produced by memory manufac- The nonlinear behavior can be captured accurately using the two ad- turers and shipped to the outsourced semiconductor assembly and ditional points. By excluding the repeated runs of the mean value, a total test services (OSAT) companies for the PoP assembly. Therefore, the of (2N +1)or(4N + 1) runs are required for the three and five sample- TFBGA packages are required to meet the warpage specification (e.g., point schemes, respectively. ±0.1 mm for the package body size of 15 mm × 15 mm and the ball Once the corresponding system responses are obtained at the pitch of 0.5 mm [21]) before shipment. The packages with warpage ex- sample points, the moving least square (MLS) or stepwise moving ceeding the specification cannot be processed further, which becomes a H.-P. Wei et al. / Microelectronics Reliability 78 (2017) 319–330 323

Fig. 6. Package warpage definition and sign convention: (a) convex and (b) concave.

yield loss. It is suggested by JEDEC and JEITA [22] that the package warp- age at solder reflow temperature be measured over the area where sol- der joints are located (will be referred to as “measuring zone”). Fig. 5(b) shows the measuring zone of the TFBGA package used in this paper. Fig. 6 shows the definition of package warpage and the sign conven- tion. Fig. 6(a) shows a convex package (corners down during assembly – a positive warpage), while Fig. 6(b) shows a concave package (corners up during assembly – a negative warpage). The red dashed line shown in Fig. 6 indicates the reference plane; the coefficients of the equation of the reference plane are calculated by the least square method with the out-of-plane deformation across the x-y spatial dimensions of the specimen in the measuring zone. The distance between the highest point in the measuring zone and the reference plane is denoted as A, Fig. 7. Viscoelastic FE model of TFBGA package: (a) boundary conditions, (b) die stack fi whereas the distance between the lowest point in the measuring zone con guration, and (c) enlarged cross-sectional view. and the reference plane is denoted as B. The magnitude of the package warpage is defined as the sum of A and B. can be expressed as: ÀÁ − C1 TÀÁTref logðÞ¼aT ð11Þ 3.2. Numerical analysis: warpage prediction C2 þ T−Tref

A quarter symmetry was used to build a finite element model with where aT is the shift factor, Tref is the reference temperature (115 °C), and the element type SOLID185 in the commercial FEA package (ANSYS®), C1, C2 are the material constants. The values of C1 and C2 were −20.16 which supports the viscoelastic and elastic material properties. It takes and −111.38, respectively. about 1.5 h for a single model run using an advanced workstation. Fig. The EMC molding process is done at 175 °C, which can be assumed as 7 shows the details of the model. The boundary condition and the die the stress free temperature. The package is then subjected to the solder stack configuration are shown in (a) and (b), respectively. The enlarged reflow process during the assembly. Fig. 10 shows the complete thermal view of the cross-section in (c) shows the details of the chip and the die excursion of the package used for warpage prediction. The conventional attach film (DAF) configuration. lead-free solder reflow profile is considered [24], where the peak tem- The material properties and the nominal dimensions used in the perature is 260 °C. model are summarized in Table 1 and Table 2, respectively. The temper- The deformed configuration of the TFBGA package with the nominal ature dependent Young's modulus of DAF measured by design at the peak reflow temperature is shown in Fig. 11(a). The light thermomechanical analysis (TMA) is shown in Fig. 8. The EMC was pink area indicates the measuring zone and the white circles represent modeled as a linear viscoelastic material. The master curves used in the the solder ball locations. Fig. 11(b) shows the reference plane deter- modelareshowninFig. 9 [23]. The Williams-Landel-Ferry (WLF) func- mined based on the z-direction displacements of the nodes in the mea- tion was used to fit the shift factors at different temperatures, which suring zone. The package warpage was calculated based on the 324 H.-P. Wei et al. / Microelectronics Reliability 78 (2017) 319–330

Table 1 Material properties of TFBGA.

Material Young's modulus Poisson's ratio CTE (ppm/°C) Tg (°C) (GPa) α1 (bTg) α2 (NTg) Silicon die 130 0.23 2.8 – DAF Temp. dependent 0.3 65.3 162.9 138 Substrate 46.794 0.3 16.2 (in-plane) – 61.5 (out-of-plane) EMC Viscoelastic 0.21 9.12 35.13 137.5

Table 2 Dimensions of TFBGA.

Length × width × thickness

1st Die (mm) 13 × 11 × 0.575 1st DAF (mm) 13 × 11 × 0.025 2nd Die (mm) 11 × 9 × 0.575 2nd DAF (mm) 11 × 9 × 0.025 Substrate (mm) 15 × 15 × 0.13 EMC (mm) 15 × 15 × 0.59

Fig. 9. Master curves of bulk modulus and shear modulus of EMC [23]. definition described in Section 3.1, i.e., the sum of (1) the distance of the highest point in the measuring zone to the reference plane and (2) the distance of the lowest point in the measuring zone to the reference tPCB should also satisfy this relationship. Based on the manufactur- μ plane. The package warpage of the nominal design was 40.16 m. ing specification [26], the package thickness, tPKG, is given as 0.72 ± 0.08 mm. The package thickness is expected to follow a normal dis- 3.3. Uncertainty propagation analysis by EDR tribution, and it can be expressed with the mean and standard de-

viations of μPKGt =0.72mmandσPKGt = 0.027 mm. It is to be noted 3.3.1. Input random variables that σPKGt is set to be one third of the tolerance, which makes It is well known that the manufacturing variables tend to follow a 99.73% of tPKG lie within the tolerance. normal distribution according to the central limit theorem [25].The12 The correlation coefficient of tEMC and tPCB was determined using the random input variables with the means and standard deviations used distribution of the package thickness. A sweeping analysis was conduct- in the study are listed in Table 3. The material properties (X9 to X12) ed over the theoretical range of correlation coefficient, [−1, 1] with a were measured, and the dimensions of X1 to X8 were obtained from step size of 0.05, which produced a total of 41 correlation coefficients. fi the manufacturing speci cations [26,27]. For each correlation coefficient, ρt,i (i = 1 to 41), random sampling of Among the 12 input variables, two pairs of properties have the statis- the bivariate normal distribution of tEMC and tPCB was conducted to gen- tical correlation: (1) the EMC thickness (tEMC), X3, and the PCB thickness erate 100,000 pairs of tEMC and tPCB data, which subsequently produced fi (tPCB), X4, and (2) the EMC CTE below and above Tg, X9 and X10.Tode ne 100,000 tPKG data. The tPKG data was then fitted into a normal distribution the joint PDF of the correlated and normally distributed input variables, μρ σρ to calculate the mean, t,i, and standard deviation, t,i. The least square an additional parameter called the correlation coefficient is required. μρ σρ error was calculated to represent the degree of t,iand t,i deviated The joint PDF with these 5 parameters (i.e., the means and standard de- from μPKGt and σPKGt. The results are shown in Fig. 12.Thecorrelationco- fi viations of two variables, and the correlation coef cient) is called the bi- efficient of X3 and X4 was determined as −0.35. variate normal distribution.

The package thickness, tPKG,isequaltotEMC plus tPCB, i.e., tPKG = tEMC + tPCB. The statistical distributions of tPKG, tEMC,and

Fig. 8. Temperature dependent Young's modulus of DAF. Fig. 10. Completed thermal excursion. H.-P. Wei et al. / Microelectronics Reliability 78 (2017) 319–330 325

3.3.2. Eigenvector sampling and sample points

The covariance of input variables Xi and Xj, which quantify the linear dependency between these two variables, is defined as:

ÀÁ hi ; Σ −μ −μ Cov Xi X j ¼ ij ¼ E ðÞXi i X j j ð12Þ

where E(·) is the expectation operator; μi and μj are the mean values of the input variable Xi and Xj, respectively. After calculating the covariance between each pair of the N number of input variables, the covariance matrix can be obtained as [17]:

2 3 σ 2 Σ Σ ⋯ Σ 1 12 13 1N 6 Σ σ 2 Σ ⋯ Σ 7 6 21 2 23 2N 7 Σ ¼ 6 Σ Σ σ 2 ⋯ Σ 7 ¼ 6 31 32 3 3N 7 4 ⋮⋮⋮⋱⋮5 Σ Σ Σ ⋯ σ 2 N1 N2 N3 N 2 0:0332 00 000 6 2 6 00:033 0000 6 2 6 00 :029 −0:0001015 0 0 6 2 6 00−0:0001015 0:01 00 6 6 00 0 0 0:0012 0 6 6 00 0 0 00:0012 6 6 00 0 0 00 6 6 00 0 0 00 6 6 00 0 0 00 6 6 00 0 0 00 4 00 0 0 00 00 0 0 00 3 000000 7 0000007 7 0000007 7 0000007 7 0000007 7 0000007 ð13Þ 2 7 0:00375 000007 2 7 00:00375 00007 2 7 004:24 4:664 0 0 7 2 7 Fig. 11. (a) Deformed configuration of TFBGA package at 260 °C (20× magnification) and 004:664 1:1 007 2 5 (b) package warpage calculated from on reference plane. 00000:81 0 000001592

Unlike the above case, the EMC CTEs above and below Tg always in- 2 crease or decrease in the same direction since any pair of these two var- where Σii =σi is the variance of the input variable Xi and Σij =Σji.Bysolv- iables are measured from the same sample. Accordingly, it is reasonable ing the eigenvalue problem of the covariance matrix (i.e., ΣXE=λXE), the λ to assume that the EMC CTEs above and below Tg have perfect positive eigenvalues and the corresponding eigenvectors XE are obtained. The ei- correlation (i.e., the correlation coefficient of “unity”). genvalues, λ, and the corresponding eigenvectors, XE,ofthecovariance

Table 3 Input variables.

Variables Physical meaning Mean Std. dev. Distribution Correlation coefficient

X1 PKG length (mm) 15 0.033 Normal –

X2 PKG width (mm) 15 0.033 Normal

X3 EMC thickness (mm) 0.59 0.029 Bivariate normal −0.35

X4 PCB thickness (mm) 0.13 0.01

X5 1st Chip thickness (mm) 0.0575 0.001 Normal –

X6 2nd Chip thickness (mm) 0.0575 0.001 Normal –

X7 1st DAF thickness (mm) 0.025 0.00375 Normal –

X8 2nd DAF thickness (mm) 0.025 0.00375 Normal –

X9 EMC CTE above Tg (ppm/°C) 35.13 4.24 Bivariate normal 1

X10 EMC CTE below Tg (ppm/°C) 9.12 1.1

X11 PCB CTE (ppm/°C) 16.2 0.81 Normal 0

X12 PCB modulus (MPa) 46,794 159 Normal 0 326 H.-P. Wei et al. / Microelectronics Reliability 78 (2017) 319–330 matrix of Eq. (13) are: 2 3 0:0332 6 2 7 6 0:033 7 6 7 6 0:000855 7 6 7 6 0:000086 7 6 7 6 0:0012 7 6 7 6 0:0012 7 λ ¼ 6 7; 6 0:003752 7 6 7 6 0:003752 7 6 7 6 4:382 7 6 7 6 0 7 4 5 0:812 1592

XE ¼ 2 3 10 0 0 0000 0 0 00 6 7 6 01 0 0 0000 0 0 007 6 7 6 00 0:9911 0:13330000 0 0 007 6 7 6 00−0:1333 0:99110000 0 0 007 6 7 6 00 0 0 1000 0 0 007 6 7 6 00 0 0 0100 0 0 007 ð14Þ 6 7 6 00 0 0 0010 0 0 007 6 7 6 00 0 0 0001 0 0 007 Fig. 13. Marginal joint PDF of package length and width (uncorrelated case) and locations 6 7 6 00 0 0 00000:9680 ‐0:2511007 of sampling points. 6 7 6 00 0 0 00000:2511 0:9680 0 0 7 4 5 00 0 0 0000 0 0 10 00 0 0 0000 0 0 01 dimensional marginal joint PDFs along the eigenvector directions (green and black lines). It is to be noted that for this uncorrelated The multi-dimensional integration of the joint PDF with 12 dimen- case, the eigenvector directions are just the directions of the input var- sions (i.e., 12 input variables) can be decomposed into 12 one dimen- iables. Fig. 13 also shows the sample points of 2N + 1 sampling scheme sional integration along the 12 eigenvectors directions shown in Eq. (Eq. (8)) as well as the additional sample points required for 4N +1 (14). The marginal joint PDF of two input variables is used to illustrate sampling scheme (Eq. (9)). the locations of the sampling points since it is difficult to show graphical- Fig. 14 shows the marginal joint PDFs of the two correlated cases. The ly the joint PDF with more than 3 input variables. The marginal joint PDF marginal joint PDF of EMC thickness (X3) and PCB thickness (X4)is shown in (a), and the marginal joint PDF of EMC CTE above and below of Xi and Xj provides the probability of each (xi, xj)pair,whichiscalculat- ed by integrating the joint probability distribution of the 12 input vari- Tg (X9 and X10) in (b). Due to the statistical correlation, the eigenvector directions are different from the original variable directions. It should ables over the 10 input variables other than Xi and Xj. For example, the be noted that, in Fig. 14(b), the variation along the second eigenvector di- marginal joint PDF of X3 and X4 can be expressed as: rection was zero due to the perfect correlation between X9 and X10. Z∞ Z∞ The warpage at the each sample point was calculated by the same f X ; X ⋯ f X ; …; X dX dX dX ⋯dX 15 procedure described in Section 3.2. The results are summarized in X3X4 ðÞ¼3 4 X1;…;X12 ðÞ1 12 1 2 5 12 ð Þ −∞ −∞ Table 4.Basedonthe2N + 1 sampling scheme, a total of 25 simulations were conducted for the 12 input variables. Fig. 13 shows the marginal joint PDF of an uncorrelated case, where the package length (X ) and package width (X ) are the uncorrelated 1 2 3.3.3. Statistical moments input variables. After the additive decomposition is completed, this bi- As described in Section 2, the statistical moments can be obtained by variate marginal joint PDF is further transformed into two one- calculating the multiple one-dimensional integrations in Eq. (6). In this study, each one-dimensional integration was calculated by the numeri- cal integration algorithm called the moment-based quadrature rule [16]. It was demonstrated that it can calculate the one-dimensional inte-

gration for an arbitrary distribution of the input variable Xj with good ac- curacy and efficiency, compared with other conventional integration methods such as Gauss-Legendre and Gauss-Hermite quadratures [16]. Fig. 15 shows the weights and the approximated package warpages

of the integration points for two representative variables: (a) X11 (CTE of PCB) and (b) X3 (EMC thickness). The red dots represent the package warpages at the sample points. A total of 21 integration points were sug- gested in Ref. [17] for several examples with nonlinear system response. The 21 integration points were also used in this study and expected to be sufficient since the system response of this study is less nonlinear. As de- scribed in Section 2, the weight of each integration point is represented by the area of the corresponding bar. For example, the central bar in

Fig. 15(a) approximates the area under the PDF of X11 from x11;11− x11;11 −x11;10 x11;12 −x11;11 2 to x11;11 þ 2 . In Fig. 15(a), the package warpages at the three sample points (μ pffiffiffiffiffiffiffi 11 μ λ and 11 3 11) linearly decrease along the direction of X11. Therefore, Fig. 12. MCS sweeping analysis results for correlation coefficient of EMC thickness and PCB the package warpages of the integration points (black cross) can be ac- thickness. curately interpolated and extrapolated by MLS, which was confirmed by H.-P. Wei et al. / Microelectronics Reliability 78 (2017) 319–330 327

Fig. 14. Marginal joint PDF and locations of sampling points for correlated input variables:

(a) EMC and PCB thickness and (b) EMC CTE below Tg and above Tg.

Table 4 Fig. 15. Integration points and weights for 1-D numerical integration: (a) X and (b) X . Package warpage simulation results at sample points. 11 3

μ pffiffiffiffiffiffiffi Variable Package warpage ( m) four additional simulations conducted at μ 1:5 λ and μ 6 pffiffiffiffiffi pffiffiffiffiffi pffiffiffiffiffi pffiffiffiffiffi pffiffiffiffiffiffiffi 11 11 11 − λ − : λ μ : λ λ 3 i 1 5 i i þ1 5 i þ3 i λ11 . Most of the input variables in this study show linear response

X1 40.00 40.08 40.16 40.25 40.33 curves similar to X11.

X2 40.06 40.11 40.16 40.22 40.27 The most nonlinear response curve is observed with EMC thickness, pffiffiffiffiffiffi X3 52.94 47.27 40.16 31.16 19.07 μ λ X3 (Fig. 15(b)). Due to the fact that the warpage within 3 3 3 is X4 27.87 33.67 40.16 47.44 55.62

X5 41.24 40.70 40.16 39.63 39.11 Table 5 X6 40.83 40.49 40.16 39.83 39.51 First four statistical moments. X7 44.31 42.31 40.16 37.94 35.69 X 41.49 40.88 40.16 39.39 38.57 8 Mean Std. dev. Skewness Kurtosis X9/X10 −12.78 14.40 40.16 65.26 90.13

X11 55.99 48.11 40.16 32.28 24.33 EDR sampling scheme 2N + 1 39.68 19.48 −0.0488 3.0053

X12 40.47 40.31 40.16 40.01 39.86 4N + 1 39.76 19.25 −0.0478 3.0332 328 H.-P. Wei et al. / Microelectronics Reliability 78 (2017) 319–330

Table 6 Table 7 Coefficients of Pearson system. Predicted statistical moments and yield loss by MCS and EDR.

c0 c1 c2 Mean Std. dev. Skewness Kurtosis Yield loss (ppm) EDR sampling scheme 2N + 1 378.96 −0.4745 0.0003 MCSa 54.8721 7.9191 −0.0006 2.9995 762 4N + 1 365.03 −0.4515 0.0048 EDR 2N + 1 54.8720 7.9202 0.0000 3.0000 754 EDR 4N + 1 54.8720 7.9201 0.0000 3.0000 758

a Average value of 30 repetitions with sample size of 1000,000. intermediately nonlinear, two additional simulations conducted at μ pffiffiffiffiffiffi 3 1:5 λ indicate that the interpolation was accurate, whereas the ex- 3 pffiffiffiffiffiffi μ λ 3.3.4. PDF estimation trapolation results deviated from the simulations at 3 6 3.Howev- After obtaining the statistical moments, the probability estimation er, as expected, it is clearly shown inpffiffiffiffiffiFig. 15 that thep contributionffiffiffiffiffi of method such as the method of moments (MOM) and the Pearson system weights are negligible for Xbμ −3 λ and Xbμ þ 3 λ , and thus, i i i i can be used to construct the PDF of random response, which is the final even the extrapolation by MLS contains error, the effect on the integra- outcome of the EDR method for an uncertainty propagation analysis. tion results is minimal. In this study, the PDF of package warpage was constructed using the Once all the one-dimensional integrations in Eq. (10) are completed, Pearson system [28]. The Pearson system is a family of continuous prob- they are combined to calculate the statistical moments. The first four ability distributions, which offers flexibility in constructing the PDF statistical moments are listed in Table 5. based on the first four statistical moments (mean, standard deviation, skewness and kurtosis). The Pearson distributions of the system re- sponse, Y,aredefined by the following differential equation [28]:

1 dpðÞ Y − a þ Y ¼ 2 ð16Þ pYðÞ dY c0 þ c1Y þ c2Y

where a, c0, c1 and c2 are four parameters to describe the PDF. Based on the theoretical derivation, the four parameters can be determined by the first four moments, which can be expressed as [28]: 8 −1 > 2 2 2 > c0 ¼ 4β −3β 10β −12β −18 σ <> 2 1 2 1 −1 2 a ¼ c1 ¼ β ðÞβ þ 3 10β −12β −18 σ ð17Þ > 1 2 2 1 > −1 : β − β2− β − β2− c2 ¼ 2 2 3 1 6 10 2 12 1 18

As denoted in Section 2, σ is the standard deviation, β1 is the skew- ness, and β2 is the kurtosis. It is worth noting that the 1st moment (i.e., mean, μ) is not shown in Eq. (17) since the Pearson system first con- structs the PDF about the mean of zero and then shifts the PDF to the true mean. The coefficients determined from the statistical moments obtained in Section 3.3.3 are summarized in Table 6. Fig. 16(a) depicts the predicted PDFs. Fig. 16(b) shows the enlarged view of the tail-end region. It is clear

Fig. 16. PDFs of and assembly yield loss predicted by two schemes of EDR: (a) entire PDF and (b) enlarged view of the tail-end region. Fig. 17. Yield loss distribution predicted by MCS with the sample size of 1000,000. H.-P. Wei et al. / Microelectronics Reliability 78 (2017) 319–330 329

Table 8 Effect of sample size on MCS predictions.

Samples size = 10,000 Samples size = 100,000 Samples size = 1000,000

Theoretical probability MCS repetitions Theoretical probability MCS repetitions Theoretical probability MCS repetitions

±1% error bounds (762 ± 7.62 ppm) 2.3% 0% (0/30) 7.3% 3.3% (1/30) 22.3% 26.7% (8/30) ±10% error bounds (762 ± 76.2 ppm) 21.7% 30% (9/30) 61.6% 70% (21/30) 99.4% 100% (30/30)

^ that the results of 2N +1and4N + 1 sampling schemes are virtually and NMCS are sufficiently large, ð1−pÞcan be approximated by the nor- − identical, which is attributed to the fact that most of the response curves malp distributionffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi with the mean of (1 p) and the standard deviation fi are linear. By applying the speci cation of JEDEC [21], i.e., ±0.1 mm for of pð1−pÞ=NMCS [6],wherep is the true yield loss. the TFBGA package in this study, the prediction yield losses of the two Fig. 17 shows the yield loss distribution of NMCS = 1000,000. The schemes are 765 ppm and 751 ppm, respectively. yield loss predicted by EDR 2N + 1 scheme is 754 ppm, which falls with- in the ±1% tolerance of the true yield loss (762 ± 7.62 ppm). The prob- 4. Validity of the proposed approach ability that the MCS predictions satisfy the ±1% tolerance is theoretically 22.3% (the light green area in Fig. 17). This theoretical probability is con- It has been shown that the yield loss (i.e., tail-end probability) of a firmed by the single MCS repetition results, which are also shown in package with 12 input variables can be predicted accurately by as few Fig. 17; only 7 out of 30 repetitions (23.3% probability) fall within the as 25 simulations. The direct MCS is used to evaluate the accuracy of ±1% error bound (±7.62 ppm). In other words, for any yield loss predict- the EDR results quantitatively. ed by MCS with a sample size of 1,000,000 has only 22.3% probability that The direct application of the MCS to the current problem was imprac- the error will be smaller than or equal to than the yield loss predicted by tical due to excessive computational time. Instead, an empirical model EDR. When the tolerance is relaxed to ±10% error of the true yield loss, obtained from the coplanarity data at room temperature was used for the MCS has 99.4% probability to fulfill the tolerance. It is also confirmed the verification. The empirical coplanarity model produced by using by that all the 30 repetitions fall within the ±10% error bounds (Fig. 17). the stepwise regression with 43,358 coplanarity data points is given as: The yield loss by the MCS is affected obviously by the sample size. Table 8 summarizes the results obtained from different sample sizes. : : − : : − : YCopl ¼ 2 6366X1 þ 0 4564X2 74 4052X3 þ 36 0588X4 0 06189ðÞX5 þ X6 When the sample size is reduced to 10,000, the probability that the þ 0:0274ðÞþX þ X 1:7329X þ 1:73875X þ 0:6273X −0:00116X 7 8 9 10 11 12 ð18Þ MCS prediction is to be comparable to the EDR is only 2.3%. Even the tol- −0:49186ðÞðÞX5 þ X6 =ðÞ0:4746X1X2X3−0:036ðÞ 1:913ðÞþX5 þ X6 0:8ðÞX7 þ X8 erance is relaxed to ±10%, the sample size of 10,000 has only 21.7% þ 2:0361ðÞþðÞX5 þ X6 =X1X2X4 14:662 probability. Considering the fact that it takes approximately 1.5 h to run the vis- where YCopl is the coplanarity and X1 to X12 are input variables listed in coelastic model used in this case study using an advanced workstation Table 3. The two pairs of input variables – (1) the EMC thickness, X3, with 24 cores, the MCS with 10,000 model runs would take 2 months, and the PCB thickness, X4, and (2) the EMC CTE below and above Tg, X9 which is impractical for most of the semiconductor packaging applica- and X10 – are still considered to be statistically correlated with the same correlation coefficients as −0.35 and 1, respectively. tions. Conversely, the EDR provides yield loss prediction with uncertain- As mentioned earlier, the statistical moments and the yield loss ty less than 1% with only 25 model runs and results in 37.5 h fi estimated from the MCS exhibit statistical variations. It is well- computational time. The comparison result clearly con rms the effec- known that the true values can be obtained by employing the unbi- tiveness of EDR in the tail-end probability prediction. fi ased estimators in multiple repetitions. According to the central In this study, the accuracy of the proposed approach was con rmed fi limit theorem, 30 repetitions will produce a good approximation of for symmetric input distributions. It was con rmed in Ref. [17] that the thetruevalue[29,30]. More details about the unbiased estimators EDR method is effective for both symmetric and asymmetric distribu- of the 1st to 4th statistical moments and the yield loss can be found tions. It is expected that the proposed approach will also work effectively in Ref. [31,6],respectively. with any asymmetric input distributions as long as the input variables For the yield loss prediction, the coplanarity criterion was set as 80 have linear-dependency. μm according to the room temperature coplanarity criterion suggested by JEDEC [32,33]. It is to be noted that this is different from the package 5. Conclusion warpage specification at reflow temperature discussed in Section 3. In this study, the MCS with the sample size of 1,000,000 was conduct- The eigenvector dimension reduction (EDR) method was imple- ed for 30 repetitions to estimate the true statistical moments and the mented to predict the assembly yield of a stacked die thin flat ball grid true yield loss. The comparison between the MCS and EDR is summa- array (TFBGA) package. A total of 12 manufacturing input variables rized in Table 7. As expected, the results from 2N +1and4N +1sam- were considered during the yield prediction, among which two pairs of pling schemes are nearly identical. Differences of statistical moments properties had the statistical correlation. The method calculated the sta- between the MCS result and the EDR results are very small (only fourth tistical moments of warpage distribution first through dimension reduc- decimal place). The effect of these differences on the PDF construction tion and eigenvector sampling. The probability density function (PDF) of are minimal, which produces only 8 ppm difference in yield loss the warpage was constructed from the statistical moments by the Pear- prediction. son system. The assembly yield was predicted from the PDF based on the A quantitative comparison of the yield loss was made using the re- JEDEC reflow flatness requirements. sults of the MCS sample size of 1,000,000. The yield loss prediction by In this case study, only 25 modeling runs were needed to predict the MCS is expressed as [6]: assembly yield with uncertainty less than 1% despite the fact that the  prediction dealt with a tail-end probability (less than 1000 ppm) with k 12 input variables. The number of input variables was much larger ðÞ¼1−p^ 1− ð19Þ NMCS than that has been conceived as the practical limit of the uncertainty propagation analysis. More applications of the EDR method are expected where NMCS is the sample size used in MCS; and k is the number of pre- to provide solutions to tail-end probability problems which have not dicted coplanarity less than or equal to the coplanarity criterion. When k been feasible due to a computational burden. 330 H.-P. Wei et al. / Microelectronics Reliability 78 (2017) 319–330

Acknowledgment [14] S. Reh, et al., Probabilistic finite element analysis using ANSYS, Struct. Saf. 28 (2006) 17–43. [15] H. Seo, B. Kwak, An improved reliability analysis using design of experiments and an This work was supported by Samsung Electronics through UMD: KFS application to tolerance design, The 5th World Congress of Structural and Multidisci- 4326670. Their support is greatly appreciated and graciously plinary Optimization, Lodo do Jesolo, Italy May, 2003, pp. 19–23. [16] S. Rahman, H. Xu, A univariate dimension-reduction method for multi-dimensional acknowledged. integration in stochastic mechanics, Probabilistic Engineering Mechanics 19 (2004) 393–408. References [17] B.D. Youn, et al., Eigenvector dimension reduction (EDR) method for sensitivity-free probability analysis, Struct. Multidiscip. Optim. 37 (2008) 13–28. [1] K. Ishibashi, PoP (package-on-package) stacking yield loss study, 2007 Proceedings [18] J. Zhao, et al., Effects of package design on top PoP package warpage, 2008 58th Elec- 57th Electronic Components and Technology Conference 2007, pp. 1403–1408. tronic Components and Technology Conference 2008, pp. 1082–1088. [2] D. Xie, et al., Yield study of inline Package on Package (PoP) assembly, Electronics [19] D. Xie, et al., Head in pillow (HIP) and yield study on SIP and PoP assembly, 2009 Packaging Technology Conference 2008, pp. 1202–1208 EPTC 2008. 10th, 2008. 59th Electronic Components and Technology Conference 2009, pp. 752–758. [3] T. Katahira, et al., 0.3 mm pitch CSP/BGA development for mobile terminals, Pro- [20] W. Lin, M.W. Lee, PoP/CSP warpage evaluation and viscoelastic modeling, 2008 58th ceedings of the 40th International Symposium on Microelectronics 2007, pp. 11–15. Electronic Components and Technology Conference 2008, pp. 1576–1581. [4] K. Yang, B.S. El-Haik, Design for Six Sigma, McGraw-Hill, New York, 2003. [21] JEDEC, “JEP 95 SPP-024A standard,” Reflow Flatness Requirements for Ball Grid [5] D. Kececioglu, Robust Engineering Design-by-reliability With Emphasis on Mechan- Array Packages, 2009. ical Components & Structural Reliability, 1, DEStech Publications, Inc., 2003 [22] JEDEC, “Coplanarity Test for Surface-mount Semiconductor Devices,” JESD22-B108A, [6] D. Vose, Risk Analysis: A Quantitative Guide, John Wiley & Sons, 2008. 2003. [7] D.-G. Yang, et al., Parametric study on flip chip package with lead-free solder joints [23] Y. Sun, et al., Measurement of the comprehensive viscoelastic properties of advanced by using the probabilistic designing approach, Microelectron. Reliab. 44 (2004) EMC using FBG sensor, Electronic Components and Technology Conference (ECTC), 1947–1955. 2016 IEEE 66th 2016, pp. 531–537. [8] R.H. Myers, et al., Response Surface Methodology: Process and Product Optimization [24] IPC/JEDEC J-STD-020C, Moisture/reflow Sensitivity Classification for Nonhermetic Using Designed Experiments, John Wiley & Sons, 2016. Solid State Surface Mount Devices, JEDEC Solid State Technology Association, 2004. [9] P. Rajaguru, et al., Sintered silver finite element modelling and reliability based design [25] C.C. Zhang, H.-P.B. Wang, Robust design of assembly and machining tolerance alloca- optimisation in power electronic module, Microelectron. Reliab. 55 (2015) 919–930. tions, IIE Trans. 30 (1997) 17–29. [10] B. Öztürk, et al., Finite element based design of a new dogbone specimen for low cycle [26] Micron Technology Inc., “16Gb: 216-Ball, Dual-Channel Mobile LPDDR3 SDRAM Fea- fatigue testing of highly filled epoxy-based adhesives for automotive applications, tures,” EDFA164A1PB, EDFA164A1PK Datasheet, 2014. Thermal, Mechanical and Multi-Physics Simulation and Experiments in Microelec- [27] Amkor Technology Inc., “Package on Package (PoP) Family,” DS586G Datasheet, 2012. tronics and Microsystems (EuroSimE), 2013 14th International Conference on 2013, [28] N.L. Johnson, et al., Continuous Univariate Distributions, 1, John Wiley & Sons, New pp. 1–4. York, 1994 163. [11] J.W. Evans, et al., Designing and building-in reliability in advanced microelectronic [29] R.V. Hogg, et al., Probability and Statistical Inference: Pearson Higher Ed, 2014. assemblies and structures, IEEE Transactions on Components, Packaging, and [30] J.M. Ferrin, et al., Conceptual and practical implications for rehabilitation research: Manufacturing Technology: Part A 20 (1997) 38–45. effect size estimates, confidence intervals, and power, Rehabilitation Education 21 [12] L. Marretta, R. Di Lorenzo, Influence of material properties variability on springback (2007) 87–100. and thinning in sheet stamping processes: a stochastic analysis, Int. J. Adv. Manuf. [31] M. Holický, Reliability Analysis for Structural Design: AFRICAN SUN MeDIA, 2009. Technol. 51 (2010) 117–134. [32] JEDEC Publication 95 (JEP95), Design Guide 4.5, Fine Pitch Square Ball Grid Array [13] S. Stoyanov, et al., Computational approach for reliable and robust system-in-package (FBGA) Package, January 2009. design, 2007 30th International Spring Seminar on Electronics Technology (ISSE) [33] JEDEC Pubilication 95 (JEP95), Design Registration 4.14, Ball Grid Array Package, 2007, pp. 40–45. April 2011.