<<

2012 IEEE International Symposium on Information Theory Proceedings Optimized Cell Programming for Flash Memories with Quantizers

Minghai Qin∗, Eitan Yaakobi∗†, Paul H. Siegel∗ ∗University of California, San Diego †California Institute of Technology Electrical and Engineering Department Electrical Engineering Department Emails: {mqin, eyaakobi, psiegel}@ucsd.edu

Abstract—Multi-level flash memory cells represent by the capacity analysis and coding strategies are discussed in [1], amount of charge stored in them. Certain voltages are applied [5], [11]. to the flash memory cells to inject charges when programming Parallel programming is a crucial tool to increase the write and the cell level can be only increased during the program- ming process as a result of the high cost of erasures. To speed when programming flash memory cells. Two important achieve a high speed during writing, parallel programming is properties of parallel programming are the use of shared pro- used, whereby a common voltage is applied to a group of cells to gram voltages and the need to account for variation in charge inject charges simultaneously. The voltage sharing simplifies the injection [12]. Instead of applying distinct program voltages to circuitry and increases the programming speed, but it also affects different cells, parallel programming applies a common pro- the precision of charge injection and limits the storage capacity of flash memory cells. Another factor that limits the precision gram voltage to many cells simultaneously. Consequently, the of cell programming is the thermal noise induced in complexity of hardware realization is substantially reduced and charge injection. the write speed is therefore increased. Parallel programming In this paper, we focus on noiseless parallel programming of must also account for the fact that cells have different hardness multiple cells and noisy programming of a single cell. We propose with respect to charge injection [3], [9]. When applying the a new criterion to evaluate the performance of the cell program- ming which is more suitable for flash memories in practice and same program voltage to cells, the amount of charge trapped in then we optimize the parallel programming strategy accordingly. different cells may vary. Those cells that have a large amount We then proceed to noisy programming and consider the two of trapped charge are called easy-to-program cells and those scenarios where feedback on cell levels is either available during with little trapped charge are called hard-to-program cells. Un- programming or not. We study the optimization problem un- derstanding this intrinsic property of cells will allow the pro- der both circumstances and present algorithms to achieve the optimal performance. gramming of cells according to their hardness of charge injec- tion. One widely-used programming method is the Incremental I.INTRODUCTION Step Pulse Programming (ISPP) scheme [3], [9], which allows Flash memories are a widely-used technology for non- easy-to-program cells to be programmed with a lower program volatile . The basic memory units in flash voltage and hard-to-program cells to be programmed with a memories are floating-gate cells, which use charge (i.e., elec- higher program voltage. trons) stored in them to represent data, and the amount of In [6], [7], the optimized programming for a single flash charge stored in a cell determines its level. The hot-electron was studied. A programming strategy to optimize injection mechanism or Fowler-Nordheim tunneling mech- the expected precision with respect to two cost functions was anism [2] is used to increase and decrease a cell level by proposed, where one of the cost functions is the ℓp metric and injecting charge into it or by removing charge from it, re- the other is related to rank modulation [8]. It was assumed that spectively. The cells in flash memories are organized as the programming noise follows a uniform distribution and the blocks, each of which contains about 106 cells. One of the level increment is chosen adaptively according to the current most prominent features of programming flash memory cells cell level to minimize the cost function. is its asymmetry; that is, increasing a cell level, i.e., inject- In [12], algorithms for parallel programming were studied. ing charge into a cell, is easy to accomplish by applying a The underlying model incorporated shared program voltages certain voltage to the cell, while decreasing a cell level, i.e., and variations in cell hardness, as well as a cost function based removing charge from a cell, is expensive in the sense that upon the ℓp metric. The programming problem was formu- the block containing the cell must first be erased, i.e., all lated as a special case of the Subspace/Subset Selection Prob- charge in the cells within the block is totally removed, before lem (SSP) [4] and the Sparse Approximate Solution Problem reprogramming to their target levels. The erase operation, (SAS) [10], both of which are NP-hard. Then an algorithm called a block erasure, is not only time consuming, but also with polynomial time complexity was proposed to search for degrades the performance and reduces the longevity of flash the optimal programming voltages. memories [2]. We note that flash memories use multiple discrete levels In order to minimize the number of block erasures, pro- to represent data in real applications [2]. Hence, if the actual gramming flash memories is accomplished very carefully us- cell level is within a certain distance from the target level, it ing multiple rounds of charge injection to avoid “overshoot- will be quantized to the correct target level even though there ing” the desired cell level. Therefore, a flash memory can be is a gap between them. Read errors can be mitigated by use modeled as a Write Asymmetric Memory (WAM), for which of error correction codes. If the error correction capability is

978-1-4673-2579-0/12/$31.00 ©2012 IEEE 1000 e, then any read error will be totally eliminated as long as Let Θ = (θ1, . . . , θn) be the target cell levels and α = the number of read errors is less than e. This motivates us to (α1, . . . , αn) be the hardness of charge injection. Let V = T consider another cost function, which is the number of cells (V1,V2,...,Vt) be the voltages applied on the t rounds of that are not quantized correctly to their target levels. programming. Let bi,j ∈ {0, 1} for i ∈ [n] and j ∈ [t] indicate Assume that Θ = (θ1, . . . , θn) is the vector of target cell whether ci is programmed on the j-th round; i.e., bi,j = 1 if levels and ℓt = (ℓ1,t, . . . , ℓn,t) is a vector of random vari- voltage Vj is applied to cell ci, and bi,j = 0, otherwise. Let ables which represent the level of every cell after t program- ℓi,t for i ∈ [n] be a random variable, representing the level of ming rounds. Note that in general the value of ℓi,t, for 1 6 ci after t rounds of programming. Then i 6 n, depends on the applied voltages, the hardness of the ∑t cell, and the programming noise. We will evaluate the per- ℓi,t = (αiVj + ϵi,j) bi,j, formance of any programming method by some cost func- j=1 where ϵ , for i ∈ [n] and j ∈ [t], is the programming noise tion C(Θ, ℓt) between the target cell levels Θ and the actual i,j of the i-th cell on the j-th programming round. We define cell levels ℓt. Then, the programming problem is to find an ℓ = (ℓ , . . . , ℓ ) to be the cell-state vector after t rounds algorithm which minimizes the expected value of C(Θ, ℓt). t 1,t n,t of programming and we define the matrix In [12], the ℓp metric was considered as the cost function (∑ ) 1   n p p C (Θ, ℓ ) = θ − ℓ . Motivated by the nature of b1,1 b2,1 ··· bn,1 p t i=1 i i,t   quantization of the cell levels, we study in this paper the cost b1,2 b2,2 ··· bn,2   t×n B =  . . . .  ∈ {0, 1} function  . . .. .  { } . . . b b ··· b C∆(Θ, ℓt) = i ∈ [n]: θi − ℓi,t > ∆i , 1,t 2,t n,t where ∆i is the quantization distance for the i-th cell. We to be the indicator matrix of the programmed cell on each solve this problem for the special case where the hardness of round of programming. each cell is known and there is no programming noise. We We evaluate the performance of the programming by some also study the problem in the presence of noise for a single cost function C(Θ, ℓt). The programming problem is to min- cell with and without feedback. imize the expected value of C(Θ, ℓt) over V and B. That is, The rest of the paper is organized as follows. In Section II, given the information of Θ, α and {ϵi,j }n×t, we seek to solve we propose a new cost function and define the parallel pro- minimize E [C(Θ, ℓ )] , (P1) gramming problem when flash memories quantize the amount t ∈ Rt ∈ { }t×n E of charge to discrete levels. In Section III, we present a with V + and B 0, 1 , where [X] is the expected polynomial-time algorithm to optimize the noiseless parallel value of the random variable X. programming with deterministic parameters defined in Sec- Remark 1. We use R+ to denote the set of all non-negative tion II. In Section IV, single cell programming with noise is real numbers, i.e., R+ = {x ∈ R : x > 0}. studied where there is no feedback information on the cell In [12], the ℓp metric is considered as the cost function, i.e. ( ) 1 level. In Section V, noisy cell programming is studied where ∑n p p we can adaptively apply voltages according to the feedback Cp(Θ, ℓt) = θi − ℓi,t , of the current cell level. Due to the lack of space, some i=1 α proofs will be omitted. and the optimal solution for (P1) was derived for known in the absence of noise. However, in real applications, flash II.PRELIMINARIES memories use multiple discrete levels to represent data and if the cell level ℓ is within a certain distance from the target Let c1, c2, . . . , cn be n flash memory cells, with erased level i,t denoted by 0. Their levels can be increased by injecting elec- level θi, it will be quantized to the correct target level even trons, but cannot be decreased. We denote by [n] the set of pos- though there is a gap between ℓi,t and θi. This motivates us to itive integers less than or equal to n, i.e., [n] = {1, 2, . . . , n}. consider the number of cells that are not correctly quantized to their target levels as the cost function. To be more precise, When applying a voltage V to a memory cell ci, where i ∈ [n], letting ∆ = (∆ ,..., ∆ ), we define we assume that the increase of the level of cell ci is linear with 1 {n } C ∈ − V , that is, the level of ci will increase by ∆(Θ, ℓt) = i [n]: θi ℓi,t > ∆i to be the cost function, where ∆i is the quantization distance αiV + ϵ, for ci. Therefore, the[ { cell programming problem} ] is to solve where αi and ϵ are random variables. We call the αi’s, where minimize E i ∈ [n]: θi − ℓi,t > ∆i , (P2) αi > 0, i ∈ [n], the hardness of charge injection for cell ci, ∈ Rt ∈ { }t×n and ϵ is the programming noise. (Note that the distribution of with V + and B 0, 1 . ϵ may vary among different cells and different writes.) III.NOISELESS PARALLEL PROGRAMMING We denote by θi > 0, i ∈ [n], the target level of ci. The programming process consists of t rounds of charge injection In this section, we assume that the cell hardness parame- achieved by applying a specified voltage to all of the cells. ters (α1, . . . , αn) are known and deterministic, and there is no The goal is to program the cell levels to be as close as pos- programming noise, i.e., ϵi,j = 0, ∀i ∈ [n], j ∈ [t]. In this sce- sible to the target levels. We define the parallel programming nario, ℓi,t is deterministic∑ so that we can omit the expectation t problem in detail as follows. in (P2) and ℓi,t = αi j=1 Vjbi,j . Let n, t, ∆1,..., ∆n and

1001 Rt θ1, . . . , θn denote the block length, number of programming +, there is an uncountably infinite number of choices of V rounds, quantization distances and target levels, respectively. to consider. Lemma 2 states that we can limit the number un- Our goal is to find an optimal solution to (P2). der consideration to be polynomial in n, and guarantee that Lemma 1. The solution to Problem (P2) is equivalent to the so- an optimal solution can be found. Although this lemma plays lution of the following: the most important role in deriving the algorithm, the proof is maximize f(V , B), (P3) too long to present due to space limitations. T ∈ Rt T ∈ ∈ { }t×t with V = (V1,...,Vt) +, bi = (bi,1, . . . , bi,t) Lemma 2. There exists an invertible matrix A 0, 1 , − { }t θi ∆i θi+∆i such that 0, 1 and B = (b1,..., {bn), where ui = α , vi =} α , i i A · V = p, ∈ ∈ 6 T · 6 i [n] and f(V , B) = i [n]: ui bi V vi . where V is an optimal solution for (P3) and p ∈ T t. Proof: The following chain of equations is easily estab- { } Remark 4. In Lemma 2 and Algorithm 1 below, the matrix lished. R min i ∈ [n]: θi − ℓi,t > ∆i A has to be invertible over instead of GF (2). Therefore, V ,B { } enumerating only the invertible matrices over GF (2) is not

=n − max i ∈ [n]: θi − ℓi,t 6 ∆i sufficient to find an optimal solution. V ,B { } Next we give an algorithm to search for an optimal solu- θi ℓi,t ∆i =n − max i ∈ [n]: − 6 tion to (P3), which, as we have shown, is also an optimal V ,B αi αi αi { } {p , . . . , p } solution to (P2). Let 1 M be an arbitrary ordering − ∈ θi − T · 6 ∆i T |T | =n max i [n]: bi V of the points in , where M = is the number of differ- V ,B αi αi { } ent threshold point values and pi can be the value of either θi ∆i θi ∆i ∈ =n − max i ∈ [n]: − 6 bT · V 6 + an upper or a lower threshold point, for i [M]. Since p i t V ,B { αi αi } αi αi is of length t, there are N = M choices of p (the entries { } − ∈ 6 T · 6 can be repeated). Let p1,..., pN be an arbitrary ordering =n max i [n]: ui bi V vi , V ,B of the choices. A matrix Ae ∈ {0, 1}t×t is formed such that − e t t×n θi ∆i θi+∆i no two rows are the same. Thus, the number of different A’s where V ∈ R , B ∈ {0, 1} , ui = and vi = . ∏ + αi αi t−1 t e e is Q = (2 − k). Let {A1,..., AQ} be an arbitrary or- This establishes the chain. k=0 e u v i ∈ [n] dering of all possible A’s. Algorithm 1 will iterate over all Since i and i, are the boundaries of the correct e quantization interval for ci, we call them the upper threshold choices of p and those A’s that are invertible. point and the lower threshold point for ci and we call the in- Algorithm 1 PARALLEL PROGRAMMING ∗ terval [ui, vi] the quantization interval for ci. Any pair (V , B) Let f = 0; ∗ ∗ that achieves the maximum for (P3) is called an optimal so- Let V = V = (0,..., 0), V , V both have length t; V t×n lution pair, and is called optimal or an optimal solution if Let B = (b1,..., bn) ∈ {0, 1} , bi = 0, ∀i ∈ [n]; there exists B such that (V , B) is an optimal solution pair. ∗ ∈ { }t×n ∗ ∀ ∈ ∈ Let B 0, 1 , bi,j = 0, i [t], j [n]; { Definition 1. Suppose ui and vi, i ∈ [n], are defined as in For i = 1, 2,...,N T T { Lemma 1. Let u be the set of upper threshold∪ points and v For j = 1, 2,...,Q T { } If Ae is invertible and Ae −1 · p ∈ Rt { be the∪ set of lower threshold points, i.e., u = i∈[n] ui and j j i + T { } T T ∪ T e −1 · v = i∈[n] vi . Let = u v be the set of all upper and Let V = Aj pi; lower threshold points. Let f = 0; For k = 1, 2, . . . , n { |T | Remark 2. We assume that > t since otherwise we can If ∃z ∈ S , such that u 6 z 6 v { C { } T V k k easily achieve ∆(Θ, ℓt) = 0 by setting V1,...,Vt = . Find b ∈ {0, 1}t, such that bT · V = z; T t Let bk = b; Definition 2. Suppose V = (V1,...,Vt) ∈ R . We define ∪ + f = f + 1; SV to be T SV = {b · V }. }} ∗ b∈{0,1}t If f > f { ∗ ∗ ∗ and call it the attainable set of V . That is, SV is the set of f = f, V = V , B = B; } voltage values that can be injected by applying V . }}} ∗ Output the optimal solution pair (V , B∗) with maximized ∈ ∈ ∗ ∗ ∗ Remark 3. For each i [n], if there exists z SV such that f(V , B ) = f . t ui 6 z 6 vi, then there exists b ∈ {0, 1} such that ui 6 T · 6 b V vi, and thus ci can be quantized to the correct target Remark 5. Since any matrix with two identical rows is not level. invertible, we only enumerated Ae matrices with distinct rows For a fixed V , optimizing the cost function over B is easy and then checked their invertibility. Therefore,∏ the number to accomplish by checking whether there exists z ∈ S such ∈ { }t×t t−1 t − V of A 0, 1 that we considered is k=0(2 k). Fur- that ui 6 z 6 vi for each i. Intuitively, we can enumerate ev- thermore, note that the complexity of the algorithm could be ery possible V and calculate the cost function to search for slightly reduced if we enumerated only the set of invertible an optimal solution. However, since V can be any point in matrices A over R.

1002 ∑ θ−α t x Theorem 1. Algorithm 1 finds the optimal solution pair c(x) = √∑ j=1 j , and δ(x) = √∑∆ . ∗ ∗ σ t x2 σ t x2 (V , B∗) and computes the optimal value f(V , B∗) for (P3). j=1 j j=1 j t+1 2 The time complexity of the algorithm is O(n ). Let p(y) = √1 e−y /2 be the N (0, 1) Gaussian probabil- 2π Proof: According to Lemma 2, there exists an optimal ity density function. Then g(x) can be interpreted as the area solution (V , B), an invertible matrix A ∈ {0, 1}t×t, and a between the curves p(y) and y = 0 on the interval∑ determined t t threshold-point vector p ∈ T , such that θ−α xj by x, where the interval is centered at √∑ j=1 , with ra- σ t x2 A · V = p. ∑ j=1 j ∑ t √∑∆ (·) (·) In Algorithm 1, all possible A’s and p’s, have been exhaus- dius 2 t 2 . For simplicity, j=1 is written as σ j=1 xj tively iterated and there is at least one optimal V among all when the context makes the meaning clear. the V ’s derived from A’s and p’s. The algorithm outputs the The following lemma will be used to determine the optimal best V among them. This proves that this algorithm will find solution to (P5) in Theorem 2. an optimal solution to (P3). Lemma 4. If x∗ is the optimal solution to (P5), then ∀i, j ∈ The number of iterations of the algorithm is of order ∗ + ∏ − [t], x = x, for some constant x ∈ R . 3 t t 6 t t 1 t − i NQt n2 , where N = M (2n) and Q = k=0(2 k). Therefore, the complexity is O(nt+1). Theorem 2. The optimal solution x to (P5) satisfies the follow- IV. SINGLE CELL NOISY PROGRAMMINGWITHOUT ing: x = x, ∀j ∈ [t] where x is the positive root of the equation j ( ) FEEDBACK b 2 ln x2 + 2(b − a)cx + (a2 − b2) = 0, In this section, programming noise is assumed to ex- a ist. There is a single cell with injection hardness α, and √ where a = −∆+√ θ , b = ∆+√θ , c = α t . the number of programming rounds is t. The programming σ t σ t σ ϵ , . . . , ϵ noises 1 t are assumed to be independently distributed Proof: According to Lemma 3, Gaussian random variables with zero mean and variance ∫ 2 c(x)+δ(x) σ , j ∈ [t], respectively. 1 2 ϵj g(x) = √ e−u /2du Remark 6. Note that according to this model, after every 2π c(x)−δ(x) ∑ − programming round the level of each cell can decrease even ∫ ∆+θ α xj √∑ σ x2 2 1 j −u /2 though in real applications, the cell level of flash memories √ ∑ = − − e du ∆+θ α xj can only increase. We choose to study this model while as- 2π √∑ σ x2 j suming that the variance σϵ , j ∈ [t] is much smaller than j ∫ ∆+√θ−αtx αV , i.e., P (αV + ϵ < 0) is very small, such that the prob- 1 σ tx2 − 2 j j j 6 √ e u /2du ability of decreasing the cell levels is negligible. This model 2π −∆+√θ−αtx is a reasonable approximation to a physical cell and it can be σ tx2 ∫ b−cx studied analytically, as will be seen in this section. 1 x − u2 = √ e 2 du Another reasonable assumption we make is σ = σV ,j ∈ 2π a−cx ϵj j x [t], where σ is a fixed number; that is, the standard deviation 1 := √ h(x), of the programming noise is proportional to the programming 2π voltage. This makes sense since large voltage applied to the √ cell results in large power of the programming noise. During where a = −∆+√ θ , b = ∆+√θ , c = α t . The inequality follows σ t σ t σ programming, no feedback information is available, meaning from Lemma 4 and it is satisfied with equality if xj = x, ∀j ∈ that the actual amount of charge trapped in the cell after each [t]. Differentiating h(x) with respect to x, we have round of programming is not known. The goal is to maximize dh − 1 b−cx 2 −cx − (b − cx) the probability that after t rounds of programming the final 2 ( x ) · = e 2 level is in [θ − ∆, θ + ∆], i.e., dx x − 2 − − − ( ∑t ) − 1 ( a cx ) cx (a cx) − 6 6 − e 2 x · = 0, maximize P θ ∆ (αVj + ϵj) θ + ∆ , (P4) ( ) x2 j=1 ∈ Rt b 2 2 2 with V +. ⇔ 2 ln x + 2(b − a)cx + (a − b ) = 0. In the rest of the section, (P4) is recast as an optimiza- a tion problem and the vector of voltages V is written as x in It can be seen that there is only one extremal point for f(x) accordance with convention. when x > 0 and that f(x) > 0, ∀x > 0. Meanwhile, f(x) ∈ Lemma 3. Assume that σϵj = σVj, j [t], where σ is a fixed approaches 0 as x approaches +∞. It follows that the extreme number and feedback information is not available, the cell pro- point can only be where f(x) achieves its maximum, which gramming problem (P4) is equivalent to completes the proof. maximize g(x), (P5) V. SINGLE CELL NOISY PROGRAMMINGWITH FEEDBACK ∈ Rt with x +, where ∫ In this section, we assume that after every round of pro- c(x)+δ(x) 1 2 g(x) = √ e−u /2du, gramming, we can evaluate the amount of charge that has 2π c(x)−δ(x) already been trapped in the cells. That is, we can measure

1003 ∑ k 1 ∀ ∈ j=1(αVj + ϵj) after the k-th round of programming , k Next we give an algorithm for determining the optimal cell [t]. Therefore, we can adaptively choose the applied voltages programming for Problem (P6), where feedback information according to the current cell level. Similarly, we assume the is available. injection hardness α of the cell is known and fixed, and the Algorithm 2 Suppose the cell voltage is xj before the j-th programming noise values ϵ1, . . . , ϵt are independent random write, where 1 6 j 6 t and x1 = 0. Then on the j-th write, set − variables with probability density functions pj(x), ∀j ∈ [t]. θ xj +∆ Vj = . Our goal is to maximize the probability that after t rounds α+δ2 of programming the final level is in [θ − ∆, θ + ∆], i.e., Theorem 4. Algorithm 2 gives an optimal solution for the cell ( t ) ∑ programming problem (P6). maximize P θ − ∆ 6 (αVj + ϵj) 6 θ + ∆ , (P6) ∈ Rt j=1 VI.CONCLUSION with V +. t The study of cell programming for flash memories is an im- Definition 3. Let P (V 1, θ, ∆, t) be the probability that the final cell level after t rounds of programming is in portant step toward understanding their storage capacity. Flash − t memories have a unique property that the cell levels can only [θ ∆, θ + ∆] when the voltages applied are V 1, where j increase during programming, which gives rise to a monotonic V i = (Vi,Vi+1,...,Vj). Let P (θ, ∆, t) be the maximum t cell-programming model. In this paper, a new criterion to mea- probability over all choices of V 1, i.e., sure the performance of cell programming is proposed. Both t P (θ, ∆, t) = max P (V 1, θ, ∆, t), noiseless parallel programming and noisy single cell program- V t ∈Rt  1 +  where ming are studied. The potential benefit of using feedback to ∑t t  − 6 6  adaptively choose the programming voltages is considered. P (V 1, θ, ∆, t) = P θ ∆ (αVj + ϵj) θ + ∆ . j=1 VII.ACKNOWLEDGMENT Suppose the target level and quantization distance are θ and This research was supported in part by the ISEF Founda- ∆, respectively. Let P (θ, ∆, t) be as in Definition 3. Then the tion, the Lester Deutsch Fellowship, the University of Cal- following recursion holds:∫ ifornia Lab Fees Research Program, Award No. 09-LR-06-

P (θ, ∆, t) = max p1(x − αV1)P (θ − x, ∆, t − 1)dx, 118620-SIEP, the National Science Foundation under Grant V1∈R+ R + CCF-1116739, and the Center for Magnetic Recording Re- > for t 2 and ∫ search at the University of California, San Diego. θ+∆ The authors would like to thank Lele Wang for her com- P (θ, ∆, 1) = max p1(x − αV1)dx. ∈R V1 + θ−∆ ments on the statement and proof of Lemma 2. We can compute P (θ, ∆, t) numerically using the recursion REFERENCES once we know the distribution of the noise pj(x), j ∈ [t]. How- ever, analytical results are difficult to derive since the noise [1] V. Bohossian, A. Jiang, and J. Bruck, “Buffer codes for asymmetric multi-level memory,” in Proc. IEEE Int. Symp. Inform. Theory, June distribution pj(x), j ∈ [t] could be an arbitrary probability 2007, pp. 1186–1190. distribution. In the sequel, we assume a simple yet nontriv- [2] P. Cappelletti, C. Golla, P. Olivo, and E. Zanoni, Flash Memories. ial noise distribution, namely, ϵ is uniformly distributed over Kluwer Academic Publishers, 1st Edition, 1999. j [3] K. D. S. et al, “A 3.3V 32 Mb NAND flash memory with incremental [αVj − δ1Vj, αVj + δ2Vj] for j ∈ [t], where 0 6 δ1 6 α and step pulse programming scheme,” IEEE Journal of Solid-State Circuits, 1 δ2 > 0. Thus pj(x) = I ∈ − . This assump- vol. 30, no. 11, pp. 1149–1156, November 1995. (δ1+δ2)Vj x [ δ1Vj ,δ2Vj ] tion is very similar to the one made in [6]. The size of the [4] D. Haugland, “A bidirectional greedy heuristic for the subspace selection problem,” Lecture Notes in Computer Science, vol. 4638, pp. 162–176, support set of the noise distribution is proportional to the pro- August 2007. gramming voltage, which is reasonable since larger voltages [5] A. Jiang, V. Bohossian, and J. Bruck, “Floating codes for joint informa- result in larger deviations of the noise distribution. tion storage in write asymmetric memories,” in Proc. IEEE Int. Symp. Inform. Theory, June 2007, pp. 1166–1170. Theorem 3. In Definition 3, [6] A. Jiang and H. Li, “Optimized cell programming for flash memories,” { in Proc. IEEE Pacific Rim Conference on Communications, − − 1, if θ ∆ < α δ1 , and Signal Processing (PACRIM), August 2009, pp. 914–919. θ+∆ α+δ2 P (θ, ∆, 1) = − − [7] A. Jiang, H. Li, and J. Bruck, “On the capacity and programming of α+δ2 2∆ if θ ∆ > α δ1 , δ1+δ2 θ+∆ θ+∆ α+δ2 flash memories,” IEEE Trans. Inform. Theory, vol. 58, no. 3, pp. 1549 – 1564, March 2012. θ+∆ and the optimal solution is achieved by V1 = . [8] A. Jiang, R. Mateescu, M. Schwartz, and J. Bruck, “Rank modulation α+δ2 for flash memories,” IEEE Trans. Inform. Theory, vol. 55, no. 6, pp. t 2659 – 2673, June 2009. Next we would like to find the values of V 1 that maximize t [9] H. T. Lue, T. H. Hsu, S. Y. Wang, E. K. Lai, K. Y. Hsieh, R. Liu, and P (V 1, θ, ∆, t) with feedback information, for arbitrary t. C. Y. Lu, “Study of incremental step pulse programming (ISPP) and STI edge effiect of BE-SONOS NAND flash,” in Proc. IEEE int. Symp. Lemma 5. P (θ, ∆, t) is a non-increasing function of θ. on Reliability Physiscs, vol. 30, no. 11, May 2008, pp. 693–694. [10] B. K. Natarajan, “Sparse approximate solutions to linear systems,” SIAM t θ+∆ J. Comput., vol. 30, no. 2, pp. 227–234, April 1995. Lemma 6. P (V 1, θ, ∆, t) is maximized when V1 = α+δ . 2 [11] R. Rivest and A. Shamir, “How to reuse a write-once memory,” Inform. 1 and Contr., vol. 55, no. 1-3, pp. 1–19, December 1982. Measuring the exact amount of charge injected is time consuming for real [12] E. Yaakobi, A. Jiang, P. H. Siegel, A. Vardy, and J. K. Wolf, “On the applications, thus it is common to compare the cell level to certain threshold parallel programming of flash memory cells,” in Proc. IEEE Inform. values and to obtain a range for the cell level. In this work, we follow the Theory Workshop, Dublin, Ireland, August-September 2010, pp. 1–5. assumption that the actual cell level is available, as in [6].

1004