BINARY TEXT IMAGE FILE PREPROCESSING TO ACCOUNT FOR PRINTER DOT GAIN∗

Lu Zhanga, Alex Veisb, Robert Ulichneyc, Jan Allebacha

aSchool of Electrical and Computer Engineering, Purdue University, West Lafayette, IN 47907-2035, U.S.A. bHewlett-Packard Scitex, Ltd. Netanya 42505, ISRAEL cHewlett-Packard Laboratories USA, Cambridge, MA 02142, U.S.A

ABSTRACT Dot gain is a classic problem in digital printing that causes printed and text to appear darker than desired. For printing of text, we propose a method to preprocess the image sent to the printer in order to compensate for dot gain. It is based on an accurate model that predicts the printed absorp- tance for given local neighborhood in the digital image, a cost function to penalize lack of fidelity to the desired target text image, and the use of direct binary search (DBS) to minimize the cost. Fig. 1. Illustration of increased stroke thickness caused by dot gain. Index Terms— text print quality, rendering algorithm, The blue area indicates the dots added by the printer. The red area dot gain, raggedness, printer characterization indicates dots that were deleted.

1. INTRODUCTION the size of the dot-clusters in the digital image that is sent to the print engine. However, to do this in a most effective Dot gain in which the colorant effectively covers a larger area manner requires three fundamental components: (1) an accu- on the printed media than that corresponding to the digital rate model for the printed absorptance that results when a giv- image sent to the printer is a common problem observed with en digital image is sent to the print engine; (2) a cost function digital printing systems [1, 2]. It affects the printing of tex- that appropriately weights the difference between the model t and graphics, as well as halftones. With positive-contrast prediction of the printed image and the target image that was (black on white) text, the character glyphs will appear darker desired; and (3) a search strategy for finding the precompen- and larger than intended. In addition, the shape of the charac- sated digital image to send to the printer that minimizes the ters may be distorted. These effects are especially prominent cost function. with small point sizes and complex typefaces that have ser- Tabular models have been shown to be effective for pre- ifs. Figure 1 provides an illustration. With negative-contrast dicting the printed absorptance value for halftone images [4– text (white on black), the effects are reversed. The nature and 8]; and similar models have been used to solve the inverse source of the differences between the digital and printed im- halftoning [9, 10]. In this paper, we will propose (in Secs. 2 ages will vary depending on the marking technology – be it and 3) a tabular model that is novel in three aspects: (1) to electrophotography or inkjet, and the specific characteristics our knowledge, it is the first tabular model applied to predic- of the print mechanism. tion of printed text character glyphs; (2) it is a 3-stage model In this paper, we will specifically consider laser, elec- that can more accurately capture a wider range of local binary trophotographic printers. With electrophotography, the depo- configurations in the digital image sent to the print engine (a sition of toner within the area of a given printer-addressable 2-stage model was proposed in [8]); (3) it is a high-definition is strongly influenced by not just the image value at that model that predicts the printed output at a higher resolution pixel, but also by the image values of the immediately neigh- than the binary digital image that is sent to print engine. boring [3, 4]. Our cost function (Sec. 4) penalizes the mean-absolute d- The general strategy for dealing with dot gain is to prec- ifference between the model-predicted printed text image and ompensate for it by reducing the stroke width of the glyphs or the target text image generated with an antialiasing method. It ∗Research partially supported by the Hewlett-Packard Company. also incorporates a term to penalize raggedness in the edges of

978-1-4799-5751-4/14/$31.00 ©2014 IEEE 2639 ICIP 2014 the character glyphs. Finally, we use the direct binary search PDF file in our experiments, to an 1800 dpi anti-aliased tex- (DBS) algorithm to find a locally optimum precompensated t image, and then block averaging and thresholding [17] it to binary image. DBS has proven to be an effective tool in a va- yield a 600 dpi binary image. Since the high-definition printer riety of halftoning applications [3,6,7,11–16]. To our knowl- model predicts the output at 1800 dpi, the corresponding ide- edge, this is its first application to the optimization of printed al printed image, which we call the target image is also gen- text images. erated at 1800 dpi using the pointwise rescaled anti-aliased image.

2. FRAMEWORK FOR ONLINE TEXT FILE 2.2. 3-level Window-Based High-Definition Printer Out- PREPROCESSING put Predictor In this section, we present our proposed framework for on- The predictor has a three-level structure, where each level is line text file preprocessing. It is illustrated in Fig. 2. Within defined by a LUT Π(i), i = 1, 2, 3. Figure 3 demonstrates

Fig. 3. Illustration of the first-level high-definition printer predictor. (a) 600 dpi binary image of letter “s”. (b) The 5 × 5 neighborhood Fig. 2. Block diagram for online text preprocessing. We use [m, n] with pixel indices indicated (top), and the corresponding prediction for discrete spatial coordinates at the printer resolution and [M,N] of the 3 × 3 block of 1800 dpi pixels indicated by the red box (bot- for discrete spatial coordinates at the target or scanner resolution. tom). (c) Predicted 1800 dpi printer output image of the letter “s”. The function b0[m, n] denotes the initial binary image which is the (d) Actual printed image of the letter “s” scanned at 1800 dpi. original input to the printer. how the first-level of the predictor works for a 6 point Times this framework, we must first generate the target text image New Roman letter “s”. We assume that the printed absorp- and the initial binary image. We address this question in Sec tance of each 3 × 3 block of pixels in the 1800 dpi printer 2.1. Starting with the initial binary image, the search scan- output is determined by the state of the corresponding pixel s through the binary preprocessed image in raster order. At and its neighbors in the 600 dpi binary image that is sent to each pixel of the binary text image, it considers toggling it- the printer. As shown in Fig. 3(b), we estimate this absorp- s state. The high-definition printer predictor is then used to tance by first observing its 5×5 neighborhood. We assign the predict the printed binary preprocessed image at a higher res- center pixel in this 5 × 5 neighborhood of the 600 dpi image olution. We introduce this printer predictor in Sec 2.2, and (and hence all the pixels to be predicted in the corresponding present its offline training process in Sec. 3. If the trial change 3 × 3 neighborhood of the 1800 dpi image) a single configu- reduces the cost measure φ, DBS keeps that change. We ad- ration code 0 ≤ c(1) ≤ 24, calculated according to dress this cost measure in Sec. 4. One iteration of the algo- 24 rithm consists of a visit to every pixel in the 600 dpi binary (1) X i image. When no changes are retained throughout an entire it- c = bi · 2 , (1) eration, the algorithm has converged. This typically requires i=0 8 or so iterations. where the superscript 1 in c(1) refers to the first-level LUT and bi denotes the binary value of i-th pixel in the 5 × 5 window 2.1. Generating the Target Image and the Initial Binary indicated in Fig. 3(b). Then, we look up the predicted printed Image absorptance values of the 9 corresponding 1800 dpi subpixels in the trained look-up table denoted by Π(1). In this paper, we have chosen a particular 600 dpi laser, elec- Let Ω(1) denote the set of configuration codes c(1) that oc- trophotographic printer, as our target output device. As the cur in the first-level LUT Π(1). Ω(1) will contain every config- first step toward the search process, we generate the initial 600 uration code that was encountered during the offline training dpi binary image by converting the original text file, a (vector) phase. It may happen that some configuration codes c(1) will

978-1-4799-5751-4/14/$31.00 ©2014 IEEE 2640 ICIP 2014 be observed during the online prediction phase that were not 4. COST FUNCTION seen during the training process, i.e. c(1) 6∈ Ω(1). In this case, we use the second-level LUT, denoted by Π(2) that is based There are two basic assumptions on the cost φ. First, it has on the dot configurations in the 3 × 3 neighborhood of the to have an ability to capture the difference between the target center pixel shown in the top half of Fig. 3(b), which contain- image and the printed image, assuming the printer predictor (2) s the pixels b0, ..b8. Let Ω denote the set of configuration gives a fairly accurate output. So we use the sum of absolute codes c(2) that occur in Π(2). For those configurations in the differences (SAD) between the target and the predicted print- 5 × 5 neighborhood of the binary text image shown in the top ed image with an emphasis on the edges as the first term Θ of half of Fig. 3(b) for which c(1) 6∈ Ω(1) and c(2) 6∈ Ω(2), the our cost function. (3) third-level LUT Π that is based on the number of occupied X X pixels in the 3 × 3 neighborhood of the center pixel is used. Θ = k gprinter[M,N] − gtarget[M,N] k The third-level LUT contains an entry for each possible value M N of c(3). A summary is given in (2)-(4). · (1 + γ) · h[M,N], (5) 8 8 (2) X i (3) X Here, gprinter[M,N] and gtarget[M,N] denote the predicted c = bi · 2 ; c = bi. (2) printed and the target images respectively, and h[M,N] de- i=0 i=0 notes the edge mask with value 1 on the edge pixels and 0, (1) (2) (3) ~c = (c , c , c ). (3) otherwise. We address the edge mask in Sec. 4.1. The param-  (1) (1) (1) (1) eter γ is an adjustable coefficient that determines how much Π (c , s), c ∈ Ω ,  we penalize the errors on edges. Second, since a DBS-based P (~c,s) = Π(2)(c(2), s), c(1) 6∈ Ω(1), c(2) ∈ Ω(2), (4) search method combined with the local smoothing effect of  (3) (3) (1) (1) (2) (2) Π (c , s), c 6∈ Ω , c 6∈ Ω . the printer output predictor will cause the edges of characters Here, we denote by s = 0, ..., 8 the indices of the 9 subpixels look ragged, the cost φ needs to have the ability to penalize in the 3 × 3 block of the 1800 dpi predicted printer output the raggedness. This is the objective of the second term of our shown in the bottom half of Fig. 3(b). cost function. Since there is a trade-off between our first and second terms, the cost function has the form

3. OFFLINE TRAINING OF THE PRINTER Θ(k) Γ(k) PREDICTOR φk = α + (1 − α) . (6) Θ(0) Γ(0) Figure 4 shows the major steps of how we establish the re- Here, k indicates the k-th trial change during the search; 0 lationship between the binary image sent to the printer and denotes the initial state used for normalization; α is an ad- the average absorptance of the region in the scanned printed justable coefficient that controls the trade-off; and Γ is the image corresponding to each single printer-addressable pix- raggedness measure, discussed in Sec. 4.2. el. Space limitations preclude us from providing a detailed 4.1. Edge Mask The edge mask h[M,N] indicates the area in the target or printed images where observers might tolerate the errors less. We take the anti-aliased image and set to 0 inner pixels whose 5 × 5 neighborhood is solid black, i.e. all pixels in the neigh- borhood have value 255 (absorptance). Then all pixels are set Fig. 4. Procedure for establishing the printer predictor to 1 except those that are 0 to yield the final edge mask.

description of the training procedure in this paper. We print 4.2. Edge Raggedness Metric pages of text generated with the Forensic Monkey Text Gen- erator [18] that also include a grid of fiducial marks to fa- Traditionally, the raggedness metric is calculated for straight cilitate alignment of each 3 × 3 block of pixels in the 1800 lines only, as the standard deviation of the residuals of the dpi scanned page with its corresponding 600 dpi pixel in the actual T 60 contour to the fitted line [19]. We propose a new preprocessed binary image sent to the printer. For each such edge raggedness metric that is calculated by fitting a parabola 600 dpi pixel in the preprocessed binary image, we examine or a cubic locally to each pixel along the unit-width line edge its 5 × 5 neighborhood to determine the three configuration map of the character stroke and computing the distance from codes described in Sec. 2, and based on them, store the s- the actual edge pixel to the fitted line. canned absorptance values as entries in the three LUTs. The Figure 5 shows how we measure the local raggedness. We stored values are averages over the measured absorptance for fit a polynomial to a window of 7 pixels (3 on either side of the all occurrences of the given configuration codes. center pixel), then find the normal to the fitted polynomial at

978-1-4799-5751-4/14/$31.00 ©2014 IEEE 2641 ICIP 2014 Fig. 5. Raggedness metric calculation. (a) A printed image, (b) The corresponding line edge map generated using the Canny edge detector [20], (c) An example of parabola curve fitting for an edge pixel.

the center pixel, and finally compute the distance from the ac- tual edge pixel to the fitted curve. We compute the raggedness Fig. 6. Comparison between original text image, simply processed, for the whole image by averaging over the local raggedness of and model-based processed images. Blue highlights the added dots all edge pixels. and red highlights the dots missing from the target image. (a) Orig- inal binary text image in 6 point Times New Roman. (b,c) Scanned printed original binary text image. (d) Simple low-pass filtered bi- 5. EXPERIMENTAL RESULTS AND DISCUSSION nary text image. (e,f) Scanned printed version of (d). (g) Processed binary text image using our framework. (h,i) Scanned and printed We experimented with text pages including all letters of the version of (g). (j) Generated target image. alphabet in sizes ranging from 3 to 30 points and two differ- ent typefaces. The final printer predictor model was trained using 10 different pages of size 6600 × 5100 printer pixels. 9 Table 2. Evaluation of Different Scanned Printed Processed Images of them were the DBS-processed pages, which gave a more Scanned and printed 6 point Stroke Raggedness (s- complete predictor. Π(1) contains 64289 entries and Π(2) con- Times New Roman width (%) canner pixels) tains 436 entries. Printed original binary image 134.23 0.180 Printed simple processed 96.96 0.175 binary image Table 1. Percentage Use of the 3-Level LUTs of the Printer Predic- Printed processed binary 107.59 0.181 tor for Different Text Images image using our approach Model-based processed % use of % use of % use of text pages Π(1) Π(2) Π(3) 6 point Times New Roman 96.94 2.04 0.0013 The stroke width is the thickness of the stroke of a letter. 8 point Times New Roman 97.96 1.02 0 It is calculated as the average percentage coverage ratio of let- 5 point Arial 96.96 2.01 0 ters in the printed, scanned, and thresholded [17] text image to those in the target image. Its ideal value is 100. Table 2 shows Table 1 shows the percentage use of all 3-levels of the that our approach significantly reduced the stroke width and printer predictor. For most pixels, the first-level predictor is increased the raggedness slightly. Although the simple pro- used, which means that the training is sufficient. It also shows cess with the low pass filter gives a value closer to the ideal that DBS-processed pages with characters in smaller sizes or in stroke width, it can be seen in Fig. 6(c) that a few charac- more complicated typefaces, for instance those with serifs, ters are broken. So our approach gives more consistent and have more rarely seen dot configurations. This is consistent desirable results. with our intuition. Figure 6 compares the rendering results of a clipped por- tion of a processed binary image with text of 6 point Times 6. CONCLUSION New Roman for three different methods: (1) the original text image, (2) preprocessing the binary text image with a simple In this paper, we have presented a proposed rendering algo- method, (3) using our framework with the empirically chosen rithm to preprocess binary text pages to compensate for the values α = 0.8 and γ = 1.25. The simple method is based on dot gain. We have shown that we can significantly reduce the filtering the original binary text image with an 2 × 2 moving thickness of character strokes while maintaining the quality average filter, followed by a thresholding step [17]. of the character shape and the edge raggedness.

978-1-4799-5751-4/14/$31.00 ©2014 IEEE 2642 ICIP 2014 7. REFERENCES [12] J. Lee and J. Allebach, “Inkjet printer model-based halftoning,” IEEE Transactions on Image Processing, [1] S. Gooran and B. Kruse, “Near-optimal model-based vol. 14, pp. 674–689, May 2005. halftoning technique with dot gain,” in Very High Res- olution and Quality Imaging III, San Jose, CA, January [13] C. Lee and J. Allebach, “The hybrid screen improving 1998, vol. 3308 of SPIE, pp. 114–122. the breed,” IEEE Transactions on Image Processing, vol. 19, no. 2, pp. 435–450, February 2010. [2] M. Coudray, “Understanding the behavior of midtone dot gain,” Screen Printing, vol. 97, no. 12, pp. 24–29, [14] K. Chandu, M. Stanich, C. Wu, and B. Trager, “Di- 2007. rect multi-bit search screen algorithm,” in Proceedings of ICIP 2012 IEEE International Conference on Image [3] D. Kacker, T. Camis, and J. Allebach, “Electropho- Processing, Orlando, FL, September 2012, IEEE, pp. tographic process embedded in direct binary search,” 817–820. IEEE Transactions on Image Processing, vol. 11, pp. 243–257, March 2002. [15] X. Zhang, A. Veis, R. Ulichney, and J. Allebach, “Multi- level halftone screen design: Keeping texture or keeping [4] Y. Ju, R. Ulichnery, and J. Allebach, “Black-box models smoothness?,” in Proceedings of ICIP 2012 IEEE Inter- for laser electrophotographic printers - recent progress,” national Conference on Image Processing, Orlando, FL, in NIP Digital Fabrication Conference. 2013, pp. 66– September 2012, IEEE, pp. 829–832. 71, Society for Imaging Science and Technology. [16] C. Tang, A. Veis, R. Ulichney, and J. Allebach, “Irreg- [5] T. Pappas, D. Neuhoff, and C. Dong, “Measurement of ular clustered-dot periodic halftone screen design,” in printer parameters for model-based halftoning,” Journal Color Imaging XIX: Displaying, Processing, Hardcopy, of Electronic Imaging, vol. 2, no. 3, pp. 193–204, July and Applications, R. Eschbach, G. Marcu, and A. Rizzi, 1993. Eds., San Francisco, CA, February 2014, vol. 9015 of SPIE. [6] F. Baqai and J. Allebach, “Halftoning via direct binary search using analytical and stochastic printer models,” [17] N. Otsu, “A threshold selection method from gray-level IEEE Transactions on Image Processing, vol. 12, no. 1, histograms,” Automatica, vol. 11, no. 285-296, pp. 23– pp. 1–15, January 2003. 27, 1975. [7] P. Goyal, M. Gupta, C. Staelin, M. Fischer, O. Shacham, [18] A. Mikkilineni, G. Ali, P. Chiang, G. Chiu, J. Allebach, T. Kashti, and J. Allebach, “Electro-photographic mod- and E. Delp, “Signature-embedding in printed docu- el based stochastic clustered-dot halftoning with direct ments for security and forensic applications,” in Proc. binary search,” in Proceedings of ICIP 2011 IEEE In- SPIE: Security, Steganography, and Watermarking of ternational Conference on Image Processing, Septem- Multimedia Contents VI, San Jose, CA, January 2004, ber 2011, pp. 1721–1724. vol. 5306 of SPIE, pp. 455–466. [8] L. Wang, D. Abramsohn, T. Ives, M. Shaw, and J. Alle- [19] “Standard ISO/IEC 13660 Information technology - Of- bach, “Estimating toner usage with laser electrophto- fice equipment - Measurement of image quality atributes graphic printers,” in Color Imaging XVIII: Displaying, for hardcopy output - Binary text and Processing, Hardcopy, and Applications, San Francisco, graphic images,” - ISO/IEC 13660:2000(E). CA, 2013, vol. 8652 of SPIE, pp. 3–7. [20] J. Canny, “A computational approach to edge detection,” [9] M. Mese and P. Vaidyanathan, “Look-up table method IEEE Transactions on Pattern Analysis and Machine In- for inverse halftoning,” IEEE Transactions on Image telligence, vol. 8, pp. 679–698, November 1986. Processing, vol. 10, no. 10, pp. 1566–1578, 2001. [10] M. Mese and P. Vaidyanathan, “Optimized halftoning using dot diffusion and methods for inverse halftoning,” IEEE Transactions on Image Processing, vol. 9, no. 4, pp. 691–709, 2000. [11] D. Lieberman and J. Allebach, “A dual interpretation for direct binary search and its implications for tone repro- duction and texture quality,” IEEE Transactions on Im- age Processing, vol. 9, no. 11, pp. 1950–1963, Novem- ber 2000.

978-1-4799-5751-4/14/$31.00 ©2014 IEEE 2643 ICIP 2014