Image Processor and RISC MCU Embedded Single Chip Fingerprint Sensor

Journal of Sensor and Actuator Networks

Article Image Processor and RISC MCU Embedded Single Chip Fingerprint Sensor

Seungmin Jung

School of Information Science and Telecommunication, Hanshin University, 137 Hanshindae-gil, Osan-si 18101, Korea; [email protected]

Received: 28 August 2020; Accepted: 30 October 2020; Published: 2 November 2020

Abstract: In this paper, we propose a single chip fingerprint sensor with the algorithm processor and 16-bit MCU. The algorithm processor is a logic circuit that implements the GABOR filter and the THINNING step, which occupies 80% of the fingerprint image processing time. The rest of the algorithm is processed by embedded 16-bit MCU with small circuit volume, so all steps of the algorithm can be processed on the single chip without an external CPU. The capacitive sensing circuit was designed by applying the parasitic-insensitive integrator with the variable clock generator. The function was verified by Cadence Spectre for a 1-pixel sensor scheme and RTL and post simulation for digital blocks synthesized by Synopsys Design Compiler in 180n 2-poly 6-metal CMOS (complementary metal–oxide–semiconductor) process. The layout is done by automatic P&R for the full chip in a 96 96 pixel array. The chip area is 5010 µm 5710 µm (28.61 mm2) and the gate count × × is 2,866,700. The result is compared with a conventional one. The proposed scheme can reduce the processing time by 57%.

Keywords: ﬁngerprint sensor; single chip; GABOR ﬁlter; binarization; thinning; embedded MCU; VLSI

1. Introduction Fingerprint is the most popular biometric method for authentication. Minutiae-based fingerprint recognition algorithms are widely used. Minutiae means the position and angle information of ending and ridge bifurcation. The image captured from the sensor is processed through an algorithm as shown in Figure1. The fingerprint algorithm is divided into a process of registering an existing fingerprint and a step of confirming identity by matching a new fingerprint and a registered fingerprint [1–3]. Both steps go through a complex procedure of applying an algorithm and extracting feature points. In order to extract the feature points, pre-processing, feature point extraction, and matching are required. To date, various methods have been proposed mainly for software-oriented fingerprint image processing. A high-performance processor and processing time are required for such a complex algorithm. In order to perform the matching process on a large number of fingerprint images, the fast speed of image analysis is important [4–6]. One of the ways to get an obvious increase in speed is the hardware implementation of the algorithm [4,7–14]. In general, a sensor and an algorithm are separated in a fingerprint recognition system. There are many advantages if the sensor and algorithm can be operated on a single chip. Currently, the processing speed of the fingerprint recognition sensor applied to the general security system is around 500 ms. In particular, when applied to mobile, a processing speed of 500 ms or less is required. Embedded fingerprint identification systems require high-performance microcontrollers such as 32-bit or 64-bit CPUs. This is expected to reduce the operating speed and implement a small fingerprint identification system through the application of the hardware-oriented method.

J. Sens. Actuator Netw. 2020, 9, 51; doi:10.3390/jsan9040051 www.mdpi.com/journal/jsan J. Sens. Actuator Netw. 2020, 9, 51 2 of 15 J. Sens. Actuator Netw. 2020, 9, x FOR PEER REVIEW 2 of 15

Figure 1. Fingerprint identification identiﬁcation algorithm.

Several attempts have been made in the hardwarehardware development of fingerprintfingerprint recognition algorithms forfor each each step step of the of algorithm the algorithm [7–14]. The [7–14]. papers The deal withpapers the hardwaredeal with implementation the hardware of implementationGabor filters [7–10 of] andGabor thinning filters [[7–10]13,14], and respectively, thinning or [13,14], a combination respectively, of two or stages a combination [11,12], while of two the stagesfinal implementation [11,12], while is limitedthe final to simulationimplementation and FPGA is (Fieldlimited Programmable to simulation Gate and Array) FPGA verification. (Field AlthoughProgrammable VLSI implementationGate Array) verification. has been Although proposed [VLSI9,10], implementation it is only a simulation, has been and proposed there is no [9,10], case itin is which only a the simulation, entire algorithm and there and is MCUno case are in integrated which the on entire single algorithm chip. Since and thereMCU areare diintegratedfferences onin targetsingle device,chip. Since image there size, are process differences conditions, in target etc., device, it is di ffiimagecult tosize, compare process performance conditions, withetc., it old is difficultresearch to results compare [7–14 performance] for the two stageswith old implemented research results in hardware. [7–14] for Therefore, the two stages instead implemented of step-by-step in performancehardware. Therefore, comparison, instead this willof step-by-step be dealt with perf in termsormance of improvedcomparison, processing this will time be anddealt increased with in termschip area of improved when implemented processing with time an and integrated increased single chip VLSI area chip.when implemented with an integrated singleIn VLSI this chip. paper, we propose a single-chip fingerprint sensor structure with a built-in algorithm processorIn this and paper, small we MCU. propose The CPU a single-chip occupancy fingerprint of the algorithm sensor was structure analyzed with using a thebuilt-in ARM algorithm emulator processorand the FPGA and environment.small MCU. TheThe algorithmCPU occupancy processor of handlesthe algorithm only the was binarization analyzed and using thinning the ARM step. Theemulator 16-bit and RISC the MCU FPGA handles environment. the remaining The algorith stepsm that processor are processed handles by only software. the binarization The algorithm and processorthinning step. is designed The 16-bit by applying RISC MCU the Gaborhandles filter the for remaining binarization steps and that the are ZS (Zhang-Suen)processed by algorithmsoftware. Thefor thinning algorithm [15 ].processor The rest is of designed the algorithm by applying is processed the byGabor an embedded filter for 16-bitbinarization MCU withand athe small ZS (Zhang-Suen)circuit volume, algorithm so all steps for of thinning the algorithm [15]. canThe be rest processed of the algorithm on a single is chip processed without by an an external embedded CPU. The16-bit operation MCU with of thea small circuit circuit was verifiedvolume, byso Cadenceall steps Spectreof the algorithm for a 1-pixel can sensorbe processed system on in 0.18a singleµm chipstandard without CMOS an processexternal andCPU. 1.8 The V power operation condition, of the andcircuit the was logic verified synthesis by circuitCadence using Spectre Synopsys for a 1-pixelDesign sensor Compiler system was in performed 0.18 μm standard by RTL andCMOS post process layout and simulation. 1.8 V power The condition, layout is done and the by alogic full synthesiscustom flow circuit for the using sensor Synopsys cell array Design and automatic Compiler P&R was for theperformed entire chip. by RTL and post layout simulation. The layout is done by a full custom flow for the sensor cell array and automatic P&R for 2. Sensor Array Design the entire chip. 2.1. Sensor Design 2. Sensor Array Design As shown in Figure2[ 16–19], a capacitive sensing circuit is introduced. The difference in 2.1.capacitance Sensor Design between a ridge and valley is very small, with a few femtofarads. This circuit is designed As shown in Figure 2 [16–19], a capacitive sensing circuit is introduced. The difference in capacitance between a ridge and valley is very small, with a few femtofarads. This circuit is

J. Sens. Actuator Netw. 2020, 9, 51 3 of 15 J. Sens. Actuator Netw. 2020, 9, x FOR PEER REVIEW 3 of 15 todesigned detect weakto detect capacitances weak capacitances by using aby non-overlapping using a non-overlapping clock to eliminateclock to eliminate the effect the of parasiticeffect of capacitance.parasitic capacitance. The sensor The plate sensor is shielded plate withis shielded a second with metal a tosecond prevent metal noise to fromprevent occurring noise infrom the circuitoccurring under in the sensorcircuit plate,under which the sensor forms plate, a parasitic which capacitance forms a parasitic between capacitance the sensor platebetween and the metalsensor shield. plate and The the parasitic metal shield. capacitance The parasitic Cp1 is very capacitance large compared Cp1 is very to the large finger compared capacitance to the Cf finger and mustcapacitance be removed. Cf and Tomust eliminate be removed. this parasitic To eliminate capacitance, this parasitic we apply capacitance, an output we to apply the lower an output node ofto Cp1the lower to maintain node of the Cp1 same to potentialmaintain betweenthe same the pote twontial nodes between of Cp1, the maximizing two nodes theof Cp1, sensitivity maximizing of the fingerprintthe sensitivity sensor. of the The fingerprint operational sensor. amplifier The actsoperational like a bu ffamplifierer between acts the like top a metalbuffer with between Cp1 andthe top the shieldmetal with metal. Cp1 and the shield metal.

(a) Pixel level detection circuit

(b) Output signal wave of a ridge and valley

Figure 2. Pixel-level Sensor scheme and output voltage.

Figure2 2bb showsshows thethe patternpattern ofof sensorsensor outputoutput signalssignals inin thethe ridgeridge andand valleyvalley ofof thethe fingerprint.fingerprint. Because the capacitance at the ridge is greater than the valley, thethe outputoutput of thethe integratedintegrated signal is didifferent.fferent. The integration signal is decreased much more slowly in the valley. The proposed circuit is resistantresistant toto noisenoise andand isis advantageousadvantageous inin recognizingrecognizing fingerprintsfingerprints with high sensitivity with low power, butbut it it takes takes a longa long time time to integrateto integrate charges. charges. As a As result a result of the of simulation, the simulation, the proposed the proposed sensing circuitsensing requires circuit requires a maximum a maximum of 112 clocks of 112 to clocks charge to the charge full range the full voltage. range It voltage. needs about It needs 0.922 about s to obtain0.922 s ato 96 obtain96 imagea 96 × frame96 image at 10 frame MHZ. at This10 MHZ. is very This long is very time, long but thetime, disadvantage but the disadvantage can be solved can × throughbe solved a through pipelined a chargepipelined integration charge integration method. method.

2.2. Sensor Array Design A pipeline type control circuit is appliedapplied to simultaneouslysimultaneously detect the next pixel before the integration of oneone pixelpixel isis completed.completed. The The pipelined pipelined scan scan driver can reduce image capture timetime dramatically. Figure 3 shows 96 × 96 array fingerprint sensor with the pipelined scan driver

J. Sens. Actuator Netw. 2020, 9, 51 4 of 15

J. Sens. Actuator Netw. 2020, 9, x FOR PEER REVIEW 4 of 15 dramatically. Figure3 shows 96 96 array fingerprint sensor with the pipelined scan driver × architecture [20]. [20]. Figure Figure 33bb showsshows thethe timingtiming diagramdiagram ofof thethe pipelinedpipelined scanscan driver.driver. v-ck1v-ck1 toto v-ck8v-ck8 areare column control signals. This This signal signal generates generates or or re resetssets the evaluation signal for the cell. The same clock signal signal is is generated generated every every eight eight cells. cells. Therefor Therefore,e, we we can can expand expand the the depth depth of ofthe the pipeline pipeline scan scan to evaluateto evaluate eight eight cells cells simultaneously. simultaneously. Eight Eight shift shiftring counters ring counters select select the column the column clock generator clock generator every 16every clock 16 intervals. clock intervals. The sensor The cell sensor is designed cell is designed to be evaluated to be evaluated every 16 clocks. every 16Thus, clocks. it is possible Thus, it to is effectivelypossible to ereduceffectively the reduce fingerprint the fingerprint image capture image capturetime without time without damaging damaging the original the original signal. signal. The fingerprintThe fingerprint sensor sensor consists consists of a sensor of a sensor cell array, cell array, shift shiftring counter, ring counter, multiplexer, multiplexer, and decoder. and decoder. The shiftThe shiftring ringcounter counter generates generates a pipeline a pipeline scan scan signal. signal. The Themultiplexer multiplexer selects selects the column the column output output for the for sensorthe sensor cell. cell.The TheADC ADC converts converts the sense the sense signal signal output output to gray-scale. to gray-scale. The Theanalog analog signal signal is fed is to fed the to ADCthe ADC through through an analog an analog multiplexer. multiplexer. The proposed The proposed circuit circuit provides provides three pipe threeline pipeline scan modes. scan modes. v-ck1 generatesv-ck1 generates a reset a signal reset signalfor the for first the sensor first sensor cell, then cell, the then evaluation the evaluation of the offirs thet sensor first sensor cell begins, cell begins, then v-ck2then v-ck2 generates generates a reset a resetsignal signal for the for second the second sensor sensor cell, and cell, the and evaluation the evaluation of the of second the second sensor sensor cell proceedscell proceeds simultaneously. simultaneously. The 9th The sensor 9th sensor cell receives cell receives the reset the signal resetsignal generated generated by v-ck1. by v-ck1.As a result, As a theresult, evaluation the evaluation of the ofsensor the sensor cell is cell performed is performed ever everyy 16 clocks. 16 clocks. As Asshown shown in inFigure Figure 3,3, the sensing signal is integrated to increase the output voltage of the sensor cell linearly.

(a) Variable clock generator

(b) Timing diagram of the pipelined scan driver

FigureFigure 3. 3. PipelinedPipelined scan scan driver driver architecture architecture with with variable variable clock clock generator.

J. Sens. Actuator Netw. 2020, 9, 51 5 of 15 J. Sens. Actuator Netw. 2020, 9, x FOR PEER REVIEW 5 of 15

3.3. Analysis Analysis of of Fingerprint Fingerprint Algorithm Algorithm WeWe built built an an FPGA FPGA environment environment and and implement implementeded a a general-purpose general-purpose ARM7 ARM7 compatible compatible RISC RISC microcontrollermicrocontroller to verifyverify the the correct correct operation operation of theof Minutiae-basedthe Minutiae-based fingerprint fingerprint algorithm. algorithm. The applied The appliedfingerprint fingerprint algorithm algorithm is based onis Jain’sbased references on Jain’s [ 3references,21–23]. Figure [3,21–23].4 shows Figure the FPGA 4 shows emulation the FPGA board for algorithm verification. The sample image is a FVC2002 DB modified to a size of 96 96 pixels emulation board for algorithm verification. The sample image is a FVC2002 DB modified to× a size of 96at × 500 96 dpi.pixels A at point 500 dpi. routine A point is inserted routine in is the insert lowered in right the cornerlower right of the corner board of to the check board each to stepcheck of eachthe identificationstep of the identification algorithm. algorithm. The checkpoint The chec routinekpoint turns routine on one turns of theon one eight of LEDs the eight in succession. LEDs in succession.Finally, we Finally, confirmed we thatconfirmed the embedded that the emb 32-bitedded RISC 32-bit core successfullyRISC core su performedccessfully performed the fingerprint the fingerprintauthentication authentication algorithm. Allalgorithm. results of All the results algorithm of the instructions algorithm are instructions compared withare compared the results ofwith the theARM results emulator, of the theARM software emulator, emulator the soft ofware the ARM emulator processor. of the ARM processor.

FigureFigure 4. 4. FPGAFPGA emulation emulation board board for for verifying verifying the the algorithm. algorithm.

TableTable 1 shows thethe resultsresults ofof thethe ARM ARM emulation. emulation. The The algorithm’s algorithm’s total total CPU CPU cycle cycle was was 4.64 4.64 M M in in96 96 96× 96 sample sample images. images. Gabor Gabor ﬁlters filters and and Thinning Thinning algorithms algorithms occupy occupy about about 80% 80% of the of entirethe entire cycle. × cycle.A description A description of each of stepeach of step the of algorithm the algorith is shownm is shown in Figure in 1Figure. Here, 1. I-cycle Here, meansI-cycle anmeans internal an internalcycle and cycle the and microprocessor the microprocessor does not does request not a requ transferest a because transfer it because executes it an executes internal an instruction. internal instruction.N-cycle means N-cycle non-sequential means non-sequential period, and the period microprocessor, and the microprocessor sends a request tosends an address a request regardless to an addressof the address regardless used of in the previousaddress used cycle. in S-cycle the previo is a sequentialus cycle. period,S-cycle whereis a sequential the microprocessor period, where sends thea request microprocessor to an address sends with a request a word orto halfan wordaddress after with the a previous word or address, half word or to after the samethe previous address. address, or to the same address. Table 1. CPU cycle distribution of the algorithm in 96 96 pixel sample images. × Table 1. CPU cycle distribution of the algorithm in 96 × 96 pixel sample images. Instruction Cycle Process Step Instruction Cycle Process Step S-cycle N-cycle I-cycle Total % S-cycle N-cycle I-cycle Total % Normalization 25,168 6292 3146 34,606 0.75% Normalization 25,168 6292 3146 34,606 0.75% OrientationOrientation Estimation Estimation 176,023 176,023 47,70547,705 47,520 47,520 271,248 271,248 5.84% 5.84% FrequencyFrequency Estimation Estimation 307,424 307,424 88,57788,577 50,706 50,706 446,706 446,706 9.62% 9.62% Gabor FilteringGabor Filtering 1,314,689 1,314,689 329,066 329,066 230,502 230,502 1,874,257 1,874,257 40.37% 40.37% Thinning 983,279 577,092 265,939 1,826,310 39.34% Thinning 983,279 577,092 265,939 1,826,310 39.34% Minutiae Detection 118,453 40,741 17,642 176,836 3.81% MinutiaeMatching Detection Processing 118,453 11,540 40,741 389 17,642 816 12,746 176,836 0.27% 3.81% Matching ProcessingTotal Clock 11,540 2,936,577 3891,089,862 616,271 816 4,642,709 12,746 100% 0.27% Total Clock 2,936,577 1,089,862 616,271 4,642,709 100% 4. Algorithm Processor and MCU Design

4.1. Gabor Filter Design Gabor filters are linear filters used for feature extraction, texture analysis, edge detection, and more. Gabor filters are used in many image processing applications. Gabor filters have frequency

J. Sens. Actuator Netw. 2020, 9, 51 6 of 15

4. Algorithm Processor and MCU Design

4.1. Gabor Filter Design Gabor filters are linear filters used for feature extraction, texture analysis, edge detection, and J. Sens. Actuator Netw. 2020, 9, x FOR PEER REVIEW 6 of 15 more. Gabor filters are used in many image processing applications. Gabor filters have frequency and directiondirection information information and and have have optimal optimal coupling coupling resolution resolution in spatial in and spatial frequency and dimensions.frequency Therefore,dimensions. it isTherefore, appropriate it is to appropriate use a Gabor to filter use asa Gabor a band filter pass as filter a band to remove pass filter noise to and remove recover noise the improvedand recover ridge the improved and valley ridge structures and valley [21,22 structures]. Figure 5[21,22]. shows Figure an example 5 shows of an a Gabor example wavelet of a Gabor and 40wavelet filter banksand 40 with filter five banks frequencies with five and frequencies eight directions. and eight directions.

Figure 5. 2D Gabor wavelet and example of 5-frequency, 8-orientation 40 filter bank. Figure 5. 2D Gabor wavelet and example of 5-frequency, 8-orientation 40 filter bank. The input image is divided into non-overlapping window sizes in order to calculate the Gabor filterThe coeffi inputcients. image In this is divided study, the into window non-overlapping size of the Gaborwindow filter sizes kernel in order is 11 to 11.calculate The Gabor the Gabor filter × isfilter calculated coefficients. based In on this the study, angle the (θ )window and frequency(f) size of the of Gabor the ridge filter obtained kernel is through11 × 11. The the blockGabor ridgefilter orientationis calculated and based frequency on the estimation angle (θ) and steps frequency(f) shown in Figure of the1. Gaborridge obtained filters are through two-dimensional the block filters ridge madeorientation of Gaussian and frequency and cosine estimation functions steps and haveshown the in following Figure 1. general Gabor form filters are two-dimensional filters made of Gaussian and cosine functions and have the following general form   2 2   1xθ yθ  h(x, y : θ, f)= exp  +  cos(2πfx ) (1)  22 2 2  θ 1 −xy2 σx σµ h() x,y:θ,f =exp -θθ + cos() 2πfx (1) 22 θ 2σxμ σ xθ=xcosθ + ysinθ (2) y = ycosθ xsinθ θ − where x and y are the pixel positions onxθ the =xcosθ+ysinθ entire image and σx and σy are Gaussian space constants.(2)

Appropriate care should be taken in determining the values of σx and σy. The smaller the value, yθ =ycosθ-xsinθ the less likely it is to generate false ridges and valleys. The higher the value, the stronger the noise, butwhere there x isand a possibilityy are the ofpixel generating positions wrong on the ridges entire and image valleys. and Therefore, σx and theσy noise are Gaussian reduction space effect isconstants. less. In thisAppropriate study, the care mentioned should parametersbe taken in were determining analyzed the and values the values of σx of andσx andσy. Theσy were smaller set tothe 4.0. value, the less likely it is to generate false ridges and valleys. The higher the value, the stronger the noise,Cosine, but sine there and is exponential a possibility functions of generating are required wrong for kernelridges calculation.and valleys. In Therefore, this study, athe look-up noise tablereduction was appliedeffect is to less. enable In this high-speed study, the calculation mentioned for parameters trigonometric were function. analyzed It was and designed the values in 0~2of σπx atand 0.03 σy radianwere set intervals. to 4.0. The exponential function also applied a look-up table rather than applying an IPCosine, requiring sine large and chip exponential area and functions complex operation.are required Pre-analysis for kernel was calculation. performed In tothis predict study, the a exponentiallook-up table function was applied input range. to enable As aresult high-speed of applying calculation the average for trigonometric frequency and orientationfunction. It value was ofdesigned the fingerprint in 0~2 πto atEquation 0.03 radian (2), intervals. the input The range exponential is 80 to 80,function andan also interval applied of a 6.28 look-up is applied. table − Therather functional than applying block ofan the IP Gaborrequiring filter large is shown chip inarea Figure and 6complex. The designed operation. circuit Pre-analysis generates was an addressperformed of memoryto predict that the matched exponential with function 11 vertical inpu imagest range. of theAs blocka result image of applying P(i,j). The the proposed average circuitfrequency transfers and orientation the images value to the of 2D the shift finger registerprint every to Equation 11 clock (2), cycles the andinput then range performs is −80 to a left 80, shiftand operation.an interval Theof 6.28 designed is applied. circuit The performs functional convolution block of bythe applying Gabor filter a shift is registershown in and Figure Gabor 6. filter. The Ifdesigned the convolution circuit generates result is positive, an address P(i,j) of means memory valley that and matched if it is negative,with 11 vertical P(i,j) means images ridge. of the The block final outputimage P(i,j). image The is binary proposed format. circuit transfers the images to the 2D shift register every 11 clock cycles and then performs a left shift operation. The designed circuit performs convolution by applying a shift register and Gabor filter. If the convolution result is positive, P(i,j) means valley and if it is negative, P(i,j) means ridge. The final output image is binary format.

J. Sens. Actuator Actuator Netw. 2020, 9, 51x FOR PEER REVIEW 7 of 15 J. Sens. Actuator Netw. 2020, 9, x FOR PEER REVIEW 7 of 15

Figure 6. Gabor filter convolution using 11 × 11 shift register. Figure 6. Gabor ﬁlter convolution using 11 11 shift register. Figure 6. Gabor filter convolution using 11 × 11 shift register. 4.2. Thinning Algorithm Design 4.2. Thinning Algorithm Design The thinning algorithm has long been used for pattern recognition and image analysis to extract The thinning algorithm has long been used for pattern recognition and image analysis to extract feature points from images.images. Reduce the thick image to a binary digital pattern to obtain a unit width feature points from images. Reduce the thick image to a binary digital pattern to obtain a unit width skeleton that retains its unique topology and geomet geometricric properties. properties. The The thinning thinning process process is to make skeleton that retains its unique topology and geometric properties. The thinning process is to make the ridge a single line in thethe binarizedbinarized image obtained by the Gabor ﬁlter.filter. The best proven thinningthinning the ridge a single line in the binarized image obtained by the Gabor filter. The best proven thinning algorithm is Zhang and Suen (ZS)(ZS) inin severalseveral algorithmsalgorithms [[6,13,14].6,13,14]. The ZS algorithm performs two algorithm is Zhang and Suen (ZS) in several algorithms [6,13,14]. The ZS algorithm performs two iteration steps in in a a 3 3 × 33 mask mask window. window. P(n) P(n) and and P1 P1 to to8 are 8 are the the binary binary image image values values in Figure in Figure 7a.7 Ina. iteration steps in a 3 ×× 3 mask window. P(n) and P1 to 8 are the binary image values in Figure 7a. In Inthis this study, study, an an image image value value of of 0 0means means white white and and 1 1 means means black. black. Thinning Thinning is is only only done when the this study, an image value of 0 means white and 1 means black. Thinning is only done when the center image Pi is black. center image Pi is black.

(a) (b) (a) (b) Figure 7. pixel order and sample of N(Pi). (a) 3 3 winow index (b) number of black pixel neighbor P(i). Figure 7. pixel order and sample of N(Pi). (a) 3 × 3 winow index (b) number of black pixel neighbor P(i). Figure 7. pixel order and sample of N(Pi). (a) 3 × 3 winow index (b) number of black pixel neighbor P(i). In the firstfirst sub iteration, the contour point Pi is deleted from the digital pattern if the following conditionsIn the firstare met:sub iteration,In In the first first the step, contour the following point Pi is conditions deleted from are are theevaluated evaluated digital patternfor for the the central centralif the following point point Pi. Pi. conditionsWhen satisfied,satisfied, are met: thethe midpointmidpointIn the first PiPi step, turnsturns the white.white. following conditions are evaluated for the central point Pi. When satisfied, the midpoint Pi turns white. (i) P(i)P(i) is is a a value value between between 2 2 and and 6; 6; (i) P(i) is a value between 2 and 6; (ii) AA (Pi) (Pi) == 1;1;

(ii)(iii) AAt (Pi) least = 1;one of P2 and P6 and P8 is 0; (iii) At least one of P2 and P6 and P8 is 0; (iii)(iv) AtAt leastleast oneone ofof P2P4 andand P6P6 andand P8P8 isis 0;0. (iv)(iv) AtAt least least one one of of P4 P4 and and P6 P6 and and P8 P8 is is 0. 0. A(Pi) is the number of 01 patterns in Pi’s eight neighbors P1, P3, P4, …, P8 (Figure 8), and B(Pi) is A(Pi) is the number of 01 patterns in Pi’s eight neighbors P1, P3, P4, ... , P8 (Figure8), and B(Pi) is A(Pi)the number is the numberof nonzero of 01neighbors patterns of in P1, Pi’s that eight is neighbors P1, P3, P4, …, P8 (Figure 8), and B(Pi) is thethe number number of of nonzero nonzero neighbors of P1, that is ∑ 𝑃𝑛 B(Pi) = (3) B(Pi) = X∑8 𝑃𝑛 (3) B(Pi) = Pn (3) N=1

J. Sens. Actuator Netw. 2020, 9, 51 8 of 15 J. Sens. Actuator Netw. 2020, 9, x FOR PEER REVIEW 8 of 15

J. Sens. Actuator Netw. 2020, 9, x FOR PEER REVIEW 8 of 15

Figure 8. NumberNumber of black to white change around P(i). Figure 8. Number of black to white change around P(i). As in the example inin FigureFigure8 ,8, A A (Pi) (Pi) is is 2 2 or or 3. 3. Since Since it it is is not not 1, 1, the the Pi Pi is is not not deleted. deleted. Where Where B(Pi) B(Pi) is isthe the number numberAs in the of of 1s example 1s in in the the neighboring in neighboring Figure 8, A pixels (Pi)pixels is in 2in Figure or Figure 3. Since7b. 7b. it is not 1, the Pi is not deleted. Where B(Pi) is theIn number the secondsecond of 1s stepstep in the onlyonly neighboring conditionsconditions pixels (iii)iii) and and in Figureiv) (iv) are are 7b.changed. changed. The The image image of of the the result result of of the first ﬁrst step is In executed the second again step underunde onlyr the conditions following iii) conditions. and iv) are changed. The image of the result of the first step is executed again under the following conditions. (v) (v)At At least least one one of P2of P2 and and P4 P4 and and P8 P8 is 0; is 0;

(vi) (vi)At(v) At leastAt least least one one one of P2of of P2 and P2 and and P4 P4 andP4 and and P6 P6 isP8 0. is is 0. 0; (vi) At least one of P2 and P4 and P6 is 0. In this paper, we proposed the operation structure shown in Figure 9 for efficient operation of In this paper, we proposed the operation structure shown in Figure9 for e ﬃcient operation of the ZSIn algorithm. this paper, The we designedproposed circuitthe operation generates structure an address shown of in memory Figure 9 thatfor efficient matched operation with three of the ZS algorithm. The designed circuit generates an address of memory that matched with three verticalthe ZS images.algorithm. That The is, designedevery three circuit clocks, generates an image an addressof three ofpixels memory is transferred that matched to the with 3 × three3 2D vertical images. That is, every three clocks, an image of three pixels is transferred to the 3 3 2D shift shiftvertical register images. and Thatsimultaneously is, every three shifted clocks, to anthe image left of of the three window. pixels isAs transferred shown in toFigure the× 3 8, × A(Pi)3 2D register and simultaneously shifted to the left of the window. As shown in Figure8, A(Pi) means the meansshift registerthe number and simultaneouslyof 1 to 0 (1 → shifted0) patterns to the in lefteight-neighbor of the window. pixels. As Inshown order in to Figure determine 8, A(Pi) the number of 1 to 0 (1 0) patterns in eight-neighbor pixels. In order to determine the coincidence of coincidencemeans the ofnumber two →consecutive of 1 to 0 (1 P(n), → 0) P(n patterns + 1) and in eight-neighbor 2 bit binary “10”, pixels. the In ci rcuitorder uses to determine exclusive-OR the two consecutive P(n), P(n + 1) and 2 bit binary “10”, the circuit uses exclusive-OR operation and then operationcoincidence and of then two not-OR.consecutive Add P(n), up allP(n the + 1)values. and 2 Inbit the binary case “10”, of this the operation, circuit uses it exclusive-ORis possible to not-OR. Add up all the values. In the case of this operation, it is possible to obtain a result faster than obtainoperation a result and fasterthen not-OR. than the Add previous up all theSW values.operation In thebecause case ofit thisis possible operation, for itone is possibleclock. The to the previous SW operation because it is possible for one clock. The following Verilog-HDL code is followingobtain a Verilog-HDLresult faster thancode the is appliedprevious to SW calculate operation the becauseA[Pi] so it that is possible the value for can one be clock. obtained The applied to calculate the A[Pi] so that the value can be obtained within one clock by simple operation. withinfollowing one clockVerilog-HDL by simple code operation. is applied to calculate the A[Pi] so that the value can be obtained within one clock by simple operation. A(Pi)A(Pi)<= <=~| ({P1,P2}~|({P1,P2} ˆ 2 ^0b10) 2′b10)+ ~+| ({~|({ P2,P3} P2,P3} ˆ 2 0^b10) 2′b10)+ ~ +|({ ~|({ P3,P4} P3,P4} ˆ 2 0^b10) 2′b10)+ + A(Pi) <= ~|({P1,P2} ^ 2′b10) + ~|({ P2,P3} ^ 2′b10) + ~|({ P3,P4} ^ 2′b10) + ~~|({|({ P4,P5} P4,P5} ˆ ^ 2 02b10)′b10)+ +~ ~|({|({ P5,P6} P5,P6} ˆ 2^0 2b10)′b10)+ +~ ~|({|({ P6,P7} P6,P7} ˆ 2^0 b10)2′b10)+ +~ ~|({|({ P7,P8} P7,P8} ˆ 2^0 b10)2′b10)+ ~|({ P4,P5} ^ 2′b10) + ~|({ P5,P6} ^ 2′b10) + ~|({ P6,P7} ^ 2′b10) + ~|({ P7,P8} ^ 2′b10) +~ ~|({|({ P8,P1} P8,P1} ˆ ^ 2 02b10)′b10) ; ; + ~|({ P8,P1} ^ 2′b10) ;

Figure 9. Thinning operation using 3 × 33 shift shift register. register. Figure 9. Thinning operation using 3 × 3 shift register. It is possible to obtain a thinning result of Pi image in three clocks with simple circuit. The 24-bit ItIt is is possible possible to to obtain obtain a a thinningthinning resultresult ofof PiPi imageimage inin threethree clocks with simple circuit.circuit. TheThe 24-bitimage readimage from read memory from memory by three clocksby three is stored clocks in is the stored shift register in the byshift the register ‘done’ signal, by the and ‘done’ 3 3 signal, values 24-bit image read from memory by three clocks is stored in the shift register by the ‘done’× signal, andand 3 thinning 3× ×3 3values values operation and and thinning thinning is done, operation operation as shown is is indone, done, Figure asas 9 shownshown. inin FigureFigure 9.

J. Sens. Actuator Netw. 2020, 9, 51 9 of 15 J.J. Sens.Sens. ActuatorActuator Netw.Netw. 20202020,, 99,, xx FORFOR PEERPEER REVIEWREVIEW 9 9 ofof 1515

4.3. Algorithm Processor Design 4.3. Algorithm Processor Design Figure 10 shows the integrated logic of proposed algorithm processor. The yellow box is Gabor Figure 10 shows the integrated logic of proposed algorithm processor. The yellow box is Gabor filter block and blue box is thinning block. The address generator, counter and memory-A are filterfilter blockblock andand blue blue box box is thinningis thinning block. block. The The address address generator, generator, counter counter and memory-A and memory-A are shared. are shared. The logic was designed in RTL and synthesized by Synopsys Design Compiler using a 180n Theshared. logic The was logic designed was designed in RTL andin RTL synthesized and synthesi by Synopsyszed by Synopsys Design CompilerDesign Compiler using a using 180n CMOSa 180n CMOS process. The Figure 11 shows the logic circuit simulation. Simultaneous operation is process.CMOS process. The Figure The 11 showsFigure the 11 logic shows circuit the simulation. logic circuit Simultaneous simulation. operation Simultaneous is impossible operation because is impossible because thinning must be processed with the image obtained as a result of Gabor filter. thinningimpossible must because be processed thinning with must the be image processed obtained with as the a resultimage of obtained Gabor filter. as a result It took of 266,439 Gabor cyclesfilter. It took 266,439 cycles for Gabor filter operation and 55,303 cycles for thinning operation. The total forIt took Gabor 266,439 filter operationcycles for andGabor 55,303 filter cycles operation for thinning and 55,303 operation. cycles Thefor thinning total required operation. cycle isThe 321,742. total required cycle is 321,742. The result shows a valid operation of proposed logic circuit. The layout is Therequired result cycle shows is a321,742. valid operation The result of shows proposed a valid logic operation circuit. The of layoutproposed is performed logic circuit. by autoThe layout P&R for is 2 performed by auto P&R for the algorithm processor. The area is 0.962 mm 2 and gate count is 96,393. theperformed algorithm by processor.auto P&R for The the area algorithm is 0.962 mmprocessor.2 and gate The count area is is 0.962 96,393. mm and gate count is 96,393.

Figure 10. Functional block diagram of algorithm processor. Figure 10. Functional block diagram of algorithm processor.

Figure 11. Algorithm processor simulation results. FigureFigure 11.11. AlgorithmAlgorithm processorprocessor simulationsimulation results.results. 4.4. 16-Bit Risc MCU Design 4.4.4.4. 16-Bit16-Bit RiscRisc MCUMCU DesignDesign The 16-bit Reduced Instruction Set Computing (RISC) processor is designed using the Verilog The 16-bit Reduced Instruction Set Computing (RISC) processor is designed using the Verilog harwareThe description16-bit Reduced language Instruction (HDL), Set as Computing shown in Figure (RISC) 12 processor. The processor is designed handles using the the reset Verilog of the harwareharware descriptiondescription languagelanguage (HDL),(HDL), asas shownshown inin FigureFigure 12.12. TheThe processorprocessor handleshandles thethe resetreset ofof thethe algorithm,algorithm, exceptexcept forfor GaborGabor andand thinning.thinning. TheThe processorprocessor hashas aa four-stagefour-stage pipeline.pipeline. TheThe appliedapplied

J. Sens. Actuator Netw. 2020, 9, 51 10 of 15 J. Sens. Actuator Netw. 2020, 9, x FOR PEER REVIEW 10 of 15 embeddedalgorithm, exceptMCU is for based Gabor on and Harvard thinning. architecture The processor with separate has a four-stage data memory pipeline. and The instruction applied memoryembedded to avoid MCU bottlenecks is based on and Harvard enable architecturepipelines. Memory with separate Access Instructions data memory are andLoad instruction and Store. Datamemory Processing to avoid has bottlenecks eight instruct andions enable including pipelines. arithmetic, Memory logical Access and transfer Instructions operations. are Load Control and FlowStore. Instructions Data Processing are a branch has eight on instructionsEqual, branch including on not Equal arithmetic, and Jump. logical The and basic transfer architecture operations. of a RISCControl processor Flow Instructions is as given are below. a branch It includes on Equal, eight branch general-purpose on not Equal and registers Jump. Thethat basic can architecturestore 16-bit dataof a RISCand processora 16-bit ALU is as that given can below. perform It includes logical eight and general-purposearithmetic operations. registers The that flag can register store 16-bit can checkdata and parity, a 16-bit carry, ALU and that zero can status. perform Addressing logical and Mo arithmeticdes are Immediate, operations. Register, The flag registerRegister can Indirect check parity,and PC carry, relative. and The zero address status. Addressingfor conditional Modes jump are instructions Immediate, is Register, calculated Register by adding Indirect the and 16-bit PC sign-extensionrelative. The address of an eight-bit for conditional signed jump offset instructions to the program is calculated counter. by The adding architecture the 16-bit was sign-extension verified by Verilogof an eight-bit with signedRTL and off setthen to thesynthesized program counter.at the TheGate architecture level using was 180n verified CMOS by Verilogdesign withkit. RTLThe synthesizedand then synthesized gate-level at MCU the Gate first level verified using basic 180n co CMOSmmand design operation kit. The in synthesizedthe FPGA environment. gate-level MCU In thefirst emulation verified basic environment, command operationthe number in the of FPGAcycles environment.required for Inthe the rest emulation of the algorithm environment, except the Gabornumber filter of cycles and thinning required forwas the analyzed. rest of the After algorithm the layout except was Gabor performed filter and with thinning the Auto was P&R analyzed. tool, Afterthe back-annotation the layout was verification performed withresult the was Auto confirme P&R tool,d through the back-annotation post simulation. verification The layout result area was is 2 54,483confirmed μm through2 and gate post count simulation. is 5460. The The layout operatin area isg 54,483 speedµ mwasand maximum gate count 50 is 5460.MHz. The It operatinghas been confirmedspeed was maximumthat the embedded 50 MHz. It16-bit has beenRISC confirmed core succe thatssfully the performs embedded steps 16-bit excluding RISC core thinning successfully and performsGB steps in steps the excludingfingerprint thinning authentication and GB algorithm steps in the step. fingerprint authentication algorithm step.

Figure 12. FunctionalFunctional block diagram of proposed 16-bit RISC MCU.

5. Full Full Chip Chip Integration Figure 13 shows the functional block diagram and Figure 14 is floor planning of 96 96 fingerprint Figure 13 shows the functional block diagram and Figure 14 is floor planning× of 96 × 96 fingerprintsensor chip. sensor Figure chip. 15 shows Figure the 15 logicshows simulation the logic simulation results of theresults single-chip of the single-chip architecture. architecture. Figure 16 shows the chip layout is 5010 µm 5710 µm at 0.18 µm standard CMOS process. Since the sensor is Figure 16 shows the chip layout is× 5010 μm × 5710 μm at 0.18 μm standard CMOS process. Since the sensoran analog is an circuit, analog the circuit, design the and design layout and are layout performed are performed in a full custom in a full method, custom and method, the entire and chip the entireis performed chip is throughperformed automatic through placementautomatic andplacem routingent and in arouting cell-based in a design cell-based method. design The method. circuit Thecontains circuit two contains compiled two static compiled memories, static an memories, input/output an input/output pad, and an analogpad, and block an containinganalog block an ADC.containing Figure an 17 ADC. shows Figure the layout 17 shows of the the proposed layout single-chipof the proposed fingerprint single-c sensorhip fingerprint scheme. The sensor chip 2 scheme.area is 28.61 The mmchip areaand is gate 28.61 count mm is2 and 2,866,700 gate count with is memory. 2,866,700 To with verify memory. the behavior To verify of the the proposed behavior processor,of the proposed we extracted processor, each we parasitic extracted capacitance each para fromsitic the capacitance layout and from performed the layout post-simulation and performed with post-simulation with 180n CMOS conditions, as shown in Figure 15. This shows that the proposed method can effectively handle the fingerprint algorithm. The conventional fingerprint sensor does

J. Sens. Actuator Netw. 2020, 9, 51 11 of 15

180n CMOS conditions, as shown in Figure 15. This shows that the proposed method can effectively J.J. Sens.Sens. ActuatorActuator Netw.Netw. 20202020,, 99,, xx FORFOR PEERPEER REVIEWREVIEW 11 11 ofof 1515 handle the fingerprint algorithm. The conventional fingerprint sensor does not include an algorithm 2 processornotnot includeinclude and anan MCU. algorithmalgorithm The area processorprocessor of an algorithm andand MCU.MCU. processor TheThe areaarea and ofof MCUanan algorithmalgorithm is 1.47 mm processorprocessor. Therefore, andand MCUMCU the area isis 1.47 of1.47 a 2 conventionalmmmm22.. Therefore,Therefore, fingerprint thethe areaarea sensor ofof aa conventiconventi is 27.14onalonal mm fingerprintfingerprint. This means sensorsensor that isis the 27.1427.14 chip mmmm area22.. This increaseThis meansmeans in the thatthat proposed thethe chipchip circuitareaarea increaseincrease is just 5.4%. inin thethe proposedproposed circuitcircuit isis justjust 5.4%.5.4%.

FigureFigure 13.13. LogicLogic diagram diagram of of fingerprintﬁngerprintfingerprint sensorsensor chip.chip.

FigureFigure 14.14. FloorplanFloorplan of of ﬁngerprintfingerprintfingerprint sensorsensor chip.chip.

J. Sens. Actuator Actuator Netw. Netw. 2020, 9, x 51 FOR PEER REVIEW 12 of 15 J. Sens. Actuator Netw. 2020, 9, x FOR PEER REVIEW 12 of 15

FigureFigure 15. 15. Post-simulation Post-simulation result result of of full full chip. chip.

Figure 16. Layout of proposed architecture (5010 × 5710 mm2 @180n 2-poly 6-metal CMOS process). Figure 16. LayoutLayout of of proposed architecture (5010 × 57105710 mmµm22 @180n 2-poly 6-metal CMOS process). × Table2 compares the CPU occupancy of the conventional architecture and the proposed architecture. SETP-A means the algorithm before the GABOR ﬁlter, and STEP-B means the algorithm after THINNING. Figure 17 shows Table2 as a graph, and shows that the proposed architecture shows 57% performance improvement in CPU cycle occupancy. Table3 is a performance comparison considering the burden of increasing the chip area. Table3 shows that even with a small area increase of 5.4% compared to the conventional chip, the proposed architecture can achieve 57% improvement in algorithm processing. It is a future work to analyze the improvement in power consumption through power analysis of the conventional circuit in the future. If the proposed architecture is embedded

J. Sens. Actuator Netw. 2020, 9, 51 13 of 15 in the authentication fingerprint sensor system, it is expected to reduce the CPU burden and greatly improveJ. Sens. Actuator the processingNetw. 2020, 9, speedx FOR PEER without REVIEW significantly affecting the chip size. 13 of 15

Figure 17.17. Graph of CPU cycle distribution.

Table 2. Comparison of CPU cycle distribution. Table 2 compares the CPU occupancy of the conventional architecture and the proposed architecture. SETP-A meansSTEP-A the algorithm Gabor before Thinning the GABOR STEP-B filter, and STEP-B Total means the algorithm afterConventional THINNING.752,560 Figure 17 1,874,257 shows Table 1,826,310 2 as a graph, 189,582 and shows 4,642,709 that the proposed architecture shows 57% performance improvement in CPU cycle occupancy. Table 3 is a Proposed 1,346,420 266,439 55,303 330,477 1,998,639 performance comparison considering the burden of increasing the chip area. Table 3 shows that even with a small area increase of 5.4% compared to the conventional chip, the proposed Table 3. Performance comparison between the proposed and previous method. (@180n CMOS process, architecture can achieve 57% improvement in algorithm processing. It is a future work to analyze 96 96 pixel array). the improvement× in power consumption through power analysis of the conventional circuit in the Conventional Proposed Architecture future. If the proposed architecture is embedded in the authenticationImprovement fingerprint (%) Remarks sensor system, it Architecture (SW only) ((SW+HW+MCU)) is expected to reduce the CPU burden and greatly improve the processing speed without Total CPU Cycle 4,642,709 1,998,639 57 significantlyChip Area (mm2) affecting with 2 SRAM the27.14 chip size. 28.61 5.4 96 96 pixels area full chip − × Power Consumption - 5.25 mW - 20 MHz, 1.8 V Table 2. Comparison of CPU cycle distribution. 6. Conclusions STEP-A Gabor Thinning STEP-B Total Conventional 752,560 1,874,257 1,826,310 189,582 4,642,709 This paper proposes a novel fingerprint sensor architecture that processes a binarization and Proposed 1,346,420 266,439 55,303 330,477 1,998,639 thinning step, which occupy 80% of the algorithm processing time, in hardware, and the remaining steps, which is processed by embedded 16-bit RISC MCU. The CPU occupancy of the algorithm was Table 3. Performance comparison between the proposed and previous method. (@180n CMOS analyzed using the ARM emulator and the FPGA environment. The algorithm processor was designed process, 96 × 96 pixel array). by applying the Gabor filter for a binarization and the ZS algorithm for a thinning. The rest of the Conventional algorithm is processed by an embedded 16-bitProposed MCU Architecture with small circuitImprovement volume, so all steps of the Architecture (SW Remarks algorithm can be processed on a single chip without((SW+HW+MCU)) an external CPU. In(%) this paper, we implemented a only) 96 96 array high-sensitivity fingerprint sensor. In the sensor circuit of the pixel unit, an insensitive Total× CPU Cycle 4,642,709 1998639 57 charge transfer integrator is applied that can effectively reduce the effect of parasitic capacitance. Chip Area (mm2) 96 96 pixels 27.14 28.61 −5.4 Thewith proposed2 SRAM circuit in the paper uses an active output voltage feedback integrator. Thearea whole full circuit chip wasPower verified by RTL simulation of digital blocks synthesized in a 180n two-poly six-metal standard process. The layout is- performed by auto P 5.25& R mW for the full chip with 96 - 96 pixel array. 20 TheMHz, area 1.8 ofV Consumption × our chip is 5010 5710 µm (28.61 mm2) and the gate count is 2,866,700. The result is compared with a × conventional6. Conclusions one. The proposed scheme can reduce the algorithm processing time by 57% in total algorithm. If the designed architecture is adopted in the fingerprint AFIS, it is expected to reduce This paper proposes a novel fingerprint sensor architecture that processes a binarization and thinning step, which occupy 80% of the algorithm processing time, in hardware, and the remaining steps, which is processed by embedded 16-bit RISC MCU. The CPU occupancy of the algorithm was analyzed using the ARM emulator and the FPGA environment. The algorithm processor was designed by applying the Gabor filter for a binarization and the ZS algorithm for a thinning. The

J. Sens. Actuator Netw. 2020, 9, 51 14 of 15 the CPU burden and greatly improve the processing speed without significantly affecting the chip size. Although not shown in the results of this paper, it is expected to reduce power consumption by minimizing the operation of the CPU implemented in a large-scale circuit. The addition of circuits for part of the algorithm leads to an increase in the chip area, but the proposed architecture has a relatively small burden on the chip area in terms of speed improvement. The proposed architecture improves the algorithm speed by up to 57%, so we can replace the traditional 64-bit or 32-bit CPU with a 16-bit or 8-bit CPU, which will enable low power consumption and the implementation of small fingerprint systems.

Funding: This research received no external funding Acknowledgments: This work was supported by Hanshin University Research Grant. Conﬂicts of Interest: The authors declare no conﬂict of interest.

References

1. Chugh, T.; Cao, K.; Jain, A.K. Fingerprint spoof buster: Use of minutiae-centered patches. IEEE Trans. Inf. Forensics Secur. 2018, 13, 2190–2202. [CrossRef] 2. Tolosana, R.; Gomez-Barrero, M.; Busch, C.; Ortega-Garcia, J. Biometric presentation attack detection: Beyond the visible spectrum. IEEE Trans. Inf. Forensics Secur. 2019, 15, 1261–1275. [CrossRef] 3. Jung, S.-M.; Nam, J.-M.; Yang, D.-H.; Lee, M.-K. A CMOS integrated capacitive fingerprint sensor with 32-bit RISC microcontroller. IEEE J. Solid-State Circuits 2005, 40, 1745–1750. [CrossRef] 4. Alibeigi, E.; Samavi, S.; Shirani, S.; Rahmani, Z. Real time ridge orientation estimation for fingerprint images. Computer Vision and Pattern Recognition. arXiv 2017, arXiv:1710.05027. 5. Gayathri, S.; Sridhar, V. Design and simulation of Gabor filter using verilog HDL. Int. J. Latest Trends Eng. Technol. 2013, 2, 77–83. 6. Chen, W.; Sui, L.; Xu, Z.; Lang, Y. Improved Zhang-Suen thinning algorithm in binary line drawing applications. In Proceedings of the 2012 International Conference on Systems and Informatics (ICSAI2012), Yantai, China, 19–20 May 2012; pp. 1947–1950. 7. Voß, N.; Mertsching, B. Design and Implementation of an Accelerated Gabor Filter Bank Using Parallel Hardware. In International Conference on Field Programmable Logic and Applications; Springer: Berlin/Heidelberg, Germany, 2001; pp. 451–460. 8. Spinei, A.; Pellelin, D.; Fernandes, D.; Heraults, J. Fast Hardware Implementation of Gabor Filter Based Motion Estimation. Integr. Comput. Eng. 2000, 7, 67–77. [CrossRef] 9. Painkras, E.; Charoensak, C. A VLSI architecture for Gabor filtering in face processing applications. In Proceedings of the 2005 International Symposium on Intelligent Signal Processing and Communication Systems, Hong Kong, China, 13–16 December 2005. 10. Jothi, S.A.; Dhatchayani, K.; Devi, G.G.R.; Raja, M.R. VLSI Implementation of Gabor Filter in Image Detection–Research Direction. Int. J. Circuit Theory Appl. 2017, 10, 133–138. 11. Kheiri, F.; Samavi, S.; Karimi, N. Hardware design for binarization and thinning of fingerprint images. arXiv 2017, arXiv:1710.05749. 12. Das, R.K.; De, A.; Pal, C.; Chakrabarti, A. DSP hardware design for fingerprint binarization and thinning on FPGA. In Proceedings of the 2014 International Conference on Control, Instrumentation, Energy and Communication (CIEC), Calcutta, India, 31 January–2 February 2014. 13. Wakil, Y.; Tariq, S.G.; Humayun, A.; Abbas, N. An FPGA based Minutiae Extraction System for Fingerprint Recognition. Int. J. Comput. Appl. 2015, 111, 31–35. [CrossRef] 14. Hermanto, L.; Sudiro, S.A.; Wibowo, E.P. Hardware implementation of fingerprint image thinning algorithm in FPGA device. In Proceedings of the 2010 International Conference on Networking and Information Technology, Manila, Philippines, 11–12 June 2010. 15. SUPREMA. Available online: https://www.supremainc.com/embedded-modules/en/main.asp (accessed on 28 August 2020). 16. Jung, S. Implementation of 144 64 Pixel Array Bezel-Less CMOS Fingerprint Sensor. Int. J. Smart Sens. × Intell. Syst. 2018, 11, 1–5. [CrossRef] J. Sens. Actuator Netw. 2020, 9, 51 15 of 15

17. Jung, S. A Modified Architecture for Fingerprint Sensor of Switched Capacitive Integrator Scheme. Int. J. Bio-Sci. Bio-Technol. 2016, 8, 139–144. [CrossRef] 18. Yeo, H. Analysis and Performance Comparison of Charge Transfer Circuits for Capacitive Sensing. J. Next Gener. Inf. Technol. 2014, 5, 16–26. 19. Hyeopgoo, Y. A New Fingerprint Sensor based on Signal Integration Scheme using Charge Transfer Circuit. Int. J. Bio-Sci. Bio-Technol. 2015, 7, 29–38. [CrossRef] 20. Yeo, H. Touch fingerprint sensor based on sensor cell isolation technique with pseudo direct signalling. Int. J. Smart Sens. Intell. Syst. 2019, 12, 1–9. 21. Hong, L.; Wan, Y.; Jain, A. Fingerprint Image Enhancement: Algorithm and Performance Evaluation. IEEE Trans. Pattern Anal. Mach. Intell. 1998, 20, 777–789. [CrossRef] 22. Molaei, S.; Abadi, M.E.S.A. Maintaining filter structure: A Gabor-based convolutional neural network for image analysis. Appl. Soft Comput. 2020.[CrossRef] 23. Jung, S. Design of low power and high speed CMOS fingerprint sensor Design of low power and high speed CMOS fingerprint sensor. Int. J. Bio-Sci. Bio-Technol. 2013, 5, 1–16.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional aﬃliations.

© 2020 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).