sensors

Article Monitoring by Using an Adaptive 1D Convolutional Neural Network

Chih-Yung Huang * and Zaky Dzulfikri

Department of Mechanical Engineering, National Chin-Yi University of Technology, Taichung 41170, Taiwan; [email protected] * Correspondence: [email protected]; Tel.: +886-4-23924505 (ext. 6703)

Abstract: Stamping is one of the most widely used processes in the sheet industry. Because of the increasing demand for a faster process, ensuring that the stamping process is conducted without compromising quality is crucial. The tool used in the stamping process is crucial to the efficiency of the process; therefore, effective monitoring of the tool health condition is essential for detecting stamping defects. In this study, vibration measurement was used to monitor the stamping process and tool health. A system was developed for capturing signals in the stamping process, and each stamping cycle was selected through template matching. A one-dimensional (1D) convolutional neural network (CNN) was developed to classify the tool wear condition. The results revealed that the 1D CNN architecture a yielded a high accuracy (>99%) and fast adaptability among different models.

Keywords: stamping process; vibration; spectrum density; one-dimensional convolutional neural network; classification

1. Introduction   Stamping remains one of the most widely used processes in sheet metalworking and is the primary process in the manufacture of many products, such as automotive products, Citation: Huang, C.-Y.; Dzulfikri, Z. electronics, and medical devices. To ensure the consistent quality of the manufactured Stamping Monitoring by Using an products, monitoring of the tools used in the stamping process is crucial. Therefore, Adaptive 1D Convolutional Neural Network. Sensors 2021, 21, 262. monitoring data are required to clearly understand factors affecting the conditions of https://doi.org/10.3390/s21010262 stamping tools [1]. Several studies have focused on monitoring the condition of tools that are used in Received: 18 November 2020 processes such as drilling, turning, and milling; nevertheless, studies that have Accepted: 29 December 2020 focused on monitoring the conditions of tools used in stamping processes are few, particu- Published: 2 January 2021 larly those involving data acquisition and data recognition or classification. Several studies have proposed the use of force, strain, acoustic emission, vibration, and audio indicators for Publisher’s Note: MDPI stays neu- monitoring the stamping process. However, the analyses executed in these studies did not tral with regard to jurisdictional clai- cover all monitoring processes; instead, such analyses were mostly centered on recognition, ms in published maps and institutio- classification, or regression processes. Wu et al. [2] proposed using vibration signals to nal affiliations. monitor the micropiercing process and logistic regression to predict damage. Sari et al. [3] used acoustic emission to monitor the micropiercing process online. Shanbhag et al. [4] investigated the galling wear of a stamping tool by evaluating acoustic emission frequency characteristics. Ge et al. [5] modeled monitoring signals at different periods in the stamping Copyright: © 2021 by the authors. Li- censee MDPI, Basel, Switzerland. process by using several autoregressive models with residue as a feature, followed by This article is an open access article classification by using the Hidden Markov Model. Tian et al. [6] used machine vision to distributed under the terms and con- detect tool surface defects engendered by the stamping and grinding of flat parts. ditions of the Creative Commons At- The strain signal is the most commonly used monitoring signal since it is proportional tribution (CC BY) license (https:// to the stamping force. However, strain sensors can identify only certain faults and cannot creativecommons.org/licenses/by/ be applied to detect dynamic characteristics of the stamping operation, especially in the 4.0/). high frequency band [7]. On the other hand, AE and vibration force sensors are suitable

Sensors 2021, 21, 262. https://doi.org/10.3390/s21010262 https://www.mdpi.com/journal/sensors Sensors 2021, 21, 262 2 of 21

for detection of the changes in the process and wear. Nevertheless, they are difficult to install and expensive. In terms of vibration sensors, they are easy to acquire and contain ample information about the process dynamics. Although problem of low signal and noise ratio was mentioned before, we applied PCB356A15 accelerometers with sensitivities of 100 mV/g in the present study to solve this issue and clear stamping signals were observed. Vibration signals provide information crucial in determining dynamic behaviors at high frequencies in the stamping process [7]. Zhang et al. [8] used piezoelectric strain sensors to measure the strain on the press column surface through feature selection to monitor and detect failures in the process. Sari et al. [9] demonstrated the relationship between increasing the amplitude within a certain frequency range and the appearance of burrs and onset of damage that leads to punch failure. Continuous use of the tool can cause various problems, such as wear, variation in the quality of the product, and damage to the machine. Cutting tools usually exhibit adhesive and abrasive wear at contact zones [10]. Zhang et al. [11] used the bispectrum to analyze acceleration signals during the online stamping operation, which helped detect tool failure or tool wear. Vibration signal data captured during the stamping process exhibit different patterns, which can be used to monitor tool health. Some studies have used various pattern matching methods [12–15] to process vibration signals; the present study implemented these methods for analysis. A template is required as a basis for capturing the vibration signals because different parameters could affect the shape of the signal pattern, resulting in patterns with varying shapes. Zhao et al. [12] demonstrated the applicability of many deep learning methods to machine monitoring. The success of classical machine learning techniques depends on efficient feature extraction, which requires considerable resources and time. Recently developed techniques in deep learning have enabled automatic feature extraction without the need for an expert, thereby simplifying the final solution while obtaining high accuracy. Recent studies on one-dimensional (1D) signals, especially vibration signals, have used convolutional neural networks (CNNs) [16–21] for classification. On the basis of the findings and limitations indicated by the aforementioned studies, the present study focused on developing a system for capturing vibration signals and a 1D CNN model for classifying such signals to achieve end-to-end monitoring of stamping tool wear. This study’s contributions are detailed as follows. • The study developed of a system for processing vibration signals by using a template matching algorithm in order to identify tool wear conditions. • The study applied a deep 1D CNN along with a fully connected neural network (FCNN) for feature extraction to classify tool wear conditions. The optimal configura- tion was determined through the comparison of 1D CNN and FCNN configurations. • The study applied real data from a mass production workpiece to evaluate the devel- oped methods and algorithms. • The proposed method not only enables the reliable and real-time monitoring of stamping tools through processes involving data acquisition and data classification; the method also minimizes computational time throughout these processes. The ex- perimental results demonstrated the applicability of our method to actual machines, with the method exhibiting high signal acquisition rates and excellent classification efficiency. The remainder of the paper is organized as follows. In Section2, the proposed methods and data acquisition process are presented. In Section3, the signal acquisition rate and tool wear classification performance are described. Finally, Section4 presents the conclusion of this study.

2. Materials and Methods The flow of the proposed system’s operation is shown in Figure1. Raw data obtained from sensors at specific time intervals are first preprocessed and then divided into two sig- nal domains, the time-based domain, and frequency-based domain. After data processing, Sensors 2021, 21, x SensorsFOR PEER 2021 REVIEW, 21, x FOR PEER REVIEW 3 of 23 3 of 23

2. Materials and2. MaterialsMethods and Methods Sensors 2021, 21, 262 The flow of theThe proposed flow of thesystem’s proposed operat system’sion is shown operat inion Figure is shown 1. Raw in Figure data ob- 1. Raw data3 ofob- 21 tained from sensorstained at from specific sensors time at interval specifics aretime first interval preprocesseds are first andpreprocessed then divided and into then divided into two signal domains,two signal the time-baseddomains, the domain, time-based and frequency-baseddomain, and frequency-based domain. After domain. data After data processing, template matching is used to compare the processed data with the template processing, templatetemplate matching matching is used is used to tocompare compare the the processed processed data data with with the the template template data. If the data. If the required conditions are satisfied, the signal data are selected and then recog- data. If the requiredrequired conditions conditions are are satisfied, satisfied, the the signal signal data data are are selected selected and and then then recog- recognized. nized. nized.

Figure 1. OverallFigureFigure system 1.1. operation.Overall system operation.operation.

2.1. Experimental2.1. Data Experimental Data 2.1. Experimental Data The experimentalThe dataexperimental used in thisdata stud usedy werein this obtained study were from obtained an actual from working an actual working stamping process.stamping TheSpecifically, experimentalprocess. progressive Specifically, data used stamping progressive in this tools study stamping were were placed obtainedtools into were froma placedrecipro- anactual into a workingrecipro- stamping process. Specifically, progressive stamping tools were placed into a reciprocating cating stampingcating machine, stamping and anmachine, automatic and shaneet automatic feeder sheet fed metal the feedermetal sheetfed the to metalbe sheet to be stamping machine, and an automatic feeder fed the metal sheet to be stamped stamped into thestamped stamping into tool, the stamping which presse tool,d which the metal presse sheetd the to metalthe intended sheet to shape. the intended shape. into the stamping tool, which pressed the metal sheet to the intended shape. Stamping involvesStamping various involves processes, various such processes, as blanking, such piercing, as blanking, , piercing, , drawing, forming, Stamping involves various processes, such as blanking, piercing, drawing, forming, and bending, thatand arebending, executed that usingare executed distinct usingtools. distinctIn this study, tools. Inthe this piercing study, process, the piercing process, and bending, that are executed using distinct tools. In this study, the piercing process, which tends towhich yield tendsthe most to yield tool wear,the most was tool monitored wear, was at severalmonitored positions at several where positions the where the which tends to yield the most tool wear, was monitored at several positions where the tool tool comes in contacttool comes with in the contact material. with Figure the material. 2 presents Figure the 2wear presents monitoring the wear points monitoring in- points in- comes in contact with the material. Figure2 presents the wear monitoring points inside a side a progressiveside toola progressive used for piercing. tool used Figu for repiercing. 3 depicts Figu there finished 3 depicts product, the finished with product,the with the progressive tool used for piercing. Figure3 depicts the finished product, with the numbered numbered boxesnumbered indicating boxes the indicatinglocation at thewh ichlocation the tool at wh dieich was the applied. tool The was stamping applied. The stamping boxes indicating the location at which the tool die was applied. The stamping tool, metal tool, metal sheettool, workpiece metal sheet and workpiece stamping andmachine stamping used machinein this study used arein thisillustrated study arein illustrated in sheet workpiece and stamping machine used in this study are illustrated in Figure4. Figure 4. Figure 4.

Figure 2. Tool monitoringFigure 2. Tool positions. monitoring positions. Figure 2. Tool monitoring positions. After manually inspecting six sets of vibration data from the two sensors, we decided to use the y-axis vibration data from the lower accelerometer because this accelerometer produced distinctive stamping vibration signals and the least data noise. We observed two stages of wear (mild and heavy) at each tool position (Table1). Detailed conditions are shown in AppendixA. Table2 presents the classification of the experimental data regarding the various tool conditions, indicating seven class types; only one class represented the healthy condition, three represented the heavy wear condition, and the remaining three represented the mild wear condition. This table also indicates the number of samples and average stamping time for each class.

SensorsSensors 20212021,, 2121,, xx FORFOR PEERPEER REVIEWREVIEW 44 ofof 2323 Sensors Sensors 2021 2021, ,21 21, ,x x FOR FOR PEER PEER REVIEW REVIEW 44 of of 23 23

Sensors 2021, 21, 262 4 of 21 Sensors 2021, 21, x FOR PEER REVIEW 4 of 23

((aa)) ((bb)) ((aa)) ((bb)) FigureFigure 3.3. ImagesImages ofof thethe finishedfinished productproduct fromfrom thethe ((aa)) toptop andand ((bb)) rightright side;side; thethe piercingpiercing processesprocesses FigureA,FigureA, B,B, andand 3.3. Images CImagesC areare markedmarked ofof thethe finished finishedbyby thethe nunu productproductmbersmbers 1, from1,from 2,2, andand thethe 3, 3, ((a a respectively.)respectively.) toptop andand ((bb)) rightright side;side; thethe piercingpiercing processes processes A,A, B, B, andand CC areare markedmarked byby thethe nunumbersmbers 1,1, 2,2, and and 3,3, respectively.respectively. AfterAfter manuallymanually inspectinginspecting sixsix setssets ofof vibrvibrationation datadata fromfrom thethe twotwo sensors,sensors, wewe de-de- cidedcidedAfterAfter toto useuse manuallymanually thethe y-axisy-axis inspectinginspecting vibrationvibration sixsix datadata setssets fromfrom ofof vibrvibr thetheationation lowerlower datadata accelerometeraccelerometer fromfrom thethe twotwo becausebecause sensors,sensors, thisthis wewe accel-accel- de-de- cidederometercidederometer toto use use producedproduced thethe y-axisy-axis distinctivedistinctive vibrationvibration stampingstamping datadata fromfrom vibrvibr thetheationation lowerlower signalssignals accelerometeraccelerometer andand thethe leastleast becausebecause datadata thisnoise.thisnoise. accel-accel- WeWe erometerobservederometerobserved producedtwoproducedtwo stagesstages distinctivedistinctive ofof wearwear (mild(mild stampingstamping andand heavy)heavy) vibrvibrationation atat each eachsignalssignals tooltool andand positionposition thethe leastleast (Table(Table datadata 1).1). noise.noise. DetailedDetailed WeWe observedconditionsobservedconditions twotwo areare stages stagesshownshown ofof in in wearwear AppendixAppendix (mild(mild and andA.A. Table Tableheavy)heavy) 22 atpresentspresentsat eacheach tool tool thethe position positionclassificationclassification (Table(Table ofof 1). 1). thethe DetailedDetailed experi-experi- mental data regarding the various tool conditions, indicating seven class types; only one conditionsmentalconditions data areare regarding shownshown inthein Appendix Appendixvarious tool A.A. condit TableTable ions, 22 presentspresents indicating thethe classificationclassificationseven class types; ofof thethe only experi-experi- one mentalclassmentalclass representedrepresented datadata regardingregarding thethe healthyhealthy thethe variousvarious condition,condition, tooltool condit condit threethree ions, representedions,represented indicatingindicating thethe seven heavysevenheavy class class wearwear types; types; condition,condition, onlyonly andoneandone (a) (b) classtheclassthe remainingremaining representedrepresented threethree thethe represented healthyrepresentedhealthy condition,condition, thethe mildmild threethree wearwear representedrepresented condition.condition. thethe ThisThis heavyheavy tabletable wearwear alsoalso condition,condition, indicatesindicates and andthethe FigureFigure 3. 3.thenumberthenumberImages Images remainingremaining of of ofof the the samplessamples finished finished threethree and and productrepresented representedproduct averageaverage from from stamping thestamping the the (a (mild)milda top) top andtime timewear wearand (b for(for) b condition. rightcondition.) righteacheach side; side; class.class. the ThistheThis piercing piercing tabletable processes also alsoprocesses indicatesindicates A, thethe B,A, and B, and Cnumber arenumber C markedare marked ofof bysamplessamples theby numbersthe and andnumbers averageaverage 1, 2, 1, and 2, stampingandstamping 3, respectively. 3, respectively. timetime forfor eacheach class.class. TableTable 1.1. ToolTool wearwear figurefigure andand namename tagging.tagging. After manuallyTableTable inspecting 1.1. ToolTool wear wear six figurefigure sets andofand vibr namenameation tagging.tagging. data from the two sensors, we de- Identification NumberTable Process 1. Tool Location wear figure and name Mild tagging. Wear Heavy Wear Identificationcided Numbe tor useProcess the y-axis Location vibration data Mildfrom Wea ther lower accelerometerHeavy because Wear this accel- Identification NumberIdentificationIdentificationerometer Numbe Numbe Processrr producedProcessProcess Location Location Locationdistinctive stamping Mild Mild vibr WearWeaWeaationrr signals and theHeavyHeavy Heavyleast WeardataWear Wear noise. We observed two stages of wear (mild and heavy) at each tool position (Table 1). Detailed conditions are shown in Appendix A. Table 2 presents the classification of the experi- mental data regarding the various tool conditions, indicating seven class types; only one class represented the healthy condition, three represented the heavy wear condition, and 1. Piercing Position A 1. the remainingPiercing three Position represented A the mild wear condition. This table also indicates the 1. Piercing Position A 1.1. number Piercing of samplesPiercing Position Positionand A average A stamping time for each class.

Table 1. Tool wear figure and name tagging.

Identification Number Process Location Mild Wear Heavy Wear

2.1. 2.2. Piercing Piercing PiercingPositionPiercing Position PositionAPosition B BB 2.2. PiercingPiercing Position Position BB SensorsSensors 2021 2021, 21, ,21 x ,FOR x FOR PEER PEER REVIEW REVIEW 5 of5 of23 23

3.3. 3. PiercingPiercingPiercing Position Position Position C C C

2. Piercing Position B

In general, in stamping, each vibration signal cycle represents three actions (Figure5): (1) contact of the stamping tool with the metal sheet, (2) shear force in the metal sheet that TableTable 2. 2.Data Data obtained obtained from from the the experiment. experiment. is engendered by the impaction of the tool against the metal sheet, and (3) separation of ClassClassthe toolName Name from the metal Class Class sheet. Type Type Number Number of of Samples Samples Average Average Stamping Stamping Time Time (s) (s) HealthyHealthy Condition Condition Class Class 1 1 280280 0.260.26 HeavyHeavy Wear Wear Position Position A A Class Class 2 2 280280 0.1830.183 HeavyHeavy Wear Wear Position Position B B Class Class 3 3 280280 0.270.27 HeavyHeavy Wear Wear Position Position C C Class Class 4 4 280280 0.3930.393 MildMild Wear Wear Position Position A A Class Class 5 5 280280 0.1130.113 MildMild Wear Wear Position Position B B Class Class 6 6 280280 0.130.13 MildMild Wear Wear Position Position C C Class Class 7 7 280280 0.120.12

(a()a ) (b()b ) FigureFigure 4. 4.(a )( atools) tools used used and and the the (b ()b press) press machine. machine.

InIn general, general, in in stamping, stamping, each each vibration vibration si gnalsignal cycle cycle represents represents three three actions actions (Figure (Figure 5):5): (1) (1) contact contact of of the the stamping stamping tool tool with with the the metal metal sheet, sheet, (2) (2) shear shear force force in in the the metal metal sheet sheet thatthat is isengendered engendered by by the the impaction impaction of of the the tool tool against against the the metal metal sheet, sheet, and and (3) (3) separation separation ofof the the tool tool from from the the metal metal sheet. sheet.

FigureFigure 5. 5.One One signal signal cycle cycle in inthe the stamping stamping process. process.

Sensors 2021, 21, x FOR PEER REVIEW 5 of 23

3. Piercing Position C Sensors 2021, 21, x FOR PEER REVIEW 5 of 23

3. Piercing Position C Table 2. Data obtained from the experiment.

Class Name Class Type Number of Samples Average Stamping Time (s) Healthy Condition Class 1 280 0.26 Heavy Wear Position A Class 2 280 0.183 Heavy Wear Position B Class 3 280 0.27 Heavy Wear Position C Class 4 280 0.393 Table 2. Data obtained from the experiment. Sensors 2021, 21, 262Mild Wear Position A Class 5 280 0.113 5 of 21 MildClass Wear Name Position B Class Class Type 6 Number of280 Samples Average Stamping0.13 Time (s) MildHealthy Wear Condition Position C Class Class 1 7 280280 0.260.12 Heavy Wear Position A Class 2 280 0.183 Heavy Wear Position B Class 3 280 0.27 Heavy Wear Position C Class 4 280 0.393 Mild Wear Position A Class 5 280 0.113 Mild Wear Position B Class 6 280 0.13 Mild Wear Position C Class 7 280 0.12

(a) (b)

FigureFigure 4. 4.( a(a)) tools tools used used and and the the ( b(b)) press press machine. machine.

In general,Table 2. Datain stamping, obtained fromeach thevibration experiment. signal cycle represents three actions (Figure

5): (1) contact of the stamping tool with the metal sheet, (2) shear force in the metal sheet Class Namethat is engendered Class Type by the impaction Number of the oftool Samples against the metal Average sheet, Stamping and (3) Timeseparation (s) Healthy Conditionof the tool Classfrom 1the metal(a) sheet. 280(b) 0.26 Heavy Wear Position A Class 2 280 0.183 Figure 4. (a) tools used and the (b) press machine. Heavy Wear Position B Class 3 280 0.27 Heavy Wear Position C Class 4 280 0.393 Mild Wear Position AIn general, Class in 5 stamping, each vibration 280 signal cycle represents three 0.113 actions (Figure Mild Wear Position B5): (1) contact Class of 6the stamping tool with the 280 metal sheet, (2) shear force in 0.13 the metal sheet Mild Wear Position Cthat is engendered Class 7 by the impaction of the 280tool against the metal sheet, and 0.12 (3) separation of the tool from the metal sheet.

Figure 5. One signal cycle in the stamping process.

FigureFigure 5.5. OneOne signalsignal cyclecycle inin thethe stampingstamping process.process.

However, because a real environment was considered in this study, the conditions under which data were extracted could not be adjusted. Therefore, some classes did not include signals that exhibited a typical cycle with the three actions. Instead, shock vibration was produced only when the sheet was cut through shear force. Examples of

vibration signals obtained from classes 4 and 5 are illustrated in Figure6, indicating slight or nonexistent shock vibrations when the tool came into contact with the metal sheet and separated from the metal sheet. Accordingly, we randomly selected a signal with or without action 1 or 3 as the template signal. Sensors 2021, 21, x FOR PEER REVIEW 6 of 23

However, because a real environment was considered in this study, the conditions under which data were extracted could not be adjusted. Therefore, some classes did not include signals that exhibited a typical cycle with the three actions. Instead, shock vibra- tion was produced only when the sheet was cut through shear force. Examples of vibra- Sensors 2021, 21, x FORtion PEER signals REVIEW obtained from classes 4 and 5 are illustrated in Figure 6, indicating slight or 6 of 20 Sensors 2021, 21, 262 nonexistent shock vibrations when the tool came into contact with the metal sheet and 6 of 21 separated fromseparated the metal from sheet. the Accordingly,metal sheet. weAccordingly, randomly weselected randomly a signal selected with aor signal with or without actionwithout 1 or 3 as action the template 1 or 3 as signal. the template signal.

Figure(a 6.) Vibration signals obtained from (a) class 4 and(b )( b) class 5 in the experimental stamping process. FigureFigure 6. Vibration 6. Vibration signals signals obtained obtained from from (a) class (a) class 4 and 4 (andb) class (b) class 5 in the 5 in experimental the experimental stamping stamping process. process. 2.2. Signal Acquisition 2.2. Signal Acquisition2.2. SignalTool Acquisitioncondition data were obtained from an LCP-60H press machine (Ing Yu Ma- chinery, Taichung, Taiwan) with a capacity of 60 tons and an automatic sheet metal Tool conditionTool data condition were obtained data were from obtained an LCP-60H from an LCP-60Hpress machine (Ing Yu (IngMa- Yu Machinery, feeder. The sheet material was 1.5 mm thick SPCC. Table 3 lists the experimental param- chinery, Taichung,Taichung, Taiwan) Taiwan) with with a acapacity capacity of of 60 tonstons and and an an automatic automatic sheet sheet metal metal feeder. The sheet eters for the stamping process. Two multiaxis PCB356A15 accelerometers with a sensi- feeder. The sheetmaterial material was 1.5was mm 1.5 thickmm thick SPCC. SPCC. Table3 Table lists the3 lists experimental the experimental parameters param- for the stamping tivity of 100 mV/g and measurement precision of ±50 g were mounted at the top and eters for the stampingprocess. Twoprocess. multiaxis Two multia PCB356A15xis PCB356A15 accelerometers accelerometers with a sensitivity with a sensi- of 100 mV/g and bottom sections of the tool (Figure 7) to measure vibrations at a sampling rate of 25,600 tivity of 100 measurementmV/g and measurement precision of precision±50 g were of mounted±50 g were at themounted top and at bottom the top sections and of the tool Hz; NI 9234 served as the data acquisition module. Throughout the stamping process, bottom sections(Figure of the7) totool measure (Figure vibrations 7) to measure at a sampling vibrations rate at of a 25,600sampling Hz; rate NI 9234 of 25,600 served as the data signal data were recorded for 10 s whenever a vibration shock was detected; all classes Hz; NI 9234 servedacquisition as the module. data acquisition Throughout module. the stamping Throughout process, the signal stamping data wereprocess, recorded for 10 s had more than 1 h of raw data on stamping vibrations (comprising more than 92 million signal data werewhenever recorded a vibration for 10 s whenever shock was a detected; vibration all shock classes was had detected; more than all classes 1 h of raw data on data points). had more thanstamping 1 h of raw vibrations data on (comprisingstamping vibrations more than (comprising 92 million datamore points). than 92 million data points). TableTable 3. 3.Experimental Experimental parameters parameters for for the the stamping stamping process. process.

Table 3. Experimental LubricationparametersLubrication fo r the stampingAnti-Corrosive process. Oil Anti-Corrosive Oil Lubrication Sheet materialAnti-CorrosiveSheet material Oil SPCC SPCC Sheet thicknessSheet thickness(mm) (mm) 1.5 1.5 Sheet material Stamping SPCC speed (rpm) 40 Stamping speed (rpm) 40 Sheet thickness (mm) Die material 1.5 SKD 11 Stamping speed (rpm)Die materialAverage Blanking 40 force SKD 11 29.5 kN Die materialAverage BlankingNumber ofSKDforce parts 11 formed 29.5 kN 1960 Average BlankingNumber force of parts formed 29.5kN 1960 Number of parts formed 1960

Figure 7. Positions of the accelerometer sensors. Figure 7. Positions of the accelerometer sensors.

2.3.2.3. DataData ProcessingProcessing TheThe obtainedobtained vibrationvibration signalssignals werewere thenthen analyzed analyzed visually. visually. FigureFigure8 8presents presents a a flowchartflowchart of of the the data data preprocessing preprocessing procedure. procedure. This procedureThis procedure involved involved two stages: two signalstages: acquisition and 1D CNN model execution. For signal acquisition, the raw data were

Sensors 2021, 21, x FOR PEER REVIEW 8 of 23

2.3. Data Processing The obtained vibration signals were then analyzed visually. Figure 8 presents a flowchart of the data preprocessing procedure. This procedure involved two stages: signal acquisition and 1D CNN model execution. For signal acquisition, the raw data were first transformed to a resolution of n values by using root mean (RMS) val- ues to expedite the template matching process. After the acquisition of a signal, the raw Sensors 2021, 21, 262 data were processed for the 1D CNN model. Such processing is necessary because7 of 21it prevents extreme input values, which lead to poor results, and because a 1D CNN model cannot process data with inconsistent input lengths. The data set used in this study con- tainedfirst transformed a collection to of a data resolution objects of (samplesn values) whose by using lengths root meandiffered square between (RMS) classes. values Be- to causeexpedite a CNN the template model requires matching input process. data Afterwith thea fixed acquisition length,of the a signal,data samples the raw were data transformedwere processed to ensure for the that 1D CNNthey had model. the Suchsame processinglength. Moreover, is necessary the power because spectral it prevents den- sityextreme (PSD) input was values,used as which a measure lead toof poorthe signal’s results, power and because content a versus 1D CNN frequency model cannot to ac- countprocess for data the withpossible inconsistent length variation input lengths. of the data The entered data set into used the in system. this study The contained frequency a rangecollection was ofset data to 50% objects of the (samples) sampling whose rate, lengths which differedwas maintained between at classes. 25.6 kHz. Because We did a CNN not splitmodel the requires signal inputbecause data doing with so a fixedmay length,obscure the meaningful data samples features were transformedin the signal toitself. ensure In general,that they the had PSD the sameis used length. to transform Moreover, time-domain the power spectralsignals of density different (PSD) lengths was used into asfre- a quency-domainmeasure of the signal’s signals powerof the contentsame length. versus Mo frequencyreover, it to can account be used for theto analyze possible the length fre- quencyvariation of ofsignals the data 𝑥(𝑡 entered) as the intoordinary the system. Fourier Thetransform frequency 𝑥(𝜔) range with wasa finite set interval to 50% ofof the(0, T).sampling The truncated rate, which Fourier was transform maintained can atbe 25.6 obtained kHz. as We follows: did not split the signal because doing so may obscure meaningful features1 in the signal itself. In general, the PSD is used to transform time-domain signals𝑥( of𝜔) different= 𝑥 lengths(𝑡)𝑒 into𝑑𝑡 frequency-domain signals(1) of √𝑇 the same length. Moreover, it can be used to analyze the frequency of signals x(t) as the ordinaryA uniform Fourier resolution transform wasxˆ( notω) withobtained a finite even interval after the of data (0, T).were The transformed truncated Fourierto have atransform fixed frequency can be obtainedrange. This as follows:problem can be resolved using simple linear interpolation, which is mathematically expressed as follows: Z T 1 −iωt xˆ(ω) =𝑦−𝑦√ 𝑦x(t−𝑦)e dt (1) T 0= (2) 𝑥−𝑥 𝑥 −𝑥

Figure 8. Data processing flowchart.flowchart.

TheA uniform PSD generally resolution represents was not obtaineda very low even power after theamplitude. data were If transformedsuch values toare have en- tereda fixed into frequency the CNN range. algorithm This problemwithout normalization can be resolved (i.e., using conversion simple linear to a value interpolation, between 1which and 0), is mathematicallythe CNN activation expressed function as follows:will not function as intended, thus preventing the model from learning effectively. The normalization equation is expressed in Equation (3): y − y y − y 0 = 1 0 (2) x − x0 x1 − x0 The PSD generally represents a very low power amplitude. If such values are entered into the CNN algorithm without normalization (i.e., conversion to a value between 1 and 0), the CNN activation function will not function as intended, thus preventing the model from learning effectively. The normalization equation is expressed in Equation (3):

ei − Emin Normalized (ei) = (3) Emax − Emin Sensors 2021, 21, x FOR PEER REVIEW 9 of 23

𝑒 −𝐸 𝑁𝑜𝑟𝑚𝑎𝑙𝑖𝑧𝑒𝑑 (𝑒) = (3) 𝐸 −𝐸

2.4. Template Matching To simplify the comparative evaluation of several distance metrics, template Sensors 2021, 21, 262matching was used. This technique is designed to be sufficiently simple to avoid confu- 8 of 21 sion and accelerate the computation process. This technique mainly involves calculating the distance between templates and the testing signal and2.4. determining Template Matching the boundary-dependent similarity decisions (Figure 9). In this process, vibration data signals are continuously extracted using sliding windows, To simplify the comparative evaluation of several distance metrics, template matching which are then converted to a length equal to the length of the template used. Finally, the was used. This technique is designed to be sufficiently simple to avoid confusion and value of each incoming data point is calculated. accelerate the computation process. Furthermore, several classes of signals can be extracted. These signals are calculated This technique mainly involves calculating the distance between templates and the in terms of distance metrics; specifically, the local minima are calculated to determine the testing signal and determining the boundary-dependent similarity decisions (Figure9). signal that is closest in value to the signal template. The local minima must be smaller In this process, vibration data signals are continuously extracted using sliding windows, than a predetermined threshold value. Signal templates are obtained from manual ob- which are then converted to a length equal to the length of the template used. Finally, servations and random selection to avoid possible bias against one method or type of the value of each incoming data point is calculated. data.

Figure 9. Schematic of the signal matching protocol. Figure 9. Schematic of the signal matching protocol.

2.5. Distance Metrics Furthermore, several classes of signals can be extracted. These signals are calculated This studyin used terms several of distance distance metrics; metrics, specifically, namely, the the local root minima mean are square calculated error to determine the (RMSE), correlationsignal coefficient, that is closest kurtosis, in value skew toness, the signal and mean, template. on the The assumption local minima that must the be smaller than shape of the targeta predetermined pattern rema thresholdined constant. value. SignalConsider templates signals are A obtainedand B with from signal manual observations lengths PA and andPB, respectively. random selection The equations to avoid possible for the biasdistance against metrics one methodfor these or signals type of data. are as follows: 2.5. Distance Metrics This study used several𝑅𝑀𝑆𝐸: distance metrics, namely, the root mean square error (RMSE), correlation coefficient, kurtosis, skewness, and mean, on the assumption that the shape of 1 (4) the target pattern remained constant. Consider signals A and B with signal lengths PA and 𝑑 = ∙{𝐴(𝑛) −𝐵(𝑛)} . PB, respectively. The equations𝑃 for the distance metrics for these signals are as follows: RMSE : s P 1 2 (4) drsme = P · ∑ {A(n) − B(n)} . n=1

Correlation Coef ficient : P ∑n=1{A(n)−A}·{B(n)−B} (5) dcorr = q q . P 2 P 2 ∑n=1{A(n)−A} · ∑n=1{B(n)−B} Sensors 2021, 21, 262 9 of 21

Kurtosis : PA 4 PB 4 ∑n=1{A(n)−A} ∑n=1{B(n)−B} (6) dkur = 4 − 4 . PA·σA PB·σB Skewness : 3 1 P (A −B )−(A−B) P ∑n=1{ n n } (7) dskew = q 3 1 P 2 P ∑i=1{(An−Bn)−(A−B)}

Mean : n (8) ∑i=1(Ai−Bi) dmean = n A(n) and B(n) represent the nth data points of signals A and B, respectively, and P represent the length of the signals.

2.6. Autothreshold The signal matching protocol (Figure7) includes a step wherein a threshold is defined for the algorithm to search for the local minima. The algorithm comprises an autothreshold value from the template used, a standard deviation equation for the RSME metric Equation (9), and the variance for the mean distance metric Equation (10). The signals are captured after a comparison between the incoming signals and template signals; if an incoming signal has the same shape and magnitude as the template signal, the value of the distance metric decreases or increases, as discussed in Section 3.1.

Standard Deviation : q 2 (9) ∑(xi−µ) σ = P

Variance : P 1 2 (10) V = P−1 ∑ |xi − µ| i=1

xi = value from the template signal; µ = mean value of the template signal; P = template signal length.

2.7. Convolutional Neural Network Deep CNNs are designed to be operate on 2D datasets of images and videos; however, 1D CNNs have clear advantages over 2D CNNs for capturing signals. Like 2D CNNs, 1D CNNs also serve as trainable filters that can learn to filter important features from 1D data. The output from 1D CNNs can then be used in an FCNN (Figure 10) to perform Sensors 2021, 21, x FOR PEER REVIEWconventional classification operations. 1D CNNs differ from 2D CNNs in that they use a 1D10 of 20 array for feature maps and kernels; hence, the kernel size and parameters for subsampling are in the form of scalars. The FCNN uses the conventional backpropagation (BP) method.

FigureFigure 10. 1D 10. convolutional 1D convolutional neural neural network network (CNN) (CNN) with awith fully a connectedfully connected layer layer [12]. [12].

1D forward propagation from convolution layer l − 1 to the input of a neuron in layer l is expressed as shown in Equation (11).

𝑥 𝑏 𝑐𝑜𝑛𝑣1𝐷𝑤 ,𝑠 (11) where 𝑥 represents the input, 𝑏 is the bias of the 𝑘th neuron at layer 𝑙, and 𝑠 is the output of the 𝑖th neuron at layer 𝑙 − 1. Moreover, 𝑤 represents the kernel (“mask” or “weight”) from the 𝑖th neuron at layer 𝑙−1 to the 𝑘th neuron at layer 𝑙. Therefore, if 𝑦 is defined as the output, it can be expressed by a function of 𝑥, as shown in Equation (12): 𝑦 𝑓𝑥 (12) 𝑠 𝑦 ↓𝑠𝑠 where 𝑠 denotes an output of the 𝑘th neuron in the 𝑙 layer and ↓ 𝑠𝑠 denotes a downscaling factor for the output with a scale factor of 𝑠𝑠. In an FCNN, BP is performed starting from the multilayer perceptron if it has NL classes. The mean square error (MSE) is expressed as follows:

𝐸 𝑀𝑆𝐸𝑡 ,𝑦 ,…,𝑦 𝑦 𝑡 (13) where 𝐿1 represents the input layer, 𝑙𝐿 represents the identity between an input and output, p represents the input vector, 𝑡 represents the corresponding target, and 𝑦 ,…,𝑦 represent the output vectors. In this study, batch normalization (BN) was used after the application of the con- volution filters. Suppose that the data contain m training examples; the mean can be calculated using Equation (14): 1 𝜇 ← 𝑥 (14) ℬ 𝑚 Then, the variance of these training examples or the mini batch can be calculated as follows: 1 𝜎 ← 𝑥 𝜇 (15) ℬ 𝑚 ℬ After the variance is obtained, the formula for BN can be applied, as shown in Equation (16), where 𝛾 and 𝛽 are trainable parameters.

𝑥 𝜇ℬ 𝑦 ←𝛾 𝛽𝑩𝑵 𝒙 , 𝒊 (16) 𝜎ℬ 𝜖

Sensors 2021, 21, 262 10 of 21

1D forward propagation from convolution layer l − 1 to the input of a neuron in layer l is expressed as shown in Equation (11).

Nl−1   l = l + l−1 l−1 xk bk ∑ conv1D wk , sk (11) i=1

l l l−1 where xk represents the input, bk is the bias of the kth neuron at layer l, and sk is the l−1 output of the ith neuron at layer l − 1. Moreover, wk represents the kernel (“mask” or l “weight”) from the ith neuron at layer l − 1 to the kth neuron at layer l. Therefore, if yk is l defined as the output, it can be expressed by a function of xk, as shown in Equation (12):   yl = f xl k k (12) l l sk = yk ↓ ss

l where sk denotes an output of the kth neuron in the l layer and ↓ ss denotes a downscaling factor for the output with a scale factor of ss. In an FCNN, BP is performed starting from the multilayer perceptron if it has NL classes. The mean square error (MSE) is expressed as follows: NL  h i  p2 E = MSE tp, yL,..., yL = yL − t (13) p 1 NL ∑ i i i=0 where L = 1 represents the input layer, l = L represents the identity between an input p and output, p represents the input vector, t represents the corresponding target, and h i i yL,..., yL represent the output vectors. 1 NL In this study, batch normalization (BN) was used after the application of the convolu- tion filters. Suppose that the data contain m training examples; the mean can be calculated using Equation (14): 1 m µB ← ∑ xi (14) m i=1 Then, the variance of these training examples or the mini batch can be calculated as follows: m 2 1 2 σB ← ∑(xi − µB) (15) m i=1 After the variance is obtained, the formula for BN can be applied, as shown in Equation (16), where γ and β are trainable parameters.

xi − µB yi ← γ q + β = BNγ,β(xi) (16) 2 σB + e

Moreover, LeakyReLU nonlinearity can be used, as expressed in Equation (17):

 x, i f x > 0 f (x) = (17) ax, otherwise

The reference model of 1D CNN used in this study is shown in Figure 11. It com- bines convolutional filters, batch normalization, following by performing LeakyReLU and MaxPooling to attain downsampling. Sensors 2021, 21, x FOR PEER REVIEW 12 of 23

𝑥, 𝑖𝑓 𝑥 >0 𝑓(𝑥) = (17) 𝑎𝑥, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒

Sensors 2021, 21, 262 The reference model of 1D CNN used in this study is shown in Figure 11. It com-11 of 21 bines convolutional filters, batch normalization, following by performing LeakyReLU and MaxPooling to attain downsampling.

FigureFigure 11.11. DiagramDiagram of ofa convolutional a convolutional network network with attached with attached fully connected fully connected neural network neural net- (FCNN). work (FCNN).

3.3. ResultsResults TheThe mainmain resultsresults ofof thisthis study study pertained pertained toto two two aspects, aspects, namely, namely, signal signal acquisition acquisition andand signalsignal recognition.recognition. ResultsResults pertainingpertaining toto signalsignal acquisitionacquisition revealedrevealed thethe system’ssystem’s signalsignal acquisitionacquisition accuracy,accuracy, andand thosethose pertainingpertaining toto signalsignal recognitionrecognition revealedrevealed thethe 1D1D CNNCNN model’smodel’s performanceperformance inin classifyingclassifying thethe stampingstamping signals.signals. TemplateTemplate matchingmatching were were conductedconducted usingusing thethe 64-bit64-bit versionversion ofof LabVIEWLabVIEW 2019,2019, whereaswhereas 1D1D CNN CNN modeling modeling was was conductedconducted using using Tensorflow Tensorflow in in Python. Python. These These systems systems were were run run on on a Windowsa Windows computer comput- wither with an AMDan AMD Ryzen Ryzen 7 3750H 7 3750H processor, processor, 24 GB 24 of GB RAM, of RAM, and NVIDIA and NVIDIA GeForce GeForce RTX 2060. RTX 2060. 3.1. Distance Metrics Performance 3.1. DistanceFive algorithms Metrics werePerformance compared with respect to two key indicators. The first indicator was the distinctive distance, which was evaluated in the absence of a signal, in the presence of a noise signal, and in the presence of a stamping signal. Figure 12 presents the performance of each distance metric. All three algorithms could detect vibration signals to some degree, and the kurtosis, RMSE, skewness, and mean values indicated that the algorithms could distinguish between time points with a noise signal, without a signal, and with a stamping signal. However, the correlation coefficient tends to fluctuate with a small change in input value. This is because the calculation of the correlation coefficient ignores the magnitude and only focuses on the “shape” of each Sensors 2021, 21, x FOR PEER REVIEW 13 of 23

Five algorithms were compared with respect to two key indicators. The first indi- cator was the distinctive distance, which was evaluated in the absence of a signal, in the presence of a noise signal, and in the presence of a stamping signal. Figure 12 presents the performance of each distance metric. All three algorithms could detect vibration signals to some degree, and the kurtosis, RMSE, skewness, and mean values indicated that the algorithms could distinguish between time points with a Sensors 2021, 21, 262 noise signal, without a signal, and with a stamping signal. However, the correlation12 of co- 21 efficient tends to fluctuate with a small change in input value. This is because the calcu- lation of the correlation coefficient ignores the magnitude and only focuses on the “shape” of each vector. Therefore, the correlation coefficient is unsuitable as a distance vector. Therefore, the correlation coefficient is unsuitable as a distance metric for signal metric for signal matching after acquiring a vibratory signal because determining the matching after acquiring a vibratory signal because determining the optimal threshold optimal threshold value would be difficult. The second indicator was the computation value would be difficult. The second indicator was the computation time. time.

(a) (b)

(c) (d)

(e) (f)

Figure 12. (a) Root mean square (RMS) of signals from the vibration in the stamping process. Sliding windows were applied from the template signal across the RMS signal, resulting in (b) the correlation coefficient, (c) kurtosis, (d) RMSE, (e) skewness, and (f) mean metrics.

The vibration signal obtained in the stamping process could be as short as 0.1 s. Therefore, to capture such short signals, the computation time should be measured for each distance metric in order to determine the metric that yields the shortest computation time. Figure 13 presents the computation time of each distance metric. As expected, Sensors 2021, 21, x FOR PEER REVIEW 14 of 23

Figure 12. (a) Root mean square (RMS) of signals from the vibration in the stamping process. Sliding windows were ap- plied from the template signal across the RMS signal, resulting in (b) the correlation coefficient, (c) kurtosis, (d) RMSE, (e) skewness, and (f) mean metrics.

Sensors 2021, 21, 262 The vibration signal obtained in the stamping process could be as short as13 0.1 of 21s. Therefore, to capture such short signals, the computation time should be measured for each distance metric in order to determine the metric that yields the shortest computation time. Figure 13 presents the computation time of each distance metric. As expected, the theRMSE RMSE and and mean mean had had the the shortest shortest computation computation time time because because only simple mathematicalmathematical operationsoperations areare used.used. AlthoughAlthough thethe correlationcorrelation coefficient,coefficient, kurtosis,kurtosis, andand skewnessskewness hadhad shortshort computationcomputation times, times, they they required required the the data da lengthta length to beto scaled,be scaled, which which lengthened lengthened the computationthe computation time. time. Accordingly, Accordingly, the RMSE the RM andSE mean and wasmean indicated was indicated the be most the suitablebe most (i.e.,suitable best (i.e., performing) best performing) distance metric.distance metric.

FigureFigure 13.13. DistanceDistance metrics:metrics: computationcomputation time.time.

3.2.3.2. AutothresholdingAutothresholding PerformancePerformance InIn thisthis study,study, autothresholdingautothresholding performanceperformance waswas testedtested inin termsterms ofof thethe RMSERMSE andand mean.mean. These tests involved involved assessments assessments of of sensitivity sensitivity or or recall, recall, miss miss rate, rate, and and precision precision or orpositive positive predicted predicted value. value. In this In thisexperiment experiment,, template template signals signals of random of random size were size wereman- manuallyually selected selected from from random random sections sections in each in each class; class; the selection the selection process process was wasrepeated repeated four fourtimes times in each in eachclass. class. All the All signals the signals were wereobtained obtained when when the tool the cut tool the cut metal the metalsheet. sheet.Three Threemeasures measures were wereused: used:true positive true positive (TP), (TP),false falsepositive positive (FP), (FP),and false and falsenegative negative (FN). (FN). Fig- Figureure 14b 14 showsb shows a TP a TPclassification; classification; that that is, th is,e thealgorithm algorithm could could fully fully capture capture a signal. a signal. Fig- Figure 14c presents an FP classification; that is, the algorithm failed to capture the whole ure 14c presents an FP classification; that is, the algorithm failed to capture the whole signal, with one or more parts missing. Figure 14d shows an FN classification; that is, signal, with one or more parts missing. Figure 14d shows an FN classification; that is, the the algorithm could not capture the signal of interest. The TF signal is not presented in algorithm could not capture the signal of interest. The TF signal is not presented in Figure Figure 14 because it is only applicable in the condition when no stamping signal is present. 14 because it is only applicable in the condition when no stamping signal is present. Figure 15 shows the autothresholding performance as assessed in terms of the standard deviation of the RMSE and the variance of the mean. The thresholds are indicated by the red dotted line, and the RMSE and mean values were obtained using sliding windows. This figure indicates that autothresholding could be used to determine whether local minima should be calculated. This reduces computation time while ensuring that the stamping signal is captured. Although the standard deviation of the RMSE and variance of the mean can be used as the autothreshold, we found some inconsistency when using the variance as the autothreshold (Figure 16) for some cases where the template signal was long; if the variance autothreshold is below its mean value, local minima cannot be detected. Thus, we only used the RMSE as the distance metric and the standard deviation as the autothreshold. We observed that classes with the same degree of wear also had the same type of signal in terms of shape and amplitude (Figure 17); heavy but not mild wear was associated with continuous vibration to some degree. We conducted an autothresholding experiment by using class 1 (the healthy condition), by randomly selecting one of classes 2 to 4 (heavy wear conditions), and by randomly selecting one of classes 5 to 7 (mild wear conditions). Table4 shows the results for data from the same class, and cross-class. Detailed results are shown in AppendixB. The results revealed a high sensitivity (0.950) and precision Sensors 2021, 21, x FOR PEER REVIEW 14 of 22

Sensors 2021, 21, x FOR PEER REVIEW3.2. Autothresholding Performance 15 of 23

Sensors 2021, 21, 262 In this study, autothresholding performance was tested in terms of the RMSE14 and of 21 mean. These tests involved assessments of sensitivity or recall, miss rate, and precision or positive predicted value. In this experiment, template signals of random size were man- ually selected from random sections in each class; the selection process was repeated four (0.876).times in Cross-class each class. examinationAll the signals is requiredwere obta because,ined when in real-timethe tool cut monitoring, the metal sheet. the incoming Three signalsmeasures can were be fromused: any true class. positive Table (TP),4b presentsfalse positive the results(FP), and of cross-classfalse negative examination, (FN). Fig- meaningure 14b shows that a a template TP classification; from another that is, class the isalgorithm used to capture could fully signals capture from a other signal. classes. Fig- Theure sensitivity14c presents and an precision FP classification; exceeded that 0.8 inis, some the algorithm cases but werefailed very to capture low for the other whole cases. Ansignal, investigation with one or was more conducted parts missing. for cases Figure with 14d poor shows precision, an FN and classification; these cases werethat is, found the toalgorithm involve verycould narrow not capture template the signal signals, of intere whichst. resulted The TF signal in very is narrownot presented windows in Figure for the acquisition14 because ofit is the only correct applicable signal, in resulting the condition in poor when sensitivity no stamping and precision. signal is present.

(a) (b)

(a) (b)

(c) (d) Figure 14. (a) Results of signal acquisition using autothresholding. Red dotted lines indicate the best matched signal. (b) True positive signal, (c) false positive signal, and (d) and false negative signal.

Figure 15 shows the autothresholding performance as assessed in terms of the standard deviation of the RMSE and the variance of the mean. The thresholds are indi- cated by the red dotted line, and the RMSE and mean values were obtained using sliding windows. This figure indicates that autothresholding could be used to determine whether local minima should be calculated. This reduces computation time while en- suring that the stamping signal is captured. Although the standard deviation of the (c)RMSE and variance of the mean can be used as the autothreshold,(d) we found some in- consistency when using the variance as the autothreshold (Figure 16) for some cases Figure 14. (a) Results of signal acquisition using autothresholding. Red dotted lines indicate the best matched signal. (b) Figure 14. (a) Results of signalwhere acquisition the template using autothresholding.signal was long; if Red the dotted variance lines autothreshold indicate the best is below matched its signal. mean val- True positive signal, (c) false positive signal, and (d) and false negative signal. (b) True positive signal, (c) falseue, positivelocal minima signal, cannot and (d) be and detected. false negative signal. Figure 15 shows the autothresholding performance as assessed in terms of the standard deviation of the RMSE and the variance of the mean. The thresholds are indi- cated by the red dotted line, and the RMSE and mean values were obtained using sliding windows. This figure indicates that autothresholding could be used to determine whether local minima should be calculated. This reduces computation time while en- suring that the stamping signal is captured. Although the standard deviation of the RMSE and variance of the mean can be used as the autothreshold, we found some in- consistency when using the variance as the autothreshold (Figure 16) for some cases where the template signal was long; if the variance autothreshold is below its mean val- ue, local minima cannot be detected.

(a) (b)

FigureFigure 15.15. AutothresholdingAutothresholding performance performance as as assessed assessed in in terms terms of of (a ()a standard) standard deviation deviation and and (b )(b vari-)

ancevariance of the of template the template signal signal (red dotted(red dotted line). line).

Sensors 2021, 21, x FOR PEER REVIEW 15 of 22

Sensors 2021, 21, 262 (a) (b) 15 of 21 Figure 15. Autothresholding performance as assessed in terms of (a) standard deviation and (b) variance of the template signal (red dotted line).

(a) (b)

FigureFigure 16.16. Autothresholding usingusing ((aa)) standardstandard deviationdeviation andand ((bb)) variancevariance inin samesame templatetemplate signal.signal.

Thus, we only used the RMSE as the distance metric and the standard deviation as the autothreshold. We observed that classes with the same degree of wear also had the same type of signal in terms of shape and amplitude (Figure 17); heavy but not mild wear was associ- ated with continuous vibration to some degree. We conducted an autothresholding ex- periment by using class 1 (the healthy condition), by randomly selecting one of classes 2 to 4 (heavy wear conditions), and by randomly selecting one of classes 5 to 7 (mild wear conditions). Table 4 shows the results for data from the same class, and cross-class. De- (a) tailed results are shown in Appendix(b) B. The results revealed a high(c) sensitivity (0.950) and (a) precision (0.876). Cross-class(b) examination is required because, in(c) real-time monitoring, the incoming signals can be from any class. Table 4b presents the results of cross-class examination, meaning that a template from another class is used to capture signals from other classes. The sensitivity and precision exceeded 0.8 in some cases but were very low for other cases. An investigation was conducted for cases with poor precision, and these cases were found to involve very narrow template signals, which resulted in very narrow windows for the acquisition of the correct signal, resulting in poor sensitivity and preci- sion.

(d) (e) (f) (d) (e) (f) 1 Figure 17 Figure 17. Vibratory signal from (a)–(f) represent class 2 to 7, respectively. 1 Figure 17 Table 4. Average sensitivity, precision, and miss rates for data acquisition (a) within the same class and (b) cross various classes.

(a) (b) Sensitivity(a) (b) Sensitivity Class Precision Miss Rate Class Precision Miss Rate Rate Rate 1 1 0.892 0 1→4 1 0.892 0 3 1 0.821 0 3→1 0.922 0.815 0.077 7 0.851 0.917 0.148 7→2 0.797 0.866 0.202 Average 0.95 0.876 0.049 Average 0.9067 0.8583 0.093

3.3. Tool Condition Classification Using a 1D CNN and Model Optimization The proposed CNN was tested using data obtained according to the description in Section 2.1. These input data were then preprocessed to have the same data length by using the method described in Section 2.3 (Figure 18).

Sensors 2020, 20, x; doi: FOR PEER REVIEW www.mdpi.com/journal/sensors Sensors 2020, 20, x; doi: FOR PEER REVIEW www.mdpi.com/journal/sensors Sensors 2021, 21, x FOR PEER REVIEW 16 of 22

(c) (d)

(e) (f) Figure 17. Vibratory signal from (a)–(f) represent class 2 to 7, respectively.

Table 4. Average sensitivity, precision, and miss rates for data acquisition (a) within the same class and (b) cross various classes.

(a) (b) Sensitivity Miss Sensitivity Miss Class Precision Class Precision Rate Rate Rate Rate 1 1 0.892 0 1→4 1 0.892 0 3 1 0.821 0 3→1 0.922 0.815 0.077 7 0.851 0.917 0.148 7→2 0.797 0.866 0.202 Average 0.95 0.876 0.049 Average 0.9067 0.8583 0.093

3.3. Tool Condition Classification Using a 1D CNN and Model Optimization Sensors 2021, 21, 262 The proposed CNN was tested using data obtained according to the description16 of in 21 Section 2.1. These input data were then preprocessed to have the same data length by using the method described in Section 2.3 (Figure 18).

FigureFigure 18. 18. NormalizedNormalized transformation transformation of of Fast Fast Fourier Fourier Tr Transformansform from time-based to frequency-based.

FigureFigure 1919 presents the classification classification resu resultslts produced produced by by the model (referenced in SectionSection 2.7)2.7) when testing data were used. Fo Forr every class, the predicted values matched thethe actual values. WeWe couldcould analyzeanalyze misclassifications misclassifications using using a confusiona confusion matrix; matrix; this this matrix ma- trixwas was also also used used to evaluate to evaluate classification classification accuracy. accuracy. An accuracy An accuracy rateof rate 100% of 100% was obtained, was ob- tained,but many but previous many previous studies havestudies reported have thatreported adaptive that 1Dadaptive CNNs 1D can CNNs perform can better perform than betterother classificationthan other algorithms,classification such algorithms as support, such vector as machine support (SVM), vector k-nearest machine neighbors, (SVM), k-nearestSVM ensemble, neighbors, and artificialSVM ensemble, neural network and artifici architectures.al neural network In this study, architectures. hyperparameter In this optimization was used because the model must be as simple as possible to reduce the computation time while maintaining a high accuracy for classifying each class correctly. The total numbers of parameters obtained by the reference model described in Section 2.7 are shown in Figure 20. The parameters comprised weights and biases that were either trainable or not trainable in the 1D CNN model. The number of nodes in the FCNN and CNN filters was reduced to decrease the number of parameters, but other hyperparameters, such as the alpha value for LeakyReLU and the dropout, were not altered because they did not significantly reduce the size of the model. Of the data available, 30% and 70% were validation and training data, respectively. The results obtained after the training process indicated significant differences between models with respect to the total number of parameters (Table5). Although the number of parameters decreased sequentially from the reference model to the smallest model, the ccuracy did not decrease significantly. Furthermore, the second smallest model with 200,000 parameters exhibited a high level of accuracy; although, it had only 1/137 of the parameters in the reference model; however, when that model had only 100,000 parameters, the accuracy decreased, and we thus did not decrease the number of parameters further.

3.4. Data Reduction for Model Training The case in this study only used one type of progressive stamping tool; however, in actual case scenarios, the proposed model architecture could be used for other types of stamping tools. Figure 21 illustrates a graph that can serve as a reference for determining the quantity of data required to retrain the model. The graph depicts the behavior of the two smallest models when they were trained using different quantities of data. The data shown in the figure include data that were used for training and validation; the remaining data were used for testing. Because the accuracy of each model was tested using the test data, these data had not been used during model training. As stated in Section 3.3, the model with 200,000 parameters yielded the lowest accuracy, which can be confirmed by the results shown in Figure 21. Sensors 2021, 21, x FOR PEER REVIEW 17 of 22

study, hyperparameter optimization was used because the model must be as simple as possible to reduce the computation time while maintaining a high accuracy for classify- Sensors 2021, 21, 262 ing each class correctly. The total numbers of parameters obtained by the reference 17model of 21 described in Section 2.7 are shown in Figure 20. The parameters comprised weights and biases that were either trainable or not trainable in the 1D CNN model. Sensors 2021, 21, x FOR PEER REVIEW 18 of 23

FigureFigure 19.19. ConfusionConfusion matrixmatrix ofof thethe classificationclassification results.results. Figure 19. Confusion matrix of the classification results.

Figure 20. Total numbers of parameters from the reference model. FigureFigure 20. 20.Total Total numbers numbers of of parameters parameters from from the the reference reference model. model. The number of nodes in the FCNN and CNN filters was reduced to decrease the The number of nodes in the FCNN and CNN filters was reduced to decrease the Tablenumber 5. Model of parameters, size and accuracy. but other hyperparameters, such as the alpha value for number of parameters, but other hyperparameters, such as the alpha value for LeakyReLU and the dropout, were not altered because they did not significantly reduce LeakyReLU andTotal the Parameters dropout, were not altered because they Accuracy did not (%)significantly reduce the size of the model. Of the data available, 30% and 70% were validation and training the size of the model. Of the data available, 30% and 70% were validation and training data, respectively.14,025,991 The results obtained after the training process 100 indicated significant data, respectively. The results obtained after the training process indicated significant differences between13,040,311 models with respect to the total number of 99.8 parameters (Table 5). differences between13,026,135 models with respect to the total number 99.8of parameters (Table 5). Although the number of parameters decreased sequentially from the reference model to Although the number3,260,943 of parameters decreased sequentially from 99.8 the reference model to the smallest model, the accuracy did not decrease significantly. Furthermore, the second the smallest model,3,257,007 the accuracy did not decrease significantly. 99.8Furthermore, the second smallest model with1,631,599 200,000 parameters exhibited a high level 99.8of accuracy; although, it smallest model with 200,000 parameters exhibited a high level of accuracy; although, it had only 1/137 of the817,231 parameters in the reference model; however, 99.8 when that model had had only 1/137 of the parameters in the reference model; however, when that model had only 100,000 parameters,204,495 the accuracy decreased, and we thus 99.7 did not decrease the only 100,000 parameters, the accuracy decreased, and we thus did not decrease the number of parameters204,387 further. 99.8 number of parameters102,291 further. 98.0 Table 5. Model size and accuracy. Table 5. Model size and accuracy. The majority of one cycle monitoring were composed by signal acquisition, data Total Parameters Accuracy (%) preprocessing,Total Parameters and Accuracy prediction (%) using 1D CNN. In the present study, the actual computation 14,025,991 100 times14,025,991 of one cycle monitoring100 period of the 1960 samples were 0.14–0.18 s, and it depended 13,040,311 99.8 on the13,040,311 data length of template99.8 signal. For example, in this study case, the stamping speed was13,026,135 40 rpm which equals99.8 to 1.5 s of one cycle stamping process time. The computation times3,260,943 of one cycle monitoring99.8 period was 0.18 s; therefore, there was enough time to

real-time3,257,007 monitoring and99.8 classify the condition of tool wear during every cycle stamping 1,631,599 99.8 817,231 99.8 204,495 99.7

Sensors 2021, 21, x FOR PEER REVIEW 19 of 23

204,387 99.8 102,291 98.0

3.4. Data Reduction for Model Training The case in this study only used one type of progressive stamping tool; however, in actual case scenarios, the proposed model architecture could be used for other types of stamping tools. Figure 21 illustrates a graph that can serve as a reference for determining the quantity of data required to retrain the model. The graph depicts the behavior of the Sensors 2021, 21, 262 two smallest models when they were trained using different quantities of data. The18 ofdata 21 shown in the figure include data that were used for training and validation; the remain- ing data were used for testing. Because the accuracy of each model was tested using the test data, these data had not been used during model training. As stated in Section 3.3, process.the model The with proposed 200,000 method parameters could yielded be applied the lowest for stamping accuracy, speed which up can to 100be confirmed rpm and satisfyby the theresults industrial shown specifications in Figure 21. of most metal stamping.

Model Comparison 1 0.8 0.6 0.4 Accuracy 0.2 0 30 50 70 90 110 130 150 170 190 Number Of Training + Validating Data for Each Class

204387 Parameters 102291 Parameters

FigureFigure 21. 21.Comparison Comparison of of accuracies accuracies for for different different amounts amounts of of training training data. data.

4. ConclusionsThe majority of one cycle monitoring were composed by signal acquisition, data preprocessing,In this study, and a methodprediction of capturingusing 1D CNN. vibratory In the signals present during study, the the stamping actual computa- process wastion proposed times of andone verifiedcycle monitoring using real-case period data. of the Several 1960 distancesamples metricswere 0.14–0.18 were compared, s, and it anddepended the results on the revealed data length that the of RMSEtemplate was signal. the most For suitableexample, distance in this study metric case, due the to itsstamping effectiveness speed andwas speed.40 rpm An which autothresholding equals to 1.5 s method of one cycle was alsostamping employed process to helptime. determineThe computation the local times minimum of one threshold cycle monitoring from different period template was 0.18 signals. s; therefore, The proposedthere was methodenough was time validated to real-time based monitoring on sensitivity and andclassify precision. the condition A sensitivity of tool of wear >0.9 (indicatingduring eve- favorablery cycle stamping performance) process. was The obtained proposed from me experimentalthod could resultsbe applied for thefor same-classstamping speed and cross-classup to 100 rpm configurations, and satisfy andthe industrial a precision specifications of >0.85 was of obtained most metal for bothstamping. configurations. The results of this study can serve as a reference for the extraction of signals with periodic characteristics4. Conclusions and shape similarity. TheIn this use study, of an adaptive a method 1D of CNN capturing for classifying vibratory stamping-inducedsignals during the vibrationstamping signalsprocess waswas extensively proposed and studied. verified Localized using real-case raw data data. of vibration Several distance signals weremetrics normalized were compared, using theand PSD the andresults then revealed entered intothat the 1DRMSE CNN was Model. the most The 1Dsuitable CNN distance presents metric the advantages due to its ofeffectiveness adaptability and and speed. rapid featureAn autothresholding learning. Moreover, method the was effectiveness also employed of the to 1D help CNN de- wastermine assessed the usinglocal minimum a real-case threshold scenario. Thefrom size different of the model template in this signals. study The was proposed reduced greatlymethod owing was validated to the rapid based adaptability on sensitivity of the and 1D CNN,precision. and anA sensitivity accuracy of of up >0.9 to 99.8%(indicating was achieved.favorable The performance) amounts of was training obtained and validation from experimental data were alsoresults investigated. for the same-class An accuracy and ofcross-class >90% could configurations, be achieved withand a only precision 90 data of points >0.85 fromwas obtained each class. for These both findingsconfigurations. show that the 1D CNN can be used in real time for classifying vibration signals during stamping. The results of this study can serve as a reference for the extraction of signals with periodic Although the effectiveness of the proposed 1D CNN was verified, problems remain to be characteristics and shape similarity. addressed for its application in mass production. The proposed 1D CNN is not flexible in The use of an adaptive 1D CNN for classifying stamping-induced vibration signals terms of the number of classes. Future studies should aim to overcome this problem by was extensively studied. Localized raw data of vibration signals were normalized using using Siamese neural networks. This study’s findings can serve as a reference for manufacturers in developing strate- gies for reducing the cost of tool maintenance and improving product quality.

Author Contributions: Conceptualization, C.-Y.H.; Data curation, C.-Y.H. and Z.D.; Formal analysis, Z.D.; Investigation, C.-Y.H. and Z.D.; Methodology, C.-Y.H. and Z.D.; Resources, C.-Y.H.; Software, Z.D.; Supervision, C.-Y.H.; Validation, C.-Y.H.; Writing—original , C.-Y.H. and Z.D.; Writing— review & editing, C.-Y.H. and Z.D. All authors have read and agreed to the published version of the manuscript. Funding: This research received no external funding. Institutional Review Board Statement: Not Applicable. Sensors 2021, 21, 262 19 of 21

Informed Consent Statement: Not Applicable. Data Availability Statement: The data presented in this study are available on request from the corresponding author. The data are not publicly available due to further study will be carried out using the same data. Sensors 2021, 21, x FOR PEER REVIEW 21 of 23 Sensors 2021, 21, x FOR PEER REVIEWConflicts of Interest: The authors declare no conflict of interest. 21 of 23 Sensors 2021, 21, x FOR PEER REVIEW 21 of 23

AppendixAppendix A A Appendix A Appendix A

(a) (b) (c) (a) (b) (c) (a) (b) (c) Figure A1. (a) Healthy condition, (b) mild wear, and (c) heavy wear at position A. FigureFigure A1. A1.( a(a)) Healthy Healthy condition, condition, ( b(b)) mild mild wear, wear, and and (c ()c heavy) heavy wear wear at at position position A. A. Figure A1. (a) Healthy condition, (b) mild wear, and (c) heavy wear at position A.

(a) (b) (c) (a) (b) (c) (a) (b) (c) Figure A2. (a) Healthy condition, (b) mild wear, and (c) heavy wear at position B. Figure A2. (a) Healthy condition, (b) mild wear, and (c) heavy wear at position B. Figure A2. (a) HealthyFigure condition, A2. (a) Healthy (b) mild condition, wear, and ( b(c)) mild heavy wear, wear and at (positionc) heavy B. wear at position B.

(a) (b) (c) (a) (b) (c) (a) (b) (c) Figure A3. (a) Healthy condition, (b) mild wear, and (c) heavy wear at position C. Figure A3. (a) Healthy condition, (b) mild wear, and (c) heavy wear at position C. Figure A3.Figure (a) Healthy A3. (a condition,) Healthy condition,(b) mild wear, (b) mild and wear,(c) heavy and wear (c) heavy at position wear at C. position C.

Sensors 2021, 21, 262 20 of 21

Appendix B

Table A1. (a) True positive signal, (b) false negative signal, (c) false positive signal.

(a) Sensitivity Rate Trial Class 1 2 3 4 1 1 1 1 1 3 1 1 1 1 7 0.714 1 0.833 0.857 (b) Precision Trial Class 1 2 3 4 1 1 0.857 0.857 0.857 3 0.857 0.857 0.857 0.71 7 0.833 1 0.833 1 (c) Miss Rate Trial Class 1 2 3 4 1 0 0 0 0 3 0 0 0 0 7 0.285 0 0.166 0.142

Table A2. (d) sensitivity, (e) precision, (f) and miss rate for cross-class capture. “1→4” means that signals were obtained from class 1 and templates were from class 4.

(d) Sensitivity Rate Trial Class 1 2 3 4 1→4 1 1 1 1 3→1 0.857 1 1 0.833 7→2 0.833 0.857 0.833 0.667 (e) Precision Trial Class 1 2 3 4 1→4 0.857 1 0.857 0.857 3→1 1 0.428 1 0.833 7→2 0.833 1 0.833 0.8 (f) Miss Rate Trial Class 1 2 3 4 1→4 0 0 0 0 3→1 0.142 0 0 0.167 7→2 0.167 0.142 0.167 0.333 Sensors 2021, 21, 262 21 of 21

References 1. Cao, J.; Brinksmeier, E.; Fu, M.; Gao, R.X.; Liang, B.; Merklein, M.; Schmidt, M.; Yanagimoto, J. Manufacturing of advanced smart tooling for metal forming. CIRP Ann. 2019.[CrossRef] 2. Wu, T.L.; Sari, D.Y.; Lin, B.T.; Chang, C.W. Monitoring of punch failure in micro-piercing process based on vibratory signal and logistic regression. Int. J. Adv. Manuf. Technol. 2017, 93, 2447–2458. [CrossRef] 3. Sari, D.Y.; Wu, T.L.; Lin, B.T. Study of sound signal for online monitoring in the micro-piercing process. Int. J. Adv. Manuf. Technol. 2018, 97, 697–710. [CrossRef] 4. Shanbhag, V.V.; Rolfe, B.F.; Pereira, M.P. Investigation of galling wear using acoustic emission frequency characteristics. Lubricants 2020, 8, 25. [CrossRef] 5. Ge, M.; Du, R.; Xu, Y. Hidden Markov model based fault diagnosis for stamping processes. Mech. Syst. Signal Process. 2004. [CrossRef] 6. Tian, H.; Wang, D.; Lin, J.; Chen, Q.; Liu, Z. Surface defects detection of stamping and grinding flat parts based on machine vision. Sensors 2020, 20, 4531. [CrossRef][PubMed] 7. Ge, M.; Zhang, G.C.; Du, R.; Xu, Y. Feature extraction from energy distribution of stamping processes using wavelet transform. JVC J. Vib. Control 2002.[CrossRef] 8. Zhang, G.; Li, C.; Zhou, H.; Wagner, T. Punching process monitoring using wavelet transform based feature extraction and semi-supervised clustering. Procedia Manuf. 2018, 26, 1204–1212. [CrossRef] 9. Sari, D.Y.; Wu, T.L.; Lin, B.T. Preliminary study for online monitoring during the punching process. Int. J. Adv. Manuf. Technol. 2017, 88, 2275–2285. [CrossRef] 10. Jinikol, M. Punching Tool Wear Modeling Using Finite Element Method. Bachelor’s Thesis, University Malaysia Pahang, Pahang, Malaysia, November 2009. 11. Zhang, G.C.; Ge, M.; Tong, H.; Xu, Y.; Du, R. Bispectral analysis for on-line monitoring of stamping operation. Eng. Appl. Artif. Intell. 2002.[CrossRef] 12. Zhao, R.; Yan, R.; Chen, Z.; Mao, K.; Wang, P.; Gao, R.X. Deep learning and its applications to machine health monitoring. Mech. Syst. Signal Process. 2019, 115, 213–237. [CrossRef] 13. Abdoli, S.; Cardinal, P.; Lameiras Koerich, A. End-to-end environmental sound classification using a 1D convolutional neural network. Expert Syst. Appl. 2019.[CrossRef] 14. Eren, L. Bearing fault detection by one-dimensional convolutional neural networks. Math. Probl. Eng. 2017, 2017.[CrossRef] 15. Liu, J.; Gibson, S.J.; Mills, J.; Osadchy, M. Dynamic spectrum matching with one-shot learning. Chemom. Intell. Lab. Syst. 2019. [CrossRef] 16. Pandhare, V.; Singh, J.; Lee, J. Convolutional Neural Network Based -Element Bearing Fault Diagnosis for Naturally Occurring and Progressing Defects Using Time-Frequency Domain Features. In Proceedings of the 2019 Prognostics and System Health Management Conference, PHM-Paris, Paris, France, 2–5 May 2019. 17. Zhang, J.; Sun, Y.; Guo, L.; Gao, H.; Hong, X.; Song, H. A new bearing fault diagnosis method based on modified convolutional neural networks. Chin. J. Aeronaut. 2020, 33, 439–447. [CrossRef] 18. Frank, J.; Mannor, S.; Pineau, J.; Precup, D. Time series analysis using geometric template matching. IEEE Trans. Pattern Anal. Mach. Intell. 2013.[CrossRef][PubMed] 19. Omachi, S.; Omachi, M. Fast template matching with polynomials. IEEE Trans. Image Process. 2007.[CrossRef][PubMed] 20. Chen, J.H.; Chen, C.S.; Chen, Y.S. Fast algorithm for robust template matching with M-estimators. IEEE Trans. Signal Process. 2003. [CrossRef] 21. Chang, W.D.; Im, C.H. Enhanced template matching using dynamic positional warping for identification of specific patterns in electroencephalogram. J. Appl. Math. 2014, 2014.[CrossRef]