Proceedings of the 5th NA International Conference on Industrial Engineering and Operations Management Detroit, Michigan, USA, August 10 - 14, 2020 Identification of Cavendish Maturity Using Convolutional Neural Networks

Yusuf Andas Ramadhan, Esmeralda C. Djamal and Fatan Kasyidi Department of Informatics, Universitas Jenderal Achmad Yani, West Java, Indonesia [email protected], [email protected]

Abdul Talib Bon Department of Production and Operations, University Tun Hussein Onn , Malaysia [email protected]

Abstract Banana (Musa Paradisiaca) a fruit plant that is rich in sources of vitamins, minerals and carbohydrates. Image processing is a method or technique that can be used to process images by manipulating them into desired data to obtain certain information. The use of image processing technology is to improve accuracy in determining the maturity of a fruit. The maturity level of has different benefits in terms of the color of the skin of a banana. Green has the benefit of controlling blood sugar levels, yellowish green is good for those on a diet, yellow contains very high antioxidants and is good for digestion and yellow with brown spots is useful to help the immune system and is able to prevent cancer. This study uses the Convolutional Neural Network (CNN) method, one of the effective methods used by computers to do learning from images. The computer can later identify objects from the given image by this method. Each object in the image has special features (such as color, size, texture), so to get the features of the image is very important to do convolution with the appropriate filter. This research resulted in the identification system of ripeness with 4 classes including "Unripe", "Through Ripe", "Almost Ripe" and "Ripe". All classes will be identified using CNN. The accuracy obtained in this study with Adam optimization parameters achieved 93.25% accuracy of training data and 69.34% accuracy of testing data. The results are obtained by SGD optimization parameters get 94.12% accuracy of training data and 71.95% accuracy of test data.. Keywords Image Processing, Banana, Identification, and Convolutional Neural Networks.

1. Introduction Banana ( Paradisiaca) is a fruit plant that contains a lot of vitamins, minerals and carbohydrates. In Indonesia, bananas are planted less intensively, so Indonesian banana has low production standard and unable to compete in international markets. Bananas are old enough to be harvested when bananas are 80-100 days old, depending on variety. The maturity level of bananas has different benefits in terms of the color of the skin of a banana. The green color has the benefit of controlling blood sugar levels, the yellowish green color is good for diet, the yellow color contains antioxidants which are very high and good for digestion, the yellow color with brown spots is useful to help the immune system and prevent cancer. While the brownish yellow is useful raising blood sugar and fighting inflammation (Mazen and Nashat, 2019). Deep Learning is a part of machine learning that uses several layers containing non-linear processing units, each layer uses the output of the previous layer as input. Convolutinal Neural Networks (CNN) are classified as deep machine learning (Schmidhuber, 2015). Previous studies classified the maturity level of bananas using three CNN models with three variables i.e color and texture (Maimunah, Handayanto and Herlawati, 2019). Previous studies using CNN with the architecture of VGG-16 to identify the maturity of dates (Nasiri, Taheri-Garavand and Zhang, 2019). Besides CNN, other studies use HOG-SVM to detect apple trees based on images (Zhang and Liu, 2019). Image processing is a method or technique that can be used to process images or images by manipulating them into desired image data to obtain certain information. The use of image processing technology is expected to increase accuracy in determining the maturity of a fruit. The condition of the fruit can be approached from the size of the object image if taken against a background that contrasts with the color of the observed fruit(Nasiri, Taheri-Garavand and Zhang, 2019).

© IEOM Society International 1743 Proceedings of the 5th NA International Conference on Industrial Engineering and Operations Management Detroit, Michigan, USA, August 10 - 14, 2020

Other studies have classified pineapple based on its maturity by dividing it into three categories: raw pineapple, half-ripe pineapple, and ripe pineapple. Classification is done by CNN using the MobileNet architecture. The model produces an accuracy of 98.38% and 90.77%(Chaikaew et al., 2019) . Previous research has perform fruits identification using CNN when the accuracy of testing in the creation of CNN models is very dependent on the dataset used for training. Variations in the training. Variations in the training dataset will add ambiguity to the CNN model. The results of the CNN model achieve an accuracy of 96.31% in identifying images in the testing dataset from fruit images, with features like color similarity, texture similarity, similarity in size and shape similarity (Mureşan and Oltean, 2018). Other studies classify the maturity of grapes based on the shape and color of the fruit (Kangune, Kulkarni and Kosamkar, 2019). The classification model uses the Convolutional Neural Network and Support Vector Machine (SVM). Color features such as RGB and HSV and morphology are used as features in the classification model. The validation results show that the CNN model gets higher accuracy which is 79.49% than SVM which only gets 69%. Other research has performs improving the accuracy of CNN on the quality based on the classification of zalacca on digital images, increasing the accuracy of CNN classification can be achieved by preprocessing to produce better images and adding some parameters to the CNN architecture. Some parameters of CNN that can be applied are the Stride and Zero Padding parameters. The purpose of these two CNN parameters is to determine pixel shifts to get more detailed information from the input image and increase the accuracy value of the convolution model because convoluted filters will focus on finding information and eliminating unnecessary information. Besides adding some parameters, CNN architecture provides a deeper convolution layer which can be useful to improve classification accuracy because more image features are extracted by CNN and provide more information for salak classification(Dzulqarnain, Suprapto and Makhrus, 2019). Previous studies used CNN to identify 1 fruit from many fruit objects (Mureşan and Oltean, 2018). This study measuring the level of cavendish banana maturity based on the color similarity. The information generated in the form of a percentage and classification of cavendish banana ripeness which includes "Raw" and "Ripe"(Zhang et al., 2018). Other studies have identified damaged jackfruit using the Convolutional Neural Network. This model produces an accuracy of 97.87% when applied to each picture of jackfruit on test data (Orano, Maravillas and Aliac, 2019). This research performed maturity identification of cavendish bananas with 4 classes including "Ripe", "Through Ripe", "Almost-Ripe" and "Unripe". All classes will be identified using CNN. To facilitate users, the system is built in Android Mobile App. The image of a cavendish banana sample is photographed with an Android phone camera to recognize each class and feature that has certain variations.

2. Proposed Method 2.1. Data Aquisition In this reseacrh, the data is divided into two types namely training data and test data. For training data obtained from sample collection and shooting cavendish bananas with different levels of maturity, namely "Unripe", "Almost- Ripe", "Ripe" and "Through Ripe". All samples were collected by Nikon D7200 cameras with RGB scale. HD Format camera with 4128 x 3096 pixels Full Resolution. The training data sample consisted of 300 data consisting of 75 Unripe Cavendish bananas, 75 Almost-Ripe Cavendish bananas, 90 Ripe Cavendish bananas and 60 Cavendish bananas Through Ripe. The characteristics of cavendish banana images are illustrated in Table 1.

Table 1. Cavendish Banana Ripeness Level Based on Color Change No Fruit State Class Characteristics Benefits The entire surface Control blood sugar 1 Unripe of the fruit is green, levels the fruit is still hard Fruit skin with a Good for people on a 2 Almost Ripe yellowish green diet color

contains antioxidants All banana fingers 3 Ripe that are very high and are yellow good for digestion

© IEOM Society International 1744 Proceedings of the 5th NA International Conference on Industrial Engineering and Operations Management Detroit, Michigan, USA, August 10 - 14, 2020

Yellow bananas helps the immune Through 4 with lots of brown system and is able to Ripe spots. prevent cancer

Figure 1. Model of identification of cavendish banana ripeness

2.2. Image Processing In this Image Processing section, it explains how the Cavendish banana dataset is created. Cavendish banana images taken using a digital camera in different positions. Background for shooting cavendish banana using a piece of white paper as the background. The segmentation uses YOLO (Tian et al., 2019) to detect cavendish banana objects by using bounding boxes (Lamb and Chuah, 2019) or predictions adjusted in the original input to accommodate cavendish bananas. Then the image detected from the bounding box crop & resize the cavendish banana image to size 224 x 224.

© IEOM Society International 1745 Proceedings of the 5th NA International Conference on Industrial Engineering and Operations Management Detroit, Michigan, USA, August 10 - 14, 2020

2.3. Visual Geometry Group 16 Visual Geometry Group is an architectural model of Convolutional Neural Networks that refers to the depth of layers in getting deeper learning features (Simonyan and Zisserman, 2015). The VGG16 architecture shown in Figure.2 is arranged starting with 5 convolution layer blocks followed by 3 fully-connected layers. The convolution layer uses a 3 x 3 kernel with stride 1 and padding 1 to ensure that each activation map maintains the same dimension as the previous layer..

ReLU activation is done right after each convolution and max pooling operations are used at the end of each 2 x 2 kernel block with 2 strides and without padding to ensure that each dimension of the activation map from the previous layer is halved. Two fully-connected layers with 4096 ReLU activation units are then used before 1000 fully-connected softmax layers.

The weakness of the VGG16 model is that it's expensive to evaluate, it takes a lot of memory and parameters. VGG16 has around 138 million parameters. Most of these parameters (around 123 million) are in fully-connected layers(Rezende et al., 2018)

Figure 2. Architecture Visual Geometry Group 16

2.4. Convolutional Neural Networks This research uses Convolutional Neural Network This method uses convolution, pooling, ReLU layers and fully connected layers. In a typical CNN architecture, each convolution layer is followed by a Rectified Linear Unit (ReLU) layer then a pooling layer then one or more convolution layers and finally one or more fully connected layers(Cires et al., 2003). The characteristics that distinguish CNN from other Neural Networks, take into account the structure of the image when processing it. A normal neural network converts input in one dimensional array which makes the train classifier less sensitive to position changes(Cireşan et al., 2012).

1) Feature Extraction In this feature extraction layer, the encoding process occurs to find a number of features and transform the image from its original form into a feature vector for the training or identification process. The feature extraction stage consists of convolution, and pooling.

a) Convolution The convolution stage is the main process in CNN, known as a feature detector. This layer changes the image to a different depth and extracts the input image by filtering (kernel) the input then shifts according to the Stride value(Lamb and Chuah, 2019). This stride is the value that determines the filter shift (kernel) and to adjust the output

© IEOM Society International 1746 Proceedings of the 5th NA International Conference on Industrial Engineering and Operations Management Detroit, Michigan, USA, August 10 - 14, 2020 dimensions of the convolution using padding which determines the number of pixels (containing the value 0)(Hinton et al., 2012) to be added on each side of the input.

After the convolution process and obtaining the feature map, the activation process is then performed using the Rectified Linear Unit (ReLU) function.(Sultana, Sufian and Dutta, 2019) function which functions to normalize the negative value to zero, so that all values in the feature map are positive. The ReLU activation function is shown in (2). 푓(푥) = max⁡(0, 푥) (2)

Where x is an input neuron. Activation ReLU -1 -1 9 -1 0 0 9 0

12 35 40 -1 12 35 40 0

13 90 -11 48 13 90 0 48

20 44 25 100 20 44 25 100

Figure 3. Activation Process with Rectified Linear Unit b) Pooling In this Pooling phase, the function is to reduce the size and number of parameters in the network and speed up computing and control overfitting (Rezende et al., 2018) and produce feature patterns. Pooling using Max Pooling This layer can be shown in Figure 3. In the Max Pooling process, it takes the largest value of the input layer from the results of the activation function. (Wang and Chen, 2018) This research took a pooling distance of 2x2 pixels to 1x1 pixel.

Figure 4. Operation Max Pooling 2) Identification The identification layer is composed of several layers and each layer is composed of neurons that are fully connected (fully connected) with other layers. This layer gets input from the output of the previous layer, namely the feature extraction layer in the form of a matrix, then the feature in the form of the matrix is converted into a vector to be used as a neuron input.(Agarap, 2017). The first layer is the input layer that gets input features from the output of the feature extraction layer in the form of a matrix that is converted first into a vector, the method of converting the matrix into a vector is called flatten. Activation used in this identification layer consists of Rectified Linear Unit (ReLU) and softmax activation. Softmax is used to determine the probability value generated against all class targets. The Softmax function is shown in (3) where w is the weight. Then calculate the square of error or loss in the output layer using Cross Entropy as in the research (Lu et al., 2019) shown in(4).

푇 푒푥 푤푘 푦푘 = 푇 (3) 푚 푥 푤푖 ∑푖=1 푒

© IEOM Society International 1747 Proceedings of the 5th NA International Conference on Industrial Engineering and Operations Management Detroit, Michigan, USA, August 10 - 14, 2020

푚 ∇퐸 = ⁡ −⁡∑푖=1 푝푖 푙표푔 푝̂푖 (4)

3. Result And Discussion Cavendish banana image data of 300 data consists of four classes, each of which contains 75 data. Data is grouped according to maturity class. This data is used for training and testing the CNN model. The data is divided into two parts, namely 250 training data and 50 test data. Making the system using the Tensorflow and Keras library with the Python 3 programming language. The parameter used is the learning rate of 0.01 with 100 epochs. Parameter optimization testing is performed on the Anaconda 3 platform with configurations using the GPU. Optimization parameters used in this test are SGD and Adam. For SGD the same as before the learning rate of 0.01, then the momentum of 0.9. Loss function for both uses Cross Entropy and 100 epochs. The parameter optimization test results look like in Table 2.

Table 2. Result Of Parameter Optimization Test Training Data Testing Data No optimization Loss Accuracy Loss Accuracy 1 SGD 0.2967 94.12% 0.3254 71.95% 2 Adam 0.3412 93.25% 0.3756 69.34%

Based on Table 2, the SGD optimization model has a better accuracy and loss value, namely an accuracy of 94.12% and a loss value of 0.2967 compared to Adam, which has an accuracy of 93.25% and a loss value of 0.3412. Even so both SGD or Adam optimization have unstable graphics. Accuracy and loss graphs for training data and test data using SGD and Adam can be seen in Figure 5 and Figure 6.

Figure 5. Accuration and Losses Value of Train

© IEOM Society International 1748 Proceedings of the 5th NA International Conference on Industrial Engineering and Operations Management Detroit, Michigan, USA, August 10 - 14, 2020

Figure 6. Acuration and Losses Value of Test Graph in Figure. 5 and Figs. 6 shows that the SGD model tends to be good at the beginning of time but when it reaches 100 times the value of losses increases while in the Adam model the losses are unstable. That's where the SGD model has an optimal loss reduction compared to the Adam model. Be observed from the graph, there is an increase in the value of the Loss in the 55th epoch with Adam's optmizier. In tests that make up 20% of the total dataset, there are results that do not match the original label on the raw class. While mature classes have wrong results and raw classes have the most wrong results with the results found in other classes as shown in Table III.

Table 3. Accuracy Every Class Of The Cavendish Banana Ripeness. No Class Accuracy 1 Unripe 84.00% 2 Almost Ripe 90.00% 3 Ripe 94.00% 4 Through Ripe 87.00%

The results of each class's accuracy of the test data have good results above 90%. The best accuracy is obtained in the mature class testing to get the recognition result of 94%, while the smallest is the raw class by 84%.

4. Conclusion This research presents a model for the identification system of cavendish banana ripeness using Covolutional Neural Networks with VGG16 architecture to identify 4 classes of banana maturity that are raw, almost ripe, ripe and through ripe. In this study the SGD optimization model with learning 0.01 and a maximum of epoch 100, produces a better accuracy value with an accuracy of 71.95% for the test data. . Image segmentation using YOLO's pretrained model helps to detect banana objects so that cropping can be done on banana images.

References Agarap, A. F. (2017) ‘An Architecture Combining Convolutional Neural Network (CNN) and Support Vector Machine (SVM) for Image Classification’, pp. 5–8. Chaikaew, A. et al. (2019) ‘Convolutional neural network for pineapple ripeness classification machine’, Proceedings of the 16th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology, ECTI-CON 2019. IEEE, pp. 373–376. doi: 10.1109/ECTI-CON47248.2019.8955408. Cires, D. C. et al. (2003) ‘Flexible, High Performance Convolutional Neural Networks for Image Classification’,

© IEOM Society International 1749 Proceedings of the 5th NA International Conference on Industrial Engineering and Operations Management Detroit, Michigan, USA, August 10 - 14, 2020

Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence Flexible, pp. 1237– 1242. doi: 10.5591/978-1-57735-516-8/IJCAI11-210. Cireşan, D. C. et al. (2012) ‘Deep neural networks segment neuronal membranes in electron microscopy images’, Advances in Neural Information Processing Systems, 4, pp. 2843–2851. Dzulqarnain, M. F., Suprapto, S. and Makhrus, F. (2019) ‘Improvement of Convolutional Neural Network Accuracy on Salak Classification Based Quality on Digital Image’, IJCCS (Indonesian Journal of Computing and Cybernetics Systems), 13(2), p. 189. doi: 10.22146/ijccs.42036. Hinton, G. E. et al. (2012) ‘Improving neural networks by preventing co-adaptation of feature detectors’, pp. 1–18. Kangune, K., Kulkarni, V. and Kosamkar, P. (2019) ‘Grapes Ripeness Estimation using Convolutional Neural network and Support Vector Machine’, 2019 Global Conference for Advancement in Technology, GCAT 2019, pp. 1–5. doi: 10.1109/GCAT47503.2019.8978341. Lamb, N. and Chuah, M. C. (2019) ‘A Strawberry Detection System Using Convolutional Neural Networks’, Proceedings - 2018 IEEE International Conference on Big Data, Big Data 2018. IEEE, pp. 2515–2520. doi: 10.1109/BigData.2018.8622466. Lu, S. et al. (2019) ‘Fruit Classification Based on Six Layer Convolutional Neural Network’, International Conference on Digital Signal Processing, DSP. IEEE, 2018-Novem, pp. 1–5. doi: 10.1109/ICDSP.2018.8631562. Maimunah, Handayanto, R. T. and Herlawati (2019) ‘Nondestructive Banana Ripeness Classification using Neural Network’, Proceedings of 2019 4th International Conference on Informatics and Computing, ICIC 2019. doi: 10.1109/ICIC47613.2019.8985980. Mazen, F. M. A. and Nashat, A. A. (2019) ‘Ripeness Classification of Bananas Using an Artificial Neural Network’, Arabian Journal for Science and Engineering. Springer Berlin Heidelberg, 44(8), pp. 6901–6910. doi: 10.1007/s13369-018-03695-5. Mureşan, H. and Oltean, M. (2018) ‘Fruit recognition from images using deep learning’, Acta Universitatis Sapientiae, Informatica, 10(1), pp. 26–42. doi: 10.2478/ausi-2018-0002. Nasiri, A., Taheri-Garavand, A. and Zhang, Y. D. (2019) ‘Image-based deep learning automated sorting of date fruit’, Postharvest Biology and Technology. Elsevier, 153(January), pp. 133–141. doi: 10.1016/j.postharvbio.2019.04.003. Orano, J. F. V., Maravillas, E. A. and Aliac, C. J. G. (2019) ‘Jackfruit Fruit Damage Classification using Convolutional Neural Network’, 2019 IEEE 11th International Conference on Humanoid, Nanotechnology, Information Technology, Communication and Control, Environment, and Management, HNICEM 2019. doi: 10.1109/HNICEM48295.2019.9073341. Rezende, E. et al. (2018) ‘Malicious Software Classification Using VGG16 Deep Neural Network’s Bottleneck Features’, Advances in Intelligent Systems and Computing, 738, pp. 51–59. doi: 10.1007/978-3-319-77028-4_9. Schmidhuber, J. (2015) ‘Deep Learning in neural networks: An overview’, Neural Networks. Elsevier Ltd, pp. 85– 117. doi: 10.1016/j.neunet.2014.09.003. Simonyan, K. and Zisserman, A. (2015) ‘Very deep convolutional networks for large-scale image recognition’, 3rd International Conference on Learning Representations, ICLR 2015 - Conference Track Proceedings, pp. 1–14. Sultana, F., Sufian, A. and Dutta, P. (2019) ‘Advancements in image classification using convolutional neural network’, in Proceedings - 2018 4th IEEE International Conference on Research in Computational Intelligence and Communication Networks, ICRCICN 2018. Institute of Electrical and Electronics Engineers Inc., pp. 122– 129. doi: 10.1109/ICRCICN.2018.8718718. Tian, Y. et al. (2019) ‘Apple detection during different growth stages in orchards using the improved YOLO-V3 model’, Computers and Electronics in Agriculture, 157(October 2018), pp. 417–426. doi: 10.1016/j.compag.2019.01.012. Wang, S. H. and Chen, Y. (2018) ‘Fruit category classification via an eight-layer convolutional neural network with parametric rectified linear unit and dropout technique’, Multimedia Tools and Applications. Multimedia Tools and Applications. doi: 10.1007/s11042-018-6661-6. Zhang, L. and Liu, L. (2019) ‘Apple tree fruit detection and location method based on computer vision’, Revista de la Facultad de Agronomia, 36(6), pp. 1818–1828. Zhang, Y. et al. (2018) ‘Deep indicator for fine-grained classification of banana’s ripening stages’, Eurasip Journal on Image and Video Processing. EURASIP Journal on Image and Video Processing, 2018(1). doi: 10.1186/s13640-018-0284-8.

Biographies

© IEOM Society International 1750 Proceedings of the 5th NA International Conference on Industrial Engineering and Operations Management Detroit, Michigan, USA, August 10 - 14, 2020

Yusuf Andas Ramadhan is an Informatics collage student. He studied Informatics which is focus on Artificial Intelegent and Game in Universitas Jendral Achmad Yani.

Esmeralda Contessa Djamal received a Bachelor's degree in Engineering Physics from Institut Teknologi Bandung in 1994, a Master's degree in Instrument and Control from Institut Teknologi Bandung in 1998. She finished a Ph.D. degree in Engineering Physics from Institut Teknologi Bandung in 2005. Her doctoral thesis focused on signal processing and pattern recognition. Since 1998 until now, she has published about 40 papers in an international journal or proceeding primarily in pattern recognition, machine learning, and signal processing. Currently, she is a lecturer of the Informatics Department, Universitas Jenderal Achmad Yani.

Fatan Kasyidi is a lecturer of Informatics in the Faculty of Science and Informatics at the University of Jenderal Achmad Yani, Indonesia since 2018. He studied Informatics which is focus on Intelligence System in Institut Teknologi Bandung for which he was awarded the Master’s Degree in the year 2017. He’s bachelor degree in Informatics which was obtained from the University of Jenderal Achmad Yani. He is active in research about Artificial Intelligence, Speech Recognition and Affective Computing.

Abdul Talib Bon is a professor of Production and Operations Management in the Faculty of Technology Management and Business at the Universiti Tun Hussein Onn Malaysia since 1999. He has a PhD in Computer Science, which he obtained from the Universite de La Rochelle, France in the year 2008. His doctoral thesis was on topic Process Quality Improvement on Beltline Moulding Manufacturing. He studied Business Administration in the Universiti Kebangsaan Malaysia for which he was awarded the MBA in the year 1998. He’s bachelor degree and diploma in Mechanical Engineering which his obtained from the Universiti Teknologi Malaysia. He received his postgraduate certificate in Mechatronics and Robotics from Carlisle, United Kingdom in 1997. He had published more 150 International Proceedings and International Journals and 8 books. He is a member of MSORSM, IIF, IEOM, IIE, INFORMS, TAM and MIM.

© IEOM Society International 1751