HOW IMAGE DOWNSCALING and JPEG COMPRESSION AFFECTS IMAGE CLASSIFICATION PERFORMANCE an Experimental Study

HOW IMAGE DOWNSCALING AND JPEG COMPRESSION AFFECTS IMAGE CLASSIFICATION PERFORMANCE An experimental study Alexander Hjalt¨ en´ Bachelor esis, 15 hp/credits Bachelor Of Science Programme in Computing Science 2019 Abstract e quality of an image plays a role in how well it can be correctly classied by an image classifying neural network. Artifacts such as blur and noise reduces classiability. At the same time it is oen motivated to reduce le sizes of images which also tends to introduce artifacts and reduce their quality. is leads to a trade-o between having small le sizes and achieving high classication accuracy. e two main approaches of reducing le sizes of images is to either reduce the number of pixels in them via image scaling or to use less data to represent each pixel via compression. e eects of these two approaches on image classication accuracy have been studied independently. In this study the eects of combining image scaling and compression in regards to image classiability is examined for the rst time. Images are downscaled using ve popular methods before being compressed with dierent magnitudes of JPEG compression. Results are evaluated based on the fraction of the treated images that are correctly classied by the classier as well as on the image le sizes. e results shows that the scaling method used has signicant but weak eect on image classiability. us the choice of scaling method does not seem to be criti- cal in this context. ere are however trends suggesting that the Lanczos scaling method created the most classiable images and that the Gaussian method created the images with highest classiability to le size ratio. Both scaling magnitude and compression magnitude were found to be beer predictors of image classiability than scaling method. Acknowledgements I would like to thank my supervisor Marie Nordstrom¨ for her support and constructive feedback throughout this project. anks also goes out to Joakim Hjalt¨ en´ for his advise regarding statistics and for his feedback on the nal dra. Lastly thanks is also due to Niclas Borlin¨ for his tips regarding image manipulation soware. Contents 1 Introduction1 1.1 Purpose and Research estions1 1.2 Background2 1.3 Delimitations7 2 Related Work8 3 Method9 3.1 Data set9 3.2 Data treatments 10 3.3 Classier 12 3.4 Classier program 12 3.5 Data Collection 13 3.6 Data Analysis 14 4 Results and Analysis 14 4.1 Average accuracy 14 4.2 Accuracy for each treatment 14 4.3 Average accuracy to le size ratios 15 4.4 Accuracy to le size ratios for all treatments 16 4.5 Statistical analysis 18 5 Discussion 21 5.1 Interpretation of Results 21 5.2 Conclusion and Recommendations 24 5.3 Future Work 25 A Verication of ANOVA residuals 31 1 Introduction Image classication has emerged as a hot topic in recent years and has been applied to solve a wide range of problems. For example to recognize faces [1], to detect weapons during airport baggage security screening [2], and to allow autonomous vehicles to detect trac lights [3]. e most successful computational model for classifying images has proven to be the convolutional neural network (CNN) which is a type of articial neural network (ANN) [4][5]. By training CNN:s on large data sets of images they can be taught to recognise certain objects in images. e image as a whole is then placed into one of several predened classes based on which object the classier thinks it represents. e quality of the images that are classied aects the classiers performance. Images with lots of distortions such as blur, noise or compression artifacts become harder to classify correctly [6][7][8]. But at the same time it can oen be motivated to reduce the sizes of images (and les in general) which oen causes quality loss and distortion artifacts [9][10]. In situations where it is important both to keep images le sizes small and their classiability high one must balance this trade-o between le size and quality. is involves nding a way of reducing the le sizes of the images with as lile loss in classication accuracy as possible [11]. ere are two main ways of reducing le sizes. One might reduce the resolution (number of pixels) of the image by scaling it down. Less pixels to represent means less data to store. ere are many dierent methods of scaling images, each with its own strengths and weaknesses [12][13][14]. Another approach is to keep the same number of pixels but to reduce the number of bits that are used to represent each pixel, oen done via a process called compression. ere are two main types of compression, lossless and lossy. In lossless compression no information is really lost and the image can be restored completely to how it was before compression. In lossy compression data is irreversibly lost and the image cannot be restored to its original state aer it has been compressed. Lossy compression techniques can however reduce le sizes more than lossless ones, making it a beer choice if one wants to reduce le sizes to a large extent [9][15]. e JPEG standard is probably the most widely used lossy compression standard for photographic images [16]. A third way of reducing the le sizes of images can be devised by combining compression and scaling, both reducing the number of pixels and the data used to represent each pixel. is would allow for even smaller les. e eects of compression and scaling artifacts on classication accuracy has been studied separately on their own [6][17]. So has the eects of certain general quality distortions, some of which might be caused by either image compression or scaling [18][8][19]. e eects of combining scaling and compression on image classication accuracy however remains unstudied. 1.1 Purpose and Research estions e purpose of this project is to examine the correlation between image quality distortions caused by image scaling (change in resolution) and lossy JPEG compression, and the accuracy of a CNN image classier. e aim is to shed more light on how dierent scaling methods, both on their own in dierent magnitudes, and coupled with JPEG compression in dierent magnitudes, aects the ability of a classier to correctly classify the treated images. Treat- ments are evaluated based on classication accuracy (number of correctly classied images) as well as a classication accuracy to image le size ratio. 1 e following questions will be addressed: 1. Which scaling method achieves the highest classication accuracy on average? 2. Is the method that achieves the highest average accuracy superior at all scaling and compression magnitudes, or does dierent scaling methods work best at dierent scaling and/or compression magnitudes? 3. Which scaling method achieves the highest classication accuracy to image le size ratio on average? 4. Which combination of scaling method, scaling magnitude and compression magnitude achieves the highest classication accuracy to image le size ratio? 5. Which of the three factors (scaling method, scaling magnitude and compression magnitude) is the best predictor of classication accuracy? 1.2 Background is section contains more in depth information about the dierent concepts relevant to the project which helps with understanding it beer and places it in a bigger context. Readers already very familiar with these concepts may skip ahead to section 1.3. Image Classication Image classication is dened as the task of categorizing images into one of several predened classes and is one of the main problems in computer vision [20]. is task may seem simple for a human but it can be a real challenge training a computer to do it. e traditional way of teaching a computer to do this is a two step process. First hand-craed features are extracted from images using a feature descriptor. en these extracted features are sent as input to the classier, not the images themselves. is approach is problematic because the accuracy of the classier becomes very dependent on the feature extraction step which can be a very dicult task [21]. is type of feature extractor is also oen specialized to a certain task. e classiers used on the extracted features can however be quite general and trainable. Due to the disadvantages of the traditional feature craing approach new avenues were ex- plored experimenting with models that uses multiple layers of nonlinear information pro- cessing to integrate the tasks of feature extraction and classication [20]. In this way the classier is able to learn useful features by itself and there is no longer any need for hand- craed ones. In time the most successful model for image classication tasks has proven to be the convolutional neural network (CNN) [4]. Early CNN:s were being developed as long ago as the late seventies. In his paper from 1980 Kunihiko Fukushima presented an early form of CNN which he called “Neocognitron” [22]. In 1995 CNN:s were applied to detect lung nodules in medical images [23]. Despite these early uses of CNN:s it was not until 2012 when the CNN called AlexNet won the ImageNet classication contest with good margin that the true potential of using CNN:s for image classication became apparent [4][24][5]. is success can be aributed to the availability of powerful GPU:s, large data sets, data aug- mentation, beer algorithms, the ReLU activation function and new methods such as max pooling and dropout [20][5]. Advances in hardware performance and soware paralleliza- tion have also enabled larger networks to be trained within a reasonable time frame.

Load more