A Visual Database of Recognisable Utensils

David Fullerton (s1137636)

4th Year Project Report Computer Science School of Informatics University of Edinburgh 2016

3

Abstract This report is the culmitaive summation of project ”A Visual Database of Recognis- able Kitchen Utensils”, from its early planning stages, to the more applicable stages in which issues and challenges were addressed, over come, or simply accepted, and finally the resulting work in which a classifier has been successfully created. Ulti- mately the goal of this project was to make the first steps towards a more efficient, or convenient home through the means of technology being applied to menial tasks, and whilst many issues have indeed been faced, and much progress can still, and always will be obtainable, ultimately that goal has been reached. To achieve this goal there were three main areas outlined to be tackled as a means of accomplishing a positive outcome: data collection, website creation and finally utensil recognition.

Table of Contents

1 Introduction 7 1.1 Project Goals ...... 7 1.2 Summary of Contributions ...... 8

2 Dataset 9 2.1 Design ...... 9 2.2 Implementation ...... 10 2.2.1 Data Collection ...... 10 2.2.2 Website ...... 11 2.3 Discussion ...... 13

3 Thresholding and Description 15 3.1 Design ...... 15 3.2 Implementation ...... 16 3.2.1 Removing the Background ...... 16 3.2.2 Creating the Feature Vectors ...... 19 3.2.3 Testing Moment Invariance ...... 20 3.3 Discussion ...... 21

4 Recognition 23 4.1 Design ...... 23 4.2 Implementation ...... 24 4.3 Discussion ...... 26

5 Conclusion 29 5.0.1 Data collection ...... 29 5.0.2 Website ...... 29 5.0.3 Classifier, Utensil Recognition ...... 30 5.0.4 Final Thoughts ...... 30

6 Appendix 33

Bibliography 75

5

Chapter 1

Introduction

In the future it is highly likely that we will have even more forms of convenience in our homes. This has been a process of many decades already, and is still happening at present. Should this trend of convenience within the home continue, as seems likely, robots being within the home does not sound outwith the bounds of possibility. One area in particular that seems prime for such an application is the kitchen; taking into account all the forms of convenience already implemented there anyway. The kitchen is in many ways, the heart of the home, and a room that everyone within said home uses. Therefore, if a robot was put within that room, it could simplify the area, as well as make it far more efficient. This could be done in a number of ways, the robot could perform such menial tasks such as washing up, drying, putting away and dishes. Or perhaps even preparing meals in their entirety. Such a robot with such responsibilities would be required to be able to recognise different kitchen utensils, and from there, know how to use them appropriately. In order to achieve a robot capable of such things, a database of images had to be collected, and a recognition algorithm created; this project has aimed to do both those things. By collecting such images and applying them to the algorithm, this project aims to make the first few steps towards the Kitchen of the future. The algorithm itself will aim to identify, in each image collected, what class that image of a utensil belongs to. Along with this, a simple website has been created to display the images collected. This allows the means to download them at convenience and use them within the project, and have them applied to an algorithm whenever needed.

1.1 Project Goals

The goal for this project is ultimately to have created a database of images and a classi- fier that can both be used in the creating of a robot that can function within the kitchen and distinguish between different classes of utensils. • Firstly a database of kitchen utensil images had to be collected. Each image in turn had to be segmented in order to leave a binary image of just the utensil to

7 8 Chapter 1. Introduction

be registered and understood by the program, and eventually robot. To create a broad database that would eventually allow for less margin of error, the in- tent was to collect approximately 20 different classes of utensil. Such utensils are as follows:bottle opener, dessert , dinner fork, dinner knife, fish slice, ladle, masher, , , potato peeler, serving spoon, soup spoon, , tea spoon, tongs, , wooden spoon, can opener and . Preferably each image and utensil would be unique, and 100 im- ages of each class would be taken. In doing so, a database of 2000 images would have been created for the algorithm to be applied to. • Secondly a website needed to be created. This website would provide links to download either individual images, or alternatively zip files containing either all raw images, all binary images or indeed all images in their entirety. These zip files were to be available for each class of utensils individually or all classes combined. Essentially it was to be designed in order to make access to the stock images easier. • Thirdly, and lastly, an algorithm was to be created. This visual classification algorithm would be intended to be trained to take a new image, from one of above listed classes, and recognise which class it belonged to. This allowing the recognition of a fork compared to a knife per say, or indeed a potato masher to a spatula.

1.2 Summary of Contributions

• The collection of 449 images by manually photographing kitchen utensils. • Segmenting each image collected via image thresholding. • Creation of a simple xml based website that displays all images, both photo and binary segmented image, and provide means for downloading various sets of these images. These sets being either all photos of a class, all binary images of a class, all images of a class, all photos, all binary images or all images. • Adapted code from the course Introduction to Vision and Robotics, [2], as well as create code from scratch to create and store feature vector descriptions of every image in the database. Scatter plots and gaussian distributions were created to represent these feature vectors. • A multivariant Bayesian classifier was created which was trained by the feature vectors and would attempt to classify an image of a kitchen utensil presented to it. Chapter 2

Dataset

2.1 Design

With such a large amount of data to be collected, as well as created, a certain amount of pre-planning had to be done in order to ensure the success of this project. The two main areas of planning were for area concerning the images; both in collecting them, and in putting them onto a website. For collecting the images two main avenues were considered; either by going out and photographing them all, and collecting them manually, or by acquiring them online. Both ideas raised challenges. The photography avenue was challenging in so far as the sheer mount of hours that would be have to be spent taking said photographs, ensuring their backgrounds were useable, as well as lighting and such too. Whereas the online avenue posed the challenge of copyright infringement and a tricky process of avoiding such wrong doing, along with this was vetting the images quality or lack thereof. Furthermore, once having collected said images, they had to be stored as well as made easily accessible. This was easiest done in the form of a website, and this too needed to be designed specifically for the projects needs. The images were to be made easily available for download from this website, and essentially this was all the website was required to do. This fact made it far less important for the website to be aesthetically pleasing, or cluttered by unnecessary functions; allowing a streamlined approach to its design. The focus here, was the images and access to them. It was decided that an xml file with a xslt to translate it into html format was appropriate as this allowed a more straightforward process of displaying all the images collected, which would amount to hundreds of images on the website. The overall layout of the website was inspire by Dermofit, [1], which had a similar basic function in that it also displayed a large array of images both segmented and not. The kitchen utensil was to have its own homepage, which is ”kitchen utensils.xml”, with a list of each class of utensil available along with an example image, as well as a segmented image of each class. A short description of the site, along with links to zip file download of all images, were also to be on this same homepage. From there a visitor of the site can easily navigate to any individual class page. On each of

9 10 Chapter 2. Dataset those class pages; all images of that class and their segmented counterparts are to be shown. As well as this, from the class pages it is possibly to download the following: individual images, a zip file of all raw photos, a zip filed of all segmented images, or a zip file of all images both segmented and raw.

2.2 Implementation

2.2.1 Data Collection

This project involved the collection of a large amount of images containing kitchen utensils. Initially the plan was to gather the majority of the images required from various websites on the internet. By doing so, the time expended on manually taking photos individually would be taken to the minimum whilst still accomplishing a large amount of images gathered in a fraction of the time. This in turn would allow the focus of the project to be on segmentation and the recognition algorithm. Unfortunately this process quickly ran into issue when two main issues arose with this form of data collection: • Copyright: During data collection it was vital to ensure that there were no breaches in copyright when using certain images, therefore only images that were free to use could be used as part of the project. For most images this needed to be checked thoroughly, and that process became very time consuming. This was able to be partially tackled by accessing sites that specifically supply free to use images, and so there were still some images available to the project for use but far from the amount expected and needed. • Quality: The majority of images acquired from the internet also had issues in that it could not be guaranteed their quality would be up to standard. This came up in a number of ways which lead to them being useless: resolution being too low, the image being cluttered, the utensil being at an unusual angle or part of the utensil being out of frame. All of these issues amounted in turn, or collectively to an unusable image and at the frequency at which they arose, the process became very time consuming indeed. Taking these problems into consideration it was decided that manually taking the re- quired photos would ultimately be the better option. This allowing reliable quality of photos, control of the background, lighting, scale, and angle of the utensil. It also removed all concerns of copyright in a single action. This however was not a fix all so- lution by any means, the main downside to this solution being that this method required more time to be devoted to manually collecting data. This in turn lead to less images being collected than originally planned. Whilst this by no means derailed the project, it was certainly an area that required leniency and adaptability. The exact number of images captured per class is shown in the following table 2.1. With the images being manually collected, this involved travelling to the houses of friends and family to photograph the required classes of utensils. Visits were also 2.2. Implementation 11

Table 2.1: A table showing the number of images that were collected per class with the total number of images collected at the bottom.

Class Name Number of images Dinner Fork 42 Dinner Knife 35 Fish Slice 47 Kitchen Knife 26 Ladle 22 Masher 12 Whisk 11 Dessert Spoon 13 Peeler 8 Potato Peeler 14 Bottle Opener 14 Tongs 14 Soup Spoon 12 Wooden Spoon 15 Pizza Cutter 10 Serving Spoon 35 Spatula 14 Tea Spoon 105 Total Count 449 made to local charity shops and home furnishing stores in an attempt to garner as many images as possible. This in turn lead to an unforeseen issue which was that the packaging of some classes of utensils, specifically kitchen knives, meant that no photos could be collected from stores and so these classes ended up with less images. In an effort to simplify the segmentation process a common background was chosen for all photos which could then be filtered out during segmentation leaving a binary image of only the utensil. Initially a black or white background was used, A black background for silver or white utensils and a white background for others. This resulted in two main issues. First was that images on a white background cast a shadow that was often mis- segmented. Second was that there were frequently utensils with both silver and black parts to them, this meant that on either background the segmentation failed. Example of a photo taken on a white background that is both casting a shadow and where the utensil contains silver and black parts, Figure 3.2. After these problems were noticed the background was changed to be green and this was found to be far more successful, Figure 3.5

2.2.2 Website

As previously mentioned, the primary goal was to create a website of functionality and streamline access, therefore the website’s design is a simple one. It focuses on the images, and access to them and their downloads. Xml files were written to easily 12 Chapter 2. Dataset display hundreds of images on the website at the same time. Xslt files were used to convert the xml into html format. Since a standard layout was applicable to every utensil classes web page, a single xslt file was created that translated any of the classes xml file into an appropriate web page, vastly cutting down on the amount of code needed to style the website. This saved time and effort for the more vital parts of the project, as in the algorithm and application. A screen shot of the websites home page, figure 2.1. There are screenshots of all class web pages in the appendix, figures: 6.1, 6.2, 6.3, 6.4, 6.5, 6.6, 6.7, 6.8, 6.9, 6.10, 6.11, 6.12, 6.13, 6.14, 6.15, 6.16, 6.17, 6.18, 6.19 and 6.20.

Figure 2.1: The homepage of the Kitchen Utensils website 2.3. Discussion 13

2.3 Discussion

Whilst the decision to manually take photos of the required kitchen utensils on a green background led to a simplification in the segmentation process, it also led to an issue of time consumption. The task of data collection remained time consuming largely because of the ground coverage; travelling to people’s homes or to stores in order to take the photographs. Even after doing said travelling it was often the case that very few samples of certain classes were available, for example no images of kitchen knives were taken from stores due to packaging issues, and households frequently only had one can opener. These problems in data collection led to a deficit in images and two classes, can opener and bread knife, had too few images to be used once the recognition phase was reached. This lack of images proved a substantial issue and if future work were to be carried out on this project, or field, then the collection of substantially more data would be a critical area of progress to be made. The website is functionally sound and serves the purpose it was created for. It is not particularly aesthetically pleasing, however as this was not an intention, it hardly seems a drawback. It clearly displays the images and their segmented counterparts, whilst providing links to download them. Future work on the website could involve improving the way in which the image is downloaded by a user by making it more obvious how it is done, and perhaps even providing some instructions for an unpractised user. More descriptive text could also be added but overall if more work was to be dedicated to the project, then improving the website would be a low priority task still.

Chapter 3

Thresholding and Description

3.1 Design

Basic colour thresholding was used to separate the utensil from the background in each image. This was made possible by having control over the background that was present in each image and allowed basic thresholding to work effectively in each im- age acquired. Since the same background was used for each image, the function was designed around the concept of finding all pixels that were the background and remov- ing them, the resulting image was a binary image where all the pixels had value zero except those that belonged to the utensil which had value one. The initial threshold values were chosen to be near what RGB values would be expected to be for a plain green image, these values were later adjusted depending on the lighting primarily.

Each image in turn required representation, and so descriptions were created for each. These are values representing different qualities of the new binary image and are what the classifier uses to recognise the class of a utensil, i.e. whether or not it is a fork or knife. At first the project used compactness and the 6 complex invariant moments with the intention of later removing any descriptions that proved redundant, and to add any new ones if the classifier had trouble distinguishing between classes. Once the descriptions for an image had been calculated they were combined into a feature vector. Due to the large amount of images, calculating all of the feature vectors for every image each time we wanted to run or train a classifier would be take a lot of time. To tackle this the feature vector for each image was only calculated once, or only when a new image or feature was added. Along with this, all feature vectors were stored in a table and written to a dat file. This allowed meant that the feature vector for an image could be quickly accessed at any time without slowing the process by re-calculating anything.

15 16 Chapter 3. Thresholding and Description

3.2 Implementation

3.2.1 Removing the Background

The first attempts at thresholding were made where the photo was taken on a white or black background. The thresholding for this simply required the image to be converted into a grayscale image and a threshold chosen between 0 and 1, where 1 represents pure white and 0 black, usually about 0.5. The result is a binary image where all pixels of a value lower than the threshold are 0, those higher 1. This proved ineffective due to the shadow caused on the images with a white background, often segmented as part of the utensil as a result, and the utensil often containing both silver and black section. The result of attempting to segment figure 3.2 is the binary image figure 3.1. As can be seen much of the utensil has been segmented as the background and much of the background as part of the utensil. Another problem was that many utensils often had both dark and light areas and so were badly segmented for either the white or black background. This prompted the change to using a green background, as it was far less common for such a colour to be present on a utensil, and threshold based on the RGB values of each pixel.

Figure 3.1: The resulting segmentation of Figure 3.2 demonstrating issues with white or black background

With the new green background a slightly more complex thresholding was required. For this the RGB value of each pixel was normalised in an attempt to make all areas of the background, regardless of light level, have about the same RGB values. The next step was to pick a value to threshold each normalised RGB value on. Initially these thresholds were set for each RGB value separately, the result of each was a binary image showing the pixels that had passed thresholding, then combine these binary images together using a AND operation. This proved to be less effective than initially expected and problems were still experienced with shadow. In an attempt at improving the results of segmentation three new variables were created, and thresholded on those with both lower and upper limits. The three new variables were: • The difference between normalised red value and normalised green value. 3.2. Implementation 17

Figure 3.2: An example of a utensil with both light and dark areas with a white back- ground

• The difference between normalised green value and normalised blue value. • The difference between normalised red value and normalised blue value. Providing upper and lower thresholds for these three values to remove the background proved far more effective than using the normalised values alone. The threshold values chosen for each differed slightly on the lighting that was present when the photo was taken. In some photos the background seems to be an almost dull yellow, in others a bright green. This meant slight adjustments in the threshold values had to be made between photos with greatly different lighting. Although using a green background was far superiour to a black or white background there were still issues that were often difficult to resolve. The matter of shadow was still ocassionally a problem, although normalisation should have taken care of that there were still some cases where the shadow was so dark that the RGB values didn’t resemble any shade of green at all. Another problem encountered involved utensils with a reflective surface, inparticular those that bend and curve like soup . The issue was that due to the reflective surface some of the background green was reflected in the surface of the utensil when the photo was taken. This meant that for that area of the utensil the RGB values were exactly the same as the background and so could not be segmented out no matter how strict the thresholding was. An example of this issue is shown in figure 3.3, which was segmented produced the figure 3.4. As can be seen much of the ladle has been classified as part of the background since the segmentation process simply classifies all ”green” areas as the background, not taking into account context. A feature that was added to help deal with these problems was a small funciton that cleaned up any noise that was present in the binary image after segmentation. This function simply consisted of eroding the image once, then finding the largest area of pixels in the image with value 1, dilating the image twice and finally eroding once. A trivial clean up function that provided two main benefits along with removing noise: • Because the cleanup removed most noise from the image it allowed the thresh- 18 Chapter 3. Thresholding and Description

Figure 3.3: A problematic image due the the curved and reflective nature of the utensil

Figure 3.4: The resultant segmentation of Figure 3.3 showing problem with curved reflective surfaces

olding to be stricter than before, essentially the segmentation purposefully pro- duced more noise than before in an effort to ensure that more of the utensil was correctly segmented. As long as the extra noise was minimal it would be re- moved by the cleanup, leaving only the higher quality segmentation.

• The second step of cleanup where only the largest connected are of 1 value pixels are kept ensures that when the features are calculated for the image they are being applied to one solid block rather than fragmented parts of a utensil.

After cleanup the binary image now consists of one connected area of white pixels that represent the utensil, descriptions can now be calculated for this image. The result of segmenting the image shown in figure 3.5 is the figure 3.6. This creates a clean binary image of the utensil. 3.2. Implementation 19

Figure 3.5: An example of a utensil with a green background.

Figure 3.6: The binary segmented image of Figure 3.5

3.2.2 Creating the Feature Vectors

To create the descriptions for the feature vector code from the Introduction to Vision and Robotics course, [2], was adapted. This code included the file ”getproperties.m” which provided a means of calculating compactness and the 6 invariant moments. However when testing the calssifier it became apparent that knives were being incor- rectly classified as forks and so an 8th description was created with the intention of providing a way to distinguish utensils that had regions comprised of prongs or thin spokes from utensils that did not. This new description was calculated by calculating the area of the utensil, eroding the image fifteen times, dilating the image fifteen times, calculating the new area then dividing the new area by the old one. For solid utensils like knives or spoons there should be next to no difference in the two areas and so the value of the description will be near one. However for utensils like forks or the fifteen erosions results in the prongs or thin wire parts being cut off and removed from the image, resulting in a value closer to 0.7 or 0.6.

Having defined our eigth descriptions a function was created that simply looped through 20 Chapter 3. Thresholding and Description all images in the database. For each images the function would calculate the eight de- scriptions, create a feature vector containing them, store this feature vector in a table along with the filename of the image and the class folder it belongs to. The table was then written to a data file to be used later. Now that all feature vectors have been created it is possible to create scatter plots and gaussian function plots. A series of scatter plots were created for each class. On each axis of the plot would be one of the descriptions. Creating these scatter plots provided a number of benefits: • Clumps: The points in these scatter plots that represent images of the same class should, ideally, be tightly clumped together. If the points are spread out across an axis this means there may be something wrong with how the description is calculated, or that the description is not a particularly useful one, at least not for that class in particular. • Outliers: If most of the points in the graph form a clump apart from one point that lies far away then it could mean that an image has experienced an error during segmentation that has gone unnoticed. This allows the problematic image to be found, and for the issue to be fixed. • Separability: This is more noticable in the gaussian plots but can still be seen here. If it is found that for many classes the points tend to clump around the same area on a particular axis, it means that that description will not be very effective at telling the two classes apart. If this is the case for many classes then it may be worth removing that description entirely. On the other hand if clumps from different classes do not overlap at all in a particular axis then this description has high separabiltiy and will help to recognise the class of a utensil. The gaussian plots show similar information to the scatter plots, however with gaus- sians since it can more easily stack plots on top of one another it easy easier to see how separable classes will be, if the curves overlap a lot then that description will have low separability, if they hardly overlap at all then high separability. Shown in the appendix of this report are some examples of scatter plots that were produced, figures: 6.21, 6.22, 6.23, 6.24, 6.25. These scatter plots show the distribution of the discriptions Compactness and ci1 in all of throughout each class. The appendix also contains all gaussian plots created and show the distribution of all features amoung all classes. These are figures: 6.26, 6.27, 6.28, 6.29, 6.30, 6.31, 6.32, 6.33.

3.2.3 Testing Moment Invariance

Compactness and the 6 moment invariants should by invariant to the rotation, scale and translation of the utensil in the image. To check that the descriptions calculated were indeed invariant the same dessert spoon was photographed sixteen times, each time changing either the scale, translation or rotation of the spoon. After that feature vectors were created for each image and plotted the result against the images of dessert spoons 3.3. Discussion 21 that had already added to the data file. The resulting plots are shown in the appendix, figures: 6.35, 6.34, 6.36, 6.37, 6.38, 6.39. The scatter plots are used to demonstrate that the 16 test images taken do indeed clump tightly with all other photos of dessert spoons, this is expect as it shows that the descriptions are indeed invariant to scale, rotation and translation. This is most evident in figure 6.34. The gaussian plots show in red the distribution of the test images taken compared to all other images in the data base represented by the blue plot.

3.3 Discussion

Creating the feature vectors then writing them to a data file saved a lot of time since this is a large bulk of the computation involved in the project due to the large number of images. Once created and stored the feature vectors could be accessed at any time. The thresholding method used is very basic and would not work for images with a more complex background, however is ample for the images concerned within this project. The addition of segmentation by more complicated means, for example finding edges, could allow for images with more complex backgrounds to be used greatly broadening the images that could be added to the database as well as tested by the classifier. Selecting the right descriptions is key to a satisfactory classifier so naturally a high priority focus for future additions to the descriptions would be to remove redundant descriptions and to create new specialised descriptions to discriminate between classes.

Chapter 4

Recognition

4.1 Design

A multivariant bayesian classifier was chosen as the algorithm used to recognise the class of a utensil. A bayesian classifier was suitable for this task as it is a relatively simple classifier that scales easily for multivariant and multiclass problems making it ideal for this project. Although basic this classifier is quite basic as long as the feature vectors that describe the differences in classes well enough to seperate them then the bayesian algorithm would be able to recognise the class of a utensil with acceptably high accuracy. Another form of classification was taken into consideration during planning which was the idea of creating a support vector machine however this was deemed overly complex, especially for a multiclass problem, and may not have improved the accuracy or the classifier by a noticable amount. Training the classifier was to use N fold cross validation. This technique is intended to represent how the classifier will perform when given anew image of a utensil it has never seen before. The process involves carrying out series of steps N times, for this project N will be ten, that each result in a confusion matrix of classifications. The steps involved in each of the ten processes are as follows: • First step is to randomly split the data set in two such that each half has an equal number of images, an equal number of images from each class and do not have any images in common with each other. One half will be used the training set, the other half the testing set. The reason this is a random split is to try and cover as many permutations of the data and avoid repitition. • The classifier is trained using the designated training set, this involves creating a model representation of each class which will later be used in the classification stage. • Now that the classifier has been trained it is fed all the images in the training set. The classifier chooses a class for each image and this is stored in a table where the row is the class that the classifier predicted and the column is the class we know the image represents. This table is known as a confusion matrix and

23 24 Chapter 4. Recognition

is used to see how accurately the classifier performs among and where it makes mistakes. Now that ten confusion matrices have been produced the accuracy of each confusion matrix is calculated. There are two type of accuracy used in this project, Macro accu- racy and Micro accuracy. These two accuracies are calculated as follows: • Macro: Macro accuracy is calculated by counting all of the instances where the classifier correctly classified a utensil and dividing this by the total number of images that the classifier attempted to classify. This is a simple measure of accuracy and works well if there are a similar number of images per class. However this accuracy can be a bit misleading if one class has significantly more images than any of the other classes. For example if there were three classes A, B and C, A has 97 data samples, B has 2 and C has 1. If 95 images were correctly classified as A but B and C had none of their images in their class correctly classified, macro accuracy would still return a result of 95% accuracy despite all images of class B or C being classified incorrectly. This is where micro accuracy comes in. • Micro: Micro accuracy is calculated in a few of steps. For each class count how many images were correctly classified for that class, this is then divided by how many images the classifier though were of that class, correct and incorrect. Once this value is found for each class, sum them together and divide by how many classes there are. This gives the micro accuracy. To use the same example given for macro accuracy the calculation would be. (95/97 + 0/2 + 0/1)/3 giving us approximately 33% accuracy instead of 95%. Once these accuracies are calculated for each of the ten confusion matrices we can calculate the mean and standard deviation for each accuracy, giving a representation of the performance of the classifier. The ten confusion matrices can also be used to find an average confusion matrix, from this we can see not only if utensils are being incorrctly classified, but what class the algorthm though the utensil belonged to. This can help to create specific features to help distinguish between problematic classes.

4.2 Implementation

The matlab file ”buildmodel.m” and ”getproperties.m” were adapted and used during recognition as well as ideas drawn from other files from the Introduction to Vision and Robotics course at the University of Edinburgh [2]. To begin the process of classification a model must be created for each class that rep- resents the distribution of feature vectors found in images containing a utensil of that class. This is known as training the classifier and these models are what the classifier uses later to recognise the class of a utensil it is shown. For simplicity all images were initally used to train and test the classifier providing a confusion matrix that represented the upper bound of what accuracy the algorithm was capable of. 4.2. Implementation 25

The confusion was created for all images, Table 6.1, and the two accuracies calculated. The macro accuracy was 55.9% and the micro accuracy was 51.8%. Considering that these results are intended as the upper bounds for the accuracy of the classifier these are unexpectedly low accuracies. This lead to the discovery that several images had segmented incorrectly and were most certainly contributing to the low accuracies. An- other matter of interest in this confusion matrix was that the majority of dinner knives were being incorrectly identified as dinner forks. This lead to the creating of the eighth description being added called prongs. This new feature was designed to discriminate between knives and forks. Once this new description was created and the erroneous images fixed the feature vec- tors were recalculated and written to the data file and the classifier was trained and tested again. This created a new confusion matrix that includes the prongs feature, Ta- ble 6.2. The new macro accuracy was 68.1% and the new micro accuracy was 68.8%. The improvement in accuracy is encouraging however it is still lower than was ex- pected. Also looking again at dinner knives it is noted that there is a slight increase in the amount of knives being incorrectly classified as dinner forks meaning that the introduction of the prongs feature, at least in it’s current state, did not solve this issue. Once the confusion matrix for a classifier trained on all the images was created the next step was to begin using N fold cross validation. Unfortunately this is where a reduction in the amount of data collected became a problem. To train a bayesian classifier there must be at least as many training images per class as there are descriptions, for this project that means that each class must have at least eight images to create a model for that class, this is due to using the inverse covariance and that cannot be created if there are less images then features. Since N fold cross validation involves splitting the data set in half, one half for training the other for testing, this meant that each class needed at elast sixteen images to be able to cross validate. If the data collection goal had been met this would not have been a probably since there would be about one hundred images per class, unfortunately due to the extra time needed to photograph images, the difficulty finding large number of images and time constraints the amount of images collected was far less than the target. Because of this several classes did not have at least sixteen images. A possible solution to this problem would be to reduce the number of descriptions to seven, this would result in only fourteen images being required per class. To decide which description to remove the classifier was trained and tested on all of the images eight times, each time removing one of the description. This created eight confusion matrices representing the eight options of using only seven descriptions: • no descriptions removed: Table 6.2. Macro accuracy = 68.1% and micro accu- racy = 68.8% • Compactness removed: Resulting confusion matrix, Table 6.3 in appendix. Macro accuracy = 58.1% and micro accuracy = 57.0% • ci1 removed: Resulting confusion matrix, Table 6.4 in appendix. Macro accu- racy = 61.5% and micro accuracy = 60.2% • ci2 removed: Resulting confusion matrix, Table 6.5 in appendix. Macro accu- 26 Chapter 4. Recognition

racy = 65.3% and micro accuracy = 61.5% • ci3 removed: Resulting confusion matrix, Table 6.6 in appendix. Macro accu- racy = 64.8% and micro accuracy = 62.8% • ci4 removed: Resulting confusion matrix, Table 6.7 in appendix. Macro accu- racy = 65.3% and micro accuracy = 63.3% • ci5 removed: Resulting confusion matrix, Table 6.8 in appendix. Macro accu- racy = 65.0% and micro accuracy = 63.4% • ci6 removed: Resulting confusion matrix, Table 6.9 in appendix. Macro accu- racy = 69.7% and micro accuracy = 66.7% • Prongs removed: Resulting confusion matrix, Table 6.10 in appendix. Macro accuracy = 61.9% and micro accuracy = 60.4% With these results the description to be removed would be ci6. This is because removal of ci6 surprisingly increased the macro accuracy and although it did decrease the mi- cro accuracy, it decreased by a smaller amount than removing any other description removal. Removing ci6 resulted in only needing fourteen images to carry out N fold cross valida- tion, however there were still too many classes with image counts of less than fourteen and so N fold cross validation was not carried out.

4.3 Discussion

Due to a defecit in images collected both the can opener class and bread knife class had to be omitted from use with the classifier entirely. The final confusion matrix using all images still had quite poor accuracies considering that this is the upper limit of what the classifier is capable of. Perhaps the accumulation of more images would improve the accuracies. If future work was to be carried out on this project a valuable point of progress would be to create more specialised descrip- tions to add to the feature vectors as this would increase the accuracy of the classifier. A description that was discussed, but due to time constraints not implemented, would be calculating the area of the utensil and dividing this by the area of the convex hull around the utensil. This would help in seperating classes containing utensils that have holes in them, for example fish slices, from solid utensils like wooden spoons. Even with the removal of one of the features there were still not enough classes with image counts high enough to carry out cross validation. However a lot of the class that did not have enough images were only one or two images away. The addition of a few more images to the database could have allowed cross validation to be viable. Better time management would have surely allowed this to be possible. The creation of the prongs description was inteded as a way of solving the issue of dinner knives by classified as dinner forks, unfortunately the feature did not have the desired effect. However even though it did not solve the problem it was created for 4.3. Discussion 27 looking at the accuracies of using all eight descriptions and removing the prongs de- scription we see that the extra description helped a lot. The macro accuracy is 6.2% lower without the prongs description and the micro accuracy is 8.4%. This shows that the addition of the prongs description aided the classifier in recognising utensils even if it didn’t correct incorretly classifying dinner knives. The addition of enough images to enable cross validation while using all descriptions, possible even more than the ones in this project, would significantly improve the per- formance of the classifier.

Chapter 5

Conclusion

There were three main goals of this project: collect a database of kitchen utensil im- ages, create a website to host the images and create a classifier that can recognise the class of a utensil shown in an image.

5.0.1 Data collection

Approximately two thousand images were to be collected for the database, one hundred images per class and twenty classes in total. Once the decision was made to manually photograph every image it was clear that the original goal of two thousand images would take a massive amount of time to achieve. In the end only four hundred and fourty nine images were collected. Although this is still an acceptable numbe of images the problem lay in the distribution amoung the classes. The teaspoon class had the most images with over one hundred, the only class to meet the original goal, whereas can opener only had four images. Due to this deficit in images both the can opener and bread knife classes had to be omitted from use with the classifier entirely. This outcome was unfavourable and if future work were to be carried out on the project then a drastic increase in the database would be a high priority. Improving the segmentation process to be able to cope with more complex background, perhaps by using forms of edge detection or adaptive threshodling, would broaded the range of image that the classifier could accept. This lessened restriction would also allow for more images to be added to the database.

5.0.2 Website

The website design is simple and straightforward as initially intended, it serves it pur- pose by providing access to the images in the database in a clear and concise manner. Improvements could be made to make the website more user friendly and navigatable, possible adding a separate download page where all possible zip files can be accessed in one place. However improving the website would be a low priority if future work

29 30 Chapter 5. Conclusion were to be carried out on the project as increasing the size of the database or the accu- racy of the classifier would be far more valuable goals.

5.0.3 Classifier, Utensil Recognition

A Bayesian classifier was created that when trained could classify the utensil shown in an image with moderate rate of success. While the accuracy was improve by the addition of a new description ”Prongs” even when the classifier was trained and tested with the same data the accuracy was just less than 70%. Considering this reflects a much higher accuracy than would be seen if tested on new data this is a reasonably poor result. The low accuracy was not helped by the large defecit of images belonging to some classes. If data collection had been more successful N fold cross validation could have been used to calculate an accuracy that was closer to how the classifier would perform on new data. However even with the removal of a description there were still not enough images to allow for cross validation. If cross validation had been implemented however, it would not have improved the accuracy of the classifier, only providing a more realistic representation of it’s capa- bilities. An effective method of increasing the accuracy would have been to create more descriptions for specific problems, possibly adjust the ”Prongs” description so that it successful solves the problem of dinner knives being classified as dinner forks. A description that was planned to be added was the area divided by area of the convex hull. This may have also aided the discrimination between knives and forks, but would certainly have improved the accuracy over all. Unfortunately due to time constraints and poor time management this description was omitted from the final project.

5.0.4 Final Thoughts

Many of the issues faced during the project stemmed from the deficit of images col- lected. Considerable time was spent photographing images which leads to the conclu- sion that manually taking photographs was simply not an time efficient way to collect data, certainly not of the scale involved in this project. Although manually taking photos did have the benefit of keeping control over the background in each image, simplifying the segmentation process, perhaps it would have been better to take gather more images with a range of different background and create a more complex segmen- tation process that could handle images with complex and varying backgrounds. Of course the alternative method of creating a more complex segmenter may have taken up the same amount of time as taking photos so it is hard to say which option would have been better. The addition of more descriptions in the feature vectors would have been of great benefit to the classifier and the result of the project. Unfortunately this did not happen and is noticable in the accuracy of the final classifier. Allocating more time to tailoring the feature vectors would have been greatly beneficial. 31

A database of kitchen utensils has been created, even if it is lacking numbers in some classes, and all the images have been segmented and hosted on a website where they can be downloaded and used for future work in vision and robotics. A basic bayesian classifier wsa created that perfomed moderately well with the feature vectors it was given. The primary shortcoming of the project was the defecit of images for some classes. However the project over all is a success and although more descriptions in the feature vector would have been beneficially the ”Prongs” feature that was created did improve accuracy even if it didn’t quite fix the problem it was created for. If future work was to be carried out on this project there are two primary areas of focus. First is substantially increasing the number of images in the database as to allow N fold cross validation and hopefully improve classifier accuracy. Second is to create new descriptions for specific mis-classifications and the removal of descriptions that prove to be redundant or even harmful to the accuracy of the classifier.

Chapter 6

Appendix

33 34 Chapter 6. Appendix

Figure 6.1: A screen shot of the bottle opener class web page 35

Figure 6.2: A screen shot of the bottle opener class web page 36 Chapter 6. Appendix

Figure 6.3: A screen shot of the bottle opener class web page 37

Figure 6.4: A screen shot of the bottle opener class web page 38 Chapter 6. Appendix

Figure 6.5: A screen shot of the bottle opener class web page 39

Figure 6.6: A screen shot of the bottle opener class web page 40 Chapter 6. Appendix

Figure 6.7: A screen shot of the bottle opener class web page 41

Figure 6.8: A screen shot of the bottle opener class web page 42 Chapter 6. Appendix

Figure 6.9: A screen shot of the bottle opener class web page 43

Figure 6.10: A screen shot of the bottle opener class web page 44 Chapter 6. Appendix

Figure 6.11: A screen shot of the bottle opener class web page 45

Figure 6.12: A screen shot of the bottle opener class web page 46 Chapter 6. Appendix

Figure 6.13: A screen shot of the bottle opener class web page 47

Figure 6.14: A screen shot of the bottle opener class web page 48 Chapter 6. Appendix

Figure 6.15: A screen shot of the bottle opener class web page 49

Figure 6.16: A screen shot of the bottle opener class web page 50 Chapter 6. Appendix

Figure 6.17: A screen shot of the bottle opener class web page 51

Figure 6.18: A screen shot of the bottle opener class web page 52 Chapter 6. Appendix

Figure 6.19: A screen shot of the bottle opener class web page 53

Figure 6.20: A screen shot of the bottle opener class web page 54 Chapter 6. Appendix

Figure 6.21: An example of a scatter plot for the descriptions Compactness and ci1.

Figure 6.22: An example of a scatter plot for the descriptions Compactness and ci1. 55

Figure 6.23: An example of a scatter plot for the descriptions Compactness and ci1.

Figure 6.24: An example of a scatter plot for the descriptions Compactness and ci1. 56 Chapter 6. Appendix

Figure 6.25: An example of a scatter plot for the descriptions Compactness and ci1. 57

Figure 6.26: A set of gaussian plots representing the distribution of the Compactness description in each class.

Figure 6.27: A set of gaussian plots representing the distribution of the ci1 description in each class. 58 Chapter 6. Appendix

Figure 6.28: A set of gaussian plots representing the distribution of the ci2 description in each class.

Figure 6.29: A set of gaussian plots representing the distribution of the ci3 description in each class. 59

Figure 6.30: A set of gaussian plots representing the distribution of the ci4 description in each class.

Figure 6.31: A set of gaussian plots representing the distribution of the ci5 description in each class. 60 Chapter 6. Appendix

Figure 6.32: A set of gaussian plots representing the distribution of the ci6 description in each class.

Figure 6.33: A set of gaussian plots representing the distribution of the Prongs descrip- tion in each class. 61

Figure 6.34: Invariance test dessert spoon images in red compared to all other dessert spoons, same scale as Figure 6.34

Figure 6.35: Invariance test dessert spoon images in read compared to all other images

Figure 6.36: Invariance test dessert spoon images in red compared to all other dessert spoons, scaled to see spread comparison 62 Chapter 6. Appendix

Figure 6.37: Gaussians for spoon invariance tests 63

Figure 6.38: Gaussians for spoon invariance tests

Figure 6.39: Gaussians for spoon invariance tests 64 Chapter 6. Appendix ra knife bread spoon tea spatula spoon serving cutter pizza spoon wooden spoon soup tongs opener bottle peeler potato peeler spoon dessert whisk masher ladle knife kitchen slice fish knife dinner fork dinner class true 2 0 1 2 0 0 0 4 0 0 0 0 0 5 2 0 9 16 39 fork 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 knife 1 3 0 8 0 0 0 0 0 0 0 0 1 0 3 0 25 0 0 slice fish 2 0 1 0 0 0 0 2 0 5 1 0 2 0 0 20 1 7 3 knife kitchen a l .:Iiil ofso arxPoue hnaliae sdfrtrain/test for used images all when Produced Matrix Confusion Initiall 6.1: Table 0 1 0 8 0 0 0 0 0 0 0 0 0 0 10 0 4 0 0 ladle 0 0 0 1 0 0 0 0 0 0 0 0 1 6 1 0 0 1 1 masher 0 0 0 3 0 0 0 0 0 0 0 0 7 0 0 0 1 0 0 whisk 0 3 0 0 0 0 2 0 0 0 0 10 0 0 1 0 0 0 0 spoon dessert 0 0 0 0 0 0 0 1 0 0 6 0 0 1 0 0 1 0 0 peeler 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 peeler potato 0 13 0 0 0 1 3 1 14 1 1 0 1 2 0 6 0 0 0 opener bottle 0 0 0 0 0 0 0 4 0 0 0 0 1 0 0 0 0 1 1 tongs 0 0 0 2 0 1 7 0 0 0 0 0 0 0 1 0 2 0 0 spoon soup 0 0 0 0 0 5 0 0 0 0 0 0 0 0 1 0 0 0 0 spoon wooden 0 1 0 1 10 1 0 1 0 5 0 1 0 1 1 0 1 0 0 cutter pizza 0 0 0 7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 spoon serving 1 2 7 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 spatula 0 81 5 5 0 7 3 1 0 1 0 2 0 0 2 0 2 9 0 spoon tea 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 knife bread 65 tea spoon 0 3 2 0 3 0 0 0 0 1 0 0 1 3 0 8 0 86 spatula 0 1 1 0 0 0 0 0 0 0 0 0 1 5 0 2 13 9 serving spoon 0 0 0 0 4 0 0 0 0 0 0 0 0 0 0 12 0 0 pizza cutter 0 0 0 0 0 0 0 0 0 1 0 0 0 0 10 0 0 0 wooden spoon 0 0 0 0 0 0 0 0 0 0 0 0 0 6 0 0 0 0 soup spoon 0 0 3 0 0 0 0 0 0 0 0 0 10 1 0 0 1 2 tongs 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 bottle opener 0 0 0 0 0 0 0 0 0 0 14 0 0 0 0 0 0 0 potato peeler 0 0 0 0 0 0 0 0 0 5 0 0 0 0 0 1 0 0 peeler 0 0 0 0 0 0 0 0 8 0 0 0 0 0 0 0 0 0 dessert spoon 0 0 0 0 1 0 0 13 0 0 0 0 0 0 0 0 0 1 whisk 1 0 1 0 0 2 10 0 0 0 0 0 0 0 0 1 0 0 masher 1 0 0 0 0 6 0 0 0 0 0 2 0 0 0 1 0 0 ladle 0 0 6 0 9 0 0 0 0 0 0 0 0 0 0 5 0 1 Table 6.2: Confusion Matrix Produced after fixed images and prongs created kitchen knife 0 7 0 26 2 1 0 0 0 7 0 4 0 0 0 0 0 1 fish slice 0 1 33 0 3 1 1 0 0 0 0 2 0 0 0 4 0 3 knife 0 3 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 fork 40 20 1 0 0 2 0 0 0 0 0 3 0 0 0 1 0 2 true class dinner fork dinner knife fish slice kitchen knife ladle masher whisk dessert spoon peeler potato peeler bottle opener tongs soup spoon wooden spoon pizza cutter serving spoon spatula tea spoon 66 Chapter 6. Appendix e spoon tea spatula spoon serving cutter pizza spoon wooden spoon soup tongs opener bottle peeler potato peeler spoon dessert whisk masher ladle knife kitchen slice fish knife dinner fork dinner class true 3 0 0 0 0 0 3 0 0 0 0 2 4 0 0 0 17 41 fork 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 5 0 knife 1 0 0 0 0 0 1 0 0 0 0 0 0 2 0 6 0 0 slice fish 0 0 0 1 0 0 3 0 5 0 0 0 1 2 25 0 7 0 knife kitchen al .:CnuinMti rdcdwe opcns ecito a removed was description Compactness when Produced Matrix Confusion 6.3: Table 1 0 5 0 0 0 1 0 0 0 0 0 0 8 0 4 1 0 ladle 0 0 2 0 0 0 0 0 0 0 0 0 6 0 0 0 0 0 masher 0 0 2 0 0 0 1 0 0 0 0 8 0 2 0 0 0 1 whisk 2 1 0 0 0 0 0 0 0 0 11 0 0 1 0 0 0 0 spoon dessert 0 0 0 0 0 0 1 0 0 7 0 0 0 0 0 0 0 0 peeler 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 peeler potato 0 0 0 2 0 0 0 14 2 1 0 0 0 0 1 0 0 0 opener bottle 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 tongs 1 1 3 0 1 9 0 0 0 0 0 0 0 3 0 5 0 0 spoon soup 0 0 1 0 4 0 0 0 0 0 0 0 0 0 0 2 0 0 spoon wooden 2 1 2 7 0 0 0 0 4 0 1 0 0 0 0 1 1 0 cutter pizza 0 0 9 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 spoon serving 7 9 4 0 5 1 1 0 0 0 0 0 0 0 0 7 2 0 spatula 88 2 7 0 5 2 1 0 1 0 1 1 1 3 0 22 2 0 spoon tea 67 tea spoon 0 6 2 0 2 0 0 1 0 0 0 1 2 4 0 3 3 85 spatula 0 3 1 0 0 0 0 0 0 1 0 0 0 3 0 2 10 8 serving spoon 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 12 0 0 pizza cutter 0 0 1 0 0 0 0 1 0 5 0 0 0 1 10 1 0 1 wooden spoon 0 0 0 0 0 0 0 0 0 0 0 0 0 5 0 0 0 0 soup spoon 0 0 2 0 1 0 0 1 0 0 0 0 9 1 0 1 1 0 tongs 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 bottle opener 0 0 0 7 1 0 0 0 1 1 14 0 0 0 0 0 0 0 potato peeler 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 peeler 0 0 0 0 0 1 0 0 6 1 0 3 0 0 0 0 0 0 dessert spoon 0 1 0 0 1 0 0 10 0 0 0 0 1 0 0 1 0 6 whisk 1 0 1 0 0 2 10 0 0 0 0 1 0 0 0 1 0 0 masher 0 0 0 0 0 6 0 0 0 0 0 0 0 0 0 1 0 0 ladle 0 0 4 0 8 0 0 0 0 0 0 0 0 0 0 4 0 1 Table 6.4: Confusion Matrix Produced when ci1 description was removed kitchen knife 0 7 0 19 1 0 0 0 1 4 0 1 0 0 0 0 0 0 fish slice 1 1 25 0 4 0 1 0 0 0 0 2 0 0 0 5 0 3 knife 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 fork 40 14 11 0 1 3 0 0 0 0 0 4 0 1 0 4 0 1 true class dinner fork dinner knife fish slice kitchen knife ladle masher whisk dessert spoon peeler potato peeler bottle opener tongs soup spoon wooden spoon pizza cutter serving spoon spatula tea spoon 68 Chapter 6. Appendix e spoon tea spatula spoon serving cutter pizza spoon wooden spoon soup tongs opener bottle peeler potato peeler spoon dessert whisk masher ladle knife kitchen slice fish knife dinner fork dinner class true 1 0 2 0 0 0 3 0 0 0 0 0 2 1 0 1 21 40 fork 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 3 0 knife 3 0 4 0 0 0 2 0 0 0 0 1 1 3 0 32 1 0 slice fish 1 0 1 0 0 0 3 0 8 2 0 0 1 2 26 0 7 0 knife kitchen al .:CnuinMti rdcdwe i ecito a removed was description ci2 when Produced Matrix Confusion 6.5: Table 0 0 6 0 0 0 0 0 0 0 0 0 0 8 0 5 0 0 ladle 0 0 0 0 0 0 1 0 0 0 0 0 5 0 0 0 0 1 masher 0 0 2 0 0 0 0 0 0 0 0 10 2 0 0 1 0 1 whisk 6 1 1 0 1 1 0 0 0 0 12 0 0 2 0 0 0 0 spoon dessert 0 0 0 0 0 0 1 0 0 5 0 0 0 0 0 0 0 0 peeler 0 0 0 0 1 0 0 0 4 0 0 0 0 0 0 0 0 0 peeler potato 0 0 0 2 0 0 0 14 0 1 0 0 0 0 0 0 0 0 opener bottle 0 0 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 tongs 0 1 0 0 1 8 0 0 0 0 0 0 0 1 0 2 0 0 spoon soup 0 0 0 0 5 0 0 0 0 0 0 0 0 0 0 0 0 0 spoon wooden 0 0 0 8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 cutter pizza 0 0 11 0 0 0 0 0 0 0 0 0 1 2 0 1 0 0 spoon serving 4 9 1 0 2 1 0 0 0 0 0 0 0 0 0 1 1 0 spatula 90 3 7 0 5 2 0 0 2 0 1 0 0 3 0 4 2 0 spoon tea 69 tea spoon 1 5 2 0 4 0 0 1 0 0 0 0 2 4 0 4 1 88 spatula 0 3 1 0 0 0 0 0 0 1 0 0 1 3 0 3 12 7 serving spoon 0 0 1 0 2 0 0 0 0 0 0 0 0 0 0 9 0 0 pizza cutter 0 0 1 0 0 0 0 1 0 4 0 0 0 0 10 3 0 1 wooden spoon 0 0 0 0 0 0 0 0 0 0 0 0 0 6 0 0 0 0 soup spoon 0 0 4 0 1 0 0 0 0 0 0 0 8 1 0 0 1 0 tongs 0 0 0 0 0 0 0 0 0 0 0 3 0 0 0 0 0 0 bottle opener 0 0 0 0 0 0 0 0 1 0 14 0 0 0 0 0 0 0 potato peeler 0 0 0 0 0 0 0 0 0 3 0 0 0 1 0 0 0 0 peeler 0 0 0 0 0 0 0 0 5 0 0 1 0 0 0 0 0 0 dessert spoon 0 0 0 0 1 0 0 11 0 0 0 0 1 0 0 1 0 4 whisk 1 0 1 0 0 2 10 0 0 0 0 0 0 0 0 2 0 0 masher 1 0 0 0 0 6 0 0 0 0 0 1 0 0 0 0 0 0 ladle 0 0 4 0 6 0 0 0 0 0 0 0 0 0 0 5 0 0 Table 6.6: Confusion Matrix Produced when ci3 description was removed kitchen knife 0 7 0 26 2 1 0 0 2 6 0 3 0 0 0 0 0 0 fish slice 0 1 32 0 6 1 1 0 0 0 0 2 0 0 0 6 0 4 knife 0 3 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 fork 39 16 1 0 0 2 0 0 0 0 0 3 0 0 0 2 0 1 true class dinner fork dinner knife fish slice kitchen knife ladle masher whisk dessert spoon peeler potato peeler bottle opener tongs soup spoon wooden spoon pizza cutter serving spoon spatula tea spoon 70 Chapter 6. Appendix e spoon tea spatula spoon serving cutter pizza spoon wooden spoon soup tongs opener bottle peeler potato peeler spoon dessert whisk masher ladle knife kitchen slice fish knife dinner fork dinner class true 2 0 1 0 0 0 3 0 0 0 0 0 1 0 0 1 18 40 fork 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 2 0 knife 3 0 2 0 0 0 3 0 0 0 0 1 1 1 0 34 0 0 slice fish 1 0 1 0 2 0 3 0 11 0 0 0 1 2 26 0 8 0 knife kitchen al .:CnuinMti rdcdwe i ecito a removed was description ci4 when Produced Matrix Confusion 6.7: Table 0 0 7 0 0 0 0 0 0 0 0 0 0 9 0 5 0 0 ladle 0 0 1 0 0 0 1 0 0 0 0 0 6 0 0 0 1 1 masher 0 0 4 0 0 0 0 0 0 0 0 10 2 0 0 1 0 1 whisk 1 0 0 0 0 0 0 0 0 0 12 0 0 1 0 0 0 0 spoon dessert 0 0 0 0 0 0 1 0 0 7 0 0 0 0 0 0 0 0 peeler 0 0 1 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 peeler potato 0 0 0 0 0 0 0 14 0 1 0 0 0 0 0 0 0 0 opener bottle 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 tongs 3 2 1 0 3 9 0 0 0 0 0 0 0 3 0 2 0 0 spoon soup 0 0 0 0 4 0 0 0 0 0 0 0 0 0 0 0 0 0 spoon wooden 0 1 0 10 0 0 0 0 1 0 0 0 0 1 0 0 0 0 cutter pizza 0 0 9 0 0 0 0 0 0 0 0 0 0 3 0 0 0 0 spoon serving 7 10 1 0 5 1 0 0 0 0 0 0 0 0 0 2 3 0 spatula 88 1 7 0 1 2 0 0 1 0 1 0 0 2 0 2 3 0 spoon tea 71 tea spoon 0 7 2 0 4 0 0 1 0 1 0 0 2 4 0 8 1 84 spatula 0 1 2 0 0 0 0 0 0 0 0 0 1 5 0 2 11 12 serving spoon 0 0 0 0 3 1 0 0 0 0 0 0 0 0 0 13 0 0 pizza cutter 0 0 1 0 0 0 0 0 0 1 0 0 0 0 9 1 1 0 wooden spoon 0 0 0 0 1 0 0 0 0 0 0 0 0 4 0 0 0 0 soup spoon 0 0 2 0 0 0 0 0 0 0 0 0 7 1 0 0 1 1 tongs 0 0 0 0 0 1 0 0 0 0 0 3 0 0 0 0 0 0 bottle opener 0 0 0 1 0 0 0 0 0 4 14 0 0 0 1 0 0 0 potato peeler 0 0 0 0 0 0 0 0 0 4 0 0 0 0 0 0 0 0 peeler 0 0 0 0 0 0 0 0 8 0 0 0 0 0 0 0 0 0 dessert spoon 0 0 0 0 1 0 0 12 0 0 0 0 2 0 0 0 0 1 whisk 2 0 1 0 0 2 10 0 0 0 0 0 0 0 0 0 0 0 masher 0 0 0 0 0 4 0 0 0 0 0 2 0 0 0 0 0 0 ladle 0 0 4 0 7 0 0 0 0 0 0 0 0 0 0 3 0 0 Table 6.8: Confusion Matrix Produced when ci5 description was removed kitchen knife 0 7 0 25 2 1 0 0 0 4 0 4 0 0 0 0 0 1 fish slice 0 1 34 0 3 1 1 0 0 0 0 4 0 1 0 7 0 4 knife 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 fork 40 16 1 0 1 2 0 0 0 0 0 1 0 0 0 1 0 2 true class dinner fork dinner knife fish slice kitchen knife ladle masher whisk dessert spoon peeler potato peeler bottle opener tongs soup spoon wooden spoon pizza cutter serving spoon spatula tea spoon 72 Chapter 6. Appendix e spoon tea spatula spoon serving cutter pizza spoon wooden spoon soup tongs opener bottle peeler potato peeler spoon dessert whisk masher ladle knife kitchen slice fish knife dinner fork dinner class true 2 0 1 0 0 0 2 0 0 0 0 0 1 0 0 1 12 41 fork 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 10 0 knife 2 0 2 0 0 0 3 0 0 0 0 1 0 2 0 35 0 0 slice fish 1 0 0 1 1 0 4 0 7 0 0 0 2 2 26 0 7 0 knife kitchen al .:CnuinMti rdcdwe i ecito a removed was description ci6 when Produced Matrix Confusion 6.9: Table 0 0 4 0 0 0 0 0 0 0 0 0 0 6 0 4 0 0 ladle 0 0 1 0 0 0 1 0 0 0 0 0 6 1 0 0 1 0 masher 0 0 1 0 0 0 0 0 0 0 0 9 2 0 0 1 0 1 whisk 1 0 0 0 0 0 0 0 0 0 13 0 0 1 0 0 0 0 spoon dessert 0 0 0 0 0 0 1 0 0 7 0 0 0 0 0 0 0 0 peeler 0 0 1 0 0 0 0 0 6 0 0 0 0 0 0 0 0 0 peeler potato 0 0 0 0 0 0 0 14 0 1 0 0 0 0 0 0 0 0 opener bottle 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 tongs 3 2 1 0 3 8 0 0 0 0 0 0 0 2 0 2 0 0 spoon soup 0 0 0 0 5 0 0 0 0 0 0 0 0 0 0 0 0 0 spoon wooden 0 0 0 9 0 0 0 0 0 0 0 0 1 0 0 0 0 0 cutter pizza 1 0 14 0 0 0 0 0 0 0 0 1 0 5 0 0 1 0 spoon serving 4 11 1 0 4 1 0 0 0 0 0 0 0 1 0 2 1 0 spatula 91 1 9 0 2 3 0 0 1 0 0 0 0 2 0 2 3 0 spoon tea 73 tea spoon 0 6 2 0 2 0 0 1 0 0 0 1 2 4 0 3 3 86 spatula 0 3 1 0 0 0 0 0 0 1 0 0 0 3 0 2 10 8 serving spoon 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 12 0 0 pizza cutter 0 0 1 0 0 0 0 1 0 5 0 0 0 1 10 1 0 1 wooden spoon 0 0 0 0 0 0 0 0 0 0 0 0 0 5 0 0 0 0 soup spoon 0 0 2 0 1 0 0 1 0 0 0 0 9 1 0 1 1 0 tongs 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 bottle opener 0 0 0 6 1 0 0 0 1 1 14 0 0 0 0 0 0 0 potato peeler 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 peeler 0 0 0 0 0 1 0 0 6 1 0 2 0 0 0 0 0 0 dessert spoon 0 1 0 0 1 0 0 10 0 0 0 0 1 0 0 1 0 5 whisk 1 0 1 0 0 2 10 0 0 0 0 1 0 0 0 1 0 0 masher 0 0 0 0 0 6 0 0 0 0 0 0 0 0 0 1 0 0 ladle 0 0 4 0 8 0 0 0 0 0 0 0 0 0 0 4 0 1 Table 6.10: Confusion Matrix Produced when Prongs description was removed kitchen knife 0 7 0 20 1 0 0 0 1 4 0 2 0 0 0 0 0 0 fish slice 1 1 25 0 4 0 1 0 0 0 0 2 0 0 0 5 0 3 knife 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 fork 40 14 11 0 1 3 0 0 0 0 0 4 0 1 0 4 0 1 true class dinner fork dinner knife fish slice kitchen knife ladle masher whisk dessert spoon peeler potato peeler bottle opener tongs soup spoon wooden spoon pizza cutter serving spoon spatula tea spoon

Bibliography

[1] Robert Fisher. Dermofit. http://homepages.inf.ed.ac.uk/rbf/ DERMOFITDATA/lesions.xml. Accessed: 21-11-2015. [2] Robert Fisher. Introduction to vision and robotics: Flatpart recognition. "www. inf.ed.ac.uk/teaching/courses/ivr/matlab/flatpartrecog". Accessed: 18-12-2015.

75