MASARYK UNIVERSITY FACULTY}w¡¢£¤¥¦§¨  OF I !"#$%&'()+,-./012345

Road lane detection for Android

BACHELORTHESIS

Jakub Medvecký-Heretik

Brno, spring 2014 Declaration

I do hereby declare, that this paper is my original authorial work, which I have worked on by my own. All sources, references and liter- ature used or excerpted during elaboration of this work are properly cited and listed in complete reference to the due source.

Jakub Medvecký-Heretik

Advisor: Mgr. Dušan Klinec Acknowledgement

I would like to thank my supervisor, Mgr. Dušan Klinec for his con- tinual support and advice throughout this work. I would also like to thank my family and friends for their support. Abstract

The aim of this bachelor thesis was to analyze the possibilities of im- age processing techniques on the Android mobile platform for the purposes of a computer vision application focusing on the problem of road lane detection. Utilizing the results of the analysis an appli- cation was created, which detects lines and circles in an image in real time using various approaches to segmentation and different imple- mentations of the Hough transform. The application makes use of the OpenCV library and the results achieved with its help are com- pared to the results achieved with my own implementation. Keywords computer vision, road lane detection, circle detection, image process- ing, segmentation, Android, OpenCV, Hough transform Contents

1 Introduction ...... 3 2 Computer vision ...... 5 2.1 Image processing ...... 5 2.2 Techniques ...... 6 2.3 Existing libraries ...... 6 2.3.1 OpenCV ...... 6 2.3.2 VXL ...... 7 2.3.3 ImageJ ...... 7 2.3.4 libCVD ...... 7 2.4 Android ...... 8 3 Region of interest ...... 9 4 Segmentation ...... 10 4.1 Thresholding ...... 11 4.1.1 Global (static) thresholding ...... 11 4.1.2 Local (adaptive) thresholding ...... 12 Noise removal ...... 13 4.2 Canny edge detector ...... 14 4.3 Overview ...... 16 5 Hough transform ...... 17 5.1 Hough space ...... 17 5.2 Straight line detection ...... 19 5.2.1 Algorithm ...... 20 5.3 Circle detection ...... 21 5.3.1 Circles with known radii ...... 22 5.3.2 Circles with unknown radii ...... 23 5.4 Detecting non-analytical shapes ...... 24 6 Android application - Hough ...... 25 6.1 Target platform ...... 25 6.2 Implementation ...... 26 6.2.1 Segmentation ...... 26 6.2.2 Line detection ...... 26 Overview ...... 28 6.2.3 Vanishing point ...... 28 6.2.4 Circle detection ...... 29 7 Conclusion ...... 31

1 7.1 Future development ...... 31 Bibliography ...... 33 A FpsMeter logs ...... 36 B Screenshot gallery ...... 40 C Electronic attachments ...... 42

2 1 Introduction

Car manufacturers are constantly developing new innovative tech- nologies, which help drivers avoid collisions and improve safety on the road. With the arrival of cameras and various sensors the re- search in this area has greatly accelerated. Today’s modern smart- phones are compact, portable and equipped with powerful hardware sensors and high resolution cameras. If accompanied by proper soft- ware, these smartphones can be easily turned into driver assistants and help in driving safely. In order to evaluate possibly dangerous situations, such software must be able to detect and analyze the key elements in an image captured from a camera: the road, the road’s boundaries, traffic signs, potential obstacles, etc. This thesis describes the computer vision problem of road lane detection and the real-time processing of an image acquired from the smartphone’s camera on the Android mobile platform. The phases of this processing happen in the following order: 1. a color image is retrieved from the device’s camera 2. the color image is converted to grayscale 3. a binary image with visible road lanes is produced by segmen- tation of the grayscale image 4. lines are detected in the binary image using the Hough trans- form 5. the detected lines are overlaid in the color image 6. the resulting color image with detected lines is displayed on the screen of the device The second chapter of this work is an introduction to computer vision and the image processing. At the end of this chapter I compare the existing image processing libraries and I choose the most suitable one for my application. In the next chapter I identify the region of interest for the segmen- tation phase. The fourth chapter specifies the different approaches which can be used for extracting the region of interest from an image and deter- mines which technique is the most suitable one for my application.

3 1. INTRODUCTION

The fifth chapter explains how the Hough transform for detecting different shapes in an image works. Different implementations of the Hough transform, their compar- ison and their integration to the resulting application are described in the sixth chapter. The last chapter provides a functionality summary of the result- ing application and outlines the possibilites of further improvement and extension of this application in the future.

4 2 Computer vision

Computer vision (CV) is a field in computer science where comput- ers are used to collect visual information from real world, process, analyze and understand it, then deduce a profitable result from this information. CV is used to help humans perform their daily tasks and enhance performance in situations where CV is able to obtain far better results and can be more perceptive than human vision. Applications of CV range from tasks such as industrial machine vision systems to research into artificial intelligence and computers or robots that can comprehend the world around them [1]. One of many areas computer vision investigates is the road lane detection.

2.1 Image processing

Digital image processing and analysis are the fields most closely re- lated to CV and are critically important in solving CV problems. CV’s main goal is to emulate human vision in computers and achieve results such as object recognition, feature detection, autonomous driving etc. In order to achieve that, the acquired input image must firstly be processed so that only information relevant for computa- tion of the required goal is present in the output image. This is called digital image processing (DIP). DIP is the application of transformations and filters to digital im- ages (most often 2D) which yields a processed output image. Most common uses of DIP are: reconstructing damaged or miss- ing information in an image, preparing or enhancing images for fur- ther use, registering, segmenting, identificating and measuring ob- jects in images, etc. For further introductory information into the field of Image pro- cessing I recommend a book Digital image processing [2].

5 2. COMPUTERVISION 2.2 Techniques

There is a lot of different techniques and methods in DIP which we can use to transform input images. The selection of the appropriate technique strongly depends on what result we want to achieve. Some of the DIP techniques can even be combined. If we experiment in the order in which we apply them to an image, we get different results. Specific techniques that we need to apply to an image if we want to get the output image ready for road lane detection are covered more in-depth in the next chapters.

2.3 Existing libraries

There already are many existing libraries which implement most of the DIP techniques and are also highly optimized. In this chapter I mention just the most suitable ones for application on the Android mobile platform.

2.3.1 OpenCV

Open Source Computer Vision Library1 (OpenCV) is an open source computer vision and machine learning software library. It was built to provide a common infrastructure for computer vision applications and to accelerate the widespread use of machine perception. The library has more than 2500 optimized algorithms. It is the most popular CV library having a user community of more than 47 thousand people and an estimated number of downloads exceeding 7 million. The library is used extensively in research groups and com- panies such as Google, Yahoo, Microsoft, Intel, IBM and others. OpenCV is written natively in C++, has interfaces in C, Python, Java and MATLAB as well and supports Windows, , Android and Mac OS. The latest version is 2.4.8 which was released on 31th December 2013 [3].

1. http://opencv.org

6 2. COMPUTERVISION

2.3.2 VXL VXL2 (the Vision-something-Libraries) is a collection of libraries de- signed for computer vision research and implementation. It is writ- ten in C++ and is designed to be portable over many platforms. The idea is to replace X with one of many letters, e.g., G (VGL) is a geometry library, N (VNL) is a numerics library, I (VIL) is an image processing library, etc. For a more detailed description of the libraries see the VXL book [4].

2.3.3 ImageJ ImageJ3 is a public domain, Java-based image processing program developed at the National Institutes of Health. It was designed with an open architecture that provides extensibility via Java plugins and recordable macros. Custom acquisition, analysis and processing plugins can be de- veloped using ImageJ’s built-in editor and a Java compiler. User- written plugins make it possible to solve many image processing and analysis problems. ImageJ’s plugin architecture and built-in development environ- ment has made it a popular platform for teaching image processing.

2.3.4 libCVD libCVD4 is a very portable and high performance C++ library for computer vision, image, and video processing. The library is designed in a loosely-coupled manner, so that parts can be used easily in iso- lation if the whole library is not required. The emphasis is on providing simple and efficient image and video handling and high quality implementations of common low- level image processing function. This library is said to be OpenCV’s competitor when it comes to terms of speed and should be seriously considered when we seek to achieve image processing in real time.

2. http://vxl.sourceforge.net 3. http://imagej.nih.gov/ij 4. http://www.edwardrosten.com/cvd

7 2. COMPUTERVISION 2.4 Android

The key elements to consider when choosing the appropriate library for a CV application that is supposed to solve road lane detection problems and run on a mobile platform are: • Speed - we want to detect road lanes in real time, which means that the library has to be highly optimized because we need to process multiple frames, make our own computations and draw the desired result back to the user every second. • Low complexity - we want the used algorithms to be as ef- ficient as possible and have a low complexity, because of the limited hardware in smartphones. • Low battery consumption - since smartphones are powered by batteries with limited capacity and we cannot assume with ab- solute certainty that the smartphone running the application will be connected to a power source at all times. • Android ecosystem - the library should be written in Java or at least provide a port for the Android operating system in JNI5, so that it could be easily integrated to the application since An- droid applications are developed mainly in Java. Taking all these aspects into account I have chosen the OpenCV li- brary for the following reasons: – It is the most advanced open source CV library with the widest community support. – It is designed to be high-performance. The algorithms are writ- ten in C++ and are compiled to a highly optimized native code, which suits our needs since Android runs on a Linux kernel. – Even though the algorithms are written in C++, OpenCV main- tains direct port for the Android platform - it includes all the algorithms that can be found in a desktop version in the form of native Android libraries and at the same time provides Java interfaces for an access to those algorithms. – It has relatively low power consumption [5].

5. http://en.wikipedia.org/wiki/Java_Native_Interface

8 3 Region of interest

When we get the input image from the smartphone’s camera (figure 3.1), firstly we have to identify our region of interest (ROI) (figure 3.2) and then extract ROI from that image.

Figure 3.1: Road [6]

Figure 3.2: Region of interest The main goal in the segmentation phase is to separate ROI from the input image and preserve left and right road lanes for further use - line detection.

9 4 Segmentation

If we want to use Hough transform1 in the shape detection phase later, we need to get a binary image with detected edges out of this phase. A binary image consists of just two intensity values: 1 and 0. Ones represent the edges (white pixels) whereas zeroes mean non- edges (black pixels). Since the Hough transform algorithm loops through the input image and makes certain computations at every position of a white pixel, we want the output image from the segmentation phase to con- tain as few white pixels as possible - this way we will achieve better performance. The input image which goes to the segmentation phase is in grayscale. As part of segmentation I ignore the top half of the input image completely (figure 4.1) because it is above the horizon and lies out of ROI, therefore it is not important and its processing would only cause degradation of performance. In this chapter we are looking for the best result/performance ra- tio. We measure the performance of each technique in terms of frame processing frequency, also known as frames per second (FPS).

Figure 4.1: Input image for segmentation

1. Hough transform is covered in-depth in the 5th chapter.

10 4. SEGMENTATION 4.1 Thresholding Thresholding is the simplest method of image segmentation. The pixels are divided into regions based on their intensity: the pixels with intensity higher than a predefined threshold value belong to a different region than the pixels with intensity lower than the thresh- old. 4.1.1 Global (static) thresholding In global thresholding, the threshold is equal for all pixels in the im- age (independent of their position in the image). This thresholding operation can be expressed as:  255 if src(x, y) > thresh dst(x, y) = 0 otherwise where dst(x, y) is the intensity value of the pixel at position (x, y) in the destination image, src(x, y) is the intensity of the pixel at (x, y) in the source image and thresh is the numeric threshold value.

Figure 4.1.1.1: Global thresholding with thresh = 175

Figure 4.1.1.2: Global thresholding in different scene

Figure 4.1.1.3: Global thresholding in scene with shadows

11 4. SEGMENTATION

As we can see (figure 4.1.1.1), this technique gives us good results when the threshold is carefully adjusted for a particular scenery. Yet if the scene changes (figure 4.1.1.2, figure 4.1.1.3) it gives us inaccurate and unusable results. To fix that the threshold must be adjusted again. Even Otsu’s method2, which determines the value of threshold automatically does not work in this situation correctly because it is not invariant to uneven illumination in an image [7]. The main drawbacks of the global thresholding method are that it gives us bad results when the camera is tilted upwards or down- wards just by a fraction of a centimeter, when the scenery changes and when there is a varying illumination in the image. Another draw- back of this method is that it returns too many white pixels which would later slow down the shape detection algorithm.

4.1.2 Local (adaptive) thresholding In this method, the threshold value depends on the position of the pixel in the image. New threshold is calculated for every pixel from its neighboring pixels. This thresholding operation can be expressed as:  255 if src(x, y) > T (x, y) dst(x, y) = 0 otherwise where T (x, y) is a mean of the intensity values of all pixels in the n × n neighborhood of (x, y), minus constant C. With regard to performance I chose n = 3, the lowest possible value.

Figure 4.1.2.1: Local thresholding with C = −0.5 We can observe (figure 4.1.2.1) that value of C is too high as well as there is a very high amount of noise (insignificant white pixels) present which needs to be filtered out.

2. http://en.wikipedia.org/wiki/Otsu’s_method 12 4. SEGMENTATION

Noise removal After setting the value of C lower (figure 4.1.2.2), I made the follow- ing attempts to reduce the noise: 1. Erosion3 - I used a 2 × 2 rectangle structuring element (SE) with a morphological operation called erosion which eliminated most of the noise but still left some noise present (figure 4.1.2.3).

Figure 4.1.2.2: Local thresholding with C = −1.5

Figure 4.1.2.3: Erosion with 2 × 2 SE after local thresholding

If I used bigger dimensions for the SE (figure 4.1.2.4), some im- portant information could be lost, the road lines would become dis- connected and that would make them more problematic to detect.

Figure 4.1.2.4: Erosion with 3 × 3 SE after local thresholding 2. Median4 - Median filter is extremely useful for noise reduction in grayscale and color images. In binary images with detected edges, however this has paradoxically proved to be extremely unuseful.

3. http://homepages.inf.ed.ac.uk/rbf/HIPR2/erode.htm 4. http://en.wikipedia.org/wiki/Median_filter

13 4. SEGMENTATION

Since edges are just thin white lines and the neighborhood of ev- ery pixel is in the majority of cases black - the resulting median of the neighborhood will be in almost all cases black as well and important pixels forming lines will get erased together with the noise. 3. Gaussian blur5 - This was also an unsuccessful idea. I blurred the grayscale image beforehand to eliminate possible noise, then I applied an adaptive threshold and the result was almost identical to the one with no blurring preceding thresholding (figure 4.1.2.2) while FPS lowered drastically.

4.2 Canny edge detector

It was developed by J.F. Canny in 1986. His aim was to develop an algorithm that is optimal with respect to the following criteria: [8] 1. Detection: The probability of detecting real edge points should be maximized while the probability of falsely detecting non- edge points should be minimized. This corresponds to maxi- mizing the signal-to-noise ratio6. 2. Localization: The detected edges should be as close as possible to the real edges. 3. Number of responses: One real edge should not result in more than one detected edge. This particular algorithm uses double threshold for edge detec- tion. Let us assume the first threshold to be T1. Then the second threshold should be set to T2 = 2 ∗ T1 . Two output images are produced: an image from T1, which has many false edges and an image from T2, which contains fewer edges but has gaps in the contours. The algorithm then combines the results from T1 and T2 in such a manner that: it links the edges of T2 into contours until it reaches a gap, then it links the edge from T2 with edge pixels from a T1 contour until a T2 edge is found again [9].

5. http://en.wikipedia.org/wiki/Gaussian_blur 6. http://en.wikipedia.org/wiki/Signal-to-noise_ratio

14 4. SEGMENTATION

The output of this algorithm is an image with thin edges and a small amount of white pixels which is our desired outcome. Incor- rectly chosen threshold values can cause the output image to either contain redundant information (figure 4.2.1) or erase the important information (figure 4.2.2).

Figure 4.2.1: Canny with too low thresholds

Figure 4.2.2: Canny with too high thresholds

To get the correct values of thresholds which change dynamically as the scenery changes, I use the mean value of the grayscale image intensity values. Then I set the low threshold to T1 = 0.66 ∗ mean and the high threshold to T2 = 1.33 ∗ mean (figure 4.2.3).

Figure 4.2.3: Canny with thresholds calculated from mean

15 4. SEGMENTATION 4.3 Overview

In this section I measure and compare the performance of the afore- mentioned segmentation techniques satisfying my goal, which is to: – minimize average number of white pixels in the output image – maximize average of the measured FPS (see Attachment A) – keep road lanes visible • GT (Global thresholding with threshold = 175) • LT (Local thresholding with n = 3 and C = −1.5) • LTE (LT followed by an erosion with 2 × 2 SE) • CED (Canny edge detector with thresholds calculated from mean)

Figure 4.3.1: Performance overview of segmentation techniques

We can see that the CED delivered most accurate results with lit- tle to no noise present at all (figure 4.2.3). It also outputs images with lowest amount of white pixels (figure 4.3.1), which will not slow down performance later on as much as the other methods would. Still, the LTE technique delivered comparable results while preserv- ing higher FPS, therefore I declare the LTE technique more suitable for my application.

16 5 Hough transform

Hough transform is a powerful technique which can be used to de- tect features of a particular shape within an image. It is named after Paul Hough who patented the method in 1962. For complete history of the Hough transform I recommend the work How the Hough transform was invented [10]. The standard Hough transform requires that the features of the desired shape can be described by a parametric equation1. Therefore it is most commonly used for a detection of lines, circles, ellipses, etc. The main advantage of the Hough transform is that it is tolerant of gaps in the edges of shapes to some extent and is relatively unaf- fected by noise in an image and uneven illumination.

5.1 Hough space

A straight line can be described in 2D coordinate system in numerous different ways. For example: • in a Cartesian coordinate system (CCS) with parameters (m, b): y = m ∗ x + b (5.1)

where m is the slope and b is the y-axis intercept (figure 5.1.1).

Figure 5.1.1: Line in Cartesian coordinate system Equation 5.1 is not convenient for us to use because we cannot describe vertical lines with it - parameter m becomes infinite.

1. http://en.wikipedia.org/wiki/Parametric_equation 17 5. HOUGHTRANSFORM

• Instead, we shall represent a line in a Polar coordinate system (figure 5.1.2) with parameters (r, θ):  cosθ   r  y = − ∗ x + sinθ sinθ

where r is the length of a line perpendicular to this line, starting from the origin and θ is the orientation angle of r with respect to the x-axis.

Figure 5.1.2: Line in Polar coordinate system [11] We express the parameter r: r = x ∗ cosθ + y ∗ sinθ (5.2)

Each point [x0, y0] from x, y - plane (figure 5.1.3) which satisfies equation 5.2 gives a sinusoid (figure 5.1.4) in a so called Hough space (r, θ - plane).

Figure 5.1.3: Point in CCS

Figure 5.1.4: Point from CCS projected to Hough space

18 5. HOUGHTRANSFORM 5.2 Straight line detection

If we transform points which lie on the same line in CCS (figure 5.2.1) we can see that their corresponding sinusoids in Hough space (HS) all intersect at one point of intersection (figure 5.2.2).

Figure 5.2.1: Points in CCS

Figure 5.2.2: Points from CCS projected to HS This is why Hough transform is so robust. Lines in input image which are interrupted, dashed, even partially damaged will still get detected, because their undamaged segments will form the intersec- tion points which indicate presence of lines. So if we search for local maxima in HS, extract these points of in- tersection, map them back to Cartesian space and overlay this image on the original image - we get the detected lines (figure 5.2.3).

Figure 5.2.3: Detected line

19 5. HOUGHTRANSFORM

5.2.1 Algorithm There are many methods which we can employ to extract local max- ima from HS (i.e., de-Hough). If we represent HS as a grayscale image, we can use simple thresh- olding followed by thinning2 to isolate these bright spots. However, in a computer implementation of the Hough transform we use a so called “accumulator array” and “voting procedure”. Accumulator array is a 2D number array which represents HS:

A[θ, r], where θ ∈ [θmin, θmax], r ∈ [rmin, rmax]

rmin and rmax are the minimum and maximum values of r that we can possibly have and θmin and θmax are the minimum and maxi- mum values of range of discrete θ values for which we compute r. Voting procedure: //Initialize accumulator array houghSpace[thetaMax][rMax] = {0};

//Loop through every pixel of the input image for (x = 0; x < width; x++) { for (y = 0; y < height; y++) {

//If pixel is edge if (image.get(x,y) == WHITE) {

//Loop through every theta for (t = 0; t < thetaMax; t++) {

//Compute r r = x * cos(t) + y * sin(t);

//Increase vote by one houghSpace[t][r]++; } } } }

2. http://homepages.inf.ed.ac.uk/rbf/HIPR2/thin.htm

20 5. HOUGHTRANSFORM

De-Houghing: Cells in accumulator array which get enough votes (have greater value than the threshold) strongly indicate a line.

//Loop through accumulator array for (t = 0; t < thetaMax; t++) { for (r = 0; r < rMax; r++) {

//If cell has enough votes if (houghSpace[t][r] > threshold) { lines[i] = new line(t,r); i++; } } } return lines;

5.3 Circle detection

Hough transform for detecting circles works roughly in analogous way to the straight line detection described above. An equation in CCS changes to a parametric equation of a circle (figure 5.3.1), now with three parameters (a, b, r): x = a + r ∗ cosθ y = b + r ∗ sinθ where (a, b) is the center of circle, r is its radius and 0 ≤ θ ≤ 2π.

Figure 5.3.1: Circle in CCS From that again, we express parameters a = x − r ∗ cosθ and b = y − r ∗ sinθ

21 5. HOUGHTRANSFORM

5.3.1 Circles with known radii If we are looking for circles with specific radius (figure 5.3.1.1), the complexity of the algorithm will not change. We have 3 for cycles, 2 parameters and a 2D accumulator array.

Figure 5.3.1.1: Hough transform for circles with r = 30 Accumulator array: (figure 5.3.1.2)

A[x, y], where x ∈ [xmin, xmax], y ∈ [ymin, ymax] xmin and ymin are equal to 0 and xmax and ymax are equal to the width and the height of an input image respectively.

Figure 5.3.1.2: HS in transform with r = 30 Voting procedure is similar to the one used when detecting lines, only the equation changes: houghSpace[xMax][yMax] = {0}; ... //Compute a and b a = x - r * cos(t); b = y - r * sin(t);

houghSpace[a][b]++; ...

22 5. HOUGHTRANSFORM

De-Houghing is again almost identical: ... if (houghSpace[x][y] > threshold) { circles[i] = new circle(x,y,r); ...

5.3.2 Circles with unknown radii Normally, we do not know the radius of circle that we are looking for (figure 5.3.2.1). We need to specify the range of radii which the circle could possibly have, loop through it and collect votes as before. The computational complexity of the algorithm now increases as we have 3 parameters, 3D accumulator array and 4 for cycles. In general, the computation and the size of the accumulator array in- crease polynomially with the number of parameters [12]. Accumulator array:

A[x, y, r]; x ∈ [xmin, xmax], y ∈ [ymin, ymax], r ∈ [rmin, rmax] rmin and rmax define the minimum and the maximum values of radii of the circles that we want to detect. Voting procedure has one more for loop: houghSpace[xMax][yMax][rMax] = {0}; for (x = 0; x < width; x++) { for (y = 0; y < height; y++) { if (image.get(x,y) == WHITE) {

//Loop through range of possible radii for(r = rMin; r < rMax; r++) for (t = 0; t < thetaMax; t++) {

a = x - r * cos(t); b = y - r * sin(t);

houghSpace[x][y][r]++; ...

23 5. HOUGHTRANSFORM

De-Houghing: ... if (houghSpace[x][y][r] > threshold) { circles[i] = new circle(x,y,r); ...

Figure 5.3.2.1: Hough transform for circles with r ∈ [10, 60]

5.4 Detecting non-analytical shapes A modification of Hough transform was introduced by D.H. Ballard in 1981, which uses template matching principle called Generalised Hough Transform (GHT) [13]. This modification allows Hough trans- form to detect not only shapes described by an analytic equation (e.g. line, circle, ellipse, etc.) but also any arbitrary object described by its model.

Figure 5.4.1: Non-analytical shape with its R-table [14]

In order to use GHT, we need to compute its centroid (x0, y0). Then we can generate the R-table which is basically a table with val- ues r and φ, where for each edge point (x, y) its distance to this centroid is r and gradient angle at each edge point is φ (figure 5.4.1). Then we can apply the GHT algorithm [14] which loops through entries in the R-table.

24 6 Android application - Hough

As a part of this work, I developed an Android application called Hough, which implements all of the techniques mentioned in previ- ous chapters and detects road lanes in real time. My testing device was a Samsung Galaxy Ace (Samsung S5830) which was released in 2011. It features an 800 MHz Qualcomm pro- cessor, 384 MB of RAM and is officialy upgradable to Android ver- sion 2.3.6 (released in 2011 too). Considering this technical specifica- tion we can declare this smartphone with 3 years old hardware and software as a low-end device and we can expect significantly better performance of my application in modern devices.

6.1 Target platform Google, the owner of the Android, regularly releases public Applica- tion Programming Interface (API) updates for developers, through which a developer can communicate with the device’s built-in func- tions and functionality and interact with the underlying Android system. As the API version increases, functionality adds up. But we have to keep in mind that lower API may support more devices. My intent was to support as many devices as possible, so I chose the minimal API version with the neccessary functions for my appli- cation - version 10. According to the official dashboard statistics from April 1, 2014 only 1.1% of the devices running Android worldwide will be left out and will not be able to run it (figure 6.1.1).

Figure 6.1.1: Android versions worldwide [15]

25 6. ANDROID APPLICATION -HOUGH 6.2 Implementation

My application is based on a sample application called OpenCV Tu- torial 2 - Mixed Processing [16] which comes as a part of the whole OpenCV library. This sample provided me with the basic structure of the application along with a method which retrieves the input image from the camera in real time and hands it over for further processing.

6.2.1 Segmentation In my application I extracted the ROI from the input image as ex- plained in the 4th chapter using the LTE technique. I left all the code used for segmentation in the application’s source files, including the code for the techniques which proved to be unsuccessful, which is commented out, in case someone would want to use or improve it. I also draw a thin green line at the border of segmentation (half of the screen) so the user knows where the line detection begins and can tilt the smartphone and its camera accordingly.

6.2.2 Line detection The threshold, which is used in de-Houghing process of all of the techniques below can be set freely during runtime in the applica- tion’s settings (AS), so user can see and compare different results when set to different values. The first technique implemented is OpenCV’s HoughLines1 which detects straight lines in the image and returns a vector with 2-element vectors of r and θ for each line. Here I had to write a code for con- verting those lines from polar coordinates into CCS. The second technique is OpenCV’s HoughLinesP2 which finds line segments in an image and uses more efficient implementation of the Hough Line Transform described in Robust Detection of Lines Using the Progressive Probabilistic Hough Transform[17]. This returns a vector of 4 coordinates, which describe the starting and ending point

1. http://docs.opencv.org/modules/imgproc/doc/feature_ detection.html?highlight=houghlines#houghlines 2. http://docs.opencv.org/modules/imgproc/doc/feature_ detection.html?highlight=houghlinesp#houghlinesp

26 6. ANDROID APPLICATION -HOUGH

in CCS so no conversion is needed. Minimum line segment length and maximum allowed gap between line segments parameters can be again set freely during runtime inside the AS. Next technique is the Optimized Hough transform for finding lines, written entirely in Java, which I found on the VASE Lab site[18]. I used this technique for comparison with the OpenCV’s techniques which are written in C++. The input image from camera is in OpenCV’s mat format3 and this technique unfortunately works only with bitmaps. When converting to and from bitmaps in real time processing the FPS suffered significantly, so I had to rewrite VASE’s methods to work with mats instead of bitmaps. The last technique is my own implementation of Hough trans- form also written in Java using OpenCV’s functions only for draw- ing out detected lines. I implemented the Hough transform myself to establish a baseline in efficiency, performance and FPS of naive and straightforward implementation. I use accumulator array with θ values ranging from 0 to 180, and r values from 0 to the value of diagonal of the image multiplied by 2 so it covers all the possible values of θ and r. I also pre-compute the sin and cos values for every θ, which improves the performance greatly.

Figure 6.2.2.1: Detected lines (screenshot from my application)

3. http://docs.opencv.org/modules/core/doc/basic_structures. html#mat

27 6. ANDROID APPLICATION -HOUGH

Overview

In this section I measure and compare the performance of the imple- mented techniques. Input image for this phase is an image retrieved from a camera and segmented using the LTE approach. For these measurements I use the default values of parameters: threshold = 70 votes, minLineSize = 100 pixels (minimum line segment length) and maxLineGap = 100 pixels (maximum allowed gap between line segments). I am looking for the highest average measured FPS (see Attach- ment A):

Figure 6.2.2.1: Performance overview of line detection techniques

We can see (figure 6.2.2.1) that the OpenCV techniques written in C++ performed significantly better than the ones written in Java. HoughLinesP returns line segments, not whole lines so the most suit- able technique for my application is OpenCV’s HoughLines which was also the fastest.

6.2.3 Vanishing point

All these techniques on their own return many lines in image (fig- ure 6.2.2.1) and drawing all of them is both performance costly and it makes the resulting output image hard to read. That is why I im-

28 6. ANDROID APPLICATION -HOUGH

plemented a vanishing point detection for OpenCV’s HoughLines method. The algorithm determines which 2 lines (1 at left side and 1 at right side) are closest to the middle of road, which means they represent the borders of road. Then I find their intersection point at horizon, which represents the vanishing point of the road and draw them both out ending where they intersect (figure 6.2.3.1).

Figure 6.2.3.1: Detected lines with vanishing point (screenshot from my application)

6.2.4 Circle detection For circle detection (figure 6.2.4.1) I used OpenCV’s HoughCircles4 technique. Only for this technique I skip the segmentation phase, be- cause it takes a grayscale image as an input. I also developed my own implementation of Hough transform for circles. Again, the goal was to determine a baseline in perfor- mance and efficiency of straightforward implementation. The min- imum and maximum radius of circles to detect can be specified dur- ing runtime in AS. However, if they are both set to the same value (i.e., we know the radius of circle which we want to detect before-

4. http://docs.opencv.org/modules/imgproc/doc/feature_ detection.html?highlight=houghcircles#houghcircles

29 6. ANDROID APPLICATION -HOUGH hand), the faster method which uses 2D accumulator array instead of 3D will be used.

Figure 6.2.4.1: Detected circle (screenshot from my application)

Again we can see in figure 6.2.4.2 that the OpenCV’s method per- formed much better (see Attachment A).

Figure 6.2.4.2: Performance overview of circle detection techniques OpenCV’s HoughCircles method performed better because it im- plements more efficient version of Hough transform called The Hough gradient method. For more details I suggest their book Learning OpenCV: Computer Vision with the OpenCV Library [19]. Also it is highly opti- mized and written in C++, whereas my implementation is straight- forward, written in Java and includes a segmentation phase, which slows down the processing.

30 7 Conclusion

The goal of this thesis was to investigate the image processing and shape detection techniques on the Android mobile platform. The task was to evaluate which techniques can be used in solving computer vision problem of road lane detection. According to the results of this analysis an application was cre- ated which implemented different approaches in usage of Hough transform for line and circle detection. Simple and straightforward implementation was compared to the results achieved by using third- party library OpenCV. The resulting application is able to detect road lanes as well as horizontal circular traffic signs in real time. The approach with van- ishing point detection implemented resulted in a slightly higher av- erage FPS than the fastest technique without it, because it does not need to draw out as many lines. The average measured FPS in this approach reached 2.36 (see Attachment A) on a low-end device re- leased in 2011. Application Hough can be used as a learning tool for a better understanding of Hough transform and image processing on mo- bile devices, because of the variety of different techniques it covers, options for setting the parameters freely during runtime and FPS counter visible during runtime on-screen. With a further development it could be used as a digital driver’s assistant application.

7.1 Future development

This application could be improved by implementing lane tracking and techniques able to detect curves as road lanes. Lane tracking is much easier than lane detection. Lane detection will run only once at the start to initialize the lane model. Consider- ing there are only small changes between two consecutive frames, we can regard the estimated parameters of the lane model in the previ- ous frame as the initial parameters for the current frame. Work Lane detection and tracking using B-Snake [20] is proposing a B-Snake based lane detection and tracking algorithm. The proposed B-Snake lane

31 7. CONCLUSION

model can describe a wide range of lane structures since B-Spline1 can form any arbitrary shape using a set of control points [20]. Further functionality could be added in the future, such as obsta- cle detection. This can be achieved by using watershed algorithm as the Road Segmentation and Obstacle Detection by a Fast Watershed Trans- formation [21] work suggests. Another added functionality could al- low this application to recognize road signs. Conversion of charac- ters in an image to a computer language is a whole new computer vision problem called Optical character recognition 2. With these elements detected there could be various visual and auditory alerts displayed to the driver when the car passes over a road border lane (this could potentionally prevent accidents caused by microsleep) or when the car is too close to an obstacle. If we take position data from the built-in GPS sensor we can de- termine the average velocity of the vehicle, compare it to the value of the speed limit on the last recognized road sign and accordingly display another alert. The problem of road lane detection and applications trying to solve it have a great potential for daily use. Every technique used in this application could be a subject of further improvement by dif- ferent approaches or by taking advantage of new hardware.

1. http://en.wikipedia.org/wiki/B-spline 2. http://en.wikipedia.org/wiki/Optical_character_ recognition

32 Bibliography

[1] Computer vision [online], updated 28. 3. 2014, [cited 5. 4. 2014], Wikipedia. Available at: .

[2] Gonzalez, R. C., Woods, R. E. Digital image processing. 3rd ed. Upper Saddle River, N.J.: Pearson Prentice Hall, 2008. ISBN 978- 0-13-168728-8.

[3] ABOUT | OpenCV [online], updated 2014, [cited 7. 4. 2014], OpenCV Developers Team: itseez.com. Available at: .

[4] VXL [online], updated 1. 5. 2013, [cited 7. 4. 2014], VXL Devel- opers: http://vxl.sourceforge.net/developers.html. Available at: .

[5] Ammar, A., et al. OpenCV Based Real-Time Video Processing Us- ing Android Smartphone [online], updated December 2011, [cited 7. 4. 2014]. Available at: .

[6] Jones, T. Best Road Trip Hints for Kids? [online], up- dated 21. 3. 2010, [cited 16. 4. 2014]. Available at: .

[7] Thresholding [online], [cited 10. 4. 2014]. Available at: .

[8] Kalra, P. K. Canny Edge Detection [online], updated 23. 3. 2011, [cited 10. 4. 2014]. Available at: .

[9] Petrakis, E.G.M. Canny Edge Detector [online], up- dated 13. 12. 2010, [cited 10. 4. 2014]. Available at:

33 BIBLIOGRAPHY

.

[10] Hart, P.E. How the Hough transform was invented [online], Signal Processing Magazine, IEEE, vol.26, no.6, p.18- 22, November 2009, [cited 28. 4. 2014]. Available at: .

[11] Hough Line Transform | OpenCV 2.4.9.0 documenta- tion [online], updated 21. 4. 2014, [cited 28. 4. 2014], OpenCV Developers Team: itseez.com. Available at: .

[12] Fisher, R., Perkins, S., Walker, A., Wolfart, E. Hough Transform [online], Hypermedia Image Processing Reference, 2003, [cited 29. 4. 2014]. Available at: .

[13] Ballard, D.H. Generalizing the Hough Transform to Detect Arbi- trary Shapes [online], Pattern Recognition, vol.13, no.2, p.111- 122, 1981, [cited 30. 4. 2014]. Available at: .

[14] Yilmaz, A., Shah, M. Hough Transform Lecture-18 [online], CAP 5415 Computer Vision, Fall 2012, [cited 30. 4. 2014]. Available at: .

[15] Dashboards [online], updated 1. 4. 2014, [cited 1. 5. 2014], Google Android Developers. Available at: .

[16] OpenCV Tutorial 2 - Mixed Processing [online], updated 5. 4. 2013, [cited 1. 5. 2014], OpenCV Developers Team: itseez.com. Available at: .

34 BIBLIOGRAPHY

[17] Matas, J., Galambos, C., Kittler, J.V. Robust Detection of Lines Using the Progressive Probabilistic Hough Transform [on- line], CVIU 78 1, p.119-137, 2010, [cited 1. 5. 2014]. Avail- able at: .

[18] Oeschle, O. Finding Straight Lines with the Hough Tranform [on- line], updated 19. 5. 2012, [cited 1. 5. 2014], Vision and Syn- thetic Environments Laboratory ("VASE Lab"), School of Com- puter Science and Electronic Engineering, University of Es- sex. Available at: .

[19] Bradski, G., Kaehler, A. Learning OpenCV: Computer Vision with the OpenCV Library. O’Reilly Media, p.580, 1. 10. 2008. ISBN13: 9780596516130.

[20] Wang, Y., Teoh, E. K., Shen, D. Lane detection and track- ing using B-Snake [online], Image and Vision Computing, vol.22, p.269–280, 2004, [cited 13. 5. 2014]. Available at: .

[21] Beucher, S., Bilodeau, M. Road Segmentation and Obstacle De- tection by a Fast Watershed Transformation [online], Proceedings of the Intelligent Vehicles ’94 Symposium, p.296-301, 24-26. 10. 1994, [cited 13. 5. 2014]. Available at: .

35 A: FpsMeter logs

Segmentation

Global thresholding 04-11 02:12:40.979: I/FpsMeter(1779): 18.87 FPS@480x320 04-11 02:12:42.199: I/FpsMeter(1779): 16.31 FPS@480x320 04-11 02:12:43.099: I/FpsMeter(1779): 22.39 FPS@480x320 04-11 02:12:44.019: I/FpsMeter(1779): 21.58 FPS@480x320 04-11 02:12:45.279: I/FpsMeter(1779): 15.96 FPS@480x320 04-11 02:12:46.229: I/FpsMeter(1779): 20.97 FPS@480x320 04-11 02:12:47.119: I/FpsMeter(1779): 22.43 FPS@480x320 04-11 02:12:48.019: I/FpsMeter(1779): 22.40 FPS@480x320 04-11 02:12:48.939: I/FpsMeter(1779): 21.63 FPS@480x320 04-11 02:12:49.859: I/FpsMeter(1779): 21.68 FPS@480x320

Local thresholding 04-11 02:23:19.839: I/FpsMeter(1828): 15.12 FPS@480x320 04-11 02:23:21.159: I/FpsMeter(1828): 15.14 FPS@480x320 04-11 02:23:22.479: I/FpsMeter(1828): 15.15 FPS@480x320 04-11 02:23:23.969: I/FpsMeter(1828): 13.46 FPS@480x320 04-11 02:23:25.319: I/FpsMeter(1828): 14.77 FPS@480x320 04-11 02:23:26.639: I/FpsMeter(1828): 15.14 FPS@480x320 04-11 02:23:27.969: I/FpsMeter(1828): 15.08 FPS@480x320 04-11 02:23:29.289: I/FpsMeter(1828): 15.19 FPS@480x320 04-11 02:23:30.709: I/FpsMeter(1828): 14.10 FPS@480x320 04-11 02:23:32.239: I/FpsMeter(1828): 13.03 FPS@480x320

Local thresholding followed by an erosion 04-11 02:32:07.729: I/FpsMeter(1974): 10.70 FPS@480x320 04-11 02:32:32.269: I/FpsMeter(1974): 12.12 FPS@480x320 04-11 02:32:33.949: I/FpsMeter(1974): 11.91 FPS@480x320 04-11 02:32:35.559: I/FpsMeter(1974): 12.36 FPS@480x320 04-11 02:32:37.229: I/FpsMeter(1974): 12.10 FPS@480x320 04-11 02:32:38.809: I/FpsMeter(1974): 12.57 FPS@480x320 04-11 02:32:40.219: I/FpsMeter(1974): 14.16 FPS@480x320 04-11 02:32:41.639: I/FpsMeter(1974): 14.14 FPS@480x320 04-11 02:32:21.169: I/FpsMeter(1974): 11.65 FPS@480x320 04-11 02:32:22.789: I/FpsMeter(1974): 12.36 FPS@480x320

36 A:FPSMETERLOGS

Canny edge detector 05-02 20:14:36.989: I/FpsMeter(5667): 9.72 FPS@480x320 05-02 20:14:39.289: I/FpsMeter(5667): 8.70 FPS@480x320 05-02 20:14:41.239: I/FpsMeter(5667): 10.24 FPS@480x320 05-02 20:14:43.289: I/FpsMeter(5667): 9.80 FPS@480x320 05-02 20:14:45.409: I/FpsMeter(5667): 9.44 FPS@480x320 05-02 20:14:47.499: I/FpsMeter(5667): 9.53 FPS@480x320 05-02 20:14:49.569: I/FpsMeter(5667): 9.68 FPS@480x320 05-02 20:14:03.139: I/FpsMeter(5667): 10.10 FPS@480x320 05-02 20:14:05.159: I/FpsMeter(5667): 9.93 FPS@480x320 05-02 20:14:07.159: I/FpsMeter(5667): 9.96 FPS@480x320

Hough transform for lines

OpenC’sV HoughLines 05-02 21:05:44.119: I/FpsMeter(6182): 2.21 FPS@480x320 05-02 21:05:53.439: I/FpsMeter(6182): 2.15 FPS@480x320 05-02 21:06:04.019: I/FpsMeter(6182): 1.89 FPS@480x320 05-02 21:06:13.019: I/FpsMeter(6182): 2.22 FPS@480x320 05-02 21:06:23.339: I/FpsMeter(6182): 1.94 FPS@480x320 05-02 21:06:34.719: I/FpsMeter(6182): 1.76 FPS@480x320 05-02 21:06:46.129: I/FpsMeter(6182): 1.75 FPS@480x320 05-02 21:06:55.979: I/FpsMeter(6182): 2.03 FPS@480x320 05-02 21:07:05.559: I/FpsMeter(6182): 2.09 FPS@480x320 05-02 21:07:13.709: I/FpsMeter(6182): 2.45 FPS@480x320

OpenCV’s HoughLinesP 05-02 21:08:31.009: I/FpsMeter(6182): 1.86 FPS@480x320 05-02 21:08:39.569: I/FpsMeter(6182): 2.34 FPS@480x320 05-02 21:08:49.569: I/FpsMeter(6182): 2.00 FPS@480x320 05-02 21:09:00.399: I/FpsMeter(6182): 1.85 FPS@480x320 05-02 21:09:09.519: I/FpsMeter(6182): 2.19 FPS@480x320 05-02 21:09:20.289: I/FpsMeter(6182): 1.86 FPS@480x320 05-02 21:09:32.369: I/FpsMeter(6182): 1.66 FPS@480x320 05-02 21:09:43.609: I/FpsMeter(6182): 1.78 FPS@480x320 05-02 21:09:53.569: I/FpsMeter(6182): 2.01 FPS@480x320 05-02 21:10:03.129: I/FpsMeter(6182): 2.09 FPS@480x320

Optimized Hough transform written in Java 05-02 21:12:07.069: I/FpsMeter(6182): 0.72 FPS@480x320

37 A:FPSMETERLOGS

05-02 21:15:34.809: I/FpsMeter(6356): 0.70 FPS@480x320 05-02 21:20:43.269: I/FpsMeter(7276): 0.80 FPS@480x320 05-02 21:33:11.269: I/FpsMeter(1164): 0.62 FPS@480x320 05-02 21:33:35.979: I/FpsMeter(1164): 0.81 FPS@480x320 05-02 21:33:59.429: I/FpsMeter(1164): 0.85 FPS@480x320 05-02 21:34:24.689: I/FpsMeter(1164): 0.79 FPS@480x320 05-02 21:34:47.919: I/FpsMeter(1164): 0.86 FPS@480x320 05-02 21:35:10.539: I/FpsMeter(1164): 0.88 FPS@480x320 05-02 21:35:34.129: I/FpsMeter(1164): 0.85 FPS@480x320

My implementation of Hough transform for lines 05-02 21:27:26.179: I/FpsMeter(1104): 0.83 FPS@480x320 05-02 21:27:51.439: I/FpsMeter(1104): 0.79 FPS@480x320 05-02 21:28:17.189: I/FpsMeter(1104): 0.78 FPS@480x320 05-02 21:28:42.049: I/FpsMeter(1104): 0.80 FPS@480x320 05-02 21:29:06.649: I/FpsMeter(1104): 0.81 FPS@480x320 05-02 21:29:31.429: I/FpsMeter(1104): 0.81 FPS@480x320 05-02 21:29:56.649: I/FpsMeter(1104): 0.79 FPS@480x320 05-02 21:30:20.399: I/FpsMeter(1104): 0.84 FPS@480x320 05-02 21:30:44.219: I/FpsMeter(1104): 0.84 FPS@480x320 05-02 21:31:12.879: I/FpsMeter(1104): 0.78 FPS@480x320

Hough transform for circles

OpenCV’s HoughCircles 05-02 21:39:30.249: I/FpsMeter(1164): 0.96 FPS@480x320 05-02 21:39:44.549: I/FpsMeter(1164): 1.40 FPS@480x320 05-02 21:39:56.969: I/FpsMeter(1164): 1.61 FPS@480x320 05-02 21:40:08.449: I/FpsMeter(1164): 1.74 FPS@480x320 05-02 21:40:22.349: I/FpsMeter(1164): 1.44 FPS@480x320 05-02 21:40:37.779: I/FpsMeter(1164): 1.30 FPS@480x320 05-02 21:40:50.229: I/FpsMeter(1164): 1.61 FPS@480x320 05-02 21:41:01.449: I/FpsMeter(1164): 1.78 FPS@480x320 05-02 21:41:13.049: I/FpsMeter(1164): 1.72 FPS@480x320

My implementation of Hough transform for circles 05-02 21:42:00.759: I/FpsMeter(1164): 0.42 FPS@480x320 05-02 23:47:28.639: I/FpsMeter(1164): 0.24 FPS@480x320 05-02 23:48:43.679: I/FpsMeter(1164): 0.27 FPS@480x320 05-02 23:49:55.659: I/FpsMeter(1164): 0.28 FPS@480x320

38 A:FPSMETERLOGS

05-02 23:51:06.709: I/FpsMeter(1164): 0.28 FPS@480x320

Line detection with vanishing point

05-02 23:52:06.879: I/FpsMeter(1164): 2.32 FPS@480x320 05-02 23:52:17.129: I/FpsMeter(1164): 1.95 FPS@480x320 05-02 23:52:37.699: I/FpsMeter(1164): 2.03 FPS@480x320 05-02 23:52:59.199: I/FpsMeter(1164): 1.92 FPS@480x320 05-02 23:53:09.649: I/FpsMeter(1164): 1.91 FPS@480x320 05-02 23:53:18.149: I/FpsMeter(1164): 2.35 FPS@480x320 05-02 23:53:34.359: I/FpsMeter(1164): 2.51 FPS@480x320 05-02 23:54:03.819: I/FpsMeter(1164): 2.63 FPS@480x320 05-02 23:54:10.649: I/FpsMeter(1164): 2.93 FPS@480x320 05-02 23:54:18.889: I/FpsMeter(1164): 2.43 FPS@480x320 05-02 23:54:27.219: I/FpsMeter(1164): 2.40 FPS@480x320 05-02 23:54:37.369: I/FpsMeter(1164): 1.97 FPS@480x320 05-02 23:54:56.689: I/FpsMeter(1164): 2.14 FPS@480x320 05-02 23:55:03.509: I/FpsMeter(1164): 2.93 FPS@480x320 05-02 23:55:10.329: I/FpsMeter(1164): 2.94 FPS@480x320

39 B: Screenshot gallery

40 B:SCREENSHOTGALLERY

41 C: Electronic attachments

• Installation file

• Source code

These electronic attachments can be found in this thesis’ archive in Information System of Masaryk University and also on GitHub1: https://github.com/jmedveckyh/Hough.

1. http://en.wikipedia.org/wiki/GitHub

42