Practical application of image processing and recognition techniques with matlab.
This project presents and offline method for detecting lane lines using Hough transform. The images are captured with my own smartphone carried inside the vehicle. The smartphone frame rate is 30 FPS.
The first step is image conditioning in order to eliminate non interest elements, followed by an edge detector, the edge detector binary image output will be the input to the Hough transform in order to detect lines. Two best lines from each side will be chosen and averaged. Finally the averaged two lines will be tracked in a 10 frames window.
Images were captured with my smartphone Xiaomi Mi A2 carried inside the vehichle and fixed to the car front glass using a sucktion cup device as shown in the picture (pending...).
Since the camera will be fixed in the same spot inside the vehicle a ROI will be extracted from the raw image on each frame. In this project ROI coordinates are calibrated once and used for the whole video, a good way of finding ROI coordinates could be finding vanishing point coordinates and calculating ROI coordinates after that.
A Gaussian filter is done to the ROI image in order to reduce noise and smooth the edges.
Gaussian filtering averages the pixels arround each pixel of the image, the pixels averaged are determined by the kernel (matrix) size.
Because a Gaussian distribution is used the kernel size will be 6*Std in order to fit the Gaussian curve to our kernel.
The noise reduction using an averaging kernel comes with image information loss, in order to benefit from smoothening without losing too much information the standard deviation will be Std = 1.
Image below shows different results for the edge detector with different Std values. On the left size the Std is too low so there is too much noise even inside the lane line. On the right side the Std is too high which makes the two lane lines fuse at some point.
After smoothening the image an RGB to Gray process is made using MATLAB function rgb2gray. From now on each operation made to the image will be 1024x1024x1 instead of 1024x1024x3 making the algorithm less computational demanding and still reaching good results.
The equation used for the grayscaling is (𝑦=0.299𝑅+0.5870𝐺+0.1140𝐵).
In my research phasse for this project I found several methods and ways of finding lane marks, one step that lead my algorithm to good results was a pre-processing step presented by Marcos Nieto in his paper "Road environment modeling using robust perspective analysis and recursive Bayesian segmentation".
This technique is based on the assumption that, in a row of image, the pixels which belong to lane markings tend to have "high intensity value surrounded by darker regions". Thus, the detector independently filters each row of image by its pixels intensity values.
The filter highly responds to the pixels, which have higher intensity values than their left and right neighbors in the same row at distance Tau (lane markings width). The last term of equation is removed from filtered value yi to help the filter less prone to errors, especially in the case that the difference between intensity values of left and right neighbors is too high.Thresholding is used to decide which pixels we want to keep while discard the others based on their intensity values. In the image output from step 4., the pixels that belong to lane-marking evidence tend to have higher intensity values. Besides, the disturbances may leave in the image many lower intensity pixels represented as thin and weak edges. Binary thresholding can help significantly in removing these noises and enhancing the "good" pixels.
The basic idea behind binary thresholding is that all pixels that have intensity values higher than a certain threshold will be set to maximum value maxVal, while the others with intensity values below the threshold are set to zero. In this case Otsu's method will be used where the threshold is determined by minimizing intra-class intensity variance, or equivalently, by maximizing inter-class variance. At the end in order to reduce noise, conected components less than 90 px are suppressed.
After many layers of pre-processing stage, edge detection is the final one used to find strong edges which will help the Hough transform stage to find the lane lines.
A Canny edge detector is used in this project, this method is very computational demanding because runs several steps which also make it very effective. Since our image is very well conditioned a Sobel edge detector could be used in vertical and horizontal directions in order to increase computational speed (Testing led on a 20% decrease time per frame using Sobel).
Canny edge detector makes a Gaussian filter, then applies four filters to detect horizontal, vertical and diagonal edges, which finds the intensity gradients of the image. Once the intensity gradients are found performs a non-maximum suppression to get rid of spurious response to edge detection.
After application of non-maximum suppression, remaining edge pixels provide a more accurate representation of real edges in an image. However, some edge pixels remain that are caused by noise and color variation. In order to account for these spurious responses, it is essential to filter out edge pixels with a weak gradient value and preserve edge pixels with a high gradient value. This is accomplished by selecting high and low threshold values.
So far, the strong edge pixels should certainly be involved in the final edge image, as they are extracted from the true edges in the image. However, there will be some debate on the weak edge pixels, as these pixels can either be extracted from the true edge, or the noise/color variations. To achieve an accurate result, the weak edges caused by the latter reasons should be removed.
In this study Hough transform will be used in order to find the lines that best represent the lane edges. This can be a complicated task because some parameters need to be tuned in order to achieve best results.
- Bin size named 'RhoResolution', higher values for this parameters lead
el tamaño del bin 'RhoResolution' determinará la facilidad de que los puntos sean propensos a ser interpretados como de la misma línea, 'Theta' definirá el rango de valores de θ que utilizará el algoritmo para detectar las líneas así que disminuyendo su resolución y su rango podemos disminuir el coste computacional del algoritmo y evitar ciertos ángulos que no serán de interés, 'Threshold' permite controlar cuantos votos son necesarios para considerar una línea.