automatic off-road roadbook creation from satellite maps ......roadbooks are often hard to build,...

10
Automatic Off-Road Roadbook Creation from Satellite Maps João Pedro Carvalho Camejo Instituto Superior Técnico, Lisbon, Portugal [email protected] Abstract Roadbooks are still irreplaceable and fundamental tools for the navigation of a pre-planned route, normally used for rural trips, in sightseeing or in all-terrain raids. This thesis explains the construction process of an application to generate a roadbook through satellite images, representing a new approach to applications of the kind. Satellite images are automatically retrieved resorting to Google Maps imagery database, so image processing techniques must be used for road extraction. Steger’s line detection method is one of the most known and used. One of the objectives to this work is to have an autonomous process to identify roads in an image, so an extension to Steger’s method, allowing it to automatically select parameters was developed. Additionally a new approach for road extraction mainly based on color filter and template matching techniques, is presented. To surpass possible faults on the road extraction process, user interaction is introduced, allowing it to have full control over the path to be presented in the roadbook. I. INTRODUCTION A roadbook is a set of sequential indications over a road or a path, allowing for anyone following it to reach a destination from a starting location. It can be constituted by a set of pages or just a long list of direction information entries. Each roadbook entry usually provides information about the traveled distance, direction and some additional notes on road surroundings. Though there is no standard in the presentation of roadbooks, at least direction information always exists. This is represented by a drawing of each crossing or road junction, designated tulip, with the paths to follow visually highlighted enabling fast recognition of the one to follow. Roadbooks normally only have information on these road junctions avoiding anyone following it to get lost. Many times, complementary information like GPS coordinates distinct road segments also exist. Roadbooks are often hard to build, requiring prior track recognition and ahead planning. As it will be discussed, the tools available to build roadbooks are still very limited and sometimes unreliable. A. Objectives The main objective of the work is to build an application enabling the roadbook creation of a user chosen path. The project will have a strong user interaction, so emphasis will be given on this and application interface and ease of use issues will be addressed. To build roadbooks the application will only rely on satellite images. So, these satellite images will have to be integrated into the application in a way that allows the user to choose the intended travel path. User images must be accepted, but automatic fetching of the images is preferable. User must be able to select start and finish locations in the satellite map. Other non-mandatory points can also be input indicating travel locations, giving in this way more control over the chosen path. A path will have to be calculated, using only those points information and the image, so a method to extract roads from an image will have to be investigated. B. Major Contributions Getting a roadbook using satellite imagery and user interaction is a new approach for a roadbook construction tool, as there are no automatic roadbook obtaining tools relying on satellite image information. Automatic roadbook obtaining tools are scarce, and those existing depend on other systems or databases. To use images as a source of information some computer vision methods extracting roads were studied. A new method was also developed for this purpose, mainly to better suit the needs of the project. A comparison with a state of the art [1] method is performed and an adaptation on that state of the art method is also carried, allowing it to perform autonomously with automatic parameter selection. II. RELATED WORK AND USED TECHNIQUES A. Related Work Road recognition is a studied subject very important for GIS databases and other informational systems. These are fundamental in many location systems like GPS or map guiding services. Given the huge amount of existing road networks in the world, road recognition from satellite maps is one way to help complete these databases [2]. One of the main difficulties in performing this is finding a working method for the existing multitude of different settings and scenarios. There still are no methods able to perform well and extract a complete road network from a satellite image. The great amount of existing approaches shows that this is a highly non-trivial problem and that may be why no automatic roadbook creation commercial solution using this approach exists. Roadbooks are often created for off-road tracks where there is limited information available about the existing roads, in opposition to urban environments, where the network is very well known and up to date in most cases. A solution for this would be to directly extract information from sources like satellite maps. There are however some available solutions to create roadbooks. These often use GPS tracking to store taken

Upload: others

Post on 07-Feb-2021

5 views

Category:

Documents


0 download

TRANSCRIPT

  • Automatic Off-Road Roadbook Creation from Satellite Maps

    João Pedro Carvalho Camejo Instituto Superior Técnico, Lisbon, Portugal

    [email protected]

    Abstract — Roadbooks are still irreplaceable and fundamental

    tools for the navigation of a pre-planned route, normally used

    for rural trips, in sightseeing or in all-terrain raids. This thesis

    explains the construction process of an application to generate a

    roadbook through satellite images, representing a new approach

    to applications of the kind. Satellite images are automatically

    retrieved resorting to Google Maps imagery database, so image

    processing techniques must be used for road extraction. Steger’s

    line detection method is one of the most known and used. One of

    the objectives to this work is to have an autonomous process to

    identify roads in an image, so an extension to Steger’s method,

    allowing it to automatically select parameters was developed.

    Additionally a new approach for road extraction mainly based

    on color filter and template matching techniques, is presented.

    To surpass possible faults on the road extraction process, user

    interaction is introduced, allowing it to have full control over the

    path to be presented in the roadbook.

    I. INTRODUCTION

    A roadbook is a set of sequential indications over a road or

    a path, allowing for anyone following it to reach a destination

    from a starting location. It can be constituted by a set of

    pages or just a long list of direction information entries. Each

    roadbook entry usually provides information about the

    traveled distance, direction and some additional notes on road

    surroundings. Though there is no standard in the presentation

    of roadbooks, at least direction information always exists.

    This is represented by a drawing of each crossing or road

    junction, designated tulip, with the paths to follow visually

    highlighted enabling fast recognition of the one to follow.

    Roadbooks normally only have information on these road

    junctions avoiding anyone following it to get lost. Many

    times, complementary information like GPS coordinates

    distinct road segments also exist.

    Roadbooks are often hard to build, requiring prior track

    recognition and ahead planning. As it will be discussed, the

    tools available to build roadbooks are still very limited and

    sometimes unreliable.

    A. Objectives

    The main objective of the work is to build an application

    enabling the roadbook creation of a user chosen path. The

    project will have a strong user interaction, so emphasis will

    be given on this and application interface and ease of use

    issues will be addressed.

    To build roadbooks the application will only rely on

    satellite images. So, these satellite images will have to be

    integrated into the application in a way that allows the user to

    choose the intended travel path. User images must be

    accepted, but automatic fetching of the images is preferable.

    User must be able to select start and finish locations in the

    satellite map. Other non-mandatory points can also be input

    indicating travel locations, giving in this way more control

    over the chosen path. A path will have to be calculated, using

    only those points information and the image, so a method to

    extract roads from an image will have to be investigated.

    B. Major Contributions

    Getting a roadbook using satellite imagery and user

    interaction is a new approach for a roadbook construction

    tool, as there are no automatic roadbook obtaining tools

    relying on satellite image information. Automatic roadbook

    obtaining tools are scarce, and those existing depend on other

    systems or databases.

    To use images as a source of information some computer

    vision methods extracting roads were studied. A new method

    was also developed for this purpose, mainly to better suit the

    needs of the project. A comparison with a state of the art [1]

    method is performed and an adaptation on that state of the art

    method is also carried, allowing it to perform autonomously

    with automatic parameter selection.

    II. RELATED WORK AND USED TECHNIQUES

    A. Related Work

    Road recognition is a studied subject very important for

    GIS databases and other informational systems. These are

    fundamental in many location systems like GPS or map

    guiding services. Given the huge amount of existing road

    networks in the world, road recognition from satellite maps is

    one way to help complete these databases [2]. One of the

    main difficulties in performing this is finding a working

    method for the existing multitude of different settings and

    scenarios. There still are no methods able to perform well and

    extract a complete road network from a satellite image. The

    great amount of existing approaches shows that this is a

    highly non-trivial problem and that may be why no automatic

    roadbook creation commercial solution using this approach

    exists.

    Roadbooks are often created for off-road tracks where there

    is limited information available about the existing roads, in

    opposition to urban environments, where the network is very

    well known and up to date in most cases. A solution for this

    would be to directly extract information from sources like

    satellite maps.

    There are however some available solutions to create

    roadbooks. These often use GPS tracking to store taken

  • routes or to plan a new one. In some cases GPS is also

    actively used during the trip allowing for route correction

    when no map is available like in most off-road cases [3].

    Other solutions allow the creation or edition of roadbooks by

    manually choosing and setting each tulip and information

    according to an intended course.

    Many contributions exist to the topic of automatic road

    extraction from satellite and aerial view image. Li and Briggs

    [4] propose a method using a “reference circle” to extract

    roads from high-resolution aerial and noisy satellite images.

    Bacher and Mayer [5] propose an approach to extract rural

    roads taking advantage of the multispectral properties of the

    high-resolution satellite images. Steger [1] proposes a method

    extracting curvilinear structures from images using a

    differential geometric approach. Christophe and Inglada [6]

    propose a fast algorithm using a geometric method for a first

    step extraction level to be refined by human interaction or

    GIS integration. Porikli [7] approaches a method to extract

    roads from very low satellite imagery, using a Gaussian

    model. Additionally Mayer et al. [8] performed a

    comparative study on a set of automated road extraction

    methods aiming to update existing databases.

    On the other hand, semi-automatic methods are

    characterized by making use of some prior to extraction

    known information, usually a set of points indicating road

    locations called seed points. Most of the times these methods

    are used in conjugation to automatic ones to complete those

    or to bypass the necessity of the seed points. One example of

    this is the work of Laptev [9] combining an automatic

    curvilinear extraction method [1] with a semi-automatic one

    using active contour models (also known as snakes) [10].

    Pandit [11] also uses a combination of a few semi-automatic

    methods, mainly a variation of a region growing based road

    extraction called adaptive texture matching.

    B. Image Lines Extraction through a Differential Geometric

    Approach

    Steger’s work [1] uses a differential geometric approach to

    extract lines, having the same characteristics as roads, from

    images. This method has the advantage of performing well

    for both high and low resolutions and the approach is not

    specific to any kind of type of aerial/satellite image

    (multispectral, multi-temporal, noisy, etc.), so it suits the

    needs of this project.

    Steger starts by identifying approximations to the line

    profile in one dimension to extract, being parabolic and bar-

    shaped. Identifying the lines positions is done by calculating

    the points where the first derivative of the image convolved

    with the Gaussian kernel is zero. Then selecting the minimum

    of the second derivative extracts salient lines. For bar-shaped

    profiles, the responses of the convolution with the Gaussian

    kernel (��) derivatives are given by: ��(�,�,�, ℎ) = ℎ���� +� − ��� − � (1) ���(�,�,�, ℎ) = ℎ����� + � − ���� −� (2) ����(�,�,�, ℎ) = ℎ���� �� + � − ��� �� −� (3)

    The first derivative vanishes at x = 0 for all σ > 0. The

    second however will not exhibit this behavior for small σ.

    For this to hold, the following condition must be true.

    � ≥ �/√3 (4)

    In (4) � represents half the line width. For � = �/√3, ��

    �� takes its maximum negative response.

    For lines in two dimensions the same analysis is performed

    in the perpendicular direction to the line: �(�). This direction is computed resorting to the Hessian matrix.

    C. Zhang-Suen Thinning Algorithm

    Zhang-Suen thinning algorithm is a fast algorithm applied

    to digital patterns in binary images to reduce features to a

    one-pixel thick pattern while preserving connectivity and end

    points.

    III. SATELLITE MAPS

    Before processing the satellite images to extract the

    information for calculation of routes, is a key part of this

    work is how to obtain these images. This is performed using

    Google Maps, as this is a free reliable service offering a large

    variety of maps. This is integrated in the application using the

    service’s API, allowing automatic map fetching.

    A. Map Scale

    One aspect that is missing from Google Maps API is the

    lack of representation or the possibility to get a scale for the

    map obtained. This is a crucial aspect in this work, as it will

    be necessary to measure distances and make other

    calculations involving the translation into actual lengths.

    Since the Earth is not flat, the scale does not have a linear

    relation with the zoom level used varying also on the latitude

    of the location. Google recommends [13] as an alternative

    solution, the calculation of the approximate scale using the

    formula:

    � = ��� ∗ 256 ∗ 2����

    600 ∗ cos �������� ∗ �180

    � ∗ 2��, [����/�] (5)

    A study was performed to evaluate the goodness of the

    formula. A systematic error of about 14,6% was found, so a

    correction of that value is made on (5).

    B. Map Segmentation

    Selecting the course to be represented by the roadbook is

    done by user interaction. User selects a start, finish and

    eventually intermediary points. Google Maps offers satellite

    images at various zoom levels, so the maps will be requested

    by default at level 17 (5 levels below the maximum – 23) or

    the minimum between the start, finish and intermediary

    points input by the user, if any is lower than the default.

    At the defined zoom level requests must be made for

    images covering all the input point locations. An image

    request must be made using the location’s geographic

    coordinates. Translating screen Cartesian coordinates to

  • geographic and vice-versa is accomplished resorting to one

    Google Maps Javascript API.

    Having the correct projection of the map, the API method is

    used to translate each point to a known Cartesian reference

    frame. This reference frame is defined by the user

    visualization window at the level of zoom used to request

    each image segment. The translation is made using: � � ��� � ����� ∗ 2���� , �, ��, ���� ∈ �� (6) In (6), � corresponds to the point in the new reference

    frame, �� to the point given by the API method and ������ to the point given by the API method when translating of a

    pair of coordinates corresponding to the latitude of the most

    northeast corner of the visualization window and the

    longitude of the southeast corner of the same window. In this

    way is possible to get the X value of the left edge and the Y

    value of the upper limit of the preview window, which will

    correspond to the X, Y coordinates of the origin of the

    proposed reference frame, in the reference frame given by the

    API. As both these points (�� and ������) are in the same reference frame, to put �� in the new proposed reference frame is a matter of just performing a translation, subtracting

    ������ to ��. This value is then multiplied by a scale factor, in which zoom corresponds to the previously defined level

    used for requests.

    Enabling the independent processing of each of the map

    segments implies that each point has to be translated to the

    segment reference frame. This is done using the translation:

    � �′′� � �� ����� � �����������2���� � �� ����� � �������������2 , �′′ ∈ �� (7)

    �′′ is the point shifted to the positive space of the segment reference frame and � is point containing the minimum Cartesian coordinates amongst all points.

    IV. IMAGE PROCESSING

    A. Road Characterization from Satellite Images

    Knowing what is a road and what is not, is one of the main

    issues addressed here. Roads have some unique

    characteristics. Namely, roads in a rural environment are

    usually distinguishable bright elongated lines with low

    intersection and low curvature but variable width. In an urban

    environment roads are usually completely different having

    approximately constant width, high junction density, short

    segments with sharp angles and paved roads, normally darker

    than the surroundings.

    Visually these are the characteristics that make rural roads

    recognized as such when viewed from above, as in satellite

    images case. Since these are the ones with interest in this

    work, these characteristics had to be identified so that a

    method could be developed that recognizes them and

    separates the roads from the surroundings.

    Even though these characteristics are the ones

    representative and usually associated with rural roads, one

    aspect that brings extra complexity to this project is that there

    are many variations in these environments (much more than

    in urban, for instance), so applying one method that works for

    all is increasingly difficult.

    B. Steger’s Line Detector

    With Steger’s method is possible to extract curvilinear

    structures using differential geometric properties of the

    image. In fact, one of the motivations to Steger’s work is to

    extract roads from aerial images. So a study will be

    performed to evaluate how good and how well this method

    suits the needs of this project.

    This method is mainly governed on two variables: σ and r.

    σ is related to the road width, according to (4) and r is a

    threshold value, used to select line saliency. Figure 1 contains

    a simple distinguishable curved road with constant width and

    the result after applying Steger’s method. In this case the road

    is about 11 pixels wide so according to (4) a value of 3.18

    was selected for σ. For r a value of 3 was chosen in order to

    select only the main road line.

    Fig. 1. Road detection using Steger’s method.

    The result of the method is given in terms of contours

    found, being each contour constituted by a line with several

    intensity values along its pixels. Higher pixel values mean

    that response in the pixel in question was a best match for a

    curvilinear structure with the selected parameters, and that

    translates to a brighter pixel in the result. With the same

    reasoning, darker pixels in the result were worst matches for

    the selected curvilinear structure. Selecting a high enough

    value for r has the function of clearing these same low

    response values, most of them not belonging to the line. In

    order for the result to become visible, the response values are

    normalized in the range of [0 - 255].

    In the result it’s possible to clearly see that the extracted

    line follows the road’s curves, and perfectly adapts to the

    road. However there are some decisive issues to be analyzed.

    Closely comparing the original image with the result, it’s

    possible to see a small road in the bottom deriving from the

    main that is being poorly detected – only a small branch

    closer to the junction is appearing. This smaller road is not as

    bright as the main and is not as wider. This means firstly that

    the value of r should have been lower so that this road with

    lesser contrast could be detected, however along with this

    lower value, another responses not really belonging to the

    road would begin to appear. Even then, it would not be

    guaranteed that this small road would be extracted because

    it’s narrower than the main one and would need another value

    for σ in order to be detected. In fact, even in the main road

  • it’s possible to see that some sections where the road’s width

    is different than the expected (11 pixels) the response is

    either very faint or completely vanished. With different

    values for the threshold value even the faint sections could

    completely disappear.

    Another aspect visible in the result is that this method

    doesn’t deal very well with road occlusion. If a tree happens

    to block the view, the method will only extract the visible

    part of the road if any. In the cases where there are no

    curvilinear structures detected the response is zero (or very

    low), even if it’s just a small gap in the road caused by a tree

    blocking the view. In the cases where a small portion of the

    road is still visible the extraction will contour the obstruction

    considering the road to be a narrower section (that could have

    a lower response as described above), causing the detected

    road to be curvy even in straight sections, which is an

    undesirable effect.

    C. Finding Near Roads

    When the user marks the points in the map, due to a

    problem inherent to Google Maps Javascript API script, the

    points may not be located in the exact location where the user

    put them. So if the user marked the points directly over the

    road, depending on the zoom level used, that point may

    dislocated. Because of this, a marked point can’t be assumed

    to identify the exact location of a road.

    The marked point may not exactly pinpoint the location of a

    road, but most certainly a road will exist in the vicinity,

    unless the user marked a completely empty location. It will

    be assumed that a road always exists close to the point.

    Hereupon, to resolve this issue, Steger’s line detection

    method will be used, as it will serve to extract a curvilinear

    structure in the surroundings of the point.

    To accomplish this, using the marked point as a center, a

    100 × 100 pixels subsection of the image will be analyzed.

    Since nothing is known about the road to be found, some

    assumptions will have to be made so that some initial

    parameters (σ and r) for Steger’s line detection method can

    be chosen. A value of 5.0 will be used for r, which is

    relatively high and provides a good level of confidence that

    no lines are incorrectly identified as roads. As for σ, the

    assumption is that the wider roads will be around 20 pixels

    wide, which at default zoom level (17), scales will range at

    about 2 - 3 pixels/m, making these roads about 10 – 7 meters

    wide. According to (4), 20 pixel wide lines are expected to

    have a σ value of ~ 5.7.

    To shift the point to a location over the road, the maximum

    of the function (8), in the response subsection of the image

    domain is computed. The point is then moved to the location

    of the maximum.

    �� = �� ∗ �2 − �()�, ∈ ℕ� (8)

    In (8), � is the response of the Steger’s contours in a given location , � is distance between and the original point (in this case, it will be the fixed center of the image subsection)

    and is a constant representing the maximum distance

    attainable, that for a 100 × 100 section will be ~70.71

    (pixels). In this way the Steger contour response weighted

    with a distance factor is obtained, and distant lines are not

    considered unless they have a very big response value, and

    close lines are preferably chosen. Using this will make the

    point shift to near lines that will most likely correspond to

    roads in the original image given their response values.

    Algorithm Find Near Roads 1. Initialize sigma = maximum road width/ ( 2 * Sqrt(3)) 2. Initialize r = 5.0 3. 4. While number of found contours < 3 5. Find Steger Lines(sigma, r) 6. Decrease sigma 7. End While 8. 9. Find the maximum of L(x)

    10. Adjusted road point = maximum location

    D. Automatic Parameter Selection

    With the road location exactly pinpointed, information on

    the road can be discovered, namely, the road width and its

    contrast.

    After a run of the previously described algorithm allowing

    for a near road to be located, it’s possible to find the road

    width by applying the inverse of (4). The σ value is the one

    used by the method to get the nearby road. Knowing that

    already provides a first estimate about what the roads in the

    map look.

    A useful piece of information is provided by the Steger’s

    algorithm computation, which is the contrast of the found

    contour. Having all these information, it’s possible to

    estimate the value for the threshold r by computing the

    absolute value of (3).

    With this threshold and the σ value, running Steger’s

    algorithm would perform a search for all the similar roads to

    the one found. This is not very desirable, since the search for

    nearby roads always finds the widest one in the area. To

    correct this, the values of σ and r are a bit decreased in order

    for the constraints to be widen, and narrower roads can be

    found. The value used to decrease σ was 0.2, and 0.1 for r.

    E. A New Approach for Road Detection

    Steger’s line detection disadvantage is not being consistent

    for all image maps and road characteristics, so a study will be

    performed on whether a new approach can obtain better

    results or improve in some way the previously discussed

    method. Some other road characteristics will be explored,

    like color (or brightness) and format.

    One thing that can be done to identify rural roads and off-

    road tracks is to select in the image only a specific color

    range or brightness. As previously observed, rural roads are

    distinguishable in an aerial view image by being brighter and

    having a very characteristic color range.

    To make use of these properties, filters are applied to

    identify a specific color in an image. A user input point will

  • be used to sample the color of the road. For this purpose, the

    method discussed in C is used.

    Fig. 2. Application of the color filter.

    In Fig. 2 the result of the application of a color filter is

    shown. The filter samples the input color from the input

    point, and then selects similar color values throughout the

    image, inside a defined range up and down that value. The

    color filter is used in a grayscale image (so the color is a gray

    value) and the range is defined with a value of 30.

    On top of this result two more operations are applied. The

    first is a binarize operation, so that the filtered image is

    transformed into a binary one (black and white). The last

    operation applied is a maximum filter. This is applied over

    the binary image and is defined as follows: ����, �� � � !"#�$,� � � � �…� � �, & � � � �…� �� (9) A convolution mask of 3 × 3 was used to implement this

    filter (meaning n = 3 and m = 3). This has the objective of

    merging close extracted points together, filling in this way

    some possible gaps in the roads. The effect will obviously be

    extended to incorrectly extracted points, so some undesirable

    effects will also be amplified. In this way the extracted road

    will also have a property (dilated grouped clusters) that will

    be useful latter. The result of this filter application is shown

    in Fig. 3.

    Fig. 3. Application of the maximum filter.

    In order to clean some incorrectly identified areas another

    road property shall be used – its shape. To accomplish this,

    the template matching technique will be employed. This

    technique is usually employed to identify similar structures to

    a template in an image. Here the principle will be the same,

    the road is identified with a template and a matching criterion

    will be applied to the entire image. Normally with this

    technique there is interest in only identify one or a small

    amount of points in which the criterion reaches its maximum

    values, as here the interest will be in identifying areas of

    maximum values.

    Like before, the input user point (that is adjusted to be

    located over the road) will be used, in this case to extract a

    template from the image, thus providing a comparison base

    that is the actual road present in the image. In this way

    similar road segments are expected to be detected and once

    the map contains roads similar to the template, some of these

    similar roads will be identified.

    The used matching criterion was the correlation coefficient.

    One problem however, arises, and that’s the template

    matching technique in not rotation invariant, even though it

    can handle small rotation variation.

    To surpass this problem, several templates will be used at

    all rotations in the range [0, 2π[ (rad). This could be done

    only in [0, π[ (rad), as all rotations for a straight road will

    exist in this interval, however for the first case better results

    are obtained. This is due to the fact that the road may not be

    symmetrical on both sides – there may exist near side objects

    caught by the template and mainly because road contrast is

    almost never the same on both sides. Also, with this method,

    some responses will be reinforced with similar rotations.

    To get different orientated templates, rotate the existing one

    is not a solution. If this were done, with a rectangular

    template, as is the case, along with the rotation some

    information would be lost from the template corners. The

    template should not be resized to accommodate the rotated

    result, in order to maintain result consistency along the

    different orientations.

    To overcome this problem, the template is cropped to fit

    inside the original size and the missing corner information is

    filled with the data from the original image. This is

    accomplished by extracting from the image a rotated

    rectangular section. To do this, it must be exactly known

    from where to get exactly each pixel and then applying it on

    correspondent position in the template to use. To know the

    pixel location in the original image, the rotation operation

    equations, given by the rotation matrix (given in terms of -θ),

    are used. So, to locate the needed pixels the given rotation is

    performed around the center point C of the template in the

    original image, thus originating the equations:

    ' !� � �! � (�� cos�,� � �- � (�� sin�,� �(�-� � ��! � (�� sin�,���- � (�� cos�,� �(� (10) Having this set of templates a matching is performed for

    each one using the correlation coefficient criteria. The final

    result for this corresponds to the sum of all the unormalized

    responses.

    For the results of the color filter and the template matching

    to be comparable, two more operations must be performed on

    the template matching result: a threshold operation and a

    binarization of the image. The threshold value was set to 150

    (the result was normalized in [0, 255]) and the binary

    threshold set to 0 (sets to white every pixel in the image with

    value bigger than 0).

    Now having these results from the discussed methods in the

    same domains with the protruded detected road, they must be

    associated. The association method is accomplished by

    performing a pixel-wise weighted sum of both images. Each

    of the contributions has a weight of 50% to the final result. A

  • final threshold is then performed, with a value of 150. The

    final result in presented in Fig. 4.

    Fig. 4. Association of the template matching and filter

    techniques after threshold.

    The objective of the preceding methods and operations was

    to study a new way to obtain roads from a satellite image

    other than the previously discussed automatic operation of

    the Steger’s line detection method or to find a way to amend

    it.

    A hybrid solution where both methods contribute to a result

    can be thought. A possible way to do this is by adding both

    images pixel-wise and then threshold the result at some

    value. To add them pixel-wise, a dilation operation on the

    Steger’s results must be done because of the resulting thin

    lines and in order for the both images to become comparable.

    Then the threshold operation serves to clear some faint

    responses of the Steger’s method that were not reinforced by

    the new approach method. In this way the global result

    combining both methods will complete the undetected road

    areas of each but will also bring some extra noise to it.

    In Fig. 5 the result of described combination methodology

    is presented. The threshold operation was performed at a

    value of 100 and the contribution of each method is the same

    as presented earlier. The example previously shown

    continued to prove difficult, nevertheless some road sections

    were still added completing the overall network. The noise

    added in the image however is considerable and using this

    method also entails an increase in computational complexity.

    For these reasons and for not dramatically improving the

    overall results, this method was not used.

    Fig. 5. Combination of the new approach with Steger.

    F. Road Structuring From Blobs

    Looking at the result from the road extraction method it is

    possible to identify a distinctive property. These images are

    composed by aggregated sets of white pixels. In computer

    vision regions with similar properties are called blobs. These

    sets of white pixels in the resulting image can be classified as

    blobs, however in this case, these are very simple blobs

    sharing one distinguishable property from the surroundings –

    they are white (its pixels have a value of 255) as the

    surroundings are black (value 0).

    To find all blobs in an image, the image is scanned for non-

    zero valued pixels (since there are only two types, it will

    detect the white ones). Every time a pixel of this kind is

    found that does not belong to any already discovered blob, all

    connected components of the same kind are discovered, thus

    originating a blob. A component is considered connected to

    another if it sits in the Moore neighborhood of it (i.e. it’s

    located in exactly one of the 8 immediately attached pixels

    around it). To find all the connected components, a BFS is

    triggered rooting in the first pixel discovered.

    Essential to road structuring is a thinning operation

    performed on blob image. This will be accomplished

    resorting to the Zhang-Suen thinning algorithm. This

    algorithm operates in a binary image (as is the case) and

    satisfies two essential criteria: the thinned responses are

    always 1-pixel thick and structure connectedness is kept. The

    resulting thinned structures obtained will be named

    ThinBlobs.

    Next a classification of the points in each ThinBlob is

    performed. The key points that will be discovered are

    junctions and terminations. Junction points are those from

    which at least two branches leave whereas terminations

    points are the ones from where less than two branches leave.

    To find these special points, the blob is traversed point by

    point. In each, its Moore neighborhood (points P1 to P8) is

    analyzed and a classification of each point is performed.

    Algorithm Identify Junction or Termination

    1. Initialize N = 0 2. Initialize connected = false 3. 4. For points (P1 → P7) 5. If is blob point 6. If !connected 7. N++ 8. End If 9. Else 10. connected = false 11. End If 12. End For 13. 14. If P8 is blob point 15. If connected AND P1 is blob point 16. N-- 17. Else If !connected AND P1 is not blob point 18. N++ 19. End If 20. End If 21. 22. If N > 2 23. Is a junction 24. Else If N < 2 25. Is a termination 26. End If

    Having the location information of terminations and

    junctions, a classification will be performed on every other

  • point. These will either belong to termination branches or

    junction branches. Termination branches are those set of

    points linking a junction to a termination or a termination to

    another termination, whereas junction branches will be those

    linking two junctions.

    To compute the termination branches, a BFS exploration is

    used rooting in each of the discovered termination points.

    The exploration ends either when there are no points left to

    explore in the blob or a junction point is discovered. The

    exploration returns all traversed points, allowing in this way

    to store them as the branch points. Every time a termination

    branch linked to a junction is found, it increases the number

    of connected terminations in the respective Junction

    structure.

    Finding the points that constitute the branch is done in the

    same way as finding the termination branches points, that is,

    BFS-exploring until another junction is detected. The

    concerning branch will be added to the newly found Junction

    structure, so that these branches are computed only once.

    Another operation done during this stage is to estimate the

    angle of the termination branches. This angle is the

    approximate orientation of the branch ending that will be

    necessary for the next step. This is computed by estimating

    the running average of the angle of the last (at the most) M

    points of the branch. An “angle between two points” here

    referenced is the one between the x-axis of the used reference

    frame and a line formed by the two points. The value of M

    used for the implementation was 5.

    In Fig. 6 is shown the result of the branching operations

    described applied on a (perfect) road extraction. Depicted in

    blue are the junction branches and in orange the termination

    branches.

    Fig. 6. Branching of an ideal road extraction.

    To solve the gaps between the thin blobs that may have

    been due to road occlusion or to inefficiency of the road

    detection algorithm a linking algorithm is proposed. The

    objective is to find a path through the blobs that approaches

    an end point.

    The first thing to do is select a starting (thin) blob from a

    given initial point and then search for nearby (thin) blobs.

    When searching for close blobs, the main idea is to explore

    the areas next to blob endings taking advantage of the

    termination branch orientation. When a blob is found, the

    termination and the new point detected are linked with a

    straight line. This is meant to fill small gaps, so a straight line

    will in most cases will be a good or optimal approximation to

    the underlying road.

    In the linking process, all the blob branches that were

    previously found (and linked) are not eligible to start a new

    search process. In this way the linking will cascade from a

    given point, through the blobs, until there are no more blobs

    left to link. A blob is considered to be in linked state after all

    of its terminations have been explored. The blobs in this state

    can’t be linked anymore.

    When no more unlinked blobs are found the image is

    reanalyzed like a new blob image. In this way the newly

    linked blobs will be detected as one, with all its termination,

    branches and properties already discussed.

    When extraction noise is present in the blob image, a few

    blobs can be incorrectly linked (as they don’t represent roads)

    so some modifications to the method will be tested.

    The first modification will be to deal with smaller images.

    Using the starting point, a sub-region of the original image is

    extracted around that point. Then all the previously discussed

    methods are applied.. Having linked the blobs on the image,

    the result is parsed as a blob and all termination points of it

    are obtained. From these terminations, at the most two points

    are chosen and enqueued. Then the process is repeated on

    each of the enqueued points, treating them as starting points

    of a new process, until no more points are left. The results

    obtained are saved and are kept overlapping to each other, so

    that in the end a more complete road network is obtained.

    The sub-region area used was 300 × 300 pixels.

    The two points chosen from the terminations will those

    that, in relation to an endpoint, are the closer and with smaller

    azimuth. The Cartesian distance between two points was used

    to avail the proximity of the points, as for the azimuth

    between two points was defined as follows: 01�������, 2� � 3arctan2���, ��� �arctan2�2�, 2��3 (11) Here, P is a given point in Cartesian coordinates and E

    refers to the endpoint, in Cartesian coordinates as well. If

    both the closest point and with lower azimuth are the same,

    only one is enqueued.

    The overlapping of individual steps can create multiple

    responses for the same extracted road. This result will be

    parsed as one big ThinBlob, so to avoid this, the image is

    cleaned of small dark blobs (using a state of the art blob

    library [12]) and then thinned (using the Zhang-Suen

    algorithm).

    Fig. 7. Road extraction after linking, using several starting

    points.

    Using all of the user input points will produce better results.

    The process is repeated for each one of them and then the

    contributions are overlapped and treated in a similar manner

    as before. In Fig. 7 the overall result of this process on an

    example image is shown.

  • V. BUILDING ROADBOOK

    A. Parallel Execution

    With the methods of the previous chapter available two

    kinds of inputs are needed: an image (satellite map) and a set

    (can be empty) of points assumed close to roads. There is

    also the need to provide the end point. This end point must

    also be translated to the specific segment reference frame,

    even if it located outside the image bounds. The segments

    can, in this way, be independently treated as a single image.

    Since the method to obtain roads has its independent group

    of variables, the easiest way to have some performance

    improvement is to compute in parallel the result of each

    segment regarding its set of points. This is done by executing

    the road detection method for each segment in its own

    independent thread. This kind of parallel execution is referred

    as task parallelism.

    There will be as many tasks as the resulting number of map

    segments. The results of all tasks are then stitched together in

    their segment original relative position, forming the global

    result of the initial map.

    Using the parallel approach, the overall processing will

    depend on the number of threads allowed by the system

    running the application and the number of segments. In the

    best case it will take as long as the longest task and in the

    worst case, with a single-threaded system it will take as long

    as the sequential approach. Overhead on the creation and

    destruction of tasks is considered negligible compared to the

    tasks complexity.

    B. Graph Conversion

    Taking the generated global result and use it in a way that

    allow for a roadbook to be presented will require the detected

    paths represented in the image to be translated into a graph.

    The first thing to do is to find how the paths are connected.

    The methods already presented allows for an image with

    thinned interconnected lines to be translated into

    terminations, junctions and branches. This is a structure with

    some degree of similarity to the intended graph, so the first

    operation in the global result is to apply these methods and

    obtain a structured data set on the represented network.

    When converting the structured data resulting of image

    processing operations to the graph, the nodes are

    correspondent to the junctions and terminations and the links

    are correspondent to each of the existing termination or

    junction branches. Converting a branch to a link requires the

    newly created link to be added to the adjacency list on the

    both nodes on the link’s ends. These nodes are identified by

    the point where they are located, either the junction points in

    case of the junction branch or the junction and termination

    point in case of the termination branch. The links will have

    access to the point list of the branch, since this information is

    available. The link’s distance (or length) is provided by the

    number of points in that list, thus representing the distance

    between two nodes in pixels. In Fig. 8 is represented this

    conversion process, from a road extraction result.

    Fig. 8. Graph conversion.

    C. Result Review

    Presenting the result of the image processing to the user is a

    very important step. This allows for them to correct the

    detected course or select a new one and choose a path. For

    this purpose an interactive solution was adopted.

    In this stage user has the ability to add new nodes, connect

    existing ones or remove existing links or nodes. These

    operations are made over the graph structure previously

    presented.

    The start, middle and finish nodes are added to the graph as

    any other node. These are automatically added after the

    conversion from the image processing result to graph is done.

    When performing the node addition to the graph, a search for

    a path in the vicinity is made. If a path is found, the node will

    automatically be added to that path. This is a graph operation

    accomplished by breaking the link in two and resetting the

    node ends on the resulting links accordingly. The point list is

    also divided, but this is just a matter of finding the right point

    position inside that list. Since the point list is result of a BFS

    operation, the points are sequentially ordered, so dividing the

    list requires no other operations than just break it at this

    point. If the node to add is a start, finish or middle point, the

    search for nearby nodes is additionally performed and the

    nodes eventually found are transformed into one of these

    special points accordingly. The difference between these and

    any others is just a flag inside the node structure indicating

    what kind of node it is. These also appear identified in the

    GUI and are used for other graph operations.

    Connecting two nodes will add a new link to each of the

    nodes, with the point list corresponding to a straight line

    between them. A straight line can be a good approximation to

    the underlying missing road segment in many cases. In

    situations where this is not true, some intermediary nodes can

    be added to improve road estimation of the road format.

    Removing nodes and links is useful in faulty road detection

    cases. Selecting any node will remove it and all of its links,

    so it will also affect neighbor nodes if any. Start and finish

    nodes cannot be removed. To delete a link, the two linked

    nodes must be chosen and the remove action selected.

    D. Path Calculation

    Start and finish nodes should also be connected for a path to

    be computed. If the user has selected middle points, those

    must also be connected and the path will pass through them.

    While start, finish and middle nodes are not connected, the

    application doesn’t allow advancing to next stage.

    The algorithm used to get the roadbook path was a

    modification of the Dijkstra’s shortest path algorithm. The

  • first operation to do is to check if all required path nodes are

    connected, that is if starting in a source node all these

    targeted nodes can be reached. The most efficient way to do

    this would be using an exploration algorithm like a BFS or a

    DFS checking if all selected nodes were found. With this

    solution the worst case complexity would be ���, meaning that at the most all the edges would be visited once. However

    the goal here is not solely finding if the nodes are connected.

    Since the Dijkstra algorithm is going to be applied, it’s also

    being used to check if the nodes are connected. A run of the

    Dijkstra’s algorithm outputs the minimum distance from a

    source node to all remaining graph nodes. If the source is not

    connected to a given node, then the Dijkstra’s output in that

    situation indicates that the distance separating them is

    infinite. So, to check if a given set of nodes are connected

    using Dijkstra’s algorithm is just a matter of checking if all

    distances to all nodes are less than infinite. If this condition

    applies, then there is a path from start to finish through all

    middle points. With the used implementation of the

    Dijkstra’s algorithm the worst case complexity is �� log��, which is a bit worst when comparing to an exploration

    algorithm just for the connectivity task, but in this case where

    the Dijkstra is computed after that, this last computation

    resolves the problem and simultaneously computes the path.

    If the computation only involves discovering the path

    between the start and finish points, this problem is solved and

    the discovered path is the shortest, computed by Dijkstra’s

    algorithm. When more nodes are involved, a more complex

    solution must be considered.

    The solution here should always be a method that

    empowers the user with some level of choice regarding the

    final calculated path.

    For the user to have this control, an extra parameter was

    entered into the path calculation, and that is a node priority.

    The user has the possibility to set this priority in the GUI, by

    selecting the intended middle node. In this way, the algorithm

    will search for the lowest priority value first and only then it

    will try to search for the closest middle node. The nodes are

    initialized by default with a priority value corresponding to

    infinite. With this value the preference will be given to the

    closest middle node instead.

    Fig. 9. Path selection using two prioritized middle nodes.

    In Figure 9 an example with prioritized middle nodes is

    shown. Here the user has chosen to attribute a priority of 1 to

    D. As this is the lowest priority of any of the middle nodes,

    the algorithm will start by finding a path to D, which will be

    the shortest path, given the use of the Dijkstra’s algorithm.

    No priority was set on E, so the algorithm will find the next

    closest middle node that in this case is the remaining one.

    Using this approach the user has control over the calculated

    path.

    E. Roadbook Presentation

    With the path calculated all that is left to do is to build the

    actual roadbook. The roadbook will present de directions

    between road junctions, the total distance traveled and the

    distance from last indication. The format of the junction is

    required to be shown as well as the navigational direction

    (from where it came and to where is going) in each entry.

    To identify the format of the junction, a roadbook tulip is

    be used. Here it will be taken advantage of some already

    present information, that is, the format of the roads around

    the junction. Either automatically detected or user adjusted,

    this format represents the road junction in the best possible

    way, given that the only source of information is the satellite

    map image.

    The roadbook is formed by a set of pages, each one having

    a set of entries and a header. Each entry represents an

    indication of the road to follow with a tulip, a sequence

    number, the total and partial distance covered. A page is built

    as an image with big enough dimensions to accommodate the

    header plus a predetermined number of entries. The number

    of entries set for each page was 7.

    Fig. 10. Generated Roadbook.

    The nodes of the computed path are iterated. In the

    previously identified nodes (where a tulip was created) an

    entry is created. A sequence number is maintained, to be

    printed in each entry, as well as an accumulator with the total

    traveled distance and another with the partial distance. Every

    time a node that is signaled to be present in the page, the

    partial distance is printed in the page and is set to zero,

    providing in this way only the distance from the last entry.

  • All the distances are internally saved as a sum of the graph

    link lengths given in pixels and presented in kilometers (as

    the defined metric used). In Fig. 10 a constructed roadbook

    corresponding to the calculated path of the Fig. 8 is shown.

    VI. CONCLUSIONS AND FUTURE WORK

    Road recognition from the obtained images was one of the

    main issues in this work. To overcome this problem, several

    image processing methods aiming road extraction from

    satellite maps were studied and a new approach for road

    extraction was proposed, with results comparable to existing

    state of the art methods. The road extraction problem was

    resolved using the developed method along with a linking

    algorithm that tries to improve and better suit the results to

    the objective of this section of the project. This algorithm and

    the developed road extraction method are independent and

    are expected to work separately. Specifically, the extraction

    process can serve as a road recognition method in other

    projects, can be improved with more or different intermediary

    methods (template matching and color filtering were used

    here) or even serve as a starting point for other algorithms to

    work upon.

    As the road extraction process may not be perfect, map

    reviewing by the user takes particular significance to correct

    and complete the path. In this way, it’s guaranteed that a

    roadbook can always be generated on any location. A future

    improvement can be though as replacing the current road

    extraction method with another with better results, eventually

    developed. User validation will always be required even in

    the best case scenario, to assure that the computed path to be

    presented in the roadbook is the one intended.

    In terms of performance improvement, a different

    parallelization process could be approached. The most

    intensive tasks are the image processing ones, so instead of

    concurrently processing each segment of the satellite image,

    smaller tasks could be identified inside that large processing

    task, making the parallelization more efficient. Another

    improvement that could be made was to perform some or all

    of the image processing computations in the GPU. Given the

    purpose of this unit and the computing power usually

    associated, improvements in the performance would be

    expected in delegating the execution of the referred tasks to

    it.

    Integration with other systems aiming to improve results

    can also be planned as a perspective of future work. Building

    a system using GPS and a satellite image approach would

    bring the best of both worlds providing more accurate results

    and introduce a variety of new approaches for the roadbook

    problem. Similarly this work could also improve some of the

    existing commercial solutions for roadbook creation already

    using these systems.

    More variety of maps can also be introduced by integrating

    more services providing maps, other than Google Maps. This

    could be taken as an opportunity to improve results, by

    comparing different images of the same location, road

    extraction could potentially achieve better results. In fact

    some of the studied existing methods for this are already

    based in identifying image differences. Integration with

    commercial satellite image services could also allow new

    approaches for road extraction given the nature of the

    provided images (high-resolution, multispectral).

    REFERENCES

    [1] Steger, C. (1996). Extracting curvilinear structures: A

    differential geometric approach. In Computer Vision—

    ECCV'96 (pp. 630-641). Springer Berlin Heidelberg.

    [2] Mena, J. B. (2003). State of the art on automatic road

    extraction for GIS update: a novel classification. Pattern

    Recognition Letters, 24(16), 3037-3058.

    [3] Tripy, GPS + Digital Road Book. (n.d.). Tripy GPS + Digital

    Road Book for motorbikes, 4X4, Quad and Oltimers. Retrieved

    September 18, 2013, from http://www.tripy.eu/en/

    [4] Li, Y., & Briggs, R. (2009). Automatic extraction of roads from

    high resolution aerial and satellite images with heavy noise.

    World Academy of Science, Engineering and Technology, 54,

    416-422.

    [5] Bacher, U., & Mayer, H. (2005). Automatic road extraction

    from multispectral high resolution satellite images.

    Proceedings of CMRT05.

    [6] Christophe, E., & Inglada, J. (2007, September). Robust road

    extraction for high resolution satellite images. In Image

    Processing, 2007. ICIP 2007. IEEE International Conference

    on (Vol. 5, pp. V-437). IEEE.

    [7] Porikli, F. M. (2003, September). Road extraction by point-

    wise gaussian models. In AeroSense 2003 (pp. 758-764).

    International Society for Optics and Photonics.

    [8] Mayer, H., Baltsavias, E., & Bacher, U. (2005). Automated

    Extraction, Refinement, and Update of Road Databases from

    Imagery and Other Data.

    [9] Laptev, I. (1997). Road extraction based on snakes and

    sophisticated line extraction. Master's thesis, Royal Institute of

    Technology, Stockholm, Sweden.

    [10] Kass, M., Witkin, A., & Terzopoulos, D. (1988). Snakes:

    Active contour models. International journal of computer

    vision, 1(4), 321-331.

    [11] Pandit, V. (2009). Automatic Road Extraction From High

    Resolution Satellite Imagery.

    [12] CvBlobsLib — OpenCV Wiki. (n.d.). OpenCV Wiki. Retrieved

    July 2, 2013, from http://opencv.willowgarage.com

    /wiki/cvBlobsLib

    [13] Issue 4189: Add a scale bar to static maps. (n.d.). Gmaps-api-

    issue. Retrieved November 25, 2013, from

    https://code.google.com/p/gmaps-api-issues/issues/detail?id=

    4189