automatic off-road roadbook creation from satellite maps ......roadbooks are often hard to build,...

Automatic Off-Road Roadbook Creation from Satellite Maps

João Pedro Carvalho Camejo Instituto Superior Técnico, Lisbon, Portugal

[email protected]

Abstract — Roadbooks are still irreplaceable and fundamental

tools for the navigation of a pre-planned route, normally used

for rural trips, in sightseeing or in all-terrain raids. This thesis

explains the construction process of an application to generate a

roadbook through satellite images, representing a new approach

to applications of the kind. Satellite images are automatically

retrieved resorting to Google Maps imagery database, so image

processing techniques must be used for road extraction. Steger’s

line detection method is one of the most known and used. One of

the objectives to this work is to have an autonomous process to

identify roads in an image, so an extension to Steger’s method,

allowing it to automatically select parameters was developed.

Additionally a new approach for road extraction mainly based

on color filter and template matching techniques, is presented.

To surpass possible faults on the road extraction process, user

interaction is introduced, allowing it to have full control over the

path to be presented in the roadbook.

I. INTRODUCTION

A roadbook is a set of sequential indications over a road or

a path, allowing for anyone following it to reach a destination

from a starting location. It can be constituted by a set of

pages or just a long list of direction information entries. Each

roadbook entry usually provides information about the

traveled distance, direction and some additional notes on road

surroundings. Though there is no standard in the presentation

of roadbooks, at least direction information always exists.

This is represented by a drawing of each crossing or road

junction, designated tulip, with the paths to follow visually

highlighted enabling fast recognition of the one to follow.

Roadbooks normally only have information on these road

junctions avoiding anyone following it to get lost. Many

times, complementary information like GPS coordinates

distinct road segments also exist.

Roadbooks are often hard to build, requiring prior track

recognition and ahead planning. As it will be discussed, the

tools available to build roadbooks are still very limited and

sometimes unreliable.

A. Objectives

The main objective of the work is to build an application

enabling the roadbook creation of a user chosen path. The

project will have a strong user interaction, so emphasis will

be given on this and application interface and ease of use

issues will be addressed.

To build roadbooks the application will only rely on

satellite images. So, these satellite images will have to be

integrated into the application in a way that allows the user to

choose the intended travel path. User images must be

accepted, but automatic fetching of the images is preferable.

User must be able to select start and finish locations in the

satellite map. Other non-mandatory points can also be input

indicating travel locations, giving in this way more control

over the chosen path. A path will have to be calculated, using

only those points information and the image, so a method to

extract roads from an image will have to be investigated.

B. Major Contributions

Getting a roadbook using satellite imagery and user

interaction is a new approach for a roadbook construction

tool, as there are no automatic roadbook obtaining tools

relying on satellite image information. Automatic roadbook

obtaining tools are scarce, and those existing depend on other

systems or databases.

To use images as a source of information some computer

vision methods extracting roads were studied. A new method

was also developed for this purpose, mainly to better suit the

needs of the project. A comparison with a state of the art [1]

method is performed and an adaptation on that state of the art

method is also carried, allowing it to perform autonomously

with automatic parameter selection.

II. RELATED WORK AND USED TECHNIQUES

A. Related Work

Road recognition is a studied subject very important for

GIS databases and other informational systems. These are

fundamental in many location systems like GPS or map

guiding services. Given the huge amount of existing road

networks in the world, road recognition from satellite maps is

one way to help complete these databases [2]. One of the

main difficulties in performing this is finding a working

method for the existing multitude of different settings and

scenarios. There still are no methods able to perform well and

extract a complete road network from a satellite image. The

great amount of existing approaches shows that this is a

highly non-trivial problem and that may be why no automatic

roadbook creation commercial solution using this approach

exists.

Roadbooks are often created for off-road tracks where there

is limited information available about the existing roads, in

opposition to urban environments, where the network is very

well known and up to date in most cases. A solution for this

would be to directly extract information from sources like

satellite maps.

There are however some available solutions to create

roadbooks. These often use GPS tracking to store taken

routes or to plan a new one. In some cases GPS is also

actively used during the trip allowing for route correction

when no map is available like in most off-road cases [3].

Other solutions allow the creation or edition of roadbooks by

manually choosing and setting each tulip and information

according to an intended course.

Many contributions exist to the topic of automatic road

extraction from satellite and aerial view image. Li and Briggs

[4] propose a method using a “reference circle” to extract

roads from high-resolution aerial and noisy satellite images.

Bacher and Mayer [5] propose an approach to extract rural

roads taking advantage of the multispectral properties of the

high-resolution satellite images. Steger [1] proposes a method

extracting curvilinear structures from images using a

differential geometric approach. Christophe and Inglada [6]

propose a fast algorithm using a geometric method for a first

step extraction level to be refined by human interaction or

GIS integration. Porikli [7] approaches a method to extract

roads from very low satellite imagery, using a Gaussian

model. Additionally Mayer et al. [8] performed a

comparative study on a set of automated road extraction

methods aiming to update existing databases.

On the other hand, semi-automatic methods are

characterized by making use of some prior to extraction

known information, usually a set of points indicating road

locations called seed points. Most of the times these methods

are used in conjugation to automatic ones to complete those

or to bypass the necessity of the seed points. One example of

this is the work of Laptev [9] combining an automatic

curvilinear extraction method [1] with a semi-automatic one

using active contour models (also known as snakes) [10].

Pandit [11] also uses a combination of a few semi-automatic

methods, mainly a variation of a region growing based road

extraction called adaptive texture matching.

B. Image Lines Extraction through a Differential Geometric

Approach

Steger’s work [1] uses a differential geometric approach to

extract lines, having the same characteristics as roads, from

images. This method has the advantage of performing well

for both high and low resolutions and the approach is not

specific to any kind of type of aerial/satellite image

(multispectral, multi-temporal, noisy, etc.), so it suits the

needs of this project.

Steger starts by identifying approximations to the line

profile in one dimension to extract, being parabolic and bar-

shaped. Identifying the lines positions is done by calculating

the points where the first derivative of the image convolved

with the Gaussian kernel is zero. Then selecting the minimum

of the second derivative extracts salient lines. For bar-shaped

profiles, the responses of the convolution with the Gaussian

kernel (��) derivatives are given by: ��(�,�,�, ℎ) = ℎ�� +� − �� − � (1) ��(�,�,�, ℎ) = ℎ�� + � − �� −� (2) ��(�,�,�, ℎ) = ℎ�� + � − �� −� (3)

The first derivative vanishes at x = 0 for all σ > 0. The

second however will not exhibit this behavior for small σ.

For this to hold, the following condition must be true.

� ≥ �/√3 (4)

In (4) � represents half the line width. For � = �/√3, ��

�� takes its maximum negative response.

For lines in two dimensions the same analysis is performed

in the perpendicular direction to the line: �(�). This direction is computed resorting to the Hessian matrix.

C. Zhang-Suen Thinning Algorithm

Zhang-Suen thinning algorithm is a fast algorithm applied

to digital patterns in binary images to reduce features to a

one-pixel thick pattern while preserving connectivity and end

points.

III. SATELLITE MAPS

Before processing the satellite images to extract the

information for calculation of routes, is a key part of this

work is how to obtain these images. This is performed using

Google Maps, as this is a free reliable service offering a large

variety of maps. This is integrated in the application using the

service’s API, allowing automatic map fetching.

A. Map Scale

One aspect that is missing from Google Maps API is the

lack of representation or the possibility to get a scale for the

map obtained. This is a crucial aspect in this work, as it will

be necessary to measure distances and make other

calculations involving the translation into actual lengths.

Since the Earth is not flat, the scale does not have a linear

relation with the zoom level used varying also on the latitude

of the location. Google recommends [13] as an alternative

solution, the calculation of the approximate scale using the

formula:

� = �� ∗ 256 ∗ 2��

600 ∗ cos �� ∗ �180

� ∗ 2��, [��/�] (5)

A study was performed to evaluate the goodness of the

formula. A systematic error of about 14,6% was found, so a

correction of that value is made on (5).

B. Map Segmentation

Selecting the course to be represented by the roadbook is

done by user interaction. User selects a start, finish and

eventually intermediary points. Google Maps offers satellite

images at various zoom levels, so the maps will be requested

by default at level 17 (5 levels below the maximum – 23) or

the minimum between the start, finish and intermediary

points input by the user, if any is lower than the default.

At the defined zoom level requests must be made for

images covering all the input point locations. An image

request must be made using the location’s geographic

coordinates. Translating screen Cartesian coordinates to

geographic and vice-versa is accomplished resorting to one

Google Maps Javascript API.

Having the correct projection of the map, the API method is

used to translate each point to a known Cartesian reference

frame. This reference frame is defined by the user

visualization window at the level of zoom used to request

each image segment. The translation is made using: � � �� ∗ 2�� , �, ��, �� ∈ �� (6) In (6), � corresponds to the point in the new reference

frame, �� to the point given by the API method and �� to the point given by the API method when translating of a

pair of coordinates corresponding to the latitude of the most

northeast corner of the visualization window and the

longitude of the southeast corner of the same window. In this

way is possible to get the X value of the left edge and the Y

value of the upper limit of the preview window, which will

correspond to the X, Y coordinates of the origin of the

proposed reference frame, in the reference frame given by the

API. As both these points (�� and ��) are in the same reference frame, to put �� in the new proposed reference frame is a matter of just performing a translation, subtracting

�� to ��. This value is then multiplied by a scale factor, in which zoom corresponds to the previously defined level

used for requests.

Enabling the independent processing of each of the map

segments implies that each point has to be translated to the

segment reference frame. This is done using the translation:

� �′′� � �� 2�� 2 , �′′ ∈ �� (7)

�′′ is the point shifted to the positive space of the segment reference frame and � is point containing the minimum Cartesian coordinates amongst all points.

IV. IMAGE PROCESSING

A. Road Characterization from Satellite Images

Knowing what is a road and what is not, is one of the main

issues addressed here. Roads have some unique

characteristics. Namely, roads in a rural environment are

usually distinguishable bright elongated lines with low

intersection and low curvature but variable width. In an urban

environment roads are usually completely different having

approximately constant width, high junction density, short

segments with sharp angles and paved roads, normally darker

than the surroundings.

Visually these are the characteristics that make rural roads

recognized as such when viewed from above, as in satellite

images case. Since these are the ones with interest in this

work, these characteristics had to be identified so that a

method could be developed that recognizes them and

separates the roads from the surroundings.

Even though these characteristics are the ones

representative and usually associated with rural roads, one

aspect that brings extra complexity to this project is that there

are many variations in these environments (much more than

in urban, for instance), so applying one method that works for

all is increasingly difficult.

B. Steger’s Line Detector

With Steger’s method is possible to extract curvilinear

structures using differential geometric properties of the

image. In fact, one of the motivations to Steger’s work is to

extract roads from aerial images. So a study will be

performed to evaluate how good and how well this method

suits the needs of this project.

This method is mainly governed on two variables: σ and r.

σ is related to the road width, according to (4) and r is a

threshold value, used to select line saliency. Figure 1 contains

a simple distinguishable curved road with constant width and

the result after applying Steger’s method. In this case the road

is about 11 pixels wide so according to (4) a value of 3.18

was selected for σ. For r a value of 3 was chosen in order to

select only the main road line.

Fig. 1. Road detection using Steger’s method.

The result of the method is given in terms of contours

found, being each contour constituted by a line with several

intensity values along its pixels. Higher pixel values mean

that response in the pixel in question was a best match for a

curvilinear structure with the selected parameters, and that

translates to a brighter pixel in the result. With the same

reasoning, darker pixels in the result were worst matches for

the selected curvilinear structure. Selecting a high enough

value for r has the function of clearing these same low

response values, most of them not belonging to the line. In

order for the result to become visible, the response values are

normalized in the range of [0 - 255].

In the result it’s possible to clearly see that the extracted

line follows the road’s curves, and perfectly adapts to the

road. However there are some decisive issues to be analyzed.

Closely comparing the original image with the result, it’s

possible to see a small road in the bottom deriving from the

main that is being poorly detected – only a small branch

closer to the junction is appearing. This smaller road is not as

bright as the main and is not as wider. This means firstly that

the value of r should have been lower so that this road with

lesser contrast could be detected, however along with this

lower value, another responses not really belonging to the

road would begin to appear. Even then, it would not be

guaranteed that this small road would be extracted because

it’s narrower than the main one and would need another value

for σ in order to be detected. In fact, even in the main road

it’s possible to see that some sections where the road’s width

is different than the expected (11 pixels) the response is

either very faint or completely vanished. With different

values for the threshold value even the faint sections could

completely disappear.

Another aspect visible in the result is that this method

doesn’t deal very well with road occlusion. If a tree happens

to block the view, the method will only extract the visible

part of the road if any. In the cases where there are no

curvilinear structures detected the response is zero (or very

low), even if it’s just a small gap in the road caused by a tree

blocking the view. In the cases where a small portion of the

road is still visible the extraction will contour the obstruction

considering the road to be a narrower section (that could have

a lower response as described above), causing the detected

road to be curvy even in straight sections, which is an

undesirable effect.

C. Finding Near Roads

When the user marks the points in the map, due to a

problem inherent to Google Maps Javascript API script, the

points may not be located in the exact location where the user

put them. So if the user marked the points directly over the

road, depending on the zoom level used, that point may

dislocated. Because of this, a marked point can’t be assumed

to identify the exact location of a road.

The marked point may not exactly pinpoint the location of a

road, but most certainly a road will exist in the vicinity,

unless the user marked a completely empty location. It will

be assumed that a road always exists close to the point.

Hereupon, to resolve this issue, Steger’s line detection

method will be used, as it will serve to extract a curvilinear

structure in the surroundings of the point.

To accomplish this, using the marked point as a center, a

100 × 100 pixels subsection of the image will be analyzed.

Since nothing is known about the road to be found, some

assumptions will have to be made so that some initial

parameters (σ and r) for Steger’s line detection method can

be chosen. A value of 5.0 will be used for r, which is

relatively high and provides a good level of confidence that

no lines are incorrectly identified as roads. As for σ, the

assumption is that the wider roads will be around 20 pixels

wide, which at default zoom level (17), scales will range at

about 2 - 3 pixels/m, making these roads about 10 – 7 meters

wide. According to (4), 20 pixel wide lines are expected to

have a σ value of ~ 5.7.

To shift the point to a location over the road, the maximum

of the function (8), in the response subsection of the image

domain is computed. The point is then moved to the location

of the maximum.

�� = �� ∗ �2 − �()�, ∈ ℕ� (8)

In (8), � is the response of the Steger’s contours in a given location , � is distance between and the original point (in this case, it will be the fixed center of the image subsection)

and is a constant representing the maximum distance

attainable, that for a 100 × 100 section will be ~70.71

(pixels). In this way the Steger contour response weighted

with a distance factor is obtained, and distant lines are not

considered unless they have a very big response value, and

close lines are preferably chosen. Using this will make the

point shift to near lines that will most likely correspond to

roads in the original image given their response values.

Algorithm Find Near Roads 1. Initialize sigma = maximum road width/ ( 2 * Sqrt(3)) 2. Initialize r = 5.0 3. 4. While number of found contours < 3 5. Find Steger Lines(sigma, r) 6. Decrease sigma 7. End While 8. 9. Find the maximum of L(x)

10. Adjusted road point = maximum location

D. Automatic Parameter Selection

With the road location exactly pinpointed, information on

the road can be discovered, namely, the road width and its

contrast.

After a run of the previously described algorithm allowing

for a near road to be located, it’s possible to find the road

width by applying the inverse of (4). The σ value is the one

used by the method to get the nearby road. Knowing that

already provides a first estimate about what the roads in the

map look.

A useful piece of information is provided by the Steger’s

algorithm computation, which is the contrast of the found

contour. Having all these information, it’s possible to

estimate the value for the threshold r by computing the

absolute value of (3).

With this threshold and the σ value, running Steger’s

algorithm would perform a search for all the similar roads to

the one found. This is not very desirable, since the search for

nearby roads always finds the widest one in the area. To

correct this, the values of σ and r are a bit decreased in order

for the constraints to be widen, and narrower roads can be

found. The value used to decrease σ was 0.2, and 0.1 for r.

E. A New Approach for Road Detection

Steger’s line detection disadvantage is not being consistent

for all image maps and road characteristics, so a study will be

performed on whether a new approach can obtain better

results or improve in some way the previously discussed

method. Some other road characteristics will be explored,

like color (or brightness) and format.

One thing that can be done to identify rural roads and off-

road tracks is to select in the image only a specific color

range or brightness. As previously observed, rural roads are

distinguishable in an aerial view image by being brighter and

having a very characteristic color range.

To make use of these properties, filters are applied to

identify a specific color in an image. A user input point will

be used to sample the color of the road. For this purpose, the

method discussed in C is used.

Fig. 2. Application of the color filter.

In Fig. 2 the result of the application of a color filter is

shown. The filter samples the input color from the input

point, and then selects similar color values throughout the

image, inside a defined range up and down that value. The

color filter is used in a grayscale image (so the color is a gray

value) and the range is defined with a value of 30.

On top of this result two more operations are applied. The

first is a binarize operation, so that the filtered image is

transformed into a binary one (black and white). The last

operation applied is a maximum filter. This is applied over

the binary image and is defined as follows: ��, �� !"#�$,� � � � �…� � �, & � � � �…� �� (9) A convolution mask of 3 × 3 was used to implement this

filter (meaning n = 3 and m = 3). This has the objective of

merging close extracted points together, filling in this way

some possible gaps in the roads. The effect will obviously be

extended to incorrectly extracted points, so some undesirable

effects will also be amplified. In this way the extracted road

will also have a property (dilated grouped clusters) that will

be useful latter. The result of this filter application is shown

in Fig. 3.

Fig. 3. Application of the maximum filter.

In order to clean some incorrectly identified areas another

road property shall be used – its shape. To accomplish this,

the template matching technique will be employed. This

technique is usually employed to identify similar structures to

a template in an image. Here the principle will be the same,

the road is identified with a template and a matching criterion

will be applied to the entire image. Normally with this

technique there is interest in only identify one or a small

amount of points in which the criterion reaches its maximum

values, as here the interest will be in identifying areas of

maximum values.

Like before, the input user point (that is adjusted to be

located over the road) will be used, in this case to extract a

template from the image, thus providing a comparison base

that is the actual road present in the image. In this way

similar road segments are expected to be detected and once

the map contains roads similar to the template, some of these

similar roads will be identified.

The used matching criterion was the correlation coefficient.

One problem however, arises, and that’s the template

matching technique in not rotation invariant, even though it

can handle small rotation variation.

To surpass this problem, several templates will be used at

all rotations in the range [0, 2π[ (rad). This could be done

only in [0, π[ (rad), as all rotations for a straight road will

exist in this interval, however for the first case better results

are obtained. This is due to the fact that the road may not be

symmetrical on both sides – there may exist near side objects

caught by the template and mainly because road contrast is

almost never the same on both sides. Also, with this method,

some responses will be reinforced with similar rotations.

To get different orientated templates, rotate the existing one

is not a solution. If this were done, with a rectangular

template, as is the case, along with the rotation some

information would be lost from the template corners. The

template should not be resized to accommodate the rotated

result, in order to maintain result consistency along the

different orientations.

To overcome this problem, the template is cropped to fit

inside the original size and the missing corner information is

filled with the data from the original image. This is

accomplished by extracting from the image a rotated

rectangular section. To do this, it must be exactly known

from where to get exactly each pixel and then applying it on

correspondent position in the template to use. To know the

pixel location in the original image, the rotation operation

equations, given by the rotation matrix (given in terms of -θ),

are used. So, to locate the needed pixels the given rotation is

performed around the center point C of the template in the

original image, thus originating the equations:

' !� � �! � (�� cos�,� � �- � (�� sin�,� �(�-� � ��! � (�� sin�,��- � (�� cos�,� �(� (10) Having this set of templates a matching is performed for

each one using the correlation coefficient criteria. The final

result for this corresponds to the sum of all the unormalized

responses.

For the results of the color filter and the template matching

to be comparable, two more operations must be performed on

the template matching result: a threshold operation and a

binarization of the image. The threshold value was set to 150

(the result was normalized in [0, 255]) and the binary

threshold set to 0 (sets to white every pixel in the image with

value bigger than 0).

Now having these results from the discussed methods in the

same domains with the protruded detected road, they must be

associated. The association method is accomplished by

performing a pixel-wise weighted sum of both images. Each

of the contributions has a weight of 50% to the final result. A

final threshold is then performed, with a value of 150. The

final result in presented in Fig. 4.

Fig. 4. Association of the template matching and filter

techniques after threshold.

The objective of the preceding methods and operations was

to study a new way to obtain roads from a satellite image

other than the previously discussed automatic operation of

the Steger’s line detection method or to find a way to amend

it.

A hybrid solution where both methods contribute to a result

can be thought. A possible way to do this is by adding both

images pixel-wise and then threshold the result at some

value. To add them pixel-wise, a dilation operation on the

Steger’s results must be done because of the resulting thin

lines and in order for the both images to become comparable.

Then the threshold operation serves to clear some faint

responses of the Steger’s method that were not reinforced by

the new approach method. In this way the global result

combining both methods will complete the undetected road

areas of each but will also bring some extra noise to it.

In Fig. 5 the result of described combination methodology

is presented. The threshold operation was performed at a

value of 100 and the contribution of each method is the same

as presented earlier. The example previously shown

continued to prove difficult, nevertheless some road sections

were still added completing the overall network. The noise

added in the image however is considerable and using this

method also entails an increase in computational complexity.

For these reasons and for not dramatically improving the

overall results, this method was not used.

Fig. 5. Combination of the new approach with Steger.

F. Road Structuring From Blobs

Looking at the result from the road extraction method it is

possible to identify a distinctive property. These images are

composed by aggregated sets of white pixels. In computer

vision regions with similar properties are called blobs. These

sets of white pixels in the resulting image can be classified as

blobs, however in this case, these are very simple blobs

sharing one distinguishable property from the surroundings –

they are white (its pixels have a value of 255) as the

surroundings are black (value 0).

To find all blobs in an image, the image is scanned for non-

zero valued pixels (since there are only two types, it will

detect the white ones). Every time a pixel of this kind is

found that does not belong to any already discovered blob, all

connected components of the same kind are discovered, thus

originating a blob. A component is considered connected to

another if it sits in the Moore neighborhood of it (i.e. it’s

located in exactly one of the 8 immediately attached pixels

around it). To find all the connected components, a BFS is

triggered rooting in the first pixel discovered.

Essential to road structuring is a thinning operation

performed on blob image. This will be accomplished

resorting to the Zhang-Suen thinning algorithm. This

algorithm operates in a binary image (as is the case) and

satisfies two essential criteria: the thinned responses are

always 1-pixel thick and structure connectedness is kept. The

resulting thinned structures obtained will be named

ThinBlobs.

Next a classification of the points in each ThinBlob is

performed. The key points that will be discovered are

junctions and terminations. Junction points are those from

which at least two branches leave whereas terminations

points are the ones from where less than two branches leave.

To find these special points, the blob is traversed point by

point. In each, its Moore neighborhood (points P1 to P8) is

analyzed and a classification of each point is performed.

Algorithm Identify Junction or Termination

1. Initialize N = 0 2. Initialize connected = false 3. 4. For points (P1 → P7) 5. If is blob point 6. If !connected 7. N++ 8. End If 9. Else 10. connected = false 11. End If 12. End For 13. 14. If P8 is blob point 15. If connected AND P1 is blob point 16. N-- 17. Else If !connected AND P1 is not blob point 18. N++ 19. End If 20. End If 21. 22. If N > 2 23. Is a junction 24. Else If N < 2 25. Is a termination 26. End If

Having the location information of terminations and

junctions, a classification will be performed on every other

point. These will either belong to termination branches or

junction branches. Termination branches are those set of

points linking a junction to a termination or a termination to

another termination, whereas junction branches will be those

linking two junctions.

To compute the termination branches, a BFS exploration is

used rooting in each of the discovered termination points.

The exploration ends either when there are no points left to

explore in the blob or a junction point is discovered. The

exploration returns all traversed points, allowing in this way

to store them as the branch points. Every time a termination

branch linked to a junction is found, it increases the number

of connected terminations in the respective Junction

structure.

Finding the points that constitute the branch is done in the

same way as finding the termination branches points, that is,

BFS-exploring until another junction is detected. The

concerning branch will be added to the newly found Junction

structure, so that these branches are computed only once.

Another operation done during this stage is to estimate the

angle of the termination branches. This angle is the

approximate orientation of the branch ending that will be

necessary for the next step. This is computed by estimating

the running average of the angle of the last (at the most) M

points of the branch. An “angle between two points” here

referenced is the one between the x-axis of the used reference

frame and a line formed by the two points. The value of M

used for the implementation was 5.

In Fig. 6 is shown the result of the branching operations

described applied on a (perfect) road extraction. Depicted in

blue are the junction branches and in orange the termination

branches.

Fig. 6. Branching of an ideal road extraction.

To solve the gaps between the thin blobs that may have

been due to road occlusion or to inefficiency of the road

detection algorithm a linking algorithm is proposed. The

objective is to find a path through the blobs that approaches

an end point.

The first thing to do is select a starting (thin) blob from a

given initial point and then search for nearby (thin) blobs.

When searching for close blobs, the main idea is to explore

the areas next to blob endings taking advantage of the

termination branch orientation. When a blob is found, the

termination and the new point detected are linked with a

straight line. This is meant to fill small gaps, so a straight line

will in most cases will be a good or optimal approximation to

the underlying road.

In the linking process, all the blob branches that were

previously found (and linked) are not eligible to start a new

search process. In this way the linking will cascade from a

given point, through the blobs, until there are no more blobs

left to link. A blob is considered to be in linked state after all

of its terminations have been explored. The blobs in this state

can’t be linked anymore.

When no more unlinked blobs are found the image is

reanalyzed like a new blob image. In this way the newly

linked blobs will be detected as one, with all its termination,

branches and properties already discussed.

When extraction noise is present in the blob image, a few

blobs can be incorrectly linked (as they don’t represent roads)

so some modifications to the method will be tested.

The first modification will be to deal with smaller images.

Using the starting point, a sub-region of the original image is

extracted around that point. Then all the previously discussed

methods are applied.. Having linked the blobs on the image,

the result is parsed as a blob and all termination points of it

are obtained. From these terminations, at the most two points

are chosen and enqueued. Then the process is repeated on

each of the enqueued points, treating them as starting points

of a new process, until no more points are left. The results

obtained are saved and are kept overlapping to each other, so

that in the end a more complete road network is obtained.

The sub-region area used was 300 × 300 pixels.

The two points chosen from the terminations will those

that, in relation to an endpoint, are the closer and with smaller

azimuth. The Cartesian distance between two points was used

to avail the proximity of the points, as for the azimuth

between two points was defined as follows: 01��, 2� � 3arctan2��, �� arctan2�2�, 2��3 (11) Here, P is a given point in Cartesian coordinates and E

refers to the endpoint, in Cartesian coordinates as well. If

both the closest point and with lower azimuth are the same,

only one is enqueued.

The overlapping of individual steps can create multiple

responses for the same extracted road. This result will be

parsed as one big ThinBlob, so to avoid this, the image is

cleaned of small dark blobs (using a state of the art blob

library [12]) and then thinned (using the Zhang-Suen

algorithm).

Fig. 7. Road extraction after linking, using several starting

points.

Using all of the user input points will produce better results.

The process is repeated for each one of them and then the

contributions are overlapped and treated in a similar manner

as before. In Fig. 7 the overall result of this process on an

example image is shown.

V. BUILDING ROADBOOK

A. Parallel Execution

With the methods of the previous chapter available two

kinds of inputs are needed: an image (satellite map) and a set

(can be empty) of points assumed close to roads. There is

also the need to provide the end point. This end point must

also be translated to the specific segment reference frame,

even if it located outside the image bounds. The segments

can, in this way, be independently treated as a single image.

Since the method to obtain roads has its independent group

of variables, the easiest way to have some performance

improvement is to compute in parallel the result of each

segment regarding its set of points. This is done by executing

the road detection method for each segment in its own

independent thread. This kind of parallel execution is referred

as task parallelism.

There will be as many tasks as the resulting number of map

segments. The results of all tasks are then stitched together in

their segment original relative position, forming the global

result of the initial map.

Using the parallel approach, the overall processing will

depend on the number of threads allowed by the system

running the application and the number of segments. In the

best case it will take as long as the longest task and in the

worst case, with a single-threaded system it will take as long

as the sequential approach. Overhead on the creation and

destruction of tasks is considered negligible compared to the

tasks complexity.

B. Graph Conversion

Taking the generated global result and use it in a way that

allow for a roadbook to be presented will require the detected

paths represented in the image to be translated into a graph.

The first thing to do is to find how the paths are connected.

The methods already presented allows for an image with

thinned interconnected lines to be translated into

terminations, junctions and branches. This is a structure with

some degree of similarity to the intended graph, so the first

operation in the global result is to apply these methods and

obtain a structured data set on the represented network.

When converting the structured data resulting of image

processing operations to the graph, the nodes are

correspondent to the junctions and terminations and the links

are correspondent to each of the existing termination or

junction branches. Converting a branch to a link requires the

newly created link to be added to the adjacency list on the

both nodes on the link’s ends. These nodes are identified by

the point where they are located, either the junction points in

case of the junction branch or the junction and termination

point in case of the termination branch. The links will have

access to the point list of the branch, since this information is

available. The link’s distance (or length) is provided by the

number of points in that list, thus representing the distance

between two nodes in pixels. In Fig. 8 is represented this

conversion process, from a road extraction result.

Fig. 8. Graph conversion.

C. Result Review

Presenting the result of the image processing to the user is a

very important step. This allows for them to correct the

detected course or select a new one and choose a path. For

this purpose an interactive solution was adopted.

In this stage user has the ability to add new nodes, connect

existing ones or remove existing links or nodes. These

operations are made over the graph structure previously

presented.

The start, middle and finish nodes are added to the graph as

any other node. These are automatically added after the

conversion from the image processing result to graph is done.

When performing the node addition to the graph, a search for

a path in the vicinity is made. If a path is found, the node will

automatically be added to that path. This is a graph operation

accomplished by breaking the link in two and resetting the

node ends on the resulting links accordingly. The point list is

also divided, but this is just a matter of finding the right point

position inside that list. Since the point list is result of a BFS

operation, the points are sequentially ordered, so dividing the

list requires no other operations than just break it at this

point. If the node to add is a start, finish or middle point, the

search for nearby nodes is additionally performed and the

nodes eventually found are transformed into one of these

special points accordingly. The difference between these and

any others is just a flag inside the node structure indicating

what kind of node it is. These also appear identified in the

GUI and are used for other graph operations.

Connecting two nodes will add a new link to each of the

nodes, with the point list corresponding to a straight line

between them. A straight line can be a good approximation to

the underlying missing road segment in many cases. In

situations where this is not true, some intermediary nodes can

be added to improve road estimation of the road format.

Removing nodes and links is useful in faulty road detection

cases. Selecting any node will remove it and all of its links,

so it will also affect neighbor nodes if any. Start and finish

nodes cannot be removed. To delete a link, the two linked

nodes must be chosen and the remove action selected.

D. Path Calculation

Start and finish nodes should also be connected for a path to

be computed. If the user has selected middle points, those

must also be connected and the path will pass through them.

While start, finish and middle nodes are not connected, the

application doesn’t allow advancing to next stage.

The algorithm used to get the roadbook path was a

modification of the Dijkstra’s shortest path algorithm. The

first operation to do is to check if all required path nodes are

connected, that is if starting in a source node all these

targeted nodes can be reached. The most efficient way to do

this would be using an exploration algorithm like a BFS or a

DFS checking if all selected nodes were found. With this

solution the worst case complexity would be ��, meaning that at the most all the edges would be visited once. However

the goal here is not solely finding if the nodes are connected.

Since the Dijkstra algorithm is going to be applied, it’s also

being used to check if the nodes are connected. A run of the

Dijkstra’s algorithm outputs the minimum distance from a

source node to all remaining graph nodes. If the source is not

connected to a given node, then the Dijkstra’s output in that

situation indicates that the distance separating them is

infinite. So, to check if a given set of nodes are connected

using Dijkstra’s algorithm is just a matter of checking if all

distances to all nodes are less than infinite. If this condition

applies, then there is a path from start to finish through all

middle points. With the used implementation of the

Dijkstra’s algorithm the worst case complexity is �� log��, which is a bit worst when comparing to an exploration

algorithm just for the connectivity task, but in this case where

the Dijkstra is computed after that, this last computation

resolves the problem and simultaneously computes the path.

If the computation only involves discovering the path

between the start and finish points, this problem is solved and

the discovered path is the shortest, computed by Dijkstra’s

algorithm. When more nodes are involved, a more complex

solution must be considered.

The solution here should always be a method that

empowers the user with some level of choice regarding the

final calculated path.

For the user to have this control, an extra parameter was

entered into the path calculation, and that is a node priority.

The user has the possibility to set this priority in the GUI, by

selecting the intended middle node. In this way, the algorithm

will search for the lowest priority value first and only then it

will try to search for the closest middle node. The nodes are

initialized by default with a priority value corresponding to

infinite. With this value the preference will be given to the

closest middle node instead.

Fig. 9. Path selection using two prioritized middle nodes.

In Figure 9 an example with prioritized middle nodes is

shown. Here the user has chosen to attribute a priority of 1 to

D. As this is the lowest priority of any of the middle nodes,

the algorithm will start by finding a path to D, which will be

the shortest path, given the use of the Dijkstra’s algorithm.

No priority was set on E, so the algorithm will find the next

closest middle node that in this case is the remaining one.

Using this approach the user has control over the calculated

path.

E. Roadbook Presentation

With the path calculated all that is left to do is to build the

actual roadbook. The roadbook will present de directions

between road junctions, the total distance traveled and the

distance from last indication. The format of the junction is

required to be shown as well as the navigational direction

(from where it came and to where is going) in each entry.

To identify the format of the junction, a roadbook tulip is

be used. Here it will be taken advantage of some already

present information, that is, the format of the roads around

the junction. Either automatically detected or user adjusted,

this format represents the road junction in the best possible

way, given that the only source of information is the satellite

map image.

The roadbook is formed by a set of pages, each one having

a set of entries and a header. Each entry represents an

indication of the road to follow with a tulip, a sequence

number, the total and partial distance covered. A page is built

as an image with big enough dimensions to accommodate the

header plus a predetermined number of entries. The number

of entries set for each page was 7.

Fig. 10. Generated Roadbook.

The nodes of the computed path are iterated. In the

previously identified nodes (where a tulip was created) an

entry is created. A sequence number is maintained, to be

printed in each entry, as well as an accumulator with the total

traveled distance and another with the partial distance. Every

time a node that is signaled to be present in the page, the

partial distance is printed in the page and is set to zero,

providing in this way only the distance from the last entry.

All the distances are internally saved as a sum of the graph

link lengths given in pixels and presented in kilometers (as

the defined metric used). In Fig. 10 a constructed roadbook

corresponding to the calculated path of the Fig. 8 is shown.

VI. CONCLUSIONS AND FUTURE WORK

Road recognition from the obtained images was one of the

main issues in this work. To overcome this problem, several

image processing methods aiming road extraction from

satellite maps were studied and a new approach for road

extraction was proposed, with results comparable to existing

state of the art methods. The road extraction problem was

resolved using the developed method along with a linking

algorithm that tries to improve and better suit the results to

the objective of this section of the project. This algorithm and

the developed road extraction method are independent and

are expected to work separately. Specifically, the extraction

process can serve as a road recognition method in other

projects, can be improved with more or different intermediary

methods (template matching and color filtering were used

here) or even serve as a starting point for other algorithms to

work upon.

As the road extraction process may not be perfect, map

reviewing by the user takes particular significance to correct

and complete the path. In this way, it’s guaranteed that a

roadbook can always be generated on any location. A future

improvement can be though as replacing the current road

extraction method with another with better results, eventually

developed. User validation will always be required even in

the best case scenario, to assure that the computed path to be

presented in the roadbook is the one intended.

In terms of performance improvement, a different

parallelization process could be approached. The most

intensive tasks are the image processing ones, so instead of

concurrently processing each segment of the satellite image,

smaller tasks could be identified inside that large processing

task, making the parallelization more efficient. Another

improvement that could be made was to perform some or all

of the image processing computations in the GPU. Given the

purpose of this unit and the computing power usually

associated, improvements in the performance would be

expected in delegating the execution of the referred tasks to

it.

Integration with other systems aiming to improve results

can also be planned as a perspective of future work. Building

a system using GPS and a satellite image approach would

bring the best of both worlds providing more accurate results

and introduce a variety of new approaches for the roadbook

problem. Similarly this work could also improve some of the

existing commercial solutions for roadbook creation already

using these systems.

More variety of maps can also be introduced by integrating

more services providing maps, other than Google Maps. This

could be taken as an opportunity to improve results, by

comparing different images of the same location, road

extraction could potentially achieve better results. In fact

some of the studied existing methods for this are already

based in identifying image differences. Integration with

commercial satellite image services could also allow new

approaches for road extraction given the nature of the

provided images (high-resolution, multispectral).

REFERENCES

[1] Steger, C. (1996). Extracting curvilinear structures: A

differential geometric approach. In Computer Vision—

ECCV'96 (pp. 630-641). Springer Berlin Heidelberg.

[2] Mena, J. B. (2003). State of the art on automatic road

extraction for GIS update: a novel classification. Pattern

Recognition Letters, 24(16), 3037-3058.

[3] Tripy, GPS + Digital Road Book. (n.d.). Tripy GPS + Digital

Road Book for motorbikes, 4X4, Quad and Oltimers. Retrieved

September 18, 2013, from http://www.tripy.eu/en/

[4] Li, Y., & Briggs, R. (2009). Automatic extraction of roads from

high resolution aerial and satellite images with heavy noise.

World Academy of Science, Engineering and Technology, 54,

416-422.

[5] Bacher, U., & Mayer, H. (2005). Automatic road extraction

from multispectral high resolution satellite images.

Proceedings of CMRT05.

[6] Christophe, E., & Inglada, J. (2007, September). Robust road

extraction for high resolution satellite images. In Image

Processing, 2007. ICIP 2007. IEEE International Conference

on (Vol. 5, pp. V-437). IEEE.

[7] Porikli, F. M. (2003, September). Road extraction by point-

wise gaussian models. In AeroSense 2003 (pp. 758-764).

International Society for Optics and Photonics.

[8] Mayer, H., Baltsavias, E., & Bacher, U. (2005). Automated

Extraction, Refinement, and Update of Road Databases from

Imagery and Other Data.

[9] Laptev, I. (1997). Road extraction based on snakes and

sophisticated line extraction. Master's thesis, Royal Institute of

Technology, Stockholm, Sweden.

[10] Kass, M., Witkin, A., & Terzopoulos, D. (1988). Snakes:

Active contour models. International journal of computer

vision, 1(4), 321-331.

[11] Pandit, V. (2009). Automatic Road Extraction From High

Resolution Satellite Imagery.

[12] CvBlobsLib — OpenCV Wiki. (n.d.). OpenCV Wiki. Retrieved

July 2, 2013, from http://opencv.willowgarage.com

/wiki/cvBlobsLib

[13] Issue 4189: Add a scale bar to static maps. (n.d.). Gmaps-api-

issue. Retrieved November 25, 2013, from

https://code.google.com/p/gmaps-api-issues/issues/detail?id=

4189

automatic off-road roadbook creation from satellite maps ......roadbooks are often hard to build,...

Documents