automatic off-road roadbook creation from satellite maps ......roadbooks are often hard to build,...
TRANSCRIPT
-
Automatic Off-Road Roadbook Creation from Satellite Maps
João Pedro Carvalho Camejo Instituto Superior Técnico, Lisbon, Portugal
Abstract — Roadbooks are still irreplaceable and fundamental
tools for the navigation of a pre-planned route, normally used
for rural trips, in sightseeing or in all-terrain raids. This thesis
explains the construction process of an application to generate a
roadbook through satellite images, representing a new approach
to applications of the kind. Satellite images are automatically
retrieved resorting to Google Maps imagery database, so image
processing techniques must be used for road extraction. Steger’s
line detection method is one of the most known and used. One of
the objectives to this work is to have an autonomous process to
identify roads in an image, so an extension to Steger’s method,
allowing it to automatically select parameters was developed.
Additionally a new approach for road extraction mainly based
on color filter and template matching techniques, is presented.
To surpass possible faults on the road extraction process, user
interaction is introduced, allowing it to have full control over the
path to be presented in the roadbook.
I. INTRODUCTION
A roadbook is a set of sequential indications over a road or
a path, allowing for anyone following it to reach a destination
from a starting location. It can be constituted by a set of
pages or just a long list of direction information entries. Each
roadbook entry usually provides information about the
traveled distance, direction and some additional notes on road
surroundings. Though there is no standard in the presentation
of roadbooks, at least direction information always exists.
This is represented by a drawing of each crossing or road
junction, designated tulip, with the paths to follow visually
highlighted enabling fast recognition of the one to follow.
Roadbooks normally only have information on these road
junctions avoiding anyone following it to get lost. Many
times, complementary information like GPS coordinates
distinct road segments also exist.
Roadbooks are often hard to build, requiring prior track
recognition and ahead planning. As it will be discussed, the
tools available to build roadbooks are still very limited and
sometimes unreliable.
A. Objectives
The main objective of the work is to build an application
enabling the roadbook creation of a user chosen path. The
project will have a strong user interaction, so emphasis will
be given on this and application interface and ease of use
issues will be addressed.
To build roadbooks the application will only rely on
satellite images. So, these satellite images will have to be
integrated into the application in a way that allows the user to
choose the intended travel path. User images must be
accepted, but automatic fetching of the images is preferable.
User must be able to select start and finish locations in the
satellite map. Other non-mandatory points can also be input
indicating travel locations, giving in this way more control
over the chosen path. A path will have to be calculated, using
only those points information and the image, so a method to
extract roads from an image will have to be investigated.
B. Major Contributions
Getting a roadbook using satellite imagery and user
interaction is a new approach for a roadbook construction
tool, as there are no automatic roadbook obtaining tools
relying on satellite image information. Automatic roadbook
obtaining tools are scarce, and those existing depend on other
systems or databases.
To use images as a source of information some computer
vision methods extracting roads were studied. A new method
was also developed for this purpose, mainly to better suit the
needs of the project. A comparison with a state of the art [1]
method is performed and an adaptation on that state of the art
method is also carried, allowing it to perform autonomously
with automatic parameter selection.
II. RELATED WORK AND USED TECHNIQUES
A. Related Work
Road recognition is a studied subject very important for
GIS databases and other informational systems. These are
fundamental in many location systems like GPS or map
guiding services. Given the huge amount of existing road
networks in the world, road recognition from satellite maps is
one way to help complete these databases [2]. One of the
main difficulties in performing this is finding a working
method for the existing multitude of different settings and
scenarios. There still are no methods able to perform well and
extract a complete road network from a satellite image. The
great amount of existing approaches shows that this is a
highly non-trivial problem and that may be why no automatic
roadbook creation commercial solution using this approach
exists.
Roadbooks are often created for off-road tracks where there
is limited information available about the existing roads, in
opposition to urban environments, where the network is very
well known and up to date in most cases. A solution for this
would be to directly extract information from sources like
satellite maps.
There are however some available solutions to create
roadbooks. These often use GPS tracking to store taken
-
routes or to plan a new one. In some cases GPS is also
actively used during the trip allowing for route correction
when no map is available like in most off-road cases [3].
Other solutions allow the creation or edition of roadbooks by
manually choosing and setting each tulip and information
according to an intended course.
Many contributions exist to the topic of automatic road
extraction from satellite and aerial view image. Li and Briggs
[4] propose a method using a “reference circle” to extract
roads from high-resolution aerial and noisy satellite images.
Bacher and Mayer [5] propose an approach to extract rural
roads taking advantage of the multispectral properties of the
high-resolution satellite images. Steger [1] proposes a method
extracting curvilinear structures from images using a
differential geometric approach. Christophe and Inglada [6]
propose a fast algorithm using a geometric method for a first
step extraction level to be refined by human interaction or
GIS integration. Porikli [7] approaches a method to extract
roads from very low satellite imagery, using a Gaussian
model. Additionally Mayer et al. [8] performed a
comparative study on a set of automated road extraction
methods aiming to update existing databases.
On the other hand, semi-automatic methods are
characterized by making use of some prior to extraction
known information, usually a set of points indicating road
locations called seed points. Most of the times these methods
are used in conjugation to automatic ones to complete those
or to bypass the necessity of the seed points. One example of
this is the work of Laptev [9] combining an automatic
curvilinear extraction method [1] with a semi-automatic one
using active contour models (also known as snakes) [10].
Pandit [11] also uses a combination of a few semi-automatic
methods, mainly a variation of a region growing based road
extraction called adaptive texture matching.
B. Image Lines Extraction through a Differential Geometric
Approach
Steger’s work [1] uses a differential geometric approach to
extract lines, having the same characteristics as roads, from
images. This method has the advantage of performing well
for both high and low resolutions and the approach is not
specific to any kind of type of aerial/satellite image
(multispectral, multi-temporal, noisy, etc.), so it suits the
needs of this project.
Steger starts by identifying approximations to the line
profile in one dimension to extract, being parabolic and bar-
shaped. Identifying the lines positions is done by calculating
the points where the first derivative of the image convolved
with the Gaussian kernel is zero. Then selecting the minimum
of the second derivative extracts salient lines. For bar-shaped
profiles, the responses of the convolution with the Gaussian
kernel (��) derivatives are given by: ��(�,�,�, ℎ) = ℎ���� +� − ��� − � (1) ���(�,�,�, ℎ) = ℎ����� + � − ���� −� (2) ����(�,�,�, ℎ) = ℎ���� �� + � − ��� �� −� (3)
The first derivative vanishes at x = 0 for all σ > 0. The
second however will not exhibit this behavior for small σ.
For this to hold, the following condition must be true.
� ≥ �/√3 (4)
In (4) � represents half the line width. For � = �/√3, ��
�� takes its maximum negative response.
For lines in two dimensions the same analysis is performed
in the perpendicular direction to the line: �(�). This direction is computed resorting to the Hessian matrix.
C. Zhang-Suen Thinning Algorithm
Zhang-Suen thinning algorithm is a fast algorithm applied
to digital patterns in binary images to reduce features to a
one-pixel thick pattern while preserving connectivity and end
points.
III. SATELLITE MAPS
Before processing the satellite images to extract the
information for calculation of routes, is a key part of this
work is how to obtain these images. This is performed using
Google Maps, as this is a free reliable service offering a large
variety of maps. This is integrated in the application using the
service’s API, allowing automatic map fetching.
A. Map Scale
One aspect that is missing from Google Maps API is the
lack of representation or the possibility to get a scale for the
map obtained. This is a crucial aspect in this work, as it will
be necessary to measure distances and make other
calculations involving the translation into actual lengths.
Since the Earth is not flat, the scale does not have a linear
relation with the zoom level used varying also on the latitude
of the location. Google recommends [13] as an alternative
solution, the calculation of the approximate scale using the
formula:
� = ��� ∗ 256 ∗ 2����
600 ∗ cos �������� ∗ �180
� ∗ 2��, [����/�] (5)
A study was performed to evaluate the goodness of the
formula. A systematic error of about 14,6% was found, so a
correction of that value is made on (5).
B. Map Segmentation
Selecting the course to be represented by the roadbook is
done by user interaction. User selects a start, finish and
eventually intermediary points. Google Maps offers satellite
images at various zoom levels, so the maps will be requested
by default at level 17 (5 levels below the maximum – 23) or
the minimum between the start, finish and intermediary
points input by the user, if any is lower than the default.
At the defined zoom level requests must be made for
images covering all the input point locations. An image
request must be made using the location’s geographic
coordinates. Translating screen Cartesian coordinates to
-
geographic and vice-versa is accomplished resorting to one
Google Maps Javascript API.
Having the correct projection of the map, the API method is
used to translate each point to a known Cartesian reference
frame. This reference frame is defined by the user
visualization window at the level of zoom used to request
each image segment. The translation is made using: � � ��� � ����� ∗ 2���� , �, ��, ���� ∈ �� (6) In (6), � corresponds to the point in the new reference
frame, �� to the point given by the API method and ������ to the point given by the API method when translating of a
pair of coordinates corresponding to the latitude of the most
northeast corner of the visualization window and the
longitude of the southeast corner of the same window. In this
way is possible to get the X value of the left edge and the Y
value of the upper limit of the preview window, which will
correspond to the X, Y coordinates of the origin of the
proposed reference frame, in the reference frame given by the
API. As both these points (�� and ������) are in the same reference frame, to put �� in the new proposed reference frame is a matter of just performing a translation, subtracting
������ to ��. This value is then multiplied by a scale factor, in which zoom corresponds to the previously defined level
used for requests.
Enabling the independent processing of each of the map
segments implies that each point has to be translated to the
segment reference frame. This is done using the translation:
� �′′� � �� ����� � �����������2���� � �� ����� � �������������2 , �′′ ∈ �� (7)
�′′ is the point shifted to the positive space of the segment reference frame and � is point containing the minimum Cartesian coordinates amongst all points.
IV. IMAGE PROCESSING
A. Road Characterization from Satellite Images
Knowing what is a road and what is not, is one of the main
issues addressed here. Roads have some unique
characteristics. Namely, roads in a rural environment are
usually distinguishable bright elongated lines with low
intersection and low curvature but variable width. In an urban
environment roads are usually completely different having
approximately constant width, high junction density, short
segments with sharp angles and paved roads, normally darker
than the surroundings.
Visually these are the characteristics that make rural roads
recognized as such when viewed from above, as in satellite
images case. Since these are the ones with interest in this
work, these characteristics had to be identified so that a
method could be developed that recognizes them and
separates the roads from the surroundings.
Even though these characteristics are the ones
representative and usually associated with rural roads, one
aspect that brings extra complexity to this project is that there
are many variations in these environments (much more than
in urban, for instance), so applying one method that works for
all is increasingly difficult.
B. Steger’s Line Detector
With Steger’s method is possible to extract curvilinear
structures using differential geometric properties of the
image. In fact, one of the motivations to Steger’s work is to
extract roads from aerial images. So a study will be
performed to evaluate how good and how well this method
suits the needs of this project.
This method is mainly governed on two variables: σ and r.
σ is related to the road width, according to (4) and r is a
threshold value, used to select line saliency. Figure 1 contains
a simple distinguishable curved road with constant width and
the result after applying Steger’s method. In this case the road
is about 11 pixels wide so according to (4) a value of 3.18
was selected for σ. For r a value of 3 was chosen in order to
select only the main road line.
Fig. 1. Road detection using Steger’s method.
The result of the method is given in terms of contours
found, being each contour constituted by a line with several
intensity values along its pixels. Higher pixel values mean
that response in the pixel in question was a best match for a
curvilinear structure with the selected parameters, and that
translates to a brighter pixel in the result. With the same
reasoning, darker pixels in the result were worst matches for
the selected curvilinear structure. Selecting a high enough
value for r has the function of clearing these same low
response values, most of them not belonging to the line. In
order for the result to become visible, the response values are
normalized in the range of [0 - 255].
In the result it’s possible to clearly see that the extracted
line follows the road’s curves, and perfectly adapts to the
road. However there are some decisive issues to be analyzed.
Closely comparing the original image with the result, it’s
possible to see a small road in the bottom deriving from the
main that is being poorly detected – only a small branch
closer to the junction is appearing. This smaller road is not as
bright as the main and is not as wider. This means firstly that
the value of r should have been lower so that this road with
lesser contrast could be detected, however along with this
lower value, another responses not really belonging to the
road would begin to appear. Even then, it would not be
guaranteed that this small road would be extracted because
it’s narrower than the main one and would need another value
for σ in order to be detected. In fact, even in the main road
-
it’s possible to see that some sections where the road’s width
is different than the expected (11 pixels) the response is
either very faint or completely vanished. With different
values for the threshold value even the faint sections could
completely disappear.
Another aspect visible in the result is that this method
doesn’t deal very well with road occlusion. If a tree happens
to block the view, the method will only extract the visible
part of the road if any. In the cases where there are no
curvilinear structures detected the response is zero (or very
low), even if it’s just a small gap in the road caused by a tree
blocking the view. In the cases where a small portion of the
road is still visible the extraction will contour the obstruction
considering the road to be a narrower section (that could have
a lower response as described above), causing the detected
road to be curvy even in straight sections, which is an
undesirable effect.
C. Finding Near Roads
When the user marks the points in the map, due to a
problem inherent to Google Maps Javascript API script, the
points may not be located in the exact location where the user
put them. So if the user marked the points directly over the
road, depending on the zoom level used, that point may
dislocated. Because of this, a marked point can’t be assumed
to identify the exact location of a road.
The marked point may not exactly pinpoint the location of a
road, but most certainly a road will exist in the vicinity,
unless the user marked a completely empty location. It will
be assumed that a road always exists close to the point.
Hereupon, to resolve this issue, Steger’s line detection
method will be used, as it will serve to extract a curvilinear
structure in the surroundings of the point.
To accomplish this, using the marked point as a center, a
100 × 100 pixels subsection of the image will be analyzed.
Since nothing is known about the road to be found, some
assumptions will have to be made so that some initial
parameters (σ and r) for Steger’s line detection method can
be chosen. A value of 5.0 will be used for r, which is
relatively high and provides a good level of confidence that
no lines are incorrectly identified as roads. As for σ, the
assumption is that the wider roads will be around 20 pixels
wide, which at default zoom level (17), scales will range at
about 2 - 3 pixels/m, making these roads about 10 – 7 meters
wide. According to (4), 20 pixel wide lines are expected to
have a σ value of ~ 5.7.
To shift the point to a location over the road, the maximum
of the function (8), in the response subsection of the image
domain is computed. The point is then moved to the location
of the maximum.
�� = �� ∗ �2 − �()�, ∈ ℕ� (8)
In (8), � is the response of the Steger’s contours in a given location , � is distance between and the original point (in this case, it will be the fixed center of the image subsection)
and is a constant representing the maximum distance
attainable, that for a 100 × 100 section will be ~70.71
(pixels). In this way the Steger contour response weighted
with a distance factor is obtained, and distant lines are not
considered unless they have a very big response value, and
close lines are preferably chosen. Using this will make the
point shift to near lines that will most likely correspond to
roads in the original image given their response values.
Algorithm Find Near Roads 1. Initialize sigma = maximum road width/ ( 2 * Sqrt(3)) 2. Initialize r = 5.0 3. 4. While number of found contours < 3 5. Find Steger Lines(sigma, r) 6. Decrease sigma 7. End While 8. 9. Find the maximum of L(x)
10. Adjusted road point = maximum location
D. Automatic Parameter Selection
With the road location exactly pinpointed, information on
the road can be discovered, namely, the road width and its
contrast.
After a run of the previously described algorithm allowing
for a near road to be located, it’s possible to find the road
width by applying the inverse of (4). The σ value is the one
used by the method to get the nearby road. Knowing that
already provides a first estimate about what the roads in the
map look.
A useful piece of information is provided by the Steger’s
algorithm computation, which is the contrast of the found
contour. Having all these information, it’s possible to
estimate the value for the threshold r by computing the
absolute value of (3).
With this threshold and the σ value, running Steger’s
algorithm would perform a search for all the similar roads to
the one found. This is not very desirable, since the search for
nearby roads always finds the widest one in the area. To
correct this, the values of σ and r are a bit decreased in order
for the constraints to be widen, and narrower roads can be
found. The value used to decrease σ was 0.2, and 0.1 for r.
E. A New Approach for Road Detection
Steger’s line detection disadvantage is not being consistent
for all image maps and road characteristics, so a study will be
performed on whether a new approach can obtain better
results or improve in some way the previously discussed
method. Some other road characteristics will be explored,
like color (or brightness) and format.
One thing that can be done to identify rural roads and off-
road tracks is to select in the image only a specific color
range or brightness. As previously observed, rural roads are
distinguishable in an aerial view image by being brighter and
having a very characteristic color range.
To make use of these properties, filters are applied to
identify a specific color in an image. A user input point will
-
be used to sample the color of the road. For this purpose, the
method discussed in C is used.
Fig. 2. Application of the color filter.
In Fig. 2 the result of the application of a color filter is
shown. The filter samples the input color from the input
point, and then selects similar color values throughout the
image, inside a defined range up and down that value. The
color filter is used in a grayscale image (so the color is a gray
value) and the range is defined with a value of 30.
On top of this result two more operations are applied. The
first is a binarize operation, so that the filtered image is
transformed into a binary one (black and white). The last
operation applied is a maximum filter. This is applied over
the binary image and is defined as follows: ����, �� � � !"#�$,� � � � �…� � �, & � � � �…� �� (9) A convolution mask of 3 × 3 was used to implement this
filter (meaning n = 3 and m = 3). This has the objective of
merging close extracted points together, filling in this way
some possible gaps in the roads. The effect will obviously be
extended to incorrectly extracted points, so some undesirable
effects will also be amplified. In this way the extracted road
will also have a property (dilated grouped clusters) that will
be useful latter. The result of this filter application is shown
in Fig. 3.
Fig. 3. Application of the maximum filter.
In order to clean some incorrectly identified areas another
road property shall be used – its shape. To accomplish this,
the template matching technique will be employed. This
technique is usually employed to identify similar structures to
a template in an image. Here the principle will be the same,
the road is identified with a template and a matching criterion
will be applied to the entire image. Normally with this
technique there is interest in only identify one or a small
amount of points in which the criterion reaches its maximum
values, as here the interest will be in identifying areas of
maximum values.
Like before, the input user point (that is adjusted to be
located over the road) will be used, in this case to extract a
template from the image, thus providing a comparison base
that is the actual road present in the image. In this way
similar road segments are expected to be detected and once
the map contains roads similar to the template, some of these
similar roads will be identified.
The used matching criterion was the correlation coefficient.
One problem however, arises, and that’s the template
matching technique in not rotation invariant, even though it
can handle small rotation variation.
To surpass this problem, several templates will be used at
all rotations in the range [0, 2π[ (rad). This could be done
only in [0, π[ (rad), as all rotations for a straight road will
exist in this interval, however for the first case better results
are obtained. This is due to the fact that the road may not be
symmetrical on both sides – there may exist near side objects
caught by the template and mainly because road contrast is
almost never the same on both sides. Also, with this method,
some responses will be reinforced with similar rotations.
To get different orientated templates, rotate the existing one
is not a solution. If this were done, with a rectangular
template, as is the case, along with the rotation some
information would be lost from the template corners. The
template should not be resized to accommodate the rotated
result, in order to maintain result consistency along the
different orientations.
To overcome this problem, the template is cropped to fit
inside the original size and the missing corner information is
filled with the data from the original image. This is
accomplished by extracting from the image a rotated
rectangular section. To do this, it must be exactly known
from where to get exactly each pixel and then applying it on
correspondent position in the template to use. To know the
pixel location in the original image, the rotation operation
equations, given by the rotation matrix (given in terms of -θ),
are used. So, to locate the needed pixels the given rotation is
performed around the center point C of the template in the
original image, thus originating the equations:
' !� � �! � (�� cos�,� � �- � (�� sin�,� �(�-� � ��! � (�� sin�,���- � (�� cos�,� �(� (10) Having this set of templates a matching is performed for
each one using the correlation coefficient criteria. The final
result for this corresponds to the sum of all the unormalized
responses.
For the results of the color filter and the template matching
to be comparable, two more operations must be performed on
the template matching result: a threshold operation and a
binarization of the image. The threshold value was set to 150
(the result was normalized in [0, 255]) and the binary
threshold set to 0 (sets to white every pixel in the image with
value bigger than 0).
Now having these results from the discussed methods in the
same domains with the protruded detected road, they must be
associated. The association method is accomplished by
performing a pixel-wise weighted sum of both images. Each
of the contributions has a weight of 50% to the final result. A
-
final threshold is then performed, with a value of 150. The
final result in presented in Fig. 4.
Fig. 4. Association of the template matching and filter
techniques after threshold.
The objective of the preceding methods and operations was
to study a new way to obtain roads from a satellite image
other than the previously discussed automatic operation of
the Steger’s line detection method or to find a way to amend
it.
A hybrid solution where both methods contribute to a result
can be thought. A possible way to do this is by adding both
images pixel-wise and then threshold the result at some
value. To add them pixel-wise, a dilation operation on the
Steger’s results must be done because of the resulting thin
lines and in order for the both images to become comparable.
Then the threshold operation serves to clear some faint
responses of the Steger’s method that were not reinforced by
the new approach method. In this way the global result
combining both methods will complete the undetected road
areas of each but will also bring some extra noise to it.
In Fig. 5 the result of described combination methodology
is presented. The threshold operation was performed at a
value of 100 and the contribution of each method is the same
as presented earlier. The example previously shown
continued to prove difficult, nevertheless some road sections
were still added completing the overall network. The noise
added in the image however is considerable and using this
method also entails an increase in computational complexity.
For these reasons and for not dramatically improving the
overall results, this method was not used.
Fig. 5. Combination of the new approach with Steger.
F. Road Structuring From Blobs
Looking at the result from the road extraction method it is
possible to identify a distinctive property. These images are
composed by aggregated sets of white pixels. In computer
vision regions with similar properties are called blobs. These
sets of white pixels in the resulting image can be classified as
blobs, however in this case, these are very simple blobs
sharing one distinguishable property from the surroundings –
they are white (its pixels have a value of 255) as the
surroundings are black (value 0).
To find all blobs in an image, the image is scanned for non-
zero valued pixels (since there are only two types, it will
detect the white ones). Every time a pixel of this kind is
found that does not belong to any already discovered blob, all
connected components of the same kind are discovered, thus
originating a blob. A component is considered connected to
another if it sits in the Moore neighborhood of it (i.e. it’s
located in exactly one of the 8 immediately attached pixels
around it). To find all the connected components, a BFS is
triggered rooting in the first pixel discovered.
Essential to road structuring is a thinning operation
performed on blob image. This will be accomplished
resorting to the Zhang-Suen thinning algorithm. This
algorithm operates in a binary image (as is the case) and
satisfies two essential criteria: the thinned responses are
always 1-pixel thick and structure connectedness is kept. The
resulting thinned structures obtained will be named
ThinBlobs.
Next a classification of the points in each ThinBlob is
performed. The key points that will be discovered are
junctions and terminations. Junction points are those from
which at least two branches leave whereas terminations
points are the ones from where less than two branches leave.
To find these special points, the blob is traversed point by
point. In each, its Moore neighborhood (points P1 to P8) is
analyzed and a classification of each point is performed.
Algorithm Identify Junction or Termination
1. Initialize N = 0 2. Initialize connected = false 3. 4. For points (P1 → P7) 5. If is blob point 6. If !connected 7. N++ 8. End If 9. Else 10. connected = false 11. End If 12. End For 13. 14. If P8 is blob point 15. If connected AND P1 is blob point 16. N-- 17. Else If !connected AND P1 is not blob point 18. N++ 19. End If 20. End If 21. 22. If N > 2 23. Is a junction 24. Else If N < 2 25. Is a termination 26. End If
Having the location information of terminations and
junctions, a classification will be performed on every other
-
point. These will either belong to termination branches or
junction branches. Termination branches are those set of
points linking a junction to a termination or a termination to
another termination, whereas junction branches will be those
linking two junctions.
To compute the termination branches, a BFS exploration is
used rooting in each of the discovered termination points.
The exploration ends either when there are no points left to
explore in the blob or a junction point is discovered. The
exploration returns all traversed points, allowing in this way
to store them as the branch points. Every time a termination
branch linked to a junction is found, it increases the number
of connected terminations in the respective Junction
structure.
Finding the points that constitute the branch is done in the
same way as finding the termination branches points, that is,
BFS-exploring until another junction is detected. The
concerning branch will be added to the newly found Junction
structure, so that these branches are computed only once.
Another operation done during this stage is to estimate the
angle of the termination branches. This angle is the
approximate orientation of the branch ending that will be
necessary for the next step. This is computed by estimating
the running average of the angle of the last (at the most) M
points of the branch. An “angle between two points” here
referenced is the one between the x-axis of the used reference
frame and a line formed by the two points. The value of M
used for the implementation was 5.
In Fig. 6 is shown the result of the branching operations
described applied on a (perfect) road extraction. Depicted in
blue are the junction branches and in orange the termination
branches.
Fig. 6. Branching of an ideal road extraction.
To solve the gaps between the thin blobs that may have
been due to road occlusion or to inefficiency of the road
detection algorithm a linking algorithm is proposed. The
objective is to find a path through the blobs that approaches
an end point.
The first thing to do is select a starting (thin) blob from a
given initial point and then search for nearby (thin) blobs.
When searching for close blobs, the main idea is to explore
the areas next to blob endings taking advantage of the
termination branch orientation. When a blob is found, the
termination and the new point detected are linked with a
straight line. This is meant to fill small gaps, so a straight line
will in most cases will be a good or optimal approximation to
the underlying road.
In the linking process, all the blob branches that were
previously found (and linked) are not eligible to start a new
search process. In this way the linking will cascade from a
given point, through the blobs, until there are no more blobs
left to link. A blob is considered to be in linked state after all
of its terminations have been explored. The blobs in this state
can’t be linked anymore.
When no more unlinked blobs are found the image is
reanalyzed like a new blob image. In this way the newly
linked blobs will be detected as one, with all its termination,
branches and properties already discussed.
When extraction noise is present in the blob image, a few
blobs can be incorrectly linked (as they don’t represent roads)
so some modifications to the method will be tested.
The first modification will be to deal with smaller images.
Using the starting point, a sub-region of the original image is
extracted around that point. Then all the previously discussed
methods are applied.. Having linked the blobs on the image,
the result is parsed as a blob and all termination points of it
are obtained. From these terminations, at the most two points
are chosen and enqueued. Then the process is repeated on
each of the enqueued points, treating them as starting points
of a new process, until no more points are left. The results
obtained are saved and are kept overlapping to each other, so
that in the end a more complete road network is obtained.
The sub-region area used was 300 × 300 pixels.
The two points chosen from the terminations will those
that, in relation to an endpoint, are the closer and with smaller
azimuth. The Cartesian distance between two points was used
to avail the proximity of the points, as for the azimuth
between two points was defined as follows: 01�������, 2� � 3arctan2���, ��� �arctan2�2�, 2��3 (11) Here, P is a given point in Cartesian coordinates and E
refers to the endpoint, in Cartesian coordinates as well. If
both the closest point and with lower azimuth are the same,
only one is enqueued.
The overlapping of individual steps can create multiple
responses for the same extracted road. This result will be
parsed as one big ThinBlob, so to avoid this, the image is
cleaned of small dark blobs (using a state of the art blob
library [12]) and then thinned (using the Zhang-Suen
algorithm).
Fig. 7. Road extraction after linking, using several starting
points.
Using all of the user input points will produce better results.
The process is repeated for each one of them and then the
contributions are overlapped and treated in a similar manner
as before. In Fig. 7 the overall result of this process on an
example image is shown.
-
V. BUILDING ROADBOOK
A. Parallel Execution
With the methods of the previous chapter available two
kinds of inputs are needed: an image (satellite map) and a set
(can be empty) of points assumed close to roads. There is
also the need to provide the end point. This end point must
also be translated to the specific segment reference frame,
even if it located outside the image bounds. The segments
can, in this way, be independently treated as a single image.
Since the method to obtain roads has its independent group
of variables, the easiest way to have some performance
improvement is to compute in parallel the result of each
segment regarding its set of points. This is done by executing
the road detection method for each segment in its own
independent thread. This kind of parallel execution is referred
as task parallelism.
There will be as many tasks as the resulting number of map
segments. The results of all tasks are then stitched together in
their segment original relative position, forming the global
result of the initial map.
Using the parallel approach, the overall processing will
depend on the number of threads allowed by the system
running the application and the number of segments. In the
best case it will take as long as the longest task and in the
worst case, with a single-threaded system it will take as long
as the sequential approach. Overhead on the creation and
destruction of tasks is considered negligible compared to the
tasks complexity.
B. Graph Conversion
Taking the generated global result and use it in a way that
allow for a roadbook to be presented will require the detected
paths represented in the image to be translated into a graph.
The first thing to do is to find how the paths are connected.
The methods already presented allows for an image with
thinned interconnected lines to be translated into
terminations, junctions and branches. This is a structure with
some degree of similarity to the intended graph, so the first
operation in the global result is to apply these methods and
obtain a structured data set on the represented network.
When converting the structured data resulting of image
processing operations to the graph, the nodes are
correspondent to the junctions and terminations and the links
are correspondent to each of the existing termination or
junction branches. Converting a branch to a link requires the
newly created link to be added to the adjacency list on the
both nodes on the link’s ends. These nodes are identified by
the point where they are located, either the junction points in
case of the junction branch or the junction and termination
point in case of the termination branch. The links will have
access to the point list of the branch, since this information is
available. The link’s distance (or length) is provided by the
number of points in that list, thus representing the distance
between two nodes in pixels. In Fig. 8 is represented this
conversion process, from a road extraction result.
Fig. 8. Graph conversion.
C. Result Review
Presenting the result of the image processing to the user is a
very important step. This allows for them to correct the
detected course or select a new one and choose a path. For
this purpose an interactive solution was adopted.
In this stage user has the ability to add new nodes, connect
existing ones or remove existing links or nodes. These
operations are made over the graph structure previously
presented.
The start, middle and finish nodes are added to the graph as
any other node. These are automatically added after the
conversion from the image processing result to graph is done.
When performing the node addition to the graph, a search for
a path in the vicinity is made. If a path is found, the node will
automatically be added to that path. This is a graph operation
accomplished by breaking the link in two and resetting the
node ends on the resulting links accordingly. The point list is
also divided, but this is just a matter of finding the right point
position inside that list. Since the point list is result of a BFS
operation, the points are sequentially ordered, so dividing the
list requires no other operations than just break it at this
point. If the node to add is a start, finish or middle point, the
search for nearby nodes is additionally performed and the
nodes eventually found are transformed into one of these
special points accordingly. The difference between these and
any others is just a flag inside the node structure indicating
what kind of node it is. These also appear identified in the
GUI and are used for other graph operations.
Connecting two nodes will add a new link to each of the
nodes, with the point list corresponding to a straight line
between them. A straight line can be a good approximation to
the underlying missing road segment in many cases. In
situations where this is not true, some intermediary nodes can
be added to improve road estimation of the road format.
Removing nodes and links is useful in faulty road detection
cases. Selecting any node will remove it and all of its links,
so it will also affect neighbor nodes if any. Start and finish
nodes cannot be removed. To delete a link, the two linked
nodes must be chosen and the remove action selected.
D. Path Calculation
Start and finish nodes should also be connected for a path to
be computed. If the user has selected middle points, those
must also be connected and the path will pass through them.
While start, finish and middle nodes are not connected, the
application doesn’t allow advancing to next stage.
The algorithm used to get the roadbook path was a
modification of the Dijkstra’s shortest path algorithm. The
-
first operation to do is to check if all required path nodes are
connected, that is if starting in a source node all these
targeted nodes can be reached. The most efficient way to do
this would be using an exploration algorithm like a BFS or a
DFS checking if all selected nodes were found. With this
solution the worst case complexity would be ���, meaning that at the most all the edges would be visited once. However
the goal here is not solely finding if the nodes are connected.
Since the Dijkstra algorithm is going to be applied, it’s also
being used to check if the nodes are connected. A run of the
Dijkstra’s algorithm outputs the minimum distance from a
source node to all remaining graph nodes. If the source is not
connected to a given node, then the Dijkstra’s output in that
situation indicates that the distance separating them is
infinite. So, to check if a given set of nodes are connected
using Dijkstra’s algorithm is just a matter of checking if all
distances to all nodes are less than infinite. If this condition
applies, then there is a path from start to finish through all
middle points. With the used implementation of the
Dijkstra’s algorithm the worst case complexity is �� log��, which is a bit worst when comparing to an exploration
algorithm just for the connectivity task, but in this case where
the Dijkstra is computed after that, this last computation
resolves the problem and simultaneously computes the path.
If the computation only involves discovering the path
between the start and finish points, this problem is solved and
the discovered path is the shortest, computed by Dijkstra’s
algorithm. When more nodes are involved, a more complex
solution must be considered.
The solution here should always be a method that
empowers the user with some level of choice regarding the
final calculated path.
For the user to have this control, an extra parameter was
entered into the path calculation, and that is a node priority.
The user has the possibility to set this priority in the GUI, by
selecting the intended middle node. In this way, the algorithm
will search for the lowest priority value first and only then it
will try to search for the closest middle node. The nodes are
initialized by default with a priority value corresponding to
infinite. With this value the preference will be given to the
closest middle node instead.
Fig. 9. Path selection using two prioritized middle nodes.
In Figure 9 an example with prioritized middle nodes is
shown. Here the user has chosen to attribute a priority of 1 to
D. As this is the lowest priority of any of the middle nodes,
the algorithm will start by finding a path to D, which will be
the shortest path, given the use of the Dijkstra’s algorithm.
No priority was set on E, so the algorithm will find the next
closest middle node that in this case is the remaining one.
Using this approach the user has control over the calculated
path.
E. Roadbook Presentation
With the path calculated all that is left to do is to build the
actual roadbook. The roadbook will present de directions
between road junctions, the total distance traveled and the
distance from last indication. The format of the junction is
required to be shown as well as the navigational direction
(from where it came and to where is going) in each entry.
To identify the format of the junction, a roadbook tulip is
be used. Here it will be taken advantage of some already
present information, that is, the format of the roads around
the junction. Either automatically detected or user adjusted,
this format represents the road junction in the best possible
way, given that the only source of information is the satellite
map image.
The roadbook is formed by a set of pages, each one having
a set of entries and a header. Each entry represents an
indication of the road to follow with a tulip, a sequence
number, the total and partial distance covered. A page is built
as an image with big enough dimensions to accommodate the
header plus a predetermined number of entries. The number
of entries set for each page was 7.
Fig. 10. Generated Roadbook.
The nodes of the computed path are iterated. In the
previously identified nodes (where a tulip was created) an
entry is created. A sequence number is maintained, to be
printed in each entry, as well as an accumulator with the total
traveled distance and another with the partial distance. Every
time a node that is signaled to be present in the page, the
partial distance is printed in the page and is set to zero,
providing in this way only the distance from the last entry.
-
All the distances are internally saved as a sum of the graph
link lengths given in pixels and presented in kilometers (as
the defined metric used). In Fig. 10 a constructed roadbook
corresponding to the calculated path of the Fig. 8 is shown.
VI. CONCLUSIONS AND FUTURE WORK
Road recognition from the obtained images was one of the
main issues in this work. To overcome this problem, several
image processing methods aiming road extraction from
satellite maps were studied and a new approach for road
extraction was proposed, with results comparable to existing
state of the art methods. The road extraction problem was
resolved using the developed method along with a linking
algorithm that tries to improve and better suit the results to
the objective of this section of the project. This algorithm and
the developed road extraction method are independent and
are expected to work separately. Specifically, the extraction
process can serve as a road recognition method in other
projects, can be improved with more or different intermediary
methods (template matching and color filtering were used
here) or even serve as a starting point for other algorithms to
work upon.
As the road extraction process may not be perfect, map
reviewing by the user takes particular significance to correct
and complete the path. In this way, it’s guaranteed that a
roadbook can always be generated on any location. A future
improvement can be though as replacing the current road
extraction method with another with better results, eventually
developed. User validation will always be required even in
the best case scenario, to assure that the computed path to be
presented in the roadbook is the one intended.
In terms of performance improvement, a different
parallelization process could be approached. The most
intensive tasks are the image processing ones, so instead of
concurrently processing each segment of the satellite image,
smaller tasks could be identified inside that large processing
task, making the parallelization more efficient. Another
improvement that could be made was to perform some or all
of the image processing computations in the GPU. Given the
purpose of this unit and the computing power usually
associated, improvements in the performance would be
expected in delegating the execution of the referred tasks to
it.
Integration with other systems aiming to improve results
can also be planned as a perspective of future work. Building
a system using GPS and a satellite image approach would
bring the best of both worlds providing more accurate results
and introduce a variety of new approaches for the roadbook
problem. Similarly this work could also improve some of the
existing commercial solutions for roadbook creation already
using these systems.
More variety of maps can also be introduced by integrating
more services providing maps, other than Google Maps. This
could be taken as an opportunity to improve results, by
comparing different images of the same location, road
extraction could potentially achieve better results. In fact
some of the studied existing methods for this are already
based in identifying image differences. Integration with
commercial satellite image services could also allow new
approaches for road extraction given the nature of the
provided images (high-resolution, multispectral).
REFERENCES
[1] Steger, C. (1996). Extracting curvilinear structures: A
differential geometric approach. In Computer Vision—
ECCV'96 (pp. 630-641). Springer Berlin Heidelberg.
[2] Mena, J. B. (2003). State of the art on automatic road
extraction for GIS update: a novel classification. Pattern
Recognition Letters, 24(16), 3037-3058.
[3] Tripy, GPS + Digital Road Book. (n.d.). Tripy GPS + Digital
Road Book for motorbikes, 4X4, Quad and Oltimers. Retrieved
September 18, 2013, from http://www.tripy.eu/en/
[4] Li, Y., & Briggs, R. (2009). Automatic extraction of roads from
high resolution aerial and satellite images with heavy noise.
World Academy of Science, Engineering and Technology, 54,
416-422.
[5] Bacher, U., & Mayer, H. (2005). Automatic road extraction
from multispectral high resolution satellite images.
Proceedings of CMRT05.
[6] Christophe, E., & Inglada, J. (2007, September). Robust road
extraction for high resolution satellite images. In Image
Processing, 2007. ICIP 2007. IEEE International Conference
on (Vol. 5, pp. V-437). IEEE.
[7] Porikli, F. M. (2003, September). Road extraction by point-
wise gaussian models. In AeroSense 2003 (pp. 758-764).
International Society for Optics and Photonics.
[8] Mayer, H., Baltsavias, E., & Bacher, U. (2005). Automated
Extraction, Refinement, and Update of Road Databases from
Imagery and Other Data.
[9] Laptev, I. (1997). Road extraction based on snakes and
sophisticated line extraction. Master's thesis, Royal Institute of
Technology, Stockholm, Sweden.
[10] Kass, M., Witkin, A., & Terzopoulos, D. (1988). Snakes:
Active contour models. International journal of computer
vision, 1(4), 321-331.
[11] Pandit, V. (2009). Automatic Road Extraction From High
Resolution Satellite Imagery.
[12] CvBlobsLib — OpenCV Wiki. (n.d.). OpenCV Wiki. Retrieved
July 2, 2013, from http://opencv.willowgarage.com
/wiki/cvBlobsLib
[13] Issue 4189: Add a scale bar to static maps. (n.d.). Gmaps-api-
issue. Retrieved November 25, 2013, from
https://code.google.com/p/gmaps-api-issues/issues/detail?id=
4189