an object detection and tracking techniques using ... · 1 an object detection and tracking...
TRANSCRIPT
1
An Object Detection and Tracking Techniques Using
Superpixel Extraction in Compressed Video Sequences
1 Chakaravarthi S, 2 Visu P, 3Ganesan R
1Associate Professor, CSE Department, Velammal Engineering College, Chennai, Tamilnadu, India.
2Associate Professor, CSE Department, Velammal Engineering College, Chennai, Tamilnadu, India.
3Assistant Professor-III, CSE Department, Velammal Engineering College, Chennai, Tamilnadu, India
[email protected], [email protected],[email protected]
Abstract— Segmenting Foreground objects from a video gathering is a key get ready and
fundamental walk in video examination, for instance, question acknowledgment and followi ng. A
couple of Motion disclosure methods and edge locators have been delivered till now however the
issue is that it is particularly difficult to move a perfect article, say frontal territory, due to the block
from the factors like atmosphere, light, shadow and chaos. This paper proposes another frontal area
discovery approach that arrangements with edge location of edges of packed video groupings by
applying the methods of removing the superpixels and playing out the foundation subtraction
calculation and optical stream alongside SMED (Separable Morphological Edge Detector) on those
superpixels separated in each edge of the video. SMED has strength to light changing and equipped
for recognizing even slight development in the video grouping. The proposed work is profoundly
quick and exact in identifying the moving items in different situations, for example, quick moving
articles, moderate moving articles and notwithstanding moving articles in unique scenes, where
both the frontal area and foundation changes. On applying the above strategies consecutively on the
video grouping, closer view question can be divided precisely with increment in execution speed of
the calculation, exactness, decreasing the foundation commotions.
I. INTRODUCTION
A Video Surveillance framework [L.Cheng, M.Gong]1 must be equipped for ceaseless operation
under different climate and brightening conditions. It ought to be equipped for managing
development through jumbled territories, objects covering in the visual field, shadows, lighting
changes, and impacts of moving components of the scene (e.g. influencing trees), moderate moving
items, and articles being presented or expelled from the scene. Additionally, Real time video division
includes the division of forefront picture from a foundation scene precisely. Yet, the key issues of
clamor, undetected edges, loss of smoothness, inappropriate division of forefront protest
particularly in covered questions and keeping up vigor because of changes in brightening stays all
things considered despite the fact that such a large number of calculations came into a being.
International Journal of Pure and Applied MathematicsVolume 119 No. 17 2018, 553-574ISSN: 1314-3395 (on-line version)url: http://www.acadpubl.eu/hub/Special Issue http://www.acadpubl.eu/hub/
553
2
Likewise some more extra works must be accomplished for protest following, movement
investigation and conduct examination for fragmenting the closer view objects. Conventional
methodologies in view of foundation techniques regularly flop in these general circumstances.
The principle goal is to fragment the frontal area protest from foundation scene is to take out all
the related issues and to have an exact closer view picture. To have an accurate foreground image
from a video, so many edge detection algorithms are used but each of them has some constraints
and they cannot satisfy the purpose entirely. So to tackle with the above related problems this paper
gives a solution by applying the following methods sequentially thereby to improve the efficiency.
First, extraction of superpixel from a video frame to reduce the number of comparisons. Second,
applying the background subtraction algorithm and optical flow on those superpixels extracte d in
each frame of the video [K.Suganya Devi, N.Malmurugan]2 instead of pixels to detect the edges of
objects in the video clearly and finally by using the SMED (Separable Morphological Edge Detector)
the foreground object is segmented from background scene accurately.
Till now the images/video frames are analyzed with the help of pixels or sub-pixels which leads to
more comparisons and the edge detection techniques are applied to these segments which are
formed of pixels or sub pixels only. But the concept of extracting Superpixels in this paper is much
more beneficial as it reduces the number of comparisons of frame segments and it also reduces the
processing of data more than 75% and thus reduces the storage capacity. Foundation subtraction is
a vital piece of reconnaissance applications for effective division of objects of enthusiasm for a scene
for applications, for example, observation. The reason for Background Subtraction calculation is to
recognize moving articles (in the future alluded to as the closer view) from static or moving parts of
the scene (called Background). Straightforward movement discovery calculation [ O.Barnich, M.Van
Droogenbroeck]3 contrasts a static foundation outline and a present edge of a video scene, pixel by
pixel. This is the essential guideline of foundation subtraction where every video edge is analyzed
against a foundation display, and the pixels which altogether go astray from the model are thought
to be the forefront. These "frontal area" pixels are further post-handled for protest limitation and
following. The general structure of BackGround Subtraction (BGS) [Cong zhoo, Xiaogang wang, wai-
kuen cham]4 more often than not contains four stages: preprocessing, foundation displaying, closer
view identification, and post preparing. The preprocessing step gathers preparing tests and expels
imaging commotions; the foundation demonstrating step manufactures a foundation model which is
by and large powerful to certain foundation changes; the frontal area location step produces cl oser
view hopefuls through figuring the deviation of a pixel from the foundation show; at last, the post
handling step limits those possibility to shape forefront veils.
Among these four stages, the third step to be specific closer view discovery is the es sential
procedure that ought to distinguish frontal area question precisely. i.e., the resultant video
International Journal of Pure and Applied Mathematics Special Issue
554
3
succession or picture in this manner got ought not contain foundation commotion. We will likely
make a hearty, versatile following framework for forefront identification of video questions under
the packed space that is sufficiently adaptable to deal with varieties in lighting, moving scene mess,
different moving articles and other discretionary changes to the watched scene.
This paper fundamentally goes for the new strategy of video handling used to take care of the
above said issues related with the continuous video reconnaissance framework. Another frontal area
identification approach called Optical Flow with Gaussian Mixture display and SMED (OFGM-SMED)
that takes a shot at the Superpixels separated has been proposed for identifying the closer view
objects. Whatever is left of the paper is sorted out as takes after; area II depicts the past work and
current inadequacies, segment III presents the proposed approach for superpixel extraction that can
be utilized by OFGM and SMED (OFGM-SMED) and in segment IV some trial results and exchanges to
confirm the proposed approach is clarified. At long last, the paper is finished up in area V.
II. RELATED WORK AND CURRENT SHORTCOMINGS
Sobel (1970), Prewit (1971), Robinson (1977), and Frei -khan (1977) gave the classical edge
detection methods and they are very simple in computation and capable to detect edges and their
orientations but they have lack of smoothing and very sensitive to noise and they are inaccurate and
are less efficient. So there is a need for efficient edge detection algorithm and Canny proposed an
edge detection algorithm which removed the constraints of the classical methods and is preferred as
the best edge detection method but it also has one the same conventional disadvantage that it
segments the frame on pixel by pixel basis only.
The current frontal area discovery calculations can be partitioned into three classes: outline
contrast, optical stream and foundation subtract.
Outline distinction [K.Gupta, V.Anjali Kulkarni]5 ascertains pixel dark scale contrast between
neighboring two casings in a consistent picture succession and decides closer view by setting limit.
Outline contrast strategy can be utilized as a part of element condition, however it can't totally
extricate all the closer view range, the focal piece of the objective will be lost, which brings about
terrible target acknowledgment. Furthermore, this strategy is hard to precisely re cognize quick
moving items and numerous articles.
Optical stream [B.K.P.Horn, B.G.Schunck]6 is the conveyance of obvious speeds of development of
splendor examples in a picture. Optical stream can emerge from relative movement of articles and
can give essential data about the spatial game plan of the items saw and the rate of progress of this
course of action. Discontinuities in Optical stream can help in fragmenting pictures into locales that
relate to various articles. Since it is extremely hard to compute the genuine speed field utilizing
International Journal of Pure and Applied Mathematics Special Issue
555
4
picture arrangement, the optical stream field which is acquired from the data of moving items can
be utilized to supplant the speed field. In any case, the optical stream field alone can't be utilized in
light of the fact that it can't dispose of foundation clamors [C.Stauffer, W.Grimson]7which happen
because of the impact of encompassing light.
Foundation subtract [C.Stauffer, W.Grimson]7 is a typical technique utilized as a part of forefront
discovery. It ascertains the distinction between the present picture and foundation picture and
identifies forefront by setting edge. Essentially there are two techniques to acquire foundation
picture, viz.,
1.Defining background image manually, and
2.Obtaining a background model by a training algorithm as like used in Gaussian Mixture
Model (GMM).
Contrasted with the previous, the last is more precise and the consequence of forefront
recognition is greatly improved. Foundation subtract strategy has vigor to light changing and slight
development, however when utilizing this technique to manage long picture grouping there might
be much amass mistake in the frontal area. Additionally, GMM delivers great aftereffects of
distinguishing Foreground in uncompressed [K. Chitra Lakshmi , K. Nagarajan]21,area of Video
grouping. In any case, for compacted space of video it delivers exceptionally poor outcomes with
high foundation clamor. Optical stream covers long separation and the foundation commotion
because of shine change is less which brings about less gather blunder rate.
In advanced picture handling [C.Gonzalez, Woods R Eddins]8, the edge location is a critical
method. Edge location is the way toward finding important moves in a picture. Different edge
discovery [N.Senthil kumaran, R.Rajesh]9 calculations have been proposed in light of either angle
administrator or measurable methodologies. Generally the angle administrators are effectively
influenced by foundation commotion, and the separating administrators are utilized to diminish the
foundation clamor rate. In edge recognition, morphological edge identifiers [M.Fathy, M.Y.Siyal]10
are additionally accessible which are more compelling than the inclination administrators. A few
sorts of morphological locators are likewise accessible and those are not profici ent while contrasting
with distinguishable morphological edge finder. Be that as it may, the edges at various points are not
secured and thin edges are missed by this numerical morphological locator. Thus distinct
morphological edge indicator distinguishes thin edges and the edges at various edges with less
foundation commotion [M.Y.Siyal, A.Solangi]11.
Different strategies have been proposed to video picture handling as of recently. In any case,
these current strategies have a few challenges with blockage, shadows, foundation commotion and
different lighting conditions. This writing report portrays different systems included, their
requirements like memory, registering tie, multifaceted nature.
International Journal of Pure and Applied Mathematics Special Issue
556
5
The Video observation technique proposed by Baumann.A et al [A.Baumann, M.Boltz, Julia
Ebling]12, goes for vigor with low volume of false positive and false negative rate all the while. Be
that as it may, the prerequisite is to have zero false negative rates and furthermore it ought to adapt
to differing light condition, impediment circumstances and low complexity. Constant video
observation proposed by Nan Lu et al [Nan Lu, Jihong Wang, Wu Q H, Li Yang]13 manages continuous
recognition of moving articles. This arrangements with issues like storage room and time uti lization
to record the video. To keep away from the above issues this uses movement recognition calculation
however this spreads just the video that has vital data. Continuously visual reconnaissance W4S
[A.Rourke and M.G.H.Bell]14 technique (What, When, Where and Who) is the minimal effort PC
based ongoing visual observation framework for identifying and following individuals and checking
their exercises in an outside situation. It has been executed to track individuals and their parts of the
body. It has the issues like sudden light changes, shadow and occlusion.W4S is a coordinated
constant stereo technique which has tended to the restriction that W4S has met. It manages
following of individuals in open air condition. Be that as it may, this makes followi ng substantially
harder in power pictures. End-to-End technique has been proposed which is utilized for expelling
moving focuses from a surge of ongoing recordings. It sorts them as indicated by picture based
properties. Yet, this includes in powerful following of moving targets. Keen video reconnaissance
frameworks bolster the human administrators with distinguishing proof of huge occasions in video.
It can do question discovery in outside and indoor situations under changing brightening conditions.
Be that as it may, this depends on the state of recognized items.
Programmed video observation utilizing foundation subtraction [Wei Li, Xiaojuan Wu, Matsumoto
K, Hua-An Zhao]15 has distinctive issues. Pixel based multi shading foundation model is an effective
answer for this issue. However this strategy experiences moderate learning toward the start and it
couldn't separate between moving items and moving shadows. Sight and sound observation
[H.Rahmalan, M.S.Nixon, J.N.Carter]16 uses arranged number of related media streams, each of
which has an alternate confirmation level to achieve various reconnaissance undertakings. It is hard
to embed another stream in the framework with no learning of earlier history.
Edge location has been a testing issue in picture preparing. Because of absence of edge data the
yield picture is not outwardly satisfying. Different sorts of edge finders are talked about here, Robert
edge locator [L.Daniel, Schmoldt, Pei Li, Lynn Abbott]17 identifies edges which keep running along
vertical pivot of 45 and 135 degree. The main disadvantage is that it requires long investment to
figure. Gaussian edge locator decreases foundation clamor by smoothing pictures and gives better
outcomes in a loud situation. The trouble is that it is exceptionally tedious and extremely complex
for calculation. Zero intersection identifiers utilize second subordinate of the info picture and it
incorporates the laplacian administrator. It is having settled attributes every which way. Be that as it
may, it is delicate to foundation clamor. In Canny edge locator approach, if the set edge is low, it
International Journal of Pure and Applied Mathematics Special Issue
557
6
creates false edges and on the other hand, if the set edge is high it forgets vital edges. The primary
drawback of watchful edge finder is that it is not defenseless to foundation clamor [Browne Alan,
T.M.McGinnity, G.Prasad, J.Condell]18.
To beat the aforementioned issues required in the current procedures we propose a recently
adjusted approach. This is exceptionally powerful and could conquer all the previously mentioned
issues like clog, shadow and lighting moves, strength to light changing and even slight development.
This proposed approach will be extremely successful and the best decision for both static and
element foundation with shifting Frame rates. Additionally the proposed approach has created great
outcomes in distinguishing frontal area video protests in the compacted video groupings .
III. SUPERPIXEL EXTRACTION
Superpixel speaks to a limited type of district portrayal, adjusting the clashing objectives of
decreasing picture multifaceted nature. Superpixel is a polygonal piece of an advanced
picture/outline which is bigger than ordinary pixel that is rendered in the comparable properties, for
example, shading, brilliance, surface, vector and so forth. Superpixels are gathering of pixels with
roughly same pixel values. The fundamental objective of removing the superpixels from a
picture/edge is to diminish the extent of information that experiences for foundation subtraction.
Other than this the benefits of utilizing superpixels are computational effectiveness, decrease in
number of primitives and speculation, and less blunder rate [ J.Lee , R.M.Haralick , L.G.Shapiro]19.
Superpixel outline many fancied properties: It is computationally effective: it lessens the intricacy
of pictures from a huge number of pixels to just a couple of hundred superpixels. It is additionally
representationally effective: combine savvy limitations between units, while just for contiguous
pixels on the pixel-lattice, can now show any longer range collaborations between superpixels. The
superpixels are perceptually important: each superpixel is a perceptually reliable unit, i.e. all pixels in
a superpixel are no doubt uniform in, say, shading and surface. It is close entire: in light of the fact
that superpixels are consequences of an over division, most structures in the picture is rationed.
There is next to no misfortune in moving from the pixel -framework to the superpixel delineate.
The superpixel extraction algorithm is as follows
Input: Image/frame of a video
Output: Image/frame with extraction of superpixels
Method:
1. Initialize the primary pixel esteem P1 as introductory estimation of the superpixel.
2. Get a pixel Pi
3. Compare Pi with P1 i.e. process Pi-P1
International Journal of Pure and Applied Mathematics Special Issue
558
7
n
i
i PPKP0
1)(
4. If Pi lies amongst T1 and T2 then current superpixel, increase i then goto 2 else goto 5
5. Create next superpixel along the bearing of Pi from current superpixel,
a. Settle the limit values
b. Initialize the present pixel esteem as beginning pixel estimation of this superpixel
c. Goto 2
6. Stop if Pi=Pn.
Each pixel in the picture/edge is experienced to the previously mentioned investigation and thus
superpixels are separated from the picture/outline. The course of the pixel which does not lie
between the edges is noted down concerning first pixel to frame another superpixel toward that
path. For instance, a pixel beneath the primary pixel is out of the limit esteem henceforth another
superpixel is made in the south of current superpixel and those pixel organizes that is more
noteworthy than this pixel alone are taken for examination (Figure 1). This thought helps in
diminishing the calculation of superpixel extraction which abstains from covering of superpixels
moreover. A similar strategy is taken after till it achieves Pn. This technique for superpixel extraction
prompts to each superpixel roughly with same size and shape. The critical indicate be noted amid
superpixel extraction is pixel crashes must not happen with the end goal that none of the pixel must
have a place with more than one superpixel.
Consider k is the aggregate number of superpixels to be separated from a picture/edge, the
aggregate number of pixels in the picture/casing is n, and the pixel estimation of ith pixel is spoken
to by Pi. Settle a base edge T1=0 and a most extreme limit T2=+1 for each superpixel. The pixel
estimation of first pixel in the superpixel is taken as the underlying incentive for that superpixel and
the contrast between every pixel qualities are contrasted and the main pixel esteem. The condition
to decide the distinction is,
(1)
(2)
In the event that the distinction of pixel qualities lies between the limit T1(zero) to T2(+one) then the
pixel is thought to be in the same superpixel else the pixel has a place with new superpixel .
One of the significant favorable position of this technique is each pixels exists inside the same
superpixel have high connection and each superpixels have extensively less relationship which will
be useful to effortlessly distinguish the edges of the question in the picture/outline for foundation
International Journal of Pure and Applied Mathematics Special Issue
559
8
subtraction. The space expected to store the pixel estimations of a picture/edge is decreased from
thousands to couple of hundreds. Each superpixel spoke to in the memory is the portrayal of
gathering of pixels under the relating superpixel.
As the extraction of Superpixels is computationally more efficient and it reduces the number of
comparisons it is considered as a better way when compared to pixels or sub pixels. Still they cannot
predict the boundary clearly. Without setting a boundary properly we cannot get a efficient
segmented foreground image. So clear boundary setting is needed and is done with the help of edge
detectors. In that the SMED is used as it is prone to all the related problems of edge detection.
Hence, the SMED along with OFGM algorithm is performed on superpixels which give an efficient
image/video frame with clear boundaries.
IV. OPTICAL FLOW
Another closer view location approach cal led OFGM-SMED which makes utilization of Lucas-
Kanade optical stream [Nan Lu, Jihong Wang, Wu Q H, Li Yang]13 and Gaussian Mixture Model
alongside SMED is proposed. An impeccable forefront can't be acquired by utilizing optical stream
alone because of some shine change. Yet, ideal frontal area can be gotten by OFGM-SMED
adequately.
It is realized that there are five sorts of optical stream technique and Lucas Kanade (LK) optical
stream appeared in Figure 2 is a sort of angle based calculation [9]. On the off chance that I(x; y; t) is
the power of pixel m(x; y) at time t, vm = [vx; vy] is the speed vector of pixel m(x; y), then a little
while later interim Δt, the optical stream limitation condition
(3)
where
is the spatial force inclination vector.
Since vm is a two measurement variable, more requirements are expected to settle this
question. LK optical stream technique gauges vm by v communicated in (2) on the suspicion that vm
is a steady in a little spatial neighborhood Ω.
∑m=W2(m)(∇I.v+ )2 (4)
International Journal of Pure and Applied Mathematics Special Issue
560
9
2
2)(
1 2
1)( n
nx
n
n
n ewXP
In (4), W2 (m) is a window function making the central part of the neighborhood has greater weight
than the peripheral part. For the pixels mi (i = 1, 2 ..., n) in Ω, the solution v can be obta ined by
v=(ATW2A)-1ATW2b (5)
where
A = (∇I (m1)... ∇I (mn)) T;
W = diag (W (m1)... W (mn))
and
Since LK technique figures optical stream in each pixel, by utilizing this strategy we can recognize
every one of the progressions between neighboring casings, along these lines it is the best decision
in distinguishing swarm development [N.Hashimoto]20. Be that as it may, optical stream strategies
are exceptionally delicate to brilliance change, when utilizing LK strategy it is hard to locate an
appropriate limit to section frontal area and foundation. Truth be told, regardless of how to settle on
a decision, the discovery result may either lose some forefront zone or contain some foundation
commotions. Clearly we can't acquire an ideal closer view by utilizing LK technique alone.
Subsequently we attempt to utilize another technique to enhance the outcome. After a progression
of tests we found that by consolidating LK optical stream and SMED technique we could get an
immaculate outcome.
V. GAUSSIAN BACKGROUND MODELING
Gaussain Background Modeling (GBM) [M.Fathy, M.Y.Siyal]10 appeared in Figure 3, is one among the
various types of foundation subtract technique. In this technique, K Gaussian models are utilized to
inexact the pixel values in the picture, these models are refreshed on each edges of the video. On
the off chance that the remaining estimation of pixel esteem and inexact esteem is bigger than the
set limit esteem, this pixel is viewed as forefront, else it is considered as foun dation. Utilizing K
Gaussian blend models, the dim likelihood capacity of pixel X at time t is given as
(6)
where wn is the heaviness of number n Gaussian model whose mean and difference are µn and
s2n.Usually, the estimation of K is from 3 to 5. Keeping in mind the end goal to speak to a perplexing
scene, we have to utilize bigger K. It ought to be noticed that the count time increments for bigger K.
International Journal of Pure and Applied Mathematics Special Issue
561
10
where wn is the heaviness of number n Gaussian model whose mean and difference are µn and
s2n.Usually, the estimation of K is from 3 to 5. Keeping in mind the end goal to speak to a perplexing
scene, we have to utilize bigger K. It ought to be noticed that the count time increments for bigger K.
VI. SEPARABLE MORPHOLOGICAL EDGE DETECTOR
The picture edges incorporate rich data that is exceptionally critical for getting the picture
qualities by protest acknowledgment. Edge recognition alludes to the way toward recognizing and
finding sharp discontinuities in a picture. Different edge identification calculations [K.Suganya Devi,
N.Malmurugan, R.Sivakumar]21 have been proposed in view of slope administrator or factual
methodologies. For the most part the slope administrators are effortlessly influenced by foundation
clamor, and the sifting administrators are utilized to lessen the foundation commotion rate. In edge
location, morphological edge locators are additionally accessible which are viable than the slope
administrators. A few sorts of morphological locators [Browne Alan, T.M.McGinnity, G.Prasad,
J.Condell]18-[ J.Lee , R.M.Haralick , L.G.Shapiro]19 are likewise accessible and they are not effective
when contrasted and Separable Morphological Edge Detectors. The viability of many picture
handling and PC vision calculations relies on upon the flawlessness of recognizing significant edges.
Because of absence of protest edge data the yield picture is not outwardly satisfying.
Existing edge locators are additionally accessible yet the principle drawback is that they are
delicate to foundation commotion and off base. A few illustrations are Robert's edge locator and
Sobel edge finder. As existing edge finders have a few impediments with foundation clamor, another
morphological edge-identification administrator Separable Morphological Edge Detector (SMED)
[K.Suganya Devi, N.Malmurugan, R.Sivakumar]21 has been proposed. It requires less calculation, and
has considerable execution when contrasted and other morphological administrators. The purposes
behind receiving SMED administrator is recorded beneath.
1) SMED can recognize edges at various edges, while other edge locators can't recognize a wide
range of edges.
2) The quality of the edges recognized by SMED is twice than other edge indicators.
3) SMED utilizes divisible middle sifting to expel foundation clamor. Distinct middle separating has
appeared to have practically identical execution to the genuine middle sifting, yet requires less
computational power.
A Separable Morphological Edge Detector as appeared in Figure 4 plays out the accompanying
1. Grayscale transformation.
2. Middle Filtering
3. Limit Extraction utilizing histogram and settling
Edge.
International Journal of Pure and Applied Mathematics Special Issue
562
11
4. Expel the uproarious pixels from the Image and Filling
the pixel soften up standard example.
5. Make Image One Pixel Thick in Horizontal and
Vertical Direction.
SMED, which utilizes perfect and effortlessly implementable administrators, has a lower
computational necessity, contrasted with the other morphological edge-recognition administrator.
Open–close has preferable execution over the SMED administrator [N.Hashimoto]20, however it
requires around eight circumstances more computational power, along these lines, it is not
reasonable for continuous applications.
We propose another approach by consolidating SMED and OFBM which is appeared in Figure 5.
It can be seen that OFBM technique applies LK optical stream and GBM in parallel. On one hand, we
first utilize the two adjoining pictures f(x; y; t − 1) and f(x; y; t) to compute the LK optical stream
field, then middle channel and Gaussian channel are utilized to wipe out salt and pepper clamors
and high-recurrence commotions individually.
After that we utilize an edge Tlk , LK Threshold, to fragment optical stream field to get LK closer
view veil flk(x; y; t), our test outcomes demonstrate the scope of Tlk is [0.05, 0.20], picking littler Tlk
will deliver bigger forefront territory including foundation commotions, while picking greater edge
may lose some frontal area zone.
With a specific end goal to recognize all the development range we select the littlest esteem 0.05,
and afterward we attempt to kill the foundation commotions in the closer view veil flk(x; y; t). Then
again, GBM strategy is utilized to get another closer view cover where the scale channel is utilized
for dividing frontal area and foundation. In the scale channel, we set another limit Tg, GMM
Threshold, that implies a zone of pixel square. For an acquired forefront picture, if a pixel piece has
littler size than Tg, it will be named foundation, else it is kept as closer view. Thus, w e can get
another forefront veil fg(x; y; t). In our test, the estimation of Tg ought to be close to 1/400 of the
picture range. For instance, when the extent of picture is 320×240, the scope of Tg is [160, 200]. As
like LK strategy, we select the littlest Tg to get the biggest forefront cover fg(x; y; t).
At long last, these two covers are duplicated and we work morphological preparing [ B.K.P.Horn,
B.G.Schunck]6 to join the nearby ranges and reject little squares in the forefront, then an ideal
frontal area fore(x; y; t) can be gotten as appeared in the outcomes. It ought to be noticed that
however both flk(x;y;t) and fg(x; y; t) contain foundation commotions, the foundation clamor in flk(x;
International Journal of Pure and Applied Mathematics Special Issue
563
12
y; t) is created by brilliance change and shows up arbitrarily on the profiles of items, though in fg(x;
y; t) the foundation commotion happens on the edge of articles and with time passing by, the
foundation clamor shows up at a similar place. Since the two foundation clamors show up at better
places, we can wipe out most foundation commotions by duplicating flk(x; y; t) and fg(x;y;t). The
closer view picture fore(x; y; t) acquired by OFGM is then utilized as a part of the group thickness
estimation.
When utilizing SMED technique, the closer view containing tremendously gathered mistake
because of foundation clamor ought to be killed. The foundation commotions show in the Optical
Flow calculation can likewise be expelled by this approach by applying Separable Morphological
Edge Detector. While utilizing the proposed OFGM-SMED approach, all the foundation commotions
are evacuated, and no frontal area is lost, so the last protest location result will be ideal as it gets
low blunder rate appeared in Table II. OFGM-SMED approach is compelling.
VII. RESULTS AND DISCUSSIONS
The Foreground recognition is the base of movement investigation, for example, protest
following, picture division, and movement estimation appeared in Figure 7. The proposed approach
is tested on video sequences of both the uncompressed and compressed domain for three different
videos with sample of minimum 250 images under the following category. The results are discussed
herein.
The Video1, Figure 6a is slow moving scene taken by static camera with a frame rate of 30 fps,
Video2, Figure 6b is a scene taken by a dynamic camera (both foreground and background is
dynamic) with frame rate of 25 fps and Video3, Figure 6c is the fast moving traffic scene captured by
a static camera with a frame rate of 15fps. The Superpixels were extracted from the original vi deo
sequence a) b) c) as shown in Figure 7. Superpixels when applied in a large amount are not that
much efficient as it takes the shadows of the objects too. So this extraction of superpixels can be
well implemented using some edge detection methods to get clear edges of the foreground images.
So OFGM algorithm with SMED is used along with the superpixels.
The segmentation is carried out on the videos under uncompressed (avi) and Compressed (mp4)
domain. GMM shows good result in the case of fast moving objects with lower frame rate. But GMM
produces noisy moving regions for all the input videos a, b, c under the compressed domain. The
moving regions are segmented by a bounding box as shown in Figure 8.
The Figure 9 shows the better performance of our approach. The computational time (elapsed
time) of GMM is relatively high when compared with their counterparts’ say optical flow and OF with
SMED.
International Journal of Pure and Applied Mathematics Special Issue
564
13
The elapsed time is even reduced under compressed domain (mp4), where the number of frames
to be processed is less which has been shown in Figure 10. The performance evaluation of OFGM-
SMED for MPEG-4 and AVI has been shown in Figure 10.
The comparison of elapsed time taken by GMM, OF and OFGM-SMED for MPEG-4 has been
shown in Figure 11. In the graph, peaks represent the elapsed time which shows that our method
OFGM-SMED has better results under the compressed domain, as the peaks are comparatively less
throughout the frames.
The results are verified based upon the execution time and the results are formulated in the form
of a table (Table I). Also the overall performance of OFGM-SMED with superpixel extraction under
compressed domain is optimal shown in the Table I compared to its counterparts as the
computational cost is low, elapsed time is comparatively reduced and is less sensitive to background
noise.
Normal blunder rate is computed for all techniques which indicate OFGM-SMED is exceptionally
successful and has less mistake rate. The numerical outcome in Table II demonstrates OFGM-SMED
that it has less blunder rate (1.74%). It can be seen that when utilizing optical stream strategy, there
are some foundation commotions in light of the fact that each movement zone is recognized. The
calculation utilizes a current system by applying basic yet viable operations . As this approach
functions admirably in compacted space, the calculation time has been decreased while contrasted
with other video reconnaissance operation. The vehicle discovery operation is a less delicate edge -
based strategy. The limit choice is done progressively utilizing a measurable way to deal with lessen
the impacts of varieties of lighting. The approach depends on choosing a point on the level hub of
the histogram of the picture, where the total of the entropy of the focuses above and underneath
this point is most extreme. This point is chosen as the limit an incentive for the following time frame.
When utilizing SMED strategy, the frontal area which had more collected blunder has been
dispensed with.
As SMED have middle sifting it disposes of all foundation commotion introduce in optical stream.
Swarm thickness estimation is critical in reconnaissance. Surface investigation and minute
examination are two basic approaches to gauge swarm thickness, in surface examination an
arrangement of thickness components can be extricated from Gray level co-event framework
(GLCM) which is figured from frontal area picture. In the event that M is the GLCM of the nxm
forefront picture, where i and j are the spatial positions in the frontal area picture, we can f igure
another element FM, GLCM of Foreground picture, which is characterized as takes after.
International Journal of Pure and Applied Mathematics Special Issue
565
14
(7)
At the time examination, in light of the fact that the zeroth request minute speaks to the aggregate
mass of the given picture, we propose another component F00 characterized as takes after,
(8)
where Af is the zone of forefront and m00 is the zeroth request snapshot of closer view picture. Both
FM and F00 can be utilized to appraise the group thickness, the bigge r estimations of FM and the
littler estimations of F00 mean higher thickness. In our test, we utilized FM to assess the group in
various scenes and utilize F00 to quantify the diverse group in settle scene. We completed our
approach on seven unique recordings which contain 1200 casings of picture, and we haphazardly
grabbed 100 pictures to gauge OF-SMED. To begin with we falsely delegated the closer view range
on each picture which is the genuine frontal area, and afterward we utilized the accompanying
conditions to test the blunder rate of OFBM
X 100% (9)
where Areal is the range of genuine forefront, An is the territory of trial result closer view, r is the
mistake rate and the normal blunder rate R for haphazardly grabbed 100 pictures is given by,
(10)
The test result can be seen in Table II that shows our approach OFGM-SMED and OFGM-SMED on
superpixels has less error rate comparatively than their counterparts.
This algorithm works well not only for the static camera but also for moving camera with static and
dynamic background video sequences. It is also possible to detect the foreground object based on
the Region of Interest (ROI). This system is adoptable even for sudden illumination changes, since
the combination of OFGM with SMED is less sensitive to ambient lighting. In the video2 (dynamic
camera with dynamic background) irrespective of sudden illumination changes, the foreground
object has been detected effectively as shown in Figure 9.
VIII. CONCLUSION
Our approach is purely based on the concepts of extracting the superpixels without collisions and
detecting the edges clearly. For this purpose we took the conventional models of extracting the
pixels and applying the superpixel concept on those pixels. If that superpixel concept is applied, the
frame is segmented superpixel by superpixel wise and the number of comparisons needed is
n
i
m
j
n
i
m
j
M jiMjiMjiMF1 11 1
2 ),(ln),(),(
00
00
00 lnlnln MAA
mF f
f
100
100
1i irR
International Journal of Pure and Applied Mathematics Special Issue
566
15
reduced considerably. Thus the constraints in the conventional models such as wasting time in
comparisons is eliminated thoroughly. Also, a new and simpler algorithm that can be applied on
those superpixels extracted for the segmentation of compressed video (OFGM-SMED) has been
established and explained. The proposed approach OFGM-SMED joins the forefronts of both OF and
GBM together to dispose of foundation commotion. In optical stream the foundation clamor shows
up aimlessly and in SMED strategy the foundation commotion shows up at settled spots, so by doing
the blend all the foundation clamors can be killed. Additionally, SMED is utilized which recognizes
even slight development and adjusts to change in light. When utilizing the proposed OFGM-SMED
approach on superpixels, it can be seen that all foundation clamors are evacuated, and no forefron t
is lost, so the last protest location result is ideal with nearly decreased passed time. OFGM-SMED on
superpixels turns out to be an ideal approach for activity and group checking ongoing applications
with mistake rate of 1.26% which is a fulfilled outcome.
REFERENCES
[1] L.Cheng, M.Gong, “Real Time Discriminative Background Subtraction” in IEEE Transactions on
Image Processing, Vol. 20, No. 5, pp.1401-1414, 2011.
[2] K.Suganya Devi, N.Malmurugan, “OFGM-SMED: An Efficient and Robust Foreground Object
Detection in Compressed Video Sequences”, Engineering Applications of Artificial Intelligence,
Vol. 28, pp. 210-217, 2014.
[3] O.Barnich, M.Van Droogenbroeck, “ViBe: A Universal background subtraction algorithm for
video sequences”, IEEE transactions on Image Processing, Vol. 20, No. 6, pp.1709-1724, 2011.
[4] Cong zhoo, Xiaogang wang, wai-kuen cham, “Background subtraction via Robust dictionary
Learning,” EURASIP Journal on Image and Video Processing , Article ID: 972961, 12 pages, 2011.
Doi:10.1155/2011/972961.
[5] K.Gupta, V.Anjali Kulkarni, “Implementation of an Automated Single Camera Object Tracking
System Using Frame Differencing and Dynamic Template Matching,” In Proceedings of the 2007
International Conference on Systems, Computing Sciences and Software Engineering (SCSS(1)),
pp. 245-250, 2007.
[6] B.K.P.Horn, B.G.Schunck, “Determining optical flow”, Artificial Intelligence, Vol.17, pp.185-203,
1981.
[7] C.Stauffer, W.Grimson, “Adaptive background mixture models for real -time tracking,” IEEE
Computer Society Conference on Computer Vision and Pattern Recognition, Vol. 2, pp.246-252,
1999.
International Journal of Pure and Applied Mathematics Special Issue
567
16
[8] C.Gonzalez, Woods R Eddins, “Digital Image Processing,” 2nd ed Prentice Hall, Upper Saddle
River, NJ, pp. 11-17, 2002.
[9] N.Senthil kumaran, R.Rajesh, “Edge Detection Techniques for Image Segmentation-A Survey,”
Proceedings of the International Conference on Managing Next Generation Software
Applications (MNGSA-08), pp.749-760, 2008.
[10]M.Fathy, M.Y.Siyal, “An image detection technique based on morphological edge detection and
background differencing for realtime traffic analysis,” Pattern Recognition Lett., Vol. 16, pp.
1321–1330, 1995.
[11]M.Y.Siyal, A.Solangi, “A Novel morphological edge detector based approach for monitoring
vehicles at traffic junctions,” Innovations in Information Technology, pp. 1-5, 2006.
[12]A.Baumann, M.Boltz, Julia Ebling, Matthias Koenig, HartmutS Loos, Marcel Merkel, Wolfgang
Niem, JanKarl Warzelhan, Jie Yu, “A Review and Comparison of Measures for Automatic Video
Surveillance Systems,” Hindawi Publishing Corporation EURASIP Journal on Image and Video
Processing, Article ID 824726, 30 pages doi:10.1155/2008/824726, 2008.
[13]Nan Lu, Jihong Wang, Wu Q H, Li Yang, “An Improved Motion Detection Method for Real -Time
Surveillance,” Proc. Of IAENG International Journal of Computer Science, Vol. 35, No. 1, pp.119-
135, 2008.
[14]A.Rourke and M.G.H.Bell, “Traffic analysis using low cost image processing,” in Proc. Seminar on
Transportation Planning Methods, PTRC, Bath, U.K., pp.217-28. 1988.
[15]Wei Li, Xiaojuan Wu, Matsumoto K, Hua-An Zhao, “Foreground Detection Based on Optical Flow
and Background Subtract,” International Conference on Communications, Circuits and Systems
(ICCCAS), pp.359 – 362, 2010.
[16]H.Rahmalan, M.S.Nixon, J.N.Carter, “On Crowd Density Estimation for Surveillance,” The
Institution of Engineering and Technology Conference on Crime and Security , pp.540 – 545,
2006.
[17]L.Daniel, Schmoldt, Pei Li, Lynn Abbott, “Machine vision using artificial neural networks with
local 3D neighborhoods,” Computers and Electronics in Agriculture, Vol. 16, pp. 255-271, 1997.
[18]Browne Alan, T.M.McGinnity, G.Prasad, J.Condell, “FPGA Based High Accuracy Optical Flow
Algorithm,” IET Signals and Systems Conference (ISSC 2010) , Ucc, cork, pp.112-117, 2010.
[19]J.Lee , R.M.Haralick , L.G.Shapiro, “Morphologic edge detection,” IEEE J.Robot. Automat. , Vol. 3,
No. 2, pp: 142-156, 1987.
International Journal of Pure and Applied Mathematics Special Issue
568
17
[20]N.Hashimoto, “Development of an image processing traffic flow measurement system,”
Sumitomo Electronic Tech. Rev., pp. 133–138, 1998.
[21]K. Chitra Lakshmi , K. Nagarajan, “Geometric Mean Cordial Labeling Of Subdivision Of Standard
Graphs”, International Journal Of Pure And Applied Mathematics, pp103-112,2017.
[22]K.Suganya Devi, N.Malmurugan, R.Sivakumar, “OF-SMED: An Optimal foreground Detection
method in Surveillance Sytem for Traffic Monitoring” in Proceedings of IEEE International
Conference on Cyber-Security and Digital Forensics, pp. 12-17, June 24-26, 2012.
Figures:
0 1
2 3 4
Figure 1. Superpixel extraction
FIRST
PIXEL
NEW
SUPER
PIXEL
FIRST
SUPER
PIXEL
NEW
SUPER
PIXEL
NEW
SUPER
PIXEL
International Journal of Pure and Applied Mathematics Special Issue
569
18
Figure 2. Cars on highway-Optical flow
Figure 3. The flow diagram of Background subtraction
Figure 4. Edge Detection using SMED
International Journal of Pure and Applied Mathematics Special Issue
570
19
Figure 5. Flowchart of OFGM-SMED
Figure 6 a), b), c) Original Video sequence.
Figure 7. Superpixel extracted video for the original video sequence a, b, c.
Figure 8. The moving objects segmented by OFGM-SMED for the original video sequences a, b,c
International Journal of Pure and Applied Mathematics Special Issue
571
20
Figure 9. Row 1 segmentation done by GMM for the compressed video sequences a,b,c. Row 2 segmentation
done by OF for the compressed video sequences a, b, c. Row 3 segmentation done by OFGM -SMED for the
compressed video sequences a, b, c.
Figure 10.Performance Evaluation of OFGM-SMED for MPEG-4 and AVI
Figure 11. Comparison of GMM, OF, OFGM-SMED for MPEG-4
International Journal of Pure and Applied Mathematics Special Issue
572
21
Tables:
TABLE I. COMPARISON OF ELAPSED TIME (IN SECONDS) FOR THE AVERAGE FRAME SIZE 120
FOR THE ORIGINAL COMPRESSED VIDEO SEQUENCE
TABLE II. COMPARISON OF AVERAGE ERROR RATE
Original Video sequence
shown in Figure 5
Elapsed Time in seconds
OFGM-S MED
without superpixel
extraction
OFGM-S MED with
superpixel extraction
a) Slow Moving Object with
static background 0.161259 0.115432
b) Moving object with dynamic
background 0.365566 0.282512
c) Fast Moving Object with
static background 0.099303 0.069267
Approach SMED OFBM OFGM SMED
OFGM-S MED on
superpixels
Error rate
4.85% 2.01% 1.74%
1.26%
International Journal of Pure and Applied Mathematics Special Issue
573
574