generic visual perception
TRANSCRIPT
-
8/13/2019 Generic Visual Perception
1/19
Generic visual perception processor
INTRODUCTION
While computing technology is growing in leaps and bounds, the human
brain continues to be the world's fastest computer. Combine brain-power with seeing
power, and you have the fastest, cheapest, most extra ordinary processor ever-the
human eye. Little wonder, research labs the world over are striving to produce a near-
perfect electronic eye.
The 'generic visual perception processor !"##$' has been developed after %&
long years of scientific effort. !eneric "isual #erception #rocessor !"##$ canautomatically detect ob ects and trac( their movement in real-time . The !"##, which
crunches )& billion instructions per second *+# $, models the human perceptual
process at the hardware level by mimic(ing the separate temporal and spatial functions
of the eye-to-brain system. The processor sees its environment as a stream of
histograms regarding the location and velocity of ob ects.
!"## has been demonstrated as capable of learning-in-place to solve a variety of
pattern recognition problems. +t boasts automatic normali ation for varying ob ect si e,
orientation and lighting conditions, and can function in daylight or dar(ness.
This electronic eye on a chip can now handle most tas(s that a normal human eye
can. That includes driving safely, selecting ripe fruits, reading and recogni ing things.
adly, though modeled on the visual perception capabilities of the human brain, the
chip is not really a medical marvel, poised to cure the blind.
1College of applied sciences, Mavelikkara
-
8/13/2019 Generic Visual Perception
2/19
Generic visual perception processor
BAC GROUND O! T"# IN$#NTION
The invention relates generally to methods and devices for automatic visualperception, and more particularly to methods and devices for processing image signals
using two or more histogram calculation units to locali e one or more ob ects in an
image signal using one or more characteristics an ob ect such as the shape, si e and
orientation of the ob ect. uch devices can be termed an electronic spatio-temporal
neuron, and are particularly useful for image processing, but may also be used for other
signals, such as audio signals. The techni/ues of the present invention are also
particularly useful for trac(ing one or more ob ects in real time.
+t is desirable to provide devices including combined data processing units of a
similar nature, each addressing a particular parameter extracted from the video signal.
+n particular, it is desirable to provide devices including multiple units for calculating
histograms, or electronic spatio-temporal neuron T0, each processing a 12T2 2$, bya function in order to generate individually an output value.
The present invention also provides a method for perception of an ob ect using
characteristics, such as its shape, its si e or its orientation, using a device composed of a
set of histogram calculation units.
2College of applied sciences, Mavelikkara
-
8/13/2019 Generic Visual Perception
3/19
Generic visual perception processor
3sing the techni/ues of the present invention, a general outline of a moving
ob ect is determined with respect to a relatively stable bac(ground, then inside this
outline, elements that are characteri ed by their tone, color, relative position etc. are
determined. .
%OTI#NTIA& 'IG"T#D
The !"## was invented in 1992 , by *4" founder #atric #irim . +t would be
relatively simple for a C56 chip to implement in hardware the separate contributions
of temporal and spatial processing in the brain. The brain-eye system uses layers of
parallel-processing neurons that pass the signal through a series of preprocessing steps,resulting in real-time trac(ing of multiple moving ob ects within a visual scene.
#irim created a chip architecture that mimic(ed the wor( of the neurons, with the
help of multiplexing and memory. The result is an inexpensive device that can
autonomously perceive and then trac( up to eight user-specified ob ects in a video
stream based on hue, luminance, saturation, spatial orientation, speed and direction of
motion.
The !"## trac(s an ob ect, defined as a certain set of hue, luminance and
saturation values in a specific shape, from frame to frame in a video stream by
anticipating where it7s leading and trailing edges ma(e differences with the
bac(ground. That means it can trac( an ob ect through varying light sources or changes
in si e, as when an ob ect gets closer to the viewer or moves farther away.
The !"##7 ma or performance strength over current-day vision systems is its
adaptation to varying light conditions. Today7s vision systems dictate uniform shadow
less illumination ,and even next generation prototype systems, designed to wor( under
8normal9 lighting conditions, can be used only dawn to dus(. The !"## on the other
hand, adapt to real time changes in lighting without recalibration, day or light.
3College of applied sciences, Mavelikkara
-
8/13/2019 Generic Visual Perception
4/19
Generic visual perception processor
:or many decades the field of computing has been trapped by the limitations of
the traditional processors. 5any futuristic technologies have been bound by limitations
of these processors .These limitations stemmed from the basic architecture of these
processors. Traditional processors wor( by slicing each and every complex program
into simple tas(s that a processor could execute. This re/uires an existence of an
algorithm for solution of the particular problem. *ut there are many situations where
there is an inexistence of an algorithm or inability of a human to understand the
algorithm. 4ven in these extreme cases !"## performs well. +t can solve a problem with
its neural learning function. 0eural networ(s are extremely fault tolerant. *y their
design even if a group of neurons get, the neural networ( only suffers a smooth
degradation of the performance. +t won7t abruptly fail to wor(. This is a crucial
difference, from traditional processors as they fail to wor( even if a few components aredamaged. !"## recogni es stores , matches and process patterns. 4ven if pattern is not
recogni able to a human programmer in input the neural networ(, it will dig it out from
the input. Thus !"## becomes an efficient tool for applications li(e the pattern
matching and recognition.
"O( IT (OR '
*asically the chip is made of neural networ( modeled resembling the structure of
human brain. The basic element here is a neuron. There are large number of input lines
and an output line to a neuron. 4ach neuron is capable of implementing a simple
function. +t ta(es the weighted sum of its inputs and produces an output that is fed into
the next layer. The weights assigned to each input are a variable /uantity.
2 large number of such neurons interconnected form a neural networ(. 4very
input that is given to the neural networ( gets transmitted over entire networ( via direct
connections called synaptic connections and feed bac( paths. Thus the signal ripples in
the neural networ(, every time changing the weighted values associated with each input
of every neuron. These changes in the ripples will naturally direct the weights to modify
4College of applied sciences, Mavelikkara
-
8/13/2019 Generic Visual Perception
5/19
Generic visual perception processor
into those values that will become stable .That is, those values does not change. 2t this
point the information about the signal is stored as the weighted values of inputs in the
neural networ(.
2 neural networ( geometri es computation. When we draw the state diagram of
a neural networ(, the networ( activity burrows a tra ectory in this state space. The
tra ectory begins with a computation problem. The problem specifies initial conditions
which define the beginning of tra ectory in the state space.
+n pattern learning, the pattern to be learned defines the initial conditions.
Where as in pattern recognition, the pattern to be recogni ed defines the initial
conditions. 5ost of the tra ectory consists of transient behavior or computations. The
weights associated with inputs gradually change to learn new pattern information. The
tra ectory ends when the system reaches e/uilibrium. This is the final state of the neuralnetwor(. +f the pattern was meant to be matched, the final neuronal state represents the
pattern that is closest match to the input pattern.
+nput to the electronic eye can come from video, infrared or radar signals. "ideo
signal is composed of a succession of frames, wherein each frame includes a
succession of pixels whose assembly forms a space, for example an image for a two-
dimensional space. ;eal-time outputs perceive, recogni e and analy e both static images
and time-varying patterns for specific ob ects, their heading, speed, shading and color
differences. *y mimic(ing the eye and the visual regions of the brain, the !"## puts
together the salient features necessary for recognition. o instead of capturing frames of
pixels, the chip identifies ob ects of interest, determines each ob ect's speed and
direction, then follows them by trac(ing their color through the scene.
The chip emulates the eye, which has < million cones sensitive to color, only %< times more sensitive than the cones through two processing steps-tonic
and phasic.
The chip mimics the human eye7s two processing steps, tonic and phasic .Tonic
processing auto-scales according to ambient light conditions, enabling it to adapt to a
5College of applied sciences, Mavelikkara
-
8/13/2019 Generic Visual Perception
6/19
Generic visual perception processor
range of luminosity. #hasic processing determines movement by using local variable in
the feedbac( loops loops. .2s light7s edges pass over the cones and rods and cones of the
eye, these local feedbac( loops detect contrast changes caused by ob ects moving
through the scene
:or detecting smooth contours, rather than sharp contrast changes, the eye adds
ocular movement. The eye typically sweeps a scene about two to three times a second as
well as ma(ing vibratory movements at about %&& ? . The faster itter accounts for the
eye-sensitivity to the smallest detectable feature, which is an edge moving between
ad acent rods or cones.
2fter all this processing the visual signal is then sent to the brain for higher-level
observation and recognition tas(s. *ecause only detected movements and color along
with the shape and contour of ob ects is sent to the brain, rather than raw pixels, theaverage compression ratio information is about %=
-
8/13/2019 Generic Visual Perception
7/19
Generic visual perception processor
associated with the first leading-edge$ and last trailing-edge$ pixel belonging to the
ob ect. 4ach of these silicon neurons is built with ;25, a few registers, an adder and a
comparator. upplied as a %&&-pin module, the chip accommodates analogue-input line
levels for video input, with an input amplifier with programmable gain auto scaling the
signal.
The modules measure =& s/uare mm, have %&& pins and can handle )&-5?
video signals. The chip is priced at @AB&. 2 card with a soc(eted !"## and 64 bytes of
:lash ;25 comes at @ 1,500 .
DI$ID# AND CON)U#R ince processing in each module on the !"## runs in parallel out of its own memory
space, multiple !"## chips can be cascaded to expand the number of ob ects that can be
recogni ed and trac(ed. When set in master-slave mode, any number of !"## chips can
divide and con/uer, for instance, complex stereoscopic vision applications.
'O!T(AR# A'%#CT' 6n the software side, a host operating system running on an external #C
communicates with the !"##'s evaluation board via an 6 (ernel within the on-chip
microprocessor. *4" dubs the neural-learning capability of its development
environment programming by seeing and doing, because of its ease of use. The
engineer needs no (nowledge of the internal wor(ings of the !"##, the company said,
only application-specific domain (nowledge.
#rogramming the !"## is as simple as setting a few registers, and then testing
the results to gauge the application's success, said teve ;owe, *4"'s director ofresearch and development. 6nce debugged, these tiny application programs are loaded
directly into the !"##'s internal ;65.
2pplication programs themselves can use CDD, which ma(es calls to a library of
assembly language algorithms for visual perception and trac(ing of ob ects. The
7College of applied sciences, Mavelikkara
-
8/13/2019 Generic Visual Perception
8/19
Generic visual perception processor
system's modular approach permits the developer to create a hierarchy of application
building bloc(s that simplify problems with inheritable software characteristics.
*ased on the neural networ( learning functions, the chip allows application
software to be developed /uic(ly through a combination of simple operationalcommands and immediate feedbac(. imple applications such as detecting and trac(ing
a single moving ob ect can be programmed in less than a day. 4ven complex
applications such as detecting a driver falling sleep can be programmed in a month.
The ability of a neural networ( to learn from the data is perhaps its greatest
ability. This learning of neural networ( can be described as its configuration such that
the application of a set of inputs produces the desired set of outputs. "arious methods to
set the strengths of the connections exist. 6ne way is to set the synaptic connection
explicitly, using a priori (nowledge. 2nother way is to EtrainEthe neural networ( by
feeding it teaching patterns and letting it change its weights according to some learning
rule.
R#COGNITION TA' +n applications, each pixel may be described with respect to any of the six
domains of information available to it F hue, luminance, saturation, speed, direction of
motion and spatial orientation. The !"## further subcategori es pixels by ranges, for
instance luminance within %& percent and B< percent, hue of blue, saturation between
)& and )< percent, and moving upward in scene.
2 set of second-level pattern recognition commands permits the !"## to search
for different ob ects in different parts of the scene F for instance, to loo( for a closed
eyelid only within the rectangle bordered by the corners of the eye. ince some
applications may also re/uire multiple levels of recognition, the !"## has softwarehoo(s to pass along the recognition tas( from level to level.
:or instance, to detect when a driver is falling asleep F a capability that could
find use in California, which is about to mandate that cars sound an alarm when
drowsy drivers begin to nod off F the !"## is first programmed to detect the driver's
8College of applied sciences, Mavelikkara
-
8/13/2019 Generic Visual Perception
9/19
Generic visual perception processor
head, for which it creates histograms of head movement. The microprocessor reads
these histograms to identify the area for the eye.
Then the recognition tas( passes to the next level, which searches only within the
eye area rectangles. ?igh-speed movement there, normally indicative of blin(ing, isdiscounted, but when blin(s become slower than a predetermined level, they are
interpreted as the driver nodding off, and trigger an alarm.
G$%% ARC"IT#CTUR# The chip houses 23 neural bloc(s, both temporal and spatial, each consisting of )&
hardware input and output synaptic connections. The !"## multiplexes this neuralhardware with off-chip scratchpad memory to simulate as many as %&&,&&& synaptic
connections per neuron. 4ach of these synapses can be changed through the on-chip
microprocessor for a combined processing total of over B.) billion synaptic
connectionsGsecond.
+n executing up to )& *+# to analy e successive frames of a video stream, the
temporal neurons identify pixels that have changed over time and generate a >-bit value
indicative of the magnitude of that change. The spatial-processing system analy es theresulting difference histogram to calculate the speed and direction of the motion.
?istogram is a bar chart of the count of pixels of every tone of gray that occurs in
the image. +t helps us analy e, and more importantly, correct the contrast of the image.
Technically, the histogram maps Luminance, which is defined from the way the
human eye perceives the brightness of different colors. :or example, our eyes are most
sensitive to greenH we see green as being brighter than we see blue. Luminance weighsthe effect of this to indicate the actual perceived brightness of the image pixels due to
the color components.
9College of applied sciences, Mavelikkara
-
8/13/2019 Generic Visual Perception
10/19
Generic visual perception processor
:ig A.% ;epresentation of !"##
MU&TI% %#RC#%TION'
The vision perception processor chip processes motion images in real-time,
and is able to perceive and trac( ob ects based on combinations of hue, luminance,
speed, direction of motion and spatial orientation. The chip has three functionsI
temporal processing, spatial processing and histogram processing. +n applications, the
vision processor called generic visual perception processor !"##$ is used with a
microprocessor, flash memory and peripherals.
+n a case of detection of a driver falling sleep, the processor core detects movement
of the eyelids. :irst, the driver is identified, and then the microprocessor directs the
vision processor to search within the corner points of a rectangular area in which the
nose of the driver would be expected to be located, based on a model, and to select pixels
having characteristics of the shadows of nostrils. The results are calculated to the end of
10College of applied sciences, Mavelikkara
-
8/13/2019 Generic Visual Perception
11/19
Generic visual perception processor
the image frame being analy ed, and the microprocessor analy es the resulting
histograms to determine characteristics indicative of nostrils, such as the spacing and
shaping and shape of the nostrils.
Then, the microprocessor directs the vision processor to isolate a rectangular area
in which the eyes of the driver are expected to be located, based on the model, and to
select pixels having characteristics of high-speed movement normally indicative of
blin(ing. The microprocessor analy es over time the histograms ta(en of the eye area to
determine the duration of each blin( and the interval between blin(s. :rom the blin(
duration and interval information, the microprocessor is able to determine when the
driver is falling
asleep and triggers an alarm accordingly.
The highly parallel internal design of the vision processor allows it to perform
over )& billion instructions per second *+# $, and the internal three functions play the
following roles. Temporal processing involves the processing of successive frames of an
image to smooth the pixels of the image over time to prevent interference affecting
ob ect detection. patial processing involves the processing of pixels within a locali ed
area of the images to determine the speed and direction of movement of each pixel.
;eceiving the information about speed and direction of each pixel from spatial and
temporal processors, the histogram processor perceives ob ects in the images that have
undergone significant change over time and the magnitude of the changes.
The chip is able to detect and trac( eight ob ects per frame, meaning )=& ob ects
per second in a >& framesGs systems such as T" or high definition T"s ?1T"$. The
vision processor includes a total of )> spatio-temporal neural bloc(s, each having
approximately )& different input and output connections. 4ach connection can becontrolled through the on-chip microprocessor. The vision processor provides a
mechanism to reconfigure the spatio-temporal neural bloc(s to perform a wide variety
of vision processing, such as moving ob ect detection.
11College of applied sciences, Mavelikkara
-
8/13/2019 Generic Visual Perception
12/19
Generic visual perception processor
:ig %&.% ;epresentation of histogram calculation unit
12College of applied sciences, Mavelikkara
-
8/13/2019 Generic Visual Perception
13/19
Generic visual perception processor
AD$ANTAG#' AND DI'AD$ANTAG#'
AD$ANTAG#'
1. +mitating the human eye's neural networ(s and the brain, the !"## can handle
some )& billion instructions per second, compared to a mere millions handled by
#entium-class processors of today.
2. !"## has been demonstrated as capable of learning-in-place to solve a variety of
pattern recognition problems.
3. +t has automatic normali ation for varying ob ect si e, orientation, and lighting
conditions, and can function in daylight or dar(ness.
4. .+t is an inexpensive device that can autonomously perceive and then trac( up
to eight user-specified ob ects in a video stream.
5. ince processing in each module on the !"## runs in parallel out of its own
memory space, multiple !"## chips can be cascaded to expand the number of
ob ects that can be recogni ed and trac(ed.
6. The engineer needs no (nowledge of the internal wor(ings of the !"##, the
company said, only application-specific domain (nowledge.
7. imple applications can be /uic(ly prototyped in a few days, with medium-si eapplications ta(ing a few wee(s and even big applications only a couple of
months
8. The chip could be useful across a wide variety of industries where visual trac(ing
is important
13College of applied sciences, Mavelikkara
-
8/13/2019 Generic Visual Perception
14/19
-
8/13/2019 Generic Visual Perception
15/19
Generic visual perception processor
A%%&ICATION' *4" lists possible applications for the !"## in process monitoring, /uality
control and assemblyH automotive systems such as intelligent air bags that monitorpassenger si e and traffic congestion monitorsH pedestrian detection, license plate
recognition, electronic toll collection, automatic par(ing management, automatic
inspectionH and medical uses including disease identification. The chip could also prove
useful in unmanned air vehicles, miniature smart weapons, ground reconnaissance and
other military applications, as well as in security access using facial, iris, fingerprint, or
height and gait identification
*+ Auto otive industr-
?ere, !"##-type devices could (eep watch and trigger off an alarm if the driver
were to nod off to sleep at the wheel. 6r act as a monitor for the car's progress to ma(e
sure it does not get too close to the curb.
2lso with transportation, !"## could be used in developing systems for collision
avoidance, automatic cruise control, smart air bag systems, license plate recognition,measurement of traffic flow, electronic toll collection, automatic cargo trac(ing, par(ing
management and the inspection of crac(s in rails and tunnels.
2nother automotive application is warning erratic drivers. !"## does so by
monitoring the left and right lanes of the road, signaling the driver when the vehicle
deviates from the proscribed lane.
.+ Ro/otics
+n manufacturing, !"## have applications in robotics, particularly for dirty and
dangerous obs such as feeding hot parts to forging presses, cleaning up ha ardous
waste, and spraying toxic coatings on aircraft parts.
15College of applied sciences, Mavelikkara
-
8/13/2019 Generic Visual Perception
16/19
Generic visual perception processor
0+ Agriculture and fis1eries
+n agriculture and fisheries engineering, !"## could help with tas(s that
traditionally have been labor intensive such as disease and parasite identification,
harvest control, ripeness detection and yield identification.
2+ Militar- applications
The chip can be used in other important pattern-recognition applications, such as
military target ac/uisition and fire control.. 5ilitary applications include unmanned air
vehicles, automatic target detection, tra ectory correction, ground reconnaissance andsurveillance. "ision systems are useful in home security, as well, and !"##-based
systems could be inexpensively developed to detect intruders and fires.
+n addition to the above applications, !"## should be able to wor( as medical
scanners, blood analy ers, cardiac monitoring, ban( chec(s, bar code reading, seal and
signature verification, trademar( database indexing, construction of virtual reality
environment models, human motion analysis, expression understanding, cloud
identification, and many other fields.
16College of applied sciences, Mavelikkara
-
8/13/2019 Generic Visual Perception
17/19
Generic visual perception processor
PRELIMINARY INFORMATION OF GVPP-7BGeneric Visual Perception Processor
2rray :ormat max$I J&&?xB&&" :rame rateI &-%&& "!2 frames per second progressive-scan +nterface 5odeI 5asterG lave 1ata ;ate max$I =& 5egapixel per second 1ynamic ;angeI %&-bits, > channels #arametersI Luminance, ?ue, aturation, 5otion orientation,
velocity$, 6riented edges, lines, curves, corners ComputationI B= T0 bloc(s
5ulti-scales possibilities optional$ 5ulti-chips connections capabilities emi-!raphic interface visuali ation !3+ with mouse #2L, 0T C, "!2, visuali ation +nternal 6 C language programmation #C+, # ), +)C, ; )>), #W5 &.> Watt, >,>"olts for #2LG0T C format -=&,DJ< C Complete application in one i# % x% I C-56 +mager, !"##,
%&5bits ";25, % 5bits erial :lash 5emory,)K 5? Cristal.
17College of applied sciences, Mavelikkara
-
8/13/2019 Generic Visual Perception
18/19
Generic visual perception processor
!UTUR# 'CO%#'
:or future, scientists are wor(ing on a 8visual mouse9 for hand-
gesture interface to computers that ta(e advantage of that high compression ratio.
:uture studies also involve using this processor as an eye of the robots, which provides
tremendous applications.
CONC&U'ION'
+mitating the human eye's neural networ(s and the brain, the generic visual perception
processor can handle about )& billion instructions per second, and can manage most
tas(s performed by the eye. 5odeled on the visual perception capabilities of the human
brain, the !"## is a single chip that can detect ob ects in a motion video signal and then
to locate and trac( them in real time far more dependably than competing systems, which cost far more, according to company scientists. This is a generic chip, and we've
already identified more than %&& applications in ten or more industries. The chip could
be useful across a wide swathe of industries where visual trac(ing is important.
18College of applied sciences, Mavelikkara
-
8/13/2019 Generic Visual Perception
19/19
Generic visual perception processor
R#!#RCNC#'
1M httpIGGwww.patentstorm.us
2M httpIGGwww.techweb.com
3M *art os(o., 0eural networ(s and fu y systems, 2ddison Wesley =MhttpIGGwww.pixelin(.comGsupport