generic visual perception

Upload: mohammed-hisham

Post on 04-Jun-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/13/2019 Generic Visual Perception

    1/19

    Generic visual perception processor

    INTRODUCTION

    While computing technology is growing in leaps and bounds, the human

    brain continues to be the world's fastest computer. Combine brain-power with seeing

    power, and you have the fastest, cheapest, most extra ordinary processor ever-the

    human eye. Little wonder, research labs the world over are striving to produce a near-

    perfect electronic eye.

    The 'generic visual perception processor !"##$' has been developed after %&

    long years of scientific effort. !eneric "isual #erception #rocessor !"##$ canautomatically detect ob ects and trac( their movement in real-time . The !"##, which

    crunches )& billion instructions per second *+# $, models the human perceptual

    process at the hardware level by mimic(ing the separate temporal and spatial functions

    of the eye-to-brain system. The processor sees its environment as a stream of

    histograms regarding the location and velocity of ob ects.

    !"## has been demonstrated as capable of learning-in-place to solve a variety of

    pattern recognition problems. +t boasts automatic normali ation for varying ob ect si e,

    orientation and lighting conditions, and can function in daylight or dar(ness.

    This electronic eye on a chip can now handle most tas(s that a normal human eye

    can. That includes driving safely, selecting ripe fruits, reading and recogni ing things.

    adly, though modeled on the visual perception capabilities of the human brain, the

    chip is not really a medical marvel, poised to cure the blind.

    1College of applied sciences, Mavelikkara

  • 8/13/2019 Generic Visual Perception

    2/19

    Generic visual perception processor

    BAC GROUND O! T"# IN$#NTION

    The invention relates generally to methods and devices for automatic visualperception, and more particularly to methods and devices for processing image signals

    using two or more histogram calculation units to locali e one or more ob ects in an

    image signal using one or more characteristics an ob ect such as the shape, si e and

    orientation of the ob ect. uch devices can be termed an electronic spatio-temporal

    neuron, and are particularly useful for image processing, but may also be used for other

    signals, such as audio signals. The techni/ues of the present invention are also

    particularly useful for trac(ing one or more ob ects in real time.

    +t is desirable to provide devices including combined data processing units of a

    similar nature, each addressing a particular parameter extracted from the video signal.

    +n particular, it is desirable to provide devices including multiple units for calculating

    histograms, or electronic spatio-temporal neuron T0, each processing a 12T2 2$, bya function in order to generate individually an output value.

    The present invention also provides a method for perception of an ob ect using

    characteristics, such as its shape, its si e or its orientation, using a device composed of a

    set of histogram calculation units.

    2College of applied sciences, Mavelikkara

  • 8/13/2019 Generic Visual Perception

    3/19

    Generic visual perception processor

    3sing the techni/ues of the present invention, a general outline of a moving

    ob ect is determined with respect to a relatively stable bac(ground, then inside this

    outline, elements that are characteri ed by their tone, color, relative position etc. are

    determined. .

    %OTI#NTIA& 'IG"T#D

    The !"## was invented in 1992 , by *4" founder #atric #irim . +t would be

    relatively simple for a C56 chip to implement in hardware the separate contributions

    of temporal and spatial processing in the brain. The brain-eye system uses layers of

    parallel-processing neurons that pass the signal through a series of preprocessing steps,resulting in real-time trac(ing of multiple moving ob ects within a visual scene.

    #irim created a chip architecture that mimic(ed the wor( of the neurons, with the

    help of multiplexing and memory. The result is an inexpensive device that can

    autonomously perceive and then trac( up to eight user-specified ob ects in a video

    stream based on hue, luminance, saturation, spatial orientation, speed and direction of

    motion.

    The !"## trac(s an ob ect, defined as a certain set of hue, luminance and

    saturation values in a specific shape, from frame to frame in a video stream by

    anticipating where it7s leading and trailing edges ma(e differences with the

    bac(ground. That means it can trac( an ob ect through varying light sources or changes

    in si e, as when an ob ect gets closer to the viewer or moves farther away.

    The !"##7 ma or performance strength over current-day vision systems is its

    adaptation to varying light conditions. Today7s vision systems dictate uniform shadow

    less illumination ,and even next generation prototype systems, designed to wor( under

    8normal9 lighting conditions, can be used only dawn to dus(. The !"## on the other

    hand, adapt to real time changes in lighting without recalibration, day or light.

    3College of applied sciences, Mavelikkara

  • 8/13/2019 Generic Visual Perception

    4/19

    Generic visual perception processor

    :or many decades the field of computing has been trapped by the limitations of

    the traditional processors. 5any futuristic technologies have been bound by limitations

    of these processors .These limitations stemmed from the basic architecture of these

    processors. Traditional processors wor( by slicing each and every complex program

    into simple tas(s that a processor could execute. This re/uires an existence of an

    algorithm for solution of the particular problem. *ut there are many situations where

    there is an inexistence of an algorithm or inability of a human to understand the

    algorithm. 4ven in these extreme cases !"## performs well. +t can solve a problem with

    its neural learning function. 0eural networ(s are extremely fault tolerant. *y their

    design even if a group of neurons get, the neural networ( only suffers a smooth

    degradation of the performance. +t won7t abruptly fail to wor(. This is a crucial

    difference, from traditional processors as they fail to wor( even if a few components aredamaged. !"## recogni es stores , matches and process patterns. 4ven if pattern is not

    recogni able to a human programmer in input the neural networ(, it will dig it out from

    the input. Thus !"## becomes an efficient tool for applications li(e the pattern

    matching and recognition.

    "O( IT (OR '

    *asically the chip is made of neural networ( modeled resembling the structure of

    human brain. The basic element here is a neuron. There are large number of input lines

    and an output line to a neuron. 4ach neuron is capable of implementing a simple

    function. +t ta(es the weighted sum of its inputs and produces an output that is fed into

    the next layer. The weights assigned to each input are a variable /uantity.

    2 large number of such neurons interconnected form a neural networ(. 4very

    input that is given to the neural networ( gets transmitted over entire networ( via direct

    connections called synaptic connections and feed bac( paths. Thus the signal ripples in

    the neural networ(, every time changing the weighted values associated with each input

    of every neuron. These changes in the ripples will naturally direct the weights to modify

    4College of applied sciences, Mavelikkara

  • 8/13/2019 Generic Visual Perception

    5/19

    Generic visual perception processor

    into those values that will become stable .That is, those values does not change. 2t this

    point the information about the signal is stored as the weighted values of inputs in the

    neural networ(.

    2 neural networ( geometri es computation. When we draw the state diagram of

    a neural networ(, the networ( activity burrows a tra ectory in this state space. The

    tra ectory begins with a computation problem. The problem specifies initial conditions

    which define the beginning of tra ectory in the state space.

    +n pattern learning, the pattern to be learned defines the initial conditions.

    Where as in pattern recognition, the pattern to be recogni ed defines the initial

    conditions. 5ost of the tra ectory consists of transient behavior or computations. The

    weights associated with inputs gradually change to learn new pattern information. The

    tra ectory ends when the system reaches e/uilibrium. This is the final state of the neuralnetwor(. +f the pattern was meant to be matched, the final neuronal state represents the

    pattern that is closest match to the input pattern.

    +nput to the electronic eye can come from video, infrared or radar signals. "ideo

    signal is composed of a succession of frames, wherein each frame includes a

    succession of pixels whose assembly forms a space, for example an image for a two-

    dimensional space. ;eal-time outputs perceive, recogni e and analy e both static images

    and time-varying patterns for specific ob ects, their heading, speed, shading and color

    differences. *y mimic(ing the eye and the visual regions of the brain, the !"## puts

    together the salient features necessary for recognition. o instead of capturing frames of

    pixels, the chip identifies ob ects of interest, determines each ob ect's speed and

    direction, then follows them by trac(ing their color through the scene.

    The chip emulates the eye, which has < million cones sensitive to color, only %< times more sensitive than the cones through two processing steps-tonic

    and phasic.

    The chip mimics the human eye7s two processing steps, tonic and phasic .Tonic

    processing auto-scales according to ambient light conditions, enabling it to adapt to a

    5College of applied sciences, Mavelikkara

  • 8/13/2019 Generic Visual Perception

    6/19

    Generic visual perception processor

    range of luminosity. #hasic processing determines movement by using local variable in

    the feedbac( loops loops. .2s light7s edges pass over the cones and rods and cones of the

    eye, these local feedbac( loops detect contrast changes caused by ob ects moving

    through the scene

    :or detecting smooth contours, rather than sharp contrast changes, the eye adds

    ocular movement. The eye typically sweeps a scene about two to three times a second as

    well as ma(ing vibratory movements at about %&& ? . The faster itter accounts for the

    eye-sensitivity to the smallest detectable feature, which is an edge moving between

    ad acent rods or cones.

    2fter all this processing the visual signal is then sent to the brain for higher-level

    observation and recognition tas(s. *ecause only detected movements and color along

    with the shape and contour of ob ects is sent to the brain, rather than raw pixels, theaverage compression ratio information is about %=

  • 8/13/2019 Generic Visual Perception

    7/19

    Generic visual perception processor

    associated with the first leading-edge$ and last trailing-edge$ pixel belonging to the

    ob ect. 4ach of these silicon neurons is built with ;25, a few registers, an adder and a

    comparator. upplied as a %&&-pin module, the chip accommodates analogue-input line

    levels for video input, with an input amplifier with programmable gain auto scaling the

    signal.

    The modules measure =& s/uare mm, have %&& pins and can handle )&-5?

    video signals. The chip is priced at @AB&. 2 card with a soc(eted !"## and 64 bytes of

    :lash ;25 comes at @ 1,500 .

    DI$ID# AND CON)U#R ince processing in each module on the !"## runs in parallel out of its own memory

    space, multiple !"## chips can be cascaded to expand the number of ob ects that can be

    recogni ed and trac(ed. When set in master-slave mode, any number of !"## chips can

    divide and con/uer, for instance, complex stereoscopic vision applications.

    'O!T(AR# A'%#CT' 6n the software side, a host operating system running on an external #C

    communicates with the !"##'s evaluation board via an 6 (ernel within the on-chip

    microprocessor. *4" dubs the neural-learning capability of its development

    environment programming by seeing and doing, because of its ease of use. The

    engineer needs no (nowledge of the internal wor(ings of the !"##, the company said,

    only application-specific domain (nowledge.

    #rogramming the !"## is as simple as setting a few registers, and then testing

    the results to gauge the application's success, said teve ;owe, *4"'s director ofresearch and development. 6nce debugged, these tiny application programs are loaded

    directly into the !"##'s internal ;65.

    2pplication programs themselves can use CDD, which ma(es calls to a library of

    assembly language algorithms for visual perception and trac(ing of ob ects. The

    7College of applied sciences, Mavelikkara

  • 8/13/2019 Generic Visual Perception

    8/19

    Generic visual perception processor

    system's modular approach permits the developer to create a hierarchy of application

    building bloc(s that simplify problems with inheritable software characteristics.

    *ased on the neural networ( learning functions, the chip allows application

    software to be developed /uic(ly through a combination of simple operationalcommands and immediate feedbac(. imple applications such as detecting and trac(ing

    a single moving ob ect can be programmed in less than a day. 4ven complex

    applications such as detecting a driver falling sleep can be programmed in a month.

    The ability of a neural networ( to learn from the data is perhaps its greatest

    ability. This learning of neural networ( can be described as its configuration such that

    the application of a set of inputs produces the desired set of outputs. "arious methods to

    set the strengths of the connections exist. 6ne way is to set the synaptic connection

    explicitly, using a priori (nowledge. 2nother way is to EtrainEthe neural networ( by

    feeding it teaching patterns and letting it change its weights according to some learning

    rule.

    R#COGNITION TA' +n applications, each pixel may be described with respect to any of the six

    domains of information available to it F hue, luminance, saturation, speed, direction of

    motion and spatial orientation. The !"## further subcategori es pixels by ranges, for

    instance luminance within %& percent and B< percent, hue of blue, saturation between

    )& and )< percent, and moving upward in scene.

    2 set of second-level pattern recognition commands permits the !"## to search

    for different ob ects in different parts of the scene F for instance, to loo( for a closed

    eyelid only within the rectangle bordered by the corners of the eye. ince some

    applications may also re/uire multiple levels of recognition, the !"## has softwarehoo(s to pass along the recognition tas( from level to level.

    :or instance, to detect when a driver is falling asleep F a capability that could

    find use in California, which is about to mandate that cars sound an alarm when

    drowsy drivers begin to nod off F the !"## is first programmed to detect the driver's

    8College of applied sciences, Mavelikkara

  • 8/13/2019 Generic Visual Perception

    9/19

    Generic visual perception processor

    head, for which it creates histograms of head movement. The microprocessor reads

    these histograms to identify the area for the eye.

    Then the recognition tas( passes to the next level, which searches only within the

    eye area rectangles. ?igh-speed movement there, normally indicative of blin(ing, isdiscounted, but when blin(s become slower than a predetermined level, they are

    interpreted as the driver nodding off, and trigger an alarm.

    G$%% ARC"IT#CTUR# The chip houses 23 neural bloc(s, both temporal and spatial, each consisting of )&

    hardware input and output synaptic connections. The !"## multiplexes this neuralhardware with off-chip scratchpad memory to simulate as many as %&&,&&& synaptic

    connections per neuron. 4ach of these synapses can be changed through the on-chip

    microprocessor for a combined processing total of over B.) billion synaptic

    connectionsGsecond.

    +n executing up to )& *+# to analy e successive frames of a video stream, the

    temporal neurons identify pixels that have changed over time and generate a >-bit value

    indicative of the magnitude of that change. The spatial-processing system analy es theresulting difference histogram to calculate the speed and direction of the motion.

    ?istogram is a bar chart of the count of pixels of every tone of gray that occurs in

    the image. +t helps us analy e, and more importantly, correct the contrast of the image.

    Technically, the histogram maps Luminance, which is defined from the way the

    human eye perceives the brightness of different colors. :or example, our eyes are most

    sensitive to greenH we see green as being brighter than we see blue. Luminance weighsthe effect of this to indicate the actual perceived brightness of the image pixels due to

    the color components.

    9College of applied sciences, Mavelikkara

  • 8/13/2019 Generic Visual Perception

    10/19

    Generic visual perception processor

    :ig A.% ;epresentation of !"##

    MU&TI% %#RC#%TION'

    The vision perception processor chip processes motion images in real-time,

    and is able to perceive and trac( ob ects based on combinations of hue, luminance,

    speed, direction of motion and spatial orientation. The chip has three functionsI

    temporal processing, spatial processing and histogram processing. +n applications, the

    vision processor called generic visual perception processor !"##$ is used with a

    microprocessor, flash memory and peripherals.

    +n a case of detection of a driver falling sleep, the processor core detects movement

    of the eyelids. :irst, the driver is identified, and then the microprocessor directs the

    vision processor to search within the corner points of a rectangular area in which the

    nose of the driver would be expected to be located, based on a model, and to select pixels

    having characteristics of the shadows of nostrils. The results are calculated to the end of

    10College of applied sciences, Mavelikkara

  • 8/13/2019 Generic Visual Perception

    11/19

    Generic visual perception processor

    the image frame being analy ed, and the microprocessor analy es the resulting

    histograms to determine characteristics indicative of nostrils, such as the spacing and

    shaping and shape of the nostrils.

    Then, the microprocessor directs the vision processor to isolate a rectangular area

    in which the eyes of the driver are expected to be located, based on the model, and to

    select pixels having characteristics of high-speed movement normally indicative of

    blin(ing. The microprocessor analy es over time the histograms ta(en of the eye area to

    determine the duration of each blin( and the interval between blin(s. :rom the blin(

    duration and interval information, the microprocessor is able to determine when the

    driver is falling

    asleep and triggers an alarm accordingly.

    The highly parallel internal design of the vision processor allows it to perform

    over )& billion instructions per second *+# $, and the internal three functions play the

    following roles. Temporal processing involves the processing of successive frames of an

    image to smooth the pixels of the image over time to prevent interference affecting

    ob ect detection. patial processing involves the processing of pixels within a locali ed

    area of the images to determine the speed and direction of movement of each pixel.

    ;eceiving the information about speed and direction of each pixel from spatial and

    temporal processors, the histogram processor perceives ob ects in the images that have

    undergone significant change over time and the magnitude of the changes.

    The chip is able to detect and trac( eight ob ects per frame, meaning )=& ob ects

    per second in a >& framesGs systems such as T" or high definition T"s ?1T"$. The

    vision processor includes a total of )> spatio-temporal neural bloc(s, each having

    approximately )& different input and output connections. 4ach connection can becontrolled through the on-chip microprocessor. The vision processor provides a

    mechanism to reconfigure the spatio-temporal neural bloc(s to perform a wide variety

    of vision processing, such as moving ob ect detection.

    11College of applied sciences, Mavelikkara

  • 8/13/2019 Generic Visual Perception

    12/19

    Generic visual perception processor

    :ig %&.% ;epresentation of histogram calculation unit

    12College of applied sciences, Mavelikkara

  • 8/13/2019 Generic Visual Perception

    13/19

    Generic visual perception processor

    AD$ANTAG#' AND DI'AD$ANTAG#'

    AD$ANTAG#'

    1. +mitating the human eye's neural networ(s and the brain, the !"## can handle

    some )& billion instructions per second, compared to a mere millions handled by

    #entium-class processors of today.

    2. !"## has been demonstrated as capable of learning-in-place to solve a variety of

    pattern recognition problems.

    3. +t has automatic normali ation for varying ob ect si e, orientation, and lighting

    conditions, and can function in daylight or dar(ness.

    4. .+t is an inexpensive device that can autonomously perceive and then trac( up

    to eight user-specified ob ects in a video stream.

    5. ince processing in each module on the !"## runs in parallel out of its own

    memory space, multiple !"## chips can be cascaded to expand the number of

    ob ects that can be recogni ed and trac(ed.

    6. The engineer needs no (nowledge of the internal wor(ings of the !"##, the

    company said, only application-specific domain (nowledge.

    7. imple applications can be /uic(ly prototyped in a few days, with medium-si eapplications ta(ing a few wee(s and even big applications only a couple of

    months

    8. The chip could be useful across a wide variety of industries where visual trac(ing

    is important

    13College of applied sciences, Mavelikkara

  • 8/13/2019 Generic Visual Perception

    14/19

  • 8/13/2019 Generic Visual Perception

    15/19

    Generic visual perception processor

    A%%&ICATION' *4" lists possible applications for the !"## in process monitoring, /uality

    control and assemblyH automotive systems such as intelligent air bags that monitorpassenger si e and traffic congestion monitorsH pedestrian detection, license plate

    recognition, electronic toll collection, automatic par(ing management, automatic

    inspectionH and medical uses including disease identification. The chip could also prove

    useful in unmanned air vehicles, miniature smart weapons, ground reconnaissance and

    other military applications, as well as in security access using facial, iris, fingerprint, or

    height and gait identification

    *+ Auto otive industr-

    ?ere, !"##-type devices could (eep watch and trigger off an alarm if the driver

    were to nod off to sleep at the wheel. 6r act as a monitor for the car's progress to ma(e

    sure it does not get too close to the curb.

    2lso with transportation, !"## could be used in developing systems for collision

    avoidance, automatic cruise control, smart air bag systems, license plate recognition,measurement of traffic flow, electronic toll collection, automatic cargo trac(ing, par(ing

    management and the inspection of crac(s in rails and tunnels.

    2nother automotive application is warning erratic drivers. !"## does so by

    monitoring the left and right lanes of the road, signaling the driver when the vehicle

    deviates from the proscribed lane.

    .+ Ro/otics

    +n manufacturing, !"## have applications in robotics, particularly for dirty and

    dangerous obs such as feeding hot parts to forging presses, cleaning up ha ardous

    waste, and spraying toxic coatings on aircraft parts.

    15College of applied sciences, Mavelikkara

  • 8/13/2019 Generic Visual Perception

    16/19

    Generic visual perception processor

    0+ Agriculture and fis1eries

    +n agriculture and fisheries engineering, !"## could help with tas(s that

    traditionally have been labor intensive such as disease and parasite identification,

    harvest control, ripeness detection and yield identification.

    2+ Militar- applications

    The chip can be used in other important pattern-recognition applications, such as

    military target ac/uisition and fire control.. 5ilitary applications include unmanned air

    vehicles, automatic target detection, tra ectory correction, ground reconnaissance andsurveillance. "ision systems are useful in home security, as well, and !"##-based

    systems could be inexpensively developed to detect intruders and fires.

    +n addition to the above applications, !"## should be able to wor( as medical

    scanners, blood analy ers, cardiac monitoring, ban( chec(s, bar code reading, seal and

    signature verification, trademar( database indexing, construction of virtual reality

    environment models, human motion analysis, expression understanding, cloud

    identification, and many other fields.

    16College of applied sciences, Mavelikkara

  • 8/13/2019 Generic Visual Perception

    17/19

    Generic visual perception processor

    PRELIMINARY INFORMATION OF GVPP-7BGeneric Visual Perception Processor

    2rray :ormat max$I J&&?xB&&" :rame rateI &-%&& "!2 frames per second progressive-scan +nterface 5odeI 5asterG lave 1ata ;ate max$I =& 5egapixel per second 1ynamic ;angeI %&-bits, > channels #arametersI Luminance, ?ue, aturation, 5otion orientation,

    velocity$, 6riented edges, lines, curves, corners ComputationI B= T0 bloc(s

    5ulti-scales possibilities optional$ 5ulti-chips connections capabilities emi-!raphic interface visuali ation !3+ with mouse #2L, 0T C, "!2, visuali ation +nternal 6 C language programmation #C+, # ), +)C, ; )>), #W5 &.> Watt, >,>"olts for #2LG0T C format -=&,DJ< C Complete application in one i# % x% I C-56 +mager, !"##,

    %&5bits ";25, % 5bits erial :lash 5emory,)K 5? Cristal.

    17College of applied sciences, Mavelikkara

  • 8/13/2019 Generic Visual Perception

    18/19

    Generic visual perception processor

    !UTUR# 'CO%#'

    :or future, scientists are wor(ing on a 8visual mouse9 for hand-

    gesture interface to computers that ta(e advantage of that high compression ratio.

    :uture studies also involve using this processor as an eye of the robots, which provides

    tremendous applications.

    CONC&U'ION'

    +mitating the human eye's neural networ(s and the brain, the generic visual perception

    processor can handle about )& billion instructions per second, and can manage most

    tas(s performed by the eye. 5odeled on the visual perception capabilities of the human

    brain, the !"## is a single chip that can detect ob ects in a motion video signal and then

    to locate and trac( them in real time far more dependably than competing systems, which cost far more, according to company scientists. This is a generic chip, and we've

    already identified more than %&& applications in ten or more industries. The chip could

    be useful across a wide swathe of industries where visual trac(ing is important.

    18College of applied sciences, Mavelikkara

  • 8/13/2019 Generic Visual Perception

    19/19

    Generic visual perception processor

    R#!#RCNC#'

    1M httpIGGwww.patentstorm.us

    2M httpIGGwww.techweb.com

    3M *art os(o., 0eural networ(s and fu y systems, 2ddison Wesley =MhttpIGGwww.pixelin(.comGsupport