security sphere - massachusetts institute of technology
TRANSCRIPT
SSSeeecccuuurrriiitttyyy SSSppphhheeerrreee:::PPPaaannnooorrraaammmiiiccc AAAcccqqquuuiiisssiiitttiiiooonnn
BByy:: OOlluuwwaammuuyyiiwwaa OOlluubbuuyyiiddeeCCoollllaabboorraattoorrss:: JJeelleennaa MMaaddiicc,, Matthew Yarosz
AAddvviissoorr:: PPrrooffeessssoorr CChhaarrlleess GG.. SSooddiinnii
MMaayy 1111,, 22000000
Abstract
The Security Sphere is a wireless, panoramic video surveillance device. It will be acompact ball with a spherical four inch radius hanging in the central apex of a visual field. TheSecurity Sphere will have four imaging locations which will be aligned to form one panoramicimage. It will also have motion detection capability that will increase the power savings of thedevice. Furthermore, this low power capability that can lead to future portability. The MITTechnology being showcased is the CMOS Differential Passive Pixel Image (DPPI) Sensor andthe Ultra-Low Power Phase Locked Loop (PLL) Frequency Synthesizer. The Security Spherewill transmit 30 visual data frames at 2.5 Megabits per second. The resolution of the SecuritySphere images will be 256 X 256.
2
Table of Contents
1 Overview………………………………….………………………….3
2 Overall Security Sphere Architecture…….…………….………….3
3 Tracing the Data Path………………….……………………….…...4
3.1 Panoramic Acquisition………….…………………..….…...4
3.2 Motion Detection……………….……………….……..……5
3.3 Compression…………………….………………..…………5
3.4 Transmission……………………………………..…………6
3.5 Reception…………………………………………...……….7
3.6 Image Alignment……………...………………….…………8
4 Specifics of Panoramic Acquisition..………………………………..8
4.1 Architecture…………………………………………………8
4.2 Lenses………………………………………………………..9
4.3 Differential Passive Pixel Image Sensors………………….11
5 Testing………………………………………………………….…….12
6 Resources……………………………………………………………..13
7 Timeline………………………………………………………………13
8 References……………………………………………………………14
3
1 Overview:
The interest in and applications for low power, portable, wireless, wide visual range monitors are
increasing in the current information age. For example, these devices can be placed anywhere
from service centers to the baby cribs to ensure the safety of the environment. In response to this
growing market, the proposal for this Masters of Engineering Thesis is to build a wireless,
panoramic video surveillance unit dubbed the Security Sphere. The Security Sphere will be a
four inch radius sphere with four imagers mounted on the bottom hemispherical surface. In the
primary operational mode, the Security Sphere will be hanging from the central apex of a visual
field, and data will be transmitted at 30 frames per second to a display unit. The fabricated
Security Sphere will showcase the following MIT Technology: a CMOS Differential Passive
Pixel Image (DPPI) Sensor and an Ultra-low Power Phase Locked Loop (PLL) Frequency
Synthesizer.
2 Overall Security Sphere Architecture:
1.89GHz2.5 Mbps
MITTx
ImageCompression
MotionDetection
PanoramicPanoramicAcquisitionAcquisition
DigitalDigitalDemodulation andDemodulation and
DecompressionDecompression
Rx AnalogFront End
Image AlignmentImage Alignmentand Displayand Display
4
3 Tracing the Data Path:
3.1 Panoramic Acquistion:
Four CMOS Differential Passive Pixel Image (DPPI) Sensors will be strategically embedded in
the bottom hemisphere of the hollow sphere to capture the panoramic, 180 solid angle view.
Embedding the sensors will yield a more compact, portable device. Furthermore, decreasing the
size of the Security Sphere will make it easier to realize a seamless, panoramic image at the focal
length of the lenses. Short Lens focal lengths also facilitate the Security Sphere’s ability to
capture the panoramic image. Fortunately, Wide Angle Lenses require a short focal length,
which also lead to a better depth of field and smaller size for the Security Sphere. Furthermore,
Wide Angle Lenses exponentially decrease the number of imagers and lenses capturing the
panoramic, seamless image. The tradeoff is that these lenses create image and distance
distortions. The distance distortion causes objects to appear farther away and smaller than in real
life. The image distortion effect of Wide Angle Lenses is the visual compression of the data that
reaches its maximum at the edge of the angle of view. For the 110 degree angle of view lenses
utilized for the Security Sphere, the image distortion will approximately be a 34% decrease from
a flat image viewed at the same distance.
After passing through the lenses, the image will then be sensed by the Differential
Passive Pixel Image (DPPI) Sensor. This sensor is a CMOS Random Access 256 x 256 charge
output pixel array that utilizes a differential architecture to eliminate common mode
disturbances. The voltage out put from the DPPI Sensor is read out in an even and odd sequence,
and written to an SRAM. After storage, the image advances along the data path for image
processing.
5
3.2 Motion Detection Algorithm:
The first data processing stage is motion detection. At this point, the data is in its original,
unaltered form, ideal for a motion detection algorithm that is necessarily sensitive to noise. The
Security Sphere employs motion detection primarily as a power saving feature. Security cameras
frequently monitor areas with little to no motion for extended periods of time. When no motion
is present in its surroundings, the security sphere will enter a power-saving mode in which it
decreases its transmitted frame rate from the normal 30 frames per second to one frame per
second. Thus, the security sphere achieves significant power savings at the transmitter. The
specific design of the motion detection algorithm matches the purpose of a security device:
detection of human intruders. Motion that is too small or too repetitive to possibly be human
motion does not cause the motion detection algorithm to trigger. For example, a small animal or
a blowing leaf will not cause enough pixels to change value, and the algorithm will not trigger.
In addition, the motion of machines in a factory setting is large enough to be human motion, but
much too repetitive. This type of motion also should not be monitored at the full 30 frames per
second. The motion detection algorithm therefore recognizes the repetitive nature of this motion,
and quickly “learns” not to trigger upon it. The motion detection algorithm will be software
implemented on a Texas Instruments DSP chip mounted on the sphere.
3.3 Compression:
Next the data moves to the image compression stage. The sphere must transmit four images at
30 frames per second over a wireless channel that is limited to only 2.5 Megabits per second.
Clearly, image compression is necessary, specifically at a compression ratio of 25 to 1. Wavelet
6
compression was selected over more standard video compression algorithms such as MPEG,
because wavelet compression has proven to provide high compression ratios without the
blocking artifacts exhibited by Discrete Cosine Transform based algorithms such as MPEG.
Instead of dividing the image spatially and compressing each spatial block independently,
wavelet compression divides the image in the frequency domain and compresses each frequency
band separately. This algorithm creates compressed images that appear more natural. The image
compression is handled by the Analog Devices ADV601 wavelet compression chip. A
complementary ADV601 chip will perform the decompression at the receiver unit.
3.4 Transmission:
After compressing the data, it will be ready for transmission to the display unit via the MIT
Transmitter. In the heart of the MIT transmitter is an ultra-low power, high data rate fractional-
N frequency synthesizer. Its major component is a Phase Locked Loop (PLL) that is a feedback
loop that modulates the data around the 1.89 GHz carrier frequency. The PLL consists of a
phase detector (PD), low-pass filter (LPF), voltage controlled oscillator (VCO) and a divider in
the feedback path. This structure suffers from fractional spurs that could significantly degrade
the performance of the synthesizer. A sigma-delta (Σ-∆) modulator is added to the PLL to shift
the spurs to higher frequencies so that they can be filtered out by the LPF. To achieve high data
rates, a high bandwidth is necessary. The low bandwidth of the LPF is the limiting factor. To
achieve a flatter frequency response from the synthesizer, a precompensation filter with a
complementary transfer function to the feedback loop is added. The matching of these two
transfer functions is ensured by the Adaptive Tuning Circuit (ATC) with automatic gain
calibration capabilities. The ATC has been proposed by Dan McMahill, a PhD student at MIT.
7
Automatic gain calibration eliminates factory trimming, and makes the synthesizer less sensitive
to process and temperature variations. The modulation used is Gaussian Minimum Shift Keying
(GMSK) scheme. The major advantage of GMSK is that it allows for coherent demodulation that
achieves a lower bit error rate compared to noncoherent demodulation. The Gaussian filter that
shapes the digital data stream is integrated into the synthesizer.
A reduced number of components and low operating voltages lead to low power dissipation.
Also, when no motion is detected, the power-saving mode signal from the ADV601 chip
disconnects the power supplies from the transmitter.
3.5 Reception:
The receiver has an analog front end only. The front end has a single IF stage that consists of a
band-select filter, low noise amplifier (LNA), image reject filter, a mixer, and a channel-select
filter. An analog-to-digital converter (ADC) transfers the downconverted signal into the digital
domain of the FPGA, where further filtering, demodulation, and clock extraction occur. The
digital version of the receiver reduces hardware and minimizes associated problems such as
temperature variation and stability problems. The demodulator is implemented in software of the
FPGA, allowing for greater flexibility in adopting the coherent demodulation scheme. I/Q
mismatches are also eliminated. This receiver will be built using discrete, commercially available
components. The demodulated signal is then decompressed and sent to the display unit for image
alignment.
8
3.6 Image Alignment:
The data being transmitted contains four separate spatial images. These images arrive at the
laptop as separate images and must be combined into one panoramic image. An image alignment
algorithm programmed in software on the laptop display unit achieves this alignment. The pixels
along the image boundaries that the images share in common enable the image alignment. First,
using prior knowledge of the imager locations on the sphere, an initial guess at the proper
alignment of the images is made. Then a correlation algorithm tests various image
configurations around the initial guess, eventually zeroing in on the proper configuration. The
panoramic image is then displayed on the laptop computer monitor.
4 Specifics of the Panoramic Acquisition:
4.1 Architecture:
Block Diagram of the Block Diagram of the Panoramic AcquisitionPanoramic Acquisition
FPGA
SRAMLens&
Imager ChipDSP
Bias VoltageBias Voltage
Figure 1
9
As shown in Figure 1, the Imager Chip (DPPI Sensor) will be controlled by an FPGA (Field
Programmable Gate Array) which will control the timing issues such as powering up the Chip,
and selecting the sequence of rows and columns that will be read out from the imager. This
visual data will then be stored in constant addresses in an SRAM Chip. Although there will be
four imagers, since all the imagers are treated in the same manner, only the data path for one
DPPI Sensor is required in the above block diagram.
4.2 Lenses:
Angle: θ
Focal Length: F
Radius: R
Side: SImager & Lens
Figure 2: Analysis of Seamless, Panoramic Lens View
# of Required Lens (Focal Length of 25cm)
0
10
20
30
40
50
60
70
0 20 40 60 80 100 120 140
Angles (Degrees)
# of
Len
s
Graph 1
Distortion vs AOV
0102030405060708090
-22.5 -20 -19 -17.5 -8
Distortion
Graph 2
10
The importance of the lens in the panoramic acquisition of data is captured in the above Figure
and Graphs. Figure 2 shows a condensed version of the analysis utilized to calculate how many
lens will be required to display a seamless panoramic image. This figure illustrates how the
angle of view, and focal length of the lenses lead to the overlap area for a lens to be
2*[Fl*tan(θ/2)] 2. Using this formula, it becomes apparent that to cover the bottom
hemispherical surface area of 2πr2, four lenses are required. Furthermore, once the area covered
[~6.3*r2] has been matched by the overlap area of the four lenses (8*[Fl*tan(θ/2)] 2), any
increases in the focal length and will yield a seamless image with greater overlap area because of
the greater coefficient – 8 versus 6.3 – in the formula for the overlap area. Also, as the focal
length of the lenses increases, the depth of field of the captured image decreases and the image
looks flatter. Therefore, it is not advisable to greatly increase the focal length.
The other parameter of interest is relationship between the angle of view and the number
of lenses required. This is displayed in Graph 1. This graph shows that as the number of angle
of view decreases, the number of lenses required to capture a seamless, panoramic view
increases exponentially (as captured in the tan(θ/2)). In order to utilize only four lenses for a
four inch radius Security Sphere at a focal length of 25cm, it becomes evident that > 103o angle
of view is required. The tradeoff is that as the angle of view increases, the visual distortion of
the lens increases. This relationship is captured in Graph 2. It is evident that as the angle of
view increases, the visual compression of the data away from the center of the lens increases
correspondingly. For > 103o angle of view, the visual compression is ~34.6%. This
compression is labeled distortion due to the fact the compression is not spacially uniform, for
instance, the horizontal dimension is compressed more than the vertical dimension.
11
4.3 Differential Passive Pixel Image (DPPI) Sensor:
DPPI Sensor Architecture:
DPPI Sensor Specifications:
The CMOS Differential Passive Pixel Image Sensor is a charge output device that utilizes a
CMOS random access 256 X 256 array architecture shown in Figure 3. The passive pixel
architecture consists of a single n-well photodiode and a row select transistor on 20 square
557733uuVVNNooiissee
RRaannddoomm AAcccceessss CCMMOOSSTTeecchhnnoollooggyy
225566 XX 225566RReessoolluuttiioonn
33 VVoollttss –– 55 VVoollttssBBaatttteerryy RRaannggee
3300 FFrraammeess//sseeccoonnddFFrraammee RRaattee
FFoollddeedd CCaassccooddeeOOppeerraattiioonnaall AAmmpplliiffiieerr
OOuuttppuutt CCiirrccuuiitt
CChhaarrggee OOuuttppuuttPPiixxeell TTyyppee
88mmmm XX 66mmmmDDiiee SSiizzee
GGrraayy,, 225566 LLeevveellssCCoolloorr SSccaallee
CCuurrrreenntt DDPPPPII SSeennssoorrPPaarraammeetteerrss
Figure 3
Table 2
12
micron die. The differential nature of the imager is due to the ability of the present output pixel
to be compared with a dummy cell that is kept in the dark. The differential output voltage
therefore corresponds to the sensed charge. This differential architecture reduces common mode
disturbances and reduces column-to-column variation. Furthermore, the main problem with
passive pixel technology, parasitic current, is extracted out this sensor using a Correlated Double
Sampling architecture that is an outgrowth of the differential nature of the imager [from the
ISSCC]. Furthermore, in the power savings mode, the sensor exhibits significant power savings
by only turning on the required circuitry to send out images at reduced resolutions. Moreover,
the CMOS imager allows separate and independent access to each pixel. This capability also
enables a zoom and pan feature to be implemented with the CMOS imager. Each column of the
256 X 256 array will be read out in an even and odd sequence, and the information will be stored
into an SRAM.
5 Testing:
The parameters on which the Security Sphere will be judged are resolution, motion detection
capability, and aesthetics of the displayed image. The demonstration system for this device will
be a one-hour visual recording of a room. During this time period, there will be a variable rate
and size of movement in the room. This demonstration will display the ability of the security
sphere to display high resolution. Moreover, the flexibility with which the On-chip DSP can
detect motion and hence increase the power savings of the Security Sphere will be exhibited.
Finally, the ability of the Software DSP unit to manipulate the incoming data stream will be
highlighted.
13
6 Resources:
Parts Number Total CostWide Angle Lenses 4 $640Printed Circuit Board (PCB) 6 $400Evaluation Board 1 $200Various Electrical Components NA $200Buffers 2 $150Wavelet Compression Chip (ADV601) 2 $128SRAM 1 $100Analog to Digital Converter (A/D) 2 $100FPGA 2 $100CMOS DPPI Sensor 4 $0PLL Frequency Synthesizer 1 $0Code Composer Software 1 $0DSP Chip 2 $0Laptop 1 $0
Total $2,018
7 Timeline:
StartDate
EndDate Member Task
3/1/00 5/1/00 Jelena Receiver Design3/1/00 5/1/00 Muyiwa Selection of Lens4/1/00 6/1/00 Matthew Testing of Wavelet Compression Chip6/1/00 7/1/00 Jelena Testing of PLL Frequency Synthesizer6/1/00 9/1/00 Muyiwa Imager Board and Lens Integration7/1/00 9/1/00 Jelena Receiver Implementation7/1/00 10/1/00 Matthew Implementation and Testing of Motion Detection Algorithm9/1/00 12/1/01 Jelena Implementation of Transmitter
9/1/00 12/1/00 Muyiwa Testing of Board and Lens in Simulated SphereArchitecture
10/1/00 12/1/01 Matthew Implementation of Image Alignment Algorithm12/1/01 2/1/01 All Implementation of Integrated Security Sphere2/1/01 3/1/01 All Testing of Integrated Security Sphere3/1/01 6/1/01 All Demonstration of Integrated Security Sphere
10/1/00 6/1/01 All Thesis Writeup
Table 3
Table 4
14
8 References:
[1] Fujimori, Iliana L., A Differential Passive Pixel Image Sensor, Massachusetts Institute ofTechnology, February 1997.
[2] Horenstein, Henry, Black and White Photography A Basic Manual, 2nd Edition, Boston,1983.
[3] M. H. Perrott: "Techniques for High Data Rate Modulation and Low Power Operation ofFractional-N Frequency Synthesizers)". PhD Dissertation, MIT, Department of ElectricalEngineering and Computer Science.
[4] Ketan Patel: Ultra Low-Power Wireless Sensor Demonstration System: Design of a WirelessBase Station. Master's of Science Thesis, MIT, Department of Electrical Engineering andComputer Science.
[5] Dennis M. Akos and James B. Y. Tsui: Design and Implementation of a Direct DigitizationGPS Receiver Front End, IEEE Transactions on Microwave Theory and Techniques, Vol 44,NO. 12 December 1996.
Acknowledgements:I would like to thank Professor Charles G. Sodini, Professor Jim Bales, and Sam Lefian for theirguidance of all aspects of this project.