image compression document

IMAGE COMPRESSION USING DISCRETE WAVELET AND DISCRETE COSINE TRANSFORM A Project ReportSubmitted in partial fulfilment of the requirements for theAward of the degree OfBACHELOR OF ENGINEERINGInELECTRONICS & COMMUNICATIONS ENGINEERINGByA.SUJANA A.MADHAVA REDDY N.KRISHNA CHAITHANYA A.KARTHIK Under the Esteemed Guidance ofT.HYMA LAKSHMI .(Engg.),MISTE,MIETEProf. of E.C.E Dept.

Department ofElectronics & Communications EngineeringSRKR ENGINEERING COLLEGE, BHIMAVARAM(Recognised by A.I.C.T.E)Affiliated toANDHRA UNIVERSITY (2006-2010)

CERTIFICATE OF EXAMINATION This is to certify that I have examined the thesis and here by accord approval of it as a study carried out and presented in a manner required for its acceptance in partial fulfillment of degree of Bachelor of Engineering for which it has been submitted.

This approval does not necessarily endorse or accepts every statement made, opinions expressed or conclusions drawn as record in the report, it only signifies the acceptance of the report for the purpose for which it is submitted.

EXTERNAL EXAMINER INTERNAL EXAMINER

ACKNOWLEDGEMENT We are most sincere and grateful to this sanctum S.R.K.R ENGINEERING COLLEGE and express our heartfelt thanks to principal Dr.D.RANGARAJU, for giving us this opportunity for the successful completion of this degree.

We express our profound gratitude to our project guide Prof.T.HYMA LAKSHMI, for his valuable guidance and encouragement throughout this project, which enabled us to complete our project successfully and in time. We would like to express our sincere thanks to Prof.N.VENKATESWARA RAO, Head of department of ECE for his valuable suggestions at the time of need.

We express our special thanks to staff of ECE department and friends who helped us in bringing out the present form.

The last but not the least we are grateful to our family members for their love and moral support without which we would not achieve this goal.

--Project Associates

ABSTRACT: The fast development of multimedia computing has led to the demand of using digital images. The manipulation, storage and transmission of images in their raw form is very expensive, and significantly slows the transmission and make storage costly. Efficient image compression solutions are becoming critical with the recent growth of data intensive, multimedia based applications. Many techniques are now available and much effort is being expended in determining the optimum compression transforms. Compression is done using Cosine and Wavelet Transforms. Recently

compression techniques using Wavelet Transform (WT) have received great attention, because of their promising compression ratio, ability to analyze the temporal and spectral properties of image signals and flexibility in representing non stationary signals like speech and images by taking into account human perception system. In this paper we describe the application of Discrete Wavelet Transform (DWT) for analysis, processing and compression of multimedia signals like speech and image. More specifically we explore the major issues concerning the wavelet based image compression which include choosing optimal wavelet, decomposition levels and thresholding criteria. The simulation results prove the effectiveness of DWT based techniques in attaining an efficient compression ratio of 2.67 for images, achieving higher signal to noise ratio (SNR), better peak signal to noise ratio (PSNR), while the retained signal energy is 99.9885% and the resulting signals are generally much smoother. A comparison between Discrete Cosine Transforms (DCT) and discrete wavelet transform is done finally.

1. INTRODUCTION

In today’s digital world, when we see digital movie, listen digital music, read digital mail, store documents digitally, making conversation digitally, we have to deal with huge amount of digital data. So, data compression plays a very significant role to keep the digital world realistic. If there were no data compression techniques, we would have not been able to listen songs over the Internet, see digital pictures or movies, Or we would have not heard about video conferencing or telemedicine. How data compression made it possible? What are the main advantages of data compression in digital world? There many be many answers but the three obvious reasons are the saving of memory space for storage, channel bandwidth and the processing time for transmission. Every one of us might have experienced that before the advent MP3, hardly 4 or 5 songs of wav file could be accommodated. And it was not possible to send a wav file through mail because of its tremendous file size. Also, it took 5 to 10 minutes or even more to download a song from the Internet. Now, we can easily accommodate 50 to 60 songs of MP3 in a music CD of same capacity. Because, the uncompressed audio files can be compressed 10 to 15 times using MP3 format. And we have no problem in sending any of our favorite music to our distant friends in any corner of the world. Also, we can

download a song in MP3 in a matter of seconds. This is a simple example of significance of data compression. Similar compression schemes were developed for other digital data like images and videos. Videos are nothings but the animations of frames of images in a proper sequence at a rate of 30 frames per second or higher. A huge amount of memory is required for storing video files. The possibility of storing 4/5 movies in DVD CD now rather than we used 2/3 CDs for a movie file is because compression. We will consider here mainly the image compression techniques. Image data compression is concerned with minimizing the number of bits required to represent an image with no significant loss of information. Image compression algorithms aim to remove redundancy present in the data (correlation of data) in a way which makes image reconstruction possible; this is called information preserving compression Perhaps the simplest and most dramatic form of data compression is the sampling of band limited images, where an infinite number of pixels per unit area are reduced to one sample without any loss of information. Consequently, the number of samples per unit area is infinitely reduced.

Transform based methods better preserve subjective image quality, and are less sensitive to statistical image property changes both inside a single images and between images. Prediction methods provide higher compression ratios in a much less expensive way. If compressed images are transmitted an important property is insensitivity to transmission channel noise. Transform based techniques are significantly less sensitivity to channel noise. If a transform coefficients are corrupted during transmission, the resulting image is spread homogeneously through the image or image part and is not too disturbing.

Applications of data compression are primarily in transmission and storage of information. Image transmission applications are in broadcast television, remote sensing via satellite, military communication via aircraft, radar and sonar, teleconferencing, and computer communications.

1.1) IMAGE In general images can be defined as any two dimensional function f(x,y) where x,y are spatial coordinates, and amplitude of f at any pair of coordinates(x,y) is called intensity or gray level of the image at that point.

Digital image when x,y and the amplitude values of f are all finite, discrete quantities, we call the image a digital image. Pixel A pixel is a single point in a graphic image. Graphics monitors display pictures by dividing the display screen into thousands (or millions) of pixels, arranged in rows and columns. The pixels are so close together that they appear connected. The number of bits used to represent each pixel determines how many colors or shades of gray can be displayed. For example, in 8-bit color mode, the color monitor uses 8 bits for each pixel, making it possible to display 2 to the 8th power (256) different colors or shades of gray.

Image types The different types of images are binary, indexed, intensity, and RGB image types.

Binary image An image containing only black and white pixels. In MATLAB, a binary image is represented by a uint8 or double logical matrix containing 0's and 1's (which usually represent black and white, respectively). A matrix is logical when its "logical flag" is turned "on." We often use the variable name BW to represent a binary image in memory.

Indexed image An image whose pixel values are direct indices into an RGB color map. In MATLAB, an indexed image is represented by an array of class uint8, uint16, or double. The color map is always an m-by-3 array of class double. We often use the variable name X to represent an indexed image in memory, and map to represent the color map.

Intensity image An image consisting of intensity (grayscale) values. In MATLAB, intensity images are represented by an array of class uint8, uint16, or double. While intensity images are not stored with color maps, MATLAB uses a system color map to display them. We often use the variable name I to represent an intensity image in memory. This term is synonymous with the term "grayscale."

Multiframe image An image file that contains more than one image, or frame. When in MATLAB memory, a multiframe image is a 4-Darray where the fourth dimension specifies the frame number. This term is synonymous with the term "multipageimage."

RGB image An image in which each pixel is specified by three values -- one each for the red, blue, and green components of the pixel's color. In MATLAB, an RGB image is represented by an m-by-n-by-3 array of class uint8, uint16, or double.We often use the variable name RGB to represent an RGB image in memory.

IMAGE DIGITIZATION An image captured by a sensor is expressed as a continuous function f(x,y) of two coordinates in the plane. Image digitization means that the function f(x,y) is sampled into a matrix with m rows and n coloumns. The image quantization assigns to each continuous samples an integer value. The continuous range of image functions f(x,y) is split into k intervals. The finer the sampling(i.e the larger m and n) and quantization(larger k) the nbetter the approximation of the continuous image f(x,y).

SAMPLING AND QUANTIZATION To be suitable for computer processing an image function must be digitized both spatially and in amplitude. Digitization of spatial coordinates is called image sampling and amplitude digitization is called gray level quantization.

IMAGE PROCESSING The field of digital image processing refers to processing of digital image by means of a digital computer. A digital image is an image f(x,y) that has been discretized both in spatial coordinates and brightness. A digital image can be considered as a matrix whose row and coloumn indices identifies a point in the image and corresponding matrix element value identifies the gray level at that point. The elements of such a digital array are

called image elementd, picture elements, pixels or pels. The last two being commonly used abbreviations of “pictures elements”. The term digital processing generally refers to a two dimentional picture by a digital computer. In a broader context it implies digital processing of any two dimensional data. In the form in which they usually occur, images are not directly amenable to computer analysis. Since computers work with numerical rather than pictorial data, an image must be converted to numerical form before processing.this conversion process is called “digitization ” . The image is divided into small regions called picture elements or “pixels “. At each pixel location the image brightness is sample and quantized. This step generates an integer at each pixel representing the brightness or darkness of the image at that point. When this has been done for all pixels the image is represented by rectangular array of integers. each location has allocation or address ,and an integer value called ”gray level”. This array digital data is now candidate for computer processing.

APPLICATIONS OF DIGITAL IMAGE PROCESSING

1.Office automation: optical character recognition; document processing cursive script recognition; logo ang icon recognition;2.Industrial automation: automation inspection system; non destructive testing; automatic assembling; procrss related to VLSI manufacturing; PCB checking;3.Robotics; oil and natural gas exploration; etc4.Bio-medical: ECG,EEG,EMG analysis; cytological, histological and stereological applications; automated radiology and pathology; x-ray image analysis; etc5.Remote sensing: natural resources survey and management; estimation related to agriculture, hydrology foresty, mineralogy; urban planning; environment control and pollution control; etc6.Criminology: finger print identification; human face registration and matching; forensic investigation etc.7. Astronomy and space applications: restoration of images suffering from geometric and photometric distortions; etc.8.Information technology: facsimilies image transmission, video text; Video conferencing and video phones; etc.9.Entertainment and consumer electronics: HDVT; multimedia and video editing.

10.Military applications: missile guidance and detection; target identification; navigation of pilot less vehicle; reconnaissance; and range finding;etc.11.Printing and graphics art: color fidelity in desktop publishing; art conservation and dissemination; etc.

IMAGE COMPRESSION

PRINCIPLES OF IMAGE COMPRESSION: An ordinary characteristic of most images is that the neighboring pixels are correlated and therefore hold redundant information. The foremost task then is to find out less correlated representation of the image. Two elementary components of compression are redundancy and irrelevancy reduction. Redundancy reduction aims at removing duplication from the signal source image. Irrelevancy reduction omits parts of the signal that is not noticed by the signal receiver, namely the Human Visual System (HVS). In general, three types of redundancy can be identified:

(a) Spatial Redundancy or correlation between neighboring pixel values, (b) Spectral Redundancy or correlation between different color planes or spectral bands and (c) Temporal Redundancy or correlation between adjacent frames in a sequence of images especially in video applications. Image compression research aims at reducing the number of bits needed to represent an image by removing the spatial and spectral redundancies as much as possible.

DATA COMPRESSION VERSUS BANDWIDTH The mere processing of converting an analog signal into digital signal results in increased bandwidth requirements for transmission. For example a 5 MHz television signal sampled at nyquist rate with 8 bits per sample would require a bandwidth of 40 MHz when transmitted using a digital modulation scheme. Data compression seeks to minimize this cost and sometimes try to reduce the bandwidth of the digital signal below its analog bandwidth requirements.Why do we need compression?

The figures in TABLE1 show the qualitative transition from simple text to full motion video data and the disk space needed to store such uncompressed data

Table 1 Multimedia data types and uncompressed storage space requiredMultimedia Data Size/Duration Bits/Pixel or Uncompressed Bits/Sample Size

A page of text 11'' x 8.5'' Varying resolution 16-32Kbits

Telephone quality 1 sec 8bps 64Kbitsspeech

Grayscale Image 512 x 512 8 bpp 2.1Mbits

Color Image 512 x 512 24 bpp 6.29Mbits

Medical Image 2048 x 1680 12 bpp 41.3Mbits

SHD Image 2048 x 2048 24 bpp 100Mbits

Full-motion Video 640 x 480, 24 bpp 2.21Gbits 10 sec

The examples above clearly illustrate the need for large storage space for digital image, audio, and video data. So, at the present state of technology, the only solution is to compress these multimedia data before its storage and transmission, and decompress it at the receiver for play back.

Framework of General Image Compression Method A typical lossy image compression system is shown in Fig. 3. It consists of three closely connected components namely (a) Source Encoder, (b) Quantizer and (c) Entropy Encoder. Compression is achieved by applying a linear transform in order to decorrelate the image data, quantizing the resulting transform coefficients and entropy coding the quantized values.

Fig. 3: A Typical Lossy Image Encoder

Source Encoder (Linear Transformer) A variety of linear transforms have been developed which include Discrete Fourier Transform (DFT), Discrete Cosine Transform (DCT), Discrete Wavelet Transform (DWT) and many more, each with its own advantages and disadvantages.

Quantizer A quantizer is used to reduce the number of bits needed to store the transformed coefficients by reducing the precision of those values. As it is a many-to-one mapping, it is a lossy process and is the main source of compression in an encoder. Quantization can be performed on each individual coefficient,which is called Scalar Quantization (SQ). Quantization canalso be applied on a group of coefficients together known asVector Quantization (VQ) [9]. Both uniform and non-uniform quantizers can be used depending on the problems.

Entropy Encoder

An entropy encoder supplementary compresses the quantized values losslessly to provide a better overall compression. It uses a model to perfectly determine the probabilities for each quantized value and produces an appropriate code based on these probabilities so that the resultant output code stream is smaller than the input stream. The most commonly usedentropy encoders are the Huffman encoder and the arithmetic encoder, although for applications requiring fast execution, simple Run Length Encoding (RLE) is very effective [10]. It is important to note that a properly designed quantizer and entropy encoder are absolutely necessary along with optimum signal transformation to get the best possible compression.

1.2) What are the different types of compression?

(A) Lossless vs. Lossy compression:

There are different ways of classifying compression techniques. Two of these would be mentioned here. The first categorization is based on the information content of the reconstructed image. They are 'lossless compression' and 'lossy compression schemes. In lossless compression, the reconstructed image after compression is numerically identical to the original image on a pixel by-pixel basis. However, only a modest amount of compression is achievable in this technique. In lossy compression on the other hand, the reconstructed image contains degradation relative to the original, because redundant information is discarded during compression. As a result, much higher compression is achievable, and under normal viewing conditions, no visible loss is perceived (visually lossless).

(B) Predictive vs. Transform coding:

The second categorization of various coding schemes is based on the 'space' where the compression method is applied. These are 'predictive coding' and 'transform coding'. In predictive coding, information already sent or available is used to predict future values, and the difference is coded. Since this is done in the image or spatial domain, it is relatively simple to implement and is readily adapted to local image characteristics. Differential Pulse Code Modulation (DPCM) is one particular example of predictive coding.

Transform coding, also called block quantization, is an alternative to predictive coding. A block of data is unitarily transformed so that a large fractionof its total energy is packed in relatively few transform coefficients, which are quantized independently the optimum transform coder is defined as one that minimizes the mean square distortion of the reproduced data for a given number of total bits. Transform coding, on the other hand, first transforms the image from its spatial domain representation to a different type of representation using some well-known transforms mentioned later, and then codes the transformed values (coefficients). The primary advantage is that, it provides greater data compression compared to predictive methods, although at the expense of greater computations.

OBJECTIVE This process aims to study and understand the general operations used to compress a two dimensional gray scale images and to develop an application that allows the compression and reconstruction to be carried out on the images. The application developed aims to achieve:

1. Minimum distortion2. High compression ratio3. Fast compression time

To compress an image the operations include linear transform, quantization and entropy encoding. The thesis will study the wavelet and cosine transformation and discuss the superior features that it has over fourier transform. This helps to know how quantization reduces the volume of an image data before packing them efficiently in the entropy coding operation. To reconstruct the image, an inverse operation is performed at every stage of the system in the reverse order of the image decomposition.

DATA REDUNDANCY Data redundancy is the central issue in digital image compression. It is a mathematically quantifiable entity. If n1 and n2 represent the number of information carrying units in two data sets that represent the same information, the relative data redundancy Rd of the first data set can be defined as Rd=1-1/Cr Where Cr, commonly called the compression ratio, is Cr=n1/n2 For the case n2=n1, Cr=1 and Rd=0 indicating that the first representation contains no redundant data.

When n2<<n1, Cr>infinite and Rd>1 implying significant compression and highly redundant data. In other case n2>>n1, Cr>0 and Rd>infinite, indicating that the second data set contains much more data then the original representation.

COMPRESSION RATIO The degree of data reduction as a result of the compression process is known as compression ratio. The ratio measures the quantity of compressed data.

Compression ratio(C.R) = length of original data string length of compressed data string

increase of C>R causes more efficient the compression technique employed and vice versa.

CONTENTS

1.Introduction

1.1 Image compression

1.2 Types of compression

1.3 Advantages

1.4 Applications

2. Compression techniques2.1 Dct

2.2 Dwt

3. Wavelets4.

Introducing Wavelets

The fundamental idea behind wavelets is to analyse according to scale. The wavelet analysis procedure is to adopt a wavelet prototype function called an analysing wavelet or mother wavelet. Any signal can then be represented by translated and scaled versions of the mother wavelet. Wavelet analysis is capable of revealing aspects of data that other signal analysis techniques such as Fourier analysis miss aspects like trends, breakdown points, discontinuities in higher derivatives, and self-similarity. Furthermore, because it affords a different view of data than those presented by traditional techniques, it can compress or de-noise a signal without appreciable degradatio

Definition of wavelet

There are a number of ways of defining a wavelet (or a wavelet family).

Scaling filter

An orthogonal wavelet is entirely defined by the scaling filter - a low-pass finite impulse response (FIR) filter of length 2N and sum 1. In biorthogonal wavelets, separate decomposition and reconstruction filters are defined.

http://en.wikipedia.org/wiki/Finite_impulse_response

For analysis with orthogonal wavelets the high pass filter is calculated as the quadrature mirror filter of the low pass, and reconstruction filters are the time reverse of the decomposition filters.Daubechies and Symlet wavelets can be defined by the scaling filter.

Scaling function

Wavelets are defined by the wavelet function ψ(t) (i.e. the mother wavelet) and scaling function φ(t) (also called father wavelet) in the time domain.The wavelet function is in effect a band-pass filter and scaling it for each level halves its bandwidth. This creates the problem that in order to cover the entire spectrum, an infinite number of levels would be required. The scaling function filters the lowest level of the transform and ensures all the spectrum is covered. See for a detailed explanation.For a wavelet with compact support, φ(t) can be considered finite in length and is equivalent to the scaling filter g.Meyer wavelets can be defined by scaling function

Wavelet function

The wavelet only has a time domain representation as the wavelet function ψ(t).For instance, Mexican hat wavelet s can be defined by a wavelet function. See a list of a few Continuous wavelets.

Classification of wavelets

http://en.wikipedia.org/wiki/Continuous_wavelets

http://en.wikipedia.org/wiki/Mexican_hat_wavelet

http://en.wikipedia.org/wiki/Quadrature_mirror_filter

Wavelet transforms are classified into discrete wavelet transforms (DWTs) and continuous wavelet transforms (CWTs). Note that both DWT and CWT are continuous-time (analog) transforms. They can be used to represent continuous-time (analog) signals. CWTs operate over every possible scale and translation whereas DWTs use a specific subset of scale and translation values or representation grid.

List of wavelets

Discrete wavelets Beylkin (18) BNC wavelets Coiflet (6, 12, 18, 24, 30) Cohen-Daubechies-Feauveau wavelet (Sometimes referred

to as CDF N/P or Daubechies biorthogonal wavelets) Daubechies wavelet (2, 4, 6, 8, 10, 12, 14, 16, 18, 20) Binomial-QMF (Also referred to as Daubechies wavelet) Haar wavelet Mathieu wavelet Legendre wavelet Villasenor wavelet

Continuous waveletsReal valued

Beta wavelet Hermitian wavelet Hermitian hat wavelet Mexican hat wavelet Shannon wavelet

Complex valued

http://en.wikipedia.org/wiki/Shannon_wavelet

http://en.wikipedia.org/wiki/Mexican_hat_wavelet

http://en.wikipedia.org/wiki/Hermitian_hat_wavelet

http://en.wikipedia.org/wiki/Hermitian_wavelet

http://en.wikipedia.org/wiki/Beta_wavelet

http://en.wikipedia.org/w/index.php?title=Villasenor_wavelet&action=edit&redlink=1

http://en.wikipedia.org/wiki/Legendre_wavelet

http://en.wikipedia.org/wiki/Mathieu_wavelet

http://en.wikipedia.org/wiki/Haar_wavelet

http://en.wikipedia.org/wiki/Binomial-QMF

http://en.wikipedia.org/wiki/Daubechies_wavelet

http://en.wikipedia.org/wiki/Cohen-Daubechies-Feauveau_wavelet

http://en.wikipedia.org/wiki/Coiflet

http://en.wikipedia.org/w/index.php?title=BNC_wavelets&action=edit&redlink=1

http://en.wikipedia.org/w/index.php?title=Beylkin&action=edit&redlink=1

Complex mexican hat wavelet Morlet wavelet Shannon wavelet

Modified Morlet wavelet

Wavelet vs. Fourier analysis

SIMILARITIES BETWEEN FOURIER AND WAVELET TRANSFORM

The fast Fourier transform (FFT) and the discrete wavelet transform (DWT) are both linear operationthat generate a data structure that contains log2 n segments of various lengths, usually fillingand transforming it into a different data vector of length 2n .

The mathematical properties of the matrices involved in the transforms are similar as well. Theinverse transform matrix for both the FFT and the DWT is the transpose of the original. As a result,both transforms can be viewed as a rotation in function space to a different domain. For the FFT,this new domain contains basis functions that are sines and cosines. For the wavelet transform,this new domain contains more complicated basis functions called wavelets, mother wavelets, orAnalyzing wavelets.

Both transforms have another similarity. The basis functions are localized in frequency, makingmathematical tools such as power spectra (how much power is contained in a frequency interval) and

http://en.wikipedia.org/wiki/Modified_Morlet_wavelet

http://en.wikipedia.org/wiki/Shannon_wavelet

http://en.wikipedia.org/wiki/Morlet_wavelet

http://en.wikipedia.org/wiki/Complex_mexican_hat_wavelet

Scalegrams (to be dened later) useful at picking out frequencies and calculating power distributions.

General Concepts In the well-known Fourier analysis, a signal is broken down into constituent sinusoids of different frequencies. These sines and cosines (essentially complex exponentials) are the basis functions and the elements of Fourier synthesis. Taking the Fourier transform of a signal can be viewed as a rotation in the function space of the signal from the time domain to the frequency domain. Similarly, the wavelet transform can be viewed as transforming the signal from the time domain to the wavelet domain. This new domain contains more complicated basis functions called wavelets, mother wavelets or analyzing wavelets.

Mathematically, the process of Fourier analysis is represented by the Fourier transform:

Which is the sum over all time of the signal f(t) multiplied by a complex exponential.The results of the transform are the Fourier coefficients F(ω), which when multiplied By a sinusoid of frequency ω, yield the constituent sinusoidal components of the original signal.

A wavelet prototype function at a scale s and a spatial displacement u is defined as:

Replacing the complex exponential in Equation 2.1 with this function yields the continuous wavelet transform (CWT):

Which is the sum over all time of the signal multiplied by scaled and shifted versions of the wavelet function ψ. The results of the CWT are many wavelet coefficients C, which are a function of scale and position. Multiplying each coefficient by the appropriately scaled and shifted wavelet yields the constituent wavelets of the original signal. The basis functions in both Fourier and wavelet analysis are localized in frequency making mathematical tools such as power spectra (power in a frequency interval) useful at picking out frequencies and calculating power distributions.

The most important difference between these two kinds of transforms is that individual wavelet functions are localised in space. In contrast Fourier sine and cosine functions are non-local and are active for all time t. This localisation feature, along with wavelets localisation of frequency, makes many functions and operators using wavelets .sparse. when transformed into the wavelet domain. This sparseness, in turn results in a number of useful applications such as data compression, detecting features in images and de-noisingthis or

2.2.2 Time-Frequency Resolution

A major draw back of Fourier analysis is that in transforming to the frequency domain, the time domain information is lost. When looking at the Fourier transform of a signal, it is impossible to tell when a particular event took place. In an effort to correct this deficiency, Dennis Gabor (1946) adapted the Fourier transform to analyse only a small section of the signal at a time . a technique called windowing the signal [14]. Gabor.s adaptation, called the Windowed Fourier Transform (WFT) gives information about signals simultaneously in the time domain and in the frequency domain

To illustrate the time-frequency resolution differences between the Fourier transform and the wavelet transform consider the following figures.

Figure 2.1 shows a windowed Fourier transform, where the window is simply a square wave. The square wave window truncates the sine or cosine function to fit a window of a particular width. Because a single window is used for all

frequencies in the WFT, the resolution of the analysis is the same at all locations in the time frequency plane. An advantage of wavelet transforms is that the windows vary. Wavelet analysis allows the use of long time intervals where we want more precise low-frequency information, and shorter regions where we want high-frequency information. A way to achieve this is to have short high-frequency basis functions and long low-frequency ones.

Figure 2.2 shows a time-scale view for wavelet analysis rather than a time frequency region. Scale is inversely related to frequency. A low-scale compressed wavelet with rapidly changing details corresponds to a high frequency. A high-scale stretched wavelet that is slowly changing has a low frequency.

2.3 Examples of Wavelets

The figure below illustrates four different types of wavelet basis functions.

The different families make trade-offs between how compactly the basis functions are localized in space and how smooth they are. Within each family of wavelets (such as the Daubechies family) are wavelet subclasses distinguished by the number of filter coefficients and the level of iteration. Wavelets are most often classified within a family by the number of vanishing moments. This is an extra set of mathematical relationships for the coefficients that must be satisfied. The extent of compactness of signals depends on the number of vanishing moments of the wavelet function used. A more detailed discussion is provided in the next section.

The Discrete Wavelet Transform:

The Discrete Wavelet Transform (DWT) involves choosing scales and positions based on powers of two. So called dyadic scales and positions. The mother wavelet is rescaled or dilated by powers of two and translated by integers. Specifically, a function f(t) L2(R) (defines space of square integrable functions) can be represented as

The function ψ(t) is known as the mother wavelet, while φ(t) is known as the scalingFunction.

Applications of Discrete Wavelet TransformGenerally, an approximation to DWT is used for data compression if signal is already sampled, and the CWT for signal analysis. Thus, DWT approximation is commonly used in engineering and computer science, and the CWT in scientific research.Wavelet transforms are now being adopted for a vast number of applications, often replacing the conventional Fourier Transform. Many areas of physics have seen this paradigm shift, including molecular dynamics, ab initio calculations, astrophysics, density-matrix localisation, seismic geophysics, optics, turbulence and quantum mechanics. This change has also occurred, blood-pressure, heart-rate and ECG analyses, DNA analysis in image processing, protein analysis, climatology, general signal processing, speech recognition, computer graphics and

http://en.wikipedia.org/wiki/Computer_graphics

http://en.wikipedia.org/wiki/Speech_recognition

http://en.wikipedia.org/wiki/Signal_processing

http://en.wikipedia.org/wiki/Signal_processing

http://en.wikipedia.org/wiki/Climatology

http://en.wikipedia.org/wiki/Protein

http://en.wikipedia.org/wiki/Image_processing


http://en.wikipedia.org/wiki/DNA

http://en.wikipedia.org/wiki/ECG

http://en.wikipedia.org/wiki/Quantum_mechanics

http://en.wikipedia.org/wiki/Turbulence

http://en.wikipedia.org/wiki/Optics

http://en.wikipedia.org/wiki/Density_matrix

http://en.wikipedia.org/wiki/Density_matrix

http://en.wikipedia.org/wiki/Astrophysics

http://en.wikipedia.org/wiki/Ab_initio

http://en.wikipedia.org/wiki/Molecular_dynamics

http://en.wikipedia.org/wiki/Fourier_Transform

http://en.wikipedia.org/wiki/Signal_analysis

http://en.wikipedia.org/wiki/Data_compression

multifractal analysis. In computer vision and image processing, the notion of scale-space representation and Gaussian derivative operators is regarded as a canonical multi-scale representation.One use of wavelet approximation is in data compression. Like some other transforms, wavelet transforms can be used to transform data, then encode the transformed data, resulting in effective compression. For example, JPEG 2000 is an image compression standard that uses biorthogonal wavelets. This means that although the frame is overcomplete, it is a tight frame (see types of Frame of a vector space), and the same frame functions (except for conjugation in the case of complex wavelets) are used for both analysis and synthesis, i.e., in both the forward and inverse transform. For details see wavelet compression.A related use is that of smoothing/denoising data based on wavelet coefficient thresholding, also called wavelet shrinkage. By adaptively thresholding the wavelet coefficients that correspond to undesired frequency components smoothing and/or denoising operations can be performed.Wavelet transforms are also starting to be used for communication applications. Wavelet OFDM is the basic modulation scheme used in HD-PLC (a powerline communications technology developed by Panasonic), and in one of the optional modes included in the IEEE P1901 draft standard. The advantage of Wavelet OFDM over traditional FFT OFDM systems is that Wavelet can achieve deeper notches and that it does not require a Guard Interval (which usually represents significant overhead in FFT OFDM systems)[2]

IMAGE COMPRESSION USING DISCRETE COSINE TRANFORMS :

http://en.wikipedia.org/wiki/OFDM

http://en.wikipedia.org/wiki/FFT

http://en.wikipedia.org/wiki/OFDM

http://en.wikipedia.org/wiki/IEEE_P1901

http://en.wikipedia.org/wiki/Panasonic

http://en.wikipedia.org/wiki/Wavelet_compression

http://en.wikipedia.org/wiki/Frame_of_a_vector_space

http://en.wikipedia.org/wiki/JPEG_2000

http://en.wikipedia.org/wiki/Scale-space


http://en.wikipedia.org/wiki/Computer_vision

http://en.wikipedia.org/wiki/Multifractal_analysis

In today’s technological world as our use of and reliance on computers continues to grow, so too does our need for efficient ways of storing large amounts of data and due to the bandwidth and storage limitations, images must be compressed before transmission and storage.

However, the compression will reduce the image fidelity, especially when the images are compressed at lower bit rates. The reconstructed images suffer from blocking artifacts and the image quality will be severely degraded under the circumstance of high compression ratios. In order to have a good compression ratio without losing too much of information when the image is decompressed we use DCT. A discrete cosine transform (DCT) expresses a sequence of finitely many data points in terms of a sum of cosine functions oscillating at different frequencies. The JPEG process is a widely used form of lossy image compression that centers on the Discrete Cosine Transform. DCT and Fourier transforms convert images from time-domain to frequency-domain to decorrelate pixels. The DCT transformation is reversible .

The DCT works by separating images into parts of differing frequencies. During a step called quantization, where part of compression actually occurs, the less important frequencies are discarded, hence the use of the term “lossy“. Then, only the most important frequencies that remain are used retrieve the image in the decompression process. As a result, reconstructed images contain some distortion; but as we shall soon see, these levels of distortion can be adjusted during the compression stage. The JPEG method is used for both color and black-and-white images.

THE JPEG PROCESS:

The following is a general overview of the JPEG process. JPEG stands for Joint Photographic Experts Group which is a commonly used method of compression for photographic images. The degree of compression can be adjusted, allowing a selectable tradeoff between storage size and image quality. JPEG typically achieves 10:1 compression with little perceptible loss in image quality. More comprehensive understanding of the process may be acquired as such given under:

1.The image is broken into 8x8 blocks of pixels.

2. Working from left to right, top to bottom, the DCT is applied to each block.

3. Each block is compressed through quantization.

4. The array of compressed blocks that constitute the image is stored in a drastically reduced amount of space.

5. When desired, the image is reconstructed through decompression, a process that uses the Inverse Discrete Cosine Transform (IDCT).

THE DISCRETE COSINE TRANSFORM:

Like other transforms, the Discrete Cosine Transform (DCT) attempts to decorrelate the image data. After decorrelation each transform coefficient can be encoded independently without

losing compression efficiency. This section describes the DCT and some of its important properties.

1) The One-Dimensional DCT:

The most common DCT definition of a 1-D sequence of length N is

For u = 0, 1, 2, …, N −1. Similarly, the inverse transformation is defined as

For x = 0, 1, 2, …, N −1 In both equations as above, α (u) is defined as

It is clear from first equation that for

Thus, the first transform coefficient is the average value of the sample sequence. In literature, this value is referred to as the DC Coefficient. All other transform coefficients are called the AC Coefficients.

2) The Two-Dimensional DCT:

The Discrete Cosine Transform (DCT) is one of many transforms that takes its input and transforms it into a linear combination of weighted basis functions. These basis functions are commonly the frequency. The 2-D Discrete Cosine Transform is just a one dimensional DCT applied twice, once in the x direction, and again in the y direction. One can imagine the computational complexity of doing so for a large image. Thus, many algorithms, such as the Fast Fourier Transform (FFT), have been created to speed the computation. The DCT equation (Eq.1) computes the i, jth entry of the DCT of an image.

p (x, y) is the x,yth element of the image represented by the matrix p. N is the size of the block that the DCT is done on. The equation calculates one entry (i, j th) of the transformed image from the pixel values of the original image matrix. For the

standard 8x8 block that JPEG compression uses, N equals 8 and x and y range from 0 to 7. Therefore D (i, j ) would be as in Equation (3).

Because the DCT uses cosine functions, the resulting matrix depends on the horizontal and vertical frequencies. Therefore an image black with a lot of change in frequency has a very random looking resulting matrix, while an image matrix of just one color, has a resulting matrix of a large value for the first element and zeroes for the other elements.

COMPRESSION

Block Diagram:

The input is an image which consists of data interms of pixels. A greyscale image is of resolution 255x255. i.e it consists of 65025 no of pixel values. An 8x8 DCT matrix is considered here.

THE DCT MATRIX: To get the matrix form of Equation (1), we will use the following equation,

For an 8x8 block it results in this matrix:

The first row (i : 1) of the matrix has all the entries equal to 1/ 8 as expected from Equation (4).The columns of T form an orthonormal set, so T is an orthogonal matrix. When doing the inverse DCT the orthogonality of T is important, as the inverse of T is T’ which is easy to calculate.

DCT ON AN 8x8 BLOCK:

Before we begin, it should be noted that the pixel values of a black-and-white image range from 0 to 255 in steps of 1, where pure black is represented by 0, and pure white by 255. Thus it can be seen how a photo, illustration, etc. can be accurately represented by these 256 shades of gray. Since an image comprises hundreds or even thousands of 8x8 blocks of pixels, the following description of what happens to one 8x8 block is a microcosm of the JPEG process; what is done to one block of image pixels is done to all of them, in the order earlier specified. Now, let‘s start with a block of image pixel values. This particular

block was chosen from the very upper- left-hand corner of an image.

Because the DCT is designed to work on pixel values ranging from -128 to 127, the original block is “leveled off“ by subtracting 128 from each entry. This results in the following matrix.

We are now ready to perform the Discrete Cosine Transform, which is accomplished by matrix multiplication.

D = TMT’ -----(5)

In Equation (5) matrix M is first multiplied on the left by the DCT matrix T from the previous section; this transforms the rows. The columns are then transformed by multiplying on the right by the transpose of the DCT matrix. This yields the following matrix.

This block matrix now consists of 64 DCT coefficients, c (i, j), where i and j range from 0 to 7. The top-left coefficient, c (0, 0), correlates to the low frequencies of the original image block. As we move away from c(0,0) in all directions, the DCT coefficients correlate to higher and higher frequencies of the image block, where c(7, 7) corresponds to highest frequency. Higher frequencies are mainly represented as lower number and Lower frequencies as higher number. It is important to know that human eye is most sensitive to lower frequencies.

QUANTIZATION:

Our 8x8 block of DCT coefficients is now ready for compression by quantization. A remarkable and highly useful feature of the JPEG process is that in this step, varying levels of image compression and quality are obtainable through selection of specific quantization matrices. This enables the user to decide on quality levels ranging from 1 to 100, where 1 gives the poorest image quality and highest compression, while 100 gives the best quality and lowest compression. As a result, the quality/compression ratio can be tailored to suit different needs.

Subjective experiments involving the human visual system have resulted in the JPEG standard quantization matrix. With a quality level of 50, this matrix renders both high compression and excellent decompressed image quality.

If, however, another level of quality and compression is desired, scalar multiples of the JPEG standard quantization matrix may be used. For a quality level greater than 50 (less compression, higher image quality), the standard quantization matrix is multiplied by (100-quality level)/50. For a quality level less than 50 (more compression, lower image quality), the standard quantization matrix is multiplied by

50/quality level. The scaled quantization matrix is then rounded and clipped to have positive integer values ranging from 1 to 255. For example, the following quantization matrices yield quality levels of 10 and 90.

Quantization is achieved by dividing each element in the transformed image matrix D by corresponding element in the quantization matrix, and then rounding to the nearest integer value. For the following step, quantization matrix Q50 is used.

Recall that the coefficients situated near the upper-left corner correspond to the lower frequencies to which the human eye is most sensitive of the image block. In addition, the zeros represent the less important, higher frequencies that have been discarded, giving rise to the lossy part of compression. As mentioned earlier, only the remaining nonzero coefficients will be used to reconstruct the image. It is also interesting to note the effect of different quantization matrices; use of Q10 would give C significantly more zeros, while Q90 would result in very few zeros.

CODING:

The quantized matrix C is now ready for the final step of compression. Before storage, all coefficients of C are converted by an encoder to a stream of binary data (01101011...). In-depth

coverage of the coding process is beyond the scope of this article. However, we can point out one key aspect that the reader is sure to appreciate. After quantization, it is quite common for most of the coefficients to equal zero. JPEG takes advantage of this by encoding quantized coefficients in the zig-zag sequence shown in Figure as under. The advantage lies in the consolidation of relatively large runs of zeros, which compress very well. The sequence in Figure 1(4x4) continues for the entire 8x8 block.

DECOMPRESSION: Block diagram:

Reconstruction of our image begins by decoding the bit stream representing the Quantized matrix C. Each element of C is then multiplied by the corresponding element of the quantization matrix originally used R i, j = Q i, j × C i, j

The IDCT is next applied to matrix R, which is rounded to the nearest integer. Finally, 128 is added to each element of that result, giving us the decompressed JPEG version N of our original 8x8 image block M. N = round (T’RT) + 128

PROPERTIES OF DCT: Some properties of the DCT which are of particular value to image processing applications: a)Decorrelation: The principle advantage of image transformation is the removal of redundancy between

neighboring pixels. This leads to uncorrelated transform coefficients which can be encoded independently. It can be inferred that DCT exhibits excellent decorrelation properties.

b)Energy Compaction: Efficacy of a transformation scheme can be directly gauged by its ability to pack input data into as few coefficients as possible. This allows the quantizer to discard coefficients with relatively small amplitudes without introducing visual distortion in the reconstructed image. DCT exhibits excellent energy compaction for highly correlated images.

c) Separability: The DCT transform equation can be expressed as

This property, known as separability, has the principle advantage that D (i, j) can be computed in two steps by successive 1-D operations on rows and columns of an image. The arguments presented can be identically applied for the inverse DCT computation.

d) Symmetry: Another look at the row and column operations in above Equation reveals that these operations are functionally identical. Such a transformation is called a symmetric transformation. A separable and symmetric transform can be expressed in the form D = TMT’ where M is an N ×N symmetric transformation matrix This is an extremely useful property since it implies that the transformation matrix can be precomputed offline and then applied to the image thereby providing orders of magnitude improvement in computation efficiency.

COMPARISON OF MATRICES: Let us now see how the JPEG version of our original pixel block compares,

CONCLUSION:

If we look at the above two matrices, this is a remarkable result, considering that nearly 70% of the DCT coefficients were discarded prior to image block

decompression/reconstruction. Given that similar results will occur with the rest of the blocks that constitute the entire image, it should be no surprise that the JPEG image will be scarcely distinguishable from the original. Remember, there are 256 possible shades of gray in a black-and-white picture, and a difference of, say, 10, is barely noticeable to the human eye. DCT takes advantage of redundancies in the data by grouping pixels with similar frequencies together. And moreover if we observe as the resolution of the image is very high, even after sufficient compression and decompression there is very less change in the original and decompressed image. Thus, we can also conclude that at the same compression ratio the difference between original and decompressed image goes on decreasing as there is increase in image resolution.

Disadvantages of DCT Only spatial correlation of the pixels inside the

single 2-D block is considered and the correlation from the pixels of the neighboring blocks is neglected

Impossible to completely decorrelate the blocks at their boundaries using DCT

Undesirable blocking artifacts affect the reconstructed images or video frames. (high compression ratios or very low bit rates)

DISCRETE WAVELET TRANSFORM

Application to image compression

This is a picture of a famous mathematician: Emmy Noether compressed in different ways

Introduction When retrieved from the Internet, digital images take a considerable amount of time to download and use a large amount of computer memory. The Haar wavelet transform that we will discuss in this application is one way of compressing digital images so they take less space when stored and transmitted. As we will see later, the word ``wavelet’’ stands for an orthogonal basis of a certain vector space.

The basic idea behind this method of compression is to treat a digital image as an array of numbers i.e., a matrix. Each image consists of a fairly large number of little squares called pixels (picture elements). The matrix corresponding to a digital image assigns a whole number to each pixel. For example, in the case of a 256x256 pixel gray scale image, the image is stored as a 256x256 matrix, with each element of the matrix being a whole number ranging from 0 (for black) to 225 (for white). The JPEG compression technique divides an image into 8x8 blocks and assigns a matrix to each block. One can use some linear algebra techniques to maximize compression of the image and maintain a suitable level of detail

Vector transform using Haar Wavelets Before we explain the transform of a matrix, let us see how the wavelets transform vectors (rows of a matrix). Suppose

1600160014201260708448680420r

is one row of an 8x8 image matrix. In general, if the data string has length equal to 2k, then the transformation process will consist of k steps. In the above case, there will be 3 steps since 8=23.

We perform the following operations on the entries of the vector r:

1. 1. Divide the entries of r into four pairs: (420, 680), (448, 708), (1260, 1410), (1600, 600).

2. 2. Form the average of each of these pairs:

16002

16001600,13402

14201260,5782

708448,5502

680420

These will form the first four entries of the next step vector r1.

3. 3. Subtract each average from the first entry of the pair to get the numbers:

0,75,130,130 .

These will form the last four entries of the next step vector r1.

4. 4. Form the new vector:

080130130160013405785501 r .

Note that the vector r1 can be obtained from r by multiplying r on the right by the matrix:

2/10002/10002/10002/1000

02/10002/10002/10002/100002/10002/10002/10002/100002/10002/10002/10002/1

1W

The first four coefficients of r1 are called the approximation coefficients and the last four entries are called the detail coefficients.

For our next step, we look at the first four entries of r1 as two pairs that we take their averages as in step 1 above. This gives the first two entries: 564, 1470 of the new vector r2. These are our new approximation coefficients. The third and the fourth entries of r2 are obtained by subtracting these averages from the first element of each pair. This results in the new detail coefficients: -14, -130. The last four entries of r2 are the same as the detail coefficients of r1:

0801301301301414705642 r

Here the vector r2 can be obtained from r1 by multiplying r1 on the right by the matrix:

1000000001000000001000000001000000002/102/1000002/102/10000002/102/1000002/102/1

2W

For the last step, average the first two entries of r2, and as before subtract the answer from the first entry. This results in the following vector:

0801301301301445310173 r

As before, r3 can be obtained from r1 by multiplying r2 on the wright by the matrix:

1000000001000000001000000001000000001000000001000000002/12/10000002/12/1

3W

As a consequence, one gets r3 immediately from r using the following equation

rWWWr 3213

Let

2/10004/108/18/12/10004/108/18/1

02/1004/108/18/102/1004/108/18/1002/1004/18/18/1002/1004/18/18/10002/104/18/18/10002/104/18/18/1

321 WWWW

.

Note the following:

The columns of the matrix W1 form an orthogonal subset of R8 (the vector space of dimension 8 over R); that is these columns are pair wise orthogonal (try their dot products). Therefore, they form a basis of R8. As a consequence, W1 is invertible. The same is true for W2 and W3.

As a product of invertible matrices, W is also invertible and its columns form an orthogonal basis of R8. The inverse of W is given by:

1

11

21

31 WWWW

The fact the W is invertible allows us to retrieve our image from the compressed form using the relation

.31rWr

Suppose that A is the matrix corresponding to a certain image. The Haar transform is carried out by performing the above operations on each row of the matrix A and then by repeating the same operations on the columns of the resulting matrix. The row-transformed matrix is AW. Transforming the columns of AW is obtained by multiplying AW on the left by the matrix WT (the transpose of W). Thus, the Haar transform takes the matrix A and stores it as WTAW. Let S denote the transformed matrix:

.AWWS T

Using the properties of inverse matrix, we can retrieve our original matrix:

.)()( 1111 SWWSWWA TT

This allows us to see the original image (decompressing the compressed image).

Let us try an example.

Example Suppose we have an 8x8 image represented by the matrix

16001600140812806407047684481600160014721280832832768768153616001600160010888968969601536153616001536121696083283215361600153615361344960832832160016001536147214721216832768160015361408134410881156640704153615361472134412801152704576

A

The row-transformed matrix is

0643216012832416105609600112323441144

3209632163231212720321280161282961256

320192016160280127203212832482722401312323234329622528811850646464642882721200

AWL

Transforming the columns of L is obtained as follows

016168080364416161616048881616321616562020

16164848163288832644868243882161672024201050482082903630

4406824541463061212

LWS T

The point of doing Haar wavelet transform is that areas of the original matrix that contain little variation will end up as zero elements in the transformed matrix. A matrix is considered sparse if it has a “high proportion of zero entries”. Sparse matrices take much less memory to store. Since we cannot expect the transformed matrices always to be sparse, we decide on a non-negative threshold value known as ε, and then we let any entry in the transformed matrix whose absolute value is less than ε to be reset to zero. This will leave us with a kind of sparse matrix. If ε is zero, we will not modify any of the elements.

Every time you click on an image to download it from the Internet, the source computer recalls the Haar transformed matrix from its memory. It first sends the overall approximation coefficients and larger detail coefficients and a bit later the smaller detail coefficients. As your computer receives the information, it begins reconstructing in progressively greater detail until the original image is fully reconstructed.

Linear algebra can make the compression process faster, more efficientLet us first recall that an nxn square matrix A is called orthogonal if its columns form an orthonormal basis of Rn, that is the columns of A are pairwise orthogonal and the length of each column vector is 1. Equivalently, A is orthogonal if its inverse is equal to its transpose. That latter property makes retrieving the transformed image via the equation

TTT WSWSWWSWWA 1111 )()(

much faster.

Another powerful property of orthogonal matrices is that they preserve magnitude. In other words, if v is a vector of Rn and A is an orthogonal matrix, then ||Av||=||v||. Here is how it works:

2

2

||||

)()(||||

vvvIvvAvAvAvAvAv

T

T

TT

T

This in turns shows that ||Av||=||v||. Also, the angle is preserved when the transformation is by orthogonal matrices: recall that the cosine of the angle between two vectors u and v is given by:

||||||||.cos

vuvu

so, if A is an orthogonal matrix, ψ is the angle between the two vectors Au and Av, then

cos||||||||

.||||||||

||||||||

||||||||)()(||||||||

)).((cos

vuvu

vuvuvuAvAuvuAvAuAvAuAvAu

T

TT

T

Since both magnitude and angle are preserved, there is significantly less distortion produced in the rebuilt image when an orthogonal matrix is used. Since the transformation matrix W is the product of three other matrices, one can normalize W by normalizing each of the three matrices. The normalized version of W is

4/20002/1064/864/84/20002/1064/864/8

04/2002/1064/864/804/2002/1064/864/8004/2002/164/864/8004/2002/164/864/80004/202/164/864/80004/202/164/864/8

W

Remark If you look closely at the process we described above, you will notice that the matrix W is nothing but a change of basis for R8. In other words, the columns of W form a new basis (a “very nice” one) of R8. So when you multiply a vector v (written in the standard basis) of R8 by W, what you get is the coordinates of v in this new basis. Some of these coordinates can be “neglected” using our threshold and this what allows the transformed matrix to be stored more easily and transmitted more quickly.

Compression ratio If we choose our threshold value ε to be positive (i.e. greater than zero), then some entries of the transformed matrix will be reset to zero and therefore some detail will be lost when the image is decompressed. The key issue is then to choose ε wisely so that the compression is done effectively with a minimum “damage” to the picture. Note that the compression ratio is defined as the ratio of nonzero entries in the transformed matrix (S=WTAW) to the number of nonzero entries in the compressed matrix obtained from S by applying the threshold ε.

Thresholder

Once DWT is performed, the next task is thresholding,which is neglecting certain wavelet coefficients. For doingthis one has to decide the value of a threshold andhow to apply the same.

Value of the Threshold

This is an important step which affects the quality of thecompressed image. The basic idea is to truncate the in-significantcoefficients, since the amount of informationcontained in them is negligible.

The question of deciding the value of threshold is a problemin itself. Ideally, one should have a uniform recipe,which would work satisfactorily for a given set of problems,so that the procedure is automated. One suchmethod by Donoho and co-authors [4] gives an asymptoticallyoptimal formula called the universal thresholdt :t = ¾q(2 lnN): (1)Here, ¾ = standard deviation of the N wavelet coefficientcients.The value of t should be calculated for each level ofdecomposition and only for the high pass coefficients.The low pass coefficients are usually kept untouched soas to facilitate further decomposition.

Quantizer

Higher compression ratios can be obtained by quantizingthe non-zero wavelet coefficients, before they areencoded. A quantizer is a many-to-one function Q(x )that maps many input values into a (usually much)smaller set of output values. Quantizers are staircasefunctions characterized by a set of numbers { di, i = 0,….,N} called decision points and a set of numbers {ri, i=0,…..,N-1} called reconstruction levels. An input value x is mapped to a reconstruction level ri, if x lies in the interval (di,d(i+1)].

To achieve best results, a separate quantizer should be

designed for each scale, taking into account statistical

properties of the scale's coefficients and for images, properties

of the human visual system. The coefficient statistics

guide the quantizer design for each scale, while the human visual system guides the allocation of bits

among the different scales. For our present purpose,

a simple uniform quantizer (i.e., constant step size) is

used. The wavelet coefficients (Figure 4 on p.26), after

thresholding were uniformly quantized into 256 different

bins. Thus the size of each bin was ( xmax-xmin/256)

, where xmin

and xmax are the wavelet coefficients with minimum and

maximum values, respectively. To minimize the maximum

error (minimax condition), centroid of each bin is

assigned to all the coefficients falling in that bin. For

discussions on non-uniform quantizers, interested readers

can refer [3].

ENTROPY ENCODER

This is the last component in the compression model.Till now, we have devised models for an alternate representationof the image, in which its interpixel redundancieswere reduced. This last model, which is a losslesstechnique, then aims at eliminating the coding redundancies,whose notion will be clear by considering an example.Suppose, we have a domain in an image, where pixelvalues are uniform or the variation in them is uniform.Now one requires 8 bpp (bits per pixel) for representingeach pixel since the values range from 0 to 255. Thusrepresenting each pixel with the same (or constant difference)value will introduce coding redundancy. This canbe eliminated, if we transform the real values into somesymbolic form, usually a binary system, where each symbolcorresponds to a particular value. We will discuss afew coding techniques and analyse their performances.

Run Length Encoding

Run-length encoding (RLE) makes use of the fact thatnearby pixels in an image will probably have the samebrightness value. This redundancy can then be codedas follows,Original image data (8-bit)

127 127 127 127 129 129 129Run-length encoded image data127 4 129 2This technique will be useful for encoding an online signal.But data explosion problems can occur and even asingle data error will obstruct full decompression.

Differential Pulse Code Modulation

Predictive image compression techniques assume that apixel's brightness can be predicted given the value ofthe preceding pixel. Differential pulse code modulation(DPCM) codes the differences between two adjacent pixels.DPCM starts coding at the top left-hand corner ofan image and works left to right, until all the image isencoded as shown:Original Image Data86 86 86 86 88 89 89 89 89 90 9086-0-0-0-2-1-0-0-0-1-0DPCM CodeThis technique will be useful for images that have largerruns of equal-value pixels.

Huffman Coding

This is the most popular statistical data compressiontechnique for removing coding redundancy. It assignsthe smallest possible number of code symbols per sourcesymbol and hence reduces the average code length usedto represent the set of given values. The general ideais to assign least number of bits to most probable (orfrequent) values occurring in an image. The Huffmancode is an example of a code which is optimal whenall symbols have possibilities of occurrence which areintegral powers of ½.

A Huffman code can be built in the following manner:

Rank all symbols in decreasing order of probabilityof occurrence.

Successively combine the two symbols of the lowestprobability to form a new composite symbol(source reduction); eventually we will build a binarytree, where each node is the probability of allnodes beneath it.

Trace the path to each leaf, noticing the directionat each node.For a given frequency distribution, there are many possibleHuffman codes, but the total compressed lengthwill be the same. It is possible to define a canonicalHuffman tree, that is, pick one out of many alternativetrees. Such a canonical tree can then be representedvery compactly, by transmitting only the bit length of each code.

Advantages of DWT over DCT (cont.)

Higher flexibility: Wavelet function can be freely chosen

No need to divide the input coding into non-overlapping 2-D blocks, it has higher compression ratios avoid blocking artifacts.

Transformation of the whole imageà introduces inherent scaling

Better identification of which data is relevant to human perceptionà higher compression ratio (64:1 vs. 500:1)

image compression document

Documents