image compression with neural networks a surveylooney/cs773b/nnimage-compress.pdf · image...

24
* Tel.: #44 1443 482271; fax: #44 1443 482715; e-mail: jjiang@glam.ac.uk Signal Processing: Image Communication 14 (1999) 737}760 Image compression with neural networks } A survey J. Jiang* School of Computing, University of Glamorgan, Pontypridd CF37 1DL, UK Received 24 April 1996 Abstract Apart from the existing technology on image compression represented by series of JPEG, MPEG and H.26x standards, new technology such as neural networks and genetic algorithms are being developed to explore the future of image coding. Successful applications of neural networks to vector quantization have now become well established, and other aspects of neural network involvement in this area are stepping up to play signi"cant roles in assisting with those traditional technologies. This paper presents an extensive survey on the development of neural networks for image compression which covers three categories: direct image compression by neural networks; neural network implementa- tion of existing techniques, and neural network based technology which provide improvement over traditional algo- rithms. ( 1999 Elsevier Science B.V. All rights reserved. Keywords: Neural network; Image compression and coding 1. Introduction Image compression is a key technology in the development of various multimedia computer ser- vices and telecommunication applications such as teleconferencing, digital broadcast codec and video technology, etc. At present, the main core of image compression technology consists of three impor- tant processing stages: pixel transforms, quanti- zation and entropy coding. In addition to these key techniques, new ideas are constantly appearing across di!erent disciplines and new research fronts. As a model to simulate the learning function of human brains, neural networks have enjoyed wide- spread applications in telecommunication and computer science. Recent publications show a sub- stantial increase in neural networks for image com- pression and coding. Together with the popular multimedia applications and related products, what role are the well-developed neural networks going to play in this special era where information processing and communication is in great demand? Although there is no sign of signi"cant work on neural networks that can take over the existing technology, research on neural networks of image compression are still making steady advances. This could have a tremendous impact upon the develop- ment of new technologies and algorithms in this subject area. 0923-5965/99/$ - see front matter ( 1999 Elsevier Science B.V. All rights reserved. PII: S 0 9 2 3 - 5 9 6 5 ( 9 8 ) 0 0 0 4 1 - 1

Upload: others

Post on 24-Aug-2020

16 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Image compression with neural networks A surveylooney/cs773b/NNimage-compress.pdf · image compression 2.1. Back-propagation image compression 2.1.1. Basic back-propagation neural

*Tel.: #44 1443 482271; fax: #44 1443 482715; e-mail:[email protected]

Signal Processing: Image Communication 14 (1999) 737}760

Image compression with neural networks} A survey

J. Jiang*

School of Computing, University of Glamorgan, Pontypridd CF37 1DL, UK

Received 24 April 1996

Abstract

Apart from the existing technology on image compression represented by series of JPEG, MPEG and H.26x standards,new technology such as neural networks and genetic algorithms are being developed to explore the future ofimage coding. Successful applications of neural networks to vector quantization have now become well established, andother aspects of neural network involvement in this area are stepping up to play signi"cant roles in assisting with thosetraditional technologies. This paper presents an extensive survey on the development of neural networks for imagecompression which covers three categories: direct image compression by neural networks; neural network implementa-tion of existing techniques, and neural network based technology which provide improvement over traditional algo-rithms. ( 1999 Elsevier Science B.V. All rights reserved.

Keywords: Neural network; Image compression and coding

1. Introduction

Image compression is a key technology in thedevelopment of various multimedia computer ser-vices and telecommunication applications such asteleconferencing, digital broadcast codec and videotechnology, etc. At present, the main core of imagecompression technology consists of three impor-tant processing stages: pixel transforms, quanti-zation and entropy coding. In addition to these keytechniques, new ideas are constantly appearingacross di!erent disciplines and new research fronts.As a model to simulate the learning function of

human brains, neural networks have enjoyed wide-spread applications in telecommunication andcomputer science. Recent publications show a sub-stantial increase in neural networks for image com-pression and coding. Together with the popularmultimedia applications and related products,what role are the well-developed neural networksgoing to play in this special era where informationprocessing and communication is in great demand?Although there is no sign of signi"cant work onneural networks that can take over the existingtechnology, research on neural networks of imagecompression are still making steady advances. Thiscould have a tremendous impact upon the develop-ment of new technologies and algorithms in thissubject area.

0923-5965/99/$ - see front matter ( 1999 Elsevier Science B.V. All rights reserved.PII: S 0 9 2 3 - 5 9 6 5 ( 9 8 ) 0 0 0 4 1 - 1

Page 2: Image compression with neural networks A surveylooney/cs773b/NNimage-compress.pdf · image compression 2.1. Back-propagation image compression 2.1.1. Basic back-propagation neural

Fig. 1. Back propagation neural network.

This paper comprises "ve sections. Section 2 dis-cusses those neural networks which are directlydeveloped for image compression. Section 3 con-tributes to neural network implementation of thoseconventional image compression algorithms. InSection 4, indirect neural network developmentswhich are mainly designed to assist with thoseconventional algorithms and provide further im-provement are discussed. Finally, Section 5 givesconclusions and summary for present researchwork and other possibilities of future research di-rections.

2. Direct neural network development forimage compression

2.1. Back-propagation image compression

2.1.1. Basic back-propagation neural networkBack-propagation is one of the neural networks

which are directly applied to image compressioncoding [9,17,47,48,57]. The neural network struc-ture can be illustrated as in Fig. 1. Three layers, oneinput layer, one output layer and one hidden layer,are designed. The input layer and output layer arefully connected to the hidden layer. Compression isachieved by designing the value of K, the number ofneurones at the hidden layer, less than that ofneurones at both input and the output layers. Theinput image is split up into blocks or vectors of

8]8, 4]4 or 16]16 pixels. When the input vectoris referred to as N-dimensional which is equal tothe number of pixels included in each block, all thecoupling weights connected to each neurone at thehidden layer can be represented by Mw

ji, j"1, 2,2,

K and i"1, 2,2, NN, which can also be describedby a matrix of order K]N. From the hidden layerto the output layer, the connections can be repre-sented by Mw@

ij: 1)i)N, 1)j)KN which is an-

other weight matrix of order N]K. Image com-pression is achieved by training the network in sucha way that the coupling weights, Mw

jiN, scale the

input vector of N-dimension into a narrow channelof K-dimension (K(N) at the hidden layer andproduce the optimum output value which makesthe quadratic error between input and output min-imum. In accordance with the neural networkstructure shown in Fig. 1, the operation of a linearnetwork can be described as follows:

hj"

N+i/1

wjixi, 1)j)K, (2.1)

for encoding and

xNi"

K+j/1

w@ijhj, 1)i)N, (2.2)

for decoding where xi3[0, 1] denotes the nor-

malized pixel values for grey scale images with greylevels [0, 255] [5,9,14,34,35,47]. The reason for us-ing normalized pixel values is due to the fact thatneural networks can operate more e$ciently whenboth their inputs and outputs are limited to a rangeof [0, 1] [5]. Good discussions on a number ofnormalization functions and their e!ect on neuralnetwork performances can be found in [5,34,35].

The above linear networks can also be trans-formed into non-linear ones if a transfer functionsuch as sigmoid is added to the hidden layer andthe output layer as shown in Fig. 1 to scale thesummation down in the above equations. There isno proof, however, that the non-linear network canprovide a better solution than its linear counterpart[14]. Experiments carried out on a number of im-age samples in [9] report that linear networksactually outperform the non-linear one in terms ofboth training speed and compression performance.Most of the back-propagation neural networks

738 J. Jiang / Signal Processing: Image Communication 14 (1999) 737}760

Page 3: Image compression with neural networks A surveylooney/cs773b/NNimage-compress.pdf · image compression 2.1. Back-propagation image compression 2.1.1. Basic back-propagation neural

developed for image compression are, in fact,designed as linear [9,14,47]. Theoretical discussionon the roles of the sigmoid transfer function can befound in [5,6].

With this basic back-propagation neural net-work, compression is conducted in two phases:training and encoding. In the "rst phase, a set ofimage samples is designed to train the network viathe back-propagation learning rule which uses eachinput vector as the desired output. This is equiva-lent to compressing the input into the narrowchannel represented by the hidden layer and thenreconstructing the input from the hidden to theoutput layer.

The second phase simply involves the entropycoding of the state vector h

jat the hidden layer. In

cases where adaptive training is conducted, theentropy coding of those coupling weights may alsobe required in order to catch up with some inputcharacteristics that are not encountered at thetraining stage. The entropy coding is normally de-signed as the simple "xed length binary codingalthough many advanced variable length entropycoding algorithms are available [22,23,26,60]. Oneof the obvious reasons for this is perhaps that theresearch community is only concerned with thepart played by neural networks rather than any-thing else. It is a straightforward thing to introducebetter entropy coding methods after all. Therefore,the compression performance can be assessedeither in terms of the compression ratio adopted bythe computer science community or bit rate ad-opted by the telecommunication community. Weuse the bit rate throughout the paper to discuss allthe neural network algorithms. For the backpropagation narrow channel compression neuralnetwork, the bit rate can be de"ned as follows:

bit rate"nK¹#NKt

nNbits/pixel, (2.3)

where input images are divided into n blocksof N pixels or n N-dimensional vectors; ¹ and tstand for the number of bits used to encode eachhidden neurone output and each coupling weightfrom the hidden layer to the output layer. Whenthe coupling weights are maintained the samethroughout the compression process after training

is completed, the term NKt can be ignored and thebit rate becomes K¹/N bits/pixel. Since the hiddenneurone output is real valued, quantization is re-quired for "xed length entropy coding which isnormally designed as 32 level uniform quantizationcorresponding to 5 bit entropy coding [9,14].

This neural network development, in fact, is inthe direction of K}L transform technology[17,21,50] which actually provides the optimumsolution for all linear narrow channel type of imagecompression neural networks [17]. When Eqs. (2.1)and (2.2) are represented in matrix form, we have

[h]"[=]T[x], (2.4)

[xN ]"[=@][h]"[=@][=]T[x] (2.5)

for encoding and decoding.The K}L transform maps input images into

a new vector space where all the coe$cients in thenew space is de-correlated. This means that thecovariance matrix of the new vectors is a diagonalmatrix whose elements along the diagonal are eig-envalues of the covariance matrix of the originalinput vectors. Let e

iand j

i, i"1, 2,2, n, be eigen-

vectors and eigenvalues of cx, the covariance matrix

for input vector x, and those corresponding eigen-values are arranged in a descending order so thatji*j

i`1, for i"1, 2,2, n!1. To extract the prin-

cipal components, K eigenvectors corresponding tothe K largest eigenvalues in c

xare normally used to

construct the K}L transform matrix, [AK], in

which all rows are formed by the eigenvectors of cx.

In addition, all eigenvectors in [AK] are ordered in

such a way that the "rst row of [AK] is the eigenvec-

tor corresponding to the largest eigenvalue, and thelast row is the eigenvector corresponding to thesmallest eigenvalue. Hence, the forward K}L trans-form or encoding can be de"ned as

[y]"[AK]([x]![m

x]), (2.6)

and the inverse K}L transform or decoding can bede"ned as:

[xN ]"[AK]T[y]#[m

x], (2.7)

where [mx] is the mean value of [x] and [xN ] repres-

ents the reconstructed vectors or image blocks.Thus the mean square error between x and xN is

J. Jiang / Signal Processing: Image Communication 14 (1999) 737}760 739

Page 4: Image compression with neural networks A surveylooney/cs773b/NNimage-compress.pdf · image compression 2.1. Back-propagation image compression 2.1.1. Basic back-propagation neural

Fig. 2. Hierarchical neural network structure.

given by the following equation:

e.4"EM(x!xN )2N"

1

M

M+

K/1

(xK!x

K)2

"

n+j/1

jj!

K+j/1

jj"

n+

j/K`1

jj, (2.8)

where the statistical mean value EM ) N is approxi-mated by the average value over all the input vectorsamples which, in image coding, are all the non-overlapping blocks of 4]4 or 8]8 pixels.

Therefore, by selecting the K eigenvectors asso-ciated with the largest eigenvalues to run the K}Ltransform over input image pixels, the resultingerrors between the reconstructed image and theoriginal one can be minimized due to the fact thatthe values of j's decrease monotonically.

From the comparison between the equation pair(2.4) and (2.5) and the equation pair (2.6) and (2.7), itcan be concluded that the linear neural networkreaches the optimum solution whenever the follow-ing condition is satis"ed:

[=@][=]T"[AK]T[A

K]. (2.9)

Under this circumstance, the neurone weights frominput to hidden and from hidden to output can bedescribed respectively as follows:

[=@]"[AK][;]~1, [=]T"[;][A

K]T, (2.10)

where [;] is an arbitrary K]K matrix and[;][;]~1 gives an identity matrix of K]K.Hence, it can be seen that the linear neural networkcan achieve the same compression performance asthat of K}L transform without necessarily obtain-ing its weight matrices being equal to [A

K]T and

[AK].

Variations of the image compression scheme im-plemented on the basic network can be designed byusing overlapped blocks and di!erent error func-tions [47]. The use of overlapped blocks is justi"edin the sense that some extent of correlation alwaysexists between the coe$cients among the neigh-bouring blocks. The use of other error functions, inaddition, may provide better interpretation ofhuman visual perception for the quality of thosereconstructed images.

2.1.2. Hierarchical back-propagation neuralnetwork

The basic back-propagation network is furtherextended to construct a hierarchical neural net-work by adding two more hidden layers into theexisting network as proposed in [48]. The hier-archical neural network structure can be illustratedin Fig. 2 in which the three hidden layers aretermed as the combiner layer, the compressor layerand the decombiner layer. The idea is to exploitcorrelation between pixels by inner hidden layerand to exploit correlation between blocks of pixelsby outer hidden layers. From the input layer to thecombiner layer and from the decombiner layer tothe output layer, local connections are designedwhich have the same e!ect as M fully connectedneural sub-networks. As seen in Fig. 2, all threehidden layers are fully connected. The basic idea isto divide an input image into M disjoint sub-scenesand each sub-scene is further partitioned into¹ pixel blocks of size p]p.

For a standard image of 512]512, as proposed[48], it can be divided into 8 sub-scenes and eachsub-scene has 512 pixel blocks of size 8]8. Accord-ingly, the proposed neural network structure isdesigned to have the following parameters:

Total number of neurones at the inputlayer"Mp2"8]64"512.Total number of neurones at the combinerlayer"MN

h"8]8"64.

Total number of neurones at the compressorlayer"Q"8.

740 J. Jiang / Signal Processing: Image Communication 14 (1999) 737}760

Page 5: Image compression with neural networks A surveylooney/cs773b/NNimage-compress.pdf · image compression 2.1. Back-propagation image compression 2.1.1. Basic back-propagation neural

Fig. 3. Inner loop neural network.

The total number of neurones for the decombinerlayer and the output layer is the same as that of thecombiner layer and the input layer, respectively.

A so called nested training algorithm (NTA) isproposed to reduce the overall neural networktraining time which can be explained in the follow-ing three steps.

Step 1. Outer loop neural network (O¸NN) train-ing. By taking the input layer, the combiner layerand the output layer out of the network shown inFig. 2, we can obtain M 64}8}64 outer loop neuralnetworks where 64}8}64 represents the number ofneurones for its input layer, hidden layer and out-put layer, respectively. As the image is divided into8 sub-scenes and each sub-scene has 512 pixelblocks each of which has the same size as that of theinput layer, we have 8 training sets to train the8 outer-loop neural networks independently. Eachtraining set contains 512 training patterns (or pixelblocks). In this training process, the standardback-propagation learning rule is directly appliedin which the desired output is equal to the input.

Step 2. Inner loop neural network (I¸NN) train-ing. By taking the three hidden layers in Fig. 2 intoconsideration, an inner loop neural network can bederived as shown in Fig. 3. As the related para-meters are designed in such a way that N

h"8;

Q"8; M"8, the inner loop neural network is alsoa 64}8}64 network. Corresponding to the 8 sub-scenes each of which has 512 training patterns (orpixel blocks), we also have 8 groups of hidden layeroutputs from the operations of step 1, in which eachhidden layer output is an 8-dimensional vector andeach group contains 512 such vectors. Therefore,for the inner loop network, the training set contains512 training patterns each of them is a 64-dimen-

sional vector when the outputs of all the 8 hiddenlayers inside the OLNN are directly used to trainthe ILNN. Again, the standard back-propagationlearning rule is used in the training process.Throughout the two steps of training for bothILNN and OLNN, the linear transfer function(or activating function) is used.

Step 3. Reconstruction of the overall neural net-works. From the previous two steps of training, wehave four sets of coupling weights, two out of step1 and two out of step 2. Hence, the overall neuralnetwork coupling weights can be assigned in sucha way that the two sets of weights from step 1 aregiven to the outer layers in Fig. 2 involving theinput layer connected to the combiner layer, andthe decombiner layer connected to the output layer.Similarly, the two sets of coupling weights obtainedfrom step 2 can be given to the inner layers in Fig. 2involving the combiner layer connected to the com-pressor layer and the compressor layer connectedto the decombiner layer.

After training is completed, the neural network isready for image compression in which half of thenetwork acts as an encoder and the other half asa decoder. The neurone weights are maintained thesame throughout the image compression process.

2.1.3. Adaptive back-propagation neural networkFurther to the basic narrow channel back-propa-

gation image compression neural network, a num-ber of adaptive schemes are proposed by Carrato[8,9] based on the principle that di!erent neuralnetworks are used to compress image blocks withdi!erent extent of complexity. The general struc-ture for the adaptive schemes can be illustrated inFig. 4 in which a group of neural networks withincreasing number of hidden neurones, (h

.*/, h

.!9),

is designed. The basic idea is to classify the inputimage blocks into a few sub-sets with di!erentfeatures according to their complexity measure-ment. A "ne tuned neural network then compresseseach sub-set.

Four schemes are proposed in [9] to train theneural networks which can be roughly classi"ed asparallel training, serial training, activity-basedtraining and activity and direction based trainingschemes.

J. Jiang / Signal Processing: Image Communication 14 (1999) 737}760 741

Page 6: Image compression with neural networks A surveylooney/cs773b/NNimage-compress.pdf · image compression 2.1. Back-propagation image compression 2.1.1. Basic back-propagation neural

Fig. 4. Adaptive neural network structure.

The parallel training scheme applies the com-plete training set simultaneously to all neuralnetworks and uses the S/N (signal-to-noise) ratioto roughly classify the image blocks into thesame number of sub-sets as that of neural net-works. After this initial coarse classi"cation is com-pleted, each neural network is then further trainedby its corresponding re"ned sub-set of trainingblocks.

Serial training involves an adaptive searchingprocess to build up the necessary number of neuralnetworks to accommodate the di!erent patternsembedded inside the training images. Starting witha neural network with pre-de"ned minimum num-ber of hidden neurones, h

.*/, the neural network is

roughly trained by all the image blocks. The S/Nratio is used again to classify all the blocks into twoclasses depending on whether their S/N is greaterthan a pre-set threshold or not. For those blockswith higher S/N ratios, further training is startedto the next neural network with the number ofhidden neurones increased and the correspondingthreshold readjusted for further classi"cation. Thisprocess is repeated until the whole training set isclassi"ed into a maximum number of sub-setscorresponding to the same number of neural net-works established.

In the next two training schemes, extra two para-meters, activity A(P

l) and four directions, are de-

"ned to classify the training set rather than usingthe neural networks. Hence the back propagationtraining of each neural network can be completedin one phase by its appropriate sub-set.

The so called activity of the lth block is de"ned as

A(Pl)" +

%7%/ i,j

Ap(P

l(i, j)), (2.11)

Ap(P

l(i, j))"

1+

r/~1

1+

s/~1

(Pl(i, j)!P

l(i#r, j#s))2,

(2.12)

where Ap(P

l(i, j)) is the activity of each pixel which

concerns its neighbouring 8 pixels as r and s varyfrom !1 to #1 in Eq. (2.12).

Prior to training, all image blocks are classi"edinto four classes according to their activity valueswhich are identi"ed as very low, low, high and veryhigh activities. Hence four neural networks are de-signed with increasing number of hidden neuronesto compress the four di!erent sub-sets of inputimages after the training phase is completed.

On top of the high activity parameter, a furtherfeature extraction technique is applied by consider-ing four main directions presented in the imagedetails, i.e., horizontal, vertical and the two diag-onal directions. These preferential direction fea-tures can be evaluated by calculating the values ofmean squared di!erences among neighbouringpixels along the four directions [9].

For those image patterns classi"ed as high activ-ity, further four neural networks corresponding tothe above directions are added to re"ne their struc-tures and tune their learning processes to the pref-erential orientations of the input. Hence, the overallneural network system is designed to have six neu-ral networks among which two correspond to lowactivity and medium activity sub-sets and otherfour networks correspond to the high activity andfour direction classi"cations [9]. Among all theadaptive schemes, linear back-propagation neuralnetwork is mainly used as the core of all the di!er-ent variations.

2.1.4. Performance assessmentsAround the back-propagation neural network,

we have described three representative schemes toachieve image data compression towards the direc-tion of K}L transform. Their performances cannormally be assessed by considering two measure-ments. One is the compression ratio or bit rate

742 J. Jiang / Signal Processing: Image Communication 14 (1999) 737}760

Page 7: Image compression with neural networks A surveylooney/cs773b/NNimage-compress.pdf · image compression 2.1. Back-propagation image compression 2.1.1. Basic back-propagation neural

Table 1Basic back propagation on ¸ena (256]256)

DimensionN

Training schemes PSNR(dB)

Bit rate(bits/pixel)

64 Non-linear 25.06 0.7564 Linear 26.17 0.75

Table 2Hierarchical back propagation on ¸ena (512]512)

DimensionN

Trainingschemes

SNR (dB) Bit rate(bits/pixel)

4 Linear 14.395 164 Linear 19.755 1

144 Linear 25.67 1

Table 3Adaptive back propagation on ¸ena (256]256)

DimensionN

Training schemes PSNR(dB)

Bit rate(bits/pixel)

64 Activity based linear 27.73 0.6864 Activity and direction

based linear27.93 0.66

which is used to measure the compression perfor-mance, and the other is mainly used to measure thequality of reconstructed images with regards toa speci"c compression ratio or bit rate. The de"ni-tion of this measurement, however, is a little am-biguous at present. In practice, there exists twoacceptable measurements for the quality of recon-structed images which are PSNR (peak-signal-to-noise ratio) and NMSE (normalised mean-squareerror). For a grey level image with n blocks of sizeN, or n vectors with dimension N, their de"nitionscan be given, respectively, as follows:

PSNR"10 log2552

1

nN+n

i/1+N

j/1(P

ij!P

ij)2

(dB),

(2.13)

NMSE"

+ni/1

+Nj/1

(Pij!P

ij)2

+ni/1

+Nj/1

P2ij

, (2.14)

where Pij

is the intensity value of pixels in thereconstructed images; and P

ijthe intensity value of

pixels in the original images which are split up inton input vectors: x

i"MP

i1, P

i2,2, P

iNN.

In order to bridge the di!erences between thetwo measurements, let us introduce an SNR(signal-to-noise ratio) measurement to interpret theNMSE values as an alternative indication of thequality of those reconstructed images in compari-son with the PSNR "gures. The SNR is de"nedbelow:

SNR"!10 log NMSE (dB). (2.15)

Considering the di!erent settings for the experi-ments reported in various sources [9,14,47,48], it isdi$cult to make a comparison among all the algo-rithms presented in this section. To make the bestuse of all the experimental results available, we take¸ena as the standard image sample and summarisethe related experiments for all the algorithms asillustrated in Tables 1}3 which are grouped intobasic back propagation, hierarchical back propa-gation and adaptive back propagation.

Experiments were also carried out to test theneural network schemes against the conventionalimage compression techniques such as DCT withSQ (scale quantization) and VQ (vector quantiz-

ation), sub-band coding with VQ schemes [9]. Theresults reported reveal that the neural networksprovide very competitive compression performanceand even outperform the conventional techniquesin some occasions [9].

Back propagation based neural networks haveprovided good alternatives for image compressionin the framework of the K}L transform. Althoughmost of the networks developed so far use lineartraining schemes and experiments support this op-tion, it is not clear why non-linear training leads toinferior performance and how non-linear transferfunction can be further exploited to achieve furtherimprovement. Theoretical research is required inthis area to provide in-depth information and toprepare for further exploration in line with non-linear signal processing using neural networks[12,19,53,61,63,74].

Generally speaking, the coding scheme usedby neural networks tends to maintain the same

J. Jiang / Signal Processing: Image Communication 14 (1999) 737}760 743

Page 8: Image compression with neural networks A surveylooney/cs773b/NNimage-compress.pdf · image compression 2.1. Back-propagation image compression 2.1.1. Basic back-propagation neural

neurone weights, once being trained, throughoutthe whole process of image compression. This isbecause the bit rate will become K¹/N instead ofthat given by Eq. (2.3) in which maximisation oftheir compression performance can be imple-mented. From experiments [4,9], in addition, re-constructed image quality does not show anysigni"cant di!erence for those images outside thetraining set due to the so-called generalizationproperty [9] in neural networks. Circumstanceswhere a drastic change of the input image statisticsoccurs might not be well covered by the trainingstage, however, adaptive training might proveworthwhile. Therefore, how to use the minimumnumber of bits to represent the adaptive trainingremains an interesting area to be further explored.

2.2. Hebbian learning-based image compression

While the back-propagation-based narrow-channel neural network aims at achieving a com-pression upper bounded by the K}L transform,a number of Hebbian learning rules have beendeveloped to address the issue of how the principalcomponents can be directly extracted from inputimage blocks to achieve image data compression[17,36,58,71]. The general neural network structureconsists of one input layer and one output layerwhich is actually a half of the network shown inFig. 1. Hebbian learning rule comes from Hebb'spostulation that if two neurones were very active atthe same time which is illustrated by the high valuesof both its output and one of its inputs, the strengthof the connection between the two neurones willgrow or increase. Hence, for the output valuesexpressed as [h]"[w]T[x], the learning rule canbe described as

=i(t#1)"

=(t)#ahi(t)X(t)

E=i(t)#ah

i(t)X(t)E

, (2.16)

where=i(t#1)"Mw

i1, w

i2,2, w

iNN is the ith new

coupling weight vector in the next cycle (t#1);1)i)M and M is the number of output neur-ones. a the learning rate; h

i(t) the ith output value;

X(t) the input vector, corresponding to each indi-vidual image block and E ) E the Euclidean norm

used to normalize the updated weights and makethe learning stable.

From the above basic Hebbian learning, a so-called linearized Hebbian learning rule is developedby Oja [50,51] by expanding Eq. (2.16) into a seriesfrom which the updating of all coupling weights isconstructed from below:

=i(t#1)"=

i(t)#a[h

i(t)X(t)!h2

i(t)=

i(t)]. (2.17)

To obtain the leading M principal components,Sanger [58] extends the above model to a learningrule which removes the previous principal compo-nents through Gram}Schmidt orthogonalizationand made the coupling weights converge to thedesired principal components. As each neurone atthe output layer functions as a simple summationof all its inputs, the neural network is linear andforward connected. Experiments are reported [58]that at a bit rate of 0.36 bits/pixel, an SNR of16.4 dB is achieved when the network is trained byimages of 512]512 and the image compressed wasoutside the training set.

Other learning rules developed include the adap-tive principal component extraction (APEX) [36]and robust algorithms for the extraction estimationof the principal components [62,72]. All the vari-ations are designed to improve the neural networkperformance in converging to the principal compo-nents and in tracking down the high order statisticsfrom the input data stream.

Similar to those back-propagation trainednarrow channel neural networks, the compressionperformance of such neural networks are alsodependent on the number of output neurones, i.e.the value of M. As explained in the above, thequality of the reconstructed images are determinedby e

.4"EM(x!xN )2N"+N

j/M`1jj

for neural net-works with N-dimensional input vectors andM output neurones. With the same characteristicsas that of conventional image compression algo-rithms, high compression performance is alwaysbalanced by the quality of reconstructed images.Since conventional algorithms in computing theK}L transform involves lengthy operations inangle tuned trial and veri"cation iterations for eachvector and there is no fast algorithm in existence sofar, neural network development provides advant-ages in terms of computing e$ciency and speed in

744 J. Jiang / Signal Processing: Image Communication 14 (1999) 737}760

Page 9: Image compression with neural networks A surveylooney/cs773b/NNimage-compress.pdf · image compression 2.1. Back-propagation image compression 2.1.1. Basic back-propagation neural

Fig. 5. Vector quantization neural network.

dealing with these matters. In addition, the iterativenature of these learning processes is also helpful incapturing those slowly varying statistics embeddedinside the input stream. Hence the whole systemcan be made adaptive in compressing various im-age contents such as those described in the lastsection.

2.3. Vector quantization neural networks

Since neural networks are capable of learningfrom input information and optimizing itself toobtain the appropriate environment for a widerange of tasks [38], a family of learning algorithmshas been developed for vector quantization. For allthe learning algorithms, the basic structure is sim-ilar which can be illustrated in Fig. 5. The inputvector is constructed from a K-dimensional space.M neurones are designed in Fig. 5 to compute thevector quantization code-book in which each neur-one relates to one code-word via its couplingweights. The coupling weight, Mw

ijN, associated with

the ith neurone is eventually trained to representthe code-word c

iin the code-book. As the neural

network is being trained, all the coupling weightswill be optimized to represent the best possiblepartition of all the input vectors. To train the net-work, a group of image samples known to bothencoder and decoder is often designated as thetraining set, and the "rst M input vectors of the

training data set are normally used to initialize allthe neurones. With this general structure, variouslearning algorithms have been designed and de-veloped such as Kohonen's self-organizing featuremapping [10,13,18,33,52,70], competitive learning[1,54,55,65], frequency sensitive competitive learn-ing [1,10], fuzzy competitive learning [11,31,32],general learning [25,49], and distortion equalizedfuzzy competitive learning [7] and PVQ (predictiveVQ) neural networks [46].

Let Wi(t) be the weight vector of the ith neurone

at the tth iteration, the basic competitive learningalgorithm can be summarized as follows:

zi"G

1 d(x,=i(t))" min

1xjxM

d(x,=j(t)),

0 otherwise,(2.18)

=i(t#1)"=

i(t)#a(x!=

i(t))z

i, (2.19)

where d(x,=i(t)) is the distance in the ¸

2metric

between the input vector x and the coupling weightvector=

i(t)"Mw

i1, w

i2,2, w

iKN; K"p]p; a is the

learning rate, and ziis its output.

A so-called under-utilization problem [1,11] oc-curs in competitive learning which means some ofthe neurones are left out of the learning processand never win the competition. Various schemesare developed to tackle this problem. Kohonen self-organizing neural network [10,13,18] overcomesthe problem by updating the winning neurone aswell as those neurones in its neighbourhood.

Frequency sensitive competitive learning algo-rithm addresses the problem by keeping a record ofhow frequent each neurone is the winner to main-tain that all neurones in the network are updatedan approximately equal number of times. To imple-ment this scheme, the distance is modi"ed to in-clude the total number of times that the neurone i isthe winner. The modi"ed distance measurement isde"ned as

d(x,=(t)i)"d(x,=

i(t))u

i(t), (2.20)

where ui(t) is the total number of winning times for

neurone i up to the tth training cycle. Hence, themore the ith neurone wins the competition, thegreater its distance from the next input vector.Thus, the chance of winning the competition dimin-ishes. This way of tackling the under-utilization

J. Jiang / Signal Processing: Image Communication 14 (1999) 737}760 745

Page 10: Image compression with neural networks A surveylooney/cs773b/NNimage-compress.pdf · image compression 2.1. Back-propagation image compression 2.1.1. Basic back-propagation neural

problem does not provide interactive solutions inoptimizing the code-book.

For all competitive learning based VQ (vectorquantization) neural networks, fast algorithms canbe developed to speed up the training process byoptimizing the search for the winner and reducingthe relevant computing costs [29].

By considering general optimizing methods [19],numerous variations of learning algorithms can bedesigned for the same neural network structureshown in Fig. 5. The general learning VQ [25,49] isone typical example. To model the optimization ofcentroids w3RM for a "xed set of training vectorsx"Mx

1,2, x

KN, a total average mismatch between

x, the input vector, and W, the code-book repre-sented by neurone weights, is de"ned as

C(=)"K+i/1

M+j/1

girEx

i!=

rE2

K"

K+i/1

M+j/1

¸x, (2.21)

where gir

is given by

gir"G

1 if r"i,1

+Mj/1

Ex!=jE2

otherwise,

and =iis assumed to be the weight vector of the

winning neurone.To derive the optimum partition of x, the

problem is reduced down to minimizing thetotal average mismatch C(=), where ="

M=1,=

2,2,=

MN is the code-book which varies

with respect to the partition of x. From the basiclearning algorithm theory, the gradient of C(=) canbe obtained in an iterative, stepwise fashion bymoving the learning algorithm in the direction ofthe gradient of ¸

xin Eq. (2.21). Since ¸

xcan be

rearranged as

¸x"

M+r/1

girEx!=

rE2

"Ex!=iE2#

+Mr/1, rEi

Ex!=rE2

+Mj/1

Ex!=jE2

"Ex!=iE2#1!

Ex!=iE2

+Mj/1

Ex!=jE2

, (2.22)

¸x

is di!erentiated with respect to =iand =

jas

follows:

R¸x

R=i

"!2(x!=i)D2!D#Ex!=

iE2

D2, (2.23)

R¸x

R=j

"2(x!=j)Ex!=

iE2

D2, (2.24)

where D"+Mr/1

Ex!=rE2.

Hence, the learning rule can be designed as fol-lows:

=i(t#1)"=

i(t)#a(t)(x!=

i(t))

]D2!D#Ex!=

i(t)E2

D2(2.25)

for the winning neurone i, and

=j(t#1)"=

j(t)#a(t)(x!=

j(t))

Ex!=i(t)E2

D2

(2.26)

for the other (M!1) neurones. This algorithm canalso be classi"ed as a variation of Kohonen's self-organizing neural network [33].

Around the competitive learning scheme, fuzzymembership functions are introduced to controlthe transition from soft to crisp decisions during thecode-book design process [25,31]. The essentialidea is that one input vector is assigned to a clusteronly to a certain extent rather than either &in' or&out'. The fuzzy assignment is useful particularly atearlier training stages which guarantees that allinput vectors are included in the formation of a newcode-book represented by all the neurone couplingweights. Representative examples include directfuzzy competitive learning [11], fuzzy algorithmsfor learning vector quantization [31,32] and distor-tion equalized fuzzy competitive learning algorithm[7], etc. The so-called distortion equalized fuzzycompetitive learning algorithm modi"es the dis-tance measurement to update the neurone weightsby taking fuzzy membership functions into con-sideration to optimize the learning process. Spe-ci"cally, each neurone is allocated a distortionrepresented by V

j(t), j"1, 2,2, M, with their in-

itial values Vj(0)"1. The distance between the

input vector x and all the neurone weights =j

is

746 J. Jiang / Signal Processing: Image Communication 14 (1999) 737}760

Page 11: Image compression with neural networks A surveylooney/cs773b/NNimage-compress.pdf · image compression 2.1. Back-propagation image compression 2.1.1. Basic back-propagation neural

then modi"ed as follows:

Dkj"d(x

k,=

j(t))"

<j(t)

+Ml/1<

l(t)

Ex!=j(t)E2. (2.27)

At each training cycle, a fuzzy membership func-tion, k

kj(t), for each neurone is constructed to up-

date both distortion Vj

and the neurone weightsgiven below:

<j(t#1)"<

j(t)#(k

kj)mD

kj,

=j(t#1)"=

j(t)#(k

kj)ma(x

k!=

j(t),

where m*1 is a fuzzy index which controls thefuzziness of the whole partition, and k

kjis a fuzzy

membership function which speci"es the extent ofthe input vector being associated with a couplingweight. Throughout the learning process, the fuzzyidea is incorporated to control the partition ofinput vectors as well as avoid the under-utilizationproblem.

Experiments are carried out to vector quantizeimage samples directly without any transform bya number of typical CLVQ neural networks [7].The results show that, in terms of PSNR, the fuzzycompetitive learning neural network achieves26.99 dB for ¸ena when the compression ratiois maintained at 0.44 bits/pixel for all the neuralnetworks. In addition, with bit rate being0.56 bits/pixel, the fuzzy algorithms presented in[32] achieve a PSNR "gure around 32.54 dB forthe compression of ¸ena which is a signi"cant im-provement compared with the conventional LBGalgorithm [32].

In this section, we discussed three major develop-ments in neural networks for direct image compres-sion. The "rst two conform to the conventionalroute of pixel transforms in which principal compo-nents are extracted to reduce the redundancyembedded inside input images. The third typecorresponds to the well developed quantizationtechnique in which neural learning algorithms arecalled in to help producing the best possible code-book. While the basic structure for encoders arevery similar which contain two layers of neurones,the working procedure is fundamentally di!erent.Apart from the fact that the pixel transform basednarrow channel networks always have less number

of hidden neurones than those of input ones, theyalso require a complete neural network to recon-struct those compressed images. Yet for those VQneural networks, reconstruction stands the sameas their conventional counterparts, which onlyinvolves a look-up table operation. These devel-opments, in fact, cover two sides of image com-pression in the light of conventional route: pixeltransform, quantization and entropy coding.Therefore, one natural alternative is to make thesetwo developments work together which lead toa neural network designed to extract the desiredprincipal components and then followed by an-other hidden layer of neurones to do the vectorquantization. In that sense, the choice of leadingprincipal components could be made more #exible.One simple example is to keep all the principalcomponents instead of discarding those smaller onesand then represent them by a code-book which isre"ned according to their associated eigenvalues.

3. Neural network development of existingtechnology

In this section, we show that the existing conven-tional image compression technology can be de-veloped right into various learning algorithms tobuild up neural networks for image compression.This will be a signi"cant development in the sensethat various existing image compression algorithmscan actually be implemented by simply one neuralnetwork architecture empowered with di!erentlearning algorithms. Hence, the powerful parallelcomputing and learning capability with neural net-works can be fully exploited to build up a universaltest bed where various compression algorithms canbe evaluated and assessed. Three conventionaltechniques are covered in this section which includewavelet transforms, fractals and predictive coding.

3.1. Wavelet neural networks

Based on wavelet transforms, a number of neuralnetworks are designed for image processing andrepresentation [16,24,41,66}68,75]. This directionof research is mainly to develop existing image

J. Jiang / Signal Processing: Image Communication 14 (1999) 737}760 747

Page 12: Image compression with neural networks A surveylooney/cs773b/NNimage-compress.pdf · image compression 2.1. Back-propagation image compression 2.1.1. Basic back-propagation neural

Fig. 6. Wavelet neural network.

coding techniques into various neural networkstructures and learning algorithms.

When a signal s(t) is approximated by daughtersof a mother wavelet h(t) as follows, for instance,a neural network structure can be established asshown in Fig. 6 [66}68].

s(t)"K+k/1

wkh A

t!bk

akB , (3.1)

where the wk, b

kand a

kare weight coe$cients,

shifts and dilations for each daughter wavelet. Toachieve the optimum approximation, the followingleast-mean-square energy function can be used todevelop a learning algorithm for the neural net-work in Fig. 6.

E"

1

2

T+t/1

(s(t)!s(t))2. (3.2)

The value of E can be minimized by adaptivelychanging or learning the best possible values of thecoe$cients, w

k, b

kand a

k. One typical example is to

use a conjugate gradient method [19,67] to achieveoptimum approximation of the original signal s(t).Forming the column vectors [u(w)] and [w] fromthe gradient analysis and coe$cients w

k, the ith

iteration for minimising E with respect to [w] pro-ceeds according to the following two steps:

(i) if k is multiple of n then: [s(w)I]"![g(w)I]else:

[s(w)i]"![g(w)i]

#

[g(w)i]T[g(w)i]

[g(w)i~1]T[g(w)i~1][s (w)i~1]; (3.3)

(ii)

[wi`1]"[wi]#ai[s(w)i]. (3.4)

Step (i) computes a search direction [s] at iter-ation i. Step (ii) computes the new weight vectorusing a variable step-size a. By simply choosing thestep-size a as the learning rate, the above two stepscan be constructed as a learning algorithm for thewavelet neural network in Fig. 6.

With the similar theory (energy function ischanged into an error function), [41] suggesteda back-propagation and convolution based neuralnetwork to search for the optimal wavelet de-composition of input images for e$cient imagedata compression. The neural network structure isdesigned to have one input layer and one outputlayer. Local connections of neurones are estab-lished to complete the training corresponding tofour channels in wavelet decomposition [41,61], i.e.LL (low pass}low pass), LH (low pass}high pass),HL (high pass}low pass) and HH (high pass}highpass). From the well-known back-propagationlearning rule, a convolution based training algo-rithm is used to optimize the neural network train-ing in which quantization of neural network outputis used as the desired output to allow back propa-gation learning. To modify the learning rule, min-imization of both quantization error and entropy istaken into consideration by incorporating a so-called entropy reduction function, Z(Q¹(i, j)), hencethe overall learning rule can be designed as follows:

Kc(u, v)[t#1]"K

c(u, v)[t]#g+

i,j

d(i, j)

]S(i!u, j!v)#aDKc(u, v)[t],

(3.5)

748 J. Jiang / Signal Processing: Image Communication 14 (1999) 737}760

Page 13: Image compression with neural networks A surveylooney/cs773b/NNimage-compress.pdf · image compression 2.1. Back-propagation image compression 2.1.1. Basic back-propagation neural

Fig. 7. Entropy reduction function.

where Kc(u, v)[t#1] is the neural network coup-

ling weights for channel c at (t#1)th cycle; g, a arethe learning rate and momentum terms, respective-ly, in modi"ed back propagation to acceleratethe learning process; S(i!u, j!v) is the originalinput image pixel; DK

c(u, v)[t]"K

c(u, v)[t]!

Kc(u, v)[t!1]; d(i, j)"

REf

RKc(u, v)

the so-called

weight-update function derived from the localgradient in the back propagation scheme.

Ef is an error function which is de"ned below:

Ef(i, j)"Z(QT(i, j)) (¹(i, j)!QT(i, j))2/2, (3.6)

where ¹(i, j) is the neural network output used toapproximate wavelet coe$cients;

QT(i, j) is the quantization of ¹(i, j);

Z(QT(i, j))"G0 for QT(i, j)"0,

1 for DQT(i, j)D"1,

F(n, q) for DQT(i, j)D"n.

Z(QT(i, j)) is the above mentioned entropy reduc-tion function; with F(n, q) being a ramp function,Z(QT(i, j)) can be illustrated in Fig. 7.

While the neural network outputs converge tothe optimal wavelet coe$cients, its couplingweights also converge to an optimal wavelet lowpass "lter. To make this happen, extra theoreticalwork is required to re"ne the system design.

Firstly, the standard wavelet transform com-prises two orthonormal "lters, one low pass and theother high pass. To facilitate the neural network

learning and its convergence to one optimalwavelet "lter, the entire wavelet decompositionneeds to be represented by one type of "lter ratherthan two. This can be done by working on therelation between the two orthonormal "lters [2,41].Hence all the three channels, LH, HL and HH,can be represented by channel LL, and the wholewavelet decomposition can be implemented bytraining the neural network into one optimal lowpass "lter. This type of redesign produces equiva-lently a pre-processing of input pixel for LH, HLand HH channels. Basically, the pre-processing in-volves a #ipping operation of the input matrix andan alternating operation of signs for the #ippedmatrix [39].

Secondly, each epoch of the neural networktraining is only a suggestion or approximation ofthe desired optimal wavelet low-pass "lter, whichalso gives us the smallest possible values of theerror function Ef. To make it the best possibleapproximation, a post-processing is added to con-vert the neural network coupling weights into a re-quired wavelet "lter. Assuming that this desiredwavelet "lter is h@

u, the distance between h@

uand the

neural network suggestions can be evaluated as

f (h@u)"+

u,v

(h@uh@v!K

c(u, v))2. (3.7)

To minimize the distance subject to constraintequations C

p(h@

u)"0, the Langrangian multiplier

method is used to obtain a set of h@u, which is given

below:

df (h@u)#+

p

jpdC

p(h@

u)"0, (3.8)

where jp

is the Lagrangian multiplier, df ( ) ) thedi!erentiation of function f ( ) ) and C

p(h@

u) the equa-

tions corresponding to the following constraints forwavelet low pass "lters [36]:

G[+

uh2u]!J2/2"0,

[+huhu`2n

]!du,u`2n

"0,

dij

is the Dirac delta function.Experiments reported [39] on a number of image

samples support the proposed neural networksystem by "nding out that Daubechie's wavelet

J. Jiang / Signal Processing: Image Communication 14 (1999) 737}760 749

Page 14: Image compression with neural networks A surveylooney/cs773b/NNimage-compress.pdf · image compression 2.1. Back-propagation image compression 2.1.1. Basic back-propagation neural

Fig. 8. IFS neural network.

produces a satisfactory compression with thesmallest errors and Harr's wavelet produces thebest results on sharp edges and low-noise smoothareas. The advantage of using such a neural net-work is that the searching process for the bestpossible wavelets can be incorporated directly intothe wavelet decomposition to compress images[41,74]. Yet conventional wavelet based imagecompression technique has to run an independentevaluation stage to "nd out the most suitablewavelets for the multi-band decomposition [69]. Inaddition, the powerful learning capability of neuralnetworks may provide adaptive solutions for prob-lems involved in sub-band coding of those imageswhich may require di!erent wavelets to achieve thebest possible compression performance [4,71].

3.2. Fractal neural networks

Fractally con"gured neural networks [45,64,71]based on IFS (iterated function systems [3]) codesrepresent another example along the direction ofdeveloping existing image compression technologyinto neural networks. Its conventional counterpartinvolves representing images by fractals and eachfractal is then represented by a so-called IFS whichconsists of a group of a$ne transformations. Togenerate images from IFS, random iteration algo-rithm is the most typical technique associated withfractal based image decompression [3]. Hence,fractal based image compression features higherspeed in decompression and lower speed in com-pression.

By establishing one neurone per pixel, two tradi-tional algorithms of generating images using IFSsare formulated into neural networks in which allthe neurones are organized as a topology with twodimensions [64]. The network structure can beillustrated in Fig. 8 in which w

ij, i{j{is the coupling

weight between (ij)th neurone to (i@j@)th one, andsij

is the state output of the neurone at position (i, j).The training algorithm is directly obtained fromthe random iteration algorithm in which the coup-ling weights are used to interpret the self similaritybetween pixels [64]. Since not all the neurones inthe network are fully connected, the topology canactually be described by a sparse matrix theoret-

ically. In common with most neural networks, themajority of the work operated in the neural net-work is to compute and optimize the couplingweights, w

ij, i{j{. Once these have been calculated,

the required image can typically be generated ina small number of iterations. Hence, the neuralnetwork implementation of the IFS based imagecoding system could lead to massively parallel im-plementation on a dedicated hardware for generat-ing IFS fractals. Although the essential algorithmstays the same as its conventional algorithm, solu-tions could be provided by neural networks for thecomputing intensive problems which are currentlyunder intensive investigation in the conventionalfractal based image compression research area.

The neural network proposed in the originalreference [64] can only complete image generationsfrom IFS codes. Research in fractal image compres-sion, however, focuses on developing fast encodingalgorithms before real-time application can be im-proved. This involves extensive search for a$netransformations to produce fractals for a given im-age and generating output entropy codes. Thework described [64] actually covers the decom-pression side of the technology. In addition, theneural network requires a large number ofneurones arranged in two dimensions which isthe same as the number of pixels in any image.Yet large number of the neurones are not activeaccording to the system design. Further work,therefore, is required to be done along this line ofresearch.

750 J. Jiang / Signal Processing: Image Communication 14 (1999) 737}760

Page 15: Image compression with neural networks A surveylooney/cs773b/NNimage-compress.pdf · image compression 2.1. Back-propagation image compression 2.1.1. Basic back-propagation neural

Fig. 9. Predictive neural network I.

3.3. Predictive coding neural networks

Predictive coding has been proved to be a power-ful technique in de-correlating input data forspeech compression and image compression wherea high degree of correlation is embedded amongneighbouring data samples. Although general pre-dictive coding is classi"ed into various models suchas AR and ARMA, etc., autoregressive model (AR)has been successfully applied to image compres-sion. Hence, predictive coding in terms of applica-tions in image compression can be further classi"edinto linear and non-linear AR models. Conven-tional technology provides a mature environmentand well developed theory for predictive codingwhich is represented by LPC (linear predictive cod-ing) PCM (pulse code modulation), DPCM (deltaPCM) or their modi"ed variations. Non-linear pre-dictive coding, however, is very limited due to thedi$culties involved in optimizing the coe$cientsextraction to obtain the best possible predictivevalues. Under this circumstance, a neural networkprovides a very promising approach in optimizingnon-linear predictive coding [17,42].

With a linear AR model, predictive coding can bedescribed by the following equation:

Xn"

N+i/0

aiX

n~i#v

n"p#v

n, (3.9)

where p represents the predictive value for the pixelX

nwhich is to be encoded in the next step. Its

neighbouring pixels, Xn~1

,Xn~2

,2,Xn~N

, areused by the linear model to produce the predictivevalue. v

nstands for the errors between the input

pixel and its predictive value. vn

can also bemodelled by a set of zero-mean independent andidentically distributed random variables.

Based on the above linear AR model, a multi-layer perceptron neural network can be construc-ted to achieve the design of its correspondingnon-linear predictor as shown in Fig. 9. For thepixel X

nwhich is to be predicted, its N neighbour-

ing pixels obtained from its predictive pattern arearranged into a one dimensional input vectorX"MX

n~1, X

n~2,2, X

n~NN for the neural net-

work. A hidden layer is designed to carry out backpropagation learning for training the neural net-

work. The output of each neurone, say the jthneurone, can be derived from the equation givenbelow:

hj"f (R)"f A

N+i/1

wjixn~iB , (3.10)

where f (v)"1/(1#e~v) is a sigmoid transfer func-tion.

To predict those drastically changing featuresinside images such as edges, contours, etc., high-order terms are added to improve the predictiveperformance. This corresponds to a non-linear ARmodel expressed as follows:

Xn"+

i

aiX

n~i#+

i

+j

aijX

n~iX

n~j

#+i

+j

+k

aijk

Xn~i

Xn~j

Xn~k

#2#vn.

(3.11)

Hence, another so-called functional link type neu-ral network can be designed [42] to implement thistype of a non-linear AR model with high-orderterms. The structure of the network is illustrated inFig. 10. It contains only two layers of neurones, onefor input and the other for output. Couplingweights, Mw

iN, between the input layer and the out-

put layer are trained towards minimizing the resid-ual energy which is de"ned as

RE"+n

en"+

n

(Xn!X

n)2, (3.12)

where Xn

is the predictive value for the pixel Xn.

Predictive performance with these neural networks

J. Jiang / Signal Processing: Image Communication 14 (1999) 737}760 751

Page 16: Image compression with neural networks A surveylooney/cs773b/NNimage-compress.pdf · image compression 2.1. Back-propagation image compression 2.1.1. Basic back-propagation neural

Fig. 10. Predictive neural network II.

Fig. 11. Neural network adaptive image coding system.

is claimed to outperform the conventional opti-mum linear predictors by about 4.17 and 3.74 dBfor two test images [42]. Further research, espe-cially for non-linear networks, is encouraged by thereported results to optimize their learning rules forprediction of those images whose contents are sub-ject to abrupt statistical changes.

4. Neural network based image compression

In this section, we present two image compres-sion systems which are developed "rmly based onneural networks. This illustrates how neural net-works may play important roles in assisting withnew technology development in the image com-pression area.

4.1. Neural network based adaptive image coding

The structure of a neural network-based adap-tive image coding system [21] can be illustrated inFig. 11.

The system is developed based on a compositesource model in which multiple sub-sources areinvolved. The principle is similar to those adaptivenarrow-channel neural networks described in Sec-tion 2.1.3. The basic idea is that the input image

can be classi"ed into a certain number of di!erentclasses or sub-sources by features pre-de"ned forany input image data set. Features of an image areextracted through image texture analysis as shownin the "rst block in Fig. 11. After the sub-source orclass to which the input image block belongs isidenti"ed, a K}L neural network transform anda number of quantization code books speci"callyre"ned for each individual sub-source are then ac-tivated by the class label, ¸, to further process andachieve the best possible data compression.

The LEP self-organization neural network inFig. 11 consists of two layers, one input layer andone output layer. The input neurones are organizedaccording to the feature sets. Accordingly, the inputneurones can be represented by M(x

mn), m"1, 2,2,

Mn, n"1, 2,2, NN, where M

nis the dimension of

input data with feature set n. It can be seen fromthis representation that there are N features in theinput data each of which consists of M

nelements. In

order to classify the input image data into varioussub-sources, the neurones at the output layer arerepresented by M(u

jk), j"1, 2,2, p; k"1, 2,2, qN,

where p is the number of output sub-sources andq the number of feature sets or perspectives astermed by the original paper [21]. The outputneurone representation is also illustrated in Fig. 12where the neurones in each of the output sub-sources, for the convenience of understanding, arerepresented by di!erent symbols such as square,triangle, circles and rectangles. It can be seen fromFig. 12 that each individual sub-source of the inputdata is typi"ed by q features or perspectives at theoutput layer. Therefore, the overall learning classi-"cation system comprises q perspective neural net-works which can be summarized in Fig. 13.

752 J. Jiang / Signal Processing: Image Communication 14 (1999) 737}760

Page 17: Image compression with neural networks A surveylooney/cs773b/NNimage-compress.pdf · image compression 2.1. Back-propagation image compression 2.1.1. Basic back-propagation neural

Fig. 12. Output neurons at LEP neural network.

Fig. 13. LEP overall neural network structure.

To classify the input block pattern, each inputfeature is connected to a perspective array of out-put neurones which represent di!erent sub-sourcesor classes. Competitive learning is then run insideeach perspective neural network in Fig. 13 to ob-tain a small proportion of winners in terms of thepotential values computed at the output neurones.The potential value is calculated from a so-calledactivated function. The winners obtained from eachneural network, in fact, are referred to as activated.That is because the real winner would be picked upby the overall neural network rather than eachindividual one, and the winner is a sub-source rep-resented by a column in Fig. 12. Since those ac-tivated sub-units or neurones picked up by eachindividual neural network will not always belong tothe same output sub-source, the overall winner willnot be selected until all of its sub-units areactivated. In Fig. 13, for example, if all triangleneurones are activated in each perspective neural

network, the overall winner will then be the sub-source represented by the column of triangles inFig. 12. In case no match is found, a new sub-source has to be added to the system.

An extended delta learning rule is designed toupdate the connection weights Mw

(mn)(jk), m"

1, 2,2,Mn; n"1, 2,2,N; j"1, 2,2, p, k"

1, 2,2, qN only for those winning sub-sources. Theweight, w

(mn)(jk), de"nes the connection strength

between the (mn)th input neurone and the ( jk)thoutput neurone. The modi"cation is

Ds(mn)(jk)

"d(smn

, w(mn)(jk)

)cja(e

i), k"1, 2,2, q,

(4.1)

where d(s(mn)

, w(mn)(jk)

) is de"ned as similaritymeasures [21]; c

j"+a

kcjk

is the so-called totalcon"dence factor of sub-source j with respect to theinput pattern. To judge the reliability of learnedinformation and to decide how far the learningprocess can modify the existing connectionstrengths, an experience record, e

j, for each sub-

source is de"ned by the number of times the sub-source has won the competition. In order to beadaptive to the spatially and temporally variantenvironment, the learning process is also subject toa forgetting process where an experienced attenu-ation factor is considered.

The K}L transform neural network implementedin Fig. 11 is another example of developing theexisting technology into neural learning algorithms[21]. As explained in Section 2.1, the idea of theK}L transform is to de-correlate the input data andproduce a set of orthonormal eigenvectors in de-scending order with respect to variance amplitude.Hence image compression can be achieved bychoosing a number of largest principal componentsout of the whole eigenvector set to reconstruct theimage. The neural network structure is the same asthat of LEP. For the ith neurone at the outputlayer, the vector of weight regarding connections toall the input neurones can be represented by Mw

ij,

j"1, 2,2, NN, where N is the dimension of inputvectors. After a certain number of complete learn-ing cycles, all the weight vectors, Mw

ij, j"1, 2,2,

NN, can be taken as the eigenvectors for the particu-lar K}L transform. Therefore, each weight vectorassociated with the output neurone represents one

J. Jiang / Signal Processing: Image Communication 14 (1999) 737}760 753

Page 18: Image compression with neural networks A surveylooney/cs773b/NNimage-compress.pdf · image compression 2.1. Back-propagation image compression 2.1.1. Basic back-propagation neural

axis of the new co-ordinate system. In other words,the training of the neural network is designed toconverge all the axes in the output neurones to thenew co-ordinate system described by the orthonor-mal eigenvectors.

The function within the ith neurone at the outputlayer is de"ned as the squared projection of theinput pattern on the ith co-ordinate in N!i#1dimensional space:

ui"A

N+j/1

wijsjB

2, (4.2)

where wij

is the weight of connection between thejth input neurone and the ith output neurone, andsjis the jth element of the input vector.The learning rule is designed on the basis of

rotating each co-ordinate axis of the original N-dimensional Cartesian co-ordinate system to themain direction of the processed vectors. Speci"-cally, each time an input vector comes into thenetwork, all the output neurones will be trainedone by one, sequentially, for a maximum of N!1times until the angle of rotation is so small that theaxis represented by the neurone concerned is al-most the same as the main direction of the inputpatterns. This algorithm is basically developedfrom the mathematical iterations for obtainingK}L transforms conventionally. As a matter of fact,the back-propagation learning discussed earlier canbe adopted to achieve a good approximation of theK}L transform and an improvement over its pro-cessing speed.

With this adaptive image coding system, a com-pression performance of 0.18 bit/pixel is reportedfor 8 bit grey level images [21].

4.2. Lossless image compression

Lossless image compression covers another im-portant side of image coding where images can bereconstructed exactly the same as originals withoutlosing any information. There are a number of keyapplication areas, notably the compression ofX-ray images, in which distortion or error cannotbe tolerated.

Theoretically, lossless image compression can bedescribed as an inductive inference problem, in

which a conditional probability distribution of fu-ture data is learned from the previously encodeddata set: C"Mx

mn: n(j and 0)m(N; n"j and

0)m(iN, where the image encoded is of the sizeN]M, and x

ijis the next pixel to be encoded. To

achieve the best possible compression performance,it is desired to make inferences on x

ijsuch that the

probability assigned to the entire two dimensionaldata set,

PMxijDi, j3(N, M)N"

N~1, M~1<

i, j/0

P(xijDS

ij), S

ij-C,

(4.3)

is maximized, where Sij

is the context used to condi-tion the probability assigned.

Due to the fact that images are digitized fromanalogue signals, strong correlation exists amongneighbouring pixels. In practice, therefore, variousprediction techniques [22,27,35,43,44,71] are de-veloped to assist with the above inferences. Theidea is to de-correlate the pixel data in such a waythat the residue of the prediction becomes an inde-pendent random data set. Hence, better inferencescan be made to maximize the probabilities allo-cated to all error entries, and the entropy of theerror data set can also be minimized correspond-ingly. As a matter of fact, comparative studies[43,71] reveal that the main redundancy reductionis achieved through the inferences of the maximumprobability assigned to each pixel encoded. Thecontext based algorithm [71], for instance, givesthe best performance according to the results re-ported in the literature.

Normally, the criterion for prediction is to min-imize the values of errors in which a mean squareerror function (MSE) is used to measure the predic-tion performance. This notion leads to a number oflinear adaptive prediction schemes which producea good compression of images on a lossless basis[37,43]. Since prediction also reduces spatial re-dundancy in image pixels, another criterion of min-imizing the zero-order entropy of those errorsproves working better than MSE based prediction[30] which often leads to non-linear prediction[44] or complicated linear prediction out of iter-ations [30]. This is put forward from the observa-tion that optimal MSE based prediction does not

754 J. Jiang / Signal Processing: Image Communication 14 (1999) 737}760

Page 19: Image compression with neural networks A surveylooney/cs773b/NNimage-compress.pdf · image compression 2.1. Back-propagation image compression 2.1.1. Basic back-propagation neural

Fig. 14. (a) Predictive pattern for entropy optimal linear prediction; (b) multi-layer perceptron neural network structure.

necessarily yield optimal entropy of the predictederrors. Experiments [30] show that around 10%di!erence exists between the MSE based predictionand the entropy based prediction in terms of theentropy of errors. Under such a context, one possi-bility of using neural networks for lossless imagecompression is to train the network such that min-imum entropy of its hidden layer outputs can beobtained. The output values equivalent to thepredicted errors is then further entropy coded byHu!man or arithmetic coding algorithms.

The work reported in [30] can be taken as anexample close to the above scheme. In this work,lossless compression is achieved through both lin-ear prediction and non-linear prediction in whichlinear prediction is used "rst to compress thesmooth areas of the input image and the non-linearprediction implemented on the neural network isused for those non-smooth areas. Fig. 14 illustratesthe linear predictive pattern (Fig. 14(a)) and themulti-layer perceptron neural network structure(Fig. 14(b)). The linear prediction is designed ac-cording to the criterion that minimum entropy ofthe predicted errors is achieved. This is carried outthrough a number of iterations to re"ne the coe$-cients towards their optimal values. Hence, eachimage will have its own optimal coe$cients andthese have to be transmitted as the overhead. Afterthe "rst pass, each pixel will have a corresponding

error produced from the optimal linear predictionrepresented as g(i, j). The pixel is further classi"edas being inside the non-smooth area if the predictederrors of its neighbouring pixels 1, and 2 given inFig. 14(a) satisfy the following equation:

TH1(Dg(i, j!1)#g(i!1, j)D(TH

2, (4.4)

where TH1

and TH2

stand for the thresholds.Around the pixel to be encoded, four predicted

errors, Mgi~1,j

, gi~1,j~1

, gi,j~1

, gi`1,j~1

N, whosepositions are shown in Fig. 15 are taken as theinput of the neural network and a supervised train-ing is applied to adjust the weights and thresholdsto minimize the MSE between the desired output,g(i, j), and the real output. After the training is"nished, the values of weights and thresholds aretransmitted as overheads to the decoding end andthe neural network is ready to apply a furthernon-linear prediction to those pixels insidenon-smooth areas. The overheads incurred in bothoptimal linear and neural network non-linearprediction can be viewed as an extra cost for statis-tics modelling [71]. In fact, the entropy-based cri-terion is to optimize the predictive coe$cients bytaking probability assignment or statistical model-ling into consideration. No work has been re-ported, however, regarding how the coe$cients canbe e$ciently and adaptively optimised under sucha context without signi"cantly increasing the

J. Jiang / Signal Processing: Image Communication 14 (1999) 737}760 755

Page 20: Image compression with neural networks A surveylooney/cs773b/NNimage-compress.pdf · image compression 2.1. Back-propagation image compression 2.1.1. Basic back-propagation neural

Fig. 15. Predictive pattern. Fig. 16. VQ neural network assisted prediction.

modelling cost (intensive computing, overheadsetc.). This remains an interesting area for furtherresearch.

Edge detection based prediction under develop-ment by the new JPEG-LS standard [40] producesa simple yet successful prediction scheme in termsof reducing the predicted error entropy. It takes thethree neighbouring pixels, the north, the west andthe north west location, into consideration to deter-mine the predictive value depending on whethera vertical edge or a horizontal edge is detected ornot. Detailed discussion of such a scheme, however,is outside the category of this paper since it is notdirectly relevant to any neural network application.Further details are referred to the JPEG-LS docu-ments [40].

With MSE based linear prediction, the mainproblem is that the predictive value for any pixelhas to be optimized by its preceding pixel valuesrather than the pixel itself. Fig. 15 illustrates anexample of predictive pattern in which the pixelxij

is to be predicted by its neighbouring four pixelsas identi"ed by x

i~1,j~1, x

i,j~1, x

i`1, j~1and

xi~1, j

. Thus the predictive value of xij

can be ex-pressed as

pij"a1

ijxi~1, j

#a2ijxi~1, j~1

#a3ijxi, j~1

#a4ijxi`1, j~1

. (4.5)

To optimize the value of Pij, a group of encoded

pixels as shown inside the dashed line area in

Fig. 15 which can also be represented by 6 canbe selected in place of the pixel x

ijto determine the

best possible values of a1ij, a2

ij, a3

ijand a4

ij. When the

MSE criterion is used, the problem comes down tominimizing the following error function:

e" +Xij|6

(xij!p

ij)2. (4.6)

The idea is based on the assumption that the coe$-cients, a1

ij, a2

ij, a3

ijand a4

ij, will give the best possible

predictive value Pij

if they can keep the total errorin predicting all other pixels inside the window inFig. 15 to a minimum. The assumption is made onthe ground that the image pixels are highly corre-lated. But when the window goes through thosedrastically changed image content areas like edges,lines, etc., the assumption would not be correctwhich causes serious distortion between the predic-tive value and the current pixel value encoded. Oneway of resolving this problem is to use a vectorquantization neural network to quantize the pixelsin 6 into a number of clusters [27]. The idea is touse the neural network to coarsely pre-classify allthe pixels inside the window 6 and to excludethose pixels which are unlikely to be close to thepixel to be predicted. In other words, only thosepixels which are classi"ed as being inside the samegroup by the vector quantization neural networkare involved in the above MSE based optimal lin-ear prediction. The overall structure of the systemcan be illustrated in Fig. 16 in which the vectorquantization neural network is adopted to producean adaptive code-book for quantizing or pre-clas-sifying the part of image up to the pixel to bepredicted. Since the vector quantization is only

756 J. Jiang / Signal Processing: Image Communication 14 (1999) 737}760

Page 21: Image compression with neural networks A surveylooney/cs773b/NNimage-compress.pdf · image compression 2.1. Back-propagation image compression 2.1.1. Basic back-propagation neural

used to pre-classify those previously encodedpixels, no loss of information is incurred during thispre-classi"cation process. To this end, such a neu-ral network application can be viewed as an in-direct application for lossless image compression.By considering each pixel as an individual symbolin a pre-de"ned scanned order, other general loss-less data compression neural networks can also beapplied for image compression [28,59].

In fact, theoretical results are analysed [56]based on Kolmogorov's mapping neural networkexistence theorem [20] that a C grey level image ofn]n can be completely described by a three-layerneural network with 2vlog nw inputs, 4vlog nw#2hidden neurones and vlog Cw output neurones[56]. This leads to a storage of full connectionsrepresented by: 8 vlog2 nw#(1#vlog Cw)4 log n#2vlog Cw . Compared with the original imagewhich requires vlog Cwn2 bits, a theoretical losslesscompression ratio can be expected. Further re-search, therefore, can be initiated under the guid-ance of this analysis to achieve the theoretical tar-get for various classes of input images.

5. Conclusions

In this paper, we have discussed various up-to-date image compression neural networks which areclassi"ed into three di!erent categories accordingto the nature of their applications and design.These include direct development of neurallearning algorithms for image compression, neuralnetwork implementation of traditional imagecompression algorithms, and indirect applicationsof neural networks to assist with those existingimage compression techniques.

With direct learning algorithm development,vector quantization neural networks and narrow-channel neural networks stand out to be the mostpromising technique which have shown competi-tive performances and even improvements overthose traditional algorithms. While conventionalimage compression technology is developed alongthe route of compacting information (transformswithin di!erent domains), then quantization andentropy coding, the principal components extrac-tion based narrow-channel neural networks pro-

duce better approximation of extracting principalcomponents from the input images, and VQ neuralnetworks make a good replacement for quantiz-ation. Since vector quantization is included inmany image compression algorithms such as thosewavelets based variations, etc., many practical ap-plications can be initiated in image processing andimage coding related area.

Theoretically, all the existing state-of-the-art im-age compression algorithms can be possibly imple-mented by extended neural network structures,such as wavelets, fractals and predictive codingdescribed in this paper. One of the advantages ofdoing so is that implementation of various tech-niques can be standardized on dedicated hardwareand architectures. Extensive evaluation and assess-ment for a wide range of di!erent techniques andalgorithms can be conveniently carried out on gen-eralized neural network structures. From this pointof view, image compression development can bemade essentially as the design of learning algo-rithms for neural networks. People often get wrongimpressions that neural networks are computingintensive and time consuming. The fact is the con-trary. For most neural networks discussed in thepaper, the main bottleneck is the training or con-vergence of coupling weights. This stage, however,is only a preparation for the neural network. Thiswill not a!ect the actual processing speed. In thecase of vector quantization, for example, a set ofpre-de"ned image samples is often used to train theneural network. After the training is "nished, theconverged code-book will be used to vector quan-tize all the input images throughout the wholeprocess due to the generalization property in neuralnetworks [4,9]. To this end, the neural network willat least perform as equally e$cient as that of con-ventional vector quantizers for software implemen-tation. As a matter of fact, recent work carried outat Loughborough clearly shows that when LBG isimplemented on a neural network structure, thesoftware simulation is actually faster than theconventional LBG [4]. With dedicated hardwareimplementation, the massive parallel computingnature of neural networks is quite obvious due tothe parallel structure and arrangement of all theneurones within each layer. In addition, neuralnetworks can also be implemented on general

J. Jiang / Signal Processing: Image Communication 14 (1999) 737}760 757

Page 22: Image compression with neural networks A surveylooney/cs773b/NNimage-compress.pdf · image compression 2.1. Back-propagation image compression 2.1.1. Basic back-propagation neural

purpose parallel processing architectures or arrayswith programmable capability to change theirstructures and hence their functionality [18,73].

Indirect neural network applications are de-veloped to assist with traditional techniques andprovide a very good potential for further improve-ment on conventional image coding and compres-sion algorithms. This is typi"ed by signi"cantresearch work on image pattern recognition,feature extraction and classi"cations by neural net-works. When traditional compression technology isapplied to those pre-processed patterns and fea-tures, it can be expected to achieve improvement byusing neural networks since their applications inthese area are well established. Hence, image pro-cessing based compression technology could be oneof the major research directions in the next stage ofimage compression development.

At present, research in image compression neuralnetworks is limited to the mode pioneered by con-ventional technology, namely, information compact-ing (transforms)#quantization#entropy coding.Neural networks are only developed to target indi-vidual problems inside this mode [9,15,36,47].Typical examples are the narrow channel type forinformation compacting, and LVQ for quantiz-ation, etc. Although signi"cant work has been donetowards neural network development for imagecompression, and strong competition can be forcedon conventional techniques, it is premature to saythat neural network technology standing alone canprovide better solutions for practical image codingproblems in comparison with those traditionaltechniques. Co-ordinated e!orts world-wide are re-quired to assess the neural networks developed onpractical applications in which the training set andimage samples should be standardized. In this way,every algorithm proposed can go through the sameassessment with the same test data set. Furtherresearch can also be targeted to design neural net-works capable of both information compacting andquantizing. Hence the advantages of both tech-niques can be fully exploited. Therefore, future re-search work in image compression neural networkscan be considered by designing more hidden layersto allow the neural networks go through moreinteractive training and sophisticated learning pro-cedures. Accordingly, high performance compres-

sion algorithms may be developed and imple-mented in those neural networks. Within the infras-tructure, dynamic connections of various neuronesand non-linear transfer functions can also be con-sidered and explored to improve their learning per-formances for those image patterns with drasticallychanged statistics.

References

[1] S.C. Ahalt, A.K. Krishnamurthy et al., Competitive learn-ing algorithms for vector quantization, Neural Networks3 (1990) 277}290.

[2] A. Averbuch, D. Lazar, M. Israeli, Image compressionusing wavelet transform and multiresolution decomposi-tion, IEEE Trans. Image Process. 5 (1) (January 1996) 4}15.

[3] M.F. Barnsley, L.P. Hurd, Fractal Image Compression,AK Peters Ltd, 1993, ISBN: 1-56881-000-8.

[4] G. Basil, J. Jiang, Experiments on neural network imple-mentation of LBG vector quantization, Research Report,Department of Computer Studies, Loughborough Univer-sity, 1997.

[5] Benbenisti et al., New simple three-layer neural net-work for image compression, Opt. Eng. 36 (1997)1814}1817.

[6] H. Bourlard, Y. Kamp, Autoassociation by multilayerperceptrons and singular values decomposition, Biol.Cybernet. 59 (1988) 291}294.

[7] D. Butler, J. Jiang, Distortion equalised fuzzy competitivelearning for image data vector quantization, in: IEEEProc. ICASSP'96, Vol. 6, 1996, pp. 3390}3394, ISBN:0-7803-3193-1.

[8] G. Candotti, S. Carrato et al., Pyramidal multiresolu-tion source coding for progressive sequences, IEEETrans. Consumer Electronics 40 (4) (November 1994)789}795.

[9] S. Carrato, Neural networks for image compression,Neural Networks: Adv. and Appl. 2 ed., Gelenbe Pub,North-Holland, Amsterdam, 1992, pp. 177}198.

[10] O.T.C. Chen et al., Image compression using self-organisa-tion networks, IEEE Trans. Circuits Systems For VideoTechnol. 4 (5) (1994) 480}489.

[11] F.L. Chung, T. Lee, Fuzzy competitive learning, NeuralNetworks 7 (3) (1994) 539}551.

[12] P. Cicconi et al., New trends in image data compression,Comput. Med. Imaging Graphics 18 (2) (1994) 107}124,ISSN 0895-6111.

[13] S. Conforto et al., High-quality compression of echo-graphic images by neural networks and vector quantiz-ation, Med. Biol. Eng. Comput. 33 (5) (1995) 695}698,ISSN 0140-0118.

[14] G.W. Cottrell, P. Munro, D. Zipser, Learning internalrepresentations from grey-scale images: an example of ex-tensional programming, in: Proc. 9th Annual CognitiveScience Society Conf., Seattle, WA, 1987, pp. 461}473.

758 J. Jiang / Signal Processing: Image Communication 14 (1999) 737}760

Page 23: Image compression with neural networks A surveylooney/cs773b/NNimage-compress.pdf · image compression 2.1. Back-propagation image compression 2.1.1. Basic back-propagation neural

[15] J. Dangman, Complete discrete 2-D Gabor transforms byneural networks for image analysis and compression,IEEE Trans. ASSP 36 (1988) 1169}1179.

[16] T.K. Denk, V. Parhi, Cherkasky, Combining neural net-works and the wavelet transform for image compression,in: IEEE Proc. ASSP, Vol. I, 1993, pp. 637}640.

[17] R.D. Dony, S. Haykin, Neural network approaches toimage compression, Proc. IEEE 83 (2) (February 1995)288}303.

[18] W.C. Fang, B.J. Sheu, T.C. Chen, J. Choi, A VLSI neuralprocessor for image data compression using self-organisa-tion networks, IEEE Trans. Neural Networks 3 (3) (1992)506}519.

[19] R. Fletcher, Practical Methods of Optimisation, Wiley,ISBN: 0-471-91547-5, New York, 1987.

[20] R. Hecht-Nielsen, Neurocomputing, Addison-Wesley,Reading, MA, 1990.

[21] N. Heinrich, J.K. Wu, Neural network adaptive imagecoding, IEEE Trans. Neural Networks 4 (4) (1993)605}627.

[22] P.G. Howard, J.S. Vitter, New methods for lossless imagecompression using arithmetic coding, Inform. Process.Management 28 (6) (1992) 765}779.

[23] C.H. Hsieh, J.C. Tsai, Lossless compression of VQ indexwith search-order coding, IEEE Trans. Image Process.5 (11) (November 1996) 1579}1582.

[24] K.M. Iftekharuddin et al., Feature-based neural waveletoptical character recognition system, Opt. Eng. 34 (11)(1995) 3193}3199.

[25] J. Jiang, Algorithm design of an image compression neuralnetwork, in: Proc. World Congress on Neural Networks,Washington, DC, 17}21 July 1995, pp. I792}I798, ISBN:0-8058-2125.

[26] J. Jiang, A novel design of arithmetic coding for datacompression, IEE Proc.-E: Computer and DigitalTechniques 142 (6) (November 1995) 419}424, ISSN 1350-2387.

[27] J. Jiang, A neural network based lossless image compres-sion, in: Proc. Visual'96: Internat. Conf. on VisualInformation Systems, 5}6 February 1996, Melbourne,Australia, pp. 192}201, ISBN: 1-875-33852-7.

[28] J. Jiang, Design of neural networks for lossless data com-pression, Opt. Eng. 35 (7) (July 1996) 1837}1843.

[29] J. Jiang, Fast competitive learning algorithm for imagecompression neural networks, Electronic Lett. 32 (15) (July1996) 1380}1381.

[30] W.W. Jiang et al., Lossless compression for medical imag-ing systems using linear/non-linear prediction and arith-metic coding, in: Proc. ISCAS'93: IEEE Internat. Symp.on Circuits and Systems, Chicago, 3}6 May 1993, ISBN0-7803-1281-3.

[31] N.B. Karayiannis, P.I. Pai, Fuzzy vector quantizationalgorithms and their application in image compression,IEEE Trans. Image Process. 4 (9) (1995) 1193}1201.

[32] N.G. Karayiannis, P.I. Pai, Fuzzy algorithms for learningvector quantization, IEEE Trans. Neural Networks 7 (5)(1996) 1196}1211.

[33] T. Kohonen, Self-Organisation and Associative Memory,Springer, Berlin, 1984.

[34] D. Kornreich, Y. Benbenisti, H.B. Mitchell, P. Schaefer,Normalization schemes in a neural network image com-pression algorithm, Signal Processing: Image Communica-tion 10 (1997) 269}278.

[35] D. Kornreich, Y. Benbenisti, H.B. Mitchell, P. Schaefer,A high performance single-structure image compressionneural network, IEEE Trans. Aerospace Electronic Sys-tems 33 (1997) 1060}1063.

[36] S.Y. Kung, K.I. Diamantaras, J.S. Taur, Adaptive princi-pal component extraction (APEX) and applications, IEEETrans. Signal Process. 42 (May 1994) 1202}1217.

[37] N. Kuroki et al., Lossless image compression by two-dimensional linear prediction with variable coe$cients,IEICE Trans. Fund. E75-A (7) (1992) 882}889.

[38] R.P. Lippmann, An introduction to computing with neuralnets, IEEE ASSP Mag. (April 1987) 4}21.

[39] S.B. Lo, H. Li et al., On optimization of orthonormalwavelet decomposition: implication of data accuracy,feature preservation, and compression e!ects, SPIE Proc.,Vol. 2707, 1996, pp. 201}214.

[40] Lossless and near-lossless coding of continuous tone stillimages (JPEG-LS), ISO/IEC JTC 1/SC 29/WG 1 FCD14495-public draft, 1997 (http://www.jpeg.org/public/jpeg-links.htm).

[41] S.G. Mallat, A theory for multiresolution signal decompo-sition: The wavelet representation, IEEE Trans. PatternAnal. Mach. Intell. 11 (7) (July 1989) 674}693.

[42] C.N. Manikopoulos, Neural network approach to DPCMsystem design for image coding, IEE Proc.-I 139 (5)(October 1992) 501}507.

[43] N.D. Memon, K. Sayood, Lossless image compression:A comparative study, in: Proc. SPIE Still Image Compres-sion, Vol. 2418, 1995, pp. 8}27.

[44] N. Memon et al., Di!erential lossless encoding of imagesusing non-linear predictive techniques, in: Proc. ICIP-94,Internat. Conf. on Image Processing, Vol. III, 13}16November, Austin, Texas, 1994, pp. 841}845, ISBN: 0-8186-6950-0.

[45] W.L. Merrill John, R.F. Port, Fractally con"gured neuralnetworks, Neural Networks 4 (1) (1991) 53}60.

[46] N. Mohsenian, S.A. Rizvi, N.M. Nasrabadi, Predictivevector quantization using a neural network approach,Opt. Eng. 32 (7) (July 1993) 1503}1513.

[47] M. Mougeot, R. Azencott, B. Angeniol, Image compres-sion with back propagation: improvement of the visualrestoration using di!erent cost functions, Neural Net-works 4 (4) (1991) 467}476.

[48] A. Namphol, S. Chin, M. Arozullah, Image compressionwith a hierarchical neural network IEEE Trans. AerospaceElectronic Systems 32 (1) (January 1996) 326}337.

[49] R.P. Nikhil, C.J. Bezdek, E.C.K. Tsao, Generalized cluster-ing networks and Kohonen's Self-organizing scheme,IEEE Trans. Neural Networks 4 (4) (1993) 549}557.

[50] E. Oja, A simpli"ed neuron model as a principal compon-ent analyser, J. Math. Biol. 15 (1982) 267}273.

J. Jiang / Signal Processing: Image Communication 14 (1999) 737}760 759

Page 24: Image compression with neural networks A surveylooney/cs773b/NNimage-compress.pdf · image compression 2.1. Back-propagation image compression 2.1.1. Basic back-propagation neural

[51] E. Oja, Data compression, feature extraction, andautoassociation in feedforward neural networks, Arti"cialNeural Networks, Elsevier, Amsterdam, 1991.

[52] I. Pitas et al., Robust and adaptive training algorithms inself-organising neural networks, Electronic Imaging SPIE5 (2) (1997).

[53] X.N. Ran, N. Farvardin, A perceptually motivated 3-com-ponent image model 2. applications to image compression,IEEE Trans. Image Process. 4 (4) (1995) 430}447.

[54] S.A. Rizvi, N.M. Nasrabadi, Residual vector quantizationusing a multilayer competitive neural network, IEEE J.Selected Areas Commun. 12 (9) (December 1994)1452}1459.

[55] S.A. Rizvi, N.M. Nasrabadi, Finite-state residual vectorquantization using a tree structured competitive neuralnetwork, IEEE Trans. Circuits Systems Video Technol.7 (2) (1997) 377}390.

[56] S.G. Romaniuk, Theoretical results for applying neuralnetworks } To lossless image compression, Network-Com-put. Neural Systems 5 (4) (1994) 583}597, ISSN 0954-898X.

[57] L.E. Russo, E.C. Real, Image compression using an outerproduct neural network, in: Proc. ICASSP, Vol. 2, SanFrancisco, CA, 1992, pp. 377}380.

[58] T.D. Sanger, Optimal unsupervised learning in a single-layer linear feed forward neural network, Neural Net-works 2 (1989) 459}473.

[59] J. Schmidhuber, S. Heil, Sequential neural text compres-sion, IEEE Trans. Neural Networks 7 (1) (January 1996)142}146.

[60] J. Shanbehzadeh, P.O. Ogunbona, Index-compressed vec-tor quantization based on index mapping, IEE Proc.-Vi-sion Image Signal Process. 144 (1) (February 1997) 31}38.

[61] J.M. Shapiro, Embedded image-coding using zerotrees ofwavelet coe$cients, IEEE Trans. Signal Process. 41 (12)(1993) 3445}3462.

[62] W. Skarbek, A. Cichocki, Robust image association byrecurrent neural subnetworks, Neural Process. Lett. 3 (3)(1996) 131}138, ISSN 1370-4621.

[63] H.H. Song, S.W. Lee, LVQ combined with simulated an-nealing for optimal design of large-set reference models,Neural Networks 9 (2) (1996) 329}336.

[64] J. Stark, Iterated function systems as neural networks,Neural Networks 4 (5) (1992) 679}690.

[65] A. Sto!els et al., On object-oriented video coding using theCNN universal machine, IEEE Trans. Circuits SystemsI-Fund. Theory Appl. 43 (11) (1996) 948}952.

[66] H. Szu, B. Telfer, J. Garcia, Wavelet transforms and neuralnetworks for compression and recognition, Neural Net-works 9 (4) (1996) 695}798.

[67] H. Szu, B. Telfer, S. Kadambe, Neural network adaptivewavelets for signal representation and classi"cation, Opt.Eng. 31 (September 1992) 1907}1916.

[68] C.P. Veronin, Priddy et al., Optical image segmentationusing neural based wavelet "ltering techniques, Opt. Eng.31 (February 1992) 287}294.

[69] J.D. Villasenor, B. Belzer, J. Liao, Wavelet "lter evaluationfor image compression, IEEE Trans. Image Process. 4 (8)(August 1995) 1053}1060.

[70] N.P. Walker et al., Image compression using neural net-works, GEC J. Res. 11 (2) (1994) 66}75, ISSN 0264-9187.

[71] M.J. Weinberger, J.J. Rissanen, R.B. Arps, Applications ofuniversal context modelling to lossless compression ofgrey-scale images, IEEE Trans. Image Process. 5 (4) (1996)575}586.

[72] L. Xu, A. Yuille, Robust principal component analysis byself-organising rules based on statistical physics approach,Tech. Report, 92-93, Harvard Robotics Lab, February1992.

[73] R.B. Yates et al., An array processor for general-purposedigital image compression, IEEE J. Solid-state Circuits 30(3) (1995) 244}250.

[74] Q. Zhang, A. Benveniste, Approximation by non-linearwavelet networks, in: Proc. IEEE Internat. Conf. ASSP,Vol. 5, May 1991, pp. 3417}3420.

[75] L. Zhang et al., Generating and coding of fractal graphs byneural network and mathematical morphology methods,IEEE Trans. Neural Networks 7 (2) (1996) 400}407.

760 J. Jiang / Signal Processing: Image Communication 14 (1999) 737}760