Download - Fast Class Rendering Using Multiresolution Classification in Discrete Cosine Transform Domain
![Page 1: Fast Class Rendering Using Multiresolution Classification in Discrete Cosine Transform Domain](https://reader036.vdocuments.net/reader036/viewer/2022081515/5681405c550346895dabcddb/html5/thumbnails/1.jpg)
1
Fast Class Rendering Using Multiresolution Classification in
Discrete Cosine Transform Domain
Presented byLi-Jen Kao
July, 2005
![Page 2: Fast Class Rendering Using Multiresolution Classification in Discrete Cosine Transform Domain](https://reader036.vdocuments.net/reader036/viewer/2022081515/5681405c550346895dabcddb/html5/thumbnails/2.jpg)
2
Outline
Introduction Feature Extraction Classification Scheme Experimental Results Conclusion
![Page 3: Fast Class Rendering Using Multiresolution Classification in Discrete Cosine Transform Domain](https://reader036.vdocuments.net/reader036/viewer/2022081515/5681405c550346895dabcddb/html5/thumbnails/3.jpg)
3
1 Introduction Classification of objects (or patterns) into
a number of predefined classes has been extensively studied in wide variety of applications such as optical character recognition (OCR) speech recognition face recognition
We may consider the design of classification systems in terms of two subproblems: feature extraction classification.
![Page 4: Fast Class Rendering Using Multiresolution Classification in Discrete Cosine Transform Domain](https://reader036.vdocuments.net/reader036/viewer/2022081515/5681405c550346895dabcddb/html5/thumbnails/4.jpg)
4
Feature extraction: Features are functions of the measurements
performed on a class of objects It has not found a general solution in most
applications. Our purpose is to design a general
classification scheme, which is less dependent on domain-specific knowledge.
Reliable and general features are required
![Page 5: Fast Class Rendering Using Multiresolution Classification in Discrete Cosine Transform Domain](https://reader036.vdocuments.net/reader036/viewer/2022081515/5681405c550346895dabcddb/html5/thumbnails/5.jpg)
5
Discrete Cosine Transform (DCT)
It helps separate an image into parts of differing importance with respect to the image's visual quality.
Due to the energy compacting property of DCT, much of the signal energy has a tendency to lie at low frequencies.
![Page 6: Fast Class Rendering Using Multiresolution Classification in Discrete Cosine Transform Domain](https://reader036.vdocuments.net/reader036/viewer/2022081515/5681405c550346895dabcddb/html5/thumbnails/6.jpg)
6
Four advantages in applying DCT
The features extracted by DCT are general and reliable. It can be applied to most of the vision-oriented applications.
The amount of data to be stored can be reduced tremendously.
Multiresolution classification and progressive matching can be achieved by nature.
The DCT is scale-invariant and less sensitive to noise and distortion.
![Page 7: Fast Class Rendering Using Multiresolution Classification in Discrete Cosine Transform Domain](https://reader036.vdocuments.net/reader036/viewer/2022081515/5681405c550346895dabcddb/html5/thumbnails/7.jpg)
7
Two philosophies of classification
Statistical the measurements that describe an
object are treated only formally as statistical variables, neglecting their “meaning
Structural regards objects as compositions of
structural units, usually called primitives.
![Page 8: Fast Class Rendering Using Multiresolution Classification in Discrete Cosine Transform Domain](https://reader036.vdocuments.net/reader036/viewer/2022081515/5681405c550346895dabcddb/html5/thumbnails/8.jpg)
8
2 Feature Extraction via DCT The DCT coefficients C(u, v) of an N×N
image represented by x(i, j) can be defined as
where
1
0
1
0
),()()(2
),(N
i
N
j
jixvuN
vuC ),2
)12(cos()
2
)12(cos(
N
vj
N
ui
.1
,021
)(otherwise
wforw
![Page 9: Fast Class Rendering Using Multiresolution Classification in Discrete Cosine Transform Domain](https://reader036.vdocuments.net/reader036/viewer/2022081515/5681405c550346895dabcddb/html5/thumbnails/9.jpg)
9
Figure 1. The DCT coefficients of the character image “ 為” .
![Page 10: Fast Class Rendering Using Multiresolution Classification in Discrete Cosine Transform Domain](https://reader036.vdocuments.net/reader036/viewer/2022081515/5681405c550346895dabcddb/html5/thumbnails/10.jpg)
10
Figure 2. Illustratation of the multiresolution ability
of DCT
(a) (b) (c) (d)
(a) The original image of size 48×48; (b) The reconstructed image of size 8×8; (c) The reconstructed image of size 16×16; (d) The reconstructed image of size 32×32.
![Page 11: Fast Class Rendering Using Multiresolution Classification in Discrete Cosine Transform Domain](https://reader036.vdocuments.net/reader036/viewer/2022081515/5681405c550346895dabcddb/html5/thumbnails/11.jpg)
11
3. The Proposed Classification Scheme
The ultimate goal of classification is to classify an unknown pattern x to one of M possible classes (c1, c2,…, cM).
Each pattern is represented by a set of D features, viewed as a D-dimensional feature vector.
![Page 12: Fast Class Rendering Using Multiresolution Classification in Discrete Cosine Transform Domain](https://reader036.vdocuments.net/reader036/viewer/2022081515/5681405c550346895dabcddb/html5/thumbnails/12.jpg)
12
3.1. Our classification model
In the training mode: the feature extraction module finds the
appropriate features for representing the input patterns, and the classifier is trained.
In the classification mode: the trained classifier assigns the input
pattern to one of the pattern classes based on the measured features.
![Page 13: Fast Class Rendering Using Multiresolution Classification in Discrete Cosine Transform Domain](https://reader036.vdocuments.net/reader036/viewer/2022081515/5681405c550346895dabcddb/html5/thumbnails/13.jpg)
13
To alleviate the burden of classification process, the process is usually divided into two stages: Coarse Classification Fine Classification
![Page 14: Fast Class Rendering Using Multiresolution Classification in Discrete Cosine Transform Domain](https://reader036.vdocuments.net/reader036/viewer/2022081515/5681405c550346895dabcddb/html5/thumbnails/14.jpg)
14
Figure 3. Model for multiresolution classification
![Page 15: Fast Class Rendering Using Multiresolution Classification in Discrete Cosine Transform Domain](https://reader036.vdocuments.net/reader036/viewer/2022081515/5681405c550346895dabcddb/html5/thumbnails/15.jpg)
15
3.2. Coarse classification module
In the training mode: The features of each training sample are first
extracted by DCT and quantized. Then the most D significant quantized DCT
features of each training sample are transformed to a code, called grid code (GC), which corresponds to a grid of feature space partitioned by the quantization method.
The training samples with the same GC are similar and can be classified into a coarse class.
Therefore, the information about all possible GCs is gathered in the training mode.
![Page 16: Fast Class Rendering Using Multiresolution Classification in Discrete Cosine Transform Domain](https://reader036.vdocuments.net/reader036/viewer/2022081515/5681405c550346895dabcddb/html5/thumbnails/16.jpg)
16
In the classification mode: The classes with the same GC as that
of the test sample are chosen as the candidates of the test sample.
![Page 17: Fast Class Rendering Using Multiresolution Classification in Discrete Cosine Transform Domain](https://reader036.vdocuments.net/reader036/viewer/2022081515/5681405c550346895dabcddb/html5/thumbnails/17.jpg)
17
3.2.1. Quantization
The 2-D DCT coefficient F(u,v) is quantized to F’(u,v) according to the following equation:
Most of the high frequency coefficients will be quantized to zero and only the most significant coefficients will be retained.
Q
vuFvuF
),(),(
![Page 18: Fast Class Rendering Using Multiresolution Classification in Discrete Cosine Transform Domain](https://reader036.vdocuments.net/reader036/viewer/2022081515/5681405c550346895dabcddb/html5/thumbnails/18.jpg)
18
3.2.2. Grid Code Transformation
After the quantization process, the most D significant quantized DCT features of sample Oi are obtained, say [qi1, qi2, .., qiD].
The significance of each DCT coefficient is decided according to the following zigzag order: F(0,0), F(0,1), F(1,0), F(2,0), F(1,1), F(0,2), F(0,3), F(1,2), F(2,1), F(3,0), F(3,1),…, and so on.
Because the value of qij may be negative, for the ease of operation, we transform qij to positive integer dij by adding a number, say kj, to qij.
In this way, object Oi can be transformed to a D-digit GC.
This process is called the grid code transformation (GCT).
![Page 19: Fast Class Rendering Using Multiresolution Classification in Discrete Cosine Transform Domain](https://reader036.vdocuments.net/reader036/viewer/2022081515/5681405c550346895dabcddb/html5/thumbnails/19.jpg)
19
3.2.3. Grid Code Sorting and Elimination
After the GCT, we obtain a list of triplets (Ti, Ci, GCi) Ti is the ID of a training sample Ci is the Class ID the training sample
belongs to GCi is the grid code of the training sample.
Then the list is sorted according to the GC ascendingly.
Given the GC of a test sample, we can get a list of candidate classes of the same GC for the test sample.
![Page 20: Fast Class Rendering Using Multiresolution Classification in Discrete Cosine Transform Domain](https://reader036.vdocuments.net/reader036/viewer/2022081515/5681405c550346895dabcddb/html5/thumbnails/20.jpg)
20
Elimination of Redundancy
Redundancy occurs as the training samples belonging to the same class have the same GC.
This redundancy can be eliminated by establishing an abstract lookup table that only contains the information about the GCs and their corresponding classes.
Then, given a GC, this table can tell the relevant classes very quickly by binary search.
![Page 21: Fast Class Rendering Using Multiresolution Classification in Discrete Cosine Transform Domain](https://reader036.vdocuments.net/reader036/viewer/2022081515/5681405c550346895dabcddb/html5/thumbnails/21.jpg)
21
3.3. The fine classification module
Progressive matching method Adding more DCT coefficients usually imply increasing
the resolution level of an image. If current resolution is not high enough to distinguish
one character from the others, we have to raise the level of resolution such that the discrimination power can also be improved.
The establishment of the templates for each class
Templates are established in the DCT domain. The average DCT coefficients of size N×N are obtained from the set of training samples with respect to the class.
Such that M sets of average DCT coefficients are obtained and served as the templates for each class.
![Page 22: Fast Class Rendering Using Multiresolution Classification in Discrete Cosine Transform Domain](https://reader036.vdocuments.net/reader036/viewer/2022081515/5681405c550346895dabcddb/html5/thumbnails/22.jpg)
22
The sum of squared differences (SSD) is used as the matching criterion.
The matching of x and Ti is decomposed into K iterations, each of which corresponds to the matching under the block of size nk×nk.
After the kth iteration, the block size is enlarged from nk×nk to nk+1×nk+1 (nk+1 = nk+d).
The process is repeated until one of the stop criterions is satisfied:
1) to preserve enough signal energy in the block, and 2) to reject unqualified classes as soon as possible.
![Page 23: Fast Class Rendering Using Multiresolution Classification in Discrete Cosine Transform Domain](https://reader036.vdocuments.net/reader036/viewer/2022081515/5681405c550346895dabcddb/html5/thumbnails/23.jpg)
23
4 Experimental Results 18600 samples (about 640 categories)
are extracted from Kin-Guan ( 金剛 ) bible. Each character image was transformed into
a 48×48 bitmap. 1000 of the 18600 samples are used for
testing and the others are used for training. The most D significant DCT coefficients were
quantized and transformed to a GC for each
sample.
![Page 24: Fast Class Rendering Using Multiresolution Classification in Discrete Cosine Transform Domain](https://reader036.vdocuments.net/reader036/viewer/2022081515/5681405c550346895dabcddb/html5/thumbnails/24.jpg)
24
Figure 3. Reduction and accuracy rate using our coarse classification scheme
![Page 25: Fast Class Rendering Using Multiresolution Classification in Discrete Cosine Transform Domain](https://reader036.vdocuments.net/reader036/viewer/2022081515/5681405c550346895dabcddb/html5/thumbnails/25.jpg)
25
Figure 4. Accuracy rate using both coarse and fine
classification
![Page 26: Fast Class Rendering Using Multiresolution Classification in Discrete Cosine Transform Domain](https://reader036.vdocuments.net/reader036/viewer/2022081515/5681405c550346895dabcddb/html5/thumbnails/26.jpg)
26
6 Conclusions This paper presents a multiresolution
classification scheme based on DCT for vision-based applications.
The DCT features of a pattern can be extracted progressively according to their significance.
On classifying an unknown object, most of the improbable candidate classes for the object can be eliminated at lower resolution levels.
Experiments were conducted for recognizing handwritten characters in Chinese palaeography and showed that our approach performs well in this application domain.
![Page 27: Fast Class Rendering Using Multiresolution Classification in Discrete Cosine Transform Domain](https://reader036.vdocuments.net/reader036/viewer/2022081515/5681405c550346895dabcddb/html5/thumbnails/27.jpg)
27
Future Works
Since only preliminary experiment has been made to test our approach, a lot of works should be done to improve this system. For example, since features of different
types complement one another in classification performance, by using different types of vision-oriented features simultaneously, classification accuracy could be improved.