26. two-dimensional orthogonal dct expansion in triangular and trapezoid regions

TWO-DIMENSIONAL ORTHOGONAL DCT EXPANSION IN TRIANGULAR AND TRAPEZOID REGIONS

ABSTRACT

It is known that the 2-D DCT basis is complete and or-thogonal in a rectangular region. In this paper, we in-troduce the way to generate the complete and orthogo-nal 2-D DCT basis in a trapezoid region or a triangular region without using the complicated Gram-Schmidt method. Moreover, since a polygon can be decom-posed several triangular regions, the proposed method is also suitable for the polygonal region. Our algorithm can much generalize the JPEG algorithm. Instead of dividing an image into 8 by 8 blocks, we can divide an image into trapezoid or triangular regions and then transform and code each of them. In addition to the DCT basis, our method can also be used for generating the 2-D complete and orthogonal DFT basis, KLT ba-sis, Legendre basis, Hadamard (Walsh) basis, and polynomial basis in the trapezoid and triangular re-gions.

eve transform (KLT) and has higher ability for decor-relation, after performing the DCT, most energy is con-centrated on the low-frequency region, which is very helpful for compression.

Although the DCT in (1) is popular in image com-pression, it has some problem. That is, it is only orthogo-nal in an MN rectangular region. However, for the re-gion with other shape, it may not be orthogonal. Although for these types of regions, we can use the Gram-Schmidt algorithm to convert the DCT basis into an orthogonal ba-sis, it is very time-consuming and the round-off error may be caused during the process of computation.

In this paper, we find that, with some modification, the DCT basis can also be complete and orthogonal in a triangular region, a trapezoidal region, or their twisted forms.

Furthermore, since a polygon can be viewed as a combination of triangles, the proposed method can also be applied for a polygonal region. We can first divide an n-side polygon into n-2 triangular regions (instead of 88 blocks) then perform DCT expansion for each triangular region.

Therefore, with the proposed method, we can perform DCT expansion for an arbitrary polygonal region. It makes the JPEG algorithm much more flexible.

Moreover, in addition to the DCT basis, the proposed method can also be applied to other discrete orthogonal bases with even and odd symmetries, such as the KLT ba-sis, the DFT basis, the Hadamard (Walsh) basis, the dis-crete Legendre basis, and other discrete orthogonal poly-nomial bases. With the proposed method, we can convert them into a complete and orthogonal vector set in the trapezoid and triangular regions

2. COMPLETE AND ORTHOGONAL DCT BASIS

IN THE TRAPEZOID REGION

Here, we define the trapezoid as a region that has M rows (or columns) and if the number of pixels in the mth row (m = 0, 1, …, M-1) is denoted by K(m), then K(m) + K(M1m) is a constant. (4)

Fig. 1: A “trapezoid” region that satisfies (4) and the starting point of each row are aligned at the same col-umn. Black dots mean the pixels in the trapezoid re-gion.

. (17)

Case 2: M is odd and N is even: Since there are (M+1)/2 even p and (M1)/2 odd p. Thus, the number of (p, q) that satisfy (14) is:

. (18)

Case3: M is even and N is odd:

. (19)

Note that, it is impossible that both M and N are odd. In this case, from (5), if m = (M1)/2, 2K((M1)/2) = N and K((M1)/2) = N/2, which is not an integer.

(M1)th row(M2)th row

1st row0th row

m = M1m = M2

m = 2m = 1m = 0

Region A Region B

(a)

Region A

(b)

Region B

rotation by 180

Rectangular region

n = 0 1 2 N1

Region A

Region B

Therefore, in all the cases, we can obtain MN/2 DCT bases from Theorem 2, which is equal to the number of points in the trapezoid region A. Thus, the DCT bases ob-tained from Theorem 2 form a complete and orthonor-mal set in the trapezoid region A. #

In Fig. 3, we give an example. Fig. 3(a) is a trapezoid region. We use (14)-(16) to derive its complete and or-thonormal DCT set (consists of 16 bases) and the results are shown in Fig. 3(b).

Fig. 3: The complete and orthonormal 2-D DCT basis in a trapezoid region.

3. EXTENDING TO GENERALIZED TRAPEZOID, TRIANGULAR, AND POLYGONAL REGIONS

We have derived the complete and orthonormal DCT ba-sis for the trapezoid region whose first pixels in each row are aligned at the same column, as in Fig. 1. In fact, our results can also be applied to other type of regions.

First, our results can be applied to any trapezoid re-gion that satisfies (4), even if the first pixels in each row are not aligned at the same column. For the region as in Fig. 4(a), we can first shear it into in Fig. 4(b), then use the method in Section 2 to find the complete orthogonal DCT bases, and then shear the bases back.

Furthermore, our method can also be applied for the trapezoid regions that is the rotation form of Fig. 1 or Fig. 4(a).

Moreover, since the triangular region can be viewed

as a special case of trapezoid region whose number of pixel in the first (or the last) row is 1 (i.e., in (5), K(0) = 1 or K(M 1) =1), as in Fig. 5, thus, the method in Theorem 2 can also be used for the triangular region.

Furthermore, since an n-side polygonal region can be view as a combination of n2 triangular regions, we can also use our method to perform DCT expansion for a polygonal region.

Fig. 4: Shearing a region that satisfies (5) into the trape-zoid region whose first pixels in each row are aligned at the same column.

Fig. 5: A triangular region can be viewed as a special case of the trapezoid region where K(0) or K(M1) =1 in (4).

However, it is hard to find a trapezoid which can match the arbitrary shape accurately for real case image compression. That is, we find the approximate trapezoid that is contained inside the arbitrary shape with the largest area instead of finding the perfect matched trape-zoid.

In order to have higher compression ratio we intend to find a trapezoid that is contained inside the shape. Therefore, most of the pixels in the trapezoid region may have similar characteristics (grey level values). In other words, energy in this trapezoid region mostly concen-trates in the low frequency region. It is helpful for image compression. Fig. 6 shows an example of finding an ap-proximate trapezoid in an arbitrary region. Fig. 6(a) is an arbitrary shape and Fig. 6(b) is one of the ways to find the approximate trapezoid region. We can see that the trape-zoid cannot exactly match the shape in Fig. 6(a). There-fore, we may find more trapezoids with smaller size in the rest of the region to have the entire shape. In chapter 5, we will show how to deal with this problem. In fact, little amount of missing points is tolerable. They can be easily recovered by pixel interpolation in the posterior process.

(a)

(b) C0,0 C2,0 C1,1 C3,1

2 4 6 8 10

2

4

2 4 6 8 10

2

4

2 4 6 8 10

2

4

2 4 6 8 10

2

4

C0,2 C2,2 C1,3 C3,3

2 4 6 8 10

2

4

2 4 6 8 10

2

4

2 4 6 8 10

2

4

2 4 6 8 10

2

4

C0,4 C2,4 C1,5 C3,5

C0,6 C2,6 C1,7 C3,7

2 4 6 8 10

2

4

2 4 6 8 10

2

4

2 4 6 8 10

2

4

2 4 6 8 10

2

4

2 4 6 8 10

2

4

2 4 6 8 10

2

4

2 4 6 8 10

2

4

2 4 6 8 10

2

4

shearing(a) (b)

(M1)th row

1st row0th row

Fig. 6: Finding (b) an approximate trapezoid region in (a) an arbitrary shape.

4. EXTENDING TO OTHER SYMMETRIC ORTHOGONAL BASIS

In Sections 2 and 3, we discussed how to derive the com-plete and orthogonal DCT basis in a triangular or a trape-zoid region. In fact, our method is also suitable for other types of bases. Since Theorem 2 was derived based on (8), thus, if a basis set is complete and orthogonal in a rectangular region and has the even / odd symmetric re-lation as in (8), we can also use Theorem 2 to convert it into the complete and orthogonal basis set in the triangu-lar and the trapezoid regions.

For example, in digital signal processing [5], the basis sets of the 2-D discrete Fourier transform (DFT), the 2-D discrete Hartley transform, the 2-D number theoretic transform (NTT), the 2-D discrete Legendre transform, the 2-D discrete orthogonal polynomial expansion, and the 2-D Hadamard (Walsh) transform all have the even / odd symmetric relation as in (8). Therefore, we can use Theorem 2 to convert them into complete and orthonor-mal basis sets in a triangular or a trapezoid region. We give an example of deriving the complete orthogonal Hadamard (Walsh) basis set for the triangular region as in Fig. 7(a). Then, as the method in Fig. 2(b), we first con-vert it into a 44 rectangular region. The 2-D orthogonal Hadamard basis for the 44 rectangular region is [6]:

5. APPLICATIONS IN IMAGE COMPRESSION AND SIGNAL ANALYSIS

This chapter is divided into three parts. First, we will dis-cuss the proposed method used in a trapezoid region. Chapter 5.2 introduces the new segmentation and com-pression algorithms. Chapter 5.3 shows the compression procedure of the entire image.

5.1. Proposed method in a specific trapezoid region

The proposed method provides an efficient way to trans-form and code a trapezoid or triangular shape object. In Figs. 9 and 10, we show a simulation.

50 100 150

50

100

150

200

50 100 150

50

100

150

200

trapezoid

doorregion

Fig. 9: (a) A laboratory image. (b) In a 2-D image, the

door always has the shape of trapezoid.

0 5 10 15 20 250.96

0.97

0.98

0.99

1

MPEG-4Gram-Schmidt

proposed

j

P[ j]

Fig. 10: Normalized partial sums P(j) (see (24), which can measure the performance of energy concentra-tion) using (a) the proposed method, (b) the DCT obtained by the Gram-Schmidt method, and (c) the two directional 1-D DCT in MPEG 4.

Although a door has the shape of rectangle, in a 2-D im-age, it always becomes the trapezoid form, as in Fig. 9(b). Then we use three methods to transform and code the door region in Fig. 9(b): (a) the proposed method, (b) us-ing the DCT basis orthogonalized by the Gram-Schmidt method, and (c) applying the 1-D DCT along x-axis and y-axis, as the method used in MPEG 4 [4]. Their running time are: (a) proposed: 0.0364 sec (b) Gram-Schmidt: 1032.87 sec (c) the 1-D DCT method in MPEG 4: 0.0701 sec. (23)Then, in Fig. 10, we show the normalized partial sums of the energies of the largest DCT coefficients of the three methods: ,

From (23), the proposed method is much faster than the Gram-Schmidt method and its energy concentration is as good as the results of the Gram-Schmidt method (see Fig. 10). Moreover, compared with the shape adaptive DCT method in MPEG 4, since our method perform the DCT with fixed number of points for each row and col-umn, our method has both less computation time and bet-ter energy concentration than the 1-D DCT method in MPEG 4.

5.2. New Segmentation and Compression Algorithms

With the proposed method, the algorithm for image compression can become much more general. For the ex-

approximate trapezoid

(b)(a)

isting JPEG algorithm, an image is first divided into sev-eral 88 blocks, as Fig. 11(a). Now, with the proposed method, we can divide an image into several trapezoid, rectangular, or triangular blocks instead of 88 rectan-gular blocks, as Fig. 11 (b).

Fig. 11: (a) The existing JPEG cuts an image into several 88 rectangular blocks. (b) With the proposed method, we can divide an image into rectangular, trapezoid, or triangular blocks.

Compared with the original JPEG algorithm, the method in Fig. 11 (b) is more flexible. Since the bound-aries between two blocks can have the direction not paral-lel to x- and y-axes, we can make them match the edges of the objects. Then, the YCbCr values in a block will be more uniform, which is good for compression.

To make the block exactly match the shape of the ob-ject, which is the work in MPEG-4, we need extra data to record the edges of the objects, which is not good for compression.

Using the method in Fig. 11 (b) can avoid the prob-lem. Since the boundary consists of straight lines, to record the shape of a block, we only have to record its corners.

Moreover, from Section 4, since Theorem 2 can also be used for deriving the 2-D complete and orthogonal DFT, NTT, and Hadamard basis in a trapezoid or triangu-lar region, therefore, the proposed method is also useful for signal analysis, filter design, CDMA, and other signal processing applications.

5.3. Image Compression with proposed method

Chapter 5.1 shows the compression in a specific trapezoid region. However, for general images we can hardly find a trapezoid which can exactly match the shape of the ob-ject. Therefore, finding the appropriate trapezoids is very important in our proposed method.

Images are divided into four regions: lower fre-quency regions, higher frequency regions, border re-gions and the corner and boundaries part. The lower frequency regions are trapezoids; they are depicted in Fig. 12(b). We divide this image into eight low frequency parts. The lines in Fig. 12(b) denote the boundaries of the

trapezoid region. Trapezoid DCT is used in the lower fre-quency regions and the corner and boundaries part are coded by geometric coding techniques. Arbitrary shape DCT using Gaussian-Schmidt method is used in the higher frequency regions and the border regions because their size are small enough and may not cost too much processing time.

50 100

50

100

50 100

50

100

Fig. 12: (a) A fruit image. (b) The lower frequency re-gions found in the fruit image.

We try to find the largest trapezoid that is contained in-side the lower frequency regions. Therefore, higher com-pression ratio can be obtained in the compression process. Dividing the objects into many trapezoid regions, the op-timal solution is difficult to find.

There are two problems in the dividing procedure: overlapped trapezoids and missing points. Missing points mean that we have gap between the trapezoids we found. This can be dealt with pixel interpolation. The overlapped trapezoids problem cause when we divide into larger trapezoids. This can be easily remove by simply choosing the average value or just drop one of the points. Missing points may cause larger error so we are willing to process more data (overlapped trapezoids problem) rather than have missing points between the regions.

(a) (b)

all 88 rectangular blocks trapezoidal, rectangular, or triangular blocks

Fig. 13 is the flowchart of our proposed compression method. An image is divided into four regions as we men-tioned before. The trapezoid DCT will be applied on the low frequency region; in other words, the low frequency regions must be divided into trapezoid. The arbitrary shape DCT using GS is applied on the rest of the regions.

Fig. 13: The flowchart of our proposed image compres-sion method using DCT in trapezoid regions

As mentioned, it is hard to find the optimal solution of di-viding the lower frequency region into trapezoids. We proposed a method to resolve the problem. For each ob-jects in the image, we do the following processes. The di-viding procedure has mainly two steps: slice the objects into several stripes, find the inscribed trapezoid in each stripes. The following is the dividing procedure:Step 1. Find the corners of the objectStep 2. According to the corners, the object is sliced into

several stripes on the position of the corners. If the corners are too close, we will merge the stripes.

Step 3. Find the inscribed trapezoid in each stripe. The endpoints of the trapezoid are initialized to the endpoint of the upper side and the lower side.

Step 4. By moving the legs inward we can obtain the in-scribed trapezoid.

Step 5. Record the endpoints of the inscribed trapezoid.Fig. 14 shows an example of finding inscribed trape-

zoid regions according to this process. Fig. 14(a) shows how we slice the object into stripes. Note that if the cor-ners are too close then we will merge the two stripes. Fig. 14(b) is the process that we find the inscribed trapezoid. We move the legs of the trapezoid until the legs are all in-side the object.

Fig. 14: (a) Slicing the object into several stripes accord-ing to the corners. (b) Finding the inscribed trapezoid by moving inward the legs of the initial trapezoid.

50 100

50

100

50 100

50

100

Fig. 15: (a) The reconstruction fruit image using JPEG compression standard (692 bytes). (b) The recon-struction fruit image using our proposed method (165 bytes).

Fig. 15 shows the reconstruction fruit image using the JPEG compression standard and our proposed method. We can see some black points inside the apple in Fig. 15(b). This is caused by the missing point problem and we do not fix it yet. The distortions are mainly in the high frequency region and the border region but it is endurable for human vision because human eyes are more sensitive to the lower frequency distortion.

Input image

Lower frequency region

Other region

DCT in trapezoid regions

ASDCT using GSO process

Coding

Coding

(a) Corner too close (b) Finding inscribed trapezoid

In Fig. 15, compared to the JPEG standard, the num-ber of bits using our proposed method is 165 bytes with RMSE equals to 4.7286. The JPEG standard costs 692 bytes with RMSE equals to 2.1198. The data amount of our proposed method is about one fourth of the JPEG standard one while looking similar.

If we use smaller quantization step, we will have RMSE smaller than using JPEG standard but it costs more bytes whereas still costs only two third of data amount of the one using the JPEG compression standard. Further-more, if we compress the image by using JPEG compres-sion standard with the same amount of data as our pro-posed method it will cause severe block effect. So our proposed method can also solve the block effect.

50 100

50

100

Fig. 16: The reconstruction fruit image using JPEG com-pression standard (233 bytes) RMSE= 4.2173.

Fig. 16 is the example of a fruit image using JPEG com-pression standard with data amount equals to 233 bytes and RMSE equals to 4.2173. It is obvious that the block effect becomes severe while using less byte to encode the image. Compared to our proposed method in Fig. 15(b), it costs only 165 bytes without block effect.

Moreover, the processing time is much less than the arbitrary shape DCT using Gaussian-Schmidt method. It costs only 4.930688 seconds by using our proposed method while the Gaussian-Schmidt method needs much more processing time.

In summary, compared to the conventional JPEG compression standard, our proposed method has the fol-lowing advantages:

(a)Less amount of data quantity.(b)Avoid block effect.(c)Compress the image according to its characteris-

tics.Compared to the arbitrary shape DCT using Gaussian-

Schmidt method, our proposed method has the following advantages:

(a)Reduce massive computation time.(b)Energy concentration is as good as the results of

the Gram-Schmidt method.

7. CONCLUSION

In this paper, we describe the ways to generate the com-plete and orthonormal DCT basis in a trapezoid or a trian-gular region efficiently without using the Gram-Schmidt method. With the proposed method, the JPEG compres-sion algorithm can become much more general and we can divide an image into trapezoid or triangular blocks in-stead of 88 blocks. Moreover, our method can also be applied to the DFT basis, the Hadamard (Walsh) basis, or any other bases with even and odd symmetric relations. By the new segmentation and compression algorithm we proposed, the block effect problem in the JPEG compres-sion standard can be resolved even using less amount of data quantity to compress the image. Moreover, the com-putation time of our proposed method is much lower than the arbitrary shape DCT using Gaussian-Schmidt method.

26. two-dimensional orthogonal dct expansion in triangular and trapezoid regions

Documents