kernel method for unimodal test - seoul national...

저 시-비 리- 경 지 2.0 한민

는 아래 조건 르는 경 에 한하여 게

l 저 물 복제, 포, 전송, 전시, 공연 송할 수 습니다.

다 과 같 조건 라야 합니다:

l 하는, 저 물 나 포 경 , 저 물에 적 된 허락조건 명확하게 나타내어야 합니다.

l 저 터 허가를 면 러한 조건들 적 되지 않습니다.

저 에 른 리는 내 에 하여 향 지 않습니다.

것 허락규약(Legal Code) 해하 쉽게 약한 것 니다.

Disclaimer

저 시. 하는 원저 를 시하여야 합니다.

비 리. 하는 저 물 리 목적 할 수 없습니다.

경 지. 하는 저 물 개 , 형 또는 가공할 수 없습니다.

http://creativecommons.org/licenses/by-nc-nd/2.0/kr/legalcode

http://creativecommons.org/licenses/by-nc-nd/2.0/kr/

이학박사 학위논문

Kernel Method

for Unimodal Test

커널방법을 이용한 단일모드 검정

2015년 8월

서울대학교 대학원

통계학과

이 선 미

Kernel Method

for Unimodal Test

by

Seonmi Lee

A Dissertation

submitted in fulfillment of the requirement

for the degree of

Ph.D of Science

in

Statistics

The Department of Statistics

College of Natural Sciences

Seoul National University

August, 2015

Abstract

Seonmi Lee

Statistics

The Graduate School

Seoul National University

Finding the number of modes is of great interest in density estimation. Well

known nonparametric unimodality tests are including the dip test, excess mass

test, and Silverman’s test. The dip and excess mass statistic are based on the

empirical distribution and supremum distance, while Silverman’s test depends

on the bandwidth of kernel density estimator. A main issue of these tests

is conservatism and often calibration methods are used to address this issue.

We propose kernel methods of unimodality based on the dip and excess mass

statistics to address the aforementioned issue. We proposed to use the total

variation distance to identify the closest unimodal distribution to kernel dis-

tribution and construct the kernel dip test based on the unimodal distribution

from calculating test statistics. Our numerical studies show that the proposed

tests outperform. We also introduce a kernel excess mass statistics. Under the

strong unimodal condition, the limiting distribution of the kernel excess mass

statistic is the same as that of the empirical excess mass statistic. However

i

the numerical studies indicate that the calibration of kernel excess mass test

has a greater power and better level accuracy than the calibration of empirical

excess mass test. We apply the proposed method to astronomy data, physical

properties of minor planets in the solar system.

Keywords: Density estimation, Dip test, Excess mass test, Kernel methods,

Unimodal distribution

Student Number: 2010-30925

ii

Contents

List of Figures v

List of Tables vii

1 Introduction 1

1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Outline of the thesis . . . . . . . . . . . . . . . . . . . . . . . . 2

2 Unimodality Test 4

2.1 The dip test . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2.2 The excess mass test . . . . . . . . . . . . . . . . . . . . . . . . 7

2.3 Silverman’s test . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

3 The Kernel Dip Test 14

3.1 The kernel dip with total variation . . . . . . . . . . . . . . . . 14

3.2 Computing the kernel dip . . . . . . . . . . . . . . . . . . . . . 20

3.3 The kernel dip test . . . . . . . . . . . . . . . . . . . . . . . . . 35

iii

4 The Kernel Excess Mass Test 36

4.1 The kernel excess mass . . . . . . . . . . . . . . . . . . . . . . . 36

4.2 Computing the kernel excess mass . . . . . . . . . . . . . . . . . 41

4.3 The kernel excess mass test . . . . . . . . . . . . . . . . . . . . 45

5 Numerical Study 48

5.1 Simulation 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

5.2 Simulation 2 : Calibration tests . . . . . . . . . . . . . . . . . . 53

5.3 Real data analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 57

6 Conclusion 68

Reference 70

Abstract in Korean 73

List of Figures

2.1 The empirical dip statistics with sup distance . . . . . . . . . . 5

2.2 Distributions of the empirical dip . . . . . . . . . . . . . . . . . 6

2.3 The excess mass E2(λ) and E1(λ) . . . . . . . . . . . . . . . . . 8

3.1 Sup distances between F and unimodal distributions U1 and U2 15

3.2 The total variation distance between f(x) and u(x) . . . . . . . 17

3.3 Unimodal distributions u0(x) (dashed line) and u(x) (dash-dot

line) close to distribution f(x) (solid line) . . . . . . . . . . . . 20

3.4 A part of f(x) and nondecreasing function u(x) . . . . . . . . . . 22

3.5 Finding u∗0(x) in I0 = [a, x1] . . . . . . . . . . . . . . . . . . . . 24

3.6 Finding u∗s(x) in Is = [xs, xs+1], 1 ≤ s < S . . . . . . . . . . . . 26

3.7 Finding u(x) in IS = [xS, b] . . . . . . . . . . . . . . . . . . . . . 28

3.8 A closest unimodal function u0(x) for a bimodal function f(x) . 30

3.9 Distributions of the kernel dip . . . . . . . . . . . . . . . . . . . 34

4.1 The kernel excess mass and the kernel dip of bimodal distribution 37

v

4.2 Intervals for calculating kernel excess mass . . . . . . . . . . . . 42

4.3 Distributions of kernel excess mass . . . . . . . . . . . . . . . . 44

5.1 Unimodal distributions : N(0, 1), t(6) and β(3, 4) . . . . . . . . 49

5.2 Tests for bimodal distributions . . . . . . . . . . . . . . . . . . . 51

5.3 Tests for multimodal distributions . . . . . . . . . . . . . . . . . 52

5.4 Calibrating tests for unimodal distributions . . . . . . . . . . . 54

5.5 Calibrating tests for bimodal distributions . . . . . . . . . . . . 55

5.6 Calibrating tests for multimodal distributions . . . . . . . . . . 56

5.7 Estimated distibutions for Centaurs and TNOs . . . . . . . . . . 59

5.8 B-R versus HR for Centaurs and TNOs . . . . . . . . . . . . . . 61

5.9 Estimated distributions of three groups . . . . . . . . . . . . . . 62

5.10 Estimated distributions of three groups without Centaurs . . . . 63

5.11 B-R distributions of data HR > HR:up and HR < HR:low . . 64

5.12 Kernel dip test for TNOs . . . . . . . . . . . . . . . . . . . . . . 66

5.13 Calibration Kernel Excess Mass test for TNOs . . . . . . . . . . 67

vi

List of Tables

5.1 Tests for unimodal distributions . . . . . . . . . . . . . . . . . . 50

5.2 Unimodality tests for Centaurs and TNOs . . . . . . . . . . . . 60

5.3 Unimodality tests for three groups . . . . . . . . . . . . . . . . . 62

5.4 Unimodality tests for three groups without Centaurs . . . . . . 63

vii

Chapter 1

Introduction

1.1 Overview

The unimodality of the distribution has been one of the most important criteria

in clustering analysis. The multimodality of the distribution means that it is

mixture and contains several subpopulations. For finding existence of more

than one mode, there has been several approaches based on density estimates

Cox (1966) used a histogram and Silverman (1981) used a bandwidth selection

of kernel density estimate. Silverman’s test statistics is generally determined

by extreme value, not by the modes. This flaw reduces the power of Silverman’s

test. Hartigan and Hartigan (1985) proposed the dip statistics and Muller and

Sawitzki (1991) proposed the excess mass statistics based on the empirical

cumulative distribution function. These test methods are equivalent in the

context of one dimension by Cheng and Hall (1998b). In addition, Cheng and

1

Hall (1998a) have suggested calibration method of excess mass test. The power

of this calibrating test, however, is reduced when there are modes with small

dip.

In other to overcome this drawback, we propose new dip and excess mass

statistics based on kernel methods. As the definition of the dip with supremum

distance suggested by Hartigan and Hartigan (1985) makes some difficulties, we

suggest new definition of the dip with total variation distance. The computing

of new dip also offers nearest a unimodal distribution. Moreover, the proposed

kernel dip test use this closest unimodal distribution in this study.

We also define the excess mass using the kernel distribution like as the

empirical excess mass suggested by Muller and Sawitzki (1991). Our study

shows that the asymptotic convergence property of the kernel excess mass is

similar to the asymptotic result of empirical excess mass.

In numerical study, new proposed kernel methods for unimodality test per-

form better than other test methods. The calibrating kernel excess mass test

particularly has greatest power. We describe how our kernel unimodality tests

apply to astronomy data in real data analysis.

1.2 Outline of the thesis

The thesis is organized as follows. In Chapter 2, we review the dip test and the

excess mass test as well as Silverman’s test. Chapter 3 provides description

of the kernel dip statistic with total variation and the computing of this new

2

dip. Furthermore we propose the kernel dip test for unimodality based on

calculating the kernel dip. We define the kernel excess mass and show its

theoretical properties in Chapter 4. We perform the simulation study and real

data analysis in Chapter 5 and conclude in Chapter 6.

3

Chapter 2

Unimodality Test

2.1 The dip test

1) The dip statistics

The dip of a distribution function, suggested by Hartigan and Hartigan (1985),

is given by

DS(F ) = infU∈U

supx|F (x)− U(x)|

where U is the class of unimodal distribution functions. The dip statistic is

the maximum difference by between the empirical distribution function Fn and

the unimodal distribution function U that minimizes that maximum difference

such as Figure 2.1. The empirical distribution function is defined by Fn(x) =

1

n

∑Xi ≤ x where X1, · · · , Xn sampling from F . Therefore we can write

4

Figure 2.1: The empirical dip statistics with sup distance

the empirical dip statistics as follows

DS(Fn) = infU∈U

supx|Fn(x)− U(x)|.

2) The dip test

Hartigan and Hartigan (1985) showed that the asymptotic property of the dip

and proposed the dip test for unimodality.

Theorem 2.1. Let F be unimodal with nonzero kth derivative at the mode m,

for some k ≥ 2, and

inf0<F ′(x)<F ′(m)−ε

| ddx

logF ′(x) | > 0 for each ε > 0.

Then√nDS(Fn)→ 0 in probability.

Theorem 2.1 shows that DS(Fn) converges zero under the assumption of uni-

modality. The empirical dip DS(Fn) is the statistics for testing the null hypoth-

5

Figure 2.2: Distributions of the empirical dip

esis that F is unimodal distribution, that is DS(F ) = 0, against the alternative

that it is not unimodal distribution, DS(F ) > 0. This dip test rejects when

DS(Fn) is too large. When Fn is unimodal, however, DS(Fn) is not zero but

small value for sufficiently large n. We need to the asymptotic distribution of

DS(Fn) under the null hypothesis. Hartigan and Hartigan (1985) argued that

the dip of the uniform distribution is larger than one of any other distributions

in unimodal distribution class. They used the uniform distribution U(0, 1) to

calculate distribution of DS(Fn) for the dip test.

Many studies have mentioned that this test is conservative because it is

possible for the dip statistics of a multimodal distribution to be less than one

of the uniform distributions. We also observe this problem in Figure 2.2. This

6

figure plots distribution functions of the empirical dip test statistics DS(Fn)

with sample size n = 500 and drew 1000 samples for uniform, normal, and

t distribution and mixture of two normal distributions. In this figure, distri-

bution of dip statistics from mixture normal distribution is similar one of dip

statistics from uniform distribution. Although the mixture normal distribu-

tion is bimodal distribution, their dip test sometimes determines the mixture

normal distribution as unimodal distribution.

2.2 The excess mass test

1) The excess mass

The excess mass approaches to testing unimodality with the dip. Muller and

Sawitzki (1991) introduced the excess mass as measure of excessive of prob-

ability concentrated on the peak. For a bounded continuous density f with

respect to Lebesgue measure, the excess mass functional is defined by

λ→ E(λ) =

∫(f(x)− λ)+dx.

And λ-cluster I is the connected components of x : f(x) ≥ λ. When f has

exactly m λ-cluster,

Em(λ) = supm∑j=1

∫Ij(λ)

(f(x)− λ)dx.

where the supremum is taken over all families Ij : j = 1, · · · ,m of pairwise

disjoint connected set. Specifically, E2(λ) is considered as shaded region of

7

Figure 2.3: The excess mass E2(λ) and E1(λ)

left graph in Figure 2.3. In this graph, disjoint intervals I1 and I2 satisfy

f(x) ≥ λ, and excess mass functional has supremum value in I1 and I2. The

right graph of Figure 2.3 shows E1(λ) is calculated in the same way as E2(λ).

The difference between E2(λ) and E1(λ) means the excess mass of the second

peak. This difference can be statistics of unimodality test.

Let M be the maximum number of modes, we obtain estimators of the

excess mass with the empirical distribution as following

En,M(λ) = supI1,··· ,IM

[M∑j=1

Fn(Ij)− λ‖Ij‖

]

where ‖I‖ is the length of I, Fn(I) is Fn-measure of I. LetDn,m(λ) = En,m(λ)−

En,(m−1)(λ) for some λ > 0. This excess mass different statistics, Dn,m, is

useful to m-modality test. Accordingly we can define the empirical excess

mass as ∆n,m = maxλ>0

Dn,m(λ)

Muller and Sawitzki (1991) considered ∆n,2 = 2DS(Fn) and Cheng and

Hall (1998b) proved this for every n. Consequently, tests with the empirical

8

excess mass and the empirical dip have same result. The excess mass test

statistics has same disadvantages of dip test statistics. The empirical excess

mass is also not zero when Fn is unimodal.

Several studies have shown that the asymptotic properties of the empirical

excess mass under the strong unimodal condition.

Strong unimodal condition

(i) The sampling density f has a continuous derivative f ′, ultimately mono-

tone in each tail.

(ii) The constraints f ′(x0) = 0 and f(x0) 6= 0 are jointly satisfied at just one

point x0.

(iii) The second derivative function f ′′ exists and is Holder continuous within

a neighborhood of x0, with f ′′(x0) < 0.

Cheng and Hall (1998b) derived the limiting distribution of ∆n,2 under the

strong unimodal condition as follow theorem.

Theorem 2.2. Let W denote a standard Wiener process on the real line.

Given real number y1 < y2 and t, define

δ(y1, y2, t) = W (y2)−W (y1)− (y32 − y3

1) + t(y2 − y1).

9

And put

Z = 615 sup−∞<t<∞

[sup

−∞<y1<···<y4<∞δ(y1, y2, t) + δ(y3, y4, t)−

sup−∞<y1<y2<∞

δ(y1, y2, t)].

Under strong unimodal condition,

n35 ∆n,2 converges in distribution to cfZ as n→∞

where cf =

f(x0)3

|f ′′(x0)|

15

. Moreover n35DS(Fn) also have same convergence

distribution.

This theorem tells that the distribution of empirical excess mass statistics,

∆n,2, depend only on cf .

2) The empirical excess mass test

The excess mass test, suggested by Muller and Sawitzki (1991), is based on the

assertion that P [∆n,m ≤ κ] is smaller than P [maxI |(Un − U)(I)| ≤ κ], where

U is the standard uniform distribution and Un is the empirical distribution

of a sample drawn from it. Since the empirical excess mass ∆n,2 is equal to

twice the empirical dip DS(Fn), this test using the uniform distribution has

also conservative result like as Section 2.1.

To avoid this conservatism, Cheng and Hall (1998a) proposed calibrating

empirical excess mass test. By theorem 2.2, we can concentrate to estimate a

factor cf in order to find the limiting distribution of the empirical excess mass

10

statistics under the null hypothesis. If one can resample from a calibration

distribution, a known unimodal density g(·), we find properties of the excess

mass statistics correspond to the excess mass of f(·). In fact, the asymptotic

property of our kernel excess mass statistic in this study is same the property

of empirical excess mass statistic. Therefore we will introduce the calibrating

excess mass test method in Chapter 4. Besides the new kernel test statistic has

better performance than the empirical test statistic in the simulation study.

2.3 Silverman’s test

1) Silverman’s Test

Given a sample X = X1, · · · , Xn, from a population with density f , kernel

density estimate is defined by

fh(x) =1

nh

n∑i=1

K(x−Xi

h)

where h is bandwidth, K is a kernel function. Silverman (1981) suggested

unimodality test using the fact that the number of modes of fh(x) is non-

increase in h when K is the gaussian kernel. This test is based on the kernel

density estimate with smallest bandwidth h making m mode distribution. The

null hypothesis H0 is that f has a m modes and the alternative hypothesis H1

is that f has more than m modes. This study proposed statistic hcrit = infh :

f(·, h) has at most m modes for this test. If hcrit has large value, then it is

evidence against H0. When the null hypothesis is that the true density is g

11

and h0 is hcrit from the data, Silverman test is based on

Pr(hcrit > h0) = Pr(f(·;h0) has more than m modes.|x1, · · · , xn from g).

To obtain the value of the statistic hcrit, we generate R bootstrap method. Let

fcrit denote fh with h = hcrit. Conditional on X , let X∗1 , · · · , X∗n be a resample

drawn from fcrit,

f ∗h(x) =1

nh

∑K(

x−X∗ih

)

and let h∗crit is infh : f ∗h has at most m modes. This test is determined by

the number of times that f ∗h(x) possesses more than one mode.

P (# of occurrences in which f ∗h(x) has more than one mode)/R

Mammen et al. (1992) and Cheng and Hall (1998b) showed the asymptotic

property of hcrit under the null hypothesis H0. In addition, Mammen et al.

(1992) told that Silverman’s test is conservative because the true asymptotic

level is less than the nominal one. The distribution of Un = P (h∗crit ≤ hcrit|X )

is not far from being uniform on the interval (0, 1), at least for large values

of n. Moreover, sometimes hcrit has spurious mode on the tail. If both the

support of f and the interval I are unbounded then properties of hcrit are

generally determined by extreme values in the sample, not by the modes of f .

2) Calibrating Silverman’s Test

Calibrating Silverman’s test is proposed by Hall and York (2001) and improves

12

its level accuracy. This calibration takes two forms in terms of an asymptotic

approach of Gn(λ) = P (h∗crit/hcrit ≤ λ|X ) and Monte Carlo technique. Under

H0, Gn(λ) converges weakly to a stochastic process G and there unique exists

λα such that PG(λα) ≥ 1 − α = α. We need α to specify the constant λα.

First of all, an asymptotically correction bases on limiting distribution of the

test statistics

PGn(λα) ≥ 1− α → PG(λα) ≥ 1− α = α.

By simulation study, they fit λα =a1α

3 + a2α2 + a3α + a4

α3 + a5α2 + a6α + a7

to the output

to provide a means of approximating λα for arbitrary α. The coefficients

are a1 = 0.94029, a2 = −1.59914, a3 = 0.17695, a4 = 0.48971, a5 =

−1.77793, a6 = 0.36162, a7 = 0.42423. The other method estimates λα by

Monte Carlo because G does not depend on unknowns. We use the former

method in the simulation study of Chapter 5.

13

Chapter 3

The Kernel Dip Test

3.1 The kernel dip with total variation

1) The total variation dip

The empirical dip statistic has some defects because of the supremum distance.

Recall the definition of the dip statistic,

DS(F ) = infU∈U

supx|F (x)− U(x)|

where U is the class of unimodal distribution functions. The unimodal distri-

bution that is closest distribution to observed distribution in terms of the sup

distance is no always reasonable. In top of Figure 3.1, there are distribution

F which mixture N(0, 1) and N(3, 1) and two unimodal distributions U1 and

U2 near to F . The sup distances from U1 and U2 to F are 0.0415 and 0.024,

respectively. F , U1, and U2 can be converted as f , u1, and u2, respectively

14

Figure 3.1: Sup distances between F and unimodal distributions U1 and U2

15

as the bottom of Figure 3.1. Although U2 is closer to F than U1, the density

u2 is unconvincing unimodal density closest to density f . The density u1 is

better reasonable closest to f than density u2. In addition, the computing

method of the dip suggested by Hartigan and Hartigan (1985) gives a result

analogous to sup |F − U1|. When a kernel distribution Fh is smooth function

unlike empirical distribution Fn, this computing method cannot give exactly

dip of Fh.

In order to overcome this difficulty, we apply the total variation distance

to the definition of the dip. Levin et al. (2008) introduced the total variation

as a distance measure for two probabilities. For two probability distribution

P and Q on sample space Ω, the total variation is defined by

‖P −Q‖TV = maxA∈Ω|P (A)−Q(A)|.

For arbitrary sample space Ω, measure µ and probability distributions P and

Q with Radon-Nikodym derivatives fP and fQ with respect to µ,

‖P −Q‖TV =1

2‖fP − fQ‖L1(µ) =

1

2

∫Ω

|fP − fQ|dµ.

We use total variation instead of supremum distance to measure more accuracy

distance between distributions. For any distribution F , we redefine the dip as

D(F ) = infU∈U‖F − U‖TV

where U is the class of unimodal distribution functions. The total variation dis-

tance between F and unimodal distribution U is L1 distance between Radon-

Nikodym derivative f and u as Figure 3.2.

16

Figure 3.2: The total variation distance between f(x) and u(x)

If F and U are absolutely continuous distributions, then there exist Radon-

Nikodym derivatives f and u. Hence, we can write as

D(F ) = infU∈U‖F − U‖TV = inf

u∈U∗

[1

2

∫|f(x)− u(x)|dx

]where U∗ is the class of unimodal density function. The definition of U∗ can

be considered as

U∗ = u0 |u0 is nondecreasing in (−∞,m]

and nonincreasing in (m,∞) where m is mode.

The total variation dip has following properties.

Property 1. For any distribution F1 and F2,

D(F1) ≤ D(F2) + ‖F1 − F2‖TV .

17

This is because of the triangular inequality of the total variation distance: for

some probability distribution F , ‖P −Q‖TV ≤ ‖P − F‖TV + ‖F −Q‖TV .

Property 2. If F is unimodal distribution, that is F ∈ U , then D(F ) = 0.

On the other hand, if F is multimodal distribution then D(F ) > 0.

We can determine whether any distribution F is unimodal or not by using this

property.

2) The Kernel dip

Recall the kernel density estimation,

fh(x) =1

nh

∑K(

x−Xi

h)

where kernel K satisfying∫K = 1. In addition, we consider kernel distribution

function estimation as follows:

Fh(x) =

∫ x

∞fh(y) dy =

1

n

∑L(x−Xi

h)

where L(t) =∫ t−∞K(u)du.

Suggested new dip is considered the class with absolutely continuous func-

tions. We can estimate the total variation dip with kernel estimator. The

estimated total variation dip measures distance between the kernel distribu-

tion function and the unimodal distribution function as

D(Fh) = infU∈U‖Fh − U‖TV

18

where U is unimodal class. We obtain following properties of the kernel dip.

Property 1. When a kernel distribution function Fh(x) is unimodal, its ker-

nel dip is zero, D(Fh(x)) = 0.

Property 2. When a true distribution function F is unimodal,

D(Fh(x)) = infU∈U‖Fh − U‖TV ≤ ‖Fh − F‖TV .

Moreover, Theorem 3.1 shows the asymptotic property of the kernel dip.

Theorem 3.1. Assume a true distribution F is unimodal, nh → ∞, h → 0

as n→∞, then

D(Fh(x))→ 0 in probability.

Devroye and Gyorfi (1985) showed the asymptotic convergence property of

kernel density estimator as Lemma 3.2.

Lemma 3.2. Assume nh→∞, h→ 0 as n→∞,∫|fh(x)− f(x)| → 0 as n→∞.

One can easily prove the Theorem 3.1 by Lemma 3.2. and Property 2 of the

kernel dip.

19

Figure 3.3: Unimodal distributions u0(x) (dashed line) and u(x) (dash-dot

line) close to distribution f(x) (solid line)

3.2 Computing the kernel dip

1) Computing the total variation dip

The total variation dip needs to new computing method. To compute the total

variation dip D(F ) for any distribution function F , we should find unimodal

distribution U0 which satisfies D(F ) = ‖F − U0‖TV . Let consider F (x) with

density f(x) in Figure 3.3 as a example. There are two unimodal densities

u0(x) and u(x) satisfying ‖u0(x)− f(x)‖L1 ≤ ‖u(x)− f(x)‖L1. If u0(x) is clos-

20

est to f(x), we obtain dip as D(F ) = 12

∫|f(x)− u0(x)|dx. We assume f is a

continuous function and has bounded support I. The interval I can be divided

into nondecreasing part IL = (−∞,m) and nonincreasing part IR = (m,∞) by

the maximum mode m. Following theorems give idea finding closest unimodal

function for computing total variation dip. As Figure 3.3, we first find non-

decreasing function u0 near to f in IL. In particular, let concentrate finding

nondecreasing function in I∗L = [a, b] where a and b satisfy f(a) = minai<m

f(ai)

and f(b) = maxbi<m

f(bi).

Theorem 3.3. Let f(x) be continuous function in [a, b] ∈ R and have K

modes at bk and antimodes at ak where 1 ≤ k ≤ K and set increasing interval

Iinc = [a, b1] ∪

(K−1⋃k=1

[ak, bk]

)∪ [aK , b]. Assume that f(a) = min

1≤k≤Kf(ak) and

f(b) = max1≤k≤K

f(bk). Then

infu∈U0

∫ b

a

|f(x)− u(x)|dx = infu∈U1

∫ b

a

|f(x)− u(x)|dx (3.1)

where U0 is class of nondecreasing continuous function and

U1 =

u

∣∣∣∣∣ u(x) = f(x)I[a ≤ x ≤ c1] +L∑i=1

f(ci)I[ci ≤ x ≤ di]

+L−1∑i=1

f(x)I[di ≤ x ≤ ci+1] + f(x)I[dL ≤ x ≤ b]

where f(c1) ≤ · · · ≤ f(cL), c1 ≤ · · · ≤ cL, ci(1≤i≤L) ∈ Iinc

and di = minx | f(x) = f(ci), ci < x and x ∈ Iinc.

21

Figure 3.4: A part of f(x) and nondecreasing function u(x)

Proof of Theorem 3.3. Assume nondecreasing function u(x) satisfy u(a) =

f(a) and u(a) = f(b) because se find closest function to f . For any u(x) in U0,

there exist S intersection points of u(x) and f(x) in the interval where f(x)

decrease. As Figure 3.4, x1, · · · , xS(≤K) can be defined as elements of the set

x | f(x) = u(x) and x ∈ Iinc. And set I0 = [a, x1], I1 = [x1, x2], · · · , IS−1 =

[xS−1, xS], IS = [xS, b]. If we find nondecresing functions u∗0, · · · , u∗S satisfy∫Ii|f(x) − u(x)|dx ≥

∫Ii|f(x) − u∗i (x)|dx for any i = 0, · · · , S, then u∗ =∑S

s=0 U∗0 I(x ∈ Is) satisfy

∫ ba|f(x)− u(x)|dx ≥

∫ ba|f(x)− u∗(x)|dx. Thus, our

problem reduces to find u∗ which is closer to f than u for any u ∈ U0.

First, we establish u∗0(x) in I0 = [a, x1]. We can find x0 such that f(x0) =

22

f(x1) in the open interval (a, x1). If x0 ≤ b1 as graphs in top of Figure 3.5,

then we can set c01 = x0 and

u∗0(x) = f(x)I[a ≤ x ≤ c01] + f(c01)I[c01 ≤ x ≤ x1].

If x0 ≥ b1, we consider two cases, u(b1) ≥ f(b1) or u(b1) < f(b1). When

u(b1) ≥ f(b1), we find modes b1 = b01 < b02 < · · · < b0J < x0 satisfy f(b01) ≤

f(b02) ≤ · · · ≤ f(b0J) as middle of Figure 3.5. And let c0j = b0j, j = 1, · · · , J

and c0(J+1) = x0. Moreover let d0j be a solution of f(x) = f(c0j) in (c0j, c0(j+1)),

j = 1, · · · , J and d0(J+1) = x1. Therefore we can set u∗0 as

u∗0 = f(x)I[a ≤ x ≤ c01] +

J+1∑j=1

f(c0j)I[c0j ≤ x ≤ d0j] +J∑j=1

f(x)I[d0j ≤ x ≤ c0(j+1)].

When u(b1) < f(b1), we can find antimodes a01 = a1 < a02 < · · · < a0J < x0

satisfy f(a01) ≤ f(a02) ≤ · · · ≤ f(a0J) such as bottom of Figure 3.5. Let

d0j = a0j, j = 1, · · · , J and d0(J+1) = x1. In addition, set c0j be a solution

of f(x) = f(d0j) in (d0(j−1), d0j), j = 1, · · · , J with d00 = a and c0(J+1) = x0.

Then we can set

u∗0 = f(x)I[a ≤ x ≤ c01] +

J+1∑j=1

f(c0j)I[c0J ≤ x ≤ d0j] +J∑j=1

f(x)I[d0j ≤ x ≤ c0(j+1)].

Next, we consider f(x) and u∗s(x) in Is = [xs, xs+1], 1 ≤ s < S. Let

ds0 = minx|f(x) = f(xs) and x ∈ (xs, xs+1), x0s be a solution x0s of f(x) =

23

Figure 3.5: Finding u∗0(x) in I0 = [a, x1]

24

f(xs+1) in (xs, xs+1). If there exists no mode that is bigger than f(xs) and

smaller than f(xs+1), then we set u∗s as top of Figure 3.6, that is

u∗s = f(xs)I[xs ≤ x ≤ ds0 ] + f(x)I[ds0 ≤ x ≤ x0s] + f(xs+1)I[x0s ≤ x ≤ xs+1].

Otherwise, we consider u∗s in the two cases. When f(bk) ≤ u(bk) where mode

bk such that ds0 ≤ bk ≤ x0s, find modes bs1 < bs2 < · · · < bsJs ≤ x0s satisfy

f(bs1) ≤ · · · ≤ f(bsJ) and bs1 ≥ ds0. Let csj = bsj, j = 1, · · · , Js, cs(Js+1) = x0s

and dsj be a solution of f(x) = f(csj) in (csj, cs(j+1)), j = 1, · · · , Js such as

middle of Figure 3.6. Then we can set

u∗s = f(xs)I[xs ≤ x ≤ ds0] +Js∑j=0

f(x)I[dsj ≤ x ≤ cs(j+1)]

+Js∑j=1

f(csj)I[csj ≤ x ≤ dsj] + f(xs+1)I[x0s ≤ x ≤ xs+1].

When f(ak) ≤ u(ak) where antimode ak such that ds0 ≤ ak ≤ x0s, then we

find antimodes as1 < as2 < · · · < asJs ≤ x0s satisfy f(as1) ≤ · · · ≤ f(asJs)

and as1 > ds0. And let dsj = asj, j = 1, · · · , Js and csj be a solution of

f(x) = f(dsj) in (ds(j−1), dsj), j = 1, · · · , Js and cs(Js+1) = x0s like as bottom

of Figure 3.6. Then we can also set

u∗s = f(xs)I[xs ≤ x ≤ ds0] +Js∑j=0

f(x)I[dsj ≤ x ≤ cs(j+1)]

+Js∑j=1

f(csj)I[csj ≤ x ≤ dsj] + f(xs+1)I[x0s ≤ x ≤ xs+1].

Finally, we have to find u∗S(x) in IS = [xS, b]. Let x0S a solution of f(x) =

25

Figure 3.6: Finding u∗s(x) in Is = [xs, xs+1], 1 ≤ s < S

26

f(xS) in (xS, b]. If x0S ≥ aK as graphs in top of Figure 3.7, then

u∗S = f(xS)I[xS ≤ x ≤ x0S] + f(x)I[x0S ≤ x ≤ b].

If x0S < aK and f(bK) < u(bK), then we find modes bS1 < bS2 < · · · < bSJS ≤

bK satisfy f(bS1) ≤ · · · f(bSJS) and bS1 > xS. And let cSj = bSj, j = 1, · · · , JS,

and dSj be a solution of f(x) = f(cSj) in (cSj, cS(j+1)), j = 1, · · · , JS. with

cSJS+1 = b and dS0 = x0S as middle of Figure 3.7. Then we can set

u∗S = f(xS)I[xS ≤ x ≤ dS0 ] +

JS−1∑j=0

f(x)I[dSj ≤ x ≤ cS(j+1)]

+

JS∑j=1

f(cSj)I[cSj ≤ x ≤ dSj] + f(x)I[dSJ ≤ x ≤ b].

If x0S < aK and f(bK) ≥ u(bK), let find antimodes aS1 < · · · < aSJ ≤ aK

satisfy f(aS1) ≤ · · · ≤ f(aSJS) and aS1 > xS. And let dSj = aSj, j = 1, · · · , JS

and cSj be a solution of f(x) = f(dSj) in (dS(j−1), dSj), j = 1, · · · , JS with

dS0 = x0S. Then we can set u∗S as bottom of Figure 3.7.

u∗S = f(xS)I[xS ≤ x ≤ dS0] +

JS−1∑j=0

f(x)I[dSj ≤ x ≤ cS(j+1)]

+

JS∑j=1

f(cSj)I[cSj ≤ x ≤ dSj] + f(x)I[dSJ ≤ x ≤ b].

As a result, we can make nondecreasing function u∗ =∑S

s=0 u∗s in U1 closer

to f than u in U0 in terms of L1 distance :∫ b

a

|f(x)− u(x)|dx ≥∫ b

a

|f(x)− u(x)|dx

27

Figure 3.7: Finding u(x) in IS = [xS, b]

28

Moreover, the fact that all of functions u1 in U1 is nondecreasing function gives

that

infu∈U0

∫ b

a

|f(x)− u(x)|dx ≤ infu∈U1

∫ b

a

|f(x)− u(x)|dx.

We also show that nonincreasing function near to f in IR similar to Theo-

rem 3.3.

Corollary 3.3.1. Let f(x) be continuous function in [a, b] ∈ R and f(x) have

K modes at ak and antimodes at bk where 1 ≤ k ≤ K and set decreasing in-

terval Idec = [a, b1]∪

(K−1⋃k=1

[bk, ak+1]

)∪ [bK , b]. Assume that f(a) = max

1≤k≤Kf(bk)

and f(b) = min1≤k≤K

f(ak) for some i, j ∈ (1, · · · , K). Then

infu∈U0

∫ b

a

|f(x)− u(x)|dx = infu∈U1

∫ b

a

|f(x)− u(x)|dx

where U0 is class of nonincreasing function and

U1 =

u(x)

∣∣∣∣∣u(x) = f(x)I[a ≤ x ≤ c1] +L∑i=1


+L−1∑i=1

f(x)I[di ≤ x ≤ ci+1] + f(x)I[dL ≤ x ≤ b]

where f(c1) ≤ · · · ≤ f(cL), c1 ≤ · · · ≤ cL, ci(1≤i≤L) ∈ Idec

and di = minx | f(x) = f(ci), ci < x and x ∈ Idec.

When F is bimodal distribution, we can more easily find closest unimodal

distribution by following theorem.

29

Figure 3.8: A closest unimodal function u0(x) for a bimodal function f(x)

Theorem 3.4. Let F be absolutely continuous bimodal distribution and have

density function f , set

U0(x) = G0I[x ≤ m] + L0I[x ≥ m]

where m = arg max f(x), G0 is the greatest convex minorant (g.c.m.) of F

and L0 is the least concave majorant (l.c.m.) of F . Then

D(F ) = ‖F − U0‖TV .

To specific, g.c.m. is defined as G0(x) = sup G(x) |G(x) is convex in

( −∞,m ] and F (x) ≥ G(x) and l.c.m. is defined as L0(x) = inf L(x) |L(x)

is concave in [ m,∞ ) and F (x) ≤ L(x).

30

Proof of Theorem 3.4. Assume that F has mode at x = m1 and m with

m1 < m. We only find nondecreasing function u0 in (−∞,m] minimize L1

distance to f(x) because of L0(x)I[x ≥ m] = F (x)I[x ≥ m]. By theorem 3.3,

closest unimodal density u0 can be written by

u0(x) = f(x)I(−∞ ≤ x ≤ c1] + f(c1)I[c1 ≤ x ≤ d1] + f(x)I[d1 ≤ x ≤ ∞]

for some c1 and d1 such that c1 < m1 < d1 < m and f(a1) ≤ f(c1) ≤ f(m1)

where a1 is antimode between m1 and m like as Figure 3.8. We can rewrite∫ m

−∞|f(x)− u0(x)|dx =

∫M=[c1,x1]

f(x)− f(c1)dx+

∫A=[x1,d1]

f(c1)− f(x)dx

where x1 satisfies f(x1) = f(c1) and m1 ≤ x1 ≤ a1. Since F (m) = U(m),

u0 should satisfy∫Mf(x)− f(c1)dx =

∫Af(c1)− f(x)dx, for some c1. For all

x ≤ m, ∫ x

−∞f(y)dy ≥

∫ x

−∞u∗0(y)dy

and∫ x−∞ u0(y)dy = G0(x) is g.c.m. of F in (−∞,m] because u∗0(y) is nonde-

creasing function. Consequently, U0(x) = G0I[x ≤ m] + L0I[x ≥ m], satisfies

infU∈U‖F − U‖TV = ‖F − U0‖TV .

When F has mode at x = m2 and m with m2 > m, we can also find U0 by

Corollary 3.3.1

In the computation of the total variation dip, we consider the case which F

is multimodal distribution. This is because the dip of the unimodal distribution

is zero based on the property of the total variation dip.

31

Suppose F is a multimodal distribution. Find m = arg max f(x) and cal-

culate dip separated interval in (−∞,m] and [m,∞), that is

infu∈U∗

∫|f(x)− u(x)|dx = inf

g∈U∗L

∫ m

−∞|f(x)− g(x)|dx+ inf

l∈U∗R

∫ ∞m

|f(x)− l(x)|dx

where U∗L is nondecreasing function class and U∗R is nonincreasing function class.

Let the left term is DL(F ) and the right term is DR(F ). First, we have to

compute DL(F ) . If F does not have any other mode in (−∞,m), DL(F ) = 0.

And if F has only one mode in (−∞,m), we can easily get DL(F ) by using

Theorem 3.4. When F has k(≥ 2) mode in (−∞,m), calculating DL(F ) is

somewhat complex. We can consider several nondecreasing functions such that

u(x) = f(x)I[x ≤ c1] +L∑i=1


+L−1∑i=1

f(x)I[di ≤ x ≤ ci+1] + f(x)I[dL ≤ x]

where f(c1) ≤ · · · ≤ f(cL) for c1 ≤ · · · ≤ cL, di = minx∈[a,b]

x|f(x) = f(ci) as

Theorem 3.3, and they should satisfy

1

2

∫ m

−∞f(x)− u(x)dx = 0.

Then, we choose nondecreasing function having minimum total variation in

(−∞,m] on above functions. Similarly, we can also compute the right term of

the dip, DR(F ).

2) The kernel dip estimation

32

Recall the definition of the kernel dip,

D(Fh) = infU∈U‖Fh − U‖TV = inf

u∈U∗

[1

2

∫|fh(x)− u(x)|dx

].

Unlike the empirical dip, we use the kernel density to estimate new suggested

dip. This estimator depends on the bandwidth selection in the kernel density

estimation. We use well known bandwidth selection method suggested by

Sheather and Jones (1991). They chose bandwidth h to minimize a kernel

based estimate of asymptotic mean integrated squared error (AMISE) as n→

∞ and h→ 0.

AMISE =1

nhR(K) +

1

4h4M2

2F (f ′′)

where R(K) =∫K2(x)dx and M2 =

∫x2K(x)dx. They obtain hopt solve the

equation

h =

[R(K)

M22 SD(α(h))

] 15

n−15

where α(h) = c1h17 for appropriate c1 and SD(α(h)) = 1

n2h5

∑∑i 6=jK

′′ ∗

K ′(Xi−Xj

h), by analogy with algorithm of Park and Marron (1990).

Furthermore the kernel method is affected by data in the tail of a distri-

bution. For example, f is a gamma distribution having long right tail. If the

kernel density is applied to the full data set from f , then outlying data points

make spurious modes in the tail and increasing the kernel dip. Consequently,

the accuracy and the power of the unimodal test using the kernel dip are bad

affected these spurious modes. In other to avoid this problem, we suppose the

33

Figure 3.9: Distributions of the kernel dip

support of f is bounded and use the data laying within l standard deviation

in practice.

The kernel dip is clearer estimate than empirical dip. Figure 3.9 shows

simple simulation conducted under same condition of Figure 2.2 in Chapter 2.

When the kernel density Fh is unimodal, then kernel dip D(Fh) has exactly

zero. Therefore a lot of kernel dips calculating samples from normal and t

distribution have zero unlike empirical dip of Figure 2.2. In addition, kernel

dips of data from the bimodal distribution are larger than ones of data from

two unimodal distributions, normal and t.

34

3.3 The kernel dip test

For kernel distribution function Fh, D(Fh) = 0 means the fact that Fh is

unimodal and we estimate true distribution F is unimodal. On the other tand,

large D(Fh) means Fh is not unimodal and we estimate F (x) is multimodal.

We should know how large kernel dip is to determine multimodal. Since the

asymptotic distribution of D(Fh) is unknown, the determination of critical

values is difficult.

Unlike as empirical dip test, we can find unimodal function U0 having

smallest total variation to estimated distribution Fh in the computing kernel

dips. Under the null hypothesis, we consider the estimated closest unimodal

function U0 as underlying distribution. Therefore, we can draw a sample from

unimodal U0 and compute kernel dips d∗. Moreover we employ Monte-Carlo

simulation to compute the p-value of the kernel dip d0 from observed data

defined as P (d0 ≤ d∗|X ). If this p-value is smaller than significant level α, we

reject H0 and determine the population distribution is not unimodal.

Since our kernel dip test use the closest unimodal distribution, it have

better level accuracy and power than empirical dip test using the uniform

distribution. This advantage can be confirmed on the simulation study in

Chapter 5.

35

Chapter 4

The Kernel Excess Mass Test

4.1 The kernel excess mass

Muller and Sawitzki (1991) used empirical distribution to estimate excess mass.

The excess mass can be also estimated by using kernel distribution estimator

instead of empirical distribution. The kernel excess mass for m mode and some

λ > 0 is defined by

Em(λ) = supI1,··· ,Im

[m∑j=1

Fh(Ij)− λ‖Ij‖

]

where the supremum is taken over all sequences I1, · · · , Im of disjoint interval

for kernel distribution Fh. Let denote Hλ(Ij) = Fh(Ij)− λ‖Ij‖, then Em(λ) =

supI1,··· ,Im

m∑j=1

Hλ(Ij). We also write the kernel excess mass statistics as

∆m = maxλDm(λ) = max

λEm(λ)− Em−1(λ)

36

Figure 4.1: The kernel excess mass and the kernel dip of bimodal distribution

The kernel excess mass statistics has different properties of empirical statistics

in Chapter 2.

Property 1. If a kernel distribution Fh(x) is unimodal, then ∆m = 0 for

m > 1.

If m = 2, then this property is same one of kernel dip. It is advantage of the

kernel method and improves the accuracy of unimodality test.

Property 2. If a kernel distribution Fh is bimodal, then the kernel excess

mass statistic is half of the kernel dip statistic :

∆2 = maxλ

[E2(λ)− E1(λ)] =1

2inf

U :unimodal‖Fh − U‖TV =

1

2D(Fh).

This is because ∆2 measures the minimal excess mass that has to be moved to

convert Fh into a unimodal distribution U . It can be regarded as a measure of

kernel dip as Figure 4.1. However, this property does not apply when Fh has

m(> 2) modes.

37

Kernel excess mass statistics and empirical excess mass statistics are same

convergence distribution under the strong unimodal condition by Theorem 2.2

and next Theorem 4.1.

Theorem 4.1. Let K be a kernel of order 2 and Choose h > 0 of order

h ' n−13 under the strong unimodal condition. Then,

P (∆2 > n−35u) −→ P (cfZ > u) as n→∞

where cf =

f(x0)3

|f ′′(x0)|

15

and

Z = 615 sup−∞<t<∞

[sup

−∞<y1<···<y4<∞δ(y1, y2, t) + δ(y3, y4, t)

− sup−∞<y1<y2<∞

δ(y1, y2, t)].

For y1 < y2 and t, δ(y1, y2, t) = W (y2) −W (y1) − (y32 − y3

1) + t(y2 − y1) and

W is a standard Wiener process on the real line.

The idea of the proof of the theorem is similar one of Theorem 2.2. If we

know asymptotic behavior of Fh(x), we can show simply asymptotic property

of kernel excess mass. Gine and Nickl (2009) showed exponential inequality

for the kernel distribution Fh(x) estimator as Lemma 4.2.

Lemma 4.2. Suppose F has a density f with respect to Lebesgue measure and

f ∈ C1(R). Let K be a kernel of order 2 and bandwidth h converges 0 as

n → ∞ and satisfies h ≥ (log n/n). Then there exist constants C1 > 0 and

38

C2 > 0 such that for all λ ≥ C2 max(√h(log 1

h),√nh2) and n ≤ 1,

P

(supx|√n(Fh − Fn)| > λ

)≤ 2 exp

−C1 min(h−1λ2,

√nλ). (4.1)

For the MISE optimal bandwidth, h ' n−13 is admissible in which C2 such

that C2n− 1

6

√log n ≤ λ ≤

√n.

Lemma 4.3. Assume the same condition of lemma 4.2,

P

(sup

−∞<x1<x2<∞

∣∣∣n(Fh(x2)− Fh(x1))− (F (x2)− F (x1))

−√n B(F (x2))−B(F (x1))

∣∣∣ > C3 log n+ s

)≤ C4n

(−C5s) (4.2)

for all n ≥ 1 and s > 0, where C3, C4, C5 are positive constants.

Proof of lemma 4.3. Komlos et al. (1975) suggested the embedding of Fn

in a standard Brownian bridge B. For distribution function G given by G(t) =

tI(0 ≤ t ≤ 1), and let Gn(t) be the empirical distribution based on Sample

X1, · · · , Xn from G, we can construct that

P

(sup

0≤t≤1

∣∣n Gn(t)−G(t) −√nB(t)

∣∣ > C6 log n+ s

)≤ C7 exp(−C8s) (4.3)

for each s > 0 where C6, C7, C8 are positive constants. Chan and Hall (2010)

showed the following inequality by using (4.3). They construct Gn(t) such that

39

Fn = Gn(F ). For s > 0,

P

(sup

−∞<x1<x2<∞

∣∣n (Fn(x2)− Fn(x1))− (F (x2)− F (x1))

−√n B(F (x2))−B(F (x1))

∣∣ > C6 log n+ s

)≤ C7n

(−C8s). (4.4)

By (4.4) and Lemma 4.2, one can show inequality (4.2).

We can prove the theorem by using Lemma 4.3 and idea of Cheng and Hall

(1998b).

Proof of Theorem 4.1. We can rewrite

∆2 = maxλ

[E2(λ)− E1(λ)

]= max

λ

[sup

−∞<x1<···<x4<∞

Hλ([x1, x2]) + Hλ([x3, x4])

− sup−∞<x1<x2<∞

Hλ([x1, x2])

].

Our first goal is to show convergence of right part of ∆2, supHλ([x1, x2]).

Under strong unimodal condition, we can set I = (x0 − n−15

+ε1 , x0 + n−15ε1)

and f(x0) − ελ < λ < f(x0) where some small ε1 > 0 and ελ > 0. For given

x1, x2 ∈ I and x1 < x0 < x2,

Hλ([x1, x2])

= f(x0)− λ(x2 − x1) +f ′′(x0)

6(x2 − x1)3 + o(n3(− 1

5+ε1))

= ελ(x2 − x1) +f ′′(x0)

6f(x0)−3(F (x2)3 − F (x1)3) + o(n3(− 1

5+ε1)) (4.5)

Let a = −16f ′′(x0)f(x0)−3 (a−

15 = cf ) and yi = a

25n

15F (xi), for i = 1, 2. Then

(4.5) is represented as −a 15 (y3

2 − y31) + op(1). It is known that scale process

40

√cW

(t

c

)is wiener process, for all c > 0. Thus, W (yi) = a

15n

110W (F (xi)) is

also wiener process. We can also write that

n35 B(F (x2))−B(F (x1)) = a−

15 W (y2)−W (y1)− tλ(y2 − y1) (4.6)

for some tλ. In addition, we similarly obtain convergence of left part of ∆2.

Threrfore Lemma 4.3 and equation (4.5) and (4.6) gives us the following result

n35 ∆2 = a−

15 sup−∞<t<∞

[sup

−∞<y1<···<y4<∞δ(y1, y2, t) + δ(y3, y4, t)

− sup−∞<y1<y2<∞

δ(y1, y2, t)]

+ op(1).

4.2 Computing the kernel excess mass

1) Computing the excess mass estimation

The excess mass is defined as Em(λ) = supI1,··· ,Im

∑mj=1 Hλ(Ij), and Hλ(I =

[x1, x2]) = F (x2)−F (x1)−λ(x2− x1). For the calculation of excess mass, we

have to find disjoint connected set Ij : j = 1, · · · ,m has the supremum. If F

is unimodal, D2(λ) = E2(λ)− E1(λ) is zero for any λ. Consequently, we only

have to make computation algorithm of excess mass when F is multimodal.

Using kernel method, we assume that the distribution function F is smooth.

Therefore let compute ∆2 = maxλ[E2(λ)− E1(λ)] under the assumption that

F is multimodal and smooth.

41

Figure 4.2: Intervals for calculating kernel excess mass

To compute ∆2, we have to calculate D2(λ) = E2(λ)−E1(λ) for fixed λ and

find maxλD2(λ). If the number (l) of meeting points between f(x) and λ is less

than two, then D2(λ) = 0. If l > 2, we define the meeting points as x1, · · · , xl

and set the disjoint intervals as [x1, x2], · · · , [xl−1, xl]. These intervals must

have more than one mode or antimode. To specific, [x1, xl] can be written by

(M1 ∪ A1 ∪ · · · ∪ Am−1 ∪Mm) where Mi (1 ≤ i ≤ m), m = [ l2] are intervals

including modes and Aj (1 ≤ j ≤ m − 1) are intervals including antimodes

such as Figure 4.2.

If Hλ(Mi) and Hλ(Aj) are known, it is convenient to calculate E1(λ)

and E2(λ). Let compute Hλ(Mi) and Hλ(Aj) for all i = 1, · · · ,m and j =

42

1, · · · ,m− 1. Then, calculation of E1(λ) can be achieved follows :

max

Hλ(Mk),

(b∑i=a

Hλ(Mi) +b−1∑i=a

Hλ(Ai)

)

where 1 ≤ k ≤ m and 1 ≤ a < b ≤ m

.

E2(λ) is obtained by both Hλ(I1) and Hλ(I2) where I1 ⊂ (M1 ∪A1 ∪ · · · ∪Ma)

and I2 ⊂ (Ma+1∪Aa+1∪ · · · ∪Mm) for some 1 ≤ a < m− 1. The computation

of the maximum Hλ(I1) for fixed a follow as :

If a = 1, then Hλ(I1) = Hλ(M1). If a > 1, on the other hand,

Hλ(I1) = max

Hλ(Ma),

(a∑

i=a−b+1

Hλ(Mi) +a−1∑

i=a−b+1

Hλ(Ai)

)

where 2 ≤ b ≤ a

.

Computing of Hλ(I2) is similar to E1(λ) :

If m− a = 1, then Hλ(I2) = Hλ(Mm). Otherwise,

Hλ(I2) = max

Hλ(Mk),

(c∑i=b

Hλ(Mi) +c−1∑i=b

Hλ(Ai)

)

where a < k ≤ m and a < b < c ≤ m

.

Finally, E2(λ) is calculated as E2(λ) = maxa(Hλ(I1) +Hλ(I2)).

2) The kernel excess mass estimation

The kernel excess mass estimator as well as the kernel dip estimator depends on

the bandwidth selection. This estimator uses the kernel distribution Fh unlike

43

Figure 4.3: Distributions of kernel excess mass

kernel dip using the kernel density estimator fh. The bandwidth selection of

estimating distribution is slightly different to one of estimating kernel density.

Altman and Leger (1995) showed that the global performance of Fh(x) as an

estimator of F (x) can be measured in terms of mean integrated squared error

(MISE),

MISE(Fh) = E

∫Fh(x)− F (x)2dF (x).

They choose bandwidth h minimising MISE(Fh) :

hopt =

[2M1(K)D1(F )

M2(K)2D2(F )

] 13

n−13

where M1 =∫xK(x)L(x)dx, M2 =

∫x2k(x)dx, D1(F ) =

∫f(x)2dx and

D2(F ) =∫f ′(x)2f(x)dx.

This kernel estimator gives clearer excess mass than empirical estimator. In

other to compare two excess mass estimators, we observe Figure 4.3 and Figure

44

2.2. A lot of kernel distribution estimators from the unimodal distribution, for

instance normal and t distribution, are unimodal and their kernel excess mass

are zero in Figure 4.3. This property is similar to property of kernel dip (Figure

3.9). It is the advantage of kernel method.

4.3 The kernel excess mass test

1) The Kernel excess mass test with uniform distribution

Figure 4.3 shows that P [∆2,f ≤ κ] is less than P [∆2,u ≤ κ], where u is the

standard uniform distribution u[0, 1] and f is the unimodal distribution. We

use the uniform distribution to calculate critical values κ∗α for ∆2 like as test

of Muller and Sawitzki (1991). In fact, this test also is conservative similar to

empirical excess mass test with uniform distribution.

2) Calibrating excess mass test

Under the null hypothesis that a true distribution f is unimodal, the distri-

bution of the kernel excess mass ∆2,f is independent of unknowns except for a

factor cf =

f(x0)3

|f ′′(x0)|

15

by Theorem 4.1. This idea is analogue empirical ver-

sion to overcome conservatism of the dip test. We can also obtain calibrating

densities by using algorithm in calibration empirical test.

If there exists a consistent excess mass estimate of the underlying distri-

bution of the data and another distribution, g(·), such that cf = cg, then we

can use the distribution of ∆2,g instead of ∆2,f . To begin with, let consider

45

dg = c−5g =

|g′′(x0)|g(x0)3

. Cheng and Hall (1998a) show that dg cover the range by

three classes of distributions.

• If g(·) is the beta distribution gβ(x) :

gβ(x) =[x(1− x)]β−1

B(β, β)for 0 < x < 1, β > 1,

then dg = γ(β) = 24β−1(β − 1)B(β, β)2 ∈ [0, 2π).

• If g(·) is the normal distribution, then dg = γ(β) = 2π.

• If g(·) is the rescaled Student t distribution gβ(x) :

gβ(x) =1

B(β − 12, 1

2)

1

(1 + x2)βfor −∞ < x <∞, β > 1

2

then, dg = γ(β) = 2βB(β − 12, 1

2)2 ∈ (2π,∞).

Consequently, we tests with these classes as following algorithm:

(i) Estimated df =|f ′′h1(x0)|fh2(x0)3

where f is Gaussian kernel estimation, f ′′ a second order derivative of

the Gaussian kernel estimation and x0 = arg max(f).

We take h1 = (1

2n2)

110 σ and h2 = (

5

12√

2n)19 σ. This simple bandwidth

selection is proposed by Chan and Hall (2010).

(ii) Find β = γ−1(dg) by the above calibration densities in three classes.

(iii) Conditional on these values of dg and β, draw a sample from the corre-

sponding (beta, normal, or Student t) distribution and compute ∆∗2,g.

46

(iv) Finally, we employ Monte Carlo methods to compute

P (∆∗2,g > ∆2,f |χ).

In the next numerical study, we simulate this test and the empirical excess

mass test based on calibrating method. The simulation study of calibrating

tests show kernel excess mass test is better than empirical excess mass test.

47

Chapter 5

Numerical Study

5.1 Simulation 1

We compare the accuracy rate of the kernel dip test and the kernel excess mass

test with uniform distribution to the accuracy rate of the empirical dip test

and Silverman’s test. We illustrate performance for samples of size n = 100

and n = 500 from 9 different distributions including 3 unimodal distributions,

6 multimodal distributions. Unimodal distributions are normal distribution

N(0, 1), t(6) distribution with sharp mode, and asymmetric beta distribution

β(3, 4). Multimodal distributions have 2, 3, or 4 modes and some distributions

being skewed. To calculate the rate of accuracy, we draw 500 samples for each

distribution and sample size.

Suggested new unimodal tests with kernel methods and Silverman’s test

assume bounded support to avoid spurious modes. Thus we conduct the test

48

Figure 5.1: Unimodal distributions : N(0, 1), t(6) and β(3, 4)

over data that lay within l standard deviations and find l = 2 give the best

result through several examinations l = 1, 1.5, 2, 2.5. Each p-values of four

tests is computed from Monte-Carlo simulation with 2000 replicates.

1) Unimodal Cases

We draw data from three distributions in Figure 5.1 and perform four type

unimodality tests with it. Table 5.1 reports estimates of the true levels uni-

modality tests for a variety of nominal levels. Tests of data from student t

distribution with sharp mode have smaller type I error than tests of data from

normal distribution. On the other hand, testing from beta distribution β(3, 4)

has large type I error. The whole true level of all tests is smaller than nominal

level. In particular, kernel excess mass test with uniform distribution in the

large sample, n = 500, has zero for all nominal levels. This is because the

kernel excess mass of unimodal distributions except uniform distribution is

zero or very small as Figure 4.4 in Chapter 4. This conservatism can be solved

49

N(0, 1) n = 100 n = 500

α 0.05 0.10 0.15 0.05 0.10 0.15

Kernel Dip 0.004 0.020 0.028 0.002 0.014 0.022

Kernel Excess Mass 0.002 0.004 0.008 0.000 0.000 0.000

Empirical Dip 0.000 0.004 0.016 0.000 0.000 0.000

Silverman 0.020 0.040 0.076 0.014 0.032 0.078

t(6) n = 100 n = 500

α 0.05 0.10 0.15 0.05 0.10 0.15

Kernel Dip 0.006 0.022 0.040 0.002 0.010 0.018


Empirical Dip 0.002 0.008 0.008 0.000 0.000 0.000

Silverman 0.006 0.022 0.050 0.006 0.024 0.044

β(3, 4) n = 100 n = 500

α 0.05 0.10 0.15 0.05 0.10 0.15

Kernel Dip 0.012 0.014 0.038 0.006 0.012 0.030


Empirical Dip 0.006 0.010 0.014 0.000 0.002 0.002

Silverman 0.026 0.076 0.126 0.026 0.054 0.106

Table 5.1: Tests for unimodal distributions

50

Figure 5.2: Tests for bimodal distributions

by calibrating kernel excess mass test. In addition, the kernel dip test gives

also smaller than true levels because kernel test statistics of unimodal cases

are almost zero.

2) Multimodal Cases

The following figures show the powers of four tests against mixture normal dis-

tributions. One of the bimodal distribution is mixtured N(0, 1) and N(3, 1).

The other is mixtured N(0, 12) and N(2, 1) with unbalanced weight. We also

consider multimodal distributions with 3 or 4 mode mixture normal distribu-

51

Figure 5.3: Tests for multimodal distributions52

tions. Figure 5.2 shows the result of tests for bimodal cases. Although Silver-

man’s test has greatest power, our new kernel tests also have greater power

than the empirical dip test. Moreover, the power of kernel dip test against

multimodal distribution with more than 3 mode is greatest in the large sample

cases, n = 500. Especially, kernel dip tests for unbalanced mixture distribu-

tions have large power in Figure 5.3. Whereas kernel excess mass test with

uniform distribution has less power like as empirical dip (excess mass) test. To

sum up, our kernel dip tests have good power in a variety of situations though

it is somewhat conservative.

5.2 Simulation 2 : Calibration tests

Let compare the rate of accuracy of calibration tests of kernel excess mass

test, empirical excess mass test and Silverman’s test. We illustrate also per-

formance for samples of size n = 100 and n = 500 from 9 different distributions

like as Simulation 1. Furthermore we draw 500 samples for each distribution

and sample size to calculate the rate of accuracy and the power.

1) Unimodal Cases

The first panel in each row of Figure 5.4 depicts respective sampling unimodal

density and the next two panels show the level accuracy for small sample

n = 100 and large sample n = 500. The sharper mode has the more conserva-

tive for all calibrating tests. The actual level of our calibrating kernel excess

53

Figure 5.4: Calibrating tests for unimodal distributions

54

Figure 5.5: Calibrating tests for bimodal distributions

mass test better respects to the nominal level than calibrating empirical excess

mass test in all cases except sampling beta distribution. Silverman’s test has

better accuracy than two excess mass tests for normal or t distribution. On

the other hand, two excess mass tests have better performance for beta distri-

bution. Excess mass tests are more conservative for all large sample n = 500

because excess mass of estimated distribution is zero or small for large n.

2) Multimodal Cases

Figure 5.5 and Figure 5.6 display powers against the bimodal distributions

55

Figure 5.6: Calibrating tests for multimodal distributions56

and multimodal distributions with more than 3 mode, respectively. The first

panel in each row of these figures is sample density like as multimodal distri-

bution of simulation study in Section 5.1. The middle and right panel plot

the approximate probability of rejecting the unimodality at the nominal level

for small (n = 100) and large (n = 500) sample. Calibration of the kernel

excess mass produces tests with greater power than other calibration tests for

all multimodal distributions. Although calibrating kernel excess mass test is a

little conservative for sampling unimodal distributions, this test out perform

for data from multimodal distributions. In particular, kernel excess mass test

has greatest power at more than 0.05 level for large same in the symmetric

mixture two normal distribution of Figure 5.5 and multimodal distributions of

Figure 5.6.

The numerical result reported in Figure 5.4-5.6 conveys in the fact that

our calibration kernel excess mass test have better performance than empirical

excess mass test as well as Silverman’s test. It has good level accuracy in a wide

variety situations unimodal distributions with sharp or flat mode. Moreover,

it has greater power than other calibration tests in many multimodal cases.

5.3 Real data analysis

In astronomy, it is one of significant problems whether distribution of data

set with white noise is unimodal or multimodal. Feigelson and Babu (2012)

introduced parametric and nonparametric inference of test for multimodality

57

in astronomy. Furthermore Peixinho et al. (2012) and Peixinho et al. (2003)

analyzed the B-R color distribution using empirical dip test. The empirical

dip test is well known unimodality test and their R-package is useful. However

many studies including our study indicate that the dip test is conservative and

has less power. Therefore we apply our kernel unimodal tests to analysis of

the B-R color distribution.

1) Visible Colors of Centaurs and TNOs

Discovery of orbital and physical characterization of minor planet in the so-

lar system is ongoing. The minor planet consists of Trans-Neptunian Objects

(TNOs) and Centaurs. TNOs also constitute a system including Kuiper Belt

Objects (KBOs), scattered disc objects (SDOs) and other objects. The color

index B-R of planets is important indicator of surface compositions. Surface

compositions, like as gas, cloud or galaxy, affect the temperature of a star

and variability of color. Therefore it is useful in many studies of any gradi-

ents of chemical on surface processing history with formation magnitudes for

minor planet in solar system. Peixinho et al. (2012) addressed the issue of

the color distributions of 253 Centaurs and TNOs. They studied B-R color

and absolute magnitudes as a proxi for size, HR, with the implicit assumption

that surface colors are independent of dynamic classifications. They verified

that HR and diameter of minor planets in solar system correlate very strongly.

Thus they consider HR as size of planets. Romanishin et al. (2010) also use

58

Figure 5.7: Estimated distibutions for Centaurs and TNOs

the nonparmetric statistical test for studying B-R color distribution. However

we concentrate analysis related with unimodality test in this paper.

2) Unimodality test for Centaurs and TNOs

Peixinho et al. (2012) tested the null hypothesis that the B-R index of all

objects (253 Centaurs and TNOs) is consistent with an unimodal distribution.

59

Objects (n) Empirical Dip Ker. Dip C. Ker. Ex. Mass

All Objects (253) 0.1698 0.0890 0.0000

Without Centaurs (224) 0.4099 0.3985 0.0255

Centaurs (29) 0.0160 0.0020 0.0000

Table 5.2: Unimodality tests for Centaurs and TNOs

Although Figure 5.7 shows that estimated full sample distribution has two

peaks, they argued that the distribution of B-R color is unimodal based on

the empirical dip test. On the other hand, our calibrating kernel excess mass

test has very small p-value, nearly zero, and kernel dip test also have smaller

p-value than empirical dip test. In terms of calibrating kernel test, there is

strong evidence against unimodality of full color sample distribution. In table

5.2, kernel and empirical dip tests show no strong evidence against unimodality

of the sample removing Centaurs (n = 224), while kernel excess mass test

shows strong evidence against unimodality. However, p-values of three tests

in Centaurs population (n = 29) are very small and their distributions can be

estimated as multimodal.

Next, we perform not only empirical dip test but also new kernel tests in

the three groups based on the magnitude HR. Figure 5.8 plots all 253 object

Centaurs and TNOs by B-R color index versus HR. These points forms a

N shape with an apparent double bimodal in color. Peixinho et al. (2012)

suggested HR:up and HR:low for small, large, and intermediate objects based on

60

Figure 5.8: B-R versus HR for Centaurs and TNOs

the empirical dip test. First, they performed iterative test with a HR:up starting

at the maximum of HR and decreasing in 0.1 mag. Detecting minimum of p-

value, they stopped shifting HR:up, 6.8. They also find a cutoff limit HR:low of

large object starting at the minimum HR and shifting in 0.1 mag.

Finally, they divided all data into three group and performed dip test for

61

Figure 5.9: Estimated distributions of three groups


HR ≤ 5 (38) 0.0254 0.0235 0.0010

HR ≥ 6.8 (124) 0.0025 0.0000 0.0000

5 < HR < 6.8 (91) 0.9820 0.1260 0.0420

Table 5.3: Unimodality tests for three groups

62

Figure 5.10: Estimated distributions of three groups without Centaurs


HR ≤ 5 (38) 0.0254 0.0235 0.0010

HR ≥ 6.8 (98) 0.0435 0.0075 0.0000

5 < HR < 6.8 (88) 0.9459 0.0570 0.0475

Table 5.4: Unimodality tests for three groups without Centaurs

63

Figure 5.11: B-R distributions of data HR > HR:up and HR < HR:low

each group like as Table 5.3. Two tests with kernel method as well as the

empirical dip test shows evidence of multimodality for small and large objects

group. Nevertheless, the empirical dip test and our new kernel tests have

different results in intermediate size group. P-values of kernel dip test and

kernel excess mass test are smaller than p-values of dip test. Calibrating kernel

excess mass test shows evidence against color unimodality of intermediate size

objects. In particular, new kernel tests also shows evidence against unimodality

of intermediate objects without Centaurs in Table 5.4.

Therefore we propose new criteria of dividing magnitude HR based on our

new kernel tests. Unlike study of Peixinho et al. (2012), we calculate kernel dip

or excess mass statistics to obtainHR:up andHR:low . Kernel densities of objects

above the cut of HR:up decreasing from maximum value are always bimodal in

the left graph in Figure 5.11. Thus, we have difficult to determine cutoff value

64

because p-values of our new kernel tests are zero or very small. On the contrary,

kernel densities of objects below the cutoff HR:low increasing from minimum

value are unimodal or multimodal in the right graph in Figure 5.11. We first

consider the cutoff line for large object using the maximum test statistics of

the sample less than HR:low. And then we find upper cutoff line for middle

size objects. We find HR:up, the point of HR sharply changing test statistics

of color index between HR:low and HR:up decreasing from maximum. In terms

of kernel dip statistics, we obtain criteria HR:low = 5.9 and HR:up = 7.7. This

result is upper than empirical dip test. In Figure 5.12, distributions of small

objects (HR ≤ 5.9) and large objects (HR ≥ 7.7) are multimodal and one

of intermediate objects (5.9 < HR < 7.7) is unimodal. Kernel excess mass

statistics give the same criteria HR:low of dip test, but HR:up of kernel excess

mass is larger than one of dip test in Figure 5.13. The kernel excess mass

test shows evidence of unimodality small and large objects as well as against

multimodality of middle objects. Consequently, we recommend to our kernel

test for separating planets base on the unimodality test.

65

Figure 5.12: Kernel dip test for TNOs

66

Figure 5.13: Calibration Kernel Excess Mass test for TNOs

67

Chapter 6

Conclusion

In this thesis, we propose kernel methods for unimodality test, kernel dip test

and kernel excess mass test. The computing kernel dip statistics uses some

theorems relative with total variation distance suggested by our study. In

addition, this calculation gives closest unimodal distribution to test sample

distribution. Kernel dip test considers this unimodal distribution as the null

distribution of kernel dip statistics under unimodality assumption. The first

simulation study describes that the kernel dip test have better performance

than empirical dip test. For testing in the multimodality cases, kernel dip test

have greater power than not only empirical dip test but also Silverman’s test.

Furthermore we show the asymptotic convergences of kernel excess mass

and empirical excess are same and construct the calibration of kernel excess

mass test. Our new calibration kernel test also has greatest power in the

simulation study comparing other calibration tests.

68

We will apply new kernel unimodality test several analysis as well as as-

tronomy data. We have to propose the kernel dip test with less conservatism

and develop these tests on the multivariate cases.

69

Reference

N. Altman and C. Leger. Bandwidth selection for kernel distribution function

estimation. Journal of Statistical Planning and Inference, 46(2):195–214,

1995.

Y.-B. Chan and P. Hall. Using evidence of mixed populations to select vari-

ables for clustering very high-dimensional data. Journal of the American

Statistical Association, 105(490):798–809, 2010.

M. Y. Cheng and P. Hall. Calibrating the excess mass and dip tests of modality.

Journal of the Royal Statistical Society, Series B, 60(3):579–589, 1998a.

M. Y. Cheng and P. Hall. On mode testing and empirical approximations to

distributions. Statistics & Probability Letters, 39(3):245–254, 1998b.

D. R. Cox. Notes on the analysis of mixed frequency distributions. British

Journal of Mathematical and Statistical Psychology, 19(1):39–47, 1966.

L. Devroye and L. Gyorfi. Nonparametric Density Estimation: the L1 view.

Wiley, 1985.

70

E. D. Feigelson and G. J. Babu. Modern Statistical Methods for Astronomy

With R Applications. Cambridge, 2012.

E. Gine and R. Nickl. An exponential inequality for the distribution function

of the kernel density estimator, with applications to adaptive estimation.

Probability Theory and Related Fields, 143(3-4):569–596, 2009.

P. Hall and M. York. On the calibration of silverman’s test for multimodality.

Statistica Sinica, 11:515–536, 2001.

J. A. Hartigan and P. M. Hartigan. The dip test of unimodality. The Annals

of Statistics, 13(1):70–84, 1985.

J. Komlos, P. Major, and G. Tusnady. An approximation of partial sums of

independent RV’-s, and the sample DF.I. Zeitschrift fur Wahrscheinlichkeit-

stheorie und Verwandte Gebiete, 32(1-2):111–131, 1975.

D. A. Levin, Y. Peres, and E. L. Wilmer. Markov Chains and Mixing Times.

American Mathematical Society, 2008.

E. Mammen, J. S. Marron, and N. I. Fisher. Some asymptotics for multi-

modality tests based on kernel density estimates. Probability Theory and

Related Fields, 91(1):115–132, 1992.

D. W. Muller and G. Sawitzki. Excess mass estimates and tests for multi-

modality. Journal of the American Statistical Association, 86(415):738–746,

1991.

71

B. U. Park and J. S. Marron. Comparison of data-driven bandwidth selectors.

Journal of the American Statistical Association, 85(409):66–72, 1990.

N. Peixinho, A. Doressoundiram, A. Delsanti, H. Boehnhardt, A. Barucci, and

I. Belskaya. Reopening the TNOs color controversy: Centaurs bimodality

and TNOs unimodality. Astronomy & Astrophysics, 410(3):L29–L32, 2003.

N. Peixinho, A. Delsanti, A. Guilbert-Lepoutre, R. Gafeira, and P. Lacerda.

The bimodal colors of centaurs and small kuiper belt objects. Astronomy &

Astrophysics, 546:A86, 2012.

W. Romanishin, S. C. Tegler, and G. J. Consolmagno. Colors of inner disk

classical kuiper belt objects. The Astronomical Journal, 140(1):29–33, 2010.

S. J. Sheather and M. C. Jones. A reliable data-based bandwidth selection

method for kernel density estimation. Journal of the Royal Statistical Soci-

ety, Series B, 53(3):683–690, 1991.

B. W. Silverman. Using kernel density estimates to investigate multimodality.

Journal of the Royal Statistical Society, Series B, 43(1):97–99, 1981.

72

국문초록

단일모드 검정을 통해 분포의 특징을 검정하는 것은 분류 분석에 매우 유용한

방법이다. 비모수적 단일모드 검정 방법으로는 경험적분포함수를 이용한 Dip

검정과 초과질량 (Excess Mass) 검정, 그리고 평활량을 이용한 Silverman

검정이 잘 알려져 있다. 본 연구에서는 경험적분포함수 대신 커널분포함수를

이용한 Dip 검정과 초과질량 검정을 제안하고 그 특징들을 확인하였다.

기존의 경험적분포함수를 이용한 Dip 검정은 상한거리 (supremum dis-

tance)를 기준으로 정의되어 있는데 이는 Dip 계산에 여러가지 문제를 발

생시켰다. 본 연구에서는 전변동거리 (total variation distance)를 이용한

Dip 검정을 제시하여 기존의 검정에서 발생하는 문제점들을 보완하였다. 또한

커널추정량을 이용해 경험적추정량를 이용했던 기존 검정 방법보다 정확성과

검정력을 향상시켰는데 이것은 모의 실험을 통해 확인하였다.

또한본연구에서는커널분포함수를이용한초과질량검정통계량도제안하였

는데새로운검정통계량이기존의경험적분포함수를이용한초과질량검정통계

량보다 좋은 검정력을 보이는 것을 확인하였다. 특히 두 초과질량 검정통계량의

점근적성질이같은것을증명하였다. 이를이용하여보정초과질량검정방법을

제안하였고커널검정방법이기존의검정방법보다더큰정확도를가지는것을

실험을 통해 보였다.

새로운 단일모드 검정 방법들을 천문학 자료에 적용하여 실제 자료 분석에

서도 커널 검정 방법이 유효함을 확인하였다. 천문학에서 행성의 색은 행성의

발생과 진화 단계를 설명하는 중요한 변수이다. 행성의 색 분포의 특징을 기준

으로 행성들을 분류하여 그 특성들을 확인하는데 이때 행성 색 분포의 단일모드

73

검정방법을 적용하여 분류할 수 있다. 본 연구에서는 새로 제안된 커널추정량을

이용한 검정 방법들을 사용하여 좀 더 나은 분석 기준을 제시하였다.

주요어 : Dip 검정, 초과질량 검정, 커널 추정량, 커널분포함수, 단일모드함수,

단일모드 검정

학 번 : 2010-30925

74

kernel method for unimodal test - seoul national...

Documents