kernel method for unimodal test - seoul national...
TRANSCRIPT
저 시-비 리- 경 지 2.0 한민
는 아래 조건 르는 경 에 한하여 게
l 저 물 복제, 포, 전송, 전시, 공연 송할 수 습니다.
다 과 같 조건 라야 합니다:
l 하는, 저 물 나 포 경 , 저 물에 적 된 허락조건 명확하게 나타내어야 합니다.
l 저 터 허가를 면 러한 조건들 적 되지 않습니다.
저 에 른 리는 내 에 하여 향 지 않습니다.
것 허락규약(Legal Code) 해하 쉽게 약한 것 니다.
Disclaimer
저 시. 하는 원저 를 시하여야 합니다.
비 리. 하는 저 물 리 목적 할 수 없습니다.
경 지. 하는 저 물 개 , 형 또는 가공할 수 없습니다.
이학박사 학위논문
Kernel Method
for Unimodal Test
커널방법을 이용한 단일모드 검정
2015년 8월
서울대학교 대학원
통계학과
이 선 미
Kernel Method
for Unimodal Test
by
Seonmi Lee
A Dissertation
submitted in fulfillment of the requirement
for the degree of
Ph.D of Science
in
Statistics
The Department of Statistics
College of Natural Sciences
Seoul National University
August, 2015
Abstract
Seonmi Lee
Statistics
The Graduate School
Seoul National University
Finding the number of modes is of great interest in density estimation. Well
known nonparametric unimodality tests are including the dip test, excess mass
test, and Silverman’s test. The dip and excess mass statistic are based on the
empirical distribution and supremum distance, while Silverman’s test depends
on the bandwidth of kernel density estimator. A main issue of these tests
is conservatism and often calibration methods are used to address this issue.
We propose kernel methods of unimodality based on the dip and excess mass
statistics to address the aforementioned issue. We proposed to use the total
variation distance to identify the closest unimodal distribution to kernel dis-
tribution and construct the kernel dip test based on the unimodal distribution
from calculating test statistics. Our numerical studies show that the proposed
tests outperform. We also introduce a kernel excess mass statistics. Under the
strong unimodal condition, the limiting distribution of the kernel excess mass
statistic is the same as that of the empirical excess mass statistic. However
i
the numerical studies indicate that the calibration of kernel excess mass test
has a greater power and better level accuracy than the calibration of empirical
excess mass test. We apply the proposed method to astronomy data, physical
properties of minor planets in the solar system.
Keywords: Density estimation, Dip test, Excess mass test, Kernel methods,
Unimodal distribution
Student Number: 2010-30925
ii
Contents
List of Figures v
List of Tables vii
1 Introduction 1
1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Outline of the thesis . . . . . . . . . . . . . . . . . . . . . . . . 2
2 Unimodality Test 4
2.1 The dip test . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 The excess mass test . . . . . . . . . . . . . . . . . . . . . . . . 7
2.3 Silverman’s test . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3 The Kernel Dip Test 14
3.1 The kernel dip with total variation . . . . . . . . . . . . . . . . 14
3.2 Computing the kernel dip . . . . . . . . . . . . . . . . . . . . . 20
3.3 The kernel dip test . . . . . . . . . . . . . . . . . . . . . . . . . 35
iii
4 The Kernel Excess Mass Test 36
4.1 The kernel excess mass . . . . . . . . . . . . . . . . . . . . . . . 36
4.2 Computing the kernel excess mass . . . . . . . . . . . . . . . . . 41
4.3 The kernel excess mass test . . . . . . . . . . . . . . . . . . . . 45
5 Numerical Study 48
5.1 Simulation 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
5.2 Simulation 2 : Calibration tests . . . . . . . . . . . . . . . . . . 53
5.3 Real data analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 57
6 Conclusion 68
Reference 70
Abstract in Korean 73
List of Figures
2.1 The empirical dip statistics with sup distance . . . . . . . . . . 5
2.2 Distributions of the empirical dip . . . . . . . . . . . . . . . . . 6
2.3 The excess mass E2(λ) and E1(λ) . . . . . . . . . . . . . . . . . 8
3.1 Sup distances between F and unimodal distributions U1 and U2 15
3.2 The total variation distance between f(x) and u(x) . . . . . . . 17
3.3 Unimodal distributions u0(x) (dashed line) and u(x) (dash-dot
line) close to distribution f(x) (solid line) . . . . . . . . . . . . 20
3.4 A part of f(x) and nondecreasing function u(x) . . . . . . . . . . 22
3.5 Finding u∗0(x) in I0 = [a, x1] . . . . . . . . . . . . . . . . . . . . 24
3.6 Finding u∗s(x) in Is = [xs, xs+1], 1 ≤ s < S . . . . . . . . . . . . 26
3.7 Finding u(x) in IS = [xS, b] . . . . . . . . . . . . . . . . . . . . . 28
3.8 A closest unimodal function u0(x) for a bimodal function f(x) . 30
3.9 Distributions of the kernel dip . . . . . . . . . . . . . . . . . . . 34
4.1 The kernel excess mass and the kernel dip of bimodal distribution 37
v
4.2 Intervals for calculating kernel excess mass . . . . . . . . . . . . 42
4.3 Distributions of kernel excess mass . . . . . . . . . . . . . . . . 44
5.1 Unimodal distributions : N(0, 1), t(6) and β(3, 4) . . . . . . . . 49
5.2 Tests for bimodal distributions . . . . . . . . . . . . . . . . . . . 51
5.3 Tests for multimodal distributions . . . . . . . . . . . . . . . . . 52
5.4 Calibrating tests for unimodal distributions . . . . . . . . . . . 54
5.5 Calibrating tests for bimodal distributions . . . . . . . . . . . . 55
5.6 Calibrating tests for multimodal distributions . . . . . . . . . . 56
5.7 Estimated distibutions for Centaurs and TNOs . . . . . . . . . . 59
5.8 B-R versus HR for Centaurs and TNOs . . . . . . . . . . . . . . 61
5.9 Estimated distributions of three groups . . . . . . . . . . . . . . 62
5.10 Estimated distributions of three groups without Centaurs . . . . 63
5.11 B-R distributions of data HR > HR:up and HR < HR:low . . 64
5.12 Kernel dip test for TNOs . . . . . . . . . . . . . . . . . . . . . . 66
5.13 Calibration Kernel Excess Mass test for TNOs . . . . . . . . . . 67
vi
List of Tables
5.1 Tests for unimodal distributions . . . . . . . . . . . . . . . . . . 50
5.2 Unimodality tests for Centaurs and TNOs . . . . . . . . . . . . 60
5.3 Unimodality tests for three groups . . . . . . . . . . . . . . . . . 62
5.4 Unimodality tests for three groups without Centaurs . . . . . . 63
vii
Chapter 1
Introduction
1.1 Overview
The unimodality of the distribution has been one of the most important criteria
in clustering analysis. The multimodality of the distribution means that it is
mixture and contains several subpopulations. For finding existence of more
than one mode, there has been several approaches based on density estimates
Cox (1966) used a histogram and Silverman (1981) used a bandwidth selection
of kernel density estimate. Silverman’s test statistics is generally determined
by extreme value, not by the modes. This flaw reduces the power of Silverman’s
test. Hartigan and Hartigan (1985) proposed the dip statistics and Muller and
Sawitzki (1991) proposed the excess mass statistics based on the empirical
cumulative distribution function. These test methods are equivalent in the
context of one dimension by Cheng and Hall (1998b). In addition, Cheng and
1
Hall (1998a) have suggested calibration method of excess mass test. The power
of this calibrating test, however, is reduced when there are modes with small
dip.
In other to overcome this drawback, we propose new dip and excess mass
statistics based on kernel methods. As the definition of the dip with supremum
distance suggested by Hartigan and Hartigan (1985) makes some difficulties, we
suggest new definition of the dip with total variation distance. The computing
of new dip also offers nearest a unimodal distribution. Moreover, the proposed
kernel dip test use this closest unimodal distribution in this study.
We also define the excess mass using the kernel distribution like as the
empirical excess mass suggested by Muller and Sawitzki (1991). Our study
shows that the asymptotic convergence property of the kernel excess mass is
similar to the asymptotic result of empirical excess mass.
In numerical study, new proposed kernel methods for unimodality test per-
form better than other test methods. The calibrating kernel excess mass test
particularly has greatest power. We describe how our kernel unimodality tests
apply to astronomy data in real data analysis.
1.2 Outline of the thesis
The thesis is organized as follows. In Chapter 2, we review the dip test and the
excess mass test as well as Silverman’s test. Chapter 3 provides description
of the kernel dip statistic with total variation and the computing of this new
2
dip. Furthermore we propose the kernel dip test for unimodality based on
calculating the kernel dip. We define the kernel excess mass and show its
theoretical properties in Chapter 4. We perform the simulation study and real
data analysis in Chapter 5 and conclude in Chapter 6.
3
Chapter 2
Unimodality Test
2.1 The dip test
1) The dip statistics
The dip of a distribution function, suggested by Hartigan and Hartigan (1985),
is given by
DS(F ) = infU∈U
supx|F (x)− U(x)|
where U is the class of unimodal distribution functions. The dip statistic is
the maximum difference by between the empirical distribution function Fn and
the unimodal distribution function U that minimizes that maximum difference
such as Figure 2.1. The empirical distribution function is defined by Fn(x) =
1
n
∑Xi ≤ x where X1, · · · , Xn sampling from F . Therefore we can write
4
Figure 2.1: The empirical dip statistics with sup distance
the empirical dip statistics as follows
DS(Fn) = infU∈U
supx|Fn(x)− U(x)|.
2) The dip test
Hartigan and Hartigan (1985) showed that the asymptotic property of the dip
and proposed the dip test for unimodality.
Theorem 2.1. Let F be unimodal with nonzero kth derivative at the mode m,
for some k ≥ 2, and
inf0<F ′(x)<F ′(m)−ε
| ddx
logF ′(x) | > 0 for each ε > 0.
Then√nDS(Fn)→ 0 in probability.
Theorem 2.1 shows that DS(Fn) converges zero under the assumption of uni-
modality. The empirical dip DS(Fn) is the statistics for testing the null hypoth-
5
Figure 2.2: Distributions of the empirical dip
esis that F is unimodal distribution, that is DS(F ) = 0, against the alternative
that it is not unimodal distribution, DS(F ) > 0. This dip test rejects when
DS(Fn) is too large. When Fn is unimodal, however, DS(Fn) is not zero but
small value for sufficiently large n. We need to the asymptotic distribution of
DS(Fn) under the null hypothesis. Hartigan and Hartigan (1985) argued that
the dip of the uniform distribution is larger than one of any other distributions
in unimodal distribution class. They used the uniform distribution U(0, 1) to
calculate distribution of DS(Fn) for the dip test.
Many studies have mentioned that this test is conservative because it is
possible for the dip statistics of a multimodal distribution to be less than one
of the uniform distributions. We also observe this problem in Figure 2.2. This
6
figure plots distribution functions of the empirical dip test statistics DS(Fn)
with sample size n = 500 and drew 1000 samples for uniform, normal, and
t distribution and mixture of two normal distributions. In this figure, distri-
bution of dip statistics from mixture normal distribution is similar one of dip
statistics from uniform distribution. Although the mixture normal distribu-
tion is bimodal distribution, their dip test sometimes determines the mixture
normal distribution as unimodal distribution.
2.2 The excess mass test
1) The excess mass
The excess mass approaches to testing unimodality with the dip. Muller and
Sawitzki (1991) introduced the excess mass as measure of excessive of prob-
ability concentrated on the peak. For a bounded continuous density f with
respect to Lebesgue measure, the excess mass functional is defined by
λ→ E(λ) =
∫(f(x)− λ)+dx.
And λ-cluster I is the connected components of x : f(x) ≥ λ. When f has
exactly m λ-cluster,
Em(λ) = supm∑j=1
∫Ij(λ)
(f(x)− λ)dx.
where the supremum is taken over all families Ij : j = 1, · · · ,m of pairwise
disjoint connected set. Specifically, E2(λ) is considered as shaded region of
7
Figure 2.3: The excess mass E2(λ) and E1(λ)
left graph in Figure 2.3. In this graph, disjoint intervals I1 and I2 satisfy
f(x) ≥ λ, and excess mass functional has supremum value in I1 and I2. The
right graph of Figure 2.3 shows E1(λ) is calculated in the same way as E2(λ).
The difference between E2(λ) and E1(λ) means the excess mass of the second
peak. This difference can be statistics of unimodality test.
Let M be the maximum number of modes, we obtain estimators of the
excess mass with the empirical distribution as following
En,M(λ) = supI1,··· ,IM
[M∑j=1
Fn(Ij)− λ‖Ij‖
]
where ‖I‖ is the length of I, Fn(I) is Fn-measure of I. LetDn,m(λ) = En,m(λ)−
En,(m−1)(λ) for some λ > 0. This excess mass different statistics, Dn,m, is
useful to m-modality test. Accordingly we can define the empirical excess
mass as ∆n,m = maxλ>0
Dn,m(λ)
Muller and Sawitzki (1991) considered ∆n,2 = 2DS(Fn) and Cheng and
Hall (1998b) proved this for every n. Consequently, tests with the empirical
8
excess mass and the empirical dip have same result. The excess mass test
statistics has same disadvantages of dip test statistics. The empirical excess
mass is also not zero when Fn is unimodal.
Several studies have shown that the asymptotic properties of the empirical
excess mass under the strong unimodal condition.
Strong unimodal condition
(i) The sampling density f has a continuous derivative f ′, ultimately mono-
tone in each tail.
(ii) The constraints f ′(x0) = 0 and f(x0) 6= 0 are jointly satisfied at just one
point x0.
(iii) The second derivative function f ′′ exists and is Holder continuous within
a neighborhood of x0, with f ′′(x0) < 0.
Cheng and Hall (1998b) derived the limiting distribution of ∆n,2 under the
strong unimodal condition as follow theorem.
Theorem 2.2. Let W denote a standard Wiener process on the real line.
Given real number y1 < y2 and t, define
δ(y1, y2, t) = W (y2)−W (y1)− (y32 − y3
1) + t(y2 − y1).
9
And put
Z = 615 sup−∞<t<∞
[sup
−∞<y1<···<y4<∞δ(y1, y2, t) + δ(y3, y4, t)−
sup−∞<y1<y2<∞
δ(y1, y2, t)].
Under strong unimodal condition,
n35 ∆n,2 converges in distribution to cfZ as n→∞
where cf =
f(x0)3
|f ′′(x0)|
15
. Moreover n35DS(Fn) also have same convergence
distribution.
This theorem tells that the distribution of empirical excess mass statistics,
∆n,2, depend only on cf .
2) The empirical excess mass test
The excess mass test, suggested by Muller and Sawitzki (1991), is based on the
assertion that P [∆n,m ≤ κ] is smaller than P [maxI |(Un − U)(I)| ≤ κ], where
U is the standard uniform distribution and Un is the empirical distribution
of a sample drawn from it. Since the empirical excess mass ∆n,2 is equal to
twice the empirical dip DS(Fn), this test using the uniform distribution has
also conservative result like as Section 2.1.
To avoid this conservatism, Cheng and Hall (1998a) proposed calibrating
empirical excess mass test. By theorem 2.2, we can concentrate to estimate a
factor cf in order to find the limiting distribution of the empirical excess mass
10
statistics under the null hypothesis. If one can resample from a calibration
distribution, a known unimodal density g(·), we find properties of the excess
mass statistics correspond to the excess mass of f(·). In fact, the asymptotic
property of our kernel excess mass statistic in this study is same the property
of empirical excess mass statistic. Therefore we will introduce the calibrating
excess mass test method in Chapter 4. Besides the new kernel test statistic has
better performance than the empirical test statistic in the simulation study.
2.3 Silverman’s test
1) Silverman’s Test
Given a sample X = X1, · · · , Xn, from a population with density f , kernel
density estimate is defined by
fh(x) =1
nh
n∑i=1
K(x−Xi
h)
where h is bandwidth, K is a kernel function. Silverman (1981) suggested
unimodality test using the fact that the number of modes of fh(x) is non-
increase in h when K is the gaussian kernel. This test is based on the kernel
density estimate with smallest bandwidth h making m mode distribution. The
null hypothesis H0 is that f has a m modes and the alternative hypothesis H1
is that f has more than m modes. This study proposed statistic hcrit = infh :
f(·, h) has at most m modes for this test. If hcrit has large value, then it is
evidence against H0. When the null hypothesis is that the true density is g
11
and h0 is hcrit from the data, Silverman test is based on
Pr(hcrit > h0) = Pr(f(·;h0) has more than m modes.|x1, · · · , xn from g).
To obtain the value of the statistic hcrit, we generate R bootstrap method. Let
fcrit denote fh with h = hcrit. Conditional on X , let X∗1 , · · · , X∗n be a resample
drawn from fcrit,
f ∗h(x) =1
nh
∑K(
x−X∗ih
)
and let h∗crit is infh : f ∗h has at most m modes. This test is determined by
the number of times that f ∗h(x) possesses more than one mode.
P (# of occurrences in which f ∗h(x) has more than one mode)/R
Mammen et al. (1992) and Cheng and Hall (1998b) showed the asymptotic
property of hcrit under the null hypothesis H0. In addition, Mammen et al.
(1992) told that Silverman’s test is conservative because the true asymptotic
level is less than the nominal one. The distribution of Un = P (h∗crit ≤ hcrit|X )
is not far from being uniform on the interval (0, 1), at least for large values
of n. Moreover, sometimes hcrit has spurious mode on the tail. If both the
support of f and the interval I are unbounded then properties of hcrit are
generally determined by extreme values in the sample, not by the modes of f .
2) Calibrating Silverman’s Test
Calibrating Silverman’s test is proposed by Hall and York (2001) and improves
12
its level accuracy. This calibration takes two forms in terms of an asymptotic
approach of Gn(λ) = P (h∗crit/hcrit ≤ λ|X ) and Monte Carlo technique. Under
H0, Gn(λ) converges weakly to a stochastic process G and there unique exists
λα such that PG(λα) ≥ 1 − α = α. We need α to specify the constant λα.
First of all, an asymptotically correction bases on limiting distribution of the
test statistics
PGn(λα) ≥ 1− α → PG(λα) ≥ 1− α = α.
By simulation study, they fit λα =a1α
3 + a2α2 + a3α + a4
α3 + a5α2 + a6α + a7
to the output
to provide a means of approximating λα for arbitrary α. The coefficients
are a1 = 0.94029, a2 = −1.59914, a3 = 0.17695, a4 = 0.48971, a5 =
−1.77793, a6 = 0.36162, a7 = 0.42423. The other method estimates λα by
Monte Carlo because G does not depend on unknowns. We use the former
method in the simulation study of Chapter 5.
13
Chapter 3
The Kernel Dip Test
3.1 The kernel dip with total variation
1) The total variation dip
The empirical dip statistic has some defects because of the supremum distance.
Recall the definition of the dip statistic,
DS(F ) = infU∈U
supx|F (x)− U(x)|
where U is the class of unimodal distribution functions. The unimodal distri-
bution that is closest distribution to observed distribution in terms of the sup
distance is no always reasonable. In top of Figure 3.1, there are distribution
F which mixture N(0, 1) and N(3, 1) and two unimodal distributions U1 and
U2 near to F . The sup distances from U1 and U2 to F are 0.0415 and 0.024,
respectively. F , U1, and U2 can be converted as f , u1, and u2, respectively
14
Figure 3.1: Sup distances between F and unimodal distributions U1 and U2
15
as the bottom of Figure 3.1. Although U2 is closer to F than U1, the density
u2 is unconvincing unimodal density closest to density f . The density u1 is
better reasonable closest to f than density u2. In addition, the computing
method of the dip suggested by Hartigan and Hartigan (1985) gives a result
analogous to sup |F − U1|. When a kernel distribution Fh is smooth function
unlike empirical distribution Fn, this computing method cannot give exactly
dip of Fh.
In order to overcome this difficulty, we apply the total variation distance
to the definition of the dip. Levin et al. (2008) introduced the total variation
as a distance measure for two probabilities. For two probability distribution
P and Q on sample space Ω, the total variation is defined by
‖P −Q‖TV = maxA∈Ω|P (A)−Q(A)|.
For arbitrary sample space Ω, measure µ and probability distributions P and
Q with Radon-Nikodym derivatives fP and fQ with respect to µ,
‖P −Q‖TV =1
2‖fP − fQ‖L1(µ) =
1
2
∫Ω
|fP − fQ|dµ.
We use total variation instead of supremum distance to measure more accuracy
distance between distributions. For any distribution F , we redefine the dip as
D(F ) = infU∈U‖F − U‖TV
where U is the class of unimodal distribution functions. The total variation dis-
tance between F and unimodal distribution U is L1 distance between Radon-
Nikodym derivative f and u as Figure 3.2.
16
Figure 3.2: The total variation distance between f(x) and u(x)
If F and U are absolutely continuous distributions, then there exist Radon-
Nikodym derivatives f and u. Hence, we can write as
D(F ) = infU∈U‖F − U‖TV = inf
u∈U∗
[1
2
∫|f(x)− u(x)|dx
]where U∗ is the class of unimodal density function. The definition of U∗ can
be considered as
U∗ = u0 |u0 is nondecreasing in (−∞,m]
and nonincreasing in (m,∞) where m is mode.
The total variation dip has following properties.
Property 1. For any distribution F1 and F2,
D(F1) ≤ D(F2) + ‖F1 − F2‖TV .
17
This is because of the triangular inequality of the total variation distance: for
some probability distribution F , ‖P −Q‖TV ≤ ‖P − F‖TV + ‖F −Q‖TV .
Property 2. If F is unimodal distribution, that is F ∈ U , then D(F ) = 0.
On the other hand, if F is multimodal distribution then D(F ) > 0.
We can determine whether any distribution F is unimodal or not by using this
property.
2) The Kernel dip
Recall the kernel density estimation,
fh(x) =1
nh
∑K(
x−Xi
h)
where kernel K satisfying∫K = 1. In addition, we consider kernel distribution
function estimation as follows:
Fh(x) =
∫ x
∞fh(y) dy =
1
n
∑L(x−Xi
h)
where L(t) =∫ t−∞K(u)du.
Suggested new dip is considered the class with absolutely continuous func-
tions. We can estimate the total variation dip with kernel estimator. The
estimated total variation dip measures distance between the kernel distribu-
tion function and the unimodal distribution function as
D(Fh) = infU∈U‖Fh − U‖TV
18
where U is unimodal class. We obtain following properties of the kernel dip.
Property 1. When a kernel distribution function Fh(x) is unimodal, its ker-
nel dip is zero, D(Fh(x)) = 0.
Property 2. When a true distribution function F is unimodal,
D(Fh(x)) = infU∈U‖Fh − U‖TV ≤ ‖Fh − F‖TV .
Moreover, Theorem 3.1 shows the asymptotic property of the kernel dip.
Theorem 3.1. Assume a true distribution F is unimodal, nh → ∞, h → 0
as n→∞, then
D(Fh(x))→ 0 in probability.
Devroye and Gyorfi (1985) showed the asymptotic convergence property of
kernel density estimator as Lemma 3.2.
Lemma 3.2. Assume nh→∞, h→ 0 as n→∞,∫|fh(x)− f(x)| → 0 as n→∞.
One can easily prove the Theorem 3.1 by Lemma 3.2. and Property 2 of the
kernel dip.
19
Figure 3.3: Unimodal distributions u0(x) (dashed line) and u(x) (dash-dot
line) close to distribution f(x) (solid line)
3.2 Computing the kernel dip
1) Computing the total variation dip
The total variation dip needs to new computing method. To compute the total
variation dip D(F ) for any distribution function F , we should find unimodal
distribution U0 which satisfies D(F ) = ‖F − U0‖TV . Let consider F (x) with
density f(x) in Figure 3.3 as a example. There are two unimodal densities
u0(x) and u(x) satisfying ‖u0(x)− f(x)‖L1 ≤ ‖u(x)− f(x)‖L1. If u0(x) is clos-
20
est to f(x), we obtain dip as D(F ) = 12
∫|f(x)− u0(x)|dx. We assume f is a
continuous function and has bounded support I. The interval I can be divided
into nondecreasing part IL = (−∞,m) and nonincreasing part IR = (m,∞) by
the maximum mode m. Following theorems give idea finding closest unimodal
function for computing total variation dip. As Figure 3.3, we first find non-
decreasing function u0 near to f in IL. In particular, let concentrate finding
nondecreasing function in I∗L = [a, b] where a and b satisfy f(a) = minai<m
f(ai)
and f(b) = maxbi<m
f(bi).
Theorem 3.3. Let f(x) be continuous function in [a, b] ∈ R and have K
modes at bk and antimodes at ak where 1 ≤ k ≤ K and set increasing interval
Iinc = [a, b1] ∪
(K−1⋃k=1
[ak, bk]
)∪ [aK , b]. Assume that f(a) = min
1≤k≤Kf(ak) and
f(b) = max1≤k≤K
f(bk). Then
infu∈U0
∫ b
a
|f(x)− u(x)|dx = infu∈U1
∫ b
a
|f(x)− u(x)|dx (3.1)
where U0 is class of nondecreasing continuous function and
U1 =
u
∣∣∣∣∣ u(x) = f(x)I[a ≤ x ≤ c1] +L∑i=1
f(ci)I[ci ≤ x ≤ di]
+L−1∑i=1
f(x)I[di ≤ x ≤ ci+1] + f(x)I[dL ≤ x ≤ b]
where f(c1) ≤ · · · ≤ f(cL), c1 ≤ · · · ≤ cL, ci(1≤i≤L) ∈ Iinc
and di = minx | f(x) = f(ci), ci < x and x ∈ Iinc.
21
Figure 3.4: A part of f(x) and nondecreasing function u(x)
Proof of Theorem 3.3. Assume nondecreasing function u(x) satisfy u(a) =
f(a) and u(a) = f(b) because se find closest function to f . For any u(x) in U0,
there exist S intersection points of u(x) and f(x) in the interval where f(x)
decrease. As Figure 3.4, x1, · · · , xS(≤K) can be defined as elements of the set
x | f(x) = u(x) and x ∈ Iinc. And set I0 = [a, x1], I1 = [x1, x2], · · · , IS−1 =
[xS−1, xS], IS = [xS, b]. If we find nondecresing functions u∗0, · · · , u∗S satisfy∫Ii|f(x) − u(x)|dx ≥
∫Ii|f(x) − u∗i (x)|dx for any i = 0, · · · , S, then u∗ =∑S
s=0 U∗0 I(x ∈ Is) satisfy
∫ ba|f(x)− u(x)|dx ≥
∫ ba|f(x)− u∗(x)|dx. Thus, our
problem reduces to find u∗ which is closer to f than u for any u ∈ U0.
First, we establish u∗0(x) in I0 = [a, x1]. We can find x0 such that f(x0) =
22
f(x1) in the open interval (a, x1). If x0 ≤ b1 as graphs in top of Figure 3.5,
then we can set c01 = x0 and
u∗0(x) = f(x)I[a ≤ x ≤ c01] + f(c01)I[c01 ≤ x ≤ x1].
If x0 ≥ b1, we consider two cases, u(b1) ≥ f(b1) or u(b1) < f(b1). When
u(b1) ≥ f(b1), we find modes b1 = b01 < b02 < · · · < b0J < x0 satisfy f(b01) ≤
f(b02) ≤ · · · ≤ f(b0J) as middle of Figure 3.5. And let c0j = b0j, j = 1, · · · , J
and c0(J+1) = x0. Moreover let d0j be a solution of f(x) = f(c0j) in (c0j, c0(j+1)),
j = 1, · · · , J and d0(J+1) = x1. Therefore we can set u∗0 as
u∗0 = f(x)I[a ≤ x ≤ c01] +
J+1∑j=1
f(c0j)I[c0j ≤ x ≤ d0j] +J∑j=1
f(x)I[d0j ≤ x ≤ c0(j+1)].
When u(b1) < f(b1), we can find antimodes a01 = a1 < a02 < · · · < a0J < x0
satisfy f(a01) ≤ f(a02) ≤ · · · ≤ f(a0J) such as bottom of Figure 3.5. Let
d0j = a0j, j = 1, · · · , J and d0(J+1) = x1. In addition, set c0j be a solution
of f(x) = f(d0j) in (d0(j−1), d0j), j = 1, · · · , J with d00 = a and c0(J+1) = x0.
Then we can set
u∗0 = f(x)I[a ≤ x ≤ c01] +
J+1∑j=1
f(c0j)I[c0J ≤ x ≤ d0j] +J∑j=1
f(x)I[d0j ≤ x ≤ c0(j+1)].
Next, we consider f(x) and u∗s(x) in Is = [xs, xs+1], 1 ≤ s < S. Let
ds0 = minx|f(x) = f(xs) and x ∈ (xs, xs+1), x0s be a solution x0s of f(x) =
23
Figure 3.5: Finding u∗0(x) in I0 = [a, x1]
24
f(xs+1) in (xs, xs+1). If there exists no mode that is bigger than f(xs) and
smaller than f(xs+1), then we set u∗s as top of Figure 3.6, that is
u∗s = f(xs)I[xs ≤ x ≤ ds0 ] + f(x)I[ds0 ≤ x ≤ x0s] + f(xs+1)I[x0s ≤ x ≤ xs+1].
Otherwise, we consider u∗s in the two cases. When f(bk) ≤ u(bk) where mode
bk such that ds0 ≤ bk ≤ x0s, find modes bs1 < bs2 < · · · < bsJs ≤ x0s satisfy
f(bs1) ≤ · · · ≤ f(bsJ) and bs1 ≥ ds0. Let csj = bsj, j = 1, · · · , Js, cs(Js+1) = x0s
and dsj be a solution of f(x) = f(csj) in (csj, cs(j+1)), j = 1, · · · , Js such as
middle of Figure 3.6. Then we can set
u∗s = f(xs)I[xs ≤ x ≤ ds0] +Js∑j=0
f(x)I[dsj ≤ x ≤ cs(j+1)]
+Js∑j=1
f(csj)I[csj ≤ x ≤ dsj] + f(xs+1)I[x0s ≤ x ≤ xs+1].
When f(ak) ≤ u(ak) where antimode ak such that ds0 ≤ ak ≤ x0s, then we
find antimodes as1 < as2 < · · · < asJs ≤ x0s satisfy f(as1) ≤ · · · ≤ f(asJs)
and as1 > ds0. And let dsj = asj, j = 1, · · · , Js and csj be a solution of
f(x) = f(dsj) in (ds(j−1), dsj), j = 1, · · · , Js and cs(Js+1) = x0s like as bottom
of Figure 3.6. Then we can also set
u∗s = f(xs)I[xs ≤ x ≤ ds0] +Js∑j=0
f(x)I[dsj ≤ x ≤ cs(j+1)]
+Js∑j=1
f(csj)I[csj ≤ x ≤ dsj] + f(xs+1)I[x0s ≤ x ≤ xs+1].
Finally, we have to find u∗S(x) in IS = [xS, b]. Let x0S a solution of f(x) =
25
Figure 3.6: Finding u∗s(x) in Is = [xs, xs+1], 1 ≤ s < S
26
f(xS) in (xS, b]. If x0S ≥ aK as graphs in top of Figure 3.7, then
u∗S = f(xS)I[xS ≤ x ≤ x0S] + f(x)I[x0S ≤ x ≤ b].
If x0S < aK and f(bK) < u(bK), then we find modes bS1 < bS2 < · · · < bSJS ≤
bK satisfy f(bS1) ≤ · · · f(bSJS) and bS1 > xS. And let cSj = bSj, j = 1, · · · , JS,
and dSj be a solution of f(x) = f(cSj) in (cSj, cS(j+1)), j = 1, · · · , JS. with
cSJS+1 = b and dS0 = x0S as middle of Figure 3.7. Then we can set
u∗S = f(xS)I[xS ≤ x ≤ dS0 ] +
JS−1∑j=0
f(x)I[dSj ≤ x ≤ cS(j+1)]
+
JS∑j=1
f(cSj)I[cSj ≤ x ≤ dSj] + f(x)I[dSJ ≤ x ≤ b].
If x0S < aK and f(bK) ≥ u(bK), let find antimodes aS1 < · · · < aSJ ≤ aK
satisfy f(aS1) ≤ · · · ≤ f(aSJS) and aS1 > xS. And let dSj = aSj, j = 1, · · · , JS
and cSj be a solution of f(x) = f(dSj) in (dS(j−1), dSj), j = 1, · · · , JS with
dS0 = x0S. Then we can set u∗S as bottom of Figure 3.7.
u∗S = f(xS)I[xS ≤ x ≤ dS0] +
JS−1∑j=0
f(x)I[dSj ≤ x ≤ cS(j+1)]
+
JS∑j=1
f(cSj)I[cSj ≤ x ≤ dSj] + f(x)I[dSJ ≤ x ≤ b].
As a result, we can make nondecreasing function u∗ =∑S
s=0 u∗s in U1 closer
to f than u in U0 in terms of L1 distance :∫ b
a
|f(x)− u(x)|dx ≥∫ b
a
|f(x)− u(x)|dx
27
Figure 3.7: Finding u(x) in IS = [xS, b]
28
Moreover, the fact that all of functions u1 in U1 is nondecreasing function gives
that
infu∈U0
∫ b
a
|f(x)− u(x)|dx ≤ infu∈U1
∫ b
a
|f(x)− u(x)|dx.
We also show that nonincreasing function near to f in IR similar to Theo-
rem 3.3.
Corollary 3.3.1. Let f(x) be continuous function in [a, b] ∈ R and f(x) have
K modes at ak and antimodes at bk where 1 ≤ k ≤ K and set decreasing in-
terval Idec = [a, b1]∪
(K−1⋃k=1
[bk, ak+1]
)∪ [bK , b]. Assume that f(a) = max
1≤k≤Kf(bk)
and f(b) = min1≤k≤K
f(ak) for some i, j ∈ (1, · · · , K). Then
infu∈U0
∫ b
a
|f(x)− u(x)|dx = infu∈U1
∫ b
a
|f(x)− u(x)|dx
where U0 is class of nonincreasing function and
U1 =
u(x)
∣∣∣∣∣u(x) = f(x)I[a ≤ x ≤ c1] +L∑i=1
f(ci)I[ci ≤ x ≤ di]
+L−1∑i=1
f(x)I[di ≤ x ≤ ci+1] + f(x)I[dL ≤ x ≤ b]
where f(c1) ≤ · · · ≤ f(cL), c1 ≤ · · · ≤ cL, ci(1≤i≤L) ∈ Idec
and di = minx | f(x) = f(ci), ci < x and x ∈ Idec.
When F is bimodal distribution, we can more easily find closest unimodal
distribution by following theorem.
29
Figure 3.8: A closest unimodal function u0(x) for a bimodal function f(x)
Theorem 3.4. Let F be absolutely continuous bimodal distribution and have
density function f , set
U0(x) = G0I[x ≤ m] + L0I[x ≥ m]
where m = arg max f(x), G0 is the greatest convex minorant (g.c.m.) of F
and L0 is the least concave majorant (l.c.m.) of F . Then
D(F ) = ‖F − U0‖TV .
To specific, g.c.m. is defined as G0(x) = sup G(x) |G(x) is convex in
( −∞,m ] and F (x) ≥ G(x) and l.c.m. is defined as L0(x) = inf L(x) |L(x)
is concave in [ m,∞ ) and F (x) ≤ L(x).
30
Proof of Theorem 3.4. Assume that F has mode at x = m1 and m with
m1 < m. We only find nondecreasing function u0 in (−∞,m] minimize L1
distance to f(x) because of L0(x)I[x ≥ m] = F (x)I[x ≥ m]. By theorem 3.3,
closest unimodal density u0 can be written by
u0(x) = f(x)I(−∞ ≤ x ≤ c1] + f(c1)I[c1 ≤ x ≤ d1] + f(x)I[d1 ≤ x ≤ ∞]
for some c1 and d1 such that c1 < m1 < d1 < m and f(a1) ≤ f(c1) ≤ f(m1)
where a1 is antimode between m1 and m like as Figure 3.8. We can rewrite∫ m
−∞|f(x)− u0(x)|dx =
∫M=[c1,x1]
f(x)− f(c1)dx+
∫A=[x1,d1]
f(c1)− f(x)dx
where x1 satisfies f(x1) = f(c1) and m1 ≤ x1 ≤ a1. Since F (m) = U(m),
u0 should satisfy∫Mf(x)− f(c1)dx =
∫Af(c1)− f(x)dx, for some c1. For all
x ≤ m, ∫ x
−∞f(y)dy ≥
∫ x
−∞u∗0(y)dy
and∫ x−∞ u0(y)dy = G0(x) is g.c.m. of F in (−∞,m] because u∗0(y) is nonde-
creasing function. Consequently, U0(x) = G0I[x ≤ m] + L0I[x ≥ m], satisfies
infU∈U‖F − U‖TV = ‖F − U0‖TV .
When F has mode at x = m2 and m with m2 > m, we can also find U0 by
Corollary 3.3.1
In the computation of the total variation dip, we consider the case which F
is multimodal distribution. This is because the dip of the unimodal distribution
is zero based on the property of the total variation dip.
31
Suppose F is a multimodal distribution. Find m = arg max f(x) and cal-
culate dip separated interval in (−∞,m] and [m,∞), that is
infu∈U∗
∫|f(x)− u(x)|dx = inf
g∈U∗L
∫ m
−∞|f(x)− g(x)|dx+ inf
l∈U∗R
∫ ∞m
|f(x)− l(x)|dx
where U∗L is nondecreasing function class and U∗R is nonincreasing function class.
Let the left term is DL(F ) and the right term is DR(F ). First, we have to
compute DL(F ) . If F does not have any other mode in (−∞,m), DL(F ) = 0.
And if F has only one mode in (−∞,m), we can easily get DL(F ) by using
Theorem 3.4. When F has k(≥ 2) mode in (−∞,m), calculating DL(F ) is
somewhat complex. We can consider several nondecreasing functions such that
u(x) = f(x)I[x ≤ c1] +L∑i=1
f(ci)I[ci ≤ x ≤ di]
+L−1∑i=1
f(x)I[di ≤ x ≤ ci+1] + f(x)I[dL ≤ x]
where f(c1) ≤ · · · ≤ f(cL) for c1 ≤ · · · ≤ cL, di = minx∈[a,b]
x|f(x) = f(ci) as
Theorem 3.3, and they should satisfy
1
2
∫ m
−∞f(x)− u(x)dx = 0.
Then, we choose nondecreasing function having minimum total variation in
(−∞,m] on above functions. Similarly, we can also compute the right term of
the dip, DR(F ).
2) The kernel dip estimation
32
Recall the definition of the kernel dip,
D(Fh) = infU∈U‖Fh − U‖TV = inf
u∈U∗
[1
2
∫|fh(x)− u(x)|dx
].
Unlike the empirical dip, we use the kernel density to estimate new suggested
dip. This estimator depends on the bandwidth selection in the kernel density
estimation. We use well known bandwidth selection method suggested by
Sheather and Jones (1991). They chose bandwidth h to minimize a kernel
based estimate of asymptotic mean integrated squared error (AMISE) as n→
∞ and h→ 0.
AMISE =1
nhR(K) +
1
4h4M2
2F (f ′′)
where R(K) =∫K2(x)dx and M2 =
∫x2K(x)dx. They obtain hopt solve the
equation
h =
[R(K)
M22 SD(α(h))
] 15
n−15
where α(h) = c1h17 for appropriate c1 and SD(α(h)) = 1
n2h5
∑∑i 6=jK
′′ ∗
K ′(Xi−Xj
h), by analogy with algorithm of Park and Marron (1990).
Furthermore the kernel method is affected by data in the tail of a distri-
bution. For example, f is a gamma distribution having long right tail. If the
kernel density is applied to the full data set from f , then outlying data points
make spurious modes in the tail and increasing the kernel dip. Consequently,
the accuracy and the power of the unimodal test using the kernel dip are bad
affected these spurious modes. In other to avoid this problem, we suppose the
33
Figure 3.9: Distributions of the kernel dip
support of f is bounded and use the data laying within l standard deviation
in practice.
The kernel dip is clearer estimate than empirical dip. Figure 3.9 shows
simple simulation conducted under same condition of Figure 2.2 in Chapter 2.
When the kernel density Fh is unimodal, then kernel dip D(Fh) has exactly
zero. Therefore a lot of kernel dips calculating samples from normal and t
distribution have zero unlike empirical dip of Figure 2.2. In addition, kernel
dips of data from the bimodal distribution are larger than ones of data from
two unimodal distributions, normal and t.
34
3.3 The kernel dip test
For kernel distribution function Fh, D(Fh) = 0 means the fact that Fh is
unimodal and we estimate true distribution F is unimodal. On the other tand,
large D(Fh) means Fh is not unimodal and we estimate F (x) is multimodal.
We should know how large kernel dip is to determine multimodal. Since the
asymptotic distribution of D(Fh) is unknown, the determination of critical
values is difficult.
Unlike as empirical dip test, we can find unimodal function U0 having
smallest total variation to estimated distribution Fh in the computing kernel
dips. Under the null hypothesis, we consider the estimated closest unimodal
function U0 as underlying distribution. Therefore, we can draw a sample from
unimodal U0 and compute kernel dips d∗. Moreover we employ Monte-Carlo
simulation to compute the p-value of the kernel dip d0 from observed data
defined as P (d0 ≤ d∗|X ). If this p-value is smaller than significant level α, we
reject H0 and determine the population distribution is not unimodal.
Since our kernel dip test use the closest unimodal distribution, it have
better level accuracy and power than empirical dip test using the uniform
distribution. This advantage can be confirmed on the simulation study in
Chapter 5.
35
Chapter 4
The Kernel Excess Mass Test
4.1 The kernel excess mass
Muller and Sawitzki (1991) used empirical distribution to estimate excess mass.
The excess mass can be also estimated by using kernel distribution estimator
instead of empirical distribution. The kernel excess mass for m mode and some
λ > 0 is defined by
Em(λ) = supI1,··· ,Im
[m∑j=1
Fh(Ij)− λ‖Ij‖
]
where the supremum is taken over all sequences I1, · · · , Im of disjoint interval
for kernel distribution Fh. Let denote Hλ(Ij) = Fh(Ij)− λ‖Ij‖, then Em(λ) =
supI1,··· ,Im
m∑j=1
Hλ(Ij). We also write the kernel excess mass statistics as
∆m = maxλDm(λ) = max
λEm(λ)− Em−1(λ)
36
Figure 4.1: The kernel excess mass and the kernel dip of bimodal distribution
The kernel excess mass statistics has different properties of empirical statistics
in Chapter 2.
Property 1. If a kernel distribution Fh(x) is unimodal, then ∆m = 0 for
m > 1.
If m = 2, then this property is same one of kernel dip. It is advantage of the
kernel method and improves the accuracy of unimodality test.
Property 2. If a kernel distribution Fh is bimodal, then the kernel excess
mass statistic is half of the kernel dip statistic :
∆2 = maxλ
[E2(λ)− E1(λ)] =1
2inf
U :unimodal‖Fh − U‖TV =
1
2D(Fh).
This is because ∆2 measures the minimal excess mass that has to be moved to
convert Fh into a unimodal distribution U . It can be regarded as a measure of
kernel dip as Figure 4.1. However, this property does not apply when Fh has
m(> 2) modes.
37
Kernel excess mass statistics and empirical excess mass statistics are same
convergence distribution under the strong unimodal condition by Theorem 2.2
and next Theorem 4.1.
Theorem 4.1. Let K be a kernel of order 2 and Choose h > 0 of order
h ' n−13 under the strong unimodal condition. Then,
P (∆2 > n−35u) −→ P (cfZ > u) as n→∞
where cf =
f(x0)3
|f ′′(x0)|
15
and
Z = 615 sup−∞<t<∞
[sup
−∞<y1<···<y4<∞δ(y1, y2, t) + δ(y3, y4, t)
− sup−∞<y1<y2<∞
δ(y1, y2, t)].
For y1 < y2 and t, δ(y1, y2, t) = W (y2) −W (y1) − (y32 − y3
1) + t(y2 − y1) and
W is a standard Wiener process on the real line.
The idea of the proof of the theorem is similar one of Theorem 2.2. If we
know asymptotic behavior of Fh(x), we can show simply asymptotic property
of kernel excess mass. Gine and Nickl (2009) showed exponential inequality
for the kernel distribution Fh(x) estimator as Lemma 4.2.
Lemma 4.2. Suppose F has a density f with respect to Lebesgue measure and
f ∈ C1(R). Let K be a kernel of order 2 and bandwidth h converges 0 as
n → ∞ and satisfies h ≥ (log n/n). Then there exist constants C1 > 0 and
38
C2 > 0 such that for all λ ≥ C2 max(√h(log 1
h),√nh2) and n ≤ 1,
P
(supx|√n(Fh − Fn)| > λ
)≤ 2 exp
−C1 min(h−1λ2,
√nλ). (4.1)
For the MISE optimal bandwidth, h ' n−13 is admissible in which C2 such
that C2n− 1
6
√log n ≤ λ ≤
√n.
Lemma 4.3. Assume the same condition of lemma 4.2,
P
(sup
−∞<x1<x2<∞
∣∣∣n(Fh(x2)− Fh(x1))− (F (x2)− F (x1))
−√n B(F (x2))−B(F (x1))
∣∣∣ > C3 log n+ s
)≤ C4n
(−C5s) (4.2)
for all n ≥ 1 and s > 0, where C3, C4, C5 are positive constants.
Proof of lemma 4.3. Komlos et al. (1975) suggested the embedding of Fn
in a standard Brownian bridge B. For distribution function G given by G(t) =
tI(0 ≤ t ≤ 1), and let Gn(t) be the empirical distribution based on Sample
X1, · · · , Xn from G, we can construct that
P
(sup
0≤t≤1
∣∣n Gn(t)−G(t) −√nB(t)
∣∣ > C6 log n+ s
)≤ C7 exp(−C8s) (4.3)
for each s > 0 where C6, C7, C8 are positive constants. Chan and Hall (2010)
showed the following inequality by using (4.3). They construct Gn(t) such that
39
Fn = Gn(F ). For s > 0,
P
(sup
−∞<x1<x2<∞
∣∣n (Fn(x2)− Fn(x1))− (F (x2)− F (x1))
−√n B(F (x2))−B(F (x1))
∣∣ > C6 log n+ s
)≤ C7n
(−C8s). (4.4)
By (4.4) and Lemma 4.2, one can show inequality (4.2).
We can prove the theorem by using Lemma 4.3 and idea of Cheng and Hall
(1998b).
Proof of Theorem 4.1. We can rewrite
∆2 = maxλ
[E2(λ)− E1(λ)
]= max
λ
[sup
−∞<x1<···<x4<∞
Hλ([x1, x2]) + Hλ([x3, x4])
− sup−∞<x1<x2<∞
Hλ([x1, x2])
].
Our first goal is to show convergence of right part of ∆2, supHλ([x1, x2]).
Under strong unimodal condition, we can set I = (x0 − n−15
+ε1 , x0 + n−15ε1)
and f(x0) − ελ < λ < f(x0) where some small ε1 > 0 and ελ > 0. For given
x1, x2 ∈ I and x1 < x0 < x2,
Hλ([x1, x2])
= f(x0)− λ(x2 − x1) +f ′′(x0)
6(x2 − x1)3 + o(n3(− 1
5+ε1))
= ελ(x2 − x1) +f ′′(x0)
6f(x0)−3(F (x2)3 − F (x1)3) + o(n3(− 1
5+ε1)) (4.5)
Let a = −16f ′′(x0)f(x0)−3 (a−
15 = cf ) and yi = a
25n
15F (xi), for i = 1, 2. Then
(4.5) is represented as −a 15 (y3
2 − y31) + op(1). It is known that scale process
40
√cW
(t
c
)is wiener process, for all c > 0. Thus, W (yi) = a
15n
110W (F (xi)) is
also wiener process. We can also write that
n35 B(F (x2))−B(F (x1)) = a−
15 W (y2)−W (y1)− tλ(y2 − y1) (4.6)
for some tλ. In addition, we similarly obtain convergence of left part of ∆2.
Threrfore Lemma 4.3 and equation (4.5) and (4.6) gives us the following result
n35 ∆2 = a−
15 sup−∞<t<∞
[sup
−∞<y1<···<y4<∞δ(y1, y2, t) + δ(y3, y4, t)
− sup−∞<y1<y2<∞
δ(y1, y2, t)]
+ op(1).
4.2 Computing the kernel excess mass
1) Computing the excess mass estimation
The excess mass is defined as Em(λ) = supI1,··· ,Im
∑mj=1 Hλ(Ij), and Hλ(I =
[x1, x2]) = F (x2)−F (x1)−λ(x2− x1). For the calculation of excess mass, we
have to find disjoint connected set Ij : j = 1, · · · ,m has the supremum. If F
is unimodal, D2(λ) = E2(λ)− E1(λ) is zero for any λ. Consequently, we only
have to make computation algorithm of excess mass when F is multimodal.
Using kernel method, we assume that the distribution function F is smooth.
Therefore let compute ∆2 = maxλ[E2(λ)− E1(λ)] under the assumption that
F is multimodal and smooth.
41
Figure 4.2: Intervals for calculating kernel excess mass
To compute ∆2, we have to calculate D2(λ) = E2(λ)−E1(λ) for fixed λ and
find maxλD2(λ). If the number (l) of meeting points between f(x) and λ is less
than two, then D2(λ) = 0. If l > 2, we define the meeting points as x1, · · · , xl
and set the disjoint intervals as [x1, x2], · · · , [xl−1, xl]. These intervals must
have more than one mode or antimode. To specific, [x1, xl] can be written by
(M1 ∪ A1 ∪ · · · ∪ Am−1 ∪Mm) where Mi (1 ≤ i ≤ m), m = [ l2] are intervals
including modes and Aj (1 ≤ j ≤ m − 1) are intervals including antimodes
such as Figure 4.2.
If Hλ(Mi) and Hλ(Aj) are known, it is convenient to calculate E1(λ)
and E2(λ). Let compute Hλ(Mi) and Hλ(Aj) for all i = 1, · · · ,m and j =
42
1, · · · ,m− 1. Then, calculation of E1(λ) can be achieved follows :
max
Hλ(Mk),
(b∑i=a
Hλ(Mi) +b−1∑i=a
Hλ(Ai)
)
where 1 ≤ k ≤ m and 1 ≤ a < b ≤ m
.
E2(λ) is obtained by both Hλ(I1) and Hλ(I2) where I1 ⊂ (M1 ∪A1 ∪ · · · ∪Ma)
and I2 ⊂ (Ma+1∪Aa+1∪ · · · ∪Mm) for some 1 ≤ a < m− 1. The computation
of the maximum Hλ(I1) for fixed a follow as :
If a = 1, then Hλ(I1) = Hλ(M1). If a > 1, on the other hand,
Hλ(I1) = max
Hλ(Ma),
(a∑
i=a−b+1
Hλ(Mi) +a−1∑
i=a−b+1
Hλ(Ai)
)
where 2 ≤ b ≤ a
.
Computing of Hλ(I2) is similar to E1(λ) :
If m− a = 1, then Hλ(I2) = Hλ(Mm). Otherwise,
Hλ(I2) = max
Hλ(Mk),
(c∑i=b
Hλ(Mi) +c−1∑i=b
Hλ(Ai)
)
where a < k ≤ m and a < b < c ≤ m
.
Finally, E2(λ) is calculated as E2(λ) = maxa(Hλ(I1) +Hλ(I2)).
2) The kernel excess mass estimation
The kernel excess mass estimator as well as the kernel dip estimator depends on
the bandwidth selection. This estimator uses the kernel distribution Fh unlike
43
Figure 4.3: Distributions of kernel excess mass
kernel dip using the kernel density estimator fh. The bandwidth selection of
estimating distribution is slightly different to one of estimating kernel density.
Altman and Leger (1995) showed that the global performance of Fh(x) as an
estimator of F (x) can be measured in terms of mean integrated squared error
(MISE),
MISE(Fh) = E
∫Fh(x)− F (x)2dF (x).
They choose bandwidth h minimising MISE(Fh) :
hopt =
[2M1(K)D1(F )
M2(K)2D2(F )
] 13
n−13
where M1 =∫xK(x)L(x)dx, M2 =
∫x2k(x)dx, D1(F ) =
∫f(x)2dx and
D2(F ) =∫f ′(x)2f(x)dx.
This kernel estimator gives clearer excess mass than empirical estimator. In
other to compare two excess mass estimators, we observe Figure 4.3 and Figure
44
2.2. A lot of kernel distribution estimators from the unimodal distribution, for
instance normal and t distribution, are unimodal and their kernel excess mass
are zero in Figure 4.3. This property is similar to property of kernel dip (Figure
3.9). It is the advantage of kernel method.
4.3 The kernel excess mass test
1) The Kernel excess mass test with uniform distribution
Figure 4.3 shows that P [∆2,f ≤ κ] is less than P [∆2,u ≤ κ], where u is the
standard uniform distribution u[0, 1] and f is the unimodal distribution. We
use the uniform distribution to calculate critical values κ∗α for ∆2 like as test
of Muller and Sawitzki (1991). In fact, this test also is conservative similar to
empirical excess mass test with uniform distribution.
2) Calibrating excess mass test
Under the null hypothesis that a true distribution f is unimodal, the distri-
bution of the kernel excess mass ∆2,f is independent of unknowns except for a
factor cf =
f(x0)3
|f ′′(x0)|
15
by Theorem 4.1. This idea is analogue empirical ver-
sion to overcome conservatism of the dip test. We can also obtain calibrating
densities by using algorithm in calibration empirical test.
If there exists a consistent excess mass estimate of the underlying distri-
bution of the data and another distribution, g(·), such that cf = cg, then we
can use the distribution of ∆2,g instead of ∆2,f . To begin with, let consider
45
dg = c−5g =
|g′′(x0)|g(x0)3
. Cheng and Hall (1998a) show that dg cover the range by
three classes of distributions.
• If g(·) is the beta distribution gβ(x) :
gβ(x) =[x(1− x)]β−1
B(β, β)for 0 < x < 1, β > 1,
then dg = γ(β) = 24β−1(β − 1)B(β, β)2 ∈ [0, 2π).
• If g(·) is the normal distribution, then dg = γ(β) = 2π.
• If g(·) is the rescaled Student t distribution gβ(x) :
gβ(x) =1
B(β − 12, 1
2)
1
(1 + x2)βfor −∞ < x <∞, β > 1
2
then, dg = γ(β) = 2βB(β − 12, 1
2)2 ∈ (2π,∞).
Consequently, we tests with these classes as following algorithm:
(i) Estimated df =|f ′′h1(x0)|fh2(x0)3
where f is Gaussian kernel estimation, f ′′ a second order derivative of
the Gaussian kernel estimation and x0 = arg max(f).
We take h1 = (1
2n2)
110 σ and h2 = (
5
12√
2n)19 σ. This simple bandwidth
selection is proposed by Chan and Hall (2010).
(ii) Find β = γ−1(dg) by the above calibration densities in three classes.
(iii) Conditional on these values of dg and β, draw a sample from the corre-
sponding (beta, normal, or Student t) distribution and compute ∆∗2,g.
46
(iv) Finally, we employ Monte Carlo methods to compute
P (∆∗2,g > ∆2,f |χ).
In the next numerical study, we simulate this test and the empirical excess
mass test based on calibrating method. The simulation study of calibrating
tests show kernel excess mass test is better than empirical excess mass test.
47
Chapter 5
Numerical Study
5.1 Simulation 1
We compare the accuracy rate of the kernel dip test and the kernel excess mass
test with uniform distribution to the accuracy rate of the empirical dip test
and Silverman’s test. We illustrate performance for samples of size n = 100
and n = 500 from 9 different distributions including 3 unimodal distributions,
6 multimodal distributions. Unimodal distributions are normal distribution
N(0, 1), t(6) distribution with sharp mode, and asymmetric beta distribution
β(3, 4). Multimodal distributions have 2, 3, or 4 modes and some distributions
being skewed. To calculate the rate of accuracy, we draw 500 samples for each
distribution and sample size.
Suggested new unimodal tests with kernel methods and Silverman’s test
assume bounded support to avoid spurious modes. Thus we conduct the test
48
Figure 5.1: Unimodal distributions : N(0, 1), t(6) and β(3, 4)
over data that lay within l standard deviations and find l = 2 give the best
result through several examinations l = 1, 1.5, 2, 2.5. Each p-values of four
tests is computed from Monte-Carlo simulation with 2000 replicates.
1) Unimodal Cases
We draw data from three distributions in Figure 5.1 and perform four type
unimodality tests with it. Table 5.1 reports estimates of the true levels uni-
modality tests for a variety of nominal levels. Tests of data from student t
distribution with sharp mode have smaller type I error than tests of data from
normal distribution. On the other hand, testing from beta distribution β(3, 4)
has large type I error. The whole true level of all tests is smaller than nominal
level. In particular, kernel excess mass test with uniform distribution in the
large sample, n = 500, has zero for all nominal levels. This is because the
kernel excess mass of unimodal distributions except uniform distribution is
zero or very small as Figure 4.4 in Chapter 4. This conservatism can be solved
49
N(0, 1) n = 100 n = 500
α 0.05 0.10 0.15 0.05 0.10 0.15
Kernel Dip 0.004 0.020 0.028 0.002 0.014 0.022
Kernel Excess Mass 0.002 0.004 0.008 0.000 0.000 0.000
Empirical Dip 0.000 0.004 0.016 0.000 0.000 0.000
Silverman 0.020 0.040 0.076 0.014 0.032 0.078
t(6) n = 100 n = 500
α 0.05 0.10 0.15 0.05 0.10 0.15
Kernel Dip 0.006 0.022 0.040 0.002 0.010 0.018
Kernel Excess Mass 0.000 0.000 0.000 0.000 0.000 0.000
Empirical Dip 0.002 0.008 0.008 0.000 0.000 0.000
Silverman 0.006 0.022 0.050 0.006 0.024 0.044
β(3, 4) n = 100 n = 500
α 0.05 0.10 0.15 0.05 0.10 0.15
Kernel Dip 0.012 0.014 0.038 0.006 0.012 0.030
Kernel Excess Mass 0.002 0.006 0.010 0.000 0.000 0.000
Empirical Dip 0.006 0.010 0.014 0.000 0.002 0.002
Silverman 0.026 0.076 0.126 0.026 0.054 0.106
Table 5.1: Tests for unimodal distributions
50
Figure 5.2: Tests for bimodal distributions
by calibrating kernel excess mass test. In addition, the kernel dip test gives
also smaller than true levels because kernel test statistics of unimodal cases
are almost zero.
2) Multimodal Cases
The following figures show the powers of four tests against mixture normal dis-
tributions. One of the bimodal distribution is mixtured N(0, 1) and N(3, 1).
The other is mixtured N(0, 12) and N(2, 1) with unbalanced weight. We also
consider multimodal distributions with 3 or 4 mode mixture normal distribu-
51
Figure 5.3: Tests for multimodal distributions52
tions. Figure 5.2 shows the result of tests for bimodal cases. Although Silver-
man’s test has greatest power, our new kernel tests also have greater power
than the empirical dip test. Moreover, the power of kernel dip test against
multimodal distribution with more than 3 mode is greatest in the large sample
cases, n = 500. Especially, kernel dip tests for unbalanced mixture distribu-
tions have large power in Figure 5.3. Whereas kernel excess mass test with
uniform distribution has less power like as empirical dip (excess mass) test. To
sum up, our kernel dip tests have good power in a variety of situations though
it is somewhat conservative.
5.2 Simulation 2 : Calibration tests
Let compare the rate of accuracy of calibration tests of kernel excess mass
test, empirical excess mass test and Silverman’s test. We illustrate also per-
formance for samples of size n = 100 and n = 500 from 9 different distributions
like as Simulation 1. Furthermore we draw 500 samples for each distribution
and sample size to calculate the rate of accuracy and the power.
1) Unimodal Cases
The first panel in each row of Figure 5.4 depicts respective sampling unimodal
density and the next two panels show the level accuracy for small sample
n = 100 and large sample n = 500. The sharper mode has the more conserva-
tive for all calibrating tests. The actual level of our calibrating kernel excess
53
Figure 5.4: Calibrating tests for unimodal distributions
54
Figure 5.5: Calibrating tests for bimodal distributions
mass test better respects to the nominal level than calibrating empirical excess
mass test in all cases except sampling beta distribution. Silverman’s test has
better accuracy than two excess mass tests for normal or t distribution. On
the other hand, two excess mass tests have better performance for beta distri-
bution. Excess mass tests are more conservative for all large sample n = 500
because excess mass of estimated distribution is zero or small for large n.
2) Multimodal Cases
Figure 5.5 and Figure 5.6 display powers against the bimodal distributions
55
Figure 5.6: Calibrating tests for multimodal distributions56
and multimodal distributions with more than 3 mode, respectively. The first
panel in each row of these figures is sample density like as multimodal distri-
bution of simulation study in Section 5.1. The middle and right panel plot
the approximate probability of rejecting the unimodality at the nominal level
for small (n = 100) and large (n = 500) sample. Calibration of the kernel
excess mass produces tests with greater power than other calibration tests for
all multimodal distributions. Although calibrating kernel excess mass test is a
little conservative for sampling unimodal distributions, this test out perform
for data from multimodal distributions. In particular, kernel excess mass test
has greatest power at more than 0.05 level for large same in the symmetric
mixture two normal distribution of Figure 5.5 and multimodal distributions of
Figure 5.6.
The numerical result reported in Figure 5.4-5.6 conveys in the fact that
our calibration kernel excess mass test have better performance than empirical
excess mass test as well as Silverman’s test. It has good level accuracy in a wide
variety situations unimodal distributions with sharp or flat mode. Moreover,
it has greater power than other calibration tests in many multimodal cases.
5.3 Real data analysis
In astronomy, it is one of significant problems whether distribution of data
set with white noise is unimodal or multimodal. Feigelson and Babu (2012)
introduced parametric and nonparametric inference of test for multimodality
57
in astronomy. Furthermore Peixinho et al. (2012) and Peixinho et al. (2003)
analyzed the B-R color distribution using empirical dip test. The empirical
dip test is well known unimodality test and their R-package is useful. However
many studies including our study indicate that the dip test is conservative and
has less power. Therefore we apply our kernel unimodal tests to analysis of
the B-R color distribution.
1) Visible Colors of Centaurs and TNOs
Discovery of orbital and physical characterization of minor planet in the so-
lar system is ongoing. The minor planet consists of Trans-Neptunian Objects
(TNOs) and Centaurs. TNOs also constitute a system including Kuiper Belt
Objects (KBOs), scattered disc objects (SDOs) and other objects. The color
index B-R of planets is important indicator of surface compositions. Surface
compositions, like as gas, cloud or galaxy, affect the temperature of a star
and variability of color. Therefore it is useful in many studies of any gradi-
ents of chemical on surface processing history with formation magnitudes for
minor planet in solar system. Peixinho et al. (2012) addressed the issue of
the color distributions of 253 Centaurs and TNOs. They studied B-R color
and absolute magnitudes as a proxi for size, HR, with the implicit assumption
that surface colors are independent of dynamic classifications. They verified
that HR and diameter of minor planets in solar system correlate very strongly.
Thus they consider HR as size of planets. Romanishin et al. (2010) also use
58
Figure 5.7: Estimated distibutions for Centaurs and TNOs
the nonparmetric statistical test for studying B-R color distribution. However
we concentrate analysis related with unimodality test in this paper.
2) Unimodality test for Centaurs and TNOs
Peixinho et al. (2012) tested the null hypothesis that the B-R index of all
objects (253 Centaurs and TNOs) is consistent with an unimodal distribution.
59
Objects (n) Empirical Dip Ker. Dip C. Ker. Ex. Mass
All Objects (253) 0.1698 0.0890 0.0000
Without Centaurs (224) 0.4099 0.3985 0.0255
Centaurs (29) 0.0160 0.0020 0.0000
Table 5.2: Unimodality tests for Centaurs and TNOs
Although Figure 5.7 shows that estimated full sample distribution has two
peaks, they argued that the distribution of B-R color is unimodal based on
the empirical dip test. On the other hand, our calibrating kernel excess mass
test has very small p-value, nearly zero, and kernel dip test also have smaller
p-value than empirical dip test. In terms of calibrating kernel test, there is
strong evidence against unimodality of full color sample distribution. In table
5.2, kernel and empirical dip tests show no strong evidence against unimodality
of the sample removing Centaurs (n = 224), while kernel excess mass test
shows strong evidence against unimodality. However, p-values of three tests
in Centaurs population (n = 29) are very small and their distributions can be
estimated as multimodal.
Next, we perform not only empirical dip test but also new kernel tests in
the three groups based on the magnitude HR. Figure 5.8 plots all 253 object
Centaurs and TNOs by B-R color index versus HR. These points forms a
N shape with an apparent double bimodal in color. Peixinho et al. (2012)
suggested HR:up and HR:low for small, large, and intermediate objects based on
60
Figure 5.8: B-R versus HR for Centaurs and TNOs
the empirical dip test. First, they performed iterative test with a HR:up starting
at the maximum of HR and decreasing in 0.1 mag. Detecting minimum of p-
value, they stopped shifting HR:up, 6.8. They also find a cutoff limit HR:low of
large object starting at the minimum HR and shifting in 0.1 mag.
Finally, they divided all data into three group and performed dip test for
61
Figure 5.9: Estimated distributions of three groups
Objects (n) Empirical Dip Ker. Dip C. Ker. Ex. Mass
HR ≤ 5 (38) 0.0254 0.0235 0.0010
HR ≥ 6.8 (124) 0.0025 0.0000 0.0000
5 < HR < 6.8 (91) 0.9820 0.1260 0.0420
Table 5.3: Unimodality tests for three groups
62
Figure 5.10: Estimated distributions of three groups without Centaurs
Objects (n) Empirical Dip Ker. Dip C. Ker. Ex. Mass
HR ≤ 5 (38) 0.0254 0.0235 0.0010
HR ≥ 6.8 (98) 0.0435 0.0075 0.0000
5 < HR < 6.8 (88) 0.9459 0.0570 0.0475
Table 5.4: Unimodality tests for three groups without Centaurs
63
Figure 5.11: B-R distributions of data HR > HR:up and HR < HR:low
each group like as Table 5.3. Two tests with kernel method as well as the
empirical dip test shows evidence of multimodality for small and large objects
group. Nevertheless, the empirical dip test and our new kernel tests have
different results in intermediate size group. P-values of kernel dip test and
kernel excess mass test are smaller than p-values of dip test. Calibrating kernel
excess mass test shows evidence against color unimodality of intermediate size
objects. In particular, new kernel tests also shows evidence against unimodality
of intermediate objects without Centaurs in Table 5.4.
Therefore we propose new criteria of dividing magnitude HR based on our
new kernel tests. Unlike study of Peixinho et al. (2012), we calculate kernel dip
or excess mass statistics to obtainHR:up andHR:low . Kernel densities of objects
above the cut of HR:up decreasing from maximum value are always bimodal in
the left graph in Figure 5.11. Thus, we have difficult to determine cutoff value
64
because p-values of our new kernel tests are zero or very small. On the contrary,
kernel densities of objects below the cutoff HR:low increasing from minimum
value are unimodal or multimodal in the right graph in Figure 5.11. We first
consider the cutoff line for large object using the maximum test statistics of
the sample less than HR:low. And then we find upper cutoff line for middle
size objects. We find HR:up, the point of HR sharply changing test statistics
of color index between HR:low and HR:up decreasing from maximum. In terms
of kernel dip statistics, we obtain criteria HR:low = 5.9 and HR:up = 7.7. This
result is upper than empirical dip test. In Figure 5.12, distributions of small
objects (HR ≤ 5.9) and large objects (HR ≥ 7.7) are multimodal and one
of intermediate objects (5.9 < HR < 7.7) is unimodal. Kernel excess mass
statistics give the same criteria HR:low of dip test, but HR:up of kernel excess
mass is larger than one of dip test in Figure 5.13. The kernel excess mass
test shows evidence of unimodality small and large objects as well as against
multimodality of middle objects. Consequently, we recommend to our kernel
test for separating planets base on the unimodality test.
65
Figure 5.12: Kernel dip test for TNOs
66
Figure 5.13: Calibration Kernel Excess Mass test for TNOs
67
Chapter 6
Conclusion
In this thesis, we propose kernel methods for unimodality test, kernel dip test
and kernel excess mass test. The computing kernel dip statistics uses some
theorems relative with total variation distance suggested by our study. In
addition, this calculation gives closest unimodal distribution to test sample
distribution. Kernel dip test considers this unimodal distribution as the null
distribution of kernel dip statistics under unimodality assumption. The first
simulation study describes that the kernel dip test have better performance
than empirical dip test. For testing in the multimodality cases, kernel dip test
have greater power than not only empirical dip test but also Silverman’s test.
Furthermore we show the asymptotic convergences of kernel excess mass
and empirical excess are same and construct the calibration of kernel excess
mass test. Our new calibration kernel test also has greatest power in the
simulation study comparing other calibration tests.
68
We will apply new kernel unimodality test several analysis as well as as-
tronomy data. We have to propose the kernel dip test with less conservatism
and develop these tests on the multivariate cases.
69
Reference
N. Altman and C. Leger. Bandwidth selection for kernel distribution function
estimation. Journal of Statistical Planning and Inference, 46(2):195–214,
1995.
Y.-B. Chan and P. Hall. Using evidence of mixed populations to select vari-
ables for clustering very high-dimensional data. Journal of the American
Statistical Association, 105(490):798–809, 2010.
M. Y. Cheng and P. Hall. Calibrating the excess mass and dip tests of modality.
Journal of the Royal Statistical Society, Series B, 60(3):579–589, 1998a.
M. Y. Cheng and P. Hall. On mode testing and empirical approximations to
distributions. Statistics & Probability Letters, 39(3):245–254, 1998b.
D. R. Cox. Notes on the analysis of mixed frequency distributions. British
Journal of Mathematical and Statistical Psychology, 19(1):39–47, 1966.
L. Devroye and L. Gyorfi. Nonparametric Density Estimation: the L1 view.
Wiley, 1985.
70
E. D. Feigelson and G. J. Babu. Modern Statistical Methods for Astronomy
With R Applications. Cambridge, 2012.
E. Gine and R. Nickl. An exponential inequality for the distribution function
of the kernel density estimator, with applications to adaptive estimation.
Probability Theory and Related Fields, 143(3-4):569–596, 2009.
P. Hall and M. York. On the calibration of silverman’s test for multimodality.
Statistica Sinica, 11:515–536, 2001.
J. A. Hartigan and P. M. Hartigan. The dip test of unimodality. The Annals
of Statistics, 13(1):70–84, 1985.
J. Komlos, P. Major, and G. Tusnady. An approximation of partial sums of
independent RV’-s, and the sample DF.I. Zeitschrift fur Wahrscheinlichkeit-
stheorie und Verwandte Gebiete, 32(1-2):111–131, 1975.
D. A. Levin, Y. Peres, and E. L. Wilmer. Markov Chains and Mixing Times.
American Mathematical Society, 2008.
E. Mammen, J. S. Marron, and N. I. Fisher. Some asymptotics for multi-
modality tests based on kernel density estimates. Probability Theory and
Related Fields, 91(1):115–132, 1992.
D. W. Muller and G. Sawitzki. Excess mass estimates and tests for multi-
modality. Journal of the American Statistical Association, 86(415):738–746,
1991.
71
B. U. Park and J. S. Marron. Comparison of data-driven bandwidth selectors.
Journal of the American Statistical Association, 85(409):66–72, 1990.
N. Peixinho, A. Doressoundiram, A. Delsanti, H. Boehnhardt, A. Barucci, and
I. Belskaya. Reopening the TNOs color controversy: Centaurs bimodality
and TNOs unimodality. Astronomy & Astrophysics, 410(3):L29–L32, 2003.
N. Peixinho, A. Delsanti, A. Guilbert-Lepoutre, R. Gafeira, and P. Lacerda.
The bimodal colors of centaurs and small kuiper belt objects. Astronomy &
Astrophysics, 546:A86, 2012.
W. Romanishin, S. C. Tegler, and G. J. Consolmagno. Colors of inner disk
classical kuiper belt objects. The Astronomical Journal, 140(1):29–33, 2010.
S. J. Sheather and M. C. Jones. A reliable data-based bandwidth selection
method for kernel density estimation. Journal of the Royal Statistical Soci-
ety, Series B, 53(3):683–690, 1991.
B. W. Silverman. Using kernel density estimates to investigate multimodality.
Journal of the Royal Statistical Society, Series B, 43(1):97–99, 1981.
72
국문초록
단일모드 검정을 통해 분포의 특징을 검정하는 것은 분류 분석에 매우 유용한
방법이다. 비모수적 단일모드 검정 방법으로는 경험적분포함수를 이용한 Dip
검정과 초과질량 (Excess Mass) 검정, 그리고 평활량을 이용한 Silverman
검정이 잘 알려져 있다. 본 연구에서는 경험적분포함수 대신 커널분포함수를
이용한 Dip 검정과 초과질량 검정을 제안하고 그 특징들을 확인하였다.
기존의 경험적분포함수를 이용한 Dip 검정은 상한거리 (supremum dis-
tance)를 기준으로 정의되어 있는데 이는 Dip 계산에 여러가지 문제를 발
생시켰다. 본 연구에서는 전변동거리 (total variation distance)를 이용한
Dip 검정을 제시하여 기존의 검정에서 발생하는 문제점들을 보완하였다. 또한
커널추정량을 이용해 경험적추정량를 이용했던 기존 검정 방법보다 정확성과
검정력을 향상시켰는데 이것은 모의 실험을 통해 확인하였다.
또한본연구에서는커널분포함수를이용한초과질량검정통계량도제안하였
는데새로운검정통계량이기존의경험적분포함수를이용한초과질량검정통계
량보다 좋은 검정력을 보이는 것을 확인하였다. 특히 두 초과질량 검정통계량의
점근적성질이같은것을증명하였다. 이를이용하여보정초과질량검정방법을
제안하였고커널검정방법이기존의검정방법보다더큰정확도를가지는것을
실험을 통해 보였다.
새로운 단일모드 검정 방법들을 천문학 자료에 적용하여 실제 자료 분석에
서도 커널 검정 방법이 유효함을 확인하였다. 천문학에서 행성의 색은 행성의
발생과 진화 단계를 설명하는 중요한 변수이다. 행성의 색 분포의 특징을 기준
으로 행성들을 분류하여 그 특성들을 확인하는데 이때 행성 색 분포의 단일모드
73
검정방법을 적용하여 분류할 수 있다. 본 연구에서는 새로 제안된 커널추정량을
이용한 검정 방법들을 사용하여 좀 더 나은 분석 기준을 제시하였다.
주요어 : Dip 검정, 초과질량 검정, 커널 추정량, 커널분포함수, 단일모드함수,
단일모드 검정
학 번 : 2010-30925
74