examples of hilbert spaces lecture 11a
TRANSCRIPT
Examples of Hilbert spaces
Lecture 11A.
MA 751Part 4
Measurability and Hilbert Spaces
1. Some set theory in ‘:
Def. 1: A in of radius withball ‘ %: !center is a seta − ‘:
F œ Ö − À m m ×x x a‘ %:
of points within of the fixed point % a.
Def 2: A set is if it is a union ofK § ‘: openballs in .‘:
Def. 3: Given a set , the K § ‘: boundary`K K of is the set of all points suchx − ‘:
that every ball centered at contains pointsxin and also the complement ~ .K K
[An open set can also be defined as a setthat does not contain its own boundary]
Def. 4: A set is if it containsJ § ‘: closed its boundary
[A closed set can also be defined as a setwhose complement in is open]‘:
Def. 5: A set is a if it consistsV § ‘: regionof an open set together with some partK(or maybe none) of its boundary.
2. Measurable functions and sets
Let be the set of continuous functions onG‘. Let be the set of Q measurablefunctions:
Def. 6: A subset has if itE § ‘ measure !can be covered by arbitrarily small balls.
That is, for any number no matter how% !small, there is a set of open balls F ßF ßá" #
whose union contains but whoseEvolumes add up to less than .%
Def. 7: A statement about points in holds‘:
almost everywhere a.e. almost all ( ) (or for real numbers) if it holds for all x − ‘:
except for a set of measure 0.
Def. 8: The set of Q measurable functionson (or an interval of ) is the set of‘ ‘:
functions that are limits of continuousfunctions, almost everywhere i.e.,
` œ 0Ð Ñ Àš x there are continuous functions0 Ð Ñ 0Ð Ñ œ 0 Ð Ñ×8 8
8Ä∞x x xsuch that for almostlim
all x − ‘›
Measureable functions
Fig. 1: the function as a limit of continuous functions0ÐBÑ
Measureable functions
In fact, lots of functions (even discontinuousones) can be viewed as limits of continuousfunctions.
Measureable functions
For example
0ÐBÑ œ M ÐBÑ œ" B − Ò!ß "Ó!Ò!ß"Ó œ if
otherwise .
is a discontinuous but measurable function.
Note: ordinary notion of integral is difficult touse for functions as complicated asmeasurable functions.
Measureable functions
Definition 9: A is (Lebesgue)set I § ‘:
measurable if its indicator function
M Ð Ñ ´" B − E! B  EE x œ if
if is a measurable
function.
[Equivalent on to our previous definition of‘:
measurability on any space]
Measureable functions
10. Integration of measurable functions
To integrate measurable functions (Lebesgueintegral) first need:
Measureable functions
Theorem 1: Given a non-negativemeasurable function , there is0 À Ä‘ ‘:
always an increasing sequence e f0 Ð8 8œ"∞xÑ
of non-negative continuous functions (i.e.with the property that for all0 Ð Ñ 0 Ð Ñ8" 8x xx x) which converges to almost0Ð Ñeverywhere.
Measureable functions
Def. 11: xIf is a positive0Ð Ñ !measurable function, define
( (‘ ‘: :
0 ÐBÑ .B œ 0 ÐBÑ .Bßlim8Ä∞
8
where is any increasing sequence of0 ÐBÑ8
nonnegative continuous functions whichconverges to a.e.0
Measureable functions
[note we know the value of the integrals ofthe continuous functions - they are0 Ð Ñ8 xordinary Riemann integrals on ]‘:
Measureable functions
Fig. 2: sequence of continuous functions 0 Ð8 xÑincreasing to 0Ð Ñx
Measureable functions
Def 12: To find the integral of a negativemeasurable function , we just compute the0integral of (which is positive), and put 0a minus sign in front of it.
Measureable functions
Since every function is the sum of a0positive plus a negative function
0 œ 0 0 ß" #
the integral of is defined as0
( ( (∞ ∞ ∞
∞ ∞ ∞
" #0 .B œ 0 .B 0 .BÞ
[Thus we now know how to define the integralof an arbitrary function]
Measureable functions
Ex 1: if looks like:0ÐBÑ
fig 3: has positive and negative part0ÐBÑ
Measureable functions
Then integral of is integral of a positive0ÐBÑplus a negative function:
fig 4: now sum the areas between (or ) and the x0 0" #
axis
Measureable functions
Note we can show pretty easily all theproperties of integrals we are used to alsohold for this more general Lebesgueintegral.
For example, we still have
( ( (Ð0 1Ñ .B 0 .B 1 .Bß = + etc.
Measureable functions
[For now we will assume the above andrelated facts already known to be true forstandard Riemann integrals]
Hilbert spaces of functions
2. New Hilbert spaces:
Consider the space
L œ P Ò Ó# 1 1,
Hilbert spaces of functions
œ 0ÐBÑ Ò ß Óœmeasurable functions on 1 1
with (
#
1
1
0 ÐBÑ.B ∞ Þ
Can show that if then and 0 ß 1 − L 0 1 -0are in if is a constant (exercise). MoreL -generally is a vector space.L
Hilbert spaces of functions
Further, we can define an inner product on L(known as the inner product):P#
Ø0 ß 1Ù œ Ø0 ß 1Ù œ 0ÐBÑ 1ÐBÑ .BÞP
# (1
1
This satisfies conditions (1) - (4) of an innerproduct.
Can also show that is complete (i.e., everyLCauchy sequence converges to aÖ0 ×8function in ).0 L
Hilbert spaces of functions
Thus is a Hilbert space.L
Note: we always consider two measurablefunctions the same if they differ just at afinite number of points
Hilbert spaces of functions
fig 5: two functions and which differ at a finite0 0" #
collection of points.
Hilbert spaces of functions
Can show: such functions and have0 0" #
the same integral [certainly area isunchanged]; furthermore,
( l0 0 l .B œ !" # (1)
Def 13: More generally we will consider twofunctions to be the same or if equivalent (1)holds
Hilbert spaces of functions
[Equivalently, holds iff differ on(1) 0 Ð Ñß 0 Ð Ñ" #x xa set of measure ]!
Function space basis expansions
3. Fourier series: an example in Hilbertspaces
Ex 2: Consider Hilbert spaceL œ P Ò ß Óß# 1 1 with usual inner product
Ø0 ß 1Ù œ Ø0 ß 1Ù œ 0ÐBÑ1ÐBÑ.BÞP
# ( 1
1
Function space basis expansions
Consider set of vectors
F œ Ö 8Bl 8 œ "ß #ßáך sin
together with Ö 8Bl 8 œ !ß "ß #ßá×cos ›œ Ö"ß Bß Bß #Bß #Bß á×cos sin cos sin
Function space basis expansions
We will show this is an orthogonal set. First:show that is orthogonal to all other"vectors:
Ø"ß 8BÙ œ 8B .B œ ! Ða 8 œ "ß #ßá Ñcos cos(1
1
Ø"ß 8BÙ œ 8B .B œ ! Ða8 œ "ß #ßá Ñsin sin(1
1
Function space basis expansions
Now show that (for example) cos is&Borthogonal to all other vectors:
, Ø &B 8BÙ œ &B 8B œ ! acos sin cos sin'1
1
8 œ "ß #ßá
Function space basis expansions
To show above we use the trig identities:
cos cos cos cos + , œ Ð+ ,Ñ Ð+ ,Ñ"
#c d
and
sin cos sin sin + , œ Ò Ð+ ,Ñ Ð+ ,ÑÓ"
#
sin sin cos cos+ , œ Ð+ ,Ñ Ð+ ,Ñ"
#c d.
Function space basis expansions
[Above holds similarly for any other cos .]7B
Similarly, we also have:
Ø &Bß 8BÙ œ &B 8B .B œ ! acos cos cos cos '1
1
8 Á &
Can similarly show that sin is also7Borthogonal to all other vectors.
Function space basis expansions
Thus these vectors form a orthogonal set ofvectors. Are they orthonormal?
m 8Bm œ Ð 8Bß 8BÑ œ 8B.Bcos cos cos cos# #
(
1
1
œ .B œ" #8B
#(1
1 cos 1
Function space basis expansions
Thus:
m 8Bm œ Þcos È1
Thus has length .1È1
cos 8B "
Similarly, has length 1È1sin 8B "
And: has length .1È#1† " "
Function space basis expansions
Thus:
šÈ È È È1 1 1 1, , , ,
#B B #B
1 1 1 1cos sin cos
1 1 1 , , È È È ›
1 1 1sin cos sin#B $Bß $B á
œ Ö@ ß @ ß @ ßá×" # $
Function space basis expansions
Are an orthonormal (and hence lin ind ) setÞ Þfor the space of cont. functions.
Can show: they are a basis. So any vector0ÐBÑ can be written in the form:
0ÐBÑ œ - @ - @ á" " # #
œ - - B - B" " "
#" # $È È È1 1 1
cos sin
Function space basis expansions
- #B - #B á" "
% &È È1 1cos sin
œ + B , B + #B+
#!
" " #cos sin cos
, #B á# sin
[Fourier series of a function]
Function space basis expansions
Notice that
- œ Ð0ÐBÑß #BÑ œ 0ÐBÑ #B .B" "
%È È(
1 1cos cos
1
1
œ 0ÐBÑ #B .B"È (1 1
1
cos
Ê + œ œ 0ÐBÑ #B .B#- "
%È1 1 1
1 ' cos
Function space basis expansions
Generally:
+ œ 0ÐBÑ 8B .B"
81
(1
1
cos
, œ 0ÐBÑ 8B .BÞ"
81
(1
1
sin
[Using above linear algebra have no needto do advanced calculus for theory ofFourier series!]
Function space basis expansions
Ex: 0ÐBÑ œ #B
fig 6
Function space basis expansions
#B œ + B , B + #B+
#!
" " #cos sin cos
, #B á# sin
, œ #B &B .B"
&1
(1
1
sin
Function space basis expansions
œ .B# B &B &B
& &1 Ÿº ðóóóóñóóóóò(cos cos 1
1 1
1
!
œ œ# # %
& &1
1œ
Function space basis expansions
, œ %
''
Generally:
, œ #B 8B œ" 8
88
%8
%8
1(
1
1
cos if evenif odd
Can show + œ !Þ8
Function space basis expansions
Thus
#B œ , B , #B , $B á
ðóóóóóóóñóóóóóóóòï" # $sin sin sinW ÐBÑ"
W ÐBÑ#
œ % Ò" † B † #B † $B áÓ" "
# $sin sin sin
Function space basis expansions
Lecture 11B.
Part 5 (MA 751)
Statistical machine learning and kernelmethods
Primary references:John Shawe-Taylor and Nello Cristianini,
Kernel Methods for Pattern Analysis
Christopher Burges, A tutorial on supportvector machines for pattern recognition,Data Mining and Knowledge Discovery 2,121–167 (1998).
Other references:Aronszajn Theory of reproducing kernels.ß
Transactions of the American MathematicalSociety, 686, 337-404, 1950.
Felipe Cucker and Steve Smale, On themathematical foundations of learning.Bulletin of the American MathematicalSociety, 2002.
Teo Evgeniou, Massimo Pontil and TomasoPoggio, Regularization Networks andSupport Vector Machines Advances inComputational Mathematics, 2000.
1. Linear functionals
Definition 1. Given a vector space , weZdefine a map from to the real0 À Z Ä Z‘numbers to be a .functional
If is , i.e., if for real we have0 +ß ,linear
0Ð+ , Ñ œ +0Ð Ñ ,0Ð Ñßx y x y
then we say is a 0 linear functional.
If is an inner product space (so each hasZ va length ), we say that is ifm m 0v bounded
l0Ð Ñl Ÿ Gm mx x
for some number and all .G ! − Zx
Reproducing kernel Hilbert spaces
2. Reproducing Kernel Hilbert spaces:
Def. 1. A matrix is if8 ‚ 8 Q symmetric Q œ Q 3ß 4Þ34 43 for all
A symmetric is if all of itsQ positive eigenvalues are non-negative.
Reproducing kernel Hilbert spaces
Equivalently is positive ifQ
Ø ßQ Ù ´ Q !a a a aX
for all vectors , with , thea œ Ø † † Ù
++ã+
Ô ×Ö ÙÖ ÙÕ Ø
"
#
8
standard inner product on .‘8
Reproducing kernel Hilbert spaces
Definition 2: Let be compact (i.e., a\ © ‘:
closed bounded subset). A (real)reproducing kernel Hilbert space (RKHS) [on is a Hilbert space of functions on \ \(i.e., a complete collection of functionswhich is closed under addition and scalarmult, and for which an inner product isdefined)Þ
Reproducing kernel Hilbert spaces
[ also needs the property: for any fixedx x− \ À Ä, the evaluation functional ‡ [ ‘defined by
x x‡Ð0Ñ œ 0Ð Ñ
is a bounded linear functional on .[
Reproducing kernel Hilbert spaces
Definition 3: We define a to be akernel function which isO À \ ‚\ Ä ‘symmetric, i.e.,
OÐ ß Ñ œ OÐ ß Ñx y y x
for .x yß − \
Reproducing kernel Hilbert spaces
We say is if for any fixed collectionO positive
Ö ßá ß × § \x x" 8 ,
the matrix8 ‚ 8
K x xœ ÐO Ñ ´ OÐ ß Ñ34 3 4
is positive (i.e., non-negative).
Kernel existence
We now have the reason these are calledRKHS:
Kernel existence
Theorem 1: Given a reproducing kernelHilbert space of functions on ,[ ‘\ § .
there exists a unique symmetric positivekernel function such that for allOÐ ß Ñx y0 − ß[
0Ð Ñ œ Ø0Ð ÑßOÐ ßx x† † ÑÙ[
(inner product above is in the variable ;†x is fixed).
Kernel existence
Note this means that evaluation of at fixed 0 xis equivalent to taking inner product of 0Ð † Ñwith the fixed function (in variableOÐ † ß Ñx† with fixed)x
Kernel existence
Proof (please look at this on your own): Forany fixed , recall is a boundedx x− \ ‡
linear functional on .[
Kernel existence
By the thereRiesz Representation theorem 1
exists a fixed function, call it suchO Ð † Ñxthat for all (recall is fixed, now is0 − 0[ xvarying)
0Ð Ñ œ Ð0Ñ œ Ø0Ð † ÑßO Ð † ÑÙÞx x‡x (1)
(all inner products are in in , i.e.,[ß Pnot #
Ø0 ß 1Ù œ Ø0 ß 1Ù[).
1Riesz Representation Theorem: If is a bounded linear functional on , there exists a unique 9 [ ‘ [ [À Ä −ysuch that .a − ß Ð Ñ œ Ø ß Ùx x y x[ 9
Kernel existence
That is, evaluation of at is equivalent to0 xan inner product with the function .Ox
Define Note by (1), theOÐ ß Ñ œ O Ð ÑÞx y yxfunctions and satisfyO Ð † Ñ O Ð Ñx y †
ØO Ð † ÑßO Ð ÑÙ œ O Ð Ñ œ O Ð Ñx y y x† x y ,
so is symmetric.OÐ ß Ñx y
Kernel existence
To prove is positive definite: letOÐ ß Ñx yÖ ßá ß ×x x" 8 be a fixed collection. IfO ´ OÐ ß Ñ œ ÐO Ñ34 3 4 34x x K, then if is a matrix
and c œ ß
--ã-
Ô ×Ö ÙÖ ÙÕ Ø
"
#
8
Ø ß Ù ´ œ - - OÐ ß Ñc Kc c Kc x xX
3ß4œ"
8
3 4 3 4
Kernel existence
œ - - ØO Ð † ÑßO Ð † ÑÙ3ß4œ"
8
3 4 x x3 4
œ - O Ð † Ñß - O Ð † Ѥ ¥3œ" 4œ"
8 8
3 4x x3 4
œ - O Ð † Ñ !¾ ¾3œ"
8
3
#
x3
[
.
Kernel existence
Definition 4: We call the above kernelOÐ ß Ñx y the of .reproducing kernel [
Definition 5: A is a positiveMercer kernel definite kernel which is alsoOÐ ß Ñx ycontinuous as a function of and andx ybounded.
Kernel existence
Def. 6: For a continuous function on a0compact set we define\ § ‘:
m0m œ l0Ð ÑlÞ∞−\
maxx
x
[Recall here is assumed a closed\ § ‘:
bounded set]
Kernel existence
Theorem 2:(i) For every Mercer kernel , O À \ ‚\ Ä ‘
there exists a unique Hilbert space (an[RKHS) of functions on such that is its\ Oreproducing kernel.
(ii) Moreover, this consists of continuous[functions, and for any 0 − [
m0m Ÿ Q m0m∞ O [,
where | |Q œ OÐ ß Ñ ÞOß −\maxx y
x y
Kernel existence
Proof (please look at this on your own): LetOÐ ß Ñ À \ ‚\ Äx y ‘ be a Mercer kernel.We will construct a reproducing kernelHilbert space with reproducing kernel [ Oas follows.
Define (below span means finite span; noinfinite sums)
Kernel existence
[! −\œ ÖO Ð † Ñ×span x x
is any finiteœ - O Ð † Ñ À Ö × § \š3
3 3 3x3x
subsetà - − Þ3 ‘ ›
Kernel existence
Now we define inner product forØ0 ß 1Ù0 ß 1 − Þ[! Assume
0Ð † Ñ œ + O Ð † Ñß 1Ð † Ñ œ , O Ð † ÑÞ3œ" 3œ"
6 6
3 3x x3 3
[Note we may assume both use same0 ß 1set of since if not we may take a unionÖ ×x3
without loss]. [Note again that here ]Ø † ß † Ù œ Ø † ß † Ù[
Kernel existence
Then defining ,ØO Ð † ÑßO Ð † ÑÙ œ OÐ ß Ñx y x ydefine
Ø0Ð † Ñß 1Ð † ÑÙ
œ + OÐ ß † Ñß , OÐ ß † Ѥ ¥3œ" 4œ"
6 6
3 3 4 4x x
Kernel existence
œ + + ØOÐ ß † ÑßOÐ ß † ÑÙ3ß4œ"
6
3 4 3 4x x
œ + , OÐ ß ÑÞ3ß4œ"
6
3 4 3 4x x
Kernel existence
Easy to check that with the above innerproduct is an inner product space (i.e.,[!
satisfies properties ). Now formÐ Ñ Ð Ña dthe of this space into the completion2
(complete) Hilbert space [Þ2The completion of a non-complete inner product space space is the (unique) smallest complete inner product[!
(Hilbert) space which contains . That is, , the inner product on is the same as on , and there is[ [ [ [ [ [! ! !§no smaller complete Hilbert space which contains .[!
Example 1: [ œ œ Ð+ ß + ßá Ñ l+ l ∞ Ø ß Ù œ + ,œ ºa a b" # 3 3 33œ" 3œ"
∞ ∞# with inner product was discussed in class.
The inner product space
all but a finite number of are 0[ [ [! " # 3œ Ð+ ß + ßá Ñ − + §œ º is an example of an incomplete space. is its completion.[Example 2: [ 1 1 [œ P Ð ß Ñ 0ÐB −# with standard inner product for functions. We know if ) then
0ÐBÑ œ + 5B , 5B 0 −+#
5œ"
∞
5 5 !! cos sin . Define to be all for which the above sum is (i.e., all but a[ [ finite
finite number of terms are 0). Then is the completion of .[ [!
Kernel existence
Note that for as above0 œ + O Ð † Ñ − À3
3 !x3[
Kernel existence
l0Ð Ñl œ Ø0Ð † ÑßOÐ ß † ÑÙ Ÿ m0Ð † ÑmmOÐ ß † Ñmx x x
œ m0m ØOÐ ß † ÑßOÐ ß † ÑÙÈ x x
œ m0m OÐ ß ÑðóóñóóòÈ x xQO
œ Q m0m ÞO [
Kernel existence
[Note again here we write bym0m œ m0m[definition; similarly ]Ø0 ß 1Ù œ Ø0 ß 1Ù[
The above shows that the identity mappingM À Ä GÐ\Ñ[! (the latter is the continuousfunctions on ) is bounded.\
By this we mean that maps function as aM 0function in to itself as a function in[!
GÐ\Ñ GÐ\Ñ 0; in norm of ism0m ´ l0ÐBÑl∞
B−\sup .
Kernel existence
By bounded we mean thatmM0m œ m0m Ÿ .m0m∞ ∞ [ for some constant. !.
Thus any Cauchy sequence in is also[!
Cauchy in and so has limit asGÐ\Ñfunction in .GÐ\Ñ
So it follows easily that the completion of[[! exists as a subset of .GÐ\Ñ
Kernel existence
That is a reproducing kernel for followsO [by approximation from the fact that Oworks as a reproducing kernel in [!Þ
Regularization methods
3. Regularization methods for choosing 0
Finding desired from training set0Ð Ñx
R0 ´ œ ÖÐ ß C Ñ×g x3 3 3œ"R
is an : a unique operatorill-posed problemR R" does not exist because is not one toone.
Regularization methods
Need to combine both:
(a) Data (posterior or R0 œ g a posterioriinformation)
(b) Prior or information, e.g., " isa priori 0smooth", e.g. expressing a preference forsmooth over wiggly solutions seen earlier.
How to incorporate both? Using Tikhonovregularization methods.
Regularization methods
We introduce a regularization loss functionalN Ð0Ñ representing penalty (loss) for choiceof an "unrealistic" such as that in 0 (a)above.
Assume we want to find correct function0 Ð Ñß! x from data
R0 Ð Ñ œ ÐÐ C Ñßá ß Ð ß C ÑÑ œ! " " 8 8x x x, g
Regularization methods
Suppose we are given as a candidate0Ð Ñxfor approximating from the information0 Ð Ñ! xin g Þ
We score as a good or bad approximation0based on a combination of
(a) Its error on the known points ,Ö ×x3 3œ"8
(b) Its "plausibility", i.e., how low the penaltyN Ð0Ñ is.
Regularization methods
These are combined in minimization of theLagrangian
_Ð0Ñ œ PÐ0Ð Ñß C Ñ N Ð0ÑÞ"
83œ"
8
3 3x
Here measures loss wheneverPÐ0Ð Ñß C Ñx3 3
predicted is far from actual value ,0Ð Ñ Cx3 3
e.g.
PÐ0Ð Ñß C Ñ œ l0Ð Ñ C l Þx x3 3 3 3#
Regularization methods
And measures the i.e., aN Ð0Ñ a priori loss,measure of discrepancy between theprospective choice and our prior0expectation about .0
Examples: Regularization methods
Example:
N Ð0Ñ œ mE0m œ . lE0Ð Ñl ßP# #
# ( x x
where hereE0 œ 0 0à?
?0 œ á Þ` 0 ` 0`B `B
# #
"# #
:
Note and thus measures the degree?0 N Ð0Ñof non-smoothness that has (i.e., we0prefer smoother functions a priori).
Examples: Regularization methods
Example 3: Consider case N Ð0Ñ œ mE0m#
above. The norm
m0m œ mE0m[ P#
œ reproducing kernel Hilbert space norm(at least if dimension is small)..
That is, this norm comes from an innerproduct , andØ0 ß 1Ù œ ÐE0ÑÐBÑÐE1ÑÐBÑ.B'
\
with this inner product is an RKHS.[
Examples: Regularization methods
If this is the case, in general things becomeeasier.
Examples: Regularization methods
Example 4: In the case , .0 œ 0ÐBÑ B − ‘"
Suppose we choose:
E0 œ 0 0 œ " 0ß. .
.B .B
# #
# #Œwe have
Examples: Regularization methods
N Ð0Ñ œ mE0m œ " 0 .Bß.
.B#
#
#
#( ” •Œand is a measure of "lack ofmE0msmoothness" of .0
Examples: Regularization methods
4. More about using the Laplacian tomeasure smoothness (Sobolevsmoothness)
Basic definitions: Recall the Laplacianoperator on a function on ? ‘0 :
0Ð Ñ œ 0ÐB ßá ß B Ñx " :
is defined by
Examples: Regularization methods
?0 œ 0 á 0Þ` `
`B `B
# #
"#
:#
Using the Laplacian for kernels
For an even integer, we can define the= !Sobolev space by:L=
L œ Ö0 − P Ð Ñ À Ð" Ñ 0 − P Ð Ñ×= # . =Î# # :‘ ? ‘ .
This is the set of functions in (i.e.0 P Ð Ñ# :‘square integrable functions) which are stillin after taking the derivativeP Ð Ñ# :‘operation , i.e., repeatedÐ" Ñ ÐM Ñ? ?=Î#
=Î# " œ M times (operator is always theidentity operator).
Using the Laplacian for kernels
For define the new inner product0 ß 1 − L=
Ø0 ß 1Ù œ ØÐ "Ñ 0 ß Ð "Ñ 1Ù àL=Î# =Î#
P= #? ?
[note ]Ø2Ð Ñß 5Ð ÑÙ œ 2Ð Ñ5Ð Ñ.x x x x xP \# '
Using the Laplacian for kernels
Can show that is an RKHS withL=
reproducing kernel
OÐ Ñ œ"
Ðl l "Ñz Y"
# =Œ=
(1)
where denotes the inverse FourierY"
transform. The function is a"Ðl l "Ñ= # =
function on where= œ Ð ßá ß Ñ − ß= = ‘" ::
l l œ á Þ= # # #" := =
Using the Laplacian for kernels
Fig 7: in one dimension - a smooth kernelOÐ Ñz
Using the Laplacian for kernels
OÐ Ñz is called a radial basis function.
Note: the kernel (as function of 2OÐ ß Ñx yvariables) is defined in terms of above byO
OÐ ß Ñ œ OÐ ÑÞx y x y
Using the Laplacian for kernels
The Representer Theorem for RKHS
1. Application: using RKHS forregularization
Assume again we have unknown functionC œ 0Ð Ñ \ §x on , with only data‘:
R0 œ ÐÐ C Ñßá ß Ð ß C ÑÑ œx x" " 8 8ß g .
To find the best guess for , approximate0 0s
it by the minimizer
RKHS and regularization
0 œ m0Ð Ñ C m m0ms "
8arg min0−L 3œ"
8
3 3# #
L=
=Ÿx - (1a)
where can be some constant.-
RKHS and regularization
We seek which balances minimizing0
3œ"
8
3 3#m0Ð Ñ C m ßx
i.e., the data error, with minimizing , i.e.,m0m#L=
maximizing the smoothness.
The solution to such a problem will look likethis:
RKHS and regularization
It will compromise between fitting the data(which may have error) and trying to be
smooth.
RKHS and regularization
The amazing thing: 0s can be foundexplicitly using the above radial basisfunctions.
RKHS and regularization
2. Solving the minimization
Now consider general version optimizationproblem with a space of functions (1a) [that is an RKHS.
Claim we can solve it explicitly.
To see this works in general for RKHS, returnto general problem:
RKHS and regularization
General problem: Given unknown0 − œ[ RKHS, try to find "best"approximation to fitting the data0 0s
R0 ´ ÐÐ ß C Ñßá ß Ð ß C ÑÑx x" " 8 8 , but ALSOsatisfying a priori knowledge that ism0 m! [
small (e.g. so is smooth).0
RKHS and regularization
Specifically, want to find
arg min 0− 3œ"
8
3 3#
[[
"
8PÐ0Ð Ñß C Ñ m0m Þx - (2)
Note we can have, e.g.,
PÐ0Ð Ñß C Ñ œ Ð0Ð Ñ C Ñ Þx x3 3 3 3#
RKHS and regularization
In that case
3œ" 3œ"
8 8
3 3 3 3#PÐ0Ð Ñß C Ñ œ Ð0Ð Ñ C Ñ œx x squared error
Consider the general case , with arbitrary(2)error measure . We have theP
RKHS and regularization
Representer Theorem: E solution of theTikhonov optimization problem can be(2)written
0ÐBÑ œ + OÐ ß Ñßs
3œ"
8
3 3x x (3)
where is the reproducing kernel of theORKHS .[
RKHS and regularization
Important theorem: thus only need to find 8numbers to optimize infinite dimensional+3problem above.(2)
Proof: Use calculus of variations.
If a minimizer of (2) exists, now consider0" any .1 − [
Assuming derivatives with respect to exist:%
[again all norms and inner products are in ][
Representer theorem proof
! œ PÐÐ0 1ÑÐ Ñß C Ñ m0 1m. "
. 8%% - % º
3œ"
8
" 3 3 "#
œ!
x [%
Representer theorem proof
œ Ð0 Ð Ñß C Ñ † 1Ð Ñ" `P
8 `0 Ð Ñ3œ"
8
" 3" 3 3 3x x x
Ø0 ß 0 Ù # Ø0 ß 1Ù Ø1ß 1Ù.
.- % %
%˜ ™º" " "
#
œ!%
œ P Ð0 Ð Ñß C Ñ † 1Ð Ñ # Ø0 ß 1Ùß"
83œ"
8
" " 3 3 3 "x x -
Representer theorem proof
where and all innerP Ð+ß ,Ñ œ PÐ+ß ,Ñ"``+
products are in [Þ
Since the above is true for all it1 − ß[follows that if we let we get1 œ Ox
(recall ):O Ð Ñ ´ OÐ ß Ñx x x x3 3
Representer theorem proof
! œ P Ð0 Ð Ñß C ÑO Ð Ñ # Ø0 ßO Ù"
83œ"
8
" " 3 3 3 "x xx x-
œ P Ð0 Ð Ñß C ÑO Ð Ñ # 0 Ð Ñß"
83œ"
8
" " 3 3 3 "x x xx -
or
Representer theorem proof
0 Ð Ñ œ P Ð0 Ð Ñß C ÑOÐ ß ÑÞ"
# 8" " " 3 3 3
3œ"
8
x x x x-
Thus if a minimizer exists for (1a) it0 œ 0 ßs"
can be written in the form (3) as claimed,with
+ œ P Ð0 Ð Ñß C ÑÞ"
# 83 " " 3 3
-x
Representer theorem proof
Note that this does not solve the problem,since the are expressed in terms of the+3solution itself.
But it does reduce the possibilities for what asolution looks like.