research article optimized k-means algorithmdownloads.hindawi.com/journals/mpe/2014/506480.pdf ·...
TRANSCRIPT
Research ArticleOptimized K-Means Algorithm
Samir Brahim Belhaouari1 Shahnawaz Ahmed1 and Samer Mansour2
1 Department of Mathematics amp Computer Science College of Science and General Studies Alfaisal UniversityPO Box 50927 Riyadh Saudi Arabia
2Department of Software Engineering College of Engineering Alfaisal University PO Box 50927 Riyadh Saudi Arabia
Correspondence should be addressed to Shahnawaz Ahmed mathshangmailcom
Received 18 May 2014 Revised 9 August 2014 Accepted 13 August 2014 Published 7 September 2014
Academic Editor Gradimir Milovanovic
Copyright copy 2014 Samir Brahim Belhaouari et al This is an open access article distributed under the Creative CommonsAttribution License which permits unrestricted use distribution and reproduction in any medium provided the original work isproperly cited
The localization of the region of interest (ROI) which contains the face is the first step in any automatic recognition system whichis a special case of the face detection However face localization from input image is a challenging task due to possible variations inlocation scale pose occlusion illumination facial expressions and clutter background In this paper we introduce a new optimizedk-means algorithm that finds the optimal centers for each cluster which corresponds to the global minimum of the k-means clusterThis method was tested to locate the faces in the input image based on image segmentation It separates the input image into twoclasses faces and nonfaces To evaluate the proposed algorithmMIT-CBCL BioID and Caltech datasets are usedThe results showsignificant localization accuracy
1 Introduction
The face detection problem is to determine the face existencein the input image and then return its location if it existsBut in the face localization problem it is given that the inputimage contains a face and the goal is to determine the locationof this face The automatic recognition system which is aspecial case of the face detection locates in its earliest phasethe region of interest (ROI) that contains the face Facelocalization in an input image is a challenging task due topossible variations in location scale pose occlusion illumi-nation facial expressions and clutter background Variousmethods were proposed to detect andor localize faces in aninput image however there are still needs to improve theperformance of localization and detectionmethods A surveyon some face recognition and detection techniques can befound in [1] A more recent survey mainly on face detectionwas written by Yang et al [2]
One of the ROI detection trends is the idea of segmentingthe image pixels into a number of groups or regions basedon similarity properties It gained more attention in therecognition technologies which rely on grouping the featuresvector to distinguish between the image regions and then
concentrate on a particular region which contains the faceOne of the earliest surveys on image segmentation techniqueswas done by Fu and Mui [3] They classify the techniquesinto three classes features shareholding (clustering) edgedetection and region extraction A later survey by N R Paland S K Pal [4] did only concentrate on fuzzy nonfuzzyand colour images techniques Many efforts had been done inROI detection which may be divided into two types regiongrowing and region clustering The difference between thetwo types is that the region clustering searches for the clusterswithout prior information while the region growing needsinitial points called seeds to detect the clusters The mainproblem in region growing approach is to find the suitableseeds since the clusters will grow from the neighbouringpixels of these seeds based on a specified deviation For seedsselection Wan and Higgins [5] defined a number of criteriato select insensitive initial points for the subclass of regiongrowing To reduce the region growing time Chang and Li[6] proposed a fast region growing method by using parallelsegmentation
As mentioned before the region clustering approachsearches for clusters without prior information Pappas andJayant [7] generalized the K-means algorithm to group the
Hindawi Publishing CorporationMathematical Problems in EngineeringVolume 2014 Article ID 506480 14 pageshttpdxdoiorg1011552014506480
2 Mathematical Problems in Engineering
image pixels based on spatial constraints and their intensityvariations These constraints include the following firstthe region intensity to be close to the data and secondto have a spatial continuity Their generalization algorithmis adaptive and includes the previous spatial constraintsLike the K-means clustering algorithm their algorithm isiterative These spatial constraints are included using Gibbsrandom field model and they concluded that each region ischaracterized by a slowly varying intensity function Becauseof unimportant features the algorithm is not that muchaccurate Later to improve theirmethod they proposed usinga caricature of the original image to keep only the mostsignificant features and remove the unimportant features [8]Besides the problem of time costly segmentation there arethree basic problems that occur during the clustering processwhich are center redundancy dead centers and trappedcenters at local minima Moving K-mean (MKM) proposedin [9] can overcome the three basic problems by minimizingthe center redundancy and the dead center as well as reducingindirectly the effect of trapped centers at local minima Inspite of that MKM is sensitive to noise centers are notlocated in the middle or centroid of a group of data andsome members of the centers with the largest fitness areassigned as members of a center with the smallest fitness Toreduce the effects of these problems Isa et al [10] proposedthree modified versions of the MKM clustering algorithmfor image segmentation fuzzy moving k-means adaptivemoving k-means and adaptive fuzzy moving k-means Afterthat Sulaiman and Isa [11] introduced a new segmentationmethod based on a new clustering algorithm which isAdaptive Fuzzy k-means Clustering Algorithm (AFKM) Onthe other hand the Fuzzy C-Means (FCM) algorithm wasused in image segmentation but it still has some drawbacksthat are the lack of enough robustness to noise and outliersthe crucial parameter 120572 being selected generally based onexperience and the time of segmenting of an image beingdependent on its size To overcome the drawbacks of theFCM algorithm Cai et al [12] integrated both local spatialand gray information to propose fast and robust Fuzzy C-means algorithm called Fast Generalized Fuzzy C-MeansAlgorithm (FGFCM) for image segmentation Moreovermany researchers have provided a definition for some datacharacteristics where it has a significant impact on the K-means clustering analysis including the scattering of the datanoise high-dimensionality the size of the data and outliers inthe data datasets types of attributes and scales of attributes[13] However there is a need for more investigation tounderstand how data distributions affect K-means clusteringperformance In [14] Xiong et al provided a formal andorganized study of the effect of skewed data distributionson K-means clustering Previous research found the impactof high-dimensionality on K-means clustering performanceThe traditional Euclidean notion of proximity is not veryuseful on real-world high-dimensional datasets such as doc-ument datasets and gene-expression datasets To overcomethis challenge researchers turn tomake use of dimensionalityreduction techniques such as multidimensional scaling [15]Second K-means has difficulty in detecting the ldquonaturalrdquoclusters with nonspherical shapes [13] Modified K-means
clustering algorithm is a direction to solve this problemGuhaet al [16] proposed the CURE method which makes use ofmultiple representative points to obtain the ldquonaturalrdquo clustersshape information The problem of outliers and noise in thedata can also reduce clustering algorithms performance [17]especially for prototype-based algorithms such as K-meansThe direction to solve this kind of problem is by combiningsome outlier removal techniques before conducting K-meansclustering For example a simple method [9] of detectingoutliers is based on the distance measure On the other handmanymodifiedK-means clustering algorithms that workwellfor smallermedium-size datasets are unable to deal with largedatasets Bradley et al in [18] introduced a discussion ofscaling K-means clustering to large datasets
Arthur and Vassilvitskii implemented a preliminary ver-sion k-means++ in C++ to evaluate the performance of k-means K-means++ [19] uses a careful seeding method tooptimize the speed and the accuracy of k-means Experi-ments were performed on four datasets and results showedthat that this algorithm is 119874(log 119896)-competitive with theoptimal clustering Experiments also proved its better speed(70 faster) and accuracy (potential value obtained is betterby factors of 10 to 1000) than k-means Kanungo et al[20] also proposed an 119874(1198993120598minus119889) algorithm for the k-meansproblem It is (9 + 120598) competitive but quiet slow in practiceXiong et al investigates the measures that best reflects theperformance of K-means clustering [14] An organized studywas performed to understand the impact of data distributionson the performance of K-means clustering Research workalso has improved the traditional K-means clustering so thatit can handle datasets with large variation of cluster sizesThisformal study illustrates that the entropy sometimes providedmisleading information on the clustering performance Basedon this a coefficient of variation (CV) is proposed to validatethe clustering outcome Experimental results proved that fordatasets with great variation in ldquotruerdquo cluster sizes K-meanslessens (less than 10) the variation in resultant cluster sizesFor datasets with small variation in ldquotruerdquo cluster sizes K-means increases (greater than 03) the variation in resultantcluster sizes
Scalability of the k-means for large datasets is one ofthe major limitations Chiang et al [21] proposed a simpleand easy to implement algorithm for reducing the compu-tation time of k-means This pattern reduction algorithmcompresses andor removes iteration patterns that are notlikely to change their membershipThe pattern is compressedby selecting one of the patterns to be removed and setting itsvalue to the average of all patterns removed If 119863[119903119894
119897] is the
pattern to be removed then we can write it as follows
119863[119903119894
119897] =
|119877119894
119897|
sum
119895=1
119863[119903119897
119894119895] where 119903119894
119897isin 119877119894
119897 (1)
This algorithm can be also applied to other clusteringalgorithms like population-based clustering algorithms Thecomputation time of many clustering algorithms can bereduced using this technique as it is especially effectivefor large and high-dimensional datasets In [22] Bradley
Mathematical Problems in Engineering 3
et al proposed Scalable k-means that uses buffering anda two-stage compression scheme to compress or removepatterns in order to improve the performance of k-meansThe algorithm is slightly faster than k-means but does notshow the same result always The factors that degrade theperformance of scalable k-means are the compression pro-cesses and compression ratio Farnstrom et al also proposeda simple single pass k-means algorithm [23] which reducesthe computation time of k-means Ordonez and Omiecinskiproposed relational k-means [24] that uses the block andincremental concept to improve the stability of scalable k-meansThe computation time of k-means can also be reducedusing Parallel bisecting k-means [25] and triangle inequality[26] methods
Although K-means is the most popular and simple clus-tering algorithm the major difficulty with this process is thatit cannot ensure the global optimumresults because the initialcluster centers are selected randomly Reference [27] presentsa novel technique that selects the initial cluster centers byusing Voronoi diagramThis method automates the selectionof the initial cluster centers To validate the performanceexperiments were performed on a range of artificial and realworld datasets The quality of the algorithm is assessed usingthe following error rate (ER) equation
ER =Number ofmisclassified objects
Total number of objectstimes 100 (2)
The lower the error rate the better the clustering Theproposed algorithm is able to produce better clustering resultsthan the traditional K-means Experimental results proved60 reduction in iterations for defining the centroids
All previous solutions and efforts to increase the perfor-mance of K-means algorithm still need more investigationsince they are all looking for a local minimum Searching forthe global minimum will certainly improve the performanceof the algorithmThe rest of the paper is organized as followsIn Section 2 the proposed method is introduced that findsthe optimal centers for each cluster which corresponds tothe global minimum of the k-means cluster In Section 3 theresults are discussed and analyzed using two sets from MITBioID and Caltech datasets Finally in Section 4 conclusionsare drawn
2 K-Means Clustering
K-means an unsupervised learning algorithm first proposedby MacQueen 1967 [28] is a popular method of clusteranalysis It is a process of organizing the specified objectsinto uniform classes called clusters based on similaritiesamong objects based on certain criteria It solves the well-known clustering problem by considering certain attributesand performing an iterative alternating fitting process Thisalgorithm partitions a dataset 119883 into 119896 disjoint clusters suchthat each observation belongs to the cluster with the nearestmean Let119909
119894be the centroid of cluster 119888
119894and let119889(119909
119895 119909119894)be the
180
160
140
120
100
80
60
40
20
0
Clus
terin
g er
ror
Number of clusters k2 4 6 8 10 12 14 16
Global k-means
Fast global k-means
k-means
Min k-means
Figure 1 Comparison of k-means
dissimilarity between 119909119894and object 119909
119895isin 119888119894Then the function
minimized by the k-means is given by the following equation
min1199091 119909119896
119864 =
119896
sum
119894=1
sum
119909119895isin119888119894
119889 (119909119895 119909119894) (3)
K-means clustering is utilized in a vast number ofapplications including machine learning fault detectionpattern recognition image processing statistics and artificialintelligent [11 29 30] k-means algorithm is considered asone of the fastest clustering algorithms with a number ofvariants that are sensitive to the selection of initial pointsand are intended for solving many issues of k-means likethe evaluation of the clusters number [31] the method ofinitialization of the clusters centroids [32] and the speed ofthe algorithm [33]
21 Modifications in K-Means Clustering Global k-meansalgorithm [34] is a proposal to improve global search prop-erties of k-means algorithm and its performance on verylarge datasets by computing clusters successively The mainweakness of k-means clustering that is its sensitivity to initialpositions of the cluster centers is overcome by global k-means clustering algorithm which consists of a deterministicglobal optimization technique It does not select initial valuesrandomly instead an incremental approach is applied tooptimally add one new cluster center at each stage Globalk-means also proposes a method to reduce the computa-tional load while maintain the solution quality Experimentswere performed to compare the performance of k-meansglobal k-means min k-means and fast global k-means asshown in Figure 1 Numerical results show that the globalk-means algorithm considerably outperforms other k-meansalgorithms
4 Mathematical Problems in Engineering
Bagirov [35] proposed a new version of the global k-means algorithm for minimum sum-of-squares clusteringproblems He also compared three different versions ofthe k-means algorithm to propose the modified versionof the global k-means algorithm The proposed algorithmcomputes clusters incrementally and 119896 minus 1 cluster centersfrom the previous iteration are used to compute k-partitionof a dataset The starting point of the 119896th cluster center iscomputed byminimizing an auxiliary cluster function Givena finite set of points 119860 in the 119899-dimensional space the globalk-means compute the centroid1198831 of the set119860 as the followingequation
1198831=
1
119898
119898
sum
119894=1
119886119894 119886119894isin 119860 119894 = 1 119898 (4)
Numerical experiments performed on 14 datasets demon-strate the efficiency of themodified global k-means algorithmin comparison to the multistart k-means (MS k-means) andthe GKM algorithms when the number of clusters 119896 gt 5Modified global k-means however requires more computa-tional time than the global k-means algorithm Xie and Jiang[36] also proposed a novel version of the global k-meansalgorithm The method of creating the next cluster center inthe global k-means algorithm is modified and that showeda better execution time without affecting the performanceof the global k-means algorithm Another extension of thestandard k-means clustering is the Global Kernel k-means[37] which optimizes the clustering error in feature spaceby employing kernel-means as a local search procedure Akernel function is used in order to solve the M-clusteringproblem and near-optimal solutions are used to optimizethe clustering error in the feature space This incrementalalgorithm has high ability to identify nonlinearly separableclusters in input space and no dependence on cluster initial-ization It can handle weighted data points making it suitablefor graph partitioning applications Twomajor modificationswere performed in order to reduce the computational costwith no major effect on the solution quality
Video imaging and Image segmentation are importantapplications for k-means clustering Hung et al [38]modifiedthe k-means algorithm for color image segmentation wherea weight selection procedure in the W-k-means algorithm isproposed The evaluation function 119865(119868) is used for compari-son
119865 (119868) = radic119877
119877
sum
119894=1
(119890119894256)2
radic119860119894
(5)
where 119868 = segmented image 119877 = the number of regions in thesegmented image 119860
119894= the area or the number of pixels of
the 119894th region and 119890119894= the color error of region 119894
119890119894is the sum of the Euclidean distance of the color vectors
between the original image and the segmented image Smallervalues of 119865(119868) give better segmentation results Resultsfrom color image segmentation illustrate that the proposedprocedure produces better segmentation than the randominitialization Maliatski and Yadid-Pecht [39] also proposean adaptive k-means clustering hardware-driven approach
Y
X
My
1
My
My
2
Mx1
Mx Mx2
C1
C2
Figure 2 The separation points in two dimensions
The algorithm uses both pixel intensity and pixel distance inthe clustering process In [40] also a FPGA implementationof real-time k-means clustering is done for colored imagesA filtering algorithm is used for hardware implementationSuliman and Isa [11] presented a unique clustering algo-rithm called adaptive fuzzy-K-means clustering for imagesegmentation It can be used for different types of imagesincluding medical images microscopic images and CCDcamera images The concepts of fuzziness and belongingnessare applied to produce more adaptive clustering as comparedto other conventional clustering algorithms The proposedadaptive fuzzy-K-means clustering algorithm provides bettersegmentation and visual quality for a range of images
22 The Proposed Methodology In this method the facelocation is determined automatically by using an optimizedK-means algorithm First the input image is reshaped intovectors of pixels values followed by clustering the pixelsbased on applying a certain threshold determined by thealgorithm into two classes Pixels with values below thethreshold are put in the nonface classThe rest of the pixels areput in the face class However some unwanted parts may getclustered to this face classTherefore the algorithm is appliedagain to the face class to obtain only the face part
221 Optimized K-Means Algorithm Theproposed K-meansmodified algorithm is intended to solve the limitations of thestandard version by using differential equations to determinethe optimum separation point This algorithm finds theoptimal centers for each cluster which corresponds to theglobal minimum of the k-means cluster As an example saywe would like to cluster an image into two subclasses faceand nonface So we look for a separator which will separatethe image into two different clusters in two dimensionsThenthe means of the clusters can be found easily Figure 2 showsthe separation points and themean of each class To find these
Mathematical Problems in Engineering 5
points we proposed the followingmodification in continuousand discrete cases
Let 1199091 1199092 1199093 119909
119873 be a dataset and let 119909
119894be an
observation Let119891(119909) be the probabilitymass function (PMF)of 119909
119891 (119909) =
infin
sum
minusinfin
1
119873
120575 (119909 minus 119909119894) 119909
119894isin R (6)
where 120575(119909) is the Dirac functionWhen the size of the data is very large the question of the
cost of the minimization ofsum119896119894=1sum119909119894isin119904119894
|119909119894minus 120583119894|2 arises where
119896 is the number of clusters and 120583119894is the mean of cluster 119904
119894
Let
Var 119896 =119896
sum
119894=1
sum
119909119894isin119904119894
1003816100381610038161003816119909119894minus 120583119894
1003816100381610038161003816
2 (7)
Then it is enough to minimize Var 1 without losing thegenerality since
min119904119894
119889
sum
119894=1
sum
119909119895isin119904119894
10038161003816100381610038161003816119909119895minus 120583119894
10038161003816100381610038161003816
2
ge
119889
sum
119897=1
min119904119894
(
119896
sum
119894=1
sum
119909119895isin119904119894
10038161003816100381610038161003816119909119897
119895minus 120583119897
119894
10038161003816100381610038161003816
2
) (8)
where 119889 is the dimension of 119909119895= (1199091
119895 119909
119889
119895) and 120583
119895=
(1205831
119895 120583
119889
119895)
222 Continuous Case For the continuous case 119891(119909) is likethe probability density function (PDF)Then for 2 classes weneed to minimize the following equation
Var 2 = int
119872
minusinfin
(119909 minus 1205831)2119891 (119909) 119889119909
+ int
infin
119872
(119909 minus 1205832)2119891 (119909) 119889119909
(9)
If 119889 ge 2 we can use (8)
minVar 2 ge119889
sum
119896=1
min(int119872
minusinfin
(119909119896minus 120583119896
1)
2
119891 (119909119896) 119889119909
+int
119872
minusinfin
(119909119896minus 120583119896
2)
2
119891 (119909119896) 119889119909)
(10)
Let 119909 isin R be any random variable with probabilitydensity function 119891(119909) we want to find 120583
1and 120583
2such that
it minimizes (7) as follows
(120583lowast
1 120583lowast
2) = arg(1205831 1205832)
[ min11987212058311205832
[int
119872
minusinfin
(119909 minus 1205831)2119891 (119909) 119889119909
+int
infin
119872
(119909 minus 1205832)2119891 (119909) 119889119909]]
(11)
min11987212058311205832
Var 2 = min119872
[min1205831
int
119872
minusinfin
(119909 minus 1205831)2119891 (119909) 119889119909
+min1205832
int
infin
119872
(119909 minus 1205832)2119891 (119909) 119889119909]
(12)
min11987212058311205832
Var 2 = min119872
[min1205831
int
119872
minusinfin
(119909 minus 1205831(119872))2119891 (119909) 119889119909
+min1205832
int
infin
119872
(119909 minus 1205832(119872))2119891 (119909) 119889119909]
(13)
We know that
min119909119864 [(119883 minus 119909)
2] = 119864 [(119883 minus 119864 (119909))
2] = Var (119909) (14)
So we can find 1205831and 1205832in terms of119872 Let us define Var
1
and Var2as follows
Var1= int
119872
minusinfin
(119909 minus 1205831)2119891 (119909) 119889119909
Var2= int
infin
119872
(119909 minus 1205832)2119891 (119909) 119889119909
(15)
After simplification we get
Var1= int
119872
minusinfin
1199092119891 (119909) 119889119909
+ 1205832
1int
119872
minusinfin
119891 (119909) 119889119909 minus 21205831int
119872
minusinfin
119909119891 (119909) 119889119909
Var2= int
infin
119872
1199092119891 (119909) 119889119909
+ 1205832
2int
infin
119872
119891 (119909) 119889119909 minus 21205832int
infin
119872
119909119891 (119909) 119889119909
(16)
To find the minimum Var1and Var
2are both partially
differentiated with respect to 1205831and 120583
2 respectively After
simplification we conclude that the minimization occurswhen
1205831=
119864 (119883minus)
119875 (119883minus)
1205832=
119864 (119883+)
119875 (119883+)
(17)
where119883minus= 119883 sdot 1
119909lt119872and119883
+= 119883 sdot 1
119909ge119872
6 Mathematical Problems in Engineering
To find the minimum of Var 2 we need to find 120597120583120597119872
1205831= (
int
119872
minusinfin119909119891 (119909) 119889119909
int
119872
minusinfin119891 (119909) 119889119909
)
997904rArr
1205971205831
120597119872
=
119872119891 (119872)119875 (119883minus) minus 119891 (119872)119864 (119883
minus)
1198752(119883minus)
1205832= (
int
infin
119872119909119891 (119909) 119889119909
int
infin
119872119891 (119909) 119889119909
)
997904rArr
1205971205832
120597119872
=
119872119891 (119872)119875 (119883+) minus 119891 (119872)119864 (119883+
)
1198752(119883+)
(18)
After simplification we get
1205971205831
120597119872
=
119891 (119872)
1198752(119883minus)
[119872119875 (119883minus) minus 119864 (119883
minus)]
1205971205832
120597119872
=
119891 (119872)
1198752(119883+)
[119872119875 (119883+) minus 119864 (119883
+)]
(19)
In order to find the minimum of Var 2 we will find120597Var1120597119872 and 120597119881119886119903
2120597119872 separately and add themas follows
120597
120597119872
(Var 2) = 120597
120597119872
(Var1) +
120597
120597119872
(Var2)
120597Var1
120597119872
=
120597
120597119872
(int
119872
minusinfin
1199092119891 (119909) 119889119909
+1205832
1int
119872
minusinfin
119891 (119909) 119889119909 minus 21205831int
119872
minusinfin
119909119891 (119909) 119889119909)
= 1198722119891 (119872) + 2120583
11205831015840
1119875 (119883minus)
+ 1205832
1119891 (119872) minus 2120583
1015840
1119864 (119883minus) minus 2120583
1119872119891(119872)
(20)
After simplification we get
120597Var1
120597119872
= 119891 (119872)[1198722+
1198642(119883minus)
119875(119883minus)2minus 2119872
119864 (119883minus)
119875 (119883minus)
] (21)
Similarly
120597Var2
120597119872
= minus1198722119891 (119872) + 2120583
21205831015840
2119875 (119883+)
minus 1205832
2119891 (119872) minus 2120583
1015840
2119864 (119883+) + 2120583
2119872119891(119872)
(22)
After simplification we get
120597Var2
120597119872
= 119891 (119872)[minus1198722+
1198642(119883+)
119875(119883+)2minus 2119872
119864 (119883+)
119875 (119883+)
] (23)
Adding both these differentials and after simplifying itfollows that
120597
120597119872
(Var 2) = 119891 (119872)[(
1198642(119883minus)
119875(119883minus)2minus
1198642(119883+)
119875(119883+)2)
minus2119872(
119864 (119883minus)
119875 (119883minus)
minus
119864 (119883+)
119875 (119883+)
)]
(24)
Equating this with 0 gives
120597
120597119872
(Var 2) = 0
[(
1198642(119883minus)
119875(119883minus)2minus
1198642(119883+)
119875(119883+)2) minus 2119872(
119864 (119883minus)
119875 (119883minus)
minus
119864 (119883+)
119875 (119883+)
)] = 0
[
119864 (119883minus)
119875 (119883minus)
minus
119864 (119883+)
119875 (119883+)
] [
119864 (119883minus)
119875 (119883minus)
+
119864 (119883+)
119875 (119883+)
minus 2119872] = 0
(25)
But [(119864(119883minus)119875(119883
minus))minus(119864(119883
+)119875(119883
+))] = 0 as in that case
1205831= 1205832 which contradicts the initial assumption Therefore
[
119864 (119883minus)
119875 (119883minus)
+
119864 (119883+)
119875 (119883+)
minus 2119872] = 0
997904rArr 119872 = [
119864 (119883minus)
119875 (119883minus)
+
119864 (119883+)
119875 (119883+)
] sdot
1
2
119872 =
1205831+ 1205832
2
(26)
Therefore we have to find all values of119872 that satisfy (26)which is easier than minimizing Var 2 directly
To find theminimum in terms of119872 it is enough to derive
120597Var 2 (1205831(119872) 120583
2(119872))
120597119872
= 119891 (119872) [
1198642(119883minus)
119875(119883 lt 119872)2minus
1198642(119883+)
119875(119883 ge 119872)2
minus2119872[
119864 (119883minus)
119875 (119883 le 119872)
minus
119864 (119883+)
119875 (119883 gt 119872)
]]
(27)
Then we find
119872 = [
119864 (119883+)
119875 (119883+)
+
119864 (119883minus)
119875 (119883minus)
] sdot
1
2
(28)
223 Finding the Centers of Some Common ProbabilityDensity Functions
(1) Uniform Distribution The probability density function ofthe continuous uniform distribution is
119891 (119909) =
1
119887 minus 119886
119886 lt 119909 lt 119887
0 otherwise(29)
Mathematical Problems in Engineering 7
where 119886 and 119887 are the two parameters of the distributionTheexpected values for119883 sdot 1
119909ge 119872 and119883 sdot 1
119909lt 119872 are
119864 (119883+) =
1
2
(
1198872minus1198722
119887 minus 119886
)
119864 (119883minus) =
1
2
(
1198722minus 1198862
119887 minus 119886
)
(30)
Putting all these in (28) and solving we get
119872 =
1
2
(119886 + 119887)
1205831(119872) =
119864 (119883minus)
119875 (119883minus)
=
1
4
(3119886 + 119887)
1205832 (119872) =
119864 (119883+)
119875 (119883+)
=
1
4
(119886 + 3119887)
(31)
(2) Log-NormalDistributionTheprobability density functionof a log-normal distribution is
119891 (119909 120583 120590) =
1
119909120590radic2120587
119890minus(ln 119909minus120583)221205902
119909 gt 0 (32)
The probabilities in left and right of119872 are respectively
119875 (119883+) =
1
2
minus
1
2
erf (12
radic2 (ln119872minus 120583)
120590
)
119875 (119883minus) =
1
2
+
1
2
erf (12
radic2 (ln119872minus 120583)
120590
)
(33)
The expected values for119883 sdot 1119909ge 119872 and119883 sdot 1
119909lt 119872 are
119864 (119883+)
= [
1
2
119890120583+(12)120590
2
1 + erf (12
(120583 + 1205902minus ln119872)radic2
120590
)]
119864 (119883minus)
= [
1
2
119890120583+(12)120590
2
1 + erf (12
(minus120583 minus 1205902+ ln119872)radic2
120590
)]
(34)
Putting all these in (28) and assuming 120583 = 0 and 120590 = 1
and solving we get
119872 = 4244942583
1205831(119872) =
119864 (119883minus)
119875 (119883minus)
= 1196827816
1205832(119872) =
119864 (119883+)
119875 (119883+)
= 7293057348
(35)
(3) Normal Distribution The probability density function ofa normal distribution is
119891 (119909 120583 120590) =
1
120590radic2120587
119890minus(119909minus120583)
221205902
(36)
The probabilities in left and right of119872 are respectively
119875 (119883+) =
1
2
minus
1
2
erf (12
radic2 (119872 minus 120583)
120590
)
119875 (119883minus) =
1
2
+
1
2
erf (12
radic2 (119872 minus 120583)
120590
)
(37)
The expected values for119883 sdot 1119909ge 119872 and119883 sdot 1
119909lt 119872 are
119864 (119883+) = minus
1
2
120583minus1 + erf (12
radic2 (119872 minus 120583)
120590
)
119864 (119883minus) =
1
2
1205831 + erf (12
radic2 (119872 minus 120583)
120590
)
(38)
Putting all these in (28) and solving we get 119872 = 120583Assuming 120583 = 0 and 120590 = 1 and solving we get
119872 = 0
1205831(119872) =
119864 (119883minus)
119875 (119883minus)
= minusradic2
120587
1205832(119872) =
119864 (119883+)
119875 (119883+)
= radic2
120587
(39)
224 Discrete Case Let119883 be a discrete random variable andassume we have a set of119873 observations from119883 Let 119901
119895be the
mass density function for an observation 119909119895
Let 119909119894minus1
lt 119872119894minus1
lt 119909119894and 119909
119894lt 119872119894lt 119909119894+1
We define 120583119909le119909119894
as themean for all 119909 le 119909119894 and 120583
119909ge119909119894as themean for all 119909 ge 119909
119894
Define Var 2(119872119894minus1) as the variance of two centers forced by
119872119894minus1
as a separator
Var (119872119894minus1) = sum
119909119895le119909119894minus1
(119909119895minus 120583119909119895le119909119894minus1
)
2
119901119895
+ sum
119909119895ge119909119894
(119909119895minus 120583119909119895ge119909119894
)
2
119901119895
(40)
Var (119872119894) = sum
119909119895le119909119894
(119909119895minus 120583119909119895le119909119894
)
2
119901119895
+ sum
119909119895ge119909119894+1
(119909119895minus 120583119909119895ge119909119894+1
)
2
119901119895
(41)
8 Mathematical Problems in Engineering
The first part of Var(119872119894minus1) simplifies as follows
sum
119909119895le119909119894minus1
(119909119895minus 120583119909119895le119909119894minus1
)
2
119901119895
= sum
119909119895le119909119894minus1
1199092
119895119901119895minus 2120583119909119895le119909119894minus1
times sum
119909119895le119909119894minus1
119909119895119901119895+ 1205832
119909119895le119909119894minus1sum
119909119895le119909119894minus1
119901119895
= 119864 (1198832sdot 1119909119895le119909119894minus1
) minus
1198642(119883 sdot 1
119909119895le119909119894minus1)
119875 (119883 le 119909119894minus1)
(42)
Similarly
sum
119909119895ge119909119894
(119909119895minus 120583119909119895ge119909119894
)
2
119901119895= 119864 (119883
2sdot 1119909119895ge119909119894
) minus
1198642(119883 sdot 1
119909119895ge119909119894)
119875 (119883 ge 119909119894)
Var 2 (119872119894minus1) = 119864 (119883
2sdot 1119909119895le119909119894minus1
)
minus
1198642(119883 sdot 1
119909119895le119909119894minus1)
119875 (119883 le 119909119894minus1)
+ 119864 (1198832sdot 1119909119895ge119909119894
)
minus
1198642(119883 sdot 1
119909119895ge119909119894)
119875 (119883 ge 119909119894)
Var 2 (119872119894) = 119864 (119883
2sdot 1119909119895le119909119894
)
minus
1198642(119883 sdot 1
119909119895le119909119894)
119875 (119883 le 119909119894)
+ 119864 (1198832sdot 1119909119895ge119909119894+1
)
minus
1198642(119883 sdot 1
119909119895ge119909119894+1)
119875 (119883 ge 119909119894+1)
(43)
Define ΔVar 2 = Var 2(119872119894) minus Var 2(119872
119894minus1)
ΔVar 2 = [
[
minus
1198642(119883 sdot 1
119909119895le119909119894)
119875 (119883 le 119909119894)
minus
1198642(119883 sdot 1
119909119895ge119909119894+1)
119875 (119883 ge 119909119894+1)
]
]
minus[
[
minus
1198642(119883 sdot 1
119909119895le119909119894minus1)
119875 (119883 le 119909119894minus1)
minus
1198642(119883 sdot 1
119909119895ge119909119894)
119875 (119883 ge 119909119894)
]
]
(44)
Now in order to simplify the calculation we rearrangeΔVar 2 as follows
ΔVar 2 = [
[
1198642(119883 sdot 1
119909119895le119909119894minus1)
119875 (119883 le 119909119894minus1)
minus
1198642(119883 sdot 1
119909119895le119909119894)
119875 (119883 le 119909119894)
]
]
+[
[
1198642(119883 sdot 1
119909119895ge119909119894)
119875 (119883 ge 119909119894)
minus
1198642(119883 sdot 1
119909119895ge119909119894+1)
119875 (119883 ge 119909119894+1)
]
]
(45)
The first part is simplified as follows
1198642(119883 sdot 1
119909119895le119909119894minus1)
119875 (119883 le 119909119894minus1)
minus
1198642(119883 sdot 1
119909119895le119909119894)
119875 (119883 le 119909119894)
=
1198642(119883 sdot 1
119909119895le119909119894minus1)
119875 (119883 le 119909119894minus1)
minus
[119864 (119883 sdot 1119909119895le119909119894minus1
) + 119909119894119901119894]
2
119875 (119883 le 119909119894minus1) + 119901119894
=
1
119875 (119883 le 119909119894minus1) [119875 (119883 le 119909
119894minus1) + 119901119894]
times 1198642(119883 sdot 1
119909119895le119909119894minus1)119875 (119883 le 119909
119894)
+ 1199011198941198642(119883 sdot 1
119909119895le119909119894minus1)
minus 1198642(119883 sdot 1
119909119895le119909119894minus1)119875 (119883 le 119909
119894minus1)
minus 2119909119894119901119894119864(119883 sdot 1
119909119895le119909119894minus1)119875 (119883 le 119909
119894minus1)
minus1199092
1198941199012
119894119875 (119883 le 119909
119894minus1)
=
1
119875 (119883 le 119909119894minus1) [119875 (119883 le 119909
119894minus1) + 119901119894]
times 1199011198941198642(119883 sdot 1
119909119895le119909119894minus1)
minus 2119909119894119901119894119864(119883 sdot 1
119909119895le119909119894minus1)119875 (119883 le 119909
119894minus1)
minus1199092
1198941199012
119894119875 (119883 le 119909
119894minus1)
(46)
Similarly simplifying the second part yields
1198642(119883 sdot 1
119909119895ge119909119894)
119875 (119883 ge 119909119894)
minus
1198642(119883 sdot 1
119909119895ge119909119894+1)
119875 (119883 ge 119909119894+1)
=
[119864 (119883 sdot 1119909119895ge119909119894+1
) + 119909119894119901119894]
2
119875 (119883 ge 119909119894+1) + 119901119894
minus
1198642(119883 sdot 1
119909119895ge119909119894+1)
119875 (119883 ge 119909119894+1)
=
1
119875 (119883 ge 119909119894+1) [119875 (119883 ge 119909
119894+1) + 119901119894]
times 1198642(119883 sdot 1
119909119895ge119909119894+1)119875 (119883 ge 119909
119894+1)
minus 1199011198941198642(119883 sdot 1
119909119895ge119909119894+1)
minus 1198642(119883 sdot 1
119909119895ge119909119894+1)119875 (119883 ge 119909
119894+1)
+ 2119909119894119901119894119864(119883 sdot 1
119909119895ge119909119894+1)119875 (119883 ge 119909
119894+1)
+1199092
1198941199012
119894119875 (119883 ge 119909
119894+1)
Mathematical Problems in Engineering 9
=
1
119875 (119883 ge 119909119894+1) [119875 (119883 ge 119909
119894+1) + 119901119894]
times minus119901119894+11198642(119883 sdot 1
119909119895ge119909119894+1)
+ 2119909119894119901119894119864(119883 sdot 1
119909119895ge119909119894+1)119875 (119883 ge 119909
119894+1)
+1199092
1198941199012
119894+1119875 (119883 ge 119909
119894+1)
(47)In general these types of optimization arise when the
dataset is very bigIf we consider119873 to be very large then 119901
119896≪ max(sum
119894ge119896+1
119875119894 sum119894le119896minus1
119875119894) 119875(119883 le 119909
119894minus1) + 119901119894asymp 119875(119883 le 119909
119894) and 119875(119883 ge
119909119894+1) + 119901119894asymp 119875(119883 ge 119909
119894+1)
ΔVar 2 asymp [
[
1198642(119883 sdot 1
119909119895le119909119894minus1) 119901119894
1198752(119883 le 119909
119894minus1)
minus
1198642(119883 sdot 1
119909119895ge119909119894+1) 119901119894
1198752(119883 ge 119909
119894+1)
]
]
minus 2119901119894119909119894[
[
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
minus
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
]
]
minus 1199012
1198941199092
119894[
1
119875 (119883 le 119909119894minus1)
minus
1
119875 (119883 ge 119909119894+1)
]
(48)
and as lim119873rarrinfin
1199012
1198941199092
119894[(1119875(119883 le 119909
119894minus1))minus(1119875(119883 ge 119909
119894+1))] =
0 we can further simplify ΔVar 2 as followsΔVar 2
asymp[
[
1198642(119883 sdot 1
119909119895le119909119894minus1) 119901119894
1198752(119883 le 119909
119894minus1)
minus
1198642(119883 sdot 1
119909119895ge119909119894+1) 119901119894
1198752(119883 ge 119909
119894+1)
]
]
minus 2119901119894119909119894[
[
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
minus
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
]
]
= 119901119894[
[
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
minus
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
]
]
times[
[
(
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
+
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
) minus 2119909119894]
]
(49)To find the minimum of ΔVar 2 let us locate when
ΔVar 2 = 0
997904rArr 119901119894[
[
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
minus
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
]
]
times[
[
(
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
+
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
) minus 2119909119894]
]
= 0
(50)
VarD
X1 X2 X3 X4 X5 X6 Xn
M
Figure 3 Discrete random variables
But 119901119894
= 0 and so is [(119864(119883 sdot 1119909119895le119909119894minus1
)119875(119883 le 119909119894minus1)) minus
(119864(119883 sdot 1119909119895ge119909119894+1
)119875(119883 ge 119909119894+1))] lt 0 since 119909
119894is an increasing
sequence which implies that 120583119894minus1
lt 120583119894where
120583119894minus1
=
119864 (119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
120583119894=
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
(51)
Therefore
[
[
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
+
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
]
]
minus 2119909119894= 0 (52)
Let us define
119865 (119909119894) = 2119909
119894minus[
[
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
+
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
]
]
119865 (119909119894) = 0
997904rArr 119909119894=[
[
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
+
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
]
]
sdot
1
2
997904rArr 119909119894=
120583119894minus1+ 120583119894
2
(53)
where 120583119894minus1
and 120583119894are the means of the first and second parts
respectivelyTo find the minimum for the total variation it is enough
to find the good separator119872 between 119909119894and 119909
119894minus1such that
119865(119909119894) sdot 119865(119909
119894+1) lt 0 and 119865(119909
119894) lt 0 (see Figure 3)
After finding few local minima we can get the globalminimum by finding the smallest among them
3 Face Localization Using Proposed Method
Now it becomes clear the superiority of the K-means modi-fied algorithmover the standard algorithm in terms of finding
10 Mathematical Problems in Engineering
Figure 4 The original image
Figure 5 The face with shade only
a global minimum and we can explain our solution basedon the modified algorithm Suppose we have an image withpixels values from 0 to 255 First we reshaped the input imageinto a vector of pixels Figure 4 shows an input image beforethe operation of the K-means modified algorithmThe imagecontains a face part a normal background and a backgroundwith illumination Then we split the image pixels into twoclasses one from 0 to 119909
1and another from 119909
1to 255 by
applying a certain threshold determined by the algorithmLet [0 119909
1] be the class 1-which represents the nonface
part and let [1199091 255] be the class 2-which represents the
face with the shadeAll pixels with values less than the threshold belong to
the first class and will be assigned to 0 values This will keepthe pixels with values above the threshold and these pixelsbelong to both the face part and the illuminated part of thebackground as shown in Figure 5
In order to remove the illuminated background partanother threshold will be applied to the pixels using thealgorithm meant to separate between the face part and theilluminated background part New classes will be obtainedfrom 119909
1to 1199092and from 119909
2to 255Then the pixels with values
less than the threshold will be assigned a value of 0 and thenonzero pixels left represent the face part Let [119909
1 119909
2] be
the class 1 which represents the illuminated background partand let [119909
2 255] be the class 2 which represents the face
partThe result of the removal of the illuminated back-ground
part is shown in Figure 6
Figure 6 The face without illumination
Figure 7 The filtered image
One can see that some face pixels were assigned to 0 dueto the effects of the illumination Therefore there is a needto return to the original image in Figure 4 to fully extract theimage window corresponding to the face A filtering processis then performed to reduce the noise and the result is shownin Figure 7
Figure 8 shows the final result which is a window contain-ing the face extracted from the original image
4 Results
In this Section the experimental results of the proposedalgorithm will be analyzed and discussed First it starts withpresenting a description of the datasets used in this workfollowed by the results of the previous part of the workThree popular databases were used to test the proposed facelocalizationmethodTheMIT-CBCL database has image size115 times 115 and 300 images were selected from this datasetbased on different conditions such as light variation poseand background variations The second dataset is the BioIDdataset with image size 384 times 286 and 300 images from thedatabase were selected based on different conditions suchas indooroutdoor The last dataset is the Caltech datasetwith image size 896 times 592 and the 300 images from thedatabase were selected based on different conditions such asindooroutdoor
41 MIT-CBCL Dataset This dataset was established by theCenter for Biological and Computational Learning (CBCL)
Mathematical Problems in Engineering 11
Figure 8 The corresponding window in the original image
Figure 9 Samples fromMIT-CBCL dataset
at Massachusetts Institute of Technology (MIT) [41] It has10 persons with 200 images per person and an image size of115times115 pixelsThe images of each person are with differentlight conditions clutter background and scale and differentposes Few examples of these images are shown in Figure 9
42 BioID Dataset This dataset [42] consists of 1521 graylevel images with a resolution of 384 times 286 pixels Each oneshows the frontal view of a face of one out of 23 differenttest persons The images were taken under different lightingconditions with a complex background and contain tiltedand rotated faces This dataset is considered as the mostdifficult one for eye detection (see Figure 10)
43 Caltech Dataset This database [43] has been collectedby Markus Weber at California Institute of Technology Thedatabase contains 450 face images of 27 subjects underdifferent lighting conditions different face expressions andcomplex backgrounds (see Figure 11)
44 Proposed Method Result The proposed method is eval-uated on these datasets where the result was 100 with a
Figure 10 Sample of faces from BioID dataset for localization
Figure 11 Sample of faces from Caltech dataset for localization
Table 1 Face localization using K-mean modified algorithm onMIT-CBCL BioID and Caltech the sets contain faces with differentposes and faces with clutter background as well as indoorsoutdoors
Dataset Image Accuracy ()MIT-Set 1 150 100MIT-Set 2 150 100BioID 300 100Caltech 300 100
localization time of 4 s Table 1 shows the proposed methodresults with the MIT-CBCL Caltech and BioID databasesIn this test we have focused on the use of images of facesfrom different angles from 30 degrees left to 30 degreesright and the variation in the background as well as thecases of indoor and outdoor images Two sets of images areselected from theMIT-CBCL dataset and two other sets fromthe other databases to evaluate the proposed method Thefirst set from MIT-CBCL contains the images with differentposes and the second set contains images with differentbackgrounds Each set contains 150 images The BioID andCaltech sets contain imageswith indoor and outdoor settingsThe results showed the efficiency of the proposed K-meanmodified algorithm to locate the faces from the image withhigh background and poses changes In addition anotherparameter was considered in this test which is the localizationtime From the results the proposed method can locate theface in a very short time because it has less complexity due tothe use of differential equations
12 Mathematical Problems in Engineering
Figure 12 Examples of face localization using k-mean modified algorithm on MIT-CBCL dataset
(a) (b) (c)
Figure 13 Example of face localization using k-mean modified algorithm on BioID dataset (a) the original image (b) face position and (c)filtered image
Figure 12 shows some examples of face localization onMIT-CBCL
In Figure 13(a) an original image from BioID dataset isshownThe localization position is shown in Figure 13(b) Butthere is need to do a filtering operation in order to remove theunwanted parts and this is shown in Figure 13(c) and then thearea of ROI will be determined
Figure 14 shows the face position on an image from theCaltech dataset
5 Conclusion
In this paper we focus on developing the RIO detectionby suggesting a new method for face localization In thismethod of localization the input image is reshaped intoa vector and then the optimized k-means algorithm isapplied twice to cluster the image pixels into classes At the
end the block window corresponding to the class of pixelscontaining the face only is extracted from the input imageThree sets of images taken from MIT BioID and Caltechdatasets and differing in illumination and backgroundsas well as in-dooroutdoor settings were used to evaluatethe proposed method The results show that the proposedmethod achieved significant localization accuracyOur futureresearch direction includes finding the optimal centers forhigher dimension data whichwe believe is just generalizationof our approach to higher dimension (8) and to highersubclusters
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
Mathematical Problems in Engineering 13
(a) (b) (c)
Figure 14 Example of face localization using k-mean modified algorithm on Caltech dataset (a) the original image (b) face position and(c) filtered image
Acknowledgments
This study is a part of the Internal Grant Project no419011811122 The authors gratefully acknowledge the con-tinued support from Alfaisal University and its Office ofResearch
References
[1] R Chellappa C L Wilson and S Sirohey ldquoHuman andmachine recognition of faces a surveyrdquo Proceedings of the IEEEvol 83 no 5 pp 705ndash740 1995
[2] M-H Yang D J Kriegman and N Ahuja ldquoDetecting faces inimages a surveyrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 24 no 1 pp 34ndash58 2002
[3] K S Fu and J K Mui ldquoA survey on image segmentationrdquoPattern Recognition vol 13 no 1 pp 3ndash16 1981
[4] N R Pal and S K Pal ldquoA review on image segmentationtechniquesrdquo Pattern Recognition vol 26 no 9 pp 1277ndash12941993
[5] S-YWan andW EHiggins ldquoSymmetric region growingrdquo IEEETransactions on Image Processing vol 12 no 9 pp 1007ndash10152003
[6] Y-L Chang and X Li ldquoFast image region growingrdquo Image andVision Computing vol 13 no 7 pp 559ndash571 1995
[7] T N Pappas and N S Jayant ldquoAn adaptive clustering algorithmfor image segmentationrdquo in Proceedings of the InternationalConference on Acoustics Speech and Signal Processing (ICASSPrsquo89) vol 3 pp 1667ndash1670 May 1989
[8] T N Pappas ldquoAn adaptive clustering algorithm for imagesegmentationrdquo IEEE Transactions on Signal Processing vol 40no 4 pp 901ndash914 1992
[9] M Y Mashor ldquoHybrid training algorithm for RBF networkrdquoInternational Journal of the Computer the Internet and Manage-ment vol 8 no 2 pp 50ndash65 2000
[10] N A M Isa S A Salamah and U K Ngah ldquoAdaptive fuzzymovingK-means clustering algorithm for image segmentationrdquo
IEEE Transactions on Consumer Electronics vol 55 no 4 pp2145ndash2153 2009
[11] S N Sulaiman and N AM Isa ldquoAdaptive fuzzy-K-means clus-tering algorithm for image segmentationrdquo IEEE Transactions onConsumer Electronics vol 56 no 4 pp 2661ndash2668 2010
[12] W Cai S Chen and D Zhang ldquoFast and robust fuzzy c-meansclustering algorithms incorporating local information for imagesegmentationrdquo Pattern Recognition vol 40 no 3 pp 825ndash8382007
[13] P Tan M Steinbach and V Kumar Introduction to DataMining Pearson Addison-Wesley 2006
[14] H Xiong J Wu and J Chen ldquoK-means clustering versusvalidation measures a data-distribution perspectiverdquo IEEETransactions on Systems Man and Cybernetics B Cyberneticsvol 39 no 2 pp 318ndash331 2009
[15] I Borg and P J F GroenenModernMultidimensional ScalingmdashTheory and Applications Springer Science + Business Media2005
[16] S Guha R Rastogi and K Shim ldquoCURE an efficient clusteringalgorithm for large databasesrdquo ACM SIGMOD Record vol 27no 2 pp 73ndash84 1998
[17] EM Knorr R T Ng and V Tucakov ldquoDistance-based outliersalgorithms and applicationsrdquo VLDB Journal vol 8 no 3-4 pp237ndash253 2000
[18] P S Bradley U Fayyad and C Reina ldquoScaling clusteringalgorithms to large databasesrdquo in Proceedings of the 4th ACMSIGKDD International Conference on Knowledge Discovery andData Mining (KDD rsquo98) pp 9ndash15 1998
[19] D Arthur and S Vassilvitskii ldquok-means++ the advantages ofcareful seedingrdquo in Proceedings of the 18th Annual ACM-SIAMSymposium on Discrete Algorithms pp 1027ndash1035 2007
[20] T Kanungo D M Mount N S Netanyahu C D PiatkoR Silverman and A Y Wu ldquoA local search approximationalgorithm for k-means clusteringrdquo Computational GeometryTheory and Applications vol 28 no 2-3 pp 89ndash112 2004
14 Mathematical Problems in Engineering
[21] M C Chiang C W Tsai and C S Yang ldquoA time-efficient pat-tern reduction algorithm for k-means clusteringrdquo InformationSciences vol 181 no 4 pp 716ndash731 2011
[22] P S Bradley U M Fayyad and C Reina ldquoScaling clusteringalgorithms to large databasesrdquo in Proceedings of the ACMSIGKDD International Conference on Knowledge Discovery andData Mining (KDD rsquo98) pp 9ndash15 1998
[23] F Farnstrom J Lewis and C Elkan ldquoScalability for clusteringalgorithms revisitedrdquo SIGKDD Explorations vol 2 no 1 pp 51ndash57 2000
[24] C Ordonez and E Omiecinski ldquoEfficient disk-based K-means clustering for relational databasesrdquo IEEE Transactions onKnowledge and Data Engineering vol 16 no 8 pp 909ndash9212004
[25] Y Li and S M Chung ldquoParallel bisecting k-means withprediction clustering algorithmrdquo Journal of Supercomputingvol 39 no 1 pp 19ndash37 2007
[26] C Elkan ldquoUsing the triangle inequality to accelerate k-meansrdquoin Proceedings of the 20th International Conference on MachineLearning pp 147ndash153 August 2003
[27] D Reddy and P K Jana ldquoInitialization for K-means clusteringusing voronoi diagramrdquo Procedia Technology vol 4 pp 395ndash400 2012
[28] J B MacQueen ldquoSome methods for classification and analysisof multivariate observationsrdquo in Proceedings of 5th BerkeleySymposium on Mathematical Statistics and Probability pp 281ndash297 University of California Press 1967
[29] R Ebrahimpour R Rasoolinezhad Z Hajiabolhasani andM Ebrahimi ldquoVanishing point detection in corridors usingHough transform and K-means clusteringrdquo IET ComputerVision vol 6 no 1 pp 40ndash51 2012
[30] P S Bishnu and V Bhattacherjee ldquoSoftware fault predictionusing quad tree-based K-means clustering algorithmrdquo IEEETransactions on Knowledge and Data Engineering vol 24 no6 pp 1146ndash1150 2012
[31] T Elomaa and H Koivistoinen ldquoOn autonomous K-meansclusteringrdquo in Proceedings of the International Symposium onMethodologies for Intelligent Systems pp 228ndash236 May 2005
[32] J M Pena J A Lozano and P Larranaga ldquoAn empiricalcomparison of four initialization methods for the K-Meansalgorithmrdquo Pattern Recognition Letters vol 20 no 10 pp 1027ndash1040 1999
[33] T Kanungo D M Mount N S Netanyahu C D PiatkoR Silverman and A Y Wu ldquoAn efficient k-means clusteringalgorithms analysis and implementationrdquo IEEETransactions onPattern Analysis andMachine Intelligence vol 24 no 7 pp 881ndash892 2002
[34] A Likas N Vlassis and J J Verbeek ldquoThe global k-meansclustering algorithmrdquo Pattern Recognition vol 36 no 2 pp451ndash461 2003
[35] A M Bagirov ldquoModified global k-means algorithm for mini-mumsum-of-squares clustering problemsrdquoPatternRecognitionvol 41 no 10 pp 3192ndash3199 2008
[36] J Xie and S Jiang ldquoA simple and fast algorithm for global K-means clusteringrdquo in Proceedings of the 2nd InternationalWork-shop on Education Technology and Computer Science (ETCS 10)vol 2 pp 36ndash40 Wuhan China March 2010
[37] G F Tzortzis and A C Likas ldquoThe global kernel 120581-meansalgorithm for clustering in feature spacerdquo IEEE Transactions onNeural Networks vol 20 no 7 pp 1181ndash1194 2009
[38] W-L Hung Y-C Chang and E Stanley Lee ldquoWeight selectionin W-K-means algorithm with an application in color imagesegmentationrdquo Computers and Mathematics with Applicationsvol 62 no 2 pp 668ndash676 2011
[39] B Maliatski and O Yadid-Pecht ldquoHardware-driven adaptive 120581-means clustering for real-time video imagingrdquo IEEE Transac-tions on Circuits and Systems for Video Technology vol 15 no 1pp 164ndash166 2005
[40] T Saegusa and T Maruyama ldquoAn FPGA implementation ofreal-time K-means clustering for color imagesrdquo Journal of Real-Time Image Processing vol 2 no 4 pp 309ndash318 2007
[41] ldquoCBCL face recognition Databaserdquo Vol 2010httpcbclmitedusoftware-datasetsheiselefacerecognition-databasehtml
[42] BioID BioID Support Downloads BioID Face Database 2011[43] httpwwwvisioncaltecheduhtml-filesarchivehtml
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
2 Mathematical Problems in Engineering
image pixels based on spatial constraints and their intensityvariations These constraints include the following firstthe region intensity to be close to the data and secondto have a spatial continuity Their generalization algorithmis adaptive and includes the previous spatial constraintsLike the K-means clustering algorithm their algorithm isiterative These spatial constraints are included using Gibbsrandom field model and they concluded that each region ischaracterized by a slowly varying intensity function Becauseof unimportant features the algorithm is not that muchaccurate Later to improve theirmethod they proposed usinga caricature of the original image to keep only the mostsignificant features and remove the unimportant features [8]Besides the problem of time costly segmentation there arethree basic problems that occur during the clustering processwhich are center redundancy dead centers and trappedcenters at local minima Moving K-mean (MKM) proposedin [9] can overcome the three basic problems by minimizingthe center redundancy and the dead center as well as reducingindirectly the effect of trapped centers at local minima Inspite of that MKM is sensitive to noise centers are notlocated in the middle or centroid of a group of data andsome members of the centers with the largest fitness areassigned as members of a center with the smallest fitness Toreduce the effects of these problems Isa et al [10] proposedthree modified versions of the MKM clustering algorithmfor image segmentation fuzzy moving k-means adaptivemoving k-means and adaptive fuzzy moving k-means Afterthat Sulaiman and Isa [11] introduced a new segmentationmethod based on a new clustering algorithm which isAdaptive Fuzzy k-means Clustering Algorithm (AFKM) Onthe other hand the Fuzzy C-Means (FCM) algorithm wasused in image segmentation but it still has some drawbacksthat are the lack of enough robustness to noise and outliersthe crucial parameter 120572 being selected generally based onexperience and the time of segmenting of an image beingdependent on its size To overcome the drawbacks of theFCM algorithm Cai et al [12] integrated both local spatialand gray information to propose fast and robust Fuzzy C-means algorithm called Fast Generalized Fuzzy C-MeansAlgorithm (FGFCM) for image segmentation Moreovermany researchers have provided a definition for some datacharacteristics where it has a significant impact on the K-means clustering analysis including the scattering of the datanoise high-dimensionality the size of the data and outliers inthe data datasets types of attributes and scales of attributes[13] However there is a need for more investigation tounderstand how data distributions affect K-means clusteringperformance In [14] Xiong et al provided a formal andorganized study of the effect of skewed data distributionson K-means clustering Previous research found the impactof high-dimensionality on K-means clustering performanceThe traditional Euclidean notion of proximity is not veryuseful on real-world high-dimensional datasets such as doc-ument datasets and gene-expression datasets To overcomethis challenge researchers turn tomake use of dimensionalityreduction techniques such as multidimensional scaling [15]Second K-means has difficulty in detecting the ldquonaturalrdquoclusters with nonspherical shapes [13] Modified K-means
clustering algorithm is a direction to solve this problemGuhaet al [16] proposed the CURE method which makes use ofmultiple representative points to obtain the ldquonaturalrdquo clustersshape information The problem of outliers and noise in thedata can also reduce clustering algorithms performance [17]especially for prototype-based algorithms such as K-meansThe direction to solve this kind of problem is by combiningsome outlier removal techniques before conducting K-meansclustering For example a simple method [9] of detectingoutliers is based on the distance measure On the other handmanymodifiedK-means clustering algorithms that workwellfor smallermedium-size datasets are unable to deal with largedatasets Bradley et al in [18] introduced a discussion ofscaling K-means clustering to large datasets
Arthur and Vassilvitskii implemented a preliminary ver-sion k-means++ in C++ to evaluate the performance of k-means K-means++ [19] uses a careful seeding method tooptimize the speed and the accuracy of k-means Experi-ments were performed on four datasets and results showedthat that this algorithm is 119874(log 119896)-competitive with theoptimal clustering Experiments also proved its better speed(70 faster) and accuracy (potential value obtained is betterby factors of 10 to 1000) than k-means Kanungo et al[20] also proposed an 119874(1198993120598minus119889) algorithm for the k-meansproblem It is (9 + 120598) competitive but quiet slow in practiceXiong et al investigates the measures that best reflects theperformance of K-means clustering [14] An organized studywas performed to understand the impact of data distributionson the performance of K-means clustering Research workalso has improved the traditional K-means clustering so thatit can handle datasets with large variation of cluster sizesThisformal study illustrates that the entropy sometimes providedmisleading information on the clustering performance Basedon this a coefficient of variation (CV) is proposed to validatethe clustering outcome Experimental results proved that fordatasets with great variation in ldquotruerdquo cluster sizes K-meanslessens (less than 10) the variation in resultant cluster sizesFor datasets with small variation in ldquotruerdquo cluster sizes K-means increases (greater than 03) the variation in resultantcluster sizes
Scalability of the k-means for large datasets is one ofthe major limitations Chiang et al [21] proposed a simpleand easy to implement algorithm for reducing the compu-tation time of k-means This pattern reduction algorithmcompresses andor removes iteration patterns that are notlikely to change their membershipThe pattern is compressedby selecting one of the patterns to be removed and setting itsvalue to the average of all patterns removed If 119863[119903119894
119897] is the
pattern to be removed then we can write it as follows
119863[119903119894
119897] =
|119877119894
119897|
sum
119895=1
119863[119903119897
119894119895] where 119903119894
119897isin 119877119894
119897 (1)
This algorithm can be also applied to other clusteringalgorithms like population-based clustering algorithms Thecomputation time of many clustering algorithms can bereduced using this technique as it is especially effectivefor large and high-dimensional datasets In [22] Bradley
Mathematical Problems in Engineering 3
et al proposed Scalable k-means that uses buffering anda two-stage compression scheme to compress or removepatterns in order to improve the performance of k-meansThe algorithm is slightly faster than k-means but does notshow the same result always The factors that degrade theperformance of scalable k-means are the compression pro-cesses and compression ratio Farnstrom et al also proposeda simple single pass k-means algorithm [23] which reducesthe computation time of k-means Ordonez and Omiecinskiproposed relational k-means [24] that uses the block andincremental concept to improve the stability of scalable k-meansThe computation time of k-means can also be reducedusing Parallel bisecting k-means [25] and triangle inequality[26] methods
Although K-means is the most popular and simple clus-tering algorithm the major difficulty with this process is thatit cannot ensure the global optimumresults because the initialcluster centers are selected randomly Reference [27] presentsa novel technique that selects the initial cluster centers byusing Voronoi diagramThis method automates the selectionof the initial cluster centers To validate the performanceexperiments were performed on a range of artificial and realworld datasets The quality of the algorithm is assessed usingthe following error rate (ER) equation
ER =Number ofmisclassified objects
Total number of objectstimes 100 (2)
The lower the error rate the better the clustering Theproposed algorithm is able to produce better clustering resultsthan the traditional K-means Experimental results proved60 reduction in iterations for defining the centroids
All previous solutions and efforts to increase the perfor-mance of K-means algorithm still need more investigationsince they are all looking for a local minimum Searching forthe global minimum will certainly improve the performanceof the algorithmThe rest of the paper is organized as followsIn Section 2 the proposed method is introduced that findsthe optimal centers for each cluster which corresponds tothe global minimum of the k-means cluster In Section 3 theresults are discussed and analyzed using two sets from MITBioID and Caltech datasets Finally in Section 4 conclusionsare drawn
2 K-Means Clustering
K-means an unsupervised learning algorithm first proposedby MacQueen 1967 [28] is a popular method of clusteranalysis It is a process of organizing the specified objectsinto uniform classes called clusters based on similaritiesamong objects based on certain criteria It solves the well-known clustering problem by considering certain attributesand performing an iterative alternating fitting process Thisalgorithm partitions a dataset 119883 into 119896 disjoint clusters suchthat each observation belongs to the cluster with the nearestmean Let119909
119894be the centroid of cluster 119888
119894and let119889(119909
119895 119909119894)be the
180
160
140
120
100
80
60
40
20
0
Clus
terin
g er
ror
Number of clusters k2 4 6 8 10 12 14 16
Global k-means
Fast global k-means
k-means
Min k-means
Figure 1 Comparison of k-means
dissimilarity between 119909119894and object 119909
119895isin 119888119894Then the function
minimized by the k-means is given by the following equation
min1199091 119909119896
119864 =
119896
sum
119894=1
sum
119909119895isin119888119894
119889 (119909119895 119909119894) (3)
K-means clustering is utilized in a vast number ofapplications including machine learning fault detectionpattern recognition image processing statistics and artificialintelligent [11 29 30] k-means algorithm is considered asone of the fastest clustering algorithms with a number ofvariants that are sensitive to the selection of initial pointsand are intended for solving many issues of k-means likethe evaluation of the clusters number [31] the method ofinitialization of the clusters centroids [32] and the speed ofthe algorithm [33]
21 Modifications in K-Means Clustering Global k-meansalgorithm [34] is a proposal to improve global search prop-erties of k-means algorithm and its performance on verylarge datasets by computing clusters successively The mainweakness of k-means clustering that is its sensitivity to initialpositions of the cluster centers is overcome by global k-means clustering algorithm which consists of a deterministicglobal optimization technique It does not select initial valuesrandomly instead an incremental approach is applied tooptimally add one new cluster center at each stage Globalk-means also proposes a method to reduce the computa-tional load while maintain the solution quality Experimentswere performed to compare the performance of k-meansglobal k-means min k-means and fast global k-means asshown in Figure 1 Numerical results show that the globalk-means algorithm considerably outperforms other k-meansalgorithms
4 Mathematical Problems in Engineering
Bagirov [35] proposed a new version of the global k-means algorithm for minimum sum-of-squares clusteringproblems He also compared three different versions ofthe k-means algorithm to propose the modified versionof the global k-means algorithm The proposed algorithmcomputes clusters incrementally and 119896 minus 1 cluster centersfrom the previous iteration are used to compute k-partitionof a dataset The starting point of the 119896th cluster center iscomputed byminimizing an auxiliary cluster function Givena finite set of points 119860 in the 119899-dimensional space the globalk-means compute the centroid1198831 of the set119860 as the followingequation
1198831=
1
119898
119898
sum
119894=1
119886119894 119886119894isin 119860 119894 = 1 119898 (4)
Numerical experiments performed on 14 datasets demon-strate the efficiency of themodified global k-means algorithmin comparison to the multistart k-means (MS k-means) andthe GKM algorithms when the number of clusters 119896 gt 5Modified global k-means however requires more computa-tional time than the global k-means algorithm Xie and Jiang[36] also proposed a novel version of the global k-meansalgorithm The method of creating the next cluster center inthe global k-means algorithm is modified and that showeda better execution time without affecting the performanceof the global k-means algorithm Another extension of thestandard k-means clustering is the Global Kernel k-means[37] which optimizes the clustering error in feature spaceby employing kernel-means as a local search procedure Akernel function is used in order to solve the M-clusteringproblem and near-optimal solutions are used to optimizethe clustering error in the feature space This incrementalalgorithm has high ability to identify nonlinearly separableclusters in input space and no dependence on cluster initial-ization It can handle weighted data points making it suitablefor graph partitioning applications Twomajor modificationswere performed in order to reduce the computational costwith no major effect on the solution quality
Video imaging and Image segmentation are importantapplications for k-means clustering Hung et al [38]modifiedthe k-means algorithm for color image segmentation wherea weight selection procedure in the W-k-means algorithm isproposed The evaluation function 119865(119868) is used for compari-son
119865 (119868) = radic119877
119877
sum
119894=1
(119890119894256)2
radic119860119894
(5)
where 119868 = segmented image 119877 = the number of regions in thesegmented image 119860
119894= the area or the number of pixels of
the 119894th region and 119890119894= the color error of region 119894
119890119894is the sum of the Euclidean distance of the color vectors
between the original image and the segmented image Smallervalues of 119865(119868) give better segmentation results Resultsfrom color image segmentation illustrate that the proposedprocedure produces better segmentation than the randominitialization Maliatski and Yadid-Pecht [39] also proposean adaptive k-means clustering hardware-driven approach
Y
X
My
1
My
My
2
Mx1
Mx Mx2
C1
C2
Figure 2 The separation points in two dimensions
The algorithm uses both pixel intensity and pixel distance inthe clustering process In [40] also a FPGA implementationof real-time k-means clustering is done for colored imagesA filtering algorithm is used for hardware implementationSuliman and Isa [11] presented a unique clustering algo-rithm called adaptive fuzzy-K-means clustering for imagesegmentation It can be used for different types of imagesincluding medical images microscopic images and CCDcamera images The concepts of fuzziness and belongingnessare applied to produce more adaptive clustering as comparedto other conventional clustering algorithms The proposedadaptive fuzzy-K-means clustering algorithm provides bettersegmentation and visual quality for a range of images
22 The Proposed Methodology In this method the facelocation is determined automatically by using an optimizedK-means algorithm First the input image is reshaped intovectors of pixels values followed by clustering the pixelsbased on applying a certain threshold determined by thealgorithm into two classes Pixels with values below thethreshold are put in the nonface classThe rest of the pixels areput in the face class However some unwanted parts may getclustered to this face classTherefore the algorithm is appliedagain to the face class to obtain only the face part
221 Optimized K-Means Algorithm Theproposed K-meansmodified algorithm is intended to solve the limitations of thestandard version by using differential equations to determinethe optimum separation point This algorithm finds theoptimal centers for each cluster which corresponds to theglobal minimum of the k-means cluster As an example saywe would like to cluster an image into two subclasses faceand nonface So we look for a separator which will separatethe image into two different clusters in two dimensionsThenthe means of the clusters can be found easily Figure 2 showsthe separation points and themean of each class To find these
Mathematical Problems in Engineering 5
points we proposed the followingmodification in continuousand discrete cases
Let 1199091 1199092 1199093 119909
119873 be a dataset and let 119909
119894be an
observation Let119891(119909) be the probabilitymass function (PMF)of 119909
119891 (119909) =
infin
sum
minusinfin
1
119873
120575 (119909 minus 119909119894) 119909
119894isin R (6)
where 120575(119909) is the Dirac functionWhen the size of the data is very large the question of the
cost of the minimization ofsum119896119894=1sum119909119894isin119904119894
|119909119894minus 120583119894|2 arises where
119896 is the number of clusters and 120583119894is the mean of cluster 119904
119894
Let
Var 119896 =119896
sum
119894=1
sum
119909119894isin119904119894
1003816100381610038161003816119909119894minus 120583119894
1003816100381610038161003816
2 (7)
Then it is enough to minimize Var 1 without losing thegenerality since
min119904119894
119889
sum
119894=1
sum
119909119895isin119904119894
10038161003816100381610038161003816119909119895minus 120583119894
10038161003816100381610038161003816
2
ge
119889
sum
119897=1
min119904119894
(
119896
sum
119894=1
sum
119909119895isin119904119894
10038161003816100381610038161003816119909119897
119895minus 120583119897
119894
10038161003816100381610038161003816
2
) (8)
where 119889 is the dimension of 119909119895= (1199091
119895 119909
119889
119895) and 120583
119895=
(1205831
119895 120583
119889
119895)
222 Continuous Case For the continuous case 119891(119909) is likethe probability density function (PDF)Then for 2 classes weneed to minimize the following equation
Var 2 = int
119872
minusinfin
(119909 minus 1205831)2119891 (119909) 119889119909
+ int
infin
119872
(119909 minus 1205832)2119891 (119909) 119889119909
(9)
If 119889 ge 2 we can use (8)
minVar 2 ge119889
sum
119896=1
min(int119872
minusinfin
(119909119896minus 120583119896
1)
2
119891 (119909119896) 119889119909
+int
119872
minusinfin
(119909119896minus 120583119896
2)
2
119891 (119909119896) 119889119909)
(10)
Let 119909 isin R be any random variable with probabilitydensity function 119891(119909) we want to find 120583
1and 120583
2such that
it minimizes (7) as follows
(120583lowast
1 120583lowast
2) = arg(1205831 1205832)
[ min11987212058311205832
[int
119872
minusinfin
(119909 minus 1205831)2119891 (119909) 119889119909
+int
infin
119872
(119909 minus 1205832)2119891 (119909) 119889119909]]
(11)
min11987212058311205832
Var 2 = min119872
[min1205831
int
119872
minusinfin
(119909 minus 1205831)2119891 (119909) 119889119909
+min1205832
int
infin
119872
(119909 minus 1205832)2119891 (119909) 119889119909]
(12)
min11987212058311205832
Var 2 = min119872
[min1205831
int
119872
minusinfin
(119909 minus 1205831(119872))2119891 (119909) 119889119909
+min1205832
int
infin
119872
(119909 minus 1205832(119872))2119891 (119909) 119889119909]
(13)
We know that
min119909119864 [(119883 minus 119909)
2] = 119864 [(119883 minus 119864 (119909))
2] = Var (119909) (14)
So we can find 1205831and 1205832in terms of119872 Let us define Var
1
and Var2as follows
Var1= int
119872
minusinfin
(119909 minus 1205831)2119891 (119909) 119889119909
Var2= int
infin
119872
(119909 minus 1205832)2119891 (119909) 119889119909
(15)
After simplification we get
Var1= int
119872
minusinfin
1199092119891 (119909) 119889119909
+ 1205832
1int
119872
minusinfin
119891 (119909) 119889119909 minus 21205831int
119872
minusinfin
119909119891 (119909) 119889119909
Var2= int
infin
119872
1199092119891 (119909) 119889119909
+ 1205832
2int
infin
119872
119891 (119909) 119889119909 minus 21205832int
infin
119872
119909119891 (119909) 119889119909
(16)
To find the minimum Var1and Var
2are both partially
differentiated with respect to 1205831and 120583
2 respectively After
simplification we conclude that the minimization occurswhen
1205831=
119864 (119883minus)
119875 (119883minus)
1205832=
119864 (119883+)
119875 (119883+)
(17)
where119883minus= 119883 sdot 1
119909lt119872and119883
+= 119883 sdot 1
119909ge119872
6 Mathematical Problems in Engineering
To find the minimum of Var 2 we need to find 120597120583120597119872
1205831= (
int
119872
minusinfin119909119891 (119909) 119889119909
int
119872
minusinfin119891 (119909) 119889119909
)
997904rArr
1205971205831
120597119872
=
119872119891 (119872)119875 (119883minus) minus 119891 (119872)119864 (119883
minus)
1198752(119883minus)
1205832= (
int
infin
119872119909119891 (119909) 119889119909
int
infin
119872119891 (119909) 119889119909
)
997904rArr
1205971205832
120597119872
=
119872119891 (119872)119875 (119883+) minus 119891 (119872)119864 (119883+
)
1198752(119883+)
(18)
After simplification we get
1205971205831
120597119872
=
119891 (119872)
1198752(119883minus)
[119872119875 (119883minus) minus 119864 (119883
minus)]
1205971205832
120597119872
=
119891 (119872)
1198752(119883+)
[119872119875 (119883+) minus 119864 (119883
+)]
(19)
In order to find the minimum of Var 2 we will find120597Var1120597119872 and 120597119881119886119903
2120597119872 separately and add themas follows
120597
120597119872
(Var 2) = 120597
120597119872
(Var1) +
120597
120597119872
(Var2)
120597Var1
120597119872
=
120597
120597119872
(int
119872
minusinfin
1199092119891 (119909) 119889119909
+1205832
1int
119872
minusinfin
119891 (119909) 119889119909 minus 21205831int
119872
minusinfin
119909119891 (119909) 119889119909)
= 1198722119891 (119872) + 2120583
11205831015840
1119875 (119883minus)
+ 1205832
1119891 (119872) minus 2120583
1015840
1119864 (119883minus) minus 2120583
1119872119891(119872)
(20)
After simplification we get
120597Var1
120597119872
= 119891 (119872)[1198722+
1198642(119883minus)
119875(119883minus)2minus 2119872
119864 (119883minus)
119875 (119883minus)
] (21)
Similarly
120597Var2
120597119872
= minus1198722119891 (119872) + 2120583
21205831015840
2119875 (119883+)
minus 1205832
2119891 (119872) minus 2120583
1015840
2119864 (119883+) + 2120583
2119872119891(119872)
(22)
After simplification we get
120597Var2
120597119872
= 119891 (119872)[minus1198722+
1198642(119883+)
119875(119883+)2minus 2119872
119864 (119883+)
119875 (119883+)
] (23)
Adding both these differentials and after simplifying itfollows that
120597
120597119872
(Var 2) = 119891 (119872)[(
1198642(119883minus)
119875(119883minus)2minus
1198642(119883+)
119875(119883+)2)
minus2119872(
119864 (119883minus)
119875 (119883minus)
minus
119864 (119883+)
119875 (119883+)
)]
(24)
Equating this with 0 gives
120597
120597119872
(Var 2) = 0
[(
1198642(119883minus)
119875(119883minus)2minus
1198642(119883+)
119875(119883+)2) minus 2119872(
119864 (119883minus)
119875 (119883minus)
minus
119864 (119883+)
119875 (119883+)
)] = 0
[
119864 (119883minus)
119875 (119883minus)
minus
119864 (119883+)
119875 (119883+)
] [
119864 (119883minus)
119875 (119883minus)
+
119864 (119883+)
119875 (119883+)
minus 2119872] = 0
(25)
But [(119864(119883minus)119875(119883
minus))minus(119864(119883
+)119875(119883
+))] = 0 as in that case
1205831= 1205832 which contradicts the initial assumption Therefore
[
119864 (119883minus)
119875 (119883minus)
+
119864 (119883+)
119875 (119883+)
minus 2119872] = 0
997904rArr 119872 = [
119864 (119883minus)
119875 (119883minus)
+
119864 (119883+)
119875 (119883+)
] sdot
1
2
119872 =
1205831+ 1205832
2
(26)
Therefore we have to find all values of119872 that satisfy (26)which is easier than minimizing Var 2 directly
To find theminimum in terms of119872 it is enough to derive
120597Var 2 (1205831(119872) 120583
2(119872))
120597119872
= 119891 (119872) [
1198642(119883minus)
119875(119883 lt 119872)2minus
1198642(119883+)
119875(119883 ge 119872)2
minus2119872[
119864 (119883minus)
119875 (119883 le 119872)
minus
119864 (119883+)
119875 (119883 gt 119872)
]]
(27)
Then we find
119872 = [
119864 (119883+)
119875 (119883+)
+
119864 (119883minus)
119875 (119883minus)
] sdot
1
2
(28)
223 Finding the Centers of Some Common ProbabilityDensity Functions
(1) Uniform Distribution The probability density function ofthe continuous uniform distribution is
119891 (119909) =
1
119887 minus 119886
119886 lt 119909 lt 119887
0 otherwise(29)
Mathematical Problems in Engineering 7
where 119886 and 119887 are the two parameters of the distributionTheexpected values for119883 sdot 1
119909ge 119872 and119883 sdot 1
119909lt 119872 are
119864 (119883+) =
1
2
(
1198872minus1198722
119887 minus 119886
)
119864 (119883minus) =
1
2
(
1198722minus 1198862
119887 minus 119886
)
(30)
Putting all these in (28) and solving we get
119872 =
1
2
(119886 + 119887)
1205831(119872) =
119864 (119883minus)
119875 (119883minus)
=
1
4
(3119886 + 119887)
1205832 (119872) =
119864 (119883+)
119875 (119883+)
=
1
4
(119886 + 3119887)
(31)
(2) Log-NormalDistributionTheprobability density functionof a log-normal distribution is
119891 (119909 120583 120590) =
1
119909120590radic2120587
119890minus(ln 119909minus120583)221205902
119909 gt 0 (32)
The probabilities in left and right of119872 are respectively
119875 (119883+) =
1
2
minus
1
2
erf (12
radic2 (ln119872minus 120583)
120590
)
119875 (119883minus) =
1
2
+
1
2
erf (12
radic2 (ln119872minus 120583)
120590
)
(33)
The expected values for119883 sdot 1119909ge 119872 and119883 sdot 1
119909lt 119872 are
119864 (119883+)
= [
1
2
119890120583+(12)120590
2
1 + erf (12
(120583 + 1205902minus ln119872)radic2
120590
)]
119864 (119883minus)
= [
1
2
119890120583+(12)120590
2
1 + erf (12
(minus120583 minus 1205902+ ln119872)radic2
120590
)]
(34)
Putting all these in (28) and assuming 120583 = 0 and 120590 = 1
and solving we get
119872 = 4244942583
1205831(119872) =
119864 (119883minus)
119875 (119883minus)
= 1196827816
1205832(119872) =
119864 (119883+)
119875 (119883+)
= 7293057348
(35)
(3) Normal Distribution The probability density function ofa normal distribution is
119891 (119909 120583 120590) =
1
120590radic2120587
119890minus(119909minus120583)
221205902
(36)
The probabilities in left and right of119872 are respectively
119875 (119883+) =
1
2
minus
1
2
erf (12
radic2 (119872 minus 120583)
120590
)
119875 (119883minus) =
1
2
+
1
2
erf (12
radic2 (119872 minus 120583)
120590
)
(37)
The expected values for119883 sdot 1119909ge 119872 and119883 sdot 1
119909lt 119872 are
119864 (119883+) = minus
1
2
120583minus1 + erf (12
radic2 (119872 minus 120583)
120590
)
119864 (119883minus) =
1
2
1205831 + erf (12
radic2 (119872 minus 120583)
120590
)
(38)
Putting all these in (28) and solving we get 119872 = 120583Assuming 120583 = 0 and 120590 = 1 and solving we get
119872 = 0
1205831(119872) =
119864 (119883minus)
119875 (119883minus)
= minusradic2
120587
1205832(119872) =
119864 (119883+)
119875 (119883+)
= radic2
120587
(39)
224 Discrete Case Let119883 be a discrete random variable andassume we have a set of119873 observations from119883 Let 119901
119895be the
mass density function for an observation 119909119895
Let 119909119894minus1
lt 119872119894minus1
lt 119909119894and 119909
119894lt 119872119894lt 119909119894+1
We define 120583119909le119909119894
as themean for all 119909 le 119909119894 and 120583
119909ge119909119894as themean for all 119909 ge 119909
119894
Define Var 2(119872119894minus1) as the variance of two centers forced by
119872119894minus1
as a separator
Var (119872119894minus1) = sum
119909119895le119909119894minus1
(119909119895minus 120583119909119895le119909119894minus1
)
2
119901119895
+ sum
119909119895ge119909119894
(119909119895minus 120583119909119895ge119909119894
)
2
119901119895
(40)
Var (119872119894) = sum
119909119895le119909119894
(119909119895minus 120583119909119895le119909119894
)
2
119901119895
+ sum
119909119895ge119909119894+1
(119909119895minus 120583119909119895ge119909119894+1
)
2
119901119895
(41)
8 Mathematical Problems in Engineering
The first part of Var(119872119894minus1) simplifies as follows
sum
119909119895le119909119894minus1
(119909119895minus 120583119909119895le119909119894minus1
)
2
119901119895
= sum
119909119895le119909119894minus1
1199092
119895119901119895minus 2120583119909119895le119909119894minus1
times sum
119909119895le119909119894minus1
119909119895119901119895+ 1205832
119909119895le119909119894minus1sum
119909119895le119909119894minus1
119901119895
= 119864 (1198832sdot 1119909119895le119909119894minus1
) minus
1198642(119883 sdot 1
119909119895le119909119894minus1)
119875 (119883 le 119909119894minus1)
(42)
Similarly
sum
119909119895ge119909119894
(119909119895minus 120583119909119895ge119909119894
)
2
119901119895= 119864 (119883
2sdot 1119909119895ge119909119894
) minus
1198642(119883 sdot 1
119909119895ge119909119894)
119875 (119883 ge 119909119894)
Var 2 (119872119894minus1) = 119864 (119883
2sdot 1119909119895le119909119894minus1
)
minus
1198642(119883 sdot 1
119909119895le119909119894minus1)
119875 (119883 le 119909119894minus1)
+ 119864 (1198832sdot 1119909119895ge119909119894
)
minus
1198642(119883 sdot 1
119909119895ge119909119894)
119875 (119883 ge 119909119894)
Var 2 (119872119894) = 119864 (119883
2sdot 1119909119895le119909119894
)
minus
1198642(119883 sdot 1
119909119895le119909119894)
119875 (119883 le 119909119894)
+ 119864 (1198832sdot 1119909119895ge119909119894+1
)
minus
1198642(119883 sdot 1
119909119895ge119909119894+1)
119875 (119883 ge 119909119894+1)
(43)
Define ΔVar 2 = Var 2(119872119894) minus Var 2(119872
119894minus1)
ΔVar 2 = [
[
minus
1198642(119883 sdot 1
119909119895le119909119894)
119875 (119883 le 119909119894)
minus
1198642(119883 sdot 1
119909119895ge119909119894+1)
119875 (119883 ge 119909119894+1)
]
]
minus[
[
minus
1198642(119883 sdot 1
119909119895le119909119894minus1)
119875 (119883 le 119909119894minus1)
minus
1198642(119883 sdot 1
119909119895ge119909119894)
119875 (119883 ge 119909119894)
]
]
(44)
Now in order to simplify the calculation we rearrangeΔVar 2 as follows
ΔVar 2 = [
[
1198642(119883 sdot 1
119909119895le119909119894minus1)
119875 (119883 le 119909119894minus1)
minus
1198642(119883 sdot 1
119909119895le119909119894)
119875 (119883 le 119909119894)
]
]
+[
[
1198642(119883 sdot 1
119909119895ge119909119894)
119875 (119883 ge 119909119894)
minus
1198642(119883 sdot 1
119909119895ge119909119894+1)
119875 (119883 ge 119909119894+1)
]
]
(45)
The first part is simplified as follows
1198642(119883 sdot 1
119909119895le119909119894minus1)
119875 (119883 le 119909119894minus1)
minus
1198642(119883 sdot 1
119909119895le119909119894)
119875 (119883 le 119909119894)
=
1198642(119883 sdot 1
119909119895le119909119894minus1)
119875 (119883 le 119909119894minus1)
minus
[119864 (119883 sdot 1119909119895le119909119894minus1
) + 119909119894119901119894]
2
119875 (119883 le 119909119894minus1) + 119901119894
=
1
119875 (119883 le 119909119894minus1) [119875 (119883 le 119909
119894minus1) + 119901119894]
times 1198642(119883 sdot 1
119909119895le119909119894minus1)119875 (119883 le 119909
119894)
+ 1199011198941198642(119883 sdot 1
119909119895le119909119894minus1)
minus 1198642(119883 sdot 1
119909119895le119909119894minus1)119875 (119883 le 119909
119894minus1)
minus 2119909119894119901119894119864(119883 sdot 1
119909119895le119909119894minus1)119875 (119883 le 119909
119894minus1)
minus1199092
1198941199012
119894119875 (119883 le 119909
119894minus1)
=
1
119875 (119883 le 119909119894minus1) [119875 (119883 le 119909
119894minus1) + 119901119894]
times 1199011198941198642(119883 sdot 1
119909119895le119909119894minus1)
minus 2119909119894119901119894119864(119883 sdot 1
119909119895le119909119894minus1)119875 (119883 le 119909
119894minus1)
minus1199092
1198941199012
119894119875 (119883 le 119909
119894minus1)
(46)
Similarly simplifying the second part yields
1198642(119883 sdot 1
119909119895ge119909119894)
119875 (119883 ge 119909119894)
minus
1198642(119883 sdot 1
119909119895ge119909119894+1)
119875 (119883 ge 119909119894+1)
=
[119864 (119883 sdot 1119909119895ge119909119894+1
) + 119909119894119901119894]
2
119875 (119883 ge 119909119894+1) + 119901119894
minus
1198642(119883 sdot 1
119909119895ge119909119894+1)
119875 (119883 ge 119909119894+1)
=
1
119875 (119883 ge 119909119894+1) [119875 (119883 ge 119909
119894+1) + 119901119894]
times 1198642(119883 sdot 1
119909119895ge119909119894+1)119875 (119883 ge 119909
119894+1)
minus 1199011198941198642(119883 sdot 1
119909119895ge119909119894+1)
minus 1198642(119883 sdot 1
119909119895ge119909119894+1)119875 (119883 ge 119909
119894+1)
+ 2119909119894119901119894119864(119883 sdot 1
119909119895ge119909119894+1)119875 (119883 ge 119909
119894+1)
+1199092
1198941199012
119894119875 (119883 ge 119909
119894+1)
Mathematical Problems in Engineering 9
=
1
119875 (119883 ge 119909119894+1) [119875 (119883 ge 119909
119894+1) + 119901119894]
times minus119901119894+11198642(119883 sdot 1
119909119895ge119909119894+1)
+ 2119909119894119901119894119864(119883 sdot 1
119909119895ge119909119894+1)119875 (119883 ge 119909
119894+1)
+1199092
1198941199012
119894+1119875 (119883 ge 119909
119894+1)
(47)In general these types of optimization arise when the
dataset is very bigIf we consider119873 to be very large then 119901
119896≪ max(sum
119894ge119896+1
119875119894 sum119894le119896minus1
119875119894) 119875(119883 le 119909
119894minus1) + 119901119894asymp 119875(119883 le 119909
119894) and 119875(119883 ge
119909119894+1) + 119901119894asymp 119875(119883 ge 119909
119894+1)
ΔVar 2 asymp [
[
1198642(119883 sdot 1
119909119895le119909119894minus1) 119901119894
1198752(119883 le 119909
119894minus1)
minus
1198642(119883 sdot 1
119909119895ge119909119894+1) 119901119894
1198752(119883 ge 119909
119894+1)
]
]
minus 2119901119894119909119894[
[
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
minus
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
]
]
minus 1199012
1198941199092
119894[
1
119875 (119883 le 119909119894minus1)
minus
1
119875 (119883 ge 119909119894+1)
]
(48)
and as lim119873rarrinfin
1199012
1198941199092
119894[(1119875(119883 le 119909
119894minus1))minus(1119875(119883 ge 119909
119894+1))] =
0 we can further simplify ΔVar 2 as followsΔVar 2
asymp[
[
1198642(119883 sdot 1
119909119895le119909119894minus1) 119901119894
1198752(119883 le 119909
119894minus1)
minus
1198642(119883 sdot 1
119909119895ge119909119894+1) 119901119894
1198752(119883 ge 119909
119894+1)
]
]
minus 2119901119894119909119894[
[
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
minus
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
]
]
= 119901119894[
[
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
minus
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
]
]
times[
[
(
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
+
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
) minus 2119909119894]
]
(49)To find the minimum of ΔVar 2 let us locate when
ΔVar 2 = 0
997904rArr 119901119894[
[
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
minus
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
]
]
times[
[
(
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
+
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
) minus 2119909119894]
]
= 0
(50)
VarD
X1 X2 X3 X4 X5 X6 Xn
M
Figure 3 Discrete random variables
But 119901119894
= 0 and so is [(119864(119883 sdot 1119909119895le119909119894minus1
)119875(119883 le 119909119894minus1)) minus
(119864(119883 sdot 1119909119895ge119909119894+1
)119875(119883 ge 119909119894+1))] lt 0 since 119909
119894is an increasing
sequence which implies that 120583119894minus1
lt 120583119894where
120583119894minus1
=
119864 (119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
120583119894=
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
(51)
Therefore
[
[
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
+
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
]
]
minus 2119909119894= 0 (52)
Let us define
119865 (119909119894) = 2119909
119894minus[
[
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
+
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
]
]
119865 (119909119894) = 0
997904rArr 119909119894=[
[
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
+
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
]
]
sdot
1
2
997904rArr 119909119894=
120583119894minus1+ 120583119894
2
(53)
where 120583119894minus1
and 120583119894are the means of the first and second parts
respectivelyTo find the minimum for the total variation it is enough
to find the good separator119872 between 119909119894and 119909
119894minus1such that
119865(119909119894) sdot 119865(119909
119894+1) lt 0 and 119865(119909
119894) lt 0 (see Figure 3)
After finding few local minima we can get the globalminimum by finding the smallest among them
3 Face Localization Using Proposed Method
Now it becomes clear the superiority of the K-means modi-fied algorithmover the standard algorithm in terms of finding
10 Mathematical Problems in Engineering
Figure 4 The original image
Figure 5 The face with shade only
a global minimum and we can explain our solution basedon the modified algorithm Suppose we have an image withpixels values from 0 to 255 First we reshaped the input imageinto a vector of pixels Figure 4 shows an input image beforethe operation of the K-means modified algorithmThe imagecontains a face part a normal background and a backgroundwith illumination Then we split the image pixels into twoclasses one from 0 to 119909
1and another from 119909
1to 255 by
applying a certain threshold determined by the algorithmLet [0 119909
1] be the class 1-which represents the nonface
part and let [1199091 255] be the class 2-which represents the
face with the shadeAll pixels with values less than the threshold belong to
the first class and will be assigned to 0 values This will keepthe pixels with values above the threshold and these pixelsbelong to both the face part and the illuminated part of thebackground as shown in Figure 5
In order to remove the illuminated background partanother threshold will be applied to the pixels using thealgorithm meant to separate between the face part and theilluminated background part New classes will be obtainedfrom 119909
1to 1199092and from 119909
2to 255Then the pixels with values
less than the threshold will be assigned a value of 0 and thenonzero pixels left represent the face part Let [119909
1 119909
2] be
the class 1 which represents the illuminated background partand let [119909
2 255] be the class 2 which represents the face
partThe result of the removal of the illuminated back-ground
part is shown in Figure 6
Figure 6 The face without illumination
Figure 7 The filtered image
One can see that some face pixels were assigned to 0 dueto the effects of the illumination Therefore there is a needto return to the original image in Figure 4 to fully extract theimage window corresponding to the face A filtering processis then performed to reduce the noise and the result is shownin Figure 7
Figure 8 shows the final result which is a window contain-ing the face extracted from the original image
4 Results
In this Section the experimental results of the proposedalgorithm will be analyzed and discussed First it starts withpresenting a description of the datasets used in this workfollowed by the results of the previous part of the workThree popular databases were used to test the proposed facelocalizationmethodTheMIT-CBCL database has image size115 times 115 and 300 images were selected from this datasetbased on different conditions such as light variation poseand background variations The second dataset is the BioIDdataset with image size 384 times 286 and 300 images from thedatabase were selected based on different conditions suchas indooroutdoor The last dataset is the Caltech datasetwith image size 896 times 592 and the 300 images from thedatabase were selected based on different conditions such asindooroutdoor
41 MIT-CBCL Dataset This dataset was established by theCenter for Biological and Computational Learning (CBCL)
Mathematical Problems in Engineering 11
Figure 8 The corresponding window in the original image
Figure 9 Samples fromMIT-CBCL dataset
at Massachusetts Institute of Technology (MIT) [41] It has10 persons with 200 images per person and an image size of115times115 pixelsThe images of each person are with differentlight conditions clutter background and scale and differentposes Few examples of these images are shown in Figure 9
42 BioID Dataset This dataset [42] consists of 1521 graylevel images with a resolution of 384 times 286 pixels Each oneshows the frontal view of a face of one out of 23 differenttest persons The images were taken under different lightingconditions with a complex background and contain tiltedand rotated faces This dataset is considered as the mostdifficult one for eye detection (see Figure 10)
43 Caltech Dataset This database [43] has been collectedby Markus Weber at California Institute of Technology Thedatabase contains 450 face images of 27 subjects underdifferent lighting conditions different face expressions andcomplex backgrounds (see Figure 11)
44 Proposed Method Result The proposed method is eval-uated on these datasets where the result was 100 with a
Figure 10 Sample of faces from BioID dataset for localization
Figure 11 Sample of faces from Caltech dataset for localization
Table 1 Face localization using K-mean modified algorithm onMIT-CBCL BioID and Caltech the sets contain faces with differentposes and faces with clutter background as well as indoorsoutdoors
Dataset Image Accuracy ()MIT-Set 1 150 100MIT-Set 2 150 100BioID 300 100Caltech 300 100
localization time of 4 s Table 1 shows the proposed methodresults with the MIT-CBCL Caltech and BioID databasesIn this test we have focused on the use of images of facesfrom different angles from 30 degrees left to 30 degreesright and the variation in the background as well as thecases of indoor and outdoor images Two sets of images areselected from theMIT-CBCL dataset and two other sets fromthe other databases to evaluate the proposed method Thefirst set from MIT-CBCL contains the images with differentposes and the second set contains images with differentbackgrounds Each set contains 150 images The BioID andCaltech sets contain imageswith indoor and outdoor settingsThe results showed the efficiency of the proposed K-meanmodified algorithm to locate the faces from the image withhigh background and poses changes In addition anotherparameter was considered in this test which is the localizationtime From the results the proposed method can locate theface in a very short time because it has less complexity due tothe use of differential equations
12 Mathematical Problems in Engineering
Figure 12 Examples of face localization using k-mean modified algorithm on MIT-CBCL dataset
(a) (b) (c)
Figure 13 Example of face localization using k-mean modified algorithm on BioID dataset (a) the original image (b) face position and (c)filtered image
Figure 12 shows some examples of face localization onMIT-CBCL
In Figure 13(a) an original image from BioID dataset isshownThe localization position is shown in Figure 13(b) Butthere is need to do a filtering operation in order to remove theunwanted parts and this is shown in Figure 13(c) and then thearea of ROI will be determined
Figure 14 shows the face position on an image from theCaltech dataset
5 Conclusion
In this paper we focus on developing the RIO detectionby suggesting a new method for face localization In thismethod of localization the input image is reshaped intoa vector and then the optimized k-means algorithm isapplied twice to cluster the image pixels into classes At the
end the block window corresponding to the class of pixelscontaining the face only is extracted from the input imageThree sets of images taken from MIT BioID and Caltechdatasets and differing in illumination and backgroundsas well as in-dooroutdoor settings were used to evaluatethe proposed method The results show that the proposedmethod achieved significant localization accuracyOur futureresearch direction includes finding the optimal centers forhigher dimension data whichwe believe is just generalizationof our approach to higher dimension (8) and to highersubclusters
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
Mathematical Problems in Engineering 13
(a) (b) (c)
Figure 14 Example of face localization using k-mean modified algorithm on Caltech dataset (a) the original image (b) face position and(c) filtered image
Acknowledgments
This study is a part of the Internal Grant Project no419011811122 The authors gratefully acknowledge the con-tinued support from Alfaisal University and its Office ofResearch
References
[1] R Chellappa C L Wilson and S Sirohey ldquoHuman andmachine recognition of faces a surveyrdquo Proceedings of the IEEEvol 83 no 5 pp 705ndash740 1995
[2] M-H Yang D J Kriegman and N Ahuja ldquoDetecting faces inimages a surveyrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 24 no 1 pp 34ndash58 2002
[3] K S Fu and J K Mui ldquoA survey on image segmentationrdquoPattern Recognition vol 13 no 1 pp 3ndash16 1981
[4] N R Pal and S K Pal ldquoA review on image segmentationtechniquesrdquo Pattern Recognition vol 26 no 9 pp 1277ndash12941993
[5] S-YWan andW EHiggins ldquoSymmetric region growingrdquo IEEETransactions on Image Processing vol 12 no 9 pp 1007ndash10152003
[6] Y-L Chang and X Li ldquoFast image region growingrdquo Image andVision Computing vol 13 no 7 pp 559ndash571 1995
[7] T N Pappas and N S Jayant ldquoAn adaptive clustering algorithmfor image segmentationrdquo in Proceedings of the InternationalConference on Acoustics Speech and Signal Processing (ICASSPrsquo89) vol 3 pp 1667ndash1670 May 1989
[8] T N Pappas ldquoAn adaptive clustering algorithm for imagesegmentationrdquo IEEE Transactions on Signal Processing vol 40no 4 pp 901ndash914 1992
[9] M Y Mashor ldquoHybrid training algorithm for RBF networkrdquoInternational Journal of the Computer the Internet and Manage-ment vol 8 no 2 pp 50ndash65 2000
[10] N A M Isa S A Salamah and U K Ngah ldquoAdaptive fuzzymovingK-means clustering algorithm for image segmentationrdquo
IEEE Transactions on Consumer Electronics vol 55 no 4 pp2145ndash2153 2009
[11] S N Sulaiman and N AM Isa ldquoAdaptive fuzzy-K-means clus-tering algorithm for image segmentationrdquo IEEE Transactions onConsumer Electronics vol 56 no 4 pp 2661ndash2668 2010
[12] W Cai S Chen and D Zhang ldquoFast and robust fuzzy c-meansclustering algorithms incorporating local information for imagesegmentationrdquo Pattern Recognition vol 40 no 3 pp 825ndash8382007
[13] P Tan M Steinbach and V Kumar Introduction to DataMining Pearson Addison-Wesley 2006
[14] H Xiong J Wu and J Chen ldquoK-means clustering versusvalidation measures a data-distribution perspectiverdquo IEEETransactions on Systems Man and Cybernetics B Cyberneticsvol 39 no 2 pp 318ndash331 2009
[15] I Borg and P J F GroenenModernMultidimensional ScalingmdashTheory and Applications Springer Science + Business Media2005
[16] S Guha R Rastogi and K Shim ldquoCURE an efficient clusteringalgorithm for large databasesrdquo ACM SIGMOD Record vol 27no 2 pp 73ndash84 1998
[17] EM Knorr R T Ng and V Tucakov ldquoDistance-based outliersalgorithms and applicationsrdquo VLDB Journal vol 8 no 3-4 pp237ndash253 2000
[18] P S Bradley U Fayyad and C Reina ldquoScaling clusteringalgorithms to large databasesrdquo in Proceedings of the 4th ACMSIGKDD International Conference on Knowledge Discovery andData Mining (KDD rsquo98) pp 9ndash15 1998
[19] D Arthur and S Vassilvitskii ldquok-means++ the advantages ofcareful seedingrdquo in Proceedings of the 18th Annual ACM-SIAMSymposium on Discrete Algorithms pp 1027ndash1035 2007
[20] T Kanungo D M Mount N S Netanyahu C D PiatkoR Silverman and A Y Wu ldquoA local search approximationalgorithm for k-means clusteringrdquo Computational GeometryTheory and Applications vol 28 no 2-3 pp 89ndash112 2004
14 Mathematical Problems in Engineering
[21] M C Chiang C W Tsai and C S Yang ldquoA time-efficient pat-tern reduction algorithm for k-means clusteringrdquo InformationSciences vol 181 no 4 pp 716ndash731 2011
[22] P S Bradley U M Fayyad and C Reina ldquoScaling clusteringalgorithms to large databasesrdquo in Proceedings of the ACMSIGKDD International Conference on Knowledge Discovery andData Mining (KDD rsquo98) pp 9ndash15 1998
[23] F Farnstrom J Lewis and C Elkan ldquoScalability for clusteringalgorithms revisitedrdquo SIGKDD Explorations vol 2 no 1 pp 51ndash57 2000
[24] C Ordonez and E Omiecinski ldquoEfficient disk-based K-means clustering for relational databasesrdquo IEEE Transactions onKnowledge and Data Engineering vol 16 no 8 pp 909ndash9212004
[25] Y Li and S M Chung ldquoParallel bisecting k-means withprediction clustering algorithmrdquo Journal of Supercomputingvol 39 no 1 pp 19ndash37 2007
[26] C Elkan ldquoUsing the triangle inequality to accelerate k-meansrdquoin Proceedings of the 20th International Conference on MachineLearning pp 147ndash153 August 2003
[27] D Reddy and P K Jana ldquoInitialization for K-means clusteringusing voronoi diagramrdquo Procedia Technology vol 4 pp 395ndash400 2012
[28] J B MacQueen ldquoSome methods for classification and analysisof multivariate observationsrdquo in Proceedings of 5th BerkeleySymposium on Mathematical Statistics and Probability pp 281ndash297 University of California Press 1967
[29] R Ebrahimpour R Rasoolinezhad Z Hajiabolhasani andM Ebrahimi ldquoVanishing point detection in corridors usingHough transform and K-means clusteringrdquo IET ComputerVision vol 6 no 1 pp 40ndash51 2012
[30] P S Bishnu and V Bhattacherjee ldquoSoftware fault predictionusing quad tree-based K-means clustering algorithmrdquo IEEETransactions on Knowledge and Data Engineering vol 24 no6 pp 1146ndash1150 2012
[31] T Elomaa and H Koivistoinen ldquoOn autonomous K-meansclusteringrdquo in Proceedings of the International Symposium onMethodologies for Intelligent Systems pp 228ndash236 May 2005
[32] J M Pena J A Lozano and P Larranaga ldquoAn empiricalcomparison of four initialization methods for the K-Meansalgorithmrdquo Pattern Recognition Letters vol 20 no 10 pp 1027ndash1040 1999
[33] T Kanungo D M Mount N S Netanyahu C D PiatkoR Silverman and A Y Wu ldquoAn efficient k-means clusteringalgorithms analysis and implementationrdquo IEEETransactions onPattern Analysis andMachine Intelligence vol 24 no 7 pp 881ndash892 2002
[34] A Likas N Vlassis and J J Verbeek ldquoThe global k-meansclustering algorithmrdquo Pattern Recognition vol 36 no 2 pp451ndash461 2003
[35] A M Bagirov ldquoModified global k-means algorithm for mini-mumsum-of-squares clustering problemsrdquoPatternRecognitionvol 41 no 10 pp 3192ndash3199 2008
[36] J Xie and S Jiang ldquoA simple and fast algorithm for global K-means clusteringrdquo in Proceedings of the 2nd InternationalWork-shop on Education Technology and Computer Science (ETCS 10)vol 2 pp 36ndash40 Wuhan China March 2010
[37] G F Tzortzis and A C Likas ldquoThe global kernel 120581-meansalgorithm for clustering in feature spacerdquo IEEE Transactions onNeural Networks vol 20 no 7 pp 1181ndash1194 2009
[38] W-L Hung Y-C Chang and E Stanley Lee ldquoWeight selectionin W-K-means algorithm with an application in color imagesegmentationrdquo Computers and Mathematics with Applicationsvol 62 no 2 pp 668ndash676 2011
[39] B Maliatski and O Yadid-Pecht ldquoHardware-driven adaptive 120581-means clustering for real-time video imagingrdquo IEEE Transac-tions on Circuits and Systems for Video Technology vol 15 no 1pp 164ndash166 2005
[40] T Saegusa and T Maruyama ldquoAn FPGA implementation ofreal-time K-means clustering for color imagesrdquo Journal of Real-Time Image Processing vol 2 no 4 pp 309ndash318 2007
[41] ldquoCBCL face recognition Databaserdquo Vol 2010httpcbclmitedusoftware-datasetsheiselefacerecognition-databasehtml
[42] BioID BioID Support Downloads BioID Face Database 2011[43] httpwwwvisioncaltecheduhtml-filesarchivehtml
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
Mathematical Problems in Engineering 3
et al proposed Scalable k-means that uses buffering anda two-stage compression scheme to compress or removepatterns in order to improve the performance of k-meansThe algorithm is slightly faster than k-means but does notshow the same result always The factors that degrade theperformance of scalable k-means are the compression pro-cesses and compression ratio Farnstrom et al also proposeda simple single pass k-means algorithm [23] which reducesthe computation time of k-means Ordonez and Omiecinskiproposed relational k-means [24] that uses the block andincremental concept to improve the stability of scalable k-meansThe computation time of k-means can also be reducedusing Parallel bisecting k-means [25] and triangle inequality[26] methods
Although K-means is the most popular and simple clus-tering algorithm the major difficulty with this process is thatit cannot ensure the global optimumresults because the initialcluster centers are selected randomly Reference [27] presentsa novel technique that selects the initial cluster centers byusing Voronoi diagramThis method automates the selectionof the initial cluster centers To validate the performanceexperiments were performed on a range of artificial and realworld datasets The quality of the algorithm is assessed usingthe following error rate (ER) equation
ER =Number ofmisclassified objects
Total number of objectstimes 100 (2)
The lower the error rate the better the clustering Theproposed algorithm is able to produce better clustering resultsthan the traditional K-means Experimental results proved60 reduction in iterations for defining the centroids
All previous solutions and efforts to increase the perfor-mance of K-means algorithm still need more investigationsince they are all looking for a local minimum Searching forthe global minimum will certainly improve the performanceof the algorithmThe rest of the paper is organized as followsIn Section 2 the proposed method is introduced that findsthe optimal centers for each cluster which corresponds tothe global minimum of the k-means cluster In Section 3 theresults are discussed and analyzed using two sets from MITBioID and Caltech datasets Finally in Section 4 conclusionsare drawn
2 K-Means Clustering
K-means an unsupervised learning algorithm first proposedby MacQueen 1967 [28] is a popular method of clusteranalysis It is a process of organizing the specified objectsinto uniform classes called clusters based on similaritiesamong objects based on certain criteria It solves the well-known clustering problem by considering certain attributesand performing an iterative alternating fitting process Thisalgorithm partitions a dataset 119883 into 119896 disjoint clusters suchthat each observation belongs to the cluster with the nearestmean Let119909
119894be the centroid of cluster 119888
119894and let119889(119909
119895 119909119894)be the
180
160
140
120
100
80
60
40
20
0
Clus
terin
g er
ror
Number of clusters k2 4 6 8 10 12 14 16
Global k-means
Fast global k-means
k-means
Min k-means
Figure 1 Comparison of k-means
dissimilarity between 119909119894and object 119909
119895isin 119888119894Then the function
minimized by the k-means is given by the following equation
min1199091 119909119896
119864 =
119896
sum
119894=1
sum
119909119895isin119888119894
119889 (119909119895 119909119894) (3)
K-means clustering is utilized in a vast number ofapplications including machine learning fault detectionpattern recognition image processing statistics and artificialintelligent [11 29 30] k-means algorithm is considered asone of the fastest clustering algorithms with a number ofvariants that are sensitive to the selection of initial pointsand are intended for solving many issues of k-means likethe evaluation of the clusters number [31] the method ofinitialization of the clusters centroids [32] and the speed ofthe algorithm [33]
21 Modifications in K-Means Clustering Global k-meansalgorithm [34] is a proposal to improve global search prop-erties of k-means algorithm and its performance on verylarge datasets by computing clusters successively The mainweakness of k-means clustering that is its sensitivity to initialpositions of the cluster centers is overcome by global k-means clustering algorithm which consists of a deterministicglobal optimization technique It does not select initial valuesrandomly instead an incremental approach is applied tooptimally add one new cluster center at each stage Globalk-means also proposes a method to reduce the computa-tional load while maintain the solution quality Experimentswere performed to compare the performance of k-meansglobal k-means min k-means and fast global k-means asshown in Figure 1 Numerical results show that the globalk-means algorithm considerably outperforms other k-meansalgorithms
4 Mathematical Problems in Engineering
Bagirov [35] proposed a new version of the global k-means algorithm for minimum sum-of-squares clusteringproblems He also compared three different versions ofthe k-means algorithm to propose the modified versionof the global k-means algorithm The proposed algorithmcomputes clusters incrementally and 119896 minus 1 cluster centersfrom the previous iteration are used to compute k-partitionof a dataset The starting point of the 119896th cluster center iscomputed byminimizing an auxiliary cluster function Givena finite set of points 119860 in the 119899-dimensional space the globalk-means compute the centroid1198831 of the set119860 as the followingequation
1198831=
1
119898
119898
sum
119894=1
119886119894 119886119894isin 119860 119894 = 1 119898 (4)
Numerical experiments performed on 14 datasets demon-strate the efficiency of themodified global k-means algorithmin comparison to the multistart k-means (MS k-means) andthe GKM algorithms when the number of clusters 119896 gt 5Modified global k-means however requires more computa-tional time than the global k-means algorithm Xie and Jiang[36] also proposed a novel version of the global k-meansalgorithm The method of creating the next cluster center inthe global k-means algorithm is modified and that showeda better execution time without affecting the performanceof the global k-means algorithm Another extension of thestandard k-means clustering is the Global Kernel k-means[37] which optimizes the clustering error in feature spaceby employing kernel-means as a local search procedure Akernel function is used in order to solve the M-clusteringproblem and near-optimal solutions are used to optimizethe clustering error in the feature space This incrementalalgorithm has high ability to identify nonlinearly separableclusters in input space and no dependence on cluster initial-ization It can handle weighted data points making it suitablefor graph partitioning applications Twomajor modificationswere performed in order to reduce the computational costwith no major effect on the solution quality
Video imaging and Image segmentation are importantapplications for k-means clustering Hung et al [38]modifiedthe k-means algorithm for color image segmentation wherea weight selection procedure in the W-k-means algorithm isproposed The evaluation function 119865(119868) is used for compari-son
119865 (119868) = radic119877
119877
sum
119894=1
(119890119894256)2
radic119860119894
(5)
where 119868 = segmented image 119877 = the number of regions in thesegmented image 119860
119894= the area or the number of pixels of
the 119894th region and 119890119894= the color error of region 119894
119890119894is the sum of the Euclidean distance of the color vectors
between the original image and the segmented image Smallervalues of 119865(119868) give better segmentation results Resultsfrom color image segmentation illustrate that the proposedprocedure produces better segmentation than the randominitialization Maliatski and Yadid-Pecht [39] also proposean adaptive k-means clustering hardware-driven approach
Y
X
My
1
My
My
2
Mx1
Mx Mx2
C1
C2
Figure 2 The separation points in two dimensions
The algorithm uses both pixel intensity and pixel distance inthe clustering process In [40] also a FPGA implementationof real-time k-means clustering is done for colored imagesA filtering algorithm is used for hardware implementationSuliman and Isa [11] presented a unique clustering algo-rithm called adaptive fuzzy-K-means clustering for imagesegmentation It can be used for different types of imagesincluding medical images microscopic images and CCDcamera images The concepts of fuzziness and belongingnessare applied to produce more adaptive clustering as comparedto other conventional clustering algorithms The proposedadaptive fuzzy-K-means clustering algorithm provides bettersegmentation and visual quality for a range of images
22 The Proposed Methodology In this method the facelocation is determined automatically by using an optimizedK-means algorithm First the input image is reshaped intovectors of pixels values followed by clustering the pixelsbased on applying a certain threshold determined by thealgorithm into two classes Pixels with values below thethreshold are put in the nonface classThe rest of the pixels areput in the face class However some unwanted parts may getclustered to this face classTherefore the algorithm is appliedagain to the face class to obtain only the face part
221 Optimized K-Means Algorithm Theproposed K-meansmodified algorithm is intended to solve the limitations of thestandard version by using differential equations to determinethe optimum separation point This algorithm finds theoptimal centers for each cluster which corresponds to theglobal minimum of the k-means cluster As an example saywe would like to cluster an image into two subclasses faceand nonface So we look for a separator which will separatethe image into two different clusters in two dimensionsThenthe means of the clusters can be found easily Figure 2 showsthe separation points and themean of each class To find these
Mathematical Problems in Engineering 5
points we proposed the followingmodification in continuousand discrete cases
Let 1199091 1199092 1199093 119909
119873 be a dataset and let 119909
119894be an
observation Let119891(119909) be the probabilitymass function (PMF)of 119909
119891 (119909) =
infin
sum
minusinfin
1
119873
120575 (119909 minus 119909119894) 119909
119894isin R (6)
where 120575(119909) is the Dirac functionWhen the size of the data is very large the question of the
cost of the minimization ofsum119896119894=1sum119909119894isin119904119894
|119909119894minus 120583119894|2 arises where
119896 is the number of clusters and 120583119894is the mean of cluster 119904
119894
Let
Var 119896 =119896
sum
119894=1
sum
119909119894isin119904119894
1003816100381610038161003816119909119894minus 120583119894
1003816100381610038161003816
2 (7)
Then it is enough to minimize Var 1 without losing thegenerality since
min119904119894
119889
sum
119894=1
sum
119909119895isin119904119894
10038161003816100381610038161003816119909119895minus 120583119894
10038161003816100381610038161003816
2
ge
119889
sum
119897=1
min119904119894
(
119896
sum
119894=1
sum
119909119895isin119904119894
10038161003816100381610038161003816119909119897
119895minus 120583119897
119894
10038161003816100381610038161003816
2
) (8)
where 119889 is the dimension of 119909119895= (1199091
119895 119909
119889
119895) and 120583
119895=
(1205831
119895 120583
119889
119895)
222 Continuous Case For the continuous case 119891(119909) is likethe probability density function (PDF)Then for 2 classes weneed to minimize the following equation
Var 2 = int
119872
minusinfin
(119909 minus 1205831)2119891 (119909) 119889119909
+ int
infin
119872
(119909 minus 1205832)2119891 (119909) 119889119909
(9)
If 119889 ge 2 we can use (8)
minVar 2 ge119889
sum
119896=1
min(int119872
minusinfin
(119909119896minus 120583119896
1)
2
119891 (119909119896) 119889119909
+int
119872
minusinfin
(119909119896minus 120583119896
2)
2
119891 (119909119896) 119889119909)
(10)
Let 119909 isin R be any random variable with probabilitydensity function 119891(119909) we want to find 120583
1and 120583
2such that
it minimizes (7) as follows
(120583lowast
1 120583lowast
2) = arg(1205831 1205832)
[ min11987212058311205832
[int
119872
minusinfin
(119909 minus 1205831)2119891 (119909) 119889119909
+int
infin
119872
(119909 minus 1205832)2119891 (119909) 119889119909]]
(11)
min11987212058311205832
Var 2 = min119872
[min1205831
int
119872
minusinfin
(119909 minus 1205831)2119891 (119909) 119889119909
+min1205832
int
infin
119872
(119909 minus 1205832)2119891 (119909) 119889119909]
(12)
min11987212058311205832
Var 2 = min119872
[min1205831
int
119872
minusinfin
(119909 minus 1205831(119872))2119891 (119909) 119889119909
+min1205832
int
infin
119872
(119909 minus 1205832(119872))2119891 (119909) 119889119909]
(13)
We know that
min119909119864 [(119883 minus 119909)
2] = 119864 [(119883 minus 119864 (119909))
2] = Var (119909) (14)
So we can find 1205831and 1205832in terms of119872 Let us define Var
1
and Var2as follows
Var1= int
119872
minusinfin
(119909 minus 1205831)2119891 (119909) 119889119909
Var2= int
infin
119872
(119909 minus 1205832)2119891 (119909) 119889119909
(15)
After simplification we get
Var1= int
119872
minusinfin
1199092119891 (119909) 119889119909
+ 1205832
1int
119872
minusinfin
119891 (119909) 119889119909 minus 21205831int
119872
minusinfin
119909119891 (119909) 119889119909
Var2= int
infin
119872
1199092119891 (119909) 119889119909
+ 1205832
2int
infin
119872
119891 (119909) 119889119909 minus 21205832int
infin
119872
119909119891 (119909) 119889119909
(16)
To find the minimum Var1and Var
2are both partially
differentiated with respect to 1205831and 120583
2 respectively After
simplification we conclude that the minimization occurswhen
1205831=
119864 (119883minus)
119875 (119883minus)
1205832=
119864 (119883+)
119875 (119883+)
(17)
where119883minus= 119883 sdot 1
119909lt119872and119883
+= 119883 sdot 1
119909ge119872
6 Mathematical Problems in Engineering
To find the minimum of Var 2 we need to find 120597120583120597119872
1205831= (
int
119872
minusinfin119909119891 (119909) 119889119909
int
119872
minusinfin119891 (119909) 119889119909
)
997904rArr
1205971205831
120597119872
=
119872119891 (119872)119875 (119883minus) minus 119891 (119872)119864 (119883
minus)
1198752(119883minus)
1205832= (
int
infin
119872119909119891 (119909) 119889119909
int
infin
119872119891 (119909) 119889119909
)
997904rArr
1205971205832
120597119872
=
119872119891 (119872)119875 (119883+) minus 119891 (119872)119864 (119883+
)
1198752(119883+)
(18)
After simplification we get
1205971205831
120597119872
=
119891 (119872)
1198752(119883minus)
[119872119875 (119883minus) minus 119864 (119883
minus)]
1205971205832
120597119872
=
119891 (119872)
1198752(119883+)
[119872119875 (119883+) minus 119864 (119883
+)]
(19)
In order to find the minimum of Var 2 we will find120597Var1120597119872 and 120597119881119886119903
2120597119872 separately and add themas follows
120597
120597119872
(Var 2) = 120597
120597119872
(Var1) +
120597
120597119872
(Var2)
120597Var1
120597119872
=
120597
120597119872
(int
119872
minusinfin
1199092119891 (119909) 119889119909
+1205832
1int
119872
minusinfin
119891 (119909) 119889119909 minus 21205831int
119872
minusinfin
119909119891 (119909) 119889119909)
= 1198722119891 (119872) + 2120583
11205831015840
1119875 (119883minus)
+ 1205832
1119891 (119872) minus 2120583
1015840
1119864 (119883minus) minus 2120583
1119872119891(119872)
(20)
After simplification we get
120597Var1
120597119872
= 119891 (119872)[1198722+
1198642(119883minus)
119875(119883minus)2minus 2119872
119864 (119883minus)
119875 (119883minus)
] (21)
Similarly
120597Var2
120597119872
= minus1198722119891 (119872) + 2120583
21205831015840
2119875 (119883+)
minus 1205832
2119891 (119872) minus 2120583
1015840
2119864 (119883+) + 2120583
2119872119891(119872)
(22)
After simplification we get
120597Var2
120597119872
= 119891 (119872)[minus1198722+
1198642(119883+)
119875(119883+)2minus 2119872
119864 (119883+)
119875 (119883+)
] (23)
Adding both these differentials and after simplifying itfollows that
120597
120597119872
(Var 2) = 119891 (119872)[(
1198642(119883minus)
119875(119883minus)2minus
1198642(119883+)
119875(119883+)2)
minus2119872(
119864 (119883minus)
119875 (119883minus)
minus
119864 (119883+)
119875 (119883+)
)]
(24)
Equating this with 0 gives
120597
120597119872
(Var 2) = 0
[(
1198642(119883minus)
119875(119883minus)2minus
1198642(119883+)
119875(119883+)2) minus 2119872(
119864 (119883minus)
119875 (119883minus)
minus
119864 (119883+)
119875 (119883+)
)] = 0
[
119864 (119883minus)
119875 (119883minus)
minus
119864 (119883+)
119875 (119883+)
] [
119864 (119883minus)
119875 (119883minus)
+
119864 (119883+)
119875 (119883+)
minus 2119872] = 0
(25)
But [(119864(119883minus)119875(119883
minus))minus(119864(119883
+)119875(119883
+))] = 0 as in that case
1205831= 1205832 which contradicts the initial assumption Therefore
[
119864 (119883minus)
119875 (119883minus)
+
119864 (119883+)
119875 (119883+)
minus 2119872] = 0
997904rArr 119872 = [
119864 (119883minus)
119875 (119883minus)
+
119864 (119883+)
119875 (119883+)
] sdot
1
2
119872 =
1205831+ 1205832
2
(26)
Therefore we have to find all values of119872 that satisfy (26)which is easier than minimizing Var 2 directly
To find theminimum in terms of119872 it is enough to derive
120597Var 2 (1205831(119872) 120583
2(119872))
120597119872
= 119891 (119872) [
1198642(119883minus)
119875(119883 lt 119872)2minus
1198642(119883+)
119875(119883 ge 119872)2
minus2119872[
119864 (119883minus)
119875 (119883 le 119872)
minus
119864 (119883+)
119875 (119883 gt 119872)
]]
(27)
Then we find
119872 = [
119864 (119883+)
119875 (119883+)
+
119864 (119883minus)
119875 (119883minus)
] sdot
1
2
(28)
223 Finding the Centers of Some Common ProbabilityDensity Functions
(1) Uniform Distribution The probability density function ofthe continuous uniform distribution is
119891 (119909) =
1
119887 minus 119886
119886 lt 119909 lt 119887
0 otherwise(29)
Mathematical Problems in Engineering 7
where 119886 and 119887 are the two parameters of the distributionTheexpected values for119883 sdot 1
119909ge 119872 and119883 sdot 1
119909lt 119872 are
119864 (119883+) =
1
2
(
1198872minus1198722
119887 minus 119886
)
119864 (119883minus) =
1
2
(
1198722minus 1198862
119887 minus 119886
)
(30)
Putting all these in (28) and solving we get
119872 =
1
2
(119886 + 119887)
1205831(119872) =
119864 (119883minus)
119875 (119883minus)
=
1
4
(3119886 + 119887)
1205832 (119872) =
119864 (119883+)
119875 (119883+)
=
1
4
(119886 + 3119887)
(31)
(2) Log-NormalDistributionTheprobability density functionof a log-normal distribution is
119891 (119909 120583 120590) =
1
119909120590radic2120587
119890minus(ln 119909minus120583)221205902
119909 gt 0 (32)
The probabilities in left and right of119872 are respectively
119875 (119883+) =
1
2
minus
1
2
erf (12
radic2 (ln119872minus 120583)
120590
)
119875 (119883minus) =
1
2
+
1
2
erf (12
radic2 (ln119872minus 120583)
120590
)
(33)
The expected values for119883 sdot 1119909ge 119872 and119883 sdot 1
119909lt 119872 are
119864 (119883+)
= [
1
2
119890120583+(12)120590
2
1 + erf (12
(120583 + 1205902minus ln119872)radic2
120590
)]
119864 (119883minus)
= [
1
2
119890120583+(12)120590
2
1 + erf (12
(minus120583 minus 1205902+ ln119872)radic2
120590
)]
(34)
Putting all these in (28) and assuming 120583 = 0 and 120590 = 1
and solving we get
119872 = 4244942583
1205831(119872) =
119864 (119883minus)
119875 (119883minus)
= 1196827816
1205832(119872) =
119864 (119883+)
119875 (119883+)
= 7293057348
(35)
(3) Normal Distribution The probability density function ofa normal distribution is
119891 (119909 120583 120590) =
1
120590radic2120587
119890minus(119909minus120583)
221205902
(36)
The probabilities in left and right of119872 are respectively
119875 (119883+) =
1
2
minus
1
2
erf (12
radic2 (119872 minus 120583)
120590
)
119875 (119883minus) =
1
2
+
1
2
erf (12
radic2 (119872 minus 120583)
120590
)
(37)
The expected values for119883 sdot 1119909ge 119872 and119883 sdot 1
119909lt 119872 are
119864 (119883+) = minus
1
2
120583minus1 + erf (12
radic2 (119872 minus 120583)
120590
)
119864 (119883minus) =
1
2
1205831 + erf (12
radic2 (119872 minus 120583)
120590
)
(38)
Putting all these in (28) and solving we get 119872 = 120583Assuming 120583 = 0 and 120590 = 1 and solving we get
119872 = 0
1205831(119872) =
119864 (119883minus)
119875 (119883minus)
= minusradic2
120587
1205832(119872) =
119864 (119883+)
119875 (119883+)
= radic2
120587
(39)
224 Discrete Case Let119883 be a discrete random variable andassume we have a set of119873 observations from119883 Let 119901
119895be the
mass density function for an observation 119909119895
Let 119909119894minus1
lt 119872119894minus1
lt 119909119894and 119909
119894lt 119872119894lt 119909119894+1
We define 120583119909le119909119894
as themean for all 119909 le 119909119894 and 120583
119909ge119909119894as themean for all 119909 ge 119909
119894
Define Var 2(119872119894minus1) as the variance of two centers forced by
119872119894minus1
as a separator
Var (119872119894minus1) = sum
119909119895le119909119894minus1
(119909119895minus 120583119909119895le119909119894minus1
)
2
119901119895
+ sum
119909119895ge119909119894
(119909119895minus 120583119909119895ge119909119894
)
2
119901119895
(40)
Var (119872119894) = sum
119909119895le119909119894
(119909119895minus 120583119909119895le119909119894
)
2
119901119895
+ sum
119909119895ge119909119894+1
(119909119895minus 120583119909119895ge119909119894+1
)
2
119901119895
(41)
8 Mathematical Problems in Engineering
The first part of Var(119872119894minus1) simplifies as follows
sum
119909119895le119909119894minus1
(119909119895minus 120583119909119895le119909119894minus1
)
2
119901119895
= sum
119909119895le119909119894minus1
1199092
119895119901119895minus 2120583119909119895le119909119894minus1
times sum
119909119895le119909119894minus1
119909119895119901119895+ 1205832
119909119895le119909119894minus1sum
119909119895le119909119894minus1
119901119895
= 119864 (1198832sdot 1119909119895le119909119894minus1
) minus
1198642(119883 sdot 1
119909119895le119909119894minus1)
119875 (119883 le 119909119894minus1)
(42)
Similarly
sum
119909119895ge119909119894
(119909119895minus 120583119909119895ge119909119894
)
2
119901119895= 119864 (119883
2sdot 1119909119895ge119909119894
) minus
1198642(119883 sdot 1
119909119895ge119909119894)
119875 (119883 ge 119909119894)
Var 2 (119872119894minus1) = 119864 (119883
2sdot 1119909119895le119909119894minus1
)
minus
1198642(119883 sdot 1
119909119895le119909119894minus1)
119875 (119883 le 119909119894minus1)
+ 119864 (1198832sdot 1119909119895ge119909119894
)
minus
1198642(119883 sdot 1
119909119895ge119909119894)
119875 (119883 ge 119909119894)
Var 2 (119872119894) = 119864 (119883
2sdot 1119909119895le119909119894
)
minus
1198642(119883 sdot 1
119909119895le119909119894)
119875 (119883 le 119909119894)
+ 119864 (1198832sdot 1119909119895ge119909119894+1
)
minus
1198642(119883 sdot 1
119909119895ge119909119894+1)
119875 (119883 ge 119909119894+1)
(43)
Define ΔVar 2 = Var 2(119872119894) minus Var 2(119872
119894minus1)
ΔVar 2 = [
[
minus
1198642(119883 sdot 1
119909119895le119909119894)
119875 (119883 le 119909119894)
minus
1198642(119883 sdot 1
119909119895ge119909119894+1)
119875 (119883 ge 119909119894+1)
]
]
minus[
[
minus
1198642(119883 sdot 1
119909119895le119909119894minus1)
119875 (119883 le 119909119894minus1)
minus
1198642(119883 sdot 1
119909119895ge119909119894)
119875 (119883 ge 119909119894)
]
]
(44)
Now in order to simplify the calculation we rearrangeΔVar 2 as follows
ΔVar 2 = [
[
1198642(119883 sdot 1
119909119895le119909119894minus1)
119875 (119883 le 119909119894minus1)
minus
1198642(119883 sdot 1
119909119895le119909119894)
119875 (119883 le 119909119894)
]
]
+[
[
1198642(119883 sdot 1
119909119895ge119909119894)
119875 (119883 ge 119909119894)
minus
1198642(119883 sdot 1
119909119895ge119909119894+1)
119875 (119883 ge 119909119894+1)
]
]
(45)
The first part is simplified as follows
1198642(119883 sdot 1
119909119895le119909119894minus1)
119875 (119883 le 119909119894minus1)
minus
1198642(119883 sdot 1
119909119895le119909119894)
119875 (119883 le 119909119894)
=
1198642(119883 sdot 1
119909119895le119909119894minus1)
119875 (119883 le 119909119894minus1)
minus
[119864 (119883 sdot 1119909119895le119909119894minus1
) + 119909119894119901119894]
2
119875 (119883 le 119909119894minus1) + 119901119894
=
1
119875 (119883 le 119909119894minus1) [119875 (119883 le 119909
119894minus1) + 119901119894]
times 1198642(119883 sdot 1
119909119895le119909119894minus1)119875 (119883 le 119909
119894)
+ 1199011198941198642(119883 sdot 1
119909119895le119909119894minus1)
minus 1198642(119883 sdot 1
119909119895le119909119894minus1)119875 (119883 le 119909
119894minus1)
minus 2119909119894119901119894119864(119883 sdot 1
119909119895le119909119894minus1)119875 (119883 le 119909
119894minus1)
minus1199092
1198941199012
119894119875 (119883 le 119909
119894minus1)
=
1
119875 (119883 le 119909119894minus1) [119875 (119883 le 119909
119894minus1) + 119901119894]
times 1199011198941198642(119883 sdot 1
119909119895le119909119894minus1)
minus 2119909119894119901119894119864(119883 sdot 1
119909119895le119909119894minus1)119875 (119883 le 119909
119894minus1)
minus1199092
1198941199012
119894119875 (119883 le 119909
119894minus1)
(46)
Similarly simplifying the second part yields
1198642(119883 sdot 1
119909119895ge119909119894)
119875 (119883 ge 119909119894)
minus
1198642(119883 sdot 1
119909119895ge119909119894+1)
119875 (119883 ge 119909119894+1)
=
[119864 (119883 sdot 1119909119895ge119909119894+1
) + 119909119894119901119894]
2
119875 (119883 ge 119909119894+1) + 119901119894
minus
1198642(119883 sdot 1
119909119895ge119909119894+1)
119875 (119883 ge 119909119894+1)
=
1
119875 (119883 ge 119909119894+1) [119875 (119883 ge 119909
119894+1) + 119901119894]
times 1198642(119883 sdot 1
119909119895ge119909119894+1)119875 (119883 ge 119909
119894+1)
minus 1199011198941198642(119883 sdot 1
119909119895ge119909119894+1)
minus 1198642(119883 sdot 1
119909119895ge119909119894+1)119875 (119883 ge 119909
119894+1)
+ 2119909119894119901119894119864(119883 sdot 1
119909119895ge119909119894+1)119875 (119883 ge 119909
119894+1)
+1199092
1198941199012
119894119875 (119883 ge 119909
119894+1)
Mathematical Problems in Engineering 9
=
1
119875 (119883 ge 119909119894+1) [119875 (119883 ge 119909
119894+1) + 119901119894]
times minus119901119894+11198642(119883 sdot 1
119909119895ge119909119894+1)
+ 2119909119894119901119894119864(119883 sdot 1
119909119895ge119909119894+1)119875 (119883 ge 119909
119894+1)
+1199092
1198941199012
119894+1119875 (119883 ge 119909
119894+1)
(47)In general these types of optimization arise when the
dataset is very bigIf we consider119873 to be very large then 119901
119896≪ max(sum
119894ge119896+1
119875119894 sum119894le119896minus1
119875119894) 119875(119883 le 119909
119894minus1) + 119901119894asymp 119875(119883 le 119909
119894) and 119875(119883 ge
119909119894+1) + 119901119894asymp 119875(119883 ge 119909
119894+1)
ΔVar 2 asymp [
[
1198642(119883 sdot 1
119909119895le119909119894minus1) 119901119894
1198752(119883 le 119909
119894minus1)
minus
1198642(119883 sdot 1
119909119895ge119909119894+1) 119901119894
1198752(119883 ge 119909
119894+1)
]
]
minus 2119901119894119909119894[
[
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
minus
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
]
]
minus 1199012
1198941199092
119894[
1
119875 (119883 le 119909119894minus1)
minus
1
119875 (119883 ge 119909119894+1)
]
(48)
and as lim119873rarrinfin
1199012
1198941199092
119894[(1119875(119883 le 119909
119894minus1))minus(1119875(119883 ge 119909
119894+1))] =
0 we can further simplify ΔVar 2 as followsΔVar 2
asymp[
[
1198642(119883 sdot 1
119909119895le119909119894minus1) 119901119894
1198752(119883 le 119909
119894minus1)
minus
1198642(119883 sdot 1
119909119895ge119909119894+1) 119901119894
1198752(119883 ge 119909
119894+1)
]
]
minus 2119901119894119909119894[
[
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
minus
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
]
]
= 119901119894[
[
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
minus
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
]
]
times[
[
(
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
+
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
) minus 2119909119894]
]
(49)To find the minimum of ΔVar 2 let us locate when
ΔVar 2 = 0
997904rArr 119901119894[
[
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
minus
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
]
]
times[
[
(
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
+
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
) minus 2119909119894]
]
= 0
(50)
VarD
X1 X2 X3 X4 X5 X6 Xn
M
Figure 3 Discrete random variables
But 119901119894
= 0 and so is [(119864(119883 sdot 1119909119895le119909119894minus1
)119875(119883 le 119909119894minus1)) minus
(119864(119883 sdot 1119909119895ge119909119894+1
)119875(119883 ge 119909119894+1))] lt 0 since 119909
119894is an increasing
sequence which implies that 120583119894minus1
lt 120583119894where
120583119894minus1
=
119864 (119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
120583119894=
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
(51)
Therefore
[
[
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
+
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
]
]
minus 2119909119894= 0 (52)
Let us define
119865 (119909119894) = 2119909
119894minus[
[
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
+
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
]
]
119865 (119909119894) = 0
997904rArr 119909119894=[
[
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
+
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
]
]
sdot
1
2
997904rArr 119909119894=
120583119894minus1+ 120583119894
2
(53)
where 120583119894minus1
and 120583119894are the means of the first and second parts
respectivelyTo find the minimum for the total variation it is enough
to find the good separator119872 between 119909119894and 119909
119894minus1such that
119865(119909119894) sdot 119865(119909
119894+1) lt 0 and 119865(119909
119894) lt 0 (see Figure 3)
After finding few local minima we can get the globalminimum by finding the smallest among them
3 Face Localization Using Proposed Method
Now it becomes clear the superiority of the K-means modi-fied algorithmover the standard algorithm in terms of finding
10 Mathematical Problems in Engineering
Figure 4 The original image
Figure 5 The face with shade only
a global minimum and we can explain our solution basedon the modified algorithm Suppose we have an image withpixels values from 0 to 255 First we reshaped the input imageinto a vector of pixels Figure 4 shows an input image beforethe operation of the K-means modified algorithmThe imagecontains a face part a normal background and a backgroundwith illumination Then we split the image pixels into twoclasses one from 0 to 119909
1and another from 119909
1to 255 by
applying a certain threshold determined by the algorithmLet [0 119909
1] be the class 1-which represents the nonface
part and let [1199091 255] be the class 2-which represents the
face with the shadeAll pixels with values less than the threshold belong to
the first class and will be assigned to 0 values This will keepthe pixels with values above the threshold and these pixelsbelong to both the face part and the illuminated part of thebackground as shown in Figure 5
In order to remove the illuminated background partanother threshold will be applied to the pixels using thealgorithm meant to separate between the face part and theilluminated background part New classes will be obtainedfrom 119909
1to 1199092and from 119909
2to 255Then the pixels with values
less than the threshold will be assigned a value of 0 and thenonzero pixels left represent the face part Let [119909
1 119909
2] be
the class 1 which represents the illuminated background partand let [119909
2 255] be the class 2 which represents the face
partThe result of the removal of the illuminated back-ground
part is shown in Figure 6
Figure 6 The face without illumination
Figure 7 The filtered image
One can see that some face pixels were assigned to 0 dueto the effects of the illumination Therefore there is a needto return to the original image in Figure 4 to fully extract theimage window corresponding to the face A filtering processis then performed to reduce the noise and the result is shownin Figure 7
Figure 8 shows the final result which is a window contain-ing the face extracted from the original image
4 Results
In this Section the experimental results of the proposedalgorithm will be analyzed and discussed First it starts withpresenting a description of the datasets used in this workfollowed by the results of the previous part of the workThree popular databases were used to test the proposed facelocalizationmethodTheMIT-CBCL database has image size115 times 115 and 300 images were selected from this datasetbased on different conditions such as light variation poseand background variations The second dataset is the BioIDdataset with image size 384 times 286 and 300 images from thedatabase were selected based on different conditions suchas indooroutdoor The last dataset is the Caltech datasetwith image size 896 times 592 and the 300 images from thedatabase were selected based on different conditions such asindooroutdoor
41 MIT-CBCL Dataset This dataset was established by theCenter for Biological and Computational Learning (CBCL)
Mathematical Problems in Engineering 11
Figure 8 The corresponding window in the original image
Figure 9 Samples fromMIT-CBCL dataset
at Massachusetts Institute of Technology (MIT) [41] It has10 persons with 200 images per person and an image size of115times115 pixelsThe images of each person are with differentlight conditions clutter background and scale and differentposes Few examples of these images are shown in Figure 9
42 BioID Dataset This dataset [42] consists of 1521 graylevel images with a resolution of 384 times 286 pixels Each oneshows the frontal view of a face of one out of 23 differenttest persons The images were taken under different lightingconditions with a complex background and contain tiltedand rotated faces This dataset is considered as the mostdifficult one for eye detection (see Figure 10)
43 Caltech Dataset This database [43] has been collectedby Markus Weber at California Institute of Technology Thedatabase contains 450 face images of 27 subjects underdifferent lighting conditions different face expressions andcomplex backgrounds (see Figure 11)
44 Proposed Method Result The proposed method is eval-uated on these datasets where the result was 100 with a
Figure 10 Sample of faces from BioID dataset for localization
Figure 11 Sample of faces from Caltech dataset for localization
Table 1 Face localization using K-mean modified algorithm onMIT-CBCL BioID and Caltech the sets contain faces with differentposes and faces with clutter background as well as indoorsoutdoors
Dataset Image Accuracy ()MIT-Set 1 150 100MIT-Set 2 150 100BioID 300 100Caltech 300 100
localization time of 4 s Table 1 shows the proposed methodresults with the MIT-CBCL Caltech and BioID databasesIn this test we have focused on the use of images of facesfrom different angles from 30 degrees left to 30 degreesright and the variation in the background as well as thecases of indoor and outdoor images Two sets of images areselected from theMIT-CBCL dataset and two other sets fromthe other databases to evaluate the proposed method Thefirst set from MIT-CBCL contains the images with differentposes and the second set contains images with differentbackgrounds Each set contains 150 images The BioID andCaltech sets contain imageswith indoor and outdoor settingsThe results showed the efficiency of the proposed K-meanmodified algorithm to locate the faces from the image withhigh background and poses changes In addition anotherparameter was considered in this test which is the localizationtime From the results the proposed method can locate theface in a very short time because it has less complexity due tothe use of differential equations
12 Mathematical Problems in Engineering
Figure 12 Examples of face localization using k-mean modified algorithm on MIT-CBCL dataset
(a) (b) (c)
Figure 13 Example of face localization using k-mean modified algorithm on BioID dataset (a) the original image (b) face position and (c)filtered image
Figure 12 shows some examples of face localization onMIT-CBCL
In Figure 13(a) an original image from BioID dataset isshownThe localization position is shown in Figure 13(b) Butthere is need to do a filtering operation in order to remove theunwanted parts and this is shown in Figure 13(c) and then thearea of ROI will be determined
Figure 14 shows the face position on an image from theCaltech dataset
5 Conclusion
In this paper we focus on developing the RIO detectionby suggesting a new method for face localization In thismethod of localization the input image is reshaped intoa vector and then the optimized k-means algorithm isapplied twice to cluster the image pixels into classes At the
end the block window corresponding to the class of pixelscontaining the face only is extracted from the input imageThree sets of images taken from MIT BioID and Caltechdatasets and differing in illumination and backgroundsas well as in-dooroutdoor settings were used to evaluatethe proposed method The results show that the proposedmethod achieved significant localization accuracyOur futureresearch direction includes finding the optimal centers forhigher dimension data whichwe believe is just generalizationof our approach to higher dimension (8) and to highersubclusters
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
Mathematical Problems in Engineering 13
(a) (b) (c)
Figure 14 Example of face localization using k-mean modified algorithm on Caltech dataset (a) the original image (b) face position and(c) filtered image
Acknowledgments
This study is a part of the Internal Grant Project no419011811122 The authors gratefully acknowledge the con-tinued support from Alfaisal University and its Office ofResearch
References
[1] R Chellappa C L Wilson and S Sirohey ldquoHuman andmachine recognition of faces a surveyrdquo Proceedings of the IEEEvol 83 no 5 pp 705ndash740 1995
[2] M-H Yang D J Kriegman and N Ahuja ldquoDetecting faces inimages a surveyrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 24 no 1 pp 34ndash58 2002
[3] K S Fu and J K Mui ldquoA survey on image segmentationrdquoPattern Recognition vol 13 no 1 pp 3ndash16 1981
[4] N R Pal and S K Pal ldquoA review on image segmentationtechniquesrdquo Pattern Recognition vol 26 no 9 pp 1277ndash12941993
[5] S-YWan andW EHiggins ldquoSymmetric region growingrdquo IEEETransactions on Image Processing vol 12 no 9 pp 1007ndash10152003
[6] Y-L Chang and X Li ldquoFast image region growingrdquo Image andVision Computing vol 13 no 7 pp 559ndash571 1995
[7] T N Pappas and N S Jayant ldquoAn adaptive clustering algorithmfor image segmentationrdquo in Proceedings of the InternationalConference on Acoustics Speech and Signal Processing (ICASSPrsquo89) vol 3 pp 1667ndash1670 May 1989
[8] T N Pappas ldquoAn adaptive clustering algorithm for imagesegmentationrdquo IEEE Transactions on Signal Processing vol 40no 4 pp 901ndash914 1992
[9] M Y Mashor ldquoHybrid training algorithm for RBF networkrdquoInternational Journal of the Computer the Internet and Manage-ment vol 8 no 2 pp 50ndash65 2000
[10] N A M Isa S A Salamah and U K Ngah ldquoAdaptive fuzzymovingK-means clustering algorithm for image segmentationrdquo
IEEE Transactions on Consumer Electronics vol 55 no 4 pp2145ndash2153 2009
[11] S N Sulaiman and N AM Isa ldquoAdaptive fuzzy-K-means clus-tering algorithm for image segmentationrdquo IEEE Transactions onConsumer Electronics vol 56 no 4 pp 2661ndash2668 2010
[12] W Cai S Chen and D Zhang ldquoFast and robust fuzzy c-meansclustering algorithms incorporating local information for imagesegmentationrdquo Pattern Recognition vol 40 no 3 pp 825ndash8382007
[13] P Tan M Steinbach and V Kumar Introduction to DataMining Pearson Addison-Wesley 2006
[14] H Xiong J Wu and J Chen ldquoK-means clustering versusvalidation measures a data-distribution perspectiverdquo IEEETransactions on Systems Man and Cybernetics B Cyberneticsvol 39 no 2 pp 318ndash331 2009
[15] I Borg and P J F GroenenModernMultidimensional ScalingmdashTheory and Applications Springer Science + Business Media2005
[16] S Guha R Rastogi and K Shim ldquoCURE an efficient clusteringalgorithm for large databasesrdquo ACM SIGMOD Record vol 27no 2 pp 73ndash84 1998
[17] EM Knorr R T Ng and V Tucakov ldquoDistance-based outliersalgorithms and applicationsrdquo VLDB Journal vol 8 no 3-4 pp237ndash253 2000
[18] P S Bradley U Fayyad and C Reina ldquoScaling clusteringalgorithms to large databasesrdquo in Proceedings of the 4th ACMSIGKDD International Conference on Knowledge Discovery andData Mining (KDD rsquo98) pp 9ndash15 1998
[19] D Arthur and S Vassilvitskii ldquok-means++ the advantages ofcareful seedingrdquo in Proceedings of the 18th Annual ACM-SIAMSymposium on Discrete Algorithms pp 1027ndash1035 2007
[20] T Kanungo D M Mount N S Netanyahu C D PiatkoR Silverman and A Y Wu ldquoA local search approximationalgorithm for k-means clusteringrdquo Computational GeometryTheory and Applications vol 28 no 2-3 pp 89ndash112 2004
14 Mathematical Problems in Engineering
[21] M C Chiang C W Tsai and C S Yang ldquoA time-efficient pat-tern reduction algorithm for k-means clusteringrdquo InformationSciences vol 181 no 4 pp 716ndash731 2011
[22] P S Bradley U M Fayyad and C Reina ldquoScaling clusteringalgorithms to large databasesrdquo in Proceedings of the ACMSIGKDD International Conference on Knowledge Discovery andData Mining (KDD rsquo98) pp 9ndash15 1998
[23] F Farnstrom J Lewis and C Elkan ldquoScalability for clusteringalgorithms revisitedrdquo SIGKDD Explorations vol 2 no 1 pp 51ndash57 2000
[24] C Ordonez and E Omiecinski ldquoEfficient disk-based K-means clustering for relational databasesrdquo IEEE Transactions onKnowledge and Data Engineering vol 16 no 8 pp 909ndash9212004
[25] Y Li and S M Chung ldquoParallel bisecting k-means withprediction clustering algorithmrdquo Journal of Supercomputingvol 39 no 1 pp 19ndash37 2007
[26] C Elkan ldquoUsing the triangle inequality to accelerate k-meansrdquoin Proceedings of the 20th International Conference on MachineLearning pp 147ndash153 August 2003
[27] D Reddy and P K Jana ldquoInitialization for K-means clusteringusing voronoi diagramrdquo Procedia Technology vol 4 pp 395ndash400 2012
[28] J B MacQueen ldquoSome methods for classification and analysisof multivariate observationsrdquo in Proceedings of 5th BerkeleySymposium on Mathematical Statistics and Probability pp 281ndash297 University of California Press 1967
[29] R Ebrahimpour R Rasoolinezhad Z Hajiabolhasani andM Ebrahimi ldquoVanishing point detection in corridors usingHough transform and K-means clusteringrdquo IET ComputerVision vol 6 no 1 pp 40ndash51 2012
[30] P S Bishnu and V Bhattacherjee ldquoSoftware fault predictionusing quad tree-based K-means clustering algorithmrdquo IEEETransactions on Knowledge and Data Engineering vol 24 no6 pp 1146ndash1150 2012
[31] T Elomaa and H Koivistoinen ldquoOn autonomous K-meansclusteringrdquo in Proceedings of the International Symposium onMethodologies for Intelligent Systems pp 228ndash236 May 2005
[32] J M Pena J A Lozano and P Larranaga ldquoAn empiricalcomparison of four initialization methods for the K-Meansalgorithmrdquo Pattern Recognition Letters vol 20 no 10 pp 1027ndash1040 1999
[33] T Kanungo D M Mount N S Netanyahu C D PiatkoR Silverman and A Y Wu ldquoAn efficient k-means clusteringalgorithms analysis and implementationrdquo IEEETransactions onPattern Analysis andMachine Intelligence vol 24 no 7 pp 881ndash892 2002
[34] A Likas N Vlassis and J J Verbeek ldquoThe global k-meansclustering algorithmrdquo Pattern Recognition vol 36 no 2 pp451ndash461 2003
[35] A M Bagirov ldquoModified global k-means algorithm for mini-mumsum-of-squares clustering problemsrdquoPatternRecognitionvol 41 no 10 pp 3192ndash3199 2008
[36] J Xie and S Jiang ldquoA simple and fast algorithm for global K-means clusteringrdquo in Proceedings of the 2nd InternationalWork-shop on Education Technology and Computer Science (ETCS 10)vol 2 pp 36ndash40 Wuhan China March 2010
[37] G F Tzortzis and A C Likas ldquoThe global kernel 120581-meansalgorithm for clustering in feature spacerdquo IEEE Transactions onNeural Networks vol 20 no 7 pp 1181ndash1194 2009
[38] W-L Hung Y-C Chang and E Stanley Lee ldquoWeight selectionin W-K-means algorithm with an application in color imagesegmentationrdquo Computers and Mathematics with Applicationsvol 62 no 2 pp 668ndash676 2011
[39] B Maliatski and O Yadid-Pecht ldquoHardware-driven adaptive 120581-means clustering for real-time video imagingrdquo IEEE Transac-tions on Circuits and Systems for Video Technology vol 15 no 1pp 164ndash166 2005
[40] T Saegusa and T Maruyama ldquoAn FPGA implementation ofreal-time K-means clustering for color imagesrdquo Journal of Real-Time Image Processing vol 2 no 4 pp 309ndash318 2007
[41] ldquoCBCL face recognition Databaserdquo Vol 2010httpcbclmitedusoftware-datasetsheiselefacerecognition-databasehtml
[42] BioID BioID Support Downloads BioID Face Database 2011[43] httpwwwvisioncaltecheduhtml-filesarchivehtml
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
4 Mathematical Problems in Engineering
Bagirov [35] proposed a new version of the global k-means algorithm for minimum sum-of-squares clusteringproblems He also compared three different versions ofthe k-means algorithm to propose the modified versionof the global k-means algorithm The proposed algorithmcomputes clusters incrementally and 119896 minus 1 cluster centersfrom the previous iteration are used to compute k-partitionof a dataset The starting point of the 119896th cluster center iscomputed byminimizing an auxiliary cluster function Givena finite set of points 119860 in the 119899-dimensional space the globalk-means compute the centroid1198831 of the set119860 as the followingequation
1198831=
1
119898
119898
sum
119894=1
119886119894 119886119894isin 119860 119894 = 1 119898 (4)
Numerical experiments performed on 14 datasets demon-strate the efficiency of themodified global k-means algorithmin comparison to the multistart k-means (MS k-means) andthe GKM algorithms when the number of clusters 119896 gt 5Modified global k-means however requires more computa-tional time than the global k-means algorithm Xie and Jiang[36] also proposed a novel version of the global k-meansalgorithm The method of creating the next cluster center inthe global k-means algorithm is modified and that showeda better execution time without affecting the performanceof the global k-means algorithm Another extension of thestandard k-means clustering is the Global Kernel k-means[37] which optimizes the clustering error in feature spaceby employing kernel-means as a local search procedure Akernel function is used in order to solve the M-clusteringproblem and near-optimal solutions are used to optimizethe clustering error in the feature space This incrementalalgorithm has high ability to identify nonlinearly separableclusters in input space and no dependence on cluster initial-ization It can handle weighted data points making it suitablefor graph partitioning applications Twomajor modificationswere performed in order to reduce the computational costwith no major effect on the solution quality
Video imaging and Image segmentation are importantapplications for k-means clustering Hung et al [38]modifiedthe k-means algorithm for color image segmentation wherea weight selection procedure in the W-k-means algorithm isproposed The evaluation function 119865(119868) is used for compari-son
119865 (119868) = radic119877
119877
sum
119894=1
(119890119894256)2
radic119860119894
(5)
where 119868 = segmented image 119877 = the number of regions in thesegmented image 119860
119894= the area or the number of pixels of
the 119894th region and 119890119894= the color error of region 119894
119890119894is the sum of the Euclidean distance of the color vectors
between the original image and the segmented image Smallervalues of 119865(119868) give better segmentation results Resultsfrom color image segmentation illustrate that the proposedprocedure produces better segmentation than the randominitialization Maliatski and Yadid-Pecht [39] also proposean adaptive k-means clustering hardware-driven approach
Y
X
My
1
My
My
2
Mx1
Mx Mx2
C1
C2
Figure 2 The separation points in two dimensions
The algorithm uses both pixel intensity and pixel distance inthe clustering process In [40] also a FPGA implementationof real-time k-means clustering is done for colored imagesA filtering algorithm is used for hardware implementationSuliman and Isa [11] presented a unique clustering algo-rithm called adaptive fuzzy-K-means clustering for imagesegmentation It can be used for different types of imagesincluding medical images microscopic images and CCDcamera images The concepts of fuzziness and belongingnessare applied to produce more adaptive clustering as comparedto other conventional clustering algorithms The proposedadaptive fuzzy-K-means clustering algorithm provides bettersegmentation and visual quality for a range of images
22 The Proposed Methodology In this method the facelocation is determined automatically by using an optimizedK-means algorithm First the input image is reshaped intovectors of pixels values followed by clustering the pixelsbased on applying a certain threshold determined by thealgorithm into two classes Pixels with values below thethreshold are put in the nonface classThe rest of the pixels areput in the face class However some unwanted parts may getclustered to this face classTherefore the algorithm is appliedagain to the face class to obtain only the face part
221 Optimized K-Means Algorithm Theproposed K-meansmodified algorithm is intended to solve the limitations of thestandard version by using differential equations to determinethe optimum separation point This algorithm finds theoptimal centers for each cluster which corresponds to theglobal minimum of the k-means cluster As an example saywe would like to cluster an image into two subclasses faceand nonface So we look for a separator which will separatethe image into two different clusters in two dimensionsThenthe means of the clusters can be found easily Figure 2 showsthe separation points and themean of each class To find these
Mathematical Problems in Engineering 5
points we proposed the followingmodification in continuousand discrete cases
Let 1199091 1199092 1199093 119909
119873 be a dataset and let 119909
119894be an
observation Let119891(119909) be the probabilitymass function (PMF)of 119909
119891 (119909) =
infin
sum
minusinfin
1
119873
120575 (119909 minus 119909119894) 119909
119894isin R (6)
where 120575(119909) is the Dirac functionWhen the size of the data is very large the question of the
cost of the minimization ofsum119896119894=1sum119909119894isin119904119894
|119909119894minus 120583119894|2 arises where
119896 is the number of clusters and 120583119894is the mean of cluster 119904
119894
Let
Var 119896 =119896
sum
119894=1
sum
119909119894isin119904119894
1003816100381610038161003816119909119894minus 120583119894
1003816100381610038161003816
2 (7)
Then it is enough to minimize Var 1 without losing thegenerality since
min119904119894
119889
sum
119894=1
sum
119909119895isin119904119894
10038161003816100381610038161003816119909119895minus 120583119894
10038161003816100381610038161003816
2
ge
119889
sum
119897=1
min119904119894
(
119896
sum
119894=1
sum
119909119895isin119904119894
10038161003816100381610038161003816119909119897
119895minus 120583119897
119894
10038161003816100381610038161003816
2
) (8)
where 119889 is the dimension of 119909119895= (1199091
119895 119909
119889
119895) and 120583
119895=
(1205831
119895 120583
119889
119895)
222 Continuous Case For the continuous case 119891(119909) is likethe probability density function (PDF)Then for 2 classes weneed to minimize the following equation
Var 2 = int
119872
minusinfin
(119909 minus 1205831)2119891 (119909) 119889119909
+ int
infin
119872
(119909 minus 1205832)2119891 (119909) 119889119909
(9)
If 119889 ge 2 we can use (8)
minVar 2 ge119889
sum
119896=1
min(int119872
minusinfin
(119909119896minus 120583119896
1)
2
119891 (119909119896) 119889119909
+int
119872
minusinfin
(119909119896minus 120583119896
2)
2
119891 (119909119896) 119889119909)
(10)
Let 119909 isin R be any random variable with probabilitydensity function 119891(119909) we want to find 120583
1and 120583
2such that
it minimizes (7) as follows
(120583lowast
1 120583lowast
2) = arg(1205831 1205832)
[ min11987212058311205832
[int
119872
minusinfin
(119909 minus 1205831)2119891 (119909) 119889119909
+int
infin
119872
(119909 minus 1205832)2119891 (119909) 119889119909]]
(11)
min11987212058311205832
Var 2 = min119872
[min1205831
int
119872
minusinfin
(119909 minus 1205831)2119891 (119909) 119889119909
+min1205832
int
infin
119872
(119909 minus 1205832)2119891 (119909) 119889119909]
(12)
min11987212058311205832
Var 2 = min119872
[min1205831
int
119872
minusinfin
(119909 minus 1205831(119872))2119891 (119909) 119889119909
+min1205832
int
infin
119872
(119909 minus 1205832(119872))2119891 (119909) 119889119909]
(13)
We know that
min119909119864 [(119883 minus 119909)
2] = 119864 [(119883 minus 119864 (119909))
2] = Var (119909) (14)
So we can find 1205831and 1205832in terms of119872 Let us define Var
1
and Var2as follows
Var1= int
119872
minusinfin
(119909 minus 1205831)2119891 (119909) 119889119909
Var2= int
infin
119872
(119909 minus 1205832)2119891 (119909) 119889119909
(15)
After simplification we get
Var1= int
119872
minusinfin
1199092119891 (119909) 119889119909
+ 1205832
1int
119872
minusinfin
119891 (119909) 119889119909 minus 21205831int
119872
minusinfin
119909119891 (119909) 119889119909
Var2= int
infin
119872
1199092119891 (119909) 119889119909
+ 1205832
2int
infin
119872
119891 (119909) 119889119909 minus 21205832int
infin
119872
119909119891 (119909) 119889119909
(16)
To find the minimum Var1and Var
2are both partially
differentiated with respect to 1205831and 120583
2 respectively After
simplification we conclude that the minimization occurswhen
1205831=
119864 (119883minus)
119875 (119883minus)
1205832=
119864 (119883+)
119875 (119883+)
(17)
where119883minus= 119883 sdot 1
119909lt119872and119883
+= 119883 sdot 1
119909ge119872
6 Mathematical Problems in Engineering
To find the minimum of Var 2 we need to find 120597120583120597119872
1205831= (
int
119872
minusinfin119909119891 (119909) 119889119909
int
119872
minusinfin119891 (119909) 119889119909
)
997904rArr
1205971205831
120597119872
=
119872119891 (119872)119875 (119883minus) minus 119891 (119872)119864 (119883
minus)
1198752(119883minus)
1205832= (
int
infin
119872119909119891 (119909) 119889119909
int
infin
119872119891 (119909) 119889119909
)
997904rArr
1205971205832
120597119872
=
119872119891 (119872)119875 (119883+) minus 119891 (119872)119864 (119883+
)
1198752(119883+)
(18)
After simplification we get
1205971205831
120597119872
=
119891 (119872)
1198752(119883minus)
[119872119875 (119883minus) minus 119864 (119883
minus)]
1205971205832
120597119872
=
119891 (119872)
1198752(119883+)
[119872119875 (119883+) minus 119864 (119883
+)]
(19)
In order to find the minimum of Var 2 we will find120597Var1120597119872 and 120597119881119886119903
2120597119872 separately and add themas follows
120597
120597119872
(Var 2) = 120597
120597119872
(Var1) +
120597
120597119872
(Var2)
120597Var1
120597119872
=
120597
120597119872
(int
119872
minusinfin
1199092119891 (119909) 119889119909
+1205832
1int
119872
minusinfin
119891 (119909) 119889119909 minus 21205831int
119872
minusinfin
119909119891 (119909) 119889119909)
= 1198722119891 (119872) + 2120583
11205831015840
1119875 (119883minus)
+ 1205832
1119891 (119872) minus 2120583
1015840
1119864 (119883minus) minus 2120583
1119872119891(119872)
(20)
After simplification we get
120597Var1
120597119872
= 119891 (119872)[1198722+
1198642(119883minus)
119875(119883minus)2minus 2119872
119864 (119883minus)
119875 (119883minus)
] (21)
Similarly
120597Var2
120597119872
= minus1198722119891 (119872) + 2120583
21205831015840
2119875 (119883+)
minus 1205832
2119891 (119872) minus 2120583
1015840
2119864 (119883+) + 2120583
2119872119891(119872)
(22)
After simplification we get
120597Var2
120597119872
= 119891 (119872)[minus1198722+
1198642(119883+)
119875(119883+)2minus 2119872
119864 (119883+)
119875 (119883+)
] (23)
Adding both these differentials and after simplifying itfollows that
120597
120597119872
(Var 2) = 119891 (119872)[(
1198642(119883minus)
119875(119883minus)2minus
1198642(119883+)
119875(119883+)2)
minus2119872(
119864 (119883minus)
119875 (119883minus)
minus
119864 (119883+)
119875 (119883+)
)]
(24)
Equating this with 0 gives
120597
120597119872
(Var 2) = 0
[(
1198642(119883minus)
119875(119883minus)2minus
1198642(119883+)
119875(119883+)2) minus 2119872(
119864 (119883minus)
119875 (119883minus)
minus
119864 (119883+)
119875 (119883+)
)] = 0
[
119864 (119883minus)
119875 (119883minus)
minus
119864 (119883+)
119875 (119883+)
] [
119864 (119883minus)
119875 (119883minus)
+
119864 (119883+)
119875 (119883+)
minus 2119872] = 0
(25)
But [(119864(119883minus)119875(119883
minus))minus(119864(119883
+)119875(119883
+))] = 0 as in that case
1205831= 1205832 which contradicts the initial assumption Therefore
[
119864 (119883minus)
119875 (119883minus)
+
119864 (119883+)
119875 (119883+)
minus 2119872] = 0
997904rArr 119872 = [
119864 (119883minus)
119875 (119883minus)
+
119864 (119883+)
119875 (119883+)
] sdot
1
2
119872 =
1205831+ 1205832
2
(26)
Therefore we have to find all values of119872 that satisfy (26)which is easier than minimizing Var 2 directly
To find theminimum in terms of119872 it is enough to derive
120597Var 2 (1205831(119872) 120583
2(119872))
120597119872
= 119891 (119872) [
1198642(119883minus)
119875(119883 lt 119872)2minus
1198642(119883+)
119875(119883 ge 119872)2
minus2119872[
119864 (119883minus)
119875 (119883 le 119872)
minus
119864 (119883+)
119875 (119883 gt 119872)
]]
(27)
Then we find
119872 = [
119864 (119883+)
119875 (119883+)
+
119864 (119883minus)
119875 (119883minus)
] sdot
1
2
(28)
223 Finding the Centers of Some Common ProbabilityDensity Functions
(1) Uniform Distribution The probability density function ofthe continuous uniform distribution is
119891 (119909) =
1
119887 minus 119886
119886 lt 119909 lt 119887
0 otherwise(29)
Mathematical Problems in Engineering 7
where 119886 and 119887 are the two parameters of the distributionTheexpected values for119883 sdot 1
119909ge 119872 and119883 sdot 1
119909lt 119872 are
119864 (119883+) =
1
2
(
1198872minus1198722
119887 minus 119886
)
119864 (119883minus) =
1
2
(
1198722minus 1198862
119887 minus 119886
)
(30)
Putting all these in (28) and solving we get
119872 =
1
2
(119886 + 119887)
1205831(119872) =
119864 (119883minus)
119875 (119883minus)
=
1
4
(3119886 + 119887)
1205832 (119872) =
119864 (119883+)
119875 (119883+)
=
1
4
(119886 + 3119887)
(31)
(2) Log-NormalDistributionTheprobability density functionof a log-normal distribution is
119891 (119909 120583 120590) =
1
119909120590radic2120587
119890minus(ln 119909minus120583)221205902
119909 gt 0 (32)
The probabilities in left and right of119872 are respectively
119875 (119883+) =
1
2
minus
1
2
erf (12
radic2 (ln119872minus 120583)
120590
)
119875 (119883minus) =
1
2
+
1
2
erf (12
radic2 (ln119872minus 120583)
120590
)
(33)
The expected values for119883 sdot 1119909ge 119872 and119883 sdot 1
119909lt 119872 are
119864 (119883+)
= [
1
2
119890120583+(12)120590
2
1 + erf (12
(120583 + 1205902minus ln119872)radic2
120590
)]
119864 (119883minus)
= [
1
2
119890120583+(12)120590
2
1 + erf (12
(minus120583 minus 1205902+ ln119872)radic2
120590
)]
(34)
Putting all these in (28) and assuming 120583 = 0 and 120590 = 1
and solving we get
119872 = 4244942583
1205831(119872) =
119864 (119883minus)
119875 (119883minus)
= 1196827816
1205832(119872) =
119864 (119883+)
119875 (119883+)
= 7293057348
(35)
(3) Normal Distribution The probability density function ofa normal distribution is
119891 (119909 120583 120590) =
1
120590radic2120587
119890minus(119909minus120583)
221205902
(36)
The probabilities in left and right of119872 are respectively
119875 (119883+) =
1
2
minus
1
2
erf (12
radic2 (119872 minus 120583)
120590
)
119875 (119883minus) =
1
2
+
1
2
erf (12
radic2 (119872 minus 120583)
120590
)
(37)
The expected values for119883 sdot 1119909ge 119872 and119883 sdot 1
119909lt 119872 are
119864 (119883+) = minus
1
2
120583minus1 + erf (12
radic2 (119872 minus 120583)
120590
)
119864 (119883minus) =
1
2
1205831 + erf (12
radic2 (119872 minus 120583)
120590
)
(38)
Putting all these in (28) and solving we get 119872 = 120583Assuming 120583 = 0 and 120590 = 1 and solving we get
119872 = 0
1205831(119872) =
119864 (119883minus)
119875 (119883minus)
= minusradic2
120587
1205832(119872) =
119864 (119883+)
119875 (119883+)
= radic2
120587
(39)
224 Discrete Case Let119883 be a discrete random variable andassume we have a set of119873 observations from119883 Let 119901
119895be the
mass density function for an observation 119909119895
Let 119909119894minus1
lt 119872119894minus1
lt 119909119894and 119909
119894lt 119872119894lt 119909119894+1
We define 120583119909le119909119894
as themean for all 119909 le 119909119894 and 120583
119909ge119909119894as themean for all 119909 ge 119909
119894
Define Var 2(119872119894minus1) as the variance of two centers forced by
119872119894minus1
as a separator
Var (119872119894minus1) = sum
119909119895le119909119894minus1
(119909119895minus 120583119909119895le119909119894minus1
)
2
119901119895
+ sum
119909119895ge119909119894
(119909119895minus 120583119909119895ge119909119894
)
2
119901119895
(40)
Var (119872119894) = sum
119909119895le119909119894
(119909119895minus 120583119909119895le119909119894
)
2
119901119895
+ sum
119909119895ge119909119894+1
(119909119895minus 120583119909119895ge119909119894+1
)
2
119901119895
(41)
8 Mathematical Problems in Engineering
The first part of Var(119872119894minus1) simplifies as follows
sum
119909119895le119909119894minus1
(119909119895minus 120583119909119895le119909119894minus1
)
2
119901119895
= sum
119909119895le119909119894minus1
1199092
119895119901119895minus 2120583119909119895le119909119894minus1
times sum
119909119895le119909119894minus1
119909119895119901119895+ 1205832
119909119895le119909119894minus1sum
119909119895le119909119894minus1
119901119895
= 119864 (1198832sdot 1119909119895le119909119894minus1
) minus
1198642(119883 sdot 1
119909119895le119909119894minus1)
119875 (119883 le 119909119894minus1)
(42)
Similarly
sum
119909119895ge119909119894
(119909119895minus 120583119909119895ge119909119894
)
2
119901119895= 119864 (119883
2sdot 1119909119895ge119909119894
) minus
1198642(119883 sdot 1
119909119895ge119909119894)
119875 (119883 ge 119909119894)
Var 2 (119872119894minus1) = 119864 (119883
2sdot 1119909119895le119909119894minus1
)
minus
1198642(119883 sdot 1
119909119895le119909119894minus1)
119875 (119883 le 119909119894minus1)
+ 119864 (1198832sdot 1119909119895ge119909119894
)
minus
1198642(119883 sdot 1
119909119895ge119909119894)
119875 (119883 ge 119909119894)
Var 2 (119872119894) = 119864 (119883
2sdot 1119909119895le119909119894
)
minus
1198642(119883 sdot 1
119909119895le119909119894)
119875 (119883 le 119909119894)
+ 119864 (1198832sdot 1119909119895ge119909119894+1
)
minus
1198642(119883 sdot 1
119909119895ge119909119894+1)
119875 (119883 ge 119909119894+1)
(43)
Define ΔVar 2 = Var 2(119872119894) minus Var 2(119872
119894minus1)
ΔVar 2 = [
[
minus
1198642(119883 sdot 1
119909119895le119909119894)
119875 (119883 le 119909119894)
minus
1198642(119883 sdot 1
119909119895ge119909119894+1)
119875 (119883 ge 119909119894+1)
]
]
minus[
[
minus
1198642(119883 sdot 1
119909119895le119909119894minus1)
119875 (119883 le 119909119894minus1)
minus
1198642(119883 sdot 1
119909119895ge119909119894)
119875 (119883 ge 119909119894)
]
]
(44)
Now in order to simplify the calculation we rearrangeΔVar 2 as follows
ΔVar 2 = [
[
1198642(119883 sdot 1
119909119895le119909119894minus1)
119875 (119883 le 119909119894minus1)
minus
1198642(119883 sdot 1
119909119895le119909119894)
119875 (119883 le 119909119894)
]
]
+[
[
1198642(119883 sdot 1
119909119895ge119909119894)
119875 (119883 ge 119909119894)
minus
1198642(119883 sdot 1
119909119895ge119909119894+1)
119875 (119883 ge 119909119894+1)
]
]
(45)
The first part is simplified as follows
1198642(119883 sdot 1
119909119895le119909119894minus1)
119875 (119883 le 119909119894minus1)
minus
1198642(119883 sdot 1
119909119895le119909119894)
119875 (119883 le 119909119894)
=
1198642(119883 sdot 1
119909119895le119909119894minus1)
119875 (119883 le 119909119894minus1)
minus
[119864 (119883 sdot 1119909119895le119909119894minus1
) + 119909119894119901119894]
2
119875 (119883 le 119909119894minus1) + 119901119894
=
1
119875 (119883 le 119909119894minus1) [119875 (119883 le 119909
119894minus1) + 119901119894]
times 1198642(119883 sdot 1
119909119895le119909119894minus1)119875 (119883 le 119909
119894)
+ 1199011198941198642(119883 sdot 1
119909119895le119909119894minus1)
minus 1198642(119883 sdot 1
119909119895le119909119894minus1)119875 (119883 le 119909
119894minus1)
minus 2119909119894119901119894119864(119883 sdot 1
119909119895le119909119894minus1)119875 (119883 le 119909
119894minus1)
minus1199092
1198941199012
119894119875 (119883 le 119909
119894minus1)
=
1
119875 (119883 le 119909119894minus1) [119875 (119883 le 119909
119894minus1) + 119901119894]
times 1199011198941198642(119883 sdot 1
119909119895le119909119894minus1)
minus 2119909119894119901119894119864(119883 sdot 1
119909119895le119909119894minus1)119875 (119883 le 119909
119894minus1)
minus1199092
1198941199012
119894119875 (119883 le 119909
119894minus1)
(46)
Similarly simplifying the second part yields
1198642(119883 sdot 1
119909119895ge119909119894)
119875 (119883 ge 119909119894)
minus
1198642(119883 sdot 1
119909119895ge119909119894+1)
119875 (119883 ge 119909119894+1)
=
[119864 (119883 sdot 1119909119895ge119909119894+1
) + 119909119894119901119894]
2
119875 (119883 ge 119909119894+1) + 119901119894
minus
1198642(119883 sdot 1
119909119895ge119909119894+1)
119875 (119883 ge 119909119894+1)
=
1
119875 (119883 ge 119909119894+1) [119875 (119883 ge 119909
119894+1) + 119901119894]
times 1198642(119883 sdot 1
119909119895ge119909119894+1)119875 (119883 ge 119909
119894+1)
minus 1199011198941198642(119883 sdot 1
119909119895ge119909119894+1)
minus 1198642(119883 sdot 1
119909119895ge119909119894+1)119875 (119883 ge 119909
119894+1)
+ 2119909119894119901119894119864(119883 sdot 1
119909119895ge119909119894+1)119875 (119883 ge 119909
119894+1)
+1199092
1198941199012
119894119875 (119883 ge 119909
119894+1)
Mathematical Problems in Engineering 9
=
1
119875 (119883 ge 119909119894+1) [119875 (119883 ge 119909
119894+1) + 119901119894]
times minus119901119894+11198642(119883 sdot 1
119909119895ge119909119894+1)
+ 2119909119894119901119894119864(119883 sdot 1
119909119895ge119909119894+1)119875 (119883 ge 119909
119894+1)
+1199092
1198941199012
119894+1119875 (119883 ge 119909
119894+1)
(47)In general these types of optimization arise when the
dataset is very bigIf we consider119873 to be very large then 119901
119896≪ max(sum
119894ge119896+1
119875119894 sum119894le119896minus1
119875119894) 119875(119883 le 119909
119894minus1) + 119901119894asymp 119875(119883 le 119909
119894) and 119875(119883 ge
119909119894+1) + 119901119894asymp 119875(119883 ge 119909
119894+1)
ΔVar 2 asymp [
[
1198642(119883 sdot 1
119909119895le119909119894minus1) 119901119894
1198752(119883 le 119909
119894minus1)
minus
1198642(119883 sdot 1
119909119895ge119909119894+1) 119901119894
1198752(119883 ge 119909
119894+1)
]
]
minus 2119901119894119909119894[
[
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
minus
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
]
]
minus 1199012
1198941199092
119894[
1
119875 (119883 le 119909119894minus1)
minus
1
119875 (119883 ge 119909119894+1)
]
(48)
and as lim119873rarrinfin
1199012
1198941199092
119894[(1119875(119883 le 119909
119894minus1))minus(1119875(119883 ge 119909
119894+1))] =
0 we can further simplify ΔVar 2 as followsΔVar 2
asymp[
[
1198642(119883 sdot 1
119909119895le119909119894minus1) 119901119894
1198752(119883 le 119909
119894minus1)
minus
1198642(119883 sdot 1
119909119895ge119909119894+1) 119901119894
1198752(119883 ge 119909
119894+1)
]
]
minus 2119901119894119909119894[
[
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
minus
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
]
]
= 119901119894[
[
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
minus
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
]
]
times[
[
(
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
+
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
) minus 2119909119894]
]
(49)To find the minimum of ΔVar 2 let us locate when
ΔVar 2 = 0
997904rArr 119901119894[
[
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
minus
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
]
]
times[
[
(
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
+
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
) minus 2119909119894]
]
= 0
(50)
VarD
X1 X2 X3 X4 X5 X6 Xn
M
Figure 3 Discrete random variables
But 119901119894
= 0 and so is [(119864(119883 sdot 1119909119895le119909119894minus1
)119875(119883 le 119909119894minus1)) minus
(119864(119883 sdot 1119909119895ge119909119894+1
)119875(119883 ge 119909119894+1))] lt 0 since 119909
119894is an increasing
sequence which implies that 120583119894minus1
lt 120583119894where
120583119894minus1
=
119864 (119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
120583119894=
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
(51)
Therefore
[
[
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
+
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
]
]
minus 2119909119894= 0 (52)
Let us define
119865 (119909119894) = 2119909
119894minus[
[
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
+
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
]
]
119865 (119909119894) = 0
997904rArr 119909119894=[
[
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
+
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
]
]
sdot
1
2
997904rArr 119909119894=
120583119894minus1+ 120583119894
2
(53)
where 120583119894minus1
and 120583119894are the means of the first and second parts
respectivelyTo find the minimum for the total variation it is enough
to find the good separator119872 between 119909119894and 119909
119894minus1such that
119865(119909119894) sdot 119865(119909
119894+1) lt 0 and 119865(119909
119894) lt 0 (see Figure 3)
After finding few local minima we can get the globalminimum by finding the smallest among them
3 Face Localization Using Proposed Method
Now it becomes clear the superiority of the K-means modi-fied algorithmover the standard algorithm in terms of finding
10 Mathematical Problems in Engineering
Figure 4 The original image
Figure 5 The face with shade only
a global minimum and we can explain our solution basedon the modified algorithm Suppose we have an image withpixels values from 0 to 255 First we reshaped the input imageinto a vector of pixels Figure 4 shows an input image beforethe operation of the K-means modified algorithmThe imagecontains a face part a normal background and a backgroundwith illumination Then we split the image pixels into twoclasses one from 0 to 119909
1and another from 119909
1to 255 by
applying a certain threshold determined by the algorithmLet [0 119909
1] be the class 1-which represents the nonface
part and let [1199091 255] be the class 2-which represents the
face with the shadeAll pixels with values less than the threshold belong to
the first class and will be assigned to 0 values This will keepthe pixels with values above the threshold and these pixelsbelong to both the face part and the illuminated part of thebackground as shown in Figure 5
In order to remove the illuminated background partanother threshold will be applied to the pixels using thealgorithm meant to separate between the face part and theilluminated background part New classes will be obtainedfrom 119909
1to 1199092and from 119909
2to 255Then the pixels with values
less than the threshold will be assigned a value of 0 and thenonzero pixels left represent the face part Let [119909
1 119909
2] be
the class 1 which represents the illuminated background partand let [119909
2 255] be the class 2 which represents the face
partThe result of the removal of the illuminated back-ground
part is shown in Figure 6
Figure 6 The face without illumination
Figure 7 The filtered image
One can see that some face pixels were assigned to 0 dueto the effects of the illumination Therefore there is a needto return to the original image in Figure 4 to fully extract theimage window corresponding to the face A filtering processis then performed to reduce the noise and the result is shownin Figure 7
Figure 8 shows the final result which is a window contain-ing the face extracted from the original image
4 Results
In this Section the experimental results of the proposedalgorithm will be analyzed and discussed First it starts withpresenting a description of the datasets used in this workfollowed by the results of the previous part of the workThree popular databases were used to test the proposed facelocalizationmethodTheMIT-CBCL database has image size115 times 115 and 300 images were selected from this datasetbased on different conditions such as light variation poseand background variations The second dataset is the BioIDdataset with image size 384 times 286 and 300 images from thedatabase were selected based on different conditions suchas indooroutdoor The last dataset is the Caltech datasetwith image size 896 times 592 and the 300 images from thedatabase were selected based on different conditions such asindooroutdoor
41 MIT-CBCL Dataset This dataset was established by theCenter for Biological and Computational Learning (CBCL)
Mathematical Problems in Engineering 11
Figure 8 The corresponding window in the original image
Figure 9 Samples fromMIT-CBCL dataset
at Massachusetts Institute of Technology (MIT) [41] It has10 persons with 200 images per person and an image size of115times115 pixelsThe images of each person are with differentlight conditions clutter background and scale and differentposes Few examples of these images are shown in Figure 9
42 BioID Dataset This dataset [42] consists of 1521 graylevel images with a resolution of 384 times 286 pixels Each oneshows the frontal view of a face of one out of 23 differenttest persons The images were taken under different lightingconditions with a complex background and contain tiltedand rotated faces This dataset is considered as the mostdifficult one for eye detection (see Figure 10)
43 Caltech Dataset This database [43] has been collectedby Markus Weber at California Institute of Technology Thedatabase contains 450 face images of 27 subjects underdifferent lighting conditions different face expressions andcomplex backgrounds (see Figure 11)
44 Proposed Method Result The proposed method is eval-uated on these datasets where the result was 100 with a
Figure 10 Sample of faces from BioID dataset for localization
Figure 11 Sample of faces from Caltech dataset for localization
Table 1 Face localization using K-mean modified algorithm onMIT-CBCL BioID and Caltech the sets contain faces with differentposes and faces with clutter background as well as indoorsoutdoors
Dataset Image Accuracy ()MIT-Set 1 150 100MIT-Set 2 150 100BioID 300 100Caltech 300 100
localization time of 4 s Table 1 shows the proposed methodresults with the MIT-CBCL Caltech and BioID databasesIn this test we have focused on the use of images of facesfrom different angles from 30 degrees left to 30 degreesright and the variation in the background as well as thecases of indoor and outdoor images Two sets of images areselected from theMIT-CBCL dataset and two other sets fromthe other databases to evaluate the proposed method Thefirst set from MIT-CBCL contains the images with differentposes and the second set contains images with differentbackgrounds Each set contains 150 images The BioID andCaltech sets contain imageswith indoor and outdoor settingsThe results showed the efficiency of the proposed K-meanmodified algorithm to locate the faces from the image withhigh background and poses changes In addition anotherparameter was considered in this test which is the localizationtime From the results the proposed method can locate theface in a very short time because it has less complexity due tothe use of differential equations
12 Mathematical Problems in Engineering
Figure 12 Examples of face localization using k-mean modified algorithm on MIT-CBCL dataset
(a) (b) (c)
Figure 13 Example of face localization using k-mean modified algorithm on BioID dataset (a) the original image (b) face position and (c)filtered image
Figure 12 shows some examples of face localization onMIT-CBCL
In Figure 13(a) an original image from BioID dataset isshownThe localization position is shown in Figure 13(b) Butthere is need to do a filtering operation in order to remove theunwanted parts and this is shown in Figure 13(c) and then thearea of ROI will be determined
Figure 14 shows the face position on an image from theCaltech dataset
5 Conclusion
In this paper we focus on developing the RIO detectionby suggesting a new method for face localization In thismethod of localization the input image is reshaped intoa vector and then the optimized k-means algorithm isapplied twice to cluster the image pixels into classes At the
end the block window corresponding to the class of pixelscontaining the face only is extracted from the input imageThree sets of images taken from MIT BioID and Caltechdatasets and differing in illumination and backgroundsas well as in-dooroutdoor settings were used to evaluatethe proposed method The results show that the proposedmethod achieved significant localization accuracyOur futureresearch direction includes finding the optimal centers forhigher dimension data whichwe believe is just generalizationof our approach to higher dimension (8) and to highersubclusters
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
Mathematical Problems in Engineering 13
(a) (b) (c)
Figure 14 Example of face localization using k-mean modified algorithm on Caltech dataset (a) the original image (b) face position and(c) filtered image
Acknowledgments
This study is a part of the Internal Grant Project no419011811122 The authors gratefully acknowledge the con-tinued support from Alfaisal University and its Office ofResearch
References
[1] R Chellappa C L Wilson and S Sirohey ldquoHuman andmachine recognition of faces a surveyrdquo Proceedings of the IEEEvol 83 no 5 pp 705ndash740 1995
[2] M-H Yang D J Kriegman and N Ahuja ldquoDetecting faces inimages a surveyrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 24 no 1 pp 34ndash58 2002
[3] K S Fu and J K Mui ldquoA survey on image segmentationrdquoPattern Recognition vol 13 no 1 pp 3ndash16 1981
[4] N R Pal and S K Pal ldquoA review on image segmentationtechniquesrdquo Pattern Recognition vol 26 no 9 pp 1277ndash12941993
[5] S-YWan andW EHiggins ldquoSymmetric region growingrdquo IEEETransactions on Image Processing vol 12 no 9 pp 1007ndash10152003
[6] Y-L Chang and X Li ldquoFast image region growingrdquo Image andVision Computing vol 13 no 7 pp 559ndash571 1995
[7] T N Pappas and N S Jayant ldquoAn adaptive clustering algorithmfor image segmentationrdquo in Proceedings of the InternationalConference on Acoustics Speech and Signal Processing (ICASSPrsquo89) vol 3 pp 1667ndash1670 May 1989
[8] T N Pappas ldquoAn adaptive clustering algorithm for imagesegmentationrdquo IEEE Transactions on Signal Processing vol 40no 4 pp 901ndash914 1992
[9] M Y Mashor ldquoHybrid training algorithm for RBF networkrdquoInternational Journal of the Computer the Internet and Manage-ment vol 8 no 2 pp 50ndash65 2000
[10] N A M Isa S A Salamah and U K Ngah ldquoAdaptive fuzzymovingK-means clustering algorithm for image segmentationrdquo
IEEE Transactions on Consumer Electronics vol 55 no 4 pp2145ndash2153 2009
[11] S N Sulaiman and N AM Isa ldquoAdaptive fuzzy-K-means clus-tering algorithm for image segmentationrdquo IEEE Transactions onConsumer Electronics vol 56 no 4 pp 2661ndash2668 2010
[12] W Cai S Chen and D Zhang ldquoFast and robust fuzzy c-meansclustering algorithms incorporating local information for imagesegmentationrdquo Pattern Recognition vol 40 no 3 pp 825ndash8382007
[13] P Tan M Steinbach and V Kumar Introduction to DataMining Pearson Addison-Wesley 2006
[14] H Xiong J Wu and J Chen ldquoK-means clustering versusvalidation measures a data-distribution perspectiverdquo IEEETransactions on Systems Man and Cybernetics B Cyberneticsvol 39 no 2 pp 318ndash331 2009
[15] I Borg and P J F GroenenModernMultidimensional ScalingmdashTheory and Applications Springer Science + Business Media2005
[16] S Guha R Rastogi and K Shim ldquoCURE an efficient clusteringalgorithm for large databasesrdquo ACM SIGMOD Record vol 27no 2 pp 73ndash84 1998
[17] EM Knorr R T Ng and V Tucakov ldquoDistance-based outliersalgorithms and applicationsrdquo VLDB Journal vol 8 no 3-4 pp237ndash253 2000
[18] P S Bradley U Fayyad and C Reina ldquoScaling clusteringalgorithms to large databasesrdquo in Proceedings of the 4th ACMSIGKDD International Conference on Knowledge Discovery andData Mining (KDD rsquo98) pp 9ndash15 1998
[19] D Arthur and S Vassilvitskii ldquok-means++ the advantages ofcareful seedingrdquo in Proceedings of the 18th Annual ACM-SIAMSymposium on Discrete Algorithms pp 1027ndash1035 2007
[20] T Kanungo D M Mount N S Netanyahu C D PiatkoR Silverman and A Y Wu ldquoA local search approximationalgorithm for k-means clusteringrdquo Computational GeometryTheory and Applications vol 28 no 2-3 pp 89ndash112 2004
14 Mathematical Problems in Engineering
[21] M C Chiang C W Tsai and C S Yang ldquoA time-efficient pat-tern reduction algorithm for k-means clusteringrdquo InformationSciences vol 181 no 4 pp 716ndash731 2011
[22] P S Bradley U M Fayyad and C Reina ldquoScaling clusteringalgorithms to large databasesrdquo in Proceedings of the ACMSIGKDD International Conference on Knowledge Discovery andData Mining (KDD rsquo98) pp 9ndash15 1998
[23] F Farnstrom J Lewis and C Elkan ldquoScalability for clusteringalgorithms revisitedrdquo SIGKDD Explorations vol 2 no 1 pp 51ndash57 2000
[24] C Ordonez and E Omiecinski ldquoEfficient disk-based K-means clustering for relational databasesrdquo IEEE Transactions onKnowledge and Data Engineering vol 16 no 8 pp 909ndash9212004
[25] Y Li and S M Chung ldquoParallel bisecting k-means withprediction clustering algorithmrdquo Journal of Supercomputingvol 39 no 1 pp 19ndash37 2007
[26] C Elkan ldquoUsing the triangle inequality to accelerate k-meansrdquoin Proceedings of the 20th International Conference on MachineLearning pp 147ndash153 August 2003
[27] D Reddy and P K Jana ldquoInitialization for K-means clusteringusing voronoi diagramrdquo Procedia Technology vol 4 pp 395ndash400 2012
[28] J B MacQueen ldquoSome methods for classification and analysisof multivariate observationsrdquo in Proceedings of 5th BerkeleySymposium on Mathematical Statistics and Probability pp 281ndash297 University of California Press 1967
[29] R Ebrahimpour R Rasoolinezhad Z Hajiabolhasani andM Ebrahimi ldquoVanishing point detection in corridors usingHough transform and K-means clusteringrdquo IET ComputerVision vol 6 no 1 pp 40ndash51 2012
[30] P S Bishnu and V Bhattacherjee ldquoSoftware fault predictionusing quad tree-based K-means clustering algorithmrdquo IEEETransactions on Knowledge and Data Engineering vol 24 no6 pp 1146ndash1150 2012
[31] T Elomaa and H Koivistoinen ldquoOn autonomous K-meansclusteringrdquo in Proceedings of the International Symposium onMethodologies for Intelligent Systems pp 228ndash236 May 2005
[32] J M Pena J A Lozano and P Larranaga ldquoAn empiricalcomparison of four initialization methods for the K-Meansalgorithmrdquo Pattern Recognition Letters vol 20 no 10 pp 1027ndash1040 1999
[33] T Kanungo D M Mount N S Netanyahu C D PiatkoR Silverman and A Y Wu ldquoAn efficient k-means clusteringalgorithms analysis and implementationrdquo IEEETransactions onPattern Analysis andMachine Intelligence vol 24 no 7 pp 881ndash892 2002
[34] A Likas N Vlassis and J J Verbeek ldquoThe global k-meansclustering algorithmrdquo Pattern Recognition vol 36 no 2 pp451ndash461 2003
[35] A M Bagirov ldquoModified global k-means algorithm for mini-mumsum-of-squares clustering problemsrdquoPatternRecognitionvol 41 no 10 pp 3192ndash3199 2008
[36] J Xie and S Jiang ldquoA simple and fast algorithm for global K-means clusteringrdquo in Proceedings of the 2nd InternationalWork-shop on Education Technology and Computer Science (ETCS 10)vol 2 pp 36ndash40 Wuhan China March 2010
[37] G F Tzortzis and A C Likas ldquoThe global kernel 120581-meansalgorithm for clustering in feature spacerdquo IEEE Transactions onNeural Networks vol 20 no 7 pp 1181ndash1194 2009
[38] W-L Hung Y-C Chang and E Stanley Lee ldquoWeight selectionin W-K-means algorithm with an application in color imagesegmentationrdquo Computers and Mathematics with Applicationsvol 62 no 2 pp 668ndash676 2011
[39] B Maliatski and O Yadid-Pecht ldquoHardware-driven adaptive 120581-means clustering for real-time video imagingrdquo IEEE Transac-tions on Circuits and Systems for Video Technology vol 15 no 1pp 164ndash166 2005
[40] T Saegusa and T Maruyama ldquoAn FPGA implementation ofreal-time K-means clustering for color imagesrdquo Journal of Real-Time Image Processing vol 2 no 4 pp 309ndash318 2007
[41] ldquoCBCL face recognition Databaserdquo Vol 2010httpcbclmitedusoftware-datasetsheiselefacerecognition-databasehtml
[42] BioID BioID Support Downloads BioID Face Database 2011[43] httpwwwvisioncaltecheduhtml-filesarchivehtml
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
Mathematical Problems in Engineering 5
points we proposed the followingmodification in continuousand discrete cases
Let 1199091 1199092 1199093 119909
119873 be a dataset and let 119909
119894be an
observation Let119891(119909) be the probabilitymass function (PMF)of 119909
119891 (119909) =
infin
sum
minusinfin
1
119873
120575 (119909 minus 119909119894) 119909
119894isin R (6)
where 120575(119909) is the Dirac functionWhen the size of the data is very large the question of the
cost of the minimization ofsum119896119894=1sum119909119894isin119904119894
|119909119894minus 120583119894|2 arises where
119896 is the number of clusters and 120583119894is the mean of cluster 119904
119894
Let
Var 119896 =119896
sum
119894=1
sum
119909119894isin119904119894
1003816100381610038161003816119909119894minus 120583119894
1003816100381610038161003816
2 (7)
Then it is enough to minimize Var 1 without losing thegenerality since
min119904119894
119889
sum
119894=1
sum
119909119895isin119904119894
10038161003816100381610038161003816119909119895minus 120583119894
10038161003816100381610038161003816
2
ge
119889
sum
119897=1
min119904119894
(
119896
sum
119894=1
sum
119909119895isin119904119894
10038161003816100381610038161003816119909119897
119895minus 120583119897
119894
10038161003816100381610038161003816
2
) (8)
where 119889 is the dimension of 119909119895= (1199091
119895 119909
119889
119895) and 120583
119895=
(1205831
119895 120583
119889
119895)
222 Continuous Case For the continuous case 119891(119909) is likethe probability density function (PDF)Then for 2 classes weneed to minimize the following equation
Var 2 = int
119872
minusinfin
(119909 minus 1205831)2119891 (119909) 119889119909
+ int
infin
119872
(119909 minus 1205832)2119891 (119909) 119889119909
(9)
If 119889 ge 2 we can use (8)
minVar 2 ge119889
sum
119896=1
min(int119872
minusinfin
(119909119896minus 120583119896
1)
2
119891 (119909119896) 119889119909
+int
119872
minusinfin
(119909119896minus 120583119896
2)
2
119891 (119909119896) 119889119909)
(10)
Let 119909 isin R be any random variable with probabilitydensity function 119891(119909) we want to find 120583
1and 120583
2such that
it minimizes (7) as follows
(120583lowast
1 120583lowast
2) = arg(1205831 1205832)
[ min11987212058311205832
[int
119872
minusinfin
(119909 minus 1205831)2119891 (119909) 119889119909
+int
infin
119872
(119909 minus 1205832)2119891 (119909) 119889119909]]
(11)
min11987212058311205832
Var 2 = min119872
[min1205831
int
119872
minusinfin
(119909 minus 1205831)2119891 (119909) 119889119909
+min1205832
int
infin
119872
(119909 minus 1205832)2119891 (119909) 119889119909]
(12)
min11987212058311205832
Var 2 = min119872
[min1205831
int
119872
minusinfin
(119909 minus 1205831(119872))2119891 (119909) 119889119909
+min1205832
int
infin
119872
(119909 minus 1205832(119872))2119891 (119909) 119889119909]
(13)
We know that
min119909119864 [(119883 minus 119909)
2] = 119864 [(119883 minus 119864 (119909))
2] = Var (119909) (14)
So we can find 1205831and 1205832in terms of119872 Let us define Var
1
and Var2as follows
Var1= int
119872
minusinfin
(119909 minus 1205831)2119891 (119909) 119889119909
Var2= int
infin
119872
(119909 minus 1205832)2119891 (119909) 119889119909
(15)
After simplification we get
Var1= int
119872
minusinfin
1199092119891 (119909) 119889119909
+ 1205832
1int
119872
minusinfin
119891 (119909) 119889119909 minus 21205831int
119872
minusinfin
119909119891 (119909) 119889119909
Var2= int
infin
119872
1199092119891 (119909) 119889119909
+ 1205832
2int
infin
119872
119891 (119909) 119889119909 minus 21205832int
infin
119872
119909119891 (119909) 119889119909
(16)
To find the minimum Var1and Var
2are both partially
differentiated with respect to 1205831and 120583
2 respectively After
simplification we conclude that the minimization occurswhen
1205831=
119864 (119883minus)
119875 (119883minus)
1205832=
119864 (119883+)
119875 (119883+)
(17)
where119883minus= 119883 sdot 1
119909lt119872and119883
+= 119883 sdot 1
119909ge119872
6 Mathematical Problems in Engineering
To find the minimum of Var 2 we need to find 120597120583120597119872
1205831= (
int
119872
minusinfin119909119891 (119909) 119889119909
int
119872
minusinfin119891 (119909) 119889119909
)
997904rArr
1205971205831
120597119872
=
119872119891 (119872)119875 (119883minus) minus 119891 (119872)119864 (119883
minus)
1198752(119883minus)
1205832= (
int
infin
119872119909119891 (119909) 119889119909
int
infin
119872119891 (119909) 119889119909
)
997904rArr
1205971205832
120597119872
=
119872119891 (119872)119875 (119883+) minus 119891 (119872)119864 (119883+
)
1198752(119883+)
(18)
After simplification we get
1205971205831
120597119872
=
119891 (119872)
1198752(119883minus)
[119872119875 (119883minus) minus 119864 (119883
minus)]
1205971205832
120597119872
=
119891 (119872)
1198752(119883+)
[119872119875 (119883+) minus 119864 (119883
+)]
(19)
In order to find the minimum of Var 2 we will find120597Var1120597119872 and 120597119881119886119903
2120597119872 separately and add themas follows
120597
120597119872
(Var 2) = 120597
120597119872
(Var1) +
120597
120597119872
(Var2)
120597Var1
120597119872
=
120597
120597119872
(int
119872
minusinfin
1199092119891 (119909) 119889119909
+1205832
1int
119872
minusinfin
119891 (119909) 119889119909 minus 21205831int
119872
minusinfin
119909119891 (119909) 119889119909)
= 1198722119891 (119872) + 2120583
11205831015840
1119875 (119883minus)
+ 1205832
1119891 (119872) minus 2120583
1015840
1119864 (119883minus) minus 2120583
1119872119891(119872)
(20)
After simplification we get
120597Var1
120597119872
= 119891 (119872)[1198722+
1198642(119883minus)
119875(119883minus)2minus 2119872
119864 (119883minus)
119875 (119883minus)
] (21)
Similarly
120597Var2
120597119872
= minus1198722119891 (119872) + 2120583
21205831015840
2119875 (119883+)
minus 1205832
2119891 (119872) minus 2120583
1015840
2119864 (119883+) + 2120583
2119872119891(119872)
(22)
After simplification we get
120597Var2
120597119872
= 119891 (119872)[minus1198722+
1198642(119883+)
119875(119883+)2minus 2119872
119864 (119883+)
119875 (119883+)
] (23)
Adding both these differentials and after simplifying itfollows that
120597
120597119872
(Var 2) = 119891 (119872)[(
1198642(119883minus)
119875(119883minus)2minus
1198642(119883+)
119875(119883+)2)
minus2119872(
119864 (119883minus)
119875 (119883minus)
minus
119864 (119883+)
119875 (119883+)
)]
(24)
Equating this with 0 gives
120597
120597119872
(Var 2) = 0
[(
1198642(119883minus)
119875(119883minus)2minus
1198642(119883+)
119875(119883+)2) minus 2119872(
119864 (119883minus)
119875 (119883minus)
minus
119864 (119883+)
119875 (119883+)
)] = 0
[
119864 (119883minus)
119875 (119883minus)
minus
119864 (119883+)
119875 (119883+)
] [
119864 (119883minus)
119875 (119883minus)
+
119864 (119883+)
119875 (119883+)
minus 2119872] = 0
(25)
But [(119864(119883minus)119875(119883
minus))minus(119864(119883
+)119875(119883
+))] = 0 as in that case
1205831= 1205832 which contradicts the initial assumption Therefore
[
119864 (119883minus)
119875 (119883minus)
+
119864 (119883+)
119875 (119883+)
minus 2119872] = 0
997904rArr 119872 = [
119864 (119883minus)
119875 (119883minus)
+
119864 (119883+)
119875 (119883+)
] sdot
1
2
119872 =
1205831+ 1205832
2
(26)
Therefore we have to find all values of119872 that satisfy (26)which is easier than minimizing Var 2 directly
To find theminimum in terms of119872 it is enough to derive
120597Var 2 (1205831(119872) 120583
2(119872))
120597119872
= 119891 (119872) [
1198642(119883minus)
119875(119883 lt 119872)2minus
1198642(119883+)
119875(119883 ge 119872)2
minus2119872[
119864 (119883minus)
119875 (119883 le 119872)
minus
119864 (119883+)
119875 (119883 gt 119872)
]]
(27)
Then we find
119872 = [
119864 (119883+)
119875 (119883+)
+
119864 (119883minus)
119875 (119883minus)
] sdot
1
2
(28)
223 Finding the Centers of Some Common ProbabilityDensity Functions
(1) Uniform Distribution The probability density function ofthe continuous uniform distribution is
119891 (119909) =
1
119887 minus 119886
119886 lt 119909 lt 119887
0 otherwise(29)
Mathematical Problems in Engineering 7
where 119886 and 119887 are the two parameters of the distributionTheexpected values for119883 sdot 1
119909ge 119872 and119883 sdot 1
119909lt 119872 are
119864 (119883+) =
1
2
(
1198872minus1198722
119887 minus 119886
)
119864 (119883minus) =
1
2
(
1198722minus 1198862
119887 minus 119886
)
(30)
Putting all these in (28) and solving we get
119872 =
1
2
(119886 + 119887)
1205831(119872) =
119864 (119883minus)
119875 (119883minus)
=
1
4
(3119886 + 119887)
1205832 (119872) =
119864 (119883+)
119875 (119883+)
=
1
4
(119886 + 3119887)
(31)
(2) Log-NormalDistributionTheprobability density functionof a log-normal distribution is
119891 (119909 120583 120590) =
1
119909120590radic2120587
119890minus(ln 119909minus120583)221205902
119909 gt 0 (32)
The probabilities in left and right of119872 are respectively
119875 (119883+) =
1
2
minus
1
2
erf (12
radic2 (ln119872minus 120583)
120590
)
119875 (119883minus) =
1
2
+
1
2
erf (12
radic2 (ln119872minus 120583)
120590
)
(33)
The expected values for119883 sdot 1119909ge 119872 and119883 sdot 1
119909lt 119872 are
119864 (119883+)
= [
1
2
119890120583+(12)120590
2
1 + erf (12
(120583 + 1205902minus ln119872)radic2
120590
)]
119864 (119883minus)
= [
1
2
119890120583+(12)120590
2
1 + erf (12
(minus120583 minus 1205902+ ln119872)radic2
120590
)]
(34)
Putting all these in (28) and assuming 120583 = 0 and 120590 = 1
and solving we get
119872 = 4244942583
1205831(119872) =
119864 (119883minus)
119875 (119883minus)
= 1196827816
1205832(119872) =
119864 (119883+)
119875 (119883+)
= 7293057348
(35)
(3) Normal Distribution The probability density function ofa normal distribution is
119891 (119909 120583 120590) =
1
120590radic2120587
119890minus(119909minus120583)
221205902
(36)
The probabilities in left and right of119872 are respectively
119875 (119883+) =
1
2
minus
1
2
erf (12
radic2 (119872 minus 120583)
120590
)
119875 (119883minus) =
1
2
+
1
2
erf (12
radic2 (119872 minus 120583)
120590
)
(37)
The expected values for119883 sdot 1119909ge 119872 and119883 sdot 1
119909lt 119872 are
119864 (119883+) = minus
1
2
120583minus1 + erf (12
radic2 (119872 minus 120583)
120590
)
119864 (119883minus) =
1
2
1205831 + erf (12
radic2 (119872 minus 120583)
120590
)
(38)
Putting all these in (28) and solving we get 119872 = 120583Assuming 120583 = 0 and 120590 = 1 and solving we get
119872 = 0
1205831(119872) =
119864 (119883minus)
119875 (119883minus)
= minusradic2
120587
1205832(119872) =
119864 (119883+)
119875 (119883+)
= radic2
120587
(39)
224 Discrete Case Let119883 be a discrete random variable andassume we have a set of119873 observations from119883 Let 119901
119895be the
mass density function for an observation 119909119895
Let 119909119894minus1
lt 119872119894minus1
lt 119909119894and 119909
119894lt 119872119894lt 119909119894+1
We define 120583119909le119909119894
as themean for all 119909 le 119909119894 and 120583
119909ge119909119894as themean for all 119909 ge 119909
119894
Define Var 2(119872119894minus1) as the variance of two centers forced by
119872119894minus1
as a separator
Var (119872119894minus1) = sum
119909119895le119909119894minus1
(119909119895minus 120583119909119895le119909119894minus1
)
2
119901119895
+ sum
119909119895ge119909119894
(119909119895minus 120583119909119895ge119909119894
)
2
119901119895
(40)
Var (119872119894) = sum
119909119895le119909119894
(119909119895minus 120583119909119895le119909119894
)
2
119901119895
+ sum
119909119895ge119909119894+1
(119909119895minus 120583119909119895ge119909119894+1
)
2
119901119895
(41)
8 Mathematical Problems in Engineering
The first part of Var(119872119894minus1) simplifies as follows
sum
119909119895le119909119894minus1
(119909119895minus 120583119909119895le119909119894minus1
)
2
119901119895
= sum
119909119895le119909119894minus1
1199092
119895119901119895minus 2120583119909119895le119909119894minus1
times sum
119909119895le119909119894minus1
119909119895119901119895+ 1205832
119909119895le119909119894minus1sum
119909119895le119909119894minus1
119901119895
= 119864 (1198832sdot 1119909119895le119909119894minus1
) minus
1198642(119883 sdot 1
119909119895le119909119894minus1)
119875 (119883 le 119909119894minus1)
(42)
Similarly
sum
119909119895ge119909119894
(119909119895minus 120583119909119895ge119909119894
)
2
119901119895= 119864 (119883
2sdot 1119909119895ge119909119894
) minus
1198642(119883 sdot 1
119909119895ge119909119894)
119875 (119883 ge 119909119894)
Var 2 (119872119894minus1) = 119864 (119883
2sdot 1119909119895le119909119894minus1
)
minus
1198642(119883 sdot 1
119909119895le119909119894minus1)
119875 (119883 le 119909119894minus1)
+ 119864 (1198832sdot 1119909119895ge119909119894
)
minus
1198642(119883 sdot 1
119909119895ge119909119894)
119875 (119883 ge 119909119894)
Var 2 (119872119894) = 119864 (119883
2sdot 1119909119895le119909119894
)
minus
1198642(119883 sdot 1
119909119895le119909119894)
119875 (119883 le 119909119894)
+ 119864 (1198832sdot 1119909119895ge119909119894+1
)
minus
1198642(119883 sdot 1
119909119895ge119909119894+1)
119875 (119883 ge 119909119894+1)
(43)
Define ΔVar 2 = Var 2(119872119894) minus Var 2(119872
119894minus1)
ΔVar 2 = [
[
minus
1198642(119883 sdot 1
119909119895le119909119894)
119875 (119883 le 119909119894)
minus
1198642(119883 sdot 1
119909119895ge119909119894+1)
119875 (119883 ge 119909119894+1)
]
]
minus[
[
minus
1198642(119883 sdot 1
119909119895le119909119894minus1)
119875 (119883 le 119909119894minus1)
minus
1198642(119883 sdot 1
119909119895ge119909119894)
119875 (119883 ge 119909119894)
]
]
(44)
Now in order to simplify the calculation we rearrangeΔVar 2 as follows
ΔVar 2 = [
[
1198642(119883 sdot 1
119909119895le119909119894minus1)
119875 (119883 le 119909119894minus1)
minus
1198642(119883 sdot 1
119909119895le119909119894)
119875 (119883 le 119909119894)
]
]
+[
[
1198642(119883 sdot 1
119909119895ge119909119894)
119875 (119883 ge 119909119894)
minus
1198642(119883 sdot 1
119909119895ge119909119894+1)
119875 (119883 ge 119909119894+1)
]
]
(45)
The first part is simplified as follows
1198642(119883 sdot 1
119909119895le119909119894minus1)
119875 (119883 le 119909119894minus1)
minus
1198642(119883 sdot 1
119909119895le119909119894)
119875 (119883 le 119909119894)
=
1198642(119883 sdot 1
119909119895le119909119894minus1)
119875 (119883 le 119909119894minus1)
minus
[119864 (119883 sdot 1119909119895le119909119894minus1
) + 119909119894119901119894]
2
119875 (119883 le 119909119894minus1) + 119901119894
=
1
119875 (119883 le 119909119894minus1) [119875 (119883 le 119909
119894minus1) + 119901119894]
times 1198642(119883 sdot 1
119909119895le119909119894minus1)119875 (119883 le 119909
119894)
+ 1199011198941198642(119883 sdot 1
119909119895le119909119894minus1)
minus 1198642(119883 sdot 1
119909119895le119909119894minus1)119875 (119883 le 119909
119894minus1)
minus 2119909119894119901119894119864(119883 sdot 1
119909119895le119909119894minus1)119875 (119883 le 119909
119894minus1)
minus1199092
1198941199012
119894119875 (119883 le 119909
119894minus1)
=
1
119875 (119883 le 119909119894minus1) [119875 (119883 le 119909
119894minus1) + 119901119894]
times 1199011198941198642(119883 sdot 1
119909119895le119909119894minus1)
minus 2119909119894119901119894119864(119883 sdot 1
119909119895le119909119894minus1)119875 (119883 le 119909
119894minus1)
minus1199092
1198941199012
119894119875 (119883 le 119909
119894minus1)
(46)
Similarly simplifying the second part yields
1198642(119883 sdot 1
119909119895ge119909119894)
119875 (119883 ge 119909119894)
minus
1198642(119883 sdot 1
119909119895ge119909119894+1)
119875 (119883 ge 119909119894+1)
=
[119864 (119883 sdot 1119909119895ge119909119894+1
) + 119909119894119901119894]
2
119875 (119883 ge 119909119894+1) + 119901119894
minus
1198642(119883 sdot 1
119909119895ge119909119894+1)
119875 (119883 ge 119909119894+1)
=
1
119875 (119883 ge 119909119894+1) [119875 (119883 ge 119909
119894+1) + 119901119894]
times 1198642(119883 sdot 1
119909119895ge119909119894+1)119875 (119883 ge 119909
119894+1)
minus 1199011198941198642(119883 sdot 1
119909119895ge119909119894+1)
minus 1198642(119883 sdot 1
119909119895ge119909119894+1)119875 (119883 ge 119909
119894+1)
+ 2119909119894119901119894119864(119883 sdot 1
119909119895ge119909119894+1)119875 (119883 ge 119909
119894+1)
+1199092
1198941199012
119894119875 (119883 ge 119909
119894+1)
Mathematical Problems in Engineering 9
=
1
119875 (119883 ge 119909119894+1) [119875 (119883 ge 119909
119894+1) + 119901119894]
times minus119901119894+11198642(119883 sdot 1
119909119895ge119909119894+1)
+ 2119909119894119901119894119864(119883 sdot 1
119909119895ge119909119894+1)119875 (119883 ge 119909
119894+1)
+1199092
1198941199012
119894+1119875 (119883 ge 119909
119894+1)
(47)In general these types of optimization arise when the
dataset is very bigIf we consider119873 to be very large then 119901
119896≪ max(sum
119894ge119896+1
119875119894 sum119894le119896minus1
119875119894) 119875(119883 le 119909
119894minus1) + 119901119894asymp 119875(119883 le 119909
119894) and 119875(119883 ge
119909119894+1) + 119901119894asymp 119875(119883 ge 119909
119894+1)
ΔVar 2 asymp [
[
1198642(119883 sdot 1
119909119895le119909119894minus1) 119901119894
1198752(119883 le 119909
119894minus1)
minus
1198642(119883 sdot 1
119909119895ge119909119894+1) 119901119894
1198752(119883 ge 119909
119894+1)
]
]
minus 2119901119894119909119894[
[
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
minus
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
]
]
minus 1199012
1198941199092
119894[
1
119875 (119883 le 119909119894minus1)
minus
1
119875 (119883 ge 119909119894+1)
]
(48)
and as lim119873rarrinfin
1199012
1198941199092
119894[(1119875(119883 le 119909
119894minus1))minus(1119875(119883 ge 119909
119894+1))] =
0 we can further simplify ΔVar 2 as followsΔVar 2
asymp[
[
1198642(119883 sdot 1
119909119895le119909119894minus1) 119901119894
1198752(119883 le 119909
119894minus1)
minus
1198642(119883 sdot 1
119909119895ge119909119894+1) 119901119894
1198752(119883 ge 119909
119894+1)
]
]
minus 2119901119894119909119894[
[
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
minus
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
]
]
= 119901119894[
[
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
minus
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
]
]
times[
[
(
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
+
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
) minus 2119909119894]
]
(49)To find the minimum of ΔVar 2 let us locate when
ΔVar 2 = 0
997904rArr 119901119894[
[
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
minus
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
]
]
times[
[
(
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
+
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
) minus 2119909119894]
]
= 0
(50)
VarD
X1 X2 X3 X4 X5 X6 Xn
M
Figure 3 Discrete random variables
But 119901119894
= 0 and so is [(119864(119883 sdot 1119909119895le119909119894minus1
)119875(119883 le 119909119894minus1)) minus
(119864(119883 sdot 1119909119895ge119909119894+1
)119875(119883 ge 119909119894+1))] lt 0 since 119909
119894is an increasing
sequence which implies that 120583119894minus1
lt 120583119894where
120583119894minus1
=
119864 (119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
120583119894=
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
(51)
Therefore
[
[
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
+
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
]
]
minus 2119909119894= 0 (52)
Let us define
119865 (119909119894) = 2119909
119894minus[
[
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
+
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
]
]
119865 (119909119894) = 0
997904rArr 119909119894=[
[
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
+
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
]
]
sdot
1
2
997904rArr 119909119894=
120583119894minus1+ 120583119894
2
(53)
where 120583119894minus1
and 120583119894are the means of the first and second parts
respectivelyTo find the minimum for the total variation it is enough
to find the good separator119872 between 119909119894and 119909
119894minus1such that
119865(119909119894) sdot 119865(119909
119894+1) lt 0 and 119865(119909
119894) lt 0 (see Figure 3)
After finding few local minima we can get the globalminimum by finding the smallest among them
3 Face Localization Using Proposed Method
Now it becomes clear the superiority of the K-means modi-fied algorithmover the standard algorithm in terms of finding
10 Mathematical Problems in Engineering
Figure 4 The original image
Figure 5 The face with shade only
a global minimum and we can explain our solution basedon the modified algorithm Suppose we have an image withpixels values from 0 to 255 First we reshaped the input imageinto a vector of pixels Figure 4 shows an input image beforethe operation of the K-means modified algorithmThe imagecontains a face part a normal background and a backgroundwith illumination Then we split the image pixels into twoclasses one from 0 to 119909
1and another from 119909
1to 255 by
applying a certain threshold determined by the algorithmLet [0 119909
1] be the class 1-which represents the nonface
part and let [1199091 255] be the class 2-which represents the
face with the shadeAll pixels with values less than the threshold belong to
the first class and will be assigned to 0 values This will keepthe pixels with values above the threshold and these pixelsbelong to both the face part and the illuminated part of thebackground as shown in Figure 5
In order to remove the illuminated background partanother threshold will be applied to the pixels using thealgorithm meant to separate between the face part and theilluminated background part New classes will be obtainedfrom 119909
1to 1199092and from 119909
2to 255Then the pixels with values
less than the threshold will be assigned a value of 0 and thenonzero pixels left represent the face part Let [119909
1 119909
2] be
the class 1 which represents the illuminated background partand let [119909
2 255] be the class 2 which represents the face
partThe result of the removal of the illuminated back-ground
part is shown in Figure 6
Figure 6 The face without illumination
Figure 7 The filtered image
One can see that some face pixels were assigned to 0 dueto the effects of the illumination Therefore there is a needto return to the original image in Figure 4 to fully extract theimage window corresponding to the face A filtering processis then performed to reduce the noise and the result is shownin Figure 7
Figure 8 shows the final result which is a window contain-ing the face extracted from the original image
4 Results
In this Section the experimental results of the proposedalgorithm will be analyzed and discussed First it starts withpresenting a description of the datasets used in this workfollowed by the results of the previous part of the workThree popular databases were used to test the proposed facelocalizationmethodTheMIT-CBCL database has image size115 times 115 and 300 images were selected from this datasetbased on different conditions such as light variation poseand background variations The second dataset is the BioIDdataset with image size 384 times 286 and 300 images from thedatabase were selected based on different conditions suchas indooroutdoor The last dataset is the Caltech datasetwith image size 896 times 592 and the 300 images from thedatabase were selected based on different conditions such asindooroutdoor
41 MIT-CBCL Dataset This dataset was established by theCenter for Biological and Computational Learning (CBCL)
Mathematical Problems in Engineering 11
Figure 8 The corresponding window in the original image
Figure 9 Samples fromMIT-CBCL dataset
at Massachusetts Institute of Technology (MIT) [41] It has10 persons with 200 images per person and an image size of115times115 pixelsThe images of each person are with differentlight conditions clutter background and scale and differentposes Few examples of these images are shown in Figure 9
42 BioID Dataset This dataset [42] consists of 1521 graylevel images with a resolution of 384 times 286 pixels Each oneshows the frontal view of a face of one out of 23 differenttest persons The images were taken under different lightingconditions with a complex background and contain tiltedand rotated faces This dataset is considered as the mostdifficult one for eye detection (see Figure 10)
43 Caltech Dataset This database [43] has been collectedby Markus Weber at California Institute of Technology Thedatabase contains 450 face images of 27 subjects underdifferent lighting conditions different face expressions andcomplex backgrounds (see Figure 11)
44 Proposed Method Result The proposed method is eval-uated on these datasets where the result was 100 with a
Figure 10 Sample of faces from BioID dataset for localization
Figure 11 Sample of faces from Caltech dataset for localization
Table 1 Face localization using K-mean modified algorithm onMIT-CBCL BioID and Caltech the sets contain faces with differentposes and faces with clutter background as well as indoorsoutdoors
Dataset Image Accuracy ()MIT-Set 1 150 100MIT-Set 2 150 100BioID 300 100Caltech 300 100
localization time of 4 s Table 1 shows the proposed methodresults with the MIT-CBCL Caltech and BioID databasesIn this test we have focused on the use of images of facesfrom different angles from 30 degrees left to 30 degreesright and the variation in the background as well as thecases of indoor and outdoor images Two sets of images areselected from theMIT-CBCL dataset and two other sets fromthe other databases to evaluate the proposed method Thefirst set from MIT-CBCL contains the images with differentposes and the second set contains images with differentbackgrounds Each set contains 150 images The BioID andCaltech sets contain imageswith indoor and outdoor settingsThe results showed the efficiency of the proposed K-meanmodified algorithm to locate the faces from the image withhigh background and poses changes In addition anotherparameter was considered in this test which is the localizationtime From the results the proposed method can locate theface in a very short time because it has less complexity due tothe use of differential equations
12 Mathematical Problems in Engineering
Figure 12 Examples of face localization using k-mean modified algorithm on MIT-CBCL dataset
(a) (b) (c)
Figure 13 Example of face localization using k-mean modified algorithm on BioID dataset (a) the original image (b) face position and (c)filtered image
Figure 12 shows some examples of face localization onMIT-CBCL
In Figure 13(a) an original image from BioID dataset isshownThe localization position is shown in Figure 13(b) Butthere is need to do a filtering operation in order to remove theunwanted parts and this is shown in Figure 13(c) and then thearea of ROI will be determined
Figure 14 shows the face position on an image from theCaltech dataset
5 Conclusion
In this paper we focus on developing the RIO detectionby suggesting a new method for face localization In thismethod of localization the input image is reshaped intoa vector and then the optimized k-means algorithm isapplied twice to cluster the image pixels into classes At the
end the block window corresponding to the class of pixelscontaining the face only is extracted from the input imageThree sets of images taken from MIT BioID and Caltechdatasets and differing in illumination and backgroundsas well as in-dooroutdoor settings were used to evaluatethe proposed method The results show that the proposedmethod achieved significant localization accuracyOur futureresearch direction includes finding the optimal centers forhigher dimension data whichwe believe is just generalizationof our approach to higher dimension (8) and to highersubclusters
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
Mathematical Problems in Engineering 13
(a) (b) (c)
Figure 14 Example of face localization using k-mean modified algorithm on Caltech dataset (a) the original image (b) face position and(c) filtered image
Acknowledgments
This study is a part of the Internal Grant Project no419011811122 The authors gratefully acknowledge the con-tinued support from Alfaisal University and its Office ofResearch
References
[1] R Chellappa C L Wilson and S Sirohey ldquoHuman andmachine recognition of faces a surveyrdquo Proceedings of the IEEEvol 83 no 5 pp 705ndash740 1995
[2] M-H Yang D J Kriegman and N Ahuja ldquoDetecting faces inimages a surveyrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 24 no 1 pp 34ndash58 2002
[3] K S Fu and J K Mui ldquoA survey on image segmentationrdquoPattern Recognition vol 13 no 1 pp 3ndash16 1981
[4] N R Pal and S K Pal ldquoA review on image segmentationtechniquesrdquo Pattern Recognition vol 26 no 9 pp 1277ndash12941993
[5] S-YWan andW EHiggins ldquoSymmetric region growingrdquo IEEETransactions on Image Processing vol 12 no 9 pp 1007ndash10152003
[6] Y-L Chang and X Li ldquoFast image region growingrdquo Image andVision Computing vol 13 no 7 pp 559ndash571 1995
[7] T N Pappas and N S Jayant ldquoAn adaptive clustering algorithmfor image segmentationrdquo in Proceedings of the InternationalConference on Acoustics Speech and Signal Processing (ICASSPrsquo89) vol 3 pp 1667ndash1670 May 1989
[8] T N Pappas ldquoAn adaptive clustering algorithm for imagesegmentationrdquo IEEE Transactions on Signal Processing vol 40no 4 pp 901ndash914 1992
[9] M Y Mashor ldquoHybrid training algorithm for RBF networkrdquoInternational Journal of the Computer the Internet and Manage-ment vol 8 no 2 pp 50ndash65 2000
[10] N A M Isa S A Salamah and U K Ngah ldquoAdaptive fuzzymovingK-means clustering algorithm for image segmentationrdquo
IEEE Transactions on Consumer Electronics vol 55 no 4 pp2145ndash2153 2009
[11] S N Sulaiman and N AM Isa ldquoAdaptive fuzzy-K-means clus-tering algorithm for image segmentationrdquo IEEE Transactions onConsumer Electronics vol 56 no 4 pp 2661ndash2668 2010
[12] W Cai S Chen and D Zhang ldquoFast and robust fuzzy c-meansclustering algorithms incorporating local information for imagesegmentationrdquo Pattern Recognition vol 40 no 3 pp 825ndash8382007
[13] P Tan M Steinbach and V Kumar Introduction to DataMining Pearson Addison-Wesley 2006
[14] H Xiong J Wu and J Chen ldquoK-means clustering versusvalidation measures a data-distribution perspectiverdquo IEEETransactions on Systems Man and Cybernetics B Cyberneticsvol 39 no 2 pp 318ndash331 2009
[15] I Borg and P J F GroenenModernMultidimensional ScalingmdashTheory and Applications Springer Science + Business Media2005
[16] S Guha R Rastogi and K Shim ldquoCURE an efficient clusteringalgorithm for large databasesrdquo ACM SIGMOD Record vol 27no 2 pp 73ndash84 1998
[17] EM Knorr R T Ng and V Tucakov ldquoDistance-based outliersalgorithms and applicationsrdquo VLDB Journal vol 8 no 3-4 pp237ndash253 2000
[18] P S Bradley U Fayyad and C Reina ldquoScaling clusteringalgorithms to large databasesrdquo in Proceedings of the 4th ACMSIGKDD International Conference on Knowledge Discovery andData Mining (KDD rsquo98) pp 9ndash15 1998
[19] D Arthur and S Vassilvitskii ldquok-means++ the advantages ofcareful seedingrdquo in Proceedings of the 18th Annual ACM-SIAMSymposium on Discrete Algorithms pp 1027ndash1035 2007
[20] T Kanungo D M Mount N S Netanyahu C D PiatkoR Silverman and A Y Wu ldquoA local search approximationalgorithm for k-means clusteringrdquo Computational GeometryTheory and Applications vol 28 no 2-3 pp 89ndash112 2004
14 Mathematical Problems in Engineering
[21] M C Chiang C W Tsai and C S Yang ldquoA time-efficient pat-tern reduction algorithm for k-means clusteringrdquo InformationSciences vol 181 no 4 pp 716ndash731 2011
[22] P S Bradley U M Fayyad and C Reina ldquoScaling clusteringalgorithms to large databasesrdquo in Proceedings of the ACMSIGKDD International Conference on Knowledge Discovery andData Mining (KDD rsquo98) pp 9ndash15 1998
[23] F Farnstrom J Lewis and C Elkan ldquoScalability for clusteringalgorithms revisitedrdquo SIGKDD Explorations vol 2 no 1 pp 51ndash57 2000
[24] C Ordonez and E Omiecinski ldquoEfficient disk-based K-means clustering for relational databasesrdquo IEEE Transactions onKnowledge and Data Engineering vol 16 no 8 pp 909ndash9212004
[25] Y Li and S M Chung ldquoParallel bisecting k-means withprediction clustering algorithmrdquo Journal of Supercomputingvol 39 no 1 pp 19ndash37 2007
[26] C Elkan ldquoUsing the triangle inequality to accelerate k-meansrdquoin Proceedings of the 20th International Conference on MachineLearning pp 147ndash153 August 2003
[27] D Reddy and P K Jana ldquoInitialization for K-means clusteringusing voronoi diagramrdquo Procedia Technology vol 4 pp 395ndash400 2012
[28] J B MacQueen ldquoSome methods for classification and analysisof multivariate observationsrdquo in Proceedings of 5th BerkeleySymposium on Mathematical Statistics and Probability pp 281ndash297 University of California Press 1967
[29] R Ebrahimpour R Rasoolinezhad Z Hajiabolhasani andM Ebrahimi ldquoVanishing point detection in corridors usingHough transform and K-means clusteringrdquo IET ComputerVision vol 6 no 1 pp 40ndash51 2012
[30] P S Bishnu and V Bhattacherjee ldquoSoftware fault predictionusing quad tree-based K-means clustering algorithmrdquo IEEETransactions on Knowledge and Data Engineering vol 24 no6 pp 1146ndash1150 2012
[31] T Elomaa and H Koivistoinen ldquoOn autonomous K-meansclusteringrdquo in Proceedings of the International Symposium onMethodologies for Intelligent Systems pp 228ndash236 May 2005
[32] J M Pena J A Lozano and P Larranaga ldquoAn empiricalcomparison of four initialization methods for the K-Meansalgorithmrdquo Pattern Recognition Letters vol 20 no 10 pp 1027ndash1040 1999
[33] T Kanungo D M Mount N S Netanyahu C D PiatkoR Silverman and A Y Wu ldquoAn efficient k-means clusteringalgorithms analysis and implementationrdquo IEEETransactions onPattern Analysis andMachine Intelligence vol 24 no 7 pp 881ndash892 2002
[34] A Likas N Vlassis and J J Verbeek ldquoThe global k-meansclustering algorithmrdquo Pattern Recognition vol 36 no 2 pp451ndash461 2003
[35] A M Bagirov ldquoModified global k-means algorithm for mini-mumsum-of-squares clustering problemsrdquoPatternRecognitionvol 41 no 10 pp 3192ndash3199 2008
[36] J Xie and S Jiang ldquoA simple and fast algorithm for global K-means clusteringrdquo in Proceedings of the 2nd InternationalWork-shop on Education Technology and Computer Science (ETCS 10)vol 2 pp 36ndash40 Wuhan China March 2010
[37] G F Tzortzis and A C Likas ldquoThe global kernel 120581-meansalgorithm for clustering in feature spacerdquo IEEE Transactions onNeural Networks vol 20 no 7 pp 1181ndash1194 2009
[38] W-L Hung Y-C Chang and E Stanley Lee ldquoWeight selectionin W-K-means algorithm with an application in color imagesegmentationrdquo Computers and Mathematics with Applicationsvol 62 no 2 pp 668ndash676 2011
[39] B Maliatski and O Yadid-Pecht ldquoHardware-driven adaptive 120581-means clustering for real-time video imagingrdquo IEEE Transac-tions on Circuits and Systems for Video Technology vol 15 no 1pp 164ndash166 2005
[40] T Saegusa and T Maruyama ldquoAn FPGA implementation ofreal-time K-means clustering for color imagesrdquo Journal of Real-Time Image Processing vol 2 no 4 pp 309ndash318 2007
[41] ldquoCBCL face recognition Databaserdquo Vol 2010httpcbclmitedusoftware-datasetsheiselefacerecognition-databasehtml
[42] BioID BioID Support Downloads BioID Face Database 2011[43] httpwwwvisioncaltecheduhtml-filesarchivehtml
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
6 Mathematical Problems in Engineering
To find the minimum of Var 2 we need to find 120597120583120597119872
1205831= (
int
119872
minusinfin119909119891 (119909) 119889119909
int
119872
minusinfin119891 (119909) 119889119909
)
997904rArr
1205971205831
120597119872
=
119872119891 (119872)119875 (119883minus) minus 119891 (119872)119864 (119883
minus)
1198752(119883minus)
1205832= (
int
infin
119872119909119891 (119909) 119889119909
int
infin
119872119891 (119909) 119889119909
)
997904rArr
1205971205832
120597119872
=
119872119891 (119872)119875 (119883+) minus 119891 (119872)119864 (119883+
)
1198752(119883+)
(18)
After simplification we get
1205971205831
120597119872
=
119891 (119872)
1198752(119883minus)
[119872119875 (119883minus) minus 119864 (119883
minus)]
1205971205832
120597119872
=
119891 (119872)
1198752(119883+)
[119872119875 (119883+) minus 119864 (119883
+)]
(19)
In order to find the minimum of Var 2 we will find120597Var1120597119872 and 120597119881119886119903
2120597119872 separately and add themas follows
120597
120597119872
(Var 2) = 120597
120597119872
(Var1) +
120597
120597119872
(Var2)
120597Var1
120597119872
=
120597
120597119872
(int
119872
minusinfin
1199092119891 (119909) 119889119909
+1205832
1int
119872
minusinfin
119891 (119909) 119889119909 minus 21205831int
119872
minusinfin
119909119891 (119909) 119889119909)
= 1198722119891 (119872) + 2120583
11205831015840
1119875 (119883minus)
+ 1205832
1119891 (119872) minus 2120583
1015840
1119864 (119883minus) minus 2120583
1119872119891(119872)
(20)
After simplification we get
120597Var1
120597119872
= 119891 (119872)[1198722+
1198642(119883minus)
119875(119883minus)2minus 2119872
119864 (119883minus)
119875 (119883minus)
] (21)
Similarly
120597Var2
120597119872
= minus1198722119891 (119872) + 2120583
21205831015840
2119875 (119883+)
minus 1205832
2119891 (119872) minus 2120583
1015840
2119864 (119883+) + 2120583
2119872119891(119872)
(22)
After simplification we get
120597Var2
120597119872
= 119891 (119872)[minus1198722+
1198642(119883+)
119875(119883+)2minus 2119872
119864 (119883+)
119875 (119883+)
] (23)
Adding both these differentials and after simplifying itfollows that
120597
120597119872
(Var 2) = 119891 (119872)[(
1198642(119883minus)
119875(119883minus)2minus
1198642(119883+)
119875(119883+)2)
minus2119872(
119864 (119883minus)
119875 (119883minus)
minus
119864 (119883+)
119875 (119883+)
)]
(24)
Equating this with 0 gives
120597
120597119872
(Var 2) = 0
[(
1198642(119883minus)
119875(119883minus)2minus
1198642(119883+)
119875(119883+)2) minus 2119872(
119864 (119883minus)
119875 (119883minus)
minus
119864 (119883+)
119875 (119883+)
)] = 0
[
119864 (119883minus)
119875 (119883minus)
minus
119864 (119883+)
119875 (119883+)
] [
119864 (119883minus)
119875 (119883minus)
+
119864 (119883+)
119875 (119883+)
minus 2119872] = 0
(25)
But [(119864(119883minus)119875(119883
minus))minus(119864(119883
+)119875(119883
+))] = 0 as in that case
1205831= 1205832 which contradicts the initial assumption Therefore
[
119864 (119883minus)
119875 (119883minus)
+
119864 (119883+)
119875 (119883+)
minus 2119872] = 0
997904rArr 119872 = [
119864 (119883minus)
119875 (119883minus)
+
119864 (119883+)
119875 (119883+)
] sdot
1
2
119872 =
1205831+ 1205832
2
(26)
Therefore we have to find all values of119872 that satisfy (26)which is easier than minimizing Var 2 directly
To find theminimum in terms of119872 it is enough to derive
120597Var 2 (1205831(119872) 120583
2(119872))
120597119872
= 119891 (119872) [
1198642(119883minus)
119875(119883 lt 119872)2minus
1198642(119883+)
119875(119883 ge 119872)2
minus2119872[
119864 (119883minus)
119875 (119883 le 119872)
minus
119864 (119883+)
119875 (119883 gt 119872)
]]
(27)
Then we find
119872 = [
119864 (119883+)
119875 (119883+)
+
119864 (119883minus)
119875 (119883minus)
] sdot
1
2
(28)
223 Finding the Centers of Some Common ProbabilityDensity Functions
(1) Uniform Distribution The probability density function ofthe continuous uniform distribution is
119891 (119909) =
1
119887 minus 119886
119886 lt 119909 lt 119887
0 otherwise(29)
Mathematical Problems in Engineering 7
where 119886 and 119887 are the two parameters of the distributionTheexpected values for119883 sdot 1
119909ge 119872 and119883 sdot 1
119909lt 119872 are
119864 (119883+) =
1
2
(
1198872minus1198722
119887 minus 119886
)
119864 (119883minus) =
1
2
(
1198722minus 1198862
119887 minus 119886
)
(30)
Putting all these in (28) and solving we get
119872 =
1
2
(119886 + 119887)
1205831(119872) =
119864 (119883minus)
119875 (119883minus)
=
1
4
(3119886 + 119887)
1205832 (119872) =
119864 (119883+)
119875 (119883+)
=
1
4
(119886 + 3119887)
(31)
(2) Log-NormalDistributionTheprobability density functionof a log-normal distribution is
119891 (119909 120583 120590) =
1
119909120590radic2120587
119890minus(ln 119909minus120583)221205902
119909 gt 0 (32)
The probabilities in left and right of119872 are respectively
119875 (119883+) =
1
2
minus
1
2
erf (12
radic2 (ln119872minus 120583)
120590
)
119875 (119883minus) =
1
2
+
1
2
erf (12
radic2 (ln119872minus 120583)
120590
)
(33)
The expected values for119883 sdot 1119909ge 119872 and119883 sdot 1
119909lt 119872 are
119864 (119883+)
= [
1
2
119890120583+(12)120590
2
1 + erf (12
(120583 + 1205902minus ln119872)radic2
120590
)]
119864 (119883minus)
= [
1
2
119890120583+(12)120590
2
1 + erf (12
(minus120583 minus 1205902+ ln119872)radic2
120590
)]
(34)
Putting all these in (28) and assuming 120583 = 0 and 120590 = 1
and solving we get
119872 = 4244942583
1205831(119872) =
119864 (119883minus)
119875 (119883minus)
= 1196827816
1205832(119872) =
119864 (119883+)
119875 (119883+)
= 7293057348
(35)
(3) Normal Distribution The probability density function ofa normal distribution is
119891 (119909 120583 120590) =
1
120590radic2120587
119890minus(119909minus120583)
221205902
(36)
The probabilities in left and right of119872 are respectively
119875 (119883+) =
1
2
minus
1
2
erf (12
radic2 (119872 minus 120583)
120590
)
119875 (119883minus) =
1
2
+
1
2
erf (12
radic2 (119872 minus 120583)
120590
)
(37)
The expected values for119883 sdot 1119909ge 119872 and119883 sdot 1
119909lt 119872 are
119864 (119883+) = minus
1
2
120583minus1 + erf (12
radic2 (119872 minus 120583)
120590
)
119864 (119883minus) =
1
2
1205831 + erf (12
radic2 (119872 minus 120583)
120590
)
(38)
Putting all these in (28) and solving we get 119872 = 120583Assuming 120583 = 0 and 120590 = 1 and solving we get
119872 = 0
1205831(119872) =
119864 (119883minus)
119875 (119883minus)
= minusradic2
120587
1205832(119872) =
119864 (119883+)
119875 (119883+)
= radic2
120587
(39)
224 Discrete Case Let119883 be a discrete random variable andassume we have a set of119873 observations from119883 Let 119901
119895be the
mass density function for an observation 119909119895
Let 119909119894minus1
lt 119872119894minus1
lt 119909119894and 119909
119894lt 119872119894lt 119909119894+1
We define 120583119909le119909119894
as themean for all 119909 le 119909119894 and 120583
119909ge119909119894as themean for all 119909 ge 119909
119894
Define Var 2(119872119894minus1) as the variance of two centers forced by
119872119894minus1
as a separator
Var (119872119894minus1) = sum
119909119895le119909119894minus1
(119909119895minus 120583119909119895le119909119894minus1
)
2
119901119895
+ sum
119909119895ge119909119894
(119909119895minus 120583119909119895ge119909119894
)
2
119901119895
(40)
Var (119872119894) = sum
119909119895le119909119894
(119909119895minus 120583119909119895le119909119894
)
2
119901119895
+ sum
119909119895ge119909119894+1
(119909119895minus 120583119909119895ge119909119894+1
)
2
119901119895
(41)
8 Mathematical Problems in Engineering
The first part of Var(119872119894minus1) simplifies as follows
sum
119909119895le119909119894minus1
(119909119895minus 120583119909119895le119909119894minus1
)
2
119901119895
= sum
119909119895le119909119894minus1
1199092
119895119901119895minus 2120583119909119895le119909119894minus1
times sum
119909119895le119909119894minus1
119909119895119901119895+ 1205832
119909119895le119909119894minus1sum
119909119895le119909119894minus1
119901119895
= 119864 (1198832sdot 1119909119895le119909119894minus1
) minus
1198642(119883 sdot 1
119909119895le119909119894minus1)
119875 (119883 le 119909119894minus1)
(42)
Similarly
sum
119909119895ge119909119894
(119909119895minus 120583119909119895ge119909119894
)
2
119901119895= 119864 (119883
2sdot 1119909119895ge119909119894
) minus
1198642(119883 sdot 1
119909119895ge119909119894)
119875 (119883 ge 119909119894)
Var 2 (119872119894minus1) = 119864 (119883
2sdot 1119909119895le119909119894minus1
)
minus
1198642(119883 sdot 1
119909119895le119909119894minus1)
119875 (119883 le 119909119894minus1)
+ 119864 (1198832sdot 1119909119895ge119909119894
)
minus
1198642(119883 sdot 1
119909119895ge119909119894)
119875 (119883 ge 119909119894)
Var 2 (119872119894) = 119864 (119883
2sdot 1119909119895le119909119894
)
minus
1198642(119883 sdot 1
119909119895le119909119894)
119875 (119883 le 119909119894)
+ 119864 (1198832sdot 1119909119895ge119909119894+1
)
minus
1198642(119883 sdot 1
119909119895ge119909119894+1)
119875 (119883 ge 119909119894+1)
(43)
Define ΔVar 2 = Var 2(119872119894) minus Var 2(119872
119894minus1)
ΔVar 2 = [
[
minus
1198642(119883 sdot 1
119909119895le119909119894)
119875 (119883 le 119909119894)
minus
1198642(119883 sdot 1
119909119895ge119909119894+1)
119875 (119883 ge 119909119894+1)
]
]
minus[
[
minus
1198642(119883 sdot 1
119909119895le119909119894minus1)
119875 (119883 le 119909119894minus1)
minus
1198642(119883 sdot 1
119909119895ge119909119894)
119875 (119883 ge 119909119894)
]
]
(44)
Now in order to simplify the calculation we rearrangeΔVar 2 as follows
ΔVar 2 = [
[
1198642(119883 sdot 1
119909119895le119909119894minus1)
119875 (119883 le 119909119894minus1)
minus
1198642(119883 sdot 1
119909119895le119909119894)
119875 (119883 le 119909119894)
]
]
+[
[
1198642(119883 sdot 1
119909119895ge119909119894)
119875 (119883 ge 119909119894)
minus
1198642(119883 sdot 1
119909119895ge119909119894+1)
119875 (119883 ge 119909119894+1)
]
]
(45)
The first part is simplified as follows
1198642(119883 sdot 1
119909119895le119909119894minus1)
119875 (119883 le 119909119894minus1)
minus
1198642(119883 sdot 1
119909119895le119909119894)
119875 (119883 le 119909119894)
=
1198642(119883 sdot 1
119909119895le119909119894minus1)
119875 (119883 le 119909119894minus1)
minus
[119864 (119883 sdot 1119909119895le119909119894minus1
) + 119909119894119901119894]
2
119875 (119883 le 119909119894minus1) + 119901119894
=
1
119875 (119883 le 119909119894minus1) [119875 (119883 le 119909
119894minus1) + 119901119894]
times 1198642(119883 sdot 1
119909119895le119909119894minus1)119875 (119883 le 119909
119894)
+ 1199011198941198642(119883 sdot 1
119909119895le119909119894minus1)
minus 1198642(119883 sdot 1
119909119895le119909119894minus1)119875 (119883 le 119909
119894minus1)
minus 2119909119894119901119894119864(119883 sdot 1
119909119895le119909119894minus1)119875 (119883 le 119909
119894minus1)
minus1199092
1198941199012
119894119875 (119883 le 119909
119894minus1)
=
1
119875 (119883 le 119909119894minus1) [119875 (119883 le 119909
119894minus1) + 119901119894]
times 1199011198941198642(119883 sdot 1
119909119895le119909119894minus1)
minus 2119909119894119901119894119864(119883 sdot 1
119909119895le119909119894minus1)119875 (119883 le 119909
119894minus1)
minus1199092
1198941199012
119894119875 (119883 le 119909
119894minus1)
(46)
Similarly simplifying the second part yields
1198642(119883 sdot 1
119909119895ge119909119894)
119875 (119883 ge 119909119894)
minus
1198642(119883 sdot 1
119909119895ge119909119894+1)
119875 (119883 ge 119909119894+1)
=
[119864 (119883 sdot 1119909119895ge119909119894+1
) + 119909119894119901119894]
2
119875 (119883 ge 119909119894+1) + 119901119894
minus
1198642(119883 sdot 1
119909119895ge119909119894+1)
119875 (119883 ge 119909119894+1)
=
1
119875 (119883 ge 119909119894+1) [119875 (119883 ge 119909
119894+1) + 119901119894]
times 1198642(119883 sdot 1
119909119895ge119909119894+1)119875 (119883 ge 119909
119894+1)
minus 1199011198941198642(119883 sdot 1
119909119895ge119909119894+1)
minus 1198642(119883 sdot 1
119909119895ge119909119894+1)119875 (119883 ge 119909
119894+1)
+ 2119909119894119901119894119864(119883 sdot 1
119909119895ge119909119894+1)119875 (119883 ge 119909
119894+1)
+1199092
1198941199012
119894119875 (119883 ge 119909
119894+1)
Mathematical Problems in Engineering 9
=
1
119875 (119883 ge 119909119894+1) [119875 (119883 ge 119909
119894+1) + 119901119894]
times minus119901119894+11198642(119883 sdot 1
119909119895ge119909119894+1)
+ 2119909119894119901119894119864(119883 sdot 1
119909119895ge119909119894+1)119875 (119883 ge 119909
119894+1)
+1199092
1198941199012
119894+1119875 (119883 ge 119909
119894+1)
(47)In general these types of optimization arise when the
dataset is very bigIf we consider119873 to be very large then 119901
119896≪ max(sum
119894ge119896+1
119875119894 sum119894le119896minus1
119875119894) 119875(119883 le 119909
119894minus1) + 119901119894asymp 119875(119883 le 119909
119894) and 119875(119883 ge
119909119894+1) + 119901119894asymp 119875(119883 ge 119909
119894+1)
ΔVar 2 asymp [
[
1198642(119883 sdot 1
119909119895le119909119894minus1) 119901119894
1198752(119883 le 119909
119894minus1)
minus
1198642(119883 sdot 1
119909119895ge119909119894+1) 119901119894
1198752(119883 ge 119909
119894+1)
]
]
minus 2119901119894119909119894[
[
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
minus
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
]
]
minus 1199012
1198941199092
119894[
1
119875 (119883 le 119909119894minus1)
minus
1
119875 (119883 ge 119909119894+1)
]
(48)
and as lim119873rarrinfin
1199012
1198941199092
119894[(1119875(119883 le 119909
119894minus1))minus(1119875(119883 ge 119909
119894+1))] =
0 we can further simplify ΔVar 2 as followsΔVar 2
asymp[
[
1198642(119883 sdot 1
119909119895le119909119894minus1) 119901119894
1198752(119883 le 119909
119894minus1)
minus
1198642(119883 sdot 1
119909119895ge119909119894+1) 119901119894
1198752(119883 ge 119909
119894+1)
]
]
minus 2119901119894119909119894[
[
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
minus
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
]
]
= 119901119894[
[
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
minus
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
]
]
times[
[
(
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
+
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
) minus 2119909119894]
]
(49)To find the minimum of ΔVar 2 let us locate when
ΔVar 2 = 0
997904rArr 119901119894[
[
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
minus
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
]
]
times[
[
(
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
+
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
) minus 2119909119894]
]
= 0
(50)
VarD
X1 X2 X3 X4 X5 X6 Xn
M
Figure 3 Discrete random variables
But 119901119894
= 0 and so is [(119864(119883 sdot 1119909119895le119909119894minus1
)119875(119883 le 119909119894minus1)) minus
(119864(119883 sdot 1119909119895ge119909119894+1
)119875(119883 ge 119909119894+1))] lt 0 since 119909
119894is an increasing
sequence which implies that 120583119894minus1
lt 120583119894where
120583119894minus1
=
119864 (119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
120583119894=
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
(51)
Therefore
[
[
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
+
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
]
]
minus 2119909119894= 0 (52)
Let us define
119865 (119909119894) = 2119909
119894minus[
[
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
+
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
]
]
119865 (119909119894) = 0
997904rArr 119909119894=[
[
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
+
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
]
]
sdot
1
2
997904rArr 119909119894=
120583119894minus1+ 120583119894
2
(53)
where 120583119894minus1
and 120583119894are the means of the first and second parts
respectivelyTo find the minimum for the total variation it is enough
to find the good separator119872 between 119909119894and 119909
119894minus1such that
119865(119909119894) sdot 119865(119909
119894+1) lt 0 and 119865(119909
119894) lt 0 (see Figure 3)
After finding few local minima we can get the globalminimum by finding the smallest among them
3 Face Localization Using Proposed Method
Now it becomes clear the superiority of the K-means modi-fied algorithmover the standard algorithm in terms of finding
10 Mathematical Problems in Engineering
Figure 4 The original image
Figure 5 The face with shade only
a global minimum and we can explain our solution basedon the modified algorithm Suppose we have an image withpixels values from 0 to 255 First we reshaped the input imageinto a vector of pixels Figure 4 shows an input image beforethe operation of the K-means modified algorithmThe imagecontains a face part a normal background and a backgroundwith illumination Then we split the image pixels into twoclasses one from 0 to 119909
1and another from 119909
1to 255 by
applying a certain threshold determined by the algorithmLet [0 119909
1] be the class 1-which represents the nonface
part and let [1199091 255] be the class 2-which represents the
face with the shadeAll pixels with values less than the threshold belong to
the first class and will be assigned to 0 values This will keepthe pixels with values above the threshold and these pixelsbelong to both the face part and the illuminated part of thebackground as shown in Figure 5
In order to remove the illuminated background partanother threshold will be applied to the pixels using thealgorithm meant to separate between the face part and theilluminated background part New classes will be obtainedfrom 119909
1to 1199092and from 119909
2to 255Then the pixels with values
less than the threshold will be assigned a value of 0 and thenonzero pixels left represent the face part Let [119909
1 119909
2] be
the class 1 which represents the illuminated background partand let [119909
2 255] be the class 2 which represents the face
partThe result of the removal of the illuminated back-ground
part is shown in Figure 6
Figure 6 The face without illumination
Figure 7 The filtered image
One can see that some face pixels were assigned to 0 dueto the effects of the illumination Therefore there is a needto return to the original image in Figure 4 to fully extract theimage window corresponding to the face A filtering processis then performed to reduce the noise and the result is shownin Figure 7
Figure 8 shows the final result which is a window contain-ing the face extracted from the original image
4 Results
In this Section the experimental results of the proposedalgorithm will be analyzed and discussed First it starts withpresenting a description of the datasets used in this workfollowed by the results of the previous part of the workThree popular databases were used to test the proposed facelocalizationmethodTheMIT-CBCL database has image size115 times 115 and 300 images were selected from this datasetbased on different conditions such as light variation poseand background variations The second dataset is the BioIDdataset with image size 384 times 286 and 300 images from thedatabase were selected based on different conditions suchas indooroutdoor The last dataset is the Caltech datasetwith image size 896 times 592 and the 300 images from thedatabase were selected based on different conditions such asindooroutdoor
41 MIT-CBCL Dataset This dataset was established by theCenter for Biological and Computational Learning (CBCL)
Mathematical Problems in Engineering 11
Figure 8 The corresponding window in the original image
Figure 9 Samples fromMIT-CBCL dataset
at Massachusetts Institute of Technology (MIT) [41] It has10 persons with 200 images per person and an image size of115times115 pixelsThe images of each person are with differentlight conditions clutter background and scale and differentposes Few examples of these images are shown in Figure 9
42 BioID Dataset This dataset [42] consists of 1521 graylevel images with a resolution of 384 times 286 pixels Each oneshows the frontal view of a face of one out of 23 differenttest persons The images were taken under different lightingconditions with a complex background and contain tiltedand rotated faces This dataset is considered as the mostdifficult one for eye detection (see Figure 10)
43 Caltech Dataset This database [43] has been collectedby Markus Weber at California Institute of Technology Thedatabase contains 450 face images of 27 subjects underdifferent lighting conditions different face expressions andcomplex backgrounds (see Figure 11)
44 Proposed Method Result The proposed method is eval-uated on these datasets where the result was 100 with a
Figure 10 Sample of faces from BioID dataset for localization
Figure 11 Sample of faces from Caltech dataset for localization
Table 1 Face localization using K-mean modified algorithm onMIT-CBCL BioID and Caltech the sets contain faces with differentposes and faces with clutter background as well as indoorsoutdoors
Dataset Image Accuracy ()MIT-Set 1 150 100MIT-Set 2 150 100BioID 300 100Caltech 300 100
localization time of 4 s Table 1 shows the proposed methodresults with the MIT-CBCL Caltech and BioID databasesIn this test we have focused on the use of images of facesfrom different angles from 30 degrees left to 30 degreesright and the variation in the background as well as thecases of indoor and outdoor images Two sets of images areselected from theMIT-CBCL dataset and two other sets fromthe other databases to evaluate the proposed method Thefirst set from MIT-CBCL contains the images with differentposes and the second set contains images with differentbackgrounds Each set contains 150 images The BioID andCaltech sets contain imageswith indoor and outdoor settingsThe results showed the efficiency of the proposed K-meanmodified algorithm to locate the faces from the image withhigh background and poses changes In addition anotherparameter was considered in this test which is the localizationtime From the results the proposed method can locate theface in a very short time because it has less complexity due tothe use of differential equations
12 Mathematical Problems in Engineering
Figure 12 Examples of face localization using k-mean modified algorithm on MIT-CBCL dataset
(a) (b) (c)
Figure 13 Example of face localization using k-mean modified algorithm on BioID dataset (a) the original image (b) face position and (c)filtered image
Figure 12 shows some examples of face localization onMIT-CBCL
In Figure 13(a) an original image from BioID dataset isshownThe localization position is shown in Figure 13(b) Butthere is need to do a filtering operation in order to remove theunwanted parts and this is shown in Figure 13(c) and then thearea of ROI will be determined
Figure 14 shows the face position on an image from theCaltech dataset
5 Conclusion
In this paper we focus on developing the RIO detectionby suggesting a new method for face localization In thismethod of localization the input image is reshaped intoa vector and then the optimized k-means algorithm isapplied twice to cluster the image pixels into classes At the
end the block window corresponding to the class of pixelscontaining the face only is extracted from the input imageThree sets of images taken from MIT BioID and Caltechdatasets and differing in illumination and backgroundsas well as in-dooroutdoor settings were used to evaluatethe proposed method The results show that the proposedmethod achieved significant localization accuracyOur futureresearch direction includes finding the optimal centers forhigher dimension data whichwe believe is just generalizationof our approach to higher dimension (8) and to highersubclusters
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
Mathematical Problems in Engineering 13
(a) (b) (c)
Figure 14 Example of face localization using k-mean modified algorithm on Caltech dataset (a) the original image (b) face position and(c) filtered image
Acknowledgments
This study is a part of the Internal Grant Project no419011811122 The authors gratefully acknowledge the con-tinued support from Alfaisal University and its Office ofResearch
References
[1] R Chellappa C L Wilson and S Sirohey ldquoHuman andmachine recognition of faces a surveyrdquo Proceedings of the IEEEvol 83 no 5 pp 705ndash740 1995
[2] M-H Yang D J Kriegman and N Ahuja ldquoDetecting faces inimages a surveyrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 24 no 1 pp 34ndash58 2002
[3] K S Fu and J K Mui ldquoA survey on image segmentationrdquoPattern Recognition vol 13 no 1 pp 3ndash16 1981
[4] N R Pal and S K Pal ldquoA review on image segmentationtechniquesrdquo Pattern Recognition vol 26 no 9 pp 1277ndash12941993
[5] S-YWan andW EHiggins ldquoSymmetric region growingrdquo IEEETransactions on Image Processing vol 12 no 9 pp 1007ndash10152003
[6] Y-L Chang and X Li ldquoFast image region growingrdquo Image andVision Computing vol 13 no 7 pp 559ndash571 1995
[7] T N Pappas and N S Jayant ldquoAn adaptive clustering algorithmfor image segmentationrdquo in Proceedings of the InternationalConference on Acoustics Speech and Signal Processing (ICASSPrsquo89) vol 3 pp 1667ndash1670 May 1989
[8] T N Pappas ldquoAn adaptive clustering algorithm for imagesegmentationrdquo IEEE Transactions on Signal Processing vol 40no 4 pp 901ndash914 1992
[9] M Y Mashor ldquoHybrid training algorithm for RBF networkrdquoInternational Journal of the Computer the Internet and Manage-ment vol 8 no 2 pp 50ndash65 2000
[10] N A M Isa S A Salamah and U K Ngah ldquoAdaptive fuzzymovingK-means clustering algorithm for image segmentationrdquo
IEEE Transactions on Consumer Electronics vol 55 no 4 pp2145ndash2153 2009
[11] S N Sulaiman and N AM Isa ldquoAdaptive fuzzy-K-means clus-tering algorithm for image segmentationrdquo IEEE Transactions onConsumer Electronics vol 56 no 4 pp 2661ndash2668 2010
[12] W Cai S Chen and D Zhang ldquoFast and robust fuzzy c-meansclustering algorithms incorporating local information for imagesegmentationrdquo Pattern Recognition vol 40 no 3 pp 825ndash8382007
[13] P Tan M Steinbach and V Kumar Introduction to DataMining Pearson Addison-Wesley 2006
[14] H Xiong J Wu and J Chen ldquoK-means clustering versusvalidation measures a data-distribution perspectiverdquo IEEETransactions on Systems Man and Cybernetics B Cyberneticsvol 39 no 2 pp 318ndash331 2009
[15] I Borg and P J F GroenenModernMultidimensional ScalingmdashTheory and Applications Springer Science + Business Media2005
[16] S Guha R Rastogi and K Shim ldquoCURE an efficient clusteringalgorithm for large databasesrdquo ACM SIGMOD Record vol 27no 2 pp 73ndash84 1998
[17] EM Knorr R T Ng and V Tucakov ldquoDistance-based outliersalgorithms and applicationsrdquo VLDB Journal vol 8 no 3-4 pp237ndash253 2000
[18] P S Bradley U Fayyad and C Reina ldquoScaling clusteringalgorithms to large databasesrdquo in Proceedings of the 4th ACMSIGKDD International Conference on Knowledge Discovery andData Mining (KDD rsquo98) pp 9ndash15 1998
[19] D Arthur and S Vassilvitskii ldquok-means++ the advantages ofcareful seedingrdquo in Proceedings of the 18th Annual ACM-SIAMSymposium on Discrete Algorithms pp 1027ndash1035 2007
[20] T Kanungo D M Mount N S Netanyahu C D PiatkoR Silverman and A Y Wu ldquoA local search approximationalgorithm for k-means clusteringrdquo Computational GeometryTheory and Applications vol 28 no 2-3 pp 89ndash112 2004
14 Mathematical Problems in Engineering
[21] M C Chiang C W Tsai and C S Yang ldquoA time-efficient pat-tern reduction algorithm for k-means clusteringrdquo InformationSciences vol 181 no 4 pp 716ndash731 2011
[22] P S Bradley U M Fayyad and C Reina ldquoScaling clusteringalgorithms to large databasesrdquo in Proceedings of the ACMSIGKDD International Conference on Knowledge Discovery andData Mining (KDD rsquo98) pp 9ndash15 1998
[23] F Farnstrom J Lewis and C Elkan ldquoScalability for clusteringalgorithms revisitedrdquo SIGKDD Explorations vol 2 no 1 pp 51ndash57 2000
[24] C Ordonez and E Omiecinski ldquoEfficient disk-based K-means clustering for relational databasesrdquo IEEE Transactions onKnowledge and Data Engineering vol 16 no 8 pp 909ndash9212004
[25] Y Li and S M Chung ldquoParallel bisecting k-means withprediction clustering algorithmrdquo Journal of Supercomputingvol 39 no 1 pp 19ndash37 2007
[26] C Elkan ldquoUsing the triangle inequality to accelerate k-meansrdquoin Proceedings of the 20th International Conference on MachineLearning pp 147ndash153 August 2003
[27] D Reddy and P K Jana ldquoInitialization for K-means clusteringusing voronoi diagramrdquo Procedia Technology vol 4 pp 395ndash400 2012
[28] J B MacQueen ldquoSome methods for classification and analysisof multivariate observationsrdquo in Proceedings of 5th BerkeleySymposium on Mathematical Statistics and Probability pp 281ndash297 University of California Press 1967
[29] R Ebrahimpour R Rasoolinezhad Z Hajiabolhasani andM Ebrahimi ldquoVanishing point detection in corridors usingHough transform and K-means clusteringrdquo IET ComputerVision vol 6 no 1 pp 40ndash51 2012
[30] P S Bishnu and V Bhattacherjee ldquoSoftware fault predictionusing quad tree-based K-means clustering algorithmrdquo IEEETransactions on Knowledge and Data Engineering vol 24 no6 pp 1146ndash1150 2012
[31] T Elomaa and H Koivistoinen ldquoOn autonomous K-meansclusteringrdquo in Proceedings of the International Symposium onMethodologies for Intelligent Systems pp 228ndash236 May 2005
[32] J M Pena J A Lozano and P Larranaga ldquoAn empiricalcomparison of four initialization methods for the K-Meansalgorithmrdquo Pattern Recognition Letters vol 20 no 10 pp 1027ndash1040 1999
[33] T Kanungo D M Mount N S Netanyahu C D PiatkoR Silverman and A Y Wu ldquoAn efficient k-means clusteringalgorithms analysis and implementationrdquo IEEETransactions onPattern Analysis andMachine Intelligence vol 24 no 7 pp 881ndash892 2002
[34] A Likas N Vlassis and J J Verbeek ldquoThe global k-meansclustering algorithmrdquo Pattern Recognition vol 36 no 2 pp451ndash461 2003
[35] A M Bagirov ldquoModified global k-means algorithm for mini-mumsum-of-squares clustering problemsrdquoPatternRecognitionvol 41 no 10 pp 3192ndash3199 2008
[36] J Xie and S Jiang ldquoA simple and fast algorithm for global K-means clusteringrdquo in Proceedings of the 2nd InternationalWork-shop on Education Technology and Computer Science (ETCS 10)vol 2 pp 36ndash40 Wuhan China March 2010
[37] G F Tzortzis and A C Likas ldquoThe global kernel 120581-meansalgorithm for clustering in feature spacerdquo IEEE Transactions onNeural Networks vol 20 no 7 pp 1181ndash1194 2009
[38] W-L Hung Y-C Chang and E Stanley Lee ldquoWeight selectionin W-K-means algorithm with an application in color imagesegmentationrdquo Computers and Mathematics with Applicationsvol 62 no 2 pp 668ndash676 2011
[39] B Maliatski and O Yadid-Pecht ldquoHardware-driven adaptive 120581-means clustering for real-time video imagingrdquo IEEE Transac-tions on Circuits and Systems for Video Technology vol 15 no 1pp 164ndash166 2005
[40] T Saegusa and T Maruyama ldquoAn FPGA implementation ofreal-time K-means clustering for color imagesrdquo Journal of Real-Time Image Processing vol 2 no 4 pp 309ndash318 2007
[41] ldquoCBCL face recognition Databaserdquo Vol 2010httpcbclmitedusoftware-datasetsheiselefacerecognition-databasehtml
[42] BioID BioID Support Downloads BioID Face Database 2011[43] httpwwwvisioncaltecheduhtml-filesarchivehtml
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
Mathematical Problems in Engineering 7
where 119886 and 119887 are the two parameters of the distributionTheexpected values for119883 sdot 1
119909ge 119872 and119883 sdot 1
119909lt 119872 are
119864 (119883+) =
1
2
(
1198872minus1198722
119887 minus 119886
)
119864 (119883minus) =
1
2
(
1198722minus 1198862
119887 minus 119886
)
(30)
Putting all these in (28) and solving we get
119872 =
1
2
(119886 + 119887)
1205831(119872) =
119864 (119883minus)
119875 (119883minus)
=
1
4
(3119886 + 119887)
1205832 (119872) =
119864 (119883+)
119875 (119883+)
=
1
4
(119886 + 3119887)
(31)
(2) Log-NormalDistributionTheprobability density functionof a log-normal distribution is
119891 (119909 120583 120590) =
1
119909120590radic2120587
119890minus(ln 119909minus120583)221205902
119909 gt 0 (32)
The probabilities in left and right of119872 are respectively
119875 (119883+) =
1
2
minus
1
2
erf (12
radic2 (ln119872minus 120583)
120590
)
119875 (119883minus) =
1
2
+
1
2
erf (12
radic2 (ln119872minus 120583)
120590
)
(33)
The expected values for119883 sdot 1119909ge 119872 and119883 sdot 1
119909lt 119872 are
119864 (119883+)
= [
1
2
119890120583+(12)120590
2
1 + erf (12
(120583 + 1205902minus ln119872)radic2
120590
)]
119864 (119883minus)
= [
1
2
119890120583+(12)120590
2
1 + erf (12
(minus120583 minus 1205902+ ln119872)radic2
120590
)]
(34)
Putting all these in (28) and assuming 120583 = 0 and 120590 = 1
and solving we get
119872 = 4244942583
1205831(119872) =
119864 (119883minus)
119875 (119883minus)
= 1196827816
1205832(119872) =
119864 (119883+)
119875 (119883+)
= 7293057348
(35)
(3) Normal Distribution The probability density function ofa normal distribution is
119891 (119909 120583 120590) =
1
120590radic2120587
119890minus(119909minus120583)
221205902
(36)
The probabilities in left and right of119872 are respectively
119875 (119883+) =
1
2
minus
1
2
erf (12
radic2 (119872 minus 120583)
120590
)
119875 (119883minus) =
1
2
+
1
2
erf (12
radic2 (119872 minus 120583)
120590
)
(37)
The expected values for119883 sdot 1119909ge 119872 and119883 sdot 1
119909lt 119872 are
119864 (119883+) = minus
1
2
120583minus1 + erf (12
radic2 (119872 minus 120583)
120590
)
119864 (119883minus) =
1
2
1205831 + erf (12
radic2 (119872 minus 120583)
120590
)
(38)
Putting all these in (28) and solving we get 119872 = 120583Assuming 120583 = 0 and 120590 = 1 and solving we get
119872 = 0
1205831(119872) =
119864 (119883minus)
119875 (119883minus)
= minusradic2
120587
1205832(119872) =
119864 (119883+)
119875 (119883+)
= radic2
120587
(39)
224 Discrete Case Let119883 be a discrete random variable andassume we have a set of119873 observations from119883 Let 119901
119895be the
mass density function for an observation 119909119895
Let 119909119894minus1
lt 119872119894minus1
lt 119909119894and 119909
119894lt 119872119894lt 119909119894+1
We define 120583119909le119909119894
as themean for all 119909 le 119909119894 and 120583
119909ge119909119894as themean for all 119909 ge 119909
119894
Define Var 2(119872119894minus1) as the variance of two centers forced by
119872119894minus1
as a separator
Var (119872119894minus1) = sum
119909119895le119909119894minus1
(119909119895minus 120583119909119895le119909119894minus1
)
2
119901119895
+ sum
119909119895ge119909119894
(119909119895minus 120583119909119895ge119909119894
)
2
119901119895
(40)
Var (119872119894) = sum
119909119895le119909119894
(119909119895minus 120583119909119895le119909119894
)
2
119901119895
+ sum
119909119895ge119909119894+1
(119909119895minus 120583119909119895ge119909119894+1
)
2
119901119895
(41)
8 Mathematical Problems in Engineering
The first part of Var(119872119894minus1) simplifies as follows
sum
119909119895le119909119894minus1
(119909119895minus 120583119909119895le119909119894minus1
)
2
119901119895
= sum
119909119895le119909119894minus1
1199092
119895119901119895minus 2120583119909119895le119909119894minus1
times sum
119909119895le119909119894minus1
119909119895119901119895+ 1205832
119909119895le119909119894minus1sum
119909119895le119909119894minus1
119901119895
= 119864 (1198832sdot 1119909119895le119909119894minus1
) minus
1198642(119883 sdot 1
119909119895le119909119894minus1)
119875 (119883 le 119909119894minus1)
(42)
Similarly
sum
119909119895ge119909119894
(119909119895minus 120583119909119895ge119909119894
)
2
119901119895= 119864 (119883
2sdot 1119909119895ge119909119894
) minus
1198642(119883 sdot 1
119909119895ge119909119894)
119875 (119883 ge 119909119894)
Var 2 (119872119894minus1) = 119864 (119883
2sdot 1119909119895le119909119894minus1
)
minus
1198642(119883 sdot 1
119909119895le119909119894minus1)
119875 (119883 le 119909119894minus1)
+ 119864 (1198832sdot 1119909119895ge119909119894
)
minus
1198642(119883 sdot 1
119909119895ge119909119894)
119875 (119883 ge 119909119894)
Var 2 (119872119894) = 119864 (119883
2sdot 1119909119895le119909119894
)
minus
1198642(119883 sdot 1
119909119895le119909119894)
119875 (119883 le 119909119894)
+ 119864 (1198832sdot 1119909119895ge119909119894+1
)
minus
1198642(119883 sdot 1
119909119895ge119909119894+1)
119875 (119883 ge 119909119894+1)
(43)
Define ΔVar 2 = Var 2(119872119894) minus Var 2(119872
119894minus1)
ΔVar 2 = [
[
minus
1198642(119883 sdot 1
119909119895le119909119894)
119875 (119883 le 119909119894)
minus
1198642(119883 sdot 1
119909119895ge119909119894+1)
119875 (119883 ge 119909119894+1)
]
]
minus[
[
minus
1198642(119883 sdot 1
119909119895le119909119894minus1)
119875 (119883 le 119909119894minus1)
minus
1198642(119883 sdot 1
119909119895ge119909119894)
119875 (119883 ge 119909119894)
]
]
(44)
Now in order to simplify the calculation we rearrangeΔVar 2 as follows
ΔVar 2 = [
[
1198642(119883 sdot 1
119909119895le119909119894minus1)
119875 (119883 le 119909119894minus1)
minus
1198642(119883 sdot 1
119909119895le119909119894)
119875 (119883 le 119909119894)
]
]
+[
[
1198642(119883 sdot 1
119909119895ge119909119894)
119875 (119883 ge 119909119894)
minus
1198642(119883 sdot 1
119909119895ge119909119894+1)
119875 (119883 ge 119909119894+1)
]
]
(45)
The first part is simplified as follows
1198642(119883 sdot 1
119909119895le119909119894minus1)
119875 (119883 le 119909119894minus1)
minus
1198642(119883 sdot 1
119909119895le119909119894)
119875 (119883 le 119909119894)
=
1198642(119883 sdot 1
119909119895le119909119894minus1)
119875 (119883 le 119909119894minus1)
minus
[119864 (119883 sdot 1119909119895le119909119894minus1
) + 119909119894119901119894]
2
119875 (119883 le 119909119894minus1) + 119901119894
=
1
119875 (119883 le 119909119894minus1) [119875 (119883 le 119909
119894minus1) + 119901119894]
times 1198642(119883 sdot 1
119909119895le119909119894minus1)119875 (119883 le 119909
119894)
+ 1199011198941198642(119883 sdot 1
119909119895le119909119894minus1)
minus 1198642(119883 sdot 1
119909119895le119909119894minus1)119875 (119883 le 119909
119894minus1)
minus 2119909119894119901119894119864(119883 sdot 1
119909119895le119909119894minus1)119875 (119883 le 119909
119894minus1)
minus1199092
1198941199012
119894119875 (119883 le 119909
119894minus1)
=
1
119875 (119883 le 119909119894minus1) [119875 (119883 le 119909
119894minus1) + 119901119894]
times 1199011198941198642(119883 sdot 1
119909119895le119909119894minus1)
minus 2119909119894119901119894119864(119883 sdot 1
119909119895le119909119894minus1)119875 (119883 le 119909
119894minus1)
minus1199092
1198941199012
119894119875 (119883 le 119909
119894minus1)
(46)
Similarly simplifying the second part yields
1198642(119883 sdot 1
119909119895ge119909119894)
119875 (119883 ge 119909119894)
minus
1198642(119883 sdot 1
119909119895ge119909119894+1)
119875 (119883 ge 119909119894+1)
=
[119864 (119883 sdot 1119909119895ge119909119894+1
) + 119909119894119901119894]
2
119875 (119883 ge 119909119894+1) + 119901119894
minus
1198642(119883 sdot 1
119909119895ge119909119894+1)
119875 (119883 ge 119909119894+1)
=
1
119875 (119883 ge 119909119894+1) [119875 (119883 ge 119909
119894+1) + 119901119894]
times 1198642(119883 sdot 1
119909119895ge119909119894+1)119875 (119883 ge 119909
119894+1)
minus 1199011198941198642(119883 sdot 1
119909119895ge119909119894+1)
minus 1198642(119883 sdot 1
119909119895ge119909119894+1)119875 (119883 ge 119909
119894+1)
+ 2119909119894119901119894119864(119883 sdot 1
119909119895ge119909119894+1)119875 (119883 ge 119909
119894+1)
+1199092
1198941199012
119894119875 (119883 ge 119909
119894+1)
Mathematical Problems in Engineering 9
=
1
119875 (119883 ge 119909119894+1) [119875 (119883 ge 119909
119894+1) + 119901119894]
times minus119901119894+11198642(119883 sdot 1
119909119895ge119909119894+1)
+ 2119909119894119901119894119864(119883 sdot 1
119909119895ge119909119894+1)119875 (119883 ge 119909
119894+1)
+1199092
1198941199012
119894+1119875 (119883 ge 119909
119894+1)
(47)In general these types of optimization arise when the
dataset is very bigIf we consider119873 to be very large then 119901
119896≪ max(sum
119894ge119896+1
119875119894 sum119894le119896minus1
119875119894) 119875(119883 le 119909
119894minus1) + 119901119894asymp 119875(119883 le 119909
119894) and 119875(119883 ge
119909119894+1) + 119901119894asymp 119875(119883 ge 119909
119894+1)
ΔVar 2 asymp [
[
1198642(119883 sdot 1
119909119895le119909119894minus1) 119901119894
1198752(119883 le 119909
119894minus1)
minus
1198642(119883 sdot 1
119909119895ge119909119894+1) 119901119894
1198752(119883 ge 119909
119894+1)
]
]
minus 2119901119894119909119894[
[
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
minus
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
]
]
minus 1199012
1198941199092
119894[
1
119875 (119883 le 119909119894minus1)
minus
1
119875 (119883 ge 119909119894+1)
]
(48)
and as lim119873rarrinfin
1199012
1198941199092
119894[(1119875(119883 le 119909
119894minus1))minus(1119875(119883 ge 119909
119894+1))] =
0 we can further simplify ΔVar 2 as followsΔVar 2
asymp[
[
1198642(119883 sdot 1
119909119895le119909119894minus1) 119901119894
1198752(119883 le 119909
119894minus1)
minus
1198642(119883 sdot 1
119909119895ge119909119894+1) 119901119894
1198752(119883 ge 119909
119894+1)
]
]
minus 2119901119894119909119894[
[
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
minus
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
]
]
= 119901119894[
[
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
minus
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
]
]
times[
[
(
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
+
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
) minus 2119909119894]
]
(49)To find the minimum of ΔVar 2 let us locate when
ΔVar 2 = 0
997904rArr 119901119894[
[
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
minus
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
]
]
times[
[
(
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
+
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
) minus 2119909119894]
]
= 0
(50)
VarD
X1 X2 X3 X4 X5 X6 Xn
M
Figure 3 Discrete random variables
But 119901119894
= 0 and so is [(119864(119883 sdot 1119909119895le119909119894minus1
)119875(119883 le 119909119894minus1)) minus
(119864(119883 sdot 1119909119895ge119909119894+1
)119875(119883 ge 119909119894+1))] lt 0 since 119909
119894is an increasing
sequence which implies that 120583119894minus1
lt 120583119894where
120583119894minus1
=
119864 (119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
120583119894=
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
(51)
Therefore
[
[
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
+
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
]
]
minus 2119909119894= 0 (52)
Let us define
119865 (119909119894) = 2119909
119894minus[
[
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
+
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
]
]
119865 (119909119894) = 0
997904rArr 119909119894=[
[
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
+
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
]
]
sdot
1
2
997904rArr 119909119894=
120583119894minus1+ 120583119894
2
(53)
where 120583119894minus1
and 120583119894are the means of the first and second parts
respectivelyTo find the minimum for the total variation it is enough
to find the good separator119872 between 119909119894and 119909
119894minus1such that
119865(119909119894) sdot 119865(119909
119894+1) lt 0 and 119865(119909
119894) lt 0 (see Figure 3)
After finding few local minima we can get the globalminimum by finding the smallest among them
3 Face Localization Using Proposed Method
Now it becomes clear the superiority of the K-means modi-fied algorithmover the standard algorithm in terms of finding
10 Mathematical Problems in Engineering
Figure 4 The original image
Figure 5 The face with shade only
a global minimum and we can explain our solution basedon the modified algorithm Suppose we have an image withpixels values from 0 to 255 First we reshaped the input imageinto a vector of pixels Figure 4 shows an input image beforethe operation of the K-means modified algorithmThe imagecontains a face part a normal background and a backgroundwith illumination Then we split the image pixels into twoclasses one from 0 to 119909
1and another from 119909
1to 255 by
applying a certain threshold determined by the algorithmLet [0 119909
1] be the class 1-which represents the nonface
part and let [1199091 255] be the class 2-which represents the
face with the shadeAll pixels with values less than the threshold belong to
the first class and will be assigned to 0 values This will keepthe pixels with values above the threshold and these pixelsbelong to both the face part and the illuminated part of thebackground as shown in Figure 5
In order to remove the illuminated background partanother threshold will be applied to the pixels using thealgorithm meant to separate between the face part and theilluminated background part New classes will be obtainedfrom 119909
1to 1199092and from 119909
2to 255Then the pixels with values
less than the threshold will be assigned a value of 0 and thenonzero pixels left represent the face part Let [119909
1 119909
2] be
the class 1 which represents the illuminated background partand let [119909
2 255] be the class 2 which represents the face
partThe result of the removal of the illuminated back-ground
part is shown in Figure 6
Figure 6 The face without illumination
Figure 7 The filtered image
One can see that some face pixels were assigned to 0 dueto the effects of the illumination Therefore there is a needto return to the original image in Figure 4 to fully extract theimage window corresponding to the face A filtering processis then performed to reduce the noise and the result is shownin Figure 7
Figure 8 shows the final result which is a window contain-ing the face extracted from the original image
4 Results
In this Section the experimental results of the proposedalgorithm will be analyzed and discussed First it starts withpresenting a description of the datasets used in this workfollowed by the results of the previous part of the workThree popular databases were used to test the proposed facelocalizationmethodTheMIT-CBCL database has image size115 times 115 and 300 images were selected from this datasetbased on different conditions such as light variation poseand background variations The second dataset is the BioIDdataset with image size 384 times 286 and 300 images from thedatabase were selected based on different conditions suchas indooroutdoor The last dataset is the Caltech datasetwith image size 896 times 592 and the 300 images from thedatabase were selected based on different conditions such asindooroutdoor
41 MIT-CBCL Dataset This dataset was established by theCenter for Biological and Computational Learning (CBCL)
Mathematical Problems in Engineering 11
Figure 8 The corresponding window in the original image
Figure 9 Samples fromMIT-CBCL dataset
at Massachusetts Institute of Technology (MIT) [41] It has10 persons with 200 images per person and an image size of115times115 pixelsThe images of each person are with differentlight conditions clutter background and scale and differentposes Few examples of these images are shown in Figure 9
42 BioID Dataset This dataset [42] consists of 1521 graylevel images with a resolution of 384 times 286 pixels Each oneshows the frontal view of a face of one out of 23 differenttest persons The images were taken under different lightingconditions with a complex background and contain tiltedand rotated faces This dataset is considered as the mostdifficult one for eye detection (see Figure 10)
43 Caltech Dataset This database [43] has been collectedby Markus Weber at California Institute of Technology Thedatabase contains 450 face images of 27 subjects underdifferent lighting conditions different face expressions andcomplex backgrounds (see Figure 11)
44 Proposed Method Result The proposed method is eval-uated on these datasets where the result was 100 with a
Figure 10 Sample of faces from BioID dataset for localization
Figure 11 Sample of faces from Caltech dataset for localization
Table 1 Face localization using K-mean modified algorithm onMIT-CBCL BioID and Caltech the sets contain faces with differentposes and faces with clutter background as well as indoorsoutdoors
Dataset Image Accuracy ()MIT-Set 1 150 100MIT-Set 2 150 100BioID 300 100Caltech 300 100
localization time of 4 s Table 1 shows the proposed methodresults with the MIT-CBCL Caltech and BioID databasesIn this test we have focused on the use of images of facesfrom different angles from 30 degrees left to 30 degreesright and the variation in the background as well as thecases of indoor and outdoor images Two sets of images areselected from theMIT-CBCL dataset and two other sets fromthe other databases to evaluate the proposed method Thefirst set from MIT-CBCL contains the images with differentposes and the second set contains images with differentbackgrounds Each set contains 150 images The BioID andCaltech sets contain imageswith indoor and outdoor settingsThe results showed the efficiency of the proposed K-meanmodified algorithm to locate the faces from the image withhigh background and poses changes In addition anotherparameter was considered in this test which is the localizationtime From the results the proposed method can locate theface in a very short time because it has less complexity due tothe use of differential equations
12 Mathematical Problems in Engineering
Figure 12 Examples of face localization using k-mean modified algorithm on MIT-CBCL dataset
(a) (b) (c)
Figure 13 Example of face localization using k-mean modified algorithm on BioID dataset (a) the original image (b) face position and (c)filtered image
Figure 12 shows some examples of face localization onMIT-CBCL
In Figure 13(a) an original image from BioID dataset isshownThe localization position is shown in Figure 13(b) Butthere is need to do a filtering operation in order to remove theunwanted parts and this is shown in Figure 13(c) and then thearea of ROI will be determined
Figure 14 shows the face position on an image from theCaltech dataset
5 Conclusion
In this paper we focus on developing the RIO detectionby suggesting a new method for face localization In thismethod of localization the input image is reshaped intoa vector and then the optimized k-means algorithm isapplied twice to cluster the image pixels into classes At the
end the block window corresponding to the class of pixelscontaining the face only is extracted from the input imageThree sets of images taken from MIT BioID and Caltechdatasets and differing in illumination and backgroundsas well as in-dooroutdoor settings were used to evaluatethe proposed method The results show that the proposedmethod achieved significant localization accuracyOur futureresearch direction includes finding the optimal centers forhigher dimension data whichwe believe is just generalizationof our approach to higher dimension (8) and to highersubclusters
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
Mathematical Problems in Engineering 13
(a) (b) (c)
Figure 14 Example of face localization using k-mean modified algorithm on Caltech dataset (a) the original image (b) face position and(c) filtered image
Acknowledgments
This study is a part of the Internal Grant Project no419011811122 The authors gratefully acknowledge the con-tinued support from Alfaisal University and its Office ofResearch
References
[1] R Chellappa C L Wilson and S Sirohey ldquoHuman andmachine recognition of faces a surveyrdquo Proceedings of the IEEEvol 83 no 5 pp 705ndash740 1995
[2] M-H Yang D J Kriegman and N Ahuja ldquoDetecting faces inimages a surveyrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 24 no 1 pp 34ndash58 2002
[3] K S Fu and J K Mui ldquoA survey on image segmentationrdquoPattern Recognition vol 13 no 1 pp 3ndash16 1981
[4] N R Pal and S K Pal ldquoA review on image segmentationtechniquesrdquo Pattern Recognition vol 26 no 9 pp 1277ndash12941993
[5] S-YWan andW EHiggins ldquoSymmetric region growingrdquo IEEETransactions on Image Processing vol 12 no 9 pp 1007ndash10152003
[6] Y-L Chang and X Li ldquoFast image region growingrdquo Image andVision Computing vol 13 no 7 pp 559ndash571 1995
[7] T N Pappas and N S Jayant ldquoAn adaptive clustering algorithmfor image segmentationrdquo in Proceedings of the InternationalConference on Acoustics Speech and Signal Processing (ICASSPrsquo89) vol 3 pp 1667ndash1670 May 1989
[8] T N Pappas ldquoAn adaptive clustering algorithm for imagesegmentationrdquo IEEE Transactions on Signal Processing vol 40no 4 pp 901ndash914 1992
[9] M Y Mashor ldquoHybrid training algorithm for RBF networkrdquoInternational Journal of the Computer the Internet and Manage-ment vol 8 no 2 pp 50ndash65 2000
[10] N A M Isa S A Salamah and U K Ngah ldquoAdaptive fuzzymovingK-means clustering algorithm for image segmentationrdquo
IEEE Transactions on Consumer Electronics vol 55 no 4 pp2145ndash2153 2009
[11] S N Sulaiman and N AM Isa ldquoAdaptive fuzzy-K-means clus-tering algorithm for image segmentationrdquo IEEE Transactions onConsumer Electronics vol 56 no 4 pp 2661ndash2668 2010
[12] W Cai S Chen and D Zhang ldquoFast and robust fuzzy c-meansclustering algorithms incorporating local information for imagesegmentationrdquo Pattern Recognition vol 40 no 3 pp 825ndash8382007
[13] P Tan M Steinbach and V Kumar Introduction to DataMining Pearson Addison-Wesley 2006
[14] H Xiong J Wu and J Chen ldquoK-means clustering versusvalidation measures a data-distribution perspectiverdquo IEEETransactions on Systems Man and Cybernetics B Cyberneticsvol 39 no 2 pp 318ndash331 2009
[15] I Borg and P J F GroenenModernMultidimensional ScalingmdashTheory and Applications Springer Science + Business Media2005
[16] S Guha R Rastogi and K Shim ldquoCURE an efficient clusteringalgorithm for large databasesrdquo ACM SIGMOD Record vol 27no 2 pp 73ndash84 1998
[17] EM Knorr R T Ng and V Tucakov ldquoDistance-based outliersalgorithms and applicationsrdquo VLDB Journal vol 8 no 3-4 pp237ndash253 2000
[18] P S Bradley U Fayyad and C Reina ldquoScaling clusteringalgorithms to large databasesrdquo in Proceedings of the 4th ACMSIGKDD International Conference on Knowledge Discovery andData Mining (KDD rsquo98) pp 9ndash15 1998
[19] D Arthur and S Vassilvitskii ldquok-means++ the advantages ofcareful seedingrdquo in Proceedings of the 18th Annual ACM-SIAMSymposium on Discrete Algorithms pp 1027ndash1035 2007
[20] T Kanungo D M Mount N S Netanyahu C D PiatkoR Silverman and A Y Wu ldquoA local search approximationalgorithm for k-means clusteringrdquo Computational GeometryTheory and Applications vol 28 no 2-3 pp 89ndash112 2004
14 Mathematical Problems in Engineering
[21] M C Chiang C W Tsai and C S Yang ldquoA time-efficient pat-tern reduction algorithm for k-means clusteringrdquo InformationSciences vol 181 no 4 pp 716ndash731 2011
[22] P S Bradley U M Fayyad and C Reina ldquoScaling clusteringalgorithms to large databasesrdquo in Proceedings of the ACMSIGKDD International Conference on Knowledge Discovery andData Mining (KDD rsquo98) pp 9ndash15 1998
[23] F Farnstrom J Lewis and C Elkan ldquoScalability for clusteringalgorithms revisitedrdquo SIGKDD Explorations vol 2 no 1 pp 51ndash57 2000
[24] C Ordonez and E Omiecinski ldquoEfficient disk-based K-means clustering for relational databasesrdquo IEEE Transactions onKnowledge and Data Engineering vol 16 no 8 pp 909ndash9212004
[25] Y Li and S M Chung ldquoParallel bisecting k-means withprediction clustering algorithmrdquo Journal of Supercomputingvol 39 no 1 pp 19ndash37 2007
[26] C Elkan ldquoUsing the triangle inequality to accelerate k-meansrdquoin Proceedings of the 20th International Conference on MachineLearning pp 147ndash153 August 2003
[27] D Reddy and P K Jana ldquoInitialization for K-means clusteringusing voronoi diagramrdquo Procedia Technology vol 4 pp 395ndash400 2012
[28] J B MacQueen ldquoSome methods for classification and analysisof multivariate observationsrdquo in Proceedings of 5th BerkeleySymposium on Mathematical Statistics and Probability pp 281ndash297 University of California Press 1967
[29] R Ebrahimpour R Rasoolinezhad Z Hajiabolhasani andM Ebrahimi ldquoVanishing point detection in corridors usingHough transform and K-means clusteringrdquo IET ComputerVision vol 6 no 1 pp 40ndash51 2012
[30] P S Bishnu and V Bhattacherjee ldquoSoftware fault predictionusing quad tree-based K-means clustering algorithmrdquo IEEETransactions on Knowledge and Data Engineering vol 24 no6 pp 1146ndash1150 2012
[31] T Elomaa and H Koivistoinen ldquoOn autonomous K-meansclusteringrdquo in Proceedings of the International Symposium onMethodologies for Intelligent Systems pp 228ndash236 May 2005
[32] J M Pena J A Lozano and P Larranaga ldquoAn empiricalcomparison of four initialization methods for the K-Meansalgorithmrdquo Pattern Recognition Letters vol 20 no 10 pp 1027ndash1040 1999
[33] T Kanungo D M Mount N S Netanyahu C D PiatkoR Silverman and A Y Wu ldquoAn efficient k-means clusteringalgorithms analysis and implementationrdquo IEEETransactions onPattern Analysis andMachine Intelligence vol 24 no 7 pp 881ndash892 2002
[34] A Likas N Vlassis and J J Verbeek ldquoThe global k-meansclustering algorithmrdquo Pattern Recognition vol 36 no 2 pp451ndash461 2003
[35] A M Bagirov ldquoModified global k-means algorithm for mini-mumsum-of-squares clustering problemsrdquoPatternRecognitionvol 41 no 10 pp 3192ndash3199 2008
[36] J Xie and S Jiang ldquoA simple and fast algorithm for global K-means clusteringrdquo in Proceedings of the 2nd InternationalWork-shop on Education Technology and Computer Science (ETCS 10)vol 2 pp 36ndash40 Wuhan China March 2010
[37] G F Tzortzis and A C Likas ldquoThe global kernel 120581-meansalgorithm for clustering in feature spacerdquo IEEE Transactions onNeural Networks vol 20 no 7 pp 1181ndash1194 2009
[38] W-L Hung Y-C Chang and E Stanley Lee ldquoWeight selectionin W-K-means algorithm with an application in color imagesegmentationrdquo Computers and Mathematics with Applicationsvol 62 no 2 pp 668ndash676 2011
[39] B Maliatski and O Yadid-Pecht ldquoHardware-driven adaptive 120581-means clustering for real-time video imagingrdquo IEEE Transac-tions on Circuits and Systems for Video Technology vol 15 no 1pp 164ndash166 2005
[40] T Saegusa and T Maruyama ldquoAn FPGA implementation ofreal-time K-means clustering for color imagesrdquo Journal of Real-Time Image Processing vol 2 no 4 pp 309ndash318 2007
[41] ldquoCBCL face recognition Databaserdquo Vol 2010httpcbclmitedusoftware-datasetsheiselefacerecognition-databasehtml
[42] BioID BioID Support Downloads BioID Face Database 2011[43] httpwwwvisioncaltecheduhtml-filesarchivehtml
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
8 Mathematical Problems in Engineering
The first part of Var(119872119894minus1) simplifies as follows
sum
119909119895le119909119894minus1
(119909119895minus 120583119909119895le119909119894minus1
)
2
119901119895
= sum
119909119895le119909119894minus1
1199092
119895119901119895minus 2120583119909119895le119909119894minus1
times sum
119909119895le119909119894minus1
119909119895119901119895+ 1205832
119909119895le119909119894minus1sum
119909119895le119909119894minus1
119901119895
= 119864 (1198832sdot 1119909119895le119909119894minus1
) minus
1198642(119883 sdot 1
119909119895le119909119894minus1)
119875 (119883 le 119909119894minus1)
(42)
Similarly
sum
119909119895ge119909119894
(119909119895minus 120583119909119895ge119909119894
)
2
119901119895= 119864 (119883
2sdot 1119909119895ge119909119894
) minus
1198642(119883 sdot 1
119909119895ge119909119894)
119875 (119883 ge 119909119894)
Var 2 (119872119894minus1) = 119864 (119883
2sdot 1119909119895le119909119894minus1
)
minus
1198642(119883 sdot 1
119909119895le119909119894minus1)
119875 (119883 le 119909119894minus1)
+ 119864 (1198832sdot 1119909119895ge119909119894
)
minus
1198642(119883 sdot 1
119909119895ge119909119894)
119875 (119883 ge 119909119894)
Var 2 (119872119894) = 119864 (119883
2sdot 1119909119895le119909119894
)
minus
1198642(119883 sdot 1
119909119895le119909119894)
119875 (119883 le 119909119894)
+ 119864 (1198832sdot 1119909119895ge119909119894+1
)
minus
1198642(119883 sdot 1
119909119895ge119909119894+1)
119875 (119883 ge 119909119894+1)
(43)
Define ΔVar 2 = Var 2(119872119894) minus Var 2(119872
119894minus1)
ΔVar 2 = [
[
minus
1198642(119883 sdot 1
119909119895le119909119894)
119875 (119883 le 119909119894)
minus
1198642(119883 sdot 1
119909119895ge119909119894+1)
119875 (119883 ge 119909119894+1)
]
]
minus[
[
minus
1198642(119883 sdot 1
119909119895le119909119894minus1)
119875 (119883 le 119909119894minus1)
minus
1198642(119883 sdot 1
119909119895ge119909119894)
119875 (119883 ge 119909119894)
]
]
(44)
Now in order to simplify the calculation we rearrangeΔVar 2 as follows
ΔVar 2 = [
[
1198642(119883 sdot 1
119909119895le119909119894minus1)
119875 (119883 le 119909119894minus1)
minus
1198642(119883 sdot 1
119909119895le119909119894)
119875 (119883 le 119909119894)
]
]
+[
[
1198642(119883 sdot 1
119909119895ge119909119894)
119875 (119883 ge 119909119894)
minus
1198642(119883 sdot 1
119909119895ge119909119894+1)
119875 (119883 ge 119909119894+1)
]
]
(45)
The first part is simplified as follows
1198642(119883 sdot 1
119909119895le119909119894minus1)
119875 (119883 le 119909119894minus1)
minus
1198642(119883 sdot 1
119909119895le119909119894)
119875 (119883 le 119909119894)
=
1198642(119883 sdot 1
119909119895le119909119894minus1)
119875 (119883 le 119909119894minus1)
minus
[119864 (119883 sdot 1119909119895le119909119894minus1
) + 119909119894119901119894]
2
119875 (119883 le 119909119894minus1) + 119901119894
=
1
119875 (119883 le 119909119894minus1) [119875 (119883 le 119909
119894minus1) + 119901119894]
times 1198642(119883 sdot 1
119909119895le119909119894minus1)119875 (119883 le 119909
119894)
+ 1199011198941198642(119883 sdot 1
119909119895le119909119894minus1)
minus 1198642(119883 sdot 1
119909119895le119909119894minus1)119875 (119883 le 119909
119894minus1)
minus 2119909119894119901119894119864(119883 sdot 1
119909119895le119909119894minus1)119875 (119883 le 119909
119894minus1)
minus1199092
1198941199012
119894119875 (119883 le 119909
119894minus1)
=
1
119875 (119883 le 119909119894minus1) [119875 (119883 le 119909
119894minus1) + 119901119894]
times 1199011198941198642(119883 sdot 1
119909119895le119909119894minus1)
minus 2119909119894119901119894119864(119883 sdot 1
119909119895le119909119894minus1)119875 (119883 le 119909
119894minus1)
minus1199092
1198941199012
119894119875 (119883 le 119909
119894minus1)
(46)
Similarly simplifying the second part yields
1198642(119883 sdot 1
119909119895ge119909119894)
119875 (119883 ge 119909119894)
minus
1198642(119883 sdot 1
119909119895ge119909119894+1)
119875 (119883 ge 119909119894+1)
=
[119864 (119883 sdot 1119909119895ge119909119894+1
) + 119909119894119901119894]
2
119875 (119883 ge 119909119894+1) + 119901119894
minus
1198642(119883 sdot 1
119909119895ge119909119894+1)
119875 (119883 ge 119909119894+1)
=
1
119875 (119883 ge 119909119894+1) [119875 (119883 ge 119909
119894+1) + 119901119894]
times 1198642(119883 sdot 1
119909119895ge119909119894+1)119875 (119883 ge 119909
119894+1)
minus 1199011198941198642(119883 sdot 1
119909119895ge119909119894+1)
minus 1198642(119883 sdot 1
119909119895ge119909119894+1)119875 (119883 ge 119909
119894+1)
+ 2119909119894119901119894119864(119883 sdot 1
119909119895ge119909119894+1)119875 (119883 ge 119909
119894+1)
+1199092
1198941199012
119894119875 (119883 ge 119909
119894+1)
Mathematical Problems in Engineering 9
=
1
119875 (119883 ge 119909119894+1) [119875 (119883 ge 119909
119894+1) + 119901119894]
times minus119901119894+11198642(119883 sdot 1
119909119895ge119909119894+1)
+ 2119909119894119901119894119864(119883 sdot 1
119909119895ge119909119894+1)119875 (119883 ge 119909
119894+1)
+1199092
1198941199012
119894+1119875 (119883 ge 119909
119894+1)
(47)In general these types of optimization arise when the
dataset is very bigIf we consider119873 to be very large then 119901
119896≪ max(sum
119894ge119896+1
119875119894 sum119894le119896minus1
119875119894) 119875(119883 le 119909
119894minus1) + 119901119894asymp 119875(119883 le 119909
119894) and 119875(119883 ge
119909119894+1) + 119901119894asymp 119875(119883 ge 119909
119894+1)
ΔVar 2 asymp [
[
1198642(119883 sdot 1
119909119895le119909119894minus1) 119901119894
1198752(119883 le 119909
119894minus1)
minus
1198642(119883 sdot 1
119909119895ge119909119894+1) 119901119894
1198752(119883 ge 119909
119894+1)
]
]
minus 2119901119894119909119894[
[
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
minus
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
]
]
minus 1199012
1198941199092
119894[
1
119875 (119883 le 119909119894minus1)
minus
1
119875 (119883 ge 119909119894+1)
]
(48)
and as lim119873rarrinfin
1199012
1198941199092
119894[(1119875(119883 le 119909
119894minus1))minus(1119875(119883 ge 119909
119894+1))] =
0 we can further simplify ΔVar 2 as followsΔVar 2
asymp[
[
1198642(119883 sdot 1
119909119895le119909119894minus1) 119901119894
1198752(119883 le 119909
119894minus1)
minus
1198642(119883 sdot 1
119909119895ge119909119894+1) 119901119894
1198752(119883 ge 119909
119894+1)
]
]
minus 2119901119894119909119894[
[
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
minus
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
]
]
= 119901119894[
[
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
minus
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
]
]
times[
[
(
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
+
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
) minus 2119909119894]
]
(49)To find the minimum of ΔVar 2 let us locate when
ΔVar 2 = 0
997904rArr 119901119894[
[
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
minus
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
]
]
times[
[
(
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
+
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
) minus 2119909119894]
]
= 0
(50)
VarD
X1 X2 X3 X4 X5 X6 Xn
M
Figure 3 Discrete random variables
But 119901119894
= 0 and so is [(119864(119883 sdot 1119909119895le119909119894minus1
)119875(119883 le 119909119894minus1)) minus
(119864(119883 sdot 1119909119895ge119909119894+1
)119875(119883 ge 119909119894+1))] lt 0 since 119909
119894is an increasing
sequence which implies that 120583119894minus1
lt 120583119894where
120583119894minus1
=
119864 (119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
120583119894=
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
(51)
Therefore
[
[
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
+
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
]
]
minus 2119909119894= 0 (52)
Let us define
119865 (119909119894) = 2119909
119894minus[
[
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
+
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
]
]
119865 (119909119894) = 0
997904rArr 119909119894=[
[
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
+
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
]
]
sdot
1
2
997904rArr 119909119894=
120583119894minus1+ 120583119894
2
(53)
where 120583119894minus1
and 120583119894are the means of the first and second parts
respectivelyTo find the minimum for the total variation it is enough
to find the good separator119872 between 119909119894and 119909
119894minus1such that
119865(119909119894) sdot 119865(119909
119894+1) lt 0 and 119865(119909
119894) lt 0 (see Figure 3)
After finding few local minima we can get the globalminimum by finding the smallest among them
3 Face Localization Using Proposed Method
Now it becomes clear the superiority of the K-means modi-fied algorithmover the standard algorithm in terms of finding
10 Mathematical Problems in Engineering
Figure 4 The original image
Figure 5 The face with shade only
a global minimum and we can explain our solution basedon the modified algorithm Suppose we have an image withpixels values from 0 to 255 First we reshaped the input imageinto a vector of pixels Figure 4 shows an input image beforethe operation of the K-means modified algorithmThe imagecontains a face part a normal background and a backgroundwith illumination Then we split the image pixels into twoclasses one from 0 to 119909
1and another from 119909
1to 255 by
applying a certain threshold determined by the algorithmLet [0 119909
1] be the class 1-which represents the nonface
part and let [1199091 255] be the class 2-which represents the
face with the shadeAll pixels with values less than the threshold belong to
the first class and will be assigned to 0 values This will keepthe pixels with values above the threshold and these pixelsbelong to both the face part and the illuminated part of thebackground as shown in Figure 5
In order to remove the illuminated background partanother threshold will be applied to the pixels using thealgorithm meant to separate between the face part and theilluminated background part New classes will be obtainedfrom 119909
1to 1199092and from 119909
2to 255Then the pixels with values
less than the threshold will be assigned a value of 0 and thenonzero pixels left represent the face part Let [119909
1 119909
2] be
the class 1 which represents the illuminated background partand let [119909
2 255] be the class 2 which represents the face
partThe result of the removal of the illuminated back-ground
part is shown in Figure 6
Figure 6 The face without illumination
Figure 7 The filtered image
One can see that some face pixels were assigned to 0 dueto the effects of the illumination Therefore there is a needto return to the original image in Figure 4 to fully extract theimage window corresponding to the face A filtering processis then performed to reduce the noise and the result is shownin Figure 7
Figure 8 shows the final result which is a window contain-ing the face extracted from the original image
4 Results
In this Section the experimental results of the proposedalgorithm will be analyzed and discussed First it starts withpresenting a description of the datasets used in this workfollowed by the results of the previous part of the workThree popular databases were used to test the proposed facelocalizationmethodTheMIT-CBCL database has image size115 times 115 and 300 images were selected from this datasetbased on different conditions such as light variation poseand background variations The second dataset is the BioIDdataset with image size 384 times 286 and 300 images from thedatabase were selected based on different conditions suchas indooroutdoor The last dataset is the Caltech datasetwith image size 896 times 592 and the 300 images from thedatabase were selected based on different conditions such asindooroutdoor
41 MIT-CBCL Dataset This dataset was established by theCenter for Biological and Computational Learning (CBCL)
Mathematical Problems in Engineering 11
Figure 8 The corresponding window in the original image
Figure 9 Samples fromMIT-CBCL dataset
at Massachusetts Institute of Technology (MIT) [41] It has10 persons with 200 images per person and an image size of115times115 pixelsThe images of each person are with differentlight conditions clutter background and scale and differentposes Few examples of these images are shown in Figure 9
42 BioID Dataset This dataset [42] consists of 1521 graylevel images with a resolution of 384 times 286 pixels Each oneshows the frontal view of a face of one out of 23 differenttest persons The images were taken under different lightingconditions with a complex background and contain tiltedand rotated faces This dataset is considered as the mostdifficult one for eye detection (see Figure 10)
43 Caltech Dataset This database [43] has been collectedby Markus Weber at California Institute of Technology Thedatabase contains 450 face images of 27 subjects underdifferent lighting conditions different face expressions andcomplex backgrounds (see Figure 11)
44 Proposed Method Result The proposed method is eval-uated on these datasets where the result was 100 with a
Figure 10 Sample of faces from BioID dataset for localization
Figure 11 Sample of faces from Caltech dataset for localization
Table 1 Face localization using K-mean modified algorithm onMIT-CBCL BioID and Caltech the sets contain faces with differentposes and faces with clutter background as well as indoorsoutdoors
Dataset Image Accuracy ()MIT-Set 1 150 100MIT-Set 2 150 100BioID 300 100Caltech 300 100
localization time of 4 s Table 1 shows the proposed methodresults with the MIT-CBCL Caltech and BioID databasesIn this test we have focused on the use of images of facesfrom different angles from 30 degrees left to 30 degreesright and the variation in the background as well as thecases of indoor and outdoor images Two sets of images areselected from theMIT-CBCL dataset and two other sets fromthe other databases to evaluate the proposed method Thefirst set from MIT-CBCL contains the images with differentposes and the second set contains images with differentbackgrounds Each set contains 150 images The BioID andCaltech sets contain imageswith indoor and outdoor settingsThe results showed the efficiency of the proposed K-meanmodified algorithm to locate the faces from the image withhigh background and poses changes In addition anotherparameter was considered in this test which is the localizationtime From the results the proposed method can locate theface in a very short time because it has less complexity due tothe use of differential equations
12 Mathematical Problems in Engineering
Figure 12 Examples of face localization using k-mean modified algorithm on MIT-CBCL dataset
(a) (b) (c)
Figure 13 Example of face localization using k-mean modified algorithm on BioID dataset (a) the original image (b) face position and (c)filtered image
Figure 12 shows some examples of face localization onMIT-CBCL
In Figure 13(a) an original image from BioID dataset isshownThe localization position is shown in Figure 13(b) Butthere is need to do a filtering operation in order to remove theunwanted parts and this is shown in Figure 13(c) and then thearea of ROI will be determined
Figure 14 shows the face position on an image from theCaltech dataset
5 Conclusion
In this paper we focus on developing the RIO detectionby suggesting a new method for face localization In thismethod of localization the input image is reshaped intoa vector and then the optimized k-means algorithm isapplied twice to cluster the image pixels into classes At the
end the block window corresponding to the class of pixelscontaining the face only is extracted from the input imageThree sets of images taken from MIT BioID and Caltechdatasets and differing in illumination and backgroundsas well as in-dooroutdoor settings were used to evaluatethe proposed method The results show that the proposedmethod achieved significant localization accuracyOur futureresearch direction includes finding the optimal centers forhigher dimension data whichwe believe is just generalizationof our approach to higher dimension (8) and to highersubclusters
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
Mathematical Problems in Engineering 13
(a) (b) (c)
Figure 14 Example of face localization using k-mean modified algorithm on Caltech dataset (a) the original image (b) face position and(c) filtered image
Acknowledgments
This study is a part of the Internal Grant Project no419011811122 The authors gratefully acknowledge the con-tinued support from Alfaisal University and its Office ofResearch
References
[1] R Chellappa C L Wilson and S Sirohey ldquoHuman andmachine recognition of faces a surveyrdquo Proceedings of the IEEEvol 83 no 5 pp 705ndash740 1995
[2] M-H Yang D J Kriegman and N Ahuja ldquoDetecting faces inimages a surveyrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 24 no 1 pp 34ndash58 2002
[3] K S Fu and J K Mui ldquoA survey on image segmentationrdquoPattern Recognition vol 13 no 1 pp 3ndash16 1981
[4] N R Pal and S K Pal ldquoA review on image segmentationtechniquesrdquo Pattern Recognition vol 26 no 9 pp 1277ndash12941993
[5] S-YWan andW EHiggins ldquoSymmetric region growingrdquo IEEETransactions on Image Processing vol 12 no 9 pp 1007ndash10152003
[6] Y-L Chang and X Li ldquoFast image region growingrdquo Image andVision Computing vol 13 no 7 pp 559ndash571 1995
[7] T N Pappas and N S Jayant ldquoAn adaptive clustering algorithmfor image segmentationrdquo in Proceedings of the InternationalConference on Acoustics Speech and Signal Processing (ICASSPrsquo89) vol 3 pp 1667ndash1670 May 1989
[8] T N Pappas ldquoAn adaptive clustering algorithm for imagesegmentationrdquo IEEE Transactions on Signal Processing vol 40no 4 pp 901ndash914 1992
[9] M Y Mashor ldquoHybrid training algorithm for RBF networkrdquoInternational Journal of the Computer the Internet and Manage-ment vol 8 no 2 pp 50ndash65 2000
[10] N A M Isa S A Salamah and U K Ngah ldquoAdaptive fuzzymovingK-means clustering algorithm for image segmentationrdquo
IEEE Transactions on Consumer Electronics vol 55 no 4 pp2145ndash2153 2009
[11] S N Sulaiman and N AM Isa ldquoAdaptive fuzzy-K-means clus-tering algorithm for image segmentationrdquo IEEE Transactions onConsumer Electronics vol 56 no 4 pp 2661ndash2668 2010
[12] W Cai S Chen and D Zhang ldquoFast and robust fuzzy c-meansclustering algorithms incorporating local information for imagesegmentationrdquo Pattern Recognition vol 40 no 3 pp 825ndash8382007
[13] P Tan M Steinbach and V Kumar Introduction to DataMining Pearson Addison-Wesley 2006
[14] H Xiong J Wu and J Chen ldquoK-means clustering versusvalidation measures a data-distribution perspectiverdquo IEEETransactions on Systems Man and Cybernetics B Cyberneticsvol 39 no 2 pp 318ndash331 2009
[15] I Borg and P J F GroenenModernMultidimensional ScalingmdashTheory and Applications Springer Science + Business Media2005
[16] S Guha R Rastogi and K Shim ldquoCURE an efficient clusteringalgorithm for large databasesrdquo ACM SIGMOD Record vol 27no 2 pp 73ndash84 1998
[17] EM Knorr R T Ng and V Tucakov ldquoDistance-based outliersalgorithms and applicationsrdquo VLDB Journal vol 8 no 3-4 pp237ndash253 2000
[18] P S Bradley U Fayyad and C Reina ldquoScaling clusteringalgorithms to large databasesrdquo in Proceedings of the 4th ACMSIGKDD International Conference on Knowledge Discovery andData Mining (KDD rsquo98) pp 9ndash15 1998
[19] D Arthur and S Vassilvitskii ldquok-means++ the advantages ofcareful seedingrdquo in Proceedings of the 18th Annual ACM-SIAMSymposium on Discrete Algorithms pp 1027ndash1035 2007
[20] T Kanungo D M Mount N S Netanyahu C D PiatkoR Silverman and A Y Wu ldquoA local search approximationalgorithm for k-means clusteringrdquo Computational GeometryTheory and Applications vol 28 no 2-3 pp 89ndash112 2004
14 Mathematical Problems in Engineering
[21] M C Chiang C W Tsai and C S Yang ldquoA time-efficient pat-tern reduction algorithm for k-means clusteringrdquo InformationSciences vol 181 no 4 pp 716ndash731 2011
[22] P S Bradley U M Fayyad and C Reina ldquoScaling clusteringalgorithms to large databasesrdquo in Proceedings of the ACMSIGKDD International Conference on Knowledge Discovery andData Mining (KDD rsquo98) pp 9ndash15 1998
[23] F Farnstrom J Lewis and C Elkan ldquoScalability for clusteringalgorithms revisitedrdquo SIGKDD Explorations vol 2 no 1 pp 51ndash57 2000
[24] C Ordonez and E Omiecinski ldquoEfficient disk-based K-means clustering for relational databasesrdquo IEEE Transactions onKnowledge and Data Engineering vol 16 no 8 pp 909ndash9212004
[25] Y Li and S M Chung ldquoParallel bisecting k-means withprediction clustering algorithmrdquo Journal of Supercomputingvol 39 no 1 pp 19ndash37 2007
[26] C Elkan ldquoUsing the triangle inequality to accelerate k-meansrdquoin Proceedings of the 20th International Conference on MachineLearning pp 147ndash153 August 2003
[27] D Reddy and P K Jana ldquoInitialization for K-means clusteringusing voronoi diagramrdquo Procedia Technology vol 4 pp 395ndash400 2012
[28] J B MacQueen ldquoSome methods for classification and analysisof multivariate observationsrdquo in Proceedings of 5th BerkeleySymposium on Mathematical Statistics and Probability pp 281ndash297 University of California Press 1967
[29] R Ebrahimpour R Rasoolinezhad Z Hajiabolhasani andM Ebrahimi ldquoVanishing point detection in corridors usingHough transform and K-means clusteringrdquo IET ComputerVision vol 6 no 1 pp 40ndash51 2012
[30] P S Bishnu and V Bhattacherjee ldquoSoftware fault predictionusing quad tree-based K-means clustering algorithmrdquo IEEETransactions on Knowledge and Data Engineering vol 24 no6 pp 1146ndash1150 2012
[31] T Elomaa and H Koivistoinen ldquoOn autonomous K-meansclusteringrdquo in Proceedings of the International Symposium onMethodologies for Intelligent Systems pp 228ndash236 May 2005
[32] J M Pena J A Lozano and P Larranaga ldquoAn empiricalcomparison of four initialization methods for the K-Meansalgorithmrdquo Pattern Recognition Letters vol 20 no 10 pp 1027ndash1040 1999
[33] T Kanungo D M Mount N S Netanyahu C D PiatkoR Silverman and A Y Wu ldquoAn efficient k-means clusteringalgorithms analysis and implementationrdquo IEEETransactions onPattern Analysis andMachine Intelligence vol 24 no 7 pp 881ndash892 2002
[34] A Likas N Vlassis and J J Verbeek ldquoThe global k-meansclustering algorithmrdquo Pattern Recognition vol 36 no 2 pp451ndash461 2003
[35] A M Bagirov ldquoModified global k-means algorithm for mini-mumsum-of-squares clustering problemsrdquoPatternRecognitionvol 41 no 10 pp 3192ndash3199 2008
[36] J Xie and S Jiang ldquoA simple and fast algorithm for global K-means clusteringrdquo in Proceedings of the 2nd InternationalWork-shop on Education Technology and Computer Science (ETCS 10)vol 2 pp 36ndash40 Wuhan China March 2010
[37] G F Tzortzis and A C Likas ldquoThe global kernel 120581-meansalgorithm for clustering in feature spacerdquo IEEE Transactions onNeural Networks vol 20 no 7 pp 1181ndash1194 2009
[38] W-L Hung Y-C Chang and E Stanley Lee ldquoWeight selectionin W-K-means algorithm with an application in color imagesegmentationrdquo Computers and Mathematics with Applicationsvol 62 no 2 pp 668ndash676 2011
[39] B Maliatski and O Yadid-Pecht ldquoHardware-driven adaptive 120581-means clustering for real-time video imagingrdquo IEEE Transac-tions on Circuits and Systems for Video Technology vol 15 no 1pp 164ndash166 2005
[40] T Saegusa and T Maruyama ldquoAn FPGA implementation ofreal-time K-means clustering for color imagesrdquo Journal of Real-Time Image Processing vol 2 no 4 pp 309ndash318 2007
[41] ldquoCBCL face recognition Databaserdquo Vol 2010httpcbclmitedusoftware-datasetsheiselefacerecognition-databasehtml
[42] BioID BioID Support Downloads BioID Face Database 2011[43] httpwwwvisioncaltecheduhtml-filesarchivehtml
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
Mathematical Problems in Engineering 9
=
1
119875 (119883 ge 119909119894+1) [119875 (119883 ge 119909
119894+1) + 119901119894]
times minus119901119894+11198642(119883 sdot 1
119909119895ge119909119894+1)
+ 2119909119894119901119894119864(119883 sdot 1
119909119895ge119909119894+1)119875 (119883 ge 119909
119894+1)
+1199092
1198941199012
119894+1119875 (119883 ge 119909
119894+1)
(47)In general these types of optimization arise when the
dataset is very bigIf we consider119873 to be very large then 119901
119896≪ max(sum
119894ge119896+1
119875119894 sum119894le119896minus1
119875119894) 119875(119883 le 119909
119894minus1) + 119901119894asymp 119875(119883 le 119909
119894) and 119875(119883 ge
119909119894+1) + 119901119894asymp 119875(119883 ge 119909
119894+1)
ΔVar 2 asymp [
[
1198642(119883 sdot 1
119909119895le119909119894minus1) 119901119894
1198752(119883 le 119909
119894minus1)
minus
1198642(119883 sdot 1
119909119895ge119909119894+1) 119901119894
1198752(119883 ge 119909
119894+1)
]
]
minus 2119901119894119909119894[
[
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
minus
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
]
]
minus 1199012
1198941199092
119894[
1
119875 (119883 le 119909119894minus1)
minus
1
119875 (119883 ge 119909119894+1)
]
(48)
and as lim119873rarrinfin
1199012
1198941199092
119894[(1119875(119883 le 119909
119894minus1))minus(1119875(119883 ge 119909
119894+1))] =
0 we can further simplify ΔVar 2 as followsΔVar 2
asymp[
[
1198642(119883 sdot 1
119909119895le119909119894minus1) 119901119894
1198752(119883 le 119909
119894minus1)
minus
1198642(119883 sdot 1
119909119895ge119909119894+1) 119901119894
1198752(119883 ge 119909
119894+1)
]
]
minus 2119901119894119909119894[
[
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
minus
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
]
]
= 119901119894[
[
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
minus
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
]
]
times[
[
(
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
+
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
) minus 2119909119894]
]
(49)To find the minimum of ΔVar 2 let us locate when
ΔVar 2 = 0
997904rArr 119901119894[
[
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
minus
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
]
]
times[
[
(
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
+
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
) minus 2119909119894]
]
= 0
(50)
VarD
X1 X2 X3 X4 X5 X6 Xn
M
Figure 3 Discrete random variables
But 119901119894
= 0 and so is [(119864(119883 sdot 1119909119895le119909119894minus1
)119875(119883 le 119909119894minus1)) minus
(119864(119883 sdot 1119909119895ge119909119894+1
)119875(119883 ge 119909119894+1))] lt 0 since 119909
119894is an increasing
sequence which implies that 120583119894minus1
lt 120583119894where
120583119894minus1
=
119864 (119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
120583119894=
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
(51)
Therefore
[
[
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
+
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
]
]
minus 2119909119894= 0 (52)
Let us define
119865 (119909119894) = 2119909
119894minus[
[
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
+
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
]
]
119865 (119909119894) = 0
997904rArr 119909119894=[
[
119864(119883 sdot 1119909119895le119909119894minus1
)
119875 (119883 le 119909119894minus1)
+
119864 (119883 sdot 1119909119895ge119909119894+1
)
119875 (119883 ge 119909119894+1)
]
]
sdot
1
2
997904rArr 119909119894=
120583119894minus1+ 120583119894
2
(53)
where 120583119894minus1
and 120583119894are the means of the first and second parts
respectivelyTo find the minimum for the total variation it is enough
to find the good separator119872 between 119909119894and 119909
119894minus1such that
119865(119909119894) sdot 119865(119909
119894+1) lt 0 and 119865(119909
119894) lt 0 (see Figure 3)
After finding few local minima we can get the globalminimum by finding the smallest among them
3 Face Localization Using Proposed Method
Now it becomes clear the superiority of the K-means modi-fied algorithmover the standard algorithm in terms of finding
10 Mathematical Problems in Engineering
Figure 4 The original image
Figure 5 The face with shade only
a global minimum and we can explain our solution basedon the modified algorithm Suppose we have an image withpixels values from 0 to 255 First we reshaped the input imageinto a vector of pixels Figure 4 shows an input image beforethe operation of the K-means modified algorithmThe imagecontains a face part a normal background and a backgroundwith illumination Then we split the image pixels into twoclasses one from 0 to 119909
1and another from 119909
1to 255 by
applying a certain threshold determined by the algorithmLet [0 119909
1] be the class 1-which represents the nonface
part and let [1199091 255] be the class 2-which represents the
face with the shadeAll pixels with values less than the threshold belong to
the first class and will be assigned to 0 values This will keepthe pixels with values above the threshold and these pixelsbelong to both the face part and the illuminated part of thebackground as shown in Figure 5
In order to remove the illuminated background partanother threshold will be applied to the pixels using thealgorithm meant to separate between the face part and theilluminated background part New classes will be obtainedfrom 119909
1to 1199092and from 119909
2to 255Then the pixels with values
less than the threshold will be assigned a value of 0 and thenonzero pixels left represent the face part Let [119909
1 119909
2] be
the class 1 which represents the illuminated background partand let [119909
2 255] be the class 2 which represents the face
partThe result of the removal of the illuminated back-ground
part is shown in Figure 6
Figure 6 The face without illumination
Figure 7 The filtered image
One can see that some face pixels were assigned to 0 dueto the effects of the illumination Therefore there is a needto return to the original image in Figure 4 to fully extract theimage window corresponding to the face A filtering processis then performed to reduce the noise and the result is shownin Figure 7
Figure 8 shows the final result which is a window contain-ing the face extracted from the original image
4 Results
In this Section the experimental results of the proposedalgorithm will be analyzed and discussed First it starts withpresenting a description of the datasets used in this workfollowed by the results of the previous part of the workThree popular databases were used to test the proposed facelocalizationmethodTheMIT-CBCL database has image size115 times 115 and 300 images were selected from this datasetbased on different conditions such as light variation poseand background variations The second dataset is the BioIDdataset with image size 384 times 286 and 300 images from thedatabase were selected based on different conditions suchas indooroutdoor The last dataset is the Caltech datasetwith image size 896 times 592 and the 300 images from thedatabase were selected based on different conditions such asindooroutdoor
41 MIT-CBCL Dataset This dataset was established by theCenter for Biological and Computational Learning (CBCL)
Mathematical Problems in Engineering 11
Figure 8 The corresponding window in the original image
Figure 9 Samples fromMIT-CBCL dataset
at Massachusetts Institute of Technology (MIT) [41] It has10 persons with 200 images per person and an image size of115times115 pixelsThe images of each person are with differentlight conditions clutter background and scale and differentposes Few examples of these images are shown in Figure 9
42 BioID Dataset This dataset [42] consists of 1521 graylevel images with a resolution of 384 times 286 pixels Each oneshows the frontal view of a face of one out of 23 differenttest persons The images were taken under different lightingconditions with a complex background and contain tiltedand rotated faces This dataset is considered as the mostdifficult one for eye detection (see Figure 10)
43 Caltech Dataset This database [43] has been collectedby Markus Weber at California Institute of Technology Thedatabase contains 450 face images of 27 subjects underdifferent lighting conditions different face expressions andcomplex backgrounds (see Figure 11)
44 Proposed Method Result The proposed method is eval-uated on these datasets where the result was 100 with a
Figure 10 Sample of faces from BioID dataset for localization
Figure 11 Sample of faces from Caltech dataset for localization
Table 1 Face localization using K-mean modified algorithm onMIT-CBCL BioID and Caltech the sets contain faces with differentposes and faces with clutter background as well as indoorsoutdoors
Dataset Image Accuracy ()MIT-Set 1 150 100MIT-Set 2 150 100BioID 300 100Caltech 300 100
localization time of 4 s Table 1 shows the proposed methodresults with the MIT-CBCL Caltech and BioID databasesIn this test we have focused on the use of images of facesfrom different angles from 30 degrees left to 30 degreesright and the variation in the background as well as thecases of indoor and outdoor images Two sets of images areselected from theMIT-CBCL dataset and two other sets fromthe other databases to evaluate the proposed method Thefirst set from MIT-CBCL contains the images with differentposes and the second set contains images with differentbackgrounds Each set contains 150 images The BioID andCaltech sets contain imageswith indoor and outdoor settingsThe results showed the efficiency of the proposed K-meanmodified algorithm to locate the faces from the image withhigh background and poses changes In addition anotherparameter was considered in this test which is the localizationtime From the results the proposed method can locate theface in a very short time because it has less complexity due tothe use of differential equations
12 Mathematical Problems in Engineering
Figure 12 Examples of face localization using k-mean modified algorithm on MIT-CBCL dataset
(a) (b) (c)
Figure 13 Example of face localization using k-mean modified algorithm on BioID dataset (a) the original image (b) face position and (c)filtered image
Figure 12 shows some examples of face localization onMIT-CBCL
In Figure 13(a) an original image from BioID dataset isshownThe localization position is shown in Figure 13(b) Butthere is need to do a filtering operation in order to remove theunwanted parts and this is shown in Figure 13(c) and then thearea of ROI will be determined
Figure 14 shows the face position on an image from theCaltech dataset
5 Conclusion
In this paper we focus on developing the RIO detectionby suggesting a new method for face localization In thismethod of localization the input image is reshaped intoa vector and then the optimized k-means algorithm isapplied twice to cluster the image pixels into classes At the
end the block window corresponding to the class of pixelscontaining the face only is extracted from the input imageThree sets of images taken from MIT BioID and Caltechdatasets and differing in illumination and backgroundsas well as in-dooroutdoor settings were used to evaluatethe proposed method The results show that the proposedmethod achieved significant localization accuracyOur futureresearch direction includes finding the optimal centers forhigher dimension data whichwe believe is just generalizationof our approach to higher dimension (8) and to highersubclusters
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
Mathematical Problems in Engineering 13
(a) (b) (c)
Figure 14 Example of face localization using k-mean modified algorithm on Caltech dataset (a) the original image (b) face position and(c) filtered image
Acknowledgments
This study is a part of the Internal Grant Project no419011811122 The authors gratefully acknowledge the con-tinued support from Alfaisal University and its Office ofResearch
References
[1] R Chellappa C L Wilson and S Sirohey ldquoHuman andmachine recognition of faces a surveyrdquo Proceedings of the IEEEvol 83 no 5 pp 705ndash740 1995
[2] M-H Yang D J Kriegman and N Ahuja ldquoDetecting faces inimages a surveyrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 24 no 1 pp 34ndash58 2002
[3] K S Fu and J K Mui ldquoA survey on image segmentationrdquoPattern Recognition vol 13 no 1 pp 3ndash16 1981
[4] N R Pal and S K Pal ldquoA review on image segmentationtechniquesrdquo Pattern Recognition vol 26 no 9 pp 1277ndash12941993
[5] S-YWan andW EHiggins ldquoSymmetric region growingrdquo IEEETransactions on Image Processing vol 12 no 9 pp 1007ndash10152003
[6] Y-L Chang and X Li ldquoFast image region growingrdquo Image andVision Computing vol 13 no 7 pp 559ndash571 1995
[7] T N Pappas and N S Jayant ldquoAn adaptive clustering algorithmfor image segmentationrdquo in Proceedings of the InternationalConference on Acoustics Speech and Signal Processing (ICASSPrsquo89) vol 3 pp 1667ndash1670 May 1989
[8] T N Pappas ldquoAn adaptive clustering algorithm for imagesegmentationrdquo IEEE Transactions on Signal Processing vol 40no 4 pp 901ndash914 1992
[9] M Y Mashor ldquoHybrid training algorithm for RBF networkrdquoInternational Journal of the Computer the Internet and Manage-ment vol 8 no 2 pp 50ndash65 2000
[10] N A M Isa S A Salamah and U K Ngah ldquoAdaptive fuzzymovingK-means clustering algorithm for image segmentationrdquo
IEEE Transactions on Consumer Electronics vol 55 no 4 pp2145ndash2153 2009
[11] S N Sulaiman and N AM Isa ldquoAdaptive fuzzy-K-means clus-tering algorithm for image segmentationrdquo IEEE Transactions onConsumer Electronics vol 56 no 4 pp 2661ndash2668 2010
[12] W Cai S Chen and D Zhang ldquoFast and robust fuzzy c-meansclustering algorithms incorporating local information for imagesegmentationrdquo Pattern Recognition vol 40 no 3 pp 825ndash8382007
[13] P Tan M Steinbach and V Kumar Introduction to DataMining Pearson Addison-Wesley 2006
[14] H Xiong J Wu and J Chen ldquoK-means clustering versusvalidation measures a data-distribution perspectiverdquo IEEETransactions on Systems Man and Cybernetics B Cyberneticsvol 39 no 2 pp 318ndash331 2009
[15] I Borg and P J F GroenenModernMultidimensional ScalingmdashTheory and Applications Springer Science + Business Media2005
[16] S Guha R Rastogi and K Shim ldquoCURE an efficient clusteringalgorithm for large databasesrdquo ACM SIGMOD Record vol 27no 2 pp 73ndash84 1998
[17] EM Knorr R T Ng and V Tucakov ldquoDistance-based outliersalgorithms and applicationsrdquo VLDB Journal vol 8 no 3-4 pp237ndash253 2000
[18] P S Bradley U Fayyad and C Reina ldquoScaling clusteringalgorithms to large databasesrdquo in Proceedings of the 4th ACMSIGKDD International Conference on Knowledge Discovery andData Mining (KDD rsquo98) pp 9ndash15 1998
[19] D Arthur and S Vassilvitskii ldquok-means++ the advantages ofcareful seedingrdquo in Proceedings of the 18th Annual ACM-SIAMSymposium on Discrete Algorithms pp 1027ndash1035 2007
[20] T Kanungo D M Mount N S Netanyahu C D PiatkoR Silverman and A Y Wu ldquoA local search approximationalgorithm for k-means clusteringrdquo Computational GeometryTheory and Applications vol 28 no 2-3 pp 89ndash112 2004
14 Mathematical Problems in Engineering
[21] M C Chiang C W Tsai and C S Yang ldquoA time-efficient pat-tern reduction algorithm for k-means clusteringrdquo InformationSciences vol 181 no 4 pp 716ndash731 2011
[22] P S Bradley U M Fayyad and C Reina ldquoScaling clusteringalgorithms to large databasesrdquo in Proceedings of the ACMSIGKDD International Conference on Knowledge Discovery andData Mining (KDD rsquo98) pp 9ndash15 1998
[23] F Farnstrom J Lewis and C Elkan ldquoScalability for clusteringalgorithms revisitedrdquo SIGKDD Explorations vol 2 no 1 pp 51ndash57 2000
[24] C Ordonez and E Omiecinski ldquoEfficient disk-based K-means clustering for relational databasesrdquo IEEE Transactions onKnowledge and Data Engineering vol 16 no 8 pp 909ndash9212004
[25] Y Li and S M Chung ldquoParallel bisecting k-means withprediction clustering algorithmrdquo Journal of Supercomputingvol 39 no 1 pp 19ndash37 2007
[26] C Elkan ldquoUsing the triangle inequality to accelerate k-meansrdquoin Proceedings of the 20th International Conference on MachineLearning pp 147ndash153 August 2003
[27] D Reddy and P K Jana ldquoInitialization for K-means clusteringusing voronoi diagramrdquo Procedia Technology vol 4 pp 395ndash400 2012
[28] J B MacQueen ldquoSome methods for classification and analysisof multivariate observationsrdquo in Proceedings of 5th BerkeleySymposium on Mathematical Statistics and Probability pp 281ndash297 University of California Press 1967
[29] R Ebrahimpour R Rasoolinezhad Z Hajiabolhasani andM Ebrahimi ldquoVanishing point detection in corridors usingHough transform and K-means clusteringrdquo IET ComputerVision vol 6 no 1 pp 40ndash51 2012
[30] P S Bishnu and V Bhattacherjee ldquoSoftware fault predictionusing quad tree-based K-means clustering algorithmrdquo IEEETransactions on Knowledge and Data Engineering vol 24 no6 pp 1146ndash1150 2012
[31] T Elomaa and H Koivistoinen ldquoOn autonomous K-meansclusteringrdquo in Proceedings of the International Symposium onMethodologies for Intelligent Systems pp 228ndash236 May 2005
[32] J M Pena J A Lozano and P Larranaga ldquoAn empiricalcomparison of four initialization methods for the K-Meansalgorithmrdquo Pattern Recognition Letters vol 20 no 10 pp 1027ndash1040 1999
[33] T Kanungo D M Mount N S Netanyahu C D PiatkoR Silverman and A Y Wu ldquoAn efficient k-means clusteringalgorithms analysis and implementationrdquo IEEETransactions onPattern Analysis andMachine Intelligence vol 24 no 7 pp 881ndash892 2002
[34] A Likas N Vlassis and J J Verbeek ldquoThe global k-meansclustering algorithmrdquo Pattern Recognition vol 36 no 2 pp451ndash461 2003
[35] A M Bagirov ldquoModified global k-means algorithm for mini-mumsum-of-squares clustering problemsrdquoPatternRecognitionvol 41 no 10 pp 3192ndash3199 2008
[36] J Xie and S Jiang ldquoA simple and fast algorithm for global K-means clusteringrdquo in Proceedings of the 2nd InternationalWork-shop on Education Technology and Computer Science (ETCS 10)vol 2 pp 36ndash40 Wuhan China March 2010
[37] G F Tzortzis and A C Likas ldquoThe global kernel 120581-meansalgorithm for clustering in feature spacerdquo IEEE Transactions onNeural Networks vol 20 no 7 pp 1181ndash1194 2009
[38] W-L Hung Y-C Chang and E Stanley Lee ldquoWeight selectionin W-K-means algorithm with an application in color imagesegmentationrdquo Computers and Mathematics with Applicationsvol 62 no 2 pp 668ndash676 2011
[39] B Maliatski and O Yadid-Pecht ldquoHardware-driven adaptive 120581-means clustering for real-time video imagingrdquo IEEE Transac-tions on Circuits and Systems for Video Technology vol 15 no 1pp 164ndash166 2005
[40] T Saegusa and T Maruyama ldquoAn FPGA implementation ofreal-time K-means clustering for color imagesrdquo Journal of Real-Time Image Processing vol 2 no 4 pp 309ndash318 2007
[41] ldquoCBCL face recognition Databaserdquo Vol 2010httpcbclmitedusoftware-datasetsheiselefacerecognition-databasehtml
[42] BioID BioID Support Downloads BioID Face Database 2011[43] httpwwwvisioncaltecheduhtml-filesarchivehtml
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
10 Mathematical Problems in Engineering
Figure 4 The original image
Figure 5 The face with shade only
a global minimum and we can explain our solution basedon the modified algorithm Suppose we have an image withpixels values from 0 to 255 First we reshaped the input imageinto a vector of pixels Figure 4 shows an input image beforethe operation of the K-means modified algorithmThe imagecontains a face part a normal background and a backgroundwith illumination Then we split the image pixels into twoclasses one from 0 to 119909
1and another from 119909
1to 255 by
applying a certain threshold determined by the algorithmLet [0 119909
1] be the class 1-which represents the nonface
part and let [1199091 255] be the class 2-which represents the
face with the shadeAll pixels with values less than the threshold belong to
the first class and will be assigned to 0 values This will keepthe pixels with values above the threshold and these pixelsbelong to both the face part and the illuminated part of thebackground as shown in Figure 5
In order to remove the illuminated background partanother threshold will be applied to the pixels using thealgorithm meant to separate between the face part and theilluminated background part New classes will be obtainedfrom 119909
1to 1199092and from 119909
2to 255Then the pixels with values
less than the threshold will be assigned a value of 0 and thenonzero pixels left represent the face part Let [119909
1 119909
2] be
the class 1 which represents the illuminated background partand let [119909
2 255] be the class 2 which represents the face
partThe result of the removal of the illuminated back-ground
part is shown in Figure 6
Figure 6 The face without illumination
Figure 7 The filtered image
One can see that some face pixels were assigned to 0 dueto the effects of the illumination Therefore there is a needto return to the original image in Figure 4 to fully extract theimage window corresponding to the face A filtering processis then performed to reduce the noise and the result is shownin Figure 7
Figure 8 shows the final result which is a window contain-ing the face extracted from the original image
4 Results
In this Section the experimental results of the proposedalgorithm will be analyzed and discussed First it starts withpresenting a description of the datasets used in this workfollowed by the results of the previous part of the workThree popular databases were used to test the proposed facelocalizationmethodTheMIT-CBCL database has image size115 times 115 and 300 images were selected from this datasetbased on different conditions such as light variation poseand background variations The second dataset is the BioIDdataset with image size 384 times 286 and 300 images from thedatabase were selected based on different conditions suchas indooroutdoor The last dataset is the Caltech datasetwith image size 896 times 592 and the 300 images from thedatabase were selected based on different conditions such asindooroutdoor
41 MIT-CBCL Dataset This dataset was established by theCenter for Biological and Computational Learning (CBCL)
Mathematical Problems in Engineering 11
Figure 8 The corresponding window in the original image
Figure 9 Samples fromMIT-CBCL dataset
at Massachusetts Institute of Technology (MIT) [41] It has10 persons with 200 images per person and an image size of115times115 pixelsThe images of each person are with differentlight conditions clutter background and scale and differentposes Few examples of these images are shown in Figure 9
42 BioID Dataset This dataset [42] consists of 1521 graylevel images with a resolution of 384 times 286 pixels Each oneshows the frontal view of a face of one out of 23 differenttest persons The images were taken under different lightingconditions with a complex background and contain tiltedand rotated faces This dataset is considered as the mostdifficult one for eye detection (see Figure 10)
43 Caltech Dataset This database [43] has been collectedby Markus Weber at California Institute of Technology Thedatabase contains 450 face images of 27 subjects underdifferent lighting conditions different face expressions andcomplex backgrounds (see Figure 11)
44 Proposed Method Result The proposed method is eval-uated on these datasets where the result was 100 with a
Figure 10 Sample of faces from BioID dataset for localization
Figure 11 Sample of faces from Caltech dataset for localization
Table 1 Face localization using K-mean modified algorithm onMIT-CBCL BioID and Caltech the sets contain faces with differentposes and faces with clutter background as well as indoorsoutdoors
Dataset Image Accuracy ()MIT-Set 1 150 100MIT-Set 2 150 100BioID 300 100Caltech 300 100
localization time of 4 s Table 1 shows the proposed methodresults with the MIT-CBCL Caltech and BioID databasesIn this test we have focused on the use of images of facesfrom different angles from 30 degrees left to 30 degreesright and the variation in the background as well as thecases of indoor and outdoor images Two sets of images areselected from theMIT-CBCL dataset and two other sets fromthe other databases to evaluate the proposed method Thefirst set from MIT-CBCL contains the images with differentposes and the second set contains images with differentbackgrounds Each set contains 150 images The BioID andCaltech sets contain imageswith indoor and outdoor settingsThe results showed the efficiency of the proposed K-meanmodified algorithm to locate the faces from the image withhigh background and poses changes In addition anotherparameter was considered in this test which is the localizationtime From the results the proposed method can locate theface in a very short time because it has less complexity due tothe use of differential equations
12 Mathematical Problems in Engineering
Figure 12 Examples of face localization using k-mean modified algorithm on MIT-CBCL dataset
(a) (b) (c)
Figure 13 Example of face localization using k-mean modified algorithm on BioID dataset (a) the original image (b) face position and (c)filtered image
Figure 12 shows some examples of face localization onMIT-CBCL
In Figure 13(a) an original image from BioID dataset isshownThe localization position is shown in Figure 13(b) Butthere is need to do a filtering operation in order to remove theunwanted parts and this is shown in Figure 13(c) and then thearea of ROI will be determined
Figure 14 shows the face position on an image from theCaltech dataset
5 Conclusion
In this paper we focus on developing the RIO detectionby suggesting a new method for face localization In thismethod of localization the input image is reshaped intoa vector and then the optimized k-means algorithm isapplied twice to cluster the image pixels into classes At the
end the block window corresponding to the class of pixelscontaining the face only is extracted from the input imageThree sets of images taken from MIT BioID and Caltechdatasets and differing in illumination and backgroundsas well as in-dooroutdoor settings were used to evaluatethe proposed method The results show that the proposedmethod achieved significant localization accuracyOur futureresearch direction includes finding the optimal centers forhigher dimension data whichwe believe is just generalizationof our approach to higher dimension (8) and to highersubclusters
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
Mathematical Problems in Engineering 13
(a) (b) (c)
Figure 14 Example of face localization using k-mean modified algorithm on Caltech dataset (a) the original image (b) face position and(c) filtered image
Acknowledgments
This study is a part of the Internal Grant Project no419011811122 The authors gratefully acknowledge the con-tinued support from Alfaisal University and its Office ofResearch
References
[1] R Chellappa C L Wilson and S Sirohey ldquoHuman andmachine recognition of faces a surveyrdquo Proceedings of the IEEEvol 83 no 5 pp 705ndash740 1995
[2] M-H Yang D J Kriegman and N Ahuja ldquoDetecting faces inimages a surveyrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 24 no 1 pp 34ndash58 2002
[3] K S Fu and J K Mui ldquoA survey on image segmentationrdquoPattern Recognition vol 13 no 1 pp 3ndash16 1981
[4] N R Pal and S K Pal ldquoA review on image segmentationtechniquesrdquo Pattern Recognition vol 26 no 9 pp 1277ndash12941993
[5] S-YWan andW EHiggins ldquoSymmetric region growingrdquo IEEETransactions on Image Processing vol 12 no 9 pp 1007ndash10152003
[6] Y-L Chang and X Li ldquoFast image region growingrdquo Image andVision Computing vol 13 no 7 pp 559ndash571 1995
[7] T N Pappas and N S Jayant ldquoAn adaptive clustering algorithmfor image segmentationrdquo in Proceedings of the InternationalConference on Acoustics Speech and Signal Processing (ICASSPrsquo89) vol 3 pp 1667ndash1670 May 1989
[8] T N Pappas ldquoAn adaptive clustering algorithm for imagesegmentationrdquo IEEE Transactions on Signal Processing vol 40no 4 pp 901ndash914 1992
[9] M Y Mashor ldquoHybrid training algorithm for RBF networkrdquoInternational Journal of the Computer the Internet and Manage-ment vol 8 no 2 pp 50ndash65 2000
[10] N A M Isa S A Salamah and U K Ngah ldquoAdaptive fuzzymovingK-means clustering algorithm for image segmentationrdquo
IEEE Transactions on Consumer Electronics vol 55 no 4 pp2145ndash2153 2009
[11] S N Sulaiman and N AM Isa ldquoAdaptive fuzzy-K-means clus-tering algorithm for image segmentationrdquo IEEE Transactions onConsumer Electronics vol 56 no 4 pp 2661ndash2668 2010
[12] W Cai S Chen and D Zhang ldquoFast and robust fuzzy c-meansclustering algorithms incorporating local information for imagesegmentationrdquo Pattern Recognition vol 40 no 3 pp 825ndash8382007
[13] P Tan M Steinbach and V Kumar Introduction to DataMining Pearson Addison-Wesley 2006
[14] H Xiong J Wu and J Chen ldquoK-means clustering versusvalidation measures a data-distribution perspectiverdquo IEEETransactions on Systems Man and Cybernetics B Cyberneticsvol 39 no 2 pp 318ndash331 2009
[15] I Borg and P J F GroenenModernMultidimensional ScalingmdashTheory and Applications Springer Science + Business Media2005
[16] S Guha R Rastogi and K Shim ldquoCURE an efficient clusteringalgorithm for large databasesrdquo ACM SIGMOD Record vol 27no 2 pp 73ndash84 1998
[17] EM Knorr R T Ng and V Tucakov ldquoDistance-based outliersalgorithms and applicationsrdquo VLDB Journal vol 8 no 3-4 pp237ndash253 2000
[18] P S Bradley U Fayyad and C Reina ldquoScaling clusteringalgorithms to large databasesrdquo in Proceedings of the 4th ACMSIGKDD International Conference on Knowledge Discovery andData Mining (KDD rsquo98) pp 9ndash15 1998
[19] D Arthur and S Vassilvitskii ldquok-means++ the advantages ofcareful seedingrdquo in Proceedings of the 18th Annual ACM-SIAMSymposium on Discrete Algorithms pp 1027ndash1035 2007
[20] T Kanungo D M Mount N S Netanyahu C D PiatkoR Silverman and A Y Wu ldquoA local search approximationalgorithm for k-means clusteringrdquo Computational GeometryTheory and Applications vol 28 no 2-3 pp 89ndash112 2004
14 Mathematical Problems in Engineering
[21] M C Chiang C W Tsai and C S Yang ldquoA time-efficient pat-tern reduction algorithm for k-means clusteringrdquo InformationSciences vol 181 no 4 pp 716ndash731 2011
[22] P S Bradley U M Fayyad and C Reina ldquoScaling clusteringalgorithms to large databasesrdquo in Proceedings of the ACMSIGKDD International Conference on Knowledge Discovery andData Mining (KDD rsquo98) pp 9ndash15 1998
[23] F Farnstrom J Lewis and C Elkan ldquoScalability for clusteringalgorithms revisitedrdquo SIGKDD Explorations vol 2 no 1 pp 51ndash57 2000
[24] C Ordonez and E Omiecinski ldquoEfficient disk-based K-means clustering for relational databasesrdquo IEEE Transactions onKnowledge and Data Engineering vol 16 no 8 pp 909ndash9212004
[25] Y Li and S M Chung ldquoParallel bisecting k-means withprediction clustering algorithmrdquo Journal of Supercomputingvol 39 no 1 pp 19ndash37 2007
[26] C Elkan ldquoUsing the triangle inequality to accelerate k-meansrdquoin Proceedings of the 20th International Conference on MachineLearning pp 147ndash153 August 2003
[27] D Reddy and P K Jana ldquoInitialization for K-means clusteringusing voronoi diagramrdquo Procedia Technology vol 4 pp 395ndash400 2012
[28] J B MacQueen ldquoSome methods for classification and analysisof multivariate observationsrdquo in Proceedings of 5th BerkeleySymposium on Mathematical Statistics and Probability pp 281ndash297 University of California Press 1967
[29] R Ebrahimpour R Rasoolinezhad Z Hajiabolhasani andM Ebrahimi ldquoVanishing point detection in corridors usingHough transform and K-means clusteringrdquo IET ComputerVision vol 6 no 1 pp 40ndash51 2012
[30] P S Bishnu and V Bhattacherjee ldquoSoftware fault predictionusing quad tree-based K-means clustering algorithmrdquo IEEETransactions on Knowledge and Data Engineering vol 24 no6 pp 1146ndash1150 2012
[31] T Elomaa and H Koivistoinen ldquoOn autonomous K-meansclusteringrdquo in Proceedings of the International Symposium onMethodologies for Intelligent Systems pp 228ndash236 May 2005
[32] J M Pena J A Lozano and P Larranaga ldquoAn empiricalcomparison of four initialization methods for the K-Meansalgorithmrdquo Pattern Recognition Letters vol 20 no 10 pp 1027ndash1040 1999
[33] T Kanungo D M Mount N S Netanyahu C D PiatkoR Silverman and A Y Wu ldquoAn efficient k-means clusteringalgorithms analysis and implementationrdquo IEEETransactions onPattern Analysis andMachine Intelligence vol 24 no 7 pp 881ndash892 2002
[34] A Likas N Vlassis and J J Verbeek ldquoThe global k-meansclustering algorithmrdquo Pattern Recognition vol 36 no 2 pp451ndash461 2003
[35] A M Bagirov ldquoModified global k-means algorithm for mini-mumsum-of-squares clustering problemsrdquoPatternRecognitionvol 41 no 10 pp 3192ndash3199 2008
[36] J Xie and S Jiang ldquoA simple and fast algorithm for global K-means clusteringrdquo in Proceedings of the 2nd InternationalWork-shop on Education Technology and Computer Science (ETCS 10)vol 2 pp 36ndash40 Wuhan China March 2010
[37] G F Tzortzis and A C Likas ldquoThe global kernel 120581-meansalgorithm for clustering in feature spacerdquo IEEE Transactions onNeural Networks vol 20 no 7 pp 1181ndash1194 2009
[38] W-L Hung Y-C Chang and E Stanley Lee ldquoWeight selectionin W-K-means algorithm with an application in color imagesegmentationrdquo Computers and Mathematics with Applicationsvol 62 no 2 pp 668ndash676 2011
[39] B Maliatski and O Yadid-Pecht ldquoHardware-driven adaptive 120581-means clustering for real-time video imagingrdquo IEEE Transac-tions on Circuits and Systems for Video Technology vol 15 no 1pp 164ndash166 2005
[40] T Saegusa and T Maruyama ldquoAn FPGA implementation ofreal-time K-means clustering for color imagesrdquo Journal of Real-Time Image Processing vol 2 no 4 pp 309ndash318 2007
[41] ldquoCBCL face recognition Databaserdquo Vol 2010httpcbclmitedusoftware-datasetsheiselefacerecognition-databasehtml
[42] BioID BioID Support Downloads BioID Face Database 2011[43] httpwwwvisioncaltecheduhtml-filesarchivehtml
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
Mathematical Problems in Engineering 11
Figure 8 The corresponding window in the original image
Figure 9 Samples fromMIT-CBCL dataset
at Massachusetts Institute of Technology (MIT) [41] It has10 persons with 200 images per person and an image size of115times115 pixelsThe images of each person are with differentlight conditions clutter background and scale and differentposes Few examples of these images are shown in Figure 9
42 BioID Dataset This dataset [42] consists of 1521 graylevel images with a resolution of 384 times 286 pixels Each oneshows the frontal view of a face of one out of 23 differenttest persons The images were taken under different lightingconditions with a complex background and contain tiltedand rotated faces This dataset is considered as the mostdifficult one for eye detection (see Figure 10)
43 Caltech Dataset This database [43] has been collectedby Markus Weber at California Institute of Technology Thedatabase contains 450 face images of 27 subjects underdifferent lighting conditions different face expressions andcomplex backgrounds (see Figure 11)
44 Proposed Method Result The proposed method is eval-uated on these datasets where the result was 100 with a
Figure 10 Sample of faces from BioID dataset for localization
Figure 11 Sample of faces from Caltech dataset for localization
Table 1 Face localization using K-mean modified algorithm onMIT-CBCL BioID and Caltech the sets contain faces with differentposes and faces with clutter background as well as indoorsoutdoors
Dataset Image Accuracy ()MIT-Set 1 150 100MIT-Set 2 150 100BioID 300 100Caltech 300 100
localization time of 4 s Table 1 shows the proposed methodresults with the MIT-CBCL Caltech and BioID databasesIn this test we have focused on the use of images of facesfrom different angles from 30 degrees left to 30 degreesright and the variation in the background as well as thecases of indoor and outdoor images Two sets of images areselected from theMIT-CBCL dataset and two other sets fromthe other databases to evaluate the proposed method Thefirst set from MIT-CBCL contains the images with differentposes and the second set contains images with differentbackgrounds Each set contains 150 images The BioID andCaltech sets contain imageswith indoor and outdoor settingsThe results showed the efficiency of the proposed K-meanmodified algorithm to locate the faces from the image withhigh background and poses changes In addition anotherparameter was considered in this test which is the localizationtime From the results the proposed method can locate theface in a very short time because it has less complexity due tothe use of differential equations
12 Mathematical Problems in Engineering
Figure 12 Examples of face localization using k-mean modified algorithm on MIT-CBCL dataset
(a) (b) (c)
Figure 13 Example of face localization using k-mean modified algorithm on BioID dataset (a) the original image (b) face position and (c)filtered image
Figure 12 shows some examples of face localization onMIT-CBCL
In Figure 13(a) an original image from BioID dataset isshownThe localization position is shown in Figure 13(b) Butthere is need to do a filtering operation in order to remove theunwanted parts and this is shown in Figure 13(c) and then thearea of ROI will be determined
Figure 14 shows the face position on an image from theCaltech dataset
5 Conclusion
In this paper we focus on developing the RIO detectionby suggesting a new method for face localization In thismethod of localization the input image is reshaped intoa vector and then the optimized k-means algorithm isapplied twice to cluster the image pixels into classes At the
end the block window corresponding to the class of pixelscontaining the face only is extracted from the input imageThree sets of images taken from MIT BioID and Caltechdatasets and differing in illumination and backgroundsas well as in-dooroutdoor settings were used to evaluatethe proposed method The results show that the proposedmethod achieved significant localization accuracyOur futureresearch direction includes finding the optimal centers forhigher dimension data whichwe believe is just generalizationof our approach to higher dimension (8) and to highersubclusters
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
Mathematical Problems in Engineering 13
(a) (b) (c)
Figure 14 Example of face localization using k-mean modified algorithm on Caltech dataset (a) the original image (b) face position and(c) filtered image
Acknowledgments
This study is a part of the Internal Grant Project no419011811122 The authors gratefully acknowledge the con-tinued support from Alfaisal University and its Office ofResearch
References
[1] R Chellappa C L Wilson and S Sirohey ldquoHuman andmachine recognition of faces a surveyrdquo Proceedings of the IEEEvol 83 no 5 pp 705ndash740 1995
[2] M-H Yang D J Kriegman and N Ahuja ldquoDetecting faces inimages a surveyrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 24 no 1 pp 34ndash58 2002
[3] K S Fu and J K Mui ldquoA survey on image segmentationrdquoPattern Recognition vol 13 no 1 pp 3ndash16 1981
[4] N R Pal and S K Pal ldquoA review on image segmentationtechniquesrdquo Pattern Recognition vol 26 no 9 pp 1277ndash12941993
[5] S-YWan andW EHiggins ldquoSymmetric region growingrdquo IEEETransactions on Image Processing vol 12 no 9 pp 1007ndash10152003
[6] Y-L Chang and X Li ldquoFast image region growingrdquo Image andVision Computing vol 13 no 7 pp 559ndash571 1995
[7] T N Pappas and N S Jayant ldquoAn adaptive clustering algorithmfor image segmentationrdquo in Proceedings of the InternationalConference on Acoustics Speech and Signal Processing (ICASSPrsquo89) vol 3 pp 1667ndash1670 May 1989
[8] T N Pappas ldquoAn adaptive clustering algorithm for imagesegmentationrdquo IEEE Transactions on Signal Processing vol 40no 4 pp 901ndash914 1992
[9] M Y Mashor ldquoHybrid training algorithm for RBF networkrdquoInternational Journal of the Computer the Internet and Manage-ment vol 8 no 2 pp 50ndash65 2000
[10] N A M Isa S A Salamah and U K Ngah ldquoAdaptive fuzzymovingK-means clustering algorithm for image segmentationrdquo
IEEE Transactions on Consumer Electronics vol 55 no 4 pp2145ndash2153 2009
[11] S N Sulaiman and N AM Isa ldquoAdaptive fuzzy-K-means clus-tering algorithm for image segmentationrdquo IEEE Transactions onConsumer Electronics vol 56 no 4 pp 2661ndash2668 2010
[12] W Cai S Chen and D Zhang ldquoFast and robust fuzzy c-meansclustering algorithms incorporating local information for imagesegmentationrdquo Pattern Recognition vol 40 no 3 pp 825ndash8382007
[13] P Tan M Steinbach and V Kumar Introduction to DataMining Pearson Addison-Wesley 2006
[14] H Xiong J Wu and J Chen ldquoK-means clustering versusvalidation measures a data-distribution perspectiverdquo IEEETransactions on Systems Man and Cybernetics B Cyberneticsvol 39 no 2 pp 318ndash331 2009
[15] I Borg and P J F GroenenModernMultidimensional ScalingmdashTheory and Applications Springer Science + Business Media2005
[16] S Guha R Rastogi and K Shim ldquoCURE an efficient clusteringalgorithm for large databasesrdquo ACM SIGMOD Record vol 27no 2 pp 73ndash84 1998
[17] EM Knorr R T Ng and V Tucakov ldquoDistance-based outliersalgorithms and applicationsrdquo VLDB Journal vol 8 no 3-4 pp237ndash253 2000
[18] P S Bradley U Fayyad and C Reina ldquoScaling clusteringalgorithms to large databasesrdquo in Proceedings of the 4th ACMSIGKDD International Conference on Knowledge Discovery andData Mining (KDD rsquo98) pp 9ndash15 1998
[19] D Arthur and S Vassilvitskii ldquok-means++ the advantages ofcareful seedingrdquo in Proceedings of the 18th Annual ACM-SIAMSymposium on Discrete Algorithms pp 1027ndash1035 2007
[20] T Kanungo D M Mount N S Netanyahu C D PiatkoR Silverman and A Y Wu ldquoA local search approximationalgorithm for k-means clusteringrdquo Computational GeometryTheory and Applications vol 28 no 2-3 pp 89ndash112 2004
14 Mathematical Problems in Engineering
[21] M C Chiang C W Tsai and C S Yang ldquoA time-efficient pat-tern reduction algorithm for k-means clusteringrdquo InformationSciences vol 181 no 4 pp 716ndash731 2011
[22] P S Bradley U M Fayyad and C Reina ldquoScaling clusteringalgorithms to large databasesrdquo in Proceedings of the ACMSIGKDD International Conference on Knowledge Discovery andData Mining (KDD rsquo98) pp 9ndash15 1998
[23] F Farnstrom J Lewis and C Elkan ldquoScalability for clusteringalgorithms revisitedrdquo SIGKDD Explorations vol 2 no 1 pp 51ndash57 2000
[24] C Ordonez and E Omiecinski ldquoEfficient disk-based K-means clustering for relational databasesrdquo IEEE Transactions onKnowledge and Data Engineering vol 16 no 8 pp 909ndash9212004
[25] Y Li and S M Chung ldquoParallel bisecting k-means withprediction clustering algorithmrdquo Journal of Supercomputingvol 39 no 1 pp 19ndash37 2007
[26] C Elkan ldquoUsing the triangle inequality to accelerate k-meansrdquoin Proceedings of the 20th International Conference on MachineLearning pp 147ndash153 August 2003
[27] D Reddy and P K Jana ldquoInitialization for K-means clusteringusing voronoi diagramrdquo Procedia Technology vol 4 pp 395ndash400 2012
[28] J B MacQueen ldquoSome methods for classification and analysisof multivariate observationsrdquo in Proceedings of 5th BerkeleySymposium on Mathematical Statistics and Probability pp 281ndash297 University of California Press 1967
[29] R Ebrahimpour R Rasoolinezhad Z Hajiabolhasani andM Ebrahimi ldquoVanishing point detection in corridors usingHough transform and K-means clusteringrdquo IET ComputerVision vol 6 no 1 pp 40ndash51 2012
[30] P S Bishnu and V Bhattacherjee ldquoSoftware fault predictionusing quad tree-based K-means clustering algorithmrdquo IEEETransactions on Knowledge and Data Engineering vol 24 no6 pp 1146ndash1150 2012
[31] T Elomaa and H Koivistoinen ldquoOn autonomous K-meansclusteringrdquo in Proceedings of the International Symposium onMethodologies for Intelligent Systems pp 228ndash236 May 2005
[32] J M Pena J A Lozano and P Larranaga ldquoAn empiricalcomparison of four initialization methods for the K-Meansalgorithmrdquo Pattern Recognition Letters vol 20 no 10 pp 1027ndash1040 1999
[33] T Kanungo D M Mount N S Netanyahu C D PiatkoR Silverman and A Y Wu ldquoAn efficient k-means clusteringalgorithms analysis and implementationrdquo IEEETransactions onPattern Analysis andMachine Intelligence vol 24 no 7 pp 881ndash892 2002
[34] A Likas N Vlassis and J J Verbeek ldquoThe global k-meansclustering algorithmrdquo Pattern Recognition vol 36 no 2 pp451ndash461 2003
[35] A M Bagirov ldquoModified global k-means algorithm for mini-mumsum-of-squares clustering problemsrdquoPatternRecognitionvol 41 no 10 pp 3192ndash3199 2008
[36] J Xie and S Jiang ldquoA simple and fast algorithm for global K-means clusteringrdquo in Proceedings of the 2nd InternationalWork-shop on Education Technology and Computer Science (ETCS 10)vol 2 pp 36ndash40 Wuhan China March 2010
[37] G F Tzortzis and A C Likas ldquoThe global kernel 120581-meansalgorithm for clustering in feature spacerdquo IEEE Transactions onNeural Networks vol 20 no 7 pp 1181ndash1194 2009
[38] W-L Hung Y-C Chang and E Stanley Lee ldquoWeight selectionin W-K-means algorithm with an application in color imagesegmentationrdquo Computers and Mathematics with Applicationsvol 62 no 2 pp 668ndash676 2011
[39] B Maliatski and O Yadid-Pecht ldquoHardware-driven adaptive 120581-means clustering for real-time video imagingrdquo IEEE Transac-tions on Circuits and Systems for Video Technology vol 15 no 1pp 164ndash166 2005
[40] T Saegusa and T Maruyama ldquoAn FPGA implementation ofreal-time K-means clustering for color imagesrdquo Journal of Real-Time Image Processing vol 2 no 4 pp 309ndash318 2007
[41] ldquoCBCL face recognition Databaserdquo Vol 2010httpcbclmitedusoftware-datasetsheiselefacerecognition-databasehtml
[42] BioID BioID Support Downloads BioID Face Database 2011[43] httpwwwvisioncaltecheduhtml-filesarchivehtml
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
12 Mathematical Problems in Engineering
Figure 12 Examples of face localization using k-mean modified algorithm on MIT-CBCL dataset
(a) (b) (c)
Figure 13 Example of face localization using k-mean modified algorithm on BioID dataset (a) the original image (b) face position and (c)filtered image
Figure 12 shows some examples of face localization onMIT-CBCL
In Figure 13(a) an original image from BioID dataset isshownThe localization position is shown in Figure 13(b) Butthere is need to do a filtering operation in order to remove theunwanted parts and this is shown in Figure 13(c) and then thearea of ROI will be determined
Figure 14 shows the face position on an image from theCaltech dataset
5 Conclusion
In this paper we focus on developing the RIO detectionby suggesting a new method for face localization In thismethod of localization the input image is reshaped intoa vector and then the optimized k-means algorithm isapplied twice to cluster the image pixels into classes At the
end the block window corresponding to the class of pixelscontaining the face only is extracted from the input imageThree sets of images taken from MIT BioID and Caltechdatasets and differing in illumination and backgroundsas well as in-dooroutdoor settings were used to evaluatethe proposed method The results show that the proposedmethod achieved significant localization accuracyOur futureresearch direction includes finding the optimal centers forhigher dimension data whichwe believe is just generalizationof our approach to higher dimension (8) and to highersubclusters
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
Mathematical Problems in Engineering 13
(a) (b) (c)
Figure 14 Example of face localization using k-mean modified algorithm on Caltech dataset (a) the original image (b) face position and(c) filtered image
Acknowledgments
This study is a part of the Internal Grant Project no419011811122 The authors gratefully acknowledge the con-tinued support from Alfaisal University and its Office ofResearch
References
[1] R Chellappa C L Wilson and S Sirohey ldquoHuman andmachine recognition of faces a surveyrdquo Proceedings of the IEEEvol 83 no 5 pp 705ndash740 1995
[2] M-H Yang D J Kriegman and N Ahuja ldquoDetecting faces inimages a surveyrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 24 no 1 pp 34ndash58 2002
[3] K S Fu and J K Mui ldquoA survey on image segmentationrdquoPattern Recognition vol 13 no 1 pp 3ndash16 1981
[4] N R Pal and S K Pal ldquoA review on image segmentationtechniquesrdquo Pattern Recognition vol 26 no 9 pp 1277ndash12941993
[5] S-YWan andW EHiggins ldquoSymmetric region growingrdquo IEEETransactions on Image Processing vol 12 no 9 pp 1007ndash10152003
[6] Y-L Chang and X Li ldquoFast image region growingrdquo Image andVision Computing vol 13 no 7 pp 559ndash571 1995
[7] T N Pappas and N S Jayant ldquoAn adaptive clustering algorithmfor image segmentationrdquo in Proceedings of the InternationalConference on Acoustics Speech and Signal Processing (ICASSPrsquo89) vol 3 pp 1667ndash1670 May 1989
[8] T N Pappas ldquoAn adaptive clustering algorithm for imagesegmentationrdquo IEEE Transactions on Signal Processing vol 40no 4 pp 901ndash914 1992
[9] M Y Mashor ldquoHybrid training algorithm for RBF networkrdquoInternational Journal of the Computer the Internet and Manage-ment vol 8 no 2 pp 50ndash65 2000
[10] N A M Isa S A Salamah and U K Ngah ldquoAdaptive fuzzymovingK-means clustering algorithm for image segmentationrdquo
IEEE Transactions on Consumer Electronics vol 55 no 4 pp2145ndash2153 2009
[11] S N Sulaiman and N AM Isa ldquoAdaptive fuzzy-K-means clus-tering algorithm for image segmentationrdquo IEEE Transactions onConsumer Electronics vol 56 no 4 pp 2661ndash2668 2010
[12] W Cai S Chen and D Zhang ldquoFast and robust fuzzy c-meansclustering algorithms incorporating local information for imagesegmentationrdquo Pattern Recognition vol 40 no 3 pp 825ndash8382007
[13] P Tan M Steinbach and V Kumar Introduction to DataMining Pearson Addison-Wesley 2006
[14] H Xiong J Wu and J Chen ldquoK-means clustering versusvalidation measures a data-distribution perspectiverdquo IEEETransactions on Systems Man and Cybernetics B Cyberneticsvol 39 no 2 pp 318ndash331 2009
[15] I Borg and P J F GroenenModernMultidimensional ScalingmdashTheory and Applications Springer Science + Business Media2005
[16] S Guha R Rastogi and K Shim ldquoCURE an efficient clusteringalgorithm for large databasesrdquo ACM SIGMOD Record vol 27no 2 pp 73ndash84 1998
[17] EM Knorr R T Ng and V Tucakov ldquoDistance-based outliersalgorithms and applicationsrdquo VLDB Journal vol 8 no 3-4 pp237ndash253 2000
[18] P S Bradley U Fayyad and C Reina ldquoScaling clusteringalgorithms to large databasesrdquo in Proceedings of the 4th ACMSIGKDD International Conference on Knowledge Discovery andData Mining (KDD rsquo98) pp 9ndash15 1998
[19] D Arthur and S Vassilvitskii ldquok-means++ the advantages ofcareful seedingrdquo in Proceedings of the 18th Annual ACM-SIAMSymposium on Discrete Algorithms pp 1027ndash1035 2007
[20] T Kanungo D M Mount N S Netanyahu C D PiatkoR Silverman and A Y Wu ldquoA local search approximationalgorithm for k-means clusteringrdquo Computational GeometryTheory and Applications vol 28 no 2-3 pp 89ndash112 2004
14 Mathematical Problems in Engineering
[21] M C Chiang C W Tsai and C S Yang ldquoA time-efficient pat-tern reduction algorithm for k-means clusteringrdquo InformationSciences vol 181 no 4 pp 716ndash731 2011
[22] P S Bradley U M Fayyad and C Reina ldquoScaling clusteringalgorithms to large databasesrdquo in Proceedings of the ACMSIGKDD International Conference on Knowledge Discovery andData Mining (KDD rsquo98) pp 9ndash15 1998
[23] F Farnstrom J Lewis and C Elkan ldquoScalability for clusteringalgorithms revisitedrdquo SIGKDD Explorations vol 2 no 1 pp 51ndash57 2000
[24] C Ordonez and E Omiecinski ldquoEfficient disk-based K-means clustering for relational databasesrdquo IEEE Transactions onKnowledge and Data Engineering vol 16 no 8 pp 909ndash9212004
[25] Y Li and S M Chung ldquoParallel bisecting k-means withprediction clustering algorithmrdquo Journal of Supercomputingvol 39 no 1 pp 19ndash37 2007
[26] C Elkan ldquoUsing the triangle inequality to accelerate k-meansrdquoin Proceedings of the 20th International Conference on MachineLearning pp 147ndash153 August 2003
[27] D Reddy and P K Jana ldquoInitialization for K-means clusteringusing voronoi diagramrdquo Procedia Technology vol 4 pp 395ndash400 2012
[28] J B MacQueen ldquoSome methods for classification and analysisof multivariate observationsrdquo in Proceedings of 5th BerkeleySymposium on Mathematical Statistics and Probability pp 281ndash297 University of California Press 1967
[29] R Ebrahimpour R Rasoolinezhad Z Hajiabolhasani andM Ebrahimi ldquoVanishing point detection in corridors usingHough transform and K-means clusteringrdquo IET ComputerVision vol 6 no 1 pp 40ndash51 2012
[30] P S Bishnu and V Bhattacherjee ldquoSoftware fault predictionusing quad tree-based K-means clustering algorithmrdquo IEEETransactions on Knowledge and Data Engineering vol 24 no6 pp 1146ndash1150 2012
[31] T Elomaa and H Koivistoinen ldquoOn autonomous K-meansclusteringrdquo in Proceedings of the International Symposium onMethodologies for Intelligent Systems pp 228ndash236 May 2005
[32] J M Pena J A Lozano and P Larranaga ldquoAn empiricalcomparison of four initialization methods for the K-Meansalgorithmrdquo Pattern Recognition Letters vol 20 no 10 pp 1027ndash1040 1999
[33] T Kanungo D M Mount N S Netanyahu C D PiatkoR Silverman and A Y Wu ldquoAn efficient k-means clusteringalgorithms analysis and implementationrdquo IEEETransactions onPattern Analysis andMachine Intelligence vol 24 no 7 pp 881ndash892 2002
[34] A Likas N Vlassis and J J Verbeek ldquoThe global k-meansclustering algorithmrdquo Pattern Recognition vol 36 no 2 pp451ndash461 2003
[35] A M Bagirov ldquoModified global k-means algorithm for mini-mumsum-of-squares clustering problemsrdquoPatternRecognitionvol 41 no 10 pp 3192ndash3199 2008
[36] J Xie and S Jiang ldquoA simple and fast algorithm for global K-means clusteringrdquo in Proceedings of the 2nd InternationalWork-shop on Education Technology and Computer Science (ETCS 10)vol 2 pp 36ndash40 Wuhan China March 2010
[37] G F Tzortzis and A C Likas ldquoThe global kernel 120581-meansalgorithm for clustering in feature spacerdquo IEEE Transactions onNeural Networks vol 20 no 7 pp 1181ndash1194 2009
[38] W-L Hung Y-C Chang and E Stanley Lee ldquoWeight selectionin W-K-means algorithm with an application in color imagesegmentationrdquo Computers and Mathematics with Applicationsvol 62 no 2 pp 668ndash676 2011
[39] B Maliatski and O Yadid-Pecht ldquoHardware-driven adaptive 120581-means clustering for real-time video imagingrdquo IEEE Transac-tions on Circuits and Systems for Video Technology vol 15 no 1pp 164ndash166 2005
[40] T Saegusa and T Maruyama ldquoAn FPGA implementation ofreal-time K-means clustering for color imagesrdquo Journal of Real-Time Image Processing vol 2 no 4 pp 309ndash318 2007
[41] ldquoCBCL face recognition Databaserdquo Vol 2010httpcbclmitedusoftware-datasetsheiselefacerecognition-databasehtml
[42] BioID BioID Support Downloads BioID Face Database 2011[43] httpwwwvisioncaltecheduhtml-filesarchivehtml
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
Mathematical Problems in Engineering 13
(a) (b) (c)
Figure 14 Example of face localization using k-mean modified algorithm on Caltech dataset (a) the original image (b) face position and(c) filtered image
Acknowledgments
This study is a part of the Internal Grant Project no419011811122 The authors gratefully acknowledge the con-tinued support from Alfaisal University and its Office ofResearch
References
[1] R Chellappa C L Wilson and S Sirohey ldquoHuman andmachine recognition of faces a surveyrdquo Proceedings of the IEEEvol 83 no 5 pp 705ndash740 1995
[2] M-H Yang D J Kriegman and N Ahuja ldquoDetecting faces inimages a surveyrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 24 no 1 pp 34ndash58 2002
[3] K S Fu and J K Mui ldquoA survey on image segmentationrdquoPattern Recognition vol 13 no 1 pp 3ndash16 1981
[4] N R Pal and S K Pal ldquoA review on image segmentationtechniquesrdquo Pattern Recognition vol 26 no 9 pp 1277ndash12941993
[5] S-YWan andW EHiggins ldquoSymmetric region growingrdquo IEEETransactions on Image Processing vol 12 no 9 pp 1007ndash10152003
[6] Y-L Chang and X Li ldquoFast image region growingrdquo Image andVision Computing vol 13 no 7 pp 559ndash571 1995
[7] T N Pappas and N S Jayant ldquoAn adaptive clustering algorithmfor image segmentationrdquo in Proceedings of the InternationalConference on Acoustics Speech and Signal Processing (ICASSPrsquo89) vol 3 pp 1667ndash1670 May 1989
[8] T N Pappas ldquoAn adaptive clustering algorithm for imagesegmentationrdquo IEEE Transactions on Signal Processing vol 40no 4 pp 901ndash914 1992
[9] M Y Mashor ldquoHybrid training algorithm for RBF networkrdquoInternational Journal of the Computer the Internet and Manage-ment vol 8 no 2 pp 50ndash65 2000
[10] N A M Isa S A Salamah and U K Ngah ldquoAdaptive fuzzymovingK-means clustering algorithm for image segmentationrdquo
IEEE Transactions on Consumer Electronics vol 55 no 4 pp2145ndash2153 2009
[11] S N Sulaiman and N AM Isa ldquoAdaptive fuzzy-K-means clus-tering algorithm for image segmentationrdquo IEEE Transactions onConsumer Electronics vol 56 no 4 pp 2661ndash2668 2010
[12] W Cai S Chen and D Zhang ldquoFast and robust fuzzy c-meansclustering algorithms incorporating local information for imagesegmentationrdquo Pattern Recognition vol 40 no 3 pp 825ndash8382007
[13] P Tan M Steinbach and V Kumar Introduction to DataMining Pearson Addison-Wesley 2006
[14] H Xiong J Wu and J Chen ldquoK-means clustering versusvalidation measures a data-distribution perspectiverdquo IEEETransactions on Systems Man and Cybernetics B Cyberneticsvol 39 no 2 pp 318ndash331 2009
[15] I Borg and P J F GroenenModernMultidimensional ScalingmdashTheory and Applications Springer Science + Business Media2005
[16] S Guha R Rastogi and K Shim ldquoCURE an efficient clusteringalgorithm for large databasesrdquo ACM SIGMOD Record vol 27no 2 pp 73ndash84 1998
[17] EM Knorr R T Ng and V Tucakov ldquoDistance-based outliersalgorithms and applicationsrdquo VLDB Journal vol 8 no 3-4 pp237ndash253 2000
[18] P S Bradley U Fayyad and C Reina ldquoScaling clusteringalgorithms to large databasesrdquo in Proceedings of the 4th ACMSIGKDD International Conference on Knowledge Discovery andData Mining (KDD rsquo98) pp 9ndash15 1998
[19] D Arthur and S Vassilvitskii ldquok-means++ the advantages ofcareful seedingrdquo in Proceedings of the 18th Annual ACM-SIAMSymposium on Discrete Algorithms pp 1027ndash1035 2007
[20] T Kanungo D M Mount N S Netanyahu C D PiatkoR Silverman and A Y Wu ldquoA local search approximationalgorithm for k-means clusteringrdquo Computational GeometryTheory and Applications vol 28 no 2-3 pp 89ndash112 2004
14 Mathematical Problems in Engineering
[21] M C Chiang C W Tsai and C S Yang ldquoA time-efficient pat-tern reduction algorithm for k-means clusteringrdquo InformationSciences vol 181 no 4 pp 716ndash731 2011
[22] P S Bradley U M Fayyad and C Reina ldquoScaling clusteringalgorithms to large databasesrdquo in Proceedings of the ACMSIGKDD International Conference on Knowledge Discovery andData Mining (KDD rsquo98) pp 9ndash15 1998
[23] F Farnstrom J Lewis and C Elkan ldquoScalability for clusteringalgorithms revisitedrdquo SIGKDD Explorations vol 2 no 1 pp 51ndash57 2000
[24] C Ordonez and E Omiecinski ldquoEfficient disk-based K-means clustering for relational databasesrdquo IEEE Transactions onKnowledge and Data Engineering vol 16 no 8 pp 909ndash9212004
[25] Y Li and S M Chung ldquoParallel bisecting k-means withprediction clustering algorithmrdquo Journal of Supercomputingvol 39 no 1 pp 19ndash37 2007
[26] C Elkan ldquoUsing the triangle inequality to accelerate k-meansrdquoin Proceedings of the 20th International Conference on MachineLearning pp 147ndash153 August 2003
[27] D Reddy and P K Jana ldquoInitialization for K-means clusteringusing voronoi diagramrdquo Procedia Technology vol 4 pp 395ndash400 2012
[28] J B MacQueen ldquoSome methods for classification and analysisof multivariate observationsrdquo in Proceedings of 5th BerkeleySymposium on Mathematical Statistics and Probability pp 281ndash297 University of California Press 1967
[29] R Ebrahimpour R Rasoolinezhad Z Hajiabolhasani andM Ebrahimi ldquoVanishing point detection in corridors usingHough transform and K-means clusteringrdquo IET ComputerVision vol 6 no 1 pp 40ndash51 2012
[30] P S Bishnu and V Bhattacherjee ldquoSoftware fault predictionusing quad tree-based K-means clustering algorithmrdquo IEEETransactions on Knowledge and Data Engineering vol 24 no6 pp 1146ndash1150 2012
[31] T Elomaa and H Koivistoinen ldquoOn autonomous K-meansclusteringrdquo in Proceedings of the International Symposium onMethodologies for Intelligent Systems pp 228ndash236 May 2005
[32] J M Pena J A Lozano and P Larranaga ldquoAn empiricalcomparison of four initialization methods for the K-Meansalgorithmrdquo Pattern Recognition Letters vol 20 no 10 pp 1027ndash1040 1999
[33] T Kanungo D M Mount N S Netanyahu C D PiatkoR Silverman and A Y Wu ldquoAn efficient k-means clusteringalgorithms analysis and implementationrdquo IEEETransactions onPattern Analysis andMachine Intelligence vol 24 no 7 pp 881ndash892 2002
[34] A Likas N Vlassis and J J Verbeek ldquoThe global k-meansclustering algorithmrdquo Pattern Recognition vol 36 no 2 pp451ndash461 2003
[35] A M Bagirov ldquoModified global k-means algorithm for mini-mumsum-of-squares clustering problemsrdquoPatternRecognitionvol 41 no 10 pp 3192ndash3199 2008
[36] J Xie and S Jiang ldquoA simple and fast algorithm for global K-means clusteringrdquo in Proceedings of the 2nd InternationalWork-shop on Education Technology and Computer Science (ETCS 10)vol 2 pp 36ndash40 Wuhan China March 2010
[37] G F Tzortzis and A C Likas ldquoThe global kernel 120581-meansalgorithm for clustering in feature spacerdquo IEEE Transactions onNeural Networks vol 20 no 7 pp 1181ndash1194 2009
[38] W-L Hung Y-C Chang and E Stanley Lee ldquoWeight selectionin W-K-means algorithm with an application in color imagesegmentationrdquo Computers and Mathematics with Applicationsvol 62 no 2 pp 668ndash676 2011
[39] B Maliatski and O Yadid-Pecht ldquoHardware-driven adaptive 120581-means clustering for real-time video imagingrdquo IEEE Transac-tions on Circuits and Systems for Video Technology vol 15 no 1pp 164ndash166 2005
[40] T Saegusa and T Maruyama ldquoAn FPGA implementation ofreal-time K-means clustering for color imagesrdquo Journal of Real-Time Image Processing vol 2 no 4 pp 309ndash318 2007
[41] ldquoCBCL face recognition Databaserdquo Vol 2010httpcbclmitedusoftware-datasetsheiselefacerecognition-databasehtml
[42] BioID BioID Support Downloads BioID Face Database 2011[43] httpwwwvisioncaltecheduhtml-filesarchivehtml
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
14 Mathematical Problems in Engineering
[21] M C Chiang C W Tsai and C S Yang ldquoA time-efficient pat-tern reduction algorithm for k-means clusteringrdquo InformationSciences vol 181 no 4 pp 716ndash731 2011
[22] P S Bradley U M Fayyad and C Reina ldquoScaling clusteringalgorithms to large databasesrdquo in Proceedings of the ACMSIGKDD International Conference on Knowledge Discovery andData Mining (KDD rsquo98) pp 9ndash15 1998
[23] F Farnstrom J Lewis and C Elkan ldquoScalability for clusteringalgorithms revisitedrdquo SIGKDD Explorations vol 2 no 1 pp 51ndash57 2000
[24] C Ordonez and E Omiecinski ldquoEfficient disk-based K-means clustering for relational databasesrdquo IEEE Transactions onKnowledge and Data Engineering vol 16 no 8 pp 909ndash9212004
[25] Y Li and S M Chung ldquoParallel bisecting k-means withprediction clustering algorithmrdquo Journal of Supercomputingvol 39 no 1 pp 19ndash37 2007
[26] C Elkan ldquoUsing the triangle inequality to accelerate k-meansrdquoin Proceedings of the 20th International Conference on MachineLearning pp 147ndash153 August 2003
[27] D Reddy and P K Jana ldquoInitialization for K-means clusteringusing voronoi diagramrdquo Procedia Technology vol 4 pp 395ndash400 2012
[28] J B MacQueen ldquoSome methods for classification and analysisof multivariate observationsrdquo in Proceedings of 5th BerkeleySymposium on Mathematical Statistics and Probability pp 281ndash297 University of California Press 1967
[29] R Ebrahimpour R Rasoolinezhad Z Hajiabolhasani andM Ebrahimi ldquoVanishing point detection in corridors usingHough transform and K-means clusteringrdquo IET ComputerVision vol 6 no 1 pp 40ndash51 2012
[30] P S Bishnu and V Bhattacherjee ldquoSoftware fault predictionusing quad tree-based K-means clustering algorithmrdquo IEEETransactions on Knowledge and Data Engineering vol 24 no6 pp 1146ndash1150 2012
[31] T Elomaa and H Koivistoinen ldquoOn autonomous K-meansclusteringrdquo in Proceedings of the International Symposium onMethodologies for Intelligent Systems pp 228ndash236 May 2005
[32] J M Pena J A Lozano and P Larranaga ldquoAn empiricalcomparison of four initialization methods for the K-Meansalgorithmrdquo Pattern Recognition Letters vol 20 no 10 pp 1027ndash1040 1999
[33] T Kanungo D M Mount N S Netanyahu C D PiatkoR Silverman and A Y Wu ldquoAn efficient k-means clusteringalgorithms analysis and implementationrdquo IEEETransactions onPattern Analysis andMachine Intelligence vol 24 no 7 pp 881ndash892 2002
[34] A Likas N Vlassis and J J Verbeek ldquoThe global k-meansclustering algorithmrdquo Pattern Recognition vol 36 no 2 pp451ndash461 2003
[35] A M Bagirov ldquoModified global k-means algorithm for mini-mumsum-of-squares clustering problemsrdquoPatternRecognitionvol 41 no 10 pp 3192ndash3199 2008
[36] J Xie and S Jiang ldquoA simple and fast algorithm for global K-means clusteringrdquo in Proceedings of the 2nd InternationalWork-shop on Education Technology and Computer Science (ETCS 10)vol 2 pp 36ndash40 Wuhan China March 2010
[37] G F Tzortzis and A C Likas ldquoThe global kernel 120581-meansalgorithm for clustering in feature spacerdquo IEEE Transactions onNeural Networks vol 20 no 7 pp 1181ndash1194 2009
[38] W-L Hung Y-C Chang and E Stanley Lee ldquoWeight selectionin W-K-means algorithm with an application in color imagesegmentationrdquo Computers and Mathematics with Applicationsvol 62 no 2 pp 668ndash676 2011
[39] B Maliatski and O Yadid-Pecht ldquoHardware-driven adaptive 120581-means clustering for real-time video imagingrdquo IEEE Transac-tions on Circuits and Systems for Video Technology vol 15 no 1pp 164ndash166 2005
[40] T Saegusa and T Maruyama ldquoAn FPGA implementation ofreal-time K-means clustering for color imagesrdquo Journal of Real-Time Image Processing vol 2 no 4 pp 309ndash318 2007
[41] ldquoCBCL face recognition Databaserdquo Vol 2010httpcbclmitedusoftware-datasetsheiselefacerecognition-databasehtml
[42] BioID BioID Support Downloads BioID Face Database 2011[43] httpwwwvisioncaltecheduhtml-filesarchivehtml
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of