Identification of Africanized honeybees
Post on 26-Jun-2016
Embed Size (px)
Journal of Chromatography A, 1096 (2005) 6975
Identification of Africanizedul N.y, Stillwersity, P
Gas chrom potentiAfricanized drocarexocrine gla am conretention tim ed to idof the genotype. The pattern recognition GA searched for features in the chromatograms that optimized the separation of the European andAfricanized honeybees in a plot of the two or three largest principal components of the data. Because the largest principal components capturethe bulk of the variance in the data, the peaks identified by the pattern recognition GA primarily contained information about differencesbetween gas chromatograms of European and Africanized honeybees. The principal component analysis routine embedded in the fitnessfunction of the pattern recognition GA acted as an information filter, significantly reducing the size of the search space since it restrictedthe search tofocused on tcorrectly aresimilar to asmart one 2005 Else
Africanimported ineybee bettvariety of hestablishedtypes, referthe bee faunthe AfricanTexas townhoneybeesMexico, Ne
0021-9673/$doi:10.1016/jfeature sets whose principal component plots showed clustering on the basis of the bees genotype. In addition, the algorithmhose classes and/or samples that were difficult to classify as it trained using a form of boosting. Samples that consistently classifynot as heavily weighted as samples that are difficult to classify. Over time, the algorithm learns its optimal parameters in a manner
neural network. The pattern recognition GA integrates aspects of artificial intelligence and evolutionary computations to yield a-pass procedure for feature selection and classification.vier B.V. All rights reserved.
ttern recognition; Genetic algorithms; Classification; Chemical taxonomy; Feature selection
ized honeybees are descendants of African beesto Brazil by scientists attempting to breed a hon-
er adapted to the South American tropics. Theoneybee that resulted from the interbreeding of theEuropean bee with the newly imported African
red to as the Africanized bee, has since dominateda of much of South and Central America. In 1990,ized honeybee appeared outside the small southof Hildago . In the past 14 years, Africanized
have spread to southern California, Arizona, Newvada, and Oklahoma. The success of Africanized
ding author. Tel.: +1 405 744 5945; fax: +1 405 744 6007.dress: email@example.com (B.K. Lavine).
honeybees in supplanting the European honeybee populationhas been attributed to a variety of biological and behaviorfactors and is one of the most successful introgressions everdocumented.
Africanized honeybees have received considerable atten-tion in the popular press. Many stories have stressed theaggressive behavior of this bee and the inherent danger thatAfricanized bees pose for both man and domestic animals.In addition, Africanized honeybees also have the potentialto alter agricultural practices and significantly increase thecost of bee-pollinated food products. Honeybees account for80% of all insect pollination activity in the United States.They pollinate more than 100 different agricultural productsincluding many fruits and vegetables, forage plants, whichare important in production of meat and dairy products, andoil seed crops. The United States Department of Agriculture
see front matter 2005 Elsevier B.V. All rights reserved..chroma.2005.06.049Barry K. Lavine a,, Meha Department of Chemistry, Oklahoma State Universit
b Mathematics & Computer Science, Clarkson Univ
Available online 11 July
atography and pattern recognition methods were used to develop ahoneybees. The test data consisted of 237 gas chromatograms of hynds of European and Africanized honeybees. Each gas chromatogre windows. A genetic algorithm (GA) for pattern recognition was ushoneybeesVora b
ater, OK 74078-3071, USAotsdam, NY 13699, USA
al method for differentiating European honeybees frombon extracts obtained from the wax glands, cuticle, andtained 65 peaks corresponding to a set of standardizedentify features in the gas chromatograms characteristic
70 B.K. Lavine, M.N. Vora / J. Chromatogr. A 1096 (2005) 6975
estimates that 20 billion dollars worth of agricultural prod-ucts are dependent on the European honeybee for pollination. If Africanized honeybees appear in the United States inthe same fthey couldagricultura
To contStates, it wtification. Tand easy tohoneybeesUnited Stahoneybee iprocedureapproximabee specimmorphomepopulationwhich is oofficials. Ahas recentare neededmethod.
Our prematographcuticular hyare fully Ahoneybeeshave also sgenotype inthe linear lheavily Afand moderyet fully Acan be difdifferences
In thefrom the wEuropeancapillary c(GA) for pfeatures inAfrican gefeatures inof the Eurotwo or threBecause thof the variarecognitiondifferencesThe princithe fitnessan informasearch spawhose prinbasis of g
those classes and/or samples that were difficult to classify asit trained using a form of boosting. Samples that consistentlyclassify correctly are not as heavily weighted as samples that
ydrocaer beees andybeesdor, Pdesig
ers atvior. Eies mty of c
ydrocaxocrining indThe hsilicapesti
ion wam of nto anahroma
silic0.32 m200 Cgas ch5890Ator. G
a Fnce ofs wase n-allary ction inthe co
MS exal gasan Afatogorm in which they dispersed throughout Brazil,have deleterious effects on all aspects of the USl economy influenced by bee pollination.rol the spread of Africanized bees in the Unitedill be necessary to develop a program of stock cer-his program can only be implemented if a reliableuse method for the identification of Africanized
is developed. Currently, the method used by thetes Department of Agriculture for Africanizeddentification is morphometric analysis . Thisemploys a linear discriminant developed fromtely 20 body measurements to identify individual
ens as Africanized or European. However,tric analysis cannot determine if a given beeis in the initial stages of becoming Africanized,f great interest to Federal and State regulatorylthough a polymerase chain reaction based assayly been developed , more selective primers
to ensure accurate genotyping when using this
vious work using packed column gas chro-y and the linear learning machine  to analyzedrocarbons of insects has shown that bees whichfricanized can be differentiated from Europeanon the basis of their hydrocarbon profiles . Wehown that it is possible to identify the AfricanF1 hybrids . Using gas chromatography and
earning machine, we have also demonstrated thatricanized (i.e., bees that are fully Africanized)ately Africanized honeybees (bees that are notfricanized but possess many of the African traits)ferentiated from European honeybees based onin their hydrocarbon profiles .present study, hydrocarbon extracts obtainedax glands, cuticle, and exocrine glands of 238
and Africanized honeybees were analyzed byolumn gas chromatography. A genetic algorithmattern recognition  was used to identifythe gas chromatograms characteristic of the
notype. The pattern recognition GA searched forthe chromatograms that optimized the separationpean and Africanized honeybees in a plot of thee largest principal components  of the data.e largest principal components capture the bulknce in the data, the peaks identified by the pattern
GA primarily contained information aboutbetween European and Africanized honeybees.
pal component analysis routine embedded infunction of the pattern recognition GA acted astion filter, significantly reducing the size of thece since it restricted the search to feature setscipal component plots showed clustering on theenotype. In addition, the algorithm focused on
Hand esoak72 h.of ausingfractstreapriorgas c
Hbeesfusedi.d. =50 toThea HPdetecusingpresedieneto thcapilretenfromGCtypicfromchromt to classify. Over time, the algorithm learns itsrameters in a manner similar to a neural network.
rbon extracts were obtained from 294 adults. Of the 294 foragers, 128 were Africanized hon-the other 166 were European. The Africanized
were collected from colonies in Costa Rica, Peru,eru, Honduras, and Mexico. Many of the coloniesnated as moderately or heavily Africanized bythese sites based on a field test for colony defenseuropean honeybees were collected from managedaintained in the United States. They represented aommercially available US stocks.
rbons were extracted from the wax gland, cuticle,e gland of individual whole bee specimens by firstividual specimens in pesticide grade hexane for
ydrocarbons were isolated from the soak by meansgel syringe column (silica Sep-Pac, Millipore)
cide grade hexane as the eluent. The hydrocarbons collected and concentrated to dryness under aitrogen. It was reconstituted with 50l of hexanelysis by capillary column gas chromatography ortographymass spectrometery (GCMS).
rbon extracts obtained from individual wholeanalyzed on a 25-m 5% phenyl methyl siliconea capillary column (Hewlett-Packard Ultra 2,m), which was temperature programmed fromat 7/min and then from 200 to 250 C at 1/min.
romatographic experiments were performed oninstrument equipped with a flame ionization
CMS analysis was also performed in this studyinnigan OWA 1020 automated GCMS. Thenormal and branched chain alkanes, alkenes, andrevealed in the extract. GC peaks correspondingkanes were used as retention standards in theolumn gas chromatographic experiments. Kovatdices were assigned to the compounds elutinglumn and these indices (as well as data from theperiment) were used for peak identification. Achromatographic trace of the hydrocarbon extractricanized honeybee is shown in Fig. 1. Each gasram contained 65 peaks corresponding to a set
B.K. Lavine, M.N. Vora / J. Chromatogr. A 1096 (2005) 6975 71
Fig. 1. Gas ch and, cunormal alkane nd perm
of standardmatographwere at leaof these pewere alsovisual anal
For pattdata was dipean honeyhoneybees,56 Europeabees from Scorrectly cEach gas cvector X = (jth peak. San n-dimenis thereforeEuclideanassumptioninversely re
Table 1Training set
European foraHeavily AfricModerately A
t poinssingn of thEurogenet
t featuees g
GAlationof whEuropto bestringC pea
ubset oromatographic trace of the hydrocarbon extracts obtained from the wax gls; B: alkenes; C: dienes; and D: branched chain alkanes. Reprinted with ki
ized retention time windows. The 65 gas chro-ic peaks selected for pattern recognition analysisst moderately resolved and computer integrationaks always yielded reliable results. These peaksreadily identifiable in all the chromatograms byysis so peak matching was not a problem.
ern recognition analysis, the gas chromatographicvided into a training set (consisting of 130 Euro-bees and 108 heavily and moderately Africanizedsee Table 1) and a prediction set (consisting of
n and Africanized honeybees of which the honey-
is thaposseregioto the
Aselecthe bnitionpopueachized/peakin theing Gthe san Diego, Tampa, Berkeley and Mexico were notlassified by morphometric analysis, see Table 2).hromatogram was initially represented as a datax1, x2, x3, . . . xj, . . . x65) where xj is the area of theuch a vector can also be considered as a point insional Euclidean space. A set of chromatogramsrepresented as a set of points in an n-dimensional
space. (In this study, n is equal to 65.) A basicis that distances between points in this space arelated to their degree of similarity. The expectation
e Number of honeybees
gers from United States 130anized foragers 64fricanized foragers 44
evaluation.tion, whichthe Europeponent plotThe fitnessfor bee clasrecombinatThe power
Table 2Prediction set
San Diego (ETampa (EuropBerkeley (EurMexico (AfricFrench GuinePeru (AfricanTotalticle, and exocrine gland of a heavily Africanized forager. A:ission from Lavine et al. .
ts representing chromatograms from honeybeesthe African genotype should cluster in a limitedis space separate from the points corresponding
pean honeybees.ic algorithm for pattern recognition was used tores from the training set data characteristic ofenotype. A block diagram of the pattern recog-is shown in Fig. 2. During each generation, aof binary strings of fixed length is generated,
ich represents a potential solution to the African-ean honeybee classification problem. For a GCincluded, it is necessary for the corresponding bit
to be set at 1. If the bit is set to 0, the correspond-k is not included. The strings are decoded yieldingf the 65 GC peaks sent to the fitness function for
Each string is assigned a value by the fitness func-is a measure of the degree of separation between
an and Africanized honeybees in a principal com-of the data defined by the extracted feature subset.(i.e., the quality of the proposed feature subsetsification) is used to select potential solutions forion, which produces a new population of strings.of the GA arises from recombination [13,14],
e Number of honeybees
uropean) 7ean) 22opean) 7anized) 6
a (Africanized) 7ized) 7
72 B.K. Lavine, M.N. Vora / J. Chromatogr. A 1096 (2005) 6975
Fig. 2. Block om B.Kidentification es, Ana
which causmation betexpectationIn additionwhere onethe featurestring in quinclusion inGA to searclocal convetion, selectinternal paruntil a specsible solutirecognition
The parecognitioncomponentplots, classthe fitnessing each gweights forto the corre
CW(c) = 1
The prinsome afterextracted issification adistances aprincipal cosmallest toneighbors.a user defithe class to
arest noint inmputer to sc
es) ha0 sampean
sampleas eaose aighbo
es. Heh equale, the
e GAult torationsompuof SHn. SHdiagram of the pattern recognition GA. Reprinted with kind permission frusing gas chromatography/genetic algorithms-pattern recognition techniqu
es a structured yet randomized exchange of infor-ween strings (i.e., potential solutions), with thethat good solutions can generate even better ones.
, some of the binary strings may undergo mutation,of the bits is randomly changed. (If a bit is zero oris excluded, the mutation operator applied to theestion causes the bit to change to one and forces its
the feature subset or vice versa.) This allows theh adjacent regions of the solution space mitigatingrgence. The aforementioned processes: evalua-ion, crossover, reproduction, and adjustment ofameters (which is discussed below), are repeatedified number of generations is achieved or a fea-
on is found. The operators comprising our patternGA are described below.
ttern recognition GA emulates human patternthrough machine learning to score the principalplots. To track and score the principal componentand sample weights, which are an integral part of
function, are computed (see Eqs. (1) and (2)) dur-eneration. Class weights sum to 100;...