Identification of Africanized honeybees

Download Identification of Africanized honeybees

Post on 26-Jun-2016

223 views

Category:

Documents

0 download

Embed Size (px)

TRANSCRIPT

  • Journal of Chromatography A, 1096 (2005) 6975

    Identification of Africanizedul N.y, Stillwersity, P

    2005

    Abstract

    Gas chrom potentiAfricanized drocarexocrine gla am conretention tim ed to idof the genotype. The pattern recognition GA searched for features in the chromatograms that optimized the separation of the European andAfricanized honeybees in a plot of the two or three largest principal components of the data. Because the largest principal components capturethe bulk of the variance in the data, the peaks identified by the pattern recognition GA primarily contained information about differencesbetween gas chromatograms of European and Africanized honeybees. The principal component analysis routine embedded in the fitnessfunction of the pattern recognition GA acted as an information filter, significantly reducing the size of the search space since it restrictedthe search tofocused on tcorrectly aresimilar to asmart one 2005 Else

    Keywords: Pa

    1. Introdu

    Africanimported ineybee bettvariety of hestablishedtypes, referthe bee faunthe AfricanTexas townhoneybeesMexico, Ne

    CorresponE-mail ad

    0021-9673/$doi:10.1016/jfeature sets whose principal component plots showed clustering on the basis of the bees genotype. In addition, the algorithmhose classes and/or samples that were difficult to classify as it trained using a form of boosting. Samples that consistently classifynot as heavily weighted as samples that are difficult to classify. Over time, the algorithm learns its optimal parameters in a manner

    neural network. The pattern recognition GA integrates aspects of artificial intelligence and evolutionary computations to yield a-pass procedure for feature selection and classification.vier B.V. All rights reserved.

    ttern recognition; Genetic algorithms; Classification; Chemical taxonomy; Feature selection

    ction

    ized honeybees are descendants of African beesto Brazil by scientists attempting to breed a hon-

    er adapted to the South American tropics. Theoneybee that resulted from the interbreeding of theEuropean bee with the newly imported African

    red to as the Africanized bee, has since dominateda of much of South and Central America. In 1990,ized honeybee appeared outside the small southof Hildago [1]. In the past 14 years, Africanized

    have spread to southern California, Arizona, Newvada, and Oklahoma. The success of Africanized

    ding author. Tel.: +1 405 744 5945; fax: +1 405 744 6007.dress: bklab@chem.okstate.edu (B.K. Lavine).

    honeybees in supplanting the European honeybee populationhas been attributed to a variety of biological and behaviorfactors and is one of the most successful introgressions everdocumented.

    Africanized honeybees have received considerable atten-tion in the popular press. Many stories have stressed theaggressive behavior of this bee and the inherent danger thatAfricanized bees pose for both man and domestic animals.In addition, Africanized honeybees also have the potentialto alter agricultural practices and significantly increase thecost of bee-pollinated food products. Honeybees account for80% of all insect pollination activity in the United States.They pollinate more than 100 different agricultural productsincluding many fruits and vegetables, forage plants, whichare important in production of meat and dairy products, andoil seed crops. The United States Department of Agriculture

    see front matter 2005 Elsevier B.V. All rights reserved..chroma.2005.06.049Barry K. Lavine a,, Meha Department of Chemistry, Oklahoma State Universit

    b Mathematics & Computer Science, Clarkson Univ

    Available online 11 July

    atography and pattern recognition methods were used to develop ahoneybees. The test data consisted of 237 gas chromatograms of hynds of European and Africanized honeybees. Each gas chromatogre windows. A genetic algorithm (GA) for pattern recognition was ushoneybeesVora b

    ater, OK 74078-3071, USAotsdam, NY 13699, USA

    al method for differentiating European honeybees frombon extracts obtained from the wax glands, cuticle, andtained 65 peaks corresponding to a set of standardizedentify features in the gas chromatograms characteristic

  • 70 B.K. Lavine, M.N. Vora / J. Chromatogr. A 1096 (2005) 6975

    estimates that 20 billion dollars worth of agricultural prod-ucts are dependent on the European honeybee for pollination[2]. If Africanized honeybees appear in the United States inthe same fthey couldagricultura

    To contStates, it wtification. Tand easy tohoneybeesUnited Stahoneybee iprocedureapproximabee specimmorphomepopulationwhich is oofficials. Ahas recentare neededmethod.

    Our prematographcuticular hyare fully Ahoneybeeshave also sgenotype inthe linear lheavily Afand moderyet fully Acan be difdifferences

    In thefrom the wEuropeancapillary c(GA) for pfeatures inAfrican gefeatures inof the Eurotwo or threBecause thof the variarecognitiondifferencesThe princithe fitnessan informasearch spawhose prinbasis of g

    those classes and/or samples that were difficult to classify asit trained using a form of boosting. Samples that consistentlyclassify correctly are not as heavily weighted as samples that

    ifficulal pa

    xperim

    Bee sp

    ydrocaer beees andybeesdor, Pdesig

    ers atvior. Eies mty of c

    Sampl

    ydrocaxocrining indThe hsilicapesti

    ion wam of nto anahroma

    Gas c

    ydrocawere

    silic0.32 m200 Cgas ch5890Ator. G

    a Fnce ofs wase n-allary ction inthe co

    MS exal gasan Afatogorm in which they dispersed throughout Brazil,have deleterious effects on all aspects of the USl economy influenced by bee pollination.rol the spread of Africanized bees in the Unitedill be necessary to develop a program of stock cer-his program can only be implemented if a reliableuse method for the identification of Africanized

    is developed. Currently, the method used by thetes Department of Agriculture for Africanizeddentification is morphometric analysis [3]. Thisemploys a linear discriminant developed fromtely 20 body measurements to identify individual

    ens as Africanized or European. However,tric analysis cannot determine if a given beeis in the initial stages of becoming Africanized,f great interest to Federal and State regulatorylthough a polymerase chain reaction based assayly been developed [4], more selective primers

    to ensure accurate genotyping when using this

    vious work using packed column gas chro-y and the linear learning machine [5] to analyzedrocarbons of insects has shown that bees whichfricanized can be differentiated from Europeanon the basis of their hydrocarbon profiles [6]. Wehown that it is possible to identify the AfricanF1 hybrids [7]. Using gas chromatography and

    earning machine, we have also demonstrated thatricanized (i.e., bees that are fully Africanized)ately Africanized honeybees (bees that are notfricanized but possess many of the African traits)ferentiated from European honeybees based onin their hydrocarbon profiles [8].present study, hydrocarbon extracts obtainedax glands, cuticle, and exocrine glands of 238

    and Africanized honeybees were analyzed byolumn gas chromatography. A genetic algorithmattern recognition [911] was used to identifythe gas chromatograms characteristic of the

    notype. The pattern recognition GA searched forthe chromatograms that optimized the separationpean and Africanized honeybees in a plot of thee largest principal components [12] of the data.e largest principal components capture the bulknce in the data, the peaks identified by the pattern

    GA primarily contained information aboutbetween European and Africanized honeybees.

    pal component analysis routine embedded infunction of the pattern recognition GA acted astion filter, significantly reducing the size of thece since it restricted the search to feature setscipal component plots showed clustering on theenotype. In addition, the algorithm focused on

    are doptim

    2. E

    2.1.

    HworkeybehoneEcuawere

    workbehacolonvarie

    2.2.

    Hand esoak72 h.of ausingfractstreapriorgas c

    2.3.

    Hbeesfusedi.d. =50 toThea HPdetecusingpresedieneto thcapilretenfromGCtypicfromchromt to classify. Over time, the algorithm learns itsrameters in a manner similar to a neural network.

    ental

    ecimens

    rbon extracts were obtained from 294 adults. Of the 294 foragers, 128 were Africanized hon-the other 166 were European. The Africanized

    were collected from colonies in Costa Rica, Peru,eru, Honduras, and Mexico. Many of the coloniesnated as moderately or heavily Africanized bythese sites based on a field test for colony defenseuropean honeybees were collected from managedaintained in the United States. They represented aommercially available US stocks.

    e preparation

    rbons were extracted from the wax gland, cuticle,e gland of individual whole bee specimens by firstividual specimens in pesticide grade hexane for

    ydrocarbons were isolated from the soak by meansgel syringe column (silica Sep-Pac, Millipore)

    cide grade hexane as the eluent. The hydrocarbons collected and concentrated to dryness under aitrogen. It was reconstituted with 50l of hexanelysis by capillary column gas chromatography ortographymass spectrometery (GCMS).

    hromatographic analysis

    rbon extracts obtained from individual wholeanalyzed on a 25-m 5% phenyl methyl siliconea capillary column (Hewlett-Packard Ultra 2,m), which was temperature programmed fromat 7/min and then from 200 to 250 C at 1/min.

    romatographic experiments were performed oninstrument equipped with a flame ionization

    CMS analysis was also performed in this studyinnigan OWA 1020 automated GCMS. Thenormal and branched chain alkanes, alkenes, andrevealed in the extract. GC peaks correspondingkanes were used as retention standards in theolumn gas chromatographic experiments. Kovatdices were assigned to the compounds elutinglumn and these indices (as well as data from theperiment) were used for peak identification. Achromatographic trace of the hydrocarbon extractricanized honeybee is shown in Fig. 1. Each gasram contained 65 peaks corresponding to a set

  • B.K. Lavine, M.N. Vora / J. Chromatogr. A 1096 (2005) 6975 71

    Fig. 1. Gas ch and, cunormal alkane nd perm

    of standardmatographwere at leaof these pewere alsovisual anal

    3. Pattern

    For pattdata was dipean honeyhoneybees,56 Europeabees from Scorrectly cEach gas cvector X = (jth peak. San n-dimenis thereforeEuclideanassumptioninversely re

    Table 1Training set

    Specimen typ

    European foraHeavily AfricModerately A

    Total

    t poinssingn of thEurogenet

    t featuees g

    GAlationof whEuropto bestringC pea

    ubset oromatographic trace of the hydrocarbon extracts obtained from the wax gls; B: alkenes; C: dienes; and D: branched chain alkanes. Reprinted with ki

    ized retention time windows. The 65 gas chro-ic peaks selected for pattern recognition analysisst moderately resolved and computer integrationaks always yielded reliable results. These peaksreadily identifiable in all the chromatograms byysis so peak matching was not a problem.

    recognition analysis

    ern recognition analysis, the gas chromatographicvided into a training set (consisting of 130 Euro-bees and 108 heavily and moderately Africanizedsee Table 1) and a prediction set (consisting of

    n and Africanized honeybees of which the honey-

    is thaposseregioto the

    Aselecthe bnitionpopueachized/peakin theing Gthe san Diego, Tampa, Berkeley and Mexico were notlassified by morphometric analysis, see Table 2).hromatogram was initially represented as a datax1, x2, x3, . . . xj, . . . x65) where xj is the area of theuch a vector can also be considered as a point insional Euclidean space. A set of chromatogramsrepresented as a set of points in an n-dimensional

    space. (In this study, n is equal to 65.) A basicis that distances between points in this space arelated to their degree of similarity. The expectation

    e Number of honeybees

    gers from United States 130anized foragers 64fricanized foragers 44

    238

    evaluation.tion, whichthe Europeponent plotThe fitnessfor bee clasrecombinatThe power

    Table 2Prediction set

    Specimen typ

    San Diego (ETampa (EuropBerkeley (EurMexico (AfricFrench GuinePeru (AfricanTotalticle, and exocrine gland of a heavily Africanized forager. A:ission from Lavine et al. [8].

    ts representing chromatograms from honeybeesthe African genotype should cluster in a limitedis space separate from the points corresponding

    pean honeybees.ic algorithm for pattern recognition was used tores from the training set data characteristic ofenotype. A block diagram of the pattern recog-is shown in Fig. 2. During each generation, aof binary strings of fixed length is generated,

    ich represents a potential solution to the African-ean honeybee classification problem. For a GCincluded, it is necessary for the corresponding bit

    to be set at 1. If the bit is set to 0, the correspond-k is not included. The strings are decoded yieldingf the 65 GC peaks sent to the fitness function for

    Each string is assigned a value by the fitness func-is a measure of the degree of separation between

    an and Africanized honeybees in a principal com-of the data defined by the extracted feature subset.(i.e., the quality of the proposed feature subsetsification) is used to select potential solutions forion, which produces a new population of strings.of the GA arises from recombination [13,14],

    e Number of honeybees

    uropean) 7ean) 22opean) 7anized) 6

    a (Africanized) 7ized) 7

    56

  • 72 B.K. Lavine, M.N. Vora / J. Chromatogr. A 1096 (2005) 6975

    Fig. 2. Block om B.Kidentification es, Ana

    which causmation betexpectationIn additionwhere onethe featurestring in quinclusion inGA to searclocal convetion, selectinternal paruntil a specsible solutirecognition

    3.1. Evalu

    The parecognitioncomponentplots, classthe fitnessing each gweights forto the corre

    CW(c) = 1

    SWc(s) =

    The prinsome afterextracted issification adistances aprincipal cosmallest toneighbors.a user defithe class to

    arest noint inmputer to sc

    egree

    =

    c

    betteplots,assign

    es) ha0 sampean

    sampleas eaose aighbo

    es. Heh equale, the

    Adjus

    e GAult torationsompuof SHn. SHdiagram of the pattern recognition GA. Reprinted with kind permission frusing gas chromatography/genetic algorithms-pattern recognition techniqu

    es a structured yet randomized exchange of infor-ween strings (i.e., potential solutions), with thethat good solutions can generate even better ones.

    , some of the binary strings may undergo mutation,of the bits is randomly changed. (If a bit is zero oris excluded, the mutation operator applied to theestion causes the bit to change to one and forces its

    the feature subset or vice versa.) This allows theh adjacent regions of the solution space mitigatingrgence. The aforementioned processes: evalua-ion, crossover, reproduction, and adjustment ofameters (which is discussed below), are repeatedified number of generations is achieved or a fea-

    on is found. The operators comprising our patternGA are described below.

    ation

    ttern recognition GA emulates human patternthrough machine learning to score the principalplots. To track and score the principal componentand sample weights, which are an integral part of

    function, are computed (see Eqs. (1) and (2)) dur-eneration. Class weights sum to 100;...