parallel genetic algorithms and the science of asteroseismology
DESCRIPTION
PARALLEL GENETIC ALGORITHMS AND THE SCIENCE OF ASTEROSEISMOLOGY. A Review of the Doctoral Dissertation Research of Dr. Travis Metcalfe. Outline. Introduction The Science of Asteroseismology The Genetic Algorithm Parallel Computing Conclusion. Introduction. - PowerPoint PPT PresentationTRANSCRIPT
PARALLEL GENETIC PARALLEL GENETIC ALGORITHMS AND THE ALGORITHMS AND THE
SCIENCE OF SCIENCE OF ASTEROSEISMOLOGYASTEROSEISMOLOGY
A Review of the Doctoral A Review of the Doctoral Dissertation Research of Dr. Travis Dissertation Research of Dr. Travis
MetcalfeMetcalfe
OutlineOutline IntroductionIntroduction The Science of AsteroseismologyThe Science of Asteroseismology The Genetic AlgorithmThe Genetic Algorithm Parallel ComputingParallel Computing ConclusionConclusion
IntroductionIntroductionAstronomers observe the universe and Astronomers observe the universe and gather information about it. They then fit gather information about it. They then fit this information into mathematical models. this information into mathematical models. The process of “fitting” involves adjusting The process of “fitting” involves adjusting the many parameters of the model. When the many parameters of the model. When they have a good fit, they use the parameter they have a good fit, they use the parameter settings to tell them something about the settings to tell them something about the object or phenomenon they are studying. object or phenomenon they are studying. The author uses a parallel genetic algorithm The author uses a parallel genetic algorithm to solve this problem of optimization.to solve this problem of optimization.
The Goal of the ResearchThe Goal of the Research
To Further the Understanding of the Composition To Further the Understanding of the Composition and Characteristics of White Dwarvesand Characteristics of White Dwarves
More Generally, Since White Dwarves are the More Generally, Since White Dwarves are the Endpoint for all but the most massive stars, this Endpoint for all but the most massive stars, this research can lead to a better understanding of research can lead to a better understanding of stellar evolutionstellar evolution
* Source
Traditional TechniqueTraditional Technique Make an initial “guess” for parameter Make an initial “guess” for parameter
valuesvalues
Use some iterative technique to Use some iterative technique to improve upon the initial guesses.improve upon the initial guesses.
Adjustable Input ParametersAdjustable Input Parameters MassMass TemperatureTemperature H and He layer massesH and He layer masses Convective EfficiencyConvective Efficiency Core compositionCore composition
Problem with this techniqueProblem with this technique Results often depend on the initial Results often depend on the initial
guessguess
The initial guess is inherently The initial guess is inherently subjective, often the result of subjective, often the result of intuition or past experienceintuition or past experience
The Genetic AlgorithmThe Genetic Algorithm A genetic algorithm provides a more A genetic algorithm provides a more
systematic approach to optimizing the systematic approach to optimizing the resultsresults
The genetic algorithm used was PIKAIAThe genetic algorithm used was PIKAIA PIKAIA is a general purpose “function PIKAIA is a general purpose “function
optimization” genetic algorithmoptimization” genetic algorithm Public domain softwarePublic domain software Fortran-77Fortran-77
OutlineOutline IntroductionIntroduction The Science of AsteroseismologyThe Science of Asteroseismology The Genetic AlgorithmThe Genetic Algorithm Parallel ComputingParallel Computing ConclusionConclusion
White dwarves which show a regular White dwarves which show a regular variation in light intensity are known as variation in light intensity are known as pulsating white dwarvespulsating white dwarves
Using photometric techniques, this Using photometric techniques, this variation in intensity can be very variation in intensity can be very accurately measured with such accurately measured with such instruments as the Whole Earth Telescope instruments as the Whole Earth Telescope (WET)(WET)
The pulsation is the result of seismic The pulsation is the result of seismic activity within the white dwarfactivity within the white dwarf
Just as seismological information can be Just as seismological information can be used to study the internal nature of the used to study the internal nature of the earth, seismological data, as expressed in earth, seismological data, as expressed in varying stellar luminosity, can be used to varying stellar luminosity, can be used to determine the characteristics of these determine the characteristics of these pulsating white dwarves.pulsating white dwarves.
Observed Light Curve for the Observed Light Curve for the White Dwarf GD 358.White Dwarf GD 358.
OutlineOutline IntroductionIntroduction The Science of AsteroseismologyThe Science of Asteroseismology The Genetic AlgorithmThe Genetic Algorithm Parallel ComputingParallel Computing ConclusionConclusion
Initial ConditionsInitial Conditions Population size: 1000 ( in later work this Population size: 1000 ( in later work this
was reduced to 128).was reduced to 128).
No rationale was given for how the initial No rationale was given for how the initial population value was chosen, or why it population value was chosen, or why it was changed.was changed.
For each member of the initial population, For each member of the initial population, parameter values are randomly setparameter values are randomly set
DurationDuration Until the difference between the Until the difference between the
average fitness and the best fitness average fitness and the best fitness in the population were less than 1%.in the population were less than 1%.
In later work, he used a constant 200 In later work, he used a constant 200 generations.generations.
Fitness MeasurementFitness Measurement The model is then run using these The model is then run using these
initial valuesinitial values
Fitness is based on the root-mean-Fitness is based on the root-mean-square differences between the square differences between the observed and calculated pulsation observed and calculated pulsation periods periods
Fitness MeasurementFitness Measurement The fitness value is converted to a The fitness value is converted to a
survival probability by normalizing survival probability by normalizing with respect to the most fit memberwith respect to the most fit member
The next generation is chosen The next generation is chosen randomly. This random selection is randomly. This random selection is weighted, based on each member’s weighted, based on each member’s survivability ratiosurvivability ratio
CrossoverCrossover Numerical encodingNumerical encoding
Each of the initial parameter values are Each of the initial parameter values are concatenated into one long stringconcatenated into one long string
A single point crossover technique is A single point crossover technique is used. The position along the string is used. The position along the string is picked randomlypicked randomly
MutationMutation Mutation is achieved by randomly Mutation is achieved by randomly
selecting a number in the string and selecting a number in the string and changing it to a new, randomly changing it to a new, randomly chosen valuechosen value
IllustrationIllustration Consider two members, each with Consider two members, each with
two parameters. two parameters. MM11 has X=2.573 and Y= 4.457. has X=2.573 and Y= 4.457. MM22 has parameter values X=3.547 has parameter values X=3.547
and Y=2.332. and Y=2.332. After encoding, MAfter encoding, M11=25734457 and =25734457 and
MM22=35472332=35472332
IllustrationIllustration The crossover point is randomly chosen, and the string The crossover point is randomly chosen, and the string
segments swappedsegments swapped
MM1 1 2573425734||457 457 25734 25734332332MM2 2 3547235472||332 332 35472 35472457457
IllustrationIllustration Mutating MMutating M11 involves picking a random spot involves picking a random spot
along the string, and changing that value:along the string, and changing that value: MM11 257257||33||4332 4332 257 2578843324332
Illustration*Illustration* The strings would then be parsed back into The strings would then be parsed back into
parameter values. For Mparameter values. For M11, this would be:, this would be:
MM11 X= 2.578X= 2.578 Y=4.332 Y=4.332
* Modified from [1]* Modified from [1]
Crossover and Mutation Crossover and Mutation RateRate
The cross over rate: 65% The cross over rate: 65% The mutation rate: 0.3%. The mutation rate: 0.3%.
In later work, the author increased the In later work, the author increased the crossover rate to 85% and varied the crossover rate to 85% and varied the mutation rate from 0.1% to 16.6%, mutation rate from 0.1% to 16.6%, depending on the variation between the depending on the variation between the mean fitness value, and the best fitness mean fitness value, and the best fitness valuevalue
ElitismElitism The most fit solution was passed The most fit solution was passed
unaltered the next generationunaltered the next generation
RationaleRationale The idea behind the relatively low The idea behind the relatively low
crossover and mutation rate is to crossover and mutation rate is to prevent removing promising prevent removing promising solutions from each generation too solutions from each generation too rapidlyrapidly
RepetitionRepetition The paper states: “Repeating this The paper states: “Repeating this
procedure many times with different procedure many times with different random number seeds helps to ensure random number seeds helps to ensure that the minimum found is truly that the minimum found is truly global”global”
It does not elaborate on how many It does not elaborate on how many Many timesMany times is, though is, though
RepetitionRepetition In a later paper, he uses 5 repetitionsIn a later paper, he uses 5 repetitions
This result was obtained in the This result was obtained in the following way…following way…
Values were put in for the model, and Values were put in for the model, and pulsation periods generated.pulsation periods generated.
The genetic algorithm attempted to The genetic algorithm attempted to find the original parameters based on find the original parameters based on the output of the modelthe output of the model
This was done 20 times, and the This was done 20 times, and the results were as follows…results were as follows…
Results (second paper)Results (second paper) First Order Solution…First Order Solution…
Run Teff M/Ms log(MHE/M*) rms Generation Found
1 26,800 0.560 -5.70 0.67 2452 25,000 0.600 -5.96 0.00 1593 24,800 0.605 -5.96 0.52 1454 25,000 0.600 -5.96 0.00 685 22,500 0.660 -6.33 1.11 976 25,000 0.600 -5.96 0.00 1427 25,000 0.600 -5.96 0.00 978 25,000 0.600 -5.96 0.00 1949 25,200 0.595 -5.91 0.42 11610 26,100 0.575 -5.80 0.54 8711 23,900 0.625 -6.12 0.79 7912 25,000 0.600 -5.96 0.00 16513 26,100 0.575 -5.80 0.54 9214 25,000 0.600 -5.96 0.00 9515 24,800 0.605 -5.96 0.52 4216 26,600 0.565 -5.70 0.72 24617 24,800 0.605 -5.96 0.52 18018 25,000 0.600 -5.96 0.00 6219 24,100 0.620 -6.07 0.76 22820 25,000 0.600 -5.96 0.00 167
The genetic algorithm found the The genetic algorithm found the exact result 9/20 times, and was exact result 9/20 times, and was close enough on four other occasions close enough on four other occasions for the correct result to be for the correct result to be determined by the addition of some determined by the addition of some other iterative technique, for a total other iterative technique, for a total of 65% accuracy.of 65% accuracy.
If the GA was rerun, and the best result If the GA was rerun, and the best result selected, the accuracy increased to 88%selected, the accuracy increased to 88%
After 5 runs, the accuracy was over 99%After 5 runs, the accuracy was over 99%
Because no correct answer was found Because no correct answer was found after 200 iterations, the number of after 200 iterations, the number of generations was reduced to 200generations was reduced to 200
Output CurveOutput Curve
OutlineOutline IntroductionIntroduction The Science of AsteroseismologyThe Science of Asteroseismology The Genetic AlgorithmThe Genetic Algorithm Parallel ComputingParallel Computing ConclusionConclusion
Problem DivisionProblem Division
Part one: running the numerical Part one: running the numerical model using a large number of model using a large number of different initial parameters. different initial parameters.
Part two: determining fitness, Part two: determining fitness, selecting the next generation, and selecting the next generation, and performing crossover/mutationperforming crossover/mutation
Master-Slave ParadigmMaster-Slave Paradigm Part one – running the model with a Part one – running the model with a
given set of parameters was given set of parameters was performed by the slave nodesperformed by the slave nodes
Part two – fitness evaluation, Part two – fitness evaluation, selection/crossover/mutation was selection/crossover/mutation was performed by the master nodeperformed by the master node
PVMPVM PVM was used as the message PVM was used as the message
passing librarypassing library
ExecutionExecution The master machine generates a job pool The master machine generates a job pool
of parameter values that it passes to the of parameter values that it passes to the slave machines. slave machines.
The slave machines in turn run the model The slave machines in turn run the model and return the results to the master. and return the results to the master.
If there are more parameter sets If there are more parameter sets available, the node is given another job. available, the node is given another job.
ExecutionExecution The master calculates variance. The master calculates variance. Determines fitness. Determines fitness. After the models have been run for a given After the models have been run for a given
generation, the master determines the generation, the master determines the members of the next generation and runs members of the next generation and runs the crossover/mutation methods on the the crossover/mutation methods on the appropriate portion of the new population. appropriate portion of the new population.
As the new parameters are created, they As the new parameters are created, they are sent to the workstations.are sent to the workstations.
The NetworkThe Network The Cluster is composed of one The Cluster is composed of one
master computer and 64 slave nodesmaster computer and 64 slave nodes The cluster of computers is divided The cluster of computers is divided
into three subnetsinto three subnets Each subnet is connected to the Each subnet is connected to the
master serially, using coaxial cable master serially, using coaxial cable and a 10base-2 (thin Ethernet) systemand a 10base-2 (thin Ethernet) system
DarwinDarwin Pentium-II 333 MHz system with 128 Pentium-II 333 MHz system with 128
MB RAMMB RAM Two 8.4 GB hard disks. Two 8.4 GB hard disks. Three NE-2000 compatible network Three NE-2000 compatible network
cards, one for each of the segmentscards, one for each of the segments
DarwinDarwin
NodesNodes MotherboardMotherboard ProcessorProcessor Single 32 MB RAM chipSingle 32 MB RAM chip NE-2000 compatible network cardNE-2000 compatible network card No Hard drive!No Hard drive!
NodesNodes Half of the nodes contain Pentium-II Half of the nodes contain Pentium-II
300 MHz processors, while the other 300 MHz processors, while the other half are AMD K6-II 450 MHz chips half are AMD K6-II 450 MHz chips
The ClusterThe Cluster
ConclusionConclusion Based on initial results, the use of Based on initial results, the use of
genetic algorithms appears to be a genetic algorithms appears to be a promising method for minimizing the promising method for minimizing the residual difference between residual difference between observational data and the Wilson—observational data and the Wilson—Devinney model Devinney model
ConclusionConclusion It is also a wonderful example of how It is also a wonderful example of how
parallel computing, open source parallel computing, open source software and clusters of workstations software and clusters of workstations can have a profound impact on the can have a profound impact on the course of research.course of research.
PIKAIA NamesakePIKAIA Namesake
““Pikaia Gracilens, a little worm-like beast that crawled in the mud of a Pikaia Gracilens, a little worm-like beast that crawled in the mud of a long gone seafloor of the Cambrian era, 530 million years ago. While long gone seafloor of the Cambrian era, 530 million years ago. While not particularly impressive in the tooth and claw department, Pikaia not particularly impressive in the tooth and claw department, Pikaia
is believed to be the founder of the phylum Chordata, whose is believed to be the founder of the phylum Chordata, whose subsequent evolution had consequences still very much felt today by subsequent evolution had consequences still very much felt today by
the rest of the ecosystem”the rest of the ecosystem”
ReferencesReferences1.1. Metcalfe, T. S. (1999), Metcalfe, T. S. (1999), Genetic-Algorithm Based Light-Curve Genetic-Algorithm Based Light-Curve
Optimization Applied to Observations of the W Ursae Majoris Optimization Applied to Observations of the W Ursae Majoris Star Bh CassiopeiaeStar Bh Cassiopeiae, The Astronomical Journal, Vol. 117, No. 5, , The Astronomical Journal, Vol. 117, No. 5, pp. 2503-2510pp. 2503-2510
2.2. Metcalfe, T. S., R. E. Nather, and D. E. Winget (2000), Metcalfe, T. S., R. E. Nather, and D. E. Winget (2000), Genetic-Genetic-
Algorithm-Based Asteroseismological Analysis of the DBV White Algorithm-Based Asteroseismological Analysis of the DBV White Dwarf GD 358Dwarf GD 358, The Astrophysical Journal, Vol. 545, No. 2, pp. , The Astrophysical Journal, Vol. 545, No. 2, pp. 974-981 974-981
3.3. Metcalfe, T. S. (2000), Metcalfe, T. S. (2000), The Asteroseismology MetacomputerThe Asteroseismology Metacomputer, ,
Baltic Astronomy, Vol. 9, pp. 479-483Baltic Astronomy, Vol. 9, pp. 479-483
ReferencesReferencesAuthor’s Web page:Author’s Web page:http://www.whitedwarf.orghttp://www.whitedwarf.org
Wilson-Devinney:Wilson-Devinney:http://cdsads.u-strasbg.fr/cgi-bin/nph-bib_quhttp://cdsads.u-strasbg.fr/cgi-bin/nph-bib_query?1971ApJ...166..605Wery?1971ApJ...166..605W
PIKAIA Web Page:PIKAIA Web Page:http://www.hao.ucar.edu/public/research/si/phttp://www.hao.ucar.edu/public/research/si/pikaia/pikaia.htmlikaia/pikaia.html
ReferencesReferencesImage SourcesImage Sources
All images were taken from: All images were taken from: http://www.whitedwarf.orghttp://www.whitedwarf.org
Except… Except…
H-R DiagramH-R Diagramhttp://www.astunit.com/tutorials/stellar.htmhttp://www.astunit.com/tutorials/stellar.htm
Pikaia Gracilens: PIKAIA WebsitePikaia Gracilens: PIKAIA Website