sdss quasars spectra fitting

15
SDSS Quasars SDSS Quasars Spectra Fitting Spectra Fitting N. Kuropatkin, C. N. Kuropatkin, C. Stoughton Stoughton

Upload: joshua-cruz

Post on 02-Jan-2016

48 views

Category:

Documents


0 download

DESCRIPTION

SDSS Quasars Spectra Fitting. N. Kuropatkin, C. Stoughton. Introduction Chris Stoughton. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: SDSS Quasars Spectra Fitting

SDSS Quasars Spectra SDSS Quasars Spectra FittingFitting

N. Kuropatkin, C. StoughtonN. Kuropatkin, C. Stoughton

Page 2: SDSS Quasars Spectra Fitting

IntroductionIntroductionChris StoughtonChris Stoughton

Quasars are complex objects. A swirling Quasars are complex objects. A swirling cloud of gas and plasma falling into a cloud of gas and plasma falling into a black hole glows at many different black hole glows at many different wavelengths. Astronomers measure this wavelengths. Astronomers measure this spectrum of light to measure the spectrum of light to measure the properties of each quasar. The model we properties of each quasar. The model we fit to the spectrum includes the following fit to the spectrum includes the following components: components:

Page 3: SDSS Quasars Spectra Fitting

power-law continuum, decreasing as exp(-lambda) power-law continuum, decreasing as exp(-lambda) a Balmer Continuum due to ionized Hydrogen, with a a Balmer Continuum due to ionized Hydrogen, with a

characteristic bump from 2000 to 4000 Angstroms characteristic bump from 2000 to 4000 Angstroms strong emission lines from ionized gas, such as Hydrogen, strong emission lines from ionized gas, such as Hydrogen,

Nitrogen, Oxygen, and Magnesium.Nitrogen, Oxygen, and Magnesium. many faint emission lines from Ironmany faint emission lines from Iron

starlight from the galaxy that surrounds the quasarstarlight from the galaxy that surrounds the quasar..

Page 4: SDSS Quasars Spectra Fitting

We vary the values of the parameters in this We vary the values of the parameters in this model to search for the parameters set that model to search for the parameters set that minimizes chi-squared. Since this includes minimizes chi-squared. Since this includes hundreds of parameters, we used a "genetic" hundreds of parameters, we used a "genetic" algorithm to find a good estimate of the algorithm to find a good estimate of the parameters set with the best chi-squared.parameters set with the best chi-squared.

The genetic algorithm keeps track of 100 sets of The genetic algorithm keeps track of 100 sets of parameters. Borrowing terms from biology, we parameters. Borrowing terms from biology, we call one set of parameters a chromosome, and call one set of parameters a chromosome, and each parameter is a gene. We start by generating each parameter is a gene. We start by generating 100 random chromosomes, using reasonable 100 random chromosomes, using reasonable ranges for the value of each gene. We calculate ranges for the value of each gene. We calculate chi-squared for each chromosome and sort the chi-squared for each chromosome and sort the results in order of increasing chi-squared. We results in order of increasing chi-squared. We then do 100 iterations of the following steps:then do 100 iterations of the following steps:

Page 5: SDSS Quasars Spectra Fitting

save the first chromosome (the "fittest" survives)save the first chromosome (the "fittest" survives) for the next 20 chromosomes, perturb the gene for the next 20 chromosomes, perturb the gene

values by 1 sigmavalues by 1 sigma for the next 20 chromosomes, perturb the gene for the next 20 chromosomes, perturb the gene

values by 5 sigmavalues by 5 sigma for the next 20 chromosomes, "breed" them by for the next 20 chromosomes, "breed" them by

taking some genes from one parent and the rest taking some genes from one parent and the rest of the genes from another parentof the genes from another parent

remove the remaining chromosomes and replace remove the remaining chromosomes and replace them with randomly generated onesthem with randomly generated ones

sort these "new" chromosomes in order of sort these "new" chromosomes in order of increasing chi-squaredincreasing chi-squared

Page 6: SDSS Quasars Spectra Fitting

At the end of these iterations, declare the first At the end of these iterations, declare the first chromosome to be the estimate of the best chi-chromosome to be the estimate of the best chi-squared fit.squared fit.

The Sloan Digital Sky Survey has The Sloan Digital Sky Survey has measured the spectrum of tens of measured the spectrum of tens of thousands of quasars. thousands of quasars.

Each spectral fit consumes Each spectral fit consumes approximately 1 hour of CPU time.approximately 1 hour of CPU time.

We are using the OSG to process We are using the OSG to process these spectra with various these spectra with various implementations of this model.implementations of this model.

Page 7: SDSS Quasars Spectra Fitting

Generic Grid GoferGeneric Grid GoferN. KuropatkinN. Kuropatkin

The task of fitting QSO spectra is an The task of fitting QSO spectra is an ideal job for the grid. ideal job for the grid.

It is CPU bound. Execution time is It is CPU bound. Execution time is about 1 hour.about 1 hour.

Staged-in data and parameters are Staged-in data and parameters are only about 1 Mbytesonly about 1 Mbytes

Staged-out results are only about 2 Staged-out results are only about 2 MbytesMbytes

Page 8: SDSS Quasars Spectra Fitting

SDSS QSO spectra fitting dataflowSDSS QSO spectra fitting dataflow

Page 9: SDSS Quasars Spectra Fitting

Shown dataflow is very generic.Shown dataflow is very generic. About 90% of all jobs on grid can About 90% of all jobs on grid can

satisfy the dataflow.satisfy the dataflow. The main specific of different grid The main specific of different grid

tools is the software used on the tools is the software used on the submission host.submission host.

We are using We are using Generic Grid GoferGeneric Grid Gofer ((GGGGGG) – fine blend of SQL database ) – fine blend of SQL database and Grid Middleware in form of Java and Grid Middleware in form of Java package.package.

Objectivities – simplicity, reliability, Objectivities – simplicity, reliability, comprehensive bookkeeping, comprehensive bookkeeping, automatic productionautomatic production

Page 10: SDSS Quasars Spectra Fitting

Generic dataflow in GGGGeneric dataflow in GGG

Page 11: SDSS Quasars Spectra Fitting

GGG production stepsGGG production steps All jobs are stored in “jobs” table.All jobs are stored in “jobs” table. Available grid sites are stored in “pool” Available grid sites are stored in “pool”

tabletable Job Manager takes jobs from the database, Job Manager takes jobs from the database,

creates Condor DAG files and submits creates Condor DAG files and submits them to sites from the pool in an them to sites from the pool in an automatic mode.automatic mode.

Two main parts – Job Manager and DAG Two main parts – Job Manager and DAG CreatorCreator

All completed stages of a job are recorded All completed stages of a job are recorded in the database together with submission in the database together with submission time and execution time time and execution time

Page 12: SDSS Quasars Spectra Fitting

The DAG creator block diagramThe DAG creator block diagram

Page 13: SDSS Quasars Spectra Fitting

The DAG Creator classThe DAG Creator class

Implements interface between the Job Implements interface between the Job Manager and Grid MiddlewareManager and Grid Middleware

Uses XML templates describing the job Uses XML templates describing the job DAG and Condor submit files to create an DAG and Condor submit files to create an abstract DAG and then a concrete DAGabstract DAG and then a concrete DAG

Performs several stages of substitution of Performs several stages of substitution of dummy parameters in the templates using dummy parameters in the templates using values from environment, job description values from environment, job description and site description filesand site description files..

Page 14: SDSS Quasars Spectra Fitting

Install OSG software.Install OSG software. Install the GGG packageInstall the GGG package Use the Demo Application as a template to create your Use the Demo Application as a template to create your

own production. You will need to modify 5 simple shell own production. You will need to modify 5 simple shell scripts and 5 simple XML files.scripts and 5 simple XML files.

Create site description XML files for sites where you want to Create site description XML files for sites where you want to run your jobs. There is tool to help with this.run your jobs. There is tool to help with this.

Distribute your software on those sites. See demo Distribute your software on those sites. See demo application how to do thisapplication how to do this

Initialize database. There are example programsInitialize database. There are example programs Lunch JobManager Lunch JobManager Watch how it works.Watch how it works.

How any user can use the package to start his own How any user can use the package to start his own production?production?

Page 15: SDSS Quasars Spectra Fitting

ConclusionConclusion

We have created simple and generic tool We have created simple and generic tool to organize data processing on grid. This to organize data processing on grid. This tool was used to process 10% of SDSS tool was used to process 10% of SDSS QSO spectra in about two weeks. The tool QSO spectra in about two weeks. The tool can be used for many different grid can be used for many different grid productions. productions.

We are working on the software We are working on the software distribution and web page.distribution and web page.

More details can be found at More details can be found at http://home.fnal.gov/~kuropat/sdss_grid/sdssprod.htmlhttp://home.fnal.gov/~kuropat/sdss_grid/sdssprod.html