apogee-2 data infrastructure jon holtzman (nmsu) apogee team

Download APOGEE-2 Data Infrastructure Jon Holtzman (NMSU) APOGEE team

Post on 03-Jan-2016

212 views

Category:

Documents

0 download

Embed Size (px)

TRANSCRIPT

PowerPoint Presentation

APOGEE-2 Data InfrastructureJon Holtzman (NMSU)APOGEE team

Data infrastructure for APOGEE-2 will be similar to that of APOGEE-1, generalized to multiple observatories, and with improved tracking of processingAPOGEE raw data and data products are stored on the Science Archive Server (SAS)Reduction and analysis software is (mostly) managed through the SDSS SVN repository Raw and reduced data described (mostly) through SDSS datamodelData and processing documented via SDSS web pages and technical papersData infrastructure

APOGEE instrument reads continuously (every ~10s) as data are accumulating, 3 chips at 2048x2048 each Raw data are stored on instrument control computer (current capacity is several weeks of data)Individual readouts are annotated with information from telescope and stored on analysis computer (current capacity is several months). These frames are archived to local disks that are shelved at APO (currently 20 x 3TB disks)quick reduction software at observatory assembles data into data cubes and compresses (lossless) for archiving on SASMaximum daily compressed data volume ~ 60 GbRaw data

Raw data

Does not include NMSU 1m + APOGEE dataLCO data will be concurrentTotal 2.5m raw data to date: ~11 TB

quick reduction software estimates S/N (at H=12.2) which is inserted into plate database for use with autoscheduling decisionsAPOGEE-1Data transferred to SAS next day, transferred to NMSU later that day, processed with full pipeline following day, updated S/N loaded into platedb, initial QA inspectionAPOGEE-2 proposal:Process data at observatory with full pipeline next day, or at SAS location (Utah) and/orImprove quick reduction S/NInitial processing

Three main stages (+1 post-processing)APRED : processing of individual visits (multiple exposures at different detector spectral dither positions) into visit-combined spectra, with initial RV estimates. Can be done dailyAPSTAR: combine multiple visits into combined spectra, with final RV determination. For APOGEE-1, has been run annually (DR10: year 1, DR11: year 1+year2)ASPCAP: process combined (or resampled visit) spectra through stellar parameters and chemical abundances pipelineFor APOGEE-1, has been run 3 timesASPCAP/RESULTS: apply calibration relations to derived parameters, set flag values for these

Pipeline processing

Raw data: data cubes (apR)Processed exposures (maybe not of general interest?)2D images (ap2D)Extracted spectra (ap1D)Sky subtracted and telluric corrected (apCframe)Visit spectraCombine multiple exposures at different dither positionsapVisit files: native wavelength scale, but with wavelength arrayCombined spectraCombine multiple visits, requires relative RVsapStar files: resampled spectra to log(lambda) scaleDerived products from spectraRadial velocities and scatter from multiple measurements (done during combination)Stellar parameters/chemical abundances from best-fitting templateParameters: Teff, log g, microturbulence (fixed), [M/H], [alpha/M], [C/M], [N/M]Abundances for 15 individual elementsaspcapStar and aspcapField files: stellar parameters of best-fit, pseudo-continuum normalized spectra and best fiitting templatesWrap-up catalog files (allStar, allVisit)

APOGEE data products

APOGEE data volumeRaw data: 2.5m+APOGEE: ~4 TB/year APOGEE-1 ~6 TB/year with MaNGA co-observing1m+APOGEE: ~2 TB/yearLCO+APOGEE: ~3 TB / yearTOTAL APOGEE-1 + APOGEE-2 : ~75 TB

Processed visit files: ~ 3 TB/year (80% individual exposure reductions)Processed combined star files: ~500 GB/100,000 starsProcessed ASPCAP files: raw FERRE files ~500 GB/100,000 starsBundled output: ~100 GB / 100,000 starsTOTAL APOGEE-1 + APOGEE-2 (one reduction!): ~ 40 TB

APOGEE data accessFlat files available via SDSS SAS:all intermediate and final data product filessummary ``wrap-up files (catalog)

Catalog files available via SDSS CAS:apogeeVisit, apogeeStar, aspcapStar

Spectrum files available via SDSS API and web interface

Planning 4 data releases in SDSS-IV:DR14: July 2017 (data through July 2016)DR15: July 2018 (data through July 2017 first APOGEE-S)DR16: July 2019 (data through July 2018)DR17: Dec 2020 (all data)

APOGEE software productsapogeereduce: IDL reduction routines (apred and apstar)

aspcapspeclib: management of spectral libraries, but not all input software (no stellar atmospheres code, limited spectral synthesis code)ferre: F95 code to interpolate in libraries, find best fitidlwrap: IDL code to manage ASPCAP processing

apogeetarget: IDL code for targetting

APOGEE pipeline processingSoftware all installed and running on Utah serversSoftware already in pipeline form (few lines per full reduction step to distribute and complete among multiple machines/processors)Some need to improve distribution of knowledge and operation among teamSome external data/software required for ASPCAP operationGeneration of stellar atmospheres (Kurucz and/or MARCS)Generation of synthetic spectra (ASSET, but considering MOOG and TURBOSPECTRUM)

APOGEE software/personnelapogeereducedeveloper: Nidever, Holtzman, (Nguyen)operation: Holtzman, (Hayden, Nidever, Nguyen)ASPCAPgrids: ASSET: Allende Prieto / KoesterkeTurbospec: Zamora, Garcia-Hernandez, Sobeck, Garcia-Perez, HoltzmanMOOG: Shetrone, Holtzman (pipeline), othersspeclibpostprocessing: Allende-Prieto, Holtzmanferre: Allende Prietoidlwrap: Holtzman, Garcia-Perez (Shane)Operation: Holtzman (Shane, Shetrone)

END

Star level bitmasksTargeting flagsAPOGEE_TARGET1, APOGEE_TARGET2: main survey vs ancillary, telluric, etc.STARFLAG: bitmask flagging potential conditions, e.g.LOW_SNRBAD_PIXELSVERY_BRIGHT_NEIGHBORPERSIST_HIGH

Data quality/issues: ASPCAPCurrent ASPCAP runs are fits for 6 parameters: Teff, log g, [M/H], [alpha/M], [C/M], [N/M]Teff, log g, [M/H], and [alpha/M] have been calibrated using observations of clusters: systematic corrections have been applied to these parameters, and are nonzero for Teff, log g, and [M/H]Results for [C/M] and [N/M] are more challenging to verify, and are more suspectIn flat fields, PARAM (calibrated parameters) vs FPARAM (fit parameters)In CAS database, TEFF, LOGG, METALS, ALPHAFE (calibrated) vs/ FIT_TEFF, FIT_LOGG, FIT_METALS, FIT_ALPHAFE (fit)Key catalog bitmasksASPCAP_FLAG: bitmask flagging potential conditions, e.g.,STAR_BADSTAR_WARNPARAMFLAG: details about nature of ASCPAP_FLAG bits

DR10: Data taken from April 2011 through July 2012First year survey data all observed spectra, even if all visits not complete: summed spectra of what is availablerelease spectra and ASPCAP resultsCommissioning data (through June 2011): degraded LSF (especially red chip). No ASPCAP170 fields (includes a few commissioning-only fields)710 plates (+ sky frames + calibration frames/monitors)40-50K starsLooking past DR10250+ fields available as of May, currently being combinedPlan to have DR10-level reductions of all year 2 data around time of DR10 releaseScope of Data

Data access: flat filesSAS: flat filesDatamodel: http://data.sdss3.org/datamodel/ APOGEE_TARGET: targeting files include all _possible_ targets as well as selected onesAPOGEE_DATA: raw data cubesAPOGEE_REDUX: reduced dataAPOGEE_REDUX: currently corresponds to http://data.sdss3.org/sas/bosswork/apogee/spectro/redux/Embedded web pages provide a guide and some static plotsVersions / organizationIdentify via apred_version/apstar_version/aspcap_version/results_versionapred_version : contains visit files (apVisit) organized by plate/MJDapstar_version contains combined star files, organized by field locationaspcap_version raw ASPCAP results, organized by field locationresults_version adds ASPCAP calibrated results and sets some additional data quality bitsCurrent version is r3/s3/a3/v302; DR10 version likely to be v303?

Summary wrap-up filesMain summary data filesallStar-v302.fits: catalog data for all DR10 starsallVisit-v302.fits: catalog data for all DR10 visitsThese files are not overly large (~60000 star entries in allStar currently), so are really quite manageablePay attention to bitmasks!

allstar=mrdfits(allStar-v302.fits,1); skip stars with STAR_BAD (bit 23) and NO_ASPCAP_RESULT (bit 31)set in aspcapflagbadbits=(2L^23 or 2L^31)gd=where((allstar.aspcapflag and badbits) gt 0)plot,s[gd].teff,s[gd].logg,.

; find giant binariesbadbits=(2^23 or 2^31)gd=where(allstar.vscatter gt 1 and (allstar.aspcapflag and badbits) eq 0 and s.logg lt 3.8)

Data access: APICan get programmatic access to data via APOGEE API (soon)One particularly useful application: downloading subset of spectraAlso basis for SAS web app: visual interface to spectraAPOGEE API currently under development, available in next several monthsDatabase used by API is loaded, graphical spectrum access available via web app: https://spectra.sdss3.org:8100/

Data access: CASData from summary files (allStar, allVisit, allPlates has been loaded into CAS (TESTDR10, currently restricted access) tables apogeePlate, apogeeStar, apogeeVisit, aspcapStarExample:SELECT top 10 p.star,p.ra, p.dec, p.glon, p.glat, p.vhelio_avg, p.vscatter, a.teff,a.logg,a.metals, v.vhelio FROM apogeeStar p JOIN aspcapStar a on a.apstar_id = p.apstar_id JOIN apogeeVisit v on a.star = v.star WHERE (a.aspcap_flag & dbo.fApogeeAspcapFlag('STAR_BAD')) = 0 and p.nvisits > 6 order by a.star

Object search through CAS implemented in sky server

Abundances of cooler stars

Second instrument or first instrument relocation

Surface gravity issues: red clump vs red giant

Abundance analysis of faint bulge stars: RR Lyr and RC stars

Achieving distance distribution