summary and conclusions specified clustering and mahalanobis classification together provide a...

1
Summary and Conclusions Specified clustering and Mahalanobis classification together provide a useful way of combining several dimensions of multiwavelength optical information (e.g., Absorption Angstrom Exponent, Extinction Angstrom Exponent, Single Scattering Albedo, Real and Imaginary Refractive Index) to assign aerosols to classes (e.g., Urban-Industrial, Biomass-Burning, Mineral Dust, Asian Urban). The specified or “reference” clusters can be established using information (e.g., trajectories, accompanying trace gases, chemical analyses, prior studies) beyond the optical information that will be available in the general case. These reference clusters can then be used to test the skill of a classification algorithm. Applications of the technique to 2 AERONET data sets and aircraft-sampled aerosols yielded skill scores ranging from 87% to 100% (diagonal elements of the skill score matrix). A first step toward simulating Glory APS data, by using bimodal aerosol models with randomly generated errors based on Glory APS expected uncertainties, yielded skill scores of 71% to 99%, with 0% to 12% of simulated points assigned to an “unknown” class (defined as consisting of points with D Mahal >3 from all 4 reference clusters based on 4 AERONET sites and associated seasons). Next Steps Refine filtering rules for reference clusters and definition of “unknown” class to yield a robust classification algorithm with a sound physical basis and relatively high skill scores. Extend tests to more of the Cattrall et al. (2005) AERONET sites and retrieved parameters, to test parameter usefulness and the generality of aerosol type assignments. Improve fidelity of simulations to expected Glory APS data products and uncertainties. Extend input data sets to parameters from other sensors (e.g., CALIOP layer heights, OMI UV absorption). Identifying Aerosol Type from Space: Absorption Angstrom Exponent as a Foundation for Multidimensional Specified Clustering and Mahalanobis Classification P. Russell 1 , P. Hamill 2 , J. Livingston 3 , Y. Shinozuka 1,4 , A. Strawa 1 , J. Redemann 1,4 , A. Omar 5 , A. Clarke 6 , R. Bergstrom 1,4 , B. Holben 7 , R. Ferrare 5 , S. Burton 5 1 NASA Ames Research Center, Moffett Field, CA ([email protected]), 2 San Jose State University, San Jose, CA, USA, 3 SRI International, Menlo Park, CA, 4 B ay Area Environmental Research Institute, Sonoma, CA, 5 NASA Langley Research Center, Hampton, VA, USA, 6 SOEST, University of Hawaii, Honolulu, HI, USA, 7 NASA Goddard Space Flight Center, Greenbelt, MD, USA Resulting clusters (10,000 points each) are shown in Fig. 4.4B with boundaries from the Cattrall-site clusters in Fig. 4.4A. Mahalanobis classification using those boundaries produces the skill scores shown in Fig. 4.4B. Points classified as “unknown” have D >3 (P(D )<1%) for all 4 Cattrall-based clusters. Specified clustering (e.g., Moussiades and Vakali, 2009) uses “a priori” information in a “reference” data set to assign points to clusters. This a priori information can include information (e.g., trajectory or chemical analyses or previous studies) beyond the optical parameters that will be available to the classification method in the general case. In Fig. 2D we assigned points to clusters (symbol colors) using the aerosol type designations of Dubovik et al. (2002); this is an example of specified clustering. Mahalanobis classification (AI Access, 2010; Wikipedia, 2010) assigns any given N- dimensional point (x 1 ,x 2 ,…,x N ) T to the cluster that has minimum Mahalanobis distance, D M , from that point. For purposes of defining D M , a cluster is defined by its mean ( 1 , 2 ,…, N ) T and its covariance matrix S. In N dimensions, (1) where the elements of S are given by (2) i is the standard deviation of x i for all points in the cluster, and ij is the correlation coefficient of x i and x j . 4. Application to example data sets (cont’d) 4.2 Aircraft-sampled aerosols, Clarke et al. (2007) 4.3 AERONET Version 2, building on Cattrall et al. (2005) Abstract Determining either aerosol composition or multiwavelength absorption from space is difficult at best, but recent research on many fronts has improved prospects for success. Results from diverse air, ground, and laboratory studies using both radiometric and in situ techniques show that the fractions of black carbon, organic matter, and mineral dust in atmospheric aerosols affect the wavelength dependence of absorption (often expressed as an Absorption Angstrom Exponent, or AAE). Recent results include analyses of the Dubovik et al. (2002) set of Aerosol Robotic Network (AERONET) retrievals from Sun-sky measurements describing full aerosol vertical columns. AAE values in this set are strongly correlated with aerosol composition or type. Specifically, AAE values are near 1 (the theoretical value for black carbon) for AERONET-measured aerosol columns dominated by urban-industrial aerosol, larger (though partially overlapping) for biomass burning aerosols, and largest for Sahara dust aerosols. These AERONET results are consistent with results from other, very different, techniques, including solar flux-aerosol optical depth (AOD) analyses and airborne in situ analyses examined in this presentation, as well as many other previous results. Although AAE is therefore a useful tool for helping to distinguish aerosol types, it cannot unambiguously distinguish urban-industrial from biomass burning aerosols, even when supplemented by measurements of Extinction Angstrom Exponent (EAE). Hence there is a need to add information from other remotely sensible properties to improve remote identification of aerosol type. Specified clustering, combined with Mahalanobis classification, provides an objective way of using multiple dimensions of data for this purpose. We demonstrate the application of this technique (previously used with High Spectral Resolution Lidar data) to (1) the Dubovik (2002) AERONET data set, (2) an in situ data set, and (3) a larger Version 2 AERONET data set. Results show that combining AAE and EAE with variables such as real and/or imaginary refractive index (RRI, IRI) or single scattering albedo (SSA) can improve separation of urban-industrial from biomass burning aerosols. The soon-to-be-launched Glory Aerosol Polarimetry Sensor (APS) is expected to produce data sets amenable to this aerosol classification technique, especially when combined with OMI aerosol absorption measurements at shorter wavelengths and CALIPSO measurements of aerosol height, to reduce height-absorption aliasing. 2. Previous results relating absorption spectra & aerosol composition 3. Multidimensional specified clustering and Mahalanobis classification (e.g., Ferrare et al., 2010) . Background and goal References AI Access, Mahalanobis distance, http://www.aiaccess.net/English/Glossaries/GlosMod/e_gm_mahalanobis.htm , 2010. Bergstrom, R. W., et al., Spectral absorption properties of atmospheric aerosols, Atmos. Chem. Phys., 7, 5937-5943, 2007. Bergstrom, R. W., et al., Aerosol spectral absorption in the Mexico City area: results from airborne measurements during MILAGRO/INTEX B, Atmos. Chem. Phys., 10, 6333-6343, 2010. Cattrall, C., et al., Variability of aerosol and spectral lidar and backscatter and extinction ratios of key aerosol types derived from selected Aerosol Robotic Network locations, J. Geophys. Res., 110, D10S11,doi:10.1029/2004JD005124, 2005. Clarke, A., et al., Biomass burning and pollution aerosol over North America: Organic components and their influence on spectral optical properties and humidification response, J. Geophys. Res., 112, D12S18, doi:10.1029/2006JD007777, 2007. Dubovik, O., et al.: Variability of absorption and optical properties of key aerosol types observed in worldwide locations, J. Atmos. Sci., 59, 590–608, 2002. Ferrare, R., C. Hostetler, S. Burton et al., Aerosol classification of airborne High Spectral Resolution Lidar measurements, ms in preparation, [email protected], 2010. Miller, S. D., A consolidated technique for enhancing desert dust storms with MODIS, Geophys. Res. Lett., 30(20), 2071, doi:10.1029/2003GL018279, 2003. Mishchenko, M., et al.: Accurate monitoring of terrestrial aerosols and total solar irradiance, B. Am. Meteorol. Soc., 88(5), 677– 691, 2007. Moussiades , L., and A. Vakali, Benchmark graphs for the evaluation of clustering algorithms , 3 rd Int’l Conf. Research Challenges in Information Science, Fez, Morocco, DOI 10.1109/RCIS.2009.5089283, 22-24 April 2009. Remer, L. A., et al. (2005), The MODIS aerosol algorithm, products, and validation, J. Atmos. Sci., 62, 947–973, doi:10.1175/JAS3385.1 Russell, P. B., et al., Absorption Angstrom Exponent in AERONET and related data as an indicator of aerosol composition, Atmos. 1D. Urban-industrial pollution? 1C. Sahara dust, NW Africa & Canary Islands Viewed from MODIS on Terra 24 Jul 2003 Atlantic coast of N America 1E. Smoke from Alaska wildfires MODIS on Terra 19 Jul 2004 MODIS on Aqua 21 Jul 2004 Sometimes aerosol type in imagery from space can be identified by tracing the aerosol back to its source (e.g., 1A, 1B). In other cases (e.g., 1C) the apparent color (absorption spectrum) distinguishes mineral dust aerosol from other aerosol types. In still other cases (e.g., 1D) it is tempting to guess aerosol type based on aerosol location. However, this can lead to errors, as exemplified by 1E, in which Alaskan wildfire smoke, carried down the Mississippi Valley, along the Gulf Coast and up the Atlantic seaboard, caused a haze layer off New England. The goal of this research is to develop robust methods for identifying aerosol type from the optical information retrievable from an individual image pixel. To test methods we use optical parameters similar to those expected to be retrieved from the Glory Aerosol Polarimetry Sensor (APS, 1F). Bergstrom et al. (2007) used airborne measurements of solar flux and AOD spectra to show (Fig. 2A) that (1) mineral dust, urban-industrial, and biomass burning aerosols have distinct spectra of SSA, (2) the different shapes of SSA spectra often convert to power-law spectra of absorption optical depth, with near-constant exponent (Absorption Angstrom Exponent, or AAE), and (3) AAE for urban-industrial aerosols is often near 1 (the value for black carbon), whereas AAE values are larger for biomass burning aerosols and largest for desert dust. These results are similar to those (Fig. 2B) obtained by Shinozuka et al. (2009) using very different analysis techniques on aircraft-sampled aerosols. Russell et al. (2010, Fig. 2C) showed that the AERONET data set of Dubovik et al. (2002), describing full aerosol vertical columns, had similar connections between AAE and aerosol type, albeit with some overlap between AAE values for urban-industrial and biomass-burning sites. They showed further that combining AAE with Extinction Angstrom Exponent (EAE) in a 2-dimensional plot (Fig. 2D left frame) still yielded partial overlap between urban-industrial (UrbInd) and biomass-burning (BioBurn) clusters, raising the question of whether the clusters could be separated by using other parameters from the AERONET retrieval set in analyses of higher dimension. ) ( ) ( ) ( 1 x S x x D T M ) ( ) ( ) ( 1 x S x x D T M ; j i ij ij S In 2 dimensions (1) reduces to (3) Notice the similarity of (3) to the exponent of the bivariate normal probability distribution, (4) Comparing (3) and (4) yields (5) Curves of constant D M are ellipses. The integral of f(x,y) over such an ellipse is the probability P(D M ) that a random point from the cluster lies within the ellipse (provided points are bivariate-normally distributed). Fig. 2D shows examples of such ellipses. (6) over an ellipse of constant D M dxdy y x f D P M ) , ( ) ( The Mahalanobis probability ellipses in Fig. 2D show that the 2-D set RRI,AAE (right frame), obtained by combining Real Refractive Index (RRI) with AAE, provides better separation of clusters than the 2-D set EAE,AAE (left frame). One measure of success in cluster separation is the skill score matrix, examples of which are shown in Fig. 2D. Elements are defined as (7) where N Sk is the number of points in the specified (Dubovik) class k and N Mkl is the number of points from Dubovik class k that Mahalanobis classification assigns to Dubovik class l. The 3-D set EAE,AAE,RRI (not shown) provides even better separation, as measured by larger 1F. The A-Train with Glory , ) ( ) ( ) ( 1 μ x μ x x S D T M . ) )( ( 2 ) ( ) ( ) 1 ( 2 1 exp 1 2 1 ) , ( 2 2 2 2 2 2 y x y x y y x x y x y x y x y x f . 2 exp 1 2 1 ) , ( 2 2 M y x D y x f 1A. Southern California wildfires, 26 Oct 2003 Viewed from MISR on Terra 4. Application to example data sets 4.1 AERONET Version 1, Dubovik et al. (2002) . ) )( ( 2 ) ( ) ( ) 1 ( 1 ) ( 2 2 2 2 2 y x y x y y x x M y x y x D x overlap and, in most cases, smaller 2-D Mahalanobis skill scores than AAE,SAE. Combining AAE, SAE, & SSA in a 3-D Mahalanobis classification produces 3-D skill scores (inset matrix) smaller than AAE,SAE scores in most cases, signaling that caution must be used when adding dimensions. (Note, however, that SSA at a shorter wavelength may provide better separation—See Figs. 2A & 2C.) Note also that, since the Clarke (2007) data are for dry aerosol, ambient RH will shift positions of Pollution, BioBurn, and Dust points differently, affecting their relationship to AERONET results. 4.2A. 2-D Mahalanobis classification of Clarke data: AAE,SAE 4.2B. 2-D Mahalanobis classification of Clarke data: AAE,SSA Cattrall et al. (2005) analyzed AERONET data for 26 sites/seasons, which they designated as urban/industrial (here abbreviated UrbInd), biomass burning (BioBurn), SE Asian (AsiaUrb), dust, and maritime. We chose the sites/seasons in Table 4.3 to explore Mahalanobis classification, using AERONET Version 2 data for the years shown. 2-D (AAE,SAE) Mahalanobis Skill Scores Mahalanob is Clarke designation designati on Poll Dust BioBurn Poll: o 367 90% 0 0% 13 8% Dust:+ 1 0% 18 95% 6 4% BioBurn:x 41 10% 1 5% 149 89% Total 409 100% 19 100% 168 100% 3-D (AAE,SAE,SSA) Mahalanobis Skill Scores Mahalanobi s Clarke designation designatio n Poll Dust BioBurn Poll: o 357 87% 0 0% 10 6% Dust:+ 0 0% 17 90% 1 1% BioBurn:x 52 13% 2 11% 157 94% Total 409 100% 19 100% 168 100% 2-D (AAE,SSA) Mahalanobis Skill Scores Mahalanobi s Clarke designation designatio n Poll Dust BioBurn Poll: o 321 78% 0 0% 13 8% Dust:+ 1 0% 14 74% 4 2% BioBurn:x 87 21% 5 26% 151 90% Total 409 100% 19 100% 168 100% designations as a reference set (i.e., 3 specified clusters) and tested how well Mahalanobis classification assigns points to the Clarke clusters. Fig. 4.2B shows that using AAE,SSA produces more cluster 2D. Different 2-D plots of Dubovik (2002) AERONET full-column results, showing different separations of clusters 300 700 1100 300 500 1100 Wavelength, nm 0.1 0.01 0.001 1.0 0.9 0.8 0.7 0.6 0.5 1 0.1 0.01 0.001 300 500 700 1100 1700 Wavelength, nm 300 500 700 900 1100 1300 1700 Wavelength, nm 2-D (AAE,RRI) Mahalanobis Skill Scores Mahalanobi s Dubovik designation designatio n UrbInd Dust BBurn UrbInd: o 4 100% 0 0% 0 0% Dust: + 0 0% 3 100% 0 0% BBurn: x 0 0% 0 0% 4 100% Total 4 100% 3 100% 4 100% 2-D (AAE,EAE) Mahalanobis Skill Scores Mahalanobi s Dubovik designation designatio n UrbInd Dust BBurn UrbInd: o 4 100% 0 0% 1 25% Dust: + 0 0% 3 100% 0 0% BBurn: x 0 0% 0 0% 3 75% Total 4 100% 3 100% 4 100% Boundary: D Mahal to Dust = D Mahal to AsiaUrb Boundary: D Mahal to UrbInd = D Mahal to BioBurn Boundary: D Mahal to BioBurn = D Mahal to AsiaUrb Sk Mkl kl N N SkillScore / % 100 Site Name Coun- try Cat- trall Desig - natio n Mont hs Years N meas Raw Fil- tere d GSFC USA Urb- Ind Jun- Sep 1992- 2009 817 802 Mongu Zam- bia Bio- Burn Aug- Nov 1995- 2008 1513 1490 Senanga Zam- bia Bio- Burn Aug- Nov 1995- 2008 1431 Beijing China Asia- Urb Jun- Feb 2000- 2009 1538 1489 Bahrain Bah- rain Dust Mar- Jul 2004- 2006 136 136 Solar Village Saudi Arabi a Dust Mar- Jul 1999- 2008 1573 1513 4.3A. 2-D Mahalanobis classification for 4 Cattrall sites Solar Village (Dust) Beijing (AsiaUrb) Mongu (BioBurn) GSFC (UrbInd) D M of Bahrain data point to specified cluster 5-D Mahalanobis Skill Scores Mahalanob is Cattrall designat ion (Bahrain ) designati on Dust UrbInd 0 0% Dust 121 89% BioBurn 0 0% AsiaUrb 15 11% Total 136 100% 4.3B. 5-D Mahalanobis classification of Bahrain data 5 Dimensions: EAE (440, 870 nm), AAE (440, 870), SSA (441), RRI (441), IRI (441) 4 Specified Sites (Types): . Mongu (BioBurn) . GSFC (UrbInd) . Beijing (AsiaUrb) . Solar Village (Dust) Models used in MODIS retrievals combine a fine mode and a coarse mode (e.g., Remer et al., 2005). Diamonds in Fig. 4.4A show SSA,EAE coordinates of fine and coarse modes for the MODIS over-ocean set (modes 1-9), augmented with increased absorption (modes 10-18) and adjusted size (modes 19, 20). Lines connecting a fine and a coarse mode show coordinates of weighted combinations of those modes. Together the lines cover the SSA,EAE space of the AERONET clusters from the 4 Cattrall sites in Fig. 4.3A (replotted in Fig. 4.4A). To investigate effects of the Glory APS retrieval uncertainties in Fig. 1F, we started with an augmented MODIS bimodal model (point on a line in Fig. 4.4A) approximating the mean of each cluster, normalized to yield AOD(550 nm)=0.5. We then added random uncertainties with standard 4.4 Aerosol models as a path to simulation of Glory APS products For each site, we filtered points to exclude outliers (here defined as points with D M >3). Boundaries in Fig. 4.3A trace the locus points having equal Mahalanobis distance to 2 sites. In Fig. 4.3B, 5-dimensional Mahalanobis classification assigns 89% of Bahrain points to the Solar Village (Dust) cluster and 11% to the Beijing (AsiaUrb) cluster. 4.4B. Simulated measurements and classifications 4.4A. Bimodal models, with clusters Years after 1 January 2004 Clarke et al. (2007) used analyses of simultaneously-sampled trace gases and other information to assign aircraft-sampled dry aerosols to the 3 types shown in Fig. 4.2A. We used the Clarke data and Table 4.3. AERONET sites and months from Cattrall et al. (2005) selected to explore Mahalanobis classification A11E-0091 1B. Alaskan wildfires, 2004 Viewed from MODIS on Terra 1.00 0.95 0.90 0.85 0.80 0.75 AAOD = K -AAE

Upload: charleen-lawrence

Post on 31-Dec-2015

223 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Summary and Conclusions  Specified clustering and Mahalanobis classification together provide a useful way of combining several dimensions of multiwavelength

Summary and ConclusionsSpecified clustering and Mahalanobis classification together provide a useful way of combining several dimensions of multiwavelength optical information (e.g., Absorption Angstrom Exponent, Extinction Angstrom Exponent, Single Scattering Albedo, Real and Imaginary Refractive Index) to assign aerosols to classes (e.g., Urban-Industrial, Biomass-Burning, Mineral Dust, Asian Urban).The specified or “reference” clusters can be established using information (e.g., trajectories, accompanying trace gases, chemical analyses, prior studies) beyond the optical information that will be available in the general case. These reference clusters can then be used to test the skill of a classification algorithm.Applications of the technique to 2 AERONET data sets and aircraft-sampled aerosols yielded skill scores ranging from 87% to 100% (diagonal elements of the skill score matrix).A first step toward simulating Glory APS data, by using bimodal aerosol models with randomly generated errors based on Glory APS expected uncertainties, yielded skill scores of 71% to 99%, with 0% to 12% of simulated points assigned to an “unknown” class (defined as consisting of points with DMahal>3 from all 4 reference clusters based on 4 AERONET sites and associated seasons).

Next StepsRefine filtering rules for reference clusters and definition of “unknown” class to yield a robust classification algorithm with a sound physical basis and relatively high skill scores.Extend tests to more of the Cattrall et al. (2005) AERONET sites and retrieved parameters, to test parameter usefulness and the generality of aerosol type assignments.Improve fidelity of simulations to expected Glory APS data products and uncertainties.Extend input data sets to parameters from other sensors (e.g., CALIOP layer heights, OMI UV absorption).

Identifying Aerosol Type from Space: Absorption Angstrom Exponent as a Foundation for Multidimensional Specified Clustering and Mahalanobis Classification

P. Russell1, P. Hamill 2, J. Livingston3, Y. Shinozuka1,4, A. Strawa1, J. Redemann1,4, A. Omar5, A. Clarke6, R. Bergstrom1,4, B. Holben7, R. Ferrare5, S. Burton5

1NASA Ames Research Center, Moffett Field, CA ([email protected]), 2San Jose State University, San Jose, CA, USA, 3SRI International, Menlo Park, CA, 4B ay Area Environmental Research Institute, Sonoma, CA,

5NASA Langley Research Center, Hampton, VA, USA, 6SOEST, University of Hawaii, Honolulu, HI, USA, 7NASA Goddard Space Flight Center, Greenbelt, MD, USA

Resulting clusters (10,000 points each) are shown in Fig. 4.4B with boundaries from the Cattrall-site clusters in Fig. 4.4A. Mahalanobis classification using those boundaries produces the skill scores shown in Fig. 4.4B. Points classified as “unknown” have DM>3 (P(DM)<1%) for all 4 Cattrall-based clusters.

Specified clustering (e.g., Moussiades and Vakali, 2009) uses “a priori” information in a “reference” data set to assign points to clusters. This a priori information can include information (e.g., trajectory or chemical analyses or previous studies) beyond the optical parameters that will be available to the classification method in the general case. In Fig. 2D we assigned points to clusters (symbol colors) using the aerosol type designations of Dubovik et al. (2002); this is an example of specified clustering. Mahalanobis classification (AI Access, 2010; Wikipedia, 2010) assigns any given N-dimensional point (x1,x2,…,xN )T to the cluster that has minimum Mahalanobis distance, DM, from that point. For purposes of defining DM, a cluster is defined by its mean (1,2,…,N )T and its covariance matrix S. In N dimensions,

(1)

where the elements of S are given by(2)

i is the standard deviation of xi for all points in the cluster, and ij is the correlation coefficient of xi and xj.

4. Application to example data sets (cont’d)4.2 Aircraft-sampled aerosols, Clarke et al. (2007)

4.3 AERONET Version 2, building on Cattrall et al. (2005)

AbstractDetermining either aerosol composition or multiwavelength absorption from space is difficult at best, but recent research on many fronts has improved prospects for success. Results from diverse air, ground, and laboratory studies using both radiometric and in situ techniques show that the fractions of black carbon, organic matter, and mineral dust in atmospheric aerosols affect the wavelength dependence of absorption (often expressed as an Absorption Angstrom Exponent, or AAE). Recent results include analyses of the Dubovik et al. (2002) set of Aerosol Robotic Network (AERONET) retrievals from Sun-sky measurements describing full aerosol vertical columns. AAE values in this set are strongly correlated with aerosol composition or type. Specifically, AAE values are near 1 (the theoretical value for black carbon) for AERONET-measured aerosol columns dominated by urban-industrial aerosol, larger (though partially overlapping) for biomass burning aerosols, and largest for Sahara dust aerosols. These AERONET results are consistent with results from other, very different, techniques, including solar flux-aerosol optical depth (AOD) analyses and airborne in situ analyses examined in this presentation, as well as many other previous results.Although AAE is therefore a useful tool for helping to distinguish aerosol types, it cannot unambiguously distinguish urban-industrial from biomass burning aerosols, even when supplemented by measurements of Extinction Angstrom Exponent (EAE). Hence there is a need to add information from other remotely sensible properties to improve remote identification of aerosol type. Specified clustering, combined with Mahalanobis classification, provides an objective way of using multiple dimensions of data for this purpose. We demonstrate the application of this technique (previously used with High Spectral Resolution Lidar data) to (1) the Dubovik (2002) AERONET data set, (2) an in situ data set, and (3) a larger Version 2 AERONET data set. Results show that combining AAE and EAE with variables such as real and/or imaginary refractive index (RRI, IRI) or single scattering albedo (SSA) can improve separation of urban-industrial from biomass burning aerosols. The soon-to-be-launched Glory Aerosol Polarimetry Sensor (APS) is expected to produce data sets amenable to this aerosol classification technique, especially when combined with OMI aerosol absorption measurements at shorter wavelengths and CALIPSO measurements of aerosol height, to reduce height-absorption aliasing.

2. Previous results relating absorption spectra & aerosol composition

3. Multidimensional specified clustering and Mahalanobis classification (e.g., Ferrare et al., 2010)

1. Background and goal

ReferencesAI Access, Mahalanobis distance, http://www.aiaccess.net/English/Glossaries/GlosMod/e_gm_mahalanobis.htm , 2010.Bergstrom, R. W., et al., Spectral absorption properties of atmospheric aerosols, Atmos. Chem. Phys., 7, 5937-5943, 2007.Bergstrom, R. W., et al., Aerosol spectral absorption in the Mexico City area: results from airborne measurements during MILAGRO/INTEX B, Atmos. Chem. Phys., 10, 6333-

6343, 2010.Cattrall, C., et al., Variability of aerosol and spectral lidar and backscatter and extinction ratios of key aerosol types derived from selected Aerosol Robotic Network locations,

J. Geophys. Res., 110, D10S11,doi:10.1029/2004JD005124, 2005.Clarke, A., et al., Biomass burning and pollution aerosol over North America: Organic components and their influence on spectral optical properties and humidification

response, J. Geophys. Res., 112, D12S18, doi:10.1029/2006JD007777, 2007.Dubovik, O., et al.: Variability of absorption and optical properties of key aerosol types observed in worldwide locations, J. Atmos. Sci., 59, 590–608, 2002.Ferrare, R., C. Hostetler, S. Burton et al., Aerosol classification of airborne High Spectral Resolution Lidar measurements, ms in preparation, [email protected],

2010.Miller, S. D., A consolidated technique for enhancing desert dust storms with MODIS, Geophys. Res. Lett., 30(20), 2071, doi:10.1029/2003GL018279, 2003.Mishchenko, M., et al.: Accurate monitoring of terrestrial aerosols and total solar irradiance, B. Am. Meteorol. Soc., 88(5), 677–691, 2007.Moussiades , L., and A. Vakali, Benchmark graphs for the evaluation of clustering algorithms , 3 rd Int’l Conf. Research Challenges in Information Science, Fez, Morocco, DOI

10.1109/RCIS.2009.5089283, 22-24 April 2009.Remer, L. A., et al. (2005), The MODIS aerosol algorithm, products, and validation, J. Atmos. Sci., 62, 947–973, doi:10.1175/JAS3385.1Russell, P. B., et al., Absorption Angstrom Exponent in AERONET and related data as an indicator of aerosol composition, Atmos. Chem. Phys., 10, 1155-1169, 2010.Shinozuka, Y., et al.: Aerosol optical properties relevant to regional remote sensing of CCN activity and links to their organic mass fraction: airborne observations over

Central Mexico and the US West Coast during MILAGRO/INTEX-B, Atmos. Chem. Phys., 9, 6727–6742, 2009.Wikipedia, Mahalanobis distance, http://en.wikipedia.org/wiki/Mahalanobis_distance , 2010.

1D. Urban-industrial pollution?

1C. Sahara dust, NW Africa & Canary IslandsViewed from MODIS on Terra

24 Jul 2003

Atlantic coast of N America

1E. Smoke from Alaska wildfires

MODIS on Terra19 Jul 2004

MODIS on Aqua21 Jul 2004

Sometimes aerosol type in imagery from space can be identified by tracing the aerosol back to its source (e.g., 1A, 1B). In other cases (e.g., 1C) the apparent color (absorption spectrum) distinguishes mineral dust aerosol from other aerosol types. In still other cases (e.g., 1D) it is tempting to guess aerosol type based on aerosol location. However, this can lead to errors, as exemplified by 1E, in which Alaskan wildfire smoke, carried down the Mississippi Valley, along the Gulf Coast and up the Atlantic seaboard, caused a haze layer off New England. The goal of this research is to develop robust methods for identifying aerosol type from the optical information retrievable from an individual image pixel. To test methods we use optical parameters similar to those expected to be retrieved from the Glory Aerosol Polarimetry Sensor (APS, 1F).

Bergstrom et al. (2007) used airborne measurements of solar flux and AOD spectra to show (Fig. 2A) that (1) mineral dust, urban-industrial, and biomass burning aerosols have distinct spectra of SSA, (2) the different shapes of SSA spectra often convert to power-law spectra of absorption optical depth, with near-constant exponent (Absorption Angstrom Exponent, or AAE), and (3) AAE for urban-industrial aerosols is often near 1 (the value for black carbon), whereas AAE values are larger for biomass burning aerosols and largest for desert dust. These results are similar to those (Fig. 2B) obtained by Shinozuka et al. (2009) using very different analysis techniques on aircraft-sampled aerosols. Russell et al. (2010, Fig. 2C) showed that the AERONET data set of Dubovik et al. (2002), describing full aerosol vertical columns, had similar connections between AAE and aerosol type, albeit with some overlap between AAE values for urban-industrial and biomass-burning sites. They showed further that combining AAE with Extinction Angstrom Exponent (EAE) in a 2-dimensional plot (Fig. 2D left frame) still yielded partial overlap between urban-industrial (UrbInd) and biomass-burning (BioBurn) clusters, raising the question of whether the clusters could be separated by using other parameters from the AERONET retrieval set in analyses of higher dimension.

)()()( 1 xSxxD TM )()()( 1 xSxxD TM

;jiijijS

In 2 dimensions (1) reduces to

(3)

Notice the similarity of (3) to the exponent of the bivariate normal probability distribution,

(4)

Comparing (3) and (4) yields

(5)

Curves of constant DM are ellipses. The integral of f(x,y) over such an ellipse is the probability P(DM) that a random point from the cluster lies within the ellipse (provided points are bivariate-normally distributed). Fig. 2D shows examples of such ellipses.

(6)

over an ellipse of constant DM

dxdyyxfDP M ),()(

The Mahalanobis probability ellipses in Fig. 2D show that the 2-D set RRI,AAE (right frame), obtained by combining Real Refractive Index (RRI) with AAE, provides better separation of clusters than the 2-D set EAE,AAE (left frame). One measure of success in cluster separation is the skill score matrix, examples of which are shown in Fig. 2D. Elements are defined as

(7)where NSk is the number of points in the specified (Dubovik) class k and NMkl is the number of points from Dubovik class k that Mahalanobis classification assigns to Dubovik class l. The 3-D set EAE,AAE,RRI (not shown) provides even better separation, as measured by larger differentiation in Mahalanobis distances than achieved with either 2D set.

1F. The A-Train with Glory

,)()()( 1 μxμxx SD TM

.))((2)()(

)1(2

1exp

12

1),( 2

2

2

2

22

yx

yx

y

y

x

x

yx

yxyxyxf

.2exp12

1),(

2

2

M

yx

Dyxf

1A. Southern California wildfires, 26 Oct 2003

Viewed from MISR on Terra

4. Application to example data sets4.1 AERONET Version 1, Dubovik et al. (2002)

.))((2)()(

)1(

1)( 2

2

2

2

2

yx

yx

y

y

x

xM

yxyxD

x

overlap and, in most cases, smaller 2-D Mahalanobis skill scores than AAE,SAE. Combining AAE, SAE, & SSA in a 3-D Mahalanobis classification produces 3-D skill scores (inset matrix) smaller than AAE,SAE scores in most cases, signaling that caution must be used when adding dimensions. (Note, however, that SSA at a shorter wavelength may provide better separation—See Figs. 2A & 2C.)

Note also that, since the Clarke (2007) data are for dry aerosol, ambient RH will shift positions of Pollution, BioBurn, and Dust points differently, affecting their relationship to AERONET results.

4.2A. 2-D Mahalanobis classification of Clarke data: AAE,SAE 4.2B. 2-D Mahalanobis classification of Clarke data: AAE,SSA

Cattrall et al. (2005) analyzed AERONET data for 26 sites/seasons, which they designated as urban/industrial (here abbreviated UrbInd), biomass burning (BioBurn), SE Asian (AsiaUrb), dust, and maritime. We chose the sites/seasons in Table 4.3 to explore Mahalanobis classification, using AERONET Version 2 data for the years shown.

2-D (AAE,SAE) Mahalanobis Skill ScoresMahalanobis Clarke designation

designation Poll Dust BioBurnPoll: o 367 90% 0 0% 13 8%

Dust:+ 1 0% 18 95% 6 4%BioBurn:x 41 10% 1 5% 149 89%

Total 409 100% 19 100% 168 100%

3-D (AAE,SAE,SSA) Mahalanobis Skill ScoresMahalanobis Clarke designation

designation Poll Dust BioBurnPoll: o 357 87% 0 0% 10 6%

Dust:+ 0 0% 17 90% 1 1%BioBurn:x 52 13% 2 11% 157 94%

Total 409 100% 19 100% 168 100%

2-D (AAE,SSA) Mahalanobis Skill ScoresMahalanobis Clarke designation

designation Poll Dust BioBurnPoll: o 321 78% 0 0% 13 8%

Dust:+ 1 0% 14 74% 4 2%BioBurn:x 87 21% 5 26% 151 90%

Total 409 100% 19 100% 168 100%

designations as a reference set (i.e., 3 specified clusters) and tested how well Mahalanobis classification assigns points to the Clarke clusters. Fig. 4.2B shows that using AAE,SSA produces more cluster

2D. Different 2-D plots of Dubovik (2002) AERONET full-column results, showing different separations of clusters

300 700 1100300 500 1100 Wavelength, nm

0.1

0.01

0.001

1.0

0.9

0.8

0.7

0.6

0.5

1

0.1

0.01

0.001300 500 700 1100 1700 Wavelength, nm300 500 700 900 1100 1300 1700

Wavelength, nm

2-D (AAE,RRI) Mahalanobis Skill ScoresMahalanobis Dubovik designationdesignation UrbInd Dust BBurnUrbInd: o 4 100% 0 0% 0 0%

Dust: + 0 0% 3 100% 0 0%BBurn: x 0 0% 0 0% 4 100%

Total 4 100% 3 100% 4 100%

2-D (AAE,EAE) Mahalanobis Skill ScoresMahalanobis Dubovik designationdesignation UrbInd Dust BBurnUrbInd: o 4 100% 0 0% 1 25%

Dust: + 0 0% 3 100% 0 0%BBurn: x 0 0% 0 0% 3 75%

Total 4 100% 3 100% 4 100%Boundary:

DMahal to Dust =DMahal to AsiaUrb

Boundary:DMahal to UrbInd =DMahal to BioBurn

Boundary:DMahal to BioBurn =DMahal to AsiaUrb

SkMklkl NNSkillScore /%100

Site Name Coun-try

Cat-trall

Desig-nation

Months

Years Nmeas

Raw Fil-tered

GSFC USA Urb-Ind

Jun-Sep

1992-2009

817 802

Mongu Zam-bia

Bio-Burn

Aug-Nov

1995-2008

1513 1490

Senanga Zam-bia

Bio-Burn

Aug-Nov

1995-2008

1431

Beijing China Asia-Urb

Jun-Feb

2000-2009

1538 1489

Bahrain Bah-rain

Dust Mar-Jul

2004-2006

136 136

Solar Village

Saudi Arabia

Dust Mar-Jul

1999-2008

1573 1513

4.3A. 2-D Mahalanobis classification for 4 Cattrall sites

Solar Village

(Dust)

Beijing (AsiaUrb)

Mongu (BioBurn)

GSFC (UrbInd)

DM o

f Bah

rain

dat

a po

int t

o sp

ecifi

ed c

lust

er

5-D Mahalanobis Skill Scores

Mahalanobis

Cattrall designation

(Bahrain)designation Dust

UrbInd 0 0%Dust 121 89%

BioBurn 0 0%AsiaUrb 15 11%

Total 136 100%

4.3B. 5-D Mahalanobis classification of Bahrain data

5 Dimensions: EAE (440, 870 nm), AAE (440, 870), SSA (441), RRI (441), IRI (441)

4 Specified Sites (Types): . Mongu (BioBurn) . GSFC (UrbInd). Beijing (AsiaUrb) . Solar Village (Dust)

Models used in MODIS retrievals combine a fine mode and a coarse mode (e.g., Remer et al., 2005). Diamonds in Fig. 4.4A show SSA,EAE coordinates of fine and coarse modes for the MODIS over-ocean set (modes 1-9), augmented with increased absorption (modes 10-18) and adjusted size (modes 19, 20). Lines connecting a fine and a coarse mode show coordinates of weighted combinations of those modes. Together the lines cover the SSA,EAE space of the AERONET clusters from the 4 Cattrall sites in Fig. 4.3A (replotted in Fig. 4.4A).

To investigate effects of the Glory APS retrieval uncertainties in Fig. 1F, we started with an augmented MODIS bimodal model (point on a line in Fig. 4.4A) approximating the mean of each cluster, normalized to yield AOD(550 nm)=0.5. We then added random uncertainties with standard deviations from Fig. 1F.

4.4 Aerosol models as a path to simulation of Glory APS products

For each site, we filtered points to exclude outliers (here defined as points with DM>3). Boundaries in Fig. 4.3A trace the locus points having equal Mahalanobis distance to 2 sites. In Fig. 4.3B, 5-dimensional Mahalanobis classification assigns 89% of Bahrain points to the Solar Village (Dust) cluster and 11% to the Beijing (AsiaUrb) cluster.

4.4B. Simulated measurements and classifications

4.4A. Bimodal models, with clusters

Years after 1 January 2004

Clarke et al. (2007) used analyses of simultaneously-sampled trace gases and other information to assign aircraft-sampled dry aerosols to the 3 types shown in Fig. 4.2A. We used the Clarke data and

Table 4.3. AERONET sites and months from Cattrall et al. (2005) selected to explore Mahalanobis classification

A11E-0091

1B. Alaskan wildfires, 2004

Viewed from MODIS on Terra

1.00

0.95

0.90

0.85

0.80

0.75

AAOD = K -AAE