modeling aerosol puff concentration distributions · 2005. 2. 14. · figure 4.8 figure 4.9 figure...
Post on 12-Oct-2020
12 Views
Preview:
TRANSCRIPT
MODELING AEROSOL PUFF CONCENTRATION DISTRIBUTIONS
FROM POINT SOURCES
USING ARTIFICIAL NEURAL NETWORKS
Timothy J. De Vito
A Thesis Submitted to the Faculty of the Royal Military College of Canada
In Partial Fdfillment of the Requirements for the Degree of
Master of Engineering in Chernical Engineering
July 2000
8 This thesis may be used within the Department of National Defence, but copyright for open publication remains the property of the author.
National Libraiy Bibliothbque nationale du Canada
Acquisitions and Acquisitions et Bibliog raphic Services services bibliographiques 395 Wellington Street 395. nie WeUington OîtawaON KlAON4 Ottawa ON K l A W Canada CaMda
The author has granted a non- exclusive licence allowing the National Library of Canada to reproduce, loan, disîriiute or sell copies of th is thesis in microform, paper or electronic formats.
The author retains ownership of the copyright in this thesis. Neither the thesis nor substantial extracts fkom it may be printed or othexwise reproduced without the author's permission.
L'auteur a accordé une licence non exclusive permettant à la Bibliothèque nationale du Canada de reproduire, prêter, distribuer ou vendre des copies de cette thèse sous la forme de microfiche/nlm, de reproduction sur papier ou sur format électronique.
L'auteur conserve la propxiété du droit d'auteur qui protège cette thèse. Ni la thèse ni des extraits substantiels de celle-ci ne doivent être imprimés ou autrement reproduits sans son autorisation.
Acknowledgements
1 would like to express my gratitude to those who supported me throughout this
research project, without whose help none of this would have been possible. 1 am
particularly indebted to Dr. W. S. Andrews at the Royal Military Coilege of Canada for
providing the oppomuiity to underîake this project, and for his unwavering confidence in
me throughout the course of this work.
The personnel at the Defence Research Establishment Valcartier must aiso be
gratefully acknowledged. They were instrumental in the collection of these data, and
went out of their way to offer me every bit of assistance possible. Special th& goes to
Gilles Roy, who took time out of his busy schedule to answer my questions, conduct field
trials, and make me feel pedectly at home in Valcartier. Additional th& go to Jean-
Marc Thériault, Luc Bissonnette and Sylvain Cantin for their insights and assistance.
1 would also Wte to thank the Defence Research and Development Branch
(DRDB) for awarding me with the DRDB-RMC Fellowship. 1 am also grateful to the
Royal Canadian Regiment Trust, who provideci support in the fom of the Milton Fowler
Gregg VC Memorial Tnist Fund Bursary.
Financial support which made this work possible is gratefully aclmowledged fiom
the Academic Research Program and the Director General Nuclear Safety, both agencies
of the Department of National Defence.
Abstract
A series of over 50 field trials has been conducted in order to detennine the
concentration distributions within aerosol p u f i resulting fiom near-instantaneous
releases under atmospheric conditions falling within Pasquill stability classes A and B
(very unstable and moderately unstable, respecîively). The aerosol examined was kaolin,
an inert, non-buoyant ceramic having particle size less than 3 p. Concentration
measurements were made using the Defence Research Establishment Valcartier laser
cloud mapper (LCM), a scanning lidar systern operating at 1 .O6 p.
Artificial neural network (ANN) models were developed using the LCM data to
predict concentration distributions given a number of easily measured meteorological
parameters. ANN model results were compared to those f?om traditional Gaussian puff
models, including the US Army Research Laboratory's Gaussian-based dispersion model
COMBIC. ANN models provided significantly better predictions than the Gaussian pufY
models. The predicted concentration distributions of one of the ANN models were
parameterized as a function of wind speed and diffusion tirne, and a tilted Gaussian puff
model was developed. Simple analytical expressions were derived for the dispersion
lengths and puff tilt angle. This parameterized model provided better predictions than
traditional Gaussian puff models.
Table of Contents
Page
............................................................................................................... List of Figures vi
List of Tables .................................................................................................................. x
.. List of Symbols ............................................................................................................. xu
1 Introduction ............................................................................................................... 1
........................................ 1 . 1 Aerosol Dispersion in the Planetary Boundary Layer 2
1.2 Application of Lidar to Aeroso1 Dispersion Experiments .................................. 3
................................................................................. 1.3 Artificial Neural Networks 6
........................................................................................ 1.4 State of the Discipline 8
1.4.1 Dispersion Modeling ................................................................................... 8
.......................................................................................... 1.4.2 Lidar Inversion 12
............................. 1.4.3 N e 4 Network Applications to Dispersion Modeling 14
............................................................................................. 1 -5 Thesis Objectives 16
2 Background and Theory ....1......................................C.o..o............t.......................... 17
...................................................................... 2.1 The Gaussian Dispersion Mode1 17
2.1.1 Stability Classification Schemes and Dispersion Coefficient
.................................................................................... Parameterizations 23
................................................................................. 2.1.2 The COMBIC Mode1 27
.................................................................................. 2.2 The Laser Cloud Mapper 30
................................................................................. 2.2.1 LCM S pecifications 30
................................................................ .................... 2.2.2 Lidar Inversion -.. 32
2.3 Artificial Neural Networks ............................................................................... 43
..... 2.3.1 Multi-Layer Feed Forward Networks and Backpropagation Leanhg 45
............................................ 2.3 -2 Genaalization and Separaîion of Data Sets 55
3 Collection and Analysis of Data ............................................................................. 57
3.1 Experimentai Setup and Collection of Data ..................................................... 57
3.2 Analysis of Inverted LCM Scans ................................................................... 60 3.3 Preparation of Data for ANN Modeling ....................................................... 64
3.4 Gaussia. Puff Modeling ................................................................................. 70
4 Resuïts and Discussion ............................................................................................ 74
4.1 ANN Mode1 Developrnent ............................................................................. 74
4.2 Comparing ANN Models with Gaussian Puff Models .................................... 81
4.3 Sensitivity Andysis ........................................................................................ 92
4.3.1 Disabledhputs .......................................................................................... 92
4.3.2 Input Dithering ......................................................................................... 9 7
4.3.3 Analysis of Mode1 Residuals ................................................................... 100
..................................... 4.4 ANN Mode1 Concentration Distribution Predictions 102
........................ 4.4.1 Horizontal Concentration Distributions .............. .,. .......... 103
....................................................... 4.4.2 Vertical Concentration Distributions 107
4.4.3 Cloud Spread ........................................................................................... 114
4.4.4 ANN Mode1 1 A Parameterization ......................................................... 117
5 ConcIusions ............................................................................................................ 123
5.1 Summary and Conclusions ............................................................................ 123
5.2 Recornmendations ................... ... ............................................................... 127
List of References ....................................................................................................... 130
.................................. Appendix A Meteorological Measurements and Estimates 135
Appendix B Neural Network Performance Statistics ............................................ 138
Appendix C Mode1 Concentration Prediction Scatter Plots ................................. 141
Vita .............................................................................................................................. 151
Figure 2.1
Figure 2.2
Figure 2.3
Figure 2.4
Figure 2.5
Figure 2.6
Figure 2.7
Figure 2.8
Figure 2.9
Figure 2.1 O
A cornparison between Gaussian puff (top) and plume (bottom) diffusion, showing concentration contour surfaces. ......................... .2 1
The effect of atmospheric stability on the dispersion of plumes. The adiabatic lapse rate is shown as a dashed Iine, while typical vertical temperature profiles are shown as solid lines for (a) unstable conditions, (b) neutral conditions, and (c) stable conditions (Tumer, 1994). ..................................................... -24
Slade parameterization of dispersion coefficients as fiinctions of downwind travel distance fiom the source; (a) Pasquill stability class A (very unstable); (b) Pasquill stability class B (moderately unstable) (CCPS, 1996). ......................................... .29
Pasquill parameterization of dispersion coefficients as fuxictions of downwind travel distance fiom the source; shown are the modifications used in COMBIC, equations (2) to (4); (a) Pasquill stability class A (very unstable); (b) Pasquill stability class B (moderately unstable) (Ayres and Desutter, 1 996). ............................................................................. -29
................................... Raster scanning pattern used b y the LCM. .3 1
................................... The basic structure of a biological neuron. .43
................................ The basic structure of a processing element. -44
A rnulti-layer feed-forward ANN with two hidden layers (Haykin, 1994). ................................................................. .46
.................................. MLFF network connections and variables.. .48
Common tramfer functions: (top) the hyperbolic tangent, ............................................... (bottom) the sigmoid hct ion. -53
vii
Figure 3.1
Figure 3.2
Figure 3.3
Figure 3.4
Figure 3.5
Figure 4.1
Figure 4.2
Figure 4.3
Figure 4.4
Figure 4.5
Figure 4.6
Figure 4.7
.................................... Layout of the experimental triai plateau.. .58
Tiie bottom sweep of a typical LCM scan shown in raw fonn (bottom) and inverted (top), displayed using LCVS (bird's eye view). Radial grid lines are spaced about 50 m apart; azimuth grid lines are 10" apart. Note the reduced noise in the inverted scan. The strong return across the top is the sand dune. ................. .6 1
Data set 1 trainhg set fiequency distributions after transfomation and duplication. ............................................... -67
Data set 1 test set fiequency distributions after transformation and duplication. ................................................................. .68
The 'image source' accounts for surface interactions of the puff by reflecting material off the ground. ......................................... .7 1
Leamhg c w e s of ANNs 2 . 2 ~ and 2.7~. Also indicated in the figure are the points during îraining where RMS improved, and
........... the ANN was automatically saved by the SaveBest command. .79
Scatter plot for ANN model 1A over the (a) test set and (b) validation set. .................................................................... -86
Cornparison of geometric variance and mean bias between the ........... various models for data set 1 (a) test set and (b) validation set. .88
Cornparison of geometric variance and mean bias between the various models for data set 2 (a) test set and (b) validation set. ........... .89
Change in ANN 1A performance over (a) the training set and (b) the test set. Percentage decrease in R and increase in RMS are shown. Emor bars indicate the standard deviation among the 20 A N N s making up model 1A. ( t h e indicates the of day) ................................................................................ -94
Change in ANN 2A performance over (a) the training set and (b) the test set. Percentage decrease in R and increase in RMS are shown. Error bars indicate the standard deviation among the 20 ANNs making up model 1A. (time indicates t h e of &y) ................................................................................ .95
Results of dithering inputs by 5% over the test set for (a) model 1A and (b) model SA. Error bars indicate standard deviation
................................ among the 20 ANNs making up each model. -98
Figure 4.8
Figure 4.9
Figure 4.10
Figure 4.1 1
Figure 4.12
Figure 4.1 3
Figure 4.14
Figure 4.15
Figure 4.1 6
Figure 4.17
Figure 4.1 8
ANN model 1A residuals vs. (a) diffbsion tirne, (b) wind speed, (c) temperature, (d) time of &y, (e) Pasquill stability class and (f) pressure.. .......................................................... .10 1
ANN model 2A residuals vs. (a) diffusion t h e , (b) wind speed, (c) temperature, (d) time of day, (e) Pasquill stability class and (f) pressure.. .......................................................... - 1 O I
ANN model 1A predictions for a horizontal slice through the puff centre at A, shown (a) 10 seconds and @) 30 seconds d e r release. ................................................................... .104
ANN model 1A predictions and fitted Gaussian curves for profiles dong r0, y=O (lefi) and A, x=O (right). Dif i ion times (a) 10 seconds and (b) 30 seconds are shown. Note:
........... these are 1 -D profiles of the surfaces shown in Figure 4.10. -105
Model 1 A nonnalized concentration contours for vertical cross-sections through the puff centre w), shown (a) 20 s and (b) 40 s after release. ..................................................... - 1 08
Model 2A nonnalized concentration contours for vertical cross-sections îhrough the puff centre @=O), shown (a) 20 s and @) 40 s afier release. ..................................................... .109
Model 1A predictions of puff tilt angle, shown at various times d e r release for (a) T=19 OC, @) T=22 OC. Note the good linear fit and the slow decay of tilt angle with d i f i i on time
..................................................................... (-1.0 mk). 1 1
Model 2A predictions of pufT tilt angle, shown at various times afkr release for (a) T=1 9 OC, (b) T=22 OC. Note the increasing puff tilt with diffusion time for low T. ...................................... .112
Evolution of dispersion coefficients with diffusion time for wind speed (a) 1 .O m/s, (b) 1.5 m/s and (c) 2.5 m/s. ....................... . I l5
Vertical profiles of downwind and crosswind dispersion lengths as predicted by model 2A. Shown for wind speed of 1.0 mls, 10 seconds after release. Remaining inputs are the same as those in Figure 4.10 above. ......................................... - 1 16
Variation of puff tilt angle with diffusion time and wind speed, shown for T=19 OC ........................................................... .119
Figure C. 1
Figure C.2
Figure C.3
Figure C.4
Figure CS
Figure C.6
Figure C.7
Figure C.8
Figure C.9
Scatter plots for ANN model 1A over (a) the test set and (b) the ................................................................. validation set. .142
Scatter plots for ANN model 2A over (a) the test se? and (b) the .................................... ......................... validation set. ... - 1 43
Scatter plots for ANN model 1B over (a) the test set and (b) the ................................................................. validation set. 144
Scatter plots for ANN model 2B over (a) the test set and (b) the ................................................................. validation set. -145
Scatter plots for data set L GPMs (Slade) over (a) the test set ..................................................... and (b) the validation set. -146
Scatter plots for data set 1 GPMp (Pasquill) over (a) the test set ..................................................... and (b) the validation set. .147
Scatter plots for data set 2 GPMs (Slade) over (a) the test set ..................................................... and (b) the validation set. .148
Scatter plots for data set 2 GPMp (Pasquill) over (a) the test set ..................................................... and (b) the validation set. -149
Scatter plots for COMBIC over (a) the test set and (b) the .................................................................. validation set. 150
List of Tables
Table 2.1
Table 3.1
Table 3.2
Table 4.1
Table 4.2
Table 4.3
Table 4.4a
Table 4.4b
Table 4.5a
Table 4.5b
Table 4.6
Detennining the Pasquill stability category (Pasqui11 and Smith, 1983). ................................................................... ..25
Total number of vectors in the îrainhg, test and validations sets of data sets 1 and 2.. ............................................................. -66
COMBIC Input card used to mode1 kaolin trial 18, scan 5. .............. .73
The best networks for each architecture listed in Table B.1 (data set 1, relative-z coordinates), as determined by performance against the test set. Validation set statistics are also shown (RMS = root-mean-square error, R = linear wrrelation coefficient). ........................................................ .76
The best networks for each architecture listed in Table B.2 (data set 2, absolute-z coordinates), as detennined by performance against the test set. Validation set statistics are also shown. (RMS = root-mean-square error, R = linear correlation coefficient). ........................................................ -76
Mean test set staîistics for the two architectures for each data set. The error tem indicates the standard deviation between the 20 ANNs in each group. .................................................. -80
Mode1 cornparison over data set 1 test set. ................................... .83
Mode1 cornparison over data set 1 validation set. .......................... .83
Mode1 cornparison over data set 2 test set. ..................... .. .......... .84
Mode1 cornparison over data set 2 validation set. .......................... .84
Fitting parameters for the dispersion coefficients as linear functions of diffusion time and wind speed.. ............................... .120
................................ Table 4.7a Mode1 cornparison over data set 1 test set. .12 1
.......................... Table 4.7b Model cornparison over data set 1 validation set. .12 1
Table A. 1 Summary of measurements taken during the kaolin trials .............. .136
Table A.2 Calculated wind speed and Pasquill Stability Class ....................... .137
Table B.l Reliminary data set 1 ANNs. Effect of varying the network architecture and epoch size on network performance against the test set. The listed statistics are root mean square (RMS) and linear correlation coefficient (R) between predicted and target
....................................................................... outputs. - 1 3 8
Table B.2 Reliminary data set 2 ANNs. Effect of varyllig the network architecture and epoch size on network performance against the test set. The listed statistics are root mean square (RMS) and linear correlation coefficient (R) between predicted and target
....................................................................... outputs. - 1 3 9
Table B.3 Results of training 20 ANNs with two different architectures on each data set. Each net was initialized to a different random
........................................................ point in weight space. -140
List of Symbols
A
A
a
a
a
a
ANN
B
b
b
P
P (r)
C
C
C
C
C
Pasquill stability class: very unstable
receiver area (m)
fitting parameter for COMBIC parameterization
fitting parameter for model 1 A parameterization
mass extinction coefficient (m2/g)
momenturn coefficient
Artificial Neural Network
Pasquill stability class: moderately unstable
fitting parameter for COMBIC parameterization
fitting parameter for model I A parameterization
puff tilt angle
elastic backscattering coefficient (m-'sr-')
Pasquill stability class: slightly unstable
aerosol concentration (@m3)
clear-air shot returned signal strength (W)
fitting parameter for COMBIC parameterization
fitting parameter for model 1 A panimeterization
C
C o
COMBIC
CP
D
d
6
6
DREV
Avi
Aw
E
E
F
F(r)
f(
FI0
F2
GPMp
GPMs
4 I
speed of light ( d s )
observed concentration (g/m3)
Combined Obscuration Model for Battlefield Induced Contaminants
predicted concentration (dm3)
Pasquill stability class: neutrai
backscaîter to extinction ratio
output layer weight update factor
hidden layer weighî update factor
Defence Research Establishment Valcartier
hidden layer weight adjustment
output layer weight adjustment
Pasquill stability class: slightly stable
neural network global error
Pasquill stability class: moderately stable
lidar system constant
non-hear tramfer function
fiaction of predictions within a factor of ten of measurement
hct ion of predictions within a factor of two of measurement
Gaussian puff model with Pasquill parameterization
Gaussian puff model with Slade parameterization
hidden layer PE summation
output layer PE summation
received intensity (w/m2)
intensiîy of source emission (w/m2)
backscatter to extinction exponent
wavelength (m)
Laser Cloud Mapper
Light Detection and Rauging
Genmetric Mean Bias
Pasquill stability class
Received signai power (W)
transmitted pulse power (W)
atmosphezic pressure (in Hg)
aerosol release mass (g)
radiai distance (m)
correlation coefficient
root mean square
volumetrîc extinction coefficient (m-')
clear-air extinction coefficient (m-' )
downwind dispersion coefficient (m)
crosswind dispersion coefficient (m)
vertical dispersion coefficient (m)
diffusion time (s)
target output value
ambient temperature ("C)
pulse length (s)
optical depth
mean wind speed ( d s )
geometric mean variance
hidden layer weight
output layer weight
relative downwind distance (m)
rh input variable
downwind distance fiom source (m)
downwind puff centroid position (m)
relative crosswind distance (m)
learning coefficient
hidden PE output value
relative vertical distance (m)
absolute vertical distance (m)
output PE value
effective release height (m)
Chapter 1
Introduction
Aerosol dispersion modeling is concemed with predicting the concentration
distribution of a contaminant introduced into the atmosphere and its subsequent
dispersion downwind. Most of the work in this field to date has concerned the dispersion
of a pollutant fkom a continuous release, such as tiom a smokestack or evaporating pool.
However, the dispersion fiom a nearly instantaneous release has received much less
attention, both in theoretical treatment and in experimental trials.
Predicting the dispersion of instantaneous releases has numerous applications of
great importance in environmental science, indusby and the military, including
detennining the effects of an accidental release of toxic or radioactive materiai, both on
the surroundhg environment, and on the health of the public. A good mode1 can help
industry mitigate such incidents, and provide insight into regdatory requirernents (CCPS,
1996). In miiitary science, dispersion models are essential in the study of smoke-grenade
obscurants and countenneasures, as well as in improving the defensive procedwes for
dispershg chernical and biological weapons.
1.1 Aerosol Dispersion in the Planetary Boundary Layer
The term 'aerosol' generally refas to a liquid or solid substance suspended in a
gaseous medium, and covers a wide range of matter with varying sKes and composition,
incIuding dust, smoke and mists (Williamson, 1 973). Aemsols typically have particle
radii ranging from 0.01 p to 10 p (Hidy, 1984). Once introduced into the
atmosphere, an aerosol is acted upon by numerous complex processes.
In the upper parts of the atmosphere, ground surface fiction has little effect on
the flow of air, and can often be ignored. However, in regions closer to the surfâce, the
effects of the f?k&onal drag become important, and ultimately force the flow to zero
velocity at the surface itself Most dispersion problems of interest occur in this lower
region of the atmosphere, commonly known as the planetary boudary layer (PBL),
which can loosely be dehed as the depth of the surface-related influence on the fiow
(Oke, 1 987), and typically extends about 1 lan above the surface.
However, the character and depth of the PBL are directly influenced by the
surface, and Vary in response to changes in the surface's physical nature (Pasquill and
Smith, 1983). Thus the PBL is characterized by mechanical turbulence, generated by the
drag of the rough underlying surface, and convective turbulence due to the exchange of
heat between the sudice and the fiow. Since the strength of convective turbulence
generally varies diumally, so does the depth and nature of the PBL. By day, the earth is
heated more rapidly than the atmosphere, resulting in an upward flux of heat fiom the
d a c e to the air. This causes enhanced thermal mixing and the PBL can reach up to 2
km in depth. Conversely, at night, the earth cools more rapidly than the atmosphere,
producing a downward flux of heat, which suppresses wnvective turbulence. This can
lead to a reduction of the PBL depth to about 100 m (Oke, 1987).
Most of the heat and moisture in the PBL is tramferred by mechanical and
convective turbulence, and these processes cause rapid and efficient mixbg. These
phenornena also play a key role in the dispersion of aerosols released into the PBL.
1.2 Application of Lidar to Aerosol Dispersion Experiments
Remote sensing has been an important tool used in probhg the atmosphere for
decades. Specifically, monitoring scattered electromagnetic (EM) energy fiom a target
has been a particularly effective way of infixrhg a vast number of target properties. This
has direct application to atmospheric science, where the targets of interest are airbome
aerosols or molecdes. The use of energy at the optical or i n h e d wavelengths pennits
measurable scattering, even for objects of such small dimension. Even in the visibly
'clear' aûnosphere, backscattered signals fiom gases and suspended particles at ranges of
several kilometers may readily be detected with laser radars, or Zihrs, of modest
performance. It is thus possible to measure the position of clouds or aerosols, their
motion, and perhaps most importantly, their structure.
The texm RADAR was coined during the Second World War as an acronym for
RAdio Detection And Ranging, and refers to the process of measuring the the of flight
of reflected radio-wavelength EM radiation fkom a distant target to detemine its range.
This principle has been applied in the subsequent years at shorter and shorter
wavelengths, and was used to study atmospheric properties as early as the 1940s and
1950s. By analogy, the application of the radar principle to enagy of optical or near-
optical wavelengths produced by lasers was called LIDAR. LIght Deteetion And
Ranlzing*
At the heart of the Lidar systern is the laser source; although a number of
techniques exist to modulate the transmitted signal, the most common is pulse
modulation, where a single pulse of light of finite duration is transmitted. Typical puises
have a length of 10-20 nanoseconds and energy in the range of 0.1 to 1 .O J per pulse.
Lasers today have pulse repetition rates of 10 to 100 puises per second (Silfiast, 1996).
Once a pulse of light is transmitted, it is scattered in al1 directions by the various
gases or aerosol particles in the air. A certain hct ion of this energy is reflected back in
the direction of the source, where it can be mllected by appropriate optical components,
Iocated at the position of the lidar. It is then directed to a photo-detector, where a voltage
or current is produced that is proportionai to the power of the received pulse. The
magnitude of the received backscattered signal depends on the specific properties of the
target aerosol, including particle size distribution and its rehctive properties. In general,
however, the received power is proportional to the density of the scattering aerosol. In
addition to this, the portion of the target responsible for a given lidar return can be
localized quite accurately, using the time of flight of the laser pulse f?om the source to the
scattering medium and back and the pointing direction of the iidar. Thus the lidar is
capable of measuring the density at a aven point in a target aerosol, and the spatial
position of this point simultaneously.
The traditional and by far most common method for measuring concentration
distributions of dispershg aerosols is to use an array of detectors, situated dong a line or
arc domwind of the source. As the aerosol passes the array, concentration time-series or
dosage wunts are calculated at each detector on the array. The lidar presents numemus
advantages over such in situ measurements.
Laser beams are coherent and can be highly collimated. Thus, pulses can be
directed with high precision, and optical scanning systems c m be used to guide the beam
to scan a given volume with a specific pattern. In this way, large volumes of space can
be scanned fkom one remote position in a matter of seconds. There is no need to
physicdy set up a limited array of devices, which can be expensive, time consuming,
and in many conditions, impossible. A lidar has no problem rneaswhg hazardous or
inaccessible locations. Further, lidar probing does not disturb the process being
measured. This may not be the case for in situ measurements, where the presence of
detector an-ays can disrupt the local flow patterns.
There are drawbacks to using a lidar system for aerosol dispersion measurements.
For one, the spatial resolution is limited by the laser pulse length and beamwidth. Each
lidar retum is in fact a spatial average over the volume occupied b y the laser pulse. Fast-
response detectors offer much higher resolution and are more effective for measuring
fluctuation statistics. Also, a Mar system requires a h i t e amount of time to scan a given
volume, and thus a three-dimensional(3-D) concentration map is not a true instantaneous
'snapshot' of the diffushg cloud. Finally, it is necessary to convert the measued
backscattered power signal to a measure of concentration. This process is known as lidar
inversion, and is no trivial task; it requires numerous assumptions that are not always
valid.
1.3 Artificial Neural Networks
Artificial neural network (ANN) modeling is an emerging technique that has
shown rapid growth in the past decade. Today, ANNs are becoming more and more
common in a wide variety of disciplines, from financial market analysis and medical
diagnosis to the many fields of engineering. They have dernonstrated a rernarkable
ability to solve a wide variety of diverse problems, such as pattern recognition,
classification, process control, tirne-series prediction, function approximation and data
compression.
The success of ANN models in so many applications can be attributed to a
number of factors. Neural networks are inherently nodinea. structures, and are capable
of recognizing nonlinear relationships between seerningly random variables; hence they
model nonlinear processes well. Also, each component of an ANN is potentially affected
by the global activity of d l the 0 t h components of the network; thus, contextual
information is dealt with naturally by ANNs (Haykin, 1994). Furthemore, they are able
to handle noisy or incomplete data, and c m easily accommodate for variables that are
difficult to quanti@ numerically (e.g., categorical or Boolean inputs). These
characteristics give ANNs a signi ficant advantage over traditional multivariate statistical
regression models. One of the chief advantages of ANNs over traditional statistical
modeling approaches is their ability to handle CO-linear input variables. Co-linearity cm
significantly impair the performance of a statistical model, but presents no problem for
neural networks. Therefore, when modeling non-linear processes with large amounts of
noisy data, categorical and potentially CO-linear variables, neural networks are better
suited for the task than are statistical regression techniques. ANNs have been shown to
significantly outperfom statistical models in numerous applications (Gardner and
Dorling, 1 996; Yi and Pxybutok, 1 996).
Neural networks were originally modeled after the bctioning of the human
brain, which is a cornplex, nonlinea., parallel processor, capable of performing certain
computations many times faster than the fastest computrs today (Haykin, 1994). Work
on ANNs dates back to the 1940s. Early pioneers of the field showed promising success
with abstract models of the biological neuron and a formulation of the process of
learning. The most striking advances have taken place in the last 20 years, with the
advent of cheaper, faster cornputers. Over the past two decades, numerous unique and
sophisticated ANN paradigms have been developed, many of which differ markedly in
structure, operation and application. However, there are certain features that are shared
by al1 neural network paradigms.
Essentidly, al1 ANNs perform the same task: they accept a set of inputs (an input
vector), and produce a correspondhg set of outputs (an output vector); that is, they
perfonn a vector mapping (Wasserman, 1993). The relationship between the input vector
and the output vector is enwded in the fiee parameters of the ANN model, usually
referred to as the network weights.
The process by which an ANN encodes this mapping relationship in its weights is
effected through a learning algorithm. Typically, a given input vator is presented to the
ANN, dong with the associateci target output vector. The f k e weights are adjusted so as
to minimixe the difference between the ANN's predicted output vector and the desired
target output value. The precise way in which the weights are adjusted depends on the
specific leaming algorithm used by a given ANN pafadip. Numemus such input-output
vector pairs are presented to the network, which adjusts its weights in response to each
pair, until the ciifference between the desired and target output vectors reaches a tolerable
minimum. At this point, the network is considered to be 'trained'.
Regardless of the ANN paradigm, the goal of any trained network is to genaalize
well. This is the ability to produce the correct output when presented with an input it has
not seen before. A network that c m predict the correct output only for those inputs used
for training is of little practical use. It is the performance of a trained ANN against a set
of inputs not used for training that is the true rneasure of its usefiilness as a model.
ANN model development for this research was done using the Windows-based
proprietary software package NeuraZWorks Prof&onal D/PLUS by NeuralWare
(NeuralWare, 1 993 a).
1.4 State of the Discipline
This section gives a bnef discussion conceming the current state of the field of
study, citing appropriate references where necessary. Three topics will be discussed:
dispersion modeling, lidar inversion and ANN applications to dispersion modeling.
1.4.1 Dispersion Modeling
Atmospheric imbulent diffusion modeling has a long and rich history. Today,
countless different models exist and are in common use, many of which Vary
considerably in modeling methodology. Cornmon approaches include Lagrangian
sîochastic modeling, solutions to the advection-diffusion equations, and Gaussian
modeling.
Lagrangian stochastic modeling is a numerical modehg procedure, where the
trajectory of a particle (or group of particles) is calculated fiom a known sîatisticd
description of the turbulent velocity field. Although it is a very p o w d method, it is
computationally intensive, and is presently a research tool. Sawford and Wilson (1996)
provide discussions of current models.
Another common numerical technique is to iteratively solve the advection-
diffusion equation where closed-form analytid solutions can not be found. This allows
for the use of more reaiistic estimates for wind speed, temperature and eddy diffbivity
profiles, since analytical solutions can only be found for the simplest f o m . Some
examples of such numerical 'K-Theory' models can be found in current literaîure (CCPS,
1996).
By far the simplest and most commody used modeling technique is the Gaussian
dispersion model. The American Institute of Chemical Engineers (CCPS, 1996)
conducted a survey of 22 of the most commonly-used public and proprie* dispersion
modeling software packages; the majority of these incorporated a Gaussian plume or puff
model into the full dispersion code. The US Environmental Rotection Agency's
Gaussian-based dispersion model (Industrial Source Complex Short Term Model v.3) is
the most widely used in North America, and has been accepted by many jurisdictions as a
regulatory model (Lehder, 2000; CCPS, 1 996). Similady, the Gaussian model is in wide
use for regulatory control in Europe (Pankrath, 1995; Olesen, 1995).
Over the past decades, there has been a dramatic increase in the application of
dispersion models based on the Gaussian plume formulation as the core procedure
(Griffiths, 1994). Application of a Gaussian model ultimately requires a specification of
the nature of the standard deviations of the Gaussian distribution. These dispersion
coefficients describe the width of a cloud of con taminant and how the cloud grows as it
diffuses downwind. The accuracy of a Gaussian model depends critically upon these
coefficients.
A number of schemes have been developed over the years to estimate the
dispersion coefficients. The usual approach has been to semi-ernpincall y parameterize
the coefficients as functions of downwind distance and atmospheric stability, based on a
series of carefully performed diffusion experiments. The most widely used of these
parameterizations is that of Pasquill, later modifiecl by Gifford (Hanna et al., 1982).
They are based on ground level concentration measurements, due to contùiuous releases
over level gmund. Others have presented similar schanes, most notably Bnggs (1973).
The only parameterization based on measurements fiom instantaneous releases is
provided by Slade (1968). The parameterizations of Slade and Pasquill will be discussed
M e r in Chapter 2.
Some theoretical and experimental studies of near-instantaneous releases have
been conducted in the past few years. Hanna (1996) sunmarizes some of the recent work
on characterizing the downwind spread of a puff release, and the role of wind shea. on
puff growth.
Van Ulden (1992) conducted a theoretical treatment of the diffusion of a passive
puff near the g o n d , based on solutions to the advection-diffiion equation and Monin-
Obukhov similarity theory. Using a coordinate system that follows the p f l s horizontal
centre of mas, he developed relations for the standard deviations and skewness of the
concentration distribution. He concluded that wind shear and the interaction between
skewness and vertical diffusion dominate the downwind spread of the puff, leading to
tilted concentration distributions.
Yee et al. (1998) performed a detailed analysis on a series of carefùlly conducted
pdfdifksion field trials. This is one of the very few atmospheric experiments where the
field trials were controlled to such a degree that enough repeat realizations of
instantaneously released clouds could be pdormed to allow the construction of
staîistically-sound ensemble averages. Yee analyzed these ensemble average
distributions in the h e w o r k of relative diffusion, using Monin-Obukhov sirnilarity
theory. He found that downwind distributions were negatively skewed, while crosswind
distributions were Gaussian in nature. It was also detennined that under the conditions of
the trial, the horizontal spread of the puff grew approximately linearly with downwind
distance.
Sato (1995) investigated the longitudinal (downwind) distribution of a diffushg
puff based on a senes of release trials. He also found these distributions to be negatively
skewed, and attributed this skewness to the presence of vertical wind shear. At short
times, the growth of the puffs in the downwind direction was found to be proportional to
diffusion tirne, giving way to a % power relationship at longer times.
1.4.2 Lidar Inversion
With the advent of the use of lidar systems to probe the atmospke in the past
two decades, numerous techniques have been dweloped to extract useful optical
properties nom backscattered lidar retums. A review of cornmon inversion a l g o r i t . in
use can be found in Evans (1988), Elouragini (1995), and Bissornette (1996).
Bissonnette outlines the major problems remaining in most such inversion attempts. He
concludes that the chief problems are: 1) the need to determine a relationship between
aerosol backscatter and extinction coefficients, 2) the cornmon requirement to speciQ a
boundary value at some specified range, 3) instabilities or slow convergence of the
solutions, and 4) the need to properly account for multiple-scattering events.
Klett (1981) developed a stable inversion algorithm that assumes a power-law
relation between the backscatter and extinction coefficients. This method is based on the
well-known but unstable ' forward method', but requùes the measwement or estimate of
the extinction coefficient at some range beyond the extent of the cloud, rather than in
fiont of it. This results in an inversion procedure that is more stable with respect to
perturbations in the signal, the postulated relationship between the backscatter and
extinction coefficients, and to the estimate of the boundary condition. Some suggestions
conceming how to make the boundary condition estimate are given, but in practice this
can be quite difficult, and these estimates become less valid as the optical depth of the
cloud becornes small. In such cases, convergence is slow and generally only the fiont
portion of the retumed signal is of use. Klett modified this algorithm to account for
deviations in the relationship between the backscatter and extinction coefficients (Klett,
1985), and showed that stable solutions could be obtained, but the basic problems of the
original method remain. Multiple scattering effects are not accounted for in Kiett's
formulations.
Evans (1984, 1988) developed a stable inversion algorithm that does not require
the estimation of the extinction coefficient boundary condition. Each lidar retum is
instead normalized by a clear air calibration shot, which greatly reduces the effects of
system noise, enables the detection of very weak signals, and provides stable solutions
for a wide range of optical depth (Evans, 1984). Others (Uthe and Livingston, 1986;
Uthe, 1981) have adopted a similar approach. This method also assumes a power law
relation between the backscatter and extinction coefficients, and multiple scattering
effects can be accounted for, to some extent, directly in the algorithm. However, such
compensation is specific to the lidar system. This is the inversion dgorithrn adopted for
this research.
Roy et al. (1993) developed a lidar inversion technique based on the total
integrated backscatter. This technique does not rely on the Mar equation or any of its
assumptions. Rather, the total integrated backscatter is measured for various aerosols
under controlled conditions, and a calibration curve is formed that c m then be used for
field trial measurements. The technique shows promise, but the calibration curve is
specific to the aerosol, the dissemination technique, and the lidar system used to obtain
the curves. In addition, for sufficiently dense clouds, multiple scattering affects the
calibration curve, Iimiting the application of this technique.
1.43 Neural Network Applications to Dispersion Modeling
n i e application of neural networks to problems in atmospheric science has grown
rapidly in the past few years, and Gardner and Dorhg (1 998) provicie a bnef overview of
recent developments. However, in the specific field of atmospheric dispersion, relatively
few attempts have been made to mode1 the process using n e d networks.
A few studies have been perfomed where ANN models were used to predict
the-averaged pollutant concentration at specific receptor cites. Gardner and Dorling
(1 996) constructed an ANN model to predict the hourly average ozone levels at a specific
site in the UK. Inputs for the model consisted of hourly average of meteorological
measurements, including hadiance, temperature, humidiîy, wind speed and direction,
collected over the course of a year. The ANN model showed considerably better results
than a conventional multiple linear regression model, and demonstrated that about 53%
of the variability in the hourly d a c e ozone concentrations can be attributed to local
meteorology. Yi and Prybutok (1996) used an ANN modeling technique to predict ozone
levels for the Dallas-Fort Worth area. In this study, the model output was the daily
maximum ozone level; inputs consisted of hourly averaged meteorological parameters
and vehicle mission measurements. Their model also outperfomed standard linear
regression models.
Boznar et al. (1993) used ANNs to make short-term predictions of thermal power
plant pollutant levels in an industrialized area of Slovenia. In this study, the terrain was
quite cornplex, and traditional dispersion models repeatedly fàiled to give accurate
predictions. Meteorological and pollutant release rneasurements were continuously made
using a network of measuring stations located throughout the region. These
measurments formed the inputs and target outputs for a series of ANN models, where
one mode1 was consîructed for each receptor site. The ANN models predicted the short-
tenn concentration values at each receptor site very well, far outperfonning the
predictions of the numerical dispersion models.
Using measurements of downwind tracer concentration together with t h e -
averaged meteorological data, Rege and Tock (1996) constructed an ANN model to
predict the source emission rate. Continuous releases of ammonia and hydrogen sulfide
were placed near ground level, and detectors were located less than 30 m downwind fiom
either source. For data in the test set, most of the ANN predictions were within about
1 0% of the measured emission rates. Traditional Gaussian models that were empirically
modified to trial data failed to predict better than within about 20% of the m e w e d
emission rates.
Each of these shidies addressed either ambient atmospheric aerosols, or
continuously released contamimnts. None deals with instantaneous releases, and it
appears that only one attempt has been made to model the evolution of concentration
distributions of instanîaneous releases using ANNs. Using some of the same lidar data
analyzed for this research, Costa constnicted an ANN to predict aerosol concentration in
the fiamework of absolute diffusion (Costa, 1998; Andrews et al., 1998). However, the
wind speed and direction measurements used to comtruct the model were sampled far too
infiequently to construct statistically sound averages. This, coupled with the absolute
diffusion coordinate system, limited the performance of the ANN model. No thorough
analysis of the ANN model's predicted concentration distributions was made.
1.5 Thesis Objectives
It is a well-established fact that a thorough description of the process of
atmospheric turbulent d i f i i o n requires the measurement of a number of statistical
properties of the turbulent flow field, including the means and variances of wind speed,
wind direction and temperature. Nevertheles, in numerous practical situations, such
measurements are not available, and it is necessary to predict concentration distributions
fkom more readily measurable properties of the atmosphere. It is the purpose of this
research to develop a model that uses routinely rnea~u~ed aîmospheric parameters to
predict the average concentration distribution fiom an instantaneous puff release more
accurately than traditional Gaussian puff models.
Typical analyses of puff diffusion data first require the construction of ensemble
average concentration distributions. This requires a large number of repeat releases taken
under identical atmospheric conditions. During the field trials nom which the present
data set is taken, atmospheric conditions were highly variable and no such repeat
realizations could be perfonned. Thus, statistically sound ensemble averages could not
be constructed. To overcome this difficulty, the data were modeled using artificial neural
networlcs.
In addition to the developrnent of a successful ANN model, a fiirther goal of the
research was to parameterize this model's predicted concentration distributions. Simple
analytical relations are derived between the moments of the distribution and the most
influentid meteorological variables.
Chapter 2
Background and Theory
This section provides a theoretical discussion of the major components of. the
work. The Gaussian puff model is h t presented, dong with a description of the
Pasquill-Gifford stability scheme and dispersion length parameterization. A bnef
description of the COMBIC model is also given. This is followed by a detailed
derivation of the lidar inversion technique of Evans (1984, 1988). Feed-forward neural
networks are then discussed, together with an explanation of the learning d e used for
network training.
2.1 The Gaussian Dispersion Mode1
The dispersion process in turbulent flow is made up of two contributions: one of
molecular scale, due to the random thermal agitation of the molecules, and another of
much larger scale, due to random turbulent bulk motion within the fluid. These two
mechanisms differ in two key ways. First, the resultant motions are on entirely different
scales; the typicd span of turbulent movement is many orders of magnitude larger than
the mean f?ee path length of the diffûsing particles. Second, due to the macroscopic size
of the parcels of air involved in any single turbulent movement, there is a continuity
constraint. As such a parcel moves, it displaces another, and at the same time leaves a
vacant region that must be filled. This l ads to the characterization of turbulent flow by a
group of random, closed-loop motions commody r e f d to as turbulent eddies. In
atmospheric fiows, the contribution of molecular diffusion is s e v d orders of magnitude
less than that due to turbulent dif'fusion, and can usually be neglected.
A typical turbulent flow contains eddies that cover a wide range of spatial scales,
and are responsible for the dissipation of energy in the flow. This energy dissipation c m
be described as a cascade of energy conversion çoni the larger eddies to the smaller ones,
where the large eddies extract energy fiom the main advective flow. These eddies are
unstable, however, and smaller eddies extract energy fkom them. This process continues
as smaller and smder eddies feed off the larger ones, until finally the eddies are so small
that the viscosity of the fluid forces the conversion of the eddies' energy into heat.
Turbulent eddies displace parcels of air within a puff of contaminant, mixing
polluted air with relatively clean air, and vice versa. This mixing by bulk displacement
eventually causes polluted air to occupy larger volumes at lower concentrations.
However, not a11 eddies influence the dispersion of a contaminant in the same way.
Small eddies will displace material short distances, and will contribute less to the
dispersion of the pollutant, except perhaps at the edges of the p&, where mixing with
clean air may cause some notable redistribution of material. Conversely, eâdies that are
much larger than the p e o r plume will tend to displace the entire mass of pollutant as a
whole, a process known as 'meandering' and thus will contribute little to the interna1
mixing of the puE Eddies that have spatial scales comparable to the size of the puff or
plume are the most efficient in causing rapid mixing.
Given the highiy random nature of the motion of the turbulent eddies, it is
apparent that the concentration of a dispersing puff or plume is also in g e n d a random
variable, about which one can only make probabilistic predictions (Csanady, 1973). For
this reason, attempts to describe concentration distributions of a contaminant have been
restricted to considering ensemble average concentrations, which show much more
reguIar behaviour and can be more easily described mathematically (Williamson, 1 973).
Indeed, it has been shown experimentally that in a field of homogeneous
turbulence, ensemble average concentrations can be approximated by a Gaussian
distribution (Csanady, 1373). No rigorous theoretical justification for the observed
Gaussian distribution seerns to exist; Csanady sumarizes some of the more convincing
arguments, but concedes that ". ..the question why a Gaussian distribution is observe. in
experiments is not yet satisfactonly answered.. .". Few investigations into this question
have been undertaken in the recent literature.
Nonetheless, common practice is to assume a Gaussian distribution for averaged
concentration distributions, and this is the basis of the Gaussian puff model and its
variants. The Gaussian mode1 originated in the works of Roberts (Sutton, 1953), Sutton,
Pasquill and Gifford (Hanna et al., 1982). The prevalence of the Gaussian model can be
attnbuted to the following:
1. Its predictions agree with experiment as well as other models.
2. The simple form of the equation facilitates mathematical manipulation.
3. It is conceptually appealing.
4. It is consistent with the random nature of turbulence.
5. It is a solution of the Fickian d i h i o n equafion, assuming that both eddy
diffirsivity and wind speed are constants.
6. Other so-called theoretical formulas contain large amounts of empiricism in
their final stages.
7. As a result of the above, it is used in most govemment guidebooks, thus
acquiring an elevated status (Hanna et al., 1982).
The usual practice is to adopt a coordinate system where the x-axis is dong the
direction of the mean wind, U, the y-axis is in the cross-wind direction, i.e. perpendicular
to the x-axis and horizontal, and the z-axis is vertical. When dealing with the diffusion of
a puff, i.e., an Uistantaneous release, it is also cornmon to use a coordinate system whose
origin is at the centre of mass of the puff. Thus the entue coordinate system moves
downwind with the puff as it is advected by the mean wind. This effectively removes the
effects of puff meander fkom the d i f i ion process.
In its simplest form, the Gaussian puff equation takes the following fonn:
where C(x. y, z. t ) is the ensemble-average concentration &lm3),
Q is the mass of aerosol instantaneously released at thne (g),
x, y, z are the coordinates relative to the centre of mass of the puff (m),
q0, o,,(i), O#), are the standard deviations of the distribution in each of the
coordinate directions; also known as the dispersion coefficients (m).
Figure 2.1 below illutrates the relative diffusion coordiaate system, and the
qualitative difference between puff and plume diffiision. Note the moving coordinate
system for puff diffusion, which is advected with the mean wind, as indicated by the
arrow. in general, the three dispersion coefficients will be distinct, and the pufY contours
will be ellipsoidal in shape, rather than spherical.
Figure 2.1 A cornparison between Gaussian puff (top) and plume (bottom) diffusion, showing concentration contour surfaces.
The foIiowing assumptions are implicit in the Gaussian puff equation:
1. The average concentration distribution is weil represented by a Gaussian, or
normal, distribution in each of the îhree coordinate directions.
2. Homogeneous turbulence: the statistical parameters that characterize the
turbulent flow field are invariant in space.
3. Meteorological conditions are invariant in time, Le., wind speed, wind
direction, temperature, stability and al1 other meteorological parameters are
constants with respect to tirne.
4. Conservation of mass, i.e., no ground deposition or reaction.
The time dependence of C(x, y, z, t ) in equation (1) is contained in the dispersion
coefficients, which are, in general, fùnctions of time and the meteorological
characteristics of the flow. Using a statistical approach to the diffusion problem, Taylor
(1921) developed relations describing the evolution of the dispersion coefficients with
t h e for the problem of a contuiuous plume. Batchelor (1952) used a similar approach to
tackle the problem of relative diffusion of puffs. In both cases, however, growth of the
dispersion coefficients was found to depend on cornplex statistical properties of the
turbulent fiow. These parameters are d l y d i f f id t to assess, and require research-
grade turbulence measurements (Hanna et al., 1 982).
In the absence of such measurernents, it is general practice to use semi-empirical
parameterizatiom. These are formed by observing the behaviour of dispersing plumes or
puffs under a broad range of conditions, and generally express the dispersion coefficients
as functions of domwind distance fiom the source and atmospheric stability. To use
such parameterizations, one must £kt characterke the atmospheric stability, preferably
by a simple scheme based on inexpensive and easily obtained measurernents (Pasqui11
and Smith, 1983).
2.1.1 Stability Classification Schemes and Dispersion Coefficient Parameterizations
Atmosphaie stabiiity generally refers to the vertical temperature stratification of
the atmosphere, and its resultant effect on the degree of dispersion. Three stability
categories are generally recognized: unstable, stable, and neutral. Unstable conditions are
usually formed shortly after dawn on sunny days, when incoming radiation from the sun
heats the d a c e of the earth causing the air in the lower levels to be wanner, and
therefore less dense, than the air above it. Such conditions are favourable to the
formation of convective turbulence, since any mass of air displaced slightly up or down
will continue to rise or fdl due to the density difference between it and its surroundings
(Sutton, 1949). Thus, in unstable conditions, turbulent mixing is enhanced.
Conversely, stable conditions are usually foxmed at night, when the surface cools
by emission of long-wave radiation. This often results in an 'inversion', in which
temperature increases with height, causing the suppression of vertical displacements.
The intermediate state is refmed to as neutral stability, and is characterized by a slight
decrease in temperature with height, usually very close to the adiabatic lapse rate (about 1
OC per 100 m). Neutral conditions can result fiom cloudy conditions, which inhibit
incoming and outgoing radiation, and fkom windy conditions where the wind rapidly
mixes the heated or cooled air vertically, evening out the vertical temperature distribution
(Tumer, 1994). Neutral conditions typically show a . intermediate level of dispersion.
Figure 2.2 below illustrates the temperature profiles of these stability reglmes, together
with typical effects on plume dispersion.
TEMPERATURE - STRONG CAPSE CûNOlT ION
WEAR LAPSE COMHTtûN
Figure 2.2 The effêct of atmospheric stability on the dispersion of plumes. The adiabatic lapse rate is shown as a dashed line, while typical vertical temperature profiles are shown as solid lines for (a) unstable conditions, (b) neutral conditions, and (c) stable conditions (Turner, 1994).
One of the most widely used stability classification schemes was first developed
by Pasquill (Pasquill and Smith, 1983), and is applicable for routine meteorological data.
Stability categones are characterized semi-quantitatively by wind speed, incorning
radiation, and the nighttime state of the atmosphere. Specifically, atmospheric stability is
divided into six classes, called 'Pasquill Stability Classes' A to F, where A is the most
unstable category, D is neutral and F is the most stable case. These are based on five
classes of surface wind speeds, three classes of daytime insolation, and two classes of
nighttime cloudiness. Table 2.1 below summarizes the scheme.
Table 2.1 Detexminhg the Pasquill stability category (Pasquill and Smith, 1983).
A: Extremely unstable conditions D: Neutral conditions B: Moderately wistable conditions E: Slightiy stable conditions C: Slightly unstable conditions F: Moderately stable conditions
Surface Daytime insolation Nighttime conditions wind speed,
d s Strong Moderate Slight > 112 cloud <3/8 cloud < 2 A A-B B - - 2-3 A-B B C E F 3 4 B B-C C D E 4-6 C C-D D D D > 6 C D D D D
Pasquill and Smith (1983, pg. 336) offer the following notes for using Table 2.1 :
1. Strong insolation corresponds to sunny midday in midsummer England; slight insolation to similar ccnditions in midwintet.
2. Night refers to the period fiom 1 hour before sunset to 1 hour after sunrise. 3. Category D should be used, regardless of wind speed, for overcast conditions
during &y or night, and for any sky wnditions during the hour preceding or following night, as dehed in (2) above.
Using these stability criteria, Pasquill analyzed a group of experimental trials, and
measured the crosswind and vertical spread o f dispersing plumes at various downwind
distances between 100 m and 1 km. The trials were conducted over flac uniform terrain,
releases were fiom near ground level, and concentration measurements were t h e
averaged over about 10 minutes. Other experimental trials have been used to extend
Pasquill's original parameterization to include the effects of sdace roughness and
elevated releases (Pasquill and Smith, 1983). These parameterizations are usually put
into the form of power law equations, where the msswind and vertical dispersion
coefficients are expressed as fiinctians of downwind distance h m the source.
The only parameterization of dispersion coefficients for near instantaneous
releases is due to Slade (1968). Slade pooled the results of a number of puff diffiision
experiments to form simple power law relations for crosswind and vertical dispersion
coefficients as functions of downwind distance. Predictions of downwind dispersion
coefficients, o,, were in general lacking, and are usually taken to be equal to a,. These
results are based on far fewer experimental trials than are those of Pasquill (CCPS, 1996),
and the experiments varied in source configuration, release height, meteorological and
terrain conditions (Slade, 1 968).
Both the Pasquill and Slade c w e s for dispersion coefficients are in wide use in
Gaussian based dispersion codes (CCPS, 1996), although some models make minor
modifications or M e r interpolations. Slade's parameterization is more appropnate for
near instantaneous releases than that of Pasquill, which is based on continuous source
trials. Nonetheles, since many more data exist for continuous plumes than for
instantaneous puffs, several models use the Pasquilt parameterization for plumes and
puffs alike (CCPS, 1996; Hama et al., 1982).
2.1.2 The COMBIC Mode1
The Combined Obscuration Mode1 for Battlefield Induced Contrrminants
(COMBIC) is a model developed by the US Army Research Laboratory to estimate
variations of transmissivity through a battlefield obscured by smoke and dust clouds.
These transmissivity values can be calculated for up to seven different EM bands, dong
nurnerous different lines of sight, and for scenarios involving several sources of varying
type-
COMBIC uses a Gaussian-based dispersion code and ernploys Gaussian puff and
plume models, depending on the source being modeled. In addition to this, COMBIC
incorporates numerous enhancements to account for effects not considered in the basic
Gaussian equation. These include puff-surface interaction effects, cloud buoyancy
effects, and source effects such as initial cloud momentum, temperature, and radius.
COMBIC also uses a sophisticated boundary layer model based on Monin-Obukhov
similarïty theory to calculate vertical temperature, wind speed, and density profiles used
in transport, diffusion and buo yancy calculations (Ayres and Desutter, 1 995).
For dispersion times l a s than 30 seconds, the COMBIC Gaussian models use
dispersion coefficients based on the Pasquill parameterization. In the case of an
instantanmus release, the dispersion cwfncients take on the following form:
0, ( X ) = 0.667a - xO.'
where X is the downwind distance fkom the source (m), and
a, b, and c are numericd parameters derïved fkom the Pasquill
parameterization, and are functions of stability class.
Note that COMBIC uses the same parameters for plume dispersion, with the
difference that %=O (i.e., downwind dispersion is not considered for continuous releases).
Also note the factors of 0.740 and 0.667 for the domwind and crosswind spread
formulas, respectively. These factors are not used in the plume formulas, and are
included to reduce horizontal dispersion in an attempt to compensate for the reduced
effect of meander in puff diffusion (Ayres and Desutter, 1995). Also note that the
downwind factor is larger than the crosswind one, in accordance with the well-
established principle that downwind diffusion proceeds at a greater rate than crosswind
d i f i i o n (Hanna, 1996).
For cornparison, the parameterizations due to Slade and Pasquill are shown below
in Figures 2.3 and 2.4, respectively.
0 2 0 4 0 6 0 8 0 1 w 1 2 0
T W distance from saaoe (m)
O 20 40 60 80 100 120
Travel dwtancefran sance (m)
Figure 2.3 Slade panuneterization of dispersion coefficients as functions of downwind travel distance nom the source; (a) Pasquill stability class A (very unstable); (b) Pasquill stability class B (moderately unstable) (CCPS, 1996).
0 2 0 4 0 6 0 8 0 1 0 0 1 2 0
Travei d i i from source (m)
Figure 2.4 Pasquill panuneterization of dispersion coefficients as functions of downwind travel distance fkom the source; shown are the modifications used in COMBIC, equations (2) to (4); (a) Pasquill stability class A (very unstable); (b) Pasquill stability class B (moderately unstable) (Ayres and Desutter, 1996).
COMBIC does not use the rnodined Pasquill parameterization shown in Figure
2.4 for dispersion times greater than 30 seconds. More cornplex semi-empirical relations
for the dispersion coefficients are employed. These relations are based on Monin-
Obukhov similarity theory, and account for the effects of wind shear and surface
roughness.
2.2 The Laser Cloud Mapper
The laser cloud mapper (LCM) is a fast scanning lidar system designed and
developed at the Defense Research Establishment Valcartier (DREV), and has been used
in the study of military obscurant characterization (Roy et al., 1994; Evans et al., 1994),
cloud ice formation (Bissornette et a l , 1997), industrial pollutant emissions monitoring
(Pal et al., 1 W8), and lidar inversion techniques (Evans, 1988; Roy et al., 1993;
Bissornette and Hutt, 1 995). This section will describe the LCM system, and discuss the
inversion algorithm used to extract the concentration maps fiom the lidar retwns.
2.2.1 LCM Specifications
The LCM consists of a 1.06 p Nd-YAG laser source, a scanning platfonn,
secondary optics and a receiver. It is controlled by a PC and the entire system is rnounted
in a step van. The laser and collecting optics are w-linear.
The laser is a pulse-modulated source, emitting 10 ns pulses at a repetition
fiequency of 100 Hz (one emission every 0.01 s). The beam divergence and receiver
31
field of view were set at 3 and 4 mrads, respectively, which defines a sampling footprint
of 0.3 m at a radial distance of 100 m. The laser pulse energy is 80 mJ.
The scanning optics guide the laser pulses dong a raster pattern, as shown in
Figure 2.5, covering a large area in about 3.5 seconds or less. Each Mar emission is
termed a shot, and 150 backscattered retunis are collecte. for each shot All the shots
dong one constant elevation form a sweep. The speed of the scanning optics was set so
that 44 shots were taken dong each horizontal sweep, and 6 or 8 sweeps were performed
per scan. The LCM scanned a volume spanned by a radial distance of about 225 m, a
range of 60" in azimuth, and 10" in elevation. T'us the resolution in azimuth is about
1 .40°; depending on whether 6 or 8 sweeps were pedormed per scan, the resolution of the
LCM in elevation is 2" or 1.43O, respectively. The digitization rate of Iidar r e m was
100 MHz, giving the LCM a radial resolution of 1.5 m.
LCM
Figure 2.5 Raster scanning pattern used by the LCM.
2.2.2 Lidar Inversion
The lidar retums collectecl by the LCM system are in the f o m of a current, which
is then amplified logarithmically. The amplified curent can then be converted into a
measure of backscaîtered power, in watts, using the log-amp calibration curve and the
h o w n circuit parameters. One can then use an inversion algorithm to convert this power
measurement into a measurement of aerosol concentration. However, this is no trivial
task, given the non-linearity and complexity of the interaction between the LCM beam
and the scatterhg aerosol.
The lidar equation relates the power of a backscattered lidar retum fkom some
range r to the volumetric extinction coefficient of the scattering aerosol, usually denoted
by O@). This is defined as the hct ion by which the flux of energy in the direction of
propagation is reduced per unit length, and bas units of m-' (Hinckley, 1976). It is
generally a fict ion of both position and wavelength. The extinction coefficient of an
aerosol is a parameter of considerable interest, because it can be used to determine that
aerosol's concentration, via the relation:
o(r,h) =a(h)C(r)
where a@) is the mass extinction coefficient of the aerosol (m2Ig), and
C(r) is the concentration (dm3).
Thus, given that the mass extinction coefficient for a given aerosol is hown, extinction
values are easily converted to measures of aerosol concentration. However, determining
the extinction coefficient o(r) fiom the backscattered Mar returns requires inversion of
the lidar equation, based in radiation transfer physics.
The radiative tramfer equation is a complex relation describing the intensity of
EM radiation (of wavelength A) received at a given point, ernanating fkom a target some
distance r away. In differential form, the equation c m be written as (considering only the
radial direction for simplicity):
where &, L) is the received intensity (W/m2),
J(r. A) is the intensity of radiation due to the source ( ~ / m ~ ) , and
~ ( r , h) is the extinction coefficient (m-').
Equation (6) assumes that the wavelength of the radiation is much smailer than
the typical distance between scatterers (Costa, 1998). That is, the scatîered radiation is
incoherent and non-intdering. The source tem, J(r, A), takes into account d l radiation
scattered into the propagation path fkom al l directions at all points dong the path. It also
accounts for thermal emission of the medium into the propagation path. Thus this term
accounts for al1 diffuse radiation reaching the receptor, while the I(r, A) term represents
the direct transmittance. In general, J(r, A) is difficult to specie, it depends on the phase
function of the scattering aerosol, and its thermal emission properties. Equation (6) can
be considerably simplifiecl by making the assumption that the diffuse radiation is
negligible in cornparison with the direct radiation (i.e., J(r, A) « I(r, X)). That is to say,
once a photon is scattered by an aerosol particle it is pennanently removed fkom the
beam. This assumption of singie-scattering is very important, and is ofien not valid,
particularly in dense media. The effects of this and other assumptions will be discussed
below.
Given the assumption of negligi'ble source emission (mdti-scattered and thermal),
equation (6) is greatly simplified to:
d - ï(r, k) = -O (r, A) I(r , A) dr (7)
where I(ro, h) is the intensity of the beam before any extinction, that is, at the
target. This is the familiar Beer-Lambert law for direct trammittance in the most general
case.
For the specific case of the LCM, the received intensity is dependent on additional
factors. Since the pulse has a finite duration or length, r, it illuminates a finite portion of
the air at any time (m. where c is the speed of light). Thus, the 'effective pulse length' is
the range interval fiom which a signal is received at any instant; this distance is 4 2 due
to the two-way path that the pulse must traverse (Hinckley, 1976). The received signal
intensity is also directly proportional to the solid angle subtended by the receiver; this is
the factor ~ / r ~ , where A is the effective receiver area, and r is the range fiom lidar to
target. Other system-specific factors can be incorporated into a system function, denoted
by F(r). Generally F(r) is taken as a calibration constant which accounts for light
reflection losses in the optical sections of the lidar system, geometric crossover of the
receiver and laser beam, and the quantum efficiency of the receiver (Pollock, 1993;
Costa, 1998).
In addition to the above mentioned factors affecting the received signal intensity,
the signal will also depend on the probability of being backscattered at a range r. The
quantity P(r) is the elastic volume backscattering coefficient, with units of m-'sfl, and is
a measure of the hction of incident energy scatterd in the backward direction (Le.,
towards the Lidar system) p a unit solid angle, per unit path length. Although it is in
general a complex function of aerosol size distribution and shape, bsckscattering
generally increases with increased aerosol concentration. Finally, the exponential factor
seen in equation (8) will represent the transmissiviiy dong the two-way path h m source
to target and back; hence, it must be squared.
Incorporating equation (8) together with all the additional factors affecting the
LCM received signal strength, and noting that power is proportional to intensity, we
hal ly get the lidar equation.
CZ P(r) = P, F(r)P (r)
where P(r) is the received signal power 0,
Pa is the transmitted pulse power 0,
and the fiinctional dependence on 7L is understood.
This is the equation that govems the relationship between the received lidar signal power
and the extinction coefficient, and hence the aemsol concentration via equation (5).
However, it rests on a number of assumptions.
In addition to the three assumptions made in arriving at equation (8), a number of
further assumptions were made in derking the lidar equation. Below is a list of al1 the
assumptions made in the derivation of the Mar equation:
1. incoherent scatterers,
2. no significant emission or source terms (J(r, A) <« I(r, A)),
3. first-order multiple scattering only,
4. only radiation scattaed exactiy through 1 80° is receiveù,
5. o d y a plane wave is received (or use an effective receiver area),
6. no beam pulse stretching (constant r),
7. the receiver encompasses the laser beam,
8. the scattering medium does not change in index of rehction, size, shape, or
orientation distribution over the scattering volume,
9. o(r) is independent of Po.
10. particles are not shadowed by one another,
1 1. the light source is monochromatic, and
12. the laser pulse is rectangular (Evans, 1984, 1988).
Note that the scattering volume is the volume of aerosol instantaneously
illuminated by the pulse. Most of the above assumptions are tnie for typical lidar systems
and aerosols; some of them impose somewhat opposing restrictions, calling for a
compromise between a small pulse length and scattering volume, and large enough
aperture area and beam divergence (Evans, 1988). Nonetheless, a reasonable
compromise is not overly restrictive to typical systems, and most of the assumptions can
be met. However, the assumption that no multiple scattering events take place is the
most difficult assumption to work with and is the one most easily violateci. Dense
aerosols and reçeivers with large apertures will in general cause the violation of this
assumption. The inversion method attempts to correct for the multi-scattering effects, as
will be discussed below.
As can be seen fiom equation (51, in order to consî~ct a c o n c e n ~ o n map for an
aerosol, one must £int obtain an extinction map. This must somehow be obtained fkom
the returned power signal, as given by the lidar equation, as expressed in equation (9).
Even casual inspection of this equation shows that this is no trivial task. It is a non-linear
fùnction of two unknowns (assuming F(r) to be constant). Nmerous inversion
techniques have been devised over the past few decades, of varying success; most of
these rnethods are plagued by instability or inaccuracy. The Aubmatically Generated
Inversion of the Lidar Equation, or AGILE, developed in DREV over the mid-to-late
1980s (Evans, 1984, 1988), provides a fast inversion of the lidar equation without the
persistent instabilities of previous rnethods.
It is fint necessary to reduce the number of unknowns in the lidar equation.
Based on several observational and theoretical studies, it has been found that when
particdate backscattering dominates (generally irue for M a r e d wavelengths), a power
law can be used to relate P(r) and a(r):
P ( r ) = do (r) , (10)
where d is a constant, and k depends on lidar wavelength and various properties of
the aerosol. Reported values of the exponent are generally in the range 0.67 < k < 1.0
(Kletî, 1981). A survey of the litmature indicates that under most conditions, k = 1 is a
good estimate. Theoretical and experimental results over a wide variety of aerosols,
particle shapes, and extinction coefficients indicate that this assumption of Iinearity is
excellent. However, caution is r@ed: certain effects can cause non-linearity, including
multi-scattering, changes in aerosol size distribution, shape or index of refiaction (Klett,
198 1).
Incorporating the d e (1 0) into the lida. eqution gives:
Cr P (r) = 4 - F(r)do '
2
The strength behind the AGILE inversion is in its use of a clear air calibration
shot. Given that the extinction coefficient of clear air, crc, is constant over r, one obtallis
f?om (1 1) the retumed signai strength for the clear air shot, denoted by C(r):
Now, dividing the retunied signal by the clear air rehun (Le., dividing equation
(1 1) by equation (12)), and rearranging, we get
Note that system constants and the 1/r2 dependence have dropped out. Taking
the k? root of this equation, and integrating over r produces
where T(r) is the transrnissivity at distance r, defined as T(r) = e-' , and r is the optical
depth, dehed as r = exp{-l(r1)dr'j; note that tranmiissivity is boundsd by L e
interval [O, 11.
If it is assumed that is small enough to be negügiile (oc «k/2r , tnie for the
ranges considered in this analysis) then
where it is understood that P, C and T are all functions of r. Equation (14) is an
important result. Since transrnissivity is a bound quantity, this indicates that the left side
of the equation is also bound. Ttiat is,
Thus, as transmissivity decreases (i .e., as the aerosol becornes increasingl y dense)
the integral on the ieft of (14), which is the total normalized integratted backscatter,
approaches a maximum theoretical value. If the measured total integrated backscatter
exceeds this limit, then some experimental error has occurred; perhaps a digitization
error, or a physical asmption is no longer valid (such as that of single-scattering). This
provides a weak check on the system, inversion method and physical assumptions
(Evans, 1984). As will be discussed below, this fact will be used when the BTN
inversion is implernented in practice to attempt to correct for multiscattering effects in
dense clouds.
Returning to equation (13), taking the k& root, and using the definition of
transmissivity,
A g a assirming oc k/2r, and rearranging, (1 5) bemmes
Rearranging equation ( 1 4) gives:
Finally, substituting (1 7) into equation (1 6) gives:
where P, C and o are d l functions of r. This is the final equation used by the AGILE
algorithm to calculate the extinction coefficient fiom the retumed lidar signal and a clear
air calibration shot.
When implemented in practice, the AGILE algorithm must account for some
additional factors and potential sources of error. The primary challenge to implementing
most inversion algorithms is two-fold: accounting for muhiscattering effects, and the
response of the logarithrnic amplifier. If the system response is too slow in recovery or
the cloud is dense enough for multiple scattering, then the limit on the integral in
equation (14) could be surpassecl (Evans, 1984). For signals that approach this Iimit too
rapidly, the AGILE algorithm applies a numerical compensation. This keeps the signal
within the allowable limit (conservation of energy), and pennits to some extent correction
for multiple scattering and the down time of the system (Roy et ai., 1993).
This is done by multiplying the normalized return ( ~ ( r ) / ~ ( r ) ) by the value T' ,
where T is the transmissivity at r given by equation (1 7), and z is an empirical fmor less
than unity. Monte Carlo simulations and theones of multiple scatterhg suggest thar this
is a reasonable rule of thumb, and others ( K d e l and Weinman, 1976) have adopted a
similar approach. This correction procedure is only applied to retunis that have an
optical depth greater than unity. It is a fairly g e n d procedure in that it is not specific to
a given aerosol, but numerical compensation is specific to the lidar system (Roy et al.,
1993). Evans (1984) has found that a value of z-0.8 provides good agreement with
measured extinction values for the LCM.
If too much correction is needed to impose the limit of equation (14), the AGILE
algorithm outputs an error value of 1. In practice, values of o(i9M.5 m-' are assumed to
be the result of inversion errors, and are discarded. The minimum detectable extinction
of the LCM is estimated to be 104 m-l, and inverted values below this threshold are
considered clear air retums (Le. zero aerosol concentration).
In addition to its ability to compensate for multiple scattering events, the AGILE
algorithm offers a number of advantages over previous inversion methods. The value of
a, can be easily obtained by a trial inversion of one shot in which the transmissivity is
measured, or by other means (Evans, 1988). This does not pose a problern for the K M ,
whkh can perform many clear air shots in a very short tirne. The value used in the
inversion algorithm is a, = 2 - 1 O" m-l. This value must be detamined only once, and
can be used repeatedly for cases where many shots must be processed, provided that the
shots are taken under the same conditions. This is a great improvement over older
methods, where a refaence shot must be obtained for every lidar rem (Klett, 198 1).
Not only does the calibration with a clear air shot eliminate systern
constants such as F(r), 5, and the l/? attenuation, it also considerably cuts down on the
signal to noise ratio. For an AGILE-inverted signai, a histogram of extinction coefficient
frequency will show a standard deviation as much as one half of that obtained for an
inversion not using the clear-air calibration (Evans, 1984). Thus the AGILE inversion
allows for the detection of a very weak retum.
Further, the integration of equation (18) is camied out using a hybrid version of
the trapezoid nile and Simpson's d e , providing an accurate method for numerical
integration with a unifonn digitization rate, wwhile giving an integrated value at every
digitized point (Evans, 1988). Transmissivity e s t h t e s incur l e s error than previous
methods, since only one numerical integration must be carried out instead of the usual
two (one for the normalized integrated backscatter, and another for the integral of
extinction values).
The AGILE inversion technique has been validateci by cornparison with other
inversion techniques and field measurements. Specifically, inverted LCM rehims wae
compared to computations derived fiom Klett's inversion method (Klett, 1 98S), showing
excellent agreement (Evans, 1984). AGILE inversions aiso showed good agreement with
meaSUTernents made by coincident transmissometers, in situ concentration measwements,
and photographïc estimates of cloud extent (Evans, 1984).
2.3 Artificial Neural Networks
The principal cellular unit of the brain is the neuron. Within the brain, millions of
these neurons are interco~ected in a complex network where idonnation is exchanged
back and forth via electncal signals. A given neuron receives numerous signals dong
branch-like structures known as dendrites. If the combined signals are strong enough,
this neuron will 'fire', transmitting a signal dong its axon to other neurons. Between the
dendrites of one neuron and the axon of another is a srnall gap called the synapse, across
which the signal is tramfmed. The magnitude of the received signal depends both on the
strength of the original btansmitted stimulus, and the properties of the synapse. Figure
2.6 below shows the basic structure of the neuron.
Figure 2.6 The basic structure of a biological newon.
Artificial neural networks attempt to mode1 this mechanism. The basic unit of
ANNs is the processing element (PE), which receives extemal stimuli, combines them,
and transmits a signal. Each input to a PE is multiplied by a distinct factor, d l e d a
weight, and these products are then summed at the PE. The sum is then modifiai by a
nonlinear transfer fùnction, and the resultant value is passed on. This output value may
be the final result, or it may act as the input to other PEs, where another connection
weight is applied prior to being summed by the next processing element. The connection
weights are andogous to the synaptic signal strength of neural connections. The basic
structure of a single processing element is shown below in Figure 2.7.
1 vb ~rans fer: yj=f (4) xn - connection
weights
Figure 2.7 The basic structure of a processing element.
The synapses play an important role in the process of learning. In 1949, Webb
presented his postulate of learning, which states that the effectiveness of a synapse
between two neurons is increased by the repeated stimulation of one neuron by the other
across that synapse (Haykin, 1994). This mechanism for the leaming process was
adopted for artificial neural networks. As examples of the process to be learned are
presented to the ANN, the comecting weights are modified in a systematic way so that
the network eventually ' l e m ' that process.
2.3.1 Multi-Layer Feed Forward Networks and Backpropagation Leamhg
The simple single PE shown above in Figure 2.7 does not achieve rnuch in the
way of learning or predicting. The fimctionality desired in practical applications requires
multiple, interconnected PEs, which are organized into groups cailed layers. A typical
ANN consists of a sequence of layers with connections between the PEs of successive
layers. A characteristic multi-layer feed-forward (MLFF) ANN is shown below in Figure
2.8. In this figure, the leftmost layer is called the input layer, where data are initially
entered into the network, and the rightmost layer is termed the output layer, where the
ANN predictions are generated. The remaining layers are not part of the input or output,
and are called 'hidden layers'. The following convention will be used to describe a
network's architecture: the number of nodes in each layer is listed, starting with the input
layer and separated by a dash. For example, 9-30- 15-1 represents a network with 9 input
nodes, 30 PEs in the first hidden layer, 15 PEs in the second hidden layer, and 1 output
PE. Feed-forward ANNs are so named because the input signals and the intenial
intermediate si@s are always propagated forward. The flow of idonnation is only
directed towards the output, and no retuming paths exist.
Input Fi rs t Second Output layer hidden hidden layer
layer laycr
Figure 2.8 A multi-layer feed-forward ANN with two hidden layers (Haykin, 1994).
The backpropagation learning algorithm gets its name fiom the way corrections
are applied to the network weights. During training, input vectors are presented to the
network, and each vector is propagated forward, layer by layer, until an output vector is
calculated. This predicted output vector is compared to the target value associated with
that particular input vector, and an error is calculated. This error is then used to adjust
the weights between the last hidden layer and the output layer. An error value is then
computed ushg the outputs of the last hidden layer, and this enor value is used to adjust
the weights in the previous layer. This continues until the weight connections fiom the
input layer are adjusted. In this way, mors are propagated backward layer by layer. This
is repeated many times for the vectors in the M g set until the output error reaches a
minimum.
The derivation of the algorithm used for leaming is presented below (Pattmon,
1996). For simplicity, only MLFF networks with one hidden layer will be considered,
and the output layer will consist of only a single PE, as is typical of prediction problems.
Consider a network with n input PEs, h hidden layer PEs and 1 output PE. The following
notation will be adopted:
x n-dimensional input vector (with elements xi , i= 1,2,. . . n)
v, weight comection between input-layer PE i and hidden-layer PE j
wj weight connection between hidden-layer PE j and the output PE
ri output ofJ& hidden layer PE
z output fiom the output PE for input vector x
t target output for input vector x
$0 nonlinear activation function
When a training vector x is presented to the input layer, each PE, j, in the hidden
layer sums the elements of x to Hj7 first multiplying each xi by its associated weight
value:
Similarly, as the outputs fiom the hidden layer, yj-, are passed to the output PE,
each one is k t multiplied by its associated weight value, then summed to I:
Thus 4 is the combined input to the? hidden-layer PE, and I is the combined
input to the output PE. The output fiom thep hidden-layer PE is then given b y:
~3 =A&) j = 1, 2, ..., h (21)
Similarly, the output h m the single output PE is:
z =Al). (22)
Combining equations (19) to (22), the output PE produces a value z, due to an input
vector x, where:
Some of the weights and outputs for a mal1 network are iiiustrated below in Figure 2.9
for clarity .
Figure 2.9 MLFF network connections and variables
The goal of the learning process is to minimize the ciifference between the
network's output, z, and the target output, t, over al1 the input vectors in the training se t
One must therefore finit define an error fünction, or cost fùnction, to be minimized.
Typically, the mean square error is used to define the error fûnction; thus, the error
associated with a given input vector x is:
To minimize this error function by adjusting the weights, a gradient descent
method is adopted, so that the weight adjusûnent, Awj, is in the direction of decreasing
error:
where q is a constant learning coefficient, typicaliy 4. The chah d e is then invoked to
evaluate equation (25):
From equation (20), the second factor is:
The first factor of (26) can be broken down using the chah d e again:
From equations (24) and (22) we have
aE -=- ( t - z ) and dz az
-=f (0, ar
respectively, where f '(1) is the derivative off with respect to 1. Consequently,
Substituthg equations (27) and (28) into (26) we arrive at the expression:
For compactness, define 6 = (t - z) f'(1).
Then quation (25) can be written:
This weight update d e is valid for a l l the weights comecting the hidden layer PEs to the
output layer PE.
Next, the weight updates for the weights comecting the input layer to the hidden
layer must be calculated. As is clear fiom equation (23), these weights are deeply
embedded in the m r function. An expression is desired for:
The second factor in (3 1 ) is easily evaluated fkom equation (1 9):
The f h t factor can be broken down using the chin d e :
The second factor of (33) is easily evaluated using (2 1):
and the f'irst factor can be evaluated directly form quations (23) and (24):
and using equation (23):
= -(t - z ) f ' (1) wj .
Substituting equations (32) to (35) into (3 1) finally yields:
Avji = ? x i f ' (Hj )(t - z)f '(T)wj
= v i f ( q P w j ,
where 6 is defined above. Again, for compactness, we can d e k e 6,- as:
6 , = r ( H j ) 6 w, .
Then al1 the weights connecting the input layer to the hidden layer are adjusted accordhg
to the following d e :
After the presentation of an input vector, and the calculation of the weight updates
according to equations (30) and (37), each weight in the ANN is adjusted according to:
o u vjy = v , ~ + Avj, . (39)
This is the most basic f o m of the backqropagation learning rule. Some comments
concerning the details of implementing this algorithm and some useful enhancements are
in order.
The choice of the f o m of the non-linear transfer function f is somewhat flexible.
The only strict requirernents are that it is bounded, continuous, and continuously
differentiable. Typical choices for f are the sigmoid fûnction and the hyperbolic tangent
bct ion . These functions are shown in Figure 2.10 below. The shape of these two
functions is similar, the main difference is îhe range ont0 which the domain is mapped.
As can be seen f?om the figure, the sigmoid function maps values ont0 the range [O, 11,
while the hyperbolic tangent maps onto [-1, 11. Note also that the derivative of either
function cm be expressed in texms of the function itself (see Figure 2.10), which is
convenient due to the presence of the derivative factors in the weight update equations.
The hyperbolic tangent fûnction was used for this work.
f ( x ) =tan h ( x ) f y x ) = 1 - ( f ( x ) ) 2
Figure 2.10 Common tramfer functions: (top) the hyperbolic tangent, (bottom) the sigmoid bct ion.
Input vectors should not be presented to the ANN as raw values. They should be
scaled to a smaller range so that they are more compatible with the transfer functions
used for the learning algorithm. Figure 2.10 shows that both the sigmoid and hyperbolic
tangent functions behave fairly linearly over the range [-2, 21. If an input value that is
much p a t e r than this is presented to the network, even with mal1 weights in the
network, the sumrnations will be large. This will cause the tramfer function to becorne
satraated. When saturated, the derivative of the transfer fûnction becomes zero; since the
derivative is a factor in the weight update equations, learning stops for PEs with large
summation values (NeuralWare, 1993a). For this reason, it is desirable to map the raw
'real-worid' values of the input and target output vectors to a small range. Also, if the
input and output variables are not of the same order of magnitude, some variables may
appear to have more importance than others (Baughman and Liu, 1995). Since the
hyperbolic tangent was used for this work, al1 input values were linearly mapped to the
range [-1, I l . The target output values were iinearly mapped to the reduced range [-0.8,
0.81, where the transfer fiinction is more linear. This is common practice, and improves
network training.
Another common modification to the standard backpropagation learning nile is a
process calied batch updating. lastead of updating the network weights after the
presentation of each input vector in the training set, mors and weight adjustrnents are
stored and averaged over an epoch. An epoch can be a complete pass through the
training set, or a &action of it. These average adjustments may better represent g e n d
trends in the training set, and erratic changes due in response to individual training
vectors are avoided. Batch training of ten improves convergence rates, particularl y for
noisy data sets, and this approach was used here.
Another technique that can greatly improve convergence rates uses an adjustable
learning coefficient q, and introduces another variable parameter, a, known as the
momentun coefficient. This enhancement is known as the Extended Delta-Bar-Delta
(EDBD) learning d e (NeuralWare, 1993b) and it specifies distinct values of q and a for
each connection weight; furthemore, these values are themselves adjusted with each
iteration. The momentun coefficient is used to add a fraction of the previous weight
adjustment to the current weight adjustment. For iteration s,
Aw, (SI = 16 (SI Y , (s ) +a Awj (S - 1) ,
where normaliy O< a 4. If a given connection weight is adjusted in the same direction
(same sign) over several iterations, the learning and rnomentum coefficients for that
comection are increased. Conversely, if a weight adjustment changes direction over
several iterations, q and a are decreased. The effect of these enhcements is to
reinforce general trends while damping out oscillatory behaviour.
2.3.2 Generalization and Separation of Data Sets
Ultimately, the goal of the îraining process is to produce an ANN that can predict
output values well when presented with inputs it has not seen before. This is called
generalization. The degree to which a network can generalize weil is detexmineci by a
number of factors, including the size of the training sek how we1I the training set
represents the process being modeled, the choice of appropriate input variables, and
network architecture (i.e., the number of hidden lay ers and PEs).
In general, larger architedures can model more complex prowsseç. The
additional comection weights give the ANN more flexibility to represent intricate
relationships within the training set. Conversely, a small architecture may not be
sufficiently robust to accurately model the process. However, there is a trade-off
between network complexity and training set size. If the training set is small, an overly
complex network will tend to 'memorize' the data set, meaning it will learn to predict the
output of training examples very well, but will predict poorly when presented with
previously unseen input vectors. This is called over-training, and cm be avoided by
testing the ANN's generalization capability during the training process.
In order to do this, the fidl data set must f%st be divided into a training set and a
test set. It is also cornmon practice to separate a third set called the validation set, which
is representative of a typical model-deployment scenario. The training set is used in the
backpropagation leaming nile to build the ANN model, while the test set is used to assess
the trained net's ability to generalize. The test and training sets are disjoint, but both are
drawn fiom the same population to ensure that they cover the same domain of input
space.
NatualWorks Profersional IIPLUS provides a method to check the ANN's
generalization ability as training proceeds, using its 'SaveBest' command. This
command h t tests the untrained network against the test set, to form a base-line
performance estimate. The network is then trained for a specified -ber of iterations
(i.e., random presentations of training vectors), and is tested again against the test set. If
the ANN's performance improves nom the 1st evaluation, the network is saved, and
training continues. After the specified number of training iterations, the network is again
tested, and if generalization improves, it is resaved. This process of training and testing
continues until the ANN no longer shows improved generalkation nom continued
leaming. In this way, the SaveBest command retains the neîwork that shows the best
generalization, and thus avoids over-training.
Chapter 3
Collection and Analysis of Data
Considerable procasing of the LCM data was required to put it into a form
amenable to both ANN modeling and comparison with Gaussian puff models. A
description of the experimental setup of the aerosol release trials is £ïrst presented. This
is followed by a discussion of the analysis of the inverteci concentration maps, and the
selection of the training and test sets. Finally, the details of the Gaussian puff models
used for ANN mode1 comparison are given.
3.1 Experimental Setup and Collection of Data
The aerosol release trials were conducted at Canadian Forces Base Valcartier, on
a military obscurant trial range conirolled by DREV during the period 5- 12 August, 1997.
Forested hills covered much of the mounding area, but the trial range itself was a very
large, level plateau. Most of the d a c e was exposed soil, with small bumps and ripples
varying about 10 cm in height. About 30% of the s d c e was cuvered in long gras
(about 40 cm) and small shrubs (about 50 cm).
The mangement of the equipment is summarized below in Figure 3.1. The
weather measurement system was placed on a pole about 5 rn above the surface, to the
left of the LCM scanning volume. Here, measurements of temperature (OC), atmospheric
pressure (in Hg), wind speed ( d s ) and direction were taken. Wind speed was rneasured
using a cup mernometer, and direction was measufed with a bi-directional vane (Davis
Weather Wizard III; Davis, 2000). The LCM was about 1.5 m above the ground. A low
sand dune was located about 175 m north of the LCM.
LCM shot 44 \
disseminator (100 rn)
Figure 3.1 Layout of the experimental trial plateau.
Unfortunately, the meteorological data w a e not sampled fiequently enough to
detennine any statistical parameters of the turbulence. Just prior to each release, a single
instantaneous measurement was taken fiom the weather measurement system and the wet
bulb anemometer. Temperature and pressure varied slowly h m trial to trial, but wind
speed and direction fluctuatecl considerab1 y between trials, with consecutive trials
59
separated by about 3 minutes. Considerable fluctuations also occurred during each trial,
but these variations were not measufed.
As shown in Figure 3.1 above, the aerosol was released fiom a particle
disseminator, which was placed in one of two separate locations. These two diffaent
locations were used to remove any bias introduced due to the position of the release
relative to the LCM scanning volume. In either release location, the disseminator was
placed directly on the surface. However, the disseminator's nozzle was located about
0.25 m above the ground, at a slight inclination (about 20° relative to the surface). The
aerosol was placed in the disseminator reservoir, and was forced out the nozzle into the
environment.
The aerosol used for all the trials analyzed here was kaolin, a fine ground ceramic
powder (H2A12SizOs - H20), with particle size less than 3 p. It had a measured mass
extinction coefficient a=1.2 f 0.2 m2/g at wavelength of 1.06 pm, which compared
favourably with the value used in COMBIC of 1.0 m21g (Ayres and Desutter, 1995).
Kaolin is hert and non-buoyant. For each trial, 50 g of kaolin was released f?om the
disseminator.
Table A. 1 in Appendix A summarizes the meteorological measurernents taken
during the trials. Each kaolin release was scanned six times as it dispersed. Cloud
conditions were also recorded during the trials. The skies were very clear on August 5, 7
and 12, with very little or no cloud cover. On August 6, there was at most about 60%
grey cloud cover. A more full description of the actual data collection is provided by
Costa ( 1998).
3.2 Analysis of Inverted LCM Scans
A total of 5 1 separate kaolin releases were performed over the four days of trials.
Each release was scanned by the LCM six times (except one, which was scanned only
twice), resulting in a total of 302 LCM scans. Each scan was inverted using the AGILE
algorithm. For those releases where six sweeps were taken per scan, this produced a
three-dimensional extinction map containing 39 600 points per scan. When there were
eight sweeps per scan, the extinction maps contained 52 800 points. Most of these data
points were clear air retums, and only a fiaction of each scan containeci retunis fiom the
difiùsing kaolin cloud. Al1 data points fiom each scan that were below the LCM
extinction detection threshold of 104 m-' were considered clear air rehims, and were
removed fiom the data set. Al1 values that were a result of inversion mor, that is, those
that were 20.5 me', were also removed. This left 302 LCM scans that contained, on
average, about 1 100 extinction measurements of the diffûsing kaolin cloud.
Not al1 of these scans were usefûl, however. A custom-designed program for the
LCM, called LCVS, allows for quick viewing of raw and inverted extinction maps, in the
form of contour plots. This lets the user easily get an idea of the qualitative behaviour of
the puffs in each scan. Figure 3.2 below shows a typical LCM scan, both in raw fom,
and after being inverted by the AGILE algorithm. Using LCVS, it was found that in
many cases the kaolin release had moved mostly or entirely out of the LCM scanning
volume, leaving an extinction map of a very small portion of the puff. Typically this was
the case for the first and sixth scans of a given release.
Au'rnuth : 00 - 1 PO drg Timr : O - 9 - 5 0 ur
E i œ u i c i o n - 0 d Arirnuth <dsg> VS ~ikrts?u~e> WKA055021
i Atimuth:eO- 120 dm Timr : 0 - 4 - 6 0 u œ
i ~ œ v r t i o n - O drg Azimuth <dcg> VS f ime <us) [ K A 0 5 5 0 2 1
i
Figure 3.2 The bottom sweep of a typical LCM scan shown in raw form (bottom) and inverted (top), displayed using LCVS @id's eye view). Radial grid lines are spaced about 50 m apart; azimuth grid lines are 10" apart. Note the reduced noise in the inverted scan. The strong return across the top is the sand dune.
Each inverîed scan was manually irispected using LCVS, and those that appeared
to be missing considerable portions of the cloud were removed fiom the data set. This
left 187 scans, many of which were stili missing portions of the cloud above and below
the LCM scanning volume. This was unavoidable, given the smaii range in elevation of
the scanning volume (IO0), and the fact thaî the LCM itself was mounted about 1.5 m
above the ground. However, nearly al1 of the remaining scans contained the f d l
horizontal extent of the diffushg kaolin puff.
The puff extinction map in each remaining scan had to be isolated. This was
necessary to discriminate between retums fiom the puff of interest, and returns nom
extraneous influences, such as the sand dune (see Figure 3.2) or portions of previous
releases still in the LCM scanning volume. Again using LCVS, each of the 187 scans
was manually inspected, and boundaries were defined for each sweep of each scan. In
the subsequent analyses, only LCM retums within the predefined boundaries of a given
sweep and scan were considered.
It was then necessary to put the remaining isolated sans into the appropriate
coordinate system. As described in Chapter 2, the coordinate system was centred on the
p f l s centre of mas, and was oriented such that the x-axis points downwind. However,
there were serious difficulties with the wind speed and direction data taken during the
trials (listed in Table A.1). As noted, these were instantaneous readings, taken before
each trial; i.e., about every three minutes. These data provided almost no information
about either the means or variances of wind speed and direction. It was decided to
estimate the mean wind speed and direction during a given release from the trajectory of
the kaolin puff fkom scan to scan.
The cenue of m a s of each scan was calculateci, and fiom these values the
horizontal distance îraveled h m scan to scan in a given release was deriveci. Since the
time between scans was known, an estimate of the horizontal wind speed and direction
could be made. These wind speed estimates replaced those presented in Table A. 1 for the
remainder of the analysis, and are listed in Table A.2. The wind direction estimates and
puff centres of mass were used to translate and rotate each scan into the appropriate
coordinate s ystem.
It should be noted here that two separate sets of the LCM data were constnicted,
each employing a different coordinate system. That is, the same data were represented in
two different ways. The first data set was translateil and rotatecl as indicated above, so
that the position of each LCM retum in a scan was given in relation to that scan's centre
of mass. Thus, this data set was analyzed in the h e w o r k of relative diffusion, as
discussed in Chapter 2, where puff meander was entirely removed f?om the dispersion
process. The second data set was constructed so that the origin of the coordinate system
followed the horizontal centre of mass of the puff, but the vertical origin was fixed at
ground level. Thus, horizontal meander was removed fiom the diffusion process, but the
vertical meander was incorporated into the p S s dispersion, and would be partially
responsible for the vertical spread of the puff. This second coordinate system was
employed for two rasons. First, many cornmon Gaussian pufî models use absolute
coordinates, and include puff meander implicitly as part of the diffusion process. Second,
this system may offer some insight into the inhomogeneity of the turbulence due to the
surface, and its variability with height fkom the ground. These data sets are r e f d to as
data set 1 and data set 2, respectively.
3.3 Preparation of Data for ANN Modeling
Both data sets were prepared in the same manner, and the following discussion
applies to them both. The measured variables chosen for input are listed below.
Downwind position, x,
Crosswind position, y,
Vertical position, 2,
Diffusion tirne, t,
Mean wind speed, U,
Ambient temperature, T,
Time of day,
Pasquill stability class, P,
Atmospheric pressure, p.
Recall that position variables are relative to the p f l s centre of mass (z is relative to the
ground for data set 2). Wind speed is as calculated from the puff trajectones, as
described above, and Pasquill stability class is estimateci fiom Table 2.1 ; these values are
listed for each trial in Table A.2 of Appendix A. Al1 other values are as described above.
The ANN predicted a single output variable: kaolin conceniration, C, in @n3.
Given the relation o (r,li) = a(k)C(r) and the fact that a=1.2 m2/g for kaolin, the
inverted LCM extinction maps immediately yield concentration maps.
Neural networks have greater success modeling data that are evenly distributed
over the entire range of values (Wasserman, 1993). For this reason, it is desirable to
balance the distribution of each input and output variable so that certain portions of a
variable's range are not under-represented. Ofim a non-linear transfonn can
considerably irnprove a variable's fiequency distribution. Duplicating vectors that have
values in poorly represented regions c m ais0 help to even out the fiequency distribution.
Numerous non-linear transforms were applied to each of the input variables listed
above. No transform showed any marked improvernent in the uniformity of any of the
input variables, and it was decided that none would be applied. However, for ANN
developrnent inputs must be in n d c a l format, so certain inputs did require some form
of transformation. Specifically, time of day and Pasquill stability class had to be
transfonned.
The time of day inputs were linearly mapped ont0 the range [O, 11, where a value
of zero corresponds to midnight, 24:00:00, and a value of 1 corresponds to the last second
of the day, i.e., 235959. Pasquill stability class was transfomed so that class A was
represented by a value of 1 .O, and class B by 2.0. The intermediate class A-B was entered
as 2.5. Other transforms for Pasquill stabifity class were also applied, such as 1-of-n
coding and themorneter coding (NeuralWare, 1993~). However, these transformations
did not significantly alter network results, and were discardeci.
The output variable, C, did require a non-linear transform to even out its
fkquency distribution. The raw distribution was heavily skewed toward low
concentration values. Also, the range of kaolin concentration values spanned four orders
of magnitude. It was found that the tramfom lu&) significantly improved the
unifomnity of the concentration fiequency distribution. Clear air data points (C4)
outlining the kaolin clouds were included in the data set to help the ANN recognize the
boundary of the puff. The inclusion of these points required a slight modification of the
transform. A small offset allowed the clear air values to be logarithmically transfomed:
log'(^ + 1 0 ~ ) was chosen. Trained ANN outputs can be converted back into
concentration predictions by applying the inverse of this logarithmic transform.
Before the duplication of any vectors in the &îa set, a test set and validation set
were removed. The validation set was removed fïrst. Since this set represents a typical
mode1 deployment scenario, one complete trial was removed f+om the data set to fonn the
validation set. Tnd 44 was selected at random, and was used for both &ta sets.
Approximately 30% of the original set was randody selected and set aside to form the
test set.
Once the training and test set were separated, certain vectors were duplicated to
balance the input and output space more evenly. Given the high dimension of the input
space (i.e., nine), it is vexy difficult to Mprove the distribution of one input without
dimpting the distribution of another. Furtherrnore, it should be noted that even if al1
inputs had perfectly uniform fiequency distributions, this would not guarantee that the
entire input space is well represented. In order to detennine this, joint frequency
distributions would have to be c o n s i d d , this would be a formidable task in nine
dimensions. Figure 3.3 and Figure 3.4 show the histograms for the training set and test
set, respectively. These figures show data set 1 histograms; &ta set 2 distributions are
similar. Table 3.1 below summarizes the size of the training, test and validation sets for
data sets 1 and 2.
Table 3.1 Total number of vectors in the training, test and validations sets of data sets 1 and 2.
Data Set Training Set Test Set Validation Set
O 2 4 6 8 10 12
bin
Figure 3 -3 Data set 1 training set fiequency distributions af'ter transformation and duplication.
6
bin
Figure 3.4 Data set 1 test set hpency distributions after transformation and duplication.
The input variables covered the following ranges:
x E (-70,30) m,
y E (-30,30) m,
z E (-1 5,lS) m for data set 1, z E (O, 30) m for data set 2,
t E (7.556) s,
U E (0.5,3.7)m/s,
T E [17,24j OC,
time cf E [0.4083,0.6625], (i.e., about 10:OO to l6:OO)
P E (1 .O, 1.5,2.0) ,
p E [3O.OO, 30.451 in Hg.
Once the data sets were M y prepared as discussed above, network training
began. There are no well-defined niles for determining the optimal parameters for ANN
training. Thesz are largely determined by the process being modeled and the degree to
which the training set represents the salient features of the problem. Numerous networks
of varying architecture were trained. Epoch size was varied to determine an optimal
setting, and numerous adjustments were made to various network parameters in an
attempt to find an optimai solution. The EDBD Iearning rule was used for al1 ANN
modeling.
3.4 Gaussian Puff Modeling
Attempts were made to model the kaolin trials with Gaussian puff models. The
traditional Gaussian model, given by equation (l), was used together with both the Slade
and PasquilI parameterizations, as !mmma&ed in Figures 2.3 and 2.4, respectively. The
kaolin trials were also modeled using the fidl COMBIC dispersion code. The predictions
fiom these models were compared to LCM data, and the performance of these models
was cornpared with that of the ANN models.
Data set 1 is in a form that is directly amenable to cornparison with the Gaussian
puff model as given in equation (1). However, since data set 2 was constructed using
absolute z coordinates, a slightly modified version of the equation was used to model this
set. The form of this equation is given below (Sato, 1995).
where z is now the absolute vertical coordinate (relative to the ground), and zh is the
effective release height of the kaolin puff. The second texm in square brackets is included
to account for reflection of the aerosol fiom the ground This is a cornmon feature of
many Gaussian puff models, and is accomplished by incorporating an 'image source'
positioned beneath the ground at -zh. Figure 3.5 below illustrates the concept of the
image source.
Figure 3.5 The 'image source' accounts for surface interactions of the puff by reflecting material off the ground.
The use of an image source assumes that al1 the material interacting with the
d a c e is reflected back up into puff. That is, no deposition due to impaction or chernical
reaction is assumeci. It is also assulled that the puff is non-buoyant, i.e., it remains at the
effective release height as it disperses, and that gravitational settiing is negligible.
COMBIC also employs an image source to account for puff-surface interactions, and it
models non-buoyant puffs using equation (40).
A modification to COMBIC allows the user to output the model's predicted three-
dimensional concentration distribution. The user specifies the size of the grid and the
grid cells; cubic grid cells with 1-m sides were used, and the grid size was chosen to
entirely contain the modeled puS. Since COMBIC models diffusion relative to the
ground, and includes the ver t id meander of the pu€f in the dispersion process, COMBIC
predictions were compareci with data set 2. Also, COMBIC uses a fixed coordinate
system in its model, so the predicted three-dirnemiod concentration distributions were
centred on the puff s horizontal centre of mass before cornparison with &ta set 2 values.
A representative input card used for COMBIC modeling is shown below in Table 3.2.
Both the COMBIC model and the standard Gaussian puff models given by
equations (1) and (40) require the specification of initial puff size. The dispersion
coefficient profiles in figures 2.3 and 2.4 are for the idealized case of a point source, i.e. a
puffhaving no dimension at ~ û . Such a source is inappropriate for modeling the kaolin
releases, since the disseminatm immediately sprayed the kaolin over a finite volume.
Initiai puff radii were estimated using LCVS.
COMBIC models a single release scenario at a tirne. Since the validation set
comprises multiple scam of a single release, it is readily modeled using COMBIC.
However, the test set is composed of randody selected points from al1 of the 187
different LCM scans. The direct evaluation of COMBIC against the test set would
require a separate COMBIC model for each of the 187 scans. htead, a smaller selection
of 33 scans was modeled using COMBIC. These 33 scans were chosen so that a broad
range of meteorological conditions f?om the kaolin release trials was represented. It is
felt that the performance of COMBIC against this smaller subset of scans is
representative of COPuiD3IC's performance against the full test set.
Table 3.2 COMBIC Input card used to mode1 kaolin trial 18, scan 5.
WAVL VIS COMBIC PHAS FILE NAME ka01805 MET MET2 TERA MJNT CLOU SUBA SUBB DONE END CONTINUE WAVL VIS COMBIC PHAS FILE FILE NAME ka01805 ORIG LIST SLOC OLOC TLOC EXTC VIEW GREY TPOS DONE END STOP
Chapter 4
Results and Discussion
This chapter is divided into four parts. The first section presents and discusses the
results of various networks trained on data sets 1 and 2. The networks showing the best
performance for each data set were selected for fùrtber analysis and rehement. In the
second section, statistical measures of model performance are defined, and the
performance of the ANN models is compared to that of the Gaussian puff models and
COMBIC. A sensitivity analysis conducted on the ANN models is then described, in
order to determine the effect of each input on model performance. The last section
presents an analysis of the concentration distributions predicted by the ANN models, and
the significant features of these models are extracted into a simple parameterizaiion.
Curve fitting and plotting were effected using SigrnaPllot v. 5.00 by SPSS, ïnc. (SPSS,
1998).
4.1 ANN Mode1 Development
Numerous ANNs were trained on both data sets using the EDBD leamhg d e .
Training began on networks with simple architectures; more PEs and hidden layers were
then added to evaluate the performance of more complex networks. In addition to testing
different network architectures, numerous other network parmeters thai affect training
were altered in attempts to improve the performance of the ANN models.
Epoch s u e was systematically varied to determine the optimal setting. Since
network weight adjustments are averaged over the epoch, a small epoch size can often
lead to erratic jurnps in weight space. This is especially tnie of noisy data sets, where it is
desirable for the network to train to the more general trends of the data, smoothing out
the noise of the individual irainhg vectors. This was found to be the case for data sets 1
and 2. Networks of varying architectures were trained with epoch sizes of 100,200, 500
and 1000. The results of some of the preliminary ANNs trained on data sets 1 and 2 are
iisted in Appendix B in Tables B.1 and B.2, respectively. The best of these ANNs are
tabulated below in Tables 4.1 and 4.2, showing each network's performance against both
the test set and the validation set. Each ANN was trained using the SaveBest command,
and training continued for up to 5 million iterations (about 50 complete presentations of
the training set). Note that an epoch size of 500 was optimal for every architecture for
either data set. Networks train& with epoch sizes of 100 and 200 did indeed show very
mt i c behaviour, with oscillatory RMS that failed to stabilize quickly. The larger epoch
setting of 1 000 slowed training considerably.
Table 4.1 The best networks for each architecture listed in Table B. 1 (data set 1, relative- z coordinates), as detennined by performance against the test set. Validation set staîistics are also shown (RMS = root-mean-square error, R = linear correlation coefficient).
- - --- - -
Test Set Statistics Validation Set Statistics ANN Architecture Epoch
RMS R RMS R
Table 4.2 The best networks for each architecture listed in Table B.2 (data set 2, absolute- z coordinates), as determined by performance against the test set. Validation set statistics are also shown. (RMS = root-mean-square error, R = iinear correlation coefficient).
Test Set Statistics Validation Set Statistics ANN Architecture Epoch
RMS R RMS R 2 . 1 ~ 9-10-1 500 0.3565 0.6 140 0.38 16 0.557 1 2 . 2 ~ 9-20- 1 500 0.3549 0.6171 0.3674 0.5923 2 .3~ 9-30- 1 500 0.3508 0.6279 0.3820 0.5572 2 . 4 ~ 9- 10-5- 1 500 0.3550 0.6 179 0.3969 0.496 1 2 . 5 ~ 9-20-1 O- 1 500 0.3425 O. 6494 0.3940 0.5060 2 . 6 ~ 9-30- 15- 1 500 0.3407 0.6529 0.3792 0.5555 2 . 7 ~ 9-40-20- 1 500 0.33 15 0.6803 0.4098 0.4606
In addition to altering the epoch size, a number of other network adjustments wae
made in attempts to improve network performance. Some networks were trained with
additional weights directly comecting the input layer to the output PE. This bypasses the
non-linear transfa fimciion, and allows for linear parts of the problem to be solved
directly. These linear connections had negligible effect on network training, suggesting
that the process being modeled is highIy non-linear. A number of neîworks were trained
wiîh the addition of random noise. This is done by addiog a s m d random value to the
surnmation at each PE before applying the tramfer hction, and can improve network
generalization. In every case, the addition of random noise slowed network training with
no noticeable improvement in performance. Given the random nature of instantaneous
concentration distributions, it is likely that the data set is sufficiently noisy that the
additional noise had no effect. In addition to this, a small, constant offset value of 0.1
was added to the derivative of the transfer fûnction. This ensured a non-zero transfer
function derivative, and aIlowed saturated PEs to continue to leam.
Network architecture had the most influence by far on ANN training and
performance. As can be seen fiom Tables 4.1 and 4.2 above, larger architectures showed
better performance against the test set, but not against the validation set. It should be
noted that test set statistics are a better measure of network +omance than validation
set statistics. This is due in part to the fact that the test set is wmposed of about 10 times
more vectors than the validation set (see Table 3.1 above). Also, the test set covers the
full range of input variables, as shown in Figures 3.3 and 3.4, while the validation set is
specific to a single kaolin release. Based only on test set statistics, networks 1 . 7 ~ and
2 . 7 ~ clearly show the best performance.
However, it is also customary to use the smallest network that produces adequate
accuracy on the test set (Wassemian, 1993); this guards against over-training. In general,
when modeling with a data set as large and noisy as that used here, it is unlikely that an
ANN wili memorize the data. This is especiaily true when using a cross-validation
leaming technique such as the Suvekst command, which cont indy checks for
improvements in network generaiization. However, this data set is somewhat unique.
Although the training sets are large, input vectors nom the same scan wiii have identical
meteorologicai inputs, differing only in the position and time variables. This may reduce
the effective size of the training set, increasing the likelihood of over-training. For this
reason, smaller networks that show reasonable perfoxmance against the test set were
retained. Networks 1 . 2 ~ and 2 . 2 ~ (each with a single hidden layer of 20 PEs) both show
poorer perfomiance against the test set than networks 1 . 7 ~ and 2.7c, but still give
reasonable performance against the validation set. Thus, the following four networks
were selected for further analysis: 1.2c, 2 . 2 ~ ~ 1 . 7 ~ and 2.7~.
Typicai leaming curves are shown below in Figure 4.1. This shows the reduction
in RMS m o r against the test set as more training vectors are presented to the ANN. For
cornparison, the figure shows leaming curves for network 2 . 2 ~ (9-20-1) and 2 . 7 ~ (9-40-
20- 1). Leaming curves for the ANNs trained on data set 1 are sllnilar. In general it was
found that the 9-20-1 ANNs showed only marginal improvement der about 2 million
iterations, while the 9-40-20-1 ANNs showed steady improvement up to about 3 million
iterations. In both cases, the steepest part of the learning curve was within the nrst
500,000 iterations.
Also clearly show is the difference in RMS approached by either network. The
larger architecture networks generally were able to reach a lower RMS much more
rapidly than the smaiîer nets. Within the first 1 million iterations, the 9-40-20-1 ANNs
had trained sufficiently to produce a test set RMS that was about 5% Iower than that of
the 9-20-1 ANNs. When the RMS stabilized, the larger networks showed about 10%
better performance against the test set than did the smaller ANNs.
- ANN 22c (820-1)
0 saved - ANN 2 7 ~ (040-20-1) .O.
Millions of iterations
Figure 4.1 Leaming cuves of ANNs 2 . 2 ~ and 2.7~. Also indicated in the figure are the points during haining where RMS improved, and the ANN was automatically saved by the SaveBest command.
In any gradient-descent algorithm, the starting point can be critical. Before
training begins, ANNs are initialized by assigning every co~ec t ion weight a small
random value. Each of the four ANNs listed above were initialized to a random point in
weight space, and trained using the SuveBesr cornand. This was done 20 times for each
ANN, producing 20 different models h m each of the above four. It was found tfiat in
almost all cases, networks with different initializations converged to diff't solutions in
weight space. However, no one solution was significantly betîer than any other,
suggesting that the error d a c e had numerous minima of approximately the same
magnitude. The full results of these training sessions are listed in Table B.3 of Appendix
B. The mean test set statistics are summarized below in Table 4.3, where the emor terms
indicate the standard deviations between the 20 ANNs of a given architecture and data
set.
Table 4.3 Mean test set statistics for the two architectures for each data set. The error tenn indicates the standard deviation between the 20 A N N s in each group.
- - --
Data set Architecture RMS R
As c m be seen fiom Tables B.3 and 4.3, the variation in performance between the
20 networks of a given data set and architecture is small. Consequently, none of the 20
models within a group is preferable to any other. However, many have trained to a
different minimum in weight space, and may produce slightly differing predictions.
Thus, for each architecture and data set listed in Table 4.3, the average prediction of each
of the 20 ANNs was taken as the final model. This is also felt to help increase the
generalization ability of the final models. If any one of the 20 ANNs in a grop is trained
more or less in a specific region of input space, perhaps resulting in anomalous
predictions in that region, this model's deficiencies are partly suppressed by averaghg
the predictions of all 20 trained networks. Two average models were constructed for
each data set. The final average ANN models are labeled as follows: models 1A and 1B
are trained on data set 1, and have 9-20-1 and 9-40-20-1 architectures, respectively.
Similarly, the average models trained on data set 2 WU be labeled models 2A and 2B.
4.2 Comparing ANN Models with Gaussian Puff Models
Before the average ANN models could be cornpared to the Gaussian puff models
and COMBIC, it was first necessary to convert the ANN output into concentration
predictions. This was done by inverthg the logarithmic tninsform that was applied to the
ANN target outputs. The concentration predictions of each of the average ANN models
were then evaluated against the test and validation sets, as were the predictions of the
Gaussian puff models and COMBIC. The Gaussian puff model described in equation (1)
was evaluated against data set 1, using both the Slade and the Pasquill dispersion
coefficient parameterizations. The Gaussian model of equation (40) was evaluated
against data set 2, as was the COMBIC model.
The statistical evaluation of each of these models is based on the following model
performance rneamres.
Factor of two: CP F2 = fiaction of data for which 0.5 5 - I 2, C o
Factor of ten: CP FI0 = hction of data for which O. 1 5 - < 10, C o
Correlation:
where a ,,, is the standard deviation of In Co, and similar1y for a, cp .
1 Geometric Variance: VG = exp {(ln C, - ln C, )' J.
In the above equatiom, Co is au observed concentration, C, is the corresponding
predicted concentration, and overbars indicate average over the test or validation set.
The use of the logarithmic forms of correlation, geometrk rnean and geometric variance
is justified when there is a large range of magnitudes in the observeci and predicted
concentrations (CCPS, 1996). Since comparisons are being made between average
concentration predictions and instantaneous concentration measurements, there are many
data pairs with C,/Cp and Cp /Co qua1 to 10, 100 or more. Thus, the logarithmic
foms are more appropriate measures of model performance than are the more cornmon
linear forms (CCPS, 1996; Mohan and Siddiqui, 1997). The ideal value of each
statistical measure dehed above is 1 .O.
Table 4.4a below summarizes the performance against the test set of the two ANN
models (ANN 1A and IB), and the Gaussian puff model (equaîion (1)) using the Slade
(GPMs) and Pasquill (GPMp) parameterizations. Table 4.4b shows the performance of
the same models against the validation set.
Table 4.4a Model wmparison over data set 1 test set.
Mode1 R VG MG F2 FI O - -- -
ANN LA 0.63 14.43 1.40 0.3 1 0.84
ANN IB 0.7 1 9.14 1.36 0.35 0.89
GPMs 0.44 58.22 1.89 0.25 0.73
GPMp 0.46 38.09 1.43 0.27 0.76
Table 4.4b Model cornparison over data set 1 validation set.
- -
Mode1 R VG MG F2 FI0 - -
ANN 1A 0.58 13.52 1 .O1 0.30 0.83
ANN 1B 0.57 16.14 1.48 0.3 1 0.83
GPMs 0.52 17.17 0.93 0.30 0.82
GPMp 0.56 15.99 0.72 0.3 1 0.82
The statistical results for data set 2 are presented below. Table 4.5a shows the
performance against the test set of the two ANN models (ANN 2A and 2B), the Gaussian
puff mode1 (equation (40)) using the Slade (GPMs) and Pasquill (GPMp)
panuneterizations, and COMBIC. Table 4-93 shows the performance of the same models
against the validation set.
Table 4.5a Model comparison over data set 2 test set.
Mode1 R VG MG F2 FI0
ANN 2A 0.59 13.89 1.36 0.32 0.85
ANN 2B 0.66 9.22 1.3 1 0.34 0.89
GPMs 0.37 48.6 1 1.36 0.27 0.75
GPMp 0.35 50.87 1.34 0.28 0.75
COMBIC* 0.35 50.46 0.59 0.26 0.76
* Note: a different test set was used to evaluate COMBIC. See Section 3.4.
Table 4.S Model comparison over data set 2 validation set.
ANN 2A 0.60 13.15 1.30 0.3 1 0.85
ANN 2B 0.57 21.78 1 .91 0.30 0.8 1
GPMs 0.46 26.08 0.74 0.25 0.68
GPMp 0.29 78.52 1.15 0.22 0.66
COMBIC 0.29 87.8 1 1.1 1 0.19 0.56
Note that in al1 cases, less than about 30% of the predictions are within a factor of
two of the measurements. This clearly justifies the use of the logarithmic forms of the
statistical measures defieci above.
Over both test sets, the ANN models showed significantly better correlation with
concentration measurements than did the Gaussian puff models. The ANN models also
showed better correlation agauist the data set 2 validation set, but all models gave
comparable correIation over the data set 1 validation set. In no cases did any of the
Gaussian puff models have higher correlation than the ANN models.
Scaîter plots of each model over both the test set and validation set are presented
in Appendix C. A sample plot for the ANN model 1A is shown below in Figure 4.2. The
large scatter and low slope of the plot are characteristic of ail the models analyzed here.
Both of these traits are to be expected. Each model predicts average concentrations, but
is being compared to instantmeou concentration measurements, which vary greatiy from
the average distribution. Indeed, such instantaneous distributions are random, and a
perfect correlation is of course impossible.
1
(a) Test Set
1
(b) Validation Set
Figure4.2 ScatterplotforANNmodellAoverthe(a)testset and (b) validation set.
Each model under-predicts the high concentrations and over-predicts the low
concentrations to varying degrees, as indicated by the low slope (significantly less than
unity) of each plot. Again, this can be attributed to the fact that average concentration
predictions are being compared to instantaneous measmementS. The very process of
averaging or smoothing over the wildly fluctuating instantaneous distributions causes the
compression of extreme values into a smaller range. Predictions fkom these models
would presurnably show much l e s scatter and a siope closer to unity if they wae
compared to ensemble averages of measured concentration distributions. However, gîven
the present data set, statistically sound ensemble averages muld not be constructed for
comparison.
The geometric mean and variance of each mode1 can be displayed visually by
plotting VG versus MG. These plots are shown below in Figure 4.3 for data set 1 models,
and Figure 4.4 for data set 2 models. A perfect model would be placed at the point (MG,
VG)=(l, l), and a model that has no random scatter but suffers a mean bias would lie
along the curve ~(VG)=(~~(MG))~ , which is the minimum value of geometric variance
corresponding to a given geometrïc mean bias (CCPS, 1996). This parabola is indicated
in the figures along with lines of constant MG, corresponding to factor of two clifferences
in the mean.
(a) Data Set 1 Test Set Staüstics
0.125 0.250 0.500 1.000 2000 4.000 8.000
Geometnc Mean Bias, MG
(b) Data Set 1 Validation Set Statistics
Figure 4.3 Comparison of geometric variance and mean bias between the various models for data set I (a) test set and (b) validation set.
(a) Data Set 2 Test Set Staüstics
I 1 1 GPMp
O 1 Q CWBiC 1 GPMs
I I
Geometric Mean Bias. MG
(b) Data Set 2 Validation Set Statistics
0.500 1 .O00 2.000
Geomeûic Mean Bias, MG
Figure 4.4 Cornparison of geometric variance and mean bias between the various models for data set 2 (a) test set and (b) validation set.
For both data sets, the more complex networks (ANN 1B and 2B) show bmer
performance over the test set than do the simpler ones (ANN 1A and 2A). The mean bias
of al1 networks over both test sets is comparable. Over the validation set, however, the
smaller ANNs have less variance and more accurate means. It appears that the complex
ANNs do not generalize well in the region of input space covered by the validation set.
This may indicate that the larger networks are over-trained.
The vectors in the validation set have meteorological inpuîs that are not
represented by any vectors in the training set. This is not the case for the test set.
Aithough no vectors are in both the test set and the training set (Le., the two sets are
disjoint), many vectors in the test set have the same meteorological inputs as vectors in
the training set, since the test set was drawn fkom the same population of LCM scans. in
such a case, the cross-validation training technique employed by the SaveBest command
may not sficiently guard against over-training. During training, a network may
continue to show Mproved performance against the test set simply because it is training
and testing on many of the same meteorological conditions. This may result in an over-
trained network that still shows good generalization over the test set. The validation set
provides a check against this. Taking this into account, it appears that the smaller
networks are better ANN models than are the larger ones. Hence, only networks 1A and
2A will be retained for m e r analysis.
Al1 the Gaussian pufT models estimate the mean within a factor of two of the
observed value, indicating that concentration predictions are within the correct range.
However, these models al1 sufk fkom large variance. That is, the Gaussian puff rnodels
predict concentration leveIs nea. the correct magnitude, but it is the distribution of these
values that is in error. This may be attributed to the symmetricai shape of the Gaussian
puffmodel distributions. Neither equation (1) or equaîion (40) accounts for the effects of
wind shear, which tends to distort a diffusing puff. The increased wind speed at higher
levels above the ground causes a puff to tilt, so that the top portion is carried downwind
faster than the bottom. This results in a vertically skewed, asymmeûicd distribution (van
Ulden, 1992; Sato, 1995). Although the COMBIC mode1 does account for wind shear, its
effect on pufT d i f i i o n is not incorporated until after a diffusion tune of 30 seconds
(Ayres and Desutter, 1995). No shear-induced puff distortion was observed in any of the
trials modeled using COMBIC, even those with diffusion times greater than 30 seconds.
Another possible source of the large variance observed in the Gaussian puff
models is the estimate of initial puffradii. As previously noted, the initial dimensions of
a kaolin puff, i.e., immediately afier it is released nom the particle disseminator, were
estimated using the visualization software LCVS. The effective release height needed for
equation (40) was also estimated using LCVS. The h t scan from a number of diffaent
kaolin trials was exarnined (i.e., scans taken at nominal difbion t h e of zero), and it was
found that fairly consistent estimates could be fomed. The disseminator immediately
force. the kaolin into a cloud with a vertical extent of about 4 m, centrd about 2 rn
above the ground. The initial pufY length was estimated to be about 10 m in the direction
of the disserninator's nozzle, and about 2 m in the transvetse direction. Given these
estimates, and the orientation of the disseminator with respect to the wind direction,
initial puff radii could be estimatecl for each scan. Of course these methods provided
only rough estimates; to detemine the effect of the initial pufY radii on each Gaussian
puffmodel's performance, numeros different values were tried. The best results in each
case were f?om the estimates listed above, and these are the results reported earlier in
Tables 4.4 and 4.5, and Figures 4.3 and 4.4.
It is o h of interest to determine which input variables have the most profound
influence on an ANN model. This c m help clarie which inputs are the best descriptive
variables in modeling the process at hancl, and may suggest if certain vanables c m be
excluded fiom the model. Models IA and 2A were analyzed to detemine the infIuence
of each input variable on the predicîed concentration. Tbis was done in two ways. For
the k t analysis, each input PE for each network was disabled one at a time (i.e., set to
zero), and the effect on network performance over the training and test sets was
rneasured. A similar approach was taken for the second analysis, except that each input
value was adjusted by a £ked percentage, a procedure known as dithering. Both
sensitivity analyses were perfomed on each of the 20 individual ANNs comprising
models 1A and 2A, and the results were averaged. In addition to t h , an analysis of
model residuals was done to determine if there are any trends in model over-prediction or
under-prediction as a function of the input variables.
4.3.1 Disabled Inputs
One way to detennine the importance of a given input PE is to fix ifs value to
zero, and test the resulting performance of the network, leaving al1 other inputs
unchanged. Since all comection weights attached to a given input PE are multiplied by
the PE value before summation (see equation (19)), a disabled input does not wntrïbute
to any sums in the hidden layer. Consequentiy, the disabled PE does not contribute to the
predicted output (see equation (20)). In this sense, the disabled PE is effectively removed
fiom the network, and the change in performance of the ANN can be aîûibuted to the
missing variable.
However, one must exercise caution when interpreting the results of disabhg an
input PE. The change in network performance may indeed be due to the importance of
the disabled variable, but the distri'bution of that input variable also plays a part. Recall
that each input variable is linearly mapped to the range [-1, I l before presentation to the
ANN. Setting an input PE to zero effectively fixes that variable's value to the middle of
the range of possible values. If a given vector has an extreme input variable value (i.e.,
near f l), then fixing this variable at zero effects a large change. Conversely, a vector
whose input value is near the middle of the range (Le., near zero) will not be changed
greatly by disabling the PE.
Therefore, when measuring the change in network performance due to a disabled
PE over a given data set, the distribution of the disabled vanable over that data set can
have a significant effect, often making some variables appear more or l e s important than
they are. For example, if a certain input variable's distribution is centred near the mid-
range value, then fixing this input to zero will effect a smaller change on average and this
variable may seem to have l e s effect on network performance. Conversely, if a variable
is distributed mostly at the extremes, îhen fixing the value to zero effects a large change
on average over the data set, and this variable may appear to have more significance than
it t d y does.
Given the importance of variable distributions when disabkg inputs, this analysis
was done over both the test set and training set Change in network performance was
gauged by the percentage drop in R and increase in RMS as a resuit of disabling a given
input PE. These percentages were calculated for each of the 20 ANNs making up each
model, and the r d t s were averaged. Figure 4.5 below shows the average redts of
disabling each input PE in tum for model I A , over both the test set and the trsining set-
Figure 4.6 shows the same results for the ANN model 2A.
(a) ANN 1A (training set) R RMS
X Y Z t U T time P p
Disabled Input
X Y Z t U T time P p
45
Disabled Input
9 40 -
Figure 4.5 Change in ANN 1A performance over (a) the training set and (b) the test set. Percentage dmease in R and increase in RMS are shown. Error bars indicate the standard deviation among the 20 ANNs making up model 1A (time indicates time of day)
(b) ANN 1A (test set)
g 4 (a) ANN 2A ("ning set)
s! -r
m R T RMS
X Y Z t U T time P p
Disabled Input
- 40 4 (b) ANN 2A (test sût) I I
Disabled Input
Figure 4.6 Change in ANN 2A performance over (a) the training set and @) the test set. Pacentage decrease in R and increase in RMS are shown. Error bars indicate the standard deviation among the 20 ANNs making up mode1 1A. (time indicates time of day)
Note that for ANN IA, the largest differaices between the training set (Figure
4.5a) and the test set (Figure 4 3 ) evaluations are for the variables t, Ty time and p (i.e.,
diffusion time, temperature, time of day and pressure). These are precisely the variables
whose fiequency distributions differ notably beîween the training and test set (cf Figures
3.3 and 3.4). This iUustrates the effect that input variable distribution can have on the
results of a PE-disabling sensitivity analysis. The differences between Figure 4.6a and
4.6b cm be attributed to the same cause. Since the distribution of each of the position
variables (x, y, z) is concentrated at the mid-point of each variable's range, their influence
may be tmderestimated by this anaiysis. Conversely, the time of &y and pressure
distributions are concentrated at the extremes, and the importance of both of these
variables may be exaggerated.
It is clear fiom the above discussion that a direct cornparison of these results
should only be done between variables with similar distributions. First mmparing the
position variables, it appears that downwind distance has the most influence on either
model's perfoxmance. Mode1 2A attributes more importance to the vertical coordinate.
This is plausible, since mode1 2A was constructeci in the fkamework of absolute vertical
diffusion, Le., the z coordinate is meamred fkom the ground up. Conversely, the s
coordinate for mode1 1A is relative to the puff s centre of mas, and the ANN may have
exploited the symmetry of this coordinate system, attributing less importance to the
vertical coordinate.
Diffusion time, wind speed and Pasqui11 class al1 have relatively unifonn
distributions over both training sets. Both models place more importance on wind speed
than either of the other two variables, which is not unreasonable. Wind speed has a direct
and immediate affect on a concentration distribution, while a p f l s distribution changes
less rapidly with diffusion tirne. The Pasquill stability class was estunated using Table
2.1, which contains a large degree of subjectivity, and attempts to combine the effecîs of
various independent parameters into a single measure of stability. Stability class
estimates using this scheme may not accurately reflect the true thermal stratification of
the lower PBL.
The remaining three inputs (temperature, h e of day and pressure) have such
differing and non-unifom distributions that it is diffidt to determine their influence on
network performance with any degree of confidence.
4.3.2 Input Dithering
Dithering is sirnila. to PE disabhg in that each input is varied one by one, and
the redting effect on network performance is mea~u~ed. However, it differs in a
fllndamental way. Instead of fixing each input variable to zero, each value is 'dithered',
or adjusted, by a srnall constant value. The effect of each input on network performance
is expressed as the change in output divided by the change in input. In other words,
dithering estimates the partial derivative of the output with respect to each input variable,
or&/&, , in the notation of Chapter 2. Since each variable is dithered by the same
amount, a variable's fiequency distribution does not affect the analysis.
Each constituent ANN of models 1 A and 2A was dithered over the test set. The
dithering constant was chosen as 5% of the input mapping range, [- 1, 11 (Le., 0.05). The
hctional change in output due to each dithered PE was expressed as a percent. The
mean absolute value of this percentage change was calculated over the test set for each of
the 20 ANNs per model, and the values were averaged. Figure 4.7 below shows the
results of the dithering analysis for both models.
Dithered lnput (5%)
Figure 4.7 Results of dithering inputs by 5% over the test set for (a) model IA and (b) model 2A. Error bars indicate standard devïation among the 20 ANNs making up each model.
Note the similarity between the results fiom either model. The only significant
ciifference arises from z and pressure, p. Model 2A attributes more importance to the
vertical coordinate than does model 1 A, and a likely reason for this was describeci above.
Model 1A shows that dithering the pressure input has a large effect on the predicted
concentration value. It was not expected that atmospheric pressure wodd be one of the
most influentia.1 inputs. In fact, this result may once again be due in part to the fiequency
distribution of this variable. Figure 3.4 shows that the test set distribution of p is
characterized by a few regions of high population separated by regions of zero
population. It may be that the ANN models do not interpolate weli in the regions of zero
population, and the dithering of the pressure variable may force the model to predict in
this poorly-trained region. it is not clear why only model 1A places such large
importance on pressure.
The lower significance attributed to the downwind distance, x, in either mode1
may be due to the large range of input values for this variable. As noted in Chapter 3, the
x coordinate takes on values spanning the range (-70 m, 40 m), while the other two spatial
variables cover considerab1 y smailer regions. Concenîration levels are not expected to
vary considerable at very large distances nom the p f l s centre; large variations generally
occur closer to the puff centre of mass. Dithering a vector that has very large spatial
coordinates (i.e., far fkom the centre of the pu@ will likely have very little effect on the
concentration prediction. Since îhere are more vectors at large than at large Lyl or bl,
the importance of the x coordinate is diminished in the dithering analysis.
The greater importance of temperature than Pasquill stability class may indicate
that the temperature measurements provide some estimate of atmospheric stability. This
may not seem likely, since it is the vertical temperature gradient that provides a measure
of stabiliîy, and measurements at a minimum of two elevations are necessary to estimate
the gradient. However, if insolation is constant (as it was for at least 3 of 4 days during
the kaolin trials), the time of day may give an indirect estimate of the temperature of the
surface, since this depends largely on diurnai heating patterns. It is possible that the
ANN models have learned a relationship among the temperature, time of day, and
perhaps other meteorological inputs that affect atmospheric stability, and hence affect
concentration distributions. In any ment, it appears as though all inputs contri'bute
significantly to the performance of either model.
4.33 Analysis of Mode1 Residuais
It is of interest to detennine if there are regions of input space where each ANN
rnodel's performance varies. In order to determine this, the so-called model residuals, or
(in C,, - in Co ) = h(~, /c, ), were plotted against a number of input variables. The
independent variables andyzed were d i h i o n time, wind speed, temperature, time of
day, Pasquill stability class, and pressure. The test set was divided into a nurnber of
subsets, each covering a specific interval of one of these six input variables. Models 1A
and 2A were then evaluated against each subset, and the model residuals were plotted as
box-plots against each of the independent variables. These box-plots are shown below
for models 1A and 2A in Figures 4.8 and 4.9, respectively. Each box lies horizontally
between the two endpoints of the subset, and is divided vertically by seven divisions.
The middle line in each box is the 5 0 ~ percentile, and the bottom and top borders of the
box are the 25" and 75& percentiles, respectively. The error bars indicate the 10& and
90" percentiles, while the points represent the and 95" percentiles.
0.01 l I I 1 i A A-B B
Pasquill Stability CIess
, I I rnorning aftetmoon
rime of Day
1 0.01 ' , I 1 I
30.00 30.15 30.30 30.45
pressure (in Hg)
Figure 4.8 ANN model 1A residuals vs. (a) diffusion tirne, (b) wind speed, (c) temperature, (d) time of day, (e) Pasquill stability class and (f) pressure.
moming afiemooci Time of Dey
0.01 I I I 0.01 ' I 1 I 4
A A-B B 30.00 30.15 30.30 30.45 Pesquill Stability Clas pressure (in Hg)
Figure 4.9 ANN model 2A residuals vs. (a) diffusion time, (b) wind speed, (c) temperature, (d) t h e of day, (e) Pasquill stability chss and ( f ) pressure.
Note that for both models, there is a general trend of under-prediction. This is
consistent with Figures 4.3 and 4.4 above, which show that both models 1A and 2A have
mean bias greater than unity. Although both models show somewhat better performance
in certain regions of input space than in others, no clear trends are evident. Model 1A
residuals show more variation between intervals of a given variable thaa do those of
model 2A. This may suggest that model 1A is under-trained in certain regions of input
space, but it is more likely due to the distribution of test set residuals over each input
variable interval. Generally, regions with more data points have lower residuals on
average than regions with fewer points. Again, this is consistent with the tendency of
these models to under-predict It is unclear why mode1 2A shows so much less variation
in residual means than model I A , but both models appear to predict with the same ability
over the entire input space.
4.4 ANN Model Concentration Distribution Predictions
An analysis of the predicted concentration distributions of the ANN models is
presented in this section. For a number of meteorological conditions, three dimensional
concentration distributions of models 1A and 2A were constmcted within the spatial
domain of the input space considered here. The fùnctional f o m of these distributions
and their moments is examine. and trends with diffusion time and meteorological
variables are investigated. Finally, ANN model 1 A is analyzed in detail, and analytical
expressions are derived relating the properties of the predicted distributions to the more
influential input variables.
4.4.1 Horizontal Concentration Distributions
Both ANN models 1A and 2A predict very smoothly varying concentration
distributions, peaked at the pufT centroid and falling off rapidly with distance. In the
downwind and crosswind directions, these distributions are very well approximated by a
Gaussian distribution. Figure 4.10 below shows typicd horizontal cross-sections talcen
through the puff s centre (z=0 m) as predicted by ANN model 1A. These two surface
plots show the model's prediction after diffusion t h e of 10 seconds (Figure 4.1Oa) and
30 seconds (Figure 4.10b). The remaining model inputs used to constmct these sections
are U=l.O d s , T=I9 OC, Pasquill stability class A, p-30.3 in Hg, and time of day is
10:48 am.
Sections such as those shown in Figure 4.10 were constructed under a wide
variety of input conditions, and both ANN models 1A and 2A predict m e s very similar
in form to those shown below. Indeed, it was found that under almost all conditions, both
models predict horizontal distributions that can be represented quite well by a Gaussian
distribution. Figure 4.1 1 shows one-dimensional profiles of the siirfaces shown in Figure
4.10 with fitted Gaussian cuves.
(a) diffusion time, el0 seconds
(b) difiusion time, t230 seconds
Figure 4.10 ANN mode1 1 A predictions for a horizontal slice through the pufT centre at A, shown (a) 10 seconds and (b) 30 seconds afler release.
(a) t=lO s
Y (ml
(b) t=30 s
ANN mode1 1A ptedieaons Gawsian distribution
Figure 4.1 1 ANN mode1 1 A predictions and fitted Gaussian cuves for profiles dong FO, y=O (ieft) and A, FO (right). Diffusion times (a) 10 seconds and (b) 30 seconds are shown. Note: these are 1 -D profiles of the surfaces shown in Figure 4.10.
These pronles show very good agreement with the fitted Gaussian m e s (2
Ml98 in al1 cases), but it is clear that the fit is better closer to the cloud's centre. As
distance fiom the puff centroid increases, both ANN models deviate fkom the Gaussian
distribution, predicting slightly higher concentration levels. It was found that this
behaviour is typical of both models 1A and 2A over a broad range of input conditions. It
should be noted that the v e y high R~ vaIues typical of most Gaussian fits are valid only
for the central region of the curve, since the smaller values at the tail contribute l e s to the
regression. However, d e r transforming the data logarithmically to account for this, it
was detennined that most ANN predictions show very good agreement with the Gaussian
out to distances of about 3 standard deviations. Again, this is the case for vimially al1
input conditions.
Although there is no reason a priori to assume that the models' predicted
distributions should follow a Gaussian distribution, the fact that model predictions are
non-zero far fiom the puff centroid suggests that these slight over-predictions are not
physically realistic. The behaviour of the tails of the ANNs' predicted distributions can
be largely attributed to the number of clear air data points included in the data set. As
noted in Chapter 3, data points with zero concentration were incorporated into both data
sets, specificdy to help the ANNs l e m the boundary of the puffs. Earlier ANN models
were trained on data sets with no such clear-air data points. These models predicted
distributions with signihcantly larger non-zero tails than those of models 1A and 2A. It
is likely that if more clear-air data points near the puff bomdaries were included in the
data set, the ANN models would predict distributions that more rapidly approach zero
away fiom the puff centroid. However, maintaining the balance of the output fiequency
distribution places a strict coastraint on the number of such points that can be included in
the data set.
The chatacteer of the ANN distribution tails can not be entirely attributed to the
number of clear air points in the data set. While the ANN models predict very syrnmetric
distributions in the crosswind direction, downwind distributions typically exhibit
asymmetry about the ceutroid. Lndeed, this behaviour bas been reported by others; boîh
Sato (1995) and Yee, et al. (1998) found that downwind concentration distributions were
negatively skewed, such that the traiiing half of the puff had an elongated tail. The
downwind profiles shown in Figure 4.1 1 also show this trend to some degree, as did most
ANN predictions. Yee detennined that the trading half of the PLIFS distribution could be
approximated quite well by an exponential distribution. Such a distribution was fitted to
ANN model predictions under a number of input conditions, but did not provide a
significantly better approximation than the Gaussian distribution. In general, the
skewness of the ANN model predictions in the downwind direction is small, and the
Gaussian distribution is felt to provide a sufficiently accurate approximation.
4.4.2 Vertical Concentration Distributions
The vertical concentration distributions predicted by ANN models 1A and 2A
differ significantly due to the different vertical coordinate systems employed by each
model. Both models predict tilted distributions, such that the upper portions of the puff
diffuse downwind at a greater rate than the lower portions. This is consistent with a
sheared surface Iayer, where wind speed increases with elevation above the ground.
Typical vertical cross-sections are shown below in Figures 4.12 and 4.13 for models 1A
and 2A, respectively. These sections are taken through the çrosswind centre of the pufT -
(Le., at y=O), and show nomalized concentration contoias. Remaining inputs are the
same as those indicated in Figures 4. I O and 4.1 1 above.
(a) g20 s
Figure 4.12 Mode1 1 A normalized concentration contours for vertical cross-sections through the puff centre O), shown (a) 20 s and (b) 40 s after release.
(a) t=20 s 3 0 ,
(b) 1-40 s
Figure 4. I3 Mode1 2A noxmalized concentration contours for vertical cross-sections through the puff centre m), shown (a) 20 s and (b) 40 s afier release.
Xn order to determine the degree of pufT tiiting predicted by each model, the
downwind centroid was calcuiated dong planes of constant z. At a given vertical
position z, the domwind centroid, X,(z), is given by:
I j x - a x , Y, dd.y Jp(x ,y , r )&&
For each ANN model, X, was calculateci as a fiinction of vertical position, under a
number of different input conditions. It was found that for both models, the downwind
centroid position varies approximately linearly with vertical distance (2?%.98 in most
cases). Significant deviations fiom linearity were only observed under conditions of low
temperatures and long diffusion times, but even in these cases, the approximation of
linearity is satisfactory (R'>0.94).
For both models, the tilt angle displayed the same two general trends.
Specifically, puffs exhibited larger tilt angle under conditions of higher wind speed, and
the degree of puff tilting decayed slowly with difkion tirne. Both of these results are
consistent with the physical nature of a sheared surface layer. Given that wind speed
approaches zero at the d c e , higher wind speeds indicate larger shear close to the
ground. Since the effect of wind shear is to stretch the puff in the downwind direction,
larger shear results in puff distributions with greater vertical skewness, and hence, larger
tilt angle.
In his theoretical andysis of puff dispersion in a sheared surface layer, van Ulden
(1 992) concluded that under neutral conditions, a diffusing puff will maintain its shape as
it diffuses, Le., its tilt angle is invariant with tirne. He dso argued that under stable
conditions, a p f l s tilt angle should increase linearly with diffusion time. Although he
presented no discussion of the development of puff tilt angle for unstable conditions, he
detemiined that the interaction of wind shear and vertical d i h i o n largely determine the
degree of pufT tilting. Specifically, the effect of wind shear is to increase the vertical
skewness of the puE while the turbdent vertical mixing acts to destroy this skewness. In
unstable conditions, there is an increased level of convective turbulence genemted by the
large temperature gradient between the surface and the air above it. These convective
turbulent eddies conîribute to enhanced vertical rnixing. Thus, in unstable conditions, the
pufT skewness generated by the wind shear is destroyed at a greater rate by this increased
vertical mixing, and tilt angle wiil decrease as the puffdiffbses domwind.
Typical vertical profiles of X, for model 1A are shown below in Figure 4.14, for
two different temperatures, and show the slow decay of pufT tilt angle with diffusion
tirne. These X, profiles are typical of model 1A predictions under a wide range of input
conditions; a more detailed analysis of model 1A tilt angle is presented below.
t=l O s M O s
r 4 0 s linear M
Figure 4.14 Model 1 A predictions of puff tilt angle, shown at various tirnes afier release for (a) T=19 OC, (b) T=22 OC. Note the good linear fit and the slow decay of tilt angle with diffiion time (-1 .O m/s).
Model 2A shows similar behaviour under most conditions, but it should be noted
that for low ternperatures, a somewhat different trend is apparent. Specifically, model 2A
predicts increasing tilt angle with diffiion t h e for T S 20 OC. This effect is less severe
as temperature increases, and for T >20 OC, the model predicts decaying tilt angle with
diffusion tirne, simüar to model 1A predicîions. It is unclear why this trend is observed
only in conditions of low temperature; the model may have learned some relationship
berneen the temperature meaSuTernents and the thermal stability, or it may be an artifact
of the specific conditions of the experimental trials.
Typical x-centroid profiles for model 2A are shown below in Figure 4.15. The
low temperature predictions of in creasing tilt angle with diffiion time may be due to
gravitational settling of the kaolin as d i f i i on proceeds. As dernonstratecl above in
Figure 4.13, model 2A predicts distributions that fdl closer to the surfhce as the pufY
disperses downwind. This effect, together with the influence of wind shear, results in a
'squashed' distribution having increased vertical skewness.
(a) T=19 OC
e l o s t M O 8
W e3Os O M O 8 - lineer fit
Figure 4.15 Mode1 2A predictions of puff tilt angle, shown at various times afier release for (a) T=19 OC, (b) T=22 OC. Note the increasing puff tilt with d i h i o n time for low T.
The vertical contours of model 2A concentration distributions provide some
insight about the interaction of the aerosol with the Surface. Most contours, such as that
shown above in Figure 4.13, exhibit large concenîration gradients close to the gound,
with the distribution falling off to very low levels at 2 4 . This indicates that very little of
the aerosol is reflected h m the ground back up into the puE Aerosol pufXs that are
reflected fiom the surfàce will g e n d y show increased concentration levels closer to the
ground. This evenîually l a d s to verticai concentration profiles that are highest at the
surface, and taper off with height above the ground.
However, such behaviour was not predicted by model 2A. This suggests that
either the aerosol release height was sufficiently high that diffushg aerosol did not
interact significantly with the surface, or that there was non-negligible ground deposition.
Although kaolin is non-reactive, deposition due to impaction with the d a c e roughness
elernents (i.e., vegetation) can result in significant aerosol deposition. This effect is
difficult to assess; usually, surface deposition rates are detennined empirically, and
depend on the properties of both the aerosol and the surface (Hanna et al., 1982). A
simple yet effective way to deal with dry d a c e deposition is to incorporate a reflection
coefficient into the image source model. This approach simply multiplies the image
source term of equation (40) by an empirically detennined constant l e s than unity,
effectively reducing the amount of aerosol reflected back up into the puff. COMBIC
employs such a system, but determinhg the comect value of the reflection coefficient is
almost impossible without some prior knowledge of the deposition rate of the aerosol.
4.43 Cloud Spread
The spatial spread of the predicted concentration distributions of both models was
examined under a variety of input conditions. This was done by calculating the second
moments of the distribution, defineci as:
with similar expressions for a,' and a:. It was found that for both models, cloud spread
varied significantly with diffusion tirne and wind speed, but showed Iittle variation with
the remaining input variables. Figure 4.16 below shows the temporal development of the
second moments of predicted concentration distributions for model 1A. Three different
wind speeds are shown, and the remaining model inputs are the same as those indicated
above in Figure 4.10. Mode1 2A cloud moments show similar behaviour, but generally
showed larger vertical dispersion owing to the vertical meander of the PUE.
Al1 dispersion coefficients show linear growth with diffusion tirne, to good
approximation ($ > 0.95 in rnost cases). This is in agreement with the observations
reported by Yee, et al. ( 1 998)' who found that the downwind and crosswind widths of
instantaneously released puffs show neariy linear growth. Sato (1995) f o n d similar
results for puff releases at short d i f i i o n times (Le., t < 100 s).
Figure 4.16 Evolution of dispersion coefficients with diffusion t h e for wind speed (a) 1 .O m/s, (b) 1.5 d s and (c) 2.5 m/s.
The relative magnitudes of the dispersion coefficients iIlustrated in Figure 4.16
above were observed under al1 input conditions. Both models predict downwind puff
dimensions that are significantly larger than those in the crosswind or vertical directions.
This can be attributed to the effects of wind shear, which enhance diffision in the
downwind direction. At short diffusion times, the vertical puff width is also notably
larger than the crosswind width. This is likely due in part to the dissemination technique
used in the kaolin release trials. The disseminator nozzle was inclined slightly fiom the
plane of the surface, effectively giving kaolin clouds larger initial vertical spread.
Both models predict that downwhd dispersion coefficients grow more rapidly
with diffusion time than do the crosswind or vertical widths. They also predict that s,
grows more rapidiy at higher wind speeds. Again, these trends are consistent with the
aihanced downwind dispersion effected by wind shear.
Model 2A predicts that dispersion coefficients increase with height above the
ground. The downwind and crosswind second moments were caïcuiated about the tilted
axis of the puff at various levels above the d a c e . It was found that pufT spread was
greater at higher levels above the surfàce. A typical vertical profïie of dispersion lengîhs
is shown below in Figure 4. 17.
O 5 10 15
dispersion coefficients (m)
Figure 4.17 Verticai profiles of downwind and crosswind dispersion lengths as predicted by mode1 2A. Shown for wind speed of 1 .O m/s, 10 seconds after release. Remaining inputs are the same as those in Figure 4.10 above.
This increase in p d f spread with vertical height is due to the vertical
inhomogeneity of the surface layer caused by the presence of the ground. Wind shear is
largely responsible for the enhanceci downwind pufF spread with height above the
surface. However, the vertical increase in spread in both the downwind and crosswind
directions is a result of the varyhg eddy spectnun with height above the surface. Very
ciose to the ground, only the smallest turbulent eddies are present, since the motion of
large eddies is restricted by the rigid surface. However, as distance h m the surfkce
inmeases, larger eddies are present, resulting in greater dispersion at larger heights.
4.4.4 ANN Model 1A Parameterization
The ANN models developed here can be easily embedded into a simple cornputer
program for deployment, requKing relatively little cornputaiional time to produce results.
Nevertheless, for the sake of convenience, it is often desirable to have simple, explicit
analytical fonnulas to describe a model. To this end, ANN model IA predictions were
fiuther analyzed, and the significant features of the model were encapsulated into
analytical expressions of relatively simple fom.
The sensitivity analyses described above demonstrate that each of the nine input
variables used to construct the ANN models contributes in a significant way to the
models' performance. However, determining the behaviour of predicted concentration
distributions over the entire range of al1 nine variables would be an extremely difficult
task. Therefore, efforts were directed at a smder subset of the input variables.
Specifically, the three spatial variables, diffusion time and wind speed were used as the
primary descriptive variables. Temperature was also varkd during this analysis to
determine its effect on predicted distributions. Pasquiil stability class was adjusted with
the wind speed, in order to maintain consistency with its definition as shown in Table 2.1.
Conditions of strong daytime IlisoIation were assumed for this anaiysis, consistent with
the conditions of most of the release trials.
The rernaining inputs, time of day and atmospheric pressure, were held at fïxed
values. These constant values were chosen such that they lie in the mid-range for each
variable, at values that are fairly well represented by the training set. This ensures that
the ANN model is interpolating in a well-trained region of input space. The chosen fixeci
values for time of day and pressure were 10:48 am and 30.30 in Hg, respectively.
Based on the analysis of the concentration distributions presented above, model
1A predicîions were modeled by a three-dimensional tilted Gaussian distribution. This
distribution takes the following fom:
where f3 is the tilt angle (Le. the slope of the Xdz) ccurve fkom the z-axis), and al1
remaining variables and parameters are as defined above. The four fiee parameters, i.e.,
tilt angle and the three dispersion coefficients, were determined as functions of wind
speed and diffusion tirne. The e&t of varying temperature was also examined, but most
analyses were performed at T=19 OC, since this value lies near the mid-range of
temperatures, and is the most populated temperature value in the training set (cf Figure
3.3).
Vertical profiles of the x-centroid, such as those shown above in Figure 4.14, were
consûucted under a number of wind speeds and diffusion times. As already noted, model
1A predicts that tilt angle decreases with diff i ion time and increases with wind speed.
These trends are summarked below in Figure 4.18. Similar trends were observed for
different temperatures.
1.8 - Lk1.0 m/s
1.5 - r U=l.Srn/s i U=2.5m/s
1.4 - Iinear fit
Q - 1.3 - - m t Q 1.2- - - CI
1.1 -
1.0 -
Figure 4.18 Variation of puff tilt angle with diffusion t h e and wind speed, shown for T= 19 OC.
Tilt angle was f o n d to Vary approximately linearly with diffusion t h e and wind
speed, and a plane of the form (UJ) = a + b W + ct provided a sufficient approximation
(2 > 0.96). The fitting panuneters were deteRnined to be:
a = 0.99 f 0.03
b = 0.20 * 0.01 c = -0.0049 f 0.0006.
The dispersion lengths were determineci by calculating the second moments of the
predicted concentration distributions for various wind speeds and difihion times. The
downwind and vertical second moments were caidated about the tilted â u s of the pufE
The variation of the dispersion coefficients with wind speed and diffusion time is very
similar to that shown above in Figure 4.16, and a plane provideci a good fit to the
predicted pufT spreads. Table 4.6 below Summafizes the fitting parameters to the plane
q ( U , t ) = a + bU + c f , for i = x, y, z. The dispemion coefficients did show some
variation with temperature; lower temperatures generally showed greater puff spread.
However, the variations were minor, and for the sake of simplicity, they were not
incorporateci into the parameterization.
Table 4.6 Fitting parameters for the dispersion coefficients as linear hct ions of diaision time and wind speed.
dispersion coefficient a b c R2
The perfoxmance of equation (41) together with the simple parameterizations for
tilt angle and dispersion coefficients was evaluated against data set 1 test set and
validation set, For cornparison, the parameterized model's performance statistics are
listed below together with those of the other models considered above. Table 4.7a shows
statistics for the test set, and Table 4.7b for the validation set.
Table 4.7a Model comparison over data set 1 test set.
Mode1 R VG MG F2 FI O
ANN 1A 0.63 14.43 1-40 0.31 0.84
1A Parameterkition 0.49 27.38 1.34 0.27 0.77
GPMs 0.44 58.22 1.89 0.25 0.73
GPMp 0.46 38.09 1.43 0.27 0.76
Table 4.7b Model comparison over data set 1 validation set,
Mode1 R VG MG F2 FI0
ANN 1A 0.58 13.52 1.01 0.30 0.83
1A Parameterization 0.54 15.94 0.88 0.29 0.81
GPMs 0.52 17.17 0.93 0.30 0.82
GPMp 0.56 15.99 0.72 0.31 0.82
Although the parameterkation of model 1A shows significantly worse
performance than the full ANN model, it still oufperfonns both traditional Gaussian puff
models over the test set, and shows comparable performance over the validation set. It is
clear that much of the predictive power of the ANN rnodel is lost in the parameterkation,
and that the effects of the neglected input variables on model perfonnance are significant.
Nonetheless, the parameterization provides a simple analytical model for puff diffusion in
a sheared surface layer, with a clear physical interpretation. Under the conditions
covered by the present data set, this parmeterized mode1 provides a more accurate
alternative to existing Gaussian puff models.
Chapter 5
Conclusions
5.1 Summary and Conclusions
A number of ANN models were constructed to predict the concentration
distribution of an instantaneuusly released aerosol in the planetary boundary layer. These
models were based on data collectecl fkom 50 field trials, where 50 g lots of kaolin
powder were released fiom near ground level. Three-dimensional concentration maps
were rnea~u~ed using a scanning lidar system, which provided over 100,000
concentration measurements used to train the ANNs. Based on easily measured
meteorological parameters, these ANN models were able to significantl y outperform
traditional Gaussian puff models, including the US A m y Research Laboratory's
Gaussian-based dispersion code COMBIC.
The field data were analyzed in two separate coordinate systems. One system
followed the centre of mass of the puff as it d i f i e d downwind, and thus removed the
effect of puffmeander from the diffusion process. The other coordinate system followed
the horizontal centre of mass of the puff, but had a hxed vertical coordinate, relative to
the surface. Both systems are commonly used in theoretical and practical treatments of
aerosol diffusion, and the second system facrlitates the assessrnent of surface effects on
the dispersion process.
Numerous multi-layer feed-forward neural networks were trained on the field data
using the extendecl delta-bar-delta backpropagation leaming algorithm. It was found that
network architecture had the greatest effect on mode1 performance. More complex
networks (9-40-20- 1) repeatedly showed better performance against the test set, but failed
to generalize well against the validation set. It was found that the cross-validation
learning technique employed might not have d c i e n t l y guarded against over-training.
For this reason, networks with the smallest architectures (9-20-1) that showed good
performance against both the test set and the validation set were retained for further
anal y sis.
ANN models with identical architectures and parameter settings converged to
different points in weight space given different weight initializations. This suggests that
the error surface had numerous minima of comparable magnitude. For this reason, the
final mode1 for each data set was constructed by averaging the output predictions fiom 20
trained ANN models, each with the same 9-20-1 architecture and network parameter
settings, but initialized to different points in weight space.
The performance of the final averaged ANN mode1 for each coordinate system
was compared to traditional Gaussian puff models. Two standard Gaussian puff
equations were employed, one for each coordinate system. Two of the most commonly
used dispersion coefficient parameterizaiions were used with the Gaussian puff
equations; one developed by Pasquill (Pasquill and Smith, 1983), the other by Slade
(1 968). The US Amy Research Laboraîory's Gaussian-based dispersion d e COMBIC
was also tested against the field data.
The average ANN models showed better performance than the Gaussian puff
models against both the test set and the validation set. In general, ANN models gave
significantly better correlation, d e r mean bias, and far less variance tha. exhibited by
the Gaussian puff models.
Sensitivity analyses performed on both ANN models revealed that al1 nine input
variables used to describe the dispersion process contributed significantly to the
performance of each model. Specifically, it was found that the three spatial variables,
wind speed and ambient temperature had the most effect on model predictions. Diffunon
time was also found to be an influentid input.
An analysis of the predicted concentration distributions of both models was
perfomed to gain insight on the trends leamed by the neural networks, and to determine
the significance of these trends with regard to the process of puff dispersion. The models
predicted very smoothly peaked concentration distributions that were well approximated
by a Gaussian distribution, in accord with observations reported in the literature.
Predicted distributions also showed smail negative skew in the downwind direction,
consistent with observations reported in the literature for puff releases in a sheared
d a c e layer.
ANN models predicted tilted distributions, such that the upper portion of the puff
was displaced fhther downwind than the lower portion as a direct result of wind shear.
Tilt angle was found to increase with wind speed and decay slowly with d i e i o n t h e .
These predictions are consistent with theoretical treatments of puff diffusion in a sheared
surface layer.
ANN model 2A (absolute z-coordinates) predicted distributions that fail closer to
the ground as diffusion proceeds, indicating that gravitational settling of the kaolin cloud
may be more significant than expected. These distributions aIso indicated that surface
reflection of the aerosol was minimal, and that aerosol ground deposition rnay have been
responsible for the low concentration predictions near the surfaçe. This model also
predicted increasing dispersion at greater distances above the surfàce. This is consistent
with the common hypothesis of increasing eddy difi ivity with vertical distance.
Both models predicted dispersion coefficients that grow approximately linearly
with diffusion t h e , consistent with reported observations in the literature. Downwind
dispersion was greater than in either the crosswind or vertical direction, and proceeded at
a greater rate.
ANN model 1A (relative z-coordinates) was parameterized in terms of the most
influential input variables. A tilted Gaussian distribution was used to approximate the
ANN's predicted concentration distribution, and simple analytical f o d a s were
developed relating tilt angle and the dispersion coefficients to wind speed and diffusion
tirne. The parameterkation did not have the same predictive p w e r as the full ANN
model, but it provided a simple analytical model for puff dispersion in a sheared d a c e
layer, and significantly outperformed traditional Gaussian puffmodels over the test set.
5.2 Recommendations
A more robust and accurate ANN model could be developed if a more detailed
description of the character of the flow field was mea~u~ed. The most severe limitation
of the present data set is the lack of an accurate measurement of mean wind speed and
direction. Since the effects of wind shear were detennined to be very influentid in the
process of puff dispersion near the ground, vertical profiles of rnean velocity should also
be measured. In addition, a more accurate quantification of atmospheric stability could
be obtained fkom measured vertical temperature profiles. These measurements do not
pose significant experimental challenges, and c m be easily made using common
meteorological instrumentation.
Aithough a goal of this research was to coristruct a model using easily measured
meteorological parameters, it is well known that an accurate description of turbulent
diffusion rquires the determination of the statistical panuneters of the turbulent flow
tield. Rapidly sampled (-10 Hz) measurernents of wind speed, wind direction and
temperature would provide a good description of the turbulence, fTom which important
descriptive variables could be derived. Quantities such as turbulent intensity, vertical
heat flux, Monin-Obukhov length, and fiction velocity could be determinecl from such
measurements, and provide usefid inputs for an ANN model.
A significant amount of the present data set was discarded because a number of
diffising p u f i passed out of the LCM scanning volume. In future field trials, the aerosol
source should be positioned such that most of the dispershg cloud is contained within the
LCM scanning volume. Increasing the scanning volume would alleviate this problem to
some degree, but at the expense of either angular resolution or scanning time. Also,
positionhg the lidar system closer to the ground would provide important measurements
of near-surface concentration ievels. Such measurements would provide insights into the
interaction of a diffusing aerosol with the ground.
Given that an ANN model is empiricdly based, it can only be reliably used for
interpolation. Therefore, the predictive capability of any ANN model is restricted to the
conditions under which the data was collected. A more robust model can be developed if
M e r field trials are conducteci under an extended range of atmospheric conditions.
Specifically, to enhance the present data set, puff releases should be performed under
more stable conditions, perhaps at night, or just before dawn. The range of wind speeds
and temperatures should also be expanded. Varying the type and mass of aerosol
released would also help build a more robust model. Finally, a better dissemination
technique should be employed, or at the very least, source effects should be measured and
controlled. The kaolin disseminator used for the present field trials produced puffs wiîh
very large initial dimensions, which were difficult to quanti@.
Aside f?om considering better descriptive variables for inputs as noted above, two
major considerations should be made when constructing ANN models to predict 3-D
concentration distributions. First, some method to include clear-air zero-concentration
data points in the data set without greatly skewing the output fiequency distribution
should be developed. This would likely give the model better predictive capability away
from the central portion of the puK Second, an alternate method should be considered
for presenting the lidar data to the ANN for training. The typical method of separating
the data set into training and test sets by random selection of vectors is not appropriate for
this type of data set, and the usual cross-validation learning techniques do not properly
guard against over-training. Input vectors fkom individual Mar sans should be kept
together, and networks should be trained on complete scans, one at a t h e . Over-training
can be checked by cross-validation between scans.
List of References
Andrews, W. S., Costa, J., and Roy, G., "Measuring and Modeling the Influence of Atmospheric Effects on the Concentration Disiributions within Transient Aerosol Plumes", Proc. of the 1998 BattZespace Atmospheric and CIoud Irnpactr on MiZitary Operations Con ference, 243-250, 1 998.
Ayres, S. D., and Desutter, S., "Combined Obscuration Model for Battlefield Induced Contaminants (COMBIC92) Model Documentation", U.S. Army Atmospheric Sciences Laboratory, White Sands Missile Range, 1 995.
Batchelor, G. K.., "Diffusion in a Field of Homogeneous Turbulence II. The Relative Motion of Particles", Proc. Cambridge Phil. Soc., 48,345, 1952.
Baughman, D. R., and Liu, Y. A., Neural Networks in Bioprocessing and Chernical Engineering, Academic Press, San Diego, 1995.
Bissonnette, L. R., Bastille, C., and Vallee, G., "Estimation of Cloud Droplet Size Density Distribution fkom Multiple Field-of-View Lidar Retunis", Report DREV R-9705, ValCartier, 1997.
Bissonnette, L. R., "Lidar inversion methods: an introduction ", Proc. 8th Int. Workshop on Multiple Scattering Lidar Fxperiments, 102,1996.
Bissonnette, L. R., and Hutt, D. L., "Multiply Scattered Aerosol Lidar Retums: Inversion Method and Cornparison with in situ measurements", Applied Optics, 34, 6959- 6975, 1995.
Boznar, M., Lesjak, M., and Makar, P., "A Neural Network-based Method for the Short- t e m Preâictions of Ambient S02 Concentrations in Highly Polluted Indusîrial Areas of Complex Terrain", Atm. Env., 27B, 2,22 1-230, 1993.
Briggs, G. A., "Diffusion Estimation for Small Emissions", U.S. NOAA E.R.L. Report ATDL- 106, Oak Ridge, 1973.
Center for Chernical Rocess Safety (CCPS), Guidelines for Use of Vapor C M Dispersion Models, 2* ed., Arnerîcan Insîitute of Chernical Engineers, New York, 1996.
Costa, J., "Measiaing and Modeling the Atmospheric Concentration Distributions of Aerosols Released From Transient Point Sources7', M. Eng. Thesis, Royal Military College of Canada, 1998.
C s d y , G. T., Turbulent D t ~ i o n in the Environment, Reidel Publishing Company, Dordrecht, Holland, 1 973.
Davis, (Onüne). Davis Instruments, <hap://www.davisnetmm/> (July, 2000).
Elouragini, S., "Useful Algorithms to Derive the Optical Properties of Clouds fiom a Back-scatter Lidar Retum", J . Mod. Uptics, 42,7, 1439- 1446, 1995.
Evans, B. T. N, Yee, E., Roy, G., and Ho, J., "Remote Detection and Mapping of Bioaerosols", J. Aerosol Sci., 25,8, 1549- 1566, 1 994.
Evans, B. T. N., "LiDAR Signal Interpretation and Rocessing with Consideration for Military Obscurants", Report DREV R-4477/88, Valcartier, 1988.
Evans, B. T. N., "On the Inversion of the Lidar Equation", Report DREV R-4343/84, ValCartier, 1984.
Gardner, M. W., and Dorling, S. R., "Artificial Neural Networks (the Multilayer Perceptron)-A Review of Applications in the Atmospheric Sciences", Atm. Env., 32, 14/15? 2627-2636, 1998.
Gardner, M. W., and Dorling, S. R., 'Weurai Network Modelling of the Influence of Local Meteorology on Surface Layer Ozone Concentrations", Proc. 2" Int. Con$ on GeoComputation, 359-370, 1996.
Griffiths, R. F., "Errors in the Use of the Briggs Parameterkation for Atmospheric Dispersion Coefficients", Atm. Env., 28, 1 7,286 1-2865, 1994.
Hanna, S. R., "Along-Wind Dispersion of Short-Duration Accidental Releases of Hazardous Gases", Proc. 9U' Joint Con$ On Applccttions of A i r Pollution Meteorology with A& M , Atlanta, 28 January - 2 February, 1996.
Hanna, S., Briggs, G., and Hosker, R., Handbook on Atmospheric Dz%ion, National Technical Information Center U.S. Dept of Energy, Springfield, 1982.
Haykin, S., Neural Networkr A CompTehensive Foundation, Macmillan College Publishing Company, hc., New York, 1994.
Hidy, G.M., Aerosols An Indusstal and Environmental Science, Academic Press, hc., Orlando, 1984.
Hinckley, E-D-, Laser Monitoring of the Atmosphere, Springer-Verlag, Berlin, 1 976.
Klett, J. D., "Lidar inversion with variable backscatter/extinction Applied Optics, 24, 11, 1638-1643, 1985.
Klett, J. D., "Stable anaiytical inversion solution for processing lidar retums", Applied Qvtics, 20,2,2 1 1-220, 198 1.
Kunkel, K. E., and Weinman, J. A., "Monte Carlo Analysis of Multiply Scattered Lidar Retuns", J. Atmos. Sci., 33, 1772-1781, 1976.
Lehder, (Online). Lehder Enviro~mental Senrices Ltd., ~http://www.lehder.com/~ (July, 2000).
Mohan, M., and Siddiqui, T. A., "An Evaluation of Dispersion Coefficients for use in Air Quality Models", Bounhry-Layer Meteorology, 84, 1 77-206, 1997.
NeuralWare, Reference Guide: Softwnre Reference for Professional IVPLUS and Neural Works Explorer, NeuralWare, Pittsburgh, 1993a.
NeuralWare, Neural Cornpuring: A Technology Handbook for Professional I ' L U S and Neural Works E ~ l o r e r , NeuraiWare, Pittsburgh, 1 993b.
NeuralWare, Using Neural Works: A Tutorial for Neural Workr Professiona l II/PL US and Neural Works Ej.plorer, NeuralWare, Pittsburgh, 1 993c.
Oke, T.R., Boundary Layer Climates, Routledge, London, 1987.
Olesen, H. R., "Regulatory Dispersion Modelling in Denmark", Workshop on Operational Short-range Atrnospheric Dispersion Models for Environmental Impact Assessment in Europe, Mol, Nov. 1994, published in Int. J. Environment and Pollution, 5,4-6,4 12-4 1 7, 1 995.
Pal, S. R., Hlaing, D., and Carswell, A. I., "ScaMing Lidar Application for Pollutant Sources in an Industriai Cornplex", SPIE, 3504,76-86, 1998.
Pankrath, J., "Atmospheric Dispasion Models for Regulatory Purposes in the Federal Republic of Germany. Part 1: Regulatory Modelling", Workshop on Operational Short-range Atmospheric Dispersion Models for Environmental Impact Assessment in Europe, Mol, Nov. 1994, published in Int. J. Environment and Pollution, 5,4-6,427-430, 1 995.
Pasquill, F., and Smith, F.B., Atmospheric Dzwion, 3d ed., John Wiley & Sons, Rexdale, 1983.
Patterson, D. W., Awcia l Neural Networh: Theory and Applications, Prentice-Hall, Toronto, 1996.
Pollock, D. H., DUE0 Handbook Volume 7: Countenneasure System, Environmental Research Institute of Michigan, Ann Arbor, 1993.
Rege, M., and Tock, R., "A Simple Neural Network for Estimating Emission Rates of Hydrogen Sulfide and Ammonia nom Single Point Sources'', J. Air & Waste Manage. Assoc., 46,953-962,1996.
Roy, G-, Bonnier, D., DeVillers, Y., Couture, G., Hutt, D., and Vdlee, G., "Canadian National Report on the SOCMET Winter Test Held at DREV, Canada in March 1 993", Report DREV-TM-9408, Valcartier, 1994.
Roy, G., Valee, G., and Jean, M., "Lidar-inversion Technique Based on Total Integrated Backscatter Calibrated Curves", Applied Optics, 32,6754, 1993.
Sato, J., "An Analytical Study on Longitudinal Diffision in the Atmospheric Boundary Layef7, The Geophysical Magazine Series 2,1,2, 105- 15 1, 1995.
Sawford, B. L., and Wilson, J. D., 'Xeview of Lagrangian Stochastic Models for Trajectones in the Turbulent Atmosphere", Boundaly-Layer Meteorology, 78, 191-210, 1996.
Silfvast, W. T., Laser Fmdamentals, Cambridge University Press, New York, 1 996.
Slade, D. H., Editor, "Meteorology and Atomic Energy", TID-24 190, USAEC, 163- 175, 1968.
SPSS, SigmaPlot 5.0 User's Guide, SPSS, Inc., Chicago, 1998.
Sutton, O. G., Micrometeorology A Srudy of Physical Processes in the Lowest m e r s of the Earth S Ahnosphere, McGraw-Hill Book Company, hc., Toronto, 1953.
Sutton, O. G., Amiospheric Turbulence, 2nd ed., John Wiley & Sons, Inc., New York, 1949.
Taylor, G. L, "Diiffusion by Continuous Movements", Proc. London Math. Soc., 20, 196, 1921.
Turner, D. B., WorRbook of Atmospherir Dispersion Estimates An Introduction fo Dispersion Modeling, 2& ed., Lewis Publishers, Ann Arbor, 1994.
Uthe, E. E., and Livingston, J. M., 'Zidar Extinction Methods Applied to Obsemations of Obscurant Events", Applied Optics, 25,678, 1986.
Uthe, E. E., "Lidar Evaluation of Smoke and Dust Clouds", Applied Optics, 20, 1503, 1981.
van Ulden, A. P., "A Surface-Layer Similarity Model for the Dispersion of a Skewed Passive Puff Near the Ground", Atm. Env., 26A, 4, 68 1-692, 1992.
Wasserman, P., Advanced Methoak in Neural Conrputing, Van Nostrand Reinhold, New York, 1993.
Williamson, S., FundamentaZs of A i r Pollution, Addison-Wesley Publishing Company, Don Mills, 1 973.
Yee, E., Kosteniuk, P. R., and Bowers, J. F., "A Study of Concentration Fluctuations in ïnstantaneous Clouds D i s w i n g in the Atmospheric Surface Layer for Relative Turbulent Diffision: Basic Descriptive Statistics", Boun&ry-mer Meteorology, 87,409-457, 1998.
Yi, J., and Prybutok, V. R., "A Neural Network Model Forecasting for Rediction of Daily Maximum Ozone Concentrations in an Industrialised Urban Area", Environmental Pollution, 92,3,349-357, 1996.
Appendix A
Meteorological Measurements and Estirnates
Note: In accord with meteorological convention, wind direction is the direction fkom
which the wind is bIowing. The delay time entry in Table A.1 refers to the approximate
time d e r kaolin release that the fïrst LCM scan began, while scan to scan time is the
approximate t h e between subsequent scans of the same release.
Table A. 1 Summary of measurements taken during the kaolin trials.
N W O 10.8 W 16 10.8
NW 13 10.8 NW 21 10.8 NW 15 10.7 NW 19 10.7
WNW O 10.8 WSW 29 10.7 SW 15 10.7
WNW 26 10.8 N W 15 10.7
13 10:41 22 30.15 0.28 SSW 17 10.9 8/6/97 14 1 5:27 21 30.29 1.39 SW O 8.2
15 15:30 22 30.29 0.00 W O 8.2 16 15:33 22 30.29 3 .O6 S O 8.1 17 1536 23 30.28 2.78 S O 8 .O 18 1539 23 30.28 1.67 WNW O 8.4 19 1 5:42 24 30.28 2.22 SSW O 8.4 20 1545 24 30.28 3.61 SW O 8.4 21 1 5:48 24 30.28 3.06 SSW O 8.4 22 155 1 24 30.28 5.83 W O 8.3 23 1554 24 30.28 4.44 W O 8.4
8/7/97 24 10:18 18 30.28 5.83 W O 8.4 25 10:2 1 18 30.28 6.67 SW O 8.2 26 1 0:24 17 30.28 4.44 SW O 8.4 27 10:33 17 30.29 5 .O0 SW O 8.4 28 10:36 17 30.28 5.00 SW O 8.4 29 1 0:39 17 30.29 5.80 WSW O 8.4 30 1 0:42 18 30.28 4.17 SW O 8.4 3 1 10:45 18 30.28 4.44 W O 8.4 32 10:48 18 30.28 6.67 W O 8.5 33 10:s 1 18 30.28 3.6 1 W O 8.2 34 1 O:% 18 30.27 5.28 W O 8.4
37 11:02 19 30.45 3.61 S O 8.4 38 11:05 19 30.45 2.22 SSW O 8.4 39 1 1:08 18 30.44 3.06 SE O 8.4 40 1l:ll 19 30.45 1.39 E O 8.4 41 1 1:17 18 30.45 0.83 E O 8.4 42 11:20 19 30.44 0.83 ENE O 8.4 43 1 1:23 20 30.44 0.00 W O 8.4 44 1 1:26 20 30.44 1.39 S O 8.4 45 1 1:29 20 30.00 0.30 S O 8.4 46 1 1:32 20 30.43 0.00 SW O 8.4 49 1 1 :44 20 30.44 2.22 NW O 8.4 50 1 1 :47 20 30.43 2.22 NW O 8.4 51 1150 20 30.43 1.67 NE O 8.5 52 1 1:53 20 30.43 1.67 ESE O 8.5
Table A.2 Calculated wind speed and Pasquill Stability Class.
date trial no. speed (mm class
8/5/97 2 2.39 A-B A-B A-B A-B A A B A
A -B A A
13 1.10 A 8/6/97 14 0.96 A-B
0.4 1 A-B 1.54 A-B 1 .O0 A-B 0.57 A -B 1.77 A-B 2.25 B 1.96 A-B 1.83 A-B
A-B B
A -B A-B
B B B
A-B A
A-B 3 5 2.90 A-B
8/12/97 3 6 0.06 A
Appendix B
Neural Network Performance Statistics
Table B. 1 Preliminary data set 1 ANNs. Effect of varying the network architecture and epoch size on network performance against the test set. The listed staîistics are root mean square (RMS) and line-ar correlation coefficient (R) between predicted and target outputs.
Test Set Statistics ANN Architecture Epoch
RMS R
Table 33.2 Preliminary data set 2 ANNs. Effect of varying the network architecture and epoch size on network perfoxmance against the test set. The listed - statistics are root mean square (RMS) and linear correlation coefficient (R) between predicted and target outputs. .
Test Set Statistics ANN Architecture Epoch
RMS R 2.1a 9-10-1 100 0.3753 0.5687 2.1b 9-10-1 200 0.3607 0.61 10 2 . 1 ~ 9-10-1 500 0.3565 0.6 140 2. ld 9-10-1 1000 0.361 1 0.601 1 2.2a 9-20- 1 1 O0 0.3685 0.58 13 2.2b 9-20-1 200 0.3588 0.6044 2 . 2 ~ 9-20- 1 500 0.3549 0.6 17 1 2.2d 9-20- 1 1 O00 0.3562 0.6 128 2.3a 9-3 O- 1 100 0.3608 0.60 1 O 2.3b 9-30- 1 200 0.3559 0.6 183 2 . 3 ~ 9-30- 1 500 0.3508 0.6279 2.3d 9-30- 1 1000 0.3540 0.6191 2.4a 9- 1 0-5- 1 100 0.3636 0.596 1 2.4b 9- 10-5- 1 200 0.3 593 0.6035 2 . 4 ~ 9- 1 0-5- 1 500 0.3550 0.6 179 2.4d 9- 10-5- 1 1 O00 0.3572 0.6133 2.5a 9-20- 10- 1 100 0.3496 0.6340 2.5b 9-20- 1 0- 1 200 0.3439 0.6494 2 . 5 ~ 9-20- 1 0- 1 500 0.3425 0.6494 2Sd 9-20- 10- 1 1000 0.3463 0.6404 2.6a 9-30-15-1 1 O0 0.3492 0.6356 2.6b 9-30- 1 5- 1 200 0.3406 0.6533 2 . 6 ~ 9-30-1 5-1 500 0.3407 0.6529 2.6d 9-30-1 5-1 1 O00 0.346 1 0.6404 2.7a 940-20- 1 100 0.3466 0.6453 2.7b 9-40-20- 1 200 0.3367 0.6640 2 . 7 ~ 9-40-20- 1 500 0.33 15 0.6803 2.7d 940-20- 1 1000 0.3448 O. 6446
Table B.3 Results of training 20 ANNs with two different architectures on each data set. Each net was initialized to a different random point in weight space.
Data set 1 Data set 1 Data set 2 Data set 2 9-20- 1 9-40-20- 1 9-20- 1 9-40-20- 1
RMS R RMS R RMS R RMS R
Appendix C
Mode1 Concentration Prediction Scatter Plots
(a) Test Set
(b) Vaiidatim Set
Figure C. 1 Scatter plots for ANN modei 1A over (a) the test set and (b) the validation set.
(a) Test Set
1
(b) Vaiidahim Set
Figure C.2 Scatîer plots for ANN modei 2A over (a) the test set and (b) the validation set.
1
(a) T a Set
1 (b) Validaibn Set
Figure C.3 Scatter plots for ANN mode1 IB over (a) the test set and (b) the validation set.
1 - (a) Tea Set
1
(b) Validaiin Set
Figure C.4 Scaîter plots for ANN mode1 2B over (a) the test set and (b) the validation set.
1
(a) Test Set
1
(b) Validatim Set
Figure C.5 Scarfer plots for data set 1 GPMs (Slade) over (a) the test set and (b) the validation set.
1 - (a) Test Set
. .
1
(b) Validatim Se!
Figure C.6 Scatter plots for data set 1 GPMp (PasquiII) over (a) the test set and (b) the validation set.
1
(b) Validatim Set
Figure C.7 Scatter plots for data set 2 GPMs (Slade) over (a) the test set and (b) the validation set.
(a) Test Set
1
(b) Validatim Set
Figure C.8 Scatter plots for data set 2 GPMp (Pasquill) over (a) the test set and (b) the validation set.
1
(b) Validaüm Set
Figure C.9 Scatter plots for COMBIC over (a) the test set and (b) the validation set
Vita
Name: D. Timothy James DeVito
Education: University of Guelph, 1 992- 1 993 Guelph, ON
Queen's University, 1993-1 996 Kingston, ON B. Sc. (Elonours) ln Class Physics, 1996
Royal Military College of Canada, 1998-2000 Kingston, ON Current program
Experience: Research Engineer, 1998-2000 Royal Military College of Canada
Publications: Modeling Aerosol Puff Concentration Distributions fiom Point Sources Using Artificial Neural Networks. Proceedings of the 2000 BattIespace Atmospheric and Cloud Impacts on Military Operations Conference.
Awards: NSERC Postgraduate Scholarship-B, 2000-2002
Defence Research & Development Branch-Royd Military College Fellowship, 1999-2000
Milton Fowla Gregg VC Memonal Trust Fund Bursary, Royal Military College of Canada, 1998- 1999
University of Guelph Entrance Scholarship, 1 992- 1993
top related