addressing fault and calibration in wireless sensor...
TRANSCRIPT
University of California
Los Angeles
Addressing Fault and Calibration
in Wireless Sensor Networks
A thesis submitted in partial satisfaction
of the requirements for the degree
Master of Science in Electrical Engineering
by
Laura Kathryn Balzano
2007
c© Copyright by
Laura Kathryn Balzano
2007
The thesis of Laura Kathryn Balzano is approved.
Steven Margulis
Greg Pottie
Mark Hansen
Mani B. Srivastava, Committee Chair
University of California, Los Angeles
2007
ii
To my mother and father who encourage me
to strive to be proud of everything I do.
iii
Table of Contents
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1 Related Areas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 Sensor Faults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.3 Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3 Survey of Faults and Aggregation Analysis . . . . . . . . . . . . 13
3.1 Examples of Fault in Sensor Network Deployments . . . . . . . . 14
3.2 Fault Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.2.1 Offset Fault . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.2.2 Gain Fault . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.2.3 Variance Degradation Fault . . . . . . . . . . . . . . . . . 19
3.2.4 Stuck-At Fault . . . . . . . . . . . . . . . . . . . . . . . . 20
3.2.5 Static Fault . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.3 Aggregation Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.3.1 Aggregation as Estimation . . . . . . . . . . . . . . . . . . 21
3.3.2 Sum and Average . . . . . . . . . . . . . . . . . . . . . . . 23
3.3.3 Some Mathematical Details . . . . . . . . . . . . . . . . . 26
3.4 Future Work and Discussion . . . . . . . . . . . . . . . . . . . . . 28
4 Estimation of Calibration Parameters using Dynamical Models 29
iv
4.1 The Ensemble Kalman Filter and the Particle Filter . . . . . . . . 30
4.2 Simple Autoregressive Surface Moisture Model . . . . . . . . . . . 32
4.2.1 Initial Condition . . . . . . . . . . . . . . . . . . . . . . . 32
4.2.2 Model Forcing . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.2.3 Model Error . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.2.4 Measurement Model . . . . . . . . . . . . . . . . . . . . . 33
4.3 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.3.1 State Vector . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.3.2 Input Parameters . . . . . . . . . . . . . . . . . . . . . . . 35
4.4 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.4.1 Estimation with Incorrect Prior Mean . . . . . . . . . . . . 40
4.4.2 Estimation with Changing Prior Variance . . . . . . . . . 45
4.4.3 Estimation with Frequent Updates . . . . . . . . . . . . . 48
4.4.4 Future Work and Discussion . . . . . . . . . . . . . . . . . 50
5 Estimation of Calibration Parameters using Subspace Matching 53
5.1 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . 54
5.2 Blind Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
5.3 Offset Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
5.4 Gain Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.4.1 General Conditions . . . . . . . . . . . . . . . . . . . . . . 61
5.4.2 Bandlimited Subspaces . . . . . . . . . . . . . . . . . . . . 65
5.5 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
v
5.5.1 Robust Estimation . . . . . . . . . . . . . . . . . . . . . . 66
5.5.2 Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . 67
5.5.3 Evaluation on Sensor Datasets . . . . . . . . . . . . . . . . 74
5.6 Future Work and Discussion . . . . . . . . . . . . . . . . . . . . . 78
6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
vi
List of Figures
3.1 An example of the Offset Fault: Various humidity sensors measur-
ing the same phenomenon. . . . . . . . . . . . . . . . . . . . . . . 15
3.2 An example of a sensor which was good and developed some prob-
lematic noise. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.3 Examples of the Stuck-At and Static Faults. . . . . . . . . . . . . 17
3.4 Variables used in fault model descriptions. . . . . . . . . . . . . . 18
3.5 Notation for moments of variables used in aggregation analysis. . 24
3.6 Averaging under a stuck-at-zero fault. . . . . . . . . . . . . . . . . 25
3.7 Averaging under degraded variance. . . . . . . . . . . . . . . . . . 26
4.1 An example soil sampling scenario. . . . . . . . . . . . . . . . . . 31
4.2 Parameters used in generating the true phenomenon. The true
initial condition and the model error are a single instance drawn
from the corresponding distribution. . . . . . . . . . . . . . . . . . 36
4.3 Example of EnKF with different measurement variance and a priori
information. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.4 An example of the SIR Filter estimate for the five state parameters. 39
4.5 RMS Error as the difference in the true and prior mean for the
measurement offset increases: 5, 50, 95% quantiles over 100 runs. 41
4.6 RMS Error as the difference in the true and prior mean for the
measurement gain increases: 5, 50, 95% quantiles over 100 runs. . 44
4.7 RMS Error as the prior variance for the offset measurement pa-
rameter increases: 5, 50, 95% quantiles over 100 runs. . . . . . . . 45
vii
4.8 RMS Error as the prior variance for the gain measurement param-
eter increases: 5, 50, 95% quantiles over 100 runs. . . . . . . . . . 47
4.9 RMS Error as the spacing between measurement updates increases:
5, 50, 95% quantiles over 100 runs. . . . . . . . . . . . . . . . . . 49
5.1 Two example simulated square fields. On the left, a 256 × 256
field generated with a basic smoothing kernel, which represents a
true continuous field. On the right, an 8× 8 grid of measurements
of the same field. The fields can be quite dynamic and still meet
the assumptions for blind calibration. The fields are shown in
pseudocolor, with red denoting the maximum valued regions and
blue denoting the minimum valued regions. . . . . . . . . . . . . 68
5.2 Gain and offset error performance with exact knowledge of P and
increasing measurement noise. The results show the mean and
median error over 100 simulation runs. . . . . . . . . . . . . . . . 70
5.3 Gain and offset error performance for mismodeled P and zero mea-
surement noise. The results show the mean and median error for
100 simulation runs. . . . . . . . . . . . . . . . . . . . . . . . . . 70
5.4 Gain error performance for SVD, blind LS, and partially blind LS.
Results show mean error over 50 simulation runs. . . . . . . . . . 72
5.5 Offset error performance for SVD, blind LS, and partially blind
LS. The top graph shows offset error for zero-mean signals, and
the bottom graph is for non-zero-mean signals. Results show mean
error over 50 simulation runs. . . . . . . . . . . . . . . . . . . . . 73
5.6 Results of blind calibration on the calibration dataset. . . . . . . . 75
viii
5.7 The mica2 motes in the cold air drainage transect run down the
side of a hill and across a valley. The mote locations pictured are
those that we used. . . . . . . . . . . . . . . . . . . . . . . . . . . 76
5.8 True and estimated gains and offsets for the cold air drainage tran-
sect dataset. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
ix
Acknowledgments
I would like to thank my advisor Mani Srivastava, my collaborators Mark Hansen,
Steven Margulis, and Rob Nowak, and members of the Data Integrity Group for
their willingness to discuss and work out all kinds of issues.
x
Abstract of the Thesis
Addressing Fault and Calibration
in Wireless Sensor Networks
by
Laura Kathryn Balzano
Master of Science in Electrical Engineering
University of California, Los Angeles, 2007
Professor Mani B. Srivastava, Chair
Sensors and devices used in wireless sensor networks are state-of-the-art tech-
nology with the lowest possible price. The sensor measurements we get from
these devices are therefore often noisy, incomplete and inaccurate. Researchers
studying wireless sensor networks hypothesize that much more information can
be extracted from hundreds of unreliable measurements spread across a field of
interest than from a smaller number of high-quality, high-reliability instruments
with the same total cost. This thesis offers a basis for exploring that hypothesis
in some detail. We make four contributions. First, we describe sensor faults
commonly seen in recent sensor network deployments, and we formulate statis-
tical models to assist in the analysis of those faults. Second, we present some
basic tools for assessing the robustness of aggregation algorithms to these com-
mon faults. We then address, in two separate ways, the issue of finding linear
calibration parameters while sensors are deployed. Our third contribution is an
approach to calibration using state space models and non-linear, non-Gaussian
filtering techniques to calibrate sensors without groundtruth knowledge or con-
trolled stimuli. We evaluate this calibration on simulated sensor data with a
xi
simple dynamical model based on the physical process of soil moisture. Fourth,
we present a general problem formulation for blind calibration which assumes
that the n sensor measurements lie in a subspace of n-dimensional space. We
prove the identifiability of the sensor offsets and gains under this assumption,
and we evaluate implementations on both simulated and real sensor data.
xii
CHAPTER 1
Introduction
Wireless Sensor Networks provide a new technology which can enable scientists,
activists, even the general population to collect data about their environment.
Many people have indoor-outdoor thermometers with wireless communication
devices so they can know the temperature around the house from a single display
on their desk. The vision for Wireless Sensor Networks (WSNs) hopes to take
this idea to the next level, allowing us to have several tiny devices which interface
with all kinds of sensors, which we could place around our homes and neighbor-
hoods or in the middle of natural ecosystems in order to make wiser decisions in
every aspect of life. For example, WSNs which have simple light and temperature
sensors currently enable companies to make smarter decisions about lighting and
heating in a building. WSNs with cameras and magnetic sensors may be deployed
in high-vehicle-traffic areas to learn traffic patterns at unprecedented resolution
and improve traffic lights or inform future road expansion plans. At the Center
for Embedded Networked Sensing (CENS), WSNs are already deployed in nat-
ural ecosystems at the James Reserve (JR) in order to provide biologists and
environmental engineers with data on airflow, soil chemistry, lake ecology, and
bird nesting patterns. These data are all at unprecedented high spatial and tem-
poral resolutions, and the scientists hope to learn intricacies of the environment
so that they can make more informed decisions when they are trying to positively
influence segments of our environment, such as climate or water quality.
1
Wireless Sensor Networks allow us to collect these data, where before we
could not, for three reasons, all of which are possible due to advances in comput-
ing technology in the recent decades. First, the devices used by WSNs are small
and getting smaller every year. This allows us to place them inconspicuously
without disrupting the environment we are trying to monitor. Second, wireless
communication technology and low-power devices allow us to collect data from
remote locations without the burden of cables and infrastructure. Third, the
hardware that make up these small, wireless devices is getting more and more
inexpensive as time moves on. In order to be a ubiquitous and pervasive tech-
nology, WSNs operate at the lowest-cost boundary, using the most inexpensive
forms of the technology available.
Exactly because of this third reason, the data collected by WSNs is not as
reliable as data collected by high-quality expensive instruments. Given that inex-
pensive pervasive sensing devices will never be highly reliable and state-of-the-art,
we can conclude that sensor data will always be noisy, incomplete and inaccurate.
Researchers studying WSNs, though, hypothesize that unreliable measurements
from hundreds of low-quality instruments can offer a user considerably more in-
formation than a single reliable data source.
Unfortunately, this hypothesis has been taken for granted in much of the
research done on WSNs. This thesis takes first steps to address the two main
issues of low-quality sensor data: fault and calibration.
Sensor faults are the rule and not the exception in every WSN deployment so
far, to our knowledge. Sensors themselves may get stuck at a particular value or
get partially disconnected and report noisy measurements. Sensor nodes reboot
unexpectedly or stop transmitting data. Software running on the sensor nodes
may have bugs and may cause data loss. Algorithms for detecting these failures
2
and for directing a user to the probable cause [24, 25] are useful for WSN users
who are willing to take care of their sensor network. Algorithms which can do an
automatic disposal of, or are resilient to, bad data are useful for WSN users who
want a transparent interface.
An important area of research in WSNs is that of in-network processing. Data
are processed on the nodes within the network in order to save transmission en-
ergy when possible. Often the data are aggregated to provide descriptive statistics
across an area of the network instead of sending back each and every data point.
Because context and information gets lost during the process of aggregation, it is
absolutely crucial that these algorithms are robust to missing and faulty sensor
data. Chapter 3 of this thesis gives a survey of faults seen in sensor networks
and takes a careful look at whether popular aggregation schemes are robust to
the usual faults.
Even if we can guarantee that the sensors never fail outright, the sensors used
for WSNs are notoriously prone to calibration errors, and arguably these errors
are one of the major obstacles to the practical use of sensor networks [5]. Cal-
ibrating every sensor by hand is infeasible if sensor networks are to scale even
into the tens of devices; yet it may be that applications need more accurate mea-
surements than uncalibrated, low-cost sensors provide. Consequently, automatic
methods for jointly calibrating sensor networks in the field, without dependence
on controlled stimuli or high-fidelity groundtruth data, are of significant interest.
This thesis explores two possible approaches, one based on a known physical dy-
namical model for the environment being sensed (Chapter 4), and one based on
a known subspace model for the environment being sensed (Chapter 5).
3
CHAPTER 2
Related Work
As mentioned in Chapter 1, a fault can be introduced into sensor data at every
point in the sensor network: from failures in the sensor itself, to software bugs
and computational errors, to lossy communication. This thesis focuses on faults
in the sensors such that they report inaccurate data.
2.1 Related Areas
Throughout the analysis in this thesis we assume safe and reliable communi-
cation between all sensor nodes and basestations. Protocols for assessing data
integrity will also need to be reliable, but that can be addressed separately by the
work in research areas such as network systems and network information theory.
Missing data due to lossy links has been a prevailing problem in data collection
sensor networks. In [25], a debugging system called Sympathy assists a network
engineer in identifying the point at which data packets go missing. The Sympa-
thy debugger uses network communications information like neighbor lists and
packet counts to determine a potential area in the network that is causing loss.
In coding theory, algorithms are evaluated in the presence of erasure links to see
how they would perform under random packet losses. For example, generalized
consensus algorithms in sensor networks with erasure links were studied in [23].
It also turns out that distributed statistical analysis and distributed optimization
4
techniques that are tolerant to faulty data are often also tolerant to missing or
lossy data, though they do converge more slowly under these circumstances.
Prior to work specific in sensor networks, system faults have been studied
thoroughly in control systems, communications networks and in VLSI design.
In control systems and communications networks, three staple concepts create a
foundation for fault tolerance: replication, redundancy and diversity. These con-
cepts are useful not only for network failures, but also for data failures themselves.
For both control systems and communications networks, the fault tolerance con-
siderations lie in the reliability of communicating signals. Control systems that
need to provide safe operation for humans or expensive equipment require ex-
tremely high control-signal integrity [18]. Sensor networks and wireless networks
in general have a long way to go before they will be able to provide the integrity
of data that these safety-critical systems require.
Fault tolerance in communications networks has been an important issue
for many years, hearkening back to the robust design of FDDI and ATM net-
works [27]. In most cases, reliability is supported with redundancy: multiple
independent routes between devices on a network. Clusters, redundant servers,
and redundant power supplies are all techniques for redundancy.
Testing of circuits and VLSI parts has been an area of intense study over
the past few decades with the rise of semiconductor devices [6]. One fault-
identification technique that was born in this area, but is applicable in many
other areas, is that of test vectors. Test vectors may be able to identify faults
such as buggy software, issues with the interaction between micro-controller and
sensor systems, and sensors that are not functional at the start of their life.
5
2.2 Sensor Faults
In his work on tolerating faulty sensor measurements [22], Keith Marzullo presents
the idea of an abstract sensor that incorporates uncertainty into the measure-
ments of an associated set of real sensors. His goal is then to construct the
abstract sensor in such a way that it is tolerant to failures.
Koushanfar and Potkonjak [19] have explored many facets of faulty data in
sensor networks. They suggest five phases of testing sensor-based systems. The
first phase is test vector generation, with a goal of finding test inputs that are
most likely to excite particular faults they are interested in testing. However, a
special characteristic of fault detection in sensor networks is that we must answer
the question of what happens to the sensors over their lifetime while they sit in
environments which have unknown properties. With sensor networks, we wish to
capture information about changing, dynamic systems that we don’t understand.
This kind of deployment purpose does not fit well with a test vector formulation.
Another important consideration is that sensors change with the climate they are
immersed in. It is difficult to predict what problematic conditions a sensor will
face once it is placed in the environment it is intended to monitor.
The next three phases in [19] for testing sensor-based systems are that of
fault detection, diagnosis and validation. They describe the last phase as one
of mapping sensor readings to correct readings with some confidence, which is
sometimes called compensation in calibration literature. The authors suggest
fault models like the ones we show in Chapter 3, though their list is slightly
shorter. Also, we have presented a general form that can encompass our entire
list of faults, which proves very useful in analysis.
The online fault detection section of this same paper [19] assumes underlying
6
equations for the phenomenon. Based on those equations, the authors’ method
can define a fault detection algorithm. The idea here is to eliminate sensor
readings from aggregation, and if that elimination improves the consistency of
the results, that sensor reading is most likely faulty. In this thesis, we instead
aim to derive and understand as much as possible about faults in sensor networks
under the assumption that only general statistical properties of the underlying
phenomenon are known.
There are copious instances of faulty data in sensor networks where the scien-
tists did not necessarily know what to expect in the data collected by the sensor
network. In [29], the authors described their experiences in deploying a sensor
network on Great Duck Island to monitor the habitat of the Leach’s Storm Pe-
trel. The authors described both packet loss errors and inaccurate measurement
reports. They found that sensor measurement quality degraded over time in the
outdoor environment, especially in sensors that could not have a protective cov-
ering due to the small size of breeding holes for the petrel. Temperature readings
inside enclosures were regularly much higher than those outside. The integrated
sensor boards caused the domino effect-as soon as one sensor went bust, others
on the board often followed quickly. Humidity sensors went bad when they were
wet, but then recovered as they dried off. Light sensors were reliable for the
most part, and the few that failed had stopped displaying the diurnal pattern
and instead consistently read high values.
From speaking with the engineers at the James Reserve [7] who collect data
from outdoor sensors into a database, we learned that the two main data prob-
lems they have are missing data and sensors that get stuck at a particular value.
A deployment in the Redwoods of California described in [31] had large amounts
of data loss due to both communication failure and sensor failure; those authors
7
note that bad data was more easily found by looking where battery voltage data
revealed an anomaly as well. Soil deployments in Bangladesh [24, 26] and Cal-
ifornia [26] have turned up very unusual fault modes that are more difficult to
manage.
The problem of robust data fusion for sensor networks is addressed in [32].
The author, David Wagner, discusses a simple application of Robust Statistics,
a decades-old field which studies, among other things, the contamination of a
data sample that can cause a particular computation on that data sample to be
in error. Wagner asks which computations common to sensor networks require
a larger percentage of data points in that computation to go to infinity before
the computation itself goes to infinity. As an example, Wagner suggests we look
at the median and the mean. 50% of data points must go to infinity before the
median of that data goes to infinity. On the other hand, if only one data point
goes to infinity, the mean of the data will become infinity. The median is a more
robust statistical computation than the mean.
Wagner approaches this problem using the Byzantine fault model, which is
described [20] as follows: “The component can exhibit arbitrary and malicious
behavior, perhaps involving collusion with other faulty components.” This kind
of analysis only allows us to understand the worst-case behavior of an aggregation
function under malicious attack. He makes the important point that sensor net-
work systems will be deployed in environments and in numbers such that it is not
possible to monitor them. The sensor nodes will then be easily compromised, and
system designers should use more robust computations to avoid serious problems
that will result from a few compromises.
As we can see from all the descriptions of sensor network faults above, a very
pressing issue is not the threat of compromise but the reality of fault. Again, due
8
to sensor nodes’ low cost and sub-par quality, sensors themselves often produce
inaccurate measurements. Even before sensor networks become pervasive in our
daily environments and therefore vulnerable to malicious compromise, we have
to make them robust to fault within their own ranks.
The Reputation-based Framework for Sensor Networks [12] is a middleware
framework which allows distributed maintenance of reputation for all the nodes
in a network. All nodes maintain reputation for their neighbor nodes given some
measure of cooperation. The reputation information itself can be used to inform
decisions made in aggregating and forwarding data, or if the reputation infor-
mation is sent to a central location it can be used to inform a user of possible
problem nodes.
Another tool aimed at improving data integrity in sensor networks is called
Confidence [24], and it has a methodology akin to Sympathy [25] for network
integrity. With Confidence, a user can define a general expectation he or she has
for the data and then use the tool to identify integrity compromise and identify
a possible cause.
Methods of fault detection based on simplified algorithms such that they could
be implemented on a single sensor node are investigated in [9, 21]. Both have
approaches based on two relationships of correlation: the correlation of a node’s
measurement and its neighbor’s measurements, and the correlation of a node’s
measurement and its own previous measurement. In [9], a naive bayes algorithm
is employed which maintains counters of the number of times a particular pair oc-
curs over the history of the sensor network. This approach requires large numbers
of counters and large amounts of training data. On the other hand, [21] keeps
track of the same relationships by only maintaining counters of the differences
between the pairs of values. Their claim is that the differences are sufficiently
9
representative of the interesting behavior from the sensors, and they demonstrate
that this approach is effective.
The author of [14] presents a method for automated fault detection of in-situ
environmental sensors. Based on data which has been carefully reviewed by a
domain expert, his algorithms learn particular characteristics of the environment
using four different statistical learning methods, are trained to differentiate be-
tween identify short- and long-duration anomaly periods for sensors. This paper
shows a comparison of the performance of the different methods.
2.3 Calibration
Calibration is the process of taking readings from a sensor and applying an equa-
tion to map these readings as closely as possible to the ground truth value of that
measurement. Certain faults which involve an incorrect or unknown measurement
offset and gain can be adjusted with a calibration curve.
The most straightforward approach to calibration is to apply a known stimulus
x to the sensor network and measure the response y. Then using the groundtruth
input x we can adjust the calibration parameters so that (5.1) is achieved. We
call this non-blind calibration, since the true signal x is known. This problem is
called inverse linear regression; mathematical details can be found at [15]. Non-
blind calibration is used routinely in sensor networks [24, 31], but may be difficult
or impossible in many applications.
As for blind calibration in sensor networks, the problem of relating measure-
ments such as received signal strength or time delay to distance for localization
purposes has been studied extensively [17, 30]. This problem is quite different
from the blind calibration problem considered in this thesis, which assumes that
10
the measurements arise from external signals (e.g., temperature) and not from
range measurements between sensors. In [33], the problem of calibrating sensor
range measurements by enforcing geometric constraints in a system-wide opti-
mization is considered. Calibration using geometric and physical constraints on
the behavior of a point light source is considered in [11]. The constraint that prox-
imal sensors in dense deployments make very similar measurements is leveraged
in [7]. In this thesis, our constraint is simply that the phenomenon of interest lies
in a subspace. This is a much more general constraint and hopefully therefore it
can be widely applicable.
Blind equalization and blind deconvolution [28] are related problems in signal
processing. In these problems, the observation model is of the form y = h ∗ x,
where ∗ is the convolution operator, and both h and x must be recovered from
y. Due to the difference between the calibration and convolution models, results
from blind deconvolution do not readily apply to blind calibration. Most similar
to our problem is work in multi-channel blind deconvolution [13]. This problem
involves observing one unknown signal through multiple unknown channels. Blind
calibration involves observing multiple unknown signals through one unknown
calibration function. This connection merits further study which is beyond the
scope of this thesis.
A final problem relation for blind calibration is that many signal processing
researchers will recognize this problem formulation from the similar formulation
of blind source separation. This is quite a different problem, yet the form is rem-
iniscent. In independent component analysis (ICA) [16], for example, a solution
to the equation y = Ax is sought where some signals x and a matrix of mixing
coefficients A are not known, instead only the mixed observations y are known.
The signals are assumed to be independent and non-Gaussian, and the mixing
11
matrix A should be invertible. Unfortunately, ICA only recovers each signal up
to a scalar constant, which is exactly the scalar gain factor we are looking for in
blind calibration.
In this thesis, we hope that we have distilled out some of the crucial problems
for addressing data integrity in sensor networks. The next three chapters give
a formulation for each of three approaches. The first approach relates to fault
aggregation while the second two approaches relate to blind calibration.
12
CHAPTER 3
Survey of Faults and Aggregation Analysis
This work in this chapter was done with the help of Professor Mark Hansen in
the Statistics Department at UCLA.
This chapter presents a study of sensor faults at a node level and in aggre-
gated data. First, an exploratory catalog of sensor node faults is developed, and
faulty behavior is characterized with statistical models. After a survey of sensor
network deployment descriptions in the literature and discussion with engineers
and scientists who have deployed sensor nodes, the choice for faults that are mod-
eled here was based on the relative consequence of different types of data faults
in recent deployments.
The most interesting part of the fault analysis comes at the network level,
when we see how faults affect fusion algorithms among data at multiple nodes.
In order to reduce the number of messages communicated, sensor networks often
aggregate data or perform in-network processing, such that messages need not
travel all the way from a sensor node to a central processing station. In addi-
tion, even when we do transmit all sensor measurements, the central processing
often involves some data fusion, where this data is used for estimation, modeling,
control, etc.
If faulty data is not identified, and if it is incorporated into the data fusion
algorithm, the results will themselves be in error. Even more drastic, if faulty
13
data is blindly aggregated before it even reaches a central processing station, we
don’t even have the chance to identify and exclude it from fusion. This problem
motivates a careful assessment of algorithm robustness to fault.
3.1 Examples of Fault in Sensor Network Deployments
Here we show a few examples of fault from sensors measuring environmental
phenomena. In Figure 3.1, there are measurement curves from several humidity
sensors closed together in a styrofoam box. The sensors should all be measuring
the same phenomenon, and yet the offset between the curves is as great as 20%
relative humidity. The factory calibration curves are often not enough to calibrate
sensors which much work together in a network. One of these sensors alone may
give appropriate measurements relative to its own baseline, however when sensors
are used in concert in a sensor network they must be calibrated relative to one
another. If these humidity sensors were deployed in a field, it would be impossible
to know whether a difference in measurement between two of the sensors was due
to a difference in phenomenon or a difference in calibration offset.
Figure 3.2 shows a sensor which becomes noisier as its battery drains. The top
plot in the figure shows the temperature measurements on days 1-4, where the
bottom plot shows noisier data on days 6-10. On one hand, the measurements in
the bottom figure could be thrown away when we see that its battery is low. On
the other hand, these data are not completely worthless. On days 6-8, the sensor
reports noisier measurements which are still tracking the true phenomenon. After
that, the measurements stop reflecting the true phenomenon.
The two faults in Figure 3.3 are very common faults in environmental sensing
where inexpensive sensors are exposed to all kinds of weather conditions. One of
14
Figure 3.1: An example of the Offset Fault: Various humidity sensors measuring
the same phenomenon.
15
0.5 1.0 1.5 2.0 2.5 3.0 3.5
21.5
22.0
22.5
23.0
23.5
Less Noisy Temperature Data
Day
Deg
rees
Cel
cius
6.0 6.5 7.0 7.5 8.0 8.5 9.0 9.5
22.0
22.5
23.0
23.5
Noisy and Otherwise Bad Temperature Data
Day
Deg
rees
Cel
cius
Figure 3.2: An example of a sensor which was good and developed some prob-
lematic noise.
16
020
4060
80
Two Faults on the Cold Air Drainage Transect
Date
Deg
rees
Cel
cius
11/18 11/19 11/20 11/21 11/22 11/23
Mote 1: goodMote 4: goodMote 12: stuckatMote 15: static
Figure 3.3: Examples of the Stuck-At and Static Faults.
the sensors has gotten stuck at a high temperature value. The other problematic
sensor is reporting data which look like static, and it is unclear whether the data
are even following the true temperature trend.
3.2 Fault Model
The examples we have shown of fault in sensor networks are varied, yet we are
interested to find a simple framework within which we can address as many faults
as possible. We adopt the perspective that the true measurement values come
from a random process, and noise or fault is imposed on that process through
either an additive noise process or a linear deterministic function.
17
Variable Meaning density distribution
y true value of phenomenon γ Γ
x measurement with acceptable f F
additive noise
x measurement transformed by a f F
fault
ε additive noise variable N (0, σ2) 12(1 + erf( ε
2σ2 ))
c percentage of nodes compromised deterministic –
by fault
β0 offset value deterministic –
β1 gain value deterministic –
Figure 3.4: Variables used in fault model descriptions.
Our generalized model for faults is then as follows. First, a perfect measure-
ment would be if we can exactly recover y. Even the most expensive systems,
however, will have some measurement noise. We call this measurement the pre-
fault measurement. Such a measurement would be represented as follows.
x = y + ε (3.1)
When the measurement is faulty, on the other hand, we assume it takes the
following general form
x = β0 + β1y + ε (3.2)
If we take this simple linear relationship between true measurement and fault, we
can capture many faults that we showed in the last section. Let’s see how this
formula is a generalization of specific kinds of faults.
18
3.2.1 Offset Fault
The offset fault most commonly manifests itself as a calibration offset. The fault is
an additive constant on top of the pre-fault measurement value. Another example
scenario of an offset fault would be if a column of vertically-spaced soil sensors is
displaced (suddenly or slowly over time), and the sensors are at a different depth
in the soil than when they were first deployed. In this scenario, we may be able
to subtract the offset to make up for the new node location; if we cannot do so
because of the model, we will need to adjust the pylon.
x = β0 + y + ε (3.3)
The faulty value only depends on the current measurement value and the
current offset. From here on, we will state the equations as if the sensor is faulty.
3.2.2 Gain Fault
The gain fault is a description of an error in calibration gain. It is difficult to
differentiate the gain fault from an offset fault without some knowledge of the
ground truth measurement values.
x = β1y + ε (3.4)
3.2.3 Variance Degradation Fault
The variance degradation fault is a fault that affects both cheap sensors and more
expensive measurement and sensing equipment. Over time, a sensor becomes
less and less accurate. If the measurement variance is σ2m, and the fault variance
19
σ2f > σ2
m, then the sensor noise is now εf ∼ N (0, σ2f) and we have simply
x = y + εf (3.5)
3.2.4 Stuck-At Fault
The stuck-at fault represents a sensor getting stuck at a particular value. Most
often this is a value at the high or low end of the appropriate sensing range.
One James Reserve leaf wetness sensor was stuck at the maximum rating, 10,
for several months. These faults are dangerous because the measurement can
tell you nothing about the underlying phenomenon. Yet, the measurements are
in-range, so simple out-of-range detection does not help.
x = β0 (3.6)
3.2.5 Static Fault
In the case of a static fault, the sensor is only reporting noise with no relation to
the true measurements. In our experience this has always been caused by a poor
connection of the sensor to the computational device. We model this with an
offset and additive noise where, as in the degraded variance fault, εf ∼ N (0, σ2f)
for σ2f > σ2
m.
x = β0 + εf (3.7)
20
3.3 Aggregation Analysis
The robustness of data aggregation to these sensor faults must be analyzed so
that system designers know under what conditions their system will fail. In [32],
Wagner emphasizes the importance of robustness in sensor networks as a defense
against possible aggressors. He notes that sensor network hardware will likely not
be tamper-proof, and so robust algorithms are a must in order to combat against
worst-case scenarios where a single node might have the ability to arbitrarily alter
aggregation results.
However, most systems implement cut-off points for their sensors, such that
a physically impossible or improbable measurement is always discarded. Thus, a
single sensor cannot arbitrarily affect an estimate or aggregate on the data. In a
scenario where we would like to do a more specific analysis than worst-case, and
we have some particular faults in mind, we can more carefully assess the damage
a fault inflicts on an aggregation algorithm.
3.3.1 Aggregation as Estimation
We can think of any fusion or aggregation function on our data as an estimator.
Our estimator Θ is any function Θ : RN → R. We continue to denote the ground
truth with y, and the set of true values from N sensors in a sensor network is
denoted as y1, · · · , yN . We define θ = Θ(y1, · · · , yN). Because the measured data
x1, · · · , xN are not equal to the ground truth (whether they are faulty data or
simply have measurement error), the function Θ(x1, · · · , xN ) returns an estimate
of θ.
To assess the quality of the estimate, we need an error term. Wagner [32] uses
root-mean-square error, the square-root of mean-square error. We will perform
21
our analysis using the mean-square error for cleanliness, but plots will show root-
mean-square error. For some estimate of θ, which we will call θ, mean square
error is defined as:
mse(Θ) = E((θ − θ)2) (3.8)
The same work [32] defines rms∗(Θ, k) as the root-mean-square error of the
estimator when k members of the data set are compromised. However, as we’ve
said, when we’re considering systems which throw away out-of-range data points,
no one data point can arbitrarily affect an estimator. We need to be more specific.
We change our perspective to assess damage under not a number of sensors, but
under a percentage of compromised sensors c, and not a general worst-case error,
but instead error under a particular fault function f .
So for Wagner’s k we now have c = k/N . We can define a new type of
error, msef (Θ, c), as the root-mean-square error when c percent of the data set
are compromised by the fault function f . For a given data set, if without loss
of generality the data are numbered compromised first, unaffected last, then we
have
θ = Θ(x1, · · · , xcN , xcN+1, · · · , xN) (3.9)
We still have θ = Θ(y1, · · · , yN), and thus we define msef (Θ, c) = E((θ−θ)2).
If a system designer is interested in keeping error below a point α, we can find
if our estimator is sufficient with the following definition, again based on [32].
Definition We say that an estimator Θ is (c, α)-resilient with respect to the
parametrized distribution p(Xi|θ) under a particular fault function f if
msef (Θ, c) ≤ α2 · mse(Θ)
22
.
We refer to this as the resilience condition. We also define this error compar-
ison under the root-mean-square error metric rmsf(Θ, c) ≤ α · rms(Θ).
Note that our c relates very closely to the breakdown point of an estimator.
While the breakdown point makes sense only for an unbounded estimator, c does
not. In a sense, it represents a breakdown with respect to a particular error
threshold.
Of the typical functions of interest, here we focus on average and sum. Again
for notation simplicity, we have ordered the data compromised first, unaffected
last. We will leave out the root-mean-square term from the following derivations
because of its simple extension from the mean-square error.
Once we have defined an estimator for our aggregation function, we can apply
the error condition to find the resiliency under a particular fault.
3.3.2 Sum and Average
The sum estimators with only pre-fault data and with both faulty and pre-fault
data are as follows.
Θprefault =
N∑
i=1
xi (3.10)
Θfault =cN∑
i=1
xi +N∑
i=(1−c)N
xi (3.11)
The Θprefault and Θfault for the average estimator under fault are simply 1/N
times that of the sum. When finding whether the resilience condition is satisfied
for the average estimator, these 1/N constants on either side of the inequality
23
Variable density first moment second moment
y γ y ≡ Eγ [y] ¯y ≡ Eγ[y2]
x f x ≡ Ef [x] ¯x ≡ Ef [x2]
x f ¯x ≡ Ef [x] ¯x ≡ Ef [x2]
xy p(x, y) xy ≡ Ep(x,y)[xy] –
xy p(x, y) xy ≡ Ep(x,y)[xy] –
Figure 3.5: Notation for moments of variables used in aggregation analysis.
cancel and the condition is the same as that of the sum.
Let us now examine whether this estimator is resilient. The abbreviations in
Figure 3.5 will be useful in the discussion.
If we assume that measurements for i 6= j are independent, we can reduce the
equation for mean-squared error quite a bit. The algebra and the final equations
are in 3.3.3. Because the final equation becomes very long, we want to try to
visualize the implications for particular scenarios with particular faults taken
from our fault examples in Section 3.1.
In the first example, we have 20 nodes in a network over which we would like to
find the average temperature value. Note that this also applies to the sum of the
temperature values. Say we are using very inexpensive electronics, and we often
find that our temperature value gets stuck at 0 deg C. We would like to see how
resilient our network is to these faults. We assume the ground truth temperatures
are constant at 20 deg C, and the measurements are iid and drawn from a normal
distribution with mean at ground truth and variance 2 deg C. We also assume
that the faults are independent of the good measurements. Then we have: ¯x = 0,
¯x = 0, x = 20, ¯x = 402, y = 20, ¯y = 400, xy = 0, xy = 400. Figure 3.6 shows how
our system will do with increasing numbers of compromised nodes. The y-axis
24
Figure 3.6: Averaging under a stuck-at-zero fault.
shows root-mean-square error. If we were interested, for example, to restrict that
our error does not exceed one degree, then we will only be able to tolerate about
5% of the nodes being compromised. The horizontal lines show the root-mean-
square error of the pre-fault estimate for different values of α. These curves offer
a comparison point; for example, if we are designing for the possible compromise
of 5% of nodes under this fault, we must accept between 2 and 5 times more error
than we would accept with no fault.
In another scenario, we examine the sum and average under the degraded
variance fault. Again we hold the ground truth values constant at 20 deg C, and
now both the pre-fault measurement and the faulty measurement are Gaussian
distributed around this value. The pre-fault measurement variance is still 2 deg C.
The faulty measurement variance is higher at 5 deg C.
In this scenario, whose results are shown in Figure 3.7, we can see that the
curve for root-mean-square error is at the bottom of the plot all the way across.
25
Figure 3.7: Averaging under degraded variance.
That is, a quarter of our sensors can be faulty with degraded variance and still
the root-mean-square error in temperature is less than a quarter of a degree.
3.3.3 Some Mathematical Details
For the Sum estimator, we start with mse between Θfault and Θtrue where
θfault =
cN∑
i=1
xi +
N∑
i=(1−c)N
xi (3.12)
θtrue =
N∑
i=1
yi (3.13)
We want to compute the value of msef (Θ, c), which with these estimators
looks as follows:
26
E
cN∑
i=1
xi +N∑
i=(1−c)N
xi −N∑
i=1
yi
2 (3.14)
Recall that in Section 3.3.2 we assume that measurements for nodes (i, j)
where i 6= j are independent.
Here are the useful squared terms:
(A) E
[
cN∑
i=1
xi
]2 = (cN) ·
[¯x + (cN − 1)¯x2
]
(B) E
[
N∑
i=cN+1
xi
]2
= (N − cN)[¯x + (N − cN − 1)x2
]
(C) E
[
N∑
i−1
yi
]2
= N[¯y + (N − 1)y2
]
(D) E
[
N∑
i−1
xi
]2
= N[¯x + (N − 1)x2
]
And here are the useful cross terms:
(F ) E
(cN∑
i=1
xi
N∑
i=cN+1
xi
)= cN(N − cN)¯xx
(G) E
(N∑
i=cN+1
xi
N∑
i=1
yi
)= (N − cN)[xy + (N − 1)xy]
(H) E
(cN∑
i=1
xi
N∑
i=1
yi
)= cN [xy + (N − 1)¯xy]
(I) E
(N∑
i=1
xi
N∑
i=1
yi
)= N [xy + (N − 1)xy]
The MSE is made up of these terms as follows.
27
mse(Θ) = (C) + (D) − 2(I) (3.15)
msef (Θ, c) = (A) + (B) + (C) + 2(F ) − 2(G) − 2(H) (3.16)
This analysis shows it is actually pretty easy to assess the impact a particular
fault has on the average or sum aggregator, because all we need to know are the
first and second moments of the true values, the fault variable, and the covariance
between the truth and the faults.
3.4 Future Work and Discussion
The fact that we are using the mean-squared error as a way to assess the perfor-
mance of the aggregate implies that only the first and second moments of distri-
butions matter. This is good because it can simplify some difficult problems. It
is problematic because if the data and fault distributions are not Gaussian, then
often some of the most interesting information is in the higher moments, and this
information is getting lost.
The faults listed in this chapter are simple faults; often sensor faults have more
complicated temporal patterns that would require a non-homogeneous statistical
model.
Hopefully the work in this chapter can help someone who is interested in ana-
lyzing the effects of particular faults on other more complicated data aggregation
and data fusion algorithms. The important point to take away from this chapter
is that simple models can go a long way to assess robustness in sensor network
algorithms.
28
CHAPTER 4
Estimation of Calibration Parameters using
Dynamical Models
This chapter presents collaborative work with Professor Steven Margulis in the
environmental engineering department at UCLA.
The first approach we take to address in-situ blind calibration is to use a
physical model of the environment to give context to the measurements and thus
some information about the measurement parameters. In the classical approach
to tracking a phenomenon, adaptive filters are used. This approach takes a
model of how the phenomenon changes from one time step to the next, along with
uncertainty in those changes, and then incorporates measurements when they are
taken. Typically these measurements have only additive noise uncertainty. Thus,
when a measurement is taken, it is possible to find the likelihood of the model
given the measurements. In this thesis, we assume there is also uncertainty in
linear measurement parameters or calibration parameters; that is the likelihood
itself is uncertain. Instead of using the measurements to simply inform the model,
we also use the model to inform an estimate of measurement parameters.
In the study of soil state in environmental engineering, it is common to use
dynamical models informed by the physics of water flow, evaporation and chem-
ical absorption by soils. Soil sensing applications are an exciting yet challenging
area for the application of Wireless Sensor Networks. Soil sampling is an impor-
29
tant task for two reasons: understanding global warming through the absorption
of carbon dioxide balanced between our oceans and our soils, and understanding
the quality of our underground water sources. Soil is extremely challenging [26]
because soil is extremely heterogeneous which makes oversampling impossible
and because the soil ecosystem is a confluence of many factors including rain,
chemicals, evaporation and plant root growth.
Fortunately, because of the importance of the problem, civil and environmen-
tal engineers have spent a lot of time developing careful models for soil dynamics.
Typically the models have parameters which should be learned from the particu-
lar environment of interest. Thus, the problem is well-suited for state estimation
filters, e.g. the Kalman Filter or Particle Filters. In this thesis, we are interested
to augment the usual parameter state vectors with the calibration parameters,
so that we can simultaneously estimate the calibration parameters as well.
4.1 The Ensemble Kalman Filter and the Particle Filter
For this analysis, we looked in particular at two filter options. The first, called the
Ensemble Kalman Filter (EnKF), tracks the probability density function (pdf) of
the state of a dynamical system by using a Monte Carlo technique. First, several
replicates (the ensemble) are generated using the joint prior distribution across
all the input variables, which can be non-Gaussian. Those replicates are tracked
with the forward model, which can be nonlinear. When measurements are taken,
the replicates are updated with a modified Kalman gain matrix. This update
step makes loose assumptions on Gaussianity [34].
The second, a particle filter implementation called the Sequential Importance
Resampling (SIR) filter, is another Monte Carlo technique which uses the same
30
Figure 4.1: An example soil sampling scenario.
propagation step but a different update step [34, 1]. Again, an ensemble of
several replicates (or particles) is generated from the joint prior, which can be
non-Gaussian, and again the replicates are tracked with the forward model, which
can be nonlinear. When a measurement is taken, the update is performed in
two steps. First, weights are given to each of the replicates depending on the
probability of that replicate given the measurement. That is, we evaluate the
likelihood function at the point of the replicate. The weights are normalized to
sum to one. Second, a new ensemble is generated from the previous ensemble by
resampling the previous ensemble with replacement according to the normalized
weights. The SIR filter is a theoretically more sound option than the Ensemble
Kalman Filter, as it can be shown to give exact solutions for large ensemble sizes.
However, nothing can be guaranteed in general for finite ensemble sizes.
In this thesis we are interested to see how the two filters compare in estimating
calibration parameters.
31
4.2 Simple Autoregressive Surface Moisture Model
Figure 4.1 represents the scenario from which we designed our simple model.
The state vector y is a single state (1-dimensional vector) which represents the
moisture at a point near the ground surface (within 5 inches of the surface). We
used a simple autoregressive model from a tutorial paper by Evensen [10]. We
model the moisture draining down out of this point over time with a drainage
coefficient δ. To this we add model error qk, with a normalization constant defined
below, and model forcing due to precipitation, precipk−1.
yk = δyk−1 +√
dtσρqk + precipk−1 (4.1)
4.2.1 Initial Condition
The first input to our dynamical model is the initial condition, y0, i.e. the mois-
ture in the soil at the start of our experiment. We take the distribution of our
uncertainty in the initial condition to be lognormal, y0 ∼ LN(µy0 , σ2y0
). This
keeps our state always positive. The lognormal distribution is defined as follows.
1
xσ√
2πe−(lnx−µ)2/2σ2
(4.2)
4.2.2 Model Forcing
The forcing in our model is due to precipitation. We will take the precipitation
input to our model to be the measurements from our rain gauge. Currently, the
precipitation was generated once and used for all the work in this chapter. In the
future we intend to create models to generate different instances of precipitation.
We will do this along with models more specific and accurate to real-life scenarios
32
of interest.
4.2.3 Model Error
To take into account the model errors [10], we also use a simple AR-1 model.
The model error at time k, qk, is an additive combination of the model error in
the previous time step and Gaussian random noise.
qk = αqk−1 +√
1 − α2wk−1 (4.3)
yk
qk
=
δyk−1 +√
dtσwρqk + precipk−1
αqk−1 +√
1 − α2wk−1
(4.4)
The constant seen before qk in the forward equation for yk is a normalization
constant. σw is the noise standard deviation of w. The normalization factor ρ is
calculated as follows [10]. This achieves normalization over each time unit, and
we call the total number of time units Nunits. In this equation, n is the number
of forward model time steps per time unit.
ρ =(1 − α)2
dt(n − 2α − nα2 + 2αn+1)(4.5)
Thus, if we choose to run our forward model over Nsteps time steps, and we
would like it to normalize every n time steps, then we choose Nunits = Nsteps/n.
4.2.4 Measurement Model
In our scenario, we are directly measuring the moisture state variable with an
ECHO-20 moisture sensor. An example calibration equation for the ECHO-
20 is given by y = 6.95 × 10−4x − 2.9 × 10−1, where y is the percent water
33
content by volume, and x is the millivolt reading given by the echo sensor. So
for our simulation experiments, we also use a linear calibration equation for the
measurement model.
y = β1x + β0 + v (4.6)
With the filtering techniques we are examining, we can hypothetically estimate
the calibration coefficients even though β1x is non-linear. Measurement noise is
random and additive, represented here by v. In the Ensemble Kalman Filter and
the Particle Filter, v does not have to be Gaussian distributed. When v has some
other distribution, the calculation of the likelihood during update for the Particle
Filter will be more difficult [1]; however the complexity of the Ensemble Kalman
Filter does not change.
Classically, the level of noise in the measurements is crucial to the performance
of estimation; if the noise is much greater than the signal itself, then obviously
estimation cannot work. Equivalently we must have a similar understanding on
conditions on the calibration parameters. For example, if the gain parameter β1
is zero, then the measurements will be completely uninformative and the filter
will rely wholly on the model. A careful understanding of these conditions is part
of future work.
4.3 Implementation
We have implemented the SIR Particle Filter Ensemble Kalman Filter in MAT-
LAB1.
We first implemented code to generate a true phenomenon and generate mea-
1The MATLAB code is all available at http://www.ee.ucla.edu/~sunbeam/bc.
34
surements of that phenomenon using the true measurement parameters. The
parameters needed for the true phenomenon and true measurements are listed in
Figure 4.2. We also need a vector of precipitation events, which we generated
once and used throughout all the examples in this chapter.
The filter code itself takes the measurements as input and runs the EnKF
and SIR algorithms with a few additional parameters, including the number of
replicates (i.e. size of the ensemble) and the prior distribution for unknown
parameters.
4.3.1 State Vector
The state vector we use in our implementation consists of five states. We call
it Y . We assume the measurement coefficients and the decay parameter are
time-invariant, and thus they do not carry a subscript k.
Y =
yk
qk
δ
β1
β0
(4.7)
4.3.2 Input Parameters
Both the EnKF and the SIR Particle Filter have input parameters of the prior
knowledge of the user with respect to the states to be estimated.
Clearly, the user does not always know accurately what the state variables
should be. The prior expected value for the state variable is the prior mean.
When one is uncertain about their knowledge of the state variable, one can in-
35
Parameter Meaning Setting
Nunits Number of time units in simulation 200
dt Time step 0.1
Nsteps Nunits/dt 2000
y0bar µ parameter in LN distribution for initial condition 0.05
y0var σ parameter in LN distribution for initial condition 0.7
α Model error parameter 0.5
wbar Model error mean 2
wvar Model error variance 1
δ Decay parameter 0.75
β0 Measurement bias 5
β1 Measurement gain 2
σv Measurement variance 0.25
Meas spacing Time steps between two measurements 50
Figure 4.2: Parameters used in generating the true phenomenon. The true initial
condition and the model error are a single instance drawn from the corresponding
distribution.
36
crease the prior uncertainty or the prior variance. We used Gaussian variables to
represent the possible values for the calibration coefficients, so mean and variance
describe the variables completely. One could also use a different distribution. All
of the parameters which define the chosen distribution would need to be used as
input to the filter.
Introducing uncertainty can hurt the ability of the filters to hone in on the
correct estimate. The work in this chapter aims to understand the tradeoffs and
the performance of the EnKF vs. the SIR filter.
One important input to both filters is the measurement noise variance. In
Figure 4.3, we illustrate an example of what can happen if we don’t know the
correct measurement noise variance a priori. This illustration uses the Ensemble
Kalman Smoother, which estimates the parameters across the entire time window
where the Filter estimates online as measurements get taken. In this example,
we used all the parameters as listed in Figure 4.2, except Meas spacing = 10,
β0 = 0 and β1 = 1. We show only a limited section of the plot for a close
up view. As can be seen, with a higher measurement variance, the filter is less
accurate. Additionally, if the prior variance is too small, the replicates do not
even encompass the true phenomenon.
In addition to noise variance, both algorithms also need the number of repli-
cates to generate and propagate, Nreps. This is set to 400 throughout the exper-
iments described in this chapter. Additionally, both filters need the description
of the prior distribution for every state variable it will estimate. It is very im-
portant to emphasize that we must either gather very good prior knowledge, or
we must allow more uncertainty in this knowledge, or else we may be misled by
the output of the filter.
37
20 22 24 26 28 30 32 34 36 38 40−2
0
2
4
6
8
10
12
14
t
moi
stur
e
Measurement Variance = 0.25, Prior Variance = 0.25
truemeas.posterior meanreplicates
20 22 24 26 28 30 32 34 36 38 40−2
0
2
4
6
8
10
12
14
t
moi
stur
e
Measurement Variance = 4; Prior Variance = 0.25
truemeas.posterior meanreplicates
20 22 24 26 28 30 32 34 36 38 40−2
0
2
4
6
8
10
12
14
t
moi
stur
e
Measurement Variance = 4, Prior Variance = 4
truemeas.posterior meanreplicates
Figure 4.3: Example of EnKF with different measurement variance and a priori
information.
38
Figure 4.4: An example of the SIR Filter estimate for the five state parameters.
4.4 Evaluation
In order to evaluate the ability of the EnKF and Particle Filter to estimate
measurement coefficients, we ran several simulations using the autoregressive
model. Here we show five experiments. In each experiment, we had uncertainty
in only one of the calibration parameters to be estimated, either the gain or the
offset. Recall that we are assuming the gain and offset are Gaussian distributed,
so we only vary the prior mean or the prior variance. Our results are particular for
uncertainty in one calibration parameter and also particular for the autoregressive
model we have chosen, but they do demonstrate some simple properties.
As an example of what the filter estimate may look like, see Figure 4.4. The
39
five plots are each the estimates over time of one of the state variables, with
the moisture state y in the top left, model error q below that, moisture decay
δ in the bottom left, measurement gain β1 in the top right and measurement
offset β0 below that. This five-plot layout is the same for all the following plots
showing the performance of estimation– we will always show plots for the five
state variables in this order.
4.4.1 Estimation with Incorrect Prior Mean
For the first two experiments, we start by testing whether we can estimate the five
state variables even when we assume some wrong prior mean for the offset or gain
calibration parameters. That is, if we assume that offset parameter is zero, but in
reality there is an offset, will we be able to estimate the state variables, including
the correct calibration offset? Or if we assume that the gain parameter is one,
when in reality there is a gain, will we estimate the state variables correctly?
In the first experiment, we allow the prior mean for calibration offset β0 to
vary while holding the variance and other parameters constant. The parameters
are listed here verbatim from the code. The variable names are self-explanatory
for the most part; keep in mind that the δ parameter is uniformly spaced, and the
lo and hi represent the interval over which it lies. In this case, δ was uniformly
distributed between 0.5 and 1.
40
0 5 100
2
4x 10
−3m
oist
ure
0 5 105.15
5.2
5.25x 10
−4
mod
el e
rror
q
0 5 10−1
0
1
mea
s ga
in
0 5 100
0.005
0.01
mea
s of
fset
0 5 100
0.5
1x 10
−4
deca
y pa
ram
eter
Effect of Prior Offset Mean on RMS Error
Increasing differencebetween true and priormean for measurementoffset
Figure 4.5: RMS Error as the difference in the true and prior mean for the
measurement offset increases: 5, 50, 95% quantiles over 100 runs.
41
beta1 = 2;
beta0 = 5;
delta = .75;
Meas_spacing_vec = 50;
Meas_Noise_vec = 0.25;
Beta1_Mean_vec = beta1;
Beta1_Var_vec = 0;
Beta0_Mean_vec = [beta0 beta0-2 beta0-5 beta0-10];
Beta0_Var_vec = 1;
Delta_lo_vec = .5;
Delta_hi_vec = 1;
Figure 4.5 shows the root mean square error of the estimate of the state
variables as a function of the difference between the true β0 and the prior mean
for β0.
The error calculation and plots for all the experiments are the same, and so
we explain it once here. We take the mean of all the replicates to be the final
estimate of the state variables. The plots then show the RMS error between this
and the true values. We simulated 100 runs to see the effects; the plots show the
5%, 50% and 95% quantiles2 of the resulting error from those 100 runs.
In Figure 4.5, we have the results of this first experiment. We have plotted
these three quantiles for the RMS error of all five parameters which we are esti-
mating. The measurement gain error is flat at zero because we are holding this
parameter as known. As we expect, for the measurement offset, decay parameter,
and moisture state, the estimate error increases as the difference between the true
and prior mean increases. So as we make a worse and worse guess at what the
2The 5% and 95% quantiles are close to the best- and worst-case error. The 50% quantileis the median error.
42
true value is for the offset calibration parameter, our estimation error gets worse.
Interestingly, when we compare the Particle filter to the EnKF, we see that the
extreme cases are more spread apart for the Particle filter. The three quantiles
for the EnKF are quite close to each other in comparison. Another observation
can be made that the estimate of model error is the same for both the Particle
filter and EnKF.
In the second experiment, we allow the mean of the gain parameter β1 to vary.
The gain calibration parameter is multiplicative with the moisture state variable
itself and thus also with the decay parameter, and so we expect the results to be
less straight-forward.
beta1 = 2;
beta0 = 5;
delta = .75;
Meas_spacing_vec = 50;
Meas_Noise_vec = 0.25;
Beta1_Mean_vec = [beta1 beta1-.5 beta1-1 beta1-1.5];
Beta1_Var_vec = .4;
Beta0_Mean_vec = beta0;
Beta0_Var_vec = 0;
Delta_lo_vec = .5;
Delta_hi_vec = 1;
Figure 4.6 shows the root mean square error of the estimate of the state
variables as a function of the difference between the true β1 and the prior mean
for β1. As we can see, the results are similar to those of the first experiment.
However, the error in the moisture parameter increases more quickly. As we
said, the error in the decay parameter and the error in the measurement gain are
multiplicative, so this makes sense. Still, we are pleased to see that we are still
43
0 0.5 1 1.50
1
2x 10
−3m
oist
ure
0 0.5 1 1.54.94
4.96
4.98x 10
−4
mod
el e
rror
q
0 0.5 1 1.50
0.5
1x 10
−3
mea
s ga
in
0 0.5 1 1.5−1
0
1
mea
s of
fset
0 0.5 1 1.50
0.5
1x 10
−4
deca
y pa
ram
eter
Effect of Prior Gain Mean on RMS Error
Increasing differencebetween true and prior meanfor measurement gain
Figure 4.6: RMS Error as the difference in the true and prior mean for the
measurement gain increases: 5, 50, 95% quantiles over 100 runs.
44
1 2 3 4 50
1
2x 10
−3
moi
stur
e
1 2 3 4 55.1
5.15x 10
−4
mod
el e
rror
q
1 2 3 4 5−1
0
1
mea
s ga
in
1 2 3 4 50
2
4x 10
−3
mea
s of
fset
1 2 3 4 50
0.5
1x 10
−4
deca
y pa
ram
eter
Effect of Prior Offset Variance on RMS Error
Increasing Prior Variance forthe measurement offset
Figure 4.7: RMS Error as the prior variance for the offset measurement parameter
increases: 5, 50, 95% quantiles over 100 runs.
able to do a reasonable job of estimating the state vector, and that more correct
information improves our estimate.
4.4.2 Estimation with Changing Prior Variance
In the third experiment, we allow the variance of β0 to vary.
45
beta1 = 2;
beta0 = 5;
delta = .75;
Meas_spacing_vec = 50;
Meas_Noise_vec = 0.25;
Beta1_Mean_vec = beta1;
Beta1_Var_vec = 0;
Beta0_Mean_vec = beta0-5;
Beta0_Var_vec = [1 2 4 5];
Delta_lo_vec = .5;
Delta_hi_vec = 1;
In Figure 4.7, we see that as we increase the variance, the estimation error
for the measurement offset parameter goes down! This is great news, because
it means that even when there is a lot of uncertainty in our prior knowledge,
the filters are able to get closer to the true parameter value. That is, when our
uncertainty is too tight around the wrong value, we will converge on the wrong
value. This results shows us that increasing the uncertainty can help in those
situations.
We can also observe that, for this experiment, the Particle Filter median case
error does generally better than the EnKF. The worst case error for the Particle
Filter is however still in general much higher than the EnKF error.
In the fourth experiment, we allow the variance of β1 to vary.
46
0.2 0.4 0.6 0.80
0.5
1x 10
−3
moi
stur
e
0.2 0.4 0.6 0.84.85
4.9
4.95x 10
−4
mod
el e
rror
q
0.2 0.4 0.6 0.80
0.5
1x 10
−3
mea
s ga
in
0.2 0.4 0.6 0.8−1
0
1
mea
s of
fset
0.2 0.4 0.6 0.80
0.5
1x 10
−4
deca
y pa
ram
eter
Effect of Prior Gain Variance on RMS Error
Increasing Prior Variance formeasurement gain
Figure 4.8: RMS Error as the prior variance for the gain measurement parameter
increases: 5, 50, 95% quantiles over 100 runs.
47
beta1 = 2;
beta0 = 5;
delta = .75;
Meas_spacing_vec = 50;
Meas_Noise_vec = 0.25;
Beta1_Mean_vec = beta1-1;
Beta1_Var_vec = [.1 .4 .7 1];
Beta0_Mean_vec = beta0;
Beta0_Var_vec = 0;
Delta_lo_vec = .5;
Delta_hi_vec = 1;
Figure 4.8 shows a slightly depressing result, especially after we saw with the
offset estimation that increasing variance helps the estimation error. When it
comes to the gain estimation, though the error in both the gain parameter and
the decay parameter decrease, the estimation error for the moisture variable itself
decreases very little. This of course makes sense, as the errors in the gain and
decay parameters will be multiplicative in the moisture state estimation error.
4.4.3 Estimation with Frequent Updates
In the fifth experiment, we examined what a change in the number of measure-
ments would do to help our estimate.
48
0 100 200 300 400 5000
0.5
1x 10
−3
moi
stur
e
0 100 200 300 400 5004.8
5
5.2x 10
−4
mod
el e
rror
q
0 100 200 300 400 5000
2
4x 10
−4
mea
s ga
in0 100 200 300 400 500
0
0.5
1x 10
−3
mea
s of
fset
0 100 200 300 400 5000
2
4x 10
−5
deca
y pa
ram
eter
Effect of Measurement Frequency on RMS Error
Increasing number of timestepsbetween measurements
Figure 4.9: RMS Error as the spacing between measurement updates increases:
5, 50, 95% quantiles over 100 runs.
49
beta1 = 2;
beta0 = 5;
delta = .85;
Meas_spacing_vec = [25 50 100 200 300 400 500];
Meas_Noise_vec = 0.25;
Beta1_Mean_vec = beta1-1;
Beta1_Var_vec = .2;
Beta0_Mean_vec = beta0-5;
Beta0_Var_vec = 2;
Delta_lo_vec = .75;
Delta_hi_vec = .95;
Figure 4.9 shows another disappointing result from this experiment. The error
remains nearly flat throughout all the parameters. It should surely be, however,
that having measurements more often can only help us in estimation– especially
for estimation of the calibration parameters themselves! We believe that this
reflects some problems with the way we chose to update the parameters. We
only used the measurements to directly update the moisture state variable, and
we updated the other state variables only indirectly. A more careful choice of
update for the calibration parameters is the main next step for this work.
4.4.4 Future Work and Discussion
This evaluation only shows us the effect of uncertainty in a single input parameter
at a time. In reality, we may have very little information and we may have un-
certainty about several input parameters. The non-linearity of the model makes
it difficult to say anything general about how this might affect the outcome of
the estimation. In turn, the model itself and all its intricacies will have a large
50
effect on whether the calibration parameter estimation will work. We advocate
a careful evaluation of each input parameter on the specific model implemented.
As we stated in the description of the fifth experiment, we believe that a main
next step is to more carefully design the update step so that both the moisture
state variable and the calibration parameters get directly updated. In the Particle
Filter we implemented for this thesis, during the update step we took the most
recent estimate of the calibration gain and offset and used them directly in the
calculation of the measurement, and the best replicates were chosen, implying
we should pick the replicates with correct calibration gains and offsets. The
EnKF has a structure which insists on the update of all state variables in each
update step, however the update matrix cannot be perfectly correct as is because
it would need to be nonlinear. If we were to do a proper update, we would
have a measurement model which directly incorporates the estimated gain and
offset, and the update would take advantage of the direct relationship between
the measurements and the calibration coefficients.
In addition to the EnKF and the SIR Particle Filter, we also implemented the
Ensemble Kalman Smoother. A Smoother is a type of filter that has some delay
because it does the estimate update over a window of measurements instead of as
each measurement comes in. Hybrid smoother-filter algorithms, where a moving
window of a small number of measurements is used in order to estimate the
calibration parameters, are of interest. This is a sensible approach if we assume
that the calibration parameters are not changing over some short time interval.
There are many ways to discuss the accuracy of the estimate, and we have
made two important choices here. First, we must have a choice of error function
to evaluate the performance of our estimator. In the simulations shown, we used
the RMS error. We could instead look at many other measures of error. Another
51
important measure would be whether or not the truth lies within the uncertainty
bounds of the smoother. This would tell us if we are being conservative enough
with our prior uncertainty. The second choice we made was to take the mean of
the replicates as our final estimate. Depending on how we believe the replicates
are distributed, taking the mean of the replicates may not be the right choice; we
could instead take another measure from the replicates such as the maximum a
posteriori value or the median value. Understanding the context for these choices
is extremely important, and the choices should be made carefully for any given
model.
52
CHAPTER 5
Estimation of Calibration Parameters using
Subspace Matching
This chapter presents collaborative work with Professor Robert Nowak in the
Electrical Engineering Department at University of Wisconsin, Madison. This
work was first published as a paper in the Conference on Information Processing
in Sensor Networks, 2007 [3].
This chapter presents a more general problem formulation and approach to
blind calibration. Instead of assuming a particular dynamical model, as we did
in Chapter 4, we assume a model for the subspace of the vector of sensor mea-
surements from a given time instant. We will call this vector a “snapshot.”
One approach to blind sensor network calibration is to begin by assuming that
the deployment is very dense, so that neighboring nodes should (in principle) have
nearly identical readings [7]. Unfortunately, many existing and envisioned sensor
network deployments may not meet the density requirements of such procedures.
However, we can view this choice in another way– as identifying a set of con-
straints the sensor measurements should satisfy.
In the case of [7] the difference between two neighboring sensor measurements
should be approximately zero. We could also choose another set of constraints,
for example that the values of three consecutive sensors lie on a line, i.e. the
second derivative is approximately zero. All these choices are defining a subspace
53
in which the sensor measurements should lie. In this chapter we discuss whether
gain and offset calibration coefficients can be identified given a general subspace
for the sensor measurements.
5.1 Problem Formulation
Consider a network of n sensors. At a given time instant, each sensor makes a
measurement, and we denote the vector of n measurements by x = [x(1), . . . , x(n)]′,
where ′ denotes the vector transpose operator (so that x is an n× 1 column vec-
tor). We will refer to x as a “snapshot.” When necessary, we will distinguish
between snapshots taken at different times using a subscript (e.g., xs and xt are
snapshots at times s and t).
Each sensor has an unknown gain and offset associated with its response, so
that instead of measuring x the sensors report
y(j) =x(j) − β(j)
α(j), j = 1, . . . , n
where α = [α(1), . . . , α(n)]′ are the sensors’ gain calibration factors and β =
[β(1), . . . , β(n)]′ are the sensors’ calibration offsets. It is assumed that α(j) 6=0, j = 1, . . . , n. With this notation, the sensor measurement y(j) can be cali-
brated by the linear transformation x(j) = α(j)y(j) + β(j). We can summarize
this for all n sensors using the vector notation
x = Y α + β , (5.1)
where Y = diag(y) and the diag operator is defined as
diag(y) =
y(1)
. . .
y(n)
.
54
The blind calibration problem entails the recovery of α and β from routine un-
calibrated sensor readings such as y.
When we look at the problem like this, we can see that without further as-
sumptions, blind calibration is an impossible task. Other work that we discussed
in Section 2 such as [13, 16] take a very similar problem formulation and make
particular assumptions so that it can be solved. Those assumptions don’t help
us in blind calibration; however, it turns out that under a different set of mild
assumptions that may often hold in practice, quite a bit can be learned from raw
(uncalibrated) sensor readings like y in order to do blind calibration.
Assume that the sensor network is slightly “oversampling” the phenomenon
being sensed. Mathematically, this means that the calibrated snapshot x lies
in a lower dimensional subspace of n-dimensional Euclidean space. Let S de-
note this “signal subspace” and assume that it is r-dimensional, for some integer
0 < r < n. For example, if the signal being measured is bandlimited and the sen-
sors are spaced closer than required by the Shannon-Nyquist sampling rate, then
x will lie in a lower dimensional subspace spanned by frequency basis vectors. If
we oversample (relative to Shannon-Nyquist) by a factor of 2, then r = n/2. Basis
vectors that correspond to smoothness assumptions, such as low-order polyno-
mials, are another potentially relevant example. In general, the signal subspace
may be spanned by an arbitrary set of r basis vectors. The calibration coeffi-
cients α and β and the signal subspace S may change over time, but here we
assume they do not change over the course of blind calibration. As we will see,
this is a reasonable assumption, since the network may be calibrated from very
few snapshots.
Let P denote the orthogonal projection matrix onto the orthogonal comple-
55
ment to the signal subspace S. Then every x ∈ S must satisfy the constraint
Px = P (Y α + β) = 0 (5.2)
This is the key idea behind our blind calibration method. Because the projection
matrix P has rank n−r, the constraint above gives us n−r linearly independent
equations in 2n unknown values (α and β). If we take snapshots from the sensor
network at k distinct times, y1, . . . , yk, then we will have k(n − r) equations in
2n unknowns. For k ≥ 2n/(n − r) we will have more equations than unknowns,
which is a hopeful sign. This observation leads to several basic questions which
we address in this chapter.
1. Is it possible to blindly recover α and β from a sufficient number of un-
calibrated sensor snapshots? Mathematically, this question boils down to
determining whether or not the constraints provide 2n linearly independent
equations.
2. If perfect blind calibration is not possible, then can we achieve a partial cali-
bration from the raw data? Can we improve this partial calibration with a
small amount of additional overhead?
3. How is the recovery affected by sensor noise? Certainly, we cannot expect the
constraint (5.2) to hold perfectly in the presence of noise, so robust versions
of the problem need to be developed.
4. How is the recovery affected by mismodeling in P ? Again, robust versions
of the problem are necessary to cope with cases where the signals are not
perfectly lying in the subspace.
56
5.2 Blind Calibration
Given k snapshots at different time instants y1, . . . , yk, the subspace constraint
(5.2) results in the following system of k(n − r) equations:
P (Y i α + β) = 0 , i = 1, . . . , k (5.3)
The true gains and offsets must satisfy this equation, but in general the equation
may be satisfied by other vectors as well. Establishing conditions that guarantee
that the true gains and/or offsets are the only solutions is the main theoretical
contribution of the paper [3].
It is easy to verify that the solutions for β satisfy
Pβ = −P Y α (5.4)
where Y = 1k
∑ki=1 Y i, the time-average of the snapshots. One immediate obser-
vation is that the constraints only determine the components of β (in terms of
the data and α) in the signal “nullspace” (the orthogonal complement to S). The
component of the offset β that lies in the signal subspace is unidentifiable. This
is intuitively very easy to understand. Our only assumption is that the signals
measured by the network lie in a lower dimensional subspace. The component
of the offset in the signal subspace is indistinguishable from the mean or average
signal. Recovery of this component of the offset requires extra assumptions, such
as assuming that the signals have zero mean, or additional calibration resources,
such as the non-blind calibration of some of the sensor offsets. We discuss this
further in Section 5.3.
Given this characterization of the β solutions, we can re-write the constraints
(5.3) in terms of α alone:
P (Y i − Y )α = 0 , i = 1, . . . , k (5.5)
57
If α is a solution to this system of equations, then every vector β satisfying
Pβ = −P Y α is a solution for β in the original system of equations (5.3).
In other words, for a given α, the value of the component of the offset in the
nullspace is P Y α.
Another simple but very important observation is that there is one degree of
ambiguity in α that can never be resolved blindly using routine sensor measure-
ments alone. The gain vector α can be multiplied by a scalar c, and it cannot
be distinguished whether this scalar multiple is part of the gains or part of the
true signal. We call this scalar multiple the global gain factor. A constraint is
needed to avoid this ambiguity, and without loss of generality we will assume
that α(1) = 1. This constraint can be interpreted physically to mean that we
will calibrate all other sensors to the gain characteristics of sensor 1. The choice
of sensor 1 is arbitrary and is taken here simply for convenience.
If noise, mismodeling effects, or other errors are present in the uncalibrated
sensor snapshots, then a solution to (5.3) or (5.5) may not exist. Robust solutions
are discussed in Section 5.5.1.
5.3 Offset Calibration
The component of the offset in the signal subspace is generally unidentifiable,
but in special cases it can be determined. For example, if it is known that the
phenomenon of interest fluctuates symmetrically about zero (or some other known
value), then the average of many measurements will tend to zero (or the known
mean value). In this situation, the average
1
k
k∑
i=1
yi =
(1
k
k∑
i=1
xi − β
)/α ≈ −β/α
58
where the the division operation is taken element-by-element. This follows since
1k
∑ki=1 xi ≈ 0 for large enough k. Thus we can identify the offset simply by
calculating the average of our measurements. More precisely, we can identify
β = β/α, which suffices since we can equivalently express the basic relationship
(5.1) between calibrated and uncalibrated snapshots as x = (Y + β) α.
Another situation in which we can determine (or partially determine) the
component of the offset in the signal subspace is when we have knowledge of
the correct offsets for a subset of the sensors. We call this partially blind offset
calibration. In [3], we discuss the details of partially blind offset calibration.
Here we summarize by giving the formula for computation. If we let let βm be
the vector of known offset calibration coefficients and T be a selection matrix
that selects the columns associated with those coefficients, then we will get the
following equation for a length-r vector of parameters θ.
θ = [TΦ]−1(βm + TP Y α) (5.6)
Recall that the columns of Φ are the basis vectors for the signal subspace.
That is, (I − P ) = ΦΦ′. When we know θ, we can solve for the vector of offset
coefficients β with the equation
θ = Φ′β
Note that in order to solve Equation 5.6, [TΦ] must be invertible. The rank
of ΦT cannot be greater than m, the number of known sensor offsets, which
shows that to completely determine the offset component in the signal subspace
we require at least m = r known offsets. In general, knowing the offsets for
an arbitrary subset of m sensors may not be sufficient (i.e., ΦT may not be
invertible), but there are important special cases when it is. First note the Φ, by
59
construction, has full rank r. Also note that the selection matrix T selects the m
rows corresponding to the known calibration offsets and eliminates the remaining
n−m rows. So, we require that the elimination of any subset of n−m rows of Φ
does not lead to a linearly dependent set of (m × 1) columns. This requirement
is known as an incoherence condition, and it is satisfied as long as the signal
basis vectors all have small inner products with the natural or canonical sensor
basis (n × 1 vectors that are all zero except for a single non-zero entry). For
example, frequency vectors (e.g., Discrete Fourier Transform vectors) are known
to satisfy this type of incoherence condition [8]. This implies that for subspaces
of bandlimited signals, ΦT is invertible provided m ≥ r.
5.4 Gain Calibration
The possibilities for offset calibration are fairly straightforward, as described
above, but conditions that guarantee that the gains can be blindly calibrated
are less obvious. This section theoretically characterizes the existence of unique
solutions to the gain calibration problem. As pointed out in Section 5.2, the gain
calibration problem can be solved independently of the offset calibration task, as
shown in (5.5), which corresponds to simply removing the mean snapshot from
each individual snapshot. Therefore, it suffices to consider the case in which
the snapshots are zero-mean and to assume that Y = 0, in which case the gain
calibration equations may be written as
P Y i α = 0 , i = 1, . . . , k (5.7)
The results we present also hold for the general case in which Y 6= 0. We first
consider general conditions guaranteeing the uniqueness of the solution to (5.7)
and then look more closely at the special case of bandlimited subspaces.
60
5.4.1 General Conditions
The following conditions are sufficient to guarantee that a unique solution to (5.7)
exists.
A1. Oversampling: Each signal x lies in a known r-dimensional subspace S,
r < n. Let φ1, . . . , φr denote a basis for S. Then x =∑r
i=1 θiφi, for
certain coefficients θ1, . . . , θr.
A2. Randomness: Each signal is randomly drawn from S and has mean zero.
This means that the signal coefficients are zero-mean random variables. The
joint distribution of these random variables is absolutely continuous with
respect to Lebesgue measure (i.e., a joint r-dimensional density function
exists). For any collection of signals x1, . . . , xk, k > 1, the joint distribu-
tion of the corresponding kr coefficients is also absolutely continuous with
respect to Lebesgue measure (i.e., a joint kr-dimensional density function
exists).
A3. Incoherence: Define the nr × n matrix
MΦ =
P diag(φ1)...
P diag(φr)
(5.8)
and assume that rank(MΦ) = n−1. Note that MΦ is a function of the basis
of the signal subspace. The matrix P , the orthogonal projection matrix
onto the orthogonal complement to the signal subspace S, can be written
as P = I −ΦΦ′, where I is the n×n identity matrix and Φ = [φ1, . . . , φr].
Assumption A1 guarantees that the calibrated or true sensor measurements
are correlated to some degree. This assumption is crucial since it implies that
61
measurements must satisfy the constraints in (5.3) and that, in principle, we
can solve for the gain vector α. Assumption A2 guarantees the signals are not
too temporally correlated (e.g., different signal realizations are non-identical with
probability 1). Also, the zero-mean assumption can be removed, as long as one
subtracts the average from each sensor reading. Assumption A3 essentially guar-
antees that the basis vectors are sufficiently incoherent with the canonical sensor
basis, i.e., the basis that forms the columns of the identity matrix. It is easy
to verify that if the signal subspace basis is coherent with the canonical basis,
then rank(MΦ) < n − 1. Also, note that MΦ1 = 0, where 1 = [1, . . . , 1]′, which
implies that rank(MΦ) is at most n−1. In general, assumption A3 only depends
on the assumed signal subspace and can be easily checked for a given basis. In
our experience, the condition is satisfied by most signal subspaces of practical
interest, such as lowpass, bandpass or smoothness subspaces.
Theorem 1 Under assumptions A1, A2 and A3, the gains α can be perfectly
recovered from any k ≥ r signal measurements by solving the linear system of
equations (5.5).
This theorem and the following proof demonstrate that the gains are identi-
fiable from routine sensor measurements; that is, in the absence of noise or other
errors, the gains are perfectly recovered. In fact, the proof shows that under A1
and A2, the condition A3 is both necessary and sufficient. When noise and er-
rors are present, the estimated gains may not be exactly equal to the true gains.
However, as the noise/errors in the measurements tend to zero, the estimated
gains tend to the true gains.
Proof First note that the case where the signal subspace is one-dimensional
(r = 1) is trivial. In this case there is one degree of freedom in the signal, and
62
hence one measurement coupled with the constraint that α(1) = 1 suffices to
calibrate the system. For the rest of the proof we assume that 1 < r < n and
thus 2 ≤ k < n.
Given k signal observations y1, . . . , yk, and letting α represent our estimated
gain vector, we need to show that the system of equations
PY 1
...
PY k
α = 0 (5.9)
has rank n − 1, and hence may be solved for the n − 1 degrees of freedom in α.
Note each subsystem of equations, PY j, has rank less than or equal to n−r (since
P is rank n − r). Therefore, if k < n−1n−r
, then the system of equations certainly
has rank less than n−1. This implies that it is necessary that k ≥ n−1n−r
. Next note
that Y j = XjA, where Xj = diag(xj) and A = diag([1, 1/α(2), . . . , 1/α(n)]′).
Then write
PX1
...
PXk
d = 0 (5.10)
where d = Aα. The key observation is that satisfaction of these equations
requires that Xjd ∈ S, for j = 1, . . . , k. Any d that satisfies this relationship
will imply a particular solution for α, and thus d must not be any vector other
than the all-ones vector for blind calibration to be possible.
Recall that by definition Xj = diag(xj). Also note that diag(xj)d = diag(d)xj.
So we can equivalently state the requirement as
diag(d)xj ∈ S, j = 1, . . . , k . (5.11)
The proof proceeds in two steps. First, A2 implies that k ≥ r signal observa-
tions will span the signal subspace with probability 1. This allows us to re-cast
63
the question in terms of a basis for the signal subspace, rather than particular
realizations of signals. Second, it is shown that A3 (in terms of the basis) suffices
to guarantee that the system of equations has rank n − 1.
Step 1: We will show that all solutions to (5.11) are contained in the set
D = {d : diag(d)φi ∈ S, i = 1, . . . , r}.
We proceed by contradiction. Suppose that there exists a vector d that satisfies
(5.11) but does not belong to D. Since d satisfies (5.11), we know that there
exists an x ∈ S such that diag(d)x ∈ S. We can write x in terms of the basis, as
x =∑r
i=1 θiφi, and diag(d)x =∑r
i=1 θi diag(d)φi. Since by assumption d does
not satisfy diag(d)φi ∈ S, i = 1, . . . , r, it follows that the coefficients θ1, . . . , θr
must weight the components outside of the signal subspace so that they cancel
out. In other words, the set of signals x ∈ S that satisfy diag(d)x ∈ S is a
proper subspace (of dimension less than r) of the signal subspace S. However, if
we make k ≥ r signal observations, then with probability 1 they collectively span
the entire signal subspace (since they are jointly continuously distributed). In
other words, the probability that all k measurements lie in a lower dimensional
subspace of S is zero. Thus, d cannot be a solution to (5.11).
Step 2: Now we characterize the set D. First, observe that the vectors d ∝ 1,
the constant vector, are contained in D, and those correspond to the global gain
factor ambiguity discussed earlier. Second, note that every d ∈ D must satisfy
P diag(d)φi = P diag(φi)d = 0, i = 1, . . . , r, where P denote the projection
matrix onto the orthogonal complement to the signal subspace S. Using the
definition of MΦ given in (5.8), we have the following equivalent condition: every
d ∈ D must satisfy MΦd = 0. We know that the vectors d ∝ 1 satisfy this
condition. The condition rank(MΦ) = n − 1 guarantees that these are the only
solutions. This completes the proof.
64
5.4.2 Bandlimited Subspaces
In the special case in which the signal subspace corresponds to a frequency domain
subspace, a slightly more precise characterization is possible which shows that
even fewer snapshots suffice for blind calibration. As stated earlier, assumption
A3 is often met in practice and can be easily checked given a signal basis Φ. One
case where A3 is automatically met is when the signal subspace is spanned by a
subset of the Discrete Fourier Transform (DFT) vectors:
φm = [1, e−i2πm
n , . . . , e−i2(n−1)πm
n ]′/√
n, m = 0, . . . , n − 1
In this case we are able to show in [3] that only dn−1n−r
e+1 snapshots are required.
This can be significantly less than r, meaning that the time over which we must
assume that the subspace and calibration coefficients are unchanging is greatly
reduced. See [2] for the necessary assumptions, which are very similar to the
assumptions for general identifiability, and the accompanying proof.
5.5 Evaluation
In order to evaluate whether this theory of blind calibration is possible in practice,
we explore its performance in simulation under both measurement noise and the
mis-characterization of the projection matrix P . Additionally, we show the per-
formance of the algorithm on two temperature sensor datasets, one dataset from a
controlled experiment where the sensors are measuring all the same phenomenon
and thus lie in a 1-dimensional subspace, and the other from a deployment in
a valley at a nature preserve called the James Reserve1, where the true dimen-
sion of the spatial signal is unknown. First, we discuss the technical tools for
implementation of robust blind calibration.
1http://www.jamesreserve.edu
65
5.5.1 Robust Estimation
Blind calibration is simply a problem of solving the linear system of equations in
(5.5). If noise, mismodeling effects, or other errors are present in the uncalibrated
sensor snapshots, then a solution to (5.5) may not exist. There are many methods
for finding the best possible solution, and we employ singular value decomposition
and standard least squares techniques.
First, note that the constraints can be expressed as
C α = 0 (5.12)
where the matrix C is given by
C =
P (Y 1 − Y )...
P (Y k − Y )
(5.13)
In the ideal case, there is always at least one solution to the constraint Cα = 0,
since the true gains must satisfy this equation. On the other hand, if the sensor
measurements contain noise or if the assumed calibration model or signal subspace
is inaccurate, then a solution may not exist. That is, the matrix C may have
full column rank and thus will not have a right nullspace. A reasonable robust
solution in such cases is to find the right singular vector of C associated with the
smallest singular value. This vector is the solution to the following optimization.
α = arg minα
‖Cα‖22 (5.14)
In other words, we find the vector of gains such that Cα is as close to zero as
possible. This vector can be efficiently computed in numerical computing envi-
ronments, such as Matlab, using the economy size singular value decomposition
66
(svd)2. Note that in the ideal case (no noise or error) the svd solution satisfies
(5.5). Thus, this is a general-purpose solution method.
Blind calibration of the gains can also be implemented by solving a system of
equations in a least squared sense as follows. Recall that we have one constraint on
our gain vector, α(1) = 1. This can be interpreted as knowing the gain coefficient
for the first sensor. We can use this knowledge as an additional constraint on the
solution. If we let c1, . . . , cn be the columns of C, let α be the gain vector with
α(1) removed, and let C be the matrix C with the first column removed, we can
rewrite the system of equations as C α = − c1. The robust solution is the value
of α that minimizes the LS criterion ‖C α + c1‖22.
More generally, we may know several of the gain coefficients for what we call
partially blind calibration. Let h be the sum of the α(i)ci corresponding to the
known gains, let α be the gain vector with the known gains α(i) removed, and
let C be the matrix C with those columns ci removed. Now we have C α = − h
and the robust solution is the minimizer of
‖C α + h‖22 (5.15)
We can solve this optimization in a numerically robust manner by avoiding the
squaring of the matrix C that is implicit in the conventional LS solution, α =
(C′
C)−1C′
( − h). This “squaring” effectively worsens the condition number of
the problem and can be avoided by using QR decomposition techniques3.
5.5.2 Simulations
To test the blind calibration methods on simulated data, we simulated both
a field and snapshots of that field. We generated gain and offset coefficients,
2The Matlab command is svd(C, 0).3We used α = C\ − h in Matlab.
67
Example Simulated Field Measured Field
Figure 5.1: Two example simulated square fields. On the left, a 256 × 256 field
generated with a basic smoothing kernel, which represents a true continuous field.
On the right, an 8 × 8 grid of measurements of the same field. The fields can
be quite dynamic and still meet the assumptions for blind calibration. The fields
are shown in pseudocolor, with red denoting the maximum valued regions and
blue denoting the minimum valued regions.
measurement noise, and most importantly, a projection matrix P .
We simulated a smooth field by generating an 256 × 256 array of pseudoran-
dom Gaussian noises (i.e., a white noise field) and then convolving it with the
smooth impulse response function h(i, j) = e(−s((i−l/2)2+(j−l/2)2), s > 0. Figure 5.1
shows an example field with the smoothing parameter s = 1, which could rep-
resent a smoothly varying temperature field, for example. We simulated sensor
measurements by sampling the field on a uniform 8×8 grid of n = 64 sensors. For
gains, we drew uniformly from α ∈ [0.5, 1.5] and for offsets from β ∈ [−.5, .5].
After applying α and β to the measurements, we then added Gaussian noise,
with mean zero and variance σ.
Separately, we created P to be a low-pass DFT matrix. We kept 3 frequencies
68
in 2d, which means with symmetries we have an r = 49-dimensional subspace4.
With this setup, we can adjust the parameters of the smoothing kernel, while
keeping P constant, to test robustness of blind calibration to an assumed sub-
space model that may over- or under-estimate the dimension of the subspace of
the true field. The smoothing kernel and projection P both characterize lowpass
effects, but the smoothing operator is only approximately described by the pro-
jection operator, even in the best case. We can also create our field by projecting
the random field onto the r-dimensional subspace using P ; this represents the
case where the true subspace is known exactly.
Estimates of the gains and offsets were calculated using the methods discussed
above and described in more detail below. For all the results, we calculated the
average error per sensor in the estimate α, and similarly the estimate β, as
follows.
errα =‖α − α‖2
2
n(5.16)
In order to interpret the error results, keep in mind the range of α and β.
For gain, a 1% error will be approximately 10−2, and 1% error in offset would be
approximately of 10−3.
5.5.2.1 Error Results using SVD
We simulated blind calibration with the described simulation set-up. We first
generated mean-zero fields using our smoothing kernel and took snapshot mea-
surements of each field. We used k = 3r snapshots (slightly more than the the-
oretical minimum of k = r) for added robustness to noise and modeling errors.
4If the 2-dimensional signal has p frequencies, then the subspace is of rank r = (2p + 1)2.
69
10−3
10−2
10−10
2
4
6
8x 10−4 Gain Error in Noise
ave
rag
e e
rro
r p
er
se
nso
r
noise variance on logscale
meanmedian
10−3
10−2
10−12.38
2.4
2.42
2.44
2.46x 10−3 Offset Error in Noise
ave
rag
e e
rro
r p
er
se
nso
r
noise variance on logscale
meanmedian
Figure 5.2: Gain and offset error performance with exact knowledge of P and
increasing measurement noise. The results show the mean and median error over
100 simulation runs.
0 0.05 0.1 0.15 0.2 0.250
0.005
0.01
0.015
0.02
0.025
0.03
0.035
0.04Gain Error v. Modeling Error
ave
rag
e e
rro
r p
er
se
nso
r
amount of signal outside of the subspace
meanmedian
0 0.05 0.1 0.15 0.2 0.253
4
5
6
7
8x 10
−3 Offset Error v. Modeling Error
ave
rag
e e
rro
r p
er
se
nso
r
amount of signal outside of the subspace
meanmedian
Figure 5.3: Gain and offset error performance for mismodeled P and zero mea-
surement noise. The results show the mean and median error for 100 simulation
runs.
70
Then we constructed the matrix C from equation (5.13) and took the minimum
right singular vector as the estimate of the gains α as described in Section 5.5.1.
We then estimated β = −Y α.
Results from totally blind calibration in simulation are shown in the Fig-
ures 5.2 and 5.3. Figure 5.2 shows error in gain and offset estimates under the
burden of increasing noise variance using exact knowledge of the subspace defined
by P . That is, the fields in these simulations were created by projecting random
signals into the space defined by projection matrix P . The maximum value in
the signals was 1, and therefore the noise variance can be taken as a percentage;
i.e., variance of 10−2 represents 1% noise in the signal. The blind calibration
performed very well in this scenario; at 1% noise the gain estimation error was
less than 1 × 10−4 and the offset estimation error was less than 2.4 × 10−3. The
figure shows mean and median error over 100 simulation runs.
Knowing the true subspace exactly is possible in practice only when perform-
ing blind calibration in a very well-known environment, such as an indoor factory.
Even in this case, there will be some component of the true signals which is out-
side of the subspace defined by the chosen P . Figure 5.3 shows how gain and
offset error are affected by out-of-subspace components in the true signals. We
used a basic smoothing kernel to control smoothness of the true field and kept P
constant as described above with r = 49. The smoothing kernel and the projec-
tion operator are both low-pass operators, but even in the best case, some of the
smoothed field will be outside of the space defined by the projection matrix P .
We defined the error in P as ‖x − Px‖2/‖x‖2. The x-axis value in the figure is
the average error in P over 100 random fields smoothed with a given smoothness
parameter. The figure shows mean and median error in gain and offset estimates
over these 100 simulation runs. Again the results are compelling. The gain es-
71
10−3
10−2
10−1
10−8
10−6
10−4
10−2
100
aver
age
erro
r pe
r se
nsor
on
logs
cale
noise variance on logscale
Mean Gain ErrorLSPartial BlindSVD
Figure 5.4: Gain error performance for SVD, blind LS, and partially blind LS.
Results show mean error over 50 simulation runs.
timation error was around 10−2 even when 10% of the signal was outside of the
subspace. The offset estimation as well was still very accurate, below 7 × 10−3
even when 20% of the signal was outside of the subspace.
5.5.2.2 Comparison of Techniques
Here we compare the SVD technique to the LS technique and the totally blind
calibration to partially blind calibration, where we know some of the calibration
coefficients ahead of time. To be completely explicit, here we have a description
of the approaches.
Totally blind SVD or SVD performs gain estimation using the minimum right
singular vector of the svd and normalizes assuming α(1) = 1. Offsets are then
estimated using β = −Y α. Totally blind LS or LS performs gain estimation by
solving equation (5.15) in a least-squares sense and assuming knowledge only of
α(1) = 1. Offsets are estimated as in SVD. Partially blind LS or partial blind
performs gain estimation by again solving equation (5.15) in the least-squared
sense but now assuming we know at least r of the true gain values. Offsets are
72
10−2
10−1
100
10−6
10−4
10−2
100
102
104 Offset Error for 0−Mean Signals
aver
age
erro
r pe
r se
nsor
on
logs
cale
noise variance on logscale
LSPartial BlindSVD
10−2
10−1
100
10−6
10−4
10−2
100
102Offset Error for Non−0−Mean Signals
aver
age
erro
r pe
r se
nsor
on
logs
cale
noise variance on logscale
LSPartial BlindSVD
Figure 5.5: Offset error performance for SVD, blind LS, and partially blind LS.
The top graph shows offset error for zero-mean signals, and the bottom graph is
for non-zero-mean signals. Results show mean error over 50 simulation runs.
then estimated as described in Section 5.3 for non-zero mean signals, i.e. using
β∆ = TΦΦ′β to solve for θ = Φ′β and thus β.
For partially blind LS we use enough of the true offsets such that we can
solve for the complete component of β in the signal subspace. The fields we
simulated are nearly bandlimited subspaces, and so the theory would imply that
r true offsets are enough to estimate β. In order to be robust to noise, we
used knowledge of the offsets of r + 5 sensors, again slightly more than the bare
minimum suggested by the theory.
A comparison of the techniques is quite interesting. First, as we expect, the
partially blind estimation does better than the other two methods in all cases;
this follows from the fact that it is using more information. In Figure 5.4 one can
see in the gain estimation, the SVD method out-performs totally blind LS, but
partially blind LS has the lowest error of all the methods.
In the case of offset error, the SVD and totally blind LS techniques out-
perform one another depending on the noise variance and whether or not the
73
signals are zero-mean. Figure 5.5 shows offset error for all three techniques. The
partially blind LS method is unaffected by non-zero mean signals, which follows
because the method for estimating the offsets does not change with a zero-mean
assumption. The other methods, on the other hand, capture the mean signal
as part of their offset estimates, and as we can see, estimation error using the
non-zero-mean signals is higher than using zero-mean signals.
The most intriguing part of these results is that totally blind LS performs
slightly better than SVD for the offset estimate in non-zero-mean signals, despite
the fact that it is using a gain estimate with more error from the first step in
order to estimate the offsets. This implies that if calibration offset is the most
important for calibration of a system which deals with non-zero-mean signals,
one might prefer the totally blind LS method over the SVD.
5.5.3 Evaluation on Sensor Datasets
We evaluate blind calibration on two sensor network datasets, which we call the
calibration dataset and the cold air drainage transect dataset.
5.5.3.1 Calibration Dataset
The calibration dataset was collected in September 2005 [4] along with data from
a reference-caliber instrument in order to characterize the calibration of the ther-
mistors used for environmental temperature measurement at the James Reserve.
From the experiment, the conclusion was drawn that after the factory-supplied
calibration was applied to the raw sensor measurements, the sensors differed from
the reference thermocouple linearly, i.e. by only a gain and offset. Thus these
sensors are suitable for evaluating the work we have done thus far on blind cali-
74
0 2 4 6 8 100.98
1
1.02
1.04Optimal and Estimated Parameters
gain
val
ue
0 2 4 6 8 10−0.5
0
0.5
1
sensor id
offs
et v
alue
optimalestimated
optimalestimated
1 1.5 2 2.5
x 105
21
22
23
24True Temperature (dashed),
Uncalibrated Data (solid)
1 1.5 2 2.5
x 105
21
22
23
24True Temperature (dashed),
Calibrated Data (solid)
time index
tem
pera
ture
in d
egre
es C
Figure 5.6: Results of blind calibration on the calibration dataset.
bration. The data is available in the NESL CVS repository5.
The setup of this experiment consisted of nine6 temperature sensors. These
sensors were placed in a styrofoam box along with a thermocouple attached to a
datalogger, providing ground truth temperature readings. Therefore, all sensors
were sensing the same phenomenon, and so the subspace spanned by the nine
measurements is rank one. Thus, for P we used a lowpass dct matrix which kept
only the dc frequency space. To illustrate, we used the following commands in
Matlab:
r = 1; n = 9;
I = eye(n);
U = dct(I);
U(r+1:n,:) = 0;
P = idct(U);
We calibrated these data using snapshots from the dataset and the SVD
method. To get the gain calibration factors, we normalized to the gain charac-
teristic of the groundtruth sensor. Figure 5.6 shows the calibration coefficient
5This data is available at http://www.ee.ucla.edu/~sunbeam/bc/6The experiment had ten sensors, one of which was faulty. In this analysis we used data
from the nine functional sensors.
75
−150
−100
−50
0
0200
4001600
1610
1620
1630
1640
1650
Sensor Locations for Cold Air Drainage
altit
ude
Figure 5.7: The mica2 motes in the cold air drainage transect run down the side
of a hill and across a valley. The mote locations pictured are those that we used.
estimates and reconstructed signals for the sensors in the experiment. The gains
and offsets were recovered with very little error. The uppermost plot shows the
true and estimated gains and offsets. The lower plot shows the data before and
after calibration, along with the ground truth measurement in blue. This clearly
demonstrates the utility of blind calibration.
5.5.3.2 Cold Air Drainage Dataset
The cold air drainage transect dataset consists of data from an ongoing deploy-
ment at the James Reserve. The deployment measures air temperature and
humidity in a valley in order to characterize the predawn cold air drainage. The
sensors used are the same as the sensors in the calibration dataset, and thus again
the factory calibration brings them within an offset and gain of one another. The
data we used for evaluation is from November 2, 2006, and it is available in
the sensor data repository called SensorBase7. On this same day, we visited the
James Reserve with a reference-caliber sensor and took measurements over the
7http://sensorbase.org
76
course of the day in order to get the true calibration parameters for comparison.
The deployment consists of 26 mica2 motes which run from one side of a
valley to the other (Figure 5.7) across a streambed and in various regions of tree
and mountain shade. Each mote has one temperature and one humidity sensor.
For our purposes, we collected calibration coefficients from 10 of the temperature
sensors.
The signal subspace in this application does not correspond to a simple low-
pass or smooth subspace, since sensors at similar elevations may have similar
readings, but can be quite distant from each other. In principle, the signal sub-
space could be constructed based on the geographic positions and elevations of
the sensor deployment. However, since we have the calibrated sensor data in
this experiment, we can use these data directly to infer an approximate signal
subspace. We constructed the projection P using the subspace associated with
the four largest singular values of the calibrated signal data matrix.
We performed totally blind calibration using SVD. We constructed C using
64 snapshots taken over the course of the morning along with P as described.
Figure 5.8 shows the results. The gain error was very small, only .0053 average
per sensor, whereas if we were to assume the gain was 1 and not calibrate the
sensors at all, the error would be .0180 average per sensor. On the other hand,
the offset error was only slightly better with blind calibration than it would have
been without: we saw .3953 average error per sensor as compared to 0.4610 error
if the offsets were assumed to be zero. We believe that the offset estimation did
not perform well due primarily to the fact that the mean signal is not zero in
this case (e.g., the average sensor readings depend on elevation). Better offset
estimates could be obtained using knowledge of one or more of the true sensor
offset values.
77
2 4 6 8 100.8
1
1.2Optimal and Estimated Parameters
gain
val
ue
2 4 6 8 10−4
−2
0
2
sensor id
offs
et v
alue
optimalestimated
optimalestimated
Figure 5.8: True and estimated gains and offsets for the cold air drainage transect
dataset.
5.6 Future Work and Discussion
There are many issues in blind calibration that could be explored further. The
two main areas ripe for study are the choice of the subspace P and the implemen-
tation of blind calibration. There are many possible choices for a suitable sub-
space, including frequency subspaces and smoothness subspaces. How to choose
the subspace when faced with a sensor deployment where the true signals are un-
known is an extremely important question for blind calibration. Methodologies
for creating a P would be extremely useful to the more general application of
blind calibration, especially ones which could incorporate trusted measurements
or the users’ knowledge of the physical space where the sensors are deployed. At
the same time, implementations of blind calibration that are robust to model
error in the subspace would allow users to be more liberal in the choice of P .
78
The theoretical analysis in this chapter is done under noiseless conditions and
with a perfect model. Future work includes both noisy analysis to find analytical
bounds that can be compared to simulation results and sensitivity analysis for
our system of linear equations. Our experience is that solutions are robust to
noise and mismodeling in some cases, and sensitive in others; we do not have a
good understanding of the robustness of the methodology at this time.
Extending the formulation to handle non-linear calibration functions would
be useful in cases where a raw non-linear sensor response must be calibrated. We
believe that many of the techniques developed in this chapter can be extended
to more general polynomial-form calibration functions. Other interesting topics
include distributed blind calibration and blind calibration in the presence of faulty
sensors.
79
CHAPTER 6
Conclusions
The work presented in this thesis takes a first step to addressing issues of fault
and calibration in sensor networks. In Chapter 3, we presented typical faults in
sensor networks and a general model which encompasses those faults well. We
used instances of that model to assess the impact of fault on popular aggregation
functions in sensor networks. The model provided allows great flexibility in rep-
resenting the typical faults in sensor networks. Hopefully this model can provide
people with a way to assess the impact of fault on their sensor network, which
can in turn facilitate more robust design principles.
Chapter 4 presented an approach to in-situ blind calibration based on state
space models. Any phenomenon of interest that is dynamic and can be repre-
sented by a dynamical system of equations can be tracked and measurements
can be assimilated using adaptive filters. With Monte Carlo methods, even non-
linear, non-Gaussian systems can be tracked. We showed that two Monte-Carlo
filtering algorithms, the Ensemble Kalman Filter and the SIR Particle Filter,
have promise for estimating calibration parameters of both gain and offset. In
preliminary implementations of both algorithms with calibration parameters part
of the estimated state space, we got expected results where better information
improved our estimate but still had small error even when our information was
uncertain. The update step which updates the estimates for the calibration pa-
rameters is only a first heuristic implementation, and finding the correct update
80
is a difficult problem. It will involve some reworking of how the filter updates
work, fundamentally. However, the promise which our initial heuristic shows is
good motivation to go ahead with this work.
The blind calibration work in Chapter 5 presented a very general approach to
the problem with one assumption that the sensor measurements lie in a subspace.
The formulation and methods developed in this chapter used only routine sensor
measurements. Thus we believe they give an extremely promising formulation for
the mass calibration of sensors. We showed that calibration gains are identifiable.
We proved how many measurements are necessary and sufficient to estimate the
gain factors, and we showed necessary and sufficient conditions to estimate the
offsets. Overall, this work demonstrates, in a very general way, that blind cali-
bration has great potential to be possible in practice. The major issue with the
formulation is that the model for the measurements’ subspace is crucial to the
entire formulation. One could take an approach such that implementation meth-
ods are amenable to an iterated solution where progressively better knowledge
is attained. Alternatively, if the methods used for implementation are robust
to model error, then presumably a slightly wrong subspace can still be used for
calibration. The formulation is presented in such a way that we can carefully
analyze the sensitivity of the method to particular subspaces, and understand in
what ways the true subspace can exist outside of the assumed subspace to still
allow blind calibration in this way.
Our final and overwhelming conclusion is that there remains a lot of work to
be done! This thesis presented first steps, and there are certainly several new
approaches to both fault detection and calibration that have yet to be explored.
Data coming from sensor networks is simply not high enough quality for many
of the envisioned applications, and this fact cannot be swept under the rug. For
81
any sensor network to be a truly useful and inexpensive technology, it needs to
be robust to fault and calibration errors.
82
References
[1] M. S. Arulampalarn, S. Maskell, N. Gordon, and T. Clapp. A tutorial onparticle filters for online nonlinear/non-gaussian bayesian tracking. IEEETransactions on Signal Processing, 50(2):174–188, February 2002.
[2] L. Balzano and R. Nowak. Blind calibration. Technical Report TR-UCLA-NESL-200702-01, Networked and Embedded Systems Laboratory, February2007.
[3] L. Balzano and R. Nowak. Blind calibration in sensor networks. In Proceed-ings of Information Processing in Sensor Networks (IPSN), 2007.
[4] L. Balzano, N. Ramanathan, E. Graham, M. Hansen, and M. B. Srivastava.An investigation of sensor integrity. Technical Report UCLA-NESL-200510-01, Networked and Embedded Systems Laboratory, 2005.
[5] P. Buonadonna, D. Gay, J. Hellerstein, W. Hong, and S. Madden. Task:Sensor network in a box. Technical Report IRB-TR-04-021, Intel ResearchBerkeley, January 2005.
[6] M. Bushnell and V. Agrawal. Essentials of Electronic Testing for Digital,Memory, and Mixed-Signal VLSI Circuits. Springer, 2000.
[7] V. Bychkovskiy, S. Megerian, D. Estrin, and M. Potkonjak. A collaborativeapproach to in-place sensor calibration. In 2nd International Workshop onInformation Processing in Sensor Networks, pages 301–316, 2003.
[8] E.J. Candes and J. Romberg. Quantitative robust uncertainty principlesand optimally sparse decompositions. Foundations of Computational Math-ematics, 2006.
[9] E. Elnahrawy and B. Nath. Context-aware sensors. In Proceedings of theEuropean Conference on Wireless Sensor Networks, pages 77–93, 2004.
[10] G. Evensen. The ensemble kalman filter: theoretical formulation and prac-tical implementation. Ocean Dynamics, 53(4):343–367, 2003.
[11] J. Feng, S. Megerian, and M. Potkonjak. Model-based calibration for sensornetworks. Sensors, pages 737 – 742, October 2003.
[12] S. Ganeriwal and M.B. Srivastava. Reputation-based framework for highintegrity sensor networks. In Proceedings of SASN ’04, October 2004.
83
[13] G. Harikumar and Y. Bresler. Perfect blind restoration of images blurredby multiple filters: Theory and efficient algorithms. IEEE Transactions onImage Processing, 8(2):202 – 219, February 1999.
[14] D.J. Hill and B.S. Minsker. Automated fault detection for in-situ envi-ronmental sensors. In Proceedings of the 7th International Conference onHydroinformatics, 2006.
[15] B. Hoadley. A bayesian look at inverse linear regression. Journal of theAmerican Statistical Association, 65(329):356 – 369, March 1970.
[16] A. Hyvarinen, J. Karhunen, and E. Oja. Independent Component Analysis.Wiley-Interscience, New York, 2001.
[17] A. Ihler, J. Fisher, R. Moses, and A. Willsky. Nonparametric belief prop-agation for self-calibration in sensor networks. In Proceedings of the ThirdInternational Symposium on Information Processing in Sensor Networks,2004.
[18] R. Isermann. Fault-Diagnosis Systems: An Introduction from Fault Detec-tion to Fault Tolerance. Springer, 2005.
[19] F. Koushanfar, M. Potkonjak, and A. Sangiovanni-Vincentelli. On-line faultdetection of sensor measurements. IEEE Sensors, pages 974–980, October2003.
[20] L. Lamport, R. Shostak, and M. Pease. The byzantine generals problem.ACM Transactions on Programming Languages and Systems, 4(3):382–401,July 1982.
[21] L. B. Larkey, L.M.A. Bettencourt, and A.A. Hagberg. In-situ data qualityassurance for environmental applications of wireless sensor networks. Tech-nical Report Unclassified Report LA-UR-06-1117, Los Alamos Laboratory,2006.
[22] Keith Marzullo. Tolerating failures of continuous-valued sensors. ACMTransactions on Computing Systems, 8(4):284–304, 1990.
[23] M. Rabbat, R. Nowak, and J. Bucklew. Generalized consensus computa-tion in networked systems with erasure links. In Proceedings of the IEEEWorkshop on Signal Processing Advances in Wireless Communications, June2005.
84
[24] N. Ramanathan, L. Balzano, M. Burt, D. Estrin, T. Harmon, C. Harvey,J. Jay, E. Kohler, S. Rothenberg, and M.Srivastava. Rapid deploymentwith confidence: Calibration and fault detection in environmental sensornetworks. Technical Report 62, Center for Embedded Networked Sensing,2006.
[25] N. Ramanathan, K. Chang, R. Kapur, L. Girod, E. Kohler, and D. Estrin.Sympathy for the sensor network debugger. In Proceedings of the 3rd inter-national conference of Sensys, pages 255–267, 2005.
[26] N. Ramanathan, T. Schoellhammer, D. Estrin, M. Hansen, T. Harmon,E. Kohler, and M. Srivastava. The final frontier: Embedding networkedsensors in the soil. Technical Report 68, Center for Embedded NetworkedSensing, November 2006.
[27] A. Shah and G. Ramakrishnan. FDDI: A High Speed Network. PrenticeHall, 1993.
[28] O. Shalvi and E. Weinstein. New criteria for blind deconvolution of non-mimimum phase systems (channels). IEEE Trans. on Information Theory,IT-36(2):312 – 321, March 1990.
[29] R. Szewczyk, J. Polastre, A. Mainwaring, and D. Culler. Lessons from asensor network expedition. In Proceedings of the 1st European Workshop onWireless Sensor Networks, pages 307–322, January 2004.
[30] C. Taylor, A. Rahimi, J. Bachrach, H. Shrobe, and A. Grue. Simultaneouslocalization, calibration, and tracking in an ad hoc sensor network. In Pro-ceedings of the Fifth International Conference on Information Processing inSensor Networks, pages 27–33, 2006.
[31] G. Tolle, J. Polastre, R. Szewczyk, D. Culler, N. Turner, K. Tu, S. Burgess,T. Dawson, P. Buonadonna, D. Gay, and W. Hong. A macroscope in theredwoods. In Proceedings of Sensys, 2005.
[32] D. Wagner. Resilient aggregation in sensor networks. In Proceedings of the2nd ACM workshop on Security of Ad Hoc Sensor Networks, pages 78–87,2004.
[33] K. Whitehouse and D. Culler. Calibration as parameter estimation in sen-sor networks. In Proceedings of the 1st ACM International Workshop onWireless Sensor Networks and Applications, pages 59–67, 2002.
85
[34] Y. Zhou, D. McLaughlin, and D. Entekhabi. Assessing the performance ofthe ensemble kalman filter for land surface data assimilation. American Me-teorological Society Monthly Weather Review, 134:2128–2142, August 2006.
86