addressing fault and calibration in wireless sensor...

University of California

Los Angeles

Addressing Fault and Calibration

in Wireless Sensor Networks

A thesis submitted in partial satisfaction

of the requirements for the degree

Master of Science in Electrical Engineering

by

Laura Kathryn Balzano

2007

c© Copyright by


2007

The thesis of Laura Kathryn Balzano is approved.

Steven Margulis

Greg Pottie

Mark Hansen

Mani B. Srivastava, Committee Chair

University of California, Los Angeles

2007

ii

To my mother and father who encourage me

to strive to be proud of everything I do.

iii

Table of Contents

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2.1 Related Areas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2.2 Sensor Faults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.3 Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

3 Survey of Faults and Aggregation Analysis . . . . . . . . . . . . 13

3.1 Examples of Fault in Sensor Network Deployments . . . . . . . . 14

3.2 Fault Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

3.2.1 Offset Fault . . . . . . . . . . . . . . . . . . . . . . . . . . 19

3.2.2 Gain Fault . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

3.2.3 Variance Degradation Fault . . . . . . . . . . . . . . . . . 19

3.2.4 Stuck-At Fault . . . . . . . . . . . . . . . . . . . . . . . . 20

3.2.5 Static Fault . . . . . . . . . . . . . . . . . . . . . . . . . . 20

3.3 Aggregation Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 21

3.3.1 Aggregation as Estimation . . . . . . . . . . . . . . . . . . 21

3.3.2 Sum and Average . . . . . . . . . . . . . . . . . . . . . . . 23

3.3.3 Some Mathematical Details . . . . . . . . . . . . . . . . . 26

3.4 Future Work and Discussion . . . . . . . . . . . . . . . . . . . . . 28

4 Estimation of Calibration Parameters using Dynamical Models 29

iv

4.1 The Ensemble Kalman Filter and the Particle Filter . . . . . . . . 30

4.2 Simple Autoregressive Surface Moisture Model . . . . . . . . . . . 32

4.2.1 Initial Condition . . . . . . . . . . . . . . . . . . . . . . . 32

4.2.2 Model Forcing . . . . . . . . . . . . . . . . . . . . . . . . . 32

4.2.3 Model Error . . . . . . . . . . . . . . . . . . . . . . . . . . 33

4.2.4 Measurement Model . . . . . . . . . . . . . . . . . . . . . 33

4.3 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

4.3.1 State Vector . . . . . . . . . . . . . . . . . . . . . . . . . . 35

4.3.2 Input Parameters . . . . . . . . . . . . . . . . . . . . . . . 35

4.4 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

4.4.1 Estimation with Incorrect Prior Mean . . . . . . . . . . . . 40

4.4.2 Estimation with Changing Prior Variance . . . . . . . . . 45

4.4.3 Estimation with Frequent Updates . . . . . . . . . . . . . 48

4.4.4 Future Work and Discussion . . . . . . . . . . . . . . . . . 50

5 Estimation of Calibration Parameters using Subspace Matching 53

5.1 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . 54

5.2 Blind Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

5.3 Offset Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

5.4 Gain Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

5.4.1 General Conditions . . . . . . . . . . . . . . . . . . . . . . 61

5.4.2 Bandlimited Subspaces . . . . . . . . . . . . . . . . . . . . 65

5.5 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

v

5.5.1 Robust Estimation . . . . . . . . . . . . . . . . . . . . . . 66

5.5.2 Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . 67

5.5.3 Evaluation on Sensor Datasets . . . . . . . . . . . . . . . . 74

5.6 Future Work and Discussion . . . . . . . . . . . . . . . . . . . . . 78

6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

vi

List of Figures

3.1 An example of the Offset Fault: Various humidity sensors measur-

ing the same phenomenon. . . . . . . . . . . . . . . . . . . . . . . 15

3.2 An example of a sensor which was good and developed some prob-

lematic noise. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

3.3 Examples of the Stuck-At and Static Faults. . . . . . . . . . . . . 17

3.4 Variables used in fault model descriptions. . . . . . . . . . . . . . 18

3.5 Notation for moments of variables used in aggregation analysis. . 24

3.6 Averaging under a stuck-at-zero fault. . . . . . . . . . . . . . . . . 25

3.7 Averaging under degraded variance. . . . . . . . . . . . . . . . . . 26

4.1 An example soil sampling scenario. . . . . . . . . . . . . . . . . . 31

4.2 Parameters used in generating the true phenomenon. The true

initial condition and the model error are a single instance drawn

from the corresponding distribution. . . . . . . . . . . . . . . . . . 36

4.3 Example of EnKF with different measurement variance and a priori

information. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

4.4 An example of the SIR Filter estimate for the five state parameters. 39

4.5 RMS Error as the difference in the true and prior mean for the

measurement offset increases: 5, 50, 95% quantiles over 100 runs. 41

4.6 RMS Error as the difference in the true and prior mean for the

measurement gain increases: 5, 50, 95% quantiles over 100 runs. . 44

4.7 RMS Error as the prior variance for the offset measurement pa-

rameter increases: 5, 50, 95% quantiles over 100 runs. . . . . . . . 45

vii

4.8 RMS Error as the prior variance for the gain measurement param-

eter increases: 5, 50, 95% quantiles over 100 runs. . . . . . . . . . 47

4.9 RMS Error as the spacing between measurement updates increases:

5, 50, 95% quantiles over 100 runs. . . . . . . . . . . . . . . . . . 49

5.1 Two example simulated square fields. On the left, a 256 × 256

field generated with a basic smoothing kernel, which represents a

true continuous field. On the right, an 8× 8 grid of measurements

of the same field. The fields can be quite dynamic and still meet

the assumptions for blind calibration. The fields are shown in

pseudocolor, with red denoting the maximum valued regions and

blue denoting the minimum valued regions. . . . . . . . . . . . . 68

5.2 Gain and offset error performance with exact knowledge of P and

increasing measurement noise. The results show the mean and

median error over 100 simulation runs. . . . . . . . . . . . . . . . 70

5.3 Gain and offset error performance for mismodeled P and zero mea-

surement noise. The results show the mean and median error for

100 simulation runs. . . . . . . . . . . . . . . . . . . . . . . . . . 70

5.4 Gain error performance for SVD, blind LS, and partially blind LS.

Results show mean error over 50 simulation runs. . . . . . . . . . 72

5.5 Offset error performance for SVD, blind LS, and partially blind

LS. The top graph shows offset error for zero-mean signals, and

the bottom graph is for non-zero-mean signals. Results show mean

error over 50 simulation runs. . . . . . . . . . . . . . . . . . . . . 73

5.6 Results of blind calibration on the calibration dataset. . . . . . . . 75

viii

5.7 The mica2 motes in the cold air drainage transect run down the

side of a hill and across a valley. The mote locations pictured are

those that we used. . . . . . . . . . . . . . . . . . . . . . . . . . . 76

5.8 True and estimated gains and offsets for the cold air drainage tran-

sect dataset. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

ix

Acknowledgments

I would like to thank my advisor Mani Srivastava, my collaborators Mark Hansen,

Steven Margulis, and Rob Nowak, and members of the Data Integrity Group for

their willingness to discuss and work out all kinds of issues.

x

Abstract of the Thesis

Addressing Fault and Calibration

in Wireless Sensor Networks

by


Master of Science in Electrical Engineering

University of California, Los Angeles, 2007

Professor Mani B. Srivastava, Chair

Sensors and devices used in wireless sensor networks are state-of-the-art tech-

nology with the lowest possible price. The sensor measurements we get from

these devices are therefore often noisy, incomplete and inaccurate. Researchers

studying wireless sensor networks hypothesize that much more information can

be extracted from hundreds of unreliable measurements spread across a field of

interest than from a smaller number of high-quality, high-reliability instruments

with the same total cost. This thesis offers a basis for exploring that hypothesis

in some detail. We make four contributions. First, we describe sensor faults

commonly seen in recent sensor network deployments, and we formulate statis-

tical models to assist in the analysis of those faults. Second, we present some

basic tools for assessing the robustness of aggregation algorithms to these com-

mon faults. We then address, in two separate ways, the issue of finding linear

calibration parameters while sensors are deployed. Our third contribution is an

approach to calibration using state space models and non-linear, non-Gaussian

filtering techniques to calibrate sensors without groundtruth knowledge or con-

trolled stimuli. We evaluate this calibration on simulated sensor data with a

xi

simple dynamical model based on the physical process of soil moisture. Fourth,

we present a general problem formulation for blind calibration which assumes

that the n sensor measurements lie in a subspace of n-dimensional space. We

prove the identifiability of the sensor offsets and gains under this assumption,

and we evaluate implementations on both simulated and real sensor data.

xii

CHAPTER 1

Introduction

Wireless Sensor Networks provide a new technology which can enable scientists,

activists, even the general population to collect data about their environment.

Many people have indoor-outdoor thermometers with wireless communication

devices so they can know the temperature around the house from a single display

on their desk. The vision for Wireless Sensor Networks (WSNs) hopes to take

this idea to the next level, allowing us to have several tiny devices which interface

with all kinds of sensors, which we could place around our homes and neighbor-

hoods or in the middle of natural ecosystems in order to make wiser decisions in

every aspect of life. For example, WSNs which have simple light and temperature

sensors currently enable companies to make smarter decisions about lighting and

heating in a building. WSNs with cameras and magnetic sensors may be deployed

in high-vehicle-traffic areas to learn traffic patterns at unprecedented resolution

and improve traffic lights or inform future road expansion plans. At the Center

for Embedded Networked Sensing (CENS), WSNs are already deployed in nat-

ural ecosystems at the James Reserve (JR) in order to provide biologists and

environmental engineers with data on airflow, soil chemistry, lake ecology, and

bird nesting patterns. These data are all at unprecedented high spatial and tem-

poral resolutions, and the scientists hope to learn intricacies of the environment

so that they can make more informed decisions when they are trying to positively

influence segments of our environment, such as climate or water quality.

1

Wireless Sensor Networks allow us to collect these data, where before we

could not, for three reasons, all of which are possible due to advances in comput-

ing technology in the recent decades. First, the devices used by WSNs are small

and getting smaller every year. This allows us to place them inconspicuously

without disrupting the environment we are trying to monitor. Second, wireless

communication technology and low-power devices allow us to collect data from

remote locations without the burden of cables and infrastructure. Third, the

hardware that make up these small, wireless devices is getting more and more

inexpensive as time moves on. In order to be a ubiquitous and pervasive tech-

nology, WSNs operate at the lowest-cost boundary, using the most inexpensive

forms of the technology available.

Exactly because of this third reason, the data collected by WSNs is not as

reliable as data collected by high-quality expensive instruments. Given that inex-

pensive pervasive sensing devices will never be highly reliable and state-of-the-art,

we can conclude that sensor data will always be noisy, incomplete and inaccurate.

Researchers studying WSNs, though, hypothesize that unreliable measurements

from hundreds of low-quality instruments can offer a user considerably more in-

formation than a single reliable data source.

Unfortunately, this hypothesis has been taken for granted in much of the

research done on WSNs. This thesis takes first steps to address the two main

issues of low-quality sensor data: fault and calibration.

Sensor faults are the rule and not the exception in every WSN deployment so

far, to our knowledge. Sensors themselves may get stuck at a particular value or

get partially disconnected and report noisy measurements. Sensor nodes reboot

unexpectedly or stop transmitting data. Software running on the sensor nodes

may have bugs and may cause data loss. Algorithms for detecting these failures

2

and for directing a user to the probable cause [24, 25] are useful for WSN users

who are willing to take care of their sensor network. Algorithms which can do an

automatic disposal of, or are resilient to, bad data are useful for WSN users who

want a transparent interface.

An important area of research in WSNs is that of in-network processing. Data

are processed on the nodes within the network in order to save transmission en-

ergy when possible. Often the data are aggregated to provide descriptive statistics

across an area of the network instead of sending back each and every data point.

Because context and information gets lost during the process of aggregation, it is

absolutely crucial that these algorithms are robust to missing and faulty sensor

data. Chapter 3 of this thesis gives a survey of faults seen in sensor networks

and takes a careful look at whether popular aggregation schemes are robust to

the usual faults.

Even if we can guarantee that the sensors never fail outright, the sensors used

for WSNs are notoriously prone to calibration errors, and arguably these errors

are one of the major obstacles to the practical use of sensor networks [5]. Cal-

ibrating every sensor by hand is infeasible if sensor networks are to scale even

into the tens of devices; yet it may be that applications need more accurate mea-

surements than uncalibrated, low-cost sensors provide. Consequently, automatic

methods for jointly calibrating sensor networks in the field, without dependence

on controlled stimuli or high-fidelity groundtruth data, are of significant interest.

This thesis explores two possible approaches, one based on a known physical dy-

namical model for the environment being sensed (Chapter 4), and one based on

a known subspace model for the environment being sensed (Chapter 5).

3

CHAPTER 2

Related Work

As mentioned in Chapter 1, a fault can be introduced into sensor data at every

point in the sensor network: from failures in the sensor itself, to software bugs

and computational errors, to lossy communication. This thesis focuses on faults

in the sensors such that they report inaccurate data.

2.1 Related Areas

Throughout the analysis in this thesis we assume safe and reliable communi-

cation between all sensor nodes and basestations. Protocols for assessing data

integrity will also need to be reliable, but that can be addressed separately by the

work in research areas such as network systems and network information theory.

Missing data due to lossy links has been a prevailing problem in data collection

sensor networks. In [25], a debugging system called Sympathy assists a network

engineer in identifying the point at which data packets go missing. The Sympa-

thy debugger uses network communications information like neighbor lists and

packet counts to determine a potential area in the network that is causing loss.

In coding theory, algorithms are evaluated in the presence of erasure links to see

how they would perform under random packet losses. For example, generalized

consensus algorithms in sensor networks with erasure links were studied in [23].

It also turns out that distributed statistical analysis and distributed optimization

4

techniques that are tolerant to faulty data are often also tolerant to missing or

lossy data, though they do converge more slowly under these circumstances.

Prior to work specific in sensor networks, system faults have been studied

thoroughly in control systems, communications networks and in VLSI design.

In control systems and communications networks, three staple concepts create a

foundation for fault tolerance: replication, redundancy and diversity. These con-

cepts are useful not only for network failures, but also for data failures themselves.

For both control systems and communications networks, the fault tolerance con-

siderations lie in the reliability of communicating signals. Control systems that

need to provide safe operation for humans or expensive equipment require ex-

tremely high control-signal integrity [18]. Sensor networks and wireless networks

in general have a long way to go before they will be able to provide the integrity

of data that these safety-critical systems require.

Fault tolerance in communications networks has been an important issue

for many years, hearkening back to the robust design of FDDI and ATM net-

works [27]. In most cases, reliability is supported with redundancy: multiple

independent routes between devices on a network. Clusters, redundant servers,

and redundant power supplies are all techniques for redundancy.

Testing of circuits and VLSI parts has been an area of intense study over

the past few decades with the rise of semiconductor devices [6]. One fault-

identification technique that was born in this area, but is applicable in many

other areas, is that of test vectors. Test vectors may be able to identify faults

such as buggy software, issues with the interaction between micro-controller and

sensor systems, and sensors that are not functional at the start of their life.

5

2.2 Sensor Faults

In his work on tolerating faulty sensor measurements [22], Keith Marzullo presents

the idea of an abstract sensor that incorporates uncertainty into the measure-

ments of an associated set of real sensors. His goal is then to construct the

abstract sensor in such a way that it is tolerant to failures.

Koushanfar and Potkonjak [19] have explored many facets of faulty data in

sensor networks. They suggest five phases of testing sensor-based systems. The

first phase is test vector generation, with a goal of finding test inputs that are

most likely to excite particular faults they are interested in testing. However, a

special characteristic of fault detection in sensor networks is that we must answer

the question of what happens to the sensors over their lifetime while they sit in

environments which have unknown properties. With sensor networks, we wish to

capture information about changing, dynamic systems that we don’t understand.

This kind of deployment purpose does not fit well with a test vector formulation.

Another important consideration is that sensors change with the climate they are

immersed in. It is difficult to predict what problematic conditions a sensor will

face once it is placed in the environment it is intended to monitor.

The next three phases in [19] for testing sensor-based systems are that of

fault detection, diagnosis and validation. They describe the last phase as one

of mapping sensor readings to correct readings with some confidence, which is

sometimes called compensation in calibration literature. The authors suggest

fault models like the ones we show in Chapter 3, though their list is slightly

shorter. Also, we have presented a general form that can encompass our entire

list of faults, which proves very useful in analysis.

The online fault detection section of this same paper [19] assumes underlying

6

equations for the phenomenon. Based on those equations, the authors’ method

can define a fault detection algorithm. The idea here is to eliminate sensor

readings from aggregation, and if that elimination improves the consistency of

the results, that sensor reading is most likely faulty. In this thesis, we instead

aim to derive and understand as much as possible about faults in sensor networks

under the assumption that only general statistical properties of the underlying

phenomenon are known.

There are copious instances of faulty data in sensor networks where the scien-

tists did not necessarily know what to expect in the data collected by the sensor

network. In [29], the authors described their experiences in deploying a sensor

network on Great Duck Island to monitor the habitat of the Leach’s Storm Pe-

trel. The authors described both packet loss errors and inaccurate measurement

reports. They found that sensor measurement quality degraded over time in the

outdoor environment, especially in sensors that could not have a protective cov-

ering due to the small size of breeding holes for the petrel. Temperature readings

inside enclosures were regularly much higher than those outside. The integrated

sensor boards caused the domino effect-as soon as one sensor went bust, others

on the board often followed quickly. Humidity sensors went bad when they were

wet, but then recovered as they dried off. Light sensors were reliable for the

most part, and the few that failed had stopped displaying the diurnal pattern

and instead consistently read high values.

From speaking with the engineers at the James Reserve [7] who collect data

from outdoor sensors into a database, we learned that the two main data prob-

lems they have are missing data and sensors that get stuck at a particular value.

A deployment in the Redwoods of California described in [31] had large amounts

of data loss due to both communication failure and sensor failure; those authors

7

note that bad data was more easily found by looking where battery voltage data

revealed an anomaly as well. Soil deployments in Bangladesh [24, 26] and Cal-

ifornia [26] have turned up very unusual fault modes that are more difficult to

manage.

The problem of robust data fusion for sensor networks is addressed in [32].

The author, David Wagner, discusses a simple application of Robust Statistics,

a decades-old field which studies, among other things, the contamination of a

data sample that can cause a particular computation on that data sample to be

in error. Wagner asks which computations common to sensor networks require

a larger percentage of data points in that computation to go to infinity before

the computation itself goes to infinity. As an example, Wagner suggests we look

at the median and the mean. 50% of data points must go to infinity before the

median of that data goes to infinity. On the other hand, if only one data point

goes to infinity, the mean of the data will become infinity. The median is a more

robust statistical computation than the mean.

Wagner approaches this problem using the Byzantine fault model, which is

described [20] as follows: “The component can exhibit arbitrary and malicious

behavior, perhaps involving collusion with other faulty components.” This kind

of analysis only allows us to understand the worst-case behavior of an aggregation

function under malicious attack. He makes the important point that sensor net-

work systems will be deployed in environments and in numbers such that it is not

possible to monitor them. The sensor nodes will then be easily compromised, and

system designers should use more robust computations to avoid serious problems

that will result from a few compromises.

As we can see from all the descriptions of sensor network faults above, a very

pressing issue is not the threat of compromise but the reality of fault. Again, due

8

to sensor nodes’ low cost and sub-par quality, sensors themselves often produce

inaccurate measurements. Even before sensor networks become pervasive in our

daily environments and therefore vulnerable to malicious compromise, we have

to make them robust to fault within their own ranks.

The Reputation-based Framework for Sensor Networks [12] is a middleware

framework which allows distributed maintenance of reputation for all the nodes

in a network. All nodes maintain reputation for their neighbor nodes given some

measure of cooperation. The reputation information itself can be used to inform

decisions made in aggregating and forwarding data, or if the reputation infor-

mation is sent to a central location it can be used to inform a user of possible

problem nodes.

Another tool aimed at improving data integrity in sensor networks is called

Confidence [24], and it has a methodology akin to Sympathy [25] for network

integrity. With Confidence, a user can define a general expectation he or she has

for the data and then use the tool to identify integrity compromise and identify

a possible cause.

Methods of fault detection based on simplified algorithms such that they could

be implemented on a single sensor node are investigated in [9, 21]. Both have

approaches based on two relationships of correlation: the correlation of a node’s

measurement and its neighbor’s measurements, and the correlation of a node’s

measurement and its own previous measurement. In [9], a naive bayes algorithm

is employed which maintains counters of the number of times a particular pair oc-

curs over the history of the sensor network. This approach requires large numbers

of counters and large amounts of training data. On the other hand, [21] keeps

track of the same relationships by only maintaining counters of the differences

between the pairs of values. Their claim is that the differences are sufficiently

9

representative of the interesting behavior from the sensors, and they demonstrate

that this approach is effective.

The author of [14] presents a method for automated fault detection of in-situ

environmental sensors. Based on data which has been carefully reviewed by a

domain expert, his algorithms learn particular characteristics of the environment

using four different statistical learning methods, are trained to differentiate be-

tween identify short- and long-duration anomaly periods for sensors. This paper

shows a comparison of the performance of the different methods.

2.3 Calibration

Calibration is the process of taking readings from a sensor and applying an equa-

tion to map these readings as closely as possible to the ground truth value of that

measurement. Certain faults which involve an incorrect or unknown measurement

offset and gain can be adjusted with a calibration curve.

The most straightforward approach to calibration is to apply a known stimulus

x to the sensor network and measure the response y. Then using the groundtruth

input x we can adjust the calibration parameters so that (5.1) is achieved. We

call this non-blind calibration, since the true signal x is known. This problem is

called inverse linear regression; mathematical details can be found at [15]. Non-

blind calibration is used routinely in sensor networks [24, 31], but may be difficult

or impossible in many applications.

As for blind calibration in sensor networks, the problem of relating measure-

ments such as received signal strength or time delay to distance for localization

purposes has been studied extensively [17, 30]. This problem is quite different

from the blind calibration problem considered in this thesis, which assumes that

10

the measurements arise from external signals (e.g., temperature) and not from

range measurements between sensors. In [33], the problem of calibrating sensor

range measurements by enforcing geometric constraints in a system-wide opti-

mization is considered. Calibration using geometric and physical constraints on

the behavior of a point light source is considered in [11]. The constraint that prox-

imal sensors in dense deployments make very similar measurements is leveraged

in [7]. In this thesis, our constraint is simply that the phenomenon of interest lies

in a subspace. This is a much more general constraint and hopefully therefore it

can be widely applicable.

Blind equalization and blind deconvolution [28] are related problems in signal

processing. In these problems, the observation model is of the form y = h ∗ x,

where ∗ is the convolution operator, and both h and x must be recovered from

y. Due to the difference between the calibration and convolution models, results

from blind deconvolution do not readily apply to blind calibration. Most similar

to our problem is work in multi-channel blind deconvolution [13]. This problem

involves observing one unknown signal through multiple unknown channels. Blind

calibration involves observing multiple unknown signals through one unknown

calibration function. This connection merits further study which is beyond the

scope of this thesis.

A final problem relation for blind calibration is that many signal processing

researchers will recognize this problem formulation from the similar formulation

of blind source separation. This is quite a different problem, yet the form is rem-

iniscent. In independent component analysis (ICA) [16], for example, a solution

to the equation y = Ax is sought where some signals x and a matrix of mixing

coefficients A are not known, instead only the mixed observations y are known.

The signals are assumed to be independent and non-Gaussian, and the mixing

11

matrix A should be invertible. Unfortunately, ICA only recovers each signal up

to a scalar constant, which is exactly the scalar gain factor we are looking for in

blind calibration.

In this thesis, we hope that we have distilled out some of the crucial problems

for addressing data integrity in sensor networks. The next three chapters give

a formulation for each of three approaches. The first approach relates to fault

aggregation while the second two approaches relate to blind calibration.

12

CHAPTER 3

Survey of Faults and Aggregation Analysis

This work in this chapter was done with the help of Professor Mark Hansen in

the Statistics Department at UCLA.

This chapter presents a study of sensor faults at a node level and in aggre-

gated data. First, an exploratory catalog of sensor node faults is developed, and

faulty behavior is characterized with statistical models. After a survey of sensor

network deployment descriptions in the literature and discussion with engineers

and scientists who have deployed sensor nodes, the choice for faults that are mod-

eled here was based on the relative consequence of different types of data faults

in recent deployments.

The most interesting part of the fault analysis comes at the network level,

when we see how faults affect fusion algorithms among data at multiple nodes.

In order to reduce the number of messages communicated, sensor networks often

aggregate data or perform in-network processing, such that messages need not

travel all the way from a sensor node to a central processing station. In addi-

tion, even when we do transmit all sensor measurements, the central processing

often involves some data fusion, where this data is used for estimation, modeling,

control, etc.

If faulty data is not identified, and if it is incorporated into the data fusion

algorithm, the results will themselves be in error. Even more drastic, if faulty

13

data is blindly aggregated before it even reaches a central processing station, we

don’t even have the chance to identify and exclude it from fusion. This problem

motivates a careful assessment of algorithm robustness to fault.

3.1 Examples of Fault in Sensor Network Deployments

Here we show a few examples of fault from sensors measuring environmental

phenomena. In Figure 3.1, there are measurement curves from several humidity

sensors closed together in a styrofoam box. The sensors should all be measuring

the same phenomenon, and yet the offset between the curves is as great as 20%

relative humidity. The factory calibration curves are often not enough to calibrate

sensors which much work together in a network. One of these sensors alone may

give appropriate measurements relative to its own baseline, however when sensors

are used in concert in a sensor network they must be calibrated relative to one

another. If these humidity sensors were deployed in a field, it would be impossible

to know whether a difference in measurement between two of the sensors was due

to a difference in phenomenon or a difference in calibration offset.

Figure 3.2 shows a sensor which becomes noisier as its battery drains. The top

plot in the figure shows the temperature measurements on days 1-4, where the

bottom plot shows noisier data on days 6-10. On one hand, the measurements in

the bottom figure could be thrown away when we see that its battery is low. On

the other hand, these data are not completely worthless. On days 6-8, the sensor

reports noisier measurements which are still tracking the true phenomenon. After

that, the measurements stop reflecting the true phenomenon.

The two faults in Figure 3.3 are very common faults in environmental sensing

where inexpensive sensors are exposed to all kinds of weather conditions. One of

14

Figure 3.1: An example of the Offset Fault: Various humidity sensors measuring

the same phenomenon.

15

0.5 1.0 1.5 2.0 2.5 3.0 3.5

21.5

22.0

22.5

23.0

23.5

Less Noisy Temperature Data

Day

Deg

rees

Cel

cius

6.0 6.5 7.0 7.5 8.0 8.5 9.0 9.5

22.0

22.5

23.0

23.5

Noisy and Otherwise Bad Temperature Data

Day

Deg

rees

Cel

cius

Figure 3.2: An example of a sensor which was good and developed some prob-

lematic noise.

16

020

4060

80

Two Faults on the Cold Air Drainage Transect

Date

Deg

rees

Cel

cius

11/18 11/19 11/20 11/21 11/22 11/23

Mote 1: goodMote 4: goodMote 12: stuckatMote 15: static

Figure 3.3: Examples of the Stuck-At and Static Faults.

the sensors has gotten stuck at a high temperature value. The other problematic

sensor is reporting data which look like static, and it is unclear whether the data

are even following the true temperature trend.

3.2 Fault Model

The examples we have shown of fault in sensor networks are varied, yet we are

interested to find a simple framework within which we can address as many faults

as possible. We adopt the perspective that the true measurement values come

from a random process, and noise or fault is imposed on that process through

either an additive noise process or a linear deterministic function.

17

Variable Meaning density distribution

y true value of phenomenon γ Γ

x measurement with acceptable f F

additive noise

x measurement transformed by a f F

fault

ε additive noise variable N (0, σ2) 12(1 + erf( ε

2σ2 ))

c percentage of nodes compromised deterministic –

by fault

β0 offset value deterministic –

β1 gain value deterministic –

Figure 3.4: Variables used in fault model descriptions.

Our generalized model for faults is then as follows. First, a perfect measure-

ment would be if we can exactly recover y. Even the most expensive systems,

however, will have some measurement noise. We call this measurement the pre-

fault measurement. Such a measurement would be represented as follows.

x = y + ε (3.1)

When the measurement is faulty, on the other hand, we assume it takes the

following general form

x = β0 + β1y + ε (3.2)

If we take this simple linear relationship between true measurement and fault, we

can capture many faults that we showed in the last section. Let’s see how this

formula is a generalization of specific kinds of faults.

18

3.2.1 Offset Fault

The offset fault most commonly manifests itself as a calibration offset. The fault is

an additive constant on top of the pre-fault measurement value. Another example

scenario of an offset fault would be if a column of vertically-spaced soil sensors is

displaced (suddenly or slowly over time), and the sensors are at a different depth

in the soil than when they were first deployed. In this scenario, we may be able

to subtract the offset to make up for the new node location; if we cannot do so

because of the model, we will need to adjust the pylon.

x = β0 + y + ε (3.3)

The faulty value only depends on the current measurement value and the

current offset. From here on, we will state the equations as if the sensor is faulty.

3.2.2 Gain Fault

The gain fault is a description of an error in calibration gain. It is difficult to

differentiate the gain fault from an offset fault without some knowledge of the

ground truth measurement values.

x = β1y + ε (3.4)

3.2.3 Variance Degradation Fault

The variance degradation fault is a fault that affects both cheap sensors and more

expensive measurement and sensing equipment. Over time, a sensor becomes

less and less accurate. If the measurement variance is σ2m, and the fault variance

19

σ2f > σ2

m, then the sensor noise is now εf ∼ N (0, σ2f) and we have simply

x = y + εf (3.5)

3.2.4 Stuck-At Fault

The stuck-at fault represents a sensor getting stuck at a particular value. Most

often this is a value at the high or low end of the appropriate sensing range.

One James Reserve leaf wetness sensor was stuck at the maximum rating, 10,

for several months. These faults are dangerous because the measurement can

tell you nothing about the underlying phenomenon. Yet, the measurements are

in-range, so simple out-of-range detection does not help.

x = β0 (3.6)

3.2.5 Static Fault

In the case of a static fault, the sensor is only reporting noise with no relation to

the true measurements. In our experience this has always been caused by a poor

connection of the sensor to the computational device. We model this with an

offset and additive noise where, as in the degraded variance fault, εf ∼ N (0, σ2f)

for σ2f > σ2

m.

x = β0 + εf (3.7)

20

3.3 Aggregation Analysis

The robustness of data aggregation to these sensor faults must be analyzed so

that system designers know under what conditions their system will fail. In [32],

Wagner emphasizes the importance of robustness in sensor networks as a defense

against possible aggressors. He notes that sensor network hardware will likely not

be tamper-proof, and so robust algorithms are a must in order to combat against

worst-case scenarios where a single node might have the ability to arbitrarily alter

aggregation results.

However, most systems implement cut-off points for their sensors, such that

a physically impossible or improbable measurement is always discarded. Thus, a

single sensor cannot arbitrarily affect an estimate or aggregate on the data. In a

scenario where we would like to do a more specific analysis than worst-case, and

we have some particular faults in mind, we can more carefully assess the damage

a fault inflicts on an aggregation algorithm.

3.3.1 Aggregation as Estimation

We can think of any fusion or aggregation function on our data as an estimator.

Our estimator Θ is any function Θ : RN → R. We continue to denote the ground

truth with y, and the set of true values from N sensors in a sensor network is

denoted as y1, · · · , yN . We define θ = Θ(y1, · · · , yN). Because the measured data

x1, · · · , xN are not equal to the ground truth (whether they are faulty data or

simply have measurement error), the function Θ(x1, · · · , xN ) returns an estimate

of θ.

To assess the quality of the estimate, we need an error term. Wagner [32] uses

root-mean-square error, the square-root of mean-square error. We will perform

21

our analysis using the mean-square error for cleanliness, but plots will show root-

mean-square error. For some estimate of θ, which we will call θ, mean square

error is defined as:

mse(Θ) = E((θ − θ)2) (3.8)

The same work [32] defines rms∗(Θ, k) as the root-mean-square error of the

estimator when k members of the data set are compromised. However, as we’ve

said, when we’re considering systems which throw away out-of-range data points,

no one data point can arbitrarily affect an estimator. We need to be more specific.

We change our perspective to assess damage under not a number of sensors, but

under a percentage of compromised sensors c, and not a general worst-case error,

but instead error under a particular fault function f .

So for Wagner’s k we now have c = k/N . We can define a new type of

error, msef (Θ, c), as the root-mean-square error when c percent of the data set

are compromised by the fault function f . For a given data set, if without loss

of generality the data are numbered compromised first, unaffected last, then we

have

θ = Θ(x1, · · · , xcN , xcN+1, · · · , xN) (3.9)

We still have θ = Θ(y1, · · · , yN), and thus we define msef (Θ, c) = E((θ−θ)2).

If a system designer is interested in keeping error below a point α, we can find

if our estimator is sufficient with the following definition, again based on [32].

Definition We say that an estimator Θ is (c, α)-resilient with respect to the

parametrized distribution p(Xi|θ) under a particular fault function f if

msef (Θ, c) ≤ α2 · mse(Θ)

22

.

We refer to this as the resilience condition. We also define this error compar-

ison under the root-mean-square error metric rmsf(Θ, c) ≤ α · rms(Θ).

Note that our c relates very closely to the breakdown point of an estimator.

While the breakdown point makes sense only for an unbounded estimator, c does

not. In a sense, it represents a breakdown with respect to a particular error

threshold.

Of the typical functions of interest, here we focus on average and sum. Again

for notation simplicity, we have ordered the data compromised first, unaffected

last. We will leave out the root-mean-square term from the following derivations

because of its simple extension from the mean-square error.

Once we have defined an estimator for our aggregation function, we can apply

the error condition to find the resiliency under a particular fault.

3.3.2 Sum and Average

The sum estimators with only pre-fault data and with both faulty and pre-fault

data are as follows.

Θprefault =

N∑

i=1

xi (3.10)

Θfault =cN∑

i=1

xi +N∑

i=(1−c)N

xi (3.11)

The Θprefault and Θfault for the average estimator under fault are simply 1/N

times that of the sum. When finding whether the resilience condition is satisfied

for the average estimator, these 1/N constants on either side of the inequality

23

Variable density first moment second moment

y γ y ≡ Eγ [y] ¯y ≡ Eγ[y2]

x f x ≡ Ef [x] ¯x ≡ Ef [x2]

x f ¯x ≡ Ef [x] ¯x ≡ Ef [x2]

xy p(x, y) xy ≡ Ep(x,y)[xy] –

xy p(x, y) xy ≡ Ep(x,y)[xy] –

Figure 3.5: Notation for moments of variables used in aggregation analysis.

cancel and the condition is the same as that of the sum.

Let us now examine whether this estimator is resilient. The abbreviations in

Figure 3.5 will be useful in the discussion.

If we assume that measurements for i 6= j are independent, we can reduce the

equation for mean-squared error quite a bit. The algebra and the final equations

are in 3.3.3. Because the final equation becomes very long, we want to try to

visualize the implications for particular scenarios with particular faults taken

from our fault examples in Section 3.1.

In the first example, we have 20 nodes in a network over which we would like to

find the average temperature value. Note that this also applies to the sum of the

temperature values. Say we are using very inexpensive electronics, and we often

find that our temperature value gets stuck at 0 deg C. We would like to see how

resilient our network is to these faults. We assume the ground truth temperatures

are constant at 20 deg C, and the measurements are iid and drawn from a normal

distribution with mean at ground truth and variance 2 deg C. We also assume

that the faults are independent of the good measurements. Then we have: ¯x = 0,

¯x = 0, x = 20, ¯x = 402, y = 20, ¯y = 400, xy = 0, xy = 400. Figure 3.6 shows how

our system will do with increasing numbers of compromised nodes. The y-axis

24

Figure 3.6: Averaging under a stuck-at-zero fault.

shows root-mean-square error. If we were interested, for example, to restrict that

our error does not exceed one degree, then we will only be able to tolerate about

5% of the nodes being compromised. The horizontal lines show the root-mean-

square error of the pre-fault estimate for different values of α. These curves offer

a comparison point; for example, if we are designing for the possible compromise

of 5% of nodes under this fault, we must accept between 2 and 5 times more error

than we would accept with no fault.

In another scenario, we examine the sum and average under the degraded

variance fault. Again we hold the ground truth values constant at 20 deg C, and

now both the pre-fault measurement and the faulty measurement are Gaussian

distributed around this value. The pre-fault measurement variance is still 2 deg C.

The faulty measurement variance is higher at 5 deg C.

In this scenario, whose results are shown in Figure 3.7, we can see that the

curve for root-mean-square error is at the bottom of the plot all the way across.

25

Figure 3.7: Averaging under degraded variance.

That is, a quarter of our sensors can be faulty with degraded variance and still

the root-mean-square error in temperature is less than a quarter of a degree.

3.3.3 Some Mathematical Details

For the Sum estimator, we start with mse between Θfault and Θtrue where

θfault =

cN∑

i=1

xi +

N∑

i=(1−c)N

xi (3.12)

θtrue =

N∑

i=1

yi (3.13)

We want to compute the value of msef (Θ, c), which with these estimators

looks as follows:

26

E

cN∑

i=1

xi +N∑

i=(1−c)N

xi −N∑

i=1

yi

2 (3.14)

Recall that in Section 3.3.2 we assume that measurements for nodes (i, j)

where i 6= j are independent.

Here are the useful squared terms:

(A) E

[

cN∑

i=1

xi

]2 = (cN) ·

[¯x + (cN − 1)¯x2

]

(B) E

[

N∑

i=cN+1

xi

]2

= (N − cN)[¯x + (N − cN − 1)x2

]

(C) E

[

N∑

i−1

yi

]2

= N[¯y + (N − 1)y2

]

(D) E

[

N∑

i−1

xi

]2

= N[¯x + (N − 1)x2

]

And here are the useful cross terms:

(F ) E

(cN∑

i=1

xi

N∑

i=cN+1

xi

)= cN(N − cN)¯xx

(G) E

(N∑

i=cN+1

xi

N∑

i=1

yi

)= (N − cN)[xy + (N − 1)xy]

(H) E

(cN∑

i=1

xi

N∑

i=1

yi

)= cN [xy + (N − 1)¯xy]

(I) E

(N∑

i=1

xi

N∑

i=1

yi

)= N [xy + (N − 1)xy]

The MSE is made up of these terms as follows.

27

mse(Θ) = (C) + (D) − 2(I) (3.15)

msef (Θ, c) = (A) + (B) + (C) + 2(F ) − 2(G) − 2(H) (3.16)

This analysis shows it is actually pretty easy to assess the impact a particular

fault has on the average or sum aggregator, because all we need to know are the

first and second moments of the true values, the fault variable, and the covariance

between the truth and the faults.

3.4 Future Work and Discussion

The fact that we are using the mean-squared error as a way to assess the perfor-

mance of the aggregate implies that only the first and second moments of distri-

butions matter. This is good because it can simplify some difficult problems. It

is problematic because if the data and fault distributions are not Gaussian, then

often some of the most interesting information is in the higher moments, and this

information is getting lost.

The faults listed in this chapter are simple faults; often sensor faults have more

complicated temporal patterns that would require a non-homogeneous statistical

model.

Hopefully the work in this chapter can help someone who is interested in ana-

lyzing the effects of particular faults on other more complicated data aggregation

and data fusion algorithms. The important point to take away from this chapter

is that simple models can go a long way to assess robustness in sensor network

algorithms.

28

CHAPTER 4

Estimation of Calibration Parameters using

Dynamical Models

This chapter presents collaborative work with Professor Steven Margulis in the

environmental engineering department at UCLA.

The first approach we take to address in-situ blind calibration is to use a

physical model of the environment to give context to the measurements and thus

some information about the measurement parameters. In the classical approach

to tracking a phenomenon, adaptive filters are used. This approach takes a

model of how the phenomenon changes from one time step to the next, along with

uncertainty in those changes, and then incorporates measurements when they are

taken. Typically these measurements have only additive noise uncertainty. Thus,

when a measurement is taken, it is possible to find the likelihood of the model

given the measurements. In this thesis, we assume there is also uncertainty in

linear measurement parameters or calibration parameters; that is the likelihood

itself is uncertain. Instead of using the measurements to simply inform the model,

we also use the model to inform an estimate of measurement parameters.

In the study of soil state in environmental engineering, it is common to use

dynamical models informed by the physics of water flow, evaporation and chem-

ical absorption by soils. Soil sensing applications are an exciting yet challenging

area for the application of Wireless Sensor Networks. Soil sampling is an impor-

29

tant task for two reasons: understanding global warming through the absorption

of carbon dioxide balanced between our oceans and our soils, and understanding

the quality of our underground water sources. Soil is extremely challenging [26]

because soil is extremely heterogeneous which makes oversampling impossible

and because the soil ecosystem is a confluence of many factors including rain,

chemicals, evaporation and plant root growth.

Fortunately, because of the importance of the problem, civil and environmen-

tal engineers have spent a lot of time developing careful models for soil dynamics.

Typically the models have parameters which should be learned from the particu-

lar environment of interest. Thus, the problem is well-suited for state estimation

filters, e.g. the Kalman Filter or Particle Filters. In this thesis, we are interested

to augment the usual parameter state vectors with the calibration parameters,

so that we can simultaneously estimate the calibration parameters as well.

4.1 The Ensemble Kalman Filter and the Particle Filter

For this analysis, we looked in particular at two filter options. The first, called the

Ensemble Kalman Filter (EnKF), tracks the probability density function (pdf) of

the state of a dynamical system by using a Monte Carlo technique. First, several

replicates (the ensemble) are generated using the joint prior distribution across

all the input variables, which can be non-Gaussian. Those replicates are tracked

with the forward model, which can be nonlinear. When measurements are taken,

the replicates are updated with a modified Kalman gain matrix. This update

step makes loose assumptions on Gaussianity [34].

The second, a particle filter implementation called the Sequential Importance

Resampling (SIR) filter, is another Monte Carlo technique which uses the same

30

Figure 4.1: An example soil sampling scenario.

propagation step but a different update step [34, 1]. Again, an ensemble of

several replicates (or particles) is generated from the joint prior, which can be

non-Gaussian, and again the replicates are tracked with the forward model, which

can be nonlinear. When a measurement is taken, the update is performed in

two steps. First, weights are given to each of the replicates depending on the

probability of that replicate given the measurement. That is, we evaluate the

likelihood function at the point of the replicate. The weights are normalized to

sum to one. Second, a new ensemble is generated from the previous ensemble by

resampling the previous ensemble with replacement according to the normalized

weights. The SIR filter is a theoretically more sound option than the Ensemble

Kalman Filter, as it can be shown to give exact solutions for large ensemble sizes.

However, nothing can be guaranteed in general for finite ensemble sizes.

In this thesis we are interested to see how the two filters compare in estimating

calibration parameters.

31

4.2 Simple Autoregressive Surface Moisture Model

Figure 4.1 represents the scenario from which we designed our simple model.

The state vector y is a single state (1-dimensional vector) which represents the

moisture at a point near the ground surface (within 5 inches of the surface). We

used a simple autoregressive model from a tutorial paper by Evensen [10]. We

model the moisture draining down out of this point over time with a drainage

coefficient δ. To this we add model error qk, with a normalization constant defined

below, and model forcing due to precipitation, precipk−1.

yk = δyk−1 +√

dtσρqk + precipk−1 (4.1)

4.2.1 Initial Condition

The first input to our dynamical model is the initial condition, y0, i.e. the mois-

ture in the soil at the start of our experiment. We take the distribution of our

uncertainty in the initial condition to be lognormal, y0 ∼ LN(µy0 , σ2y0

). This

keeps our state always positive. The lognormal distribution is defined as follows.

1

xσ√

2πe−(lnx−µ)2/2σ2

(4.2)

4.2.2 Model Forcing

The forcing in our model is due to precipitation. We will take the precipitation

input to our model to be the measurements from our rain gauge. Currently, the

precipitation was generated once and used for all the work in this chapter. In the

future we intend to create models to generate different instances of precipitation.

We will do this along with models more specific and accurate to real-life scenarios

32

of interest.

4.2.3 Model Error

To take into account the model errors [10], we also use a simple AR-1 model.

The model error at time k, qk, is an additive combination of the model error in

the previous time step and Gaussian random noise.

qk = αqk−1 +√

1 − α2wk−1 (4.3)

yk

qk

=

δyk−1 +√

dtσwρqk + precipk−1

αqk−1 +√

1 − α2wk−1

(4.4)

The constant seen before qk in the forward equation for yk is a normalization

constant. σw is the noise standard deviation of w. The normalization factor ρ is

calculated as follows [10]. This achieves normalization over each time unit, and

we call the total number of time units Nunits. In this equation, n is the number

of forward model time steps per time unit.

ρ =(1 − α)2

dt(n − 2α − nα2 + 2αn+1)(4.5)

Thus, if we choose to run our forward model over Nsteps time steps, and we

would like it to normalize every n time steps, then we choose Nunits = Nsteps/n.

4.2.4 Measurement Model

In our scenario, we are directly measuring the moisture state variable with an

ECHO-20 moisture sensor. An example calibration equation for the ECHO-

20 is given by y = 6.95 × 10−4x − 2.9 × 10−1, where y is the percent water

33

content by volume, and x is the millivolt reading given by the echo sensor. So

for our simulation experiments, we also use a linear calibration equation for the

measurement model.

y = β1x + β0 + v (4.6)

With the filtering techniques we are examining, we can hypothetically estimate

the calibration coefficients even though β1x is non-linear. Measurement noise is

random and additive, represented here by v. In the Ensemble Kalman Filter and

the Particle Filter, v does not have to be Gaussian distributed. When v has some

other distribution, the calculation of the likelihood during update for the Particle

Filter will be more difficult [1]; however the complexity of the Ensemble Kalman

Filter does not change.

Classically, the level of noise in the measurements is crucial to the performance

of estimation; if the noise is much greater than the signal itself, then obviously

estimation cannot work. Equivalently we must have a similar understanding on

conditions on the calibration parameters. For example, if the gain parameter β1

is zero, then the measurements will be completely uninformative and the filter

will rely wholly on the model. A careful understanding of these conditions is part

of future work.

4.3 Implementation

We have implemented the SIR Particle Filter Ensemble Kalman Filter in MAT-

LAB1.

We first implemented code to generate a true phenomenon and generate mea-

1The MATLAB code is all available at http://www.ee.ucla.edu/~sunbeam/bc.

34

surements of that phenomenon using the true measurement parameters. The

parameters needed for the true phenomenon and true measurements are listed in

Figure 4.2. We also need a vector of precipitation events, which we generated

once and used throughout all the examples in this chapter.

The filter code itself takes the measurements as input and runs the EnKF

and SIR algorithms with a few additional parameters, including the number of

replicates (i.e. size of the ensemble) and the prior distribution for unknown

parameters.

4.3.1 State Vector

The state vector we use in our implementation consists of five states. We call

it Y . We assume the measurement coefficients and the decay parameter are

time-invariant, and thus they do not carry a subscript k.

Y =

yk

qk

δ

β1

β0

(4.7)

4.3.2 Input Parameters

Both the EnKF and the SIR Particle Filter have input parameters of the prior

knowledge of the user with respect to the states to be estimated.

Clearly, the user does not always know accurately what the state variables

should be. The prior expected value for the state variable is the prior mean.

When one is uncertain about their knowledge of the state variable, one can in-

35

Parameter Meaning Setting

Nunits Number of time units in simulation 200

dt Time step 0.1

Nsteps Nunits/dt 2000

y0bar µ parameter in LN distribution for initial condition 0.05

y0var σ parameter in LN distribution for initial condition 0.7

α Model error parameter 0.5

wbar Model error mean 2

wvar Model error variance 1

δ Decay parameter 0.75

β0 Measurement bias 5

β1 Measurement gain 2

σv Measurement variance 0.25

Meas spacing Time steps between two measurements 50

Figure 4.2: Parameters used in generating the true phenomenon. The true initial

condition and the model error are a single instance drawn from the corresponding

distribution.

36

crease the prior uncertainty or the prior variance. We used Gaussian variables to

represent the possible values for the calibration coefficients, so mean and variance

describe the variables completely. One could also use a different distribution. All

of the parameters which define the chosen distribution would need to be used as

input to the filter.

Introducing uncertainty can hurt the ability of the filters to hone in on the

correct estimate. The work in this chapter aims to understand the tradeoffs and

the performance of the EnKF vs. the SIR filter.

One important input to both filters is the measurement noise variance. In

Figure 4.3, we illustrate an example of what can happen if we don’t know the

correct measurement noise variance a priori. This illustration uses the Ensemble

Kalman Smoother, which estimates the parameters across the entire time window

where the Filter estimates online as measurements get taken. In this example,

we used all the parameters as listed in Figure 4.2, except Meas spacing = 10,

β0 = 0 and β1 = 1. We show only a limited section of the plot for a close

up view. As can be seen, with a higher measurement variance, the filter is less

accurate. Additionally, if the prior variance is too small, the replicates do not

even encompass the true phenomenon.

In addition to noise variance, both algorithms also need the number of repli-

cates to generate and propagate, Nreps. This is set to 400 throughout the exper-

iments described in this chapter. Additionally, both filters need the description

of the prior distribution for every state variable it will estimate. It is very im-

portant to emphasize that we must either gather very good prior knowledge, or

we must allow more uncertainty in this knowledge, or else we may be misled by

the output of the filter.

37

20 22 24 26 28 30 32 34 36 38 40−2

0

2

4

6

8

10

12

14

t

moi

stur

e

Measurement Variance = 0.25, Prior Variance = 0.25

truemeas.posterior meanreplicates

20 22 24 26 28 30 32 34 36 38 40−2

0

2

4

6

8

10

12

14

t

moi

stur

e

Measurement Variance = 4; Prior Variance = 0.25


20 22 24 26 28 30 32 34 36 38 40−2

0

2

4

6

8

10

12

14

t

moi

stur

e

Measurement Variance = 4, Prior Variance = 4


Figure 4.3: Example of EnKF with different measurement variance and a priori

information.

38

Figure 4.4: An example of the SIR Filter estimate for the five state parameters.

4.4 Evaluation

In order to evaluate the ability of the EnKF and Particle Filter to estimate

measurement coefficients, we ran several simulations using the autoregressive

model. Here we show five experiments. In each experiment, we had uncertainty

in only one of the calibration parameters to be estimated, either the gain or the

offset. Recall that we are assuming the gain and offset are Gaussian distributed,

so we only vary the prior mean or the prior variance. Our results are particular for

uncertainty in one calibration parameter and also particular for the autoregressive

model we have chosen, but they do demonstrate some simple properties.

As an example of what the filter estimate may look like, see Figure 4.4. The

39

five plots are each the estimates over time of one of the state variables, with

the moisture state y in the top left, model error q below that, moisture decay

δ in the bottom left, measurement gain β1 in the top right and measurement

offset β0 below that. This five-plot layout is the same for all the following plots

showing the performance of estimation– we will always show plots for the five

state variables in this order.

4.4.1 Estimation with Incorrect Prior Mean

For the first two experiments, we start by testing whether we can estimate the five

state variables even when we assume some wrong prior mean for the offset or gain

calibration parameters. That is, if we assume that offset parameter is zero, but in

reality there is an offset, will we be able to estimate the state variables, including

the correct calibration offset? Or if we assume that the gain parameter is one,

when in reality there is a gain, will we estimate the state variables correctly?

In the first experiment, we allow the prior mean for calibration offset β0 to

vary while holding the variance and other parameters constant. The parameters

are listed here verbatim from the code. The variable names are self-explanatory

for the most part; keep in mind that the δ parameter is uniformly spaced, and the

lo and hi represent the interval over which it lies. In this case, δ was uniformly

distributed between 0.5 and 1.

40

0 5 100

2

4x 10

−3m

oist

ure

0 5 105.15

5.2

5.25x 10

−4

mod

el e

rror

q

0 5 10−1

0

1

mea

s ga

in

0 5 100

0.005

0.01

mea

s of

fset

0 5 100

0.5

1x 10

−4

deca

y pa

ram

eter

Effect of Prior Offset Mean on RMS Error

Increasing differencebetween true and priormean for measurementoffset

Figure 4.5: RMS Error as the difference in the true and prior mean for the

measurement offset increases: 5, 50, 95% quantiles over 100 runs.

41

beta1 = 2;

beta0 = 5;

delta = .75;

Meas_spacing_vec = 50;

Meas_Noise_vec = 0.25;

Beta1_Mean_vec = beta1;

Beta1_Var_vec = 0;

Beta0_Mean_vec = [beta0 beta0-2 beta0-5 beta0-10];

Beta0_Var_vec = 1;

Delta_lo_vec = .5;

Delta_hi_vec = 1;

Figure 4.5 shows the root mean square error of the estimate of the state

variables as a function of the difference between the true β0 and the prior mean

for β0.

The error calculation and plots for all the experiments are the same, and so

we explain it once here. We take the mean of all the replicates to be the final

estimate of the state variables. The plots then show the RMS error between this

and the true values. We simulated 100 runs to see the effects; the plots show the

5%, 50% and 95% quantiles2 of the resulting error from those 100 runs.

In Figure 4.5, we have the results of this first experiment. We have plotted

these three quantiles for the RMS error of all five parameters which we are esti-

mating. The measurement gain error is flat at zero because we are holding this

parameter as known. As we expect, for the measurement offset, decay parameter,

and moisture state, the estimate error increases as the difference between the true

and prior mean increases. So as we make a worse and worse guess at what the

2The 5% and 95% quantiles are close to the best- and worst-case error. The 50% quantileis the median error.

42

true value is for the offset calibration parameter, our estimation error gets worse.

Interestingly, when we compare the Particle filter to the EnKF, we see that the

extreme cases are more spread apart for the Particle filter. The three quantiles

for the EnKF are quite close to each other in comparison. Another observation

can be made that the estimate of model error is the same for both the Particle

filter and EnKF.

In the second experiment, we allow the mean of the gain parameter β1 to vary.

The gain calibration parameter is multiplicative with the moisture state variable

itself and thus also with the decay parameter, and so we expect the results to be

less straight-forward.

beta1 = 2;

beta0 = 5;

delta = .75;



Beta1_Mean_vec = [beta1 beta1-.5 beta1-1 beta1-1.5];

Beta1_Var_vec = .4;


Beta0_Var_vec = 0;

Delta_lo_vec = .5;

Delta_hi_vec = 1;

Figure 4.6 shows the root mean square error of the estimate of the state

variables as a function of the difference between the true β1 and the prior mean

for β1. As we can see, the results are similar to those of the first experiment.

However, the error in the moisture parameter increases more quickly. As we

said, the error in the decay parameter and the error in the measurement gain are

multiplicative, so this makes sense. Still, we are pleased to see that we are still

43

0 0.5 1 1.50

1

2x 10

−3m

oist

ure

0 0.5 1 1.54.94

4.96

4.98x 10

−4

mod

el e

rror

q

0 0.5 1 1.50

0.5

1x 10

−3

mea

s ga

in

0 0.5 1 1.5−1

0

1

mea

s of

fset

0 0.5 1 1.50

0.5

1x 10

−4

deca

y pa

ram

eter

Effect of Prior Gain Mean on RMS Error

Increasing differencebetween true and prior meanfor measurement gain

Figure 4.6: RMS Error as the difference in the true and prior mean for the

measurement gain increases: 5, 50, 95% quantiles over 100 runs.

44

1 2 3 4 50

1

2x 10

−3

moi

stur

e

1 2 3 4 55.1

5.15x 10

−4

mod

el e

rror

q

1 2 3 4 5−1

0

1

mea

s ga

in

1 2 3 4 50

2

4x 10

−3

mea

s of

fset

1 2 3 4 50

0.5

1x 10

−4

deca

y pa

ram

eter

Effect of Prior Offset Variance on RMS Error

Increasing Prior Variance forthe measurement offset

Figure 4.7: RMS Error as the prior variance for the offset measurement parameter

increases: 5, 50, 95% quantiles over 100 runs.

able to do a reasonable job of estimating the state vector, and that more correct

information improves our estimate.

4.4.2 Estimation with Changing Prior Variance

In the third experiment, we allow the variance of β0 to vary.

45

beta1 = 2;

beta0 = 5;

delta = .75;




Beta1_Var_vec = 0;

Beta0_Mean_vec = beta0-5;

Beta0_Var_vec = [1 2 4 5];

Delta_lo_vec = .5;

Delta_hi_vec = 1;

In Figure 4.7, we see that as we increase the variance, the estimation error

for the measurement offset parameter goes down! This is great news, because

it means that even when there is a lot of uncertainty in our prior knowledge,

the filters are able to get closer to the true parameter value. That is, when our

uncertainty is too tight around the wrong value, we will converge on the wrong

value. This results shows us that increasing the uncertainty can help in those

situations.

We can also observe that, for this experiment, the Particle Filter median case

error does generally better than the EnKF. The worst case error for the Particle

Filter is however still in general much higher than the EnKF error.

In the fourth experiment, we allow the variance of β1 to vary.

46

0.2 0.4 0.6 0.80

0.5

1x 10

−3

moi

stur

e

0.2 0.4 0.6 0.84.85

4.9

4.95x 10

−4

mod

el e

rror

q

0.2 0.4 0.6 0.80

0.5

1x 10

−3

mea

s ga

in

0.2 0.4 0.6 0.8−1

0

1

mea

s of

fset

0.2 0.4 0.6 0.80

0.5

1x 10

−4

deca

y pa

ram

eter

Effect of Prior Gain Variance on RMS Error

Increasing Prior Variance formeasurement gain

Figure 4.8: RMS Error as the prior variance for the gain measurement parameter

increases: 5, 50, 95% quantiles over 100 runs.

47

beta1 = 2;

beta0 = 5;

delta = .75;




Beta1_Var_vec = [.1 .4 .7 1];


Beta0_Var_vec = 0;

Delta_lo_vec = .5;

Delta_hi_vec = 1;

Figure 4.8 shows a slightly depressing result, especially after we saw with the

offset estimation that increasing variance helps the estimation error. When it

comes to the gain estimation, though the error in both the gain parameter and

the decay parameter decrease, the estimation error for the moisture variable itself

decreases very little. This of course makes sense, as the errors in the gain and

decay parameters will be multiplicative in the moisture state estimation error.

4.4.3 Estimation with Frequent Updates

In the fifth experiment, we examined what a change in the number of measure-

ments would do to help our estimate.

48

0 100 200 300 400 5000

0.5

1x 10

−3

moi

stur

e

0 100 200 300 400 5004.8

5

5.2x 10

−4

mod

el e

rror

q

0 100 200 300 400 5000

2

4x 10

−4

mea

s ga

in0 100 200 300 400 500

0

0.5

1x 10

−3

mea

s of

fset

0 100 200 300 400 5000

2

4x 10

−5

deca

y pa

ram

eter

Effect of Measurement Frequency on RMS Error

Increasing number of timestepsbetween measurements

Figure 4.9: RMS Error as the spacing between measurement updates increases:

5, 50, 95% quantiles over 100 runs.

49

beta1 = 2;

beta0 = 5;

delta = .85;

Meas_spacing_vec = [25 50 100 200 300 400 500];



Beta1_Var_vec = .2;


Beta0_Var_vec = 2;

Delta_lo_vec = .75;

Delta_hi_vec = .95;

Figure 4.9 shows another disappointing result from this experiment. The error

remains nearly flat throughout all the parameters. It should surely be, however,

that having measurements more often can only help us in estimation– especially

for estimation of the calibration parameters themselves! We believe that this

reflects some problems with the way we chose to update the parameters. We

only used the measurements to directly update the moisture state variable, and

we updated the other state variables only indirectly. A more careful choice of

update for the calibration parameters is the main next step for this work.

4.4.4 Future Work and Discussion

This evaluation only shows us the effect of uncertainty in a single input parameter

at a time. In reality, we may have very little information and we may have un-

certainty about several input parameters. The non-linearity of the model makes

it difficult to say anything general about how this might affect the outcome of

the estimation. In turn, the model itself and all its intricacies will have a large

50

effect on whether the calibration parameter estimation will work. We advocate

a careful evaluation of each input parameter on the specific model implemented.

As we stated in the description of the fifth experiment, we believe that a main

next step is to more carefully design the update step so that both the moisture

state variable and the calibration parameters get directly updated. In the Particle

Filter we implemented for this thesis, during the update step we took the most

recent estimate of the calibration gain and offset and used them directly in the

calculation of the measurement, and the best replicates were chosen, implying

we should pick the replicates with correct calibration gains and offsets. The

EnKF has a structure which insists on the update of all state variables in each

update step, however the update matrix cannot be perfectly correct as is because

it would need to be nonlinear. If we were to do a proper update, we would

have a measurement model which directly incorporates the estimated gain and

offset, and the update would take advantage of the direct relationship between

the measurements and the calibration coefficients.

In addition to the EnKF and the SIR Particle Filter, we also implemented the

Ensemble Kalman Smoother. A Smoother is a type of filter that has some delay

because it does the estimate update over a window of measurements instead of as

each measurement comes in. Hybrid smoother-filter algorithms, where a moving

window of a small number of measurements is used in order to estimate the

calibration parameters, are of interest. This is a sensible approach if we assume

that the calibration parameters are not changing over some short time interval.

There are many ways to discuss the accuracy of the estimate, and we have

made two important choices here. First, we must have a choice of error function

to evaluate the performance of our estimator. In the simulations shown, we used

the RMS error. We could instead look at many other measures of error. Another

51

important measure would be whether or not the truth lies within the uncertainty

bounds of the smoother. This would tell us if we are being conservative enough

with our prior uncertainty. The second choice we made was to take the mean of

the replicates as our final estimate. Depending on how we believe the replicates

are distributed, taking the mean of the replicates may not be the right choice; we

could instead take another measure from the replicates such as the maximum a

posteriori value or the median value. Understanding the context for these choices

is extremely important, and the choices should be made carefully for any given

model.

52

CHAPTER 5

Estimation of Calibration Parameters using

Subspace Matching

This chapter presents collaborative work with Professor Robert Nowak in the

Electrical Engineering Department at University of Wisconsin, Madison. This

work was first published as a paper in the Conference on Information Processing

in Sensor Networks, 2007 [3].

This chapter presents a more general problem formulation and approach to

blind calibration. Instead of assuming a particular dynamical model, as we did

in Chapter 4, we assume a model for the subspace of the vector of sensor mea-

surements from a given time instant. We will call this vector a “snapshot.”

One approach to blind sensor network calibration is to begin by assuming that

the deployment is very dense, so that neighboring nodes should (in principle) have

nearly identical readings [7]. Unfortunately, many existing and envisioned sensor

network deployments may not meet the density requirements of such procedures.

However, we can view this choice in another way– as identifying a set of con-

straints the sensor measurements should satisfy.

In the case of [7] the difference between two neighboring sensor measurements

should be approximately zero. We could also choose another set of constraints,

for example that the values of three consecutive sensors lie on a line, i.e. the

second derivative is approximately zero. All these choices are defining a subspace

53

in which the sensor measurements should lie. In this chapter we discuss whether

gain and offset calibration coefficients can be identified given a general subspace

for the sensor measurements.

5.1 Problem Formulation

Consider a network of n sensors. At a given time instant, each sensor makes a

measurement, and we denote the vector of n measurements by x = [x(1), . . . , x(n)]′,

where ′ denotes the vector transpose operator (so that x is an n× 1 column vec-

tor). We will refer to x as a “snapshot.” When necessary, we will distinguish

between snapshots taken at different times using a subscript (e.g., xs and xt are

snapshots at times s and t).

Each sensor has an unknown gain and offset associated with its response, so

that instead of measuring x the sensors report

y(j) =x(j) − β(j)

α(j), j = 1, . . . , n

where α = [α(1), . . . , α(n)]′ are the sensors’ gain calibration factors and β =

[β(1), . . . , β(n)]′ are the sensors’ calibration offsets. It is assumed that α(j) 6=0, j = 1, . . . , n. With this notation, the sensor measurement y(j) can be cali-

brated by the linear transformation x(j) = α(j)y(j) + β(j). We can summarize

this for all n sensors using the vector notation

x = Y α + β , (5.1)

where Y = diag(y) and the diag operator is defined as

diag(y) =

y(1)

. . .

y(n)

.

54

The blind calibration problem entails the recovery of α and β from routine un-

calibrated sensor readings such as y.

When we look at the problem like this, we can see that without further as-

sumptions, blind calibration is an impossible task. Other work that we discussed

in Section 2 such as [13, 16] take a very similar problem formulation and make

particular assumptions so that it can be solved. Those assumptions don’t help

us in blind calibration; however, it turns out that under a different set of mild

assumptions that may often hold in practice, quite a bit can be learned from raw

(uncalibrated) sensor readings like y in order to do blind calibration.

Assume that the sensor network is slightly “oversampling” the phenomenon

being sensed. Mathematically, this means that the calibrated snapshot x lies

in a lower dimensional subspace of n-dimensional Euclidean space. Let S de-

note this “signal subspace” and assume that it is r-dimensional, for some integer

0 < r < n. For example, if the signal being measured is bandlimited and the sen-

sors are spaced closer than required by the Shannon-Nyquist sampling rate, then

x will lie in a lower dimensional subspace spanned by frequency basis vectors. If

we oversample (relative to Shannon-Nyquist) by a factor of 2, then r = n/2. Basis

vectors that correspond to smoothness assumptions, such as low-order polyno-

mials, are another potentially relevant example. In general, the signal subspace

may be spanned by an arbitrary set of r basis vectors. The calibration coeffi-

cients α and β and the signal subspace S may change over time, but here we

assume they do not change over the course of blind calibration. As we will see,

this is a reasonable assumption, since the network may be calibrated from very

few snapshots.

Let P denote the orthogonal projection matrix onto the orthogonal comple-

55

ment to the signal subspace S. Then every x ∈ S must satisfy the constraint

Px = P (Y α + β) = 0 (5.2)

This is the key idea behind our blind calibration method. Because the projection

matrix P has rank n−r, the constraint above gives us n−r linearly independent

equations in 2n unknown values (α and β). If we take snapshots from the sensor

network at k distinct times, y1, . . . , yk, then we will have k(n − r) equations in

2n unknowns. For k ≥ 2n/(n − r) we will have more equations than unknowns,

which is a hopeful sign. This observation leads to several basic questions which

we address in this chapter.

1. Is it possible to blindly recover α and β from a sufficient number of un-

calibrated sensor snapshots? Mathematically, this question boils down to

determining whether or not the constraints provide 2n linearly independent

equations.

2. If perfect blind calibration is not possible, then can we achieve a partial cali-

bration from the raw data? Can we improve this partial calibration with a

small amount of additional overhead?

3. How is the recovery affected by sensor noise? Certainly, we cannot expect the

constraint (5.2) to hold perfectly in the presence of noise, so robust versions

of the problem need to be developed.

4. How is the recovery affected by mismodeling in P ? Again, robust versions

of the problem are necessary to cope with cases where the signals are not

perfectly lying in the subspace.

56

5.2 Blind Calibration

Given k snapshots at different time instants y1, . . . , yk, the subspace constraint

(5.2) results in the following system of k(n − r) equations:

P (Y i α + β) = 0 , i = 1, . . . , k (5.3)

The true gains and offsets must satisfy this equation, but in general the equation

may be satisfied by other vectors as well. Establishing conditions that guarantee

that the true gains and/or offsets are the only solutions is the main theoretical

contribution of the paper [3].

It is easy to verify that the solutions for β satisfy

Pβ = −P Y α (5.4)

where Y = 1k

∑ki=1 Y i, the time-average of the snapshots. One immediate obser-

vation is that the constraints only determine the components of β (in terms of

the data and α) in the signal “nullspace” (the orthogonal complement to S). The

component of the offset β that lies in the signal subspace is unidentifiable. This

is intuitively very easy to understand. Our only assumption is that the signals

measured by the network lie in a lower dimensional subspace. The component

of the offset in the signal subspace is indistinguishable from the mean or average

signal. Recovery of this component of the offset requires extra assumptions, such

as assuming that the signals have zero mean, or additional calibration resources,

such as the non-blind calibration of some of the sensor offsets. We discuss this

further in Section 5.3.

Given this characterization of the β solutions, we can re-write the constraints

(5.3) in terms of α alone:

P (Y i − Y )α = 0 , i = 1, . . . , k (5.5)

57

If α is a solution to this system of equations, then every vector β satisfying

Pβ = −P Y α is a solution for β in the original system of equations (5.3).

In other words, for a given α, the value of the component of the offset in the

nullspace is P Y α.

Another simple but very important observation is that there is one degree of

ambiguity in α that can never be resolved blindly using routine sensor measure-

ments alone. The gain vector α can be multiplied by a scalar c, and it cannot

be distinguished whether this scalar multiple is part of the gains or part of the

true signal. We call this scalar multiple the global gain factor. A constraint is

needed to avoid this ambiguity, and without loss of generality we will assume

that α(1) = 1. This constraint can be interpreted physically to mean that we

will calibrate all other sensors to the gain characteristics of sensor 1. The choice

of sensor 1 is arbitrary and is taken here simply for convenience.

If noise, mismodeling effects, or other errors are present in the uncalibrated

sensor snapshots, then a solution to (5.3) or (5.5) may not exist. Robust solutions

are discussed in Section 5.5.1.

5.3 Offset Calibration

The component of the offset in the signal subspace is generally unidentifiable,

but in special cases it can be determined. For example, if it is known that the

phenomenon of interest fluctuates symmetrically about zero (or some other known

value), then the average of many measurements will tend to zero (or the known

mean value). In this situation, the average

1

k

k∑

i=1

yi =

(1

k

k∑

i=1

xi − β

)/α ≈ −β/α

58

where the the division operation is taken element-by-element. This follows since

1k

∑ki=1 xi ≈ 0 for large enough k. Thus we can identify the offset simply by

calculating the average of our measurements. More precisely, we can identify

β = β/α, which suffices since we can equivalently express the basic relationship

(5.1) between calibrated and uncalibrated snapshots as x = (Y + β) α.

Another situation in which we can determine (or partially determine) the

component of the offset in the signal subspace is when we have knowledge of

the correct offsets for a subset of the sensors. We call this partially blind offset

calibration. In [3], we discuss the details of partially blind offset calibration.

Here we summarize by giving the formula for computation. If we let let βm be

the vector of known offset calibration coefficients and T be a selection matrix

that selects the columns associated with those coefficients, then we will get the

following equation for a length-r vector of parameters θ.

θ = [TΦ]−1(βm + TP Y α) (5.6)

Recall that the columns of Φ are the basis vectors for the signal subspace.

That is, (I − P ) = ΦΦ′. When we know θ, we can solve for the vector of offset

coefficients β with the equation

θ = Φ′β

Note that in order to solve Equation 5.6, [TΦ] must be invertible. The rank

of ΦT cannot be greater than m, the number of known sensor offsets, which

shows that to completely determine the offset component in the signal subspace

we require at least m = r known offsets. In general, knowing the offsets for

an arbitrary subset of m sensors may not be sufficient (i.e., ΦT may not be

invertible), but there are important special cases when it is. First note the Φ, by

59

construction, has full rank r. Also note that the selection matrix T selects the m

rows corresponding to the known calibration offsets and eliminates the remaining

n−m rows. So, we require that the elimination of any subset of n−m rows of Φ

does not lead to a linearly dependent set of (m × 1) columns. This requirement

is known as an incoherence condition, and it is satisfied as long as the signal

basis vectors all have small inner products with the natural or canonical sensor

basis (n × 1 vectors that are all zero except for a single non-zero entry). For

example, frequency vectors (e.g., Discrete Fourier Transform vectors) are known

to satisfy this type of incoherence condition [8]. This implies that for subspaces

of bandlimited signals, ΦT is invertible provided m ≥ r.

5.4 Gain Calibration

The possibilities for offset calibration are fairly straightforward, as described

above, but conditions that guarantee that the gains can be blindly calibrated

are less obvious. This section theoretically characterizes the existence of unique

solutions to the gain calibration problem. As pointed out in Section 5.2, the gain

calibration problem can be solved independently of the offset calibration task, as

shown in (5.5), which corresponds to simply removing the mean snapshot from

each individual snapshot. Therefore, it suffices to consider the case in which

the snapshots are zero-mean and to assume that Y = 0, in which case the gain

calibration equations may be written as

P Y i α = 0 , i = 1, . . . , k (5.7)

The results we present also hold for the general case in which Y 6= 0. We first

consider general conditions guaranteeing the uniqueness of the solution to (5.7)

and then look more closely at the special case of bandlimited subspaces.

60

5.4.1 General Conditions

The following conditions are sufficient to guarantee that a unique solution to (5.7)

exists.

A1. Oversampling: Each signal x lies in a known r-dimensional subspace S,

r < n. Let φ1, . . . , φr denote a basis for S. Then x =∑r

i=1 θiφi, for

certain coefficients θ1, . . . , θr.

A2. Randomness: Each signal is randomly drawn from S and has mean zero.

This means that the signal coefficients are zero-mean random variables. The

joint distribution of these random variables is absolutely continuous with

respect to Lebesgue measure (i.e., a joint r-dimensional density function

exists). For any collection of signals x1, . . . , xk, k > 1, the joint distribu-

tion of the corresponding kr coefficients is also absolutely continuous with

respect to Lebesgue measure (i.e., a joint kr-dimensional density function

exists).

A3. Incoherence: Define the nr × n matrix

MΦ =

P diag(φ1)...

P diag(φr)

(5.8)

and assume that rank(MΦ) = n−1. Note that MΦ is a function of the basis

of the signal subspace. The matrix P , the orthogonal projection matrix

onto the orthogonal complement to the signal subspace S, can be written

as P = I −ΦΦ′, where I is the n×n identity matrix and Φ = [φ1, . . . , φr].

Assumption A1 guarantees that the calibrated or true sensor measurements

are correlated to some degree. This assumption is crucial since it implies that

61

measurements must satisfy the constraints in (5.3) and that, in principle, we

can solve for the gain vector α. Assumption A2 guarantees the signals are not

too temporally correlated (e.g., different signal realizations are non-identical with

probability 1). Also, the zero-mean assumption can be removed, as long as one

subtracts the average from each sensor reading. Assumption A3 essentially guar-

antees that the basis vectors are sufficiently incoherent with the canonical sensor

basis, i.e., the basis that forms the columns of the identity matrix. It is easy

to verify that if the signal subspace basis is coherent with the canonical basis,

then rank(MΦ) < n − 1. Also, note that MΦ1 = 0, where 1 = [1, . . . , 1]′, which

implies that rank(MΦ) is at most n−1. In general, assumption A3 only depends

on the assumed signal subspace and can be easily checked for a given basis. In

our experience, the condition is satisfied by most signal subspaces of practical

interest, such as lowpass, bandpass or smoothness subspaces.

Theorem 1 Under assumptions A1, A2 and A3, the gains α can be perfectly

recovered from any k ≥ r signal measurements by solving the linear system of

equations (5.5).

This theorem and the following proof demonstrate that the gains are identi-

fiable from routine sensor measurements; that is, in the absence of noise or other

errors, the gains are perfectly recovered. In fact, the proof shows that under A1

and A2, the condition A3 is both necessary and sufficient. When noise and er-

rors are present, the estimated gains may not be exactly equal to the true gains.

However, as the noise/errors in the measurements tend to zero, the estimated

gains tend to the true gains.

Proof First note that the case where the signal subspace is one-dimensional

(r = 1) is trivial. In this case there is one degree of freedom in the signal, and

62

hence one measurement coupled with the constraint that α(1) = 1 suffices to

calibrate the system. For the rest of the proof we assume that 1 < r < n and

thus 2 ≤ k < n.

Given k signal observations y1, . . . , yk, and letting α represent our estimated

gain vector, we need to show that the system of equations

PY 1

...

PY k

α = 0 (5.9)

has rank n − 1, and hence may be solved for the n − 1 degrees of freedom in α.

Note each subsystem of equations, PY j, has rank less than or equal to n−r (since

P is rank n − r). Therefore, if k < n−1n−r

, then the system of equations certainly

has rank less than n−1. This implies that it is necessary that k ≥ n−1n−r

. Next note

that Y j = XjA, where Xj = diag(xj) and A = diag([1, 1/α(2), . . . , 1/α(n)]′).

Then write

PX1

...

PXk

d = 0 (5.10)

where d = Aα. The key observation is that satisfaction of these equations

requires that Xjd ∈ S, for j = 1, . . . , k. Any d that satisfies this relationship

will imply a particular solution for α, and thus d must not be any vector other

than the all-ones vector for blind calibration to be possible.

Recall that by definition Xj = diag(xj). Also note that diag(xj)d = diag(d)xj.

So we can equivalently state the requirement as

diag(d)xj ∈ S, j = 1, . . . , k . (5.11)

The proof proceeds in two steps. First, A2 implies that k ≥ r signal observa-

tions will span the signal subspace with probability 1. This allows us to re-cast

63

the question in terms of a basis for the signal subspace, rather than particular

realizations of signals. Second, it is shown that A3 (in terms of the basis) suffices

to guarantee that the system of equations has rank n − 1.

Step 1: We will show that all solutions to (5.11) are contained in the set

D = {d : diag(d)φi ∈ S, i = 1, . . . , r}.

We proceed by contradiction. Suppose that there exists a vector d that satisfies

(5.11) but does not belong to D. Since d satisfies (5.11), we know that there

exists an x ∈ S such that diag(d)x ∈ S. We can write x in terms of the basis, as

x =∑r

i=1 θiφi, and diag(d)x =∑r

i=1 θi diag(d)φi. Since by assumption d does

not satisfy diag(d)φi ∈ S, i = 1, . . . , r, it follows that the coefficients θ1, . . . , θr

must weight the components outside of the signal subspace so that they cancel

out. In other words, the set of signals x ∈ S that satisfy diag(d)x ∈ S is a

proper subspace (of dimension less than r) of the signal subspace S. However, if

we make k ≥ r signal observations, then with probability 1 they collectively span

the entire signal subspace (since they are jointly continuously distributed). In

other words, the probability that all k measurements lie in a lower dimensional

subspace of S is zero. Thus, d cannot be a solution to (5.11).

Step 2: Now we characterize the set D. First, observe that the vectors d ∝ 1,

the constant vector, are contained in D, and those correspond to the global gain

factor ambiguity discussed earlier. Second, note that every d ∈ D must satisfy

P diag(d)φi = P diag(φi)d = 0, i = 1, . . . , r, where P denote the projection

matrix onto the orthogonal complement to the signal subspace S. Using the

definition of MΦ given in (5.8), we have the following equivalent condition: every

d ∈ D must satisfy MΦd = 0. We know that the vectors d ∝ 1 satisfy this

condition. The condition rank(MΦ) = n − 1 guarantees that these are the only

solutions. This completes the proof.

64

5.4.2 Bandlimited Subspaces

In the special case in which the signal subspace corresponds to a frequency domain

subspace, a slightly more precise characterization is possible which shows that

even fewer snapshots suffice for blind calibration. As stated earlier, assumption

A3 is often met in practice and can be easily checked given a signal basis Φ. One

case where A3 is automatically met is when the signal subspace is spanned by a

subset of the Discrete Fourier Transform (DFT) vectors:

φm = [1, e−i2πm

n , . . . , e−i2(n−1)πm

n ]′/√

n, m = 0, . . . , n − 1

In this case we are able to show in [3] that only dn−1n−r

e+1 snapshots are required.

This can be significantly less than r, meaning that the time over which we must

assume that the subspace and calibration coefficients are unchanging is greatly

reduced. See [2] for the necessary assumptions, which are very similar to the

assumptions for general identifiability, and the accompanying proof.

5.5 Evaluation

In order to evaluate whether this theory of blind calibration is possible in practice,

we explore its performance in simulation under both measurement noise and the

mis-characterization of the projection matrix P . Additionally, we show the per-

formance of the algorithm on two temperature sensor datasets, one dataset from a

controlled experiment where the sensors are measuring all the same phenomenon

and thus lie in a 1-dimensional subspace, and the other from a deployment in

a valley at a nature preserve called the James Reserve1, where the true dimen-

sion of the spatial signal is unknown. First, we discuss the technical tools for

implementation of robust blind calibration.

1http://www.jamesreserve.edu

65

5.5.1 Robust Estimation

Blind calibration is simply a problem of solving the linear system of equations in

(5.5). If noise, mismodeling effects, or other errors are present in the uncalibrated

sensor snapshots, then a solution to (5.5) may not exist. There are many methods

for finding the best possible solution, and we employ singular value decomposition

and standard least squares techniques.

First, note that the constraints can be expressed as

C α = 0 (5.12)

where the matrix C is given by

C =

P (Y 1 − Y )...

P (Y k − Y )

(5.13)

In the ideal case, there is always at least one solution to the constraint Cα = 0,

since the true gains must satisfy this equation. On the other hand, if the sensor

measurements contain noise or if the assumed calibration model or signal subspace

is inaccurate, then a solution may not exist. That is, the matrix C may have

full column rank and thus will not have a right nullspace. A reasonable robust

solution in such cases is to find the right singular vector of C associated with the

smallest singular value. This vector is the solution to the following optimization.

α = arg minα

‖Cα‖22 (5.14)

In other words, we find the vector of gains such that Cα is as close to zero as

possible. This vector can be efficiently computed in numerical computing envi-

ronments, such as Matlab, using the economy size singular value decomposition

66

(svd)2. Note that in the ideal case (no noise or error) the svd solution satisfies

(5.5). Thus, this is a general-purpose solution method.

Blind calibration of the gains can also be implemented by solving a system of

equations in a least squared sense as follows. Recall that we have one constraint on

our gain vector, α(1) = 1. This can be interpreted as knowing the gain coefficient

for the first sensor. We can use this knowledge as an additional constraint on the

solution. If we let c1, . . . , cn be the columns of C, let α be the gain vector with

α(1) removed, and let C be the matrix C with the first column removed, we can

rewrite the system of equations as C α = − c1. The robust solution is the value

of α that minimizes the LS criterion ‖C α + c1‖22.

More generally, we may know several of the gain coefficients for what we call

partially blind calibration. Let h be the sum of the α(i)ci corresponding to the

known gains, let α be the gain vector with the known gains α(i) removed, and

let C be the matrix C with those columns ci removed. Now we have C α = − h

and the robust solution is the minimizer of

‖C α + h‖22 (5.15)

We can solve this optimization in a numerically robust manner by avoiding the

squaring of the matrix C that is implicit in the conventional LS solution, α =

(C′

C)−1C′

( − h). This “squaring” effectively worsens the condition number of

the problem and can be avoided by using QR decomposition techniques3.

5.5.2 Simulations

To test the blind calibration methods on simulated data, we simulated both

a field and snapshots of that field. We generated gain and offset coefficients,

2The Matlab command is svd(C, 0).3We used α = C\ − h in Matlab.

67

Example Simulated Field Measured Field

Figure 5.1: Two example simulated square fields. On the left, a 256 × 256 field

generated with a basic smoothing kernel, which represents a true continuous field.

On the right, an 8 × 8 grid of measurements of the same field. The fields can

be quite dynamic and still meet the assumptions for blind calibration. The fields

are shown in pseudocolor, with red denoting the maximum valued regions and

blue denoting the minimum valued regions.

measurement noise, and most importantly, a projection matrix P .

We simulated a smooth field by generating an 256 × 256 array of pseudoran-

dom Gaussian noises (i.e., a white noise field) and then convolving it with the

smooth impulse response function h(i, j) = e(−s((i−l/2)2+(j−l/2)2), s > 0. Figure 5.1

shows an example field with the smoothing parameter s = 1, which could rep-

resent a smoothly varying temperature field, for example. We simulated sensor

measurements by sampling the field on a uniform 8×8 grid of n = 64 sensors. For

gains, we drew uniformly from α ∈ [0.5, 1.5] and for offsets from β ∈ [−.5, .5].

After applying α and β to the measurements, we then added Gaussian noise,

with mean zero and variance σ.

Separately, we created P to be a low-pass DFT matrix. We kept 3 frequencies

68

in 2d, which means with symmetries we have an r = 49-dimensional subspace4.

With this setup, we can adjust the parameters of the smoothing kernel, while

keeping P constant, to test robustness of blind calibration to an assumed sub-

space model that may over- or under-estimate the dimension of the subspace of

the true field. The smoothing kernel and projection P both characterize lowpass

effects, but the smoothing operator is only approximately described by the pro-

jection operator, even in the best case. We can also create our field by projecting

the random field onto the r-dimensional subspace using P ; this represents the

case where the true subspace is known exactly.

Estimates of the gains and offsets were calculated using the methods discussed

above and described in more detail below. For all the results, we calculated the

average error per sensor in the estimate α, and similarly the estimate β, as

follows.

errα =‖α − α‖2

2

n(5.16)

In order to interpret the error results, keep in mind the range of α and β.

For gain, a 1% error will be approximately 10−2, and 1% error in offset would be

approximately of 10−3.

5.5.2.1 Error Results using SVD

We simulated blind calibration with the described simulation set-up. We first

generated mean-zero fields using our smoothing kernel and took snapshot mea-

surements of each field. We used k = 3r snapshots (slightly more than the the-

oretical minimum of k = r) for added robustness to noise and modeling errors.

4If the 2-dimensional signal has p frequencies, then the subspace is of rank r = (2p + 1)2.

69

10−3

10−2

10−10

2

4

6

8x 10−4 Gain Error in Noise

ave

rag

e e

rro

r p

er

se

nso

r

noise variance on logscale

meanmedian

10−3

10−2

10−12.38

2.4

2.42

2.44

2.46x 10−3 Offset Error in Noise

ave

rag

e e

rro

r p

er

se

nso

r


meanmedian

Figure 5.2: Gain and offset error performance with exact knowledge of P and

increasing measurement noise. The results show the mean and median error over

100 simulation runs.

0 0.05 0.1 0.15 0.2 0.250

0.005

0.01

0.015

0.02

0.025

0.03

0.035

0.04Gain Error v. Modeling Error

ave

rag

e e

rro

r p

er

se

nso

r

amount of signal outside of the subspace

meanmedian

0 0.05 0.1 0.15 0.2 0.253

4

5

6

7

8x 10

−3 Offset Error v. Modeling Error

ave

rag

e e

rro

r p

er

se

nso

r

amount of signal outside of the subspace

meanmedian

Figure 5.3: Gain and offset error performance for mismodeled P and zero mea-

surement noise. The results show the mean and median error for 100 simulation

runs.

70

Then we constructed the matrix C from equation (5.13) and took the minimum

right singular vector as the estimate of the gains α as described in Section 5.5.1.

We then estimated β = −Y α.

Results from totally blind calibration in simulation are shown in the Fig-

ures 5.2 and 5.3. Figure 5.2 shows error in gain and offset estimates under the

burden of increasing noise variance using exact knowledge of the subspace defined

by P . That is, the fields in these simulations were created by projecting random

signals into the space defined by projection matrix P . The maximum value in

the signals was 1, and therefore the noise variance can be taken as a percentage;

i.e., variance of 10−2 represents 1% noise in the signal. The blind calibration

performed very well in this scenario; at 1% noise the gain estimation error was

less than 1 × 10−4 and the offset estimation error was less than 2.4 × 10−3. The

figure shows mean and median error over 100 simulation runs.

Knowing the true subspace exactly is possible in practice only when perform-

ing blind calibration in a very well-known environment, such as an indoor factory.

Even in this case, there will be some component of the true signals which is out-

side of the subspace defined by the chosen P . Figure 5.3 shows how gain and

offset error are affected by out-of-subspace components in the true signals. We

used a basic smoothing kernel to control smoothness of the true field and kept P

constant as described above with r = 49. The smoothing kernel and the projec-

tion operator are both low-pass operators, but even in the best case, some of the

smoothed field will be outside of the space defined by the projection matrix P .

We defined the error in P as ‖x − Px‖2/‖x‖2. The x-axis value in the figure is

the average error in P over 100 random fields smoothed with a given smoothness

parameter. The figure shows mean and median error in gain and offset estimates

over these 100 simulation runs. Again the results are compelling. The gain es-

71

10−3

10−2

10−1

10−8

10−6

10−4

10−2

100

aver

age

erro

r pe

r se

nsor

on

logs

cale


Mean Gain ErrorLSPartial BlindSVD

Figure 5.4: Gain error performance for SVD, blind LS, and partially blind LS.

Results show mean error over 50 simulation runs.

timation error was around 10−2 even when 10% of the signal was outside of the

subspace. The offset estimation as well was still very accurate, below 7 × 10−3

even when 20% of the signal was outside of the subspace.

5.5.2.2 Comparison of Techniques

Here we compare the SVD technique to the LS technique and the totally blind

calibration to partially blind calibration, where we know some of the calibration

coefficients ahead of time. To be completely explicit, here we have a description

of the approaches.

Totally blind SVD or SVD performs gain estimation using the minimum right

singular vector of the svd and normalizes assuming α(1) = 1. Offsets are then

estimated using β = −Y α. Totally blind LS or LS performs gain estimation by

solving equation (5.15) in a least-squares sense and assuming knowledge only of

α(1) = 1. Offsets are estimated as in SVD. Partially blind LS or partial blind

performs gain estimation by again solving equation (5.15) in the least-squared

sense but now assuming we know at least r of the true gain values. Offsets are

72

10−2

10−1

100

10−6

10−4

10−2

100

102

104 Offset Error for 0−Mean Signals

aver

age

erro

r pe

r se

nsor

on

logs

cale


LSPartial BlindSVD

10−2

10−1

100

10−6

10−4

10−2

100

102Offset Error for Non−0−Mean Signals

aver

age

erro

r pe

r se

nsor

on

logs

cale


LSPartial BlindSVD

Figure 5.5: Offset error performance for SVD, blind LS, and partially blind LS.

The top graph shows offset error for zero-mean signals, and the bottom graph is

for non-zero-mean signals. Results show mean error over 50 simulation runs.

then estimated as described in Section 5.3 for non-zero mean signals, i.e. using

β∆ = TΦΦ′β to solve for θ = Φ′β and thus β.

For partially blind LS we use enough of the true offsets such that we can

solve for the complete component of β in the signal subspace. The fields we

simulated are nearly bandlimited subspaces, and so the theory would imply that

r true offsets are enough to estimate β. In order to be robust to noise, we

used knowledge of the offsets of r + 5 sensors, again slightly more than the bare

minimum suggested by the theory.

A comparison of the techniques is quite interesting. First, as we expect, the

partially blind estimation does better than the other two methods in all cases;

this follows from the fact that it is using more information. In Figure 5.4 one can

see in the gain estimation, the SVD method out-performs totally blind LS, but

partially blind LS has the lowest error of all the methods.

In the case of offset error, the SVD and totally blind LS techniques out-

perform one another depending on the noise variance and whether or not the

73

signals are zero-mean. Figure 5.5 shows offset error for all three techniques. The

partially blind LS method is unaffected by non-zero mean signals, which follows

because the method for estimating the offsets does not change with a zero-mean

assumption. The other methods, on the other hand, capture the mean signal

as part of their offset estimates, and as we can see, estimation error using the

non-zero-mean signals is higher than using zero-mean signals.

The most intriguing part of these results is that totally blind LS performs

slightly better than SVD for the offset estimate in non-zero-mean signals, despite

the fact that it is using a gain estimate with more error from the first step in

order to estimate the offsets. This implies that if calibration offset is the most

important for calibration of a system which deals with non-zero-mean signals,

one might prefer the totally blind LS method over the SVD.

5.5.3 Evaluation on Sensor Datasets

We evaluate blind calibration on two sensor network datasets, which we call the

calibration dataset and the cold air drainage transect dataset.

5.5.3.1 Calibration Dataset

The calibration dataset was collected in September 2005 [4] along with data from

a reference-caliber instrument in order to characterize the calibration of the ther-

mistors used for environmental temperature measurement at the James Reserve.

From the experiment, the conclusion was drawn that after the factory-supplied

calibration was applied to the raw sensor measurements, the sensors differed from

the reference thermocouple linearly, i.e. by only a gain and offset. Thus these

sensors are suitable for evaluating the work we have done thus far on blind cali-

74

0 2 4 6 8 100.98

1

1.02

1.04Optimal and Estimated Parameters

gain

val

ue

0 2 4 6 8 10−0.5

0

0.5

1

sensor id

offs

et v

alue

optimalestimated

optimalestimated

1 1.5 2 2.5

x 105

21

22

23

24True Temperature (dashed),

Uncalibrated Data (solid)

1 1.5 2 2.5

x 105

21

22

23

24True Temperature (dashed),

Calibrated Data (solid)

time index

tem

pera

ture

in d

egre

es C

Figure 5.6: Results of blind calibration on the calibration dataset.

bration. The data is available in the NESL CVS repository5.

The setup of this experiment consisted of nine6 temperature sensors. These

sensors were placed in a styrofoam box along with a thermocouple attached to a

datalogger, providing ground truth temperature readings. Therefore, all sensors

were sensing the same phenomenon, and so the subspace spanned by the nine

measurements is rank one. Thus, for P we used a lowpass dct matrix which kept

only the dc frequency space. To illustrate, we used the following commands in

Matlab:

r = 1; n = 9;

I = eye(n);

U = dct(I);

U(r+1:n,:) = 0;

P = idct(U);

We calibrated these data using snapshots from the dataset and the SVD

method. To get the gain calibration factors, we normalized to the gain charac-

teristic of the groundtruth sensor. Figure 5.6 shows the calibration coefficient

5This data is available at http://www.ee.ucla.edu/~sunbeam/bc/6The experiment had ten sensors, one of which was faulty. In this analysis we used data

from the nine functional sensors.

75

−150

−100

−50

0

0200

4001600

1610

1620

1630

1640

1650

Sensor Locations for Cold Air Drainage

altit

ude

Figure 5.7: The mica2 motes in the cold air drainage transect run down the side

of a hill and across a valley. The mote locations pictured are those that we used.

estimates and reconstructed signals for the sensors in the experiment. The gains

and offsets were recovered with very little error. The uppermost plot shows the

true and estimated gains and offsets. The lower plot shows the data before and

after calibration, along with the ground truth measurement in blue. This clearly

demonstrates the utility of blind calibration.

5.5.3.2 Cold Air Drainage Dataset

The cold air drainage transect dataset consists of data from an ongoing deploy-

ment at the James Reserve. The deployment measures air temperature and

humidity in a valley in order to characterize the predawn cold air drainage. The

sensors used are the same as the sensors in the calibration dataset, and thus again

the factory calibration brings them within an offset and gain of one another. The

data we used for evaluation is from November 2, 2006, and it is available in

the sensor data repository called SensorBase7. On this same day, we visited the

James Reserve with a reference-caliber sensor and took measurements over the

7http://sensorbase.org

76

course of the day in order to get the true calibration parameters for comparison.

The deployment consists of 26 mica2 motes which run from one side of a

valley to the other (Figure 5.7) across a streambed and in various regions of tree

and mountain shade. Each mote has one temperature and one humidity sensor.

For our purposes, we collected calibration coefficients from 10 of the temperature

sensors.

The signal subspace in this application does not correspond to a simple low-

pass or smooth subspace, since sensors at similar elevations may have similar

readings, but can be quite distant from each other. In principle, the signal sub-

space could be constructed based on the geographic positions and elevations of

the sensor deployment. However, since we have the calibrated sensor data in

this experiment, we can use these data directly to infer an approximate signal

subspace. We constructed the projection P using the subspace associated with

the four largest singular values of the calibrated signal data matrix.

We performed totally blind calibration using SVD. We constructed C using

64 snapshots taken over the course of the morning along with P as described.

Figure 5.8 shows the results. The gain error was very small, only .0053 average

per sensor, whereas if we were to assume the gain was 1 and not calibrate the

sensors at all, the error would be .0180 average per sensor. On the other hand,

the offset error was only slightly better with blind calibration than it would have

been without: we saw .3953 average error per sensor as compared to 0.4610 error

if the offsets were assumed to be zero. We believe that the offset estimation did

not perform well due primarily to the fact that the mean signal is not zero in

this case (e.g., the average sensor readings depend on elevation). Better offset

estimates could be obtained using knowledge of one or more of the true sensor

offset values.

77

2 4 6 8 100.8

1

1.2Optimal and Estimated Parameters

gain

val

ue

2 4 6 8 10−4

−2

0

2

sensor id

offs

et v

alue

optimalestimated

optimalestimated

Figure 5.8: True and estimated gains and offsets for the cold air drainage transect

dataset.

5.6 Future Work and Discussion

There are many issues in blind calibration that could be explored further. The

two main areas ripe for study are the choice of the subspace P and the implemen-

tation of blind calibration. There are many possible choices for a suitable sub-

space, including frequency subspaces and smoothness subspaces. How to choose

the subspace when faced with a sensor deployment where the true signals are un-

known is an extremely important question for blind calibration. Methodologies

for creating a P would be extremely useful to the more general application of

blind calibration, especially ones which could incorporate trusted measurements

or the users’ knowledge of the physical space where the sensors are deployed. At

the same time, implementations of blind calibration that are robust to model

error in the subspace would allow users to be more liberal in the choice of P .

78

The theoretical analysis in this chapter is done under noiseless conditions and

with a perfect model. Future work includes both noisy analysis to find analytical

bounds that can be compared to simulation results and sensitivity analysis for

our system of linear equations. Our experience is that solutions are robust to

noise and mismodeling in some cases, and sensitive in others; we do not have a

good understanding of the robustness of the methodology at this time.

Extending the formulation to handle non-linear calibration functions would

be useful in cases where a raw non-linear sensor response must be calibrated. We

believe that many of the techniques developed in this chapter can be extended

to more general polynomial-form calibration functions. Other interesting topics

include distributed blind calibration and blind calibration in the presence of faulty

sensors.

79

CHAPTER 6

Conclusions

The work presented in this thesis takes a first step to addressing issues of fault

and calibration in sensor networks. In Chapter 3, we presented typical faults in

sensor networks and a general model which encompasses those faults well. We

used instances of that model to assess the impact of fault on popular aggregation

functions in sensor networks. The model provided allows great flexibility in rep-

resenting the typical faults in sensor networks. Hopefully this model can provide

people with a way to assess the impact of fault on their sensor network, which

can in turn facilitate more robust design principles.

Chapter 4 presented an approach to in-situ blind calibration based on state

space models. Any phenomenon of interest that is dynamic and can be repre-

sented by a dynamical system of equations can be tracked and measurements

can be assimilated using adaptive filters. With Monte Carlo methods, even non-

linear, non-Gaussian systems can be tracked. We showed that two Monte-Carlo

filtering algorithms, the Ensemble Kalman Filter and the SIR Particle Filter,

have promise for estimating calibration parameters of both gain and offset. In

preliminary implementations of both algorithms with calibration parameters part

of the estimated state space, we got expected results where better information

improved our estimate but still had small error even when our information was

uncertain. The update step which updates the estimates for the calibration pa-

rameters is only a first heuristic implementation, and finding the correct update

80

is a difficult problem. It will involve some reworking of how the filter updates

work, fundamentally. However, the promise which our initial heuristic shows is

good motivation to go ahead with this work.

The blind calibration work in Chapter 5 presented a very general approach to

the problem with one assumption that the sensor measurements lie in a subspace.

The formulation and methods developed in this chapter used only routine sensor

measurements. Thus we believe they give an extremely promising formulation for

the mass calibration of sensors. We showed that calibration gains are identifiable.

We proved how many measurements are necessary and sufficient to estimate the

gain factors, and we showed necessary and sufficient conditions to estimate the

offsets. Overall, this work demonstrates, in a very general way, that blind cali-

bration has great potential to be possible in practice. The major issue with the

formulation is that the model for the measurements’ subspace is crucial to the

entire formulation. One could take an approach such that implementation meth-

ods are amenable to an iterated solution where progressively better knowledge

is attained. Alternatively, if the methods used for implementation are robust

to model error, then presumably a slightly wrong subspace can still be used for

calibration. The formulation is presented in such a way that we can carefully

analyze the sensitivity of the method to particular subspaces, and understand in

what ways the true subspace can exist outside of the assumed subspace to still

allow blind calibration in this way.

Our final and overwhelming conclusion is that there remains a lot of work to

be done! This thesis presented first steps, and there are certainly several new

approaches to both fault detection and calibration that have yet to be explored.

Data coming from sensor networks is simply not high enough quality for many

of the envisioned applications, and this fact cannot be swept under the rug. For

81

any sensor network to be a truly useful and inexpensive technology, it needs to

be robust to fault and calibration errors.

82

References

[1] M. S. Arulampalarn, S. Maskell, N. Gordon, and T. Clapp. A tutorial onparticle filters for online nonlinear/non-gaussian bayesian tracking. IEEETransactions on Signal Processing, 50(2):174–188, February 2002.

[2] L. Balzano and R. Nowak. Blind calibration. Technical Report TR-UCLA-NESL-200702-01, Networked and Embedded Systems Laboratory, February2007.

[3] L. Balzano and R. Nowak. Blind calibration in sensor networks. In Proceed-ings of Information Processing in Sensor Networks (IPSN), 2007.

[4] L. Balzano, N. Ramanathan, E. Graham, M. Hansen, and M. B. Srivastava.An investigation of sensor integrity. Technical Report UCLA-NESL-200510-01, Networked and Embedded Systems Laboratory, 2005.

[5] P. Buonadonna, D. Gay, J. Hellerstein, W. Hong, and S. Madden. Task:Sensor network in a box. Technical Report IRB-TR-04-021, Intel ResearchBerkeley, January 2005.

[6] M. Bushnell and V. Agrawal. Essentials of Electronic Testing for Digital,Memory, and Mixed-Signal VLSI Circuits. Springer, 2000.

[7] V. Bychkovskiy, S. Megerian, D. Estrin, and M. Potkonjak. A collaborativeapproach to in-place sensor calibration. In 2nd International Workshop onInformation Processing in Sensor Networks, pages 301–316, 2003.

[8] E.J. Candes and J. Romberg. Quantitative robust uncertainty principlesand optimally sparse decompositions. Foundations of Computational Math-ematics, 2006.

[9] E. Elnahrawy and B. Nath. Context-aware sensors. In Proceedings of theEuropean Conference on Wireless Sensor Networks, pages 77–93, 2004.

[10] G. Evensen. The ensemble kalman filter: theoretical formulation and prac-tical implementation. Ocean Dynamics, 53(4):343–367, 2003.

[11] J. Feng, S. Megerian, and M. Potkonjak. Model-based calibration for sensornetworks. Sensors, pages 737 – 742, October 2003.

[12] S. Ganeriwal and M.B. Srivastava. Reputation-based framework for highintegrity sensor networks. In Proceedings of SASN ’04, October 2004.

83

[13] G. Harikumar and Y. Bresler. Perfect blind restoration of images blurredby multiple filters: Theory and efficient algorithms. IEEE Transactions onImage Processing, 8(2):202 – 219, February 1999.

[14] D.J. Hill and B.S. Minsker. Automated fault detection for in-situ envi-ronmental sensors. In Proceedings of the 7th International Conference onHydroinformatics, 2006.

[15] B. Hoadley. A bayesian look at inverse linear regression. Journal of theAmerican Statistical Association, 65(329):356 – 369, March 1970.

[16] A. Hyvarinen, J. Karhunen, and E. Oja. Independent Component Analysis.Wiley-Interscience, New York, 2001.

[17] A. Ihler, J. Fisher, R. Moses, and A. Willsky. Nonparametric belief prop-agation for self-calibration in sensor networks. In Proceedings of the ThirdInternational Symposium on Information Processing in Sensor Networks,2004.

[18] R. Isermann. Fault-Diagnosis Systems: An Introduction from Fault Detec-tion to Fault Tolerance. Springer, 2005.

[19] F. Koushanfar, M. Potkonjak, and A. Sangiovanni-Vincentelli. On-line faultdetection of sensor measurements. IEEE Sensors, pages 974–980, October2003.

[20] L. Lamport, R. Shostak, and M. Pease. The byzantine generals problem.ACM Transactions on Programming Languages and Systems, 4(3):382–401,July 1982.

[21] L. B. Larkey, L.M.A. Bettencourt, and A.A. Hagberg. In-situ data qualityassurance for environmental applications of wireless sensor networks. Tech-nical Report Unclassified Report LA-UR-06-1117, Los Alamos Laboratory,2006.

[22] Keith Marzullo. Tolerating failures of continuous-valued sensors. ACMTransactions on Computing Systems, 8(4):284–304, 1990.

[23] M. Rabbat, R. Nowak, and J. Bucklew. Generalized consensus computa-tion in networked systems with erasure links. In Proceedings of the IEEEWorkshop on Signal Processing Advances in Wireless Communications, June2005.

84

[24] N. Ramanathan, L. Balzano, M. Burt, D. Estrin, T. Harmon, C. Harvey,J. Jay, E. Kohler, S. Rothenberg, and M.Srivastava. Rapid deploymentwith confidence: Calibration and fault detection in environmental sensornetworks. Technical Report 62, Center for Embedded Networked Sensing,2006.

[25] N. Ramanathan, K. Chang, R. Kapur, L. Girod, E. Kohler, and D. Estrin.Sympathy for the sensor network debugger. In Proceedings of the 3rd inter-national conference of Sensys, pages 255–267, 2005.

[26] N. Ramanathan, T. Schoellhammer, D. Estrin, M. Hansen, T. Harmon,E. Kohler, and M. Srivastava. The final frontier: Embedding networkedsensors in the soil. Technical Report 68, Center for Embedded NetworkedSensing, November 2006.

[27] A. Shah and G. Ramakrishnan. FDDI: A High Speed Network. PrenticeHall, 1993.

[28] O. Shalvi and E. Weinstein. New criteria for blind deconvolution of non-mimimum phase systems (channels). IEEE Trans. on Information Theory,IT-36(2):312 – 321, March 1990.

[29] R. Szewczyk, J. Polastre, A. Mainwaring, and D. Culler. Lessons from asensor network expedition. In Proceedings of the 1st European Workshop onWireless Sensor Networks, pages 307–322, January 2004.

[30] C. Taylor, A. Rahimi, J. Bachrach, H. Shrobe, and A. Grue. Simultaneouslocalization, calibration, and tracking in an ad hoc sensor network. In Pro-ceedings of the Fifth International Conference on Information Processing inSensor Networks, pages 27–33, 2006.

[31] G. Tolle, J. Polastre, R. Szewczyk, D. Culler, N. Turner, K. Tu, S. Burgess,T. Dawson, P. Buonadonna, D. Gay, and W. Hong. A macroscope in theredwoods. In Proceedings of Sensys, 2005.

[32] D. Wagner. Resilient aggregation in sensor networks. In Proceedings of the2nd ACM workshop on Security of Ad Hoc Sensor Networks, pages 78–87,2004.

[33] K. Whitehouse and D. Culler. Calibration as parameter estimation in sen-sor networks. In Proceedings of the 1st ACM International Workshop onWireless Sensor Networks and Applications, pages 59–67, 2002.

85

[34] Y. Zhou, D. McLaughlin, and D. Entekhabi. Assessing the performance ofthe ensemble kalman filter for land surface data assimilation. American Me-teorological Society Monthly Weather Review, 134:2128–2142, August 2006.

86

addressing fault and calibration in wireless sensor...

Documents