mapping forest changes using multi-temporal remote sensing … › etd › ucb › text ›...
Post on 27-Jun-2020
3 Views
Preview:
TRANSCRIPT
Mapping forest changes using multi-temporal remote sensing images: BITE for accurate
trajectory extraction and CBEST for efficient clustering
By
Yanlei Chen
A dissertation submitted in partial satisfaction of the
requirements for the degree of
Doctor of Philosophy
in
Environmental Science, Policy and Management
in the
Graduate Division
of the
University of California, Berkeley
Committee in charge:
Professor Peng Gong, Chair
Professor Gregory Biging
Professor John Radke
Fall 2014
1
Abstract
Mapping forest changes using multi-temporal remote sensing images: BITE for accurate
trajectory extraction and CBEST for efficient clustering
By
Yanlei Chen
Doctor of Philosophy in Environmental Science, Policy and Management
University of California, Berkeley
Professor Peng Gong, Chair
We developed a semi-automatic algorithm named Berkeley Indices Trajectory Extractor
(BITE) to detect forest disturbances, especially slow-onset disturbances such as insect mortality,
from time series of Landsat 5 Thematic Mapper (TM) images. BITE is a streamlined process that
features trajectory extraction and interpretation of multiple spectral indices followed by an
integration of all indices. The algorithm was tested over Grand County in Colorado, located in
the Southern Rocky Mountains Ecoregion, where forests dominated by lodgepole pine have been
under mountain pine beetle attack since 2000. We produced a disturbance map using BITE with
an identification accuracy of 94.7% assessed from 602 validation sample pixels. The algorithm
shows its robustness in deriving forest disturbance type and timing with the presence of different
levels of atmospheric conditions, noises, pixel misregistration and residual cloud/snow cover in
the imagery. Outputs of the BITE algorithm could be used in studies designed to increase
understanding of the mechanisms of mountain pine beetle dispersal and tree mortality, as well as
other types of forest disturbances.
Large remote sensing datasets, that either cover large areas or have high spatial resolution, are
often a burden for information mining for scientific studies. Here, we present an approach that
conducts clustering after gray-level vector reduction. In this manner, the speed of clustering can
be considerably improved. The approach features applying eigenspace transformation to the
dataset followed by compressing the data in the eigenspace and storing them in coded matrices
and vectors. The clustering process takes advantage of the reduced size of the compressed data
and thus reduces computational complexity. We name this approach Clustering Based on Eigen
Space Transformation (CBEST). In our experiment with a subscene of Landsat Thematic
Mapper (TM) imagery, CBEST was found to be able to improve speed considerably over
conventional K-means as the volume of data to be clustered increases. We assessed information
loss and several other factors. In addition, we evaluated the effectiveness of CBEST in mapping
land cover/use with the same image that was acquired over Guangzhou City, South China and an
AVIRIS hyperspectral image over Cappocanoe County, Indiana. Using reference data we
2
assessed the accuracies for both CBEST and conventional K-means and we found that the
CBEST was not negatively affected by information loss during compression in practice. We then
applied CBEST in mapping the forest change from 1986-2011 for the entire state of California,
USA with over 400 Landsat TM images. We discussed potential applications of the fast
clustering algorithm in dealing with large datasets in remote sensing studies.
We present an efficient approach for a practice of large-area mapping of forest changes based on
the Clustering Based on Eigen Space Transformation (CBEST) algorithm using remote sensing.
By analyzing 450 Landsat Thematic Mapper (TM) satellite images from 1986 to 2011 with a
five-year interval covering the entire state of California, USA, we derived a forest change type
map, a forest loss map and a forest gain map. Although California has 99.6 million acres land
area in total and the spatial resolution of Landsat TM is 30m, the computing time of the task took
only 10 hours in a computer with an Intel 2.8 Ghz i5 CPU and 8 Gigabytes RAM. The overall
accuracy of the forest cover in year 2011 was reported as 92.9% ± 1.6%. We found that the
estimated forest area changed from 28.20 ± 1.98 million acres to 28.05 ± 1.98 million acres from
1986-2011. In particular, our rough estimate indicates that each year California’s forest
experienced loss of 92 thousand acres and recovery of 85 thousand acres, resulting in seven
thousand acres forest loss per year. In addition, during 1986-2011, around 12% of the forestland
experienced changes, in which the change was 4% each for deforestation, afforestation and
deforestation then recovered respectively. We concluded that the forestland in California had
been managed in a sustainable manner over the 25 years, since no significantly directional
changes were observed. Our approach made a tighter estimate of the true canopy coverage such
that 29% of land in California is forestland, comparing with the statistics of 33% and 40% made
by previous studies that had lower spatial resolution and shorter temporal coverage.
i
Table of Contents
LIST OF TABLE CAPTIONS ................................................................................................................... III
LIST OF FIGURE CAPTIONS ................................................................................................................. IV
INTRODUCTION ........................................................................................................................................ V
ACKNOWLEDGEMENT .........................................................................................................................VII
CHAPTER 1 BITE: AN ALGORITHM FOR MAPPING SLOW-ONSET FOREST DISTURBANCES CAUSED BY MOUNTAIN PINE BEETLES WITH LANDSAT IMAGE STACKS .................................................. 1
ABSTRACT .................................................................................................................................................. 2
1 INTRODUCTION .............................................................................................................................. 3
2 METHODOLOGY .............................................................................................................................. 5 2.1 STUDY AREA ............................................................................................................................................................. 5 2.2 DISTURBANCE MAPPING PROCEDURE ................................................................................................................. 6
2.2.1 Data and Preprocessing ....................................................................................................................................... 7 2.2.2 Spectral Indices ........................................................................................................................................................ 9 2.2.3 Trajectory Extraction ............................................................................................................................................ 9 2.2.4 Trajectory Interpretation ................................................................................................................................ 13 2.2.5 Post-classification Process ............................................................................................................................... 14
3 RESULTS AND DISCUSSION ...................................................................................................... 15 3.1 EVALUATION OF THE CLASSIFIERS AND THE INDICES ..................................................................................... 15 3.2 ACCURACY ASSESSMENT ...................................................................................................................................... 16 3.3 THE DISTURBANCE MAP PRODUCT ................................................................................................................... 18
4 CONCLUSION AND PERSPECTIVES ......................................................................................... 20
CHAPTER 2 CLUSTERING BASED ON EIGEN SPACE TRANSFORMATION – CBEST FOR EFFICIENT CLASSIFICATION ..................................................................................................................................... 22
ABSTRACT ............................................................................................................................................... 23
1 INTRODUCTION ........................................................................................................................... 24
2 BACKGROUND............................................................................................................................... 25 2.1 K-MEANS ................................................................................................................................................................. 25 2.2 EIGEN-BASED GRAY-LEVEL VECTOR REDUCTION .......................................................................................... 27
3 CLUSTERING BASED ON EIGEN SPACE TRANSFORMATION ......................................... 28 3.1 COMPRESSION ........................................................................................................................................................ 28 3.2 CLUSTERING ........................................................................................................................................................... 30 3.3 FURTHER IMPROVEMENT .................................................................................................................................... 31
3.3.1 Mean vectors .......................................................................................................................................................... 31 3.3.2 Vacant Eigenspace Partitions ........................................................................................................................ 32 3.3.3 Boundary Optimization ..................................................................................................................................... 32
4 EXPERIMENTAL DESIGN ........................................................................................................... 32 4.1 EXPERIMENT DATA ............................................................................................................................................... 33 4.2 PREPROCESSING .................................................................................................................................................... 34
ii
4.3 METHODS ............................................................................................................................................................... 34
5 RESULTS AND ANALYSIS ........................................................................................................... 37 5.1 EFFICIENCY & PERFORMANCE TEST .................................................................................................................. 37 5.2 APPLICATION EXPERIMENTS ............................................................................................................................... 46
5.2.1 Landsat TM Image ............................................................................................................................................... 46 5.2.2 AVIRIS Hyperspectral Image .......................................................................................................................... 49
6 DISCUSSIONS ................................................................................................................................. 51
CHAPTER 3 APPLICATIONS OF CBEST IN EFFICIENTLY MAPPING FOREST CHANGES IN THE STATE OF CALIFORNIA FROM 1986-2011 ..................................................................................................... 54
ABSTRACT ............................................................................................................................................... 55
1 INTRODUCTION ........................................................................................................................... 56
2 METHODOLOGY ........................................................................................................................... 58 2.1 STUDY AREA ........................................................................................................................................................... 58 2.2 DATA ....................................................................................................................................................................... 59 2.3 PROCEDURE ............................................................................................................................................................ 60
2.3.1 Data Preparation ................................................................................................................................................. 61 2.3.2 Initial Clustering ................................................................................................................................................... 62 2.3.3 Integrating Cluster Centers ............................................................................................................................. 63 2.3.4 Probability Assigning ......................................................................................................................................... 63 2.3.5 Probability Trajectory Interpretation ........................................................................................................ 64 2.3.6 Post-processing ..................................................................................................................................................... 66
3 RESULTS AND ANALYSIS ........................................................................................................... 67 3.1 INTERMEDIATE RESULTS ..................................................................................................................................... 67 3.2 FOREST CHANGE MAP AND ACCURACY ASSESSMENT ..................................................................................... 70
4 DISCUSSIONS ................................................................................................................................. 72
5 CONCLUSIONS ............................................................................................................................... 77
CHAPTER 4 CONCLUSIONS AND PERSPECTIVES ............................................................................ 79
1 SUMMARY OF THE RESULTS .................................................................................................... 79
2 FUTURE PERSPECTIVES ............................................................................................................ 80
REFERENCES ........................................................................................................................................... 81
iii
List of Table Captions
Table 1 Data acquisition dates and land percentage. ................................................................................................ 7 Table 2 The list of the spectral indices. .......................................................................................................................... 9 Table 3 Overall accuracies of the classification test results. ‘CV’ represents the cross-validation test on
the training dataset. ‘Test’ represents the evaluation on the test dataset. ........................................... 15 Table 4 Overall accuracies of the classification test of integration of multiple indices. The evaluation
was done on the test dataset. ............................................................................................................................... 16 Table 5 Confusion matrix of the forest change type classification result. The evaluation was done on
the test dataset. ........................................................................................................................................................ 17 Table 6 Conventional K-means Algorithm ................................................................................................................. 26 Table 7 Eigen-Based Gray Level Vector Reduction ................................................................................................. 28 Table 8 CBEST Algorithm ................................................................................................................................................. 30 Table 9 Description of the Indicators .......................................................................................................................... 36 Table 10 Classification System for Guangzhou ......................................................................................................... 36 Table 11 Test Results w/respect to Data Size ........................................................................................................... 38 Table 12 Test Results w/respect to k .......................................................................................................................... 39 Table 13 Test Results w/respect to N .......................................................................................................................... 42 Table 14 Assignment of Eigenspace Partitions for Eigen Axes ............................................................................ 44 Table 15 Performance Test w/respect to the Max number of Iterations ........................................................ 45 Table 16 Confusion Matrices for Validation (Landsat) .......................................................................................... 46 Table 17 Summary of Classification Results (Landsat) ......................................................................................... 47 Table 18 Summary of Class Results (AVIRIS) ............................................................................................................ 51 Table 19 Verification classes and corresponding probability weights ............................................................ 64 Table 20 Elapsed time for the clustering process. Each result was selected as the lowest within-cluster
sum of squares from 5 runs. 1986N10 means the mosaicked image in 1986 with projection of UTM Zone 10 North. ................................................................................................................................................ 67
Table 21 Means and standard deviations of forest probabilities calculated for the clusters ................... 69 Table 22 The error matrix of samples with four classes from validation labels and two classes from
the forest cover in 2011 ......................................................................................................................................... 71 Table 23 Error matrix of accuracies for the forest cover in 2011 ...................................................................... 71
iv
List of Figure Captions
Figure 1 Study Area: Grand County, CO, USA. ............................................................................................................... 6 Figure 2 Flowchart of processing steps using in the BITE algorithm. .................................................................. 7 Figure 3 Example of an NDVI time series for 1 disturbed pixel, and intermediate results of processing
steps in the time series. These processing steps include 1) Inter-year value selection (a)-(b); 2) Noise removal (b)-(c); 3) Segmentation (c)-(d). ............................................................................................ 10
Figure 4 Example of the intermediate processing steps producing segments of the entire NDVI trajectory for 1 pixel (Segmentation Process). .............................................................................................. 13
Figure 5 Outputs of the BITE algorithm, including starting year of (left) slow-onset disturbances and (right) rapid-onset disturbances. ....................................................................................................................... 18
Figure 6 Area affected by different disturbance types for 2001-2009. ............................................................ 19 Figure 7 Enlarged view of the BITE output showing the staring year of slow-onset disturbances. ........ 20 Figure 8 Illustrative Comparisons between CBEST and K-means ...................................................................... 31 Figure 9 Test Area: Guangzhou, China ......................................................................................................................... 33 Figure 10 Experiment Flow Chart ................................................................................................................................. 35 Figure 11 Speed Comparison w/respect to Data Size. (a) Elapsed Time Comparison; (b) Elapsed Time
ratio (how many times faster); (c) ETI Comparison; (d) ETI ratio ........................................................... 39 Figure 12 Efficiency w/respect to k. (a) Elapsed Time Comparison; (b) Elapsed Time ratio (how many
times faster); (c) ETI Comparison; (d) ETI ratio; (e) Rescaled Within-Cluster Sum of Square average; (f) Rescaled Within-Cluster Sum of Square Best/Worst Case. ................................................. 41
Figure 13 Efficiency w/respect to N. (a) Elapsed Time Comparison; (b) Elapsed Time ratio (how many times faster); (c) ETI Comparison; (d) ETI ratio. (e) Within-Cluster Sum of Squares Comparison; (f) Within-Cluster Sum of Squares Limited by various max numbers of Iterations. .......................... 43
Figure 14 Scatterplot of Ground Truth ........................................................................................................................ 48 Figure 15 Land Cover/Use Map derived by K-means and CBEST in Guangzhou ........................................... 49 Figure 16 Validation Samples as Ground Reference in Guangzhou ................................................................... 49 Figure 17 Mapping Results in Tippercanoe County ................................................................................................ 50 Figure 18 California: Study Area and Landsat TM scenes. Since the study area is in the northern
hemisphere, the UTM is of North Zone. ............................................................................................................ 59 Figure 19 Flowchart of the Procedure to map forest changes in California ................................................... 61 Figure 20 CBEST software interface. The initial clustering was implemented under the configuration
in this figure............................................................................................................................................................... 63 Figure 21 Graphic demonstration of probability trajectory interpretation. (a) A typical forest loss
pixel with elaborations on the rules for automatic determination of forest loss; (b) Non-forest, all points fall within the bounds; (c) Forest; (d) Forest Gain detected in 2006. ........................................ 65
Figure 22 Post-clustering result in year 2011 and stratified samples ............................................................. 68 Figure 23 California Forest Change Maps 1986-2011. Left: Change Type Map; Upper right: Forest loss
characterized by years; Lower right: Forest gain/recovery characterized by years. ........................ 70 Figure 24 Estimated Forest Area by Mapping Years ............................................................................................... 72 Figure 25 Proportions of forest change type in California for 1986-2011 ...................................................... 73 Figure 26 Local views of some chosen places of the forest change map. The four images in the bottom
of the figure demonstrate the changes detected with historical aerial photographs back in the 1988 and 1993 in comparison with high resolution image acquired recently. Orange circles indicate a regenerated forest patch after early removal while red circles encompass a clearcutting area. ..................................................................................................................................................... 74
Figure 27 An example of how scale affect the classified area. Suppose each smallest cell unit is 30m by 30m in size, there are 20 cells or 18000m2 forest area. If using 120m by 120m cell, there are 3 cells or 43200m2. If using the entire 240m by 240m scene, the area is classified as one forest patch, with an area of 57600m2. ......................................................................................................................... 76
v
Introduction
Forestland is commonly defined as land that is at least one acre in area and has at least 10% area
stocked with trees of any size, or previously had such tree cover but not currently being
developed for non-forest use (Helms, 1998). The Resource Planning and Act assessment (USDA
Forest Service, 2012) additionally limits a width of at least 120 feet (37 meters). It also includes
transition zones with 10% tree cover and excludes lands predominantly under agricultural and
urban land use. Forests, when properly managed, are known to be a major carbon sink that can
mitigate the process of climate change. In the United States, forest growth and afforestation
offset approximately 13 percent of the Nation’s fossil fuel CO2 production in 2012 (Vose et al.,
2012). Traditionally, forest is well recognized for its economic, social and ecological values.
Commercial forest (Timberland) provides valuable wood products, while reserved forest is
preserved for recreations, aesthetics, wildlife, biodiversity, etc. The importance of sustainable
forest management that aims to conserve the forest for the benefit and sustainability for future
generations is increasingly acknowledged by the public nowadays. Therefore, it is crucial to
monitor forest changes and to estimate deforestation for tracking carbon stocks and fluxes
(Running, 2008), as well as to support decision making for better forest management for the
benefit of the society. Moreover, monitoring these deforestation and regeneration events over
time is also important since natural and human-induced disturbances that cause deforestation is
becoming more and more frequent under climate change (Overpeck et al., 1990; Westerling et
al., 2006). Natural disturbances include hurricanes, earthquakes, wildfires, increased temperature,
drought, pathogens and insect attacks (Soja et al., 2007; Kurtz et al., 2008; Westerling and
Bryant, 2008). Human-induced disturbances include logging, clear-cutting and prescribed fire.
The detection of these disturbances and land use changes provides evidence for scientists and
policy makers to study the implications of such changes and to project future trends. In
particular, slow-onset forest disturbances, which are commonly caused by insects and pathogens,
comprise a significant source of long-term carbon dioxide emissions to the atmosphere through
decomposition of dead organic matter leading to climate warming (Metz, 2001; Kurz et al. 2008;
Maness et al., 2013). Currently, there are two major challenges for forest change mapping.
Firstly, there is a lack of reliable approaches for detecting slow-onset disturbances spatially and
temporally as well as distinguishing them from rapid-onset disturbances. Secondly, there is a
lack of efficient algorithms for detecting forest changes over many years in a large area such as a
large State such as California with rich forest resources, or even the entire United States.
Therefore, to address the first challenge, we were particularly interested in accurately tracking
slow-onset disturbances with satellite images acquired in multiple years. For the second
challenge, we focused on developing an efficient automatic algorithm based on K-means, a
widely used algorithm for data mining and applied this algorithm in a practice of large-area
mapping over many years.
This dissertation paper consists of four chapters. In the first chapter, a reliable semi-automatic
algorithm for detecting slow-onset disturbances vs. rapid-onset disturbances based on Landsat
image stacks from 2001 to 2011 for Grand County in Colorado was developed. The algorithm
was named Berkeley Indices Trajectory Extractor (BITE). Temporal trajectories of multiple
spectral indices were processed with unique techniques followed by interpretation and
integration. An overall accuracy of 94.7% for the classification of disturbance types was
vi
achieved. The BITE product effectively maps the spatial and temporal dispersal of mountain pine
beetle outbreak that occurred during the time frame in the study area, supporting better
understanding of fundamentals of mechanics of insect attack patterns. Furthermore, this
algorithm should be suitable for detecting other disturbances that result in canopy loss regardless
of the speed of deforestation.
However, BITE had high computational cost and was time consuming when executed in an
ordinary lab computer. In the second chapter, an efficient unsupervised algorithm was proposed
with great improvement of lowering computational cost of conventional K-means algorithm. The
algorithm was named Clustering Based on Eigen Space Transformation (CBEST). The algorithm
compressed the data before iterating calculations for the clustering process, making the original
data size based computational cost to be based solely on a fixed number of desired compressed
space. Although there is information loss during the compression, the analysis and experiment on
some test images suggest the loss could be ignored in practice, however achieving great
improvement in computing time.
In the third chapter, the CBEST algorithm was applied in producing a forest change map for the
entire state of California from 1986-2011 with a five-year interval. With a total of 450 Landsat
Thematic Mapper images, the entire computing time was approximately 10 hours in an ordinary
lab computer. The overall accuracy was assessed for the forest cover in 2011 derived from the
map as 92.9% ± 1.6%. This efficient approach allowed us to produce the first California forest
change map with such spatial resolution of 30 meters and temporal coverage of 25 years. The
facts of California’s forestland were found using the produced map. No significant directional
change was observed. The differences between the produced map and previous forest inventories
were discussed.
In the fourth chapter, the achievements from the first three chapters were summarized. The links
between these chapters were explored and the further integration of BITE and CBEST was
envisioned to take advantages of both algorithms in a larger extent. The ultimate goal was to
efficiently and reliably map the forest changes for a relative large administrative area or
ecoregions. The potentials and benefit of the study were also prospected.
vii
Acknowledgement
I am grateful to Congcong Li for sharing the processed TM image and validation data used in
this article. We also thank David Landgrebe from Laboratory for Applications of Remote
Sensing, Purdue University for sharing the data online. This research has been partially
supported by USGS (grant number G12AC20085) and a national high technology program grant
from China (grant number 2009AA12200101).
1
Chapter 1 BITE: an algorithm for mapping slow-onset forest disturbances caused
by mountain pine beetles with Landsat image stacks
2
Abstract
We developed a semi-automatic algorithm named Berkeley Indices Trajectory Extractor
(BITE) to detect forest disturbances, especially slow-onset disturbances such as insect mortality,
from time series of Landsat 5 Thematic Mapper (TM) images. BITE is a streamlined process that
features trajectory extraction and interpretation of multiple spectral indices followed by an
integration of all indices. The algorithm was tested over Grand County in Colorado, located in
the Southern Rocky Mountains Ecoregion, where forests dominated by lodgepole pine have been
under mountain pine beetle attack since 2000. We produced a disturbance map using BITE with
an identification accuracy of 94.7% assessed from 602 validation sample pixels. The algorithm
shows its robustness in deriving forest disturbance type and timing with the presence of different
levels of atmospheric conditions, noises, pixel misregistration and residual cloud/snow cover in
the imagery. Outputs of the BITE algorithm could be used in studies designed to increase
understanding of the mechanisms of mountain pine beetle dispersal and tree mortality, as well as
other types of forest disturbances.
Keywords: Mountain Pine Beetle, Forest Disturbance, Slow-onset Disturbance, Landsat TM,
Remote Sensing
3
1 Introduction
Forest land is a major carbon sink. In the United States, forest growth and afforestation offset
approximately 13 percent of the Nation’s fossil fuel CO2 production in 2012 (Vose et al., 2012).
Monitoring forests and estimating deforestation as a result of disturbances is crucial for tracking
carbon stocks and fluxes in ecosystems (Running, 2008). Slow-onset forest disturbances, which
are commonly caused by insects and pathogens, comprise a significant source of long-term
carbon dioxide emissions to the atmosphere through decomposition of dead organic matter
leading to climate warming (Metz, 2001; Kurz et al. 2008; Maness et al., 2013). Currently, there
is a lack of reliable approaches for detecting slow-onset disturbances spatially and temporally as
well as distinguishing them from rapid-onset disturbances. Therefore, we were particularly
interested in accurately tracking slow-onset disturbances with satellite images acquired in
multiple years.
Mountain pine beetle (MPB, Dendroctonus ponderosae) is a native species to North America
and is known to cause large-scale mortality in coniferous forests. For the past decade, pine
forests in western North America have experienced extensive and severe morality from MPB
outbreaks (Kurz et al., 2008; Honey-Marie et al., 2011). Despite concerns about large amount of
carbon emissions from extensive tree mortality, the recurring MPB outbreaks in western North
America also have other socioeconomic impacts. These include: the wood industry is affected by
increased cost of timber due to damages from MPB; dead trees increase ground fuel loading and
thus wildfire vulnerability; recreational values of the landscape are negatively affected as dead
trees are not visually appealing; dead tree falls could cause injuries as a safety concern; and
wildlife composition is altered due to drastic habitat change as a result of extensive tree mortality
(Safranyik and Wilson, 2007). Therefore, challenges were raised for management of forest
sustainability in implementing effective measures to prevent and confine MPB dispersal.
Potential counter measures were investigated, including thinning (Mitchell et al., 1983), fire
suppression (Parker et al., 2006), and removing infested trees (Trzcinski and Reid, 2008). These
counter measures were limited by costs and scope, and therefore were not effective in dealing
with large-scale outbreaks.
MPB normally exists at endemic levels that cause limited mortality; however, under certain
circumstances these attacks can reach epidemic levels. Past studies about factors that could
trigger epidemic behavior of MPB focused on physiological interactions between pines, beetles
and climate (Cole and Amman, 1980; Raffa and Berryman, 1983; Safranyik and Whitney, 1985;
Bentz et al., 1991; Bentz et al., 1996) and recent technological advancements in Geospatial
Science, Geographic Information System (GIS) and Remote Sensing (RS) have facilitated the
understanding of patterns of MPB attacks throughout space and time (Logan et al., 1998;
Aukema et al., 2006; Chapman et al., 2012). Moreover, with climate variables (Iverson and
Prasad, 1998; Aukema et al., 2008), future spread trend of MPB attacks could be predicted under
different climate change scenarios leading to improved estimation of the long-term impacts of
MPB mortality on carbon stocks and fluxes. These studies provided significant support for
scientists and decision makers to develop strategies to predict and control dispersal of MPB in
forested landscapes. However, these studies were limited in scope because the spatial resolution
and temporal coverage were relatively low. To better understand the driving factors of MPB
4
dispersals, it relies on accurate spatial and temporal detections of MPB outbreaks, which can be
derived and interpreted from remotely-sensed data acquired at regular intervals over long time
periods.
Satellite imagery has been widely used to provide information about forest coverage. Nationwide
forest land cover mapping in the United States can be traced back to the 1990s (Loveland et al.,
1991; Zhu and Evans, 1994). Worldwide forest cover products such as global tropical forest
cover map (Mayaux et al., 1998), global land cover maps (Hansen et al., 2000; Gong et al., 2013)
along with global forest percentage map (Defries et al., 2000) were produced to reflect forest
conditions in relatively broad categories. With multi-temporal approaches which stack remotely
sensed images acquired from multiple dates, one is able to track land use and land cover (LULC)
changes over time. There is a variety of change detection techniques for remotely sensed data
(Singh, 1989; Coppin and Bauer, 1996; Mas, 1999; Hayes and Sader, 2001; Lu et al., 2004). For
forest change detection related to insects, there are studies that independently interpret single
date of satellite imagery to map disturbances (Keane et al., 1994; Bentz and Endreson, 2004).
These studies were limited to the detection of presence of disturbances but not their timing.
Multi-temporal approaches like time-series analysis of trajectories of vegetation indices were
developed (Goodwin et al., 2008; Goodwin et al., 2010). Similarly, Landsat-based detection of
Trends in Disturbance and Recovery (LandTrendr) was introduced using yearly Landsat time-
series stacks to extract spectral trajectories (Kennedy et al., 2010; Cohen et al., 2010).
Particularly, it was applied to detect MPB outbreaks (Meigs et al., 2011). There is another
algorithm featuring an automatic streamlined procedure for detecting forest changes using
Landsat time-series stacks, which is called Vegetation Change Tracker algorithm (VCT, Huang
et al., 2009; Huang et al., 2010). A recent global map depicting forest changes from 2000 to
2012 was produced using efficient cloud computing (Hansen et al., 2013). Generally, a reliable
forest change map is a crucial source of reference in gaining insights into mechanisms of spatial
patterns of forest changes.
Although having high ground resolution, aerial images often lack temporal consistency in data
acquisition and cost for private sources is high. As for satellite imagery, Landsat TM had a long
life of nearly 30 years of operation, with 16-day revisit intervals, and a medium spatial resolution
of 30 meters, and thus is suitable for mapping the MPB outbreaks over a long period at
appropriate spatial scale. For disturbance detection approaches that use Landsat imagery, there
are limitations when applied to mapping the outbreaks of MPB in the Southern Rocky Mountain
region. Goodwin et al. (2008; 2010) only explored one spectral index to detect disturbances with
cloud-free subsets and thresholds. LandTrendr also used a single spectral index at one time to
process the time-series and there are too many parameters for users to easily test for optimal
values. The VCT algorithm relied on thresholds to classify its forest index to low and high values.
A disturbance was determined based on consecutive high values, thus it is inaccurate in detecting
slow-onset disturbances. The global forest change map by Hansen et al. (2013) classified time-
series images for each continent which might not adapt to certain local regions. It used
supervised classification to determine forest percentage cover, forest loss and forest gain,
potentially leading to inaccuracies when determining the start year of slow-onset disturbances.
Furthermore, these studies are neither particularly adapted to areas with frequent cloud cover,
nor are they able to separate between rapid- and slow- onset disturbances.
5
Generally, the major difficulties in disturbance mapping are: 1) inconsistent quality of data due
to various atmospheric conditions, cloud, shadows and snow; 2) separating slow and abrupt
changes; 3) adaptation to local environment. To overcome these difficulties, we developed a new
semi-automatic algorithm named Berkeley Indices Trajectory Extractor (BITE) to detect
forest disturbances, especially slow-onset disturbances, by integrating multiple spectral indices
from medium resolution remote sensing images. 30 m medium resolution Landsat 5 TM time-
series stacks were chosen for the temporal coverage over the last decade. Our algorithm can
overcome the difficulties in forest change studies as described above, which are frequent cloud
cover and slow-onset vs. rapid-onset disturbances. Moreover, our algorithm does not rely on
parameters/thresholds and thus reduces complications and potential subjective errors at the users’
end. BITE enables mapping forest disturbances in the regions where MPB outbreaks occur,
which supports the studies of analyzing and modelling the dispersal pattern of the MPB
outbreaks. Furthermore, we expect our algorithm to detect slow-onset and rapid-onset
disturbances without being limited to particular research contexts. In this paper, we present the
BITE algorithm in detail and initial validation results for 1 Landsat path/row.
2 Methodology
2.1 Study Area
Our study area was Grand County, Colorado (Figure 1), located in the southern rocky mountain
region, that has been the epicenter of a widespread mountain pine beetle outbreak that started
around 2000 (Chapman et al., 2012). Grand County is one the largest counties in the state of
Colorado, with a land area of 4,843 km2 and a total population of 14,608 as of 2012 according to
U.S. Census Explorer data (America Community Survey Office, 2013). Towns such as Granby,
Grand Lake, and Winter Park attract a great number of tourists, particularly in the winter. The
majority of the county is covered by forests, dominanted by coniferous species, primarily
lodgepole pine (Pinus contorta), Engelmann spruce (Picea engelmannii) and subalpine fir
(Ablies lasiocarpa). Quaking aspen (Populus tremuloides) is the primary species in deciduous
forests that cover approximately 15% of the forested area in Grand County according to the
United States Forest Service (USFS) Forest Inventory and Analysis (FIA) (Ruefenacht et al.,
2008). The study area is partially covered by snow above a certain altitude throughout the spring,
fall, and winter, and there is a high frequency of precipitation during the summer. The local
climate leaves a narrow temporal window for acquiring satellite images with little cloud and
snow cover.
6
Figure 1 Study Area: Grand County, CO, USA.
2.2 Disturbance Mapping Procedure
BITE is a semi-automatic streamlined process that comprises the following steps: 1) Image
Preprocessing; 2) Calculating spectral indices and stacking indices along the timeline; 3)
Trajectory Extraction; 4) Trajectory Interpretation; and 5) Post-classification Processing. The
flow of the procedure is shown in Figure 2. The steps are explained in the following subsections.
7
Figure 2 Flowchart of processing steps using in the BITE algorithm.
2.2.1 Data and Preprocessing
The spatial frame of a single Landsat satellite scene at Path/Row of 34/32 includes the study
area. Landsat 5 TM images were acquired from 2001 to 2011 during leaf-on seasons (June to
September) to obtain information about vegetation during the peak of the growing season and to
avoid snow cover in the images. The acquired images were preprocessed at the L1T level, which
means systematic radiometric and geometric accuracy were ensured by using ground control
points and a digital elevation model for topographic correction. Images were processed into
surface reflectance using the Land Ecosystem Disturbance Adaptive Processing System
(LEDAPS, Masek et al., 2012). LEDAPS carries out radiance calibration, top-of-atmosphere
reflectance conversion and atmospheric correction. Subsequently, the Fmask algorithm (Zhu and
Woodcock, 2012) that takes advantage of object-based image analysis in cloud masking was
used to mask clouds and their shadows on the ground as well as snow cover and water bodies. A
total of 23 images were collected, each had less than 33% cloud and snow cover combined over
the study area. Therefore, the percentage of cloud- and snow-free land area was at least 66% of
the entire study area. Visual inspections were also carried out to remove unmasked snow cover
and bright scan lines due to overexposure of large areas of clouds. The acquisition dates and
cloud-/snow-free area percentages of the images were listed in Table 1.
Table 1 Data acquisition dates and land percentage.
Acquisition
Date
Cloud/Snow/
Shadow/Water
Land Acquisition Date Cloud/Snow/
Shadow/Water
Land
2001 June 28 20.7% 79.3% 2007 June 29* 22.5% 77.5%
2001 July 30* 17.3% 82.6% 2007 July 15 13.8% 86.2%
2002 July 1 1.7% 98.4% 2007 July 31* 21.3% 78.7%
2002 July 17* 11.3% 88.7% 2007 Aug 16* 15.2% 84.8%
8
2003 July 4 6.7% 93.2% 2008 Aug 18 31.7% 68.2%
2003 Sep 22 3.5% 96.5% 2009 Aug 5 8.7% 91.3%
2004 July 6 10.6% 89.4% 2009 Aug 21* 1.8% 98.2%
2005 Aug 26* 25.9% 74.2% 2010 June 21 6.2% 93.8%
2005 Sep 11 2.2% 97.8% 2010 Sep 25 3.1% 96.9%
2006 June 26* 11.2% 88.9% 2011 June 8 22.6% 77.4%
2006 July 28 2.4% 97.6% 2011 Aug 27* 17.3% 82.6%
2006 Aug 29 3.8% 96.2% (Starred images have thin bright scanlines)
In addition to the Landsat images, three aerial image mosaics from 2005, 2009 and 2011
collected by the National Agriculture Imagery Program (NAIP) covering the study area were
obtained for visual disturbance detections for validation purposes. NAIP imagery has a 1 meter
spatial resolution, providing sufficient details for visual interpretations at the level of individual
trees. The NAIP imagery were used to identify forest disturbances.
We also visited the study area and collected field samples with handheld Global Position System
(GPS) units to record geographic coordinates in 2010, 2012 and 2013, respectively. 118 field
plots with GPS coordinates were visited in 2010 with each plot covering a circular buffer of 8
meter in radius, in which healthy trees, unhealthy trees, sick trees and dead trees were counted
along with many individual tree and stand measurements (Caldwell et. al., 2013). In the two
subsequent field trips, a few plots sampled in 2010 were revisited and 76 new locations were
added, where surrounding tree status and environment elements were documented.
For supervised classification and accuracy assessment, a total of 301 pixels consisting of 111
persistent forest pixels, 98 slow-onset disturbance pixels and 92 rapid-onset disturbance pixels
were selected as the training set, while a total of 602 pixels consisting of 222 persistent forest
pixels, 195 slow-onset disturbance pixels and 185 rapid-onset disturbance pixels were selected as
the test set.
Field plots were not evenly distributed throughout the entire study area as some forested areas
are remote and difficult to access due to the limitation of cost and time. Thus most samples were
selected in NAIP aerial images. Both the training and test pixels were randomly selected within
the forest cover throughout the study area. The interpretation of the sample pixels incorporates
the field sample co-registered with GPS in 2010, 2012 and 2013, and visual identification of
NAIP image mosaics in 2005, 2009 and 2011. Particularly, locations of the field plots were
inspected in NAIP images to help identify infested trees from NAIP images based on the
differences between healthy and infested trees in color and texture. Generally, the crown canopy
of the infested trees is red or brown and is easily distinguished from that of healthy trees. Dead
trees lose all needles and thus are grey branches and trunks. The criteria of determining a slow-
onset disturbance sample is either an increased number of infested/dead trees observed from
2005 to 2011, or at least 30% of trees attacked within roughly a 45 meter radius (3 × 3 pixels)
over the same time period.
9
2.2.2 Spectral Indices
A total of 9 spectral indices calculated from multiple band combinations for unmasked pixels
were used to test the response performance in the time-series trajectory with regard to forest
disturbances. They include NDVI (Tucker, 1979), NDSI (Shimamura et al., 2006), NBR (Key
and Benson, 2005), NDWI (McFeeters, 1996), NDMI (Wilson and Sader, 2002), EVI (Huete et
al., 1997) and Tasseled-cap indices (Crist and Cicone, 1984). The complete names and equations
for each index are listed in Table 2. Radiometric transformation takes advantage of combining
information from multiple spectral bands to a single spectral index. The spectral indices reflect
dynamic spectral features of certain cover types, forming time-series trajectories from which
either slow or rapid changes could be interpreted. Furthermore, misregistration in land cover
change studies can cause significant errors in accuracy (Dai and Khorram, 1998). To reduce such
error, a 3 by 3 low pass filter was applied to the unmasked pixels of spectral indices to reduce
misregistration errors (Gong et al., 1992). For each index, a time-series image was created by
stacking the 23 image sequence from 2001 to 2011.
Table 2 The list of the spectral indices.
Spectral Index Equation
Normalized Difference Vegetation Index NDVI = (B4 – B3)/(B4 + B3)
Normalized Difference Snow Index NDSI = (B2 – B5)/(B2 + B5)
Normalized Burn Ratio NBR = (B4 – B7)/(B4 + B7)
Normalized Difference Water Index NDWI = (B2 – B4)/(B2 + B4)
Normalized Difference Moisture Index NDMI = (B4 – B5)/(B4 + B5)
Enhanced Vegetation Index EVI = 2.5(B4 – B3)/(B4 + 6 × B3 - 7.5 × B1 + 1)
Brightness (Tasseled Cap) Brightness = 0.3037B1 + 0.2793B2 + 0.4343B3 + 0.5585B4 + 0.5082B5 + 0.1863B7
Greenness (Tasseled Cap) Greenness = -0.2848B1 – 0.2435B2 – 0.5436B3 + 0.7243B4 + 0.084B5 – 0.18B7
Wetness (Tasseled Cap) Wetness = 0.1509B1 + 0.1793B2 + 0.3299B3 + 0.3406B4 – 0.7112B5 – 0.4572B7
2.2.3 Trajectory Extraction
Trajectories in this paper are defined as the individual line segments that represent unique trends
of spectral indices within the complete time series. The trajectory extraction process can be
further divided into three steps: Inter-year value selection, Noise removal, and Segmentation. A
diagram demonstrating the trajectory extraction process is presented in Figure 3 using NDVI as
an example. The figure presents the temporal dynamics of a MPB attacked pixel, which was
converted from 23 index values from different dates into 3 segments indicating changes within
the trajectory.
10
Figure 3 Example of an NDVI time series for 1 disturbed pixel, and intermediate results of
processing steps in the time series. These processing steps include 1) Inter-year value selection
(a)-(b); 2) Noise removal (b)-(c); 3) Segmentation (c)-(d).
Inter-year Value Selection
Due to the various weather conditions in the southern Rocky Mountain region, at any given date
of satellite overpass it is unpredictable whether the image acquired over a location in the study
area is covered by major cloud/shadow/snow or not. For instance, there are 4 images selected in
2007 as opposed to only one selected in 2004 (Table 1), and at the pixel location used in Figure
3(a) was masked for cloud coverage in 2004. The strategy in our algorithm is to select the most
suitable value in a year when multiple values exist, and to interpolate gaps between years when
no cloud/shadow/snow-free data are available.
The most suitable value for a year is approximated from its neighboring years. It takes a few
iterations until the values no longer change. In particular, the algorithm initializes the first value
(earliest in the year derived from images) for each year, then fills the gaps in the data using linear
interpolation. As for head and tail gaps, we used the first/last available value to fill. The result is
11
a list of 11 values with only one value in each year. The list is then examined, with a value
selected from each multiple-valued year that is the closest value to the mean of the values from
the previous year and the subsequent year. In the special case for heads and tails, only one
available direction is examined (e.g. check the value of the subsequent year for the head). The
process is terminated until no changes occur after repeatedly re-interpolating the gaps, followed
by list examination and selection as described above. With the value selection and gap filling, the
result is a time-series value sequence with identical temporal interval of 1 year (Figure 3(b)).
Noise Removal
It is possible that noise in the images exist even after low pass filtering, for instance, residual
cloud cover/shadow/snow cover (Kennedy et al., 2010) and device errors including undetected
bright scan lines and misregistration. In addition, no relative calibration was done among the
images to reduce uncertainties in Landsat 5 TM calibration (Thorne et al., 1997). Therefore, the
noise removal technique proposed in BITE aimed to remove abnormal data fluctuations. Noise
rarely occurs within a pixel for two temporally consecutive scenes, which suggests that the noisy
pixels form a spike deviating from the distribution of values in the series (Figure 3(b)). We used
standard deviation as the threshold to determine whether a sample value differentiated
substantially from those of the previous year and the subsequent year, since standard deviation
reflects the variation of the values in the sequence such that relatively flat sequences are less
tolerant to abnormal spikes while disturbed sequences are more tolerant. For each iteration, only
the greatest value is removed as noise, followed by the recalculation of the standard deviation for
noise detection in the next iteration. We decided to replace the noise with the closer value of the
year before and year after but not the mean to avoid over-smoothing when the noise happens to
occur at a rapid-onset disturbance such that one neighboring value is quite close to the noise
value while the other is far from the noise value. Normally, the previous value and subsequent
value of the detected noise is similar and the noise filtering would not affect the trend.
Furthermore, it was sometimes difficult to determine whether the first or the second value was
noise when there was a significant difference between the two. The circumstance could also
apply to the tail of the series. (Figure 3(b)). We simplified this problem by inserting the mean of
the second and third samples ahead of the first sample. We applied a similar operation to the tail.
The appended samples were removed after the noise removal was complete. To explain the
reason of our solution, there are two scenarios in our deliberation. Under one scenario in which
the second and the third sample are similar, both are distant from the first sample, the first
sample would be treated as a noise since a sample of similar value to the second and third sample
is appended before the first sample to form a spike. Under the second scenario in which the
second and the third sample are largely different in value so that it is difficult to tell which
sample is a noise, our approach makes a ‘mild’ guess by averaging the second sample and third
sample and appending it before the first sample and then proceeds with the noise removing
iteration by removing outstanding spikes to constrain the series to some extent.
Segmentation
Segmentation is the process of simplifying the time-series into a sequence of one or more parts
(or segments) which represent different forest conditions (e.g. undisturbed, disturbed,
12
recovering). As the study period is 11 years, complicated change patterns are unlikely to occur
(e.g. multiple disturbance and recovery cycles). Therefore, the maximum number of segments
was set to 4. For multiple segments, an exhaustive search for all possible breakpoints between
segments was performed since there were a limited number of possible combinations for a
sequence of 11 points. Breakpoints were defined as the endpoints between segments and were
selected from the points of the noise-removed time-series. Segments were formed by linearly
connecting a set of breakpoints including the first and the last sample points. Therefore, for the
time-series sequence a with n sample points, and a set of break points 1<p1<p2 … <pm<n, where
m is the number of break points, which is also the number of segments minus one. So the
trajectory function is,
(1)
The best break-point combination is determined to be the one with the minimum sum of squares
.
The coefficient of determinant (R2) is a statistical measure of model performance, which is used
to evaluate the goodness of fit for the trajectory with regards to the time-series sequence after
noise removal. R2 monotonically increases with the number of segments because one can always
break any segment into two segments to decrease or maintain the sum of squares, which leads to
non-decreasing R2. However, overfitting with more than the intended number of segments in the
trajectory introduces redundant information and even noise to the next stage of trajectory
interpretation. Therefore in practice, tradeoffs between the number of segments and model fit is
evaluated using three thresholds as criteria to determine the optimal number of segments: R2
threshold (RThres), minimum R2 (MinR) and incremental R
2 threshold (IThres). If the R
2 of the
trajectory is greater than RThres, or if it is greater than MinR, plus the incremental R2 compared
to the trajectory in the previous iteration with one less segment is smaller than IThres, then the
current trajectory is selected as the final trajectory. Empirically after a few tests, we set RThres
as 0.95, MinR as 0.9 and IThres as 0.04. RThres can be regarded as a satisfactory level of model
fit, implying that more segments could lead to overfitting. MinR defines a minimum goodness of
fit, beyond which if increasing by one more segment cannot substantially increase R2, say less
than IThres, then the contribution of more segments is considered trivial. In the example
demonstrated in Figure 4, the trajectory with 3 segments surpassed the R2 over 0.95 and therefore
was the optimal selection, while the trajectory with 4 segments was apparently redundant in
number of segments.
11
1
1
1 1 1
1
( ) (1)( 1) (1) ( [1.. ])
1
( ) ( )ˆ( ) ( ) ( ) ( [ +1.. ], 2,..., )
( ) ( )( ) ( ) ( [ 1.. ])
j j
j j j j
j j
mm m m
m
a p ai a i p
p
a p a pa i i p a p i p p j m
p p
a n a pi p a p i p n
n p
2ˆ( ( ) ( ))i
a i a i
13
Figure 4 Example of the intermediate processing steps producing segments of the entire
NDVI trajectory for 1 pixel (Segmentation Process).
2.2.4 Trajectory Interpretation
In BITE, trajectory alone does not contribute to differentiating forest from non-forest. Therefore,
a prior forest cover map is required and National Land Cover Database 2001 (NLCD 2001,
Homer et al., 2007) was chosen for this purpose as it represents land-cover at the start of our
study time period with the identical spatial resolution of 30 meters. For each of the nine spectral
indices for which we generate trajectories, six features describing the trajectories were extracted:
minimum slope, maximum slope, minimum range change, maximum range change, minimum
value and maximum value. Minimum slope is a negative slope with the greatest absolute value of
the segments while maximum slope is the largest positive slope value. Minimum range change is
the lowest negative change of index value of the segments while maximum range change is the
largest positive change value. Minimum value and maximum value are the trajectory minimum
and maximum index values, respectively. With the six features for each of the spectral indices,
supervised classification algorithms are able to classify a trajectory into three disturbance classes:
Persistent Forest, Slow-onset Disturbance and Rapid-onset Disturbance. In this study,
Classification and Regression Tree (CART) and Support Vector Machine (SVM) were tested for
forest disturbance type classification (Hastie et al., 2009).
CART and SVM classification algorithms were implemented in Matlab 2013a to test the training
set using stratified 5-fold cross validation (Kohavi, 1995) and to validate with the test set. CART
14
is a built-in toolset in Matlab, while LIBSVM is an external tool package to carry out multi-class
classification with grid search of optimal parameters (Chang and Lin, 2011). CART divides the
feature space with thresholds into rectangular partitions with such splitting determined by
training data. The interpretability of CART classes is a major advantage of this algorithm. In the
trajectory interpretation process, CART generates classification rules for trajectory features for
slopes, range changes and boundary values. SVM with the Radial Basis Function (RBF) kernel
classifies data by nonlinearly mapping data into higher dimensional space, which is suitable for
splitting datasets in low dimensional space with complex decision boundaries (Hsu et al., 2003).
We used ‘accuracy’ as the indicator of how well the combinations of classifiers and indices
perform. Overall accuracy is defined as the number of correctly classified sample pixels divided
by the number of total sample pixels.
There are two accuracies assessed for each test: 1) 5-fold cross validation was evaluated on the
training set containing 301 pixels. In a 5-fold cross validation, we randomly divided data into 5
equal-sized partitions, using 4 partitions to train the classifier and 1 partition for validation. We
did this to test all 5 partitions and combined the results into one estimate of accuracy. 2) We
trained the classifier with the entire training set, and then used it to classify the test set which
consisted of 602 pixels independent of the training set. Particularly for SVM, the two parameters
γ and c of the RBF kernel were grid searched (Hsu et al., 2003) with 5-fold cross validations on
the training set. Both cross-validations on the training set and validation on the test set were
conducted to evaluate the performance of the classification algorithms and the combination of
spectral indices. Several of the best combinations of indices and classifiers were chosen to make
forest change type maps. When a pixel is detected as disturbance, the start year of the
disturbance is determined as the left endpoint of the trajectory segment with the minimum or the
maximum slope depending on whether the index decreases or increases with a disturbance. Nine
disturbance maps were created from the time-series images of 9 spectral indices independently.
Subsequently, an integrated disturbance map was created by a simple plurality voting of a
selection from these disturbance maps, of which each pixel is labeled with the category receiving
the most votes. If a pixel is labeled as disturbance, the corresponding start year of the disturbance
is determined as the median from the start years of the winning voters.
2.2.5 Post-classification Process
To remove speckle noise and increase spatial connectivity of homogeneous patches, a minimum
mapping unit (MMU) filter is favored in a number of studies (Homer et al., 2007; Thomas et al.,
2010). The filter replaces connected pixels that are less than a minimum number with
surrounding majority labels. The connectivity can be defined as 4 neighboring pixels or 8
neighboring pixels. Connecting 8 neighboring pixels in Landsat TM imagery preserves narrow
line features such as roads and rivers, but meanwhile retains residual speckles along patch
boundaries. When MPB attacks reach outbreak levels, it is suitable to use Landsat TM for its
medium spatial resolution of 30 m pixel size (Bentz and Endreson, 2004). Rapid-onset
disturbances such as clearcuts and fires are also identifiable as patches. To better depict the
spatial pattern of disturbances, we set the MMU to 0.5 ha (6 pixels) with an 8-neighbor
connectivity. Filtered pixels were refilled by repeatedly applying a 3 × 3 majority filter until no
further change can be made. Residual unfilled pixels were relabeled with their original classes.
15
This allows for a proportional expansion of multiple surrounding classes into the filtered patch.
Combination of disturbance type and start year yields a total of 19 classes including one
persistent forest, slow-onset and rapid-onset disturbances combined with start year from 2001 to
2009. Deriving a start year of 2010 or 2011 is not possible since the noise removal process
eliminates sudden value changes at the tail of the sequence due to uncertainties of noise vs.
normal value.
3 Results and Discussion
3.1 Evaluation of the Classifiers and the Indices
Table 3 Overall accuracies of the classification test results. ‘CV’ represents the cross-
validation test on the training dataset. ‘Test’ represents the evaluation on the test dataset.
Spectral Indices CART 5-CV CART Test SVM 5-CV SVM Test
NDVI 95.3% 88.2% 94.0% 90.9%
NDSI 89.7% 73.9% 98.0% 73.8%
NDWI 94.0% 85.4% 94.0% 90.4%
NDMI 95.7% 87.5% 95.0% 92.4%
NBR 95.7% 88.5% 95.3% 92.2%
EVI 90.7% 70.4% 84.1% 78.7%
Brightness 95.0% 88.9% 93.4% 92.4%
Greenness 93.0% 83.6% 86.0% 84.2%
Wetness 96.7% 89.4% 96.3% 91.5%
(Shaded indices were selected for integration)
The parameters that achieved the highest cross validation accuracy were selected as the optimal
configuration and used to classify the test set. Neither CART nor SVM achieve accuracies higher
than 80% for the test set with NDSI or EVI (Table 3). Compared to other indices, these two
indices were less separable due to lower accuracies in tests with both training and test samples.
Although the SVM cross validation accuracy of 98.0% with NDSI surpassed others, only 73.8%
of the test samples were correctly classified. It could be the result of using a small set of training
data to evaluate the large test sample given a chance of ‘overfitting’ of biased data (Ng, 1997),
considering the training set and test set were selected independently throughout the study area.
Particularly, CART yielded better average cross-validation accuracy than SVM (94.0% vs.
92.9%. respectively). On the contrary, the average accuracy of CART was 84.0% compared to
87.4% from SVM in validating the test set. The relatively even accuracies between the cross
validation and the validation with the test set for SVM suggest that it is more adaptive to
selection bias of the samples. Therefore, though both cross validation and validation results were
taken into account when selecting the best spectral indices, but validation gains were given
greater weight in the consideration. With a thorough investigation of the performances, five
spectral indices were selected to test for integrated accuracies. The indices were Brightness,
NDMI, NBR, Wetness and NDVI respectively.
16
Table 4 Overall accuracies of the classification test of integration of multiple indices. The
evaluation was done on the test dataset.
Intg. of Brightness, NDMI
and NBR
Intg. of Brightness, NDMI, NBR, Wetness and
NDVI
CART 93.2% 93.9%
SVM 94.4% 94.7%
Both 93.7% 94.4%
We integrated predicted labels of the test samples via a simple plurality vote, in which the
winner is the one with the most votes. The integrated prediction labels were validated with the
test set to derive accuracies. We tested the integration of 3 best indices and 5 best indices for
CART and SVM independently, and integrations of both CART and SVM (Table 4). With the
integration of all 5 indices using SVM, the accuracy surpasses that of any independent index on
the test set, including the combination of SVM and CART with 10 voters. Integrating CART
results into those of SVM actually decreased accuracies. Given that the training and test samples
were selected independently, the degree of accuracies on the test samples achieved by the
integration of the 5 indices with SVM was less biased. It was thus considered as the optimal
configuration for producing the final map.
3.2 Accuracy Assessment
The 602 test sample pixels were used as reference for accuracy assessment to generate a
confusion matrix from which the producer’s accuracies, user’s accuracies and an overall
accuracy for change type were calculated (Table 5). Despite a high overall accuracy of 94.7%
and Kappa coefficient of 0.92, producer’s accuracy and user’s accuracy for persistent forest and
slow-onset disturbance were all above 90%. For the rapid-onset disturbance, the difference
between the user’s accuracy of 99.4% and the relatively low producer’s accuracy of 89.2%
implies an under-classification of the disturbance type. Similarly, most misclassified samples
occurred in the lower left triangle of the confusion matrix indicating an underestimate of
disturbance severity, agreeing with that from individual SVM runs. It was slightly mitigated by
the voting protocol as 3 of 602 samples invoked a draw as the label assigning prioritizes in rapid-
onset disturbance, slow-onset disturbance and persistent forest. For instance, if a pixel integrated
from the 5 maps are labeled as ‘persistent’, ‘rapid’, ‘rapid’, ‘slow’ and ‘slow’ respectively. There
is a tie between ‘rapid’ and ‘slow’. In this case, the labeling follows the priority of ‘rapid’, ‘slow’
and ‘persistent’. Thus, the pixel is labeled as ‘rapid’. In the disturbance map consisting of
2973340 forest pixels, the number of pixels labeled in draw situations is 55807, which covers an
area of 5023 ha. Consequently, there was an increased chance that forest changes were classified
into the more intensive type but a decreased chance of the reverse. Therefore, the integration
17
process allowed the automatic classification to be more sensitive to disturbance detections.
Although this trend of underestimation was observed, it was difficult to quantify such areas
accurately.
Table 5 Confusion matrix of the forest change type classification result. The evaluation was
done on the test dataset.
Forest Change Type
Classified
Total Prod. Acc.
Persistent Slow-onset Rapid-onset
Ref
eren
ce
Persistent 218 3 1 222 98.2%
Slow-onset 8 187 0 195 95.9%
Rapid-onset 6 14 165 185 89.2%
Total 232 204 166 602 OA = 94.7%
Kappa = 0.92 User Acc. 94.0% 91.7% 99.4%
As for the sampling scheme, the selection of the samples was a haphazard process, preventing
the calculation of the inclusion probability of samples due to unintended selection bias. As a
result, the generalization of samples to estimate the accuracy of the entire map is limited
(Stehman and Czaplewski, 1998). However, with a relatively large sample size and disperse
spatial coverage, we are confident that the impact of selection bias in this study was minimized.
Another factor that impacts the product accuracy is the accuracy of NLCD 2001 forest mask as
the input for the image. It was reported that the overall accuracy of the level 1 classification
(including forest) was 91% for the region (Wickham et al., 2010). Therefore we can conclude
that the mislabeling error of the forest area was less than 10%. Other sources of error include
sub-pixel image registration which was reduced by a 3×3 average filter during time-series image
processing. The MMU filter also reduced such error but it also introduced the chance of altering
small correctly labeled patches. The error of slope-determined start year of disturbance was also
not assessed due to lack of consistent aerial image coverage in the corresponding year or other
reliable references.
18
3.3 The Disturbance Map Product
The disturbance map (Figure 5) was presented in two schemes highlighting slow-onset and
rapid-onset disturbance, respectively. Since the values selected for image analysis were from the
summer season of the year, the ‘start year’ of a MPB disturbance should instead be defined as a
period from the summer season of the year to the summer season of the next year. Needles on
MPB attacked trees turn red within one year (Safranyik and Wilson, 2007) and are therefore
observable visually and spectrally. The spatial spread pattern could also be roughly assessed
from the disturbance map, implying annual incrementally disturbed areas over thousands of
hectares, which is in agreement with MPB’s spatial synchrony with a lag distance at 100 km
level (Peltonen et al., 2002). Therefore, the time required for MPB attacks to spread at a pixel
level is much shorter than the time required for the trees to turn red following the attack. It is
thus reasonable for MPB mortality, to shift the disturbance start period to the year before first
detection. However, such a temporal shift does not apply to other disturbances which are
detected instantly between the summer of the labeled start year and the next summer.
Consequently, we should interpret the start year differently according to the disturbance type.
Figure 5 Outputs of the BITE algorithm, including starting year of (left) slow-onset
disturbances and (right) rapid-onset disturbances.
From 2001 to 2003, the dispersal of MPB was slow, covering a total of 6117 ha. The major
outbreak took place from 2004 to 2006 as was observed (Chapman et al., 2012), affecting 18876
ha, 47261 ha and 27350 ha forested land, respectively (Figure 6). The synchronous outbreaks of
MPB were also observed in a previous study for the northern Rocky Mountain area (Goodwin et
al., 2008; Honey-Marie et al., 2011). In the 9 year period, the total disturbed area caused by
MPB attacks is 111443 ha, which is dominant compared to the 11494 ha of rapid-onset
disturbance. Combined, slow-onset and rapid-onset disturbances affected 46% of the forested
area in Grand County over the 9 year period.
19
Figure 6 Area affected by different disturbance types for 2001-2009.
The spatiotemporal pattern of the dispersal of MPB attacks can be easily recognized by visual
inspection. The attack spread outwards from the initial locations in all directions without
constraining physical connections between forest patches. The consistent spread of the outbreak
to the neighboring regions annually adds confidence in the maps of disturbance types and start
years. Typically, three locations were identified from the disturbance map as the earliest MPB
attacked forests, which are Northeast Lake Granby, Arapaho National Forests and East Fork
Troublesome Creek Valley. All three locations have been dominated by MPB-susceptible
lodgepole pines. In particular, the outbreak originated in East Fork Troublesome Creek and its
surrounding mountains was examined in an enlarged view of the disturbance map (Figure 7).
Before the outbreak in 2004, the infested region was limited to hillsides with a distance from 0.5
km to 4 km surrounding the East Fork Troublesome Creek. During the outbreak from 2004 to
2006, MPB rapidly expanded outwards to neighboring areas regardless of terrain, meanwhile the
inwards dispersal was at a very slow pace. After the major outbreak period, the persistent forests
near the center of the region by the creek were attacked. By visually inspecting NAIP imagery in
2005 and 2011, we found the MPB presence in the center region in 2005 during the outbreak,
though at very low levels (less than 10% infested trees per pixel). They were below the
capability of detection using our algorithm. The reason was that either infested trees did not
comprise a majority of a pixel or the disturbance scale was under the MMU (6 pixels). We
assumed that the center area contained lesser density of host tree species or biophysical
conditions made it more resistant to MPB attacks, but was sought out by MPB after locations
with more favorable conditions had been depleted (Raffa and Berryman, 1983). This pattern
could also be attributed to other factors such as synchronous adult emergence implied by tree
maturity (Cole and Amman, 1980), seasonal pattern of temperatures (Logan et al., 1998), or
certain terrain features (Honey-Marie et al., 2011).
20
Figure 7 Enlarged view of the BITE output showing the staring year of slow-onset
disturbances.
For the rapid-onset disturbances, there are also three locations, around which forests were
abruptly removed by natural successions or human activities. Since disturbances such as wildfire
were not observed during the considered time frame, all rapid-onset disturbances were assumed
to be human caused. In developed areas such as Lake Granby and the Fraser Valley, the
surrounding forests were cleared for construction of roads, snow tracks, residential and resort
development. In the Arapaho National Forests, the rapid-onset disturbances were the treatments
on federal and private lands where infested trees were removed to increase aesthetic value of the
land, to prevent risks of treefall, to reduce wildfire hazards, and to utilize infested trees for wood
products.
4 Conclusion and Perspectives
Compared with existing disturbance detection algorithms using Landsat TM/ETM+ stacks such
as NDMI trajectory (Goodwin et al., 2008), the VCT algorithm (Huang et al., 2010) and the
Landtrendr (Kennedy et al., 2010), BITE can select the optimal value from multiple scenes with
tolerance for cloud and snow coverage in each year and perform trajectory extraction using
multiple spectral indices. Though with a similar segmentation process to Landtrendr, BITE takes
advantage of multiple spectral indices and supervised learning algorithms, so that it can
accurately separate persistent forest, slow-onset disturbances and rapid-onset disturbances as
well as the start year of a disturbance. BITE also showed its robustness to various atmospheric
conditions, noise, pixel misregistration and residual cloud/snow cover. Furthermore, BITE does
21
not require any parameters but only a training sample to proceed with an automated mapping
process. During the test, we found that Brightness, NDMI, NBR, Wetness and NDVI yielded the
best accuracies in detecting disturbances and separating slow- and rapid- onset disturbances. An
integration of the five indices via a plurality vote can further improve the accuracy. However,
there are some limitations with BITE including: 1) the prerequisite of an accurate forest cover
map synchronous with the start date of the time-series images; 2) selecting representative
training data; 3) the lack of seasonal vegetation dynamics; and 4) not sufficiently characterizing
forest restoration or multi-staged disturbances/restoration/persisting behaviors. To overcome
these limitations, further efforts will be made to revise current modules and add new ones, such
as the automation of prior forest detection (Huang et al., 2008) instead of using an independently
derived forest cover map (e.g. NLCD). Furthermore, comprehensive sampling and response
designs were lacking for accuracy assessment of the disturbance map, though our tentative visual
validation agrees with our accuracy report. In the future, we plan to compare our algorithm with
existing algorithms and products with a better sampling design.
Currently, our disturbance map is provided for a county, enabling studies of local dispersal
patterns at the landscape level. At scales of large areas, some studies found that the dispersal
pattern of MPB population follows a Moran effect, particularly assuming temperature as the
common environmental factor that triggers the MPB epidemic (Peltonen et al., 2002; Aukema et
al., 2008). In addition, a study found that climate change causes the outbreak of MPB attacks to
expand to previously unsusceptible forests in North America (Carroll et al., 2003), which leads
to increases in CO2 releasing to the atmosphere via fire and forest decomposition and forming a
positive feedback to climate change (Kurz et al., 2008). Therefore, at the same time of improving
the algorithm to overcome various limitations, we also plan to generalize this mapping procedure
to larger areas, e.g. from the southern Rocky Mountain ecoregion to the entire MPB susceptible
Rocky Mountain areas. Temporally, lodgepole pine forests are periodically attacked by MPB
(Logan and Powell, 2001). It would be more convenient to explore such patterns by making a
fuller use of all available images inter-annually over a period of nearly 30 years (Zhu et al.,
2012). With larger spatial and longer temporal coverage, we will have more confidence in
analyzing the driving factors of such disturbances with less influence from local and/or spurious
conditions. In addition, beyond merely detecting slow-onset forest disturbances caused by MPB
attacks, BITE has further potential in monitoring other types of ecosystem disturbances such as
wildfires, flooding and hurricanes, restored ecosystems and other LULC changes in the world.
These changes are theoretically detectable from forming trajectories of spectral indices, and are
therefore consistent with the framework BITE algorithm proposed. By implementing seasonal
vegetation dynamics, BITE will be even more capable and adaptive to ephemeral disturbances.
To conclude, we plan to test and adapt BITE to larger areas with greater variety of vegetation
types and climate conditions, to continuously monitor land cover changes.
22
Chapter 2 Clustering based on eigen space transformation – CBEST for efficient
classification
This chapter has been published in International Society of Photogrammetry and Remote
Sensing
Yanlei Chen and Peng Gong*
Department of Environmental Science, Policy and Management
University of California at Berkeley
137 Mulford Hall, Berkeley, CA 94720-3114
23
Abstract
Large remote sensing datasets, that either cover large areas or have high spatial resolution, are
often a burden for information mining for scientific studies. Here, we present an approach that
conducts clustering after gray-level vector reduction. In this manner, the speed of clustering can
be considerably improved. The approach features applying eigenspace transformation to the
dataset followed by compressing the data in the eigenspace and storing them in coded matrices
and vectors. The clustering process takes advantage of the reduced size of the compressed data
and thus reduces computational complexity. We name this approach Clustering Based on
EigenSpace Transformation (CBEST). In our experiment with a subscene of Landsat Thematic
Mapper (TM) imagery, CBEST was found to be able to improve speed considerably over the
conventional K-means as the volume of data to be clustered increases. We assessed information
loss and several other factors. In addition, we evaluated the effectiveness of CBEST in mapping
land cover/use with the same image that was acquired over Guangzhou City, South China and an
AVIRIS hyperspectral image over Cappocanoe County, Indiana. Using reference data we
assessed the accuracies for both CBEST and conventional K-means and we found that the
CBEST was not negatively affected by information loss during compression in practice. We
discussed potential applications of the fast clustering algorithm in dealing with large datasets in
remote sensing studies.
Keywords: Land cover/use Mapping, Large Dataset, Landsat Thematic Mapper image, K-means,
Remote Sensing, Unsupervised Classification
24
1 Introduction
Computer-based image classification is a common practice allowing fast and automatic
identification and classification of data. Two types of image classification are found in standard
texts, supervised and unsupervised (Jensen, 2004; Lillesand and Kiefer, 1987; Richards and Jia,
2005). Unsupervised classification is often preferred when a priori knowledge over a study area
is lacking. This is particularly the case for mapping large areas where field data acquisition used
for training in supervised classification becomes prohibitively expensive. Unsupervised
classification involves the use of a clustering algorithm that runs across the entire image in many
iterations until an optimal set of clusters converges. Two most widely used algorithms are the K-
means and the iterative self-organizing data analysis technique (ISODATA) (Richards and Jia,
2005). However, these traditional iterative algorithms are time consuming when the data
dimension becomes high or the data volume becomes large. In pursuit of efficiency of
classification for continuously expanding data size in present days, we present an approach that
integrated the Eigen-Based Gray-Level Vector Reduction proposed in Gong and Howarth (1992)
followed by clustering. We call this approach Clustering Based on EigenSpace
Transformation (CBEST).
Computer assisted classification in large area mapping is more and more popular as both data
availability and computational power are increasing. The demand for variables derived from the
classification of remotely sensed data is increasing in developing global land and ocean
databases for global change studies (Jensen, 2004; Gong, 2012). Clustering is engaged in
mapping from regional (Woodcock et al., 1994; Homer et al., 1997; Franklin et al., 2001), to
continental (Loveland et al., 1991; Stone et al., 1994; Zhu and Evans, 1994; Homer et al., 2007),
even to global scales (Loveland et al., 2000; Bartholomé et al., 2005; Arino et al., 2007; Gong et
al., 2013). Among these mapping efforts, K-means clustering was adopted by some early studies
because of its simplicity for implementation.
K-means clustering is one of the earliest unsupervised classification approaches (Lloyd, 1982)
and has been used in remote sensing studies for various purposes. The applications include land
cover classification (Muller et al., 1999; Han et al., 2004; Zharikov et al., 2005), land cover
change (Brumby et al., 2002; Reger et al., 2007; Celik, 2009), time-series analysis and mapping
(Viovy, 2000; Wulder et al., 2004), cloud mapping (Gordon et al., 2005; Sano et al., 2007; Eitzen
et al., 2008), cotton yield estimation (Zarco-Tejada et al., 2005), chlorophyll concentration
mapping (Roelfsema et al., 2002), plumes and CO2 mapping (Zhang and Small, 2002), and
hydrological analysis (Belluco et al., 2006). In these studies, K-means clustering may function as
the primary classification approach, as a decisive branch of the entire classification method
(Franklin et al., 2001), or as a supplemental analysis tool which provides insight into
understanding of the data.
The K-means clustering is also used as a reference for comparison with other approaches
(Belluco et al., 2006), particularly newly proposed approaches (Viovy, 2000; Remund et al.,
2000; Shah et al., 2004; Zhong et al., 2006; Shah et al., 2007). Although the K-means algorithm
is outperformed in some cases (Shah et al., 2004; Shah et al., 2007), studies (Ouma et al., 2006;
Jiao et al., 2010) in which K-means achieved fair accuracies in remote sensing defend the role of
K-means clustering in near future practices.
25
Since the limitation of K-means clustering is widely known, some choose to integrate it into
advanced classifiers to utilize its advantages. For instance, Rollet et al. (1998) used it to initialize
the RBF (radial basis function) neural network for image classification. Some choose to revise or
add additional steps to minimize some of the shortcomings in the conventional K-means. Such
studies include contiguity-enhanced K-means (Theiler and Gisler, 1997) and unsupervised
spectral-contextual classification in the context of K-means (Zhou and Robson, 2001). K-means
also appears to be implemented with Principal Component Analysis (PCA), such as extreme
centroid initialization after PCA transformation (Funk et al., 2001) and hyperspectral data
preprocessed with Segmented PCA, then clustered with K-means (Tsagaris et al., 2005).
Outside the field of remote sensing, a number of clustering algorithms designed for large
databases have been proposed in the last few decades, such as DBMS (Ester et al., 1995),
DBSCAN (Ester et al., 1996), BIRCH (Zhang et al., 1997), STING (Wang et al., 1997), CURE
(Guha et al., 1998), WaveCluster (Sheikholeslami et al., 1998), MAFIA (Nagesh et al., 1999),
etc. These approaches have seldom been used in the context of remote sensing classification for
large area mapping. Particularly for K-means, a variety of studies also focused on improving the
efficiency of K-means for large datasets. For instance, the k-d tree and filtering algorithm
(Alsabti et al., 1998; Kanungo et al., 2002) organizes data into a k-d tree structure and prunes or
filters the branches to speed up the K-means. Scaling clustering algorithm (Bradley et al., 1998)
established a scalable framework in which compressible and discardable regions in large datasets
are identified and processed for K-means. Coresets K-means (Frahling and Sohler, 2006) uses
small weighted sets of samples that approximate the original dataset to reduce the computational
complexity. There are also studies exploring the potential of K-means in performance: fuzzy c-
means is a fuzzy implementation of K-means (Bezdek et al., 1984) and kernel K-means projects
data into higher dimensional space using kernel functions to detect complex patterns in feature
space (Girolami, 2002). The efficiency of kernel K-means was later improved by Zhang and
Rudnicky (2002) by shifting the clustering order from sample sequence to kernel sequence.
CBEST introduced in this paper is a distinct approach and it aims to greatly reduce the space and
time complexity of K-means with only one additional user specified parameter on desired
memory usage as the trade-off for little accuracy loss.
2 Background
2.1 K-means
Given a set of data in p dimensional space (x1, x2, …, xn), the objective of K-means is to find k
cluster centers (µ1, µ2, …, µk) for partition sets V={V1, V2, …, Vk}, such that within-cluster sum
of squares J is minimized (MacQueen, 1967).
2
1
-j i
k
j i
i x V
J x
(1)
µi is the mean of xj, where j ix V.
26
The iterative algorithm of conventional implementation of K-means clustering (in Euclidean
Distance) can be described in Table 6.
Table 6 Conventional K-means Algorithm
Steps Implementation
I Initialize k centers (µ1(0)
, µ2(0)
, …, µk(0)
); Iteration count t=0.
II ( ) ( )t t
ij j iD x . i=1, 2, …, k; j=1, 2, …, n.
III ( ) ( ) ( ) ( ) ( )
1 2{ : min( , ,..., )}t t t t t
i j ij j j kjV x D D D D . i=1, 2, …, k; j=1, 2, …, n.
IV
( )
( 1)
( )
1
tj i
t
i jtx Vi
xV
. i=1, 2, …, k; j=1, 2, …, n.
V t=t+1
VI Repeat step II to V until converge
( )t
ijD is the distance matrix to record distances between data and cluster centers so that the data
could be assigned to its closest cluster center. In practice, it is not necessary to take the square
root of elements in distance matrix ( )t
ijD since only comparison of elements is the concern. At
each iteration, partition sets V(t)
are updated by the minimum distance criterion, followed by
cluster centers µ(t+1)
being recalculated from the members x in their respective cluster partition
sets. The termination conditions could be determined by rules such as converging, membership
changing rate < threshold, change of within-cluster sum of squares < threshold and others.
There are a few issues to be considered carefully when applying K-means in practice. Firstly, the
cluster numbers k is required to be predefined prior to the classification. Since in clustering
analysis, we often assume no prior knowledge of the data, the predefined cluster numbers k is
therefore a guess. The strategy towards this would be either to test multiple k values to find the
most appropriate one (Zharikov et al., 2005), or starting with a large k then merge down to a
lower number according to expertise (Wulder et al., 2004). Secondly, the algorithm described in
Table 6 does not guarantee a global minimum of within-cluster sum of squares and there is a
possibility that result converges to a local minimum. This problem could be mitigated by
multiple runs with different random initial points to derive optimal or near optimal results. Some
studies have proposed initialization strategies that increase the probability to converge on a
global minimum. Zha et al. (2001) examined the relationship between PCA and K-means, and
they developed an approach to approximate global solutions for K-means by relaxing special
constraints of a trace maximization problem, which later evolved into PCA guided K-means
(Ding and He, 2004). Thirdly, K-means assumes identical within-cluster variances of Gaussian
distributions for all clusters. For data which features different within-cluster variances or those
which do not follow a Gaussian distribution, the performance could be rather poor. There are
quite a number of advanced algorithms to implement on specific types of data. For instance,
Expectation-Maximization for Gaussian Mixture Models applies a maximum likelihood
optimization which could well handle data following a Gaussian distribution. Kernel K-means
project data into higher dimensional space (Girolami, 2002) so that it could achieve good results
27
with ‘ring data’. Fourthly, since ‘distance’ is used, K-means suffers from the ‘curse of
dimensionality’ that when clustering high dimensional data, irrelevant dimensions introduce
noise to reduce the significance of ‘distance’. Eigenspace transformation is a common data
processing approach to reduce irrelevant dimensions.
2.2 Eigen-Based Gray-Level Vector Reduction
The compression approach proposed by Gong and Howarth (1992) is based on eigenspace (ES)
transformation. The same as Principle Component Analysis (PCA) (Jolliffe, 2002) in finding the
eigen structure of the feature space, ES transformation applies linear transformation to the data
such that the covariance matrix is made diagonal. Elements of the diagonal covariance matrix are
called eigenvalues representing the variance of data in corresponding dimension. The feature
space is transformed into eigenspace with correlated features being transformed into uncorrelated
eigen axes. Eigenvalues of the eigen axes are ranked from high to low. Since usually distinctive
features are associated with larger variances as opposed to noises which are usually smaller, low
ranked eigen axes are regarded as noise and then removed to reduce dimension in some studies
as a preprocessing step (Han et al., 2004). It is important to note that ES transformation alone
does not lose any information during the process.
The objective of the compression is to project original data space (n × p) into a compressed space
(n × 1) represented by N gray levels.
The compression is implemented following a ES transformation of input p dimensional data (x1,
x2, …, xn) with ranked eigenvalues (λ12, λ2
2, …, λp
2) as well as corresponding eigenvectors (v1,
v2, …, vn).
As λi2 represents the variance of data projected onto the ith eigen vector, and λi implies the
standard deviation of it, the compression rule is that the value of standard deviation in the ith
eigenvector λi determines how many partitions are assigned to along this direction to ensure the
the final partition of the eigen space is equilateral. It can be simply written as following:
1 2
1 2
...p
p
NN N
(2)
where Ni denotes the number of partitions assigned in the direction of the ith eigenvector.
The total partition N could be written as,
1 2... pN N N N (3)
Along the ith eigenvector, (1-α)% data should fall within a confidence interval of (-zα(2)λi, zα(2)λi)
if assuming a normal distribution, given the mean is zero since the ES transformation centers the
data. Here zα(2) denotes the z-value at significance level of α in a two tailed test. Gong and
Howarth (1992) chose 2.1 to be the z-value such that 97% of the data are roughly within this
range. Outliers are captured by the first and the last partitions, while the majority is assigned to
28
(Ni – 2) cells uniformly spaced in the confidence interval. Compressed gray levels in the
eigenspace are subsequently coded into N gray levels, with which each data entry could be
represented by a single integer. The process is summarized by steps in Table 7.
Table 7 Eigen-Based Gray Level Vector Reduction
Step Implement
I ES Transformation
II Derive eigenvalues and eigenvectors
III Partition ith subspace based on its eigenvalue
IV Coding the gray level
3 Clustering Based on Eigen Space Transformation
In general, CBEST is based on the integration of the Eigen-Based Gray Level Vector Reduction
approach (Gong and Howarth, 1992) and the K-means algorithm to speed up the conventional
space and time consuming process for large datasets. For remote sensing images, most spectral
data are stored in bytes, which range from 0 to 255 digital numbers. Although data is sometimes
calibrated, the information contained by the data is still one byte. If data size is large in
quantities, intuitively there are certainly overlapped cells in feature space. ES transformation
eliminates correlation, which identifies the dominant subspace of data spreading. Intuitively, the
compression approach introduced above could store more variation of data in a specified total
number of partitions. Since identical or close eigenvectors in eigenspace is very likely, especially
when the partition number N is extremely small compared to the number of entries in the data n,
it would be more efficient for a clustering algorithm such as K-means to calculate based on
scanning over weighted partitions to eliminate redundancy instead of data entries. Here a weight
for a partition is the count of number of data instances falling into the partition in eigenspace.
CBEST will be introduced in two sessions: Compression and Clustering.
3.1 Compression
In particular, it starts with ES transformation applied on original dataset (x1, x2, …, xn) to derive
eigen values (λ12, λ2
2, …, λp
2) as well as eigenvectors (v1, v2, …, vn).
Then an expected total number of partitions N̂ is specified.
Hence combine (2) and (3) to get the following equations pair:
1 2
1 2
1 2
...
ˆ...
p
p
p
NN N
N N N N
Solve the above equations pair to derive:
29
1 2
ˆ
...p
i i
p
NN
(4)
Then round Ni since partition numbers are integers. Notice that if rounding up, the actual
partition number N could be larger than N̂ . Rounding to the nearest integer was used in this
paper. Since Ni’s are rounded to integers and at least two partitions are required for providing
information for separation, it is possible that partition Nq+1, Nq+2, …, Np ≤ 1. Under this
circumstance, eigen subspace from dimension q + 1 to p are discarded because a single partition
in one-dimensional subspace does not provide any information for clustering. So the remaining
subspace has a dimension of q (q ≤ p).
A partition set G={G1, G2, …, GN} is used to denote the eigenspace partitions. For a partition set
Gr, its location in eigenspace can be defined by a coordinate (j1, j2, …, jq), ji=1, 2, …, Ni,
meaning the partition set Gr consists of the j1th unit in the first subspace, j2th unit in the second
subspace and so on. The coordinate is then projected into the single-axis indices of G for
convenience of indexing for computation. The index r to represent the coordinate (j1, j2, …, jq)
for Gr is coded as following:
1
1 2 1 3 1 2
1
( 1) ( 1) ... ( 1)q
q i
i
r j j N j N N j N
(5)
To determine which partition cell in ith one dimensional subspace should an eigenvector vi of
value ai be assigned to, the following rules are applied:
(2)
(2) (2)
(2)
1
2 [( )( 2) 2 ]
i i
i i i i i
i i i
a z
j round a z N z other
N a z
(6)
Here we use an α value at 0.1, z0.1(2)=1.64.
In the experiment later presented in this paper, we set the minimum number of Ni to 3 so that at
least two outlier partitions and one major partition could be established based on (6).
For the partition set G, a set of correspondent mean vectors (m1, m2, …, mN) and weight vectors
(w1, w2, …, wN) are statistically calculated by the following:
1
j Gi
i j
vi
m vG
(7)
and
i iw G
30
Other than using original dataset (x1, x2, …, xn) , ES transformed eigenvectors (v1, v2, …, vn) are
chosen to calculate (m1, m2, …, mN) as K-means performed on eigenvectors could yield results
closer to the global minimum (Zha et al., 2001).
3.2 Clustering
Given a set of q dimensional mean vectors (m1, m2, …, mN) and a set of weight vectors (w1, w2,
…, wN) representing eigenvectors (v1, v2, …, vn), CBEST aims to find cluster centers (µ1, µ2, …,
µk) for partition sets V={V1, V2, …, Vk}, such that within-cluster sum of squares J can be
minimized:
2
1
-j i
k
j i
i v V
J v
(8)
To differentiate compressed eigenspace partition sets G from cluster partition V, we refer G as
eigenspace partitions. The differences between CBEST clustering and conventional K-means are
(1) compressed eigenspace partitions (size N × p) are scanned for each iteration instead of
original dataset (size n × p); (2) number of counts of eigenvectors in each eigenspace partition as
weights is introduced to update cluster centers for new iterations. The algorithm is shown in
Table 8. The process of CBEST as compared with K-means is illustrated in Figure 8.
Table 8 CBEST Algorithm
Steps Implementation
I Perform compression to derive (m1, m2, …, mN) and (w1, w2, …, wN).
II Initialize k centers (µ1(0)
, µ2(0)
, …, µk(0)
); Iteration count t=0.
III ( ) ( )t t
ij j iD m . i=1, 2, …, k; j=1, 2, …, N.
IV ( ) ( ) ( ) ( ) ( )
1 2{ , : min( , ,..., )}t t t t t
i j j ij j j kjV m w D D D D . i=1, 2, …, k; j=1, 2, …, N.
V ( )
( )
( 1)
,
1t
j j it
j i
t
i j jm w Vjw V
w mw
. i=1, 2, …, k; j=1, 2, …, N.
VI t=t+1
VII Repeat step III to VI until converge
Finalization
' ( )t
ij j iD v . i=1, 2, …, k; j=1, 2, …, n.
' ' ' '
1 2{ : min( , ,..., )}i j ij j j kjV v D D D D . i=1, 2, …, k; j=1, 2, …, n.
31
Figure 8 Illustrative Comparisons between CBEST and K-means
Instead of scanning over the entire dataset once per iteration in the conventional K-means,
CBEST scans the entire dataset twice, one for compression, the other for finalization. For each
iteration, CBEST reduces the time complexity from o(knp) to o(kNq), where q ≤ p. For large
datasets, one can predefine a much smaller N to significantly decrease the computation time.
However, as cluster centers are calculated by grouped gray-level reduced vectors in eigenspace
partitions, it is difficult for CBEST to reach the local optimal if clusters are intermixed because
partition sets in clustering boundary could not be divided to different clusters. The trade-off
between user defined number of eigenspace partitions N and overall performance was
experimented and discussed in later sections in the paper.
3.3 Further Improvement
3.3.1 Mean vectors
Notice that mean vectors are being divided by weights and later only used by multiplying the
same weights back. Additional efficiency could be gained by eliminating the redundancy. We
thus replace mean vectors used in (7) by total vectors (T1, T2, …, TN). Total vectors are defined
as following:
32
j i
i j
v G
T v
The corresponding change in CBEST algorithm is the update of step V in Table 8:
( )
( )
( 1) 1t
j it
j i
t
i jv Vjw V
Tw
i=1, 2, …, k; j=1, 2, …, N.
This revision could reduce N multiplication operations in one iteration and avoid N divisions in
the compression.
3.3.2 Vacant Eigenspace Partitions
In practice, most of the eigenspace partition sets in G are vacant, unoccupied by any data.
Suppose occupied number of eigenspace partition sets is N', and N' ≤ N. The vacant partition sets
could be cleared out of G to free the space occupied by (N – N')(q × size of datatype(m or T) +
size of datatype(w)). Time complexity also decreases to o(kN'q) for each iteration.
3.3.3 Boundary Optimization
When CBEST converges, eigenspace partition sets that are not located on the inter-cluster
boundaries are less likely to change their cluster membership if further refinement is
implemented, to reach local optimal. As it is simple to locate neighboring eigenspace partition
sets from its original indices decoded from the index of partition sets G, eigenspace partition sets
on the inter-cluster boundaries could be extracted by determining whether any of their
neighboring partition sets belongs to a different cluster. Conventional K-means is then
implemented on eigenvectors belonging to these boundary partition sets while other eigenvectors
being static require no calculations.
However, by doing so the time complexity increases from o(kN'qt) to o(kN'qt + kn'qt'), where t is
the iteration number of CBEST, n' is the number of eigenvectors in boundary partition sets, and t'
is the iteration number of subsequent K-means. Moreover, as the dimension increases, the
boundary partition sets greatly increase as a result of less partition numbers in lower ranked
eigen subspace. The entire process then becomes using CBEST to initialize for K-means.
4 Experimental Design
The experiment was carried out by using CBEST on a subscene of Landsat image to identify
specific land covers for each ground mapping unit (image pixel) based on spectral information
only.
33
4.1 Experiment Data
Experiments were done on two datasets. For the first dataset, calibrated radiance of 6 spectral
bands about land surface in the central district area Guangzhou, Guangdong Province, China was
extracted from a Landsat 5 TM imagery (Path/Row: 122/44) acquired on 1/2/2009 (Figure 9).
Guangzhou is the capital and the largest city of Guangdong province in the People’s Republic of
China. It is located at the north part of Pearl River Delta, which consisted of a number of
municipalities along the south coast of Mainland China. Guangzhou has an area of 7,434.4 km2,
in which urban area takes up 3,843.43 km2. The population was about 12.78 million as of 2010.
As a result of humid subtropical climate influenced by Asian monsoon, Guangzhou has a hot and
humid summer when cloud covers degrade the quality of satellite images. Therefore, a cloud free
Landsat 5 TM scene acquired during the winter time on January 2, 2009 was chosen for the test.
As Guangzhou is located at the estuary connecting the Pearl River and South China Sea,
agriculture lands were along the Pearl River in the northwest part. Urban settlement and ports
were built near the estuary in the south as a result of sea trading for a long history. Mountainous
area in the northeast features forests and grasslands.
Figure 9 Test Area: Guangzhou, China
The image was clipped by political boundary of Guangzhou city to yield a partially masked
image of 1484 columns and 1449 rows in dimension, in which the number of unmasked pixels is
1,508,553. The TM sensor onboard Landsat 5 has 7 spectral bands in which bands 1, 2, and 3 lie
in visible spectrum; 4, 5, 7 in near and medium infrared. The six bands have a 30 meter spatial
34
resolution, which means each pixel nominally represents 30×30 m2 land area. In land cover
classification, band 6 is a thermal band with a 120 m resolution and is usually excluded due to
the resolution difference.
The second dataset is from an Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) image
taken over Northwest Tippecanoe County, Indiana on June 12, 1992 (AVIRIS Image, 2013).
Tippecanoe County has a humid continental climate with warm summers and cold winters and a
significant proportion of the land is used for agriculture. A subset of image was clipped to a
subscene of 1467 × 614 pixels, which covers approximately 14 × 6 square miles in area on the
ground. A river passes through the subscene of east-west direction, near which there are some
forests on the north side. Only a few buildings for residential use are present on the image, while
most of the area is covered by corn and soybeans fields. There are 220 spectral bands of which
the wavelength ranges from 400 nm to 2500 nm. The data is calibrated into digital numbers
proportional to radiance. Although no georeference is found with the hyperspectral image,
another spatially pixel-to-pixel matched ground reference image was available for validation.
4.2 Preprocessing
Since classification was performed for an area covered by one scene, atmospheric correction was
not necessary for both images (Gong and Howarth, 1990; Song et al., 2001). For a data
exploration clustering algorithm, the linear radiometric calibration was not necessary. ES
transformation was performed directly on Digital Number (DN) for spectral bands 1, 2, 3, 4, 5
and 7 to replace original spectral feature vectors with eigenvectors for the TM imagery. All
clustering experiments on the TM imagery were implemented on the gray-level reduced vectors
since it is known that K-means performed on such vectors could yield results closer to the global
optimum (Zha et al., 2001).
4.3 Methods
To determine whether CBEST could significantly improve efficiency while mildly
compromising accuracy for clustering large remote sensing dataset, the experiment was designed
in two parts (Figure 10).
35
Figure 10 Experiment Flow Chart
The first part being an efficiency and performance test, CBEST and conventional K-means are
tested on preprocessed TM image mainly to compare the following indicators: total elapsed time,
elapsed time per iteration (ETI) and number of total iteration t for efficiency; within-cluster sum
of squares for performance. Particularly for the efficiency test, as the indicators are relative, time
consumption is the major concern. The total elapsed time was used to estimate the overall time
consumption of both algorithms if being totally converged (a cap was set to 1000 iterations at
most). For the performance test, as CBEST derived cluster centers were not at local optima, the
total within-cluster sum of squares was calculated to compare with K-means which converges at
local minimum. Our empirical tests indicated that neither distance between cluster centers nor
pixel agreement could directly reflect the quality of the algorithm, so we used the total within-
cluster sum of squares, which is what K-means clustering intends to minimize. Comparisons
were implemented by plotting the indicators between the two algorithms against changing
variables including samples of the images, number of cluster centers k, number of eigenspace
partitions N and maximum number of iterations allowed tmax. We first test the relationship
between size of the data and running speed by systematic sampling the TM image at various
sizes of intervals. Then by varying the number of cluster centers, we examined how much speed
CBEST could gain and how much precision it could lose as number of cluster centers increases.
Number of eigenspace partitions determines the compression intensity directly and affects both
speed and accuracy. Lastly, we compared how efficiently both algorithms converge after a given
number of iterations because in practice, a limit on the number of iterations is always predefined
to avoid slow convergence leading to long running time. As CBEST and K-means both use
randomized initial cluster centers, multiple runs were implemented to mitigate biased assessment
36
due to randomness. In each run, CBEST and K-means were initialized with identical number of
cluster centers.
A total of 10 runs were implemented for each specific set of variables. In the test, we first varied
sampling interval from 10,000 to 1 while k was fixed at 16, and N was fixed at 1,000,000. Then k
was tested from 2 to 64 cluster centers, and number of eigenspace partitions was fixed at
1,000,000. When N was varied from 1000 to 10,000,000, k was fixed at 16. For most test runs,
the maximum number of iterations was set to 1000 in order to assess completely converged
clustering. At last, the maximum number of iterations varies from 10 to 80 while was N fixed at
1,000,000, and k was fixed at 16. Mean, minimum and maximum values of the 10 runs were
calculated for sample means. This gave us an estimator of overall average efficiency and
performance while the minimum and maximum were intended to represent the range. In the
performance test, the minimum value of Within-Cluster Sum of Square (WCSS) was important
as the randomly initialized K-means is sometimes run for multiple times to ensure that WCSS is
closer to global minimum. The description of all indicators used in the test is listed in Table 9.
Table 9 Description of the Indicators
Indicator Description
Total Elapsed Time Start counting right after ES Transformation and stop at converging or
reaching the iteration limit.
Number of Iterations The number of iterations that an algorithm goes through
Elapsed Time per
Iteration
Total Elapsed Time / Number of Iterations
Within-Cluster Sum of
Square
2
1
-j i
k
j i
i v V
J v
In the second part, CBEST and K-means were compared in a practical land cover/use
classification scenario.
For the Landsat TM dataset, a classification system for land cover/use types in Guangzhou was
proposed in Table 10. Both land cover and land use were taken into account in the classification
system considering the complicated nature of urban, agriculture and forest mixture in the area,
though spectral based classification is more appropriate for identifying and mapping land cover
(Gong and Howarth, 1990; Ouma et al., 2006) than land use. Since unsupervised classification is
always used to explore a natural classification system matching the spectral patterns of the region
of study, we also discussed the potential of K-means in refining our land cover/use classification.
Table 10 Classification System for Guangzhou
Land Cover Description
Settlement/Residential Urban residential areas with tree, lawn and driveways
Industrial/Commercial Urban Industrial/Commercial built-up areas
Clearland Bright land cover that cleared for construction
37
Idle land Idle land with little growing vegetation
Orchard Sparse fruit trees and grass
Cropland/grassland Grass and crops
Urban Forest Dense tree canopy
Water body Including river, reservoirs and ponds
500 reference plots were collected for validation purposes by stratified sampling (using
stratification classes in Table 10), 138 of which were collected in the field in April and
December, 2009 and June, 2010. The difference between the sampling time and image
acquisition time was not long enough for major land use change to occur. Others were
interpreted from high resolution aerial photos and local knowledge of the area. To ensure the
identical confidence level of sampling for the strata, each stratum was allocated for more than 40
sampling units and no more than 80. The range of the strata sampling sizes is subject to the
different size of areas of corresponding strata. The sampling units were distributed evenly in the
entire area to avoid redundancy from similar neighboring objects as well as to increase the
overall sampling quality by outstretched coverage.
CBEST and K-means both were implemented for multiple runs with k=30. The best solutions
were used to cluster the preprocessed TM imagery. Multiple runs were implemented to ensure
the global optimum was approximated. On the other hand, overestimated number of cluster
centers was later reduced in post processing such that these 30 clusters were merged or split
based on automatic image interpretation criteria. Still the natural boundaries between clusters are
maintained. A confusion matrix as well as Kappa coefficient was then calculated for accuracy
assessment.
For the AVIRIS dataset, CBEST and K-means were also applied in a similar way. In this
experiment, we predefined k=100 since the ground reference consists of 58 subdivided classes.
However, some of the 58 classes are defined for better management and they exhibit great
spectral similarity among each other. For instance, corn is divided into subclasses based on
tillage approach such as no-till, min-till and clean-till. It is very difficult to distinguish one from
another with only spectral information during their growing season since they both grow the
same corn type in the fields. Therefore, these 58 classes were simplified into 10 classes in which
the 100 clusters were assigned in the post-classification process. The final maps were then
assessed for accuracies by comparing to the reference dataset, and their accuracies were
compared.
All programs used in our experiments were coded using Matlab 2011b and run in a computer
featuring an Intel i5 760 2.8Ghz quad-core CPU and 8 gigabytes physical memory.
5 Results and Analysis
5.1 Efficiency & Performance Test
The plot of the first test on speed gain with respect to data size is shown in Figure 11. Mean
values of elapsed time and iterations could be found in Table 11. As the data size increases, time
38
cost by CBEST increases much slower than that of the K-means since CBEST only scans over
the entire dataset twice. At around 5000 samples, CBEST started to gain an edge over the K-
means and after that the gap significantly builds up between them. When using the entire dataset
of about 1.5 million data instances, K-means took approximately 100 times longer than CBEST
to converge, and CBEST is 15 times faster than K-means for each iteration on average. The slow
growing of time cost of CBEST is attributed to the fixed eigenspace partitions. As data size
increases, more cells are occupied and more data are scanned in preprocessing and finalizing for
CBEST. The increasing rate of time cost in CBEST is minimal when compared to that of the K-
means as all additional cost from the increased data size applied in each iteration. As a result,
CBEST gains significant efficiency boost over large datasets as confirmed by this experiment.
Table 11 Test Results w/respect to Data Size
Data Size (n) 151 503 1509 5029 15086 30172 75428 150856 301711 754277 1508553
Total Elapsed Time
(s)
CBEST 0.04 0.05 0.06 0.15 0.23 0.41 0.86 1.15 1.34 1.79 2.45
K-
means
0.01 0.01 0.03 0.14 0.78 1.15 3.83 8.06 25.58 71.28 135.95
Number of Iterations CBEST 8.9 17.6 30.3 61 67.6 78.8 95.2 83.9 82 72 66.2
K-
means
8.8 19.5 34 63.9 123.7 89.6 151 154.4 234.5 233.2 225.7
Elapsed Time
per Iteration (s)
CBEST 0.005 0.003 0.002 0.003 0.003 0.005 0.010 0.014 0.017 0.026 0.041
K-
means
0.001 0.000 0.001 0.002 0.006 0.013 0.025 0.052 0.109 0.305 0.603
39
Figure 11 Speed Comparison w/respect to Data Size. (a) Elapsed Time Comparison; (b)
Elapsed Time ratio (how many times faster); (c) ETI Comparison; (d) ETI ratio
The summary of efficiency test with respect to k was given in Table 12 (values are mean of 10
runs without specification) and illustrated in Figure 12. As expected, CBEST was faster than K-
means as number of cluster centers increases. The ratios calculated in (b) and (d) directly
represent the number of times that CBEST is faster than K-means, which boosted to about 100
times faster in the configuration as k increases to over 48. The K-means is roughly 4-5 times in
number of total iterations of CBEST, which contributes to CBEST’s shorter running time too. It
should also be noticed that the total elapsed time was associated with a greater variation as a
result of large variation in iteration numbers. For instance, when k=64 for K-means, the iteration
number ranges from 410 to 907 with an average of 629 and the maximum is over twice of the
minimum.
Table 12 Test Results w/respect to k
Number of Cluster Centers k 2 4 8 16 24 32 48 64
Total Elapsed Time (s) CBEST 0.66 0.89 1.50 2.55 4.25 5.25 6.22 9.06
K-means 8.9 16.9 46.7 127.8 306.3 432.0 700.5 1,055.8
102
103
104
105
106
10-4
10-3
10-2
10-1
100
Num of Samples
Ela
psed T
ime p
er
Itera
tion (
ET
I)
102
103
104
105
106
10-2
10-1
100
101
102
Num of Samples
ET
I R
atio (
k-m
eans/C
BE
ST
)
102
103
104
105
106
10-3
10-2
10-1
100
101
102
103
Num of Samples
Ela
psed T
ime (
s)
CBEST
k-means
CBEST Max/Min
k-means Max/Min
102
103
104
105
106
10-2
10-1
100
101
102
103
Num of Samples
Tim
e R
atio (
k-m
eans/C
BE
ST
)
Ratio (k-means/CBEST)
Max/Min
Ratio = 1
(a) (b)
(c) (d)
40
Number of Iterations CBEST 11 27 55 77 114 124 106 136
K-means 49 66 122 226 411 444 533 629
Elapsed Time per
Iteration (s)
CBEST 0.07 0.03 0.03 0.04 0.04 0.04 0.06 0.07
K-means 0.18 0.26 0.38 0.57 0.74 0.97 1.31 1.68
WCSS (×107) CBEST mean 93.74 57.48 38.35 19.64 13.60 11.67 9.73 8.46
min 93.66 57.46 31.84 17.10 13.30 11.45 9.54 7.54
K-
means
mean 93.65 57.45 38.76 17.66 13.19 11.33 9.26 7.72
min 93.65 57.45 38.76 16.84 13.15 11.27 9.23 6.70
WCSS (rescaled to K-means
min)
CBEST mean 1.00 1.00 0.99 1.17 1.03 1.04 1.05 1.26
min 1.00 1.00 0.82 1.02 1.01 1.02 1.03 1.13
K-
means
mean 1.00 1.00 1.00 1.05 1.00 1.01 1.00 1.15
min 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00
41
Figure 12 Efficiency w/respect to k. (a) Elapsed Time Comparison; (b) Elapsed Time ratio
(how many times faster); (c) ETI Comparison; (d) ETI ratio; (e) Rescaled Within-Cluster
Sum of Square average; (f) Rescaled Within-Cluster Sum of Square Best/Worst Case.
0 10 20 30 40 50 600.95
1
1.05
1.1
1.15
1.2
1.25
1.3
Num of Cluster Centers
Within
-Clu
ste
r S
um
of
Square
s (
Rescale
d t
o M
in)
CBEST avg
k-means avg
k-means Best (Base Line)
0 10 20 30 40 50 600.8
0.9
1
1.1
1.2
1.3
1.4
1.5
Num of Cluster Centers
Within
-Clu
ste
r S
um
of
Square
s (
Rescale
d t
o M
in)
CBEST Worst
CBEST Best
k-means Best (Base Line)/Worst
0 10 20 30 40 50 6010
-1
100
101
102
103
104
Num of Cluster Centers
Ela
psed T
ime (
s)
CBEST
k-means
CBEST Max/Min
k-means Max/Min
0 10 20 30 40 50 600
50
100
150
200
250
300
Num of Cluster Centers
Tim
e R
atio (
k-m
eans/C
BE
ST
)
Ratio (k-means/CBEST)
Max/Min
0 10 20 30 40 50 600
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
Num of Cluster Centers
Ela
psed T
ime p
er
Itera
tion (
ET
I)
0 10 20 30 40 50 600
5
10
15
20
25
30
35
Num of Cluster Centers
ET
I R
atio (
k-m
eans/C
BE
ST
)
(a) (b)
(c) (d)
(e) (f)
42
The performances of both CBEST and K-means are illutrated in Figure 12 (e, f), using the lowest
within-cluster sum of squares calculated by K-means as a base line on which other values were
rescaled. Subplot Figure 12e characterizes the average K-means and CBEST performance as
CBEST always behaves less effectively with respect to the varying cluster centers with an
exception of k=8, which suggests the possibility that CBEST could reach better optimal solution
on average for certain number of initial centers. The minimum values presented in Figure 12f,
suggests the same finding that at k=8, the best result of CBEST found an optimum at
approximately 80% of that K-means derived in WCSS. In other cases, the best solution CBEST
yielded was a little higher than achieved using K-means. CBEST didn’t find a close solution in
k=64. Considering the average WCSS of K-means in this case is quite high, K-means was able to
converge to close to the minimal WCSS of 67 million for 3 of 10 times, while CBEST was able
to converge to 75 million only once. Other solutions converged over 81 million for both
algorithms. We figure that as k increases, the eigenspace partitions N should also increase to
achieve finer resolution in the eigenspace in order to compensate for more information loss due
to increased between-cluster boundary surfaces. Otherwise the average number of instances in an
eigenspace partition is too many to be divided into more clusters. This conclusion is strengthened
by the following experiments.
The comparison of the means of indicators by varying N was shown in Table 13. The mininum
for WCSS was also shown in the table as a reference for the optimal solutions. The range and
mean were plotted in Figure 13. The figure was plotted with occupied partitions as x-axes
because it is this number which determines the time and space complexity of CBEST.
Table 13 Test Results w/respect to N
CBEST K-
means
Expected Eigenspace Partitions N 1000 5000 10000 50000 100000 500000 1000000 5000000 10000000 n/a
Eigenspace Partitions N' 1008 2700 15300 41184 88200 338580 505050 2079168 11450160 n/a
Occupied Partitions 554 1370 4423 9904 19091 53640 71434 185901 295316 n/a
Total Elapsed Time (s) 0.74 0.74 0.84 0.93 1.11 2.03 2.26 6.75 12.70 143.33
Number of Iterations 9 16 25 32 43 69 63 101 111 256
Elapsed Time per Iteration (s) 0.09 0.05 0.04 0.03 0.03 0.03 0.04 0.07 0.11 0.56
WCSS (×108) mean 2.67 2.26 2.31 2.24 2.15 2.09 1.96 1.84 1.83 1.77
min 1.90 1.81 1.80 1.74 1.74 1.72 1.72 1.72 1.71 1.68
WCSS
(rescaled to K-means
min)
mean 1.59 1.34 1.37 1.33 1.27 1.24 1.17 1.09 1.09 1.05
min 1.13 1.07 1.07 1.04 1.03 1.02 1.02 1.02 1.02 1.00
43
Figure 13 Efficiency w/respect to N. (a) Elapsed Time Comparison; (b) Elapsed Time ratio
(how many times faster); (c) ETI Comparison; (d) ETI ratio. (e) Within-Cluster Sum of
Squares Comparison; (f) Within-Cluster Sum of Squares Limited by various max numbers
of Iterations.
103
104
105
0
2
4
6
8
10
12
14
16
Num of Eigenspace Partitions
Ela
psed T
ime (
s)
CBEST
Max/Min
103
104
105
0
50
100
150
200
250
300
350
Num of Eigenspace Partitions
Tim
e R
atio (
k-m
eans/C
BE
ST
)
Ratio (k-means/CBEST)
Max/Min
103
104
105
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
Num of Eigenspace Partitions
Ela
psed T
ime p
er
Itera
tion (
ET
I)
103
104
105
0
5
10
15
20
25
30
Num of Eigenspace Partitions
ET
I R
atio (
k-m
eans/C
BE
ST
)
(a) (b)
(c) (d)
103
104
105
1
1.2
1.4
1.6
1.8
2
Num of Eigenspace Partitions
Within
-Clu
ste
r S
um
of
Square
s (
Rescale
d t
o M
in)
CBEST avg
k-means avg
CBEST Best/Worst
k-means Best/Worst
10 20 30 40 50 60 70 800.8
0.9
1
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
Num of Max Iterations
Within
-Clu
ste
r S
um
of
Square
s (
Rescale
d t
o M
in)
CBEST avg
k-means avg
CBEST Best/Worst
k-means Best/Worst
(e) (f)
44
The edge of CBEST to K-means gradually decreases as the number of partitions increases. This
is an expected result since the number of partitions in eigenspace determines the complexity of
the CBEST algorithm. As opposed to the total elapsed time, the quadratic like curve of elapsed
time per iteration (ETI, Figure 13c) was caused by larger weight of time in preprocessing when
scanning the entire dataset into eigenspace partitions and post-processing when assigning
memberships. In this case, ETI does not indicate the average time in each iterative update but an
overall time consumption divided into each iteration interval. Initially, the time was spent mostly
in preprocessing and post-processing so that small iteration number ensures each iterative
interval is longer. As the number of partition increases, the time is consumed more in the
iterative calculation and update steps and thus the ETI slowly drops to the lowest limit until the
number of eigenspace partitions outweighs number of iterations and pre- and post-processing.
The performance of CBEST represented by WCSS (Figure 13e) on average decreases to
approximate that of K-means as the number of eigenspace partitions N increases. The lowest
WCSS for CBEST starts with 1.13 of that of the K-means, decreased to 1.02 as partition number
increases to 10 million in the test. It can be inferred that the ultimate case is that N is large
enough to contain only one single data instance in every occupied partition set, so that the
clustering process would be completely identical to that of K-means.
Furthermore, we can explore how much information loss due to dropping the eigen axes with
lower eigenvalues could possibly affect the accuracy of CBEST. As mentioned above, we
empirically set the minimum number of partitions for a 1-dimensional subspace to be 3 since it is
the minimum partitions to allow confidence interval to be applied. In the experiment that N
gradually increases, the dimension starts with 4, and increases to 5 at N=10,000, then to 6 when
N=10,000,000 (Table 14). However, no abrupt boost of performance could be observed from
Table 13 and Figure 13 at the two breaking points. Particularly for N from 500,000 to
10,000,000, the best performance almost stabilizes at 1.02 of that of the K-means. It seems that
the additional 3 partitions assigned to the 6th
eigen axis has little impact on the clustering
performance.
Table 14 Assignment of Eigenspace Partitions for Eigen Axes
Expected N 1000 5000 10000 50000 100000 500000 1000000 5000000 10000000 Eigen Values
Eigen Axis 1 12 15 17 22 25 33 37 48 54 626.2
Eigen Axis 2 7 9 10 13 14 19 21 28 31 204.7
Eigen Axis 3 4 5 6 8 9 12 13 17 19 78.5
Eigen Axis 4 3 4 5 6 7 9 10 13 15 48.0
Eigen Axis 5 3 3 4 5 5 7 8 13.3
Eigen Axis 6 3 1.5
There is an issue of how well the means as opposed to the minimums could be used to justify the
experiment. The minimums indicate the best solutions which the algorithms could achieve from
10 runs, while the means indirectly reflect the possibility that the algorithms are trapped in
45
distant local optimums. As for the distant local optimum, for instance, K-means running with a
cluster center of 64, 7 runs generated local optimum at WCSS of 81 million as opposed to the
other 3 runs at that of 67 million. Although K-means is more likely to get through the distant
local optimum to get a solution close to the global optimum than CBEST, 10 runs are not
sufficient to generalize the frequency of getting a distant local optimum. Consequently, the
WCSS averaged by only 10 runs is subject to bias by possible multiple distant local optima. But
as for the minimum values, 10 runs could likely generate at least a solution that is close to the
global optimum though not guaranteed. However, when k is not too large, the comparison would
be more confident.
The performance of CBEST as opposed to K-means was further examined by limiting the
maximum number of iterations allowed in a single clustering implementation (Table 15, Figure
13f).
Table 15 Performance Test w/respect to the Max number of Iterations
Maximum Iteration 10 20 30 40 50 60 70 80
Within-Cluster
Sum of Squares
(×108)
CBEST Mean 2.52 2.42 2.09 1.92 2.32 2.41 2.06 2.26
Min 1.78 1.74 1.73 1.71 1.72 1.74 1.72 1.72
K-means Mean 2.57 2.26 2.01 1.78 2.16 2.01 1.70 1.70
Min 2.15 1.71 1.70 1.68 1.69 1.69 1.68 1.68
Rescaled Within-Cluster
Sum of Squares
CBEST Mean 1.17 1.42 1.23 1.14 1.37 1.43 1.22 1.34
Min 0.83 1.02 1.02 1.02 1.02 1.03 1.02 1.02
K-means Mean 1.20 1.32 1.18 1.05 1.28 1.19 1.01 1.01
Min 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00
The best solutions (min WCSS) derived by K-means were used as a baseline for rescaling as
well. It can be easily observed that within 80 iterations, the best solution generated by CBEST is
steadily around 2% higher than that of the K-means with an exception at a limiting number of
iterations of 10 when CBEST converged to a much lower WCSS. However, this could also be
caused by previously argued insufficient number of runs. As a result of similar efficiency in
converging speed between the two algorithms, more iterations of K-means as observed in the
above experiments could be speculated as the following: CBEST usually reaches the separation
limits bounded by eigenspace partitions much earlier while the K-means continues to minimize
WWCS by separating without limitation.
In general, driven by the efficiency & performance test, the following points could be made:
† Time cost of CBEST increases much slower than K-means as data size increase;
† As k increases, CBEST is faster than K-means but the increasing rate slows down;
† Both CBEST and K-means could approximately reach an optimum far from the
global optimal solution, but CBEST has higher possibility of failure given that an
identical random set of initial centers is used;
† CBEST slows down as N increases, but approaches local optimum more closely;
46
† As N increases, CBEST has higher possibility to approach an local optimum close to
global solution;
† CBEST and K-means converge at almost identical efficiency.
Consequently, to carry out a CBEST clustering in practice considering efficiency and accuracy, a
balanced N to allow fast multiple runs to approximate global optimum should be an appropriate
strategy. However, the balanced N may not be properly estimated without experimenting on the
data first. For large datasets, one could first draw a sample from the dataset and experiment on
various N to empirically get an appropriate value.
5.2 Application Experiments
5.2.1 Landsat TM Image
Both CBEST and K-means were implemented for 10 runs. The one with the lowest WCSS was
chosen to proceed to post-processing. Number of clusters k was set to 30. For CBEST, N was set
to 1,000,000 (CBEST is approximately 85x faster than K-means in this configuration on
average). Then the 30 clusters were split and merged down to 8 land cover/use classes in post-
processing. We developed an approach guided by spectral information of ground truth samples
projected in eigenspace for splitting and merging. If a cluster was considered to be a mixture of
more than one class, it was split and its members were then reassigned to its neighboring clusters
in eigenspace. Multiple clusters which were considered to be corresponding to one class were
merged into one cluster. In particular, first we assumed a Gaussian distribution for the 8 classes
and log likelihoods of the 30 cluster centers with respect to the 8 Gaussian distributed classes
were calculated. The class with outstanding highest log likelihood was assigned to the
corresponding class. To determine if the highest log likelihood is sufficiently outstanding, we set
a threshold to test if the two highest log likelihood values are close, in which case the cluster is a
mixture of at least two classes and thus split and its members reassigned to its neighboring
clusters in eigenspace. In this experiment, one cluster was split as its two highest log likelihood
values were close and the remaining 29 clusters were merged into their corresponding 8 classes
for both algorithms.
We compared the ground truth data and generated confusion matrices (Table 16). A summary of
their overall accuracy was compared in Table 17.
Table 16 Confusion Matrices for Validation (Landsat)
K-means Classified
Reference
Clearland Cropland Forest Idle Industry Orchard Settlement Water P Acc
Clearland 13 0 0 7 23 0 1 0 29.55%
Cropland 0 61 7 4 0 3 2 0 79.22%
Forest 0 2 73 0 0 6 2 0 87.95%
Idle 0 20 0 24 0 0 1 0 53.33%
Industry 9 0 0 1 52 0 9 0 73.24%
Orchard 0 0 15 0 0 33 0 0 68.75%
47
Settlement 0 0 0 5 0 0 84 2 92.31%
Water 0 0 0 0 0 0 4 37 90.24%
U Acc 59.09% 73.49% 76.84% 58.54% 69.33% 78.57% 81.55% 94.87%
CBEST Classified
Reference
Clearland Cropland Forest Idle Industry Orchard Settlement Water P Acc
Clearland 37 0 0 5 1 0 1 0 84.09%
Cropland 0 65 7 3 0 0 2 0 84.42%
Forest 0 3 68 0 0 12 0 0 81.93%
Idle 0 14 1 30 0 0 0 0 66.67%
Industry 18 0 0 1 41 0 11 0 57.75%
Orchard 0 0 9 0 0 39 0 0 81.25%
Settlement 0 1 0 1 0 0 87 2 95.60%
Water 0 0 0 0 0 0 5 36 87.80%
U Acc 67.27% 78.31% 80.00% 75.00% 97.62% 76.47% 82.08% 94.74%
Table 17 Summary of Classification Results (Landsat)
CBEST K-means
WCSS (×108) 1.179 1.166
Average Producer’s Acc 79.94% 71.82%
Average User’s Acc 81.44% 74.04%
Overall Accuracy 80.60% 75.40%
Kappa Coefficient 0.775 0.713
Agreement 82.05%
To assess spectral information, the first three eigen axes of ground reference data points are
plotted pairwise (Figure 14). It can be observed that several class pairs are mixed such as
clearland and industrial land, orchard and forest, cropland and idle land. Between clearland and
industrial land, as spectrally clearland could be grouped into two widely separated subclasses,
one of which closely mixed with industrial use. Clearland is sometimes cleared for construction
and thus covered by paved ground with little vegetation which resembles commercial/industrial
cover. As a result, poor classification accuracies were derived by both algorithms for the two
classes, particularly for the K-means. Orchard and forests are both tree covers and thus easier to
be mixed in spectral spaces. As idle land is sometimes covered by weed, its spectral signature
could shift towards less vegetated cropland and grassland.
48
Figure 14 Scatterplot of Ground Truth
Land cover/use maps were created using both algorithms (Figure 15). In general, the
classification result agrees with the layout of the area by comparing with validation samples
(Figure 16). Residential areas are located near the Pearl River on the south and west. Agriculture
lands and villages characterized by sparse residential areas are in the north west of Guangzhou.
Orchards and sparse tree canopies are spread out in the north part and forested areas, there are
also some along the Pearl River in the south. Despite the difference in bias toward different sides
for intermixed pixels, overall the results are similar as 82.05% pixels are identically classified.
49
Figure 15 Land Cover/Use Map derived by K-means and CBEST in Guangzhou
Figure 16 Validation Samples as Ground Reference in Guangzhou
5.2.2 AVIRIS Hyperspectral Image
The number of clusters k was set to 100 due to a large number of 58 classes provided in the
reference data. For CBEST, N was set to 10,000,000 due to the large number of bands for the
hyperspectral image. CBEST and K-means were both used to cluster the hyperspectral image.
The results were then compared with the reference to combine and simplify the 58 classes since
some classes were either too detailed or too similar to another class in spectral space. The final
classification system consists of 10 classes, which are Urban, Corn, Soybeans, Wheat/Oats,
Grass/Pasture, Hay, Woods, Swampy Area, Water and Other (Bare soils, Not Cropped and
Orchard). The 100 clusters were assigned into one of the 10 classes. Each cluster was assigned to
the class corresponding to the highest number of pixel counts. Lastly, the final product of the
land cover maps were produced after applying a majority filter on the reassigned cluster maps to
reduce the pepper and salt effects. The reason for applying the majority filter is that the image
primarily consists of agriculture lands and thus pixels of the same land cover are very likely
connected in large patches. The resulting maps are shown in Figure 17.
50
Figure 17 Mapping Results in Tippercanoe County
The accuracies of the two maps and the percentage of agreement between them were calculated
(Table 18). Despite their large gaps in the computing times as CBEST was almost 40 times faster
than K-means, they generated the product at the same level of accuracies with respect to the
given reference. Neither K-means nor CBEST was capable of separating corn from soybeans
effectively. If combining the two crop types into one class, the accuracy goes up to 88%. The
increase of accuracy is partly attributed to the dominant number of pixels of the two as they
comprise approximately 66% of all reference pixels in the reference image. Another reason is
that the two classes have relatively close spectral curves by visual inspection and are thus
difficult to separate. Limitation of the K-means when dealing with high dimensional data is
another concern of possible accuracy loss. Furthermore, the reference polygons may not be as
uniform as they were labelled. If observed closely, one can notice that the textures in some
polygons of the reference are complicated. For instance, the urban reference polygons define the
entire neighbourhood, while there are only a few sparsely distributed buildings yet most areas are
covered by planned lawns, brushes and trees. All these issues stated above could contribute to
the misclassification in this experiment. To sum up, it is reasonable to interpret the results as an
implication that K-means and CBEST could achieve a similar level of accuracy in a clustering
practice for hyperspectral data.
51
Table 18 Summary of Class Results (AVIRIS)
CBEST K-means
Elapsed Time (Seconds) 107 ( ES Transformation 10s) 4071
Accuracy 65.8% 65.7%
Accuracy (Corn/Soybeans Combined) 88.4% 88.1%
Agreement 81.0%
Agreement (Corn/Soybeans Combined) 93.4%
6 Discussions
There is a concern that the eigen space transformation could cost a comparable amount of time as
the K-means under certain circumstances. Theoretically, the computing complexity for the eigen
analysis is o(np2), while K-means is o(npkt). However, the process of K-means is more
complicated and redundant than the eigen analysis since it involves calculating distance pairs,
searching minimum distances, recalculating cluster centers and calculating and checking
termination flag. Each process requires scanning over the entire dataset. To give an estimate, an
empirical parameter b can be specified such that when np2 increases large enough to approach
bnpkt, eigen analysis would take as long as K-means and thus CBEST would not be a good
choice. In the Efficiency & Performance Test, we did not include the time of performing the
eigen analysis in our time comparison tables, because the eigen analysis performed on the TM
image took only 0.1 seconds on average of 10 runs under the same hardware configuration. In
comparison with over a one hundred second level run time for K-means, this time is relatively
small. In the later experiment with the hyperspectral image, though the eigen analysis processing
time went up to 10 seconds, it is still not significant compared to 4071 seconds processing time
under K-means. From the AVIRIS image experiment with p=220, k=100 and t=100, it can be
approximated that this empirical parameter b is around 9, which is large enough to guarantee that
eigen analysis is not time consuming in most cases for remotely sensed data even for
hyperspectral data with over 200 bands.
In the experiment on the Landsat TM satellite image, the post-processed CBEST map is
considered to be more accurate than the post-processed K-means map with this validation
sample set. However, due to the same nature of the two algorithms, one could argue the results
simply reflect that the two sets of results are from two respective local optima. Although both
CBEST and K-means intend to find optimal clustering patterns for the data, as they start with
different random initial cluster centers, the clustering partition boundary could be different
because they could possibly be trapped in many local minima. In practice, sometimes some local
minima could yield better classification results than the global minimum depending on the
configuration of the original dataset, classification system and validation dataset. In the
experiment with the AVIRIS data, however, CBEST and K-means yielded nearly identical
results in which they performed at the same level of accuracies yet showing the same weakness
when separating certain classes. The application for land cover/use mapping we demonstrated
here is merely intended to present that CBEST is capable of handling classification practice in
remote sensing studies.
52
The experiments conducted as above assure us of the potential of CBEST in land cover and land
use classification. Here is a sum up of the pros and cons of CBEST comparing to K-means in
remote sensing applications as following:
Pros:
† CBEST is fast when dealing with large datasets;
† CBEST converges earlier;
† By choosing a proper N, CBEST could maintain a performance level similar to K-
means;
Cons:
† CBEST does not guarantee local optimum;
† Need to estimate a proper N prior to application;
† More sensitive to initial centers.
Consequently, we can roughly conclude that CBEST enhances the advantages of the
conventional K-means while suffering almost identical drawbacks. However, the speed gain is so
significant that more agile strategies could be applied to CBEST. For instance, for an extremely
large dataset, one can only afford to run K-means several times with a low number of iterations.
CBEST can actually run a lot of times and converge at a point which could potentially yield
better result than K-means in a shorter period of time.
In practice, CBEST has already been applied to study the distribution and temporal changes of
surface cover colors over the entire country of China with 8-day composite MODIS satellite
images from 2001 to 2010 (Fu et al., In press). It was shown this method had considerably saved
computation time in comparison to the conventional K-means.
In general, CBEST is competitive in tasks that certain loss of accuracy could be tolerated to
achieve greatly improved efficiency in processing for large dataset. For instance, one would like
to scan over a large database to find clusters that are small in number and whose centers are
difficult to be detected by training. In the context of classification, CBEST can be used to
explore large datasets to search for distinct cluster centers that are unlikely to belong to any
existing classes defined in a supervised classification. One can also use CBEST to search for any
clusters of small populations within existing classes to split existing classes into higher level
subclasses. For instance, if supervised classification is applied to the image used in our
experiment, there are plenty of subclasses that could be further explored such as the large
eigenspace distance between two subclusters within the clearland class that is covered by
different cleared ground covers. In remote sensing, the image we experimented with is hardly
considered a large dataset as it is only a subset of one TM scene. For clustering the global land
areas for instance, the total land area on earth is around 150 million km2 comprising roughly 150
million pixels for a coarse spatial resolution at 1 km, which could take days for K-means to
process, but maybe only a few minutes for CBEST. With this gained efficiency, we could deal
with higher spatial resolution a lot faster for large areas. Consider a scenario that we increase the
spatial resolution to 30 meters for clustering tests for all global land areas at the same scale as the
53
global land cover product produced with TM and ETM+ data (Gong et al., 2013). The data size
increases by more than 1000 times to over 150 billion pixels in total. It could be an impossible
task for the K-means to complete within a reasonable duration of time. However, the nature of
CBEST makes it possible for this type of tasks very well.
As for K-means, not only Euclidean distance could be implemented, but generalizing to other
types of distances is also viable for CBEST. In addition, CBEST could facilitate as an
initialization tool to fast approaching the global optimum, followed by other improved K-means
approaches. Jumping out of the framework of K-means, CBEST alone as a compression
approach could be generalized in many potential iterative clustering algorithms by computing
additional statistics such as variances in eigenspace partition sets. More improvement to CBEST
is possible under the framework that we established in this paper and we are looking forward to
applying it agilely in various remote sensing practices in the future.
54
Chapter 3 Applications of CBEST in efficiently mapping forest changes in the
state of California from 1986-2011
55
Abstract
We present an efficient approach for a practice of large-area mapping of forest changes based on
the Clustering Based on Eigen Space Transformation (CBEST) algorithm using remote sensing.
By analyzing 450 Landsat Thematic Mapper (TM) satellite images from 1986 to 2011 with a
five-year interval covering the entire state of California, USA, we derived a forest change type
map, a forest loss map and a forest gain map. Although California has 99.6 million acres land
area in total and the spatial resolution of Landsat TM is 30m, the computing time of the task took
only 10 hours in a computer with an Intel 2.8 Ghz i5 CPU and 8 Gigabytes RAM. The overall
accuracy of the forest cover in year 2011 was reported as 92.9% ± 1.6%. We found that the
estimated forest area changed from 28.20 ± 1.98 million acres to 28.05 ± 1.98 million acres from
1986-2011. In particular, our rough estimate indicates that each year California’s forest
experienced loss of 92 thousand acres and recovery of 85 thousand acres, resulting in seven
thousand acres forest loss per year. In addition, during 1986-2011, around 12% of the forestland
experienced changes, in which the change was 4% each for deforestation, afforestation and
deforestation then recovered, respectively. We concluded that forestland in California had been
managed in a sustainable manner over the 25 years, since no significantly directional changes
were observed. Our approach made a tighter estimate of the true canopy coverage such that 29%
of land in California is forestland, as opposed to the statistics of 33% and 40% made by previous
studies that had lower spatial resolution and shorter temporal coverage.
Keywords: Forest Change, Deforestation, Forest Disturbance, Landsat, Large Areas Mapping,
Large Dataset, Unsupervised Classification, Multi-temporal Image Analysis, Remote Sensing
56
1 Introduction
Forestland is commonly defined as land that is at least one acre in area and has at least 10% area
stocked with trees of any size, or previously had such tree cover but not being currently
developed for non-forest use (Helms, 1998). The Resource Planning and Act assessment (USDA
Forest Service, 2012) additionally limits a width of at least 120 feet (37 meters). It also includes
transition zones with 10% tree cover and excludes lands predominantly under agricultural and
urban land use. Forest, when properly managed, is known to be a major carbon sink that can
mitigate the process of climate change. Traditionally, the importance of forest is assessed for its
economic, social and ecological values. Commercial forest (Timberland) provides valuable wood
products, while reserved forest is preserved for recreations, aesthetics, wildlife, watershed,
biodiversity, etc. The importance of sustainable forest management that aims to conserve the
forest for the benefit and sustainability for future generations is increasingly acknowledged by
the public nowadays. Therefore, it is crucial to monitor forest changes and to estimate
deforestation for tracking carbon stocks and fluxes (Running, 2008), as well as to support
decision making for better forest management for the benefit of the society. It is therefore
demanding to monitor how much land area is really dominated by mature trees but not being
regenerated as bare soils, grasses, shrubs or seedlings regardless of administrative definitions.
Moreover, monitoring these deforestation and afforestation activities over time is also important
since natural and human-induced disturbances that cause deforestation are becoming more and
more frequent under climate change (Overpeck et al., 1990; Westerling et al., 2006). Natural
disturbances include hurricanes, earthquakes, wildfires, increased temperature, drought,
pathogens and insect attacks (Soja et al., 2007; Kurtz et al., 2008; Westerling and Bryant, 2008).
Human-induced disturbances include logging, clear-cutting and prescribed fire. The detection of
these disturbances and land use changes provides evidence for scientists and policy makers to
study the implications of such changes and to project future trends.
Remote sensing has been widely used in forest mapping. Large area forest land cover mapping
begain in the 1990s (Loveland et al., 1991; Zhu and Evans, 1994; Vogelmann et al., 2001). In
global scale, the global tropical forest cover map (Mayaux et al., 1998), global land cover maps
(Hansen et al., 2000; Gong et al., 2013) and the global forest percentage map (Defries et al.,
2000) mapped land cover including forest at a single time. Among the many sensors that have
been used for forest mapping such as AVHRR (Zhu and Evans, 1994; Hansen et al., 2000),
MODIS (FIA, 2014; Parmentier and Eastman, 2014) and MERIS (Arino et al., 2007), Landsat
satellite sensors can provide both finer resolution (30 meters) and temporal coverage with a 16-
day cycle since 1984. Therefore Landsat satellite had been a preferred choice for mapping forest
changes. For instance, National Land Cover Dataset (NLCD) was produced for year 1992, 2001,
2006 and 2011 (Vogelmann et al., 2001; Homer et al., 2007; Fry et al., 2011; Jin et al., 2013), in
which forest was mapped at 30 meter spatial resolution and is the finest resolution for the
country and states so far. Due to the big volume of data as a result of its relatively fine resolution
and complexity of land cover classification approaches, only until recent years has this sensor
been more frequently utilized. Landsat images acquired from 1988-2006 over a study site at New
Mexico were used to map forest changes (Vogelmann et al., 2009). Biomass loss as a result of
disturbances was mapped using Landsat-based detection of Trends in Disturbance and Recovery
(LandTrendr, Kennedy et al., 2010; Cohen et al., 2010) for the conterminous US from 1986-
57
2004 (Powell et al., 2014). Souza et al. (2013) used Landsat images of ten years to map the
deforestation of the entire amazon forest. Hansen et al. (2013) was able to produce a global map
depicting forest changes with 30 meter resolution from 2000 to 2012 using efficient cloud
computing.
A variety of automatic classification algorithms were used to group pixels of satellite images into
land cover classes, including separating forest from non-forest. Generally there are two types of
classification algorithms: supervised and unsupervised (Jensen, 2004). A number of maps were
made using supervised classification, which requires a training dataset to train classifier models
and then uses the models to classify the image. For instance, decision trees were used to produce
the NLCD national land cover maps (Vogelmann et al., 2001), MODIS global land cover
product (Friedl et al., 2010) and 2000-2012 forest change map (Hansen et al., 2013). Support
Vector Machines (SVM) was used to produce the first 30 meter global land cover map (Gong et
al., 2013). The unsupervised clustering algorithm partitions data based on its own properties
rather than prior selection of training data. It has been applied in mapping land cover in a variety
of studies (Woodcock et al., 1994; Zhu and Evans, 1994; Loveland et al., 2000; Bartholomé et al.,
2005; Homer et al., 2007; Arino et al., 2007). In addition, there were studies that used neither
approach but thresholding rules to interpret time-series image stacks (Goodwin et al., 2008;
Kennedy et al., 2010; Cohen et al., 2010; Huang et al., 2010).
Forests cover 31 percent of the total land area in the world (FAO, 2010). In the US,
approximately 750 million acres are forestland, occupying 33 percent land area (USDA Forest
Service, 2001; Smith et al., 2002; FIA, 2014). As one of the most influential States in the US for
its top ranked economics and diverse demographics, California has a great proportion of forest
coverage and almost half of the forest is timberland (Laaksonen-Craig et al., 2003), which is
commercial forestland that is suitable for producing wood products. An earlier assessment by
Laaksonen-Craig et al. (2003) suggests 40% of the California land area is forestland. However,
national annualized inventory since 2001 indicates that 33% of the land area in California is
forest (FIA, 2014). The inventory used MODIS images with spatial resolution of 250 meters, of
which the area of each pixel is 15 times the minimum mapping unit, implying a very rough
estimate of forestland. With finer resolution of 30 meters, Franklin et al. (2000) mapped the tree
cover types in the national forests in California. The CALVEG mapping project that aims to map
all vegetation covers had not yet been finished and only limited regions of one time were mapped.
These maps were produced for one time and thus not adequate for change detection. Although
there are nationwide and worldwide map products that include the land of California, there had
been no existing studies that can track the forest change in California in a spatial and temporal
consistent manner (same methodology, data sources and small temporal gap). It is both
expensive and time consuming for experts to derive forest maps in a large area such as
California. The lack of temporal consistency results in gaps that can potentially impede
observing and studying important phenomenon and events. With the availability of Landsat
satellite images that can trace back to 1980s with a relatively fine spatial resolution of 30 meters,
we propose an approach that is based on the efficient clustering algorithm - Clustering Based on
Eigen Space Transformation (CBEST, Chen and Gong, 2013) to map forest changes in the entire
state of California from 1986-2011. CBEST was an efficient implementation of conventional K-
means, a widely used unsupervised algorithm. Unsupervised classification relies on post-
58
classification interpretation of the unlabeled clusters into information classes, which involves
expertise and intensive human works. Therefore, the procedure we designed features efficient
CBEST clustering, probability based forest cluster labeling and a unique automatic multi-
temporal forest change interpretation with probability trajectory. In this way, we can fill the
discrepancy of detailed forest monitoring in California for a long time and establish a reliable
and consistent map with efficient computing.
2 Methodology
2.1 Study Area
California locates on the west coast of the United States of America. It has 99,699 thousand acres
of total land area (U.S. Department of Commerce, 2010) with an estimated population of
38,332,521 as of 2013 (America Community Survey Office, 2013). The gross domestic product
(GDP) is about $2.003 trillion in 2012, which is the largest in the United States (U.S. Bureau of
Economic Analysis, 2013) and ranked 8th among all countries in the world. Cool offshore ocean
current and cold upwelling subsurface water lead to a Mediterranean climate in the coastal and
southern parts of the state with a rainy winter and dry summer, and moderate oceanic climate in
the north of the state. California has diverse ecosystems including deserts, forested mountain,
coastal forests, chaparral and woodlands. The plain in the central valley is one of the world’s
most productive agriculture area that supplies 8 percent of national agricultural output by value
(Reilly et al., 2008). California has a large area of wildland-urban interface (WUI), making more
than 5 million homes vulnerable to wildfires (Stewart et al., 2006).
59
2.2 Data
Figure 18 California: Study Area and Landsat TM scenes. Since the study area is in the
northern hemisphere, the UTM is of North Zone.
Thirty one Landsat scenes were required to cover the entire state of California per year, crossing
three Universal Transverse Mercator (UTM) zones as Landsat images were projected in UTM
coordinates system (Figure 18). A total of 450 Landsat Thematic Mapper Surface Reflectance
Climate Data Record (CDR) images were downloaded for year 1986, 1991, 1996, 2001, 2006
and 2011. The CDR product was processed using Land Ecosystem Disturbance Adaptive
Processing System (LEDAPS, Masek et al., 2012). LEDAPS carries out radiance calibration,
top-of-atmosphere reflectance conversion and atmospheric correction. A layer of cloud mask was
also included in the CDR product, which was generated from the Fmask algorithm (Zhu and
60
Woodcock, 2012), an object-based cloud masking approach. To ensure the quality of data as well
as to fill potential gaps as a result of cloud cover, all images from July to September with cloud
coverage less than 10 percent were downloaded. Coastal scenes with clear land area were
manually picked since the ocean part of the image is usually covered by cloud, therefore always
having more than 10 percent cloud coverage overall.
2.3 Procedure
The general procedure could be summarized as the following steps: 1) Data preparation; 2)
Initial Clustering; 3) Integrating Cluster Centers; 4) Probability Assignment; 5) Probability
Trajectory Interpretation; and 6) Post-processing. Figure 19 shows the flowchart of our approach.
The classification at the pixel level for each year was a semi-supervised classification that
incorporated both supervised and unsupervised classification. Firstly, the unsupervised
classification was carried out using CBEST to partition the image into a number of spectral
classes each with similar spectral properties. A stratified sampling was then implemented and
samples of all spectral classes were visually identified as a forest percentage class (<10%, 10%-
20%, 20%-50% and >50%) from high resolution images in Google Earth. An arbitrary value
indicating tree cover fraction was calculated for each forest percentage class. By averaging the
cover fractions for each spectral class, the mean and variance of the forest probability was
derived. Then we assigned the probability to all pixels for all acquisition years, in which each
pixel had a probability trajectory that implies the forest change from 1986 to 2011.
61
Figure 19 Flowchart of the Procedure to map forest changes in California
2.3.1 Data Preparation
62
All Landsat scenes with the same coordinate system and acquisition year were mosaicked into
one larger scene. The earlier the acquisition time of the year the image was acquired from, the
higher priority the image had in the overlay area with multiple image stacked. Because all
images were obtained from July to September, the image with its acquisition time closest to July
1st should be put on top of the image stacks. Later images were used to fill the gaps in the scene
masked as cloud, shadow, snow or water according to the ascending order of the difference
between acquisition time and July 1st. We mosaicked images from the same coordinate system
to reduce computing time and distortion error that are caused from re-projecting and re-sampling
the images (Lunetta et al., 1991). In our approach, all images from the same coordinate system
were processed independently until spectral bands were converted into clusters and temporally
stacked, so that only one pass of mosaicking and re-projecting operation was required.
2.3.2 Initial Clustering
CBEST (Chen and Gong, 2013) was applied in clustering each mosaicked image into 100
clusters. The desired number of eigenspace partitions N was set to 10 million, which means that
CBEST compresses the eigenspace by segmenting eigen axes based on the corresponding
eigenvalues into approximately 10 million eigenspace partition cells. This eigenspace
compression was originally conceptualized and developed by Gong and Howarth (1992) based
on Principal Component Analysis (Jolliffe, 2002). For CBEST, instead of using pixels as
members to implement each iteration, CBEST uses eigenspace partitioned cells in which means
and counts are calculated for a revised K-means clustering. The K-means algorithm aims to
partition the data into a specified number of clusters with minimized within-cluster sum of
squares (MacQueen, 1967). Since N is user-defined and is relatively small compared with the
number of pixels for a large image, the computational cost and memory usage of the algorithm
can be greatly reduced. The eigenspace transformation (or PCA transformation) does not strictly
require all pixels to calculate the optimal eigenspace compression. Firstly, a representative
subsample of the original image does not change the eigenvalues too much. Secondly, even if the
eigenspace compression is somehow distorted a little, the subsequent K-means clustering is still
the major step and has more impact on the result given the total number of eigenspace partitions
does not change. Therefore, a systematic sampling with a ten-pixel interval was done solely for
the purpose of compression. The projecting from original feature space to the compressed
eigenspace still scanned over the entire dataset. The CBEST software we used in this paper is
available to download at http://data.ess.tsinghua.edu.cn/. The configuration of parameters and the
interface of CBEST software are demonstrated in Figure 20.
63
Figure 20 CBEST software interface. The initial clustering was implemented under the
configuration in this figure.
CBEST software was coded and compiled in Matlab 2013a. All analyses and processing were
implemented in a computer features an Intel i5 760 2.8Ghz quad-core CPU with 8 gigabytes
physical memory.
2.3.3 Integrating Cluster Centers
There are three mosaicked images in each year, comprising a total of 18 mosaicked images for
clustering. Each mosaic yields 100 clusters, so a total number of 1800 cluster centers were
calculated. To integrate these centers and make cluster labels consistent over all images, we
further implemented a K-means clustering on these 1800 cluster centers to 30 clusters with a
universal labeling scheme. By doing this, all images were relabeled into 30 clusters, within each
of which the samples had similar spectral properties and were distinguished from those of other
clusters.
2.3.4 Probability Assignment
The clustered mosaick images in year 2011 were mosaicked into one larger image for the
purpose of sampling integrity. A stratified sampling was carried out with each cluster being a
stratum that was proportionally sampled to the total number of land pixels in California. The
desired number of sampling units was set to 1000, with a limitation of at least 10 sampling units
for each stratum. All the sampling units were categorized based on visual verifications in the
Google Earth software. Particularly, a 30m by 30m rectangular region centered at the location of
the sampling unit was roughly divided into 3 by 3 cells. By counting if there are dominate forest
presence in these cells, the label of the forest percentage was determined. The classes are defined
64
in Table 19. A secondary label was also assigned if a sample consisted of several mixed classes
or a transition (Edwards et al., 1998; Stehman and Czaplewski, 1998; Olofsson et al., 2014).
Since the minimum mapping unit for forestland is one acre and there would be some uncertainty
caused by mis-registration from either Google Earth and the Landsat imagery, it is thus
reasonable to take the surrounding pixels into account and form a secondary label in addition to
the primary label at the pixel when the sampling location exhibited great disagreement from the
surrounding neighbor pixels.
Table 19 Verification classes and corresponding probability weights
Class Description Probability Weight
Non-forest Tree cover < 10% 0
Woodland Tree cover 10%-20% 0.2
Low-density Forest Tree cover 20%-50% 0.5
High-density Forest Tree cover >50% 1
The probability weight was arbitrarily determined to convert nominal classes into numerical
values. Given that many sampling units had an additional secondary label, the probability
weights of these samples were averaged between the primary and secondary. Mean and standard
deviation of each stratum were calculated excluding pixels that were identified as agricultural
use since orchards and forests have similar spectral properties. In such manner, each cluster
(stratum) had a mean and variance that indicates the probability of the pixels being covered by
trees.
2.3.5 Probability Trajectory Interpretation
The probability weight is an indicator of tree cover fractions. By stacking the means and standard
deviation of probability of the images temporally from 1986 to 2011, one is able to derive a
trajectory with a range of dispersion of forest probabilities for each pixel. The probability
plus/minus standard deviation provides a rough range of estimating how much of the pixel is
occupied by trees. We interpreted the probability and probability plus/minus standard deviation
starting from the second available year, which was year 1991 in most cases. However, there were
values of some years that might be missing for some pixels due to cloud masks. The pixels
without values for at least 4 years were disregarded and remain unclassified. We determined the
rules for detecting a forest loss or gain by utilizing the probability range overlaps and two
thresholds.
For each year, a test for overlapping with previous years was implemented that if any mean value
of the previous year is beyond the bounds of this year and the mean value of this year is beyond
the bounds of all previous years, we continued with further tests. In order to avoid detecting
insignificant changes (e.g. removing a small proportion of trees but still mostly covered by trees),
we determined two thresholds: mean-threshold and upper-bound-threshold that a change towards
afforestation must have meant greater than the mean-threshold and upper bound greater than the
upper-bound-threshold. However, the reverse change towards deforestation only requires either
of them to be less than the threshold. In addition, if no changes were detected, the average of all
six mean probabilities and highest upper bound were tested with the same threshold as described
above. A graphical demonstration of the trajectory interpretation was shown in Figure 21. The
65
determination of the two thresholds were averaged between the mixed clusters and searched
around the mean to best equalize the user’s and producer’s accuracies, such that the chance of
misclassification between ‘high-density forest’ (>50%) and ‘non-forest’ (<10%) was even but
not one class being over-classified and the other under-classified. To simplify the storage of the
results, only the latest forest loss and forest gain were retained so that all possible classes were
listed as forest, non-forest, six forest loss years, six forest gain years and 30 combinations of loss
and gain (2 permutations of 6), with a total of 44 classes. These 44 classes were coded and stored
in an output image with a pixel depth of unsigned 8-bit (one byte), which allowed us to store
integer numbers from 0 to 255.
Figure 21 Graphic demonstration of probability trajectory interpretation. (a) A typical
forest loss pixel with elaborations on the rules for automatic determination of forest loss;
66
(b) Non-forest, all points fall within the bounds; (c) Forest; (d) Forest Gain detected in
2006.
2.3.6 Post-processing
To remove speckle noises and eliminate patches that are smaller than the minimum mapping unit
(MMU) for better administration, a MMU filter or majority filter is usually applied in mapping
practices (Homer et al., 2007; Thomas et al., 2010). The filter replaces connected neighboring
pixels with the same value that are less than a minimum number with the surrounding majority
labels. To determine neighboring connectivity, one can use 4 neighboring directions (up, down,
left and right) or all 8 neighboring directions. Connecting 8 neighboring pixels preserves narrow
line features such as roads and rivers, but since USDA Forest Service (2012) requires the width
of a forest patch wider than 120 ft (37 meters), we decided to use 4 neighbors connectivity. The
MMU was set to one acre (4045m2, 5 pixels = 4500m
2). Filtered pixels were refilled by
repeatedly applying a 3 × 3 majority filter until no further change could be made. Residual
unfilled pixels were relabeled with their original classes. The 2013 Cultivated Layer that
identifies agricultural land derived from the 2009-2013 US Department of Agriculture, National
Agriculture Statistics Service, Cropland Data Layers (CDL) was acquired (Boyan et al., 2012).
Speckle noises were filtered followed by a dilation operation that was carried out twice for the
agriculture areas with a 3 by 3 kernel to expand and eliminate trivial patches in the center and
edge of a cropland. Since there are many orchards in California, which has very similar spectral
properties to forests, it is necessary to mask out all agricultural zones. The map was then
separated into three maps: a change type map, a forest loss year map and a forest gain year map.
The type map reflects categories of forest change with ‘forest’, ‘non-forest’, ‘forest loss’, ‘forest
gain’, ‘forest loss and recovered’, while the forest loss year and gain year maps were temporal
records of the latest forest loss and forest gain. In addition, forest gain before loss was a very rare
occasion especially at the head or tail of the trajectory, leading to erratic detection of change
which was supposed to be stable non-forest. Henceforth, we relabeled these changes as non-
forest.
An accuracy assessment to evaluate the overall quality of the California forest change map was
carried out. Due to the lack of reliable past record of ground survey or high resolution images
throughout the state, it is very difficult to assess the accuracy of the changes. However, the
samples identified for year 2011 were properly sampled and labeled and thus can be used to
validate the forest cover in year 2011 derived from the change map. Considering the limitation of
remote sensing for mixed pixels, especially woodland and low density forest that have more than
half of the pixel not covered by trees. We determined that either ‘non-forest’ or ‘forest’ is an
acceptable label for ‘woodland’ and ‘low-density forest’ in reference.
67
3 Results and Analysis
3.1 Intermediate Results
Table 20 Elapsed time for the clustering process. Each result was selected as the lowest
within-cluster sum of squares from 5 runs. 1986N10 means the mosaicked image in 1986
with projection of UTM Zone 10 North.
Name Elapsed Time
(Seconds)
Within-Cluster
Sum of Squares (Trillion)
Name Elapsed Time
(Seconds)
Within-Cluster
Sum of Squares (Trillion)
1986N10 10126 17.65 2001N10 14206 18.00
1986N11 7974 38.20 2001N11 9939 41.17
1986N12 7504 1.67 2001N12 7028 1.93
1991N10 10205 17.32 2006N10 8538 20.78
1991N11 10659 46.85 2006N11 9498 44.64
1991N12 7227 1.78 2006N12 6536 1.83
1996N10 12652 17.67 2007N10 7822 18.90
1996N11 9591 38.76 2007N11 8816 39.06
1996N12 8393 1.50 2007N12 6375 2.07
It took approximately 45 hours in total to batch process these images with approximately 9.5
billion pixels in total. Because of the efficiency of the CBEST algorithm, we had the luxury of
time to run the clustering for multiple times with different randomized initial states which can
greatly improve the performance of the K-means to approximate the global optimal solution. The
performance of the clustering was documented in Table 20. The elapsed times for the mosaicked
images of three zones were similar. Although mosaicked image in zone 10 and zone 11 were
almost over 10 times bigger than that in zone 12 with only one scene in presence, the
compressive nature of CBEST algorithm decides that the time cost was more dependent on the
occupied eigenspace partitions, in other words, the dispersion of data in eigenspace rather than
the size of data. Within-cluster sum of squares is a relative measure of the degree of optimum of
the result, which is comparable to itself with multiple runs. The smaller WCSS implies tighter
positioning of the members in the eigenspace for the clusters. We could observe that mosaicked
images in zone 11 had the highest WCSS, suggesting a more diverse landscape for the region.
68
Figure 22 Post-clustering result in year 2011 and stratified sampling units
After the integration of cluster centers into a universal 30 clusters for all images, the 2011 post-
clustering result were used for stratified sampling. The cluster map and sampling plots were
shown in Figure 22. A total of 1057 samples were generated and verified in Google Earth. The
mean and standard deviation of the probability weight of each cluster/stratum were listed in
Table 21. From these numbers one can tentatively conclude that cluster 3 and 19 were most
likely to be dense forest, cluster 1, 2, 5, 6, 12, 13, 15, 18, 20, 21, 23, 24, 26, 28, 29 and 30 were
primarily without tree covers. Other clusters except for 8 and 25 were somewhat mixed between
the two, implying a transitioning or mixed woodland or low-density forest. Cluster 8 and 25 were
labeled as ‘n/a’, in which cluster 8 contained bright scan lines caused by error of the sensor while
cluster 25 was all agricultural land. Therefore, these two clusters were treated as continuation or
69
transition between the previous time and the subsequent time. In addition, for clusters with
means less than 0.5 and its range not reaching zero, the ‘standard deviations’ were manually
increased to just cover the zero value. In such a way, the mixed clusters were more flexible to
remain non-forest when the previous or/and the subsequent clusters were completely non-forest
clusters with both mean and standard deviation being zero. The average of the means of the
mixed clusters was 0.35, around which a search was done to determine the best mean-threshold
to even the chance of misclassifications. The best result of mean-threshold was found to be
0.382. The upper-bound-threshold was set to an arbitrary value of 0.5, implying that in order for
a pixel to be forest at one time, it must have its highest probability being a forest at the same time
of at least 50%.
Table 21 Means and standard deviations of forest probabilities calculated for the clusters
Cluster
#
Mean Standard
Deviation
Cluster
#
Mean Standard
Deviation
Cluster
#
Mean Standard
Deviation
1 0.03 0.12 11 0.38 0.40 21 0.00 0.00
2 0.05 0.19 12 0.01 0.07 22 0.45 0.45
3 0.86 0.33 13 0.01 0.04 23 0.04 0.09
4 0.33 0.46 14 0.17 0.33 24 0.00 0.00
5 0.00 0.00 15 0.02 0.07 25 n/a n/a
6 0.00 0.00 16 0.26 0.37 26 0.00 0.00
7 0.27 0.41 17 0.50 0.50 27 0.36 0.41
8 n/a n/a 18 0.04 0.11 28 0.09 0.18
9 0.55 0.47 19 0.79 0.35 29 0.06 0.18
10 0.37 0.42 20 0.00 0.00 30 0.10 0.32
70
3.2 Forest Change Map and Accuracy Assessment
Figure 23 California Forest Change Maps 1986-2011. Left: Change Type Map; Upper
right: Forest loss characterized by years; Lower right: Forest gain/recovery characterized
by years.
The process of the interpretation of probability trajectories to generate the initial coded forest
change map took approximately 7 hours by the same machine. The initial map was then masked
out for agriculture zones and filtered for MMU, which cost around 1 hour processing time
combined. Subsequently, the final products of forest change in California from 1986 to 2011
with a five-year interval were generated by separating the coded map into three maps: a forest
change type map, a forest loss year map and a forest gain year map (Figure 23).
We determined four classes to validate and characterize the forest and non-forest based on the
percentage of tree cover. However, there are only forest and non-forest in our forest change map.
As described in section 2.3.5, the rules for separating ‘forest’ vs ‘non-forest’ for our forest
change map were based on the even chance of misclassification between ‘high-density forest’
and ‘non-forest’. We explained in section 2.3.6 that ‘low-density forest’ and ‘woodland’ were
mixed classes with tree coverage less than 50%, that therefore can be automatically classified to
71
either ‘forest’ or ‘non-forest’ due to the mixed spectral information at the subpixel level.
Therefore, an error matrix was generated to reflect the match between the four-class reference
and two-class map systems (Table 22).
Table 22 The error matrix of samples with four classes from validation labels and two
classes from the forest cover in 2011
Reference
High-Density
Forest (>50%)
Low-Density
Forest (<50%)
Woodland
(10%-20%)
Non-forest
(<10%)
Total
Map Forest 193 36 14 34 277
Non-forest 43 57 55 625 780
Total 236 93 69 659
It can be observed that the number of samples of ‘high-density forest’ misclassified as ‘non-
forest’ (43) and that of ‘non-forest’ misclassified as ‘forest’ (34) were similar, which was
achieved by searching the appropriate mean-threshold in probability trajectory interpretation. To
match the classes between the map and the reference, ‘low-density forest’ and ‘woodland’ were
acceptable labels of either ‘forest’ or ‘non-forest’ as described in section 2.3.6. Other than
matching with their names (low-density forest and woodland), these mixed classes were
probably transition zones between forest and non-forest, marginal areas and urban forests. From
the perspective of change detection, these classes were normally in the intermediate state of
changes or in stable forms, but seldom were the results of afforestation or deforestation in
forestland. The error matrix of accuracies was then calculated (Table 23).
Table 23 Error matrix of accuracies for the forest cover in 2011
Reference
Map
Forest Non-forest Total User's Acc 95% CI (±)
Forest 245 32 277 0.884 0.038
Non-forest 43 737 780 0.945 0.016
Total 288 769
Overall Accuracy = 0.929 ± 0.016 Prod's Acc 0.851 0.958
95% CI (±) 0.035 0.016
Estimated area, 95% confidence intervals for both accuracies and estimated areas were
calculated based on the approaches introduced in Olofsson et al. (2014). The estimated area that
was classified as forest in 2011 was 28.1 million acres with a 95% confidence interval of 28.05 ±
1.54 million acres, while the non-forest area was 71.22 ± 1.54 million acres. The land area that
remained unclassified was 0.43 million acres, which should add up to the uncertain land area.
We assumed the error matrices for the forest cover maps for all years derived from the forest
change map were consistent, implying a constant chance of misclassification for both classes
throughout the mapping years. Therefore, the forest area for all mapping years could be
estimated with a 95% confidence interval that incorporates unclassified uncertainties (Figure 24).
72
Figure 24 Estimated Forest Area by Mapping Years
4 Discussions
The forestland area in California has been quite stable during the 25-year period studied and no
significant changes of forest loss or gain were detected. The slight fluctuation of the forest area
plot in Figure 24 implies there were times that disturbed areas affected more land than the
regenerating rate of the forest, and there were years that new disturbances were less frequent as
opposed to the resilient forest ecosystems in regeneration. Since no accuracy assessment of the
forest change was done, we could only estimate yearly loss and gain solely based on the number
of pixels of that change type in our map product. On an average for the 25 years, each year
California’s forest experienced a loss of 92 thousand acres and recovery of 85 thousand acres,
resulting in seven thousand acres forest loss per year. Compared with the area of forestland in the
entire state, the average annual loss was only 0.02%, which was too small even compared to the
uncertainties in the map (1.54 mil acres confidence interval bound to mean and 0.43 mil acres
unclassified). In the past 25 years, around 12% of the forestland experienced change, with
equally 4% each for ‘forest loss’, ‘forest gain’ and ‘forest loss and recovered’ (Figure 25).
73
Figure 25 Proportions of forest change type in California for 1986-2011
In particular, 4% of deforestation should be divided into two categories: 1) Permanent changes
such that land was converted to other land uses or succeeded by a different ecosystem without
dense tree covers. Such changes occur in the short or near-term without the detection of
regeneration. 2) Temporary changes that occur from disturbance such as wildfire, clearcut, etc.
These changes occurred recently and trees have not yet regenerated. Apparently, the 4% of
‘forest loss and recovered’ class falls into the latter temporary category. The 4% of ‘forest gain’
could also be categorized into permanent changes and temporary changes. Based on visual
comparisons between our maps and high-resolution images, we found the natural regenerations
usually took more than 10 years for forest to reach the level detectable by our algorithm.
Therefore, the natural process of forest loss then recovered should be initially detected in the first
15 years. We can roughly calculate the proportion of such temporary changes for the last 10
years to be 10/(10+15)×4%=1.6%. By assuming a constant ratio between permanent changes and
temporary change over the years, we can estimate the permanent loss was 4% - 1.6% = 2.4%
during the 25 years. The same calculation could be applied to the forest gain and also 2.4% of the
permanent gain was accounted for the permanent loss. However, the above analysis was limited
by the scope of temporal intervals of 5 years as well as the complexity of forest changes due to
natural disturbances, climate, change of land ownership, fluctuation of economics impacting the
wood products market and change of management policy. Further explorations could be done to
study the potential factors that affect these changes. In general, we did not observe that the
forestlands in California were either declining or increasing in a significant way. All these
observations suggest that the replacement of California’s forest was relatively stable and healthy,
which should be attributed to the Forest Practices Act and efforts to sustainably manage
forestland in California.
74
Figure 26 Local views of some chosen places of the forest change map. The four images in
the bottom of the figure demonstrate the changes detected with historical aerial
photographs back in the 1988 and 1993 in comparison with high resolution image acquired
75
recently. Orange circles indicate a regenerated forest patch after early removal while red
circles encompass a clearcut area.
In Figure 26, we demonstrated a closer view of detected disturbances at the landscape scale. The
scene in the north had many small patches that are similar in size and shape, in which the forests
were harvested for wood products. The harvesting activities in the period of 1996-2001, 2001-
2006 and 2006-2011 were almost identical, implying a well-managed forest harvest strategy. The
scene in the south suggests a big irregular disturbance, probably as a result of a wildfire during
2006-2011. The scene in the San Francisco Bay Area with zoomed view compared with
historical aerial photos lies in the hills of the University of California at Berkeley campus,
northwest of the botanical garden. The small scene includes two land use change scenarios. The
north part of the forest was clearcut in the late 1980s and later recovered. The forest in the south
part was removed for paved road and building construction during the same time and was thus
experienced a permanent land use change.
In comparison with the 40% forestland area by Laaksonen-Craig et al. (2003) and 33%
forestland area by FIA (2014), our map indicated forest occupies 29% of land. If we counted the
proportion of forest samples (>10%) in our stratified sampling, the forest samples were 38% of
the total. Considering the stratified sampling was not strictly carried out based on equal
probability (stratum of at least 10 in sample size) and only a very small proportion of the entire
state was sampled, this estimate could be both biased and large in variance. The differences in
the estimated forestland area between our approach and the previous studies can be attributed to
the following factors. Forestland is officially defined as the land at least 120 feet wide and 1 acre
in size with tree cover greater than 10%, excluding areas for urban and agricultural use. All
transition zones that met the above standard should also be defined as forest. It also includes any
disturbed areas that were previously forested without changing the land use to urban or
agricultural land. It leaves space of ambiguities for subjective manipulations. For instance, a
large grassland patch as a result of natural succession after a forest fire surrounded by forestland
could be defined as either forestland or non-forest. In a larger extent combined with surrounding
forest, the entire patch that encompasses the grassland has tree cover over 10% and thus the
patch is defined as forestland. However, the disturbance occurred a long time ago and can be
treated no longer as forest regeneration, the patch alone does not have trees over 10% and thus
should be classified as non-forest. From the perspective of data sources, without additional
administrative information in spatial details, it is not possible to find all forestland even if we
assumed that remote sensing images could derive perfect information about tree coverage. For
instance under a similar scenario as above, areas that were disturbed and under regeneration were
somehow succeeded by non-forest ecosystems due to possible factors such as nutrition, water,
climate and invasive species. If the disturbances were larger than the one acre MMU and the
spatial resolution of the data, they should be easily classified as non-forest. However, since the
patches were in the middle of a large forestland, they should remain as forest use and classified
as forestland administratively. Here the scales of observation play an important role. In Figure 27,
we demonstrated that delineating forest patches with different scales of measurement could result
in great differences in mapped areas. Finer resolution always narrows the classification boundary
tighter to the tree covers, leading to a more accurate estimate of authentic tree covers rather than
an administrative boundary that encompasses a greater range including a lot of non-forest covers.
76
This is especially so in forest and non-forest transition areas and wildland-urban interfaces where
tree coverage is relative sparse.
Figure 27 An example of how scale affect the classified area. Suppose the smallest cell unit
is 30m by 30m in size, there are 20 cells or 18000m2 forest area. If using a 120m by 120m
cell, there are 3 cells or 43200m2. If using the entire 240m by 240m scene, the area is
classified as one forest patch, with an area of 57600m2.
Therefore, since our California map was made from the finest resolution with 30 meters in
comparison to the others, it is reasonable that our map yielded the lowest estimate of forest area.
In addition we note that FIA data has only a sampling intensity of 1 plot per 6,000 acres, so that
estimates of total forest area derived using these data have fairly large confidence intervals.
Furthermore, consider that the classification for ‘low-density forest’ (20%-50%) and ‘woodland’
(10%-20%) in the error matrix of Table 22 was more inclined towards non-forest. A great
amount of small patches of mixed pixels of such tree cover percentage were not classified as
77
forestland. Other error sources may include the residual cloud post cloud masking,
misregistration of images, re-projecting and resampling process, inconsistent surface reflectance
due to the complex atmospheric conditions at acquisition, etc. (Lunetta et al., 1991).
5 Conclusions
In this study, we presented an approach to efficiently produce forest change maps for California
for 25 years with a 5 year interval. Our achievements and findings were listed as follows:
† The total computer processing time was approximately 10 hours, in which 2 hours is for
preprocessing, 7 hours for clustering and 1 hour for post-processing;
† The overall accuracy of the map was 92.9% ± 1.6%;
† The estimated forest area changed from 28.20 ± 1.98 million acres to 28.05 ± 1.98
million acres;
† During 1986-2011, California’s forests experienced loss of 92 thousand acres and
recovery of 85 thousand acres per year, resulting in seven thousand acres of forest loss
per year (These numbers were not estimated using the error matrix and thus did not
perfectly match with the above calculation);
† During 1986-2011, around 12% of the forestland (‘stable forest’, ‘forest loss’, ‘forest
gain’ and ‘forest loss and recovered’ combined) experienced changes, in which the
change was 4% each for ‘forest loss’, ‘forest gain’ and ‘forest loss and recovered’,
respectively.
† Our estimate of forestland was approximately 29% of the land area in California, as
opposed to the 40% by Laaksonen-Craig et al. (2003) and 33% by FIA (2014). We
attribute the disagreement to solely data oriented methodology regardless of subjective
administrative considerations, finer resolution, ambiguity of our approach in dealing with
mixed pixels (tree<50%) and other errors.
In conclusion, our map made a tighter estimate of the forest cover and changes during the 25
years, meaning that forest boundaries were closer to the real boundary of trees. Meanwhile, we
did not particularly treat wildland-urban interface and urban forests like masking out agriculture
land, because our goal was to document real tree-covered areas for potentially better estimate of
carbon sequestration in urban areas also. Furthermore, by overlapping urban areas with our forest
change map, one could easily identify wildland-urban interfaces where wildfire is a threat to
properties and human lives. Our forest change map can contribute to the monitoring of the
forestland in California with relatively low cost without requiring field visitation as well as being
informative about when and where the deforestation and afforestation occurred in the past. By
tracking forest changes, policy makers could regularly determine whether or not there is a major
deforestation (such as a devastating and long lasting wildfire) that natural regeneration could not
account for maintaining a balance.
However, it should also be noted that our approaches could be further improved. To reduce the
computational cost, we only used images acquired in a five-year interval, which allows the false
detection of changes in the probability trajectory interpretation without utilizing stable
consecutive values for multiple years annually (Huang et al., 2010). Neither the number of
clusters nor the minimum number of samples in a stratum was large enough to reduce the
78
variance of probabilities estimated for the strata, leading to an error prone and threshold sensitive
situation for trajectory interpretation. In addition, an accuracy assessment of the forest change
was lacking to evaluate the change aspect of our map due to the incomplete record of the aerial
photos and ground surveys. However, by using annual Landsat images we would be able to
assess whether the changes took place in a recent 10 year period since most of the aerial photos
in Google Earth can trace back into 1990s for California, thus to generalize the estimated errors
to the earlier detected change as well. Given the processing time for this practice was around 10
hours, including all Landsat records from 1984 to 2011 should increase the processing time to
circa 70 hours. If processing in a cluster of high-performance computers, the time should be
reduced further. It is also possible to generalize our approach to a larger extent such as the entire
United States and even the world. It should be able to explore the past record of all countries’
forests back to 1984 and hence allow decision makers in these countries to maintain or develop
better strategies for forest management.
79
Chapter 4 Conclusions and Perspectives
1 Summary of the Results
In the first chapter, a reliable semi-automatic algorithm for detecting mountain pine beetle
outbreaks based on Landsat image stacks from 2001 to 2011 for Grand County in Colorado was
developed. The algorithm was named Berkeley Indices Trajectory Extractor (BITE). Temporal
trajectories of multiple spectral indices were processed with unique techniques followed by
interpretation and integration. An overall accuracy of 94.7% for the classification of disturbance
types was achieved. The detection between slow-onset disturbances and rapid-onset disturbances
proved to be effective using BITE. The spatial and temporal dispersal of mountain pine beetle
outbreak that occurred during the time frame in the study area was accurately mapped. It is
appropriate to conclude that BITE algorithm should be suitable for detecting other disturbances
as well, as long as there is forest canopy loss.
In the second chapter, our experiment with a subscene of Landsat Thematic Mapper (TM)
imagery suggests that CBEST was able to improve speed considerably over conventional K-
means as the volume of data to be clustered increases. We assessed information loss and several
other factors. In addition, we evaluated the effectiveness of CBEST in mapping land cover/use
with the same image that was acquired over Guangzhou City, South China and an AVIRIS
hyperspectral image over Cappocanoe County, Indiana. Using reference data we assessed the
accuracies for both CBEST and conventional K-means and we found that the CBEST was not
negatively affected by information loss during compression in practice. We discussed potential
applications of the fast clustering algorithm in dealing with large datasets in remote sensing
studies.
In the third chapter, we efficiently produced a forest change map for the entire state of California
with a spatial resolution of 30 meters over 25 years. The computing time of the mapping process
took only 10 hours in a computer with an Intel 2.8 Ghz i5 CPU and 8 Gigabytes RAM. The
overall accuracy of the forest cover in year 2011 was reported as 92.9% ± 1.6%. We found that
the estimated forest area changed from 28.20 ± 1.98 million acres to 28.05 ± 1.98 million acres
from 1986-2011. In particular, our rough estimate indicates that each year California’s forest
experienced loss of 92 thousand acres and recovery of 85 thousand acres, resulting in seven
thousand acres forest loss per year. In addition, during a twenty five year period from1986-2011,
around 12% of the forestland experienced changes (~0.5%/year), in which the change was 4%
(.16%/year) each for deforestation, afforestation and deforestation then recovered respectively.
We concluded that the forestland in California had been managed in a sustainable manner over
the 25 years, since no significantly directional changes were observed. This is an expected result
since forest management is California is regulated by the California Forest Practices Act. Our
approach made a tighter estimate of the true canopy coverage such that 29% of land in California
is forestland, as opposed to the statistics of 33% and 40% made by previous studies that had
lower spatial resolution and shorter temporal coverage.
80
2 Future Perspectives
BITE and CBEST are distinctive algorithms. BITE is more noise resistant and capable of
differentiating between slow-onset disturbances and rapid-onset disturbances. CBEST is less
accurate and the approach using CBEST for the California forest change mapping was not able
to detect slow-onset disturbances. However, CBEST was more efficient in computing than BITE.
Within 10 hours, CBEST processed the entire state of California while BITE took over 25 hours
to process Grand County which is only 1% of the size of California. Because the data sources for
the two algorithms are identical, it is thus viable to integrate the two algorithms for mapping
tasks with consistency. To take advantages of both algorithms for large area mapping of forest
changes, an agile strategy of integrating the two algorithms should be deliberated. A general
concept of the integration is that CBEST is used in an initial mapping of the forest changes for
the entire mapping area and temporal coverage. BITE is then applied to the areas where
disturbances are detected for the onset period of the disturbances. The initial change map from
CBEST can also be used as the input for BITE so that additional sources of forest cover map is
no longer required or limited by the past maps such as NLCD and FIA forest map. Therefore, the
processing time for mapping a large area can be significantly reduced by ignoring the unchanged
areas as well as enhancing the detection of deforestation and afforestation with the more accurate
BITE algorithm.
Nevertheless, BITE lacks afforestation detection and requires a pre-mapped forest cover mask.
Also it requires training samples in a local region. It is currently unknown how well BITE can
perform if applied to other areas without in situ trained models, and to what extent of the scale
can local models adapt. The California forest change map using CBEST was not evaluated for
changes, is thus not appropriate to be used for follow up BITE refinement. More sampling units
and clusters should be added to lower the variance in the estimate. Annual or biannual Landsat
records are more favored for noise removal in the data, and for increasing the reliability of the
change detection. We expect that by refining and combining the two algorithms developed in this
dissertation paper, we will be able to derive the finest and longest forest change records for any
places in the world with Landsat satellite coverage. At least for the entire United States of
America with well-georegistrated image records, the task will be viable and significant. Decision
making in the future could be greatly influenced by a fine resolution forest change map with
longer temporal coverage. Greater details lead to a more intelligent way of managing the
sustainability of forestland that is beneficial to both human society and natural environment in
the world.
81
References
Alsabti, K., Ranka, S., Singh, V., 1998. An Efficient K-means Clustering Algorithm. Proc. 1st
Workshop on High Performance Data Mining.
America Community Survey Office, 2013. American Community Survey Multiyear Accuracy of
the Data (3-year 2010-2012 and 5-year 2008-2012), URL:
http://www.census.gov/acs/www/Downloads/data_documentation/Accuracy/MultiyearACSAccu
racyofData2012.pdf, U.S. Census Bureau, Washington, DC (last date accessed: 2 Feb 2014).
Arino, O., Gross, D., Ranera, F., Bourg, L., Leroy, M., Bicheron, P., Latham, J., Di Gregorio, A.,
Brockman, C., Witt, R., Defourny, P., Vancutsem, C., Herold, M., Sambale, J., Achard, F.,
Durieux, L., Plummer, S., Weber, J.-L., 2007. Globcover. ESA Service for global land cover
from MERIS. IEEE International Geoscience and Remote Sensing Symposium, Barcelona,
Spain, 23-27 July, pp. 2412–2415.
AVIRIS image, 2013. AVIRIS image North-South flightline over Northwest Tippecanoe
County, Indiana, https://engineering.purdue.edu/~biehl/MultiSpec/hyperspectral.html. (Accessed
3 Feb, 2013)
Aukema, B. H., Carroll, A. L., Zhu, J., Raffa, K. F., Sickley, T. A., and Taylor, S. W., 2006.
Landscape level analysis of mountain pine beetle in British Columbia, Canada: spatiotemporal
development and spatial synchrony within the present outbreak, Ecography, 29(3): 427-441.
Aukema, B.H., Carroll, A.L., Zheng, Y., Zhu, J., Raffa, K.F., Moore, R.D., Stahl, K. and Taylor,
S.W., 2008. Movement of outbreak populations of mountain pine beetle: influences of
spatiotemporal patterns and climate. Ecography, 31(3): 348-358.
Bartholomé, E., Belward, A.S., 2005. GLC2000: a new approach to global land cover mapping
from Earth observation data. International Journal of Remote Sensing 26(9), 1959–1977.
Belluco, E., Camuffo, M., Ferrari, S., Modenese, L., Silvestri, S., Marani, A., Marani, M., 2006.
Mapping salt-marsh vegetation by multispectral and hyperspectral remote sensing. Remote
Sensing of Environment 105(1), 54–67.
Bentz, B. J., Logan, J. A., and Amman, G. D., 1991. Temperature-dependent development of the
mountain pine beetle (Coleoptera: Scolytidae) and simulation of its phenology, Canadian
Entomologist, 123(5): 1083-1094.
Bentz, B. J., Powell, J. A., and Logan, J. A., 1996. Localized spatial and temporal attack
dynamics of the mountain pine beetle in lodgepole pine, US Department of Agriculture, Forest
Service, Intermountain Research Station, (INT-RP-494).
Bentz, B. J., and Endreson, D., 2004. Evaluating satellite imagery for estimating mountain pine
beetle-caused lodgepole pine mortality: current status, Information Report, Pacific Forestry
Centre, Canadian Forest Service, (BC-X-399), pp. 154-163.
82
Bezdek, J.C., Ehrlich, R., Full, W., 1984. FCM: the Fuzzy c-Means clustering algorithm.
Computers and Geosciences 10(2-3), 191–203.
Boryan, C., Yang, Z., & Di, L. (2012, July). Deriving 2011 cultivated land cover data sets using
usda national agricultural statistics service historic cropland data layers. In Geoscience and
Remote Sensing Symposium (IGARSS), 2012 IEEE International (pp. 6297-6300). IEEE.
Bradley, P.S., Fayyad, U., Reina, C., 1998. Scaling Clustering Algorithms to Large Databases.
Proc. 4th Int'l Conf. Knowledge Discovery and Data Mining, pp. 9–15.
Brumby, S.P., Theiler, J.P., Bloch J.J., Harvey N.R., Perkins S.J., Szymanski J.J., Young, A.C.,
2002. Evolving land cover classification algorithms for multispectral and multitemporal imagery.
Proc. SPIE 4480, pp. 120–129.
Caldwell, M. K., Hawbaker, T. J., Briggs, J. S., Cigan, P. W., & Stitt, S., 2013. Simulated
impacts of mountain pine beetle and wildfire disturbances on forest vegetation composition and
carbon stocks in the Southern Rocky Mountains. Biogeosciences, 10(12): 8203-8222.
Carroll, A. L., Taylor, S. W., Régnière, J., and Safranyik, L., 2003. Effect of climate change on
range expansion by the mountain pine beetle in British Columbia, Mountain Pine Beetle
Symposium: Challenges and Solutions, 30-31 Oct 2003, Kelowna, British Columbia, (Natural
Resources Canada, Information Report BC-X-399, Victoria), pp. 223-232.
Celik, T., 2009. Unsupervised Change Detection in Satellite Images Using Principal Component
Analysis and K-means Clustering. IEEE Geoscience and Remote Sensing Letters 6(4), 772–776.
Chang, C. C., and Lin, C. J., 2011. LIBSVM: a library for support vector machines, ACM
Transactions on Intelligent Systems and Technology, 2(3): 1-27.
Chapman, T. B., Veblen, T. T., and Schoennagel, T., 2012. Spatiotemporal patterns of mountain
pine beetle activity in the southern Rocky Mountains, Ecology, 93(10): 2175-2185.
Chen, Y., & Gong, P. (2013). Clustering based on eigenspace transformation–CBEST for
efficient classification. ISPRS Journal of Photogrammetry and Remote Sensing, 83, 64-80.
Cohen, W. B., Yang, Z., and Kennedy, R., 2010. Detecting trends in forest disturbance and
recovery using yearly Landsat time series: 2. TimeSync—Tools for calibration and validation,
Remote Sensing of Environment, 114(12): 2911-2924.
Cole, W. E., and Amman, G. D., 1980. Mountain pine beetle dynamics in lodgepole pine forests,
Part I: Course of an infestation, General Technical Report, Intermountain Forest and Range
Experiment Station, USDA Forest Service, (INT-89).
Coppin, P. R., and Bauer, M. E., 1996. Digital change detection in forest ecosystems with remote
sensing imagery, Remote Sensing Reviews, 13(3-4): 207-234.
83
Crist, E. P., and Cicone, R. C., 1984. A physically-based transformation of Thematic Mapper
data---The TM Tasseled Cap, Geoscience and Remote Sensing, IEEE Transactions on, GE-22(3):
256-263.
Dai, X., and Khorram, S., 1998. The effects of image misregistration on the accuracy of remotely
sensed change detection, Geoscience and Remote Sensing, IEEE Transactions on, 36(5): 1566-
1577.
DeFries, R. S., Hansen, M. C., Townshend, J. R. G., Janetos, A. C., and Loveland, T. R., 2000. A
new global 1‐km dataset of percentage tree cover derived from remote sensing, Global Change
Biology, 6(2): 247-254.
Ding, C., He, X., 2004. K-means clustering via principal component analysis. Proc. of Int’l Conf.
Machine Learning (ICML 2004), pp. 225-232.
Edwards Jr, T. C., Moisen, G. G., & Cutler, D. R. (1998). Assessing map accuracy in a remotely
sensed, ecoregion-scale cover map. Remote Sensing of environment, 63(1), 73-83.
Eitzen, Z.A., Xu, K.-M., Wong, T., 2008. Statistical Analyses of Satellite Cloud Object Data
from CERES Part V: Relationships between Physical Properties of Marine Boundary Layer
Clouds. Journal of Climate 21(24), 6668–6688.
Ester, M., Kriegel, H.-P., Sander, J., Xu, X., 1996. A Density-Based Algorithm for Discovering
Clusters in Large Spatial Databases with Noise. Proc. 2nd Int’l Conf. Knowledge Discovery and
Data Mining (KDD-96), pp. 226–231.
Ester, M., Kriegel, H.-P., Xu, X., 1995. A Database Interface for Clustering in Large Spatial
Databases. Proc. First Int'l Conf. Knowledge Discovery and Data Mining (KDD-95), pp. 94–99.
FAO, 2010. Global Forest Resources Assessment 2010 Main Report. FAO.
Forest Inventory and Analysis, 2014. Forest Inventory and Analysis Fiscal Year 2013 Business
Report. FIA.
Frahling, G., Sohler, C., 2006. A fast K-means implementation using coresets. Proc. 22nd
symposium on Computational geometry (SoCG).
Franklin, S.E., Stenhouse, G.B., Hansen, M.J., Popplewell, C.C., Dechka, J.A., Peddle, D.R.,
2001. An integrated decision tree approach (IDTA) to mapping landcover using satellite remote
sensing in support of grizzly bear habitat analysis in the Alberta Yellowhead Ecosystem.
Canadian Journal of Remote Sensing 27(6), 579–592.
Franklin, J., Woodcock, C. E., & Warbington, R. (2000). Multi-attribute vegetation maps of
forest service lands in California supporting resource management decisions. Photogrammetric
Engineering and Remote Sensing, 66(10), 1209-1218.
84
Friedl, M. A., Sulla-Menashe, D., Tan, B., Schneider, A., Ramankutty, N., Sibley, A., & Huang,
X. (2010). MODIS Collection 5 global land cover: Algorithm refinements and characterization
of new datasets. Remote Sensing of Environment, 114(1), 168-182.
Fry, J., Xian, G., Jin, S., Dewitz, J., Homer, C., Yang, L., Barnes, C., Herold, N., and Wickham,
J., 2011. Completion of the 2006 National Land Cover Database for the Conterminous United
States, PE&RS, Vol. 77(9):858-864.
Fu, W., Chen, Y., Shi, M., Zhang, X., Xiao, X., Gong, P., 2013. The distribution and temporal
changes of surface cover color in China revealed by satellite based dynamic observation. Journal
of Remote Sensing. In press.
Funk, C.C., Theiler, J., Roberts, D.A., Borel, C.C., 2001. Clustering to improve matched filter
detection of weak gas plumes in hyperspectral thermal imagery. IEEE Transactions on
Geoscience and Remote Sensing 39(7), 1410–1420.
Girolami, M., 2002. Mercer Kernel Based Clustering in Feature Space. IEEE Transactions on
Neural Networks 13(3), 780–784.
Gong, P., 2012. Remote sensing of environmental change over China, a review. Chinese
Science Bulletin 57(22), 2793-2801.
Gong, P., Howarth, P.J., 1990. An assessment of some factors influencing multispectral land-
cover classification. Photogrammetric Engineering and Remote Sensing 56(5), 597–603.
Gong, P., Howarth, P.J., 1992. Frequency‐based contextual classification and gray‐level vector
reduction for land‐use identification. Photogrammetric Engineering and Remote Sensing 58(4),
423–437.
Gong, P., LeDrew, E. F., and Miller, J. R., 1992. Registration-noise reduction in difference
images for change detection, International Journal of Remote Sensing, 13(4): 773-779.
Gong, P., Wang, J., Yu, L., et al., 2013. Finer resolution observation and monitoring of global
land cover: first mapping results with Landsat TM and ETM+ data, International Journal of
Remote Sensing, 34(7):2607-2654.
Goodwin, N. R., Coops, N. C., Wulder, M. A., Gillanders, S., Schroeder, T. A., and Nelson, T.,
2008. Estimation of insect infestation dynamics using a temporal sequence of Landsat data,
Remote Sensing of Environment, 112(9): 3680-3689.
Goodwin, N. R., Magnussen, S., Coops, N. C., and Wulder, M. A., 2010. Curve fitting of time-
series Landsat imagery for characterizing a mountain pine beetle infestation, International
Journal of Remote Sensing, 31(12): 3263-3271.
Gordon, N.D., Norris, J.R., Weaver, C.P., Klein, S.A., 2005. Cluster analysis of cloud regimes
and characteristic dynamics of midlatitude synoptic systems in observations and a model. Journal
of Geophysical Research 110(D15), D15S17.
85
Guha, S., Rastogi, R., Shim, K., 1998. CURE: An efficient clustering algorithm for large
databases. Proc. ACM SIGMOD Int. Conf. Management of Data, pp. 73–84.
Han, K.-S., Champeaux, J.-L., Roujean, J.-L., 2004. A land cover classification product over
France at 1 km resolution using SPOT4/VEGETATION data. Remote Sensing of Environment
92(1), 52–66.
Hansen, M. C., DeFries, R. S., Townshend, J. R., and Sohlberg, R., 2000. Global land cover
classification at 1 km spatial resolution using a classification tree approach. International
Journal of Remote Sensing, 21(6-7): 1331-1364.
Hansen, M. C., Potapov, P. V., Moore, R., Hancher, M., Turubanova, S. A., Tyukavina, A., Thau,
D., Stehman, S. V., Goetz, S. J., Loveland, T. R., Kommareddy, A., Egorov, A., Chini, L.,
Justice, C. O., and Townshend, J. R. G., 2013. High-resolution global maps of 21st-century
forest cover change. Science, 342(6160): 850-853.
Hastie, T., Tibshirani, R., and Friedman, J., 2009. The Elements of Statistical Learning: Data
Mining, Inference, and Prediction, Springer Science+Business Media, New York, NY, 745 p.
Helms, J. A., 1998. The dictionary of forestry. Bethesda: SAF and CABI publishing.
Homer, C.G., Ramsey, R.D., Edwards, T.C. Jr., Falconer, A., 1997. Landscape cover-type
modeling using a multi-scene Thematic Mapper mosaic. Photogrammetric Engineering &
Remote Sensing 63(1), 59–67.
Homer, C., Dewitz, J., Fry, J., Coan, M., Hossain, N., Larson, C., Herold, N., McKerrow, A.,
VanDrel, J. N., and Wickham, J., 2007. Completion of the 2001 National Land Cover Database
for the Conterminous United States, Photogrammetric Engineering & Remote Sensing, 73(4),
337-341.
Honey-Marie, C., Carroll, A. L., Lindgren, B. S., and Aukema, B. H., 2011. Incoming!
Association of landscape features with dispersing mountain pine beetle populations during a
range expansion event in western Canada, Landscape Ecology, 26(8): 1097-1110.
Hsu, C. W., Chang, C. C., and Lin, C. J., 2003. A practical guide to support vector classification.
URL: http://www.csie.ntu.edu.tw/~cjlin/papers/guide/guide.pdf, National Taiwan University,
Taipei, Taiwan (last date accessed: 2 Feb 2014).
Huang, C., Song, K., Kim, S., Townshend, J. R., Davis, P., Masek, J. G., and Goward, S. N.,
2008. Use of a dark object concept and support vector machines to automate forest cover change
analysis, Remote Sensing of Environment, 112(3): 970-985.
Huang, C., Goward, S. N., Masek, J. G., Gao, F., Vermote, E. F., Thomas, N., Schleeweis, K.,
Kennedy, R. E., Zhu, Z., Eidenshink, J. C., and Townshend, J. R., 2009. Development of time
series stacks of Landsat images for reconstructing forest disturbance history, International
Journal of Digital Earth, 2(3): 195-218.
86
Huang, C., Goward, S. N., Masek, J. G., Thomas, N., Zhu, Z., and Vogelmann, J. E., 2010. An
automated approach for reconstructing recent forest disturbance history using dense Landsat time
series stacks, Remote Sensing of Environment, 114(1): 183-198.
Huete, A. R., Liu, H. Q., Batchily, K., and Van Leeuwen, W. J. D. A., 1997. A comparison of
vegetation indices over a global set of TM images for EOS-MODIS, Remote Sensing of
Environment, 59(3): 440-451.
Iverson, L. R., and Prasad, A. M., 1998. Predicting abundance of 80 tree species following
climate change in the eastern United States, Ecological Monographs, 68(4): 465-485.
Jensen, J.R., 2004. Introductory Digital Image Processing: A Remote Sensing Perspective, third
ed. Prentice Hall, New Jersey.
Jiao, L., Gong, M., Wang, S., Hou, B., Zheng, Z., Wu, Q., 2010. Natural and Remote Sensing
Image Segmentation Using Memetic Computing. IEEE Computational Intelligence Magazine
5(2), 78–91.
Jin, S., Yang, L., Danielson, P., Homer, C., Fry, J., and Xian, G. 2013. A comprehensive change
detection method for updating the National Land Cover Database to circa 2011. Remote Sensing
of Environment, 132: 159 – 175.
Jolliffe, I., 2002. Principal component analysis, second ed. Springer-Verlag, New York.
Kanungo, T., Mount, D., Netanyahu, N., Piatko, C., Silverman, R., Wu, A., 2002. An Efficient
K-means Clustering Algorithm: Analysis and Implementation. IEEE Trans. Pattern Anal. Mach.
Intell. 24(7), 881–892.
Keane, R. E., Morgan, P., and Menakis, J. P., 1994. Landscape Assessment of the Decline of
Whitebark pine (pinusa tbi caulis) in the Bob Marshal Wilderness Complex, Montana, USA,
Northwest Science, 68(3): 213-229.
Kennedy, R. E., Yang, Z., and Cohen, W. B., 2010. Detecting trends in forest disturbance and
recovery using yearly Landsat time series: 1. LandTrendr—Temporal segmentation algorithms,
Remote Sensing of Environment, 114(12): 2897-2910.
Key, C. H., and Benson, N. C., 2005. Landscape assessment: Sampling and analysis methods,
FIREMON: Fire Effects Monitoring and Inventory System (D. C. Lutes, R. E. Keane, and J. F.
Caratti, Editors), USDA Forest Service, Rocky Mountain Research Station, Ogden, Utah,
General Technical Report, (RMRS-GTR-164).
Kohavi, R., 1995. A Study of Cross-Validation and Bootstrap for Accuracy Estimation and
Model Selection., International Joint Conference on Artificial Intelligence, 20-25 August 1995,
Montreal, Quebec, Canada, pp. 1137-1145.
87
Kurz, W. A., Dymond, C. C., Stinson, G., Rampley, G. J., Neilson, E. T., Carroll, A. L., Ebata,
T., and Safranyik, L., 2008. Mountain pine beetle and forest carbon feedback to climate change,
Nature, 452(7190): 987-990.
Laaksonen-Craig, S., Goldman, G. E., & McKillop, W., 2003. Forestry, forest industry, and
forest products consumption in California. Oakland, CA: University of California, Division of
Agriculture and Natural Resources.
Lillesand, T.M., Kiefer, R.W., 1987. Remote Sensing and Image Interpretation, second ed.
Wiley, New Jersey.
Lloyd, S., 1982. Least Squares Quantization in PCM. IEEE Transactions on Information Theory
28(2), 129–137.
Logan, J. A., White, P., Bentz, B. J., and Powell, J. A., 1998. Model analysis of spatial patterns
in mountain pine beetle outbreaks, Theoretical Population Biology, 53(3): 236-255.
Logan, J. A., and Powell, J. A., 2001. Ghost forests, global warming, and the mountain pine
beetle (Coleoptera: Scolytidae), American Entomologist, 47(3): 160-173.
Loveland, T., Merchant, J., Brown, J., and Ohlen, D., 1991. Development of a land-cover
characteristics database for the conterminous U. S., Photogrammetric Engineering & Remote
Sensing, 57(11): 1453-1463.
Loveland, T.R., Reed, B.C., Brown, J.F., Ohlen, D.O., Zhu, Z., Yang, L., Merchant, J.W., 2000.
Development of a global land cover characteristics database and IGBP DISCover from 1 km
AVHRR data. International Journal of Remote Sensing 21(6), 1303–1330.
Lu, D., Mausel, P., Brondizio, E., and Moran, E., 2004. Change detection techniques,
International Journal of Remote Sensing, 25(12): 2365-2401.
Lunetta, R., Congalton, R., Fenstermaker, L., Jensen, J., Mcgwire, K., & Tinney, L. (1991).
Remote sensing and Geographic Information System data integration: error sources and research
issues. Photogrammetric engineering and remote sensing, 57(6), 677-687.
MacQueen, J.B., 1967. Some Methods for classification and Analysis of Multivariate
Observations. Proc. 5th Berkeley Symposium on Mathematical Statistics and Probability,
University of California Press, pp. 281–297.
Maness, H., Kushner, P. J., and Fung, I., 2013. Summertime climate response to mountain pine
beetle disturbance in British Columbia. Nature Geoscience, 6(1): 65-70.
Mas, J. F., 1999. Monitoring land-cover changes: a comparison of change detection techniques,
International Journal of Remote Sensing, 20(1): 139-152.
Masek, J. G., Vermote, E. F., Saleous, N., Wolfe, R., Hall, F. G., Huemmrich, F., Gao, F.,
Kutler, J., and Lim, T. K., 2012. LEDAPS Landsat Calibration, Reflectance, Atmospheric
88
Correction Preprocessing Code, Model product, URL: http://daac.ornl.gov, Oak Ridge National
Laboratory Distributed Active Archive Center, Oak Ridge, Tennessee (last date accessed: 2 Feb
2014).
Matgen, P., El Idrissi, A., Henry, J.B., Tholey, N., Hoffmann, L., de Fraipont, P., Pfister, L.,
2006. Patterns of remotely sensed floodplain saturation and its use in runoff predictions.
Hydrological Processes 20(8), 1805–1825.
Mayaux, P., Achard, F., and Malingreau, J. P., 1998. Global tropical forest area measurements
derived from coarse resolution satellite imagery: a comparison with other approaches,
Environmental Conservation, 25(1): 37-52.
McFeeters, S. K., 1996. The use of the Normalized Difference Water Index (NDWI) in the
delineation of open water features, International Journal of Remote Sensing, 17(7): 1425-1432.
Meigs, G. W., Kennedy, R. E., and Cohen, W. B., 2011. A Landsat time series approach to
characterize bark beetle and defoliator impacts on tree mortality and surface fuels in conifer
forests, Remote Sensing of Environment, 115(12): 3707-3718.
Metz, B., 2001. Climate change 2001: mitigation: contribution of Working Group III to the third
assessment report of the Intergovernmental Panel on Climate Change (Vol. 3), Cambridge
University Press, Cambridge, United Kingdom, 758 p.
Mitchell, R. G., Waring, R. H., & Pitman, G. B., 1983. Thinning lodgepole pine increases tree
vigor and resistance to mountain pine beetle. Forest Science, 29(1): 204-211.
Muller, S.V., Racoviteanu, A.E., Walker, D.A., 1999. Landsat MSS-derived land-cover map of
northern Alaska: Extrapolation methods and a comparison with photo-interpreted and AVHRR-
derived maps. International Journal of Remote Sensing 20(15–16), 2921–2946.
Nagesh, H., Goil, S., Choudhary, A., 1999. MAFIA: Efficient and scalable subspace clustering
for very large data sets. Center for Parallel and Distributed Computing, NWU, Tech. Rep. 9906-
010.
Ng, A. Y., 1997. Preventing Overfitting of Cross-Validation Data, Proceedings of the Fourteenth
International Conference on Machine Learning, 8-12 July 1997, Nashville, Tennessee, pp. 245-
253.
Olofsson, P., Foody, G. M., Herold, M., Stehman, S. V., Woodcock, C. E., & Wulder, M. A.
(2014). Good practices for estimating area and assessing accuracy of land change. Remote
Sensing of Environment, 148, 42-57.
Ouma, Y., Ngigi, T.G., Tateishi, R.R., 2006. On the optimization and selection of wavelet
texture for feature extraction from high‐resolution satellite imagery with application towards
urban‐tree delineation. International Journal of Remote Sensing 27(1–2), 73–104.
89
Overpeck, J. T., Rind, D., & Goldberg, R. (1990). Climate-induced changes in forest disturbance
and vegetation. Nature (London), 343(6253), 51-53.
Parker, T. J., Clancy, K. M., and Mathiasen, R. L., 2006. Interactions among fire, insects and
pathogens in coniferous forests of the interior western United States and Canada, Agricultural
and Forest Entomology, 8(3): 167-189.Peltonen, M., Liebhold, A. M., Bjørnstad, O. N., and
Williams, D. W., 2002. Spatial synchrony in forest insect outbreaks: roles of regional
stochasticity and dispersal, Ecology, 83(11): 3120-3129.
Parmentier, B., & Eastman, J. R. (2014). Land transitions from multivariate time series: using
seasonal trend analysis and segmentation to detect land-cover changes. International Journal of
Remote Sensing, 35(2), 671-692.
Powell, S. L., Cohen, W. B., Kennedy, R. E., Healey, S. P., & Huang, C. (2014). Observation of
Trends in Biomass Loss as a Result of Disturbance in the Conterminous US: 1986–2004.
Ecosystems, 17(1), 142-157.
Raffa, K. F., and Berryman, A. A., 1983. The role of host plant resistance in the colonization
behavior and ecology of bark beetles (Coleoptera: Scolytidae), Ecological Monographs, 53(1):
27-49.
Reger, B., Otte A., Waldhardt, R., 2007. Identifying patterns of land-cover change and their
physical attributes in a marginal European landscape. Landscape and Urban Planning 81(1–2),
104–113.
Reilly, T.E., Dennehy, K.F., Alley, W.M., and Cunningham, W.L., 2008, Ground-Water
Availability in the United States: U.S. Geological Survey Circular 1323, 70 p., also available
online at http://pubs.usgs.gov/circ/1323/
Remund, Q.P., Long, D.G., Drinkwater, M.R., 2000. An iterative approach to multisensor sea ice
classification. IEEE Transactions on Geoscience and Remote Sensing 38(4), 1843–1856.
Richards J.A., Jia, X., 2005. Remote Sensing Digital Image Analysis: An Introduction, fourth ed.
Springer-Verlag, Berlin Heidelberg.
Roelfsema, C.M., Phinn, S.R., Dennison, W.C., 2002. Spatial distribution of benthic microalgae
on coral reefs determined by remote sensing. Coral Reefs 21(3), 264–274.
Rollet, R., Benie, G.B., Li, W., Wang, S., Boucher, J.-M., 1998. Image classification algorithm
based on the RBF neural network and K-means. International Journal of Remote Sensing 19(15),
3003–3009.
Ruefenacht, B., Finco, M. V., Nelson, M. D., Czaplewski, R., Helmer, E. H., Blackard, J. A.,
Holden, G. R., Lister, A. J., Salajanu, D., Weyermann, D., and Winterberger, K., 2008.
Conterminous US and Alaska forest type mapping using forest inventory and analysis data,
Photogrammetric Engineering & Remote Sensing, 74(11): 1379-1388.
90
Running, S. W., 2008. Ecosystem disturbance, carbon, and climate. Science, 321(5889): 652-
653.
Safranyik, L., and Whitney, H. S., 1985. Development and survival of axenically reared
mountain pine beetles, Dendroctonus ponderosae (Coleoptera: Scolytidae), at constant
temperatures, The Canadian Entomologist, 117(02): 185-192.
Safranyik, L., and Wilson, B., 2007. The mountain pine beetle: a synthesis of biology,
management and impacts on lodgepole pine, Natural Resouces Canada, Canadian Forest Service,
Victoria, British Columbia, 317 p.
Sano, E.E., Ferreira, L.G., Asner, G.P., Steinke, E.T., 2007. Spatial and temporal probabilities of
obtaining cloud‐free Landsat images over the Brazilian tropical savannah. International Journal
of Remote Sensing 28(12), 2739–2752.
Shah, C.A., Arora, M.K., Varshney, P.K., 2004. Unsupervised classification of hyperspectral
data: an ICA mixture model based approach. International Journal of Remote Sensing 25(2),
481–487.
Shah, C.A., Varshney, P.K., Arora, M.K., 2007. ICA mixture model algorithm for unsupervised
classification of remote sensing imagery. International Journal of Remote Sensing 28(8), 1711–
1731.
Sheikholeslami, G., Chatterjee, S., Zhang, A., 1998. WaveCluster: A multiresolution clustering
approach for very large spatial databases. Proc. 24th VLDB Conf., pp. 428–439.
Shimamura, Y., Izumi, T., and Matsuyama, H., 2006. Evaluation of a useful method to identify
snow‐covered areas under vegetation–comparisons among a newly proposed snow index,
normalized difference snow index, and visible reflectance, International Journal of Remote
Sensing, 27(21): 4867-4884.
Singh, A., 1989. Review Article Digital change detection techniques using remotely-sensed data,
International Journal of Remote Sensing, 10(6): 989-1003.
Smith, W.B., Miles, P.D., Vissage, J.S., Pugh, S.A., 2002. Forest Resources of the United States,
General Tech Rep NC-241 (US Department of Agriculture, Forest Service, North Central
Research Station, St. Paul, MN.
Soja, A. J., Tchebakova, N. M., French, N. H., Flannigan, M. D., Shugart, H. H., Stocks, B. J., ...
& Stackhouse Jr, P. W. (2007). Climate-induced boreal forest change: predictions versus current
observations. Global and Planetary Change, 56(3), 274-296.
Song, C., Woodcock, C.E., Seto, K.C., Lenney, M.P., Macomber, S.A., 2001. Classification and
change detection using Landsat TM data: When and how to correct atmospheric effects. Remote
Sensing of Environment 75(2), 230–244.
91
Souza Jr, C. M., Siqueira, J. V., Sales, M. H., Fonseca, A. V., Ribeiro, J. G., Numata, I., ... &
Barlow, J. (2013). Ten-Year Landsat Classification of Deforestation and Forest Degradation in
the Brazilian Amazon. Remote Sensing, 5(11), 5493-5513.
Stehman, S. V., and Czaplewski, R. L., 1998. Design and analysis for thematic map accuracy
assessment: fundamental principles, Remote Sensing of Environment, 64(3): 331-344.
Stewart, S. I., Radeloff, V. C., & Hammer, R. B. (2006). The wildland-urban interface in the
United States. The public and wildland fire management: Social science findings for managers,
197-202.
Stone, T.A., Schlesinger, P., Houghton, R.A., Woodwell, G.M., 1994. A map of the vegetation of
South America based on satellite imagery. Photogrammetric Engineering & Remote Sensing
60(5), 541–551.
Theiler, J.P., Gisler, G., 1997. Contiguity-enhanced K-means clustering algorithm for
unsupervised multispectral image segmentation. Proc. SPIE 3159, 108–118.
Thorne, K., Markharn, B., Barker, P. S., and Biggar, S., 1997. Radiometric calibration of
Landsat, Photogrammetric Engineering & Remote Sensing, 63(7): 853-858.
Trzcinski, M. K., and Reid, M. L., 2008. Effect of management on the spatial spread of mountain
pine beetle (Dendroctonus ponderosae) in Banff National Park, Forest Ecology and
Management, 256(6): 1418-1426.
Tsagaris, V., Anastassopoulos, V., Lampropoulos, G.A., 2005. Fusion of hyperspectral data
using segmented PCT for color representation and classification. IEEE Transactions on
Geoscience and Remote Sensing 43(10), 2365–2375.
Tucker, C. J., 1979. Red and photographic infrared linear combinations for monitoring
vegetation. Remote Sensing of Environment, 8(2): 127-150.
USDA Forest Service, 2001. U.S. Forest Facts and Historical Trends. FS-696. Washington, DC:
USDA Forest Service.
USDA Forest Service, 2012. Future of America’s Forest and Rangelands: Forest Service 2010
Resources Planning Act Assessment. Gen. Tech. Rep. WO-87. Washington, DC. 198 p.
U.S. Department of Commerce, 2010. Census 2010. U.S. Gazetteer Files at
http://www.census.gov/geo/maps-data/data/gazetteer2010.html.
U.S. Bureau of Economic Analysis, 2013. Widespread Economic Growth in 2012, news release
(June 6, 2013), http://www.bea.gov/newsreleases/regional/gdp_state/2013/pdf/gsp0613.pdf.
Viovy, N., 2000. Automatic Classification of Time Series (ACTS): A new clustering method for
remote sensing time series. International Journal of Remote Sensing 21(6–7), 1537–1560.
92
Vogelmann, J.E., S.M. Howard, L. Yang, C. R. Larson, B. K. Wylie, and J. N. Van Driel, 2001,
Completion of the 1990’s National Land Cover Data Set for the conterminous United States,
Photogrammetric Engineering and Remote Sensing 67:650-662.
Vogelmann, J. E., Tolk, B., and Zhu, Z. (2009). Monitoring forest changes in the southwestern
United States using multitemporal Landsat data. Remote Sensing of Environment, 113(8), 1739-
1748.
Vose, J. M., Peterson, D. L., Patel-Weynand, T, 2012. Effects of climatic variability and change
on forest ecosystems: a comprehensive science synthesis for the U.S. forest sector. Gen. Tech.
Rep. PNW-GTR-870. Portland, OR: U.S. Department of Agriculture, Forest Service, Pacific
Northwest Research Station. 265 p.
Wang, W., Yang, J., Muntz, R.R., 1997. STING: A Statistical Information Grid Approach to
Spatial Data Mining. Proc. 23rd VLDB Conf., pp. 186–195.
Westerling, A. L., Hidalgo, H. G., Cayan, D. R., & Swetnam, T. W. (2006). Warming and earlier
spring increase western US forest wildfire activity. science, 313(5789), 940-943.
Westerling, A. L., & Bryant, B. P. (2008). Climate change and wildfire in California. Climatic
Change, 87(1), 231-249.
Wickham, J. D., Stehman, S. V., Fry, J. A., Smith, J. H., and Homer, C. G., 2010. Thematic
accuracy of the NLCD 2001 land cover for the conterminous United States, Remote Sensing of
Environment, 114(6): 1286-1296.
Wilson, E. H., and Sader, S. A., 2002. Detection of forest harvest type using multiple dates of
Landsat TM imagery, Remote Sensing of Environment, 80(3): 385-396.
Woodcock, C.D., Collins, J., Gopal, S., Jakabhazy, V.D., Li, X., Macomber, S., Ryherd, S.,
Harward, V.J., Levitan, J., Wu, Y., Warbington, R., 1994. Mapping forest vegetation using
Landsat TM imagery and a canopy reflectance model. Remote Sensing of Environment 50(3),
240–254.
Wulder, M.A., Franklin, S.E., White, J.C., 2004. Sensitivity of hyperclustering and labelling land
cover classes to Landsat image acquisition date. International Journal of Remote Sensing 25(23),
5337–5344.
Zarco-Tejada, P.J., Ustin, S.L., Whiting, M.L., 2005. Temporal and Spatial Relationships
between Within-Field Yield Variability in Cotton and High-Spatial Hyperspectral Remote
Sensing Imagery, Agronomy Journal 97(3), 641–653.
Zha, H., Ding, C., Gu, M., He, X., Simon, H.D., 2001. Spectral Relaxation for K-means
Clustering. Neural Information Processing Systems 14, Vancouver, Canada, 3-8 Dec, pp. 1057–
1064.
93
Zhang, L., Small, G.W., 2002. Automated detection of chemical vapors by pattern recognition
analysis of passive multispectral infrared remote sensing imaging data. Applied Spectroscopy
56(8), 1082–1093.
Zhang, R., Rudnicky, A., 2002. A large scale clustering scheme for kernel K-means. Int’l Conf.
Pattern Recognition (ICPR02), pp. 289–292.
Zhang, T., Ramakrishnan, R., Livny, M., 1997. BIRCH: A New Data Clustering Algorithm and
Its Applications. Data Mining and Knowledge Discovery 1(2), 141–182.
Zharikov, Y., Skilleter, G.A., Loneragan, N.R., Taranto, T., Cameron, B.E., 2005. Mapping and
characterising subtropical estuarine landscapes using aerial photography and GIS for potential
application in wildlife conservation and management. Biological Conservation, 125(1), 87–100.
Zhong, Y., Zhang, L., Huang, B., Li, P., 2006. An unsupervised artificial immune classifier for
multi/hyperspectral remote sensing imagery. IEEE Transactions on Geoscience and Remote
Sensing 44(2), 420–431.
Zhou, Q., Robson, M., 2001. Automated rangeland vegetation cover and density estimation using
ground digital images and a spectral-contextual classifier. International Journal of Remote
Sensing, 22(17), 3457–3470.
Zhu, Z., and Evans, D. L., 1994. US forest types and predicted percent forest cover from
AVHRR data, Photogrammetric Engineering and Remote Sensing, 60(5): 525-531.
Zhu, Z., and Woodcock, C. E., 2012. Object-based cloud and cloud shadow detection in Landsat
imagery, Remote Sensing of Environment, 118: 83-94.
Zhu, Z., Woodcock, C. E., and Olofsson, P., 2012. Continuous monitoring of forest disturbance
using all available Landsat imagery, Remote Sensing of Environment, 122: 75-91.
top related