media control - autonomous learning systems · 2013. 7. 23. · 4.2.2.2 multi-input–multi-output...

AutonomousLearning Systems

Plamen AngelovFrom Data Streams to Knowledge in Real-time

Autonomous Learning Systems

Autonomous LearningSystemsFrom Data Streams to Knowledgein Real-time

Plamen AngelovLancaster University, UK

A John Wiley & Sons, Ltd., Publication

This edition first published 2013© 2013 John Wiley & Sons, Ltd

Registered officeJohn Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, United Kingdom

For details of our global editorial offices, for customer services and for information about how to applyfor permission to reuse the copyright material in this book please see our website at www.wiley.com.

The right of the author to be identified as the author of this work has been asserted in accordance with theCopyright, Designs and Patents Act 1988.

All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, ortransmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise,except as permitted by the UK Copyright, Designs and Patents Act 1988, without the prior permission ofthe publisher.

Wiley also publishes its books in a variety of electronic formats. Some content that appears in print maynot be available in electronic books.

Designations used by companies to distinguish their products are often claimed as trademarks. All brandnames and product names used in this book are trade names, service marks, trademarks or registeredtrademarks of their respective owners. The publisher is not associated with any product or vendormentioned in this book. This publication is designed to provide accurate and authoritative information inregard to the subject matter covered. It is sold on the understanding that the publisher is not engaged inrendering professional services. If professional advice or other expert assistance is required, the servicesof a competent professional should be sought.

Library of Congress Cataloging-in-Publication Data:

Angelov, Plamen P.Autonomous learning systems : from data streams to knowledge in real-time / Plamen P. Angelov.

pages cmIncludes bibliographical references and index.ISBN 978-1-119-95152-0 (cloth)

1. Self-organizing systems. 2. Machine learning. I. Title.Q325.A54 2013006.3′1–dc23

2012025907

A catalogue record for this book is available from the British Library.

ISBN: 978-1-119-95152-0

Set in 10/12.5pt Palatino by Aptara Inc., New Delhi, India

http://www.wiley.com

Contents

Forewords xi

Preface xix

About the Author xxiii

1 Introduction 11.1 Autonomous Systems 31.2 The Role of Machine Learning in Autonomous Systems 41.3 System Identification – an Abstract Model of the Real World 6

1.3.1 System Structure Identification 61.3.2 Parameter Identification 81.3.3 Novelty Detection, Outliers and the Link

to Structure Innovation 91.4 Online versus Offline Identification 91.5 Adaptive and Evolving Systems 101.6 Evolving or Evolutionary Systems 111.7 Supervised versus Unsupervised Learning 131.8 Structure of the Book 14

PART I FUNDAMENTALS

2 Fundamentals of Probability Theory 192.1 Randomness and Determinism 202.2 Frequentistic versus Belief-Based Approach 222.3 Probability Densities and Moments 232.4 Density Estimation – Kernel-Based Approach 262.5 Recursive Density Estimation (RDE) 282.6 Detecting Novelties/Anomalies/Outliers using RDE 322.7 Conclusions 36

vi Contents

3 Fundamentals of Machine Learning and Pattern Recognition 373.1 Preprocessing 37

3.1.1 Normalisation and Standardisation 383.1.2 Orthogonalisation of Inputs/Features – rPCA Method 39

3.1.2.1 The Basics of the PCA Method 393.1.2.2 Offline PCA 403.1.2.3 Online (Recursive) Version of PCA, rPCA 41

3.2 Clustering 423.2.1 Proximity Measures and Clusters Shape 443.2.2 Offline Methods 46

3.2.2.1 A Brief Introduction to the MountainClustering Method 46

3.2.2.2 Subtractive Clustering Method Outline 473.2.2.3 Gustafson–Kessel Clustering Algorithm 483.2.2.4 Mean Shift Clustering Algorithm 49

3.2.3 Evolving Clustering Methods 493.2.3.1 Incremental VQ Clustering Method 503.2.3.2 Evolving Clustering Algorithm eClustering 503.2.3.3 Evolving Local Means (ELM) Clustering

Algorithm 513.2.3.4 Evolving GK-like Algorithm 52

3.3 Classification 563.3.1 Recursive LDA, rLDA 58

3.4 Conclusions 58

4 Fundamentals of Fuzzy Systems Theory 614.1 Fuzzy Sets 614.2 Fuzzy Systems, Fuzzy Rules 64

4.2.1 Fuzzy Systems of Zadeh–Mamdani Type 654.2.1.1 Linguistic Terms and Variables 654.2.1.2 Inference and Defuzzification 66

4.2.2 Takagi–Sugeno Fuzzy Systems 664.2.2.1 Architecture of Takagi–Sugeno Fuzzy Systems 664.2.2.2 Multi-Input–Multi-Output (MIMO) Takagi–Sugeno

Fuzzy Systems 684.2.2.3 Analysis of the Inference in Takagi–Sugeno

Fuzzy Systems 694.3 Fuzzy Systems with Nonparametric Antecedents (AnYa) 69

4.3.1 Architecture 704.3.2 Analysis of AnYa 72

4.4 FRB (Offline) Classifiers 734.5 Neurofuzzy Systems 75

4.5.1 Neurofuzzy System Architecture 754.5.1.1 TS Type NFS 764.5.1.2 AnYa Type NFS 78

Contents vii

4.5.2 Evolving NFS 784.5.3 Linguistic Interpretation of the NFS 79

4.6 State Space Perspective 794.7 Conclusions 81

PART II METHODOLOGY OF AUTONOMOUS LEARNING SYSTEMS

5 Evolving System Structure from Streaming Data 855.1 Defining System Structure Based on Prior Knowledge 855.2 Data Space Partitioning 86

5.2.1 Regular Partitioning of the Data Space 875.2.2 Data Space Partitioning through Clustering 875.2.3 Data Space Partitioning Based on Data Clouds 885.2.4 Importance of Partitioning the Joint Input–Output

Data Space 895.2.5 Principles of Data Space Partitioning for Autonomous

Machine Learning 915.2.6 Dynamic Data Space Partitioning – Evolving System Structure

Autonomously – Example of Fish Classification 915.2.6.1 Data Space Partitioning Based on Prior Knowledge 925.2.6.2 Regular Data Space Partitioning 955.2.6.3 Data Space Partitioning through Clustering 955.2.6.4 Data Space Partitioning through Clouds 95

5.3 Normalisation and Standardisation of Streaming Datain an Evolving Environment 965.3.1 Standardisation in an Evolving Environment 975.3.2 Normalisation in an Evolving Environment 98

5.4 Autonomous Monitoring of the Structure Quality 985.4.1 Autonomous Input Variables Selection 985.4.2 Autonomous Monitoring of the Age of the Local Submodels 1015.4.3 Autonomous Monitoring of the Utility of the Local Submodels 1025.4.4 Update of the Cluster Radii 103

5.5 Short- and Long-Term Focal Points and Submodels 1045.6 Simplification and Interpretability Issues 1055.7 Conclusions 107

6 Autonomous Learning Parameters of the Local Submodels 1096.1 Learning Parameters of Local Submodels 1106.2 Global versus Local Learning 1116.3 Evolving Systems Structure Recursively 1136.4 Learning Modes 1166.5 Robustness to Outliers in Autonomous Learning 1186.6 Conclusions 118

viii Contents

7 Autonomous Predictors, Estimators, Filters, Inferential Sensors 1217.1 Predictors, Estimators, Filters – Problem Formulation 1217.2 Nonlinear Regression 1237.3 Time Series 1247.4 Autonomous Learning Sensors 125

7.4.1 Autonomous Sensors – Problem Definition 1257.4.2 A Brief Overview of Soft/Intelligent/Inferential Sensors 1267.4.3 Autonomous Intelligent Sensors (AutoSense) 1277.4.4 AutoSense Architecture 1287.4.5 Modes of Operation of AutoSense 1297.4.6 Autonomous Input Variable Selection 130

7.5 Conclusions 131

8 Autonomous Learning Classifiers 1338.1 Classifying Data Streams 1338.2 Why Adapt the Classifier Structure? 1348.3 Architecture of Autonomous Classifiers of the Family AutoClassify 135

8.3.1 AutoClassify0 1368.3.2 AutoClassify1 137

8.3.2.1 Multiple (m) Two-Class Classification Problems 1388.3.2.2 AutoClass1 MIMO 138

8.4 Learning AutoClassify from Streaming Data 1398.4.1 Learning AutoClassify0 1398.4.2 Learning AutoClassify1 139

8.5 Analysis of AutoClassify 1408.6 Conclusions 140

9 Autonomous Learning Controllers 1439.1 Indirect Adaptive Control Scheme 1449.2 Evolving Inverse Plant Model from Online Streaming Data 1459.3 Evolving Fuzzy Controller Structure from Online Streaming Data 1479.4 Examples of Using AutoControl 1489.5 Conclusions 153

10 Collaborative Autonomous Learning Systems 15510.1 Distributed Intelligence Scenarios 15510.2 Autonomous Collaborative Learning 15710.3 Collaborative Autonomous Clustering, AutoCluster by

a Team of ALSs 15810.4 Collaborative Autonomous Predictors, Estimators, Filters and

AutoSense by a Team of ALSs 15910.5 Collaborative Autonomous Classifiers AutoClassify by a Team of

ALSs 16010.6 Superposition of Local Submodels 16110.7 Conclusions 161

Contents ix

PART III APPLICATIONS OF ALS

11 Autonomous Learning Sensors for Chemical andPetrochemical Industries 16511.1 Case Study 1: Quality of the Products in an Oil Refinery 165

11.1.1 Introduction 16511.1.2 The Current State-of-the-Art 16611.1.3 Problem Description 16711.1.4 The Dataset 16711.1.5 AutoSense for Kerosene Quality Prediction 16811.1.6 AutoSense for Abel Inflammability Test 171

11.2 Case Study 2: Polypropylene Manufacturing 17211.2.1 Problem Description 17211.2.2 Drift and Shift Detection by Cluster Age Derivatives 17611.2.3 Input Variables Selection 177

11.3 Conclusions 178

12 Autonomous Learning Systems in Mobile Robotics 17912.1 The Mobile Robot Pioneer 3DX 17912.2 Autonomous Classifier for Landmark Recognition 180

12.2.1 Corner Detection and Simple Mapping of an IndoorEnvironment through Wall Following 182

12.2.2 Outdoor Landmark Detection Based on Visual InputInformation 185

12.2.3 VideoDiaries 18912.2.4 Collaborative Scenario 190

12.3 Autonomous Leader Follower 19312.4 Results Analysis 196

13 Autonomous Novelty Detection and Object Tracking in Video Streams 19713.1 Problem Definition 19713.2 Background Subtraction and KDE for Detecting Visual Novelties 198

13.2.1 Background Subtraction Method 19813.2.2 Challenges 199

13.2.2.1 Illumination Changes 19913.2.2.2 Shadows and Reflection 20013.2.2.3 Occlusions and Camouflage 20113.2.2.4 Nonstatic Background and Camera Oscillations 201

13.2.3 Parametric versus Nonparametric Approaches 20113.2.4 Kernel Density Estimation Method 202

13.3 Detecting Visual Novelties with the RDE Method 20313.4 Object Identification in Image Frames Using RDE 20413.5 Real-time Tracking in Video Streams Using ALS 20613.6 Conclusions 209

x Contents

14 Modelling Evolving User Behaviour with ALS 21114.1 User Behaviour as an Evolving Phenomenon 21114.2 Designing the User Behaviour Profile 21214.3 Applying AutoClassify0 for Modelling Evolving User Behaviour 21514.4 Case Studies 216

14.4.1 Users of UNIX Commands 21614.4.2 Modelling Activity of People in a Smart Home Environment 21814.4.3 Automatic Scene Recognition 219

14.5 Conclusions 221

15 Epilogue 22315.1 Conclusions 22315.2 Open Problems 22715.3 Future Directions 227

APPENDICESAppendix A Mathematical Foundations 231

A.1 Probability Distributions 231A.2 Basic Matrix Properties 233

Appendix B Pseudocode of the Basic Algorithms 235B.1 Mean Shift with Epanechnikov Kernel 235B.2 AutoCluster 236B.3 ELM 237B.4 AutoCluster 239B.5 AutoSense 240B.6 AutoClassify0 240B.7 AutoClassify1 241B.8 AutoControl 243

References 245

Glossary 259

Index 263

Forewords

Adrian Stoica

Efficient and robust performance in imperfectly known, nonstationary, environ-ments – and this characterizes the vast majority of real-world applications – requiressystems that can improve themselves, transcending their initial design, continuouslyoptimizing their parameters, models, and methods. These improvements come pre-dominantly from learning – about the environment, about the ageing self, about theinteractions with, and within, the environment, and from the ability to put this learn-ing to use. Batch learning – or at least repeated updating learning from most recentbatches, is sufficient only for a limited number of applications. For other applica-tions learning needs to be incremental, to sample level, a learn-or-perish, or at leastlearn-or-pay (a hefty price) situation. In particular, real-time learning is most criticalfor bots, virtual or real, agents of the cyberphysical systems that need the agility toswiftly react to virus attacks, or physical robots exposed to hazards while performingsearch and rescue in disaster areas, or dealing with what is, for now, a largely unpre-dictable partner: the human. The fast advancement in autonomous systems makes thesubject of real-time autonomous learning critically important, and yet the literatureaddressing this topic is extremely scarce.

Dr Angelov’s pioneering book addresses this problem at its core, focusing on real-time, online learning from streaming data on a sample-by-sample basis. It offers abasic framework for understanding and for designing such systems. It importantlycontributes to a more powerful learning, not only of the parameters but of a betterstructure as well. Conventional approaches are characterized by the fact that thesystem structure (model) is determined in the beginning, by human designers ofthe system, and only parameters are learned from the interaction. The entire modelidentification–learning process can, however, be posed as an optimization problem,as the author points out, and this includes automatically determining the optimalstructure in conjunction with the optimal parameters for it. This is done automaticallyin the methods described in the book, and constitutes a significant and valuablecontribution. The system is continuously evolving, not in the evolutionary (genetic)sense of improvement over generations, but continuously perfecting its development.

xii Forewords

A valuable contribution of the book is that it offers a high-level, holistic perspectiveof the field, which helps both students and expert practitioners better comprehend theinterplay of various disciplines involved in learning autonomous systems, as diverseas adaptive control and evolutionary algorithms, offering analogies between differentdisciplines and referring to the equivalency of the concepts characterized by differentterminology in different disciplines. It is not meant to be a comprehensive referenceof concepts and methods in the field, the author instead paints the landscape withselected brush strokes that allows the viewer to see the forest without getting lostin seeing the trees. It is a work that charters a new field, innovates in methods toadvance into it, and outlines new challenges to be addressed by future explorers.

The selected concepts and methods, a good number of which come from the au-thor’s own prior work, are used in the second part of the book to illustrate the im-plementation of learning in autonomous systems. Concepts such as that of evolvingclusters, ‘age’ of an (evolving) local submodel, and methods such as recursive densityestimation (RDE) introduced by the author, showing significant improvement overthe state-of-the art, are important additions to the arsenal of tools for real-time learn-ing. In particular, I believe that adaptive, self-learning (evolving) classifiers will playa fundamental role in future autonomous learning systems.

The book’s last part is a review of three applications: autonomous learning sensorsfor chemical and petrochemical applications, autonomous learning in mobile robots,and autonomous novelty detection and object tracking in video systems. These di-verse domains illustrate the general applicability of the set of methodology presentedin the book and focused on the main theme of this work: the real-time autonomousonline learning from data streams.

The field of autonomous learning systems is destined to play an increasingly im-portant role in most systems that will surround us in cyberphysical space. Convert-ing information in data streams, larger and larger, to actionable knowledge, in realtime: this is the great challenge ahead, and this book is an important step towardsaddressing it.

Adrian StoicaJet Propulsion Laboratory

California Institute of TechnologyJune 2012

Vladik Kreinovich

In many practical situations, we have experts who are skilled in doing certain tasks:expert medical doctors are skilled in diagnosing and curing diseases, professionaldrivers are skilled in driving cars – in particular, driving them in difficult traffic and/orweather conditions, etc. It is desirable to incorporate the knowledge of these topexperts in an automatic system that would help other users perform the correspondingtasks – and, ideally, perform these tasks automatically.

Forewords xiii

Experts are usually willing to share their knowledge, but the difficulty is thatin many situations, experts describe their knowledge by using imprecise (“fuzzy”)words from natural language, like “small”. For example, an expert driver rarelydescribes his or her experience in precise terms like “if the car in front slows down by10 km/h and it is at a distance of 10 meters, you should press the brake for 0.6 secondswith a force of 2.7 Newtons”; most probably, the rule described by an expert driveris “if the car in front of you is close, and it suddenly slows down some, then youshould brake right away”. In this rule, “close” and “some” are imprecise terms: whileeveryone would agree that, say 100 meters is not close while 5 meters is close, therewill not be a precise threshold so that before this threshold the distance is close, anda 1 cm larger distance is not close.

To describe such imprecise (fuzzy) knowledge in computer-understandable preciseterms, Professor Lotfi A. Zadeh invented, in the 1960s, a new approach called fuzzylogic. Zadeh’s ideas led to revolutionary changes in many control situations: from thefirst successful control applications in the 1970s through the fuzzy control boom inthe 1990s – when fuzzy-controlled washing machines, camcorders, elevators, trainswere heavily promoted and advertised – to the current ubiquity of fuzzy controllers.Just like nowadays computers are ubiquitous – companies no longer brag aboutcomputer chips in their cars, since all the cars have such chips – similarly, fuzzycontrol is ubiquitous: for example, in many cars, automatic transmission systems usefuzzy control.

The existing fuzzy controllers are very successful, but they have a serious limitation:they do not learn. Once the original expert rules are implemented, these same rules areused again and again, even when it becomes clear that the rules need to be updated.We still need an expert to update these rules.

There are, of course, numerous intelligent systems that can learn, such as artificialneural networks, but from the viewpoint of the user, these systems are “black boxes”:we may trust them, but we cannot easily understand the recommendations. In con-trast, fuzzy rules, by definition, are formulated in terms of understandable rules. Ifwe could make fuzzy systems themselves learn, make them automatically update therules – this would combine the clarity of fuzzy rules with the autonomous learningability of neural networks. This would make learning fuzzy controllers even moreefficient – and therefore, even more widely used. This would lead to a second revolutionin intelligent control.

And this revolution is starting. This book, by Dr. Plamen Angelov, one of theworld’s leading specialists in learning fuzzy systems, is the first book that sum-marizes the current techniques and successes of autonomously learning fuzzy (andother) systems – techniques mostly developed by Dr. Angelov himself, often in col-laboration with other renowned fuzzy researchers (like Dr. Ronald Yager). Some ofthese techniques have previously appeared in technical journals and proceedings ofinternational conferences, some appear here for the first time.

Ideas are many, it is difficult to describe them all in a short preface, so let us justgive a few examples. The first example is an interesting AnYa algorithm invented byAngelov and Yager (Anya is also a Russian short form of Anna (Anne)). In fuzzy logic,

xiv Forewords

each fuzzy term like “small” is described by a membership function, i.e. a function thatassigns, to each possible value x, the degree μ(x) from the interval [0, 1] to which thisvalue is small. The value μ(x) = 1 means that x is absolutely small, every expert wouldagree to this; μ(x) = 0 means that x is definitely not small, while values between 0and 1 represent the expert’s uncertainty.

In the traditional fuzzy control algorithms, we select a finite-parametric family ofmembership functions – e.g., functions that are of triangular, trapezoid, or Gaussianshapes – and adjust parameters of these functions based on the expert opinions. As aresult, sometimes, the resulting membership functions provide a rather crude and notvery accurate description of the expert knowledge. To improve the situation, AnYadoes not limit the shape of the membership function. Instead, it uses all the valuex1, . . . , xn, that the expert believes to be satisfying the property (like “small”), anddefines the desired membership function by formalizing the statement “x is closeto x1 or x is close to x2, . . .”. Now, all we need to do is describe what expertsmean by “close” (and by “or”), and we will no only have a well-shaped member-ship function, we will also have a way to update its shape when new observationsappear.

A similar idea can be implemented in probabilistic terms, when we use probabilitydensity functions (pdf) ρ(x) instead of membership functions, but the authors showa clear computational advantage of their fuzzy approach: A pdf is normalized bythe condition that the total probability is 1: ∫ρ(x)dx = 1, so we need to go through acomputationally intensive process of renormalize all its values every time we updateone value of ρ(x). In contrast, a membership function is usually normalized by thecondition that max

xμ(x) = 1. Thus, if we change a value of the membership function,

we only need to renormalizing other values when the changed value is μ(x) = 1 – andthis happens rarely.

Similar ideas are used to automatically decide how to adjust the rule’s conclu-sions, when to subdivide the original rule into two subrules – that would providea more subtle description of actions, when to dismiss the old data that is no longerrepresentative of the system’s inputs, etc.

Researchers and practitioners who have been using fuzzy techniques will defi-nitely benefit from learning how to make fuzzy systems learn automatically (punintended :-). But this book is not only for them. Readers who are not familiar withthe current fuzzy techniques will also greatly benefit: the book starts with a nice in-troduction that explains, in popular terms, what is fuzzy, and why and how we canuse fuzzy techniques. (Some math is needed – but math taught to engineers is quiteenough.)

This book is not just for the academics, practitioners will surely benefit. In the lastpart of the book, numerous applications are described in detail, providing the readerwith an understanding of how these new methods can be used in practical situations.It may be a good idea to glance through these exciting applications first, this will givethe readers an excellent motivation to grind through all the formulas and algorithmsin the main part of the book.

Forewords xv

Applications include learning sensors for chemical and petrochemical industries –industries where the chemical contents of the input (such as oil) changes all thetime, and intelligent adjustments need to be constantly made. Another successfulapplication example is mobile robotics, where the robot’s ability to learn how tonavigate in a new environment – and learn fast – is often crucial for the robot’smission. The new methods have also been applied to novelty detection and objecttracking in video streams, to wireless sensor networks, and to many other challengingapplication areas.

The second revolution – of making intelligent control systems fast learners – hasstarted. Its preliminary results are already exciting. This book will definitely helppromote the ideas of this second revolution – and thus, further improve its methodsand use these improved methods to solve numerous challenging problems of today.

Vladik KreinovichPresident

North American Fuzzy Information Processing Society (NAFIPS), El Paso, TexasAugust 2012

Arthur Kordon

One of the most significant changes during the twenty-first century is the fast dy-namics in almost all components of life. Economic, social, and financial systems, toname a few, are moving more of their activities to a real-time mode of operation.Extracting knowledge from data streams becomes as important as was informationretrieval from data bases several decades ago. The need for fast adaptation to un-known conditions, due to the new complex nature of the global economy, is anotherbig challenge in operating the systems in the twenty-first century.

Unfortunately, the existing classical modeling techniques, based on first principles,statistics, system identification, etc., cannot deliver satisfying solutions adequate tothe new fast dynamics. Adaptive systems are limited to models with a fixed struc-ture and linear relationships. Some recent computational intelligence methods withnonlinear and adaptive capabilities, such as evolutionary computation and swarmintelligence, are too slow for real-time operation.

A potential solution to the new needs is the fast-growing research area of evolv-ing intelligent systems. They offer a system that simultaneously learns its structureand calculates its parameters “on the fly” from data streams. An advantage of thisapproach is the simplicity and very low memory requirement of the used algorithms,which makes them appropriate for real time. In addition, the algorithms are universal(i.e. can be applied in various areas with no or minor changes), with minimal numberof tuning parameters, and in the case of evolving fuzzy systems – the suggested mod-els are interpretable by the users. Autonomous learning systems (ALS) is the ultimatesolution of an evolved intelligent system that integrates the broadest possible range

xvi Forewords

of algorithms and requires minimal human intervention. The research area of ALShas grown significantly in terms of publications, conference presence, and fundingsupport. An impressive feature of this emerging technology is its fast applicability indifferent areas, such as inferential sensors, mobile robotics, video streams processing,defence applications, etc.

Unfortunately, the literature for ALS is mostly available in journal or conferencepapers. Some recently published books on evolving intelligent systems and evolv-ing fuzzy systems give a generic overview and some guidelines on the differentcomponents of the technology. However, researchers and practitioners need a bookthat describes in sufficient details the foundation of ALS, offers software for inves-tigating the key algorithms in a popular environment, such as Matlab, and givesappropriate application examples. Autonomous Learning Systems is the first book onthe market that fills this need.

Purpose of the Book

The purpose of the book is to give the reader a comprehensive view of the currentstate of the art of ALS. The key topics of the book are:

1. What are the fundamentals of ALS? The first main topic of the book focuses on theambitious task of describing the diverse research foundation of ALS. The keymethods, such as probability theory, machine learning and pattern recognition,clustering, and fuzzy system theory are presented at a level of detail sufficient forunderstanding the ALS mechanisms.

2. How to develop an ALS? The second key topic of the book is the description of themethodology of ALS. Its main focus is on presenting the key steps in developingof an ALS from streaming data, such as dynamic data space partitioning, normal-ization and standardisation, autonomous monitoring of the structure quality, andautonomous learning parameters of the local submodels.

3. How and where to apply ALS in practice? The third key topic of the book coversthe implementation issues and application areas of ALS. It includes an overviewof the potential application areas, such as autonomous predictors, estimators, fil-ters, and inferential sensors; autonomous learning classifiers; autonomous learningcontrollers and collaborative autonomous learning systems. Of special interest arethe results from several real-world applications of ALS, such as inferential sen-sors in the chemical and petrochemical industries, ALS in mobile robotics, andautonomous novelty detection and object tracking in video streams.

Who this Book is for?

Due to the wide potential application areas, the targeted audience is much broaderthan the traditional scientific communities in computer science, data mining, and

Forewords xvii

engineering. The readers who can benefit from this book are described below:

� Academics – This group includes a large class of academics in the fields of computerscience, data mining, and engineering who are not familiar with the research andtechnical details of this new field. They will benefit from the book by using it as anintroduction to the field, exploring the described algorithms, and understandingits broad application potential.

� Students – Undergraduate and graduate students can benefit from the book byunderstanding this new field. The book could be a basis for a graduate course onthis topic.

� Industrial researchers – This group includes scientists in industrial labs who createnew solutions for the busines. They will benefit from the book by understandingthe value of this new emerging technology in delivering novel solutions in the areaof real-time modelling based on data streams.

� Governmental agencies – ALS have almost unlimited potential in various militaryapplications and space exploration projects. This book can be used by the govern-mental decision makers as an introduction to the technology and even may opennew application areas.

� Software vendors – This group includes vendors of process monitoring and controlsystems, data-mining software, robotic systems, etc. They will benefit from the bookby understanding a new emerging technology, exploring the described algorithms,and defining new application areas, related to data streams.

Features of the Book

The key features that differentiate this book from other titles on learning and evolvingsystems are:

1. A systematic description of autonomous learning systems – One of the most valuablefeatures of the book is the systematic and comprehensive way of presenting relatedmethods. It gives the reader a solid basis for understanding the research foundationof ALS, which is of critical importance for introducing this emerging technologyto a broad audience.

2. A detailed methodology for development of autonomous learning systems – Anotherimportant topic of the book is the step-by-step description of the key algorithms,used in the proposed technology. This allows the reader to easily implement andexplore the large potential of ALS on a simulation level.

3. A broad range of autonomous learning systems applications – The third key feature ofthe book is the impressive list of described real-world applications across severalapplication areas. It illustrates one of the unique advantages of ALS – the fasttransition from theory to practice.

Arthur KordonLake Jackson, Texas

March, 2012

xviii Forewords

Lawrence O. Hall

Today’s data-driven world has many large streams of data, for example a day’s worthof images posted to the internet. Imagine the task of finding all instances, perturbedor not, of an image. It would require learning and adaptation as the way images areplaced and modified evolve. This book provides an informative snapshot of how tobuild autonomous learning systems for data streams. It covers the math you need,probability, normalization, fuzzy systems, the basics of machine learning and patternrecognition, clustering, feature selection and more. It shows how to do learning in anautonomous setting and how to tune out noise/outliers.

Sensor learning, autonomous classifier learning, collaborative learning, learningcontrols systems in an autonomous way are all covered in an interesting way. Thereare case studies given to show how it all fits together. These include object trackingin videos and autonomous learning robots. It is clear that we need adaptable robotscapable of modifying their behaviour. This is an ambitious, early book that coversrobot learning and the basics for much more autonomous learning. It is well worthperusing for those interested in this broad subject.

Lawrence O. HallDept. of Computer Science and Engineering, University of South Florida, USA

May 16, 2012

Preface

This book comes as a result of focused research and studies for over a decade inthe emerging area that is on the crossroads of a number of well-known and well-established disciplines, such as (Figure 1):

� machine learning (ML);� system engineering (specifically, system identification), SI;� data mining, DM;� statistical analysis, SA;� pattern recognition including clustering, classification, PR;� fuzzy logic and fuzzy systems, including neurofuzzy systems, FL;� and so on.

On the one hand, there is a very strong trend of innovation of all of the abovewell-established branches of research that is linked to their online and real-time ap-plication; their adaptability, flexibility and so on (Liu and Meng, 2004; Pang, Ozawaand Kasabov, 2005; Leng, McGuinty and Prasad, 2005). On the other hand, a verystrong driver for the emergence of autonomous learning systems (ALS) is industry, es-pecially defence and security, but also aerospace and advanced process industries,

ML

SI

DM

SA

PR

FL

ALS

Figure 1 Autonomous learning systems theory is build upon other well-establishedareas of research (the list is, of course, not exhaustive)

xx Preface

the Internet, eHealth (assisted living), intelligent transport, and so on. The demand indefence was underpinned recently by a range of multimillion research and develop-ment projects funded by DARPA, USA (notably, two Grand Challenge competitions(Buehler, Iagnemma and Singh, 2010)); by MoD and BIS, UK (Defence TechnologyCentre on Systems Engineering and Autonomous Systems; ASTRAEA and GAMMAmultimillion programmes, in which the author has played the research providerrole, being the technical lead for several projects) and similar programmes in otherEuropean countries (France, Sweden, Spain, Czech Republic, Russia), and Israel.Major global companies have established their own programmes, such as IBM’sautonomous computing initiative (IBM, 2009) and BT’s intelligent network of au-tonomous elements (Detyenecki and Tateson, 2005). The International Neural Net-work Society (INNS) has established in 2011 a section on Autonomous MachineLearning of which the author is a founding member, together with scientists such asBernard Widrow – the father of the famous least mean squares (LMS) algorithm.

This book attempts to address these challenges with a systematic approach that canbe seen as laying the foundations of what can become a fast-growing area of researchthat can underpin a range of technological applications so needed by industry andsociety. The author does not claim that this represents a finished and monolithictheory; this is rather a catalyser for future developments, an inoculum, a vectorpointing the direction rather than a full solution of the problems.

An important aim of preparing this book was also to make it a one-stop shopfor students, researchers, practicing engineers, computer specialists, defence andindustry experts and so on that starts with the motivation, presents the conceptof the approach, describes details of the theoretical methodology based on a rig-orous mathematical foundation, presents a wide range of applications, and moreimportantly, provides illustrations and algorithms that can be used for further re-search. The software (subject to a license) can be downloaded from the author’s website: http://www.lancs.ac.uk/staff/angelov/Downloads.htm. It is covered by USApatent # 2010-0036780, granted 21 August 2012 (priority date 1 Nov. 2006) and twopending patent applications and distributed by the spin-out company of LancasterUniversity called EntelSenSys Ltd. (www.entelsensys.com). From the same web sitethere will also be available for the readers for this book a set of lecture notes thatwill be a useful tool for delivering specialised short courses or an advanced Masterlevel module as a part of various related programmes that cover the topics of ma-chine learning, pattern recognition, control systems, computational intelligence, datamining, systems engineering, and so on.

The book was initially planned at the end of 2006 during the very successful IEEESymposium on Evolving Fuzzy Systems held in Ambleside in the Lake District, UKbut the turn of events (as usually happens) postponed its appearance by more thanfive years, which gave an opportunity for the concepts to mature and evolve furtherand new results and applications to be added.

It would not have become a reality without the support of the colleagues and col-laborators, students, associates and visitors of the author. In the hope not to misssomeone this includes Prof. Ronald Yager (Iona College, NY, USA), Dr. Dimitar Filev

http://www.lancs.ac.uk/staff/angelov/Downloads.htm

http://www.entelsensys.com

Preface xxi

(Ford, MI, USA), Prof. Nikola Kasabov (Auckland University, New Zealand), Prof.Fernando Gomide (University of Campinas, Brazil), Dr. Xiaowei Zhou (HW Com-munications, UK), Dr. Jose Antonio Iglesias (University Carlos III, Madrid, Spain),Dr. Jose Macias Hernandez (CEPSA, Tenerife, Spain), Dr. Arthur Kordon (The DowChemical, TX, USA), Dr. Edwin Lughofer (Johan Kepler University, Linz, Austria),Prof. Igor Skrjanc (University of Ljubljana, Slovenia), Prof. Frank Klawonn (OstfaliaUniversity of Applied Sciences, Germany), Mr. Jose Victor Ramos (University ofCoimbra, Portugal), Dr. Ana Cara Belen (University of Granada), Mr. Javier Andreu(Lancaster University), Mr. Pouria Sadeghi-Tehran (Lancaster University), Mr. DenisKolev (Rinicom Ltd.), Mrs. Rashmi Dutta Baruah (Lancaster University), Mr. RaminRamezani (Imperial College, London), Mr. Julio Trevisan (Lancaster University), andmany more.

The feedback on the manuscript by Professor Vladik Kreinovich (University ofTexas, USA) who is also President of the North American Fuzzy Information Pro-cessing Society (NAFIPS); Dr. Adrian Stoica, Senior Research Scientist at the Au-tonomous Systems Division, NASA Jet Propulsion Laboratory, Pasadena, CA, USA,Dr. Arthur Kordon, Team Leader at Dow Chemical, TX, USA; as well as from Dr. LarryHall, Distinguished Professor and Chair at the Department of Computer Science andEngineering, University of South Florida, USA was also instrumental to improve andsmooth out the presentation and remove some omissions.

Plamen AngelovNovember 2009–May 2012

Lancaster, UK

About the Author

The author, Dr Plamen Angelov, is a Reader in Computational Intelligence and coor-dinator of the Intelligent Systems Research Area at Infolab21, Lancaster University,UK. He is a Senior Member of the IEEE and of INNS (International Neural NetworksSociety) and Chair of the Technical Committee on Evolving Intelligent Systems, Sys-tems, Man and Cybernetics Society, IEEE. He is also a member of the UK AutonomousSystems National Technical Committee and a founding member of the Centre of Ex-cellence in CyberSecurity officially recognised by UK GCHQ for the period 2012–2017.

He has authored or coauthored over 160 peer-reviewed publications in leadingjournals (50+) peer-reviewed conference proceedings, three patents, two researchmonographs, half a dozen edited books, and has an active research portfolio in the areaof computational intelligence and autonomous system modelling, identification, andmachine learning. He has internationally recognised pioneering results into onlineand evolving methodologies and algorithms for knowledge extraction in the form ofhuman-intelligible fuzzy rule-based systems and autonomous machine learning.

Dr. Angelov is a very active researcher leading numerous projects (over fifteenin the last five to six years) funded by UK and EU research councils, industry, HMGovernment, including UK Ministry of Defence (total funding of the order of tens ofmillions pounds with well over £1M for his group alone). His research contributes tothe competitiveness of the industry, defence and quality of life and was recognised by‘The Engineer Innovation and Technology 2008 Award in two categories: i) Aerospaceand Defence and ii) The Special Award.

Dr. Angelov is also the founding Editor-in-Chief of Springer’s journal on Evolv-ing Systems and Associate Editor of the leading international scientific journals inthis area, including IEEE Transactions on Systems, Man and Cybernetics, IEEE Trans-actions on Fuzzy Systems, Elsevier’s Fuzzy Sets and Systems, Soft Computing, Journalon Automation, Mobile Robotics and Intelligent Systems Journal on Advances in Air-craft and Spacecraft Science and several others. He also chairs annual conferencesorganised by IEEE on Evolving and Adaptive Intelligent Systems, will be Generalco-Chair of the prime conferences on neural networks (IJCNN-2013, Dallas, Texas,August, Texas and fuzzy systems, FUZZ-IEEE-2014, June 2014, Beijing, China andon Cybernetics, CYBCO-2013, Lausanne, Switzerland), acted as Visiting Professor

xxiv About the Author

(in Brazil, Germany, Spain) regularly gives invited and plenary talks at leading con-ferences, universities and companies More information can be found at his web sitewww.lancs.ac.uk/staff/angelov.

The evolving face of the author (who is, of course an autonomous learning andevolving system himself) can be seen below:

http://www.lancs.ac.uk/staff/angelov

1Introduction

The main differentiator of the new generation of autonomous systems that is emerg-ing in the twenty-first century is the adaptivity of their intelligence. They are notsimply automatic (usually remote) control devices, not only adaptive control systemsin the narrow sense of systems with tunable parameters as in the last decades of thepast century, but they are rather systems with a certain level of evolving intelligence.While conventional adaptive techniques (Astroem and Wittenmark, 1989) are suit-able to represent objects with slowly changing parameters, they can hardly handlecomplex (usually, nonlinear, nonstationary) systems with multiple operating modesor abruptly changing characteristics since it takes a long time after every drasticchange in the system to update model parameters. The evolving systems paradigm(Angelov, 2002) is based on the concept of evolving (expanding or shrinking) systemstructure that is capable of adapting to the changes in the environment and internalchanges of the system itself that cannot solely be represented by parameter tuning/adjustment.

Evolving intelligent systems (eIS) the concept of which was pioneered recently(Angelov, 2002; Kasabov, 2002; Angelov and Kasabov, 2005; Kasabov and Filev, 2006,Jager, 2006), develop their structure, their functionality, and their internal knowl-edge representation through autonomous learning from data streams generated by the(possibly unknown) environment and from the system self-monitoring. They often(but not necessarily) use as a framework of implementation fuzzy rule-based (FRB)and neurofuzzy (NF) or neural-network (NN) based systems and machine learningas a tool for training. Alternative frameworks (such as conventional multimodel sys-tems, decision trees, probabilistic, e.g. Markov, mixture Gaussian models, etc.) canalso be explored as viable frameworks of eIS and autonomous learning systems.

It should be noted that the physical embodiments of such systems can rangefrom micro-systems-on chip (Everett and Angelov, 2005), motes of a wireless sen-sor network (Andreu, Angelov and Dutta Baruah, 2011), mobile robots (Zhou and

Autonomous Learning Systems: From Data Streams to Knowledge in Real-time, First Edition. Plamen Angelov.C© 2013 John Wiley & Sons, Ltd. Published 2013 by John Wiley & Sons, Ltd.

2 Autonomous Learning Systems: From Data Streams to Knowledge in Real-time

Angelov, 2007; Liu and Meng, 2004; Kanakakis, Valavanis and Tsourveloudis, 2004)to unmanned airborne vehicles (Valavanis, 2006) and computer-controlled industrialprocesses (Filev, Larson and Ma, 2000; Macias-Hernandez et al., 2007).

The potential of these systems for industry was acknowledged by leading re-searchers with a solid industrial background such as Dr. A. Kordon, R&D Leader,Dow Chemical, TX, USA who said in 2006 “Evolving Intelligent Systems have ahigh potential for implementation in industry” (http://news.lancs.ac.uk/Web/News/Pages/930389757F5B0BF4802571FB003CB1A2.aspx); Dr. D. Filev, Senior technicalStaff at Ford R&D, Dearborn, MI, USA who also in 2006 said “Embedded soft com-puting applications are the natural implementation area of evolving systems as one of themain tools for design of real time intelligent systems” (same web reference as above).

The problem of automatic design of computationally intelligent systems for mod-elling, classification, time-series prediction, regression, clustering from data has beensuccessfully addressed during the previous century by a range of techniques such asby gradient-based techniques (as in the neurofuzzy approach ANFIS (Jang, 1993)), bygenetic/evolutionary algorithms (Fogarty and Munro, 1996; Angelov and Buswell,2003), by using partitioning by clustering (Babuska, 1998), learning by least squares(LS) techniques and so on. But, these approaches were assuming all data to be knownin advance (a batch or offline mode of learning) and were not directly applicable todata streams.

At the same time, the twenty-first century is confronting us with a range of newchallenges that require completely new approaches. As John Naisbitt famously said“today we are drowning in information but starved for knowledge” (Naisbitt, 1988).We are in the midst of an information revolution witnessing an exponential growthof the quantity and the rate of appearance of new information by; Internet users,consumers, finance industry, sensors in advanced industrial processes, autonomoussystems, space and aircraft, and so on.

It is reported that every year more than 1 Exabyte (=1018 bytes) of data aregenerated worldwide, most of it in digital form (http://news.bbc.co.uk/2/hi/technology/4079417.stm). Toshiba recently coined the phrase ‘digital obesity’ to illus-trate the ever-growing amount of data that are generated, transmitted and consumedby the users today. In this ocean of data the useful information and knowledge veryoften is difficult to extract in a clear and comprehensive, human-intelligible form. Theavailability of convenient-to-use and efficient methods, algorithms, techniques, andtools that can assist in extracting knowledge from the data (Martin, 2005) is a pressingdemand at individual and corporate level, especially if this can be done online, inreal time.

The new challenges that cannot be successfully addressed by the existing tech-niques, especially in their complexity and interconnection, can be summarisedas follows:

i. to cope with huge amounts of data;ii. to process streaming data online and in real time;

iii. to adapt to the changing environment and data pattern autonomously;

http://news.lancs.ac.uk/Web/News/Pages/930389757F5B0BF4802571FB003CB1A2.aspx

http://news.lancs.ac.uk/Web/News/Pages/930389757F5B0BF4802571FB003CB1A2.aspx

http://news.bbc.co.uk/2/hi/technology/4079417.stm

http://news.bbc.co.uk/2/hi/technology/4079417.stm

Introduction 3

iv. to be computationally efficient (that means, to use recursive, one-pass, nonitera-tive approaches);

v. to preserve the interpretability and transparency in a dynamic sense.

To address these new challenges efficient approaches are needed that deal with datastreams (Domingos and Hulten, 2001), not just with batch sets of data (Fayyad et al.,1996), detect, react and take advantage of concept shift and drift in the data streams(Lughofer and Angelov, 2011). Efficient collaborative and interactive schemes arealso needed for a range of applications in process industries (for self-calibrating, self-maintaining intelligent sensors of the new generation), in autonomous systems androbotics (for systems that have self-awareness, replanning and knowledge summari-sation capabilities), in multimedia and biomedical applications, to name a few.

Autonomous learning (AL) is understood in this book in the context of both systemstructure and system parameters. This means that the overall process of design, devel-opment, redesign/update, adaptation, use and reuse of such systems is autonomous,including the stages of the system design that traditionally assume heavy humaninvolvement and are normally done offline (the system being designed not in real timein which the process that is using this system runs in). Therefore, our understandingof AL and our concept of eIS is intricately related to the concepts of online and real-timesystem structure and parameter design and exploitation and to data streams ratherthan to data sets. This is the main differentiator in comparison with the traditionaldisciplines.

1.1 Autonomous Systems

Autonomous systems are often seen as the physical embodiments of machine intel-ligence. The concept of autonomous systems (AS) is not new and is closely relatedto AI and cybernetics, but became more popular during the last decade or so mainlydue to the interest (and funding) from the defence and aerospace industries. AS aresignificantly different from simple automatic control systems, ACS (Astroem andWittenmark, 1989). In fact, each AS has at its lower level (Layer 1) an ACS, usu-ally, for motion control, for control of the sensors and actuators, and so on. An AS,however, has also important upper layers in its architecture (Figure 1.1) that concernperceptions-behaviours (layer 2 that also corresponds to structure identification inAL systems) and the representation of the environment (usually in a form of rulebase, states, or a map, but not necessarily limiting to these forms of representation)in the model and self-monitoring functions (layer 3 that is linked to the prediction).

AS can be seen as a fusion of computationally enabled sensor platforms(machines/devices) that possess the algorithms (respectively, the software) neededto empower the systems with evolving intelligence that is manifested through inter-action with the outside environment and self-monitoring.

Examples include, but are not limited to unmanned airborne systems, UAS, un-manned ground-based vehicles, UGVs (Figure 1.3), and so on.

4 Autonomous Learning Systems: From Data Streams to Knowledge in Real-time

Perceptions

Sensors

Behaviours

Actuators

Self-monitoring

environment

info goals

Layer 3

Layer 2

Layer 1

IncreasingAbstraction

IncreasingDetail

Data Actions

Model

Figure 1.1 A three-layer structure of an autonomous system. (layer 1 – low-level directcontrol, including teleoperation; layer 2 – a more abstract, behavioural autonomy,specific tasks; layer 3, often called deliberate autonomy – the upper abstract layer ofmodelling the environment and self-monitoring)

1.2 The Role of Machine Learning in Autonomous Systems

The core functionality of an AS depends on the ability to be aware of the environment(through data streams generated by the sensors) and to make decisions accordingly.Obviously, such decisions cannot be made on the basis of a preprogrammed logicbecause this will assume a full knowledge of all the environments in which thesystem will operate and will not be flexible enough. Therefore, core elements of anyAS are self-monitoring and self-adaptation. Autonomous learning and extracting newknowledge as well as updating the existing knowledge base are vitally important forsuch type of systems.

The dependence between autonomy and learning is a two-way process – on theone hand, autonomous systems require learning in order to be aware of, explore, andadapt to the dynamic environment; on the other hand, learning algorithms requireautonomy to make them independent of human involvement. The lack or low levelof autonomy in most of the currently existing algorithms leads to the need to developnew generations of AL systems (ALS) to play an important role in the design andmaintenance of autonomous systems (e.g. UAV, UGV, intelligent/soft sensors, etc.).A system (however well may it have been designed) that is not empowered by anautonomous machine learning capability will fail helplessly in a situation that was

media control - autonomous learning systems · 2013. 7. 23. · 4.2.2.2 multi-input–multi-output...

Documents