data driven model

Upload: hemant

Post on 17-Feb-2018

232 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/23/2019 Data Driven Model

    1/46

    Tutorial on Data-Driven Modeling in Water Resourceand Environmental Engineering Using Matlab

    Feb 2014

    Waqar S. Qureshi

    Teaching AssociateAsian Institute of Technology

    February 11, 2014

    Waqar Qureshi (AIT) Modeling for WREE February 11, 2014 1 / 33

    http://find/http://goback/
  • 7/23/2019 Data Driven Model

    2/46

    Outline

    1 Major challenges for water engineers2 System modeling for water engineering

    What is modelingTypes of modelingData-driven modelsStatistical modelsSoft computing model

    Spatio-temporal complexityType of dataGeneral approach to develop a data-driven model

    3 Regression-based models

    IntroductionRegression model applicationLinear regressionLinear-regression-example-1Linear-regression-example-2

    4 Matlab TutorialWaqar Qureshi (AIT) Modeling for WREE February 11, 2014 2 / 33

    http://find/
  • 7/23/2019 Data Driven Model

    3/46

    Outline

    1 Major challenges for water engineers2 System modeling for water engineering

    What is modelingTypes of modelingData-driven modelsStatistical modelsSoft computing model

    Spatio-temporal complexityType of dataGeneral approach to develop a data-driven model

    3 Regression-based models

    IntroductionRegression model applicationLinear regressionLinear-regression-example-1Linear-regression-example-2

    4 Matlab TutorialWaqar Qureshi (AIT) Modeling for WREE February 11, 2014 3 / 33

    http://find/
  • 7/23/2019 Data Driven Model

    4/46

    Todays major challenges forwater engineers include:

    Securing water resources for people

    Protecting vital echosystems

    Dealing with variability and uncertainty of water in space and time

    Waqar Qureshi (AIT) Modeling for WREE February 11, 2014 4 / 33

    http://find/
  • 7/23/2019 Data Driven Model

    5/46

    Outline

    1 Major challenges for water engineers2 System modeling for water engineering

    What is modelingTypes of modelingData-driven modelsStatistical modelsSoft computing model

    Spatio-temporal complexityType of dataGeneral approach to develop a data-driven model

    3 Regression-based models

    IntroductionRegression model applicationLinear regressionLinear-regression-example-1Linear-regression-example-2

    4 Matlab TutorialWaqar Qureshi (AIT) Modeling for WREE February 11, 2014 5 / 33

    http://find/
  • 7/23/2019 Data Driven Model

    6/46

    Outline

    1 Major challenges for water engineers2 System modeling for water engineering

    What is modelingTypes of modelingData-driven modelsStatistical modelsSoft computing model

    Spatio-temporal complexityType of dataGeneral approach to develop a data-driven model

    3 Regression-based models

    IntroductionRegression model applicationLinear regressionLinear-regression-example-1Linear-regression-example-2

    4 Matlab TutorialWaqar Qureshi (AIT) Modeling for WREE February 11, 2014 5 / 33

    http://find/http://goback/
  • 7/23/2019 Data Driven Model

    7/46

    System modeling for water engineeringWhat is modeling

    The term model refers to tools, softwares, and programs used to

    represent real-world systems.

    Modeling of a system is used to predict the system behavior and responseto the changing factors.

    Waqar Qureshi (AIT) Modeling for WREE February 11, 2014 6 / 33

    http://find/
  • 7/23/2019 Data Driven Model

    8/46

    Outline

    1 Major challenges for water engineers2 System modeling for water engineering

    What is modelingTypes of modelingData-driven modelsStatistical modelsSoft computing model

    Spatio-temporal complexityType of dataGeneral approach to develop a data-driven model

    3 Regression-based models

    IntroductionRegression model applicationLinear regressionLinear-regression-example-1Linear-regression-example-2

    4 Matlab TutorialWaqar Qureshi (AIT) Modeling for WREE February 11, 2014 6 / 33

    http://find/http://goback/
  • 7/23/2019 Data Driven Model

    9/46

    System modeling for water engineeringTypes of modeling

    Physical model is rescaled copy of the actual system, example, DAM

    models.

    Mathematical model is baed on mathematical logic, knowledge, andequations.

    Figure: classification of models

    Waqar Qureshi (AIT) Modeling for WREE February 11, 2014 7 / 33

    http://find/
  • 7/23/2019 Data Driven Model

    10/46

    System modeling for water engineeringApplications and complexity

    As water and environmental engineers, system modeling can be applied inmany applications such as:

    Simulation of natural phenomenon

    Synthetic data generation

    Forecasting and warning of extreme events

    Developing decision making rules

    Modeling a system in the field of water engineering is difficult:

    Physicalcomplexity of natural phenomenon.Time consumingprocess of analyzing different components of thesystem.

    Waqar Qureshi (AIT) Modeling for WREE February 11, 2014 8 / 33

    http://find/
  • 7/23/2019 Data Driven Model

    11/46

    Outline

    1 Major challenges for water engineers2 System modeling for water engineering

    What is modelingTypes of modelingData-driven modelsStatistical modelsSoft computing model

    Spatio-temporal complexityType of dataGeneral approach to develop a data-driven model

    3 Regression-based modelsIntroductionRegression model applicationLinear regressionLinear-regression-example-1Linear-regression-example-2

    4 Matlab TutorialWaqar Qureshi (AIT) Modeling for WREE February 11, 2014 8 / 33

    http://find/
  • 7/23/2019 Data Driven Model

    12/46

    System modeling for water engineeringData-driven models

    Models that can simulate a system by the experimental data of thatsystem is known as data-driven models.

    Data-driven models enable us to mapcausal factorsandconsequentoutcomesfrom the observed patterns (experimental data), without deepunderstanding of the complex physical process.

    The purpose of data-driven modeling in water engineering can include thefollowing:

    Data classification and clustering.

    Extreme value predition with ephasis on floods and droughts.

    Water quality simulation and prediction.Extending the length of hydroclimatological data from the historicalones.

    Modeling water balance concerning different components of ahydrological system.

    Waqar Qureshi (AIT) Modeling for WREE February 11, 2014 9 / 33

    http://find/
  • 7/23/2019 Data Driven Model

    13/46

    System modeling for water engineeringdata-driven models

    For a complex system, data-driven models are inexpensive, accurate,precise, andflexiblein contrast to their counter physical models oranalytical models.

    Data-driven models can be used for problems where we have lessinformation about the intrinsic complexityof the phenomenon, in contrastto analytical modeling.

    Two groups of Data-driven models are:

    Statistical modelingSoft computing (Artificial intelligence)

    Waqar Qureshi (AIT) Modeling for WREE February 11, 2014 10 / 33

    http://find/
  • 7/23/2019 Data Driven Model

    14/46

    Outline

    1 Major challenges for water engineers2 System modeling for water engineering

    What is modelingTypes of modelingData-driven modelsStatistical modelsSoft computing model

    Spatio-temporal complexityType of dataGeneral approach to develop a data-driven model

    3 Regression-based modelsIntroductionRegression model applicationLinear regressionLinear-regression-example-1Linear-regression-example-2

    4 Matlab TutorialWaqar Qureshi (AIT) Modeling for WREE February 11, 2014 10 / 33

    http://find/
  • 7/23/2019 Data Driven Model

    15/46

    System modeling for water engineeringStatistical model

    A statistical model is comprised of random and deterministic variables.Deterministic variables are defined by mathematical model and use a set ofequations to generate data, while random variable is represented by aprobabilistic models for example a probability density function to generatedata.

    The probabilistic models can be parametric and non parametric.

    Parametric model can be described by its mean, variance, etc.

    Non-parametric models can be described by loosely confined assumptionssuch as nearest neighbor.

    Waqar Qureshi (AIT) Modeling for WREE February 11, 2014 11 / 33

    http://find/
  • 7/23/2019 Data Driven Model

    16/46

    Outline

    1 Major challenges for water engineers2 System modeling for water engineering

    What is modelingTypes of modelingData-driven modelsStatistical modelsSoft computing model

    Spatio-temporal complexityType of dataGeneral approach to develop a data-driven model

    3 Regression-based modelsIntroductionRegression model applicationLinear regressionLinear-regression-example-1Linear-regression-example-2

    4 Matlab TutorialWaqar Qureshi (AIT) Modeling for WREE February 11, 2014 11 / 33

    S d li f i i

    http://find/
  • 7/23/2019 Data Driven Model

    17/46

    System modeling for water engineeringSoft computing model

    In soft computing the system is modeled using fuzzy logic,neuro-computing, and genetic algorithms.

    It is tolerant of imprecision, uncertainty, partial truth, and approximation.

    The role model of soft computing is human mind.

    Example: A suitable temperature of a room to make people feelcomfortable!

    Waqar Qureshi (AIT) Modeling for WREE February 11, 2014 12 / 33

    O li

    http://find/
  • 7/23/2019 Data Driven Model

    18/46

    Outline

    1 Major challenges for water engineers2 System modeling for water engineering

    What is modelingTypes of modelingData-driven modelsStatistical modelsSoft computing model

    Spatio-temporal complexityType of dataGeneral approach to develop a data-driven model

    3 Regression-based modelsIntroductionRegression model applicationLinear regressionLinear-regression-example-1Linear-regression-example-2

    4 Matlab TutorialWaqar Qureshi (AIT) Modeling for WREE February 11, 2014 12 / 33

    S d li f i i

    http://find/
  • 7/23/2019 Data Driven Model

    19/46

    System modeling for water engineeringSpatio-temporal complexity

    The model complexity can also be classified in spacial and temporalmanner.

    The spacial and temporal characteristics of a model is essential to study

    the effects due to the dynamic change of natural phenomenon on thesystem.

    The spacial complexity of a model can be characterized as lumped,semi-distributed, and distributed models.

    Let us take an example of rainoff modeling to understand spacialcomplexity of models.

    Waqar Qureshi (AIT) Modeling for WREE February 11, 2014 13 / 33

    S t d li f t i i

    http://find/
  • 7/23/2019 Data Driven Model

    20/46

    System modeling for water engineeringLumped models

    Lumped modeling methods were used due to complex data collection

    methods and software limitations.Lumped models are still useful for producing flood guidance. They requireless data input and less computational power than more modern methods.

    Figure: Spacial complexity for runoff model

    Waqar Qureshi (AIT) Modeling for WREE February 11, 2014 14 / 33

    S t d li f t i i

    http://find/
  • 7/23/2019 Data Driven Model

    21/46

    System modeling for water engineeringSemi-distributed models

    Semi-distributed modeling is a variation of the lumped method and is

    sometimes referred to as a pseudo-distributed approach. Using thisapproach, a basin is broken down into smaller sub-basins. Runoff amountsfrom methods such as unit hydrograph are used to estimate stream flowfrom each of these sub-basins

    Waqar Qureshi (AIT) Modeling for WREE February 11, 2014 15 / 33

    System modeling for water engineering

    http://find/http://goback/
  • 7/23/2019 Data Driven Model

    22/46

    System modeling for water engineeringSemi-distributed models

    A trulydistributed modelingis one that represents processes in a gridded

    manner.

    Each cell has its parameters allowing for its own stream flow estimates.

    If these data in each cell are not available, they must somehow beestimated, introducing an uncertainty factor.

    Waqar Qureshi (AIT) Modeling for WREE February 11, 2014 16 / 33

    System modeling for water engineering

    http://find/
  • 7/23/2019 Data Driven Model

    23/46

    System modeling for water engineeringTemporal complexity

    Data-driven models can be static or dynamic.

    A rainoff isdynamicif its parameters changes as it receives newinformation, and is consideredstaticif the model relies only on the

    historical data.In summary, thepurposeof modeling is an essential criteria to select amodel and determines itscomplexity,developing time,runtime, itsaccuracy, andprecision.

    Modeling also depends upon the type of data that is available, and thetime required to acquire it.

    Waqar Qureshi (AIT) Modeling for WREE February 11, 2014 17 / 33

    Outline

    http://goforward/http://find/http://goback/
  • 7/23/2019 Data Driven Model

    24/46

    Outline

    1 Major challenges for water engineers2 System modeling for water engineering

    What is modelingTypes of modelingData-driven modelsStatistical modelsSoft computing model

    Spatio-temporal complexityType of dataGeneral approach to develop a data-driven model

    3 Regression-based modelsIntroductionRegression model applicationLinear regressionLinear-regression-example-1Linear-regression-example-2

    4 Matlab TutorialWaqar Qureshi (AIT) Modeling for WREE February 11, 2014 17 / 33

    System modeling for water engineering

    http://find/
  • 7/23/2019 Data Driven Model

    25/46

    System modeling for water engineeringType of data

    Figure: Types of Data, (a) discrete data, (b) continuous data, (c) spacial data,(d) temporal data

    Waqar Qureshi (AIT) Modeling for WREE February 11, 2014 18 / 33

    Outline

    http://find/
  • 7/23/2019 Data Driven Model

    26/46

    Outline

    1 Major challenges for water engineers2 System modeling for water engineering

    What is modelingTypes of modelingData-driven modelsStatistical modelsSoft computing model

    Spatio-temporal complexityType of dataGeneral approach to develop a data-driven model

    3 Regression-based modelsIntroductionRegression model applicationLinear regressionLinear-regression-example-1Linear-regression-example-2

    4 Matlab TutorialWaqar Qureshi (AIT) Modeling for WREE February 11, 2014 18 / 33

    System modeling for water engineering

    http://goforward/http://find/http://goback/
  • 7/23/2019 Data Driven Model

    27/46

    System modeling for water engineeringGeneral approach to develop a data-driven model

    Figure: General approach to develop a data-driven model

    Waqar Qureshi (AIT) Modeling for WREE February 11, 2014 19 / 33

    Outline

    http://find/
  • 7/23/2019 Data Driven Model

    28/46

    Outline

    1 Major challenges for water engineers2 System modeling for water engineering

    What is modelingTypes of modelingData-driven modelsStatistical modelsSoft computing model

    Spatio-temporal complexityType of dataGeneral approach to develop a data-driven model

    3 Regression-based modelsIntroduction

    Regression model applicationLinear regressionLinear-regression-example-1Linear-regression-example-2

    4 Matlab TutorialWaqar Qureshi (AIT) Modeling for WREE February 11, 2014 20 / 33

    Outline

    http://goforward/http://find/http://goback/
  • 7/23/2019 Data Driven Model

    29/46

    Outline

    1 Major challenges for water engineers2 System modeling for water engineering

    What is modelingTypes of modelingData-driven modelsStatistical modelsSoft computing model

    Spatio-temporal complexityType of dataGeneral approach to develop a data-driven model

    3 Regression-based modelsIntroduction

    Regression model applicationLinear regressionLinear-regression-example-1Linear-regression-example-2

    4 Matlab TutorialWaqar Qureshi (AIT) Modeling for WREE February 11, 2014 20 / 33

    Regression-based models

    http://find/
  • 7/23/2019 Data Driven Model

    30/46

    Regression based modelsIntroduction

    Theregression-basedmodels are data-driven models that are easy to use

    and popular.

    They ranges from linear to nonlinear and parametric to nonparametricmodels.

    Following are the application areas of regression based models

    Prediction, forecasting, and estimation of missing data.

    Interpolation and extrapolation of data.

    They are segregated as

    Multiple linear regression model.Conventional non linear regression method.

    KNN non parametric model.

    logistic regression model.

    Waqar Qureshi (AIT) Modeling for WREE February 11, 2014 21 / 33

    Outline

    http://find/
  • 7/23/2019 Data Driven Model

    31/46

    Outline

    1 Major challenges for water engineers2 System modeling for water engineering

    What is modelingTypes of modelingData-driven modelsStatistical modelsSoft computing model

    Spatio-temporal complexityType of dataGeneral approach to develop a data-driven model

    3 Regression-based modelsIntroduction

    Regression model applicationLinear regressionLinear-regression-example-1Linear-regression-example-2

    4

    Matlab TutorialWaqar Qureshi (AIT) Modeling for WREE February 11, 2014 21 / 33

    Regression-based models

    http://find/
  • 7/23/2019 Data Driven Model

    32/46

    gRegression model application

    Figure: A summary on the application of regression models in water resources andenvironmental engineering

    Waqar Qureshi (AIT) Modeling for WREE February 11, 2014 22 / 33

    Outline

    http://find/
  • 7/23/2019 Data Driven Model

    33/46

    1 Major challenges for water engineers2 System modeling for water engineering

    What is modelingTypes of modelingData-driven modelsStatistical modelsSoft computing model

    Spatio-temporal complexityType of dataGeneral approach to develop a data-driven model

    3 Regression-based modelsIntroduction

    Regression model applicationLinear regressionLinear-regression-example-1Linear-regression-example-2

    4

    Matlab TutorialWaqar Qureshi (AIT) Modeling for WREE February 11, 2014 22 / 33

    Regression-based models

    http://find/
  • 7/23/2019 Data Driven Model

    34/46

    gLinear regression

    Linear regression is used to model the linear relationship between the

    continuous dependent variable (y) and an independent variable (x).The regression model aim to identify what variables are associated with y,to predict the future observations of y.

    Waqar Qureshi (AIT) Modeling for WREE February 11, 2014 23 / 33

    Regression-based models

    http://find/
  • 7/23/2019 Data Driven Model

    35/46

    gLinear regression

    Let x and y two variables, then a plot between x and y shows if y is

    positive, negative linear or non-linear function.

    Figure: Different type of correlation between Y and X

    Waqar Qureshi (AIT) Modeling for WREE February 11, 2014 24 / 33

    Regression-based models

    http://find/
  • 7/23/2019 Data Driven Model

    36/46

    Linear regression

    The strength of linear relationship between two variables is measured by

    simplecorrelation coefficient.The Correlation coefficient between n observations f X and Y is calculatedas

    Figure: Correlation coefficient

    Figure: sample script for corrcoef(x)

    Waqar Qureshi (AIT) Modeling for WREE February 11, 2014 25 / 33

    Regression-based models

    http://find/
  • 7/23/2019 Data Driven Model

    37/46

    Linear regression

    A simplest regression function is refined as

    y=o+1.x, where o and 1 are parameters of the model

    Figure: Samples of errors in linear regression fitting

    The linear regression modeling tends to fit a line onto the observed datasuch that the sum of absolute errors of fitting for n-observations isminimized.

    S2 =n

    i=1

    2i =n

    i=1

    (yi o+1.xi)2

    Waqar Qureshi (AIT) Modeling for WREE February 11, 2014 26 / 33

    Regression-based models

    http://find/
  • 7/23/2019 Data Driven Model

    38/46

    Linear regression

    In case of multiple independent variables xi ... xn the model becomes

    multiple linear regression and is represented by the equation.

    y=o+1.x1+ 2.x2+ 3.x3+ 3.x3+ 4.x4

    The dependent variable y can be a deterministic or a probabilistic. In case

    it becomes a probabilistic, then the stochastic equation of the form isgiven as

    y=o+1.x+e

    , where e is the estimation error.The output at any instant of x can be represented by a distributionfunction. The expected value of the estimation is in fact the average valueof this distribution which is given by y=o+1.x.

    Waqar Qureshi (AIT) Modeling for WREE February 11, 2014 27 / 33

    Regression-based models

    http://find/
  • 7/23/2019 Data Driven Model

    39/46

    Linear regression

    Figure: PDF of dependent variable in a linear regression model

    Waqar Qureshi (AIT) Modeling for WREE February 11, 2014 28 / 33

    Outline

    http://find/
  • 7/23/2019 Data Driven Model

    40/46

    1 Major challenges for water engineers2 System modeling for water engineering

    What is modelingTypes of modelingData-driven modelsStatistical modelsSoft computing model

    Spatio-temporal complexityType of dataGeneral approach to develop a data-driven model

    3 Regression-based modelsIntroduction

    Regression model applicationLinear regressionLinear-regression-example-1Linear-regression-example-2

    4 Matlab TutorialWaqar Qureshi (AIT) Modeling for WREE February 11, 2014 28 / 33

    Regression-based models

    http://find/
  • 7/23/2019 Data Driven Model

    41/46

    Linear regression-example

    Interpolation of water quality values.

    Water quality of a river as a function of distane from the upstream of riveris tabulated. Use a linear regression model to interpolate total dissolvedsolution (TDS) at different locations of the river.

    Figure: Data presented for Example

    Waqar Qureshi (AIT) Modeling for WREE February 11, 2014 29 / 33

    Outline

    http://find/
  • 7/23/2019 Data Driven Model

    42/46

    1 Major challenges for water engineers2 System modeling for water engineering

    What is modelingTypes of modelingData-driven modelsStatistical modelsSoft computing model

    Spatio-temporal complexityType of dataGeneral approach to develop a data-driven model

    3 Regression-based modelsIntroduction

    Regression model applicationLinear regressionLinear-regression-example-1Linear-regression-example-2

    4 Matlab TutorialWaqar Qureshi (AIT) Modeling for WREE February 11, 2014 29 / 33

    Regression-based modelsLi i l

    http://find/
  • 7/23/2019 Data Driven Model

    43/46

    Linear regression-example

    Solve the above Example in a probabilistic manner and calculate theprobable range of TDS at the distance of 125km from the upstream.

    Waqar Qureshi (AIT) Modeling for WREE February 11, 2014 30 / 33

    Outline

    http://find/
  • 7/23/2019 Data Driven Model

    44/46

    1 Major challenges for water engineers2 System modeling for water engineering

    What is modelingTypes of modelingData-driven modelsStatistical modelsSoft computing model

    Spatio-temporal complexityType of dataGeneral approach to develop a data-driven model

    3 Regression-based modelsIntroduction

    Regression model applicationLinear regressionLinear-regression-example-1Linear-regression-example-2

    4 Matlab TutorialWaqar Qureshi (AIT) Modeling for WREE February 11, 2014 31 / 33

    Matlab Tutorial

    http://find/
  • 7/23/2019 Data Driven Model

    45/46

    Waqar Qureshi (AIT) Modeling for WREE February 11, 2014 32 / 33

    http://find/
  • 7/23/2019 Data Driven Model

    46/46

    End

    46/46

    http://find/