data managementanddataanalysis foroffshore windparks · data managementanddataanalysis foroffshore...

24
Data management and data analysis for offshore windparks Ralf Herrmann 1 , Jens Rabe 2 1 LUH, Institute of Concrete Construction (IfMa) 2 FhG, Fraunhofer-Institute for Wind Energy and Energy System Technology (IWES) 7. GIGAWIND Symposium, 2 March 2017, Leibniz Universität Hannover

Upload: others

Post on 16-Aug-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data managementanddataanalysis foroffshore windparks · Data managementanddataanalysis foroffshore windparks Ralf Herrmann1, Jens Rabe2 1LUH, Institute ofConcreteConstruction(IfMa)

Data management and data analysisfor offshore windparks

Ralf Herrmann1,Jens Rabe2

1LUH, Institute of Concrete Construction (IfMa)2FhG, Fraunhofer-Institute for Wind Energy and Energy System Technology (IWES)

7. GIGAWIND Symposium, 2 March 2017, Leibniz Universität Hannover

Page 2: Data managementanddataanalysis foroffshore windparks · Data managementanddataanalysis foroffshore windparks Ralf Herrmann1, Jens Rabe2 1LUH, Institute ofConcreteConstruction(IfMa)

Slide 2Ralf Herrmann - Data management and data analysis for offshore windparks

Overview

Introduction

Is Structural Heath Monitoring a Big Data problem?

Tasks for data processing

Distributed data processing

Data management and -analysis in Gigawind life

Conclusions

Page 3: Data managementanddataanalysis foroffshore windparks · Data managementanddataanalysis foroffshore windparks Ralf Herrmann1, Jens Rabe2 1LUH, Institute ofConcreteConstruction(IfMa)

Slide 3Ralf Herrmann - Data management and data analysis for offshore windparks

Introduction

alpha ventus

two extensively equipped wind turbines with sensors

Senvion 5M (AV4) and Adwen Multibrid M5000 (AV7)

producing energy day by day and important data

Foto: Martina Nolte / Lizenz: Creative Commons CC-by-sa-3.0 de

Page 4: Data managementanddataanalysis foroffshore windparks · Data managementanddataanalysis foroffshore windparks Ralf Herrmann1, Jens Rabe2 1LUH, Institute ofConcreteConstruction(IfMa)

Slide 4Ralf Herrmann - Data management and data analysis for offshore windparks

How to handle all the data ?

measurements started in April 2010

approx. 5 billion values every day

6 TByte each year

data is basis for lifecycle analysis

data is accessible on the Internet on the WEDA Platform for download

Foto: [Dur2016]

Foto: [Dur2016]

AV7 AV7

Page 5: Data managementanddataanalysis foroffshore windparks · Data managementanddataanalysis foroffshore windparks Ralf Herrmann1, Jens Rabe2 1LUH, Institute ofConcreteConstruction(IfMa)

Slide 5Ralf Herrmann - Data management and data analysis for offshore windparks

How to handle all the data ?

Insights of the architects of the WEDA system [GUD 2013]

“When the data is there, researchers want it all”

“Online analysis […] requires a distributed solutions”

“Data quality is an issue”

“constant tuning and optimization as the data volumes grows”

Page 6: Data managementanddataanalysis foroffshore windparks · Data managementanddataanalysis foroffshore windparks Ralf Herrmann1, Jens Rabe2 1LUH, Institute ofConcreteConstruction(IfMa)

Slide 6Ralf Herrmann - Data management and data analysis for offshore windparks

Characteristics of a Big Data System

Visibility

Access from various or multiple locations for multiple users

VolumeLarge amounts of

record and file

Petabytes

Terabytes

Gigabytes

VelocityStreams of data, fast availability for access and

delivery, high sampling rates

in real time

per second

per minute

Variety

Analysis

depends on the data format

• structured Data

• semi- structured Data

• unstructured Data

Value/ Veracity

Generates additional value for multiple users

Development of reliable systems for higher

precision and improvement of the quality of

data analysis

Big

Data

[NI2014], [Wrobel2012], [Fas2016]

Page 7: Data managementanddataanalysis foroffshore windparks · Data managementanddataanalysis foroffshore windparks Ralf Herrmann1, Jens Rabe2 1LUH, Institute ofConcreteConstruction(IfMa)

Slide 7Ralf Herrmann - Data management and data analysis for offshore windparks

Is Structural Heath Monitoring a Big Data problem?

Volume: considered volume is high (>42 TByte), growing 6 TByteeach year from 1.500 sensors ( 33 GByte per sensor ) [Gud2013]

Velocity: results are needed as fast as possible, high sampling rates

Variety: data exists in comma-separated value (csv) files, webcam pictures and data from other sources (weather)

Value: prepared data is valuable for research, industry, standards associations, certification institutes, data resellers

Visibility: data access is needed from interested parties all over the world

Page 8: Data managementanddataanalysis foroffshore windparks · Data managementanddataanalysis foroffshore windparks Ralf Herrmann1, Jens Rabe2 1LUH, Institute ofConcreteConstruction(IfMa)

Slide 8Ralf Herrmann - Data management and data analysis for offshore windparks

Looking for the right system

Foto: [Mer2014]

Page 9: Data managementanddataanalysis foroffshore windparks · Data managementanddataanalysis foroffshore windparks Ralf Herrmann1, Jens Rabe2 1LUH, Institute ofConcreteConstruction(IfMa)

Slide 9Ralf Herrmann - Data management and data analysis for offshore windparks

Transactional and Analytical data processing

OLTP (online transactional processing)

focus on data operations (insert, copy, update, delete)

quick processing

simple algorithms, query language

user interactions with data

transaction safety

OLAP (online analytical processing)

focus on analytical calculations

usage of usually existing analytical algorithms

complex queries

historical and archive data

batch-operations

Page 10: Data managementanddataanalysis foroffshore windparks · Data managementanddataanalysis foroffshore windparks Ralf Herrmann1, Jens Rabe2 1LUH, Institute ofConcreteConstruction(IfMa)

Slide 10Ralf Herrmann - Data management and data analysis for offshore windparks

What kind of work is done?

development of algorithms

inspection of the data

alarm and event based notifications

well defined access restrictions on the data

data manipulation for data cleansing by hand

evaluation of big amounts of data

statistical calculations

complex algorithms e.g. rainflow-counting

automated data validation check

Transactional behavior Analytical behavior

Page 11: Data managementanddataanalysis foroffshore windparks · Data managementanddataanalysis foroffshore windparks Ralf Herrmann1, Jens Rabe2 1LUH, Institute ofConcreteConstruction(IfMa)

Slide 11Ralf Herrmann - Data management and data analysis for offshore windparks

Example

OLTP (online transactional processing)

f( )= 27

find the right piece quickly …

… to put it in the own algorithm/software

OLAP (online analytical processing)

calculate a big amount of data

… e.g. empirical cumulative distribution functions

Page 12: Data managementanddataanalysis foroffshore windparks · Data managementanddataanalysis foroffshore windparks Ralf Herrmann1, Jens Rabe2 1LUH, Institute ofConcreteConstruction(IfMa)

Slide 12

Transactional and Analytical data processing

design goals for data systems aims either transactional or analytical

good OLAP performance degrades OLTP performance and vice versa

same problem exits with traditional Big Data databases or data systems

Apache Cassandra, MySQL Cluster

-> transactional data processing

Apache Hadoop

-> analytical data processing

Average response time [s] for :

1 – mainly transactional queries in parallel

2 – mainly analytical queries in parallelsource: [Bog2011]

0

2

4

6

8

10

12

14

16

18

20

1 2

OLTP

OLAP

[s]

Ralf Herrmann - Data management and data analysis for offshore windparks

Page 13: Data managementanddataanalysis foroffshore windparks · Data managementanddataanalysis foroffshore windparks Ralf Herrmann1, Jens Rabe2 1LUH, Institute ofConcreteConstruction(IfMa)

Slide 13Ralf Herrmann - Data management and data analysis for offshore windparks

Where is the data processed?

our office computers are not sufficient anymore

too less HDD memory, main memory, processing power

distributed processing systems:

most spread and used open-source framework:

Apache Hadoop

can work as a filesystem or database

Foto: Apache Software Foundation

Page 14: Data managementanddataanalysis foroffshore windparks · Data managementanddataanalysis foroffshore windparks Ralf Herrmann1, Jens Rabe2 1LUH, Institute ofConcreteConstruction(IfMa)

Slide 14Ralf Herrmann - Data management and data analysis for offshore windparks

Distributed storage of data

a cluster consists of many computer machines (nodes)

Hadoop distributed files system (HDFS)

data is sliced in parts

NameNode: knows locationsof files in the file system

DataNode: stores a part of afile

Client NameNode

DataNodeDataNode DataNode

1) request file

2) list of relevant DataNodes

3) read data parts

Hadoop4) replicates data

Page 15: Data managementanddataanalysis foroffshore windparks · Data managementanddataanalysis foroffshore windparks Ralf Herrmann1, Jens Rabe2 1LUH, Institute ofConcreteConstruction(IfMa)

Slide 15Ralf Herrmann - Data management and data analysis for offshore windparks

Distributed processing of data

For analysis: Data in not send to the client, but the algorithm is send to the cluster

Master Node: distributes the algorithm to all nodes

Computational Node: executesthe algorithm together

Algorithm need to followa programming model

Client Master Node

Computational

Node

Computational

Node

Computational

Node

1) sends analytics task

2) assigns tasks

3) can interchange map results4) results are stored inside the HDFS Hadoop

Page 16: Data managementanddataanalysis foroffshore windparks · Data managementanddataanalysis foroffshore windparks Ralf Herrmann1, Jens Rabe2 1LUH, Institute ofConcreteConstruction(IfMa)

Slide 16Ralf Herrmann - Data management and data analysis for offshore windparks

MapReduce

programming technique for analyzing data sets that do not fit in memory

algorithms need to be programmed to run in parallel

difficult task, including parallelization, fault tolerance, data distribution and load balancing

simplified by programmingmodel: MapReduce

k1

k3

k2

k1

k1

Input

data

partmap reduce

parted data by property

going to other reducerscoming from other mappers

merge & reduce

results

Page 17: Data managementanddataanalysis foroffshore windparks · Data managementanddataanalysis foroffshore windparks Ralf Herrmann1, Jens Rabe2 1LUH, Institute ofConcreteConstruction(IfMa)

Slide 17Ralf Herrmann - Data management and data analysis for offshore windparks

Data management and data analysis in Gigawind life

LAMA-analytical operations-here the big data happens

I/O-Logic fortimeseries data

User-Interface/Data Import

Distributed Database

Data andEvaluation

Management

transactionaldatabase

SMMEXS-transactional operations

Page 18: Data managementanddataanalysis foroffshore windparks · Data managementanddataanalysis foroffshore windparks Ralf Herrmann1, Jens Rabe2 1LUH, Institute ofConcreteConstruction(IfMa)

Slide 18Ralf Herrmann - Data management and data analysis for offshore windparks

Key Features of SMMEXS

management of sensors and evaluations

provides relevant information on the sensor data

user interface for configuration

automated data import

runs periodical analysis with now incoming data

prepared algorithms for civil engineering related analytics

works on measurement data, images, audio files

Page 19: Data managementanddataanalysis foroffshore windparks · Data managementanddataanalysis foroffshore windparks Ralf Herrmann1, Jens Rabe2 1LUH, Institute ofConcreteConstruction(IfMa)

Slide 19Ralf Herrmann - Data management and data analysis for offshore windparks

Key Features of LAMA

redundant, fault-tolerant data storage

processing algorithms can be run directly on the cluster

no data upload / download necessary besides parameters and results

tasks are run on the nodes which hold the data locally –minimizing network traffic

Java and .net based interfaces

can be used with LabVIEW, MATLAB and more

2x to 10x faster than running locally, depending on type of algorithm

Page 20: Data managementanddataanalysis foroffshore windparks · Data managementanddataanalysis foroffshore windparks Ralf Herrmann1, Jens Rabe2 1LUH, Institute ofConcreteConstruction(IfMa)

Slide 20Ralf Herrmann - Data management and data analysis for offshore windparks

Performance of the LAMA data storage

performance on a 50GiB sample

0

5

10

15

20

25

30

35

40

45

50

55

MATLAB LAMA, usingdatabase

LAMA, usingfiles

Co

mp

uta

tio

nti

me

in m

inu

tes

low

er

isb

ette

r

0

10

20

30

40

50

60

70

80

90

100

MATLAB LAMA, usingdatabase

LAMA, usingfiles

Co

mp

uta

tio

nti

me

in m

inu

tes

low

er

isb

ette

r

Parallelizable Task10 minute average

Non-Parallelizable TaskCycle Counting

Page 21: Data managementanddataanalysis foroffshore windparks · Data managementanddataanalysis foroffshore windparks Ralf Herrmann1, Jens Rabe2 1LUH, Institute ofConcreteConstruction(IfMa)

Slide 21Ralf Herrmann - Data management and data analysis for offshore windparks

Results from the system in Gigawind life

• data synchronized • data cleaned

• data filtered • data statistical evaluated

Page 22: Data managementanddataanalysis foroffshore windparks · Data managementanddataanalysis foroffshore windparks Ralf Herrmann1, Jens Rabe2 1LUH, Institute ofConcreteConstruction(IfMa)

Slide 22Ralf Herrmann - Data management and data analysis for offshore windparks

Conclusions

amount of data will grow, so algorithms are send to the data

development of algorithms with parallelism is an essential task

you can’t get your local copy

many research areas use Big Data with great success (medicine, economy, BIM, predictive maintenance)

Big Data analytics is notably impacting the Civil Engineering domain [Alavi2017]

current Civil Engineering information systems are still lacking in successful implementation [Alavi2017]

Page 23: Data managementanddataanalysis foroffshore windparks · Data managementanddataanalysis foroffshore windparks Ralf Herrmann1, Jens Rabe2 1LUH, Institute ofConcreteConstruction(IfMa)

Slide 23Ralf Herrmann - Data management and data analysis for offshore windparks

Thank you very much for your attention

In detail see the poster at the poster session:

Fast and scalable distributed storage system for measuring and simulation data

Page 24: Data managementanddataanalysis foroffshore windparks · Data managementanddataanalysis foroffshore windparks Ralf Herrmann1, Jens Rabe2 1LUH, Institute ofConcreteConstruction(IfMa)

Slide 24Ralf Herrmann - Data management and data analysis for offshore windparks

References

[Bog2011] A. Bog, K. Sachs, and A. Zeier, “Benchmarking Database Design for Mixed OLTP and OLAP Workloads,” in ICPE’11: proceedings of the 2nd AMC/SPEC International Conference on Performance Engineering, March 14-16, 2011, Karlsruhe, Germany, New York: ACM, 2011.

[Gud2013] Gudenkauf, S., and A. Claassen. “Data Warehousing for Distributed Offshore Research at Alpha Ventus - Overview and Insights Gained.” In Proceedings of the 27th EnviroInfo 2013 Conference. Hamburg: Shaker-Verlag, 2013.

[Mer2014] R. Merino, „Trafodion: How to use Hadoop for operational and transactional purposes“ URL: https://de.slideshare.net/

BigDataSpain/rodrigo-merino-how-to-use-hadoop-for-operational-and-transactional-purposes-big-data-spain-2014 (28 Feburary 2017)

[Alavi2017] Alavi, Amir H., and Amir H. Gandomi. “Big Data in Civil Engineering.” Automation in Construction, January 2017. doi:10.1016/j.autcon.2016.12.008.

[NI2014] National Instruments: “Big Analog Data™ Solutions ” URL: www.ni.com/pdf/info/us/big_analog_data_presentation.pdf (28.02.2017)

[Wrobel2012] Wrobel, S.: „Big Data – Vorsprung durch Wissen Chancen erkennen und nutzen“, 2012 URL: https://www.iais.fraunhofer.de/content/dam/iais/gf/bda/Downloads/FraunhoferIAIS_Big-Data_2012-12-10.pdf (28 February2017)

[Fas2016] Fasel, D., & Meier, A. (2016). Big Data: Grundlagen, Systeme und Nutzungspotenziale. Wiesbaden: Springer Vieweg.

[Dur2016] Durstewitz, M. and Lange, B.; “Meer-Wind-Strom: Forschung am ersten deutschen Offshore-Windpark”, 1. Auflage, Springer Fachmedien Verlag, Wiesbaden, 2016