bo - data services & information steward

27
IBsolution Academy Webinar IBsolution Academy Webinar BO – Data Services & Information Steward 1 12.11.2015, Goran Deliyski, IBsolution GmbH

Upload: ibsolution-gmbh

Post on 13-Apr-2017

498 views

Category:

Technology


2 download

TRANSCRIPT

IBsolution Academy – WebinarIBsolution Academy – Webinar

BO – Data Services & Information Steward

1

12.11.2015, Goran Deliyski, IBsolution GmbH

IBsolution Academy – Webinar

This webinar is suitable for:

• application developers

• data consultants

• database administrators

• project managers

• data management solution architects

2

IBsolution Academy – Webinar

What will you learn:

• Basic concepts of the SAP Data Services and how it works

• Step-by-step how to accomplish a duplicate check task

• Performance optimization techniques

3

IBsolution Academy – Webinar

Your host

Goran Deliyski

IBsolution Bulgaria

Data Services & Information Steward Consultant

4

IBsolution Academy – Webinar

Agenda

• Introduction ETL

• Get to know the SAP Data Services

• SAP Data Services Designer features and functionality

• Information Steward data profiling capabilities

• Performance optimization techniques

• Demo

• Questions and Answers

IBsolution Academy – Webinar

Introduction ETL

Data Services is a graphical

interface for creating and

staging jobs for data integration

and data quality purposes.

IBsolution Academy – Webinar

SAP Data Services Designer Features and

Functionality

Data Services object types

Projects (single use)

Jobs

Work flows

Data flows

Scripts

Sources and targets

File formats

IBsolution Academy – Webinar

Main transformations and their characteristics

Different categories

Data Integrator Data Quality Platform Text Data Processing

IBsolution Academy – Webinar

Main transformations and their characteristics

Query Transformation

• The most often used transformation

• Already configured and mapped

Query transformation, found in a Data flow

• Double click to open the Editor

IBsolution Academy – Webinar

Query Transformation characteristics

• The Query transform is used to map

source and target columns.

• Numerous functions can be applied

• Select only unique rows

IBsolution Academy – Webinar

Query Transformation characteristics

• More than one sources can be joined with Query transformation

• The Join condition is specified in the ‘From’ tab

• Filter conditions can be applied in ‘Where’ tab

IBsolution Academy – Webinar

Query Transformation characteristics

• Group by

• Order by

IBsolution Academy – Webinar

Main transformations and their characteristics

Case transformation

• The Case transformation separates

input data rows into multiple output data sets

Numerous outputs are possible:

- 2 targets

- 3 or more targets

IBsolution Academy – Webinar

Case Transformation

• Case transformation Editor

IBsolution Academy – Webinar

Main transformations and their characteristics

• Merge Transformation

• The Merge transform combines incoming data sets

with the same schema structure to produce a single

output data set with the same schema as the input data sets.

• The Merge transform performs a union. All sources must have the same schema including:

• Number of columns

• Column names

• Column data types

IBsolution Academy – Webinar

Main transformations and their characteristics

Data Cleanse Transformation

The Data Cleanse transform is used to perform

parsing and standardizing.

• Parsing identifies individual data elements and breaks them down into their component parts. It rearranges data elements in a single field or moves multiple data elements from a single data field to multiple discrete fields.

• Standardization includes business rules around formats, abbreviations, acronyms, punctuation, greetings, casing, order, and pattern matching – all examples of elements you can control to meet your business requirements.

IBsolution Academy – Webinar

Data Cleanse Transformation

Three tabs:

• Input where you map the fields you want to standardize and/or parse

• In Options tab you define what standardization logic you want to apply to a whole column

• The Output tab contains numerous output columns

IBsolution Academy – Webinar

Data Cleanse Transformation

Use case example

• Source table content

• Selected Output fields

• Result of standardization and parsing

IBsolution Academy – Webinar

Main transformations and their characteristics

Match transformation

• Match Criteria:

Match criteria refers to the field you want to match on. You can use criteria options to specify business rules for matching on each of these fields. They allow you to control how close to exact the data needs to be for that data to be considered a match.

• Match score

• No match score

• Contribution

• Output tab

• Input tab • Options tab

IBsolution Academy – Webinar

SAP Information Steward

Data Insight use cases

Data Quality analysis in form of:

• Data profiling

• Validation Rules

A type of business rule that checks whether the data complies with the business constraints and requirements

IBsolution Academy – Webinar

SAP Information Steward

Create new rule

• Parameters and Conditions

• Rule binding

IBsolution Academy – Webinar

Create new rule

IBsolution Academy – Webinar

SAP Information Steward

Calculate rule score

Result

IBsolution Academy – Webinar

Performance Optimization

ETL performance is all about efficiency, and enabling ETL and database engines to process quickly by doing fewer costly operations.

Let the Data base do the hard work!

• Push Down Operations

Applicable for database sources and targets

To optimize performance, the software pushes down as many SELECT operations as possible to the source database and combines as many operations as possible into one request to the database.

Operations within the SELECT

Aggregations, Distinct rows, Filtering, Joins, Ordering, Projection, Functions

IBsolution Academy – Webinar

Performance Optimization

Improving throughput• Use caching as much as possible –limit the number of times the system must access the database

• Bulk load to the target

• Minimize extracted data

• Increase the Degree of parallelism

• Change array fetch size • Increase the rows per commit

IBsolution Academy – WebinarIBsolution Academy – Webinar

DEMO

IBsolution Academy – Webinar

IBsolution Academy Certificate

Individual certificate for every attendee:

• Watch the webinar

• Take the multiple choice test

• Get 8 out of 10 questions correctly

To the test http://bit.ly/1Hq6qkH