etl process

10
Customer ID Customer Name 105 Sainsbury 102 Tesco 109 Waitrose 101 Asda What is ETL? Extraction, Transformation, Loading Simple Example of ETL Master Data By Karthikeyan Selvaraj

Upload: karthik-selvaraj

Post on 11-Nov-2014

3.781 views

Category:

Technology


0 download

DESCRIPTION

SAP BI- ETL Process

TRANSCRIPT

Page 1: ETL Process

Customer ID

Customer Name

105 Sainsbury

102 Tesco

109 Waitrose

101 Asda

What is ETL? Extraction, Transformation, Loading

Simple Example of ETL

Master Data

By Karthikeyan Selvaraj

Page 2: ETL Process

Customer ID

Customer Name

105 Sainsbury

102 Tesco

109 Waitrose

101 Asda

Let’s say the master data table here is a flat file ie excel file which is in your computer . We need to bring this table into SAP BI platform

SAP BI Platform

By Karthikeyan Selvaraj

Page 3: ETL Process

The first step is to extract the master data table ie excel file into BI-data warehouse The components needed for extracting the data into BI data warehouse are 1. DataSource 2. InfoPackage

DataSource

What type of data?

Where the data is located?

DataSource: It defines about the data. For eg: Once I finish this presentation, I will choose a location to save this ppt

and I also define in what version I want to save this ppt similarly, In datasource

we will define about the data.

1. DataSource

By Karthikeyan Selvaraj

Page 4: ETL Process

The first step is to extract the master data table ie excel file into BI-data warehouse The components needed for extracting the data into BI data warehouse are 1. DataSource 2. InfoPackage

2. InfoPackage

What is InfoPackage? In simple words we can define InfoPackage, It is like a key to open and enter into a room. It helps to bring the data from a legacy system or SAP system. For our scenario it helps to bring the data from our computer into BI datawarehouse.

Customer ID

Customer Name

105 Sainsbury

102 Tesco

109 Waitrose

101 Asda

Computer BI Datawarehouse InfoPackage

DataSource

What type of data?

Where the data is located?

By Karthikeyan Selvaraj

Excel File

Page 5: ETL Process

Now we have moved the master data table into BI datawarehouse by executing the InfoPackage Once the data comes into BI, It is stored in a table called PSA (Persistent Staging Area) The data that comes inside from any source system will be stored temporarily in PSA.

By Karthikeyan Selvaraj

Customer ID

Customer Name

105 Sainsbury

102 Tesco

109 Waitrose

101 Asda

Computer BI Datawarehouse

InfoPackage

DataSource

What type of data?

Where the data is located?

Excel File

Customer ID

Customer Name

105 Sainsbury

102 Tesco

109 Waitrose

101 Asda

PSA

Page 6: ETL Process

Transformation of Data The first part of ETL ie Extraction is done successfully. Now we need to transform the data so that it can be made more optimized for reporting. In order to do that, we define fields of the table as Info Objects. In our master data table we have two fields ie Customer ID and Customer Name so in BI we define them as Info Objects. Info Objects are divided into three types 1. Characteristics – sorting keys such as company code, product ID, etc. 2. Key Figures – quantity, amount or number of items. Data that can be manipulated. 3. Units – currency, measure this all comes under unit. Customer ID and Customer name are characteristic Info Objects.

By Karthikeyan Selvaraj

Customer ID

Customer Name

105 Sainsbury

102 Tesco

109 Waitrose

101 Asda

PSA

Customer ID Info Object

Customer Name Info Object

Characteristic Info Object

Page 7: ETL Process

By Karthikeyan Selvaraj

Transformation of Data The attribute for Customer ID is Customer name In database we define the attributes for primary key similarly we need to define the attributes for master data field ie for Customer ID. Once that is done we do the mapping ie transformation. We map the fields of the DataSource to the fields of the Info Objects

Customer ID

Customer Name

DataSource

Customer ID Info Object

Customer Name Info Object

InfoProvider

Transformation

Page 8: ETL Process

Loading Once the mapping is done, data has to be transferred from DataSource (PSA Table) to InfoProvider ( Info Objects) This is done by a process called Data Transfer Process (DTP). How?: We create the DTP in InfoProvider layer and activate it. After activation we execute the DTP (Data Transfer Process). Now the Data from the PSA Table are transferred to their respective InfoObjects.

By Karthikeyan Selvaraj

Customer ID

Customer Name

DataSource

Customer ID Info Object

Customer Name Info Object

InfoProvider

Transformation

DTP

Page 9: ETL Process

Loading Data are moved to their respective InfoObjects as per their mapping and it’s ready for reporting from the InfoProvider Layer.

Customer ID Info Object

105

102

109

101

Customer Name Info Object

Sainsbury

Tesco

Waitrose

Asda

InfoProvider

By Karthikeyan Selvaraj

Page 10: ETL Process

Thank You

By Karthikeyan Selvaraj