Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
ETL Process
1. What is ETL?
Extraction, Transformation, Loading
Simple Example of ETL
Customer Customer
ID Name
105 Sainsbury
Master Data
102 Tesco
109 Waitrose
101 Asda
By
Karthikeyan Selvaraj
2. Let’s say the master data table here is a flat file ie excel file which is in your computer .
We need to bring this table into SAP BI platform
Customer Customer
SAP BI Platform ID Name
105 Sainsbury
102 Tesco
109 Waitrose
101 Asda
By
Karthikeyan Selvaraj
3. The first step is to extract the master data table ie excel file into BI-data warehouse
The components needed for extracting the data into BI data warehouse are
1. DataSource
2. InfoPackage
1. DataSource
DataSource
DataSource: It defines about the data.
For eg: Once I finish this presentation, I
What type of will choose a location to save this ppt
data? and I also define in what version I want
Where the to save this ppt similarly, In datasource
data is we will define about the data.
located?
By
Karthikeyan Selvaraj
4. The first step is to extract the master data table ie excel file into BI-data warehouse
The components needed for extracting the data into BI data warehouse are
1. DataSource
2. InfoPackage
2. InfoPackage
What is InfoPackage?
In simple words we can define InfoPackage, It is like a key to open and enter into a
room.
It helps to bring the data from a legacy system or SAP system. For our scenario it
helps to bring the data from our computer into BI datawarehouse.
Customer Customer DataSource
Excel ID Name
File 105 Sainsbury
What type of
102 Tesco data?
109 Waitrose Where the
101 Asda data is
located?
InfoPackage By
Computer BI Datawarehouse Karthikeyan Selvaraj
5. Now we have moved the master data table into BI datawarehouse by executing the
InfoPackage
Once the data comes into BI, It is stored in a table called PSA (Persistent Staging Area)
The data that comes inside from any source system will be stored temporarily in PSA.
Excel
File
Customer Customer DataSource
PSA
ID Name
105 Sainsbury Customer Customer
What type of ID Name
102 Tesco data?
105 Sainsbury
109 Waitrose Where the
data is 102 Tesco
101 Asda located? 109 Waitrose
InfoPackage 101 Asda
BI Datawarehouse
Computer
By
Karthikeyan Selvaraj
6. Transformation of Data
The first part of ETL ie Extraction is done successfully. Now we need to transform the data
so that it can be made more optimized for reporting.
In order to do that, we define fields of the table as Info Objects. In our master data table
we have two fields ie Customer ID and Customer Name so in BI we define them as Info
Objects.
Info Objects are divided into three types
1. Characteristics – sorting keys such as company code, product ID, etc.
2. Key Figures – quantity, amount or number of items. Data that can be manipulated.
3. Units – currency, measure this all comes under unit.
Customer ID and Customer name are characteristic Info Objects.
PSA
Customer Customer Customer ID
ID Name Info Object
105 Sainsbury Customer Name
Info Object
102 Tesco
109 Waitrose
101 Asda
By
Characteristic Info Object Karthikeyan Selvaraj
7. Transformation of Data
The attribute for Customer ID is Customer name
In database we define the attributes for primary key similarly we need to define the
attributes for master data field ie for Customer ID.
Once that is done we do the mapping ie transformation. We map the fields of the
DataSource to the fields of the Info Objects
InfoProvider
DataSource
Customer ID Customer ID
Info Object
Transformation
Customer Customer Name
Name Info Object
By
Karthikeyan Selvaraj
8. Loading
Once the mapping is done, data has to be transferred from DataSource (PSA Table) to
InfoProvider ( Info Objects)
This is done by a process called Data Transfer Process (DTP).
How?: We create the DTP in InfoProvider layer and activate it. After activation we execute
the DTP (Data Transfer Process). Now the Data from the PSA Table are transferred to their
respective InfoObjects.
InfoProvider
DataSource
Customer ID Customer ID
Info Object
Transformation
Customer Customer Name
Name Info Object
DTP
By
Karthikeyan Selvaraj
9. Loading
Data are moved to their respective InfoObjects as per their mapping and it’s ready for
reporting from the InfoProvider Layer.
InfoProvider
Customer ID Customer Name
Info Object Info Object
105 Sainsbury
102 Tesco
109 Waitrose
101 Asda
By
Karthikeyan Selvaraj