SlideShare uma empresa Scribd logo
1 de 9
Informatica Software Architecture illustrated

Informatica ETL product, known as Informatica Power Center consists of 3 main components.

1. Informatica PowerCenter Client Tools:

These are the development tools installed at developer end. These tools enable a developer to

        Define transformation process, known as mapping. (Designer)
        Define run-time properties for a mapping, known as sessions (Workflow Manager)
        Monitor execution of sessions (Workflow Monitor)
        Manage repository, useful for administrators (Repository Manager)
        Report Metadata (Metadata Reporter)

2. Informatica PowerCenter Repository:

Repository is the heart of Informatica tools. Repository is a kind of data inventory where all the
data related to mappings, sources, targets etc is kept. This is the place where all the metadata for
your application is stored. All the client tools and Informatica Server fetch data from Repository.
Informatica client and server without repository is same as a PC without memory/harddisk,
which has got the ability to process data but has no data to process. This can be treated as
backend of Informatica.

3. Informatica PowerCenter Server:

Server is the place, where all the executions take place.
Server makes physical connections to sources/targets,
fetches data, applies the transformations mentioned in the
mapping and loads the data in the target system.

This architecture is visually explained in diagram below:
Sources
                                                                                   Targets



Standard: RDBMS,
Flat Files, XML,                                                           Standard: RDBMS,
ODBC                                                                       Flat Files, XML,
                                                                           ODBC

Applications: SAP
R/3, SAP BW,                                                               Applications: SAP
PeopleSoft, Siebel, JD                                                     R/3, SAP BW,
Edwards, i2                                                                PeopleSoft, Siebel, JD
                                                                           Edwards, i2

EAI: MQ Series,
Tibco, JMS, Web                                                            EAI: MQ Series,
Services                                                                   Tibco, JMS, Web
                                                                           Services

Legacy: Mainframes
(DB2, VSAM, IMS,                                                           Legacy: Mainframes
IDMS, Adabas)AS400                                                         (DB2)AS400 (DB2)
(DB2, Flat File)

                                                                           Remote Targets
Remote Sources

This is the sufficient knowledge to start with Informatica. So lets go straight to development in
Informatica.

Informatica >> Beginners >> Informatica Product Overview

Informatica Product Line

Informatica is a powerful ETL tool from Informatica Corporation, a leading provider of
enterprise data integration software and ETL softwares.

The important products provided by Informatica Corporation is provided below:

       Power Center
       Power Mart
       Power Exchange
       Power Center Connect
       Power Channel
Metadata Exchange
       Power Analyzer
       Super Glue


Power Center & Power Mart: Power Mart is a departmental version of Informatica for
building, deploying, and managing data warehouses and data marts. Power center is used for
corporate enterprise data warehouse and power mart is used for departmental data warehouses
like data marts. Power Center supports global repositories and networked repositories and it can
be connected to several sources. Power Mart supports single repository and it can be connected
to fewer sources when compared to Power Center. Power Mart can extensibily grow to an
enterprise implementation and it is easy for developer productivity through a codeless
environment.

Power Exchange: Informatica Power Exchange as a stand alone service or along with Power
Center, helps organizations leverage data by avoiding manual coding of data extraction
programs. Power Exchange supports batch, real time and changed data capture options in main
frame(DB2, VSAM, IMS etc.,), mid range (AS400 DB2 etc.,), and for relational databases
(oracle, sql server, db2 etc) and flat files in unix, linux and windows systems.

Power Center Connect: This is add on to Informatica Power Center. It helps to extract data and
metadata from ERP systems like IBM's MQSeries, Peoplesoft, SAP, Siebel etc. and other third
party applications.

Power Channel: This helps to transfer large amount of encrypted and compressed data over
LAN, WAN, through Firewalls, tranfer files over FTP, etc.

Meta Data Exchange: Metadata Exchange enables organizations to take advantage of the time
and effort already invested in defining data structures within their IT environment when used
with Power Center. For example, an organization may be using data modeling tools, such as
Erwin, Embarcadero, Oracle designer, Sybase Power Designer etc for developing data models.
Functional and technical team should have spent much time and effort in creating the data
model's data structures(tables, columns, data types, procedures, functions, triggers etc). By using
meta deta exchange, these data structures can be imported into power center to identifiy source
and target mappings which leverages time and effort. There is no need for informatica developer
to create these data structures once again.

Power Analyzer: Power Analyzer provides organizations with reporting facilities.
PowerAnalyzer makes accessing, analyzing, and sharing enterprise data simple and easily
available to decision makers. PowerAnalyzer enables to gain insight into business processes and
develop business intelligence.

With PowerAnalyzer, an organization can extract, filter, format, and analyze corporate
information from data stored in a data warehouse, data mart, operational data store, or otherdata
storage models. PowerAnalyzer is best with a dimensional data warehouse in a relational
database. It can also run reports on data in any table in a relational database that do not conform
to the dimensional model.

Super Glue: Superglue is used for loading metadata in a centralized place from several sources.
Reports can be run against this superglue to analyze meta data.


Note:This is not a complete tutorial on Informatica. We will add more Tips and Guidelines on
Informatica in near future. Please visit us soon to check back. To know more about Informatica,
contact its official website www.informatica.com

Informatica Transformations

A transformation is a repository object that generates, modifies, or passes data. The Designer
provides a set of transformations that perform specific functions. For example, an Aggregator
transformation performs calculations on groups of data.

Transformations can be of two types:

Active Transformation

An active transformation can change
the number of rows that pass through
the transformation, change the
transaction boundary, can change the
row type. For example, Filter,
Transaction Control and Update
Strategy are active transformations.
The key point is to note that Designer does not allow you to connect multiple active
transformations or an active and a passive transformation to the same downstream transformation
or transformation input group because the Integration Service may not be able to concatenate the
rows passed by active transformations However, Sequence Generator transformation(SGT) is an
exception to this rule. A SGT does not receive data. It generates unique numeric values. As a
result, the Integration Service does not encounter problems concatenating rows passed by a SGT
and an active transformation.

Passive Transformation.

A passive transformation does not change the number of rows that pass through it, maintains the
transaction boundary, and maintains the row type.

The key point is to note that Designer allows you to connect multiple transformations to the same
downstream transformation or transformation input group only if all transformations in the
upstream branches are passive. The transformation that originates the branch can be active or
passive.
Transformations can be Connected or UnConnected to the data flow.

Connected Transformation
Connected transformation is
connected to other transformations or
directly to target table in the mapping.

UnConnected Transformation

An unconnected transformation is not connected to other transformations in the mapping. It is
called within another transformation, and returns a value to that transformation.

Informatica Transformations

Following are the list of Transformations available in Informatica:

       Aggregator Transformation
       Application Source Qualifier Transformation
       Custom Transformation
       Data Masking Transformation
       Expression Transformation
       External Procedure Transformation
       Filter Transformation
       HTTP Transformation
       Input Transformation
       Java Transformation
       Joiner Transformation
       Lookup Transformation
       Normalizer Transformation
       Output Transformation
       Rank Transformation
       Reusable Transformation
       Router Transformation
       Sequence Generator Transformation
       Sorter Transformation
       Source Qualifier Transformation
       SQL Transformation
       Stored Procedure Transformation
       Transaction Control Transaction
       Union Transformation
       Unstructured Data Transformation
       Update Strategy Transformation
       XML Generator Transformation
       XML Parser Transformation
       XML Source Qualifier Transformation
       Advanced External Procedure Transformation
External Transformation


In the following pages, we will explain all the above Informatica Transformations and their
significances in the ETL process in detail.

Informatica >> Beginners >> Informatica Transformations

Informatica Transformations

Aggregator Transformation

Aggregator transformation performs aggregate funtions like average, sum, count etc. on multiple
rows or groups. The Integration Service performs these calculations as it reads and stores data
group and row data in an aggregate cache. It is an Active & Connected transformation.

Difference b/w Aggregator and Expression Transformation? Expression transformation permits
you to perform calculations row by row basis only. In Aggregator you can perform calculations
on groups.

Aggregator transformation has following ports State, State_Count, Previous_State and
State_Counter.

Components: Aggregate Cache, Aggregate Expression, Group by port, Sorted input.

Aggregate Expressions: are allowed only in aggregate transformations. can include conditional
clauses and non-aggregate functions. can also include one aggregate function nested into another
aggregate function.

Aggregate Functions: AVG, COUNT, FIRST, LAST, MAX, MEDIAN, MIN, PERCENTILE,
STDDEV, SUM, VARIANCE

Application Source Qualifier Transformation

Represents the rows that the Integration Service reads
from an application, such as an ERP source, when it runs
a session.It is an Active & Connected transformation.

Custom Transformation

It works with procedures you create outside the designer interface to extend PowerCenter
functionality. calls a procedure from a shared library or DLL. It is active/passive & connected
type.

You can use CT to create T. that require multiple input groups and multiple output groups.
Custom transformation allows you to develop the transformation logic in a procedure. Some of
the PowerCenter transformations are built using the Custom transformation. Rules that apply to
Custom transformations, such as blocking rules, also apply to transformations built using Custom
transformations. PowerCenter provides two sets of functions called generated and API functions.
The Integration Service uses generated functions to interface with the procedure. When you
create a Custom transformation and generate the source code files, the Designer includes the
generated functions in the files. Use the API functions in the procedure code to develop the
transformation logic.

Difference between Custom and External Procedure Transformation? In Custom T, input and
output functions occur separately.The Integration Service passes the input data to the procedure
using an input function. The output function is a separate function that you must enter in the
procedure code to pass output data to the Integration Service. In contrast, in the External
Procedure transformation, an external procedure function does both input and output, and its
parameters consist of all the ports of the transformation.

Data Masking Transformation

Passive & Connected. It is used to change sensitive
production data to realistic test data for non production
environments. It creates masked data for development,
testing, training and data mining. Data relationship and
referential integrity are maintained in the masked data.
For example: It returns masked value that has a realistic format for SSN, Credit card number,
birthdate, phone number, etc. But is not a valid value. Masking types: Key Masking, Random
Masking, Expression Masking, Special Mask format. Default is no masking.

Expression Transformation

Passive & Connected. are used to perform non-aggregate functions, i.e to calculate values in a
single row. Example: to calculate discount of each product or to concatenate first and last names
or to convert date to a string field.

You can create an Expression transformation in the Transformation Developer or the Mapping
Designer. Components: Transformation, Ports, Properties, Metadata Extensions.

External Procedure

Passive & Connected or Unconnected. It works with procedures you create outside of the
Designer interface to extend PowerCenter functionality. You can create complex functions
within a DLL or in the COM layer of windows and bind it to external procedure transformation.
To get this kind of extensibility, use the Transformation Exchange (TX) dynamic invocation
interface built into PowerCenter. You must be an experienced programmer to use TX and use
multi-threaded code in external procedures.

Filter Transformation
Active & Connected. It allows rows that meet the specified filter condition and removes the rows
that do not meet the condition. For example, to find all the employees who are working in
NewYork or to find out all the faculty member teaching Chemistry in a state. The input ports for
the filter must come from a single transformation. You cannot concatenate ports from more than
one transformation into the Filter transformation. Components: Transformation, Ports,
Properties, Metadata Extensions.

HTTP Transformation

Passive & Connected. It allows you to connect to an
HTTP server to use its services and applications. With an
HTTP transformation, the Integration Service connects to
the HTTP server, and issues a request to retrieves data or
posts data to the target or downstream transformation in
the mapping.
Authentication types: Basic, Digest and NTLM. Examples: GET, POST and SIMPLE POST.

Java Transformation

Active or Passive & Connected. It provides a simple native programming interface to define
transformation functionality with the Java programming language. You can use the Java
transformation to quickly define simple or moderately complex transformation functionality
without advanced knowledge of the Java programming language or an external Java
development environment.

Joiner Transformation

Active & Connected. It is used to join data from two related heterogeneous sources residing in
different locations or to join data from the same source. In order to join two sources, there must
be at least one or more pairs of matching column between the sources and a must to specify one
source as master and the other as detail. For example: to join a flat file and a relational source or
to join two flat files or to join a relational source and a XML source.
The Joiner transformation supports the following types of joins:

       Normal

       Normal join discards all the rows of data from the master and detail source that do not
       match, based on the condition.

       Master Outer

       Master outer join discards all the unmatched rows from the master source and keeps all
       the rows from the detail source and the matching rows from the master source.

       Detail Outer
Detail outer join keeps all rows of data from the master source and the matching rows
       from the detail source. It discards the unmatched rows from the detail source.

       Full Outer

       Full outer join keeps all rows of data from both the master and detail sources.

Limitations on the pipelines you connect to the Joiner transformation:
*You cannot use a Joiner transformation when either input pipeline contains an Update Strategy
transformation.
*You cannot use a Joiner transformation if you connect a Sequence Generator transformation
directly before the Joiner transformation.

Lookup Transformation

Passive & Connected or UnConnected. It is used to look up data in a flat file, relational table,
view, or synonym. It compares lookup transformation ports (input ports) to the source column
values based on the lookup condition. Later returned values can be passed to other
transformations. You can create a lookup definition from a source qualifier and can also use
multiple Lookup transformations in a mapping.

You can perform the following tasks with a Lookup transformation:
*Get a related value. Retrieve a value from the lookup table based on a value in the source. For
example, the source has an employee ID. Retrieve the employee name from the lookup table.
*Perform a calculation. Retrieve a value from a lookup table and use it in a calculation. For
example, retrieve a sales tax percentage, calculate a tax, and return the tax to a target.
*Update slowly changing dimension tables. Determine whether rows exist in a target.

Lookup Components: Lookup source, Ports, Properties, Condition.
Types of Lookup:
1) Relational or flat file lookup.
2) Pipeline lookup.
3) Cached or uncached lookup.
4) connected or unconnected lookup.

Mais conteúdo relacionado

Mais procurados

Azure data platform overview
Azure data platform overviewAzure data platform overview
Azure data platform overviewJames Serra
 
1. informatica power center architecture
1. informatica power center architecture1. informatica power center architecture
1. informatica power center architectureMuhammad Salah ElOkda
 
Informatica student meterial
Informatica student meterialInformatica student meterial
Informatica student meterialSunil Kotthakota
 
Simplify Complex Consolidations and Close Processes with Oracle Financial Con...
Simplify Complex Consolidations and Close Processes with Oracle Financial Con...Simplify Complex Consolidations and Close Processes with Oracle Financial Con...
Simplify Complex Consolidations and Close Processes with Oracle Financial Con...Alithya
 
Informatica Training | Informatica PowerCenter | Informatica Tutorial | Edureka
Informatica Training | Informatica PowerCenter | Informatica Tutorial | EdurekaInformatica Training | Informatica PowerCenter | Informatica Tutorial | Edureka
Informatica Training | Informatica PowerCenter | Informatica Tutorial | EdurekaEdureka!
 
Best Practices for Designing and Building Integrations
Best Practices for Designing and Building IntegrationsBest Practices for Designing and Building Integrations
Best Practices for Designing and Building IntegrationsAlithya
 
Tableau 7.0 prsentation
Tableau 7.0 prsentationTableau 7.0 prsentation
Tableau 7.0 prsentationinam_slides
 
Informatica MDM Presentation
Informatica MDM PresentationInformatica MDM Presentation
Informatica MDM PresentationMaxHung
 
Visualization using Tableau
Visualization using TableauVisualization using Tableau
Visualization using TableauGirija Muscut
 
Hyperion Financial Management
Hyperion Financial ManagementHyperion Financial Management
Hyperion Financial ManagementCodec-dss UK
 
Differentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solutionDifferentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solutionJames Serra
 
Data Lake Overview
Data Lake OverviewData Lake Overview
Data Lake OverviewJames Serra
 
Oracle Hyperion overview
Oracle Hyperion overviewOracle Hyperion overview
Oracle Hyperion overviewClick4learning
 
Talend Open Studio Data Integration
Talend Open Studio Data IntegrationTalend Open Studio Data Integration
Talend Open Studio Data IntegrationRoberto Marchetto
 

Mais procurados (20)

Azure data platform overview
Azure data platform overviewAzure data platform overview
Azure data platform overview
 
1. informatica power center architecture
1. informatica power center architecture1. informatica power center architecture
1. informatica power center architecture
 
Informatica slides
Informatica slidesInformatica slides
Informatica slides
 
Informatica student meterial
Informatica student meterialInformatica student meterial
Informatica student meterial
 
Simplify Complex Consolidations and Close Processes with Oracle Financial Con...
Simplify Complex Consolidations and Close Processes with Oracle Financial Con...Simplify Complex Consolidations and Close Processes with Oracle Financial Con...
Simplify Complex Consolidations and Close Processes with Oracle Financial Con...
 
Informatica Training | Informatica PowerCenter | Informatica Tutorial | Edureka
Informatica Training | Informatica PowerCenter | Informatica Tutorial | EdurekaInformatica Training | Informatica PowerCenter | Informatica Tutorial | Edureka
Informatica Training | Informatica PowerCenter | Informatica Tutorial | Edureka
 
Best Practices for Designing and Building Integrations
Best Practices for Designing and Building IntegrationsBest Practices for Designing and Building Integrations
Best Practices for Designing and Building Integrations
 
Tableau 7.0 prsentation
Tableau 7.0 prsentationTableau 7.0 prsentation
Tableau 7.0 prsentation
 
Informatica MDM Presentation
Informatica MDM PresentationInformatica MDM Presentation
Informatica MDM Presentation
 
SSAS Tabular model importance and uses
SSAS  Tabular model importance and usesSSAS  Tabular model importance and uses
SSAS Tabular model importance and uses
 
Salesforce 101
Salesforce 101Salesforce 101
Salesforce 101
 
Oracle Database Cloud Service
Oracle Database Cloud ServiceOracle Database Cloud Service
Oracle Database Cloud Service
 
Visualization using Tableau
Visualization using TableauVisualization using Tableau
Visualization using Tableau
 
Hyperion Financial Management
Hyperion Financial ManagementHyperion Financial Management
Hyperion Financial Management
 
Differentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solutionDifferentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solution
 
Retail Data Warehouse
Retail Data WarehouseRetail Data Warehouse
Retail Data Warehouse
 
Data Lake Overview
Data Lake OverviewData Lake Overview
Data Lake Overview
 
Oracle Hyperion overview
Oracle Hyperion overviewOracle Hyperion overview
Oracle Hyperion overview
 
Talend Open Studio Data Integration
Talend Open Studio Data IntegrationTalend Open Studio Data Integration
Talend Open Studio Data Integration
 
ETL
ETLETL
ETL
 

Semelhante a Informatica ETL Architecture Diagram

Informatica Interview Questions & Answers
Informatica Interview Questions & AnswersInformatica Interview Questions & Answers
Informatica Interview Questions & AnswersZaranTech LLC
 
Big Data Taiwan 2014 Track2-2: Informatica Big Data Solution
Big Data Taiwan 2014 Track2-2: Informatica Big Data SolutionBig Data Taiwan 2014 Track2-2: Informatica Big Data Solution
Big Data Taiwan 2014 Track2-2: Informatica Big Data SolutionEtu Solution
 
oracle_soultion_oracledataintegrator_goldengate_2021
oracle_soultion_oracledataintegrator_goldengate_2021oracle_soultion_oracledataintegrator_goldengate_2021
oracle_soultion_oracledataintegrator_goldengate_2021ssuser8ccb5a
 
CERN_DIS_ODI_OGG_final_oracle_golde.pptx
CERN_DIS_ODI_OGG_final_oracle_golde.pptxCERN_DIS_ODI_OGG_final_oracle_golde.pptx
CERN_DIS_ODI_OGG_final_oracle_golde.pptxcamyla81
 
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...Hortonworks
 
Technology Overview
Technology OverviewTechnology Overview
Technology OverviewLiran Zelkha
 
Big Data Transformation Powered By Apache Spark.pptx
Big Data Transformation Powered By Apache Spark.pptxBig Data Transformation Powered By Apache Spark.pptx
Big Data Transformation Powered By Apache Spark.pptxKnoldus Inc.
 
Big Data Transformations Powered By Spark
Big Data Transformations Powered By SparkBig Data Transformations Powered By Spark
Big Data Transformations Powered By SparkKnoldus Inc.
 
SAP BW vs Teradat; A White Paper
SAP BW vs Teradat; A White PaperSAP BW vs Teradat; A White Paper
SAP BW vs Teradat; A White PaperVipul Neema
 
Datawarehousing & DSS
Datawarehousing & DSSDatawarehousing & DSS
Datawarehousing & DSSDeepali Raut
 
App Load Presentation 2009
App Load Presentation 2009App Load Presentation 2009
App Load Presentation 2009sundu72
 
Sap Interview Questions - Part 1
Sap Interview Questions - Part 1Sap Interview Questions - Part 1
Sap Interview Questions - Part 1ReKruiTIn.com
 
Webinar future dataintegration-datamesh-and-goldengatekafka
Webinar future dataintegration-datamesh-and-goldengatekafkaWebinar future dataintegration-datamesh-and-goldengatekafka
Webinar future dataintegration-datamesh-and-goldengatekafkaJeffrey T. Pollock
 
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)Rittman Analytics
 
Teradata - Presentation at Hortonworks Booth - Strata 2014
Teradata - Presentation at Hortonworks Booth - Strata 2014Teradata - Presentation at Hortonworks Booth - Strata 2014
Teradata - Presentation at Hortonworks Booth - Strata 2014Hortonworks
 
Software is eating the world and MDD should be in the driving seat
Software is eating the world and MDD should be in the driving seatSoftware is eating the world and MDD should be in the driving seat
Software is eating the world and MDD should be in the driving seatJohan den Haan
 

Semelhante a Informatica ETL Architecture Diagram (20)

Informatica Interview Questions & Answers
Informatica Interview Questions & AnswersInformatica Interview Questions & Answers
Informatica Interview Questions & Answers
 
Informatica doc
Informatica docInformatica doc
Informatica doc
 
Big Data Taiwan 2014 Track2-2: Informatica Big Data Solution
Big Data Taiwan 2014 Track2-2: Informatica Big Data SolutionBig Data Taiwan 2014 Track2-2: Informatica Big Data Solution
Big Data Taiwan 2014 Track2-2: Informatica Big Data Solution
 
oracle_soultion_oracledataintegrator_goldengate_2021
oracle_soultion_oracledataintegrator_goldengate_2021oracle_soultion_oracledataintegrator_goldengate_2021
oracle_soultion_oracledataintegrator_goldengate_2021
 
CERN_DIS_ODI_OGG_final_oracle_golde.pptx
CERN_DIS_ODI_OGG_final_oracle_golde.pptxCERN_DIS_ODI_OGG_final_oracle_golde.pptx
CERN_DIS_ODI_OGG_final_oracle_golde.pptx
 
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
 
Technology Overview
Technology OverviewTechnology Overview
Technology Overview
 
Big Data Transformation Powered By Apache Spark.pptx
Big Data Transformation Powered By Apache Spark.pptxBig Data Transformation Powered By Apache Spark.pptx
Big Data Transformation Powered By Apache Spark.pptx
 
Big Data Transformations Powered By Spark
Big Data Transformations Powered By SparkBig Data Transformations Powered By Spark
Big Data Transformations Powered By Spark
 
SAP BW vs Teradat; A White Paper
SAP BW vs Teradat; A White PaperSAP BW vs Teradat; A White Paper
SAP BW vs Teradat; A White Paper
 
Project seminar
Project seminarProject seminar
Project seminar
 
SAP Business Objects Trianing
SAP Business Objects TrianingSAP Business Objects Trianing
SAP Business Objects Trianing
 
Datawarehousing & DSS
Datawarehousing & DSSDatawarehousing & DSS
Datawarehousing & DSS
 
App Load Presentation 2009
App Load Presentation 2009App Load Presentation 2009
App Load Presentation 2009
 
Sap Interview Questions - Part 1
Sap Interview Questions - Part 1Sap Interview Questions - Part 1
Sap Interview Questions - Part 1
 
Webinar future dataintegration-datamesh-and-goldengatekafka
Webinar future dataintegration-datamesh-and-goldengatekafkaWebinar future dataintegration-datamesh-and-goldengatekafka
Webinar future dataintegration-datamesh-and-goldengatekafka
 
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
 
Teradata - Presentation at Hortonworks Booth - Strata 2014
Teradata - Presentation at Hortonworks Booth - Strata 2014Teradata - Presentation at Hortonworks Booth - Strata 2014
Teradata - Presentation at Hortonworks Booth - Strata 2014
 
Sap
SapSap
Sap
 
Software is eating the world and MDD should be in the driving seat
Software is eating the world and MDD should be in the driving seatSoftware is eating the world and MDD should be in the driving seat
Software is eating the world and MDD should be in the driving seat
 

Último

Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 

Último (20)

Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 

Informatica ETL Architecture Diagram

  • 1. Informatica Software Architecture illustrated Informatica ETL product, known as Informatica Power Center consists of 3 main components. 1. Informatica PowerCenter Client Tools: These are the development tools installed at developer end. These tools enable a developer to Define transformation process, known as mapping. (Designer) Define run-time properties for a mapping, known as sessions (Workflow Manager) Monitor execution of sessions (Workflow Monitor) Manage repository, useful for administrators (Repository Manager) Report Metadata (Metadata Reporter) 2. Informatica PowerCenter Repository: Repository is the heart of Informatica tools. Repository is a kind of data inventory where all the data related to mappings, sources, targets etc is kept. This is the place where all the metadata for your application is stored. All the client tools and Informatica Server fetch data from Repository. Informatica client and server without repository is same as a PC without memory/harddisk, which has got the ability to process data but has no data to process. This can be treated as backend of Informatica. 3. Informatica PowerCenter Server: Server is the place, where all the executions take place. Server makes physical connections to sources/targets, fetches data, applies the transformations mentioned in the mapping and loads the data in the target system. This architecture is visually explained in diagram below:
  • 2. Sources Targets Standard: RDBMS, Flat Files, XML, Standard: RDBMS, ODBC Flat Files, XML, ODBC Applications: SAP R/3, SAP BW, Applications: SAP PeopleSoft, Siebel, JD R/3, SAP BW, Edwards, i2 PeopleSoft, Siebel, JD Edwards, i2 EAI: MQ Series, Tibco, JMS, Web EAI: MQ Series, Services Tibco, JMS, Web Services Legacy: Mainframes (DB2, VSAM, IMS, Legacy: Mainframes IDMS, Adabas)AS400 (DB2)AS400 (DB2) (DB2, Flat File) Remote Targets Remote Sources This is the sufficient knowledge to start with Informatica. So lets go straight to development in Informatica. Informatica >> Beginners >> Informatica Product Overview Informatica Product Line Informatica is a powerful ETL tool from Informatica Corporation, a leading provider of enterprise data integration software and ETL softwares. The important products provided by Informatica Corporation is provided below: Power Center Power Mart Power Exchange Power Center Connect Power Channel
  • 3. Metadata Exchange Power Analyzer Super Glue Power Center & Power Mart: Power Mart is a departmental version of Informatica for building, deploying, and managing data warehouses and data marts. Power center is used for corporate enterprise data warehouse and power mart is used for departmental data warehouses like data marts. Power Center supports global repositories and networked repositories and it can be connected to several sources. Power Mart supports single repository and it can be connected to fewer sources when compared to Power Center. Power Mart can extensibily grow to an enterprise implementation and it is easy for developer productivity through a codeless environment. Power Exchange: Informatica Power Exchange as a stand alone service or along with Power Center, helps organizations leverage data by avoiding manual coding of data extraction programs. Power Exchange supports batch, real time and changed data capture options in main frame(DB2, VSAM, IMS etc.,), mid range (AS400 DB2 etc.,), and for relational databases (oracle, sql server, db2 etc) and flat files in unix, linux and windows systems. Power Center Connect: This is add on to Informatica Power Center. It helps to extract data and metadata from ERP systems like IBM's MQSeries, Peoplesoft, SAP, Siebel etc. and other third party applications. Power Channel: This helps to transfer large amount of encrypted and compressed data over LAN, WAN, through Firewalls, tranfer files over FTP, etc. Meta Data Exchange: Metadata Exchange enables organizations to take advantage of the time and effort already invested in defining data structures within their IT environment when used with Power Center. For example, an organization may be using data modeling tools, such as Erwin, Embarcadero, Oracle designer, Sybase Power Designer etc for developing data models. Functional and technical team should have spent much time and effort in creating the data model's data structures(tables, columns, data types, procedures, functions, triggers etc). By using meta deta exchange, these data structures can be imported into power center to identifiy source and target mappings which leverages time and effort. There is no need for informatica developer to create these data structures once again. Power Analyzer: Power Analyzer provides organizations with reporting facilities. PowerAnalyzer makes accessing, analyzing, and sharing enterprise data simple and easily available to decision makers. PowerAnalyzer enables to gain insight into business processes and develop business intelligence. With PowerAnalyzer, an organization can extract, filter, format, and analyze corporate information from data stored in a data warehouse, data mart, operational data store, or otherdata storage models. PowerAnalyzer is best with a dimensional data warehouse in a relational
  • 4. database. It can also run reports on data in any table in a relational database that do not conform to the dimensional model. Super Glue: Superglue is used for loading metadata in a centralized place from several sources. Reports can be run against this superglue to analyze meta data. Note:This is not a complete tutorial on Informatica. We will add more Tips and Guidelines on Informatica in near future. Please visit us soon to check back. To know more about Informatica, contact its official website www.informatica.com Informatica Transformations A transformation is a repository object that generates, modifies, or passes data. The Designer provides a set of transformations that perform specific functions. For example, an Aggregator transformation performs calculations on groups of data. Transformations can be of two types: Active Transformation An active transformation can change the number of rows that pass through the transformation, change the transaction boundary, can change the row type. For example, Filter, Transaction Control and Update Strategy are active transformations. The key point is to note that Designer does not allow you to connect multiple active transformations or an active and a passive transformation to the same downstream transformation or transformation input group because the Integration Service may not be able to concatenate the rows passed by active transformations However, Sequence Generator transformation(SGT) is an exception to this rule. A SGT does not receive data. It generates unique numeric values. As a result, the Integration Service does not encounter problems concatenating rows passed by a SGT and an active transformation. Passive Transformation. A passive transformation does not change the number of rows that pass through it, maintains the transaction boundary, and maintains the row type. The key point is to note that Designer allows you to connect multiple transformations to the same downstream transformation or transformation input group only if all transformations in the upstream branches are passive. The transformation that originates the branch can be active or passive.
  • 5. Transformations can be Connected or UnConnected to the data flow. Connected Transformation Connected transformation is connected to other transformations or directly to target table in the mapping. UnConnected Transformation An unconnected transformation is not connected to other transformations in the mapping. It is called within another transformation, and returns a value to that transformation. Informatica Transformations Following are the list of Transformations available in Informatica: Aggregator Transformation Application Source Qualifier Transformation Custom Transformation Data Masking Transformation Expression Transformation External Procedure Transformation Filter Transformation HTTP Transformation Input Transformation Java Transformation Joiner Transformation Lookup Transformation Normalizer Transformation Output Transformation Rank Transformation Reusable Transformation Router Transformation Sequence Generator Transformation Sorter Transformation Source Qualifier Transformation SQL Transformation Stored Procedure Transformation Transaction Control Transaction Union Transformation Unstructured Data Transformation Update Strategy Transformation XML Generator Transformation XML Parser Transformation XML Source Qualifier Transformation Advanced External Procedure Transformation
  • 6. External Transformation In the following pages, we will explain all the above Informatica Transformations and their significances in the ETL process in detail. Informatica >> Beginners >> Informatica Transformations Informatica Transformations Aggregator Transformation Aggregator transformation performs aggregate funtions like average, sum, count etc. on multiple rows or groups. The Integration Service performs these calculations as it reads and stores data group and row data in an aggregate cache. It is an Active & Connected transformation. Difference b/w Aggregator and Expression Transformation? Expression transformation permits you to perform calculations row by row basis only. In Aggregator you can perform calculations on groups. Aggregator transformation has following ports State, State_Count, Previous_State and State_Counter. Components: Aggregate Cache, Aggregate Expression, Group by port, Sorted input. Aggregate Expressions: are allowed only in aggregate transformations. can include conditional clauses and non-aggregate functions. can also include one aggregate function nested into another aggregate function. Aggregate Functions: AVG, COUNT, FIRST, LAST, MAX, MEDIAN, MIN, PERCENTILE, STDDEV, SUM, VARIANCE Application Source Qualifier Transformation Represents the rows that the Integration Service reads from an application, such as an ERP source, when it runs a session.It is an Active & Connected transformation. Custom Transformation It works with procedures you create outside the designer interface to extend PowerCenter functionality. calls a procedure from a shared library or DLL. It is active/passive & connected type. You can use CT to create T. that require multiple input groups and multiple output groups.
  • 7. Custom transformation allows you to develop the transformation logic in a procedure. Some of the PowerCenter transformations are built using the Custom transformation. Rules that apply to Custom transformations, such as blocking rules, also apply to transformations built using Custom transformations. PowerCenter provides two sets of functions called generated and API functions. The Integration Service uses generated functions to interface with the procedure. When you create a Custom transformation and generate the source code files, the Designer includes the generated functions in the files. Use the API functions in the procedure code to develop the transformation logic. Difference between Custom and External Procedure Transformation? In Custom T, input and output functions occur separately.The Integration Service passes the input data to the procedure using an input function. The output function is a separate function that you must enter in the procedure code to pass output data to the Integration Service. In contrast, in the External Procedure transformation, an external procedure function does both input and output, and its parameters consist of all the ports of the transformation. Data Masking Transformation Passive & Connected. It is used to change sensitive production data to realistic test data for non production environments. It creates masked data for development, testing, training and data mining. Data relationship and referential integrity are maintained in the masked data. For example: It returns masked value that has a realistic format for SSN, Credit card number, birthdate, phone number, etc. But is not a valid value. Masking types: Key Masking, Random Masking, Expression Masking, Special Mask format. Default is no masking. Expression Transformation Passive & Connected. are used to perform non-aggregate functions, i.e to calculate values in a single row. Example: to calculate discount of each product or to concatenate first and last names or to convert date to a string field. You can create an Expression transformation in the Transformation Developer or the Mapping Designer. Components: Transformation, Ports, Properties, Metadata Extensions. External Procedure Passive & Connected or Unconnected. It works with procedures you create outside of the Designer interface to extend PowerCenter functionality. You can create complex functions within a DLL or in the COM layer of windows and bind it to external procedure transformation. To get this kind of extensibility, use the Transformation Exchange (TX) dynamic invocation interface built into PowerCenter. You must be an experienced programmer to use TX and use multi-threaded code in external procedures. Filter Transformation
  • 8. Active & Connected. It allows rows that meet the specified filter condition and removes the rows that do not meet the condition. For example, to find all the employees who are working in NewYork or to find out all the faculty member teaching Chemistry in a state. The input ports for the filter must come from a single transformation. You cannot concatenate ports from more than one transformation into the Filter transformation. Components: Transformation, Ports, Properties, Metadata Extensions. HTTP Transformation Passive & Connected. It allows you to connect to an HTTP server to use its services and applications. With an HTTP transformation, the Integration Service connects to the HTTP server, and issues a request to retrieves data or posts data to the target or downstream transformation in the mapping. Authentication types: Basic, Digest and NTLM. Examples: GET, POST and SIMPLE POST. Java Transformation Active or Passive & Connected. It provides a simple native programming interface to define transformation functionality with the Java programming language. You can use the Java transformation to quickly define simple or moderately complex transformation functionality without advanced knowledge of the Java programming language or an external Java development environment. Joiner Transformation Active & Connected. It is used to join data from two related heterogeneous sources residing in different locations or to join data from the same source. In order to join two sources, there must be at least one or more pairs of matching column between the sources and a must to specify one source as master and the other as detail. For example: to join a flat file and a relational source or to join two flat files or to join a relational source and a XML source. The Joiner transformation supports the following types of joins: Normal Normal join discards all the rows of data from the master and detail source that do not match, based on the condition. Master Outer Master outer join discards all the unmatched rows from the master source and keeps all the rows from the detail source and the matching rows from the master source. Detail Outer
  • 9. Detail outer join keeps all rows of data from the master source and the matching rows from the detail source. It discards the unmatched rows from the detail source. Full Outer Full outer join keeps all rows of data from both the master and detail sources. Limitations on the pipelines you connect to the Joiner transformation: *You cannot use a Joiner transformation when either input pipeline contains an Update Strategy transformation. *You cannot use a Joiner transformation if you connect a Sequence Generator transformation directly before the Joiner transformation. Lookup Transformation Passive & Connected or UnConnected. It is used to look up data in a flat file, relational table, view, or synonym. It compares lookup transformation ports (input ports) to the source column values based on the lookup condition. Later returned values can be passed to other transformations. You can create a lookup definition from a source qualifier and can also use multiple Lookup transformations in a mapping. You can perform the following tasks with a Lookup transformation: *Get a related value. Retrieve a value from the lookup table based on a value in the source. For example, the source has an employee ID. Retrieve the employee name from the lookup table. *Perform a calculation. Retrieve a value from a lookup table and use it in a calculation. For example, retrieve a sales tax percentage, calculate a tax, and return the tax to a target. *Update slowly changing dimension tables. Determine whether rows exist in a target. Lookup Components: Lookup source, Ports, Properties, Condition. Types of Lookup: 1) Relational or flat file lookup. 2) Pipeline lookup. 3) Cached or uncached lookup. 4) connected or unconnected lookup.