This document provides an overview of an Informatica training course offered by Edureka. The course covers topics such as ETL fundamentals, Informatica PowerCenter components, transformations, debugging techniques, and performance tuning. It aims to help students of varying experience levels learn skills for roles like ETL developer, data specialist, and Informatica administrator. The course contains modules on PowerCenter installation, administration, architecture, and best practices, along with hands-on labs and projects. Students will receive a certificate upon completion. More details on the course structure and registration are available on Edureka's website.
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
ETL Using Informatica Power Center
1. www.edureka.co
View Informatica course details at www.edureka.co/informatica
ETL Using Informatica Power Center
For Queries:
Post on Twitter @edurekaIN: #askEdureka
Post on Facebook /edurekaIN
For more details please contact us:
US : 1800 275 9730 (toll free)
INDIA : +91 88808 62004
Email Us : sales@edureka.co
www.edureka.co/informatica
2. Slide 2 www.edureka.co/informatica
At the end of this session, you will be able to understand:
The Information Economy
ETL - an Overview
Why ETL is still relevant?
Informatica Overview
The Informatica Platform
Why Informatica
Informatica Partners & Customers
Informatica Architecture Overview & Components
Usecase1 - Loading Product Dimension table using Slowly changing dimension (SCD)
Usecase2 - Populate Sales summary table using Incremental Aggregation
Job trends
Scope of this course
Objectives
3. Slide 3 www.edureka.co/informatica
Mergers
Acquisitions
&
Divestitures
Acquire &
Retain
Customers
Outsource
Non-core
Functions
Improve
Decisions
Modernize
Business
Improve
Efficiency
& Reduce
Costs
Lack of relevant, trustworthy
and timely data
Governance
Risk
Compliance
Increase
Partner
Network
Efficiency
Increase
Business
Agility
ConsolidationGlobalization Growth
Operational
Efficiency
Governance
The Information Economy
Lack of Trustworthy Data Impedes Key Business Imperatives
4. Slide 4 www.edureka.co/informatica
ETL - An Overview
ETL stands for Extraction, Transformation and Load
The "E" represents the ability to consistently and reliably extract data with high performance and minimal impact to
the source system
The "T" represents the ability to transform one or more data sets in batch or real-time into a consumable format
The "L" stands for loading data into a persistent or virtual data store
5. Slide 5 www.edureka.co/informatica
Why ETL is Still Relevant
Is ETL becoming a History with the advent of Big Data?
Data needs to flow from source applications into analytic data stores in a controlled, reliable, secure manner
Information needs to be standardized, with regards to semantics, format and lexicon, for accurate analysis
Operational results need to be consistent and repeatable
Operational results need to be verifiable and transparent
6. Slide 6 www.edureka.co/informatica
Facilitates Integration of data from various data sources for building a Data warehouse
Businesses have data in multiple databases with different codification and formats
Transformation is required to convert and to summarize operational data into a consistent, business oriented format
Pre-Computation of any derived information
Summarization is also carried out to pre-compute summaries and aggregates
Makes data available in a query-able format
Why ETL is Still Relevant
Mergers and acquisitions also
create disparities in data
representation and pose more
difficult challenges in ETL.
7. Slide 7 www.edureka.co/informatica
Informatica – A Product Company
Informatica Corp Provides data integration software and services for various business, industries and government
organizations including telecommunications, health care, financial and insurance services.
Founded : 1993
2012 Revenue : $811.6 million
7-year Annual CAGR: 17% per year
Employees : 2,810+
Partners : 450+
» Major SI, ISV, OEM and On-Demand Leaders
Customers: Over 5,000
» Customers in 82 countries
» Direct Presence in 28 countries
» # 1 in Customer Loyalty Rankings
(7 Years in a Row)
9. Slide 9 www.edureka.co/informatica
Informatica Products & Their Functionalities
There are a wide range of Products available under the Informatica product suite that helps satisfy the data
integration requirements within the enterprise and beyond
Informatica's product is a portfolio focused on Data Integration:
Data Integration & ETL
Information Lifecycle Management
Complex Event Processing
Data Masking
Data Quality
Data Replication
Data Virtualization
Master Data Management
Ultra Messaging
Currently at version 9.6, these components form a toolset for establishing and maintaining enterprise-wide data
warehouses
11. Slide 11 www.edureka.co/informatica
A Singular Focus on Data Integration
Why Informatica?
Proven technology leadership
A track record of continuous innovation
The most neutral trusted partner
Long history of customer success
12. Slide 12 www.edureka.co/informatica
Business Glossary, ICC Manageability
Informatica 8.6.1
Cloud Synch.
Q4 2008
Application ILM
Q1 2009
Application Information Lifecycle Management
CEP
PowerCenter CE
Q3 2009
Informatica 9.0
Informatica Cloud 9
Q4 2009
Collaboration, Pervasive DQ, Data Services
Address Validation
Q2 2009
Address Validation for DQ
Complex Event Processing and Cloud IaaS
MDM
Ultra Messaging
Q1 2010
Multi-domain MDM
Ultra-low Latency Messaging
Informatica
Marketplace
Q2 2010
Online exchange
for solutions
Cloud
Q4 2010
Trust
framework, plug-
ins
MDM
ILM
Q3 2010
Test data mgmt
Why Informatica?
A Track Record of Continuous Innovation
13. Slide 13 www.edureka.co/informatica
Financial Services
and Insurance
Tele-
communications
Manufacturing
Retail and
Services
Healthcare and
Life Sciences
Utilities and Energy
Government and
Public Sector
Transportation
and Distribution
Over 4,200 Leaders Rely on Informatica
Why Informatica?
14. Slide 14 www.edureka.co/informatica
PowerCenter:
It is a single, unified enterprise data integration platform that allows companies and government organizations
of all sizes to access, discover and integrate data from virtually any business system, in any format and deliver
that data throughout the enterprise at any speed.
An ETL tool ( Extract, Transform and Load)
The main advantages of PowerCenter over other ETL tools lies in its robustness, for it can be used in both
Windows and Unix based systems.
PowerCenter can read from a variety of different sources and write to as many targets, while transforming data
in between.
The main advantages of PowerCenter over other ETL tools, and hence a reason for its popularity over other
such tools are as follows:
» It is robust, and can be used in both windows and UNIX based systems
» It is high performing yet very simple for developing, maintaining and administering
Introduction to PowerCenter
15. Slide 15 www.edureka.co/informatica
The architecture of Informatica PowerCenter (version 9.x onwards) is based on the service Oriented Architecture
(SOA) concept.
A service-oriented architecture (SOA) can be defined as a group of services, which communicate with each other.
The process of communication involves either simple data passing or it could involve two or more services
coordinating same activity.
Informatica 9.X represents a major change in the architecture of the product line.
Aim: Its main aim is to provide improved performance and high availability.
Approach: By reengineering the underlying architecture has been made even more services-based.
PowerCenter Architecture - SOA
20. Slide 20 www.edureka.co/informatica
The PowerCenter server components comprises of the following services:
Repository service: The Repository service manages the repository. It retrieves, inserts, and updates metadata into
the repository database tables.
Integration service: The Integration service runs sessions and workflows.
SAP BW service: The SAP BW service looks out for RFC requests from SAP BW and initiates workflows to extract data
from, or load data into the SAP BW.
Web services hub: The Web services hub receives requests from web service clients and exposes PowerCenter
workflows as services.
Server Components of PowerCenter
21. Slide 21Slide 21Slide 21 www.edureka.co/informatica
ODBC
Targets
Native drivers/
ODBC
Native drivers/
ODBC
HTTPS
SOURCES
Native drives
TCP/IP
TCP/IP
ODBC
Power Center Client
Administrator
Security
Domain
MetadataRepository
Native drives
TCP/IP
DOMAIN
Repository
Service
Repository
Service Process
Overall Architecture of PowerCenter
Integration
Service
22. Slide 22 www.edureka.co/informatica
The salient features of a Domain are as follows:
» A Domain is a logical collection or set of nodes and services.
» The PowerCenter Domain is the fundamental administrative unit of PowerCenter.
» A Domain can be a single PowerCenter installation, or it can consist of multiple PowerCenter installations.
The salient features of a node are as follows:
» A node is a logical representation of a physical machine. It has physical attributes such as a hostname and a
port number.
» Each node runs a service manager which is responsible for the application and core services.
» A node can be a gateway node or a worker node, but it can belong to only one Domain.
Informatica - Domain & Nodes
23. Slide 23 www.edureka.co/informatica
A service can be described as follows:
A service is a resource that provides specialized functions. All PowerCenter processes run as services on a node.
PowerCenter has two types of services:
» Application Services represent server based functions including Repository and Integration Services.
» Core Services represent functions that manage and maintain the environment in which PowerCenter operates
and include services like Log Service, Licensing Service, and Domain Service amongst many others.
Informatica- Services
24. Slide 24 www.edureka.co/informatica
Component-based development is a technique where predefined components or functional units, or both, with
specific functionalities are used to assemble the final product.
PowerCenter follows the component-based development methodologies by allowing to build a data flow from a
source to the target, using different components (called transformations) and linking them to each other as required.
Component Based Development Techniques
25. Slide 25 www.edureka.co/informatica
The advantages of a component-based development model are as listed below:
As the functional units are already built, the developer need not build them from scratch and can instead use them
directly. Apart from making the entire process easier, this reduces the development time as well.
This approach makes bug-fixing easier as well as aid in various maintenance activities, with only the malfunctioned
components needing to be fixed.
Reusability is also another factor that works in the favor of a component-based development model
Component Based Development Techniques
26. Slide 26 www.edureka.co/informatica
Transformation is the process in ETL where we actually apply the business rules in the data flow
It is here that the data cleansing and formatting activities are actually performed along with data validation, which is
one of its main functionalities
In PowerCenter, transformations are the functional components
In order to meet all kinds of requirements, a wide range of transformations are available within Informatica
The hierarchy goes in this way
» Transformation
» Mapping
» Sessions
» Workflow
Transformation -> Mapping -> Session -> Workflow
27. Slide 27 www.edureka.co/informatica
Informatica PowerCenter is the premium data integration solution available today
"Database neutral” - will communicate with any database
Powerful data transformations convert one application’s data to another’s format
Informatica PowerCenter – DI Solution
Manufacturing
(DB2)
Sales
(SalesForce)
Billing (Sybase)
Resource
Planning (PSFT)
Inventory
(SQL Server)
Marketing (ORCL)
Accounting
(upgraded)
Informatica
PowerCenter
28. Slide 28 www.edureka.co/informatica
A company purchases a new accounts payable application
PowerCenter can move the existing account data to the new application
» Preserves data lineage for tax, accounting, and other legally mandated purposes
Data Migration
Informatica
PowerCenter
Accounting
(Old)
Accounting
(New)
29. Slide 29 www.edureka.co/informatica
Company A purchases Company B
To achieve the benefits of consolidation, Company B’s billing system must be integrated into Company A’s billing system
Application Integration
Informatica
PowerCenterBilling A Billing B
30. Slide 30 www.edureka.co/informatica
Data Warehousing
Data warehouses put information from many sources together for analysis
Data is moved from many databases to the Data warehouse
Inventory
(SQL Server)
Informatica
PowerCenter
Marketing
(ORCL)
Accounting
(upgraded)
Manufacturing
(DB2)
Resource
Planning (PSFT)
Sales
(SalesForce)
Billing
(Sybase)
Data warehouse
31. Slide 31 www.edureka.co/informatica
Middleware
Informatica can connect variety of sources, including the most of the
Application Sources
SAP certified Data Integration tool
Can pull and push data into SAP R3, SAP BW systems
Have connectivity adapter for majority of the Application Sources
Can be used as Middleware between two Applications like SAP R3,
SAP BW etc.
32. Slide 32 www.edureka.co/informatica
Some Unique Features of Informatica
Single Administration console to Administer all the application services
Unified Users, Groups, Privileges and Roles admin across PC AE Tools
Single Sign on for all the client tool - Once you login to one client tool, others are automatically logged in
In built version control
Grid and High availability
In built scheduling tool
33. Slide 33 www.edureka.co/informatica
Loading Product Dimension table using Slowly changing dimension (SCD)
Populate Sales summary table using Incremental Aggregation
Demonstrating Informatica PowerCenter Partitioning capability
Use Cases
37. Slide 37 www.edureka.co/informatica
Informatica PowerCenter Basic
Informatica PowerCenter Advanced Transformations
Informatica PowerCenter Installation and Configuration
Informatica PowerCenter Administration and Operation Basics
PowerCenter Troubleshooting & Performance Tuning
Best Practices and Methodology
Ample amount of Lab to be followed after each module
Scope of This Course
39. Slide 39
LIVE Online Class
Class Recording in LMS
24/7 Post Class Support
Module Wise Quiz
Project Work
Verifiable Certificate
www.edureka.co/informatica
How it Works