SlideShare uma empresa Scribd logo
1 de 24
ETL Tools
21 June 2017 www.snipe.co.in 2
Prepared :Snipe Team
Agenda
Table of Contents
Introduction
What do ETL tools do?
Why use an ETL tool?
ETL tools
Comparison
Conclusion
Overview
What do ETLs tool do?
An ETL tool is a tool that:
Extracts data from various data sources (usually legacy)
Transforms data
from -> being optimized for transaction
to -> being optimized for reporting and analysis
synchronizes the data coming from different databases
data cleanses to remove errors
Loads data into a data warehouse
Overview
Why use an ETL tool?
ETL tools save time and money when developing a data warehouse by
removing the need for “hand-coding”.
“Hand Coding” is still the most common way of integrating data today. It
requires hours and hours of development and expertise to create a
Business-Intelligence-System.
It is very difficult for data base administrators to connect between
different brands of databases without using an external tool.
In the event that databases are altered or new databases need to be
integrated, a lot of “hand-coded” work needs to be completely redone.
Overview
How to Implement ETL System
Tools
ETL Tools
Pentaho Kettle
Pentaho is a commercial open-source BI suite that has a product called
Kettle for data integration.
It uses an innovative meta-driven approach and has a strong and very
easy-to-use GUI
The company started around 2001
It has a strong community of 13,500 registered users
It uses a stand-alone java engine that process the tasks for moving data
between many different databases and files
Tools
ETL Tools
Talend
Talend is an open-source data integration tool
It uses a code-generating approach and uses a GUI (implemented in
Eclipse RC)
It started around October 2006
It has a much smaller community then Pentaho, but is supported by 2
finance companies
It generates Java code or Perl code which can later be run on a server
Tools
ETL Tools
Informatica PowerCenter
Informatica has a very good commercial data integration suite
It was founded in 1993
It is the market share leader in data integration (Gartner Dataquest)
It has 2600 customers. Of those, there are fortune 100 companies,
companies listed on the Dow Jones and government organization
The company's sole focus is data integration
It has quite a big package for enterprises to integrate their systems,
cleanse their data and can connect to a vast number of current and legacy
systems
Open Source Tools
ETL Tools
Inaplex Inaport
Inaplex is a small UK company
InaPlex is a producer of Customer Data Integration products for mid-
market CRM solutions
Inaplex mainly focuses on providing simple solutions for it’s customers to
integrate their data into CRM and accounting software like Sage and
Goldmine
Comparison
Type
IBM (Information Server Infosphere platform)
Advantages:
strongest vision on the market, flexibility
progress towards common metadata platform
high level of satisfaction from clients and a variety of initiatives
Disadvantages:
difficult learning curve
long implementation cycles
became very heavy (lots of GBs) with version 8.x and requires a lot of
processing power
Type
Informatica PowerCenter
Advantages:
most substantial size and resources on the market of data integration
tools vendors
consistent track record, solid technology, straightforward learning
curve, ability to address real-time data integration schemes
Informatica is highly specialized in ETL and Data Integration and
focuses on those topics, not on BI as a whole
focus on B2B data exchange
Disadvantages:
several partnerships diminishing the value of technologies
limited experience in the field.
Type
Microsoft (SQL Server Integration Services)
Advantages:
broad documentation and support, best practices to data warehouses
ease and speed of implementation
standardized data integration
real-time, message-based capabilities
relatively low cost - excellent support and distribution model
Disadvantages:
problems in non-Windows environments. Takes over all Microsoft
Windows limitations.
unclear vision and strategy
Type
Oracle (OWB and ODI)
Advantages:
based on Oracle Warehouse Builder and Oracle Data Integrator – two
very powerful tools;
tight connection to all Oracle datawarehousing applications;
tendency to integrate all tools into one application and one environment.
Disadvantages:
focus on ETL solutions, rather than in an open context of data
management;
tools are used mostly for batch-oriented work, transformation rather
than real-time processes or federation data delivery;
long-awaited bond between OWB and ODI brought only promises -
customers confused in the functionality area and the future is uncertain
Type
SAP BusinessObjects (Data Integrator / Data
Services)
Advantages:
integration with SAP
SAP Business Objects created a firm company determined to stir the
market;
Good data modeling and data-management support;
SAP Business Objects provides tools for data mining and quality;
profiling due to many acquisitions of other companies.
Quick learning curve and ease of use
Disadvantages:
SAP Business Objects is seen as two different companies
Uncertain future. Controversy over deciding which method of delivering
data integration to use (SAP BW or BODI).
BusinessObjects Data Integrator (Data Services) may not be seen as a
Types
SAS
Advantages:
experienced company, great support and most of all very powerful data
integration tool with lots of multi-management features
can work on many operating systems and gather data through number of
sources – very flexible
great support for the business-class companies as well for those medium
and minor ones
Disadvantages:
misplaced sales force, company is not well recognized
SAS has to extend influences to reach non-BI community
Costly
Types
Sun Microsystems
Advantages:
Data integration tools are a part of huge Java Composite Application
Platform Suite - very flexible with ongoing development of the products
'Single-view' services draw together data from variety of sources; small
set of vendors with a strong vision
Disadvantages:
relative weakness in bulk data movement
limited mindshare in the market
support and services rated below adequate
Types
Sybase
Advantages:
assembled a range of capabilities to be able to address a mulitude of
data delivery styles
size and global presence of Sybase create opportunities in the market
pragmatic near-term strategy - better of current market demand
broad partnerships with other data quality and data integration tools
vendors
Disadvantages:
falls behind market leaders and large vendors
gaps in many aspects of data management
Types
Syncsort
Advantages:
functionality; well-known brand on the market (40 years experience);
loyalimplementation, strong performance, targeted functionality and
lower costs customer and experience base;
easy
Disadvantages:
struggle with gaining mind share in the market
lack of support for other than ETL delivery styles
unsatisfactory with lack of capability of professional services
Types
Tibco Software
Advantages:
message-oriented application integration; capabilities based on common
SOA structures;
support for federated views; easy implementation, support
andperformance
Disadvantages:
scarce references from customers; not widely enough recognised for
data integration competencies
lacking in data quality capabilities.
Comparison
Pentaho Kettle vs Talend
Pentaho
Pentaho is a commerical open-source BI suite that has a product called
Kettle for data integration.
It uses an innovative meta-driven approach and has a strong and very
easy-to-use GUI.
The company started around 2001 (2002 was when kettle was integrated
into it).
It has a strong community of 13,500 registered users.
It has a stand-alone java engine that process the jobs and tasks for
moving data between many different databases and files.
It can schedule tasks (but you need a schedular for that - cron).
It can run remote jobs on "slave servers" on other machines.
It has data quality features: from its own GUI, writing more customised
SQL queries, Javascript and regular expressions.
Conclusion
Conclusion
Informatica and Pentaho have very good products.
Informatica has a far more extensive range of products, but compared
to Pentaho is very expensive.
Pentaho has proved that it can handle small to large scale systems.
Pentaho is gaining fast momentum with businesses that would not have
considered using open source products before.
ETL

Mais conteúdo relacionado

Mais procurados

Open Source ETL vs Commercial ETL
Open Source ETL vs Commercial ETLOpen Source ETL vs Commercial ETL
Open Source ETL vs Commercial ETLJonathan Levin
 
Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tr...
Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tr...Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tr...
Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tr...Gabriele Baldassarre
 
Extract, Transform and Load.pptx
Extract, Transform and Load.pptxExtract, Transform and Load.pptx
Extract, Transform and Load.pptxJesusaEspeleta
 
Data Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshData Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshJeffrey T. Pollock
 
Talend Open Studio Data Integration
Talend Open Studio Data IntegrationTalend Open Studio Data Integration
Talend Open Studio Data IntegrationRoberto Marchetto
 
ETL and its impact on Business Intelligence
ETL and its impact on Business IntelligenceETL and its impact on Business Intelligence
ETL and its impact on Business IntelligenceIshaPande
 
Time to Talk about Data Mesh
Time to Talk about Data MeshTime to Talk about Data Mesh
Time to Talk about Data MeshLibbySchulze
 
Introducing the Snowflake Computing Cloud Data Warehouse
Introducing the Snowflake Computing Cloud Data WarehouseIntroducing the Snowflake Computing Cloud Data Warehouse
Introducing the Snowflake Computing Cloud Data WarehouseSnowflake Computing
 
Azure Data Factory
Azure Data FactoryAzure Data Factory
Azure Data FactoryHARIHARAN R
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureDatabricks
 
Data Visualization With Tableau | Edureka
Data Visualization With Tableau | EdurekaData Visualization With Tableau | Edureka
Data Visualization With Tableau | EdurekaEdureka!
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)James Serra
 
Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...
Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...
Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...Amazon Web Services
 

Mais procurados (20)

Open Source ETL vs Commercial ETL
Open Source ETL vs Commercial ETLOpen Source ETL vs Commercial ETL
Open Source ETL vs Commercial ETL
 
Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tr...
Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tr...Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tr...
Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tr...
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
Informatica slides
Informatica slidesInformatica slides
Informatica slides
 
Extract, Transform and Load.pptx
Extract, Transform and Load.pptxExtract, Transform and Load.pptx
Extract, Transform and Load.pptx
 
Data Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshData Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to Mesh
 
Tableau: A Business Intelligence and Analytics Software
Tableau: A Business Intelligence and Analytics SoftwareTableau: A Business Intelligence and Analytics Software
Tableau: A Business Intelligence and Analytics Software
 
Talend Open Studio Data Integration
Talend Open Studio Data IntegrationTalend Open Studio Data Integration
Talend Open Studio Data Integration
 
Data Warehouse 101
Data Warehouse 101Data Warehouse 101
Data Warehouse 101
 
ETL and its impact on Business Intelligence
ETL and its impact on Business IntelligenceETL and its impact on Business Intelligence
ETL and its impact on Business Intelligence
 
Time to Talk about Data Mesh
Time to Talk about Data MeshTime to Talk about Data Mesh
Time to Talk about Data Mesh
 
Introducing the Snowflake Computing Cloud Data Warehouse
Introducing the Snowflake Computing Cloud Data WarehouseIntroducing the Snowflake Computing Cloud Data Warehouse
Introducing the Snowflake Computing Cloud Data Warehouse
 
Azure Data Factory
Azure Data FactoryAzure Data Factory
Azure Data Factory
 
Etl techniques
Etl techniquesEtl techniques
Etl techniques
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh Architecture
 
Data Visualization With Tableau | Edureka
Data Visualization With Tableau | EdurekaData Visualization With Tableau | Edureka
Data Visualization With Tableau | Edureka
 
Tableau Prep.pptx
Tableau Prep.pptxTableau Prep.pptx
Tableau Prep.pptx
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)
 
Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...
Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...
Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...
 

Destaque (11)

Build tool
Build toolBuild tool
Build tool
 
Ci
CiCi
Ci
 
Ejb
EjbEjb
Ejb
 
Visual basics
Visual basicsVisual basics
Visual basics
 
Jdbc
JdbcJdbc
Jdbc
 
Maven
MavenMaven
Maven
 
Web services engine
Web services engineWeb services engine
Web services engine
 
Ide benchmarking
Ide benchmarkingIde benchmarking
Ide benchmarking
 
Project excursion career_orientation
Project excursion career_orientationProject excursion career_orientation
Project excursion career_orientation
 
Training
TrainingTraining
Training
 
Digital marketing
Digital marketingDigital marketing
Digital marketing
 

Semelhante a ETL

Agile Business Intelligence
Agile Business IntelligenceAgile Business Intelligence
Agile Business IntelligenceDavid Portnoy
 
Big data analytics beyond beer and diapers
Big data analytics   beyond beer and diapersBig data analytics   beyond beer and diapers
Big data analytics beyond beer and diapersKai Zhao
 
Product Analysis Oracle BI Applications Introduction
Product Analysis Oracle BI Applications IntroductionProduct Analysis Oracle BI Applications Introduction
Product Analysis Oracle BI Applications IntroductionAcevedoApps
 
A Comparitive Study Of ETL Tools
A Comparitive Study Of ETL ToolsA Comparitive Study Of ETL Tools
A Comparitive Study Of ETL ToolsRhonda Cetnar
 
10 Best Big Data Management Tools
10 Best Big Data Management Tools10 Best Big Data Management Tools
10 Best Big Data Management ToolsPromptCloud
 
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)Rittman Analytics
 
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRobertsWP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRobertsJane Roberts
 
Building a Big Data Analytics Platform- Impetus White Paper
Building a Big Data Analytics Platform- Impetus White PaperBuilding a Big Data Analytics Platform- Impetus White Paper
Building a Big Data Analytics Platform- Impetus White PaperImpetus Technologies
 
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoption2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoptionHortonworks
 
Data Virtualization: An Introduction
Data Virtualization: An IntroductionData Virtualization: An Introduction
Data Virtualization: An IntroductionDenodo
 
Informatica Interview Questions & Answers
Informatica Interview Questions & AnswersInformatica Interview Questions & Answers
Informatica Interview Questions & AnswersZaranTech LLC
 
White paper making an-operational_data_store_(ods)_the_center_of_your_data_...
White paper   making an-operational_data_store_(ods)_the_center_of_your_data_...White paper   making an-operational_data_store_(ods)_the_center_of_your_data_...
White paper making an-operational_data_store_(ods)_the_center_of_your_data_...Eric Javier Espino Man
 

Semelhante a ETL (20)

Agile Business Intelligence
Agile Business IntelligenceAgile Business Intelligence
Agile Business Intelligence
 
Big data analytics beyond beer and diapers
Big data analytics   beyond beer and diapersBig data analytics   beyond beer and diapers
Big data analytics beyond beer and diapers
 
Product Analysis Oracle BI Applications Introduction
Product Analysis Oracle BI Applications IntroductionProduct Analysis Oracle BI Applications Introduction
Product Analysis Oracle BI Applications Introduction
 
Gowthami_Resume
Gowthami_ResumeGowthami_Resume
Gowthami_Resume
 
BAKKIYA_4YR
BAKKIYA_4YRBAKKIYA_4YR
BAKKIYA_4YR
 
Resume
ResumeResume
Resume
 
VenkatSubbaReddy_Resume
VenkatSubbaReddy_ResumeVenkatSubbaReddy_Resume
VenkatSubbaReddy_Resume
 
A Comparitive Study Of ETL Tools
A Comparitive Study Of ETL ToolsA Comparitive Study Of ETL Tools
A Comparitive Study Of ETL Tools
 
10 Best Big Data Management Tools
10 Best Big Data Management Tools10 Best Big Data Management Tools
10 Best Big Data Management Tools
 
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
 
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRobertsWP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
 
Pentaho Suite Analysis
Pentaho Suite Analysis Pentaho Suite Analysis
Pentaho Suite Analysis
 
Building a Big Data Analytics Platform- Impetus White Paper
Building a Big Data Analytics Platform- Impetus White PaperBuilding a Big Data Analytics Platform- Impetus White Paper
Building a Big Data Analytics Platform- Impetus White Paper
 
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoption2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
 
Data Virtualization: An Introduction
Data Virtualization: An IntroductionData Virtualization: An Introduction
Data Virtualization: An Introduction
 
Informatica Interview Questions & Answers
Informatica Interview Questions & AnswersInformatica Interview Questions & Answers
Informatica Interview Questions & Answers
 
AtomicDBCoreTech_White Papaer
AtomicDBCoreTech_White PapaerAtomicDBCoreTech_White Papaer
AtomicDBCoreTech_White Papaer
 
White paper making an-operational_data_store_(ods)_the_center_of_your_data_...
White paper   making an-operational_data_store_(ods)_the_center_of_your_data_...White paper   making an-operational_data_store_(ods)_the_center_of_your_data_...
White paper making an-operational_data_store_(ods)_the_center_of_your_data_...
 
Tableau Suite Analysis
Tableau Suite Analysis Tableau Suite Analysis
Tableau Suite Analysis
 
Richa_Profile
Richa_ProfileRicha_Profile
Richa_Profile
 

Mais de Mallikarjuna G D (20)

Reactjs
ReactjsReactjs
Reactjs
 
Bootstrap 5 ppt
Bootstrap 5 pptBootstrap 5 ppt
Bootstrap 5 ppt
 
CSS
CSSCSS
CSS
 
Angular 2.0
Angular  2.0Angular  2.0
Angular 2.0
 
Spring andspringboot training
Spring andspringboot trainingSpring andspringboot training
Spring andspringboot training
 
Hibernate
HibernateHibernate
Hibernate
 
Jspprogramming
JspprogrammingJspprogramming
Jspprogramming
 
Servlet programming
Servlet programmingServlet programming
Servlet programming
 
Servlet programming
Servlet programmingServlet programming
Servlet programming
 
Mmg logistics edu-final
Mmg  logistics edu-finalMmg  logistics edu-final
Mmg logistics edu-final
 
Interview preparation net_asp_csharp
Interview preparation net_asp_csharpInterview preparation net_asp_csharp
Interview preparation net_asp_csharp
 
Interview preparation devops
Interview preparation devopsInterview preparation devops
Interview preparation devops
 
Interview preparation testing
Interview preparation testingInterview preparation testing
Interview preparation testing
 
Interview preparation data_science
Interview preparation data_scienceInterview preparation data_science
Interview preparation data_science
 
Interview preparation full_stack_java
Interview preparation full_stack_javaInterview preparation full_stack_java
Interview preparation full_stack_java
 
Enterprunership
EnterprunershipEnterprunership
Enterprunership
 
Core java
Core javaCore java
Core java
 
Type script
Type scriptType script
Type script
 
Angularj2.0
Angularj2.0Angularj2.0
Angularj2.0
 
Git Overview
Git OverviewGit Overview
Git Overview
 

Último

Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDThiyagu K
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingTeacherCyreneCayanan
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxnegromaestrong
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin ClassesCeline George
 
Gardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterGardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterMateoGardella
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docxPoojaSen20
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfAyushMahapatra5
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.pptRamjanShidvankar
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxAreebaZafar22
 
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Shubhangi Sonawane
 

Último (20)

Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writing
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
Gardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterGardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch Letter
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docx
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
 

ETL

  • 2. 21 June 2017 www.snipe.co.in 2 Prepared :Snipe Team
  • 3. Agenda Table of Contents Introduction What do ETL tools do? Why use an ETL tool? ETL tools Comparison Conclusion
  • 4. Overview What do ETLs tool do? An ETL tool is a tool that: Extracts data from various data sources (usually legacy) Transforms data from -> being optimized for transaction to -> being optimized for reporting and analysis synchronizes the data coming from different databases data cleanses to remove errors Loads data into a data warehouse
  • 5. Overview Why use an ETL tool? ETL tools save time and money when developing a data warehouse by removing the need for “hand-coding”. “Hand Coding” is still the most common way of integrating data today. It requires hours and hours of development and expertise to create a Business-Intelligence-System. It is very difficult for data base administrators to connect between different brands of databases without using an external tool. In the event that databases are altered or new databases need to be integrated, a lot of “hand-coded” work needs to be completely redone.
  • 7. Tools ETL Tools Pentaho Kettle Pentaho is a commercial open-source BI suite that has a product called Kettle for data integration. It uses an innovative meta-driven approach and has a strong and very easy-to-use GUI The company started around 2001 It has a strong community of 13,500 registered users It uses a stand-alone java engine that process the tasks for moving data between many different databases and files
  • 8. Tools ETL Tools Talend Talend is an open-source data integration tool It uses a code-generating approach and uses a GUI (implemented in Eclipse RC) It started around October 2006 It has a much smaller community then Pentaho, but is supported by 2 finance companies It generates Java code or Perl code which can later be run on a server
  • 9. Tools ETL Tools Informatica PowerCenter Informatica has a very good commercial data integration suite It was founded in 1993 It is the market share leader in data integration (Gartner Dataquest) It has 2600 customers. Of those, there are fortune 100 companies, companies listed on the Dow Jones and government organization The company's sole focus is data integration It has quite a big package for enterprises to integrate their systems, cleanse their data and can connect to a vast number of current and legacy systems
  • 10. Open Source Tools ETL Tools Inaplex Inaport Inaplex is a small UK company InaPlex is a producer of Customer Data Integration products for mid- market CRM solutions Inaplex mainly focuses on providing simple solutions for it’s customers to integrate their data into CRM and accounting software like Sage and Goldmine
  • 12. Type IBM (Information Server Infosphere platform) Advantages: strongest vision on the market, flexibility progress towards common metadata platform high level of satisfaction from clients and a variety of initiatives Disadvantages: difficult learning curve long implementation cycles became very heavy (lots of GBs) with version 8.x and requires a lot of processing power
  • 13. Type Informatica PowerCenter Advantages: most substantial size and resources on the market of data integration tools vendors consistent track record, solid technology, straightforward learning curve, ability to address real-time data integration schemes Informatica is highly specialized in ETL and Data Integration and focuses on those topics, not on BI as a whole focus on B2B data exchange Disadvantages: several partnerships diminishing the value of technologies limited experience in the field.
  • 14. Type Microsoft (SQL Server Integration Services) Advantages: broad documentation and support, best practices to data warehouses ease and speed of implementation standardized data integration real-time, message-based capabilities relatively low cost - excellent support and distribution model Disadvantages: problems in non-Windows environments. Takes over all Microsoft Windows limitations. unclear vision and strategy
  • 15. Type Oracle (OWB and ODI) Advantages: based on Oracle Warehouse Builder and Oracle Data Integrator – two very powerful tools; tight connection to all Oracle datawarehousing applications; tendency to integrate all tools into one application and one environment. Disadvantages: focus on ETL solutions, rather than in an open context of data management; tools are used mostly for batch-oriented work, transformation rather than real-time processes or federation data delivery; long-awaited bond between OWB and ODI brought only promises - customers confused in the functionality area and the future is uncertain
  • 16. Type SAP BusinessObjects (Data Integrator / Data Services) Advantages: integration with SAP SAP Business Objects created a firm company determined to stir the market; Good data modeling and data-management support; SAP Business Objects provides tools for data mining and quality; profiling due to many acquisitions of other companies. Quick learning curve and ease of use Disadvantages: SAP Business Objects is seen as two different companies Uncertain future. Controversy over deciding which method of delivering data integration to use (SAP BW or BODI). BusinessObjects Data Integrator (Data Services) may not be seen as a
  • 17. Types SAS Advantages: experienced company, great support and most of all very powerful data integration tool with lots of multi-management features can work on many operating systems and gather data through number of sources – very flexible great support for the business-class companies as well for those medium and minor ones Disadvantages: misplaced sales force, company is not well recognized SAS has to extend influences to reach non-BI community Costly
  • 18. Types Sun Microsystems Advantages: Data integration tools are a part of huge Java Composite Application Platform Suite - very flexible with ongoing development of the products 'Single-view' services draw together data from variety of sources; small set of vendors with a strong vision Disadvantages: relative weakness in bulk data movement limited mindshare in the market support and services rated below adequate
  • 19. Types Sybase Advantages: assembled a range of capabilities to be able to address a mulitude of data delivery styles size and global presence of Sybase create opportunities in the market pragmatic near-term strategy - better of current market demand broad partnerships with other data quality and data integration tools vendors Disadvantages: falls behind market leaders and large vendors gaps in many aspects of data management
  • 20. Types Syncsort Advantages: functionality; well-known brand on the market (40 years experience); loyalimplementation, strong performance, targeted functionality and lower costs customer and experience base; easy Disadvantages: struggle with gaining mind share in the market lack of support for other than ETL delivery styles unsatisfactory with lack of capability of professional services
  • 21. Types Tibco Software Advantages: message-oriented application integration; capabilities based on common SOA structures; support for federated views; easy implementation, support andperformance Disadvantages: scarce references from customers; not widely enough recognised for data integration competencies lacking in data quality capabilities.
  • 22. Comparison Pentaho Kettle vs Talend Pentaho Pentaho is a commerical open-source BI suite that has a product called Kettle for data integration. It uses an innovative meta-driven approach and has a strong and very easy-to-use GUI. The company started around 2001 (2002 was when kettle was integrated into it). It has a strong community of 13,500 registered users. It has a stand-alone java engine that process the jobs and tasks for moving data between many different databases and files. It can schedule tasks (but you need a schedular for that - cron). It can run remote jobs on "slave servers" on other machines. It has data quality features: from its own GUI, writing more customised SQL queries, Javascript and regular expressions.
  • 23. Conclusion Conclusion Informatica and Pentaho have very good products. Informatica has a far more extensive range of products, but compared to Pentaho is very expensive. Pentaho has proved that it can handle small to large scale systems. Pentaho is gaining fast momentum with businesses that would not have considered using open source products before.