SlideShare uma empresa Scribd logo
1 de 35
Understanding Deployment Practices that Merge the
Strengths of Hadoop and the Data Warehouse
Joe Rao
PS Consultant, Teradata Corporation
HADOOP IS NOT AN ISLAND
IN THE ENTERPRISE:
2 6/17/2014 Teradata Confidential
This presentation covers
• A comparison of the strengths of Hadoop and a Data
Warehouse
• Architectures that involve Hadoop and the data warehouse
working together
AGENDA
3 6/17/2014 Teradata Confidential
• Our two platforms:
> The Data Platform – Hadoop
> The Enterprise Data Warehouse – Teradata
• Both platforms could handle everything by themselves
if we really wanted them to
• Biased organizations will favor one over the other, and
argue that everything can be done in one place
• And they're both right
FRAMING THE DISCUSSION
4 6/17/2014 Teradata Confidential
•Let's consider a software startup or
company that has no IT department yet
•They need to:
> Acquire their technology from scratch
> Build business logic from scratch
> Staff their new department from scratch
•With no existing technology, how should
they structure their data center?
FRAMING THE DISCUSSION
5 6/17/2014 Teradata Confidential
• Traditional data warehouses (like the Teradata
database) have been used as the central
repository
of business data for years.
• Data warehouses are great with:
> Thousands of concurrent users and queries
> Full ANSI SQL interfaces
> Very complex SQL query logic
> Advanced workload management
> Transactional capabilities
> Secure access
DATA WAREHOUSE STRENGTHS
6 6/17/2014 Teradata Confidential
• Many companies that have been doing things the
old way with a data warehouse don't think they need
to change anything
• What they've been doing has worked for years. Hadoop
is young and immature they say. Why change?
• These companies are change resistant. They are missing
out on the advancements in big data and can fall behind
their competition.
DATA WAREHOUSE ONLY?
I’m lonely
7 6/17/2014 Teradata Confidential
• Hadoop is changing the game in the enterprise
data landscape. It's major strengths include:
> Economical
> Able to process extremely large data sets
> Extremely flexible storage and processing
> Open, free, active community development
HADOOP STRENGTHS
8 6/17/2014 Teradata Confidential
• Appliance Solution
> Purpose-built integrated hardware/software solution
> Optimized hardware for Hadoop, software, storage, and
networking in a single rack
> Delivered ready to run at a competitive price point
• Enterprise Ready
> 100% open-source Hadoop via Hortonworks HDP
> Integrated with Teradata Unified Data Architecture on 40GB/s
InfiniBand BYNET V5 for performance and reliability
> Support for major ETL tools, enhanced security, and
metadata management
> Management tools for monitoring system health
• Benefits
> Lowest TCO and fastest time to value
> Fully engineered and supported by Teradata
TERADATA APPLIANCE FOR HADOOP
9 6/17/2014 Teradata Confidential
• Many companies are so eager to jump onto the Hadoop wave
that they think they can run their entire datacenter on
Hadoop.
• It's free, it has lots of development effort put into it, it's
flexible. Why go the “old way” with an EDW?
• These companies are using Hadoop beyond its design and
maturity level, and may run into technical problems
meeting requirements.
HADOOP ONLY?
I’m lonely
10 6/17/2014 Teradata Confidential
CONCLUSIONS — TWO TCOD EXAMPLES
1. TCOD is NOT platform cost – it is total project cost
2. Each technology has large advantages in its sweet spot(s)
3. Neither platform is cost effective in the other’s sweet spot
4. Biggest differences for the data warehouse are the development of:
 Complex queries
 Analytics Source: WinterCorp - Full report at www.wintercorp.com/tcod-report
Data Refining: Hadoop wins
Also: Landing Zone, Archive EDW: Data W/H Platform Wins
$0
$5
$10
$15
$20
$25
$30
$35
On Hadoop On Data
Warehouse
Millions
$0
$100
$200
$300
$400
$500
$600
$700
$800
On Hadoop On Data
Warehouse
Millions
Total System Cost
System and Data Admin
Application Development
ETL
Complex Queries
Analysis
11 6/17/2014 Teradata Confidential
• These two platforms are complementary!
• Successful enterprise datacenters merge the strengths
of both platforms.
EDW VS. HADOOP
12 6/17/2014 Teradata Confidential
• Split Workload Architecture
• ETL System Architecture
• Secure Access Architecture
• Active Archive Architecture
COMBINED ARCHITECTURES
13 6/17/2014 Teradata Confidential
Insurance Use Case
Impact
• Quickly analyze data for informed decisions and ad hoc reporting
• Streamlined process to calculate vehicle and fleet scores
• Cost effectively quantify, adjust and manage risk premiums
Situation
A large diversified customers needed to accurately calculate scores and adjust risk
premiums for its enterprise fleets based on vehicle data, driver behavior, GPS data,
weather data, traffic and DW data. Current custom developed applications limits the
effectiveness of these scores.
Problem
Lacks infrastructure and system to handle the huge volumes of real time data. No ad-hoc
reporting systems to combine, enrich and analyze the data. Limited storage capacity limits
the amount of data that can be captured, refined and stored.
Solution
Used Teradata Big Analytics Appliance to design a platform to streamline the ingestion
process for telematics data from multiple sources, data types, structure, and frequency
and combine with other data sources to perform meaningful analytics.
14 6/17/2014 Teradata Confidential
HADOOP
TeradataINTEGRATED DATA WAREHOUSE
• The Data Warehouse and Hadoop run different workloads
on different data sets.
SPLIT WORKLOADS
Big Data
Operational Data
15 6/17/2014 Teradata Confidential
• It is not economical to put gigantic, “value sparse” data
sets on an enterprise data warehouse.
• Hadoop was not built to be an accessible, highly concurrent
transactional database.
• The easiest natural architecture is to split up the two
platforms based on the data set and workload.
> Teradata handles the operational business data and queries
> Hadoop handles the cost prohibitive “big data” sets, such as
web, machine, social data
SPLIT WORKLOADS
16 6/17/2014 Teradata Confidential
• Both systems operate favorably on cost and performance
with respect to their given workloads.
• The business can analyze new data and gain new insights
that their existing platform couldn't handle before.
SPLIT WORKLOADS — BUSINESS VALUE
17 6/17/2014 Teradata Confidential
LARGE COMPUTER MANUFACTURER
Analysis of Customer Web Interactions
Capture, Refine, Store ClickStream Data
Impact
• Reduced data inconsistencies and improved performance
• Capture and curate ALL the data and prepare for analysis
• Perform ad hoc analytics on multi-level interactions
• Improves the marketing campaigns and the customer support process
Situation
Customers interact interact with public websites of large PC vendor for various purposes — resulting in
huge volumes of raw omniture data. Because of its nature, the data structure and format is not always
consistent and because of the volumes, processing the amount of data is difficult.
Problem
Inconsistencies like file errors, corrupted file compressions in the raw omniture data makes the
capturing and analysis process error prone. The volume, velocity (70files/hr, 1M files) adds to the
complexity.
Solution
Teradata Big Analytics solution to provide a landing and staging area for in-coming data at high
velocity. Hadoop nodes to curate the data, check for data consistency, and prepare the data for
consumption by higher end analytic platforms.
18 6/17/2014 Teradata Confidential
HADOOP Teradata
TERADATA
PLATFORM FAMILY
• Hadoop can be used as a staging and ETL preprocessing
layer for the Data Warehouse.
ETL SYSTEM ARCHITECTURE
Source Data Transformed Data
19 6/17/2014 Teradata Confidential
• The Data Warehouse is busy with operational queries.
We can reduce the workload on the DW by
migrating some ETL to Hadoop.
• ETL processing is a write once step, which fits
Hadoop's architecture.
• Hadoop can inexpensively retain the raw source
data for data lineage purposes.
*Note that there are many cases where this migration doesn't make sense,
such as when it's necessary to do referential integrity checks. The DW is
capable of handling its ETL if necessary.
ETL SYSTEM ARCHITECTURE
20 6/17/2014 Teradata Confidential
HADOOP TERADATA
PLATFORM FAMILY
• Command line interface for Hadoop / TD data transfer
• Batch mapreduce jobs
• Bidirectional
• Run on the Hadoop side
TERADATA CONNNECTOR
FOR HADOOP (TDCH)
TDCH
21 6/17/2014 Teradata Confidential
hadoop jar /home/jo845b/teradata-connector-1.0.10/lib/teradata-connector-
1.0.10.jar 
com.teradata.hadoop.tool.TeradataExportTool 
-libjars $LIB_JARS 
-classname com.teradata.jdbc.TeraDriver 
-url jdbc:teradata://terarps.ca.boeing.com/DATABASE=SQLH_TEST 
-username jo845b 
-password Teradata14 
-jobtype hcat 
-fileformat rcfile 
-method internal.fastload 
-sourcedatabase default 
-sourcetable ontime_sqoop 
-targettable ontime_sqoop 
-usexviews true
• There are a plethora of options to fine-tune data transfer
between Teradata and Hadoop
TERADATA CONNECTOR FOR HADOOP
22 6/17/2014 Teradata Confidential
• Hadoop frees up the Data Warehouse's limited storage and
processing resources, saving the business time and money.
• Data can now be kept in its raw form, adding new data
lineage capabilities to the data center.
ETL SYSTEM ARCHITECTURE — BUSINESS VALUE
23 6/17/2014 Teradata Confidential
BANKING USE CASE
Impact
• Analyze multi-structured data types
• Keep data confidential to those with access rights
• SQL users have easy access to big data sources
Situation
A large national bank needed to securely and inexpensively store and analyze raw
financial data in varied nonrelational formats. The data needs strict access privileges and
should be generally accessible to SQL users in some way.
Problem
Current infrastructure is not flexible enough to handle the expected variations in data
formats and processing algorithms. Security requirements are too strict for vanilla
Hadoop.
Solution
Use Teradata Big Analytics Appliance to ingest and store the data. Data is accessed by
analysts though an access layer with the data warehouse, and power users manipulate
the data on the Hadoop system directly.
24 6/17/2014 Teradata Confidential
HADOOP TERADATA
PLATFORM FAMILY
Sub-queries
Data
Queries
SECURE ACCESS ARCHITECTURE
• Teradata can be used as an access layer to the data
stored in Hadoop.
25 6/17/2014 Teradata Confidential
• Data in Hadoop can be accessed by data
warehouse users with no knowledge of the
inner workings of Hadoop.
• The full Teradata SQL library is now available to
Hadoop users
• Teradata can be used as a secure gateway to
limit the authentication gap in Hadoop without
needing Kerberos.
SECURE ACCESS ARCHITECTURE
26 6/17/2014 Teradata Confidential
HADOOP TERADATA
PLATFORM FAMILY
Query Grid
Data
TERADATA QUERY GRID:
TERADATA DATABASE TO HADOOP
• Direct data transfer from the Hadoop Distributed Filesystem
• Hadoop data referenced in normal SQL queries
• Transfers occur in a high speed, parallel, scalable fashion
• Data can be processed on the fly or stored long-term
27 6/17/2014 Teradata Confidential
CREATE VIEW TOM AS (
SELECT * FROM load_from_hcatalog(
USING
server('sdll4364.labs.teradata.com')
port('9083')
username('hive')
dbname('vim')
templeton_port('1880')
));
• There are a plethora of options to fine-tune data transfer
between Teradata and Hadoop
• Access rights on the view can limit users' access to other
data sets.
TERADATA QUERY GRID
28 6/17/2014 Teradata Confidential
• Businesses can leverage the much more widespread
SQL and EDW user community instead of the small,
expensive Hadoop expert community. This saves the
business money.
• Data can be stored inexpensively, securely, and
accessibly at the same time.
SECURE ACCESS ARCHITECTURE —
BUSINESS VALUE
29 6/17/2014 Teradata Confidential
PHARMACY USE CASE
Impact
• Reduced storage costs for data variety
• Perform adhoc analytics on the multiple versions of data
• Retrieve data in minutes ( vs. days with tape archives )
• Reduced load and improved performance of DW/Databases
Situation
High performance storage is expensive. A Large integrated pharmacy HC providers deals
With a variety of data with different business value. All data cannot be store on the same
system. Ever expanding data is only adding to this challenge.
Problem
Long terms storage data cannot be queried and it takes a long time for retrieval. No analysis
can be performed on the archived data. Losing out on business value from this valuable data.
Solution
Used Teradata Hadoop nodes to store all the data coming in from weblogs, medical
data, JSON files. Hadoop also serves as a enrichment layer to enhance data for high-end
analytics consumption. The complete solution provides easy movement of data from
Hadoop, Aster and Teradata.
30 6/17/2014 Teradata Confidential
HADOOP TERADATA
PLATFORM FAMILY
ACTIVE ARCHIVE
• Hadoop can be used to store the data warehouse's
cold data, historical data, and regular backups.
Backups
Historical Data
31 6/17/2014 Teradata Confidential
• Using Hadoop as an active archive allows database users to
access cold or historical data on the fly, unlike tape
archives.
• Hadoop data can be accessed in the EDW using Teradata
QueryGrid: Teradata-Hadoop.
• The data is no longer stored in the data warehouse,
freeing valuable space. Hadoop is a less expensive
platform to store this data on.
ACTIVE ARCHIVE
32 6/17/2014 Teradata Confidential
• Storing data on Hadoop frees up cold data storage space
on the relatively expensive data warehouse, saving the
business money.
• Compared to tape, businesses can still analyze and
access their data on Hadoop. This saves time and effort.
ACTIVE ARCHIVE — BUSINESS VALUE
33 6/17/2014 Teradata Confidential
• A successful DW / Hadoop coexistence system will see
varying uses of all four of these mechanisms concurrently.
• Replacing existing infrastructures with Hadoop is not a
feasible goal.
• In order to get Hadoop's foot in the door with large
established enterprises, we need to push Hadoop as an
integrated solution in tandem with a DW.
CONCLUDING REMARKS
PUSHING HADOOP FURTHER
Q&A
THANKS!
WWW.TERADATA.COM

Mais conteúdo relacionado

Mais procurados

Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...
Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...
Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...DataWorks Summit/Hadoop Summit
 
Tableau and hadoop
Tableau and hadoopTableau and hadoop
Tableau and hadoopCraig Jordan
 
Pivotal HAWQ and Hortonworks Data Platform: Modern Data Architecture for IT T...
Pivotal HAWQ and Hortonworks Data Platform: Modern Data Architecture for IT T...Pivotal HAWQ and Hortonworks Data Platform: Modern Data Architecture for IT T...
Pivotal HAWQ and Hortonworks Data Platform: Modern Data Architecture for IT T...VMware Tanzu
 
Create a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache HadoopCreate a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache HadoopHortonworks
 
Hadoop and the Data Warehouse: When to Use Which
Hadoop and the Data Warehouse: When to Use Which Hadoop and the Data Warehouse: When to Use Which
Hadoop and the Data Warehouse: When to Use Which DataWorks Summit
 
Luo june27 1150am_room230_a_v2
Luo june27 1150am_room230_a_v2Luo june27 1150am_room230_a_v2
Luo june27 1150am_room230_a_v2DataWorks Summit
 
Presentacin webinar move_up_to_power8_with_scale_out_servers_final
Presentacin webinar move_up_to_power8_with_scale_out_servers_finalPresentacin webinar move_up_to_power8_with_scale_out_servers_final
Presentacin webinar move_up_to_power8_with_scale_out_servers_finalDiego Alberto Tamayo
 
YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez Hortonworks
 
The Modern Data Architecture for Predictive Analytics with Hortonworks and Re...
The Modern Data Architecture for Predictive Analytics with Hortonworks and Re...The Modern Data Architecture for Predictive Analytics with Hortonworks and Re...
The Modern Data Architecture for Predictive Analytics with Hortonworks and Re...Revolution Analytics
 
Oracle Big Data Appliance and Big Data SQL for advanced analytics
Oracle Big Data Appliance and Big Data SQL for advanced analyticsOracle Big Data Appliance and Big Data SQL for advanced analytics
Oracle Big Data Appliance and Big Data SQL for advanced analyticsjdijcks
 
Lessons learned processing 70 billion data points a day using the hybrid cloud
Lessons learned processing 70 billion data points a day using the hybrid cloudLessons learned processing 70 billion data points a day using the hybrid cloud
Lessons learned processing 70 billion data points a day using the hybrid cloudDataWorks Summit
 
Artur Fejklowicz - “Data Lake architecture” AI&BigDataDay 2017
Artur Fejklowicz - “Data Lake architecture” AI&BigDataDay 2017Artur Fejklowicz - “Data Lake architecture” AI&BigDataDay 2017
Artur Fejklowicz - “Data Lake architecture” AI&BigDataDay 2017Lviv Startup Club
 
The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata Hortonworks
 
Data Lake for the Cloud: Extending your Hadoop Implementation
Data Lake for the Cloud: Extending your Hadoop ImplementationData Lake for the Cloud: Extending your Hadoop Implementation
Data Lake for the Cloud: Extending your Hadoop ImplementationHortonworks
 
Big Data Platform Processes Daily Healthcare Data for Clinic Use at Mayo Clinic
Big Data Platform Processes Daily Healthcare Data for Clinic Use at Mayo ClinicBig Data Platform Processes Daily Healthcare Data for Clinic Use at Mayo Clinic
Big Data Platform Processes Daily Healthcare Data for Clinic Use at Mayo ClinicDataWorks Summit
 
Swimming Across the Data Lake, Lessons learned and keys to success
Swimming Across the Data Lake, Lessons learned and keys to success Swimming Across the Data Lake, Lessons learned and keys to success
Swimming Across the Data Lake, Lessons learned and keys to success DataWorks Summit/Hadoop Summit
 
The Future of Data Warehousing: ETL Will Never be the Same
The Future of Data Warehousing: ETL Will Never be the SameThe Future of Data Warehousing: ETL Will Never be the Same
The Future of Data Warehousing: ETL Will Never be the SameCloudera, Inc.
 
Planing and optimizing data lake architecture
Planing and optimizing data lake architecturePlaning and optimizing data lake architecture
Planing and optimizing data lake architectureMilos Milovanovic
 
Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration Hortonworks
 
Cloud Based Data Warehousing and Analytics
Cloud Based Data Warehousing and AnalyticsCloud Based Data Warehousing and Analytics
Cloud Based Data Warehousing and AnalyticsSeeling Cheung
 

Mais procurados (20)

Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...
Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...
Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...
 
Tableau and hadoop
Tableau and hadoopTableau and hadoop
Tableau and hadoop
 
Pivotal HAWQ and Hortonworks Data Platform: Modern Data Architecture for IT T...
Pivotal HAWQ and Hortonworks Data Platform: Modern Data Architecture for IT T...Pivotal HAWQ and Hortonworks Data Platform: Modern Data Architecture for IT T...
Pivotal HAWQ and Hortonworks Data Platform: Modern Data Architecture for IT T...
 
Create a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache HadoopCreate a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache Hadoop
 
Hadoop and the Data Warehouse: When to Use Which
Hadoop and the Data Warehouse: When to Use Which Hadoop and the Data Warehouse: When to Use Which
Hadoop and the Data Warehouse: When to Use Which
 
Luo june27 1150am_room230_a_v2
Luo june27 1150am_room230_a_v2Luo june27 1150am_room230_a_v2
Luo june27 1150am_room230_a_v2
 
Presentacin webinar move_up_to_power8_with_scale_out_servers_final
Presentacin webinar move_up_to_power8_with_scale_out_servers_finalPresentacin webinar move_up_to_power8_with_scale_out_servers_final
Presentacin webinar move_up_to_power8_with_scale_out_servers_final
 
YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez
 
The Modern Data Architecture for Predictive Analytics with Hortonworks and Re...
The Modern Data Architecture for Predictive Analytics with Hortonworks and Re...The Modern Data Architecture for Predictive Analytics with Hortonworks and Re...
The Modern Data Architecture for Predictive Analytics with Hortonworks and Re...
 
Oracle Big Data Appliance and Big Data SQL for advanced analytics
Oracle Big Data Appliance and Big Data SQL for advanced analyticsOracle Big Data Appliance and Big Data SQL for advanced analytics
Oracle Big Data Appliance and Big Data SQL for advanced analytics
 
Lessons learned processing 70 billion data points a day using the hybrid cloud
Lessons learned processing 70 billion data points a day using the hybrid cloudLessons learned processing 70 billion data points a day using the hybrid cloud
Lessons learned processing 70 billion data points a day using the hybrid cloud
 
Artur Fejklowicz - “Data Lake architecture” AI&BigDataDay 2017
Artur Fejklowicz - “Data Lake architecture” AI&BigDataDay 2017Artur Fejklowicz - “Data Lake architecture” AI&BigDataDay 2017
Artur Fejklowicz - “Data Lake architecture” AI&BigDataDay 2017
 
The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata
 
Data Lake for the Cloud: Extending your Hadoop Implementation
Data Lake for the Cloud: Extending your Hadoop ImplementationData Lake for the Cloud: Extending your Hadoop Implementation
Data Lake for the Cloud: Extending your Hadoop Implementation
 
Big Data Platform Processes Daily Healthcare Data for Clinic Use at Mayo Clinic
Big Data Platform Processes Daily Healthcare Data for Clinic Use at Mayo ClinicBig Data Platform Processes Daily Healthcare Data for Clinic Use at Mayo Clinic
Big Data Platform Processes Daily Healthcare Data for Clinic Use at Mayo Clinic
 
Swimming Across the Data Lake, Lessons learned and keys to success
Swimming Across the Data Lake, Lessons learned and keys to success Swimming Across the Data Lake, Lessons learned and keys to success
Swimming Across the Data Lake, Lessons learned and keys to success
 
The Future of Data Warehousing: ETL Will Never be the Same
The Future of Data Warehousing: ETL Will Never be the SameThe Future of Data Warehousing: ETL Will Never be the Same
The Future of Data Warehousing: ETL Will Never be the Same
 
Planing and optimizing data lake architecture
Planing and optimizing data lake architecturePlaning and optimizing data lake architecture
Planing and optimizing data lake architecture
 
Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration
 
Cloud Based Data Warehousing and Analytics
Cloud Based Data Warehousing and AnalyticsCloud Based Data Warehousing and Analytics
Cloud Based Data Warehousing and Analytics
 

Destaque (14)

fabio certificado
fabio certificadofabio certificado
fabio certificado
 
Oficio 401
Oficio 401Oficio 401
Oficio 401
 
Es Dios
Es DiosEs Dios
Es Dios
 
Eventos
EventosEventos
Eventos
 
Fit For Purpose: Preventing a Big Data Letdown
Fit For Purpose: Preventing a Big Data LetdownFit For Purpose: Preventing a Big Data Letdown
Fit For Purpose: Preventing a Big Data Letdown
 
Aec1 presentación
Aec1 presentaciónAec1 presentación
Aec1 presentación
 
ConstruccióNdeantenasdewifi[2]
ConstruccióNdeantenasdewifi[2]ConstruccióNdeantenasdewifi[2]
ConstruccióNdeantenasdewifi[2]
 
Chama 186
Chama 186Chama 186
Chama 186
 
Portadores de texto e atividades
Portadores de texto e atividadesPortadores de texto e atividades
Portadores de texto e atividades
 
Eventos
EventosEventos
Eventos
 
Question 2
Question 2Question 2
Question 2
 
Modeling and rm theory
Modeling and rm theoryModeling and rm theory
Modeling and rm theory
 
Sara Melki Bold Magazine 2015
Sara Melki Bold Magazine 2015Sara Melki Bold Magazine 2015
Sara Melki Bold Magazine 2015
 
Disruption and Your Firm's Risk Appetite
Disruption and Your Firm's Risk AppetiteDisruption and Your Firm's Risk Appetite
Disruption and Your Firm's Risk Appetite
 

Semelhante a Hadoop is not an Island in the Enterprise

Enterprise Hadoop is Here to Stay: Plan Your Evolution Strategy
Enterprise Hadoop is Here to Stay: Plan Your Evolution StrategyEnterprise Hadoop is Here to Stay: Plan Your Evolution Strategy
Enterprise Hadoop is Here to Stay: Plan Your Evolution StrategyInside Analysis
 
Hadoop and the Data Warehouse: Point/Counter Point
Hadoop and the Data Warehouse: Point/Counter PointHadoop and the Data Warehouse: Point/Counter Point
Hadoop and the Data Warehouse: Point/Counter PointInside Analysis
 
Hp Converged Systems and Hortonworks - Webinar Slides
Hp Converged Systems and Hortonworks - Webinar SlidesHp Converged Systems and Hortonworks - Webinar Slides
Hp Converged Systems and Hortonworks - Webinar SlidesHortonworks
 
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...Innovative Management Services
 
Building a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise HadoopBuilding a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise HadoopSlim Baltagi
 
Not Your Father’s Data Warehouse: Breaking Tradition with Innovation
Not Your Father’s Data Warehouse: Breaking Tradition with InnovationNot Your Father’s Data Warehouse: Breaking Tradition with Innovation
Not Your Father’s Data Warehouse: Breaking Tradition with InnovationInside Analysis
 
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...Hortonworks
 
Hadoop and NoSQL joining forces by Dale Kim of MapR
Hadoop and NoSQL joining forces by Dale Kim of MapRHadoop and NoSQL joining forces by Dale Kim of MapR
Hadoop and NoSQL joining forces by Dale Kim of MapRData Con LA
 
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoption2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoptionHortonworks
 
Terracotta Hadoop & In-Memory Webcast
Terracotta Hadoop & In-Memory WebcastTerracotta Hadoop & In-Memory Webcast
Terracotta Hadoop & In-Memory WebcastSoftware AG
 
Hitachi Data Systems Hadoop Solution
Hitachi Data Systems Hadoop SolutionHitachi Data Systems Hadoop Solution
Hitachi Data Systems Hadoop SolutionHitachi Vantara
 
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...MapR Technologies
 
Big Data Expo 2015 - Hortonworks Common Hadoop Use Cases
Big Data Expo 2015 - Hortonworks Common Hadoop Use CasesBig Data Expo 2015 - Hortonworks Common Hadoop Use Cases
Big Data Expo 2015 - Hortonworks Common Hadoop Use CasesBigDataExpo
 
Pivotal deep dive_on_pivotal_hd_world_class_hdfs_platform
Pivotal deep dive_on_pivotal_hd_world_class_hdfs_platformPivotal deep dive_on_pivotal_hd_world_class_hdfs_platform
Pivotal deep dive_on_pivotal_hd_world_class_hdfs_platformEMC
 
Comparison of MPP Data Warehouse Platforms
Comparison of MPP Data Warehouse PlatformsComparison of MPP Data Warehouse Platforms
Comparison of MPP Data Warehouse PlatformsDavid Portnoy
 
Hadoop Perspectives for 2017
Hadoop Perspectives for 2017Hadoop Perspectives for 2017
Hadoop Perspectives for 2017Precisely
 
Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action MapR Technologies
 
Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopEliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopHortonworks
 

Semelhante a Hadoop is not an Island in the Enterprise (20)

Enterprise Hadoop is Here to Stay: Plan Your Evolution Strategy
Enterprise Hadoop is Here to Stay: Plan Your Evolution StrategyEnterprise Hadoop is Here to Stay: Plan Your Evolution Strategy
Enterprise Hadoop is Here to Stay: Plan Your Evolution Strategy
 
Hadoop and the Data Warehouse: Point/Counter Point
Hadoop and the Data Warehouse: Point/Counter PointHadoop and the Data Warehouse: Point/Counter Point
Hadoop and the Data Warehouse: Point/Counter Point
 
Hp Converged Systems and Hortonworks - Webinar Slides
Hp Converged Systems and Hortonworks - Webinar SlidesHp Converged Systems and Hortonworks - Webinar Slides
Hp Converged Systems and Hortonworks - Webinar Slides
 
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
 
Building a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise HadoopBuilding a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise Hadoop
 
Not Your Father’s Data Warehouse: Breaking Tradition with Innovation
Not Your Father’s Data Warehouse: Breaking Tradition with InnovationNot Your Father’s Data Warehouse: Breaking Tradition with Innovation
Not Your Father’s Data Warehouse: Breaking Tradition with Innovation
 
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
 
Hadoop and NoSQL joining forces by Dale Kim of MapR
Hadoop and NoSQL joining forces by Dale Kim of MapRHadoop and NoSQL joining forces by Dale Kim of MapR
Hadoop and NoSQL joining forces by Dale Kim of MapR
 
SQL In/On/Around Hadoop
SQL In/On/Around Hadoop SQL In/On/Around Hadoop
SQL In/On/Around Hadoop
 
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoption2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
 
Terracotta Hadoop & In-Memory Webcast
Terracotta Hadoop & In-Memory WebcastTerracotta Hadoop & In-Memory Webcast
Terracotta Hadoop & In-Memory Webcast
 
Hitachi Data Systems Hadoop Solution
Hitachi Data Systems Hadoop SolutionHitachi Data Systems Hadoop Solution
Hitachi Data Systems Hadoop Solution
 
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
 
Hadoop & Data Warehouse
Hadoop & Data Warehouse Hadoop & Data Warehouse
Hadoop & Data Warehouse
 
Big Data Expo 2015 - Hortonworks Common Hadoop Use Cases
Big Data Expo 2015 - Hortonworks Common Hadoop Use CasesBig Data Expo 2015 - Hortonworks Common Hadoop Use Cases
Big Data Expo 2015 - Hortonworks Common Hadoop Use Cases
 
Pivotal deep dive_on_pivotal_hd_world_class_hdfs_platform
Pivotal deep dive_on_pivotal_hd_world_class_hdfs_platformPivotal deep dive_on_pivotal_hd_world_class_hdfs_platform
Pivotal deep dive_on_pivotal_hd_world_class_hdfs_platform
 
Comparison of MPP Data Warehouse Platforms
Comparison of MPP Data Warehouse PlatformsComparison of MPP Data Warehouse Platforms
Comparison of MPP Data Warehouse Platforms
 
Hadoop Perspectives for 2017
Hadoop Perspectives for 2017Hadoop Perspectives for 2017
Hadoop Perspectives for 2017
 
Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action
 
Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopEliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside Hadoop
 

Mais de DataWorks Summit

Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisDataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiDataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...DataWorks Summit
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal SystemDataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExampleDataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberDataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixDataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiDataWorks Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsDataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureDataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EngineDataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudDataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiDataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerDataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouDataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkDataWorks Summit
 

Mais de DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Último

08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 

Último (20)

08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 

Hadoop is not an Island in the Enterprise

  • 1. Understanding Deployment Practices that Merge the Strengths of Hadoop and the Data Warehouse Joe Rao PS Consultant, Teradata Corporation HADOOP IS NOT AN ISLAND IN THE ENTERPRISE:
  • 2. 2 6/17/2014 Teradata Confidential This presentation covers • A comparison of the strengths of Hadoop and a Data Warehouse • Architectures that involve Hadoop and the data warehouse working together AGENDA
  • 3. 3 6/17/2014 Teradata Confidential • Our two platforms: > The Data Platform – Hadoop > The Enterprise Data Warehouse – Teradata • Both platforms could handle everything by themselves if we really wanted them to • Biased organizations will favor one over the other, and argue that everything can be done in one place • And they're both right FRAMING THE DISCUSSION
  • 4. 4 6/17/2014 Teradata Confidential •Let's consider a software startup or company that has no IT department yet •They need to: > Acquire their technology from scratch > Build business logic from scratch > Staff their new department from scratch •With no existing technology, how should they structure their data center? FRAMING THE DISCUSSION
  • 5. 5 6/17/2014 Teradata Confidential • Traditional data warehouses (like the Teradata database) have been used as the central repository of business data for years. • Data warehouses are great with: > Thousands of concurrent users and queries > Full ANSI SQL interfaces > Very complex SQL query logic > Advanced workload management > Transactional capabilities > Secure access DATA WAREHOUSE STRENGTHS
  • 6. 6 6/17/2014 Teradata Confidential • Many companies that have been doing things the old way with a data warehouse don't think they need to change anything • What they've been doing has worked for years. Hadoop is young and immature they say. Why change? • These companies are change resistant. They are missing out on the advancements in big data and can fall behind their competition. DATA WAREHOUSE ONLY? I’m lonely
  • 7. 7 6/17/2014 Teradata Confidential • Hadoop is changing the game in the enterprise data landscape. It's major strengths include: > Economical > Able to process extremely large data sets > Extremely flexible storage and processing > Open, free, active community development HADOOP STRENGTHS
  • 8. 8 6/17/2014 Teradata Confidential • Appliance Solution > Purpose-built integrated hardware/software solution > Optimized hardware for Hadoop, software, storage, and networking in a single rack > Delivered ready to run at a competitive price point • Enterprise Ready > 100% open-source Hadoop via Hortonworks HDP > Integrated with Teradata Unified Data Architecture on 40GB/s InfiniBand BYNET V5 for performance and reliability > Support for major ETL tools, enhanced security, and metadata management > Management tools for monitoring system health • Benefits > Lowest TCO and fastest time to value > Fully engineered and supported by Teradata TERADATA APPLIANCE FOR HADOOP
  • 9. 9 6/17/2014 Teradata Confidential • Many companies are so eager to jump onto the Hadoop wave that they think they can run their entire datacenter on Hadoop. • It's free, it has lots of development effort put into it, it's flexible. Why go the “old way” with an EDW? • These companies are using Hadoop beyond its design and maturity level, and may run into technical problems meeting requirements. HADOOP ONLY? I’m lonely
  • 10. 10 6/17/2014 Teradata Confidential CONCLUSIONS — TWO TCOD EXAMPLES 1. TCOD is NOT platform cost – it is total project cost 2. Each technology has large advantages in its sweet spot(s) 3. Neither platform is cost effective in the other’s sweet spot 4. Biggest differences for the data warehouse are the development of:  Complex queries  Analytics Source: WinterCorp - Full report at www.wintercorp.com/tcod-report Data Refining: Hadoop wins Also: Landing Zone, Archive EDW: Data W/H Platform Wins $0 $5 $10 $15 $20 $25 $30 $35 On Hadoop On Data Warehouse Millions $0 $100 $200 $300 $400 $500 $600 $700 $800 On Hadoop On Data Warehouse Millions Total System Cost System and Data Admin Application Development ETL Complex Queries Analysis
  • 11. 11 6/17/2014 Teradata Confidential • These two platforms are complementary! • Successful enterprise datacenters merge the strengths of both platforms. EDW VS. HADOOP
  • 12. 12 6/17/2014 Teradata Confidential • Split Workload Architecture • ETL System Architecture • Secure Access Architecture • Active Archive Architecture COMBINED ARCHITECTURES
  • 13. 13 6/17/2014 Teradata Confidential Insurance Use Case Impact • Quickly analyze data for informed decisions and ad hoc reporting • Streamlined process to calculate vehicle and fleet scores • Cost effectively quantify, adjust and manage risk premiums Situation A large diversified customers needed to accurately calculate scores and adjust risk premiums for its enterprise fleets based on vehicle data, driver behavior, GPS data, weather data, traffic and DW data. Current custom developed applications limits the effectiveness of these scores. Problem Lacks infrastructure and system to handle the huge volumes of real time data. No ad-hoc reporting systems to combine, enrich and analyze the data. Limited storage capacity limits the amount of data that can be captured, refined and stored. Solution Used Teradata Big Analytics Appliance to design a platform to streamline the ingestion process for telematics data from multiple sources, data types, structure, and frequency and combine with other data sources to perform meaningful analytics.
  • 14. 14 6/17/2014 Teradata Confidential HADOOP TeradataINTEGRATED DATA WAREHOUSE • The Data Warehouse and Hadoop run different workloads on different data sets. SPLIT WORKLOADS Big Data Operational Data
  • 15. 15 6/17/2014 Teradata Confidential • It is not economical to put gigantic, “value sparse” data sets on an enterprise data warehouse. • Hadoop was not built to be an accessible, highly concurrent transactional database. • The easiest natural architecture is to split up the two platforms based on the data set and workload. > Teradata handles the operational business data and queries > Hadoop handles the cost prohibitive “big data” sets, such as web, machine, social data SPLIT WORKLOADS
  • 16. 16 6/17/2014 Teradata Confidential • Both systems operate favorably on cost and performance with respect to their given workloads. • The business can analyze new data and gain new insights that their existing platform couldn't handle before. SPLIT WORKLOADS — BUSINESS VALUE
  • 17. 17 6/17/2014 Teradata Confidential LARGE COMPUTER MANUFACTURER Analysis of Customer Web Interactions Capture, Refine, Store ClickStream Data Impact • Reduced data inconsistencies and improved performance • Capture and curate ALL the data and prepare for analysis • Perform ad hoc analytics on multi-level interactions • Improves the marketing campaigns and the customer support process Situation Customers interact interact with public websites of large PC vendor for various purposes — resulting in huge volumes of raw omniture data. Because of its nature, the data structure and format is not always consistent and because of the volumes, processing the amount of data is difficult. Problem Inconsistencies like file errors, corrupted file compressions in the raw omniture data makes the capturing and analysis process error prone. The volume, velocity (70files/hr, 1M files) adds to the complexity. Solution Teradata Big Analytics solution to provide a landing and staging area for in-coming data at high velocity. Hadoop nodes to curate the data, check for data consistency, and prepare the data for consumption by higher end analytic platforms.
  • 18. 18 6/17/2014 Teradata Confidential HADOOP Teradata TERADATA PLATFORM FAMILY • Hadoop can be used as a staging and ETL preprocessing layer for the Data Warehouse. ETL SYSTEM ARCHITECTURE Source Data Transformed Data
  • 19. 19 6/17/2014 Teradata Confidential • The Data Warehouse is busy with operational queries. We can reduce the workload on the DW by migrating some ETL to Hadoop. • ETL processing is a write once step, which fits Hadoop's architecture. • Hadoop can inexpensively retain the raw source data for data lineage purposes. *Note that there are many cases where this migration doesn't make sense, such as when it's necessary to do referential integrity checks. The DW is capable of handling its ETL if necessary. ETL SYSTEM ARCHITECTURE
  • 20. 20 6/17/2014 Teradata Confidential HADOOP TERADATA PLATFORM FAMILY • Command line interface for Hadoop / TD data transfer • Batch mapreduce jobs • Bidirectional • Run on the Hadoop side TERADATA CONNNECTOR FOR HADOOP (TDCH) TDCH
  • 21. 21 6/17/2014 Teradata Confidential hadoop jar /home/jo845b/teradata-connector-1.0.10/lib/teradata-connector- 1.0.10.jar com.teradata.hadoop.tool.TeradataExportTool -libjars $LIB_JARS -classname com.teradata.jdbc.TeraDriver -url jdbc:teradata://terarps.ca.boeing.com/DATABASE=SQLH_TEST -username jo845b -password Teradata14 -jobtype hcat -fileformat rcfile -method internal.fastload -sourcedatabase default -sourcetable ontime_sqoop -targettable ontime_sqoop -usexviews true • There are a plethora of options to fine-tune data transfer between Teradata and Hadoop TERADATA CONNECTOR FOR HADOOP
  • 22. 22 6/17/2014 Teradata Confidential • Hadoop frees up the Data Warehouse's limited storage and processing resources, saving the business time and money. • Data can now be kept in its raw form, adding new data lineage capabilities to the data center. ETL SYSTEM ARCHITECTURE — BUSINESS VALUE
  • 23. 23 6/17/2014 Teradata Confidential BANKING USE CASE Impact • Analyze multi-structured data types • Keep data confidential to those with access rights • SQL users have easy access to big data sources Situation A large national bank needed to securely and inexpensively store and analyze raw financial data in varied nonrelational formats. The data needs strict access privileges and should be generally accessible to SQL users in some way. Problem Current infrastructure is not flexible enough to handle the expected variations in data formats and processing algorithms. Security requirements are too strict for vanilla Hadoop. Solution Use Teradata Big Analytics Appliance to ingest and store the data. Data is accessed by analysts though an access layer with the data warehouse, and power users manipulate the data on the Hadoop system directly.
  • 24. 24 6/17/2014 Teradata Confidential HADOOP TERADATA PLATFORM FAMILY Sub-queries Data Queries SECURE ACCESS ARCHITECTURE • Teradata can be used as an access layer to the data stored in Hadoop.
  • 25. 25 6/17/2014 Teradata Confidential • Data in Hadoop can be accessed by data warehouse users with no knowledge of the inner workings of Hadoop. • The full Teradata SQL library is now available to Hadoop users • Teradata can be used as a secure gateway to limit the authentication gap in Hadoop without needing Kerberos. SECURE ACCESS ARCHITECTURE
  • 26. 26 6/17/2014 Teradata Confidential HADOOP TERADATA PLATFORM FAMILY Query Grid Data TERADATA QUERY GRID: TERADATA DATABASE TO HADOOP • Direct data transfer from the Hadoop Distributed Filesystem • Hadoop data referenced in normal SQL queries • Transfers occur in a high speed, parallel, scalable fashion • Data can be processed on the fly or stored long-term
  • 27. 27 6/17/2014 Teradata Confidential CREATE VIEW TOM AS ( SELECT * FROM load_from_hcatalog( USING server('sdll4364.labs.teradata.com') port('9083') username('hive') dbname('vim') templeton_port('1880') )); • There are a plethora of options to fine-tune data transfer between Teradata and Hadoop • Access rights on the view can limit users' access to other data sets. TERADATA QUERY GRID
  • 28. 28 6/17/2014 Teradata Confidential • Businesses can leverage the much more widespread SQL and EDW user community instead of the small, expensive Hadoop expert community. This saves the business money. • Data can be stored inexpensively, securely, and accessibly at the same time. SECURE ACCESS ARCHITECTURE — BUSINESS VALUE
  • 29. 29 6/17/2014 Teradata Confidential PHARMACY USE CASE Impact • Reduced storage costs for data variety • Perform adhoc analytics on the multiple versions of data • Retrieve data in minutes ( vs. days with tape archives ) • Reduced load and improved performance of DW/Databases Situation High performance storage is expensive. A Large integrated pharmacy HC providers deals With a variety of data with different business value. All data cannot be store on the same system. Ever expanding data is only adding to this challenge. Problem Long terms storage data cannot be queried and it takes a long time for retrieval. No analysis can be performed on the archived data. Losing out on business value from this valuable data. Solution Used Teradata Hadoop nodes to store all the data coming in from weblogs, medical data, JSON files. Hadoop also serves as a enrichment layer to enhance data for high-end analytics consumption. The complete solution provides easy movement of data from Hadoop, Aster and Teradata.
  • 30. 30 6/17/2014 Teradata Confidential HADOOP TERADATA PLATFORM FAMILY ACTIVE ARCHIVE • Hadoop can be used to store the data warehouse's cold data, historical data, and regular backups. Backups Historical Data
  • 31. 31 6/17/2014 Teradata Confidential • Using Hadoop as an active archive allows database users to access cold or historical data on the fly, unlike tape archives. • Hadoop data can be accessed in the EDW using Teradata QueryGrid: Teradata-Hadoop. • The data is no longer stored in the data warehouse, freeing valuable space. Hadoop is a less expensive platform to store this data on. ACTIVE ARCHIVE
  • 32. 32 6/17/2014 Teradata Confidential • Storing data on Hadoop frees up cold data storage space on the relatively expensive data warehouse, saving the business money. • Compared to tape, businesses can still analyze and access their data on Hadoop. This saves time and effort. ACTIVE ARCHIVE — BUSINESS VALUE
  • 33. 33 6/17/2014 Teradata Confidential • A successful DW / Hadoop coexistence system will see varying uses of all four of these mechanisms concurrently. • Replacing existing infrastructures with Hadoop is not a feasible goal. • In order to get Hadoop's foot in the door with large established enterprises, we need to push Hadoop as an integrated solution in tandem with a DW. CONCLUDING REMARKS PUSHING HADOOP FURTHER
  • 34. Q&A

Notas do Editor

  1. 8