SlideShare uma empresa Scribd logo
1 de 12
Baixar para ler offline
W

I

N

T

E

R

C

O

R

P

TCOD: A Framework for the
Total Cost of Big Data
(key charts)
Research Report: wintercorp.com/tcod-report

Spreadsheet: wintercorp.com/tcod-spreadsheet
Key Charts: wintercorp.com/tcod-charts

Richard Winter
WinterCorp
December 6, 2013, V17

THE LARGE SCALE DATA MANAGEMENT EXPERTS
Total Cost of Data
(TCOD)

Software Development/Maintenance

Analytics

Queries

Apps

Admin

ETL*

System

Diagram not to scale.

TCOD is the cost of storing, managing and using data over time for analytic purposes
* ETL is extract, transform and load (preparing data for analytic use)
WINTERCORP: THE LARGE SCALE DATA MANAGEMENT EXPERTS
© 2010, 2011, 2012 WINTER CORPORATION,, CAMBRIDGE Reserved.
©2010 2013 WINTER CORPORATION CAMBRIDGE MA. ALL RIGHTS RESERVED..
© 2012, Winter Corporation. All Rights MA. ALL RIGHTS RESERVED

2
Data Refining Example
Data from Turbines

WINTERCORP: THE LARGE SCALE DATA MANAGEMENT EXPERTS
© 2010, 2011, 2012 WINTER CORPORATION,, CAMBRIDGE Reserved.
©2010 2013 WINTER CORPORATION CAMBRIDGE MA. ALL RIGHTS RESERVED..
© 2012, Winter Corporation. All Rights MA. ALL RIGHTS RESERVED

3
Data Refining Example
Data Management Requirements

1. Hundreds of TB of data per week – 500 TB data
capacity
2. Raw data life: few hours to a few days
3. Challenge: find the important events or trends quickly
4. Massive analysis problem
5. When analyzing, read entire files
6. Keep only the significant data
WINTERCORP: THE LARGE SCALE DATA MANAGEMENT EXPERTS
© 2012, 2013 WINTER CORPORATION, CAMBRIDGE MA. ALL RIGHTS RESERVED.
©2010 Winter Corporation. All Rights Reserved.

4
Cost Comparison
Engineering Example – Data Refining

On Hadoop
On Data Warehouse Appliance*

$9.3m

$30m

(not to scale)

Data
Warehouse
Appliance
Volume of Data

500 TB

500 TB

System Cost

$23 million

$1.3 million

Total Cost of Data

* Performance class of DW
Appliance – not the lowest
price class

Hadoop

$30 million

$9.3 million

WINTERCORP: THE LARGE SCALE DATA MANAGEMENT EXPERTS
© 2010, 2011, 2012 WINTER CORPORATION,, CAMBRIDGE Reserved.
©2010 2013 WINTER CORPORATION CAMBRIDGE MA. ALL RIGHTS RESERVED..
© 2012, Winter Corporation. All Rights MA. ALL RIGHTS RESERVED

5
Observations on Hadoop
1. Many examples of the data refining requirement in
engineering, operations, business, science, healthcare
2. Cost equation is favorable to Hadoop in these
applications even with a wide variety of data types
3. There are also many other excellent Hadoop use cases
– Data landing zone
– Archive
– Intensive batch processing of data
4. Example is one illustration of Hadoop sweet spot
WINTERCORP: THE LARGE SCALE DATA MANAGEMENT EXPERTS
© 2012, 2013 WINTERCorporation.CAMBRIDGE MA. ALL RIGHTS RESERVED.
©2010 Winter CORPORATION, All Rights Reserved.

6
Business Example
Enterprise Data Warehouse

WINTERCORP: THE LARGE SCALE DATA MANAGEMENT EXPERTS
©2010 2013 WINTER CORPORATION, CAMBRIDGEReserved.
© 2012, Winter Corporation. All Rights MA. ALL RIGHTS RESERVED.

7
Business Example - EDW
Data Management Requirements
1.

2.

3.

4.
5.
6.
7.
8.
9.

Data volume
a. 500 TB to start – all retained for at least five years
b. Continual growth of data and workload
Data sources: thousands
a. Data sources change their feeds frequently
b. New data sources are frequent
Challenges
a. Data must be correct
b. Data must be integrated
Typical enterprise data lifetime: decades
Analytic application lifetime: years
Many thousands of data users (104 – 106)
Hundreds of analytic applications
Thousands of one time analyses
Tens of thousands of complex queries

WINTERCORP: THE LARGE SCALE DATA MANAGEMENT EXPERTS
© 2012, 2013 WINTERCorporation.CAMBRIDGE MA. ALL RIGHTS RESERVED.
©2010 Winter CORPORATION, All Rights Reserved.

8
Cost Comparison
Business Example – EDW
Total System Cost
System and Data Admin
ETL
Application Development
Complex Queries
Analysis
On EDW Platform
On Hadoop

$265 million

$740 million
(not to scale)

Data Warehouse
Platform
Volume of Data

Hadoop

500 TB

System Cost

$45 million

Total Cost of Data

$1.4 million

$265 million

$740 million

WINTERCORP: THE LARGE SCALE DATA MANAGEMENT EXPERTS
© 2010, 2011, 2012 WINTER CORPORATION,, CAMBRIDGE Reserved.
©2010 2013 WINTER CORPORATION CAMBRIDGE MA. ALL RIGHTS RESERVED..
© 2012, Winter Corporation. All Rights MA. ALL RIGHTS RESERVED

9
$35

Millions

Millions

Conclusions – Two TCOD Examples
$30

$25

$800
$700
$600

Total System Cost

$20

$500

System and Data Admin

$15

$400

Application Development

$10

$300

ETL

$5

$200

Complex Queries

$0

$100
On Hadoop

On Data
Warehouse

Analysis

$0
On Hadoop

Data Refining: Hadoop wins
Also: Landing Zone, Archive

On Data Warehouse

EDW: Data W/H Platform Wins

1.

TCOD is NOT platform cost

2.

Each technology has large advantages in its sweet spot(s)

3.

Neither platform is cost effective in the other’s sweet spot

4.

Biggest differences for the data warehouse are the development of:

Complex queries

Analytics

WINTERCORP: THE LARGE SCALE DATA MANAGEMENT EXPERTS
©2010 2013 WINTER CORPORATION, CAMBRIDGEReserved.
© 2012, Winter Corporation. All Rights MA. ALL RIGHTS RESERVED.

10
TCOD Framework
Additional Notes
Not taken into account
 Actual system workloads, concurrency, availability reqmts.
 Cost of preparing simple queries
 Cost of query execution
 Workload management
 Vendor supported distributions of Hadoop/Hadoop Appliances
 ETL products available with Hadoop
New Products Should Eventually Decrease TCOD with Hadoop
 Cloudera Impala, IBM BigSQL, Teradata SQL-H, EMC Pivotal
 New version of Hive supports subset of SQL
 Further analysis, evaluation and measurement is required
WINTERCORP: THE LARGE SCALE DATA MANAGEMENT EXPERTS
©2010 Winter Corporation. All Rights Reserved.

11
In Conclusion
1. TCOD estimates what your company will really spend to get
to your business goal.
2. Total cost is extremely sensitive to technology choice
3. Analytic architectures will require both Hadoop and data
warehouse platforms
4. Focus on total cost, not platform cost, in making your choice
for a particular application or use.
5. Many analytic processes will use both Hadoop and data
warehouse technology – so integration counts!
Questions and comments welcome at tcod@wintercorp.com
WINTERCORP: THE LARGE SCALE DATA MANAGEMENT EXPERTS
© 2010, 2011, 2012 WINTER CORPORATION,, CAMBRIDGE Reserved.
©2010 2013 WINTER CORPORATION CAMBRIDGE MA. ALL RIGHTS RESERVED..
© 2012, Winter Corporation. All Rights MA. ALL RIGHTS RESERVED

12

Mais conteúdo relacionado

Mais procurados

Machine Learning Everywhere
Machine Learning EverywhereMachine Learning Everywhere
Machine Learning EverywhereDataWorks Summit
 
Part 1: Introducing the Cloudera Data Science Workbench
Part 1: Introducing the Cloudera Data Science WorkbenchPart 1: Introducing the Cloudera Data Science Workbench
Part 1: Introducing the Cloudera Data Science WorkbenchCloudera, Inc.
 
Cloudera Federal Forum 2014: The Building Blocks of the Enterprise Data Hub
Cloudera Federal Forum 2014: The Building Blocks of the Enterprise Data HubCloudera Federal Forum 2014: The Building Blocks of the Enterprise Data Hub
Cloudera Federal Forum 2014: The Building Blocks of the Enterprise Data HubCloudera, Inc.
 
Presentacin webinar move_up_to_power8_with_scale_out_servers_final
Presentacin webinar move_up_to_power8_with_scale_out_servers_finalPresentacin webinar move_up_to_power8_with_scale_out_servers_final
Presentacin webinar move_up_to_power8_with_scale_out_servers_finalDiego Alberto Tamayo
 
Apache Impala (incubating) 2.5 Performance Update
Apache Impala (incubating) 2.5 Performance UpdateApache Impala (incubating) 2.5 Performance Update
Apache Impala (incubating) 2.5 Performance UpdateCloudera, Inc.
 
Intuitive Real-Time Analytics with Search
Intuitive Real-Time Analytics with SearchIntuitive Real-Time Analytics with Search
Intuitive Real-Time Analytics with SearchCloudera, Inc.
 
Scaling Data Science on Big Data
Scaling Data Science on Big DataScaling Data Science on Big Data
Scaling Data Science on Big DataDataWorks Summit
 
Part 1: Lambda Architectures: Simplified by Apache Kudu
Part 1: Lambda Architectures: Simplified by Apache KuduPart 1: Lambda Architectures: Simplified by Apache Kudu
Part 1: Lambda Architectures: Simplified by Apache KuduCloudera, Inc.
 
Analyzing Hadoop Data Using Sparklyr

Analyzing Hadoop Data Using Sparklyr
Analyzing Hadoop Data Using Sparklyr

Analyzing Hadoop Data Using Sparklyr
Cloudera, Inc.
 
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
 Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ... Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...Cloudera, Inc.
 
Better Together: The New Data Management Orchestra
Better Together: The New Data Management OrchestraBetter Together: The New Data Management Orchestra
Better Together: The New Data Management OrchestraCloudera, Inc.
 
Hadoop Hadoop & Spark meetup - Altiscale
Hadoop Hadoop & Spark meetup - AltiscaleHadoop Hadoop & Spark meetup - Altiscale
Hadoop Hadoop & Spark meetup - AltiscaleMark Kerzner
 
Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...
Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...
Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...Precisely
 
Breakout: Operational Analytics with Hadoop
Breakout: Operational Analytics with HadoopBreakout: Operational Analytics with Hadoop
Breakout: Operational Analytics with HadoopCloudera, Inc.
 
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, ClouderaMongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, ClouderaMongoDB
 
Data Warehouse - Incremental Migration to the Cloud
Data Warehouse - Incremental Migration to the CloudData Warehouse - Incremental Migration to the Cloud
Data Warehouse - Incremental Migration to the CloudMichael Rainey
 
Extreme Sports & Beyond: Exploring a new frontier in data with GoPro
Extreme Sports & Beyond: Exploring a new frontier in data with GoProExtreme Sports & Beyond: Exploring a new frontier in data with GoPro
Extreme Sports & Beyond: Exploring a new frontier in data with GoProCloudera, Inc.
 

Mais procurados (20)

Machine Learning Everywhere
Machine Learning EverywhereMachine Learning Everywhere
Machine Learning Everywhere
 
Part 1: Introducing the Cloudera Data Science Workbench
Part 1: Introducing the Cloudera Data Science WorkbenchPart 1: Introducing the Cloudera Data Science Workbench
Part 1: Introducing the Cloudera Data Science Workbench
 
Cloudera Federal Forum 2014: The Building Blocks of the Enterprise Data Hub
Cloudera Federal Forum 2014: The Building Blocks of the Enterprise Data HubCloudera Federal Forum 2014: The Building Blocks of the Enterprise Data Hub
Cloudera Federal Forum 2014: The Building Blocks of the Enterprise Data Hub
 
Presentacin webinar move_up_to_power8_with_scale_out_servers_final
Presentacin webinar move_up_to_power8_with_scale_out_servers_finalPresentacin webinar move_up_to_power8_with_scale_out_servers_final
Presentacin webinar move_up_to_power8_with_scale_out_servers_final
 
Apache Impala (incubating) 2.5 Performance Update
Apache Impala (incubating) 2.5 Performance UpdateApache Impala (incubating) 2.5 Performance Update
Apache Impala (incubating) 2.5 Performance Update
 
Intuitive Real-Time Analytics with Search
Intuitive Real-Time Analytics with SearchIntuitive Real-Time Analytics with Search
Intuitive Real-Time Analytics with Search
 
Scaling Data Science on Big Data
Scaling Data Science on Big DataScaling Data Science on Big Data
Scaling Data Science on Big Data
 
Part 1: Lambda Architectures: Simplified by Apache Kudu
Part 1: Lambda Architectures: Simplified by Apache KuduPart 1: Lambda Architectures: Simplified by Apache Kudu
Part 1: Lambda Architectures: Simplified by Apache Kudu
 
Analyzing Hadoop Data Using Sparklyr

Analyzing Hadoop Data Using Sparklyr
Analyzing Hadoop Data Using Sparklyr

Analyzing Hadoop Data Using Sparklyr

 
Rob Bearden Keynote Hadoop Summit San Jose
Rob Bearden Keynote Hadoop Summit San JoseRob Bearden Keynote Hadoop Summit San Jose
Rob Bearden Keynote Hadoop Summit San Jose
 
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
 Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ... Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
 
Better Together: The New Data Management Orchestra
Better Together: The New Data Management OrchestraBetter Together: The New Data Management Orchestra
Better Together: The New Data Management Orchestra
 
Hadoop Hadoop & Spark meetup - Altiscale
Hadoop Hadoop & Spark meetup - AltiscaleHadoop Hadoop & Spark meetup - Altiscale
Hadoop Hadoop & Spark meetup - Altiscale
 
Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...
Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...
Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...
 
Breakout: Operational Analytics with Hadoop
Breakout: Operational Analytics with HadoopBreakout: Operational Analytics with Hadoop
Breakout: Operational Analytics with Hadoop
 
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, ClouderaMongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
 
Data-In-Motion Unleashed
Data-In-Motion UnleashedData-In-Motion Unleashed
Data-In-Motion Unleashed
 
Data Warehouse - Incremental Migration to the Cloud
Data Warehouse - Incremental Migration to the CloudData Warehouse - Incremental Migration to the Cloud
Data Warehouse - Incremental Migration to the Cloud
 
Extreme Sports & Beyond: Exploring a new frontier in data with GoPro
Extreme Sports & Beyond: Exploring a new frontier in data with GoProExtreme Sports & Beyond: Exploring a new frontier in data with GoPro
Extreme Sports & Beyond: Exploring a new frontier in data with GoPro
 
Beyond TCO
Beyond TCOBeyond TCO
Beyond TCO
 

Semelhante a Tcod a framework for the total cost of big data - december 6 2013 - winter corp - v17

Foundation for Success: How Big Data Fits in an Information Architecture
Foundation for Success: How Big Data Fits in an Information ArchitectureFoundation for Success: How Big Data Fits in an Information Architecture
Foundation for Success: How Big Data Fits in an Information ArchitectureInside Analysis
 
Big Data with Hadoop – For Data Management, Processing and Storing
Big Data with Hadoop – For Data Management, Processing and StoringBig Data with Hadoop – For Data Management, Processing and Storing
Big Data with Hadoop – For Data Management, Processing and StoringIRJET Journal
 
Big Data Expo 2015 - Hortonworks Common Hadoop Use Cases
Big Data Expo 2015 - Hortonworks Common Hadoop Use CasesBig Data Expo 2015 - Hortonworks Common Hadoop Use Cases
Big Data Expo 2015 - Hortonworks Common Hadoop Use CasesBigDataExpo
 
Data Warehouse Offload
Data Warehouse OffloadData Warehouse Offload
Data Warehouse OffloadJohn Berns
 
Hadoop_Its_Not_Just_Internal_Storage_V14
Hadoop_Its_Not_Just_Internal_Storage_V14Hadoop_Its_Not_Just_Internal_Storage_V14
Hadoop_Its_Not_Just_Internal_Storage_V14John Sing
 
Jak konsolidovat Vaše databáze s využitím Cloud služeb?
Jak konsolidovat Vaše databáze s využitím Cloud služeb?Jak konsolidovat Vaše databáze s využitím Cloud služeb?
Jak konsolidovat Vaše databáze s využitím Cloud služeb?MarketingArrowECS_CZ
 
Postgres.foreign.data.wrappers.2015
Postgres.foreign.data.wrappers.2015Postgres.foreign.data.wrappers.2015
Postgres.foreign.data.wrappers.2015EDB
 
Breakout: Hadoop and the Operational Data Store
Breakout: Hadoop and the Operational Data StoreBreakout: Hadoop and the Operational Data Store
Breakout: Hadoop and the Operational Data StoreCloudera, Inc.
 
Information Retrieval, Applied Statistics and Mathematics onBigData - German ...
Information Retrieval, Applied Statistics and Mathematics onBigData - German ...Information Retrieval, Applied Statistics and Mathematics onBigData - German ...
Information Retrieval, Applied Statistics and Mathematics onBigData - German ...Romeo Kienzler
 
IRJET - Weather Log Analysis based on Hadoop Technology
IRJET - Weather Log Analysis based on Hadoop TechnologyIRJET - Weather Log Analysis based on Hadoop Technology
IRJET - Weather Log Analysis based on Hadoop TechnologyIRJET Journal
 
Using Hadoop to Offload Data Warehouse Processing and More - Brad Anserson
Using Hadoop to Offload Data Warehouse Processing and More - Brad AnsersonUsing Hadoop to Offload Data Warehouse Processing and More - Brad Anserson
Using Hadoop to Offload Data Warehouse Processing and More - Brad AnsersonMapR Technologies
 
Rob peglar introduction_analytics _big data_hadoop
Rob peglar introduction_analytics _big data_hadoopRob peglar introduction_analytics _big data_hadoop
Rob peglar introduction_analytics _big data_hadoopGhassan Al-Yafie
 
Hadoop Summit Brussels 2015: Architecting a Scalable Hadoop Platform - Top 10...
Hadoop Summit Brussels 2015: Architecting a Scalable Hadoop Platform - Top 10...Hadoop Summit Brussels 2015: Architecting a Scalable Hadoop Platform - Top 10...
Hadoop Summit Brussels 2015: Architecting a Scalable Hadoop Platform - Top 10...Sumeet Singh
 
Snowflake’s Cloud Data Platform and Modern Analytics
Snowflake’s Cloud Data Platform and Modern AnalyticsSnowflake’s Cloud Data Platform and Modern Analytics
Snowflake’s Cloud Data Platform and Modern AnalyticsSenturus
 
Design Choices for Cloud Data Platforms
Design Choices for Cloud Data PlatformsDesign Choices for Cloud Data Platforms
Design Choices for Cloud Data PlatformsAshish Mrig
 

Semelhante a Tcod a framework for the total cost of big data - december 6 2013 - winter corp - v17 (20)

Foundation for Success: How Big Data Fits in an Information Architecture
Foundation for Success: How Big Data Fits in an Information ArchitectureFoundation for Success: How Big Data Fits in an Information Architecture
Foundation for Success: How Big Data Fits in an Information Architecture
 
Big Data with Hadoop – For Data Management, Processing and Storing
Big Data with Hadoop – For Data Management, Processing and StoringBig Data with Hadoop – For Data Management, Processing and Storing
Big Data with Hadoop – For Data Management, Processing and Storing
 
Big Data Expo 2015 - Hortonworks Common Hadoop Use Cases
Big Data Expo 2015 - Hortonworks Common Hadoop Use CasesBig Data Expo 2015 - Hortonworks Common Hadoop Use Cases
Big Data Expo 2015 - Hortonworks Common Hadoop Use Cases
 
EMC config Hadoop
EMC config HadoopEMC config Hadoop
EMC config Hadoop
 
Data Warehouse Offload
Data Warehouse OffloadData Warehouse Offload
Data Warehouse Offload
 
Hadoop_Its_Not_Just_Internal_Storage_V14
Hadoop_Its_Not_Just_Internal_Storage_V14Hadoop_Its_Not_Just_Internal_Storage_V14
Hadoop_Its_Not_Just_Internal_Storage_V14
 
Greenplum feature
Greenplum featureGreenplum feature
Greenplum feature
 
Jak konsolidovat Vaše databáze s využitím Cloud služeb?
Jak konsolidovat Vaše databáze s využitím Cloud služeb?Jak konsolidovat Vaše databáze s využitím Cloud služeb?
Jak konsolidovat Vaše databáze s využitím Cloud služeb?
 
Postgres.foreign.data.wrappers.2015
Postgres.foreign.data.wrappers.2015Postgres.foreign.data.wrappers.2015
Postgres.foreign.data.wrappers.2015
 
Breakout: Hadoop and the Operational Data Store
Breakout: Hadoop and the Operational Data StoreBreakout: Hadoop and the Operational Data Store
Breakout: Hadoop and the Operational Data Store
 
Information Retrieval, Applied Statistics and Mathematics onBigData - German ...
Information Retrieval, Applied Statistics and Mathematics onBigData - German ...Information Retrieval, Applied Statistics and Mathematics onBigData - German ...
Information Retrieval, Applied Statistics and Mathematics onBigData - German ...
 
IRJET - Weather Log Analysis based on Hadoop Technology
IRJET - Weather Log Analysis based on Hadoop TechnologyIRJET - Weather Log Analysis based on Hadoop Technology
IRJET - Weather Log Analysis based on Hadoop Technology
 
Solving Big Data Problems
Solving Big Data ProblemsSolving Big Data Problems
Solving Big Data Problems
 
Using Hadoop to Offload Data Warehouse Processing and More - Brad Anserson
Using Hadoop to Offload Data Warehouse Processing and More - Brad AnsersonUsing Hadoop to Offload Data Warehouse Processing and More - Brad Anserson
Using Hadoop to Offload Data Warehouse Processing and More - Brad Anserson
 
Rob peglar introduction_analytics _big data_hadoop
Rob peglar introduction_analytics _big data_hadoopRob peglar introduction_analytics _big data_hadoop
Rob peglar introduction_analytics _big data_hadoop
 
Hadoop Summit Brussels 2015: Architecting a Scalable Hadoop Platform - Top 10...
Hadoop Summit Brussels 2015: Architecting a Scalable Hadoop Platform - Top 10...Hadoop Summit Brussels 2015: Architecting a Scalable Hadoop Platform - Top 10...
Hadoop Summit Brussels 2015: Architecting a Scalable Hadoop Platform - Top 10...
 
Snowflake’s Cloud Data Platform and Modern Analytics
Snowflake’s Cloud Data Platform and Modern AnalyticsSnowflake’s Cloud Data Platform and Modern Analytics
Snowflake’s Cloud Data Platform and Modern Analytics
 
Data centerefficiency
Data centerefficiencyData centerefficiency
Data centerefficiency
 
Design Choices for Cloud Data Platforms
Design Choices for Cloud Data PlatformsDesign Choices for Cloud Data Platforms
Design Choices for Cloud Data Platforms
 
Modernise your EDW - Data Lake
Modernise your EDW - Data LakeModernise your EDW - Data Lake
Modernise your EDW - Data Lake
 

Último

AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbuapidays
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 

Último (20)

AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 

Tcod a framework for the total cost of big data - december 6 2013 - winter corp - v17

  • 1. W I N T E R C O R P TCOD: A Framework for the Total Cost of Big Data (key charts) Research Report: wintercorp.com/tcod-report Spreadsheet: wintercorp.com/tcod-spreadsheet Key Charts: wintercorp.com/tcod-charts Richard Winter WinterCorp December 6, 2013, V17 THE LARGE SCALE DATA MANAGEMENT EXPERTS
  • 2. Total Cost of Data (TCOD) Software Development/Maintenance Analytics Queries Apps Admin ETL* System Diagram not to scale. TCOD is the cost of storing, managing and using data over time for analytic purposes * ETL is extract, transform and load (preparing data for analytic use) WINTERCORP: THE LARGE SCALE DATA MANAGEMENT EXPERTS © 2010, 2011, 2012 WINTER CORPORATION,, CAMBRIDGE Reserved. ©2010 2013 WINTER CORPORATION CAMBRIDGE MA. ALL RIGHTS RESERVED.. © 2012, Winter Corporation. All Rights MA. ALL RIGHTS RESERVED 2
  • 3. Data Refining Example Data from Turbines WINTERCORP: THE LARGE SCALE DATA MANAGEMENT EXPERTS © 2010, 2011, 2012 WINTER CORPORATION,, CAMBRIDGE Reserved. ©2010 2013 WINTER CORPORATION CAMBRIDGE MA. ALL RIGHTS RESERVED.. © 2012, Winter Corporation. All Rights MA. ALL RIGHTS RESERVED 3
  • 4. Data Refining Example Data Management Requirements 1. Hundreds of TB of data per week – 500 TB data capacity 2. Raw data life: few hours to a few days 3. Challenge: find the important events or trends quickly 4. Massive analysis problem 5. When analyzing, read entire files 6. Keep only the significant data WINTERCORP: THE LARGE SCALE DATA MANAGEMENT EXPERTS © 2012, 2013 WINTER CORPORATION, CAMBRIDGE MA. ALL RIGHTS RESERVED. ©2010 Winter Corporation. All Rights Reserved. 4
  • 5. Cost Comparison Engineering Example – Data Refining On Hadoop On Data Warehouse Appliance* $9.3m $30m (not to scale) Data Warehouse Appliance Volume of Data 500 TB 500 TB System Cost $23 million $1.3 million Total Cost of Data * Performance class of DW Appliance – not the lowest price class Hadoop $30 million $9.3 million WINTERCORP: THE LARGE SCALE DATA MANAGEMENT EXPERTS © 2010, 2011, 2012 WINTER CORPORATION,, CAMBRIDGE Reserved. ©2010 2013 WINTER CORPORATION CAMBRIDGE MA. ALL RIGHTS RESERVED.. © 2012, Winter Corporation. All Rights MA. ALL RIGHTS RESERVED 5
  • 6. Observations on Hadoop 1. Many examples of the data refining requirement in engineering, operations, business, science, healthcare 2. Cost equation is favorable to Hadoop in these applications even with a wide variety of data types 3. There are also many other excellent Hadoop use cases – Data landing zone – Archive – Intensive batch processing of data 4. Example is one illustration of Hadoop sweet spot WINTERCORP: THE LARGE SCALE DATA MANAGEMENT EXPERTS © 2012, 2013 WINTERCorporation.CAMBRIDGE MA. ALL RIGHTS RESERVED. ©2010 Winter CORPORATION, All Rights Reserved. 6
  • 7. Business Example Enterprise Data Warehouse WINTERCORP: THE LARGE SCALE DATA MANAGEMENT EXPERTS ©2010 2013 WINTER CORPORATION, CAMBRIDGEReserved. © 2012, Winter Corporation. All Rights MA. ALL RIGHTS RESERVED. 7
  • 8. Business Example - EDW Data Management Requirements 1. 2. 3. 4. 5. 6. 7. 8. 9. Data volume a. 500 TB to start – all retained for at least five years b. Continual growth of data and workload Data sources: thousands a. Data sources change their feeds frequently b. New data sources are frequent Challenges a. Data must be correct b. Data must be integrated Typical enterprise data lifetime: decades Analytic application lifetime: years Many thousands of data users (104 – 106) Hundreds of analytic applications Thousands of one time analyses Tens of thousands of complex queries WINTERCORP: THE LARGE SCALE DATA MANAGEMENT EXPERTS © 2012, 2013 WINTERCorporation.CAMBRIDGE MA. ALL RIGHTS RESERVED. ©2010 Winter CORPORATION, All Rights Reserved. 8
  • 9. Cost Comparison Business Example – EDW Total System Cost System and Data Admin ETL Application Development Complex Queries Analysis On EDW Platform On Hadoop $265 million $740 million (not to scale) Data Warehouse Platform Volume of Data Hadoop 500 TB System Cost $45 million Total Cost of Data $1.4 million $265 million $740 million WINTERCORP: THE LARGE SCALE DATA MANAGEMENT EXPERTS © 2010, 2011, 2012 WINTER CORPORATION,, CAMBRIDGE Reserved. ©2010 2013 WINTER CORPORATION CAMBRIDGE MA. ALL RIGHTS RESERVED.. © 2012, Winter Corporation. All Rights MA. ALL RIGHTS RESERVED 9
  • 10. $35 Millions Millions Conclusions – Two TCOD Examples $30 $25 $800 $700 $600 Total System Cost $20 $500 System and Data Admin $15 $400 Application Development $10 $300 ETL $5 $200 Complex Queries $0 $100 On Hadoop On Data Warehouse Analysis $0 On Hadoop Data Refining: Hadoop wins Also: Landing Zone, Archive On Data Warehouse EDW: Data W/H Platform Wins 1. TCOD is NOT platform cost 2. Each technology has large advantages in its sweet spot(s) 3. Neither platform is cost effective in the other’s sweet spot 4. Biggest differences for the data warehouse are the development of:  Complex queries  Analytics WINTERCORP: THE LARGE SCALE DATA MANAGEMENT EXPERTS ©2010 2013 WINTER CORPORATION, CAMBRIDGEReserved. © 2012, Winter Corporation. All Rights MA. ALL RIGHTS RESERVED. 10
  • 11. TCOD Framework Additional Notes Not taken into account  Actual system workloads, concurrency, availability reqmts.  Cost of preparing simple queries  Cost of query execution  Workload management  Vendor supported distributions of Hadoop/Hadoop Appliances  ETL products available with Hadoop New Products Should Eventually Decrease TCOD with Hadoop  Cloudera Impala, IBM BigSQL, Teradata SQL-H, EMC Pivotal  New version of Hive supports subset of SQL  Further analysis, evaluation and measurement is required WINTERCORP: THE LARGE SCALE DATA MANAGEMENT EXPERTS ©2010 Winter Corporation. All Rights Reserved. 11
  • 12. In Conclusion 1. TCOD estimates what your company will really spend to get to your business goal. 2. Total cost is extremely sensitive to technology choice 3. Analytic architectures will require both Hadoop and data warehouse platforms 4. Focus on total cost, not platform cost, in making your choice for a particular application or use. 5. Many analytic processes will use both Hadoop and data warehouse technology – so integration counts! Questions and comments welcome at tcod@wintercorp.com WINTERCORP: THE LARGE SCALE DATA MANAGEMENT EXPERTS © 2010, 2011, 2012 WINTER CORPORATION,, CAMBRIDGE Reserved. ©2010 2013 WINTER CORPORATION CAMBRIDGE MA. ALL RIGHTS RESERVED.. © 2012, Winter Corporation. All Rights MA. ALL RIGHTS RESERVED 12