SlideShare uma empresa Scribd logo
1 de 15
Baixar para ler offline
© 2012 Datameer, Inc. All rights reserved.
© 2012 Datameer, Inc. All rights reserved.
Hadoop as a Data Hub:
A Sears Case Study
© 2012 Datameer, Inc. All rights reserved.
About our Speaker!
Phil Shelley

!
Dr. Shelley is CTO at Sears Holdings
Corporation (SHC), leading IT Operations
and is focusing on the modernization of IT
across the company. !
!
Phil is also CEO of Metascale, a subsidiary
of Sears Holdings. Metascale is an IT
managed Services Company that makes Big
Data easy by designing, delivering and
operating Hadoop-based solutions for
Analytics, Mainframe Migration and
massive-scale processing, integrated into
the customers’ Enterprise.!
© 2012 Datameer, Inc. All rights reserved.
About our Speaker!
Stefan Groschupf!
!
Stefan Groschupf is the co-founder and CEO of
Datameer and one of the original contributors to
Nutch, the open source predecessor of Hadoop, !
!
Prior to Datameer, Stefan was the co-founder
and CEO of Scale Unlimited, which implemented
custom Hadoop analytic solutions for HP, Sun,
Deutsche Telekom, Nokia and others. Earlier,
Stefan was CEO of 101Tec, a supplier of
Hadoop and Nutch-based search and text
classification software to industry-leading
companies such as Apple, DHL and EMI Music.
Stefan has also served as CTO at multiple
companies, including Sproose, a social search
engine company.!
Hadoop as a Data Hub
a new approach to data management
Dr. Phil Shelley
CTO Sears Holdings
CEO MetaScale
The
Challenge
Data
Volume /
Retention
Batch
Window
Limits
Escalating
IT Costs
Scalability
Ever
Evolving
Business
ETL
Complexity
/ Costs
Data
Latency /
Redundancy
Tight IT
Budgets
Challenges & Trends
2
Constant pressure to lower costs, deliver faster, migrate to real time
and answer more difficult questions…
Batch Real-Time→
Proprietary Open Source→
Capital Cloud Expense→
Heavy Iron Commodity→
Linear Parallel Processing→
Copy and Use Source Once & Re-Use→
Costs Down→
Power Up→
What is a Data Hub
A single, consolidated, fully
populated data archive that
gives unfettered user access to
analyze and report on data, with
appropriate security, as soon as
the data is created by the
transactional or other source
system
Why a Data Hub
• Most data latency is removed
• Users and analysts are put in a self-service mode
• The concept of a “data cube” is unnecessary
• Analysis at the lowest level – No need to run at the segment level
• Any question can be asked
• Business users and analysts have unrestricted ability to explore
• Correlation of any data set is immediately possible
• Significant reduction in reporting and analysis times
– Time to source the data
– Time for users to gain access to the data
• Reduction in IT labor ….
– Source Once – Use Many Times
• Data is Copied from source systems via ETL
• Sub-sets of data are captured
– Too expensive to keep all detail
– Takes too long to ETL all data fields from sources
• Each use of data generates more unique ETL jobs
• Data is segmented to reduce query times
• Cubes or views are generated to improve analysis speed
• Disparate data silos required ETL before users have access
• Data warehouse costs and performance limitations force
archiving and data truncation
• Tends to lead to different versions of “truth”
• Time lag or latency from data generation to use
The Traditional Approach
Benefits - Hadoop as a Data Hub
• All data is available
– All history
– All detail
• No need to filter, segment or cube before use
• Data can be consumed almost immediately
• No need to silo into different databases to
accommodate performance limitations
• Users do not require IT to ETL data before use
• Security is applied via Datameer profiles
• User self-service is a reality
Prerequisites
• An Enterprise data architecture that has a Data
Hub as a foundation
• Data sourcing must be controlled
• Metadata must be created for data sources
• A leader with the vision and capability to drive
• Willing business users to pilot and coach others
• A sustained strategy to Enterprise Data
Architecture and governance
• A carefully designed Hadoop data layer
architecture
Key Concepts
• A Data Hub is now reality
• Drives lower costs and reduces delays
• Time to value for data is reduced
• Business users and analysts are empowered
• The most important:
– Source Once – Re-use Many Times
– Source everything
– Retain everything
o ETL complexity is needed no-longer – DATA HUB
– Source Once – Re-Use many times
– ETL is transformed to ELTTTTTT with lower data latency
– Consume data in-place with Datameer
o ETL-induced data latency is largely eliminated
– Analysis is routinely possible within minutes of data creation
o Long-running overnight workload on Legacy Systems
– Can be eliminated and executed at any time
– Run times are a fraction of the original clock-time
o Batch processing on mainframes or other conventional batch
– Moved to Hadoop
– Run 10, 50, even 100 times faster.
o Intelligent Archive
– Put your archives/tape data on Hadoop and make it Intelligent
– Archive with the ability to run analytics or join it with other data
o Modernize Legacy
– Mainframe MIPs reduction has very attractive ROI
– Move Data Warehouse workload – Reduce Cost – Go Faster
Key Learning
Sample Reports - Datameer
© 2012 Datameer, Inc. All rights reserved.
Questions and Answers!
© 2012 Datameer, Inc. All rights reserved.
Online Resources
!  Try Datameer: www.datameer.com!
!  Visit Metascale: www.metascale.com!
!  Follow us on Twitter @datameer & @BigDataMadeEasy!
!

Mais conteúdo relacionado

Mais procurados

Apache Spark Workshop at Hadoop Summit
Apache Spark Workshop at Hadoop SummitApache Spark Workshop at Hadoop Summit
Apache Spark Workshop at Hadoop SummitSaptak Sen
 
Jethro data meetup index base sql on hadoop - oct-2014
Jethro data meetup    index base sql on hadoop - oct-2014Jethro data meetup    index base sql on hadoop - oct-2014
Jethro data meetup index base sql on hadoop - oct-2014Eli Singer
 
Don't be Hadooped when looking for Big Data ROI
Don't be Hadooped when looking for Big Data ROIDon't be Hadooped when looking for Big Data ROI
Don't be Hadooped when looking for Big Data ROIDataWorks Summit
 
Debunking Common Myths of Hadoop Backup & Test Data Management
Debunking Common Myths of Hadoop Backup & Test Data ManagementDebunking Common Myths of Hadoop Backup & Test Data Management
Debunking Common Myths of Hadoop Backup & Test Data ManagementImanis Data
 
Meetup Oracle Database MAD_BCN: 1.2 Oracle Database 18c (autonomous database)
Meetup Oracle Database MAD_BCN: 1.2 Oracle Database 18c (autonomous database)Meetup Oracle Database MAD_BCN: 1.2 Oracle Database 18c (autonomous database)
Meetup Oracle Database MAD_BCN: 1.2 Oracle Database 18c (autonomous database)avanttic Consultoría Tecnológica
 
Standing Up an Effective Enterprise Data Hub -- Technology and Beyond
Standing Up an Effective Enterprise Data Hub -- Technology and BeyondStanding Up an Effective Enterprise Data Hub -- Technology and Beyond
Standing Up an Effective Enterprise Data Hub -- Technology and BeyondCloudera, Inc.
 
New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...
New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...
New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...Cloudera, Inc.
 
Tcod a framework for the total cost of big data - december 6 2013 - winte...
Tcod   a framework for the total cost of big data  - december 6 2013  - winte...Tcod   a framework for the total cost of big data  - december 6 2013  - winte...
Tcod a framework for the total cost of big data - december 6 2013 - winte...Richard Winter
 
5 Things that Make Hadoop a Game Changer
5 Things that Make Hadoop a Game Changer5 Things that Make Hadoop a Game Changer
5 Things that Make Hadoop a Game ChangerCaserta
 
The Transformation of your Data in modern IT (Presented by DellEMC)
The Transformation of your Data in modern IT (Presented by DellEMC)The Transformation of your Data in modern IT (Presented by DellEMC)
The Transformation of your Data in modern IT (Presented by DellEMC)Cloudera, Inc.
 
Presentation big dataappliance-overview_oow_v3
Presentation   big dataappliance-overview_oow_v3Presentation   big dataappliance-overview_oow_v3
Presentation big dataappliance-overview_oow_v3xKinAnx
 
Cloudera Federal Forum 2014: The Building Blocks of the Enterprise Data Hub
Cloudera Federal Forum 2014: The Building Blocks of the Enterprise Data HubCloudera Federal Forum 2014: The Building Blocks of the Enterprise Data Hub
Cloudera Federal Forum 2014: The Building Blocks of the Enterprise Data HubCloudera, Inc.
 
What the Enterprise Requires - Business Continuity and Visibility
What the Enterprise Requires - Business Continuity and VisibilityWhat the Enterprise Requires - Business Continuity and Visibility
What the Enterprise Requires - Business Continuity and VisibilityCloudera, Inc.
 
Securing the Data Hub--Protecting your Customer IP (Technical Workshop)
Securing the Data Hub--Protecting your Customer IP (Technical Workshop)Securing the Data Hub--Protecting your Customer IP (Technical Workshop)
Securing the Data Hub--Protecting your Customer IP (Technical Workshop)Cloudera, Inc.
 
Solr consistency and recovery internals
Solr consistency and recovery internalsSolr consistency and recovery internals
Solr consistency and recovery internalsCloudera, Inc.
 
Scaling Data Science on Big Data
Scaling Data Science on Big DataScaling Data Science on Big Data
Scaling Data Science on Big DataDataWorks Summit
 
Colorado Springs Open Source Hadoop/MySQL
Colorado Springs Open Source Hadoop/MySQL Colorado Springs Open Source Hadoop/MySQL
Colorado Springs Open Source Hadoop/MySQL David Smelker
 
Introduction To Hadoop Administration - SpringPeople
Introduction To Hadoop Administration - SpringPeopleIntroduction To Hadoop Administration - SpringPeople
Introduction To Hadoop Administration - SpringPeopleSpringPeople
 

Mais procurados (20)

Apache Spark Workshop at Hadoop Summit
Apache Spark Workshop at Hadoop SummitApache Spark Workshop at Hadoop Summit
Apache Spark Workshop at Hadoop Summit
 
Jethro data meetup index base sql on hadoop - oct-2014
Jethro data meetup    index base sql on hadoop - oct-2014Jethro data meetup    index base sql on hadoop - oct-2014
Jethro data meetup index base sql on hadoop - oct-2014
 
Don't be Hadooped when looking for Big Data ROI
Don't be Hadooped when looking for Big Data ROIDon't be Hadooped when looking for Big Data ROI
Don't be Hadooped when looking for Big Data ROI
 
Debunking Common Myths of Hadoop Backup & Test Data Management
Debunking Common Myths of Hadoop Backup & Test Data ManagementDebunking Common Myths of Hadoop Backup & Test Data Management
Debunking Common Myths of Hadoop Backup & Test Data Management
 
Meetup Oracle Database MAD_BCN: 1.2 Oracle Database 18c (autonomous database)
Meetup Oracle Database MAD_BCN: 1.2 Oracle Database 18c (autonomous database)Meetup Oracle Database MAD_BCN: 1.2 Oracle Database 18c (autonomous database)
Meetup Oracle Database MAD_BCN: 1.2 Oracle Database 18c (autonomous database)
 
Standing Up an Effective Enterprise Data Hub -- Technology and Beyond
Standing Up an Effective Enterprise Data Hub -- Technology and BeyondStanding Up an Effective Enterprise Data Hub -- Technology and Beyond
Standing Up an Effective Enterprise Data Hub -- Technology and Beyond
 
New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...
New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...
New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...
 
Tcod a framework for the total cost of big data - december 6 2013 - winte...
Tcod   a framework for the total cost of big data  - december 6 2013  - winte...Tcod   a framework for the total cost of big data  - december 6 2013  - winte...
Tcod a framework for the total cost of big data - december 6 2013 - winte...
 
5 Things that Make Hadoop a Game Changer
5 Things that Make Hadoop a Game Changer5 Things that Make Hadoop a Game Changer
5 Things that Make Hadoop a Game Changer
 
The Transformation of your Data in modern IT (Presented by DellEMC)
The Transformation of your Data in modern IT (Presented by DellEMC)The Transformation of your Data in modern IT (Presented by DellEMC)
The Transformation of your Data in modern IT (Presented by DellEMC)
 
Presentation big dataappliance-overview_oow_v3
Presentation   big dataappliance-overview_oow_v3Presentation   big dataappliance-overview_oow_v3
Presentation big dataappliance-overview_oow_v3
 
Cloudera Federal Forum 2014: The Building Blocks of the Enterprise Data Hub
Cloudera Federal Forum 2014: The Building Blocks of the Enterprise Data HubCloudera Federal Forum 2014: The Building Blocks of the Enterprise Data Hub
Cloudera Federal Forum 2014: The Building Blocks of the Enterprise Data Hub
 
What the Enterprise Requires - Business Continuity and Visibility
What the Enterprise Requires - Business Continuity and VisibilityWhat the Enterprise Requires - Business Continuity and Visibility
What the Enterprise Requires - Business Continuity and Visibility
 
Securing the Data Hub--Protecting your Customer IP (Technical Workshop)
Securing the Data Hub--Protecting your Customer IP (Technical Workshop)Securing the Data Hub--Protecting your Customer IP (Technical Workshop)
Securing the Data Hub--Protecting your Customer IP (Technical Workshop)
 
Solr consistency and recovery internals
Solr consistency and recovery internalsSolr consistency and recovery internals
Solr consistency and recovery internals
 
Scaling Data Science on Big Data
Scaling Data Science on Big DataScaling Data Science on Big Data
Scaling Data Science on Big Data
 
Flexible Design
Flexible DesignFlexible Design
Flexible Design
 
Colorado Springs Open Source Hadoop/MySQL
Colorado Springs Open Source Hadoop/MySQL Colorado Springs Open Source Hadoop/MySQL
Colorado Springs Open Source Hadoop/MySQL
 
Meetup 25/04/19: Big Data
Meetup 25/04/19: Big DataMeetup 25/04/19: Big Data
Meetup 25/04/19: Big Data
 
Introduction To Hadoop Administration - SpringPeople
Introduction To Hadoop Administration - SpringPeopleIntroduction To Hadoop Administration - SpringPeople
Introduction To Hadoop Administration - SpringPeople
 

Destaque

Recipe for sea cooking class
Recipe for sea cooking classRecipe for sea cooking class
Recipe for sea cooking classTuke Ingkhaninan
 
Snapshot of Social Media Trends 2016
Snapshot of Social Media Trends 2016Snapshot of Social Media Trends 2016
Snapshot of Social Media Trends 2016Jenny Fowler
 
De ltdh 3 2011
De ltdh 3 2011De ltdh 3 2011
De ltdh 3 2011tinhban269
 
δημιουργία περιβαλλοντικής ομάδας στο σχολείο
δημιουργία περιβαλλοντικής ομάδας στο σχολείοδημιουργία περιβαλλοντικής ομάδας στο σχολείο
δημιουργία περιβαλλοντικής ομάδας στο σχολείοΚώστας Γκουντρομίχος
 
Як подати документи на спеціальність логістика на бюджет
Як подати документи на спеціальність логістика на бюджетЯк подати документи на спеціальність логістика на бюджет
Як подати документи на спеціальність логістика на бюджетАдмин Сайта
 
MCPS google apps for education orientation
MCPS google apps for education orientationMCPS google apps for education orientation
MCPS google apps for education orientationSean Patrick
 
2009 main flora education p pt
2009 main flora education p pt2009 main flora education p pt
2009 main flora education p ptFloressenceTea.com
 
D003 pr.500.02.001 guía para la identificación y evaluación de riesgos de seg...
D003 pr.500.02.001 guía para la identificación y evaluación de riesgos de seg...D003 pr.500.02.001 guía para la identificación y evaluación de riesgos de seg...
D003 pr.500.02.001 guía para la identificación y evaluación de riesgos de seg...Juan Pablo Salazar Hernández
 
India_alumni_newsletter_December2016_opt
India_alumni_newsletter_December2016_optIndia_alumni_newsletter_December2016_opt
India_alumni_newsletter_December2016_optSuneet Saxena
 

Destaque (14)

Recipe for sea cooking class
Recipe for sea cooking classRecipe for sea cooking class
Recipe for sea cooking class
 
Snapshot of Social Media Trends 2016
Snapshot of Social Media Trends 2016Snapshot of Social Media Trends 2016
Snapshot of Social Media Trends 2016
 
What makes a monk mad?
What makes a monk mad?What makes a monk mad?
What makes a monk mad?
 
De ltdh 3 2011
De ltdh 3 2011De ltdh 3 2011
De ltdh 3 2011
 
δημιουργία περιβαλλοντικής ομάδας στο σχολείο
δημιουργία περιβαλλοντικής ομάδας στο σχολείοδημιουργία περιβαλλοντικής ομάδας στο σχολείο
δημιουργία περιβαλλοντικής ομάδας στο σχολείο
 
Як подати документи на спеціальність логістика на бюджет
Як подати документи на спеціальність логістика на бюджетЯк подати документи на спеціальність логістика на бюджет
Як подати документи на спеціальність логістика на бюджет
 
MCPS google apps for education orientation
MCPS google apps for education orientationMCPS google apps for education orientation
MCPS google apps for education orientation
 
ασκηση 5 σελιδα 83
ασκηση 5 σελιδα 83ασκηση 5 σελιδα 83
ασκηση 5 σελιδα 83
 
2009 main flora education p pt
2009 main flora education p pt2009 main flora education p pt
2009 main flora education p pt
 
D003 pr.500.02.001 guía para la identificación y evaluación de riesgos de seg...
D003 pr.500.02.001 guía para la identificación y evaluación de riesgos de seg...D003 pr.500.02.001 guía para la identificación y evaluación de riesgos de seg...
D003 pr.500.02.001 guía para la identificación y evaluación de riesgos de seg...
 
Father's day
Father's dayFather's day
Father's day
 
S r cinètica
S r cinèticaS r cinètica
S r cinètica
 
India_alumni_newsletter_December2016_opt
India_alumni_newsletter_December2016_optIndia_alumni_newsletter_December2016_opt
India_alumni_newsletter_December2016_opt
 
Factor de reduccion
Factor de reduccionFactor de reduccion
Factor de reduccion
 

Semelhante a Hadoop as a data hub featuring sears

Meta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinarMeta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinarMichael Hiskey
 
Hadoop Essentials -- The What, Why and How to Meet Agency Objectives
Hadoop Essentials -- The What, Why and How to Meet Agency ObjectivesHadoop Essentials -- The What, Why and How to Meet Agency Objectives
Hadoop Essentials -- The What, Why and How to Meet Agency ObjectivesCloudera, Inc.
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)James Serra
 
Designing modern dw and data lake
Designing modern dw and data lakeDesigning modern dw and data lake
Designing modern dw and data lakepunedevscom
 
Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...
Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...
Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...Global Business Events
 
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)James Serra
 
Options for Data Prep - A Survey of the Current Market
Options for Data Prep - A Survey of the Current MarketOptions for Data Prep - A Survey of the Current Market
Options for Data Prep - A Survey of the Current MarketDremio Corporation
 
Enterprise Hadoop is Here to Stay: Plan Your Evolution Strategy
Enterprise Hadoop is Here to Stay: Plan Your Evolution StrategyEnterprise Hadoop is Here to Stay: Plan Your Evolution Strategy
Enterprise Hadoop is Here to Stay: Plan Your Evolution StrategyInside Analysis
 
Is the traditional data warehouse dead?
Is the traditional data warehouse dead?Is the traditional data warehouse dead?
Is the traditional data warehouse dead?James Serra
 
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...DATAVERSITY
 
Transforming Data Architecture Complexity at Sears - StampedeCon 2013
Transforming Data Architecture Complexity at Sears - StampedeCon 2013Transforming Data Architecture Complexity at Sears - StampedeCon 2013
Transforming Data Architecture Complexity at Sears - StampedeCon 2013StampedeCon
 
Big data architectures and the data lake
Big data architectures and the data lakeBig data architectures and the data lake
Big data architectures and the data lakeJames Serra
 
Meta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinarMeta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinarKognitio
 
Tame Big Data with Oracle Data Integration
Tame Big Data with Oracle Data IntegrationTame Big Data with Oracle Data Integration
Tame Big Data with Oracle Data IntegrationMichael Rainey
 
Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration Hortonworks
 
Hadoop and SQL: Delivery Analytics Across the Organization
Hadoop and SQL:  Delivery Analytics Across the OrganizationHadoop and SQL:  Delivery Analytics Across the Organization
Hadoop and SQL: Delivery Analytics Across the OrganizationSeeling Cheung
 
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...Hortonworks
 

Semelhante a Hadoop as a data hub featuring sears (20)

Meta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinarMeta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinar
 
Hadoop Essentials -- The What, Why and How to Meet Agency Objectives
Hadoop Essentials -- The What, Why and How to Meet Agency ObjectivesHadoop Essentials -- The What, Why and How to Meet Agency Objectives
Hadoop Essentials -- The What, Why and How to Meet Agency Objectives
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)
 
Designing modern dw and data lake
Designing modern dw and data lakeDesigning modern dw and data lake
Designing modern dw and data lake
 
Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...
Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...
Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...
 
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)
 
Options for Data Prep - A Survey of the Current Market
Options for Data Prep - A Survey of the Current MarketOptions for Data Prep - A Survey of the Current Market
Options for Data Prep - A Survey of the Current Market
 
Enterprise Hadoop is Here to Stay: Plan Your Evolution Strategy
Enterprise Hadoop is Here to Stay: Plan Your Evolution StrategyEnterprise Hadoop is Here to Stay: Plan Your Evolution Strategy
Enterprise Hadoop is Here to Stay: Plan Your Evolution Strategy
 
Is the traditional data warehouse dead?
Is the traditional data warehouse dead?Is the traditional data warehouse dead?
Is the traditional data warehouse dead?
 
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
 
Transforming Data Architecture Complexity at Sears - StampedeCon 2013
Transforming Data Architecture Complexity at Sears - StampedeCon 2013Transforming Data Architecture Complexity at Sears - StampedeCon 2013
Transforming Data Architecture Complexity at Sears - StampedeCon 2013
 
Big data architectures and the data lake
Big data architectures and the data lakeBig data architectures and the data lake
Big data architectures and the data lake
 
Meta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinarMeta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinar
 
50 Shades of SQL
50 Shades of SQL50 Shades of SQL
50 Shades of SQL
 
Tame Big Data with Oracle Data Integration
Tame Big Data with Oracle Data IntegrationTame Big Data with Oracle Data Integration
Tame Big Data with Oracle Data Integration
 
Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration
 
Data Vault Introduction
Data Vault IntroductionData Vault Introduction
Data Vault Introduction
 
Big Data Telecom
Big Data TelecomBig Data Telecom
Big Data Telecom
 
Hadoop and SQL: Delivery Analytics Across the Organization
Hadoop and SQL:  Delivery Analytics Across the OrganizationHadoop and SQL:  Delivery Analytics Across the Organization
Hadoop and SQL: Delivery Analytics Across the Organization
 
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
 

Último

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 

Último (20)

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 

Hadoop as a data hub featuring sears

  • 1. © 2012 Datameer, Inc. All rights reserved. © 2012 Datameer, Inc. All rights reserved. Hadoop as a Data Hub: A Sears Case Study
  • 2. © 2012 Datameer, Inc. All rights reserved. About our Speaker! Phil Shelley
 ! Dr. Shelley is CTO at Sears Holdings Corporation (SHC), leading IT Operations and is focusing on the modernization of IT across the company. ! ! Phil is also CEO of Metascale, a subsidiary of Sears Holdings. Metascale is an IT managed Services Company that makes Big Data easy by designing, delivering and operating Hadoop-based solutions for Analytics, Mainframe Migration and massive-scale processing, integrated into the customers’ Enterprise.!
  • 3. © 2012 Datameer, Inc. All rights reserved. About our Speaker! Stefan Groschupf! ! Stefan Groschupf is the co-founder and CEO of Datameer and one of the original contributors to Nutch, the open source predecessor of Hadoop, ! ! Prior to Datameer, Stefan was the co-founder and CEO of Scale Unlimited, which implemented custom Hadoop analytic solutions for HP, Sun, Deutsche Telekom, Nokia and others. Earlier, Stefan was CEO of 101Tec, a supplier of Hadoop and Nutch-based search and text classification software to industry-leading companies such as Apple, DHL and EMI Music. Stefan has also served as CTO at multiple companies, including Sproose, a social search engine company.!
  • 4. Hadoop as a Data Hub a new approach to data management Dr. Phil Shelley CTO Sears Holdings CEO MetaScale
  • 5. The Challenge Data Volume / Retention Batch Window Limits Escalating IT Costs Scalability Ever Evolving Business ETL Complexity / Costs Data Latency / Redundancy Tight IT Budgets Challenges & Trends 2 Constant pressure to lower costs, deliver faster, migrate to real time and answer more difficult questions… Batch Real-Time→ Proprietary Open Source→ Capital Cloud Expense→ Heavy Iron Commodity→ Linear Parallel Processing→ Copy and Use Source Once & Re-Use→ Costs Down→ Power Up→
  • 6. What is a Data Hub A single, consolidated, fully populated data archive that gives unfettered user access to analyze and report on data, with appropriate security, as soon as the data is created by the transactional or other source system
  • 7. Why a Data Hub • Most data latency is removed • Users and analysts are put in a self-service mode • The concept of a “data cube” is unnecessary • Analysis at the lowest level – No need to run at the segment level • Any question can be asked • Business users and analysts have unrestricted ability to explore • Correlation of any data set is immediately possible • Significant reduction in reporting and analysis times – Time to source the data – Time for users to gain access to the data • Reduction in IT labor …. – Source Once – Use Many Times
  • 8. • Data is Copied from source systems via ETL • Sub-sets of data are captured – Too expensive to keep all detail – Takes too long to ETL all data fields from sources • Each use of data generates more unique ETL jobs • Data is segmented to reduce query times • Cubes or views are generated to improve analysis speed • Disparate data silos required ETL before users have access • Data warehouse costs and performance limitations force archiving and data truncation • Tends to lead to different versions of “truth” • Time lag or latency from data generation to use The Traditional Approach
  • 9. Benefits - Hadoop as a Data Hub • All data is available – All history – All detail • No need to filter, segment or cube before use • Data can be consumed almost immediately • No need to silo into different databases to accommodate performance limitations • Users do not require IT to ETL data before use • Security is applied via Datameer profiles • User self-service is a reality
  • 10. Prerequisites • An Enterprise data architecture that has a Data Hub as a foundation • Data sourcing must be controlled • Metadata must be created for data sources • A leader with the vision and capability to drive • Willing business users to pilot and coach others • A sustained strategy to Enterprise Data Architecture and governance • A carefully designed Hadoop data layer architecture
  • 11. Key Concepts • A Data Hub is now reality • Drives lower costs and reduces delays • Time to value for data is reduced • Business users and analysts are empowered • The most important: – Source Once – Re-use Many Times – Source everything – Retain everything
  • 12. o ETL complexity is needed no-longer – DATA HUB – Source Once – Re-Use many times – ETL is transformed to ELTTTTTT with lower data latency – Consume data in-place with Datameer o ETL-induced data latency is largely eliminated – Analysis is routinely possible within minutes of data creation o Long-running overnight workload on Legacy Systems – Can be eliminated and executed at any time – Run times are a fraction of the original clock-time o Batch processing on mainframes or other conventional batch – Moved to Hadoop – Run 10, 50, even 100 times faster. o Intelligent Archive – Put your archives/tape data on Hadoop and make it Intelligent – Archive with the ability to run analytics or join it with other data o Modernize Legacy – Mainframe MIPs reduction has very attractive ROI – Move Data Warehouse workload – Reduce Cost – Go Faster Key Learning
  • 13. Sample Reports - Datameer
  • 14. © 2012 Datameer, Inc. All rights reserved. Questions and Answers!
  • 15. © 2012 Datameer, Inc. All rights reserved. Online Resources !  Try Datameer: www.datameer.com! !  Visit Metascale: www.metascale.com! !  Follow us on Twitter @datameer & @BigDataMadeEasy! !