Mais conteúdo relacionado Semelhante a Modern Data Architecture: In-Memory with Hadoop - the new BI (20) Modern Data Architecture: In-Memory with Hadoop - the new BI1. Hadoop and the new BI:
The Modern Data Architecture
…for in memory Big Data Analytics
10 December 2013
2. Quick Housekeeping
Q&A box is available for your questions
Webinar will be recorded for future viewing
Thank You for joining!
© Hortonworks Inc. 2013
4. Your Presenters
• Paul Groom (@datagroom)
– Chief Innovation Officer
– 28 years buried in the big data of the data
guiding business users to value
– Two wheels are more fun than four
• John Kreisa (@marked_man)
– VP Strategic Marketing, Hortonworks
– Over 20 years in data management as a
developer and a marketer
– Avid camper
© Hortonworks Inc. 2013
Page 4
5. Today’s Topics
• Introduction
• Drivers for the Modern Data Architecture (MDA)
• Apache Hadoop in the MDA
• Kognitio’s role in the MDA
• Q&A
© Hortonworks Inc. 2013
Page 5
7. APPLICATIONS
Modern Data Architecture Enabled
Business
Analytics
Custom
Applications
Packaged
Applications
DEV & DATA
TOOLS
SOURCES
DATA SYSTEM
BUILD &
TEST
OPERATIONAL
TOOLS
RDBMS
EDW
MANAGE &
MONITOR
MPP
REPOSITORIES
Existing Sources
Emerging Sources
(CRM, ERP, Clickstream, Logs)
(Sensor, Sentiment, Geo, Unstructured)
© Hortonworks Inc. 2013
Page 7
8. Hadoop Powers Modern Data Architecture
Hadoop Cluster
compute
&
storage
.
.
.
.
.
.
.
.
.
.
compute
&
storage
Hadoop clusters provide
scale-out storage and
distributed data processing
on commodity hardware
Apache Hadoop is an open source project
governed by the Apache Software Foundation
(ASF) that allows you to gain insight from massive
amounts of structured and unstructured data
quickly and without significant investment.
© Hortonworks Inc. 2013
Page 8
9. Drivers of Hadoop Adoption
New Business
Applications
From NEW types of
Data (or existing
types for longer)
© Hortonworks Inc. 2013
Page 9
10. Most Common NEW TYPES OF DATA
1. Sentiment
Understand how your customers feel about your brand and
products – right now
2. Clickstream
Capture and analyze website visitors’ data trails and
optimize your website
3. Sensor/Machine
Discover patterns in data streaming automatically from
remote sensors and machines
4. Geographic
Analyze location-based data to manage operations where
they occur
5. Server Logs
Research logs to diagnose process failures and prevent
security breaches
6. Unstructured (txt, video, pictures, etc..)
Understand patterns in files across millions of web pages,
emails, and documents
© Hortonworks Inc. 2013
Value
11. Keep Existing Data Around Longer
• Online archive
– Data that was once moved to tape can
now be queried to understand long term trends
• Compliance retention
– Industry specific requirements for retention
of data
Value
• Combine with external historical data sources
– Weather, survey, research, purchased, etc.
© Hortonworks Inc. 2013
12. Drivers of Hadoop Adoption
Architectural
A Modern Data
Architecture
Complement your existing data
systems: the right workload in the
right place
New Business
Applications
© Hortonworks Inc. 2013
Page 12
13. Requirements for Hadoop Adoption
Requirements for Hadoop’s Role
in the Modern Data Architecture
Integrated
Key Services
Interoperable with
existing data center
investments
Platform, operational and
data services essential for
the enterprise
Skills
Leverage your existing
skills: development,
operations, analytics
© Hortonworks Inc. 2013
Page 13
14. Requirements for Enterprise Hadoop
1
2
3
Key Services
Platform, Operational and
Data services essential
for the enterprise
OPERATIONAL
SERVICES
AMBARI
HBASE
PIG
SQOOP
HIVE &
HCATALOG
LOAD &
EXTRACT
Skills
NFS
CORE
PLATFORM
SERVICES
Integrated
WebHDFS
KNOX*
MAP
REDUCE
TEZ
YARN
HDFS
Enterprise Readiness
High Availability, Disaster
Recovery, Rolling Upgrades,
Security and Snapshots
HORTONWORKS
DATA PLATFORM (HDP)
Engineered with existing
data center investments
OS/VM
© Hortonworks Inc. 2013
FLUME
FALCON*
OOZIE
Leverage your existing
skills: development,
analytics, operations
DATA
SERVICES
Cloud
Appliance
Page 14
15. Requirements for Enterprise Hadoop
3
Leverage your existing
skills: development,
analytics, operations
Integration
DEVELOP
ANALYZE
2
Skills
Platform, operational and
data services essential
for the enterprise
OPERATE
1
Key Services
COLLECT
PROCESS
BUILD
EXPLORE
QUERY
DELIVER
PROVISION
MANAGE
MONITOR
Engineered with existing
data center investments
© Hortonworks Inc. 2013
Page 15
16. Familiar and Existing Tools
3
Leverage your existing
skills: development,
analytics, operations
Integration
DEVELOP
ANALYZE
2
Skills
Platform, operational and
data services essential
for the enterprise
OPERATE
1
Key Services
COLLECT
PROCESS
BUILD
EXPLORE
QUERY
DELIVER
PROVISION
MANAGE
MONITOR
Engineered with existing
data center investments
© Hortonworks Inc. 2013
Page 16
17. APPLICATIONS
Requirements for Enterprise Hadoop
Business
Analytics
Custom
Applications
Packaged
Applications
Integrated with
DEV & DATA
TOOLS
Applications
BUILD &
DATA SYSTEM
Business Intelligence,
TEST
Developer IDEs,
Data Integration
SOURCES
3
OPERATIONAL
TOOLS
RDBMS
EDW
MANAGE &
Systems
MONITOR
MPP
Data Systems & Storage,
Systems Management
REPOSITORIES
Platforms
Integration
Existing Sources
Engineered with existing
(CRM, ERP, Clickstream, Logs)
data center investments
© Hortonworks Inc. 2013
Emerging Sources
(Sensor, Sentiment, Geo, Unstructured)
Operating Systems,
Virtualization, Cloud,
Appliances
Page 17
18. SOURCES
DATA SYSTEM
APPLICATIONS
A Modern Data Architecture Applied
Business
Analytics
Custom
Applications
Packaged
Applications
Complement data systems
RDBMS
EDW
MPP
Right workload right place
REPOSITORIES
Existing Sources
Emerging Sources
(CRM, ERP, Clickstream, Logs)
(Sensor, Sentiment, Geo, Unstructured)
© Hortonworks Inc. 2013 - Confidential
Page 18
19. APPLICATIONS
Kognitio in the Modern Data Architecture
Business
Analytics
Business
Intelligence Tools
OLAP Clients
DEV & DATA
TOOLS
SOURCES
DATA SYSTEM
In‐memory MPP Accelerator
BUILD &
TEST
OPERATIONAL
TOOLS
RDBMS
EDW
MANAGE &
MONITOR
MPP
REPOSITORIES
Existing Sources
Emerging Sources
(CRM, ERP, Clickstream, Logs)
(Sensor, Sentiment, Geo, Unstructured)
© Hortonworks Inc. 2013 - Confidential
Page 19
20. APPLICATIONS
Kognitio in the Modern Data Architecture
BusinessObjects BI
DEV & DATA TOOLS
DATA SYSTEM
In‐memory MPP Accelerator
OPERATIONAL TOOLS
RDBMS
HANA
EDW
MPP
SOURCES
INFRASTRUCTURE
Existing Sources
Emerging Sources
(CRM, ERP, Clickstream, Logs)
(Sensor, Sentiment, Geo, Unstructured)
© Hortonworks Inc. 2013 - Confidential
Page 20
21. Today’s Topics
• Introduction
• Drivers for the Modern Data Architecture (MDA)
• Apache Hadoop’s role in the MDA
• Kognitio’s role in the MDA
• Q&A
© Hortonworks Inc. 2013
Page 21
22. Hadoop and the new BI
Requirements for Hadoop’s Role
in the Modern Data Architecture
1
Integrated
Interoperable with
existing data center
investments
© Hortonworks Inc. 2013
2
Skills
3
Key Services
Platform, operational and
data services essential for
the enterprise
Leverage your existing
skills: development,
operations, analytics
Page 22
23. Motivation
• Historical architecture = Existing investment
1
Key Services
Platform, Operational a
Data services essential
for the enterprise
Cognos
• Must plug-and-play with MDA
– Do not disrupt, enhance!
• Performance and behavior expectations
– Dynamic ad-hoc access
– Drill unlimited
– Report on-demand
© Hortonworks Inc. 2013
Page 23
26. In-memory analytical platform
• Software only
– Easy to deploy alongside HDP
– Simple two stage install
• Commodity Hardware
3
Integration
Engineered with existing
data center investments
– X86/64 Linux Platform with 10GbE network – same as HDP
– Biased to more RAM and less disk
• Scale-out MPP
– Same compute model as Hadoop
– Strong focus on 100% effective CPU utilization for any given query
• Exploits features of underlying persistent store
– Simple ‘Pull data’ access methods
– Parallelism – all HDP nodes intercommunicating with all Kognitio nodes
• ANSI 2011 SQL
– Mature fully featured
– Transaction processing capable
• Not-only-SQL
2
Skills
Leverage your existing
skills: development,
analytics, operations
– Any script or binaries executed in-line within SQL queries
© Hortonworks Inc. 2013
Page 26
27. Tight Integration
3
• Map-reduce Connector
– Filtered access
© Hortonworks Inc. 2013
Integration
Engineered with existing
data center investments
• HDFS Connector
– Low Latency access
Page 27
28. So why In-memory?
INSTANT WAIT
• Exploit the ‘Dynamic’ access element of ‘D’-RAM
– Data placed in memory in structures best suited for CPUs, not for disks
© Hortonworks Inc. 2013
Page 28
30. Building Data Models
• Hadoop is a great repository
• Perfect to handle volume and variability without effort
• Perfect to ‘triage’ the data, to reshape, filter and project into…
• Data Virtualisation / Logical Data Warehouse
… but with the associated horsepower to dynamically analyse the data
• Plug standard tools straight in – not a Java programmer in sight!
• Central control and security
• Data model shelf life getting shorter – sandboxes and workbenches
– Build on-demand to meet todays needs – just pull data from your HDP
– Lots of project based discovery and analytics
– World is changing rapidly
– Ever tighter feedback loops
© Hortonworks Inc. 2013
Page 30
31. Analytical Complexity
Increasing Computation
Machine learning
algorithms
Behaviour
modelling
Statistical
Analysis
Dynamic
Simulation
Clustering
Dynamic
Interaction
Reporting &
BPM
Campaign
Management
Fraud
detection
Technology/Automation
© Hortonworks Inc. 2013
Page 31
33. Mature SQL atop Hadoop
Kognitio is an in‐memory
analytical platform that is tightly
integrated with Hadoop for high‐
performance advanced analytics
that make Big Data more
consumable for enterprises,
especially those with mature BI
environments or engrained
tools.
• Powering advanced analytics at
organizations worldwide, such as:
• Privately held
• Invented the in‐memory analytical platform
• Labs in the UK ‐ HQ in New York, NY
© Hortonworks Inc. 2013
Page 33
34. APPLICATIONS
Kognitio in the Modern Data Architecture
Business
Analytics
Business
Intelligence Tools
OLAP Clients
DEV & DATA
TOOLS
SOURCES
DATA SYSTEM
In‐memory MPP Accelerator
BUILD &
TEST
OPERATIONAL
TOOLS
RDBMS
EDW
MANAGE &
MONITOR
MPP
REPOSITORIES
Existing Sources
Emerging Sources
(CRM, ERP, Clickstream, Logs)
(Sensor, Sentiment, Geo, Unstructured)
© Hortonworks Inc. 2013
Page 34
35. Forrester Wave: a “strong performer”
•
•
Kognitio’s EDW is a strong, cost-effective
alternative to SAP HANA.
•
Kognitio…was designed from the start as an
MPP (distributed) in-memory RDBMS,
making extensive use of RAM-based
processing for maximum performance.
•
© Forrester Corp. Used with permission.
© Hortonworks Inc. 2013
Kognitio’s entirely in-memory, distributed
EDW is appealing for customers looking for
fast performance on commodity hardware
Download a complimentary copy of the
full report at www.kognitio.com/wave
Page 35
36. The Modern Data Architecture
…for in memory Big Data Analytics
More about Kognito and Hortonworks
http://hortonworks.com/partner/kognitio
Get started with Hortonworks Sandbox
http://hortonworks.com/hadoop-tutorial/
Follow us:
@hortonworks @kognitio
Question & Answer session will be conducted electronically,
using the panel to the right of your screen
Today’s Slides available at: www.slideshare.net/kognitio