SlideShare a Scribd company logo
1 of 33
Download to read offline
© 2014 IBM Corporation
Open '14
Analyzing Big Data
Jeff Scheel
Chief Engineer
Linux on Power
June 2, 2014
scheel@us.ibm.com
© 2014 IBM Corporation2
Agenda
1. Getting started with Big Data
2. OpenPOWER Foundation
3. The future of Analytics
© 2014 IBM Corporation
Getting started with Big Data
© 2014 IBM Corporation4
Big Data is growing
and moving fast from
a variety of sources,
are you keeping up?
• 1 Trillion connected
devices generate 2.5
quintillion bytes data /
day
• 80% of the world’s data
today is unstructured
• 1 in 2 business leaders
don’t have access to
data they need
© 2014 IBM Corporation5
“Data is the new oil”
In its raw form, oil has little value. Once processed and refined, it helps power the world.
“Big Data has arrived at Seton
Health Care Family, fortunately
accompanied by an analytics tool
that will help deal with the
complexity of more than two
million patient contacts a year…”
“Data is the new oil.”
Clive Humby
“At the World Economic Forum
last month in Davos,
Switzerland, Big Data was a
marquee topic. A report by the
forum, “Big Data, Big Impact,”
declared data a new class of
economic asset, like currency or
gold.
“Increasingly, businesses are
applying analytics to social
media such as Facebook and
Twitter, as well as to product
review websites, to try to
“understand where customers
are, what makes them tick and
what they want”, says Deepak
Advani, who heads IBM’s
predictive analytics group.”
“Companies are being inundated
with data—from information on
customer-buying habits to
supply-chain efficiency. But many
managers struggle to make
sense of the numbers.”
© 2014 IBM Corporation6
The challenge: handling the large Volume, Variety, Velocity, and
Veracity of data to find new insights and improve business outcome
BI / Reporting Exploration /
Visualization
Functional
App
Industry
App
Predictive
Analytics
Content
Analytics
Analytic Applications
IBM Big Data Platform
Systems
Management
Application
Development
Visualization
& Discovery
Accelerators
Information Integration & Governance
Hadoop
System
Stream Computing Data Warehouse
MFG - Analyze & correlate
log records to improve
service and predict failures
Telco - Address
customer satisfaction,
Predict churn, and match
promotions in real time
Healthcare - Detect life-
threatening conditions at
hospitals in time to
intervene
Retail - Multi-channel
customer sentiment and
experience analysis
Financial Services - Make
risk decisions based on
real-time transactional
data
Law Enforcement -
Identify criminals and
threats from video, audio
feeds
© 2014 IBM Corporation7
Customers are deploying new infrastructure to leverage all data types
Data in
Motion
Data at
Rest
Data in
Many Forms
Information
Ingestion and
Operational
Information
Decision
Management
BI and Predictive
Analytics
Navigation
and Discovery
Intelligence
Analysis
Landing Area,
Analytics Zone
and Archive
 Raw Data
 Structured Data
 Text Analytics
 Data Mining
 Entity Analytics
 Machine Learning
Real-time
Analytics
 Video/Audio
 Network/Sensor
 Entity Analytics
 Predictive
Exploration,
Integrated
Warehouse, and
Mart Zones
 Discovery
 Deep
Reflection
 Operational
 Predictive Stream Processing
 Data Integration
 Master Data
Stream
s
Information Governance, Security and Business Continuity
Hadoop Infrastructure – currently being
deployed on commodity hardware
Hadoop Infrastructure – currently being
deployed on commodity hardware
© 2014 IBM Corporation8
WATSON
Two new Watson-based products:
• Interactive Care Insights for Oncology
• The WellPoint Interactive Care Guide and
Interactive Care Reviewer
IBM and Red Hat
innovating in Healthcare
with Watson
• Watson's oncology education:
• 600,000 pieces of medical
evidence
• 2 million pages of text
• 25,000 training cases
• Watson can review
1.5 million patient records
faster than it takes most office
computers to boot up
© 2014 IBM Corporation9
Big Data implementation patterns
Common analysis of structured &
unstructured data
WarehouseHadoop
App / BI
Visualization / Exploration
Warehouse and BigInsights partitioning
HadoopWarehouse
App / BI
Visualization
Exploration
App / BI
Visualization
Exploration
App / BI
Visualization
Exploration
HadoopWarehouse
Warehouse batch offload
Warehouse
App /BI
Visualization
Exploration
Hadoop
StructuredUnstructured
App / BI
Visualization
Exploration
Separate unstructured &
structured analysis
StructuredUnstructured
Structured Structured
© 2014 IBM Corporation10
What the experts say
1. Seek project input from Sales,
Marketing, and Operations
teams
2. Select projects which are well-
defined and have quick ROI –
less than a year
3. Leverage your experiences
from data warehouse and
business intelligence projects
4. Avoid starting with “Big Bang”
Source: http://www-01.ibm.com/common/ssi/cgi-bin/ssialias?infotype=SA&subtype=WH&htmlfid=POL03133USEN
© 2014 IBM Corporation11
More ideas for starting
Warehouse
App /BI
Visualization
Exploration
Hadoop
Existing BI Stack
App / BI
Visualization
Exploration
Separate unstructured & structured analysis
New
 Find a small problem to solve, i.e. an
internal phone directory, and start
“on-the-side”.
 Locate relevant data and identify
pieces what are “in motion” or “at
rest”.
 For data at rest, build opensource Hadoop on your PowerLinux
system or try the InfoSphere BigInsights Basic Edition (no charge).
 For data in motion, use the InfoSphere Streams trial download.
 Reference the IBM Information Center for details on
how to import data into Hadoop and
how to write applications using Streams Studio.
 Explore Datameer to visualize your Hadoop based Big Data
© 2014 IBM Corporation12
PowerLinux jump start services facilitate starting with Big
Data Analytics
5 Day IBM Power Analytics
Services Jump Start
Includes:
• 5 days, on-site service offering
• Quick Analytics Assessment Workshop
•Software Installation
• Hands on education in getting started
• Evaluating the analytical approach for your
business that will make the biggest impact
• Quick sample application to consume
customer data
Reference Architecture Workshop
Why Jump Start Services for your
IBM Power Analytics solution?
• Learn how to optimally leverage IBM Power
System for Analytics
• Learn the benefits and reasoning of Big Data
•Learn how to gain business value from the
data you have
2 Day IBM Power Analytics
Services Jump Start
Includes:
• 2 days, on-site Big Data Analytics service
offering
•Software installation
• Hands on education in getting started
Evaluating the analytical approach for your
business that will make the biggest impact
IBM Systems Lab Services & Training - Power Systems
Services for PowerLinux, AIX, and OS
Contact – Linda Hoben, Opportunity Manager, hoben@us.ibm.com
IBM Power Servers is an ideal
platform for streaming data and
performing analytic computations for
a multitude of applications.
Let us help make you successful!
© 2014 IBM Corporation13
IBM POWER has a strong history in transactional processing
workloads
1,556 2,845 5,669 9,200 12,602
23,871
32,046
50,164
63,021
95,081
150,000$109.00
$89.00
$52.70
$43.00
$17.80
$8.31
$5.42 $5.19 $2.97 $2.81 $0.69
0
20000
40000
60000
80000
100000
120000
140000
160000
S70 S7A S80 S85 p690 p690+ p690++ p5-595 p5-595+ P6 595 P7 780
$0
$20
$40
$60
$80
$100
$120
tpcC $/tpcC
© 2014 IBM Corporation14
POWER8 Processor
Caches
• 512 KB SRAM L2 / core
• 96 MB eDRAM shared L3
• Up to 128 MB eDRAM L4
(off-chip)
Cores
• 12 cores (SMT8)
• 8 dispatch, 10 issue,
16 exec pipe
• 2X internal data
flows/queues
• Enhanced prefetching
• 64K data cache,
32K instruction cache
Accelerators
• Crypto & memory expansion
• Transactional Memory
• VMM assist
• Data Move / VM Mobility Energy Management
• On-chip Power Management Micro-controller
• Integrated Per-core VRM
• Critical Path Monitors
Technology
•22nm SOI, eDRAM, 15 ML 650mm2
Memory
• Up to 230 GB/s
sustained bandwidth
Bus Interfaces
• Durable open memory
attach interface
• Integrated PCIe Gen3
• SMP Interconnect
• CAPI (Coherent
Accelerator Processor
Interface)
ComputerWorld: To make the chip faster, IBM has
turned to a more advanced manufacturing process,
increased the clock speed and added more cache
memory, but perhaps the biggest change heralded
by the Power8 cannot be found in the specifications.
After years of restricting Power processors to its
servers, IBM is throwing open the gates and will be
licensing Power8 to third-party chip and component
makers.
The Register: the Power8 is so clearly engineered
for midrange and enterprise systems for running
applications on a giant shared memory space,
backed by lots of cores and threads. Power8 does
not belong in a smartphone unless you want one the
size of a shoebox that weighs 20 pounds. But it most
certainly does belong in a badass server, and
Power8 is by far one of the most elegant chips that
Big Blue has ever created, based on the initial specs.
PCWorld: With Power8, IBM has more than doubled
the sustained memory bandwidth from the Power7
and Power7+, to 230 GB/s, as well as I/O speed, to
48 GB/s. Put another way, Watson’s ability to look up
and respond to information has more than doubled
as well.
Microprocessor report: Called Power8, the new
chip delivers impressive numbers, doubling the
performance of its already powerful predecessor,
Power7+. Oracle currently leads in server-processor
performance, but IBM’s new chip will crush those
records. The Power8 specs are mind boggling.
Source: Hotchips presentation
© 2014 IBM Corporation15
POWER8 delivers 2.5x performance on Big Data / Hadoop
POWER8 reduces the number of servers by 60% based on the best x86 published Terasort
result
 POWER8 S822L will deliver over 2x the
performance of the best published x86 system
… and continues to offer far superior RAS
 POWER8 delivers 1.7X over HP on a
per-core normalized benchmark.
 POWER8 exploits additional cores, more
threads, larger caches, memory bandwidth
 Terasort is a popular benchmark to measure
the performance of a Hadoop solution
 Sorts a large dataset (10 TB) in parallel
 Exercises the Map-reduced framework
and Hadoop Distributed File System
(HDFS)
>2x>2x
Relative System Performance
0
0.5
1
1.5
2
2.5
3
POWER8 Cisco
2.5x2.5x
IBM Analytics Stack: IBM Power System S822L; 24 cores / 192 threads, POWER8; 3.0GHz, 512 GB memory, RHEL 6.5, InfoSphere BigInsights 3.0
Compared to a 16 Cores HP system
http://www.cisco.com/en/US/solutions/collateral/ns340/ns517/ns224/ns944/le_tera.pdf
© 2014 IBM Corporation16
Power Systems
S822LPower Systems
S812L
• 1-socket, 2U
• Linux Only
• 2-socket, 2U
• Linux Only
• 2-socket, 2U
• All Operating Systems
Power Systems
S822
Power Systems
S814
• 1-socket, 4U
• All Operating Systems
Power Systems
S824
• 2-socket, 4U
• All Operating
SystemsPower Systems
S824L
• 2-socket, 4U
• Linux Only
• SOD
1 & 2 Sockets
New IBM Power Systems based on POWER8
© 2014 IBM Corporation
OpenPOWER Foundation – The emerging
ecosystem
18 © OpenPOWER Foundation 2014
Industry trends
• The number of companies designing & building servers is
increasing
– Traditionally there have been few companies designing systems: HP, IBM, SUN, Dell,
etc.
– Today there are many more: Google, Microsoft, Facebook, Rackspace, Huawei,
Sugon, Inspur, etc.
– A fairly mature ecosystem including the Taiwanese ODMs is a key enabler of this
trend
• Numerous disruptive forces are impacting these custom
system designs and driving designers to consider new ways of
innovating
– Ability to handle rapid growth in Big Data & Analytics based solutions
– Choice and Innovation
– CPU SOC integration drive need for chip development
• These trends create a need for a server targeted “chip-system-
software” ecosystem
– IBM has technology and a software stack ready to meet these needs
– IBM recognizes the need to work with partners to create this ecosystem
– IBM recognizes the need for choice and options in processor sourcing
19 © OpenPOWER Foundation 2014
OpenPOWER Foundation Structure
OpenPOWER is an industry foundation based on the POWER architecture, enabling an Open
community for development and opportunity for member differentiation and growth
20 © OpenPOWER Foundation 2014
Building collaboration and innovation at all levels
Welcoming new members in all areas of the ecosystem
100+ inquiries and numerous active dialogues underway
Boards/Systems
I/O, Storage, Acceleration
Chip/SOC
System/Software/Services
21 © OpenPOWER Foundation 2014
OpenPOWER Proposed Ecosystem Enablement
XCATXCAT
System Operating Environment Software Stack
A modern development environment is emerging
based on tools and services
Cloud
Software
Operating
System / KVM
Standard Operating
Environment
(System Mgmt)
Software
Power Open Source Software Stack Components
Existing
Open
Source
Software
Communitie
s
Firmware
Hardware
New OSS
Community
OpenPOWER
Technology
OpenPOWER
Firmware
CAPP
PCIe
POWER8
CAPI over PCIe
“Standard POWER Products” – 2014
Hardware
“Custom POWER SoC” – Future
Customizable
Framework to Integrate
System IP on Chip
Industry IP License Model
Multiple Options to Design with POWER Technology Within OpenPOWER
© 2014 IBM Corporation22
Non-IBM POWER8 products
http://www.enterprisetech.com/2014/04/28/inside-google-tyan-power8-server-boards/
The Tyan reference (ATX) board,
SP010, measures 12” by 9.6”
➢
one single-chip module (SCM)
➢
four DDR3 memory slots
➢
four 6 Gb/sec SATA peripheral connectors
➢
two USB 3.0 ports
➢
two Gigabit Ethernet network interfaces
➢
keyboard and video
➢
intended for developers
The Google reference board
➢
two single-chip module (SCM)
➢
four modified SATA ports
➢
Google use only
© 2014 IBM Corporation
The future of Analytics
© 2014 IBM Corporation24
The future of Analytics: An open approach
Open Platform for
Choice
25 © OpenPOWER Foundation 2014
POWER8 CAPI
Custom
Hardware
Application
POWER8
CAPP
Coherence Bus
PSL
FPGA or ASIC
Customizable Hardware
Application Accelerator
• Specific system SW, middleware, or user application
• Written to durable interface provided by PSL
POWER8
PCIe Gen 3
Transport for encapsulated messages
Processor Service Layer (PSL)
• Present robust, durable interfaces to applications
• Offload complexity / content from CAPP
Virtual Addressing
• Accelerator can work with same memory addresses that the
processors use
• Pointers de-referenced same as the host application
• Removes OS & device driver overhead
Hardware Managed Cache Coherence
• Enables the accelerator to participate in “Locks” as a normal thread
Lowers Latency over IO communication model
Coherent Accelerator Processor Interface (CAPI)
© 2014 IBM Corporation26
Coherent Accelerator Processor Interface (CAPI) Overview
CAPP PCIe
POWER8 Processor
Typical I/O Model Flow
Flow with a Coherent Model
Shared Mem.
Notify Accelerator
Acceleration
Shared Memory
Completion
DD Call
Copy or Pin
Source Data
MMIO Notify
Accelerator
Acceleration
Poll / Int
Completion
Copy or Unpin
Result Data
Ret. From DD
Completion
FPGA
Functionn
Function0
Function1
Function2
CAPI
IBM Supplied POWER
Service Layer
© 2014 IBM Corporation27
Example: Innovative “In-Memory” NoSQL/KVS Integrated Solution - via
POWER8 CAPI-attached Flash
WWW
10Gb Uplink
POWER8 Server
Flash Array w/ up
to 40TB
Differentiated NoSQL
(POWER8 + CAPI Flash)
Infrastructure Attributes
- 192 threads in 4U Server drawer
- 40 TB of memory based Flash per 4U Drawer
- Shared Memory & Cache for dynamic tuning
- Elimination of I/O and Network Overhead
- Cluster solution in a box
5X Cost Reduction with
equivalent performance
WWW
500GB
Cache Node500GB
Cache Node500GB
Cache Node500GB
Cache Node500GB
Cache Node500GB
Cache Node
Backup Node
Load Balancer
Today’s NoSQL
in memory (x86)
10Gb Uplink
Infrastructure Requirements
- Large Distributed (Scale out)
- Large Memory per node
- Networking Bandwidth Needs
- Load Balancing
Power CAPI-attached Flash model for NoSQL offers dramatic (24:1) density advantage
© 2014 IBM Corporation
Wrap-up
© 2014 IBM Corporation29
For more information on Big Data / Analytics
● Sales kits
– PartnerWorld
– IBM internal
● Worldwide contacts
– Renato Loffreda-Mancinelli, World Wide Business Analytics
and Big Data Solutions on Power - Business Dev. Leader
(loffreda@us.ibm.com)
– Michael Tabron, Solution Offering Manager, Power Analytics
(tabron@us.ibm.com)
– Gina King, Solution Offering Manager, Big Data Analytics
(glking@us.ibm.com)
– Bob Friske, Marketing Manager (rfriske@us.ibm.com)
© 2014 IBM Corporation30
Q & A
Summary:
1.Getting started with Big Data is the
toughest part. Start simple, small,
and on the side.
2.The OpenPOWER Foundation
enables new systems and helps
support the emerging analytic
solutions around NoSQL
databases.
3.POWER8 technology like CAPI will
enable new solutions from IBM and
the OpenPOWER Foundation
© 2014 IBM Corporation31
Special notices
This document was developed for IBM offerings in the United States as of the date of publication. IBM may not make these offerings available in
other countries, and the information is subject to change without notice. Consult your local IBM business contact for information on the IBM
offerings available in your area.
Information in this document concerning non-IBM products was obtained from the suppliers of these products or other public sources. Questions
on the capabilities of non-IBM products should be addressed to the suppliers of those products.
IBM may have patents or pending patent applications covering subject matter in this document. The furnishing of this document does not give
you any license to these patents. Send license inquires, in writing, to IBM Director of Licensing, IBM Corporation, New Castle Drive, Armonk, NY
10504-1785 USA.
All statements regarding IBM future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives
only.
The information contained in this document has not been submitted to any formal IBM test and is provided "AS IS" with no warranties or
guarantees either expressed or implied.
All examples cited or described in this document are presented as illustrations of the manner in which some IBM products can be used and the
results that may be achieved. Actual environmental costs and performance characteristics will vary depending on individual client configurations
and conditions.
IBM Global Financing offerings are provided through IBM Credit Corporation in the United States and other IBM subsidiaries and divisions
worldwide to qualified commercial and government clients. Rates are based on a client's credit rating, financing terms, offering type, equipment
type and options, and may vary by country. Other restrictions may apply. Rates and offerings are subject to change, extension or withdrawal
without notice.
IBM is not responsible for printing errors in this document that result in pricing or information inaccuracies.
All prices shown are IBM's United States suggested list prices and are subject to change without notice; reseller prices may vary.
IBM hardware products are manufactured from new parts, or new and serviceable used parts. Regardless, our warranty terms apply.
Any performance data contained in this document was determined in a controlled environment. Actual results may vary significantly and are
dependent on many factors including system hardware configuration and software design and configuration. Some measurements quoted in this
document may have been made on development-level systems. There is no guarantee these measurements will be the same on generally-
available systems. Some measurements quoted in this document may have been estimated through extrapolation. Users of this document
should verify the applicable data for their specific environment.
Revised September 26, 2006
© 2014 IBM Corporation
Backup
© 2014 IBM Corporation33
Where to find more information? http://openpowerfoundation.org/

More Related Content

What's hot

Big Data Scotland 2017
Big Data Scotland 2017Big Data Scotland 2017
Big Data Scotland 2017Ray Bugg
 
Big Data World Forum
Big Data World ForumBig Data World Forum
Big Data World Forumbigdatawf
 
IT @ Intel: Preparing the Future Enterprise with the Internet of Things
IT @ Intel: Preparing the Future Enterprise with the Internet of ThingsIT @ Intel: Preparing the Future Enterprise with the Internet of Things
IT @ Intel: Preparing the Future Enterprise with the Internet of ThingsIntel IT Center
 
IBM Big Data Analytics Concepts and Use Cases
IBM Big Data Analytics Concepts and Use CasesIBM Big Data Analytics Concepts and Use Cases
IBM Big Data Analytics Concepts and Use CasesTony Pearson
 
Deutsche Telekom on Big Data
Deutsche Telekom on Big DataDeutsche Telekom on Big Data
Deutsche Telekom on Big DataDataWorks Summit
 
IBM-Why Big Data?
IBM-Why Big Data?IBM-Why Big Data?
IBM-Why Big Data?Kun Le
 
Making Hadoop Ready for the Enterprise
Making Hadoop Ready for the Enterprise Making Hadoop Ready for the Enterprise
Making Hadoop Ready for the Enterprise DataWorks Summit
 
Monitizing Big Data at Telecom Service Providers
Monitizing Big Data at Telecom Service ProvidersMonitizing Big Data at Telecom Service Providers
Monitizing Big Data at Telecom Service ProvidersDataWorks Summit
 
Extending BI with Big Data Analytics
Extending BI with Big Data AnalyticsExtending BI with Big Data Analytics
Extending BI with Big Data AnalyticsDatameer
 
Implementing Big Data at the Speed of Business
Implementing Big Data at the Speed of BusinessImplementing Big Data at the Speed of Business
Implementing Big Data at the Speed of BusinessDataWorks Summit
 
5 Steps to Create a Company Culture that Embraces the Power of Data
5 Steps to Create a Company Culture that Embraces the Power of Data5 Steps to Create a Company Culture that Embraces the Power of Data
5 Steps to Create a Company Culture that Embraces the Power of DataHortonworks
 
EMC World 2014 Breakout: Move to the Business Data Lake – Not as Hard as It S...
EMC World 2014 Breakout: Move to the Business Data Lake – Not as Hard as It S...EMC World 2014 Breakout: Move to the Business Data Lake – Not as Hard as It S...
EMC World 2014 Breakout: Move to the Business Data Lake – Not as Hard as It S...Capgemini
 
200 million qps on commodity hardware : Getting started with MySQL Cluster 7.4
200 million qps on commodity hardware : Getting started with MySQL Cluster 7.4200 million qps on commodity hardware : Getting started with MySQL Cluster 7.4
200 million qps on commodity hardware : Getting started with MySQL Cluster 7.4Frazer Clement
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to HadoopPOSSCON
 
Big Data, Hadoop, Hortonworks and Microsoft HDInsight
Big Data, Hadoop, Hortonworks and Microsoft HDInsightBig Data, Hadoop, Hortonworks and Microsoft HDInsight
Big Data, Hadoop, Hortonworks and Microsoft HDInsightHortonworks
 
Big Data Use Cases for Different Verticals and Adoption Patterns - Impetus We...
Big Data Use Cases for Different Verticals and Adoption Patterns - Impetus We...Big Data Use Cases for Different Verticals and Adoption Patterns - Impetus We...
Big Data Use Cases for Different Verticals and Adoption Patterns - Impetus We...Impetus Technologies
 

What's hot (20)

Big Data Scotland 2017
Big Data Scotland 2017Big Data Scotland 2017
Big Data Scotland 2017
 
Big Data World Forum
Big Data World ForumBig Data World Forum
Big Data World Forum
 
IT @ Intel: Preparing the Future Enterprise with the Internet of Things
IT @ Intel: Preparing the Future Enterprise with the Internet of ThingsIT @ Intel: Preparing the Future Enterprise with the Internet of Things
IT @ Intel: Preparing the Future Enterprise with the Internet of Things
 
IBM Big Data Analytics Concepts and Use Cases
IBM Big Data Analytics Concepts and Use CasesIBM Big Data Analytics Concepts and Use Cases
IBM Big Data Analytics Concepts and Use Cases
 
Deutsche Telekom on Big Data
Deutsche Telekom on Big DataDeutsche Telekom on Big Data
Deutsche Telekom on Big Data
 
IBM-Why Big Data?
IBM-Why Big Data?IBM-Why Big Data?
IBM-Why Big Data?
 
Making Hadoop Ready for the Enterprise
Making Hadoop Ready for the Enterprise Making Hadoop Ready for the Enterprise
Making Hadoop Ready for the Enterprise
 
Monitizing Big Data at Telecom Service Providers
Monitizing Big Data at Telecom Service ProvidersMonitizing Big Data at Telecom Service Providers
Monitizing Big Data at Telecom Service Providers
 
Extending BI with Big Data Analytics
Extending BI with Big Data AnalyticsExtending BI with Big Data Analytics
Extending BI with Big Data Analytics
 
Implementing Big Data at the Speed of Business
Implementing Big Data at the Speed of BusinessImplementing Big Data at the Speed of Business
Implementing Big Data at the Speed of Business
 
5 Steps to Create a Company Culture that Embraces the Power of Data
5 Steps to Create a Company Culture that Embraces the Power of Data5 Steps to Create a Company Culture that Embraces the Power of Data
5 Steps to Create a Company Culture that Embraces the Power of Data
 
EMC World 2014 Breakout: Move to the Business Data Lake – Not as Hard as It S...
EMC World 2014 Breakout: Move to the Business Data Lake – Not as Hard as It S...EMC World 2014 Breakout: Move to the Business Data Lake – Not as Hard as It S...
EMC World 2014 Breakout: Move to the Business Data Lake – Not as Hard as It S...
 
200 million qps on commodity hardware : Getting started with MySQL Cluster 7.4
200 million qps on commodity hardware : Getting started with MySQL Cluster 7.4200 million qps on commodity hardware : Getting started with MySQL Cluster 7.4
200 million qps on commodity hardware : Getting started with MySQL Cluster 7.4
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
 
Big Data, Hadoop, Hortonworks and Microsoft HDInsight
Big Data, Hadoop, Hortonworks and Microsoft HDInsightBig Data, Hadoop, Hortonworks and Microsoft HDInsight
Big Data, Hadoop, Hortonworks and Microsoft HDInsight
 
Big Data on AWS
Big Data on AWSBig Data on AWS
Big Data on AWS
 
Semantic Data Management
Semantic Data ManagementSemantic Data Management
Semantic Data Management
 
Hadoop in the Cloud
Hadoop in the CloudHadoop in the Cloud
Hadoop in the Cloud
 
HP Converged System One (CSO)
HP Converged System One (CSO)HP Converged System One (CSO)
HP Converged System One (CSO)
 
Big Data Use Cases for Different Verticals and Adoption Patterns - Impetus We...
Big Data Use Cases for Different Verticals and Adoption Patterns - Impetus We...Big Data Use Cases for Different Verticals and Adoption Patterns - Impetus We...
Big Data Use Cases for Different Verticals and Adoption Patterns - Impetus We...
 

Similar to Analyzing Big Data - Jeff Scheel

IBM Smarter Analytics
IBM Smarter AnalyticsIBM Smarter Analytics
IBM Smarter AnalyticsAdrian Turcu
 
The sensor data challenge - Innovations (not only) for the Internet of Things
The sensor data challenge - Innovations (not only) for the Internet of ThingsThe sensor data challenge - Innovations (not only) for the Internet of Things
The sensor data challenge - Innovations (not only) for the Internet of ThingsStephan Reimann
 
Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...
Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...
Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...Cynthia Saracco
 
Get Started Quickly with IBM's Hadoop as a Service
Get Started Quickly with IBM's Hadoop as a ServiceGet Started Quickly with IBM's Hadoop as a Service
Get Started Quickly with IBM's Hadoop as a ServiceIBM Cloud Data Services
 
Future of Power: Power Strategy and Offerings for Denmark - Steve Sibley
Future of Power: Power Strategy and Offerings for Denmark - Steve SibleyFuture of Power: Power Strategy and Offerings for Denmark - Steve Sibley
Future of Power: Power Strategy and Offerings for Denmark - Steve SibleyIBM Danmark
 
Big Data - A Real Life Revolution
Big Data - A Real Life RevolutionBig Data - A Real Life Revolution
Big Data - A Real Life RevolutionCapgemini
 
Introduction: Real-Time Analytics on Data in Motion
Introduction: Real-Time Analytics on Data in MotionIntroduction: Real-Time Analytics on Data in Motion
Introduction: Real-Time Analytics on Data in MotionAvadhoot Patwardhan
 
2016 Sept 1st - IBM Consultants & System Integrators Interchange - Big Data -...
2016 Sept 1st - IBM Consultants & System Integrators Interchange - Big Data -...2016 Sept 1st - IBM Consultants & System Integrators Interchange - Big Data -...
2016 Sept 1st - IBM Consultants & System Integrators Interchange - Big Data -...Anand Haridass
 
Introduction to pyspark new
Introduction to pyspark newIntroduction to pyspark new
Introduction to pyspark newAnam Mahmood
 
Why Infrastructure matters?!
Why Infrastructure matters?!Why Infrastructure matters?!
Why Infrastructure matters?!Gabi Bauer
 
Why Infrastructure Matters for Big Data & Analytics
Why Infrastructure Matters for Big Data & AnalyticsWhy Infrastructure Matters for Big Data & Analytics
Why Infrastructure Matters for Big Data & AnalyticsRick Perret
 
Presentazione IBM System Storage - evento Venaria 14 ottobre
Presentazione IBM System Storage - evento Venaria 14 ottobrePresentazione IBM System Storage - evento Venaria 14 ottobre
Presentazione IBM System Storage - evento Venaria 14 ottobrePRAGMA PROGETTI
 
Ibm symp14 referentin_barbara koch_power_8 launch bk
Ibm symp14 referentin_barbara koch_power_8 launch bkIbm symp14 referentin_barbara koch_power_8 launch bk
Ibm symp14 referentin_barbara koch_power_8 launch bkIBM Switzerland
 
Top 10 renowned big data companies
Top 10 renowned big data companiesTop 10 renowned big data companies
Top 10 renowned big data companiesRobert Smith
 
Hadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - JaspersoftHadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - JaspersoftHortonworks
 
Deploying Massive Scale Graphs for Realtime Insights
Deploying Massive Scale Graphs for Realtime InsightsDeploying Massive Scale Graphs for Realtime Insights
Deploying Massive Scale Graphs for Realtime InsightsNeo4j
 
Create your Big Data vision and Hadoop-ify your data warehouse
Create your Big Data vision and Hadoop-ify your data warehouseCreate your Big Data vision and Hadoop-ify your data warehouse
Create your Big Data vision and Hadoop-ify your data warehouseJeff Kelly
 
Big Data Tools: A Deep Dive into Essential Tools
Big Data Tools: A Deep Dive into Essential ToolsBig Data Tools: A Deep Dive into Essential Tools
Big Data Tools: A Deep Dive into Essential ToolsFredReynolds2
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleHortonworks
 

Similar to Analyzing Big Data - Jeff Scheel (20)

IBM Smarter Analytics
IBM Smarter AnalyticsIBM Smarter Analytics
IBM Smarter Analytics
 
The sensor data challenge - Innovations (not only) for the Internet of Things
The sensor data challenge - Innovations (not only) for the Internet of ThingsThe sensor data challenge - Innovations (not only) for the Internet of Things
The sensor data challenge - Innovations (not only) for the Internet of Things
 
Infrastructure Matters
Infrastructure MattersInfrastructure Matters
Infrastructure Matters
 
Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...
Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...
Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...
 
Get Started Quickly with IBM's Hadoop as a Service
Get Started Quickly with IBM's Hadoop as a ServiceGet Started Quickly with IBM's Hadoop as a Service
Get Started Quickly with IBM's Hadoop as a Service
 
Future of Power: Power Strategy and Offerings for Denmark - Steve Sibley
Future of Power: Power Strategy and Offerings for Denmark - Steve SibleyFuture of Power: Power Strategy and Offerings for Denmark - Steve Sibley
Future of Power: Power Strategy and Offerings for Denmark - Steve Sibley
 
Big Data - A Real Life Revolution
Big Data - A Real Life RevolutionBig Data - A Real Life Revolution
Big Data - A Real Life Revolution
 
Introduction: Real-Time Analytics on Data in Motion
Introduction: Real-Time Analytics on Data in MotionIntroduction: Real-Time Analytics on Data in Motion
Introduction: Real-Time Analytics on Data in Motion
 
2016 Sept 1st - IBM Consultants & System Integrators Interchange - Big Data -...
2016 Sept 1st - IBM Consultants & System Integrators Interchange - Big Data -...2016 Sept 1st - IBM Consultants & System Integrators Interchange - Big Data -...
2016 Sept 1st - IBM Consultants & System Integrators Interchange - Big Data -...
 
Introduction to pyspark new
Introduction to pyspark newIntroduction to pyspark new
Introduction to pyspark new
 
Why Infrastructure matters?!
Why Infrastructure matters?!Why Infrastructure matters?!
Why Infrastructure matters?!
 
Why Infrastructure Matters for Big Data & Analytics
Why Infrastructure Matters for Big Data & AnalyticsWhy Infrastructure Matters for Big Data & Analytics
Why Infrastructure Matters for Big Data & Analytics
 
Presentazione IBM System Storage - evento Venaria 14 ottobre
Presentazione IBM System Storage - evento Venaria 14 ottobrePresentazione IBM System Storage - evento Venaria 14 ottobre
Presentazione IBM System Storage - evento Venaria 14 ottobre
 
Ibm symp14 referentin_barbara koch_power_8 launch bk
Ibm symp14 referentin_barbara koch_power_8 launch bkIbm symp14 referentin_barbara koch_power_8 launch bk
Ibm symp14 referentin_barbara koch_power_8 launch bk
 
Top 10 renowned big data companies
Top 10 renowned big data companiesTop 10 renowned big data companies
Top 10 renowned big data companies
 
Hadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - JaspersoftHadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - Jaspersoft
 
Deploying Massive Scale Graphs for Realtime Insights
Deploying Massive Scale Graphs for Realtime InsightsDeploying Massive Scale Graphs for Realtime Insights
Deploying Massive Scale Graphs for Realtime Insights
 
Create your Big Data vision and Hadoop-ify your data warehouse
Create your Big Data vision and Hadoop-ify your data warehouseCreate your Big Data vision and Hadoop-ify your data warehouse
Create your Big Data vision and Hadoop-ify your data warehouse
 
Big Data Tools: A Deep Dive into Essential Tools
Big Data Tools: A Deep Dive into Essential ToolsBig Data Tools: A Deep Dive into Essential Tools
Big Data Tools: A Deep Dive into Essential Tools
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at Scale
 

More from Kangaroot

So you think you know SUSE?
So you think you know SUSE?So you think you know SUSE?
So you think you know SUSE?Kangaroot
 
Live demo: Protect your Data
Live demo: Protect your DataLive demo: Protect your Data
Live demo: Protect your DataKangaroot
 
RootStack - Devfactory
RootStack - DevfactoryRootStack - Devfactory
RootStack - DevfactoryKangaroot
 
Welcome at OPEN'22
Welcome at OPEN'22Welcome at OPEN'22
Welcome at OPEN'22Kangaroot
 
EDB Postgres in Public Sector
EDB Postgres in Public SectorEDB Postgres in Public Sector
EDB Postgres in Public SectorKangaroot
 
Deploying NGINX in Cloud Native Kubernetes
Deploying NGINX in Cloud Native KubernetesDeploying NGINX in Cloud Native Kubernetes
Deploying NGINX in Cloud Native KubernetesKangaroot
 
Cloud demystified, what remains after the fog has lifted.
Cloud demystified, what remains after the fog has lifted.  Cloud demystified, what remains after the fog has lifted.
Cloud demystified, what remains after the fog has lifted. Kangaroot
 
Zimbra at Kangaroot / OPEN{virtual}
Zimbra at Kangaroot / OPEN{virtual}Zimbra at Kangaroot / OPEN{virtual}
Zimbra at Kangaroot / OPEN{virtual}Kangaroot
 
NGINX Controller: faster deployments, fewer headaches
NGINX Controller: faster deployments, fewer headachesNGINX Controller: faster deployments, fewer headaches
NGINX Controller: faster deployments, fewer headachesKangaroot
 
Kangaroot EDB Webinar Best Practices in Security with PostgreSQL
Kangaroot EDB Webinar Best Practices in Security with PostgreSQLKangaroot EDB Webinar Best Practices in Security with PostgreSQL
Kangaroot EDB Webinar Best Practices in Security with PostgreSQLKangaroot
 
Do you want to start with OpenShift but don’t have the manpower, knowledge, e...
Do you want to start with OpenShift but don’t have the manpower, knowledge, e...Do you want to start with OpenShift but don’t have the manpower, knowledge, e...
Do you want to start with OpenShift but don’t have the manpower, knowledge, e...Kangaroot
 
Red Hat multi-cluster management & what's new in OpenShift
Red Hat multi-cluster management & what's new in OpenShiftRed Hat multi-cluster management & what's new in OpenShift
Red Hat multi-cluster management & what's new in OpenShiftKangaroot
 
There is no such thing as “Vanilla Kubernetes”
There is no such thing as “Vanilla Kubernetes”There is no such thing as “Vanilla Kubernetes”
There is no such thing as “Vanilla Kubernetes”Kangaroot
 
Elastic SIEM (Endpoint Security)
Elastic SIEM (Endpoint Security)Elastic SIEM (Endpoint Security)
Elastic SIEM (Endpoint Security)Kangaroot
 
Hashicorp Vault - OPEN Public Sector
Hashicorp Vault - OPEN Public SectorHashicorp Vault - OPEN Public Sector
Hashicorp Vault - OPEN Public SectorKangaroot
 
Kangaroot - Bechtle kadercontracten
Kangaroot - Bechtle kadercontractenKangaroot - Bechtle kadercontracten
Kangaroot - Bechtle kadercontractenKangaroot
 
Red Hat Enterprise Linux 8
Red Hat Enterprise Linux 8Red Hat Enterprise Linux 8
Red Hat Enterprise Linux 8Kangaroot
 
Kangaroot open shift best practices - straight from the battlefield
Kangaroot open shift best practices - straight from the battlefieldKangaroot open shift best practices - straight from the battlefield
Kangaroot open shift best practices - straight from the battlefieldKangaroot
 
Kubecontrol - managed Kubernetes by Kangaroot
Kubecontrol - managed Kubernetes by KangarootKubecontrol - managed Kubernetes by Kangaroot
Kubecontrol - managed Kubernetes by KangarootKangaroot
 
OpenShift 4, the smarter Kubernetes platform
OpenShift 4, the smarter Kubernetes platformOpenShift 4, the smarter Kubernetes platform
OpenShift 4, the smarter Kubernetes platformKangaroot
 

More from Kangaroot (20)

So you think you know SUSE?
So you think you know SUSE?So you think you know SUSE?
So you think you know SUSE?
 
Live demo: Protect your Data
Live demo: Protect your DataLive demo: Protect your Data
Live demo: Protect your Data
 
RootStack - Devfactory
RootStack - DevfactoryRootStack - Devfactory
RootStack - Devfactory
 
Welcome at OPEN'22
Welcome at OPEN'22Welcome at OPEN'22
Welcome at OPEN'22
 
EDB Postgres in Public Sector
EDB Postgres in Public SectorEDB Postgres in Public Sector
EDB Postgres in Public Sector
 
Deploying NGINX in Cloud Native Kubernetes
Deploying NGINX in Cloud Native KubernetesDeploying NGINX in Cloud Native Kubernetes
Deploying NGINX in Cloud Native Kubernetes
 
Cloud demystified, what remains after the fog has lifted.
Cloud demystified, what remains after the fog has lifted.  Cloud demystified, what remains after the fog has lifted.
Cloud demystified, what remains after the fog has lifted.
 
Zimbra at Kangaroot / OPEN{virtual}
Zimbra at Kangaroot / OPEN{virtual}Zimbra at Kangaroot / OPEN{virtual}
Zimbra at Kangaroot / OPEN{virtual}
 
NGINX Controller: faster deployments, fewer headaches
NGINX Controller: faster deployments, fewer headachesNGINX Controller: faster deployments, fewer headaches
NGINX Controller: faster deployments, fewer headaches
 
Kangaroot EDB Webinar Best Practices in Security with PostgreSQL
Kangaroot EDB Webinar Best Practices in Security with PostgreSQLKangaroot EDB Webinar Best Practices in Security with PostgreSQL
Kangaroot EDB Webinar Best Practices in Security with PostgreSQL
 
Do you want to start with OpenShift but don’t have the manpower, knowledge, e...
Do you want to start with OpenShift but don’t have the manpower, knowledge, e...Do you want to start with OpenShift but don’t have the manpower, knowledge, e...
Do you want to start with OpenShift but don’t have the manpower, knowledge, e...
 
Red Hat multi-cluster management & what's new in OpenShift
Red Hat multi-cluster management & what's new in OpenShiftRed Hat multi-cluster management & what's new in OpenShift
Red Hat multi-cluster management & what's new in OpenShift
 
There is no such thing as “Vanilla Kubernetes”
There is no such thing as “Vanilla Kubernetes”There is no such thing as “Vanilla Kubernetes”
There is no such thing as “Vanilla Kubernetes”
 
Elastic SIEM (Endpoint Security)
Elastic SIEM (Endpoint Security)Elastic SIEM (Endpoint Security)
Elastic SIEM (Endpoint Security)
 
Hashicorp Vault - OPEN Public Sector
Hashicorp Vault - OPEN Public SectorHashicorp Vault - OPEN Public Sector
Hashicorp Vault - OPEN Public Sector
 
Kangaroot - Bechtle kadercontracten
Kangaroot - Bechtle kadercontractenKangaroot - Bechtle kadercontracten
Kangaroot - Bechtle kadercontracten
 
Red Hat Enterprise Linux 8
Red Hat Enterprise Linux 8Red Hat Enterprise Linux 8
Red Hat Enterprise Linux 8
 
Kangaroot open shift best practices - straight from the battlefield
Kangaroot open shift best practices - straight from the battlefieldKangaroot open shift best practices - straight from the battlefield
Kangaroot open shift best practices - straight from the battlefield
 
Kubecontrol - managed Kubernetes by Kangaroot
Kubecontrol - managed Kubernetes by KangarootKubecontrol - managed Kubernetes by Kangaroot
Kubecontrol - managed Kubernetes by Kangaroot
 
OpenShift 4, the smarter Kubernetes platform
OpenShift 4, the smarter Kubernetes platformOpenShift 4, the smarter Kubernetes platform
OpenShift 4, the smarter Kubernetes platform
 

Recently uploaded

Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 

Recently uploaded (20)

DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 

Analyzing Big Data - Jeff Scheel

  • 1. © 2014 IBM Corporation Open '14 Analyzing Big Data Jeff Scheel Chief Engineer Linux on Power June 2, 2014 scheel@us.ibm.com
  • 2. © 2014 IBM Corporation2 Agenda 1. Getting started with Big Data 2. OpenPOWER Foundation 3. The future of Analytics
  • 3. © 2014 IBM Corporation Getting started with Big Data
  • 4. © 2014 IBM Corporation4 Big Data is growing and moving fast from a variety of sources, are you keeping up? • 1 Trillion connected devices generate 2.5 quintillion bytes data / day • 80% of the world’s data today is unstructured • 1 in 2 business leaders don’t have access to data they need
  • 5. © 2014 IBM Corporation5 “Data is the new oil” In its raw form, oil has little value. Once processed and refined, it helps power the world. “Big Data has arrived at Seton Health Care Family, fortunately accompanied by an analytics tool that will help deal with the complexity of more than two million patient contacts a year…” “Data is the new oil.” Clive Humby “At the World Economic Forum last month in Davos, Switzerland, Big Data was a marquee topic. A report by the forum, “Big Data, Big Impact,” declared data a new class of economic asset, like currency or gold. “Increasingly, businesses are applying analytics to social media such as Facebook and Twitter, as well as to product review websites, to try to “understand where customers are, what makes them tick and what they want”, says Deepak Advani, who heads IBM’s predictive analytics group.” “Companies are being inundated with data—from information on customer-buying habits to supply-chain efficiency. But many managers struggle to make sense of the numbers.”
  • 6. © 2014 IBM Corporation6 The challenge: handling the large Volume, Variety, Velocity, and Veracity of data to find new insights and improve business outcome BI / Reporting Exploration / Visualization Functional App Industry App Predictive Analytics Content Analytics Analytic Applications IBM Big Data Platform Systems Management Application Development Visualization & Discovery Accelerators Information Integration & Governance Hadoop System Stream Computing Data Warehouse MFG - Analyze & correlate log records to improve service and predict failures Telco - Address customer satisfaction, Predict churn, and match promotions in real time Healthcare - Detect life- threatening conditions at hospitals in time to intervene Retail - Multi-channel customer sentiment and experience analysis Financial Services - Make risk decisions based on real-time transactional data Law Enforcement - Identify criminals and threats from video, audio feeds
  • 7. © 2014 IBM Corporation7 Customers are deploying new infrastructure to leverage all data types Data in Motion Data at Rest Data in Many Forms Information Ingestion and Operational Information Decision Management BI and Predictive Analytics Navigation and Discovery Intelligence Analysis Landing Area, Analytics Zone and Archive  Raw Data  Structured Data  Text Analytics  Data Mining  Entity Analytics  Machine Learning Real-time Analytics  Video/Audio  Network/Sensor  Entity Analytics  Predictive Exploration, Integrated Warehouse, and Mart Zones  Discovery  Deep Reflection  Operational  Predictive Stream Processing  Data Integration  Master Data Stream s Information Governance, Security and Business Continuity Hadoop Infrastructure – currently being deployed on commodity hardware Hadoop Infrastructure – currently being deployed on commodity hardware
  • 8. © 2014 IBM Corporation8 WATSON Two new Watson-based products: • Interactive Care Insights for Oncology • The WellPoint Interactive Care Guide and Interactive Care Reviewer IBM and Red Hat innovating in Healthcare with Watson • Watson's oncology education: • 600,000 pieces of medical evidence • 2 million pages of text • 25,000 training cases • Watson can review 1.5 million patient records faster than it takes most office computers to boot up
  • 9. © 2014 IBM Corporation9 Big Data implementation patterns Common analysis of structured & unstructured data WarehouseHadoop App / BI Visualization / Exploration Warehouse and BigInsights partitioning HadoopWarehouse App / BI Visualization Exploration App / BI Visualization Exploration App / BI Visualization Exploration HadoopWarehouse Warehouse batch offload Warehouse App /BI Visualization Exploration Hadoop StructuredUnstructured App / BI Visualization Exploration Separate unstructured & structured analysis StructuredUnstructured Structured Structured
  • 10. © 2014 IBM Corporation10 What the experts say 1. Seek project input from Sales, Marketing, and Operations teams 2. Select projects which are well- defined and have quick ROI – less than a year 3. Leverage your experiences from data warehouse and business intelligence projects 4. Avoid starting with “Big Bang” Source: http://www-01.ibm.com/common/ssi/cgi-bin/ssialias?infotype=SA&subtype=WH&htmlfid=POL03133USEN
  • 11. © 2014 IBM Corporation11 More ideas for starting Warehouse App /BI Visualization Exploration Hadoop Existing BI Stack App / BI Visualization Exploration Separate unstructured & structured analysis New  Find a small problem to solve, i.e. an internal phone directory, and start “on-the-side”.  Locate relevant data and identify pieces what are “in motion” or “at rest”.  For data at rest, build opensource Hadoop on your PowerLinux system or try the InfoSphere BigInsights Basic Edition (no charge).  For data in motion, use the InfoSphere Streams trial download.  Reference the IBM Information Center for details on how to import data into Hadoop and how to write applications using Streams Studio.  Explore Datameer to visualize your Hadoop based Big Data
  • 12. © 2014 IBM Corporation12 PowerLinux jump start services facilitate starting with Big Data Analytics 5 Day IBM Power Analytics Services Jump Start Includes: • 5 days, on-site service offering • Quick Analytics Assessment Workshop •Software Installation • Hands on education in getting started • Evaluating the analytical approach for your business that will make the biggest impact • Quick sample application to consume customer data Reference Architecture Workshop Why Jump Start Services for your IBM Power Analytics solution? • Learn how to optimally leverage IBM Power System for Analytics • Learn the benefits and reasoning of Big Data •Learn how to gain business value from the data you have 2 Day IBM Power Analytics Services Jump Start Includes: • 2 days, on-site Big Data Analytics service offering •Software installation • Hands on education in getting started Evaluating the analytical approach for your business that will make the biggest impact IBM Systems Lab Services & Training - Power Systems Services for PowerLinux, AIX, and OS Contact – Linda Hoben, Opportunity Manager, hoben@us.ibm.com IBM Power Servers is an ideal platform for streaming data and performing analytic computations for a multitude of applications. Let us help make you successful!
  • 13. © 2014 IBM Corporation13 IBM POWER has a strong history in transactional processing workloads 1,556 2,845 5,669 9,200 12,602 23,871 32,046 50,164 63,021 95,081 150,000$109.00 $89.00 $52.70 $43.00 $17.80 $8.31 $5.42 $5.19 $2.97 $2.81 $0.69 0 20000 40000 60000 80000 100000 120000 140000 160000 S70 S7A S80 S85 p690 p690+ p690++ p5-595 p5-595+ P6 595 P7 780 $0 $20 $40 $60 $80 $100 $120 tpcC $/tpcC
  • 14. © 2014 IBM Corporation14 POWER8 Processor Caches • 512 KB SRAM L2 / core • 96 MB eDRAM shared L3 • Up to 128 MB eDRAM L4 (off-chip) Cores • 12 cores (SMT8) • 8 dispatch, 10 issue, 16 exec pipe • 2X internal data flows/queues • Enhanced prefetching • 64K data cache, 32K instruction cache Accelerators • Crypto & memory expansion • Transactional Memory • VMM assist • Data Move / VM Mobility Energy Management • On-chip Power Management Micro-controller • Integrated Per-core VRM • Critical Path Monitors Technology •22nm SOI, eDRAM, 15 ML 650mm2 Memory • Up to 230 GB/s sustained bandwidth Bus Interfaces • Durable open memory attach interface • Integrated PCIe Gen3 • SMP Interconnect • CAPI (Coherent Accelerator Processor Interface) ComputerWorld: To make the chip faster, IBM has turned to a more advanced manufacturing process, increased the clock speed and added more cache memory, but perhaps the biggest change heralded by the Power8 cannot be found in the specifications. After years of restricting Power processors to its servers, IBM is throwing open the gates and will be licensing Power8 to third-party chip and component makers. The Register: the Power8 is so clearly engineered for midrange and enterprise systems for running applications on a giant shared memory space, backed by lots of cores and threads. Power8 does not belong in a smartphone unless you want one the size of a shoebox that weighs 20 pounds. But it most certainly does belong in a badass server, and Power8 is by far one of the most elegant chips that Big Blue has ever created, based on the initial specs. PCWorld: With Power8, IBM has more than doubled the sustained memory bandwidth from the Power7 and Power7+, to 230 GB/s, as well as I/O speed, to 48 GB/s. Put another way, Watson’s ability to look up and respond to information has more than doubled as well. Microprocessor report: Called Power8, the new chip delivers impressive numbers, doubling the performance of its already powerful predecessor, Power7+. Oracle currently leads in server-processor performance, but IBM’s new chip will crush those records. The Power8 specs are mind boggling. Source: Hotchips presentation
  • 15. © 2014 IBM Corporation15 POWER8 delivers 2.5x performance on Big Data / Hadoop POWER8 reduces the number of servers by 60% based on the best x86 published Terasort result  POWER8 S822L will deliver over 2x the performance of the best published x86 system … and continues to offer far superior RAS  POWER8 delivers 1.7X over HP on a per-core normalized benchmark.  POWER8 exploits additional cores, more threads, larger caches, memory bandwidth  Terasort is a popular benchmark to measure the performance of a Hadoop solution  Sorts a large dataset (10 TB) in parallel  Exercises the Map-reduced framework and Hadoop Distributed File System (HDFS) >2x>2x Relative System Performance 0 0.5 1 1.5 2 2.5 3 POWER8 Cisco 2.5x2.5x IBM Analytics Stack: IBM Power System S822L; 24 cores / 192 threads, POWER8; 3.0GHz, 512 GB memory, RHEL 6.5, InfoSphere BigInsights 3.0 Compared to a 16 Cores HP system http://www.cisco.com/en/US/solutions/collateral/ns340/ns517/ns224/ns944/le_tera.pdf
  • 16. © 2014 IBM Corporation16 Power Systems S822LPower Systems S812L • 1-socket, 2U • Linux Only • 2-socket, 2U • Linux Only • 2-socket, 2U • All Operating Systems Power Systems S822 Power Systems S814 • 1-socket, 4U • All Operating Systems Power Systems S824 • 2-socket, 4U • All Operating SystemsPower Systems S824L • 2-socket, 4U • Linux Only • SOD 1 & 2 Sockets New IBM Power Systems based on POWER8
  • 17. © 2014 IBM Corporation OpenPOWER Foundation – The emerging ecosystem
  • 18. 18 © OpenPOWER Foundation 2014 Industry trends • The number of companies designing & building servers is increasing – Traditionally there have been few companies designing systems: HP, IBM, SUN, Dell, etc. – Today there are many more: Google, Microsoft, Facebook, Rackspace, Huawei, Sugon, Inspur, etc. – A fairly mature ecosystem including the Taiwanese ODMs is a key enabler of this trend • Numerous disruptive forces are impacting these custom system designs and driving designers to consider new ways of innovating – Ability to handle rapid growth in Big Data & Analytics based solutions – Choice and Innovation – CPU SOC integration drive need for chip development • These trends create a need for a server targeted “chip-system- software” ecosystem – IBM has technology and a software stack ready to meet these needs – IBM recognizes the need to work with partners to create this ecosystem – IBM recognizes the need for choice and options in processor sourcing
  • 19. 19 © OpenPOWER Foundation 2014 OpenPOWER Foundation Structure OpenPOWER is an industry foundation based on the POWER architecture, enabling an Open community for development and opportunity for member differentiation and growth
  • 20. 20 © OpenPOWER Foundation 2014 Building collaboration and innovation at all levels Welcoming new members in all areas of the ecosystem 100+ inquiries and numerous active dialogues underway Boards/Systems I/O, Storage, Acceleration Chip/SOC System/Software/Services
  • 21. 21 © OpenPOWER Foundation 2014 OpenPOWER Proposed Ecosystem Enablement XCATXCAT System Operating Environment Software Stack A modern development environment is emerging based on tools and services Cloud Software Operating System / KVM Standard Operating Environment (System Mgmt) Software Power Open Source Software Stack Components Existing Open Source Software Communitie s Firmware Hardware New OSS Community OpenPOWER Technology OpenPOWER Firmware CAPP PCIe POWER8 CAPI over PCIe “Standard POWER Products” – 2014 Hardware “Custom POWER SoC” – Future Customizable Framework to Integrate System IP on Chip Industry IP License Model Multiple Options to Design with POWER Technology Within OpenPOWER
  • 22. © 2014 IBM Corporation22 Non-IBM POWER8 products http://www.enterprisetech.com/2014/04/28/inside-google-tyan-power8-server-boards/ The Tyan reference (ATX) board, SP010, measures 12” by 9.6” ➢ one single-chip module (SCM) ➢ four DDR3 memory slots ➢ four 6 Gb/sec SATA peripheral connectors ➢ two USB 3.0 ports ➢ two Gigabit Ethernet network interfaces ➢ keyboard and video ➢ intended for developers The Google reference board ➢ two single-chip module (SCM) ➢ four modified SATA ports ➢ Google use only
  • 23. © 2014 IBM Corporation The future of Analytics
  • 24. © 2014 IBM Corporation24 The future of Analytics: An open approach Open Platform for Choice
  • 25. 25 © OpenPOWER Foundation 2014 POWER8 CAPI Custom Hardware Application POWER8 CAPP Coherence Bus PSL FPGA or ASIC Customizable Hardware Application Accelerator • Specific system SW, middleware, or user application • Written to durable interface provided by PSL POWER8 PCIe Gen 3 Transport for encapsulated messages Processor Service Layer (PSL) • Present robust, durable interfaces to applications • Offload complexity / content from CAPP Virtual Addressing • Accelerator can work with same memory addresses that the processors use • Pointers de-referenced same as the host application • Removes OS & device driver overhead Hardware Managed Cache Coherence • Enables the accelerator to participate in “Locks” as a normal thread Lowers Latency over IO communication model Coherent Accelerator Processor Interface (CAPI)
  • 26. © 2014 IBM Corporation26 Coherent Accelerator Processor Interface (CAPI) Overview CAPP PCIe POWER8 Processor Typical I/O Model Flow Flow with a Coherent Model Shared Mem. Notify Accelerator Acceleration Shared Memory Completion DD Call Copy or Pin Source Data MMIO Notify Accelerator Acceleration Poll / Int Completion Copy or Unpin Result Data Ret. From DD Completion FPGA Functionn Function0 Function1 Function2 CAPI IBM Supplied POWER Service Layer
  • 27. © 2014 IBM Corporation27 Example: Innovative “In-Memory” NoSQL/KVS Integrated Solution - via POWER8 CAPI-attached Flash WWW 10Gb Uplink POWER8 Server Flash Array w/ up to 40TB Differentiated NoSQL (POWER8 + CAPI Flash) Infrastructure Attributes - 192 threads in 4U Server drawer - 40 TB of memory based Flash per 4U Drawer - Shared Memory & Cache for dynamic tuning - Elimination of I/O and Network Overhead - Cluster solution in a box 5X Cost Reduction with equivalent performance WWW 500GB Cache Node500GB Cache Node500GB Cache Node500GB Cache Node500GB Cache Node500GB Cache Node Backup Node Load Balancer Today’s NoSQL in memory (x86) 10Gb Uplink Infrastructure Requirements - Large Distributed (Scale out) - Large Memory per node - Networking Bandwidth Needs - Load Balancing Power CAPI-attached Flash model for NoSQL offers dramatic (24:1) density advantage
  • 28. © 2014 IBM Corporation Wrap-up
  • 29. © 2014 IBM Corporation29 For more information on Big Data / Analytics ● Sales kits – PartnerWorld – IBM internal ● Worldwide contacts – Renato Loffreda-Mancinelli, World Wide Business Analytics and Big Data Solutions on Power - Business Dev. Leader (loffreda@us.ibm.com) – Michael Tabron, Solution Offering Manager, Power Analytics (tabron@us.ibm.com) – Gina King, Solution Offering Manager, Big Data Analytics (glking@us.ibm.com) – Bob Friske, Marketing Manager (rfriske@us.ibm.com)
  • 30. © 2014 IBM Corporation30 Q & A Summary: 1.Getting started with Big Data is the toughest part. Start simple, small, and on the side. 2.The OpenPOWER Foundation enables new systems and helps support the emerging analytic solutions around NoSQL databases. 3.POWER8 technology like CAPI will enable new solutions from IBM and the OpenPOWER Foundation
  • 31. © 2014 IBM Corporation31 Special notices This document was developed for IBM offerings in the United States as of the date of publication. IBM may not make these offerings available in other countries, and the information is subject to change without notice. Consult your local IBM business contact for information on the IBM offerings available in your area. Information in this document concerning non-IBM products was obtained from the suppliers of these products or other public sources. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products. IBM may have patents or pending patent applications covering subject matter in this document. The furnishing of this document does not give you any license to these patents. Send license inquires, in writing, to IBM Director of Licensing, IBM Corporation, New Castle Drive, Armonk, NY 10504-1785 USA. All statements regarding IBM future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only. The information contained in this document has not been submitted to any formal IBM test and is provided "AS IS" with no warranties or guarantees either expressed or implied. All examples cited or described in this document are presented as illustrations of the manner in which some IBM products can be used and the results that may be achieved. Actual environmental costs and performance characteristics will vary depending on individual client configurations and conditions. IBM Global Financing offerings are provided through IBM Credit Corporation in the United States and other IBM subsidiaries and divisions worldwide to qualified commercial and government clients. Rates are based on a client's credit rating, financing terms, offering type, equipment type and options, and may vary by country. Other restrictions may apply. Rates and offerings are subject to change, extension or withdrawal without notice. IBM is not responsible for printing errors in this document that result in pricing or information inaccuracies. All prices shown are IBM's United States suggested list prices and are subject to change without notice; reseller prices may vary. IBM hardware products are manufactured from new parts, or new and serviceable used parts. Regardless, our warranty terms apply. Any performance data contained in this document was determined in a controlled environment. Actual results may vary significantly and are dependent on many factors including system hardware configuration and software design and configuration. Some measurements quoted in this document may have been made on development-level systems. There is no guarantee these measurements will be the same on generally- available systems. Some measurements quoted in this document may have been estimated through extrapolation. Users of this document should verify the applicable data for their specific environment. Revised September 26, 2006
  • 32. © 2014 IBM Corporation Backup
  • 33. © 2014 IBM Corporation33 Where to find more information? http://openpowerfoundation.org/