Mais conteúdo relacionado Semelhante a Get Started Quickly with IBM's Hadoop as a Service (20) Mais de IBM Cloud Data Services (20) Get Started Quickly with IBM's Hadoop as a Service1. © 2015 IBM Corporation
BigInsights on Cloud
Hadoop-as-a-Service
July 28th, 2015
2. © 2015 IBM Corporation2
Disclaimer
IBM’s statements regarding its plans, directions, and intent are subject to change or
withdrawal without notice at IBM’s sole discretion. Information regarding potential future
products is intended to outline our general product direction and it should not be relied on in
making a purchasing decision. The information mentioned regarding potential future products
is not a commitment, promise, or legal obligation to deliver any material, code or functionality.
Information about potential future products may not be incorporated into any contract. The
development, release, and timing of any future features or functionality described for our
products remains at our sole discretion.
3. © 2015 IBM Corporation3
Agenda
• Evolution of the Big Data Analytics space
• Open Data Platform and IBM’s BigInsights
• Hadoop as a Service – BigInsights on Cloud Options
• IBM Analytics for Hadoop – Free, 14-day trial
• BigInsights for Apache Hadoop – Bare Metal option for Production
• Demo
• Questions & Answers
• Resources
4. © 2015 IBM Corporation4
“At the World Economic
Forum last month in Davos,
Switzerland, Big Data was a
marquee topic. A report by the
forum, “Big Data, Big Impact,”
declared data a new class of
economic asset, like
currency or gold.
“Companies are being
inundated with data—from
information on customer-buying
habits to supply-chain efficiency.
But many managers struggle to
make sense of the numbers.”
“Increasingly, businesses are
applying analytics to social
media such as Facebook and
Twitter, as well as to product
review websites, to try to
“understand where customers are,
what makes them tick and what
they want”, says Deepak Advani,
who heads IBM’s predictive
analytics group.”
“Big Data has arrived at Seton
Health Care Family, fortunately
accompanied by an
analytics tool that will help
deal with the complexity of
more than two million
patient contacts a year…”
“Data is the new oil.”
Clive Humby
The Oscar Senti-meter — a tool
developed by the L.A. Times, IBM
and the USC Annenberg
Innovation Lab — analyzes
opinions about the Academy
Awards race shared in millions
of public messages on Twitter.”
Big Data continues to be a hot topic in the market
“…now Watson is being put to
work digesting millions of
pages of research,
incorporating the best clinical
practices and monitoring the
outcomes to assist physicians in
treating cancer patients.”
5. © 2015 IBM Corporation5
An automotive company is running a
series of experiments to better
understand and adapt to shifting
landscape of urban transportation by
streaming data from sensors on cars
using InfoSphere Streams to analyze it
on Hadoop using BigInsights on Cloud
Industrial manufacturer in the United
States reduces errors and the time
required for engine calibrations by 90
percent and improves reliability and new
product design by using sensors to collect
information on its products in the field and
analyzing it using InfoSphere BigInsights
Big Data implementations are driving real
business value for IBM customers
6. © 2015 IBM Corporation6
Rich capabilities in IBM’s Big Data Portfolio mean
lower risk and more successful projects
On premise, Cloud, and “as a Service”
BigInsights
7. © 2015 IBM Corporation7
Open Data Platform and IBM BigInsights
8. © 2015 IBM Corporation8
Open Data Platform Initiative
Why is IBM involved?
Strong history of leadership in open source & standards
Supports our commitment to open source currency in all
future releases
Accelerates our innovation within Hadoop &
surrounding applications
Open Data Platform (ODP) vs. Apache Software
Foundation (ASF)
ODP supports the ASF mission
ASF provides a governance model around individual
projects without looking at ecosystem
ODP aims to provide a vendor-led consistent packaging
model for core Apache components as an ecosystem
All Standard Apache Open Source Components
HDFS
YARN
MapReduce
Ambari HBase
Spark
Flume
Hive Pig
Sqoop
HCatalog
Solr/Lucene
ODP
9. © 2015 IBM Corporation9
SQL on Hadoop
Big SQL – optimized ANSI compliant SQL
Application Tooling
Toolkits and accelerators
Search & Entity Matching
Watson Explorer, Big Mach
Data Visualization
BigSheets spreadsheet interface
Predictive Modeling
Big R, Machine Learning
Text Analytics
Advanced text processing with AQL, Text
extraction web interface
Real-time Analytics
Streams
Data Governance and Security
DataClick, LDAP, Secure cluster
Storage Integration
GPFS - POSIX Distributed Filesystem
Enterprise Manageability
Adaptive MapReduce, Multi-tenant
scheduling
BigInsights for Apache Hadoop
IOP + IBM Value Adds = BigInsights
Knox
Ambari
Snappy
Open JDK
Avro
Solr
Oozie
Flume
Slider
Pig
Hadoop
HDFS/MapReduce/YARN*
Zookeeper
Parquet
HBase
IBM Open Platform (IOP)
Spark
Hive
Sqoop
ODP
10. © 2015 IBM Corporation10
BigInsights Users & Role-Based Modules
IBM Open Platform
BigInsights for
Apache Hadoop
12. © 2015 IBM Corporation12
IBM Open Platform uses Ambari
14. © 2015 IBM Corporation14
IBM BigInsights – BigSheets
Spreadsheet style analysis tool for business users
Easily visualize big data using
rich built-in graphing and
analytic functions
15. © 2015 IBM Corporation15
Big SQL in BigInsights
Data Sources
Hive Tables HBase Tables
BigSQL Engine
BigInsights
Application
SQL Language
JDBC / ODBC Driver
JDBC / ODBC Server
Native Sources
CSV SEQ
Parquet RC
AVRO ORC
JSON Custom
ANSI SQL 2011 Compliant
IBM’s SQL for Hadoop
• Makes Hadoop data accessible
to a wider audience
• Familiar, widely known syntax
• Leverage native Hadoop
data sources
Complements the Data
Warehouse
• Exploratory analytics
• Sandbox, Data Lake
Included in BigInsights
Use familiar SQL tools
• Cognos, SPSS, Tableau,
MicroStrategy
16. © 2015 IBM Corporation16
Example of text analytic tooling: Graphical
interface to describe structure of various
textual formats – from log file data to natural
language. Users do not need to now AQL
IBM BigInsights – Text Analytics
Information Extraction Framework for Text Analytics
17. © 2015 IBM Corporation17
R Clients
Embedded R Execution
R Packages
1
2
Explore, visualize, transform, and
model big data using familiar R
syntax and paradigm
Scale out R
Partitioning of large data (“divide”)
Parallel cluster execution of
pushed down R code (“conquer”)
All of this from within the R
environment (Jaql, Map/Reduce
are hidden from you)
Almost any R package can run in
this environment
Pull data
summaries to R
client
Or, push R
functions right
on the data
Data sources
R Packages
IBM BigInsights – Big R
End-to-end integration of R into BigInsights
18. © 2015 IBM Corporation18
Prototype, create mash-ups in
the cloud for non-production use
Empowers developers to rapidly
drive insight from all data
Two-node Docker Instance
Enterprise features – BigSheets,
Big SQL, Text, and Big R
Delivered via IBM Bluemix
50 GB – input data space
Extendable, Free 14-day Trial
For Production deployments at scale
in the cloud
Delivers flexibility and efficiency
with BYOL and PAYG pricing
Scale to meet spikes in demand
without on-premise infrastructure
Perform enterprise-class, complex
analytics on Big Data Available via
the IBM Cloud Marketplace
Web-based UI for Sizing/Pricing
IBM BigInsights – Cloud deployment options
Manage less, analyze more
IBM Analytics for Hadoop BigInsights for Apache Hadoop
19. © 2015 IBM Corporation19
IBM Analytics for Hadoop Details
Free 14-day trial on www.bluemix.net
20. © 2015 IBM Corporation20
BigInsights for Apache Hadoop – Options
Secure, Dedicated Bare-metal
Infrastructure
IBM Open Platform
BigInsights for
Apache Hadoop
21. © 2015 IBM Corporation21
IBM BigInsights on Cloud – Security
Dedicated, isolated environment for every client
Administrative control owned by customer at Hadoop
and BigInsights level
Native HDFS encryption; optional Guardium encryption
Firewalls provide perimeter security and private network isolation
Aiming for ISO 27K1 compliance in 2015
Example Configuration…
Non-shared physical machines for added security & performance
22. © 2015 IBM Corporation22
BigInsights on Cloud
Demonstration
23. © 2015 IBM Corporation23
The IBM Difference
IBM delivers the foundation for Big Data – now and in the future
Embraces open source
Establishes standards
Integrates with familiar interfaces and established systems
Delivers advanced analytic capabilities
IBM is the only vendor providing…
Hadoop as a Managed Service in the Cloud
A single company providing Hadoop-base software, cloud and services
Provides expertise to help you on your journey
6,000 partners
Analytics services and solution centers
24. © 2015 IBM Corporation24
IBM BigInsights on Cloud – unique capability
Built-in Twitter Decahose service
Scaled down random sample of Twitter Firehose
Easily land Twitter data into BigInsights HDFS
Manipulate and visualize data using BigSheets
Incorporate sentiment data into analytic models
Easily store and accommodate vast data sets
25. © 2015 IBM Corporation25
Check out more data management services at www.bluemix.net
Cloudant dashDB
BigInsights on
Cloud
DB2 on Cloud
26. © 2015 IBM Corporation26
Big Data University – Free Training
http://bigdatauniversity.com/
Powered by Hadoop
http://wiki.apache.org/hadoop/PoweredBy
Free Trial Software (both for on-premise and cloud)
http://www-01.ibm.com/software/data/infosphere/hadoop/trials.html
YouTube Videos
Watson
• The Science Behind the Answer (~7 minutes)
• Watson: Final Jeopardy (~11 minute summary)
Big Data Channel
• http://www.youtube.com/user/ibmbigdata
Resources
27. © 2015 IBM Corporation27
Thank You
Merci
Grazie
Gracias Obrigado
Danke
Japanese
French
German
Italian
Spanish
Portuguese
Traditional Chinese
Simplified Chinese
Romanian
Multumesc
Turkish
Teşekkür ederim
English