Enviar pesquisa
Carregar
Ibm leads way with hadoop and spark 2015 may 15
•
4 gostaram
•
1,337 visualizações
I
IBMInfoSphereUGFR
Seguir
IBM B
Leia menos
Leia mais
Tecnologia
Denunciar
Compartilhar
Denunciar
Compartilhar
1 de 31
Baixar agora
Baixar para ler offline
Recomendados
IBM InfoSphere Stewardship Center for iis dqec
IBM InfoSphere Stewardship Center for iis dqec
IBMInfoSphereUGFR
Présentation IBM InfoSphere Information Server 11.3
Présentation IBM InfoSphere Information Server 11.3
IBMInfoSphereUGFR
IBM InfoSphere Data Architect 9.1 - Francis Arnaudiès
IBM InfoSphere Data Architect 9.1 - Francis Arnaudiès
IBMInfoSphereUGFR
New IBM Information Server 11.3 - Bhawani Nandan Prasad
New IBM Information Server 11.3 - Bhawani Nandan Prasad
Bhawani N Prasad
Using the information server toolset to deliver end to end traceability
Using the information server toolset to deliver end to end traceability
IBM Sverige
2013.12.12 big data heise webcast
2013.12.12 big data heise webcast
Wilfried Hoge
IBM Spectrum Scale ECM - Winning Combination
IBM Spectrum Scale ECM - Winning Combination
Sasikanth Eda
Mastering Oracle® Hyperion EPM Metadata in a distributed organization
Mastering Oracle® Hyperion EPM Metadata in a distributed organization
Orchestra Networks
Recomendados
IBM InfoSphere Stewardship Center for iis dqec
IBM InfoSphere Stewardship Center for iis dqec
IBMInfoSphereUGFR
Présentation IBM InfoSphere Information Server 11.3
Présentation IBM InfoSphere Information Server 11.3
IBMInfoSphereUGFR
IBM InfoSphere Data Architect 9.1 - Francis Arnaudiès
IBM InfoSphere Data Architect 9.1 - Francis Arnaudiès
IBMInfoSphereUGFR
New IBM Information Server 11.3 - Bhawani Nandan Prasad
New IBM Information Server 11.3 - Bhawani Nandan Prasad
Bhawani N Prasad
Using the information server toolset to deliver end to end traceability
Using the information server toolset to deliver end to end traceability
IBM Sverige
2013.12.12 big data heise webcast
2013.12.12 big data heise webcast
Wilfried Hoge
IBM Spectrum Scale ECM - Winning Combination
IBM Spectrum Scale ECM - Winning Combination
Sasikanth Eda
Mastering Oracle® Hyperion EPM Metadata in a distributed organization
Mastering Oracle® Hyperion EPM Metadata in a distributed organization
Orchestra Networks
Data Governance with IBM Streams V4.1
Data Governance with IBM Streams V4.1
lisanl
Converged application solutions yujin lee(hp)
Converged application solutions yujin lee(hp)
Microsoft Singapore
IMS integration 2017
IMS integration 2017
Helene Lyon
Enterprise analytics journey from Helene Lyon
Enterprise analytics journey from Helene Lyon
Helene Lyon
Analytics with IMS Assets - 2017
Analytics with IMS Assets - 2017
Helene Lyon
Beyond Oracle EPM metadata synchronization
Beyond Oracle EPM metadata synchronization
Orchestra Networks
E-Business Suite 1 | Nadia Bendiedou | Oracle E-Business Suite Technology rel...
E-Business Suite 1 | Nadia Bendiedou | Oracle E-Business Suite Technology rel...
InSync2011
Grizzard webinar final 082510
Grizzard webinar final 082510
Sean O'Connell
Benefits of Extending PowerCenter with Informatica Cloud
Benefits of Extending PowerCenter with Informatica Cloud
Ashwin V.
Oracle ERP Cloud implementation tips
Oracle ERP Cloud implementation tips
Prabal Saha
PureApplication: System, Service, Software
PureApplication: System, Service, Software
Prolifics
Integration intervention: Get your apps and data up to speed
Integration intervention: Get your apps and data up to speed
Kenneth Peeples
Concept to production Nationwide Insurance BigInsights Journey with Telematics
Concept to production Nationwide Insurance BigInsights Journey with Telematics
Seeling Cheung
Introduction to integration
Introduction to integration
Mindmajix Technologies
Architecting and Tuning IIB/eXtreme Scale for Maximum Performance and Reliabi...
Architecting and Tuning IIB/eXtreme Scale for Maximum Performance and Reliabi...
Prolifics
Faster, Cheaper, Easier... and Successful Best Practices for Big Data Integra...
Faster, Cheaper, Easier... and Successful Best Practices for Big Data Integra...
DataWorks Summit
Informatica
Informatica
mukharji
Sabre: Master Reference Data in the Large Enterprise
Sabre: Master Reference Data in the Large Enterprise
Orchestra Networks
Migration to Oracle ERP Cloud: A must read winning recipe for all
Migration to Oracle ERP Cloud: A must read winning recipe for all
Jim Pang
Resume Pallavi Mishra as of 2017 Feb
Resume Pallavi Mishra as of 2017 Feb
Pallavi Gokhale Mishra
IBM Smarter Analytics
IBM Smarter Analytics
Adrian Turcu
2016 August POWER Up Your Insights - IBM System Summit Mumbai
2016 August POWER Up Your Insights - IBM System Summit Mumbai
Anand Haridass
Mais conteúdo relacionado
Mais procurados
Data Governance with IBM Streams V4.1
Data Governance with IBM Streams V4.1
lisanl
Converged application solutions yujin lee(hp)
Converged application solutions yujin lee(hp)
Microsoft Singapore
IMS integration 2017
IMS integration 2017
Helene Lyon
Enterprise analytics journey from Helene Lyon
Enterprise analytics journey from Helene Lyon
Helene Lyon
Analytics with IMS Assets - 2017
Analytics with IMS Assets - 2017
Helene Lyon
Beyond Oracle EPM metadata synchronization
Beyond Oracle EPM metadata synchronization
Orchestra Networks
E-Business Suite 1 | Nadia Bendiedou | Oracle E-Business Suite Technology rel...
E-Business Suite 1 | Nadia Bendiedou | Oracle E-Business Suite Technology rel...
InSync2011
Grizzard webinar final 082510
Grizzard webinar final 082510
Sean O'Connell
Benefits of Extending PowerCenter with Informatica Cloud
Benefits of Extending PowerCenter with Informatica Cloud
Ashwin V.
Oracle ERP Cloud implementation tips
Oracle ERP Cloud implementation tips
Prabal Saha
PureApplication: System, Service, Software
PureApplication: System, Service, Software
Prolifics
Integration intervention: Get your apps and data up to speed
Integration intervention: Get your apps and data up to speed
Kenneth Peeples
Concept to production Nationwide Insurance BigInsights Journey with Telematics
Concept to production Nationwide Insurance BigInsights Journey with Telematics
Seeling Cheung
Introduction to integration
Introduction to integration
Mindmajix Technologies
Architecting and Tuning IIB/eXtreme Scale for Maximum Performance and Reliabi...
Architecting and Tuning IIB/eXtreme Scale for Maximum Performance and Reliabi...
Prolifics
Faster, Cheaper, Easier... and Successful Best Practices for Big Data Integra...
Faster, Cheaper, Easier... and Successful Best Practices for Big Data Integra...
DataWorks Summit
Informatica
Informatica
mukharji
Sabre: Master Reference Data in the Large Enterprise
Sabre: Master Reference Data in the Large Enterprise
Orchestra Networks
Migration to Oracle ERP Cloud: A must read winning recipe for all
Migration to Oracle ERP Cloud: A must read winning recipe for all
Jim Pang
Resume Pallavi Mishra as of 2017 Feb
Resume Pallavi Mishra as of 2017 Feb
Pallavi Gokhale Mishra
Mais procurados
(20)
Data Governance with IBM Streams V4.1
Data Governance with IBM Streams V4.1
Converged application solutions yujin lee(hp)
Converged application solutions yujin lee(hp)
IMS integration 2017
IMS integration 2017
Enterprise analytics journey from Helene Lyon
Enterprise analytics journey from Helene Lyon
Analytics with IMS Assets - 2017
Analytics with IMS Assets - 2017
Beyond Oracle EPM metadata synchronization
Beyond Oracle EPM metadata synchronization
E-Business Suite 1 | Nadia Bendiedou | Oracle E-Business Suite Technology rel...
E-Business Suite 1 | Nadia Bendiedou | Oracle E-Business Suite Technology rel...
Grizzard webinar final 082510
Grizzard webinar final 082510
Benefits of Extending PowerCenter with Informatica Cloud
Benefits of Extending PowerCenter with Informatica Cloud
Oracle ERP Cloud implementation tips
Oracle ERP Cloud implementation tips
PureApplication: System, Service, Software
PureApplication: System, Service, Software
Integration intervention: Get your apps and data up to speed
Integration intervention: Get your apps and data up to speed
Concept to production Nationwide Insurance BigInsights Journey with Telematics
Concept to production Nationwide Insurance BigInsights Journey with Telematics
Introduction to integration
Introduction to integration
Architecting and Tuning IIB/eXtreme Scale for Maximum Performance and Reliabi...
Architecting and Tuning IIB/eXtreme Scale for Maximum Performance and Reliabi...
Faster, Cheaper, Easier... and Successful Best Practices for Big Data Integra...
Faster, Cheaper, Easier... and Successful Best Practices for Big Data Integra...
Informatica
Informatica
Sabre: Master Reference Data in the Large Enterprise
Sabre: Master Reference Data in the Large Enterprise
Migration to Oracle ERP Cloud: A must read winning recipe for all
Migration to Oracle ERP Cloud: A must read winning recipe for all
Resume Pallavi Mishra as of 2017 Feb
Resume Pallavi Mishra as of 2017 Feb
Semelhante a Ibm leads way with hadoop and spark 2015 may 15
IBM Smarter Analytics
IBM Smarter Analytics
Adrian Turcu
2016 August POWER Up Your Insights - IBM System Summit Mumbai
2016 August POWER Up Your Insights - IBM System Summit Mumbai
Anand Haridass
Summer Shorts: Big Data Integration
Summer Shorts: Big Data Integration
ibi
Get Started Quickly with IBM's Hadoop as a Service
Get Started Quickly with IBM's Hadoop as a Service
IBM Cloud Data Services
InfoSphere BigInsights - Analytics power for Hadoop - field experience
InfoSphere BigInsights - Analytics power for Hadoop - field experience
Wilfried Hoge
Big SQL Competitive Summary - Vendor Landscape
Big SQL Competitive Summary - Vendor Landscape
Nicolas Morales
Cloud scale predictive DevOps automation using Apache Spark: Velocity in Amst...
Cloud scale predictive DevOps automation using Apache Spark: Velocity in Amst...
Romeo Kienzler
TDC2017 | POA Trilha BigData - IBM BigSQL - Engine de consulta de dados de al...
TDC2017 | POA Trilha BigData - IBM BigSQL - Engine de consulta de dados de al...
tdc-globalcode
20150617 spark meetup zagreb
20150617 spark meetup zagreb
Andrey Vykhodtsev
The sensor data challenge - Innovations (not only) for the Internet of Things
The sensor data challenge - Innovations (not only) for the Internet of Things
Stephan Reimann
2014.07.11 biginsights data2014
2014.07.11 biginsights data2014
Wilfried Hoge
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
Hortonworks
50 Shades of SQL
50 Shades of SQL
DataWorks Summit
Storm Demo Talk - Colorado Springs May 2015
Storm Demo Talk - Colorado Springs May 2015
Mac Moore
Ibm integrated analytics system
Ibm integrated analytics system
ModusOptimum
Delivering a Flexible IT Infrastructure for Analytics on IBM Power Systems
Delivering a Flexible IT Infrastructure for Analytics on IBM Power Systems
Hortonworks
Securing Red Hat OpenShift Containerized Applications At Enterprise Scale
Securing Red Hat OpenShift Containerized Applications At Enterprise Scale
DevOps.com
Hot Technologies of 2013: Hadoop 2.0
Hot Technologies of 2013: Hadoop 2.0
Inside Analysis
Hadoop and SQL: Delivery Analytics Across the Organization
Hadoop and SQL: Delivery Analytics Across the Organization
Seeling Cheung
Mrinal devadas, Hortonworks Making Sense Of Big Data
Mrinal devadas, Hortonworks Making Sense Of Big Data
PatrickCrompton
Semelhante a Ibm leads way with hadoop and spark 2015 may 15
(20)
IBM Smarter Analytics
IBM Smarter Analytics
2016 August POWER Up Your Insights - IBM System Summit Mumbai
2016 August POWER Up Your Insights - IBM System Summit Mumbai
Summer Shorts: Big Data Integration
Summer Shorts: Big Data Integration
Get Started Quickly with IBM's Hadoop as a Service
Get Started Quickly with IBM's Hadoop as a Service
InfoSphere BigInsights - Analytics power for Hadoop - field experience
InfoSphere BigInsights - Analytics power for Hadoop - field experience
Big SQL Competitive Summary - Vendor Landscape
Big SQL Competitive Summary - Vendor Landscape
Cloud scale predictive DevOps automation using Apache Spark: Velocity in Amst...
Cloud scale predictive DevOps automation using Apache Spark: Velocity in Amst...
TDC2017 | POA Trilha BigData - IBM BigSQL - Engine de consulta de dados de al...
TDC2017 | POA Trilha BigData - IBM BigSQL - Engine de consulta de dados de al...
20150617 spark meetup zagreb
20150617 spark meetup zagreb
The sensor data challenge - Innovations (not only) for the Internet of Things
The sensor data challenge - Innovations (not only) for the Internet of Things
2014.07.11 biginsights data2014
2014.07.11 biginsights data2014
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
50 Shades of SQL
50 Shades of SQL
Storm Demo Talk - Colorado Springs May 2015
Storm Demo Talk - Colorado Springs May 2015
Ibm integrated analytics system
Ibm integrated analytics system
Delivering a Flexible IT Infrastructure for Analytics on IBM Power Systems
Delivering a Flexible IT Infrastructure for Analytics on IBM Power Systems
Securing Red Hat OpenShift Containerized Applications At Enterprise Scale
Securing Red Hat OpenShift Containerized Applications At Enterprise Scale
Hot Technologies of 2013: Hadoop 2.0
Hot Technologies of 2013: Hadoop 2.0
Hadoop and SQL: Delivery Analytics Across the Organization
Hadoop and SQL: Delivery Analytics Across the Organization
Mrinal devadas, Hortonworks Making Sense Of Big Data
Mrinal devadas, Hortonworks Making Sense Of Big Data
Mais de IBMInfoSphereUGFR
Présentation IBM InfoSphere MDM 11.3
Présentation IBM InfoSphere MDM 11.3
IBMInfoSphereUGFR
IBM Data lake
IBM Data lake
IBMInfoSphereUGFR
IBM InfoSphere Data Replication Products
IBM InfoSphere Data Replication Products
IBMInfoSphereUGFR
Présentation IBM DB2 Blu - Fabrizio DANUSSO
Présentation IBM DB2 Blu - Fabrizio DANUSSO
IBMInfoSphereUGFR
IBM InfoSphere MDM v11 Overview - Aomar BARIZ
IBM InfoSphere MDM v11 Overview - Aomar BARIZ
IBMInfoSphereUGFR
InfoSphere Streams Technical Overview - Use Cases Big Data - Jerome CHAILLOUX
InfoSphere Streams Technical Overview - Use Cases Big Data - Jerome CHAILLOUX
IBMInfoSphereUGFR
InfoSphere streams_technical_overview_infospherusergroup
InfoSphere streams_technical_overview_infospherusergroup
IBMInfoSphereUGFR
IBM MDM 10.1 What's New - Aomar Bariz
IBM MDM 10.1 What's New - Aomar Bariz
IBMInfoSphereUGFR
Mais de IBMInfoSphereUGFR
(8)
Présentation IBM InfoSphere MDM 11.3
Présentation IBM InfoSphere MDM 11.3
IBM Data lake
IBM Data lake
IBM InfoSphere Data Replication Products
IBM InfoSphere Data Replication Products
Présentation IBM DB2 Blu - Fabrizio DANUSSO
Présentation IBM DB2 Blu - Fabrizio DANUSSO
IBM InfoSphere MDM v11 Overview - Aomar BARIZ
IBM InfoSphere MDM v11 Overview - Aomar BARIZ
InfoSphere Streams Technical Overview - Use Cases Big Data - Jerome CHAILLOUX
InfoSphere Streams Technical Overview - Use Cases Big Data - Jerome CHAILLOUX
InfoSphere streams_technical_overview_infospherusergroup
InfoSphere streams_technical_overview_infospherusergroup
IBM MDM 10.1 What's New - Aomar Bariz
IBM MDM 10.1 What's New - Aomar Bariz
Último
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
BookNet Canada
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
Rizwan Syed
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
Lonnie McRorey
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
NavinnSomaal
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
Alfredo García Lavilla
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
Fwdays
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Mark Simos
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
LoriGlavin3
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
BkGupta21
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Precisely
How to write a Business Continuity Plan
How to write a Business Continuity Plan
Databarracks
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
BookNet Canada
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
Lorenzo Miniero
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
Pixlogix Infotech
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
2toLead Limited
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
Addepto
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
Mattias Andersson
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
Lars Bell
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
Fwdays
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
Alex Barbosa Coqueiro
Último
(20)
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
How to write a Business Continuity Plan
How to write a Business Continuity Plan
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
Ibm leads way with hadoop and spark 2015 may 15
1.
© 2015 IBM
Corporation IBM Leads the Way with Hadoop and Spark The Keys to Getting Value out of Big Data
2.
© 2015 IBM
Corporation2 IBM’s Framework for Getting Value out of Big Data All agree on Big Data’s potential, but wide divergence on how to exploit it Pioneers who have started to harness Big Data have benefited greatly We see Big Data adoption as a continual process – maturity levels IBM’s approach enables faster adoption of Big Data technologies Open source innovation (Hadoop, Spark) Standards-based technologies (ODP, SQL, R) Familiar interfaces and integration with established tools (IBM innovations) Advanced analytics (IBM innovations) IBM’s commitment for continued innovation
3.
© 2015 IBM
Corporation3 Hadoop and Spark Offer Significant Business Benefits Operations Data Warehousing Line of Business and Analytics New Business Imperatives Big Data Maturity High High Low Data-Informed Decision Making • Full dataset analysis (no more sampling) • Extract value from non-relational data • 360 o view of all enterprise data • Exploratory analysis and discovery Warehouse Modernization • Data lake • Data offload • ETL offload • Queryable archive and staging Lower the Cost of Storage Business Transformation • Create new business models • Risk-aware decision making • Fight fraud and counter threats • Optimize operations • Attract, grow, retain customers Value
4.
© 2015 IBM
Corporation4 IBM Investing in Four Catalysts for Big Data Adoption Familiar Interfaces & Integration with Established Tools Open Source Innovation Technical Standards New Analytics Capabilities
5.
© 2015 IBM
Corporation5 • Reliability • Resiliency • Security • Multiple data sources • Multiple applications • Multiple users Hadoop Advantages • Files • Semi-structured • Databases Unlimited Scale Enterprise Platform Wide Range of Data Formats
6.
© 2015 IBM
Corporation6 Hadoop MapReduce Challenges • Need deep Java skills • Few abstractions available for analysts • No in-memory framework • Application tasks write to disk with each cycle • Only suitable for batch workloads • Rigid processing model In-Memory Performance Ease of Development Combine Workflows
7.
© 2015 IBM
Corporation7 In-Memory Performance Ease of Development • Easier APIs • Python, Scala, Java • Resilient Distributed Datasets • Unify processing Spark Advantages • Batch • Interactive • Iterative algorithms • Micro-batch Combine Workflows
8.
© 2015 IBM
Corporation8 Spark Libraries Apache Spark Spark SQL Spark Streaming GraphX MLlib SparkR
9.
© 2015 IBM
Corporation9 Spark on Hadoop Apache Spark Spark SQL Spark Streaming GraphX MLlib SparkR Apache Hadoop-HDFS Apache Hadoop-YARN Resource management Storage management Compute layer Slave node 1 Slave node 2 Slave node n…
10.
© 2015 IBM
Corporation10 Spark on Mesos Apache Spark Spark SQL Spark Streaming GraphX MLlib SparkR Apache Hadoop-HDFS Apache Mesos Resource management Storage management Compute layer Slave node 1 Slave node 2 Slave node n…
11.
© 2015 IBM
Corporation11 Spark as a Service Apache Spark Spark SQL Spark Streaming GraphX MLlib SparkR Amazon S3 Resource management Storage management Compute layer Apache Hadoop-YARN Amazon EC2 node 1 Amazon EC2 node 2 Amazon EC2 node n…
12.
© 2015 IBM
Corporation12 Spark on the Amazon Cloud Apache Spark Spark SQL Spark Streaming GraphX MLlib SparkR Amazon S3 Resource management Storage management Compute layer Apache Hadoop-YARN Amazon EC2 node 1 Amazon EC2 node 2 Amazon EC2 node n…
13.
© 2015 IBM
Corporation13 Spark Running in Standalone Mode Apache Spark Spark SQL Spark Streaming GraphX MLlib SparkR Single node, with local storage Resource management Storage management Compute layer
14.
© 2015 IBM
Corporation14 Spark Resilient Distributed Datasets Slave node 1 c3 d2 a2 b1 partition3 partition1 partition2 Slave node 2 c2 d1 a1 b2 partition1 partition3 Slave node 3 c1 d2 a3 b3 partition2 partition2 partition1 RDD1 RDD2 RDD3 Spark RDD In-memory distribution HDFS On-disk distribution
15.
© 2015 IBM
Corporation15 The Combination: The Flexibility of Spark on a Stable Hadoop Platform In-Memory Performance Ease of Development Combine Workflows Unlimited Scale Enterprise Platform Wide Range of Data Formats
16.
© 2015 IBM
Corporation16 IBM Open Platform with Apache Hadoop 100% open source code Commitment to currency: “days, not months” Includes Spark Free for production use Decoupled Apache Hadoop from IBM analytics and data science technologies Production support offering available Apache Open Source Components HDFS YARN MapReduce Ambari HBase Spark Flume Hive Pig Sqoop HCatalog Solr/Lucene IBM Open Platform with Apache Hadoop
17.
© 2015 IBM
Corporation17 IBM is Committed to Open Source Open source technologies are the base for IBM software and solutions IBM’s long history of deep open source commitment Apache Software Foundation: Founding member in 1999 Cloud Foundry: #1 contributor; Basis for Bluemix OpenStack: #4 contributor; Basis for IBM’s IaaS Linux: #3 contributor; IBM first enterprise backer of Linux Hadoop/Spark: Extensive investment in open source contribution; Integration with Analytics software Infrastructure Systems Application
18.
© 2015 IBM
Corporation18 Goal of the Apache Software Foundation: Let 1000 Flowers Bloom! • 249 Top Level Projects, 40 Incubating • 2 Million+ Code Commits • IBM co-founded the ASF in 1999 and is a Gold Sponsor • The “Apache Way” is about fostering open innovation • Not a standards organization
19.
© 2015 IBM
Corporation19 Apache Hadoop Ecosystem: Rapid Innovation, Few Standards Distributions include different projects at different version levels “This proliferation of baskets [Hadoop distributions with different project versions] creates significant drag when it comes to building reliable applications ... makes it harder for customers to assess which basket of Hadoop that they need and harder for application developers to create solutions that work broadly.” – Raymie Stata, CEO, Altiscale Even though the project versions match, there are interface differences “Setting a baseline of Hive 13 so we get access to some new syntax. Try it on one, it works great... Try it on another that says it also has Hive 13, and we get ‘syntax error’ …” - Craig Rubendall, VP, SAS If the industry is truly committed to developing big data technologies and solutions …, it will require an ecosystem of providers … to create a consistent framework around which everyone can develop. - Siki Giunta, SVP, Verizon The Hadoop ecosystem is evolving at a faster pace than is comfortable “My personal speculation is that it comes from some who have been evaluating for a while seeing change occur so rapidly that they are dropping back for another look.” – Merv Adrian, VP, Gartner
20.
© 2015 IBM
Corporation20 Certify a standard “ODP Core” set of open source Hadoop family projects with specific versions and patch levels Develop tools and methods to help solution providers to test applications against the ODP Core. Contribute changes and fixes in the ODP Core Hadoop family projects to the ASF using the ASF processes. http://opendataplatform.org/
21.
© 2015 IBM
Corporation21 Open Data Platform Initiative Representation across the Hadoop ecosystem… • Hadoop distribution vendors • Software application providers • System integrators/consultants • Hardware vendors • Customers … who all believe in the need for a community-based effort to standardize Hadoop, which will lead to improved adoption
22.
© 2015 IBM
Corporation22 IBM Open Platform with Apache Hadoop adopts ODP Core BigInsights will include ODP certified Apache packages ODP will initially target core packages of a Hadoop distribution Packages will expand over time First certification set expected this summer Our goal for BigInsights on ODP Better compatibility and less testing against ecosystem software Enable IBM Hadoop capabilities to run on other ODP-certified Hadoop distributions HDFS YARN MapReduce Ambari HBase Spark Flume Hive Pig Sqoop HCatalog Solr/Lucene ODP * Candidate set of certified ODP modules – expected summer 2015 Apache Open Source Components IBM Open Platform with Apache Hadoop
23.
© 2015 IBM
Corporation23 Goal of the ODP: Enable Innovation to Flourish on a Common Platform • Complements the Apache Software Foundation’s governance model • ODP efforts focus on integration, testing, and certifying a standard core of Apache Hadoop ecosystem projects • Fixes for issues found in ODP testing will be contributed to the ASF projects in line with ASF processes • The ODP will not override or replace any aspect of ASF governance
24.
© 2015 IBM
Corporation24 Text Analytics POSIX Distributed File System Multi-workload, Multi-tenant scheduling IBM BigInsights Enterprise Management Machine Learning with Big R Big R IBM Open Platform with Apache Hadoop IBM BigInsights Data Scientist IBM BigInsights Analyst Big SQL BigSheets Big SQL BigSheets for Apache Hadoop IBM BigInsights for Apache Hadoop
25.
© 2015 IBM
Corporation25 IBM BigInsights for Apache Hadoop IBM System zIBM PowerIntel Servers On Cloud Your choice of infrastructure and deployment model
26.
© 2015 IBM
Corporation26 IBM Analytic Platform Capabilities IBM Software Integrates and Extends Hadoop and Spark Data Warehousing PureData for Analytics, Operational Analytics Entity Extraction and Matching Big Match Security and Compliance Optim, Guardium Audit and Encryption Data Integration and Governance Information Server Enterprise Search Watson Explorer Real-time Analytics Streams Predictive Modeling and Descriptive Statistics SPSS, Big R and Scalable Algorithms Analysis, Reporting, and Exploration Watson Analytics, Cognos, BigSheets Fast, ANSI SQL 2011, and Secure SQL Big SQL Enterprise File System GPFS-FPO Cluster Resource and Workload Management Platform Symphony Large Scale Text Extraction Big Text IBM Open Platform with Apache Hadoop
27.
© 2015 IBM
Corporation27 IBM Leads the Market and Analysts Agree “IBM’s all-in bet on Apache Hadoop clearly has had the biggest impact among developers we polled” - Evans Big Data Survey Leading Hadoop Distribution Leading Streaming Analytics Solution
28.
© 2015 IBM
Corporation28 IBM’s Investment in the Big Data Community Over 250,000 benefit from free Big Data skills training http://bigdatauniversity.com
29.
© 2015 IBM
Corporation29 Spark Technology Center Focal point for IBM investment in Spark Code contributions to Apache Spark project Build industry solutions using Spark Evangelize Spark technology inside/outside IBM Agile engagement across IBM divisions Systems: contribute enhancements to Spark core, and optimized infrastructure (hardware/software) for Spark Analytics: IBM Analytics software will exploit Spark processing Research: build innovations above (solutions that use Spark), inside (improvements to Spark core), and below (improve systems that execute Spark) the Spark stack Goal: To be the #1 contributor and adopter in the Spark ecosystem
30.
© 2015 IBM
Corporation30 The IBM Difference IBM delivers the foundation for Big Data – now and in the future Embraces open source Establishes standards Integrates with familiar interfaces and established systems Delivers advanced analytic capabilities Enables you to benefit from broader data and analytics capabilities Data Integration and Governance Predictive and Real-time Analytics Provides expertise to help you on your journey 6,000 partners Analytics services and solution centers
Baixar agora