Enviar pesquisa
Carregar
Apache Spark: Usage and Roadmap in Hadoop
•
Transferir como PPTX, PDF
•
9 gostaram
•
4,367 visualizações
Cloudera Japan
Seguir
Presentation to tokyo hug on spark
Leia menos
Leia mais
Tecnologia
Denunciar
Compartilhar
Denunciar
Compartilhar
1 de 18
Baixar agora
Recomendados
1980 johnson evinrude outboard 9.9 hp service repair manual
1980 johnson evinrude outboard 9.9 hp service repair manual
fujsjfkkskefsme
Infosys
Infosys
Bella Meraki
Icici securities ppt
Icici securities ppt
Ravi Verma
SIP Report
SIP Report
Sanjay Behera
Consumer behavior towards bajaj pulsar 150 cc
Consumer behavior towards bajaj pulsar 150 cc
Projects Kart
Unschool project report
Unschool project report
Shubham Moon
A comparative analysis of ulip of bajaj allianz life insurance co
A comparative analysis of ulip of bajaj allianz life insurance co
Umesh Chauhan
Sociale relevance project
Sociale relevance project
dhirajramji
Recomendados
1980 johnson evinrude outboard 9.9 hp service repair manual
1980 johnson evinrude outboard 9.9 hp service repair manual
fujsjfkkskefsme
Infosys
Infosys
Bella Meraki
Icici securities ppt
Icici securities ppt
Ravi Verma
SIP Report
SIP Report
Sanjay Behera
Consumer behavior towards bajaj pulsar 150 cc
Consumer behavior towards bajaj pulsar 150 cc
Projects Kart
Unschool project report
Unschool project report
Shubham Moon
A comparative analysis of ulip of bajaj allianz life insurance co
A comparative analysis of ulip of bajaj allianz life insurance co
Umesh Chauhan
Sociale relevance project
Sociale relevance project
dhirajramji
Asiatic Marketing Communications Limited Internship Report
Asiatic Marketing Communications Limited Internship Report
Ahsan Habib
Marketing Mix
Marketing Mix
Shubham Bhatnagar
Icici sec
Icici sec
hunkali007
wipro consumer care and lighting, SIP presentation
wipro consumer care and lighting, SIP presentation
Abhishek Tiwari
Csr infosys
Csr infosys
Sethu Madhav
WIPRO Fundamental Analysis
WIPRO Fundamental Analysis
Deepak Kumar
Impact of Digital Marketing as a Marketing Tool in India
Impact of Digital Marketing as a Marketing Tool in India
Sandip P.
Tata Motors CSR Activity PPT 2015-2016
Tata Motors CSR Activity PPT 2015-2016
Rahul Gulaganji
Project report on mahindra and mahindra (1)
Project report on mahindra and mahindra (1)
mehrajkhan16
SIP report executive summary
SIP report executive summary
Nitesh Jaiswal (NJ)
Summer Internship Project MBA at Britannia Industry Limited
Summer Internship Project MBA at Britannia Industry Limited
Dalpat Parihar
Swot analysis of hul
Swot analysis of hul
omgogna
Kubota KH41 Excavator Service Repair Manual
Kubota KH41 Excavator Service Repair Manual
uekdjkm jksemmd
Analysis of-consumer-perception-on-dabur-honey
Analysis of-consumer-perception-on-dabur-honey
AbhisheK Kumar Rajoria
Project on ratios
Project on ratios
Rasween Choudhary
MINOR PROJECT REPORT ON MARKET POTENTIAL OF RICE POWDER BY JAYABHARATH MODERN...
MINOR PROJECT REPORT ON MARKET POTENTIAL OF RICE POWDER BY JAYABHARATH MODERN...
Akaresh Jose Kaviyil JY
Internship Report on EFU Life Assuarance ltd.
Internship Report on EFU Life Assuarance ltd.
Wish Mrt'xa
初めてのSpark streaming 〜kafka+sparkstreamingの紹介〜
初めてのSpark streaming 〜kafka+sparkstreamingの紹介〜
Tanaka Yuichi
Spark/MapReduceの 機械学習ライブラリ比較検証
Spark/MapReduceの 機械学習ライブラリ比較検証
Recruit Technologies
Sparkストリーミング検証
Sparkストリーミング検証
BrainPad Inc.
Apache Sparkに手を出してヤケドしないための基本 ~「Apache Spark入門より」~ (デブサミ 2016 講演資料)
Apache Sparkに手を出してヤケドしないための基本 ~「Apache Spark入門より」~ (デブサミ 2016 講演資料)
NTT DATA OSS Professional Services
40分でわかるHadoop徹底入門 (Cloudera World Tokyo 2014 講演資料)
40分でわかるHadoop徹底入門 (Cloudera World Tokyo 2014 講演資料)
hamaken
Mais conteúdo relacionado
Mais procurados
Asiatic Marketing Communications Limited Internship Report
Asiatic Marketing Communications Limited Internship Report
Ahsan Habib
Marketing Mix
Marketing Mix
Shubham Bhatnagar
Icici sec
Icici sec
hunkali007
wipro consumer care and lighting, SIP presentation
wipro consumer care and lighting, SIP presentation
Abhishek Tiwari
Csr infosys
Csr infosys
Sethu Madhav
WIPRO Fundamental Analysis
WIPRO Fundamental Analysis
Deepak Kumar
Impact of Digital Marketing as a Marketing Tool in India
Impact of Digital Marketing as a Marketing Tool in India
Sandip P.
Tata Motors CSR Activity PPT 2015-2016
Tata Motors CSR Activity PPT 2015-2016
Rahul Gulaganji
Project report on mahindra and mahindra (1)
Project report on mahindra and mahindra (1)
mehrajkhan16
SIP report executive summary
SIP report executive summary
Nitesh Jaiswal (NJ)
Summer Internship Project MBA at Britannia Industry Limited
Summer Internship Project MBA at Britannia Industry Limited
Dalpat Parihar
Swot analysis of hul
Swot analysis of hul
omgogna
Kubota KH41 Excavator Service Repair Manual
Kubota KH41 Excavator Service Repair Manual
uekdjkm jksemmd
Analysis of-consumer-perception-on-dabur-honey
Analysis of-consumer-perception-on-dabur-honey
AbhisheK Kumar Rajoria
Project on ratios
Project on ratios
Rasween Choudhary
MINOR PROJECT REPORT ON MARKET POTENTIAL OF RICE POWDER BY JAYABHARATH MODERN...
MINOR PROJECT REPORT ON MARKET POTENTIAL OF RICE POWDER BY JAYABHARATH MODERN...
Akaresh Jose Kaviyil JY
Internship Report on EFU Life Assuarance ltd.
Internship Report on EFU Life Assuarance ltd.
Wish Mrt'xa
Mais procurados
(17)
Asiatic Marketing Communications Limited Internship Report
Asiatic Marketing Communications Limited Internship Report
Marketing Mix
Marketing Mix
Icici sec
Icici sec
wipro consumer care and lighting, SIP presentation
wipro consumer care and lighting, SIP presentation
Csr infosys
Csr infosys
WIPRO Fundamental Analysis
WIPRO Fundamental Analysis
Impact of Digital Marketing as a Marketing Tool in India
Impact of Digital Marketing as a Marketing Tool in India
Tata Motors CSR Activity PPT 2015-2016
Tata Motors CSR Activity PPT 2015-2016
Project report on mahindra and mahindra (1)
Project report on mahindra and mahindra (1)
SIP report executive summary
SIP report executive summary
Summer Internship Project MBA at Britannia Industry Limited
Summer Internship Project MBA at Britannia Industry Limited
Swot analysis of hul
Swot analysis of hul
Kubota KH41 Excavator Service Repair Manual
Kubota KH41 Excavator Service Repair Manual
Analysis of-consumer-perception-on-dabur-honey
Analysis of-consumer-perception-on-dabur-honey
Project on ratios
Project on ratios
MINOR PROJECT REPORT ON MARKET POTENTIAL OF RICE POWDER BY JAYABHARATH MODERN...
MINOR PROJECT REPORT ON MARKET POTENTIAL OF RICE POWDER BY JAYABHARATH MODERN...
Internship Report on EFU Life Assuarance ltd.
Internship Report on EFU Life Assuarance ltd.
Destaque
初めてのSpark streaming 〜kafka+sparkstreamingの紹介〜
初めてのSpark streaming 〜kafka+sparkstreamingの紹介〜
Tanaka Yuichi
Spark/MapReduceの 機械学習ライブラリ比較検証
Spark/MapReduceの 機械学習ライブラリ比較検証
Recruit Technologies
Sparkストリーミング検証
Sparkストリーミング検証
BrainPad Inc.
Apache Sparkに手を出してヤケドしないための基本 ~「Apache Spark入門より」~ (デブサミ 2016 講演資料)
Apache Sparkに手を出してヤケドしないための基本 ~「Apache Spark入門より」~ (デブサミ 2016 講演資料)
NTT DATA OSS Professional Services
40分でわかるHadoop徹底入門 (Cloudera World Tokyo 2014 講演資料)
40分でわかるHadoop徹底入門 (Cloudera World Tokyo 2014 講演資料)
hamaken
Mesos framework API v1
Mesos framework API v1
Mesosphere Inc.
JAWS-DAYS 2015 / 北海道 x 農業 x クラウド
JAWS-DAYS 2015 / 北海道 x 農業 x クラウド
Takehito Tanabe
東急ハンズのクラウドデザインパターン アーキテクチャー編
東急ハンズのクラウドデザインパターン アーキテクチャー編
一成 田部井
Solr on HDFS - Past, Present, and Future: Presented by Mark Miller, Cloudera
Solr on HDFS - Past, Present, and Future: Presented by Mark Miller, Cloudera
Lucidworks
Neural Networks and Deep Learning
Neural Networks and Deep Learning
Asim Jalis
Apache Kudu (Incubating): New Hadoop Storage for Fast Analytics on Fast Data ...
Apache Kudu (Incubating): New Hadoop Storage for Fast Analytics on Fast Data ...
Cloudera, Inc.
#cwt2016 Apache Kudu 構成とテーブル設計
#cwt2016 Apache Kudu 構成とテーブル設計
Cloudera Japan
Cloud Native Hadoop #cwt2016
Cloud Native Hadoop #cwt2016
Cloudera Japan
Spark Streamingを使ってみた ~Twitterリアルタイムトレンドランキング~
Spark Streamingを使ってみた ~Twitterリアルタイムトレンドランキング~
sugiyama koki
Kudu: New Hadoop Storage for Fast Analytics on Fast Data
Kudu: New Hadoop Storage for Fast Analytics on Fast Data
Cloudera, Inc.
Sparkコミュニティに飛び込もう!(Spark Meetup Tokyo 2015 講演資料、NTTデータ 猿田 浩輔)
Sparkコミュニティに飛び込もう!(Spark Meetup Tokyo 2015 講演資料、NTTデータ 猿田 浩輔)
NTT DATA OSS Professional Services
IoT時代におけるストリームデータ処理と急成長の Apache Flink
IoT時代におけるストリームデータ処理と急成長の Apache Flink
Takanori Suzuki
Apache kudu
Apache kudu
Asim Jalis
Hadoop Summit Tokyo HDP Sandbox Workshop
Hadoop Summit Tokyo HDP Sandbox Workshop
DataWorks Summit/Hadoop Summit
Hadoop Summit Tokyo Apache NiFi Crash Course
Hadoop Summit Tokyo Apache NiFi Crash Course
DataWorks Summit/Hadoop Summit
Destaque
(20)
初めてのSpark streaming 〜kafka+sparkstreamingの紹介〜
初めてのSpark streaming 〜kafka+sparkstreamingの紹介〜
Spark/MapReduceの 機械学習ライブラリ比較検証
Spark/MapReduceの 機械学習ライブラリ比較検証
Sparkストリーミング検証
Sparkストリーミング検証
Apache Sparkに手を出してヤケドしないための基本 ~「Apache Spark入門より」~ (デブサミ 2016 講演資料)
Apache Sparkに手を出してヤケドしないための基本 ~「Apache Spark入門より」~ (デブサミ 2016 講演資料)
40分でわかるHadoop徹底入門 (Cloudera World Tokyo 2014 講演資料)
40分でわかるHadoop徹底入門 (Cloudera World Tokyo 2014 講演資料)
Mesos framework API v1
Mesos framework API v1
JAWS-DAYS 2015 / 北海道 x 農業 x クラウド
JAWS-DAYS 2015 / 北海道 x 農業 x クラウド
東急ハンズのクラウドデザインパターン アーキテクチャー編
東急ハンズのクラウドデザインパターン アーキテクチャー編
Solr on HDFS - Past, Present, and Future: Presented by Mark Miller, Cloudera
Solr on HDFS - Past, Present, and Future: Presented by Mark Miller, Cloudera
Neural Networks and Deep Learning
Neural Networks and Deep Learning
Apache Kudu (Incubating): New Hadoop Storage for Fast Analytics on Fast Data ...
Apache Kudu (Incubating): New Hadoop Storage for Fast Analytics on Fast Data ...
#cwt2016 Apache Kudu 構成とテーブル設計
#cwt2016 Apache Kudu 構成とテーブル設計
Cloud Native Hadoop #cwt2016
Cloud Native Hadoop #cwt2016
Spark Streamingを使ってみた ~Twitterリアルタイムトレンドランキング~
Spark Streamingを使ってみた ~Twitterリアルタイムトレンドランキング~
Kudu: New Hadoop Storage for Fast Analytics on Fast Data
Kudu: New Hadoop Storage for Fast Analytics on Fast Data
Sparkコミュニティに飛び込もう!(Spark Meetup Tokyo 2015 講演資料、NTTデータ 猿田 浩輔)
Sparkコミュニティに飛び込もう!(Spark Meetup Tokyo 2015 講演資料、NTTデータ 猿田 浩輔)
IoT時代におけるストリームデータ処理と急成長の Apache Flink
IoT時代におけるストリームデータ処理と急成長の Apache Flink
Apache kudu
Apache kudu
Hadoop Summit Tokyo HDP Sandbox Workshop
Hadoop Summit Tokyo HDP Sandbox Workshop
Hadoop Summit Tokyo Apache NiFi Crash Course
Hadoop Summit Tokyo Apache NiFi Crash Course
Semelhante a Apache Spark: Usage and Roadmap in Hadoop
Spark One Platform Webinar
Spark One Platform Webinar
Cloudera, Inc.
Why Apache Spark is the Heir to MapReduce in the Hadoop Ecosystem
Why Apache Spark is the Heir to MapReduce in the Hadoop Ecosystem
Cloudera, Inc.
Cloudera Impala - San Diego Big Data Meetup August 13th 2014
Cloudera Impala - San Diego Big Data Meetup August 13th 2014
cdmaxime
Spark in the Enterprise - 2 Years Later by Alan Saldich
Spark in the Enterprise - 2 Years Later by Alan Saldich
Spark Summit
Get most out of Spark on YARN
Get most out of Spark on YARN
DataWorks Summit
Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014
Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014
cdmaxime
Scaling Spark Workloads on YARN - Boulder/Denver July 2015
Scaling Spark Workloads on YARN - Boulder/Denver July 2015
Mac Moore
Hadoop world overview trends and topics
Hadoop world overview trends and topics
Valentin Kropov
Hortonworks.bdb
Hortonworks.bdb
Emil Andreas Siemes
Processing Large Data with Apache Spark -- HasGeek
Processing Large Data with Apache Spark -- HasGeek
Venkata Naga Ravi
Spark_Part 1
Spark_Part 1
Shashi Prakash
Apache Spark Fundamentals
Apache Spark Fundamentals
Zahra Eskandari
Apache Spark in Scientific Applciations
Apache Spark in Scientific Applciations
Dr. Mirko Kämpf
Apache Spark in Scientific Applications
Apache Spark in Scientific Applications
Dr. Mirko Kämpf
Apache Spark At Apple with Sam Maclennan and Vishwanath Lakkundi
Apache Spark At Apple with Sam Maclennan and Vishwanath Lakkundi
Databricks
Transitioning Compute Models: Hadoop MapReduce to Spark
Transitioning Compute Models: Hadoop MapReduce to Spark
Slim Baltagi
Intro to Apache Spark by CTO of Twingo
Intro to Apache Spark by CTO of Twingo
MapR Technologies
Hortonworks tech workshop in-memory processing with spark
Hortonworks tech workshop in-memory processing with spark
Hortonworks
Using Kafka and Kudu for fast, low-latency SQL analytics on streaming data
Using Kafka and Kudu for fast, low-latency SQL analytics on streaming data
Mike Percy
APACHE SPARK.pptx
APACHE SPARK.pptx
DeepaThirumurugan
Semelhante a Apache Spark: Usage and Roadmap in Hadoop
(20)
Spark One Platform Webinar
Spark One Platform Webinar
Why Apache Spark is the Heir to MapReduce in the Hadoop Ecosystem
Why Apache Spark is the Heir to MapReduce in the Hadoop Ecosystem
Cloudera Impala - San Diego Big Data Meetup August 13th 2014
Cloudera Impala - San Diego Big Data Meetup August 13th 2014
Spark in the Enterprise - 2 Years Later by Alan Saldich
Spark in the Enterprise - 2 Years Later by Alan Saldich
Get most out of Spark on YARN
Get most out of Spark on YARN
Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014
Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014
Scaling Spark Workloads on YARN - Boulder/Denver July 2015
Scaling Spark Workloads on YARN - Boulder/Denver July 2015
Hadoop world overview trends and topics
Hadoop world overview trends and topics
Hortonworks.bdb
Hortonworks.bdb
Processing Large Data with Apache Spark -- HasGeek
Processing Large Data with Apache Spark -- HasGeek
Spark_Part 1
Spark_Part 1
Apache Spark Fundamentals
Apache Spark Fundamentals
Apache Spark in Scientific Applciations
Apache Spark in Scientific Applciations
Apache Spark in Scientific Applications
Apache Spark in Scientific Applications
Apache Spark At Apple with Sam Maclennan and Vishwanath Lakkundi
Apache Spark At Apple with Sam Maclennan and Vishwanath Lakkundi
Transitioning Compute Models: Hadoop MapReduce to Spark
Transitioning Compute Models: Hadoop MapReduce to Spark
Intro to Apache Spark by CTO of Twingo
Intro to Apache Spark by CTO of Twingo
Hortonworks tech workshop in-memory processing with spark
Hortonworks tech workshop in-memory processing with spark
Using Kafka and Kudu for fast, low-latency SQL analytics on streaming data
Using Kafka and Kudu for fast, low-latency SQL analytics on streaming data
APACHE SPARK.pptx
APACHE SPARK.pptx
Mais de Cloudera Japan
Impala + Kudu を用いたデータウェアハウス構築の勘所 (仮)
Impala + Kudu を用いたデータウェアハウス構築の勘所 (仮)
Cloudera Japan
機械学習の定番プラットフォームSparkの紹介
機械学習の定番プラットフォームSparkの紹介
Cloudera Japan
HDFS Supportaiblity Improvements
HDFS Supportaiblity Improvements
Cloudera Japan
分散DB Apache KuduのアーキテクチャDBの性能と一貫性を両立させる仕組み「HybridTime」とは
分散DB Apache KuduのアーキテクチャDBの性能と一貫性を両立させる仕組み「HybridTime」とは
Cloudera Japan
Apache Impalaパフォーマンスチューニング #dbts2018
Apache Impalaパフォーマンスチューニング #dbts2018
Cloudera Japan
Apache Hadoop YARNとマルチテナントにおけるリソース管理
Apache Hadoop YARNとマルチテナントにおけるリソース管理
Cloudera Japan
HBase Across the World #LINE_DM
HBase Across the World #LINE_DM
Cloudera Japan
Cloudera のサポートエンジニアリング #supennight
Cloudera のサポートエンジニアリング #supennight
Cloudera Japan
Train, predict, serve: How to go into production your machine learning model
Train, predict, serve: How to go into production your machine learning model
Cloudera Japan
Apache Kuduを使った分析システムの裏側
Apache Kuduを使った分析システムの裏側
Cloudera Japan
Cloudera in the Cloud #CWT2017
Cloudera in the Cloud #CWT2017
Cloudera Japan
先行事例から学ぶ IoT / ビッグデータの始め方
先行事例から学ぶ IoT / ビッグデータの始め方
Cloudera Japan
Clouderaが提供するエンタープライズ向け運用、データ管理ツールの使い方 #CW2017
Clouderaが提供するエンタープライズ向け運用、データ管理ツールの使い方 #CW2017
Cloudera Japan
How to go into production your machine learning models? #CWT2017
How to go into production your machine learning models? #CWT2017
Cloudera Japan
Apache Kudu - Updatable Analytical Storage #rakutentech
Apache Kudu - Updatable Analytical Storage #rakutentech
Cloudera Japan
Hue 4.0 / Hue Meetup Tokyo #huejp
Hue 4.0 / Hue Meetup Tokyo #huejp
Cloudera Japan
Apache Kuduは何がそんなに「速い」DBなのか? #dbts2017
Apache Kuduは何がそんなに「速い」DBなのか? #dbts2017
Cloudera Japan
Cloudera Data Science WorkbenchとPySparkで 好きなPythonライブラリを 分散で使う #cadeda
Cloudera Data Science WorkbenchとPySparkで 好きなPythonライブラリを 分散で使う #cadeda
Cloudera Japan
Cloudera + MicrosoftでHadoopするのがイイらしい。 #CWT2016
Cloudera + MicrosoftでHadoopするのがイイらしい。 #CWT2016
Cloudera Japan
大規模データに対するデータサイエンスの進め方 #CWT2016
大規模データに対するデータサイエンスの進め方 #CWT2016
Cloudera Japan
Mais de Cloudera Japan
(20)
Impala + Kudu を用いたデータウェアハウス構築の勘所 (仮)
Impala + Kudu を用いたデータウェアハウス構築の勘所 (仮)
機械学習の定番プラットフォームSparkの紹介
機械学習の定番プラットフォームSparkの紹介
HDFS Supportaiblity Improvements
HDFS Supportaiblity Improvements
分散DB Apache KuduのアーキテクチャDBの性能と一貫性を両立させる仕組み「HybridTime」とは
分散DB Apache KuduのアーキテクチャDBの性能と一貫性を両立させる仕組み「HybridTime」とは
Apache Impalaパフォーマンスチューニング #dbts2018
Apache Impalaパフォーマンスチューニング #dbts2018
Apache Hadoop YARNとマルチテナントにおけるリソース管理
Apache Hadoop YARNとマルチテナントにおけるリソース管理
HBase Across the World #LINE_DM
HBase Across the World #LINE_DM
Cloudera のサポートエンジニアリング #supennight
Cloudera のサポートエンジニアリング #supennight
Train, predict, serve: How to go into production your machine learning model
Train, predict, serve: How to go into production your machine learning model
Apache Kuduを使った分析システムの裏側
Apache Kuduを使った分析システムの裏側
Cloudera in the Cloud #CWT2017
Cloudera in the Cloud #CWT2017
先行事例から学ぶ IoT / ビッグデータの始め方
先行事例から学ぶ IoT / ビッグデータの始め方
Clouderaが提供するエンタープライズ向け運用、データ管理ツールの使い方 #CW2017
Clouderaが提供するエンタープライズ向け運用、データ管理ツールの使い方 #CW2017
How to go into production your machine learning models? #CWT2017
How to go into production your machine learning models? #CWT2017
Apache Kudu - Updatable Analytical Storage #rakutentech
Apache Kudu - Updatable Analytical Storage #rakutentech
Hue 4.0 / Hue Meetup Tokyo #huejp
Hue 4.0 / Hue Meetup Tokyo #huejp
Apache Kuduは何がそんなに「速い」DBなのか? #dbts2017
Apache Kuduは何がそんなに「速い」DBなのか? #dbts2017
Cloudera Data Science WorkbenchとPySparkで 好きなPythonライブラリを 分散で使う #cadeda
Cloudera Data Science WorkbenchとPySparkで 好きなPythonライブラリを 分散で使う #cadeda
Cloudera + MicrosoftでHadoopするのがイイらしい。 #CWT2016
Cloudera + MicrosoftでHadoopするのがイイらしい。 #CWT2016
大規模データに対するデータサイエンスの進め方 #CWT2016
大規模データに対するデータサイエンスの進め方 #CWT2016
Último
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
Safe Software
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
Sinan KOZAK
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
BookNet Canada
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
ThousandEyes
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
soniya singh
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
HostedbyConfluent
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
HampshireHUG
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
Michael W. Hawkins
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
Delhi Call girls
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
Neo4j
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
Maria Levchenko
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
XfilesPro
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
Rafal Los
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
Gabriella Davis
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
2toLead Limited
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
Mark Billinghurst
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
ThousandEyes
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
Scott Keck-Warren
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
BookNet Canada
Último
(20)
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Apache Spark: Usage and Roadmap in Hadoop
1.
1© Cloudera, Inc.
All rights reserved. Apache Spark: Usage and Roadmap in Hadoop Jai Ranganathan
2.
2© Cloudera, Inc.
All rights reserved. Spark will replace MapReduce To become the standard execution engine for Hadoop
3.
3© Cloudera, Inc.
All rights reserved. The Future of Data Processing on Hadoop Spark complemented by specialized fit-for-purpose engines General Data Processing w/Spark Fast Batch Processing, Machine Learning, and Stream Processing Analytic Database w/Impala Low-Latency Massively Concurrent Queries Full-Text Search w/Solr Querying textual data On-Disk Processing w/MapReduce Jobs at extreme scale and extremely disk IO intensive Shared: • Data Storage • Metadata • Resource Management • Administration • Security • Governance
4.
4© Cloudera, Inc.
All rights reserved. Cloudera Leading the Spark Movement 2013 2014 2015 2016 Identified Spark’s early potential Ships and Supports Spark with CDH 4.4 Spark on YARN integration Announces initiative to make Spark the standard execution engine Launches first Spark training Added security integration Cloudera engineers publish O’Reilly Spark book Leading effort to further performance, usability, and enterprise-readiness
5.
5© Cloudera, Inc.
All rights reserved. Community Initiative: Spark Supersedes MapReduce Stage 1 • Crunch on Spark • Search on Spark Stage 2 • Hive on Spark (beta) • Spark on HBase (beta) Stage 3 • Pig on Spark (alpha) • Sqoop on Spark Community development to port components to Spark:
6.
6© Cloudera, Inc.
All rights reserved. Cloudera Customer Use Cases Core Spark Spark Streaming • Portfolio Risk Analysis • ETL Pipeline Speed-Up • 20+ years of stock dataFinancial Services Health • Identify disease-causing genes in the full human genome • Calculate Jaccard scores on health care data sets ERP • Optical Character Recognition and Bill Classification • Trend analysis • Document classification (LDA) • Fraud analyticsData Services 1010 • Online Fraud Detection Financial Services Health • Incident Prediction for Sepsis Retail • Online Recommendation Systems • Real-Time Inventory Management Ad Tech • Real-Time Ad Performance Analysis
7.
7© Cloudera, Inc.
All rights reserved. Apache Spark Flexible, in-memory data processing for Hadoop Easy Development Flexible Extensible API Fast Batch & Stream Processing • Rich APIs for Scala, Java, and Python • Interactive shell • APIs for different types of workloads: • Batch • Streaming • Machine Learning • Graph • In-Memory processing and caching
8.
8© Cloudera, Inc.
All rights reserved. The Spark Ecosystem & Hadoop Hadoop Integration • Spark-on-YARN integration • Shares data, metadata, administration, security, & governance STORAGE HDFS, HBase RESOURCE MANAGEMENT YARN Spark Impala MR Others Spark Streamin g MLlib SparkSQL GraphX Data- frames SparkR
9.
9© Cloudera, Inc.
All rights reserved. Logistic Regression Performance (Data Fits in Memory) 0 500 1000 1500 2000 2500 3000 3500 4000 1 5 10 20 30 RunningTime(s) # of Iterations MapReduce Spark 110 s/iteration First iteration = 80s Further iterations 1s due to caching
10.
10© Cloudera, Inc.
All rights reserved. Apache Spark Streaming What is it? • Run continuous processing of data using Spark’s core API • Extends Spark concepts to fault-tolerant, transformable streams • Adds “rolling window” operations • Example: Compute rolling averages or counts for data over last five minutes Benefits: • Reuse knowledge and code in both contexts • Same programming paradigm for streaming and batch • Simplicity of development • High-level API with automatic DAG generation • Excellent throughput • Scale easily to support large volumes of data ingest • Combine elements like MLlib and Oryx into streaming applications Common Use Cases: • “On-the-fly” ETL as data is ingested into Hadoop/HDFS • Detect anomalous behavior and trigger alerts • Continuous reporting of summary metrics for incoming data
11.
11© Cloudera, Inc.
All rights reserved. Spark Streaming Architectures Data Sources Ingest Integration Layer • Flume • Kafka Spark Stream Processing Data Prep Aggregation / Scoring HDFS Spark Long-Term Analytics/ Model Building HBase Real-Time Result Serving
12.
12© Cloudera, Inc.
All rights reserved. SparkSQL + Dataframes Machine Learning Applications • Goal: • Spark/Java Developers and Data Scientists can inline SQL into Spark apps • Designed for: • Ease of development for Spark developers • Handful of concurrent Spark jobs • Strengths: • Ease of embedding SQL into Java or Scala applications • SQL for common functionality in developer flow (eg. aggregations, filters, samples)
13.
13© Cloudera, Inc.
All rights reserved. Execution Pipeline SQL AST Logical Plan Optimized Logical Plan Logical Plan Physical Plans CBO Selected Plan RDDsRDDsRDDs Dataframes
14.
14© Cloudera, Inc.
All rights reserved. Uniting Spark and Hadoop The One Platform Initiative Management Leverage Hadoop-native resource management. Security Full support for Hadoop security and beyond. Scale Enable 10k-node clusters. Streaming Support for 80% of common stream processing workloads.
15.
15© Cloudera, Inc.
All rights reserved. Management Security Scale Streaming • Spark on YARN Integration • HBase integration • Improved metrics for monitoring/troubleshooting • Dynamic Resource Allocation • Spark on YARN: • Container resizing • Dynamic Resource Allocation for Streaming • Simplified resource configuration • Improved WebUI for debugging • Improved metrics for visibility into resource utilization • Smart auto-tuning of job parameters • Kerberos Integration • HDFS Sync (Sentry) • Secure data at rest • Secure data over the wire • Audit/Lineage (Navigator) • Spark PCI compliance • Integration with Intel’s advanced encryption libraries • Enable column and view level security • Revamp Scheduler handling of node failure • Sort based shuffle improvements • Task Scheduling based on HDFS data locality and caching • Scheduler improvements for performance at scale • Stress test at scale with mixed multi-tenant workloads • HDFS DDM Integration • Dynamic resource utilization & prioritization • Scale Spark History Server for 1000s of jobs • Zero Data Loss with Spark Streaming Resilience • Flume integration • Kafka integration • SQL semantics for expressing streaming jobs (Business Users) • New streaming specific API extensions • Streaming application management (pause, update, redeploy) via CM • Optimized state updates: efficient point lookups and delta updates Detailed Roadmap: One Platform Initiative = Completed Work = Planned Future Work
16.
16© Cloudera, Inc.
All rights reserved. Spark Resources • Learn Spark • O’Reilly Advanced Analytics with Spark eBook (written by Clouderans) • Cloudera Developer Blog • cloudera.com/spark • Get Trained • Cloudera Spark Training • Try it Out • Cloudera Live Spark Tutorial
17.
17© Cloudera, Inc.
All rights reserved. Try It With Cloudera Live cloudera.com/live Featuring tutorials on: CDH
18.
18© Cloudera, Inc.
All rights reserved. Thank You Jairam Ranganathan jairam@cloudera.com
Baixar agora