Intro to cassandra + hadoop

•Transferir como PPTX, PDF•

4 gostaram•3,246 visualizações

Jeremy Hanna

A high-level introduction to using hadoop analytics over data stored in Cassandra.

Tecnologia

Cassandra + Hadoop An Introduction to Hadoop Analytics over Cassandra Data

Introductions What is Cassandra? A highly scalable distributed data store Born at Facebook, grew up in the community What is Hadoop? A set of Apache projects Deal with Big Data in a distributed way Open source versions of MapReduce, GFS, BigTable, as well as additions, such as Pig and Hive

What makes them compatible? Cassandra is great at a lot of things Fast, extremely scalable writes, fast random reads Flexible semi-structured data model Not as good with ad-hoc answers Enter Hadoop MapReduce, Pig, and Hive are extensible Output from Hadoop into Cassandra

MapReduce Input from Cassandra as of 0.6.x Baked in output to Cassandra as of 0.7.0 Streaming support is coming in 0.7 Example: WordCount

Pig What is Pig? A platform for data analytics developed at Yahoo! Includes PigLatin, Grunt shell, and interpreter that compiles down to MapReduce Simplifies data analysis Cassandra integration Stu Hood added Pig integration in Cassandra 0.6 Example: WordCount with Pig

Hive What is Hive? A platform for data analytics developed at Facebook Draws from the familiar SQL -> Hive QL Compiles down to MapReduce Cassandra integration Availability of a Cassandra storage handler is coming soon – HIVE-1434

Example Use Case Raptr.com Gaming statistics and achievements across platforms Home-grown -> Cassandra + Hadoop (Pig) Idea to execution much faster Query runtime from hours to 10-15 minutes

Questions Contact Email: jeremy.hanna@rackspace.com Twitter: @jeromatron IRC: jeromatron on irc.freenode.net - #cassandra, #hadoop Further information http://wiki.apache.org/cassandra/HadoopSupport Cassandra: The Definitive Guide

Mais conteúdo relacionado

Mais procurados

Cloud Optimized Big DataJoydeep Sen Sarma

Qubole Overview at the Fifth Elephant ConferenceJoydeep Sen Sarma

Real Time and Big Data – It’s About TimeMapR Technologies

Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)Adam Kawa

Hadoop overviewSiva Pandeti

Nextag talkJoydeep Sen Sarma

Migrating structured data between Hadoop and RDBMSBouquet

Real Time and Big Data – It’s About TimeDataWorks Summit

Hadoop and HBase @eBayDataWorks Summit

Hadoopソースコードリーディング第3回 Hadopo MR + CassandraRyu Kobayashi

Introduction to Big Data & Hadoop Architecture - Module 1Rohit Agrawal

Hadoop Hive Talk At IIT-DelhiJoydeep Sen Sarma

Introduction to the Hadoop Ecosystem (FrOSCon Edition)Uwe Printz

Hadoop distributions - ecosystemJakub Stransky

מיכאלsqlserver.co.il

Apache Hadoop at 10Cloudera, Inc.

Messaging architecture @FB (Fifth Elephant Conference)Joydeep Sen Sarma

Hadoop and Big Data: RevealedSachin Holla

Hadoop - OverviewJay

알쓸신잡youngick

Mais procurados (20)

Cloud Optimized Big Data

Qubole Overview at the Fifth Elephant Conference

Real Time and Big Data – It’s About Time

Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

Hadoop overview

Nextag talk

Migrating structured data between Hadoop and RDBMS

Real Time and Big Data – It’s About Time

Hadoop and HBase @eBay

Hadoopソースコードリーディング第3回 Hadopo MR + Cassandra

Introduction to Big Data & Hadoop Architecture - Module 1

Hadoop Hive Talk At IIT-Delhi

Introduction to the Hadoop Ecosystem (FrOSCon Edition)

Hadoop distributions - ecosystem

מיכאל

Apache Hadoop at 10

Messaging architecture @FB (Fifth Elephant Conference)

Hadoop and Big Data: Revealed

Hadoop - Overview

알쓸신잡

Semelhante a Intro to cassandra + hadoop

Overview of big data & hadoop version 1 - Tony NguyenThanh Nguyen

Overview of Big data, Hadoop and Microsoft BI - version1Thanh Nguyen

Big Data Training in AmritsarE2MATRIX

Big-Data Hadoop Tutorials - MindScripts Technologies, Pune amrutupre

Big Data Training in LudhianaE2MATRIX

Big Data Training in MohaliE2MATRIX

Hadoop online training Keylabs

Intro to HadoopJonathan Bloom

Hadoop essentials by shiva achari - sample chapterShiva Achari

Hadoop demo pptPhil Young

HDFSVardhman Kale

Big Data in the Microsoft PlatformJesus Rodriguez

Hadoop in actionMahmoud Yassin

Big Data - Linked In_DEEPUDeepu M

What is hadoopAsis Mohanty

HadoopGagan Agrawal

[Azureビッグデータ関連サービスとHortonworks勉強会] Azure HDInsightNaoki (Neo) SATO

Srikanth hadoop 3.6yrs_hydsrikanth K

Hadoop in a NutshellAnthony Thomas

Lecture 2 Hadoop.pptxAnonymous9etQKwW

Semelhante a Intro to cassandra + hadoop (20)

Overview of big data & hadoop version 1 - Tony Nguyen

Overview of Big data, Hadoop and Microsoft BI - version1

Big Data Training in Amritsar

Big-Data Hadoop Tutorials - MindScripts Technologies, Pune

Big Data Training in Ludhiana

Big Data Training in Mohali

Hadoop online training

Intro to Hadoop

Hadoop essentials by shiva achari - sample chapter

Hadoop demo ppt

HDFS

Big Data in the Microsoft Platform

Hadoop in action

Big Data - Linked In_DEEPU

What is hadoop

Hadoop

[Azureビッグデータ関連サービスとHortonworks勉強会] Azure HDInsight

Srikanth hadoop 3.6yrs_hyd

Hadoop in a Nutshell

Lecture 2 Hadoop.pptx

Mais de Jeremy Hanna

Göteborg Distributed: Eventual Consistency in Apache CassandraJeremy Hanna

Apache Cassandra in the Real WorldJeremy Hanna

Modern Cassandra for DevelopersJeremy Hanna

Troubleshooting CassandraJeremy Hanna

Cassandra + Hadoop: Analisi Batch con Apache CassandraJeremy Hanna

Cassandra + Hadoop @ApacheCon Jeremy Hanna

Cassandra+HadoopJeremy Hanna

Mais de Jeremy Hanna (8)

Göteborg Distributed: Eventual Consistency in Apache Cassandra

Apache Cassandra in the Real World

Modern Cassandra for Developers

Troubleshooting Cassandra

Cassandra + Hadoop: Analisi Batch con Apache Cassandra

Cassandra + Hadoop @ApacheCon

Cassandra+Hadoop

Último

From Family Reminiscence to Scholarly Archive .Alan Dix

Rise of the Machines: Known As Drones...Rick Flair

TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc

Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3

New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada

Sample pptx for embedding into website for demoHarshalMandlekar2

What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina

Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein

DevEX - reference for building teams, processes, and platformsSergiu Bodiu

Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González

Scale your database traffic with Read & Write split using MySQL RouterMydbops

Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen

Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani

How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe

TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey

How to write a Business Continuity PlanDatabarracks

Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll

Decarbonising Buildings: Making a net-zero built environment a realityIES VE

The State of Passkeys with FIDO Alliance.pptxLoriGlavin3

The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3

Intro to cassandra + hadoop

1. Cassandra + Hadoop An Introduction to Hadoop Analytics over Cassandra Data

2. Introductions What is Cassandra? A highly scalable distributed data store Born at Facebook, grew up in the community What is Hadoop? A set of Apache projects Deal with Big Data in a distributed way Open source versions of MapReduce, GFS, BigTable, as well as additions, such as Pig and Hive

3. What makes them compatible? Cassandra is great at a lot of things Fast, extremely scalable writes, fast random reads Flexible semi-structured data model Not as good with ad-hoc answers Enter Hadoop MapReduce, Pig, and Hive are extensible Output from Hadoop into Cassandra

4. MapReduce Input from Cassandra as of 0.6.x Baked in output to Cassandra as of 0.7.0 Streaming support is coming in 0.7 Example: WordCount

5. Pig What is Pig? A platform for data analytics developed at Yahoo! Includes PigLatin, Grunt shell, and interpreter that compiles down to MapReduce Simplifies data analysis Cassandra integration Stu Hood added Pig integration in Cassandra 0.6 Example: WordCount with Pig

6. Hive What is Hive? A platform for data analytics developed at Facebook Draws from the familiar SQL -> Hive QL Compiles down to MapReduce Cassandra integration Availability of a Cassandra storage handler is coming soon – HIVE-1434

7. Example Use Case Raptr.com Gaming statistics and achievements across platforms Home-grown -> Cassandra + Hadoop (Pig) Idea to execution much faster Query runtime from hours to 10-15 minutes

8. Questions Contact Email: jeremy.hanna@rackspace.com Twitter: @jeromatron IRC: jeromatron on irc.freenode.net - #cassandra, #hadoop Further information http://wiki.apache.org/cassandra/HadoopSupport Cassandra: The Definitive Guide

Intro to cassandra + hadoop

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Semelhante a Intro to cassandra + hadoop

Semelhante a Intro to cassandra + hadoop (20)

Mais de Jeremy Hanna

Mais de Jeremy Hanna (8)

Último

Último (20)

Intro to cassandra + hadoop