SlideShare uma empresa Scribd logo
1 de 32
Big Data – 4 V’s
NoSQL 
• NoSQL is all about scalability 
• Scaling to size 
• Scaling to complexity 
• Deliver Heavy R/W workloads 
• Data duplication and denormalization are first-class 
citizens
RDBMS vs NoSQL
No SQL Types
Database Chart
CAP Theorem
Re Check.. 
• What is CAP theorem? 
• Does NoSQL supports Transaction? 
• NoSQL Types?
HBase 
• Scalable, distributed data store 
• Sorted map of maps / Key- Value store 
• Open source avatar of Google’s Bigtable 
• Sparse 
• Multi dimensional 
• Tightly integrated with Hadoop 
• Not a RDBMS
Architecture 
HDFS((DataNodes) 
Storage 
ZooKeeper 
Membership management 
RegionServers 
Serve the regions 
HBase Masters 
Janitorial work
Column Oriented
Distributed
Variable number of columns
Important Terms 
• Table 
• Consists of rows and columns 
• Row 
• Has a bunch of columns. 
• Identified by a rowkey (primary’ key) 
• Column Qualifier 
• Dynamic column name 
• Column Family 
• Column groups - logical and physical (Similar access pattern) 
• Cell 
• The actual element that contains the data for a row-column insertion 
• Version 
• Every cell has multiple versions
Logical & Tall(v/s(Wide(tab Plehsy(sstiocraal gSet(rfuocottuprreint 
CF1 CF2 
r1 c1:v1 c1:v9 c6:v2 
r2 c1:v2 c3:v6 
r3 c2:v3 c5:v6 
r4 c2:v4 
r5 c1:v1 c3:v5 c7:v8 
HFile for CF1 HFile for CF2 
r1:CF1:c1:t1:v1 
r2:CF1:c1:t2:v2 
r2:CF1:c3:t3:v6 
r3:CF1:c2:t1:v3 
r4:CF1:c2:t1:v4 
r5:CF1:c1:t2:v1 
r5:CF1:c3:t3:v5 
r1:CF2:c1:t1:v9 
r1:CF2:c6:t4:v2 
r3:CF2:c5:t4:v6 
r5:CF2:c7:t3:v8 
Result object returned for a Get() on row r5 
r5:CF1:c1:t2:v1 
r5:CF1:c3:t3:v5 
r5:cf2:c7:t3:v8 
KeyValue objects 
Cell 
Value 
Time 
Stamp 
Col 
Qual 
Col 
Fam 
Row 
Key 
Key Value 
Logical representation of an HBase table. 
We'll look at what it means to Get() row r5 from this table. 
Actual physical storage of the table 
Structure of a KeyValue object
(J)Ruby Shell Commands 
• General 
• DDL 
• Create 
• Describe 
• Namespace 
• DML 
• Put 
• Get 
• Scan 
• Delete 
• Tools 
• Replication 
• Snapshot 
• Security 
• Visibility 
Creating Table: 
create 'DEVICE_DETAIL','BASIC_INFO','CONTRACT_INFO' 
Data Generation : 
put 'DEVICE_DETAIL','Device1','BASIC_INFO:IP_ADDR','10.10.10.10' 
put 'DEVICE_DETAIL','Device2','BASIC_INFO:IP_ADDR','20.20.20.20' 
Descripting Table: 
describe 'DEVICE_DETAIL' 
Alert Info : 
alter 'DEVICE_DETAIL',{NAME => 'CONTRACT_INFO',VERSIONS => 3 } 
Update Data: 
put 'DEVICE_DETAIL','Device2','CONTRACT_INFO:CONTRACT_NUMBER','22222222' 
Multi- Version Example : 
get 'DEVICE_DETAIL','Device2', {COLUMN=>'CONTRACT_INFO:CONTRACT_NUMBER', VERSIONS=>2} 
Scan Info: 
scan 'DEVICE_DETAIL’ 
Scan with Filter : 
scan 'DEVICE_DETAIL' , { COLUMNS => 'CONTRACT_INFO:STATUS', LIMIT => 10, FILTER => 
"ValueFilter( =, 'binary:IN_ACTIVE' )" } 
Delete Info: 
delete 'DEVICE_DETAIL','Device2','CONTRACT_INFO:STATUS'
Java API 
• HTable 
• HBaseAdmin 
• HTablePool 
• Get 
• Put 
• Delete 
• Scan 
• Increment 
• HTableDescriptor 
• HTableInterface 
• Result 
• ResultScanner 
• KeyValue 
HTable table = new HTable(configuration, hbasetablename); 
Put row = new Put(Bytes.toBytes(rowKey)); 
row.add(Bytes.toBytes(columnFamily), Bytes.toBytes(key), 
Bytes.toBytes(value)); 
Get getKey = new Get(Bytes.toBytes(key)); 
Result result = table.get(getKey);
Spark HBase 
// create configuration 
val config = HBaseConfiguration.create() 
config.set("hbase.zookeeper.quorum", "localhost") 
config.set("hbase.zookeeper.property.clientPort","2181") 
config.set("hbase.mapreduce.inputtable", "hbaseTableName") 
// read data 
val hbaseData = sparkContext.hadoopRDD(new JobConf(config), classOf[TableInputFormat], 
classOf[ImmutableBytesWritable], classOf[Result]) 
// count rows 
println(hbaseData.count)
HBase Architecture
Write & Read Logic
SQL
Re Check.. 
• Column family? 
• HBase components? 
• Name few Shell commands? 
• Version in HBase?
Reference Slides
Use Case 
• Canonical(use(case:(storing(crawl(data(and(indices(for(search 
14 
1 
Web Search 
powered by Bigtable 
Crawlers 
Crawlers 
1 Crawlers constantly scour the Internet for new pages. 
Those pages are stored as individual records in Bigtable. 3 
2 A MapReduce job runs over the entire table, generating 
search indexes for the Web Search application. 
4 
2 
5 
Indexing the Internet 
Searching the Internet 
3 The user initiates a Web Search request. 
4 The Web Search application queries the Search Indexes 
and retries matching documents directly from Bigtable. 
5 Search results are presented to the user. 
Internets Bigtable 
Crawlers 
Crawlers 
MapReduce 
You 
Search 
InSdeeaxrch 
InSdeeaxrch 
Index 
Web Search
Hbase Architecture
Replications
CAP Theorem

Mais conteúdo relacionado

Mais procurados

Intro to HBase - Lars George
Intro to HBase - Lars GeorgeIntro to HBase - Lars George
Intro to HBase - Lars GeorgeJAX London
 
Keynote: The Future of Apache HBase
Keynote: The Future of Apache HBaseKeynote: The Future of Apache HBase
Keynote: The Future of Apache HBaseHBaseCon
 
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBaseHBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBaseCloudera, Inc.
 
Optimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsight
Optimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsightOptimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsight
Optimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsightHBaseCon
 
HBaseCon 2012 | Mignify: A Big Data Refinery Built on HBase - Internet Memory...
HBaseCon 2012 | Mignify: A Big Data Refinery Built on HBase - Internet Memory...HBaseCon 2012 | Mignify: A Big Data Refinery Built on HBase - Internet Memory...
HBaseCon 2012 | Mignify: A Big Data Refinery Built on HBase - Internet Memory...Cloudera, Inc.
 
Large-scale Web Apps @ Pinterest
Large-scale Web Apps @ PinterestLarge-scale Web Apps @ Pinterest
Large-scale Web Apps @ PinterestHBaseCon
 
Analyzing Real-World Data with Apache Drill
Analyzing Real-World Data with Apache DrillAnalyzing Real-World Data with Apache Drill
Analyzing Real-World Data with Apache Drilltshiran
 
Apache Big Data EU 2015 - HBase
Apache Big Data EU 2015 - HBaseApache Big Data EU 2015 - HBase
Apache Big Data EU 2015 - HBaseNick Dimiduk
 
Apache Spark on Apache HBase: Current and Future
Apache Spark on Apache HBase: Current and Future Apache Spark on Apache HBase: Current and Future
Apache Spark on Apache HBase: Current and Future HBaseCon
 
Efficient in situ processing of various storage types on apache tajo
Efficient in situ processing of various storage types on apache tajoEfficient in situ processing of various storage types on apache tajo
Efficient in situ processing of various storage types on apache tajoHyunsik Choi
 
Harmonizing Multi-tenant HBase Clusters for Managing Workload Diversity
Harmonizing Multi-tenant HBase Clusters for Managing Workload DiversityHarmonizing Multi-tenant HBase Clusters for Managing Workload Diversity
Harmonizing Multi-tenant HBase Clusters for Managing Workload DiversityHBaseCon
 
Tajo Seoul Meetup July 2015 - What's New Tajo 0.11
Tajo Seoul Meetup July 2015 - What's New Tajo 0.11Tajo Seoul Meetup July 2015 - What's New Tajo 0.11
Tajo Seoul Meetup July 2015 - What's New Tajo 0.11Hyunsik Choi
 
HBase Data Modeling and Access Patterns with Kite SDK
HBase Data Modeling and Access Patterns with Kite SDKHBase Data Modeling and Access Patterns with Kite SDK
HBase Data Modeling and Access Patterns with Kite SDKHBaseCon
 
HBaseCon 2013: Compaction Improvements in Apache HBase
HBaseCon 2013: Compaction Improvements in Apache HBaseHBaseCon 2013: Compaction Improvements in Apache HBase
HBaseCon 2013: Compaction Improvements in Apache HBaseCloudera, Inc.
 
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...Cloudera, Inc.
 
HBase Status Report - Hadoop Summit Europe 2014
HBase Status Report - Hadoop Summit Europe 2014HBase Status Report - Hadoop Summit Europe 2014
HBase Status Report - Hadoop Summit Europe 2014larsgeorge
 
Using HBase Co-Processors to Build a Distributed, Transactional RDBMS - Splic...
Using HBase Co-Processors to Build a Distributed, Transactional RDBMS - Splic...Using HBase Co-Processors to Build a Distributed, Transactional RDBMS - Splic...
Using HBase Co-Processors to Build a Distributed, Transactional RDBMS - Splic...Chicago Hadoop Users Group
 
An Introduction to Impala – Low Latency Queries for Apache Hadoop
An Introduction to Impala – Low Latency Queries for Apache HadoopAn Introduction to Impala – Low Latency Queries for Apache Hadoop
An Introduction to Impala – Low Latency Queries for Apache HadoopChicago Hadoop Users Group
 

Mais procurados (20)

Intro to HBase - Lars George
Intro to HBase - Lars GeorgeIntro to HBase - Lars George
Intro to HBase - Lars George
 
Keynote: The Future of Apache HBase
Keynote: The Future of Apache HBaseKeynote: The Future of Apache HBase
Keynote: The Future of Apache HBase
 
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBaseHBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase
 
Optimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsight
Optimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsightOptimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsight
Optimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsight
 
HBaseCon 2012 | Mignify: A Big Data Refinery Built on HBase - Internet Memory...
HBaseCon 2012 | Mignify: A Big Data Refinery Built on HBase - Internet Memory...HBaseCon 2012 | Mignify: A Big Data Refinery Built on HBase - Internet Memory...
HBaseCon 2012 | Mignify: A Big Data Refinery Built on HBase - Internet Memory...
 
Apache hadoop hbase
Apache hadoop hbaseApache hadoop hbase
Apache hadoop hbase
 
Large-scale Web Apps @ Pinterest
Large-scale Web Apps @ PinterestLarge-scale Web Apps @ Pinterest
Large-scale Web Apps @ Pinterest
 
Analyzing Real-World Data with Apache Drill
Analyzing Real-World Data with Apache DrillAnalyzing Real-World Data with Apache Drill
Analyzing Real-World Data with Apache Drill
 
Apache Big Data EU 2015 - HBase
Apache Big Data EU 2015 - HBaseApache Big Data EU 2015 - HBase
Apache Big Data EU 2015 - HBase
 
Apache Spark on Apache HBase: Current and Future
Apache Spark on Apache HBase: Current and Future Apache Spark on Apache HBase: Current and Future
Apache Spark on Apache HBase: Current and Future
 
Efficient in situ processing of various storage types on apache tajo
Efficient in situ processing of various storage types on apache tajoEfficient in situ processing of various storage types on apache tajo
Efficient in situ processing of various storage types on apache tajo
 
Harmonizing Multi-tenant HBase Clusters for Managing Workload Diversity
Harmonizing Multi-tenant HBase Clusters for Managing Workload DiversityHarmonizing Multi-tenant HBase Clusters for Managing Workload Diversity
Harmonizing Multi-tenant HBase Clusters for Managing Workload Diversity
 
Tajo Seoul Meetup July 2015 - What's New Tajo 0.11
Tajo Seoul Meetup July 2015 - What's New Tajo 0.11Tajo Seoul Meetup July 2015 - What's New Tajo 0.11
Tajo Seoul Meetup July 2015 - What's New Tajo 0.11
 
HBase Data Modeling and Access Patterns with Kite SDK
HBase Data Modeling and Access Patterns with Kite SDKHBase Data Modeling and Access Patterns with Kite SDK
HBase Data Modeling and Access Patterns with Kite SDK
 
HBaseCon 2013: Compaction Improvements in Apache HBase
HBaseCon 2013: Compaction Improvements in Apache HBaseHBaseCon 2013: Compaction Improvements in Apache HBase
HBaseCon 2013: Compaction Improvements in Apache HBase
 
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
 
Apache phoenix
Apache phoenixApache phoenix
Apache phoenix
 
HBase Status Report - Hadoop Summit Europe 2014
HBase Status Report - Hadoop Summit Europe 2014HBase Status Report - Hadoop Summit Europe 2014
HBase Status Report - Hadoop Summit Europe 2014
 
Using HBase Co-Processors to Build a Distributed, Transactional RDBMS - Splic...
Using HBase Co-Processors to Build a Distributed, Transactional RDBMS - Splic...Using HBase Co-Processors to Build a Distributed, Transactional RDBMS - Splic...
Using HBase Co-Processors to Build a Distributed, Transactional RDBMS - Splic...
 
An Introduction to Impala – Low Latency Queries for Apache Hadoop
An Introduction to Impala – Low Latency Queries for Apache HadoopAn Introduction to Impala – Low Latency Queries for Apache Hadoop
An Introduction to Impala – Low Latency Queries for Apache Hadoop
 

Destaque

Apache hbase overview (20160427)
Apache hbase overview (20160427)Apache hbase overview (20160427)
Apache hbase overview (20160427)Steve Min
 
The Hive Overview
The Hive OverviewThe Hive Overview
The Hive Overviewthehivecs
 
SQL on Hadoop
SQL on HadoopSQL on Hadoop
SQL on Hadoopnvvrajesh
 
Base de données graphe et Neo4j
Base de données graphe et Neo4jBase de données graphe et Neo4j
Base de données graphe et Neo4jBoris Guarisma
 
Data Modeling with Neo4j
Data Modeling with Neo4jData Modeling with Neo4j
Data Modeling with Neo4jNeo4j
 
Instruction Manual Minelab X-TERRA 705 Metal Detector French Language (4901-0...
Instruction Manual Minelab X-TERRA 705 Metal Detector French Language (4901-0...Instruction Manual Minelab X-TERRA 705 Metal Detector French Language (4901-0...
Instruction Manual Minelab X-TERRA 705 Metal Detector French Language (4901-0...Serious Detecting
 
Graphes et détection de fraude : exemple de l'assurance
Graphes et détection de fraude : exemple de l'assuranceGraphes et détection de fraude : exemple de l'assurance
Graphes et détection de fraude : exemple de l'assuranceLinkurious
 
Introduction à Neo4j
Introduction à Neo4jIntroduction à Neo4j
Introduction à Neo4jNeo4j
 
Présentation des bases de données orientées graphes
Présentation des bases de données orientées graphesPrésentation des bases de données orientées graphes
Présentation des bases de données orientées graphesKoffi Sani
 

Destaque (10)

Apache hbase overview (20160427)
Apache hbase overview (20160427)Apache hbase overview (20160427)
Apache hbase overview (20160427)
 
The Hive Overview
The Hive OverviewThe Hive Overview
The Hive Overview
 
HBASE Overview
HBASE OverviewHBASE Overview
HBASE Overview
 
SQL on Hadoop
SQL on HadoopSQL on Hadoop
SQL on Hadoop
 
Base de données graphe et Neo4j
Base de données graphe et Neo4jBase de données graphe et Neo4j
Base de données graphe et Neo4j
 
Data Modeling with Neo4j
Data Modeling with Neo4jData Modeling with Neo4j
Data Modeling with Neo4j
 
Instruction Manual Minelab X-TERRA 705 Metal Detector French Language (4901-0...
Instruction Manual Minelab X-TERRA 705 Metal Detector French Language (4901-0...Instruction Manual Minelab X-TERRA 705 Metal Detector French Language (4901-0...
Instruction Manual Minelab X-TERRA 705 Metal Detector French Language (4901-0...
 
Graphes et détection de fraude : exemple de l'assurance
Graphes et détection de fraude : exemple de l'assuranceGraphes et détection de fraude : exemple de l'assurance
Graphes et détection de fraude : exemple de l'assurance
 
Introduction à Neo4j
Introduction à Neo4jIntroduction à Neo4j
Introduction à Neo4j
 
Présentation des bases de données orientées graphes
Présentation des bases de données orientées graphesPrésentation des bases de données orientées graphes
Présentation des bases de données orientées graphes
 

Semelhante a NoSQL & HBase overview

Introduction to Apache HBase, MapR Tables and Security
Introduction to Apache HBase, MapR Tables and SecurityIntroduction to Apache HBase, MapR Tables and Security
Introduction to Apache HBase, MapR Tables and SecurityMapR Technologies
 
HBaseConEast2016: HBase and Spark, State of the Art
HBaseConEast2016: HBase and Spark, State of the ArtHBaseConEast2016: HBase and Spark, State of the Art
HBaseConEast2016: HBase and Spark, State of the ArtMichael Stack
 
Introduction to HBase - Phoenix HUG 5/14
Introduction to HBase - Phoenix HUG 5/14Introduction to HBase - Phoenix HUG 5/14
Introduction to HBase - Phoenix HUG 5/14Jeremy Walsh
 
H base introduction & development
H base introduction & developmentH base introduction & development
H base introduction & developmentShashwat Shriparv
 
Big Data Everywhere Chicago: Unleash the Power of HBase Shell (Conversant)
Big Data Everywhere Chicago: Unleash the Power of HBase Shell (Conversant) Big Data Everywhere Chicago: Unleash the Power of HBase Shell (Conversant)
Big Data Everywhere Chicago: Unleash the Power of HBase Shell (Conversant) BigDataEverywhere
 
Performing Data Science with HBase
Performing Data Science with HBasePerforming Data Science with HBase
Performing Data Science with HBaseWibiData
 
NoSQL HBase schema design and SQL with Apache Drill
NoSQL HBase schema design and SQL with Apache Drill NoSQL HBase schema design and SQL with Apache Drill
NoSQL HBase schema design and SQL with Apache Drill Carol McDonald
 
Advance Hive, NoSQL Database (HBase) - Module 7
Advance Hive, NoSQL Database (HBase) - Module 7Advance Hive, NoSQL Database (HBase) - Module 7
Advance Hive, NoSQL Database (HBase) - Module 7Rohit Agrawal
 
Getting started with HBase
Getting started with HBaseGetting started with HBase
Getting started with HBaseCarol McDonald
 
Apache Drill talk ApacheCon 2018
Apache Drill talk ApacheCon 2018Apache Drill talk ApacheCon 2018
Apache Drill talk ApacheCon 2018Aman Sinha
 
Hadoop and HBase experiences in perf log project
Hadoop and HBase experiences in perf log projectHadoop and HBase experiences in perf log project
Hadoop and HBase experiences in perf log projectMao Geng
 
Hypertable - massively scalable nosql database
Hypertable - massively scalable nosql databaseHypertable - massively scalable nosql database
Hypertable - massively scalable nosql databasebigdatagurus_meetup
 
SQL Server 2014 In-Memory OLTP
SQL Server 2014 In-Memory OLTPSQL Server 2014 In-Memory OLTP
SQL Server 2014 In-Memory OLTPTony Rogerson
 
Large Scale Machine Learning with Apache Spark
Large Scale Machine Learning with Apache SparkLarge Scale Machine Learning with Apache Spark
Large Scale Machine Learning with Apache SparkCloudera, Inc.
 
Adding Value to HBase with IBM InfoSphere BigInsights and BigSQL
Adding Value to HBase with IBM InfoSphere BigInsights and BigSQLAdding Value to HBase with IBM InfoSphere BigInsights and BigSQL
Adding Value to HBase with IBM InfoSphere BigInsights and BigSQLPiotr Pruski
 
HUG France Feb 2016 - Migration de données structurées entre Hadoop et RDBMS ...
HUG France Feb 2016 - Migration de données structurées entre Hadoop et RDBMS ...HUG France Feb 2016 - Migration de données structurées entre Hadoop et RDBMS ...
HUG France Feb 2016 - Migration de données structurées entre Hadoop et RDBMS ...Modern Data Stack France
 
Bay area Cassandra Meetup 2011
Bay area Cassandra Meetup 2011Bay area Cassandra Meetup 2011
Bay area Cassandra Meetup 2011mubarakss
 

Semelhante a NoSQL & HBase overview (20)

Introduction to Apache HBase, MapR Tables and Security
Introduction to Apache HBase, MapR Tables and SecurityIntroduction to Apache HBase, MapR Tables and Security
Introduction to Apache HBase, MapR Tables and Security
 
HBaseConEast2016: HBase and Spark, State of the Art
HBaseConEast2016: HBase and Spark, State of the ArtHBaseConEast2016: HBase and Spark, State of the Art
HBaseConEast2016: HBase and Spark, State of the Art
 
Introduction to HBase - Phoenix HUG 5/14
Introduction to HBase - Phoenix HUG 5/14Introduction to HBase - Phoenix HUG 5/14
Introduction to HBase - Phoenix HUG 5/14
 
H base introduction & development
H base introduction & developmentH base introduction & development
H base introduction & development
 
Big Data Everywhere Chicago: Unleash the Power of HBase Shell (Conversant)
Big Data Everywhere Chicago: Unleash the Power of HBase Shell (Conversant) Big Data Everywhere Chicago: Unleash the Power of HBase Shell (Conversant)
Big Data Everywhere Chicago: Unleash the Power of HBase Shell (Conversant)
 
Performing Data Science with HBase
Performing Data Science with HBasePerforming Data Science with HBase
Performing Data Science with HBase
 
NoSQL HBase schema design and SQL with Apache Drill
NoSQL HBase schema design and SQL with Apache Drill NoSQL HBase schema design and SQL with Apache Drill
NoSQL HBase schema design and SQL with Apache Drill
 
מיכאל
מיכאלמיכאל
מיכאל
 
Advance Hive, NoSQL Database (HBase) - Module 7
Advance Hive, NoSQL Database (HBase) - Module 7Advance Hive, NoSQL Database (HBase) - Module 7
Advance Hive, NoSQL Database (HBase) - Module 7
 
Getting started with HBase
Getting started with HBaseGetting started with HBase
Getting started with HBase
 
HBase.pptx
HBase.pptxHBase.pptx
HBase.pptx
 
Apache Drill talk ApacheCon 2018
Apache Drill talk ApacheCon 2018Apache Drill talk ApacheCon 2018
Apache Drill talk ApacheCon 2018
 
NoSQL Databases
NoSQL DatabasesNoSQL Databases
NoSQL Databases
 
Hadoop and HBase experiences in perf log project
Hadoop and HBase experiences in perf log projectHadoop and HBase experiences in perf log project
Hadoop and HBase experiences in perf log project
 
Hypertable - massively scalable nosql database
Hypertable - massively scalable nosql databaseHypertable - massively scalable nosql database
Hypertable - massively scalable nosql database
 
SQL Server 2014 In-Memory OLTP
SQL Server 2014 In-Memory OLTPSQL Server 2014 In-Memory OLTP
SQL Server 2014 In-Memory OLTP
 
Large Scale Machine Learning with Apache Spark
Large Scale Machine Learning with Apache SparkLarge Scale Machine Learning with Apache Spark
Large Scale Machine Learning with Apache Spark
 
Adding Value to HBase with IBM InfoSphere BigInsights and BigSQL
Adding Value to HBase with IBM InfoSphere BigInsights and BigSQLAdding Value to HBase with IBM InfoSphere BigInsights and BigSQL
Adding Value to HBase with IBM InfoSphere BigInsights and BigSQL
 
HUG France Feb 2016 - Migration de données structurées entre Hadoop et RDBMS ...
HUG France Feb 2016 - Migration de données structurées entre Hadoop et RDBMS ...HUG France Feb 2016 - Migration de données structurées entre Hadoop et RDBMS ...
HUG France Feb 2016 - Migration de données structurées entre Hadoop et RDBMS ...
 
Bay area Cassandra Meetup 2011
Bay area Cassandra Meetup 2011Bay area Cassandra Meetup 2011
Bay area Cassandra Meetup 2011
 

Mais de Venkata Naga Ravi

Mais de Venkata Naga Ravi (12)

Microservices with Docker
Microservices with Docker Microservices with Docker
Microservices with Docker
 
Processing Large Data with Apache Spark -- HasGeek
Processing Large Data with Apache Spark -- HasGeekProcessing Large Data with Apache Spark -- HasGeek
Processing Large Data with Apache Spark -- HasGeek
 
Quick Trip with Docker
Quick Trip with DockerQuick Trip with Docker
Quick Trip with Docker
 
Glint with Apache Spark
Glint with Apache SparkGlint with Apache Spark
Glint with Apache Spark
 
Flocker
FlockerFlocker
Flocker
 
Big Data Benchmarking
Big Data BenchmarkingBig Data Benchmarking
Big Data Benchmarking
 
Go Lang
Go LangGo Lang
Go Lang
 
Kubernetes
KubernetesKubernetes
Kubernetes
 
Software Defined Network - SDN
Software Defined Network - SDNSoftware Defined Network - SDN
Software Defined Network - SDN
 
Virtual Container - Docker
Virtual Container - Docker Virtual Container - Docker
Virtual Container - Docker
 
Java 8 Lambda and Streams
Java 8 Lambda and StreamsJava 8 Lambda and Streams
Java 8 Lambda and Streams
 
In Memory Analytics with Apache Spark
In Memory Analytics with Apache SparkIn Memory Analytics with Apache Spark
In Memory Analytics with Apache Spark
 

Último

TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AIABDERRAOUF MEHENNI
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️anilsa9823
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 

Último (20)

TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
Exploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the ProcessExploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the Process
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 

NoSQL & HBase overview

  • 1.
  • 2. Big Data – 4 V’s
  • 3. NoSQL • NoSQL is all about scalability • Scaling to size • Scaling to complexity • Deliver Heavy R/W workloads • Data duplication and denormalization are first-class citizens
  • 8. Re Check.. • What is CAP theorem? • Does NoSQL supports Transaction? • NoSQL Types?
  • 9. HBase • Scalable, distributed data store • Sorted map of maps / Key- Value store • Open source avatar of Google’s Bigtable • Sparse • Multi dimensional • Tightly integrated with Hadoop • Not a RDBMS
  • 10.
  • 11. Architecture HDFS((DataNodes) Storage ZooKeeper Membership management RegionServers Serve the regions HBase Masters Janitorial work
  • 15. Important Terms • Table • Consists of rows and columns • Row • Has a bunch of columns. • Identified by a rowkey (primary’ key) • Column Qualifier • Dynamic column name • Column Family • Column groups - logical and physical (Similar access pattern) • Cell • The actual element that contains the data for a row-column insertion • Version • Every cell has multiple versions
  • 16. Logical & Tall(v/s(Wide(tab Plehsy(sstiocraal gSet(rfuocottuprreint CF1 CF2 r1 c1:v1 c1:v9 c6:v2 r2 c1:v2 c3:v6 r3 c2:v3 c5:v6 r4 c2:v4 r5 c1:v1 c3:v5 c7:v8 HFile for CF1 HFile for CF2 r1:CF1:c1:t1:v1 r2:CF1:c1:t2:v2 r2:CF1:c3:t3:v6 r3:CF1:c2:t1:v3 r4:CF1:c2:t1:v4 r5:CF1:c1:t2:v1 r5:CF1:c3:t3:v5 r1:CF2:c1:t1:v9 r1:CF2:c6:t4:v2 r3:CF2:c5:t4:v6 r5:CF2:c7:t3:v8 Result object returned for a Get() on row r5 r5:CF1:c1:t2:v1 r5:CF1:c3:t3:v5 r5:cf2:c7:t3:v8 KeyValue objects Cell Value Time Stamp Col Qual Col Fam Row Key Key Value Logical representation of an HBase table. We'll look at what it means to Get() row r5 from this table. Actual physical storage of the table Structure of a KeyValue object
  • 17. (J)Ruby Shell Commands • General • DDL • Create • Describe • Namespace • DML • Put • Get • Scan • Delete • Tools • Replication • Snapshot • Security • Visibility Creating Table: create 'DEVICE_DETAIL','BASIC_INFO','CONTRACT_INFO' Data Generation : put 'DEVICE_DETAIL','Device1','BASIC_INFO:IP_ADDR','10.10.10.10' put 'DEVICE_DETAIL','Device2','BASIC_INFO:IP_ADDR','20.20.20.20' Descripting Table: describe 'DEVICE_DETAIL' Alert Info : alter 'DEVICE_DETAIL',{NAME => 'CONTRACT_INFO',VERSIONS => 3 } Update Data: put 'DEVICE_DETAIL','Device2','CONTRACT_INFO:CONTRACT_NUMBER','22222222' Multi- Version Example : get 'DEVICE_DETAIL','Device2', {COLUMN=>'CONTRACT_INFO:CONTRACT_NUMBER', VERSIONS=>2} Scan Info: scan 'DEVICE_DETAIL’ Scan with Filter : scan 'DEVICE_DETAIL' , { COLUMNS => 'CONTRACT_INFO:STATUS', LIMIT => 10, FILTER => "ValueFilter( =, 'binary:IN_ACTIVE' )" } Delete Info: delete 'DEVICE_DETAIL','Device2','CONTRACT_INFO:STATUS'
  • 18. Java API • HTable • HBaseAdmin • HTablePool • Get • Put • Delete • Scan • Increment • HTableDescriptor • HTableInterface • Result • ResultScanner • KeyValue HTable table = new HTable(configuration, hbasetablename); Put row = new Put(Bytes.toBytes(rowKey)); row.add(Bytes.toBytes(columnFamily), Bytes.toBytes(key), Bytes.toBytes(value)); Get getKey = new Get(Bytes.toBytes(key)); Result result = table.get(getKey);
  • 19. Spark HBase // create configuration val config = HBaseConfiguration.create() config.set("hbase.zookeeper.quorum", "localhost") config.set("hbase.zookeeper.property.clientPort","2181") config.set("hbase.mapreduce.inputtable", "hbaseTableName") // read data val hbaseData = sparkContext.hadoopRDD(new JobConf(config), classOf[TableInputFormat], classOf[ImmutableBytesWritable], classOf[Result]) // count rows println(hbaseData.count)
  • 21. Write & Read Logic
  • 22. SQL
  • 23. Re Check.. • Column family? • HBase components? • Name few Shell commands? • Version in HBase?
  • 25.
  • 26.
  • 27.
  • 28. Use Case • Canonical(use(case:(storing(crawl(data(and(indices(for(search 14 1 Web Search powered by Bigtable Crawlers Crawlers 1 Crawlers constantly scour the Internet for new pages. Those pages are stored as individual records in Bigtable. 3 2 A MapReduce job runs over the entire table, generating search indexes for the Web Search application. 4 2 5 Indexing the Internet Searching the Internet 3 The user initiates a Web Search request. 4 The Web Search application queries the Search Indexes and retries matching documents directly from Bigtable. 5 Search results are presented to the user. Internets Bigtable Crawlers Crawlers MapReduce You Search InSdeeaxrch InSdeeaxrch Index Web Search
  • 31.

Notas do Editor

  1. Most NoSQL stores lack true ACID transactions, although a few recent systems, such as FairCom c-treeACE, Google Spanner (though technically a NewSQL database) and FoundationDB, have made them central to their designs. Eventual consistency is a consistency model used in distributed computing to achieve high availability that informally guarantees that, if no new updates are made to a given data item, eventually all accesses to that item will return the last updated value Eventually consistent services are often classified as providing BASE (Basically Available, Soft state, Eventual consistency) semantics, in contrast to traditional ACID (Atomicity, Consistency, Isolation, Durability) guarantees.
  2. http://blog.monitis.com/2011/05/22/picking-the-right-nosql-database-tool/
  3. Eric Brewer’s CAP theorem says that if you want consistency, availability, and partition tolerance, you have to settle for two out of three. (For a distributed system, partition tolerance means the system will continue to work unless there is a total network failure. A few nodes can fail and the system keeps going.) Consistency means that each client always has the same view of the data. Availability means that all clients can always read and write. Partition tolerance means that the system works well across physical network partitions.
  4. http://localhost:60010/master-status
  5. Eric Brewer’s CAP theorem says that if you want consistency, availability, and partition tolerance, you have to settle for two out of three. (For a distributed system, partition tolerance means the system will continue to work unless there is a total network failure. A few nodes can fail and the system keeps going.) Consistency means that each client always has the same view of the data. Availability means that all clients can always read and write. Partition tolerance means that the system works well across physical network partitions.