SlideShare uma empresa Scribd logo
1 de 20
Baixar para ler offline
Google Bigtable


   Aishwarya Panchbhai




                         1
Plan for today …
 Google Scale – Motivation for Bigtable
 How do existing storage solutions compare?
 Overview of Bigtable – Data Model
 A Typical Bigtable Cell
 Compactions
 Performance Evaluation
 Lessons learnt

                                               2
Google Scale
 Workload
   - Tens of billions of documents/ hundreds ?
   - 10 kb/doc => 100’s of Terra bytes
   - Web growing at ~ 5 Exabytes/year (growing at 30 %) *
   Q: How much is an Exabyte ? 1000^6
 Lots of Different kinds of data!
   - Crawling system
    URL’s, contents, links, anchors, pagerank etc
   - Per-user data: preferences, recent queries/ search history
   - Geographic data, images etc …


                                     * Source: How much information is out there?
                                                                                    3
Google Philosophy
 Problem : Every Google service sees continuing growth in
  computational needs
                                                                  More data
  - More Queries
      More Users
                                                                       More queries

  - More Data                                    Better results

        Bigger web, mailbox, blog etc

  - Better Results
        Find the Right information, and find it faster

 Solution?

   Need for more computing power – large, scalable infrastructure
Existing storage solutions?
 Scale is too large for commercial databases
 May not run on their commodity hardware
 No dependence on other vendors
 Optimizations
 Better Price/Performance
 Building internally means the system can be applied
  across many projects for low incremental cost.
Q: How much is the largest database installation ?
2005 WinterCorp TopTen
        Survey
Bigtable
 Distributed multi-level map
 Fault-tolerant, persistent => GFS
 Scalable
  - 1000’s of servers
   - Millions of reads/writes, efficient scans
 Self-managing
  - Servers can be added/removed dynamically
  - Servers adjust to load-imbalance
Bigtable Vs DBMS
 Fast Query rate
  - No Joins, No SQL support, column-oriented database
  - Uses one Bigtable instead of having many normalized
  tables
 Is not even in 1NF in a traditional view
 Designed to support historical queries
  timestamp field => what did this webpage look like
  yesterday ?
 Data compression is easier – rows are sparse
Data model: a big map
•<Row, Column, Timestamp> triple for key - lookup, insert, and delete API
•Arbitrary “columns” on a row-by-row basis
    •Column family:qualifier. Family is heavyweight, qualifier lightweight
    •Column-oriented physical store- rows are sparse!
•Does not support a relational model
    •No table-wide integrity constraints
    •No multirow transactions
SSTable
 Immutable, sorted file of key-value pairs
 Chunks of data plus an index
   Index is of block ranges, not values




                                    SSTable
            64K     64K     64K
            block   block   block

                                    Index
Tablet
  Large tables broken into tablets at row boundaries
     - Tablets hold contiguous rows
     - Approx 100 – 200 MB of data per tablet
  Approx 100 tablets per machine
     - Fast recovery
     - Load-balancing
  Built out of multiple SSTables
Tablet    Start:aardvark   End:apple
                             SSTable                               SSTable


 64K      64K      64K                    64K      64K     64K
 block    block    block     Index        block    block   block    Index
Bigtable Master                                   Bigtable client
                            Performs metadata ops,
                            load-balancing                                        Client library
                                                                Read, write

A Typical Bigtable Cell
                                                                              Read, write



                                                                                   Read, write




        Bigtable tablet                   Bigtable tablet                Bigtable tablet
            server                            server                         server
                                                                                                    Open ()
            Serves data                        Serves data                     Serves data




Cluster scheduling system     Google File system (GFS)                           Lock service

        Handles failover,             Holds tablet data, logs                    Holds metadata,
          monitoring                                                          handles master election

                                                                                                  12
Finding a tablet




   3-level look up scheme
Compactions
 Minor compaction – convert the memtable into an SSTable
   Reduce memory usage
   Reduce log traffic on restart
 Merging compaction
   Periodically executed in the background
   Reduce number of SSTables
   Good place to apply policy “keep only N versions”
 Major compaction
   Merging compaction that results in only one SSTable
   No deletion records, only live data
   Reclaim resources.
Locality Groups
 Group column families together into an SSTable
   Avoid mingling data, ie page contents and page metadata
   Can keep some groups all in memory
 Can compress locality groups
 Bloom Filters on locality groups – avoid searching
  SSTable
Microbenchmarks
Application at Google
Lessons learned
 Interesting point- only implement some of the
  requirements, since the last is probably not needed

 Many types of failure possible
 Big systems need proper systems-level monitoring
 Value simple designs
Thank You For Your Time!


        QUESTIONS ?




                           20

Mais conteúdo relacionado

Mais procurados

Bigtable: A Distributed Storage System for Structured Data
Bigtable: A Distributed Storage System for Structured DataBigtable: A Distributed Storage System for Structured Data
Bigtable: A Distributed Storage System for Structured Data
elliando dias
 
Column oriented database
Column oriented databaseColumn oriented database
Column oriented database
Kanike Krishna
 
Bigtable
BigtableBigtable
Bigtable
ptdorf
 
Best Practices in the Use of Columnar Databases
Best Practices in the Use of Columnar DatabasesBest Practices in the Use of Columnar Databases
Best Practices in the Use of Columnar Databases
DATAVERSITY
 

Mais procurados (20)

Google Big Table
Google Big TableGoogle Big Table
Google Big Table
 
Big table
Big tableBig table
Big table
 
Big table
Big tableBig table
Big table
 
Big table presentation-final
Big table presentation-finalBig table presentation-final
Big table presentation-final
 
Bigtable
BigtableBigtable
Bigtable
 
Google BigTable
Google BigTableGoogle BigTable
Google BigTable
 
Bigtable: A Distributed Storage System for Structured Data
Bigtable: A Distributed Storage System for Structured DataBigtable: A Distributed Storage System for Structured Data
Bigtable: A Distributed Storage System for Structured Data
 
Bigtable
BigtableBigtable
Bigtable
 
BigTable And Hbase
BigTable And HbaseBigTable And Hbase
BigTable And Hbase
 
Big table
Big tableBig table
Big table
 
Bigtable and Dynamo
Bigtable and DynamoBigtable and Dynamo
Bigtable and Dynamo
 
Cloud Technology: Virtualization
Cloud Technology: VirtualizationCloud Technology: Virtualization
Cloud Technology: Virtualization
 
Google cluster architecture
Google cluster architecture Google cluster architecture
Google cluster architecture
 
Column oriented database
Column oriented databaseColumn oriented database
Column oriented database
 
Bigtable and Boxwood
Bigtable and BoxwoodBigtable and Boxwood
Bigtable and Boxwood
 
Row or Columnar Database
Row or Columnar DatabaseRow or Columnar Database
Row or Columnar Database
 
Cloud File System with GFS and HDFS
Cloud File System with GFS and HDFS  Cloud File System with GFS and HDFS
Cloud File System with GFS and HDFS
 
Write intensive workloads and lsm trees
Write intensive workloads and lsm treesWrite intensive workloads and lsm trees
Write intensive workloads and lsm trees
 
Bigtable
BigtableBigtable
Bigtable
 
Best Practices in the Use of Columnar Databases
Best Practices in the Use of Columnar DatabasesBest Practices in the Use of Columnar Databases
Best Practices in the Use of Columnar Databases
 

Semelhante a google Bigtable

Storage Systems For Scalable systems
Storage Systems For Scalable systemsStorage Systems For Scalable systems
Storage Systems For Scalable systems
elliando dias
 
http://www.hfadeel.com/Blog/?p=151
http://www.hfadeel.com/Blog/?p=151http://www.hfadeel.com/Blog/?p=151
http://www.hfadeel.com/Blog/?p=151
xlight
 
Data management in cloud study of existing systems and future opportunities
Data management in cloud study of existing systems and future opportunitiesData management in cloud study of existing systems and future opportunities
Data management in cloud study of existing systems and future opportunities
Editor Jacotech
 
My Article on MySQL Magazine
My Article on MySQL MagazineMy Article on MySQL Magazine
My Article on MySQL Magazine
Jonathan Levin
 

Semelhante a google Bigtable (20)

Big Data & Hadoop Introduction
Big Data & Hadoop IntroductionBig Data & Hadoop Introduction
Big Data & Hadoop Introduction
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
KIISE:SIGDB Workshop presentation.
KIISE:SIGDB Workshop presentation.KIISE:SIGDB Workshop presentation.
KIISE:SIGDB Workshop presentation.
 
Storage for next-generation sequencing
Storage for next-generation sequencingStorage for next-generation sequencing
Storage for next-generation sequencing
 
Storage Systems For Scalable systems
Storage Systems For Scalable systemsStorage Systems For Scalable systems
Storage Systems For Scalable systems
 
The Google Bigtable
The Google BigtableThe Google Bigtable
The Google Bigtable
 
Hadoop introduction
Hadoop introductionHadoop introduction
Hadoop introduction
 
MySQL 8 Tips and Tricks from Symfony USA 2018, San Francisco
MySQL 8 Tips and Tricks from Symfony USA 2018, San FranciscoMySQL 8 Tips and Tricks from Symfony USA 2018, San Francisco
MySQL 8 Tips and Tricks from Symfony USA 2018, San Francisco
 
MinneBar 2013 - Scaling with Cassandra
MinneBar 2013 - Scaling with CassandraMinneBar 2013 - Scaling with Cassandra
MinneBar 2013 - Scaling with Cassandra
 
Hive + Amazon EMR + S3 = Elastic big data SQL analytics processing in the cloud
Hive + Amazon EMR + S3 = Elastic big data SQL analytics processing in the cloudHive + Amazon EMR + S3 = Elastic big data SQL analytics processing in the cloud
Hive + Amazon EMR + S3 = Elastic big data SQL analytics processing in the cloud
 
MySQL 8 Server Optimization Swanseacon 2018
MySQL 8 Server Optimization Swanseacon 2018MySQL 8 Server Optimization Swanseacon 2018
MySQL 8 Server Optimization Swanseacon 2018
 
http://www.hfadeel.com/Blog/?p=151
http://www.hfadeel.com/Blog/?p=151http://www.hfadeel.com/Blog/?p=151
http://www.hfadeel.com/Blog/?p=151
 
Hadoop HDFS.ppt
Hadoop HDFS.pptHadoop HDFS.ppt
Hadoop HDFS.ppt
 
8. column oriented databases
8. column oriented databases8. column oriented databases
8. column oriented databases
 
Big data and hadoop overvew
Big data and hadoop overvewBig data and hadoop overvew
Big data and hadoop overvew
 
Column Stores and Google BigQuery
Column Stores and Google BigQueryColumn Stores and Google BigQuery
Column Stores and Google BigQuery
 
Managing Security At 1M Events a Second using Elasticsearch
Managing Security At 1M Events a Second using ElasticsearchManaging Security At 1M Events a Second using Elasticsearch
Managing Security At 1M Events a Second using Elasticsearch
 
Data management in cloud study of existing systems and future opportunities
Data management in cloud study of existing systems and future opportunitiesData management in cloud study of existing systems and future opportunities
Data management in cloud study of existing systems and future opportunities
 
My Article on MySQL Magazine
My Article on MySQL MagazineMy Article on MySQL Magazine
My Article on MySQL Magazine
 
Best storage engine for MySQL
Best storage engine for MySQLBest storage engine for MySQL
Best storage engine for MySQL
 

Mais de elliando dias

Why you should be excited about ClojureScript
Why you should be excited about ClojureScriptWhy you should be excited about ClojureScript
Why you should be excited about ClojureScript
elliando dias
 
Nomenclatura e peças de container
Nomenclatura  e peças de containerNomenclatura  e peças de container
Nomenclatura e peças de container
elliando dias
 
Polyglot and Poly-paradigm Programming for Better Agility
Polyglot and Poly-paradigm Programming for Better AgilityPolyglot and Poly-paradigm Programming for Better Agility
Polyglot and Poly-paradigm Programming for Better Agility
elliando dias
 
Javascript Libraries
Javascript LibrariesJavascript Libraries
Javascript Libraries
elliando dias
 
How to Make an Eight Bit Computer and Save the World!
How to Make an Eight Bit Computer and Save the World!How to Make an Eight Bit Computer and Save the World!
How to Make an Eight Bit Computer and Save the World!
elliando dias
 
A Practical Guide to Connecting Hardware to the Web
A Practical Guide to Connecting Hardware to the WebA Practical Guide to Connecting Hardware to the Web
A Practical Guide to Connecting Hardware to the Web
elliando dias
 
Introdução ao Arduino
Introdução ao ArduinoIntrodução ao Arduino
Introdução ao Arduino
elliando dias
 
Incanter Data Sorcery
Incanter Data SorceryIncanter Data Sorcery
Incanter Data Sorcery
elliando dias
 
Fab.in.a.box - Fab Academy: Machine Design
Fab.in.a.box - Fab Academy: Machine DesignFab.in.a.box - Fab Academy: Machine Design
Fab.in.a.box - Fab Academy: Machine Design
elliando dias
 
Hadoop - Simple. Scalable.
Hadoop - Simple. Scalable.Hadoop - Simple. Scalable.
Hadoop - Simple. Scalable.
elliando dias
 
Hadoop and Hive Development at Facebook
Hadoop and Hive Development at FacebookHadoop and Hive Development at Facebook
Hadoop and Hive Development at Facebook
elliando dias
 
Multi-core Parallelization in Clojure - a Case Study
Multi-core Parallelization in Clojure - a Case StudyMulti-core Parallelization in Clojure - a Case Study
Multi-core Parallelization in Clojure - a Case Study
elliando dias
 

Mais de elliando dias (20)

Clojurescript slides
Clojurescript slidesClojurescript slides
Clojurescript slides
 
Why you should be excited about ClojureScript
Why you should be excited about ClojureScriptWhy you should be excited about ClojureScript
Why you should be excited about ClojureScript
 
Functional Programming with Immutable Data Structures
Functional Programming with Immutable Data StructuresFunctional Programming with Immutable Data Structures
Functional Programming with Immutable Data Structures
 
Nomenclatura e peças de container
Nomenclatura  e peças de containerNomenclatura  e peças de container
Nomenclatura e peças de container
 
Geometria Projetiva
Geometria ProjetivaGeometria Projetiva
Geometria Projetiva
 
Polyglot and Poly-paradigm Programming for Better Agility
Polyglot and Poly-paradigm Programming for Better AgilityPolyglot and Poly-paradigm Programming for Better Agility
Polyglot and Poly-paradigm Programming for Better Agility
 
Javascript Libraries
Javascript LibrariesJavascript Libraries
Javascript Libraries
 
How to Make an Eight Bit Computer and Save the World!
How to Make an Eight Bit Computer and Save the World!How to Make an Eight Bit Computer and Save the World!
How to Make an Eight Bit Computer and Save the World!
 
Ragel talk
Ragel talkRagel talk
Ragel talk
 
A Practical Guide to Connecting Hardware to the Web
A Practical Guide to Connecting Hardware to the WebA Practical Guide to Connecting Hardware to the Web
A Practical Guide to Connecting Hardware to the Web
 
Introdução ao Arduino
Introdução ao ArduinoIntrodução ao Arduino
Introdução ao Arduino
 
Minicurso arduino
Minicurso arduinoMinicurso arduino
Minicurso arduino
 
Incanter Data Sorcery
Incanter Data SorceryIncanter Data Sorcery
Incanter Data Sorcery
 
Rango
RangoRango
Rango
 
Fab.in.a.box - Fab Academy: Machine Design
Fab.in.a.box - Fab Academy: Machine DesignFab.in.a.box - Fab Academy: Machine Design
Fab.in.a.box - Fab Academy: Machine Design
 
The Digital Revolution: Machines that makes
The Digital Revolution: Machines that makesThe Digital Revolution: Machines that makes
The Digital Revolution: Machines that makes
 
Hadoop + Clojure
Hadoop + ClojureHadoop + Clojure
Hadoop + Clojure
 
Hadoop - Simple. Scalable.
Hadoop - Simple. Scalable.Hadoop - Simple. Scalable.
Hadoop - Simple. Scalable.
 
Hadoop and Hive Development at Facebook
Hadoop and Hive Development at FacebookHadoop and Hive Development at Facebook
Hadoop and Hive Development at Facebook
 
Multi-core Parallelization in Clojure - a Case Study
Multi-core Parallelization in Clojure - a Case StudyMulti-core Parallelization in Clojure - a Case Study
Multi-core Parallelization in Clojure - a Case Study
 

Último

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 

Último (20)

Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 

google Bigtable

  • 1. Google Bigtable Aishwarya Panchbhai 1
  • 2. Plan for today …  Google Scale – Motivation for Bigtable  How do existing storage solutions compare?  Overview of Bigtable – Data Model  A Typical Bigtable Cell  Compactions  Performance Evaluation  Lessons learnt 2
  • 3. Google Scale  Workload - Tens of billions of documents/ hundreds ? - 10 kb/doc => 100’s of Terra bytes - Web growing at ~ 5 Exabytes/year (growing at 30 %) * Q: How much is an Exabyte ? 1000^6  Lots of Different kinds of data! - Crawling system URL’s, contents, links, anchors, pagerank etc - Per-user data: preferences, recent queries/ search history - Geographic data, images etc … * Source: How much information is out there? 3
  • 4. Google Philosophy  Problem : Every Google service sees continuing growth in computational needs More data - More Queries More Users More queries - More Data Better results Bigger web, mailbox, blog etc - Better Results Find the Right information, and find it faster  Solution? Need for more computing power – large, scalable infrastructure
  • 5. Existing storage solutions?  Scale is too large for commercial databases  May not run on their commodity hardware  No dependence on other vendors  Optimizations  Better Price/Performance  Building internally means the system can be applied across many projects for low incremental cost. Q: How much is the largest database installation ?
  • 7. Bigtable  Distributed multi-level map  Fault-tolerant, persistent => GFS  Scalable - 1000’s of servers - Millions of reads/writes, efficient scans  Self-managing - Servers can be added/removed dynamically - Servers adjust to load-imbalance
  • 8. Bigtable Vs DBMS  Fast Query rate - No Joins, No SQL support, column-oriented database - Uses one Bigtable instead of having many normalized tables  Is not even in 1NF in a traditional view  Designed to support historical queries timestamp field => what did this webpage look like yesterday ?  Data compression is easier – rows are sparse
  • 9. Data model: a big map •<Row, Column, Timestamp> triple for key - lookup, insert, and delete API •Arbitrary “columns” on a row-by-row basis •Column family:qualifier. Family is heavyweight, qualifier lightweight •Column-oriented physical store- rows are sparse! •Does not support a relational model •No table-wide integrity constraints •No multirow transactions
  • 10. SSTable  Immutable, sorted file of key-value pairs  Chunks of data plus an index  Index is of block ranges, not values SSTable 64K 64K 64K block block block Index
  • 11. Tablet  Large tables broken into tablets at row boundaries - Tablets hold contiguous rows - Approx 100 – 200 MB of data per tablet  Approx 100 tablets per machine - Fast recovery - Load-balancing  Built out of multiple SSTables Tablet Start:aardvark End:apple SSTable SSTable 64K 64K 64K 64K 64K 64K block block block Index block block block Index
  • 12. Bigtable Master Bigtable client Performs metadata ops, load-balancing Client library Read, write A Typical Bigtable Cell Read, write Read, write Bigtable tablet Bigtable tablet Bigtable tablet server server server Open () Serves data Serves data Serves data Cluster scheduling system Google File system (GFS) Lock service Handles failover, Holds tablet data, logs Holds metadata, monitoring handles master election 12
  • 13. Finding a tablet 3-level look up scheme
  • 14. Compactions  Minor compaction – convert the memtable into an SSTable  Reduce memory usage  Reduce log traffic on restart  Merging compaction  Periodically executed in the background  Reduce number of SSTables  Good place to apply policy “keep only N versions”  Major compaction  Merging compaction that results in only one SSTable  No deletion records, only live data  Reclaim resources.
  • 15. Locality Groups  Group column families together into an SSTable  Avoid mingling data, ie page contents and page metadata  Can keep some groups all in memory  Can compress locality groups  Bloom Filters on locality groups – avoid searching SSTable
  • 17.
  • 19. Lessons learned  Interesting point- only implement some of the requirements, since the last is probably not needed  Many types of failure possible  Big systems need proper systems-level monitoring  Value simple designs
  • 20. Thank You For Your Time! QUESTIONS ? 20