SlideShare a Scribd company logo
1 of 32
MEGASTORE: Providing
Scalable, Highly Available
Storage for Interactive
Services
Guided By- Prof. Kong Li
Presented By- (TEAM 1)
Anumeha Shah(009423973)
Ankita Kapratwar (009413469)
Swapna Kulkarni(009264905)
What is Megastore
● Megastore combines the scalability and availability of NoSQL datastore
with ACID semantics of RDBMS in an innovative way so that it can meet
the requirement of interactive online services. Megastore provides both the
high consistency as well as high availability which can not be provided by
NoSQL or RDBMS alone.
● Megastore uses Paxos replication and consensus algorithm for high
availability and with low latency.
● Partitions the data to a fine granularity and ACID semantics within the
partition across wide area network with low latency.
Why Megastore
Online interactive services requires high availability as well as high
consistency.
● Online services are growing exceedingly as potential users are growing
exceedingly.
● More and more desktop services are moving to the cloud
● Opposing requirements of storage demands are arising and making the
storage challenging
Reasons for opposing requirements are:
● Applications should be scalable Services should be responsive.
● User should have consistent view of the data
● Services should be highly available services to be up for 24/7 services to be
Approach to Provide High Availability
and Consistency
Two approaches has been taken.
1. synchronous fault tolerant log replicator to provide availability.
2. To provide scalability partition the data into many small databases and
provide each database with its own log replicator.
Replications for High Availability
Need for replications:
● Replication is needed for high availability
● replication with in data center overcome the host specific failures
● But to overcome datacenter specific failure and regional disaster the data
should be replicated over geographically distributed datacenters.
Common Replication Strategies and
Issues
Asynchronous master/slave
● write ahead log entries are replicated by master node to at least one slave.
● Log appends acknowledgement at master and transmissions to slave
happens parallely.
● However if master fails then we can experience downtime till a slave
becomes master and also loss of data can occur.
Synchronous master/slave:
● Changes on masters and slave are done synchronously that is master
acknowledge the change once the changes are mirrored to slaves.
● This approach prevent data loss in failover of master to slave.
● However failures need timely detection using an external system because it
may cause high latency and user visible outage
Common Replication Strategies and
Issues Cont..
Optimistic Replication:
● There is no master.
● Any member can accept the changes and the changes propagates through
the group asynchronously. This approach provide high availability and
excellent latency
● However transactions are not possible as global mutation orderings are not
known at commit time.
Use of Paxos for Replication
● Paxos is fault tolerant consensus algorithm
● There is no master but group of similar peers
● A write ahead log can be replicated over all the peers.
● Any of the peer can initiate read or write.
● Log add the changes only if majority of the peers acknowledges the
changes.
● The other peers which did not acknowledge the change eventually
acknowledge.
● No distinguished failed state
Use of Paxos for Replication Cont..
Issues with Paxos replication Strategy
● If we have only one replicated log over wide area then it might suffer high
latencies which will limit the throughput.
● What if none of the replica is updated.
● What if majority of the replica does not acknowledge the writes
Solution
● Partition the data
● Multiple replicated logs.
● Each partition of the data will have its own replicated log.
● Synchronous log replication among the data centers.
Partitioning For Scalable Replication
Cross Entity Groups Operations.
Partitioning For Scalability and
Consistency
● Partition the data into entity groups
● Each partition is replicated across different data centers synchronously
and independently
● The data is stored in NoSQL datastore in datacenter
● Within an entity group the changes are done using single phase ACID
semantics.
● But across the entity group changes or operations are done using two
phase single commit using asynchronous messaging.
● These entity groups are logically distant not physically distant. So
operations across the different entity groups are local
● The traffic between the data centers is only for synchronous replications
Physical Layout
How to select entity group boundaries:
● Should not be too fine grained as it may require excessive cross group
operations. Group should also not contain large no of entities as it may
cause unnecessary writes.
Physical Layout
● Google’s big table as a storage system which is fault tolerant and scalable
● Applications keeps the data near the user or to a region where it is being
accessed the most and maintains replications near each other to avoid
failures and high latency during failures. Keeps the group of data which are
accessed together either close to each other or with in the same row.
● Implement cache for low latency
Data Model Overview
● Lies between abstract tuples of RDBMS and concrete row-column storage.
● Schema=>Set of tables =>contains entities=>contains properties
● Entity group will consist of a root entity along with all entities in child table
that references it
Data Model Cont..
Indexes
● This can be applied to any property
● Local Index- Used to find data within an entity group
● Global Index- Used to find entities without knowing in advance the entity
groups that contain them
● Storing Clause- Applications store additional properties from the primary
table for faster access at read time
● Repeated Indexes- For repeated properties
● Inline indexes: Extracting slices of information from child entities and
storing it in the data in parent for fast access. Implements many to many
links
Mapping to Bigtable
● Here the column name = Megastore table name + Property name
● Each Bigtable row stores transaction, metadata and log for the group
● Metadata is in the same row which allows to update atomically through a
single Bigtable transaction
● Index Entry- represented as a Bigtable row. Row key = Indexed property
values + primary key of indexed entity
Transactions and Concurrency control
● Entity group functions as a mini-database.
● Transaction writes mutations in write-ahead log, then mutations will apply to
data
● Multiple values can be stored in the same row/column pair with different
timestamps
● Multiversion Concurrency control- MVCC
● Readers and writers don’t block each other
Cont..
Reads-
a. Current- ensure that all committed writes are applied first, then read latest
committed transaction
b. Snapshot- reads the latest committed write operation
c. Inconsistent- Ignore the state of log and read latest value
Writes-
Begins with a current read to determine the next available log position.
Commit operation gathers mutations into a log entry, assigns it a timestamp
higher than any prev ones and appends to the log using Paxos
Transaction Lifecycle
READ- Obtain
Timestamp & Log
Position of last
committed transaction
Application Logic-
Read from Bigtable
and gather writes
into a log entry
Commit - Use
Paxos for
appending that
entry to log
Apply - Write
mutations to the
entities and
indexes
Clean Up - Delete
data that is no
longer required
Replication
● Initiation of reads and writes can be done from any replica
● Replication is done per entity group by synchronously replicating the
groups transaction log to a quorum of replicas
● Reads guarantees:
o Read will always observe the last-acknowledged write.
o After a write has been observed, all future reads observe that write
Megastore Architecture
Data Structures and Algorithms
Replicated Logs
Reads
Algorithm for a Current Read
● Query Local
● Find Position
● Catch-Up
● Validate
● Query Data
Reads
Timeline for reads for local replica A
Writes
Algorithm for writes
● Accept Leader
● Prepare
● Accept
● Invalidate
● Apply
Writes
Timeline for writes
Coordinator Availability
Failure Detection
● Google's Chubby lock service is used
● Writers are insulated from coordinator failure by testing whether a
coordinator has lost its locks
Validation Races
● Races between validates for earlier writes and invalidates for later
writes are protected in the coordinator by always sending the log
position associated with the action.
Operational Issues
Distribution of Availability
Production Metrics
Distribution of Average Latencies
Conclusion
● Megastore
● Paxos for Synchronization
● Bigtable Datastore
QUESTIONS??????
THANK YOU !

More Related Content

What's hot

Cassandra Introduction & Features
Cassandra Introduction & FeaturesCassandra Introduction & Features
Cassandra Introduction & FeaturesDataStax Academy
 
Introduction to memcached
Introduction to memcachedIntroduction to memcached
Introduction to memcachedJurriaan Persyn
 
The Big Data Stack
The Big Data StackThe Big Data Stack
The Big Data StackZubair Nabi
 
Introduction to Stream Processing
Introduction to Stream ProcessingIntroduction to Stream Processing
Introduction to Stream ProcessingGuido Schmutz
 
Introduction to Apache ZooKeeper
Introduction to Apache ZooKeeperIntroduction to Apache ZooKeeper
Introduction to Apache ZooKeeperSaurav Haloi
 
Change Data Feed in Delta
Change Data Feed in DeltaChange Data Feed in Delta
Change Data Feed in DeltaDatabricks
 
The evolution of Netflix's S3 data warehouse (Strata NY 2018)
The evolution of Netflix's S3 data warehouse (Strata NY 2018)The evolution of Netflix's S3 data warehouse (Strata NY 2018)
The evolution of Netflix's S3 data warehouse (Strata NY 2018)Ryan Blue
 
Introduction to Spark with Python
Introduction to Spark with PythonIntroduction to Spark with Python
Introduction to Spark with PythonGokhan Atil
 
Introduction to Storm
Introduction to Storm Introduction to Storm
Introduction to Storm Chandler Huang
 
NoSQL Databases: Why, what and when
NoSQL Databases: Why, what and whenNoSQL Databases: Why, what and when
NoSQL Databases: Why, what and whenLorenzo Alberton
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDBMike Dirolf
 
Building robust CDC pipeline with Apache Hudi and Debezium
Building robust CDC pipeline with Apache Hudi and DebeziumBuilding robust CDC pipeline with Apache Hudi and Debezium
Building robust CDC pipeline with Apache Hudi and DebeziumTathastu.ai
 
A Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiA Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiDatabricks
 
Migrating with Debezium
Migrating with DebeziumMigrating with Debezium
Migrating with DebeziumMike Fowler
 
Real-time Analytics with Presto and Apache Pinot
Real-time Analytics with Presto and Apache PinotReal-time Analytics with Presto and Apache Pinot
Real-time Analytics with Presto and Apache PinotXiang Fu
 
Mongodb basics and architecture
Mongodb basics and architectureMongodb basics and architecture
Mongodb basics and architectureBishal Khanal
 
Stability Patterns for Microservices
Stability Patterns for MicroservicesStability Patterns for Microservices
Stability Patterns for Microservicespflueras
 

What's hot (20)

Cassandra Introduction & Features
Cassandra Introduction & FeaturesCassandra Introduction & Features
Cassandra Introduction & Features
 
Intro to HBase
Intro to HBaseIntro to HBase
Intro to HBase
 
Introduction to memcached
Introduction to memcachedIntroduction to memcached
Introduction to memcached
 
The Big Data Stack
The Big Data StackThe Big Data Stack
The Big Data Stack
 
Introduction to Stream Processing
Introduction to Stream ProcessingIntroduction to Stream Processing
Introduction to Stream Processing
 
Introduction to Apache ZooKeeper
Introduction to Apache ZooKeeperIntroduction to Apache ZooKeeper
Introduction to Apache ZooKeeper
 
Change Data Feed in Delta
Change Data Feed in DeltaChange Data Feed in Delta
Change Data Feed in Delta
 
The evolution of Netflix's S3 data warehouse (Strata NY 2018)
The evolution of Netflix's S3 data warehouse (Strata NY 2018)The evolution of Netflix's S3 data warehouse (Strata NY 2018)
The evolution of Netflix's S3 data warehouse (Strata NY 2018)
 
Introduction to Spark with Python
Introduction to Spark with PythonIntroduction to Spark with Python
Introduction to Spark with Python
 
Introduction to Storm
Introduction to Storm Introduction to Storm
Introduction to Storm
 
NoSQL Databases: Why, what and when
NoSQL Databases: Why, what and whenNoSQL Databases: Why, what and when
NoSQL Databases: Why, what and when
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
Amazon Aurora: Under the Hood
Amazon Aurora: Under the HoodAmazon Aurora: Under the Hood
Amazon Aurora: Under the Hood
 
Snowflake Overview
Snowflake OverviewSnowflake Overview
Snowflake Overview
 
Building robust CDC pipeline with Apache Hudi and Debezium
Building robust CDC pipeline with Apache Hudi and DebeziumBuilding robust CDC pipeline with Apache Hudi and Debezium
Building robust CDC pipeline with Apache Hudi and Debezium
 
A Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiA Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and Hudi
 
Migrating with Debezium
Migrating with DebeziumMigrating with Debezium
Migrating with Debezium
 
Real-time Analytics with Presto and Apache Pinot
Real-time Analytics with Presto and Apache PinotReal-time Analytics with Presto and Apache Pinot
Real-time Analytics with Presto and Apache Pinot
 
Mongodb basics and architecture
Mongodb basics and architectureMongodb basics and architecture
Mongodb basics and architecture
 
Stability Patterns for Microservices
Stability Patterns for MicroservicesStability Patterns for Microservices
Stability Patterns for Microservices
 

Viewers also liked

Google Megastore
Google MegastoreGoogle Megastore
Google Megastorebergwolf
 
Db presentation google_megastore
Db presentation google_megastoreDb presentation google_megastore
Db presentation google_megastoreAlanoud Alqoufi
 
Google Spanner - Synchronously-Replicated, Globally-Distributed, Multi-Versio...
Google Spanner - Synchronously-Replicated, Globally-Distributed, Multi-Versio...Google Spanner - Synchronously-Replicated, Globally-Distributed, Multi-Versio...
Google Spanner - Synchronously-Replicated, Globally-Distributed, Multi-Versio...Maciek Jozwiak
 
MORE Mega Store .........
MORE Mega Store .........MORE Mega Store .........
MORE Mega Store .........PESHWA ACHARYA
 
An Overview of Spanner: Google's Globally Distributed Database
An Overview of Spanner: Google's Globally Distributed DatabaseAn Overview of Spanner: Google's Globally Distributed Database
An Overview of Spanner: Google's Globally Distributed DatabaseBenjamin Bengfort
 
Cassandra Compression and Performance Evaluation
Cassandra Compression and Performance EvaluationCassandra Compression and Performance Evaluation
Cassandra Compression and Performance EvaluationSchubert Zhang
 

Viewers also liked (8)

Google Megastore
Google MegastoreGoogle Megastore
Google Megastore
 
Db presentation google_megastore
Db presentation google_megastoreDb presentation google_megastore
Db presentation google_megastore
 
Noha mega store
Noha mega storeNoha mega store
Noha mega store
 
Google Spanner - Synchronously-Replicated, Globally-Distributed, Multi-Versio...
Google Spanner - Synchronously-Replicated, Globally-Distributed, Multi-Versio...Google Spanner - Synchronously-Replicated, Globally-Distributed, Multi-Versio...
Google Spanner - Synchronously-Replicated, Globally-Distributed, Multi-Versio...
 
Spanner
SpannerSpanner
Spanner
 
MORE Mega Store .........
MORE Mega Store .........MORE Mega Store .........
MORE Mega Store .........
 
An Overview of Spanner: Google's Globally Distributed Database
An Overview of Spanner: Google's Globally Distributed DatabaseAn Overview of Spanner: Google's Globally Distributed Database
An Overview of Spanner: Google's Globally Distributed Database
 
Cassandra Compression and Performance Evaluation
Cassandra Compression and Performance EvaluationCassandra Compression and Performance Evaluation
Cassandra Compression and Performance Evaluation
 

Similar to Megastore by Google

BISSA: Empowering Web gadget Communication with Tuple Spaces
BISSA: Empowering Web gadget Communication with Tuple SpacesBISSA: Empowering Web gadget Communication with Tuple Spaces
BISSA: Empowering Web gadget Communication with Tuple SpacesSrinath Perera
 
Interactive Data Analysis in Spark Streaming
Interactive Data Analysis in Spark StreamingInteractive Data Analysis in Spark Streaming
Interactive Data Analysis in Spark Streamingdatamantra
 
Webinar Slides: MySQL HA/DR/Geo-Scale - High Noon #2: Galera Cluster
Webinar Slides: MySQL HA/DR/Geo-Scale - High Noon #2: Galera ClusterWebinar Slides: MySQL HA/DR/Geo-Scale - High Noon #2: Galera Cluster
Webinar Slides: MySQL HA/DR/Geo-Scale - High Noon #2: Galera ClusterContinuent
 
Big data Argentina meetup 2020-09: Intro to presto on docker
Big data Argentina meetup 2020-09: Intro to presto on dockerBig data Argentina meetup 2020-09: Intro to presto on docker
Big data Argentina meetup 2020-09: Intro to presto on dockerFederico Palladoro
 
Concurrency, Parallelism And IO
Concurrency,  Parallelism And IOConcurrency,  Parallelism And IO
Concurrency, Parallelism And IOPiyush Katariya
 
Cosmos DB at VLDB 2019
Cosmos DB at VLDB 2019Cosmos DB at VLDB 2019
Cosmos DB at VLDB 2019Dharma Shukla
 
MariaDB High Availability
MariaDB High AvailabilityMariaDB High Availability
MariaDB High AvailabilityMariaDB plc
 
Data has a better idea the in-memory data grid
Data has a better idea   the in-memory data gridData has a better idea   the in-memory data grid
Data has a better idea the in-memory data gridBogdan Dina
 
Choosing the right high availability strategy
Choosing the right high availability strategyChoosing the right high availability strategy
Choosing the right high availability strategyMariaDB plc
 
M|18 Choosing the Right High Availability Strategy for You
M|18 Choosing the Right High Availability Strategy for YouM|18 Choosing the Right High Availability Strategy for You
M|18 Choosing the Right High Availability Strategy for YouMariaDB plc
 
Enabling Presto to handle massive scale at lightning speed
Enabling Presto to handle massive scale at lightning speedEnabling Presto to handle massive scale at lightning speed
Enabling Presto to handle massive scale at lightning speedShubham Tagra
 
Best Practice for Achieving High Availability in MariaDB
Best Practice for Achieving High Availability in MariaDBBest Practice for Achieving High Availability in MariaDB
Best Practice for Achieving High Availability in MariaDBMariaDB plc
 
CDC patterns in Apache Kafka®
CDC patterns in Apache Kafka®CDC patterns in Apache Kafka®
CDC patterns in Apache Kafka®confluent
 
Big Data on Cloud Native Platform
Big Data on Cloud Native PlatformBig Data on Cloud Native Platform
Big Data on Cloud Native PlatformSunil Govindan
 
Big Data on Cloud Native Platform
Big Data on Cloud Native PlatformBig Data on Cloud Native Platform
Big Data on Cloud Native PlatformSunil Govindan
 
Choosing the right high availability strategy
Choosing the right high availability strategyChoosing the right high availability strategy
Choosing the right high availability strategyMariaDB plc
 
Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?Anton Nazaruk
 
Stateful streaming and the challenge of state
Stateful streaming and the challenge of stateStateful streaming and the challenge of state
Stateful streaming and the challenge of stateYoni Farin
 
Enabling presto to handle massive scale at lightning speed
Enabling presto to handle massive scale at lightning speedEnabling presto to handle massive scale at lightning speed
Enabling presto to handle massive scale at lightning speedShubham Tagra
 

Similar to Megastore by Google (20)

Presto
PrestoPresto
Presto
 
BISSA: Empowering Web gadget Communication with Tuple Spaces
BISSA: Empowering Web gadget Communication with Tuple SpacesBISSA: Empowering Web gadget Communication with Tuple Spaces
BISSA: Empowering Web gadget Communication with Tuple Spaces
 
Interactive Data Analysis in Spark Streaming
Interactive Data Analysis in Spark StreamingInteractive Data Analysis in Spark Streaming
Interactive Data Analysis in Spark Streaming
 
Webinar Slides: MySQL HA/DR/Geo-Scale - High Noon #2: Galera Cluster
Webinar Slides: MySQL HA/DR/Geo-Scale - High Noon #2: Galera ClusterWebinar Slides: MySQL HA/DR/Geo-Scale - High Noon #2: Galera Cluster
Webinar Slides: MySQL HA/DR/Geo-Scale - High Noon #2: Galera Cluster
 
Big data Argentina meetup 2020-09: Intro to presto on docker
Big data Argentina meetup 2020-09: Intro to presto on dockerBig data Argentina meetup 2020-09: Intro to presto on docker
Big data Argentina meetup 2020-09: Intro to presto on docker
 
Concurrency, Parallelism And IO
Concurrency,  Parallelism And IOConcurrency,  Parallelism And IO
Concurrency, Parallelism And IO
 
Cosmos DB at VLDB 2019
Cosmos DB at VLDB 2019Cosmos DB at VLDB 2019
Cosmos DB at VLDB 2019
 
MariaDB High Availability
MariaDB High AvailabilityMariaDB High Availability
MariaDB High Availability
 
Data has a better idea the in-memory data grid
Data has a better idea   the in-memory data gridData has a better idea   the in-memory data grid
Data has a better idea the in-memory data grid
 
Choosing the right high availability strategy
Choosing the right high availability strategyChoosing the right high availability strategy
Choosing the right high availability strategy
 
M|18 Choosing the Right High Availability Strategy for You
M|18 Choosing the Right High Availability Strategy for YouM|18 Choosing the Right High Availability Strategy for You
M|18 Choosing the Right High Availability Strategy for You
 
Enabling Presto to handle massive scale at lightning speed
Enabling Presto to handle massive scale at lightning speedEnabling Presto to handle massive scale at lightning speed
Enabling Presto to handle massive scale at lightning speed
 
Best Practice for Achieving High Availability in MariaDB
Best Practice for Achieving High Availability in MariaDBBest Practice for Achieving High Availability in MariaDB
Best Practice for Achieving High Availability in MariaDB
 
CDC patterns in Apache Kafka®
CDC patterns in Apache Kafka®CDC patterns in Apache Kafka®
CDC patterns in Apache Kafka®
 
Big Data on Cloud Native Platform
Big Data on Cloud Native PlatformBig Data on Cloud Native Platform
Big Data on Cloud Native Platform
 
Big Data on Cloud Native Platform
Big Data on Cloud Native PlatformBig Data on Cloud Native Platform
Big Data on Cloud Native Platform
 
Choosing the right high availability strategy
Choosing the right high availability strategyChoosing the right high availability strategy
Choosing the right high availability strategy
 
Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?
 
Stateful streaming and the challenge of state
Stateful streaming and the challenge of stateStateful streaming and the challenge of state
Stateful streaming and the challenge of state
 
Enabling presto to handle massive scale at lightning speed
Enabling presto to handle massive scale at lightning speedEnabling presto to handle massive scale at lightning speed
Enabling presto to handle massive scale at lightning speed
 

Recently uploaded

Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptxiammrhaywood
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Seán Kennedy
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)cama23
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)lakshayb543
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptxmary850239
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
ROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxVanesaIglesias10
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Mark Reed
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptxmary850239
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSJoshuaGantuangco2
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYKayeClaireEstoconing
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfJemuel Francisco
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 

Recently uploaded (20)

Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
 
Raw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptxRaw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptx
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
ROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptx
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 

Megastore by Google

  • 1. MEGASTORE: Providing Scalable, Highly Available Storage for Interactive Services Guided By- Prof. Kong Li Presented By- (TEAM 1) Anumeha Shah(009423973) Ankita Kapratwar (009413469) Swapna Kulkarni(009264905)
  • 2. What is Megastore ● Megastore combines the scalability and availability of NoSQL datastore with ACID semantics of RDBMS in an innovative way so that it can meet the requirement of interactive online services. Megastore provides both the high consistency as well as high availability which can not be provided by NoSQL or RDBMS alone. ● Megastore uses Paxos replication and consensus algorithm for high availability and with low latency. ● Partitions the data to a fine granularity and ACID semantics within the partition across wide area network with low latency.
  • 3. Why Megastore Online interactive services requires high availability as well as high consistency. ● Online services are growing exceedingly as potential users are growing exceedingly. ● More and more desktop services are moving to the cloud ● Opposing requirements of storage demands are arising and making the storage challenging Reasons for opposing requirements are: ● Applications should be scalable Services should be responsive. ● User should have consistent view of the data ● Services should be highly available services to be up for 24/7 services to be
  • 4. Approach to Provide High Availability and Consistency Two approaches has been taken. 1. synchronous fault tolerant log replicator to provide availability. 2. To provide scalability partition the data into many small databases and provide each database with its own log replicator. Replications for High Availability Need for replications: ● Replication is needed for high availability ● replication with in data center overcome the host specific failures ● But to overcome datacenter specific failure and regional disaster the data should be replicated over geographically distributed datacenters.
  • 5. Common Replication Strategies and Issues Asynchronous master/slave ● write ahead log entries are replicated by master node to at least one slave. ● Log appends acknowledgement at master and transmissions to slave happens parallely. ● However if master fails then we can experience downtime till a slave becomes master and also loss of data can occur. Synchronous master/slave: ● Changes on masters and slave are done synchronously that is master acknowledge the change once the changes are mirrored to slaves. ● This approach prevent data loss in failover of master to slave. ● However failures need timely detection using an external system because it may cause high latency and user visible outage
  • 6. Common Replication Strategies and Issues Cont.. Optimistic Replication: ● There is no master. ● Any member can accept the changes and the changes propagates through the group asynchronously. This approach provide high availability and excellent latency ● However transactions are not possible as global mutation orderings are not known at commit time.
  • 7. Use of Paxos for Replication ● Paxos is fault tolerant consensus algorithm ● There is no master but group of similar peers ● A write ahead log can be replicated over all the peers. ● Any of the peer can initiate read or write. ● Log add the changes only if majority of the peers acknowledges the changes. ● The other peers which did not acknowledge the change eventually acknowledge. ● No distinguished failed state
  • 8. Use of Paxos for Replication Cont.. Issues with Paxos replication Strategy ● If we have only one replicated log over wide area then it might suffer high latencies which will limit the throughput. ● What if none of the replica is updated. ● What if majority of the replica does not acknowledge the writes Solution ● Partition the data ● Multiple replicated logs. ● Each partition of the data will have its own replicated log. ● Synchronous log replication among the data centers.
  • 10. Cross Entity Groups Operations.
  • 11. Partitioning For Scalability and Consistency ● Partition the data into entity groups ● Each partition is replicated across different data centers synchronously and independently ● The data is stored in NoSQL datastore in datacenter ● Within an entity group the changes are done using single phase ACID semantics. ● But across the entity group changes or operations are done using two phase single commit using asynchronous messaging. ● These entity groups are logically distant not physically distant. So operations across the different entity groups are local ● The traffic between the data centers is only for synchronous replications
  • 12. Physical Layout How to select entity group boundaries: ● Should not be too fine grained as it may require excessive cross group operations. Group should also not contain large no of entities as it may cause unnecessary writes. Physical Layout ● Google’s big table as a storage system which is fault tolerant and scalable ● Applications keeps the data near the user or to a region where it is being accessed the most and maintains replications near each other to avoid failures and high latency during failures. Keeps the group of data which are accessed together either close to each other or with in the same row. ● Implement cache for low latency
  • 13. Data Model Overview ● Lies between abstract tuples of RDBMS and concrete row-column storage. ● Schema=>Set of tables =>contains entities=>contains properties ● Entity group will consist of a root entity along with all entities in child table that references it
  • 15. Indexes ● This can be applied to any property ● Local Index- Used to find data within an entity group ● Global Index- Used to find entities without knowing in advance the entity groups that contain them ● Storing Clause- Applications store additional properties from the primary table for faster access at read time ● Repeated Indexes- For repeated properties ● Inline indexes: Extracting slices of information from child entities and storing it in the data in parent for fast access. Implements many to many links
  • 16. Mapping to Bigtable ● Here the column name = Megastore table name + Property name ● Each Bigtable row stores transaction, metadata and log for the group ● Metadata is in the same row which allows to update atomically through a single Bigtable transaction ● Index Entry- represented as a Bigtable row. Row key = Indexed property values + primary key of indexed entity
  • 17. Transactions and Concurrency control ● Entity group functions as a mini-database. ● Transaction writes mutations in write-ahead log, then mutations will apply to data ● Multiple values can be stored in the same row/column pair with different timestamps ● Multiversion Concurrency control- MVCC ● Readers and writers don’t block each other
  • 18. Cont.. Reads- a. Current- ensure that all committed writes are applied first, then read latest committed transaction b. Snapshot- reads the latest committed write operation c. Inconsistent- Ignore the state of log and read latest value Writes- Begins with a current read to determine the next available log position. Commit operation gathers mutations into a log entry, assigns it a timestamp higher than any prev ones and appends to the log using Paxos
  • 19. Transaction Lifecycle READ- Obtain Timestamp & Log Position of last committed transaction Application Logic- Read from Bigtable and gather writes into a log entry Commit - Use Paxos for appending that entry to log Apply - Write mutations to the entities and indexes Clean Up - Delete data that is no longer required
  • 20. Replication ● Initiation of reads and writes can be done from any replica ● Replication is done per entity group by synchronously replicating the groups transaction log to a quorum of replicas ● Reads guarantees: o Read will always observe the last-acknowledged write. o After a write has been observed, all future reads observe that write
  • 22. Data Structures and Algorithms Replicated Logs
  • 23. Reads Algorithm for a Current Read ● Query Local ● Find Position ● Catch-Up ● Validate ● Query Data
  • 24. Reads Timeline for reads for local replica A
  • 25. Writes Algorithm for writes ● Accept Leader ● Prepare ● Accept ● Invalidate ● Apply
  • 27. Coordinator Availability Failure Detection ● Google's Chubby lock service is used ● Writers are insulated from coordinator failure by testing whether a coordinator has lost its locks Validation Races ● Races between validates for earlier writes and invalidates for later writes are protected in the coordinator by always sending the log position associated with the action.
  • 30. Conclusion ● Megastore ● Paxos for Synchronization ● Bigtable Datastore