SlideShare uma empresa Scribd logo
1 de 41
Baixar para ler offline
Level 300 Storage Systems For Scalable systems HaythamElFadeel Researcher in Computer Sciences
Agenda Introduction  Glance at the Scalable systems. What the available storage solution. The problem with the current solutions.  The problem with the Database The Next-Generation of Storage System  Key-Value store systems.  Performance comparison. How it’s works Discussions, Q/A
Glance at Scalable Systems Scalable systems Scalability is the ability to provide better performance when you  add more computing power. This performance gained should be relevant to the added computing power. Examples: Google, Yahoo, Facebook, Amazon, eBay, Orkut, Google App Engine, etc.
Glance at Scalable Systems Scalable types Vertical Scalability: Adding resource within the same logical unit to increase the capacity. For example: Add more CPUs, or expanding the storage or the memory. Horizontal Scalability: Add multiple logical units of resources and make them together work as a single unit. You can think about it like: Clustering, Distributed, and Load-Balancing.
Vertical Scalability vs. Horizontal Scalability Limited Not limited  Vertical Scaling Horizontal Scaling Software and Hardware Hardware only
Vertical Scalability vs. Horizontal Scalability HaythamElFadeel Quote: If you need scalability, urgently, going to vertical scaling is probably will to be the easiest, but be sure that Vertical scaling, gets more and more expensive as you grow, and While infinite horizontal linear scalability is difficult to achieve, infinite vertical scalability is impossible.
Vertical Scalability vs. Horizontal Scalability HaythamElFadeel Quote: On the other hand Horizontal scalability doesn’t require you to buy more and more expensive hardware. It’s meant to be scaled using commodity storage and server solutions. But Horizontal scalability isn’t cheap either. The application has to be built ground up to run on multiple servers as a single application.
Glance at Scalable Systems Facebook More than 200,000,000 active user. 50,000 photo uploaded per minute. The most active social-network in the Web. Facebook chat The main challenge is maintain the users status. Distribute the load should depend on the users, and they friends to avoid the traveling. Building a system that should scale from that start to serve 100,000,000 user is really hard.
Glance at Scalable Systems Amazon More than 10,000,000 transition in every holidays. The Reliability of the user shopping cart is not option. Google, Yahoo, Microsoft, Kngine, etc Processing huge amount of data, more than 1TB. Sorting the index by the rank value. Which means, sort more than 1TB of data. Save the Crawled Web pages.
The Available Storage Solutions Memory: Just a Data Structure :) Disk: Text File: { XML, Protocol Buffer, Json } Binary File: { Serialized, custom format } Database: { MySQL, SQL Server, SQLLite, Oracle }
The Available Storage Solutions Memory: Just a Data Structure :) Disk: Text File: { XML, Protocol Buffer, Json } Binary File: { Serialized, custom format } Database: { MySQL, SQL Server, SQLLite, Oracle } What about capacity Bad performance Not portable, questions about performance Bad performance, Complex, huge latency.
The Problem with the Database Causes  Old and Very complex system.  Many wasted features.  Many steps to process the SQL query.  Need administration, and others.
The Problem with the Database Causes  Old and Very complex system. The RDMS is very complex system, just like Operating System:  Thread Scheduling, Deadlock monitor, Resource manager.  I/O Manager, Pages Manager, Execution Plan Manager.  Case Manager, Memory Manager, Transaction Manager, etc. Most of DBMS architecture, designs, algorithms came up around 1970s:  Different hardware, platform properties.  Old architecture, design, and algorithms. Please review resource #1
The Problem with the Database Causes  Many wasted features. Today systems have very rich features, simply because they think that ‘one size fits all’:  CLR Types, CLR Integration, Replication, Functions.  Policy, Relations, Transaction, Stored procedure, ACID, etc. You can even call a Web Service from SQL Server! All this mess, make the database appear like a platform and development environment.
The problem with the Database Causes  Many Steps to process the query. Parse the Query. Build the expression tree, and resolve the relational algebra expression. Optimize the expression tree. Choice the execution plan. Start execute. Please review resource #2, #3
The problem with the Database Effects Bad Performance: Throughput, Resource usage, Latency. Not Scalable.
The problem with the Database Effects Bad Performance: Throughput, Resource usage, Latency: Even the faster DBMS ‘MySQL’ can’t provide more than 5,000 query per second*. Add to this the consumed resource, and the big latency. * Depend on the configuration
The problem with the Database Effects Not Scale: The Database is not designed to scale. Even if you get a new PC and partition the Database you will never get (accepted) good performance improvement. Please review resource #1
The problem with the Database The Database give us ACID: Atomicity: A transaction is all or nothing. Consistency: Only valid data is written to the database. Isolation: pretend all transactions are happening serially and the data is correct. Durability: What you write is what you get.
The problem with the Database The problem with ACID is that it gives you too much, it trips you up when you are trying to scale a system across multiple nodes. Down time is unacceptable. So your system needs to be reliable. Reliability requires multiple nodes to handle machine failures. To make a scalable systems that can handle lots and lots of reads and writes you need many more nodes.
The problem with the Database Once you try to scale ACID across many machines you hit problems with network failures and delays. The algorithms don't work in a distributed environment at any acceptable speed. It’s a dead end
The Next generation of Storage Systems From long time ago many researches teams and companies discovered that the database is main bottleneck. Many wasted features, bad performance, and not designed for scale systems.
The Next generation of Storage Systems Building large systems on top of a traditional RDBMS data storage layer is no longer good enough. This talk explores the landscape of new technologies available today to augment your data layer to improve performance and reliability. Please review resource #4
Key-Value Storage Systems Simple data-model, just key-value pairs.  Every Value Assigned to Key. No complex stuff, such as: Relations, ACID, or SQL quires. Simple interface:   Get(key)  Put(key, value)  Delete(key)      < Optional
Key-Value Storage Systems Designed from the start to scale to hundreds of machines. Designed to be reliable, even if 50% of the machines crashed. No extra work require to add new machine, just plug the machine and it will work in harmony. Many open source projects (C++, Java, Lisp).
Key-Value Storage Systems Who use such systems: Facebook. Google Orkut, Analysis. Google Web Crawling. Amazon. Powerset. eBay. Kngine. Yahoo. ,[object Object]
Storing, and huge data analysis.
Transactions, and huge data analysis.,[object Object]
Key-Value Storage Systems Now You should make your decide Take the blue pill And see the truth Or, Take the red pill And stay in wonderland
Key-Value Storage Systems Key-Value Storage System, and other systems built around CAP concept: Consistency: your data is correct all the time. What you write is what you read. Availability: you can read and write and write your data all the time. Partition Tolerance: if one or more nodes fails the system still works and becomes consistent when the system comes on-line.
Key-Value Storage Systems One Node - Performance Comparison (Web) MySql 3,030 sets/second. 4,670 gets/second. Redis 11,200 sets/second. (3.7x MySQL) 9,840 gets/second. (2.1x MySQL) Tokyo Tyrant 9,030 sets/second. (3.0x MySQL) 9,250 gets/second. (2.0x MySQL) Please review resource #5
Key-Value Storage Systems Two High-End Nodes - Performance Comparison (Web) Redis 89,230 sets/second. 85,840 gets/second.
Key-Value Storage Systems One Node - Performance Comparison SQL Server 2,900 sets/second. 3,500 gets/second. Vina* 10,100 sets/second. (3.4x SQL Server) 9,970 gets/second. (2.8x SQL Server) * Vina : Key-Value Storage System used inside Kngine.
How it’s Works Any Key-Value storage system, consist of two primary layers: Aggregation Layer Storing Layer
How it’s Works Any Key-Value storage system, consist of two primary layers: Aggregation Layer Manage the instances, replication and distribution. Storing Layer One or many Disk-based Hash-Table.
How it’s Works (Storing Layer) On the board
How it’s Works (Aggregation Layer)  Received the requests.  Route it to the target node.  Manage Partitioning, and Replicas. The Partitioning, Replication done by Consistence Hashing algorithm. On the board Please review resource #6
Key-Value Storage Systems  Amazon Dynamo.			< Paper Facebook Cassandra.		< Open source  Tokyo Cabinet/Tyrant.		< Open source Redis					< Open source MongoDB				< Open source
Q / A
References The End of an Architectural Era (It’s Time for a Complete Rewrite). Paper. Database Systems - Paul Beynon-Davies. Book. Inside SQL Server engine - MS Press. Book. Drop ACID and Think About Data. Highscalability.com. RedisvsMySQLvs Tokyo Tyrant. Colin Howe’s Blog. Consistent Hashing and Random Trees: Distributed Caching Protocols for Relieving Hot Spots on the World Wide Web. Paper. Dynamo: Amazon’s Highly Available Key-value Store. Paper. Redis, Tokyo Tyrant project. Consistent Hashing. Tom white Blog.
Resources High Scalability blog.  Highscalability.com It’s all about innovation blog. Hfadeel.com/blog. All Things Distributed. Allthingsdistributed.com Tom White blog lexemetech.com

Mais conteúdo relacionado

Mais procurados

Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010Bhupesh Bansal
 
Design Patterns for Distributed Non-Relational Databases
Design Patterns for Distributed Non-Relational DatabasesDesign Patterns for Distributed Non-Relational Databases
Design Patterns for Distributed Non-Relational Databasesguestdfd1ec
 
What's new in SQL Server 2016
What's new in SQL Server 2016What's new in SQL Server 2016
What's new in SQL Server 2016James Serra
 
NoSQL Now! NoSQL Architecture Patterns
NoSQL Now! NoSQL Architecture PatternsNoSQL Now! NoSQL Architecture Patterns
NoSQL Now! NoSQL Architecture PatternsDATAVERSITY
 
Polyglot Database - Linuxcon North America 2016
Polyglot Database - Linuxcon North America 2016Polyglot Database - Linuxcon North America 2016
Polyglot Database - Linuxcon North America 2016Dave Stokes
 
What Your Database Query is Really Doing
What Your Database Query is Really DoingWhat Your Database Query is Really Doing
What Your Database Query is Really DoingDave Stokes
 
Nonrelational Databases
Nonrelational DatabasesNonrelational Databases
Nonrelational DatabasesUdi Bauman
 
MinneBar 2013 - Scaling with Cassandra
MinneBar 2013 - Scaling with CassandraMinneBar 2013 - Scaling with Cassandra
MinneBar 2013 - Scaling with CassandraJeff Smoley
 
Implement SQL Server on an Azure VM
Implement SQL Server on an Azure VMImplement SQL Server on an Azure VM
Implement SQL Server on an Azure VMJames Serra
 
Sql vs NO-SQL database differences explained
Sql vs NO-SQL database differences explainedSql vs NO-SQL database differences explained
Sql vs NO-SQL database differences explainedSatya Pal
 
Building Data Warehouse in SQL Server
Building Data Warehouse in SQL ServerBuilding Data Warehouse in SQL Server
Building Data Warehouse in SQL ServerAntonios Chatzipavlis
 
Sql vs NoSQL
Sql vs NoSQLSql vs NoSQL
Sql vs NoSQLRTigger
 
Introducing Azure SQL Database
Introducing Azure SQL DatabaseIntroducing Azure SQL Database
Introducing Azure SQL DatabaseJames Serra
 
An overview of snowflake
An overview of snowflakeAn overview of snowflake
An overview of snowflakeSivakumar Ramar
 

Mais procurados (19)

Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
 
Design Patterns for Distributed Non-Relational Databases
Design Patterns for Distributed Non-Relational DatabasesDesign Patterns for Distributed Non-Relational Databases
Design Patterns for Distributed Non-Relational Databases
 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQL
 
What's new in SQL Server 2016
What's new in SQL Server 2016What's new in SQL Server 2016
What's new in SQL Server 2016
 
NoSQL Now! NoSQL Architecture Patterns
NoSQL Now! NoSQL Architecture PatternsNoSQL Now! NoSQL Architecture Patterns
NoSQL Now! NoSQL Architecture Patterns
 
Queues, Pools and Caches paper
Queues, Pools and Caches paperQueues, Pools and Caches paper
Queues, Pools and Caches paper
 
Polyglot Database - Linuxcon North America 2016
Polyglot Database - Linuxcon North America 2016Polyglot Database - Linuxcon North America 2016
Polyglot Database - Linuxcon North America 2016
 
What Your Database Query is Really Doing
What Your Database Query is Really DoingWhat Your Database Query is Really Doing
What Your Database Query is Really Doing
 
Nonrelational Databases
Nonrelational DatabasesNonrelational Databases
Nonrelational Databases
 
MinneBar 2013 - Scaling with Cassandra
MinneBar 2013 - Scaling with CassandraMinneBar 2013 - Scaling with Cassandra
MinneBar 2013 - Scaling with Cassandra
 
Implement SQL Server on an Azure VM
Implement SQL Server on an Azure VMImplement SQL Server on an Azure VM
Implement SQL Server on an Azure VM
 
Sql vs NO-SQL database differences explained
Sql vs NO-SQL database differences explainedSql vs NO-SQL database differences explained
Sql vs NO-SQL database differences explained
 
Building Data Warehouse in SQL Server
Building Data Warehouse in SQL ServerBuilding Data Warehouse in SQL Server
Building Data Warehouse in SQL Server
 
A to z for sql azure databases
A to z for sql azure databasesA to z for sql azure databases
A to z for sql azure databases
 
Sql vs NoSQL
Sql vs NoSQLSql vs NoSQL
Sql vs NoSQL
 
Introducing Azure SQL Database
Introducing Azure SQL DatabaseIntroducing Azure SQL Database
Introducing Azure SQL Database
 
Big challenges
Big challengesBig challenges
Big challenges
 
MySQL ppt
MySQL ppt MySQL ppt
MySQL ppt
 
An overview of snowflake
An overview of snowflakeAn overview of snowflake
An overview of snowflake
 

Destaque

Capacity Management from Flickr
Capacity Management from FlickrCapacity Management from Flickr
Capacity Management from Flickrxlight
 
淘宝无线电子商务数据报告
淘宝无线电子商务数据报告淘宝无线电子商务数据报告
淘宝无线电子商务数据报告xlight
 
Oracle ha
Oracle haOracle ha
Oracle haxlight
 
openid-pres
openid-presopenid-pres
openid-presxlight
 
usenix
usenixusenix
usenixxlight
 
SpeedGeeks
SpeedGeeksSpeedGeeks
SpeedGeeksxlight
 
Optimizing Drupal Performance Zend Acquia Whitepaper Feb2010
Optimizing Drupal Performance Zend Acquia Whitepaper Feb2010Optimizing Drupal Performance Zend Acquia Whitepaper Feb2010
Optimizing Drupal Performance Zend Acquia Whitepaper Feb2010xlight
 
Google: The Chubby Lock Service for Loosely-Coupled Distributed Systems
Google: The Chubby Lock Service for Loosely-Coupled Distributed SystemsGoogle: The Chubby Lock Service for Loosely-Coupled Distributed Systems
Google: The Chubby Lock Service for Loosely-Coupled Distributed Systemsxlight
 
What does it take to make google work at scale
What does it take to make google work at scale What does it take to make google work at scale
What does it take to make google work at scale xlight
 

Destaque (9)

Capacity Management from Flickr
Capacity Management from FlickrCapacity Management from Flickr
Capacity Management from Flickr
 
淘宝无线电子商务数据报告
淘宝无线电子商务数据报告淘宝无线电子商务数据报告
淘宝无线电子商务数据报告
 
Oracle ha
Oracle haOracle ha
Oracle ha
 
openid-pres
openid-presopenid-pres
openid-pres
 
usenix
usenixusenix
usenix
 
SpeedGeeks
SpeedGeeksSpeedGeeks
SpeedGeeks
 
Optimizing Drupal Performance Zend Acquia Whitepaper Feb2010
Optimizing Drupal Performance Zend Acquia Whitepaper Feb2010Optimizing Drupal Performance Zend Acquia Whitepaper Feb2010
Optimizing Drupal Performance Zend Acquia Whitepaper Feb2010
 
Google: The Chubby Lock Service for Loosely-Coupled Distributed Systems
Google: The Chubby Lock Service for Loosely-Coupled Distributed SystemsGoogle: The Chubby Lock Service for Loosely-Coupled Distributed Systems
Google: The Chubby Lock Service for Loosely-Coupled Distributed Systems
 
What does it take to make google work at scale
What does it take to make google work at scale What does it take to make google work at scale
What does it take to make google work at scale
 

Semelhante a http://www.hfadeel.com/Blog/?p=151

Storage Systems for High Scalable Systems Presentation
Storage Systems for High Scalable Systems PresentationStorage Systems for High Scalable Systems Presentation
Storage Systems for High Scalable Systems Presentationandyman3000
 
Understanding AWS Database Options (DAT201) | AWS re:Invent 2013
Understanding AWS Database Options (DAT201) | AWS re:Invent 2013Understanding AWS Database Options (DAT201) | AWS re:Invent 2013
Understanding AWS Database Options (DAT201) | AWS re:Invent 2013Amazon Web Services
 
NoSQL Introduction, Theory, Implementations
NoSQL Introduction, Theory, ImplementationsNoSQL Introduction, Theory, Implementations
NoSQL Introduction, Theory, ImplementationsFirat Atagun
 
Amazon Elastic Map Reduce - Ian Meyers
Amazon Elastic Map Reduce - Ian MeyersAmazon Elastic Map Reduce - Ian Meyers
Amazon Elastic Map Reduce - Ian Meyershuguk
 
Front Range PHP NoSQL Databases
Front Range PHP NoSQL DatabasesFront Range PHP NoSQL Databases
Front Range PHP NoSQL DatabasesJon Meredith
 
Hadoop and Voldemort @ LinkedIn
Hadoop and Voldemort @ LinkedInHadoop and Voldemort @ LinkedIn
Hadoop and Voldemort @ LinkedInHadoop User Group
 
Bhupeshbansal bigdata
Bhupeshbansal bigdata Bhupeshbansal bigdata
Bhupeshbansal bigdata Bhupesh Bansal
 
Scalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYC
Scalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYCScalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYC
Scalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYCCal Henderson
 
Web20expo Scalable Web Arch
Web20expo Scalable Web ArchWeb20expo Scalable Web Arch
Web20expo Scalable Web Archroyans
 
Web20expo Scalable Web Arch
Web20expo Scalable Web ArchWeb20expo Scalable Web Arch
Web20expo Scalable Web Archguest18a0f1
 
Web20expo Scalable Web Arch
Web20expo Scalable Web ArchWeb20expo Scalable Web Arch
Web20expo Scalable Web Archmclee
 
Oracle Database 11g Lower Your Costs
Oracle Database 11g Lower Your CostsOracle Database 11g Lower Your Costs
Oracle Database 11g Lower Your CostsMark Rabne
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
NoSQLDatabases
NoSQLDatabasesNoSQLDatabases
NoSQLDatabasesAdi Challa
 
Databases in the Cloud - DevDay Austin 2017 Day 2
Databases in the Cloud - DevDay Austin 2017 Day 2Databases in the Cloud - DevDay Austin 2017 Day 2
Databases in the Cloud - DevDay Austin 2017 Day 2Amazon Web Services
 

Semelhante a http://www.hfadeel.com/Blog/?p=151 (20)

Storage Systems for High Scalable Systems Presentation
Storage Systems for High Scalable Systems PresentationStorage Systems for High Scalable Systems Presentation
Storage Systems for High Scalable Systems Presentation
 
Understanding AWS Database Options (DAT201) | AWS re:Invent 2013
Understanding AWS Database Options (DAT201) | AWS re:Invent 2013Understanding AWS Database Options (DAT201) | AWS re:Invent 2013
Understanding AWS Database Options (DAT201) | AWS re:Invent 2013
 
NoSQL Introduction, Theory, Implementations
NoSQL Introduction, Theory, ImplementationsNoSQL Introduction, Theory, Implementations
NoSQL Introduction, Theory, Implementations
 
Amazon Elastic Map Reduce - Ian Meyers
Amazon Elastic Map Reduce - Ian MeyersAmazon Elastic Map Reduce - Ian Meyers
Amazon Elastic Map Reduce - Ian Meyers
 
Front Range PHP NoSQL Databases
Front Range PHP NoSQL DatabasesFront Range PHP NoSQL Databases
Front Range PHP NoSQL Databases
 
Hadoop and Voldemort @ LinkedIn
Hadoop and Voldemort @ LinkedInHadoop and Voldemort @ LinkedIn
Hadoop and Voldemort @ LinkedIn
 
Bhupeshbansal bigdata
Bhupeshbansal bigdata Bhupeshbansal bigdata
Bhupeshbansal bigdata
 
NoSQL
NoSQLNoSQL
NoSQL
 
Managing SQLserver
Managing SQLserverManaging SQLserver
Managing SQLserver
 
Scalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYC
Scalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYCScalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYC
Scalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYC
 
Web20expo Scalable Web Arch
Web20expo Scalable Web ArchWeb20expo Scalable Web Arch
Web20expo Scalable Web Arch
 
Web20expo Scalable Web Arch
Web20expo Scalable Web ArchWeb20expo Scalable Web Arch
Web20expo Scalable Web Arch
 
Web20expo Scalable Web Arch
Web20expo Scalable Web ArchWeb20expo Scalable Web Arch
Web20expo Scalable Web Arch
 
Oracle Database 11g Lower Your Costs
Oracle Database 11g Lower Your CostsOracle Database 11g Lower Your Costs
Oracle Database 11g Lower Your Costs
 
Amazon Aurora (Debanjan Saha) - AWS DB Day
Amazon Aurora (Debanjan Saha) - AWS DB DayAmazon Aurora (Debanjan Saha) - AWS DB Day
Amazon Aurora (Debanjan Saha) - AWS DB Day
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
NoSQLDatabases
NoSQLDatabasesNoSQLDatabases
NoSQLDatabases
 
Amazon Redshift Deep Dive
Amazon Redshift Deep Dive Amazon Redshift Deep Dive
Amazon Redshift Deep Dive
 
11g R2
11g R211g R2
11g R2
 
Databases in the Cloud - DevDay Austin 2017 Day 2
Databases in the Cloud - DevDay Austin 2017 Day 2Databases in the Cloud - DevDay Austin 2017 Day 2
Databases in the Cloud - DevDay Austin 2017 Day 2
 

Mais de xlight

New zealand bloom filter
New zealand bloom filterNew zealand bloom filter
New zealand bloom filterxlight
 
Product manager-chrissyuan v1.0
Product manager-chrissyuan v1.0Product manager-chrissyuan v1.0
Product manager-chrissyuan v1.0xlight
 
Oracle 高可用概述
Oracle 高可用概述Oracle 高可用概述
Oracle 高可用概述xlight
 
Stats partitioned table
Stats partitioned tableStats partitioned table
Stats partitioned tablexlight
 
C/C++与Lua混合编程
C/C++与Lua混合编程C/C++与Lua混合编程
C/C++与Lua混合编程xlight
 
Google: The Chubby Lock Service for Loosely-Coupled Distributed Systems
Google: The Chubby Lock Service for Loosely-Coupled Distributed SystemsGoogle: The Chubby Lock Service for Loosely-Coupled Distributed Systems
Google: The Chubby Lock Service for Loosely-Coupled Distributed Systemsxlight
 
High Availability MySQL with DRBD and Heartbeat MTV Japan Mobile Service
High Availability MySQL with DRBD and Heartbeat MTV Japan Mobile ServiceHigh Availability MySQL with DRBD and Heartbeat MTV Japan Mobile Service
High Availability MySQL with DRBD and Heartbeat MTV Japan Mobile Servicexlight
 
PgSQL vs MySQL
PgSQL vs MySQLPgSQL vs MySQL
PgSQL vs MySQLxlight
 
GOOGLE: Designs, Lessons and Advice from Building Large Distributed Systems
GOOGLE: Designs, Lessons and Advice from Building Large   Distributed Systems GOOGLE: Designs, Lessons and Advice from Building Large   Distributed Systems
GOOGLE: Designs, Lessons and Advice from Building Large Distributed Systems xlight
 
sector-sphere
sector-spheresector-sphere
sector-spherexlight
 
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...xlight
 
Gfarm Fs Tatebe Tip2004
Gfarm Fs Tatebe Tip2004Gfarm Fs Tatebe Tip2004
Gfarm Fs Tatebe Tip2004xlight
 
Make Your web Work
Make Your web WorkMake Your web Work
Make Your web Workxlight
 
mogpres
mogpresmogpres
mogpresxlight
 
moscow_developer_day
moscow_developer_daymoscow_developer_day
moscow_developer_dayxlight
 

Mais de xlight (17)

New zealand bloom filter
New zealand bloom filterNew zealand bloom filter
New zealand bloom filter
 
Product manager-chrissyuan v1.0
Product manager-chrissyuan v1.0Product manager-chrissyuan v1.0
Product manager-chrissyuan v1.0
 
Oracle 高可用概述
Oracle 高可用概述Oracle 高可用概述
Oracle 高可用概述
 
Stats partitioned table
Stats partitioned tableStats partitioned table
Stats partitioned table
 
C/C++与Lua混合编程
C/C++与Lua混合编程C/C++与Lua混合编程
C/C++与Lua混合编程
 
Google: The Chubby Lock Service for Loosely-Coupled Distributed Systems
Google: The Chubby Lock Service for Loosely-Coupled Distributed SystemsGoogle: The Chubby Lock Service for Loosely-Coupled Distributed Systems
Google: The Chubby Lock Service for Loosely-Coupled Distributed Systems
 
High Availability MySQL with DRBD and Heartbeat MTV Japan Mobile Service
High Availability MySQL with DRBD and Heartbeat MTV Japan Mobile ServiceHigh Availability MySQL with DRBD and Heartbeat MTV Japan Mobile Service
High Availability MySQL with DRBD and Heartbeat MTV Japan Mobile Service
 
PgSQL vs MySQL
PgSQL vs MySQLPgSQL vs MySQL
PgSQL vs MySQL
 
GOOGLE: Designs, Lessons and Advice from Building Large Distributed Systems
GOOGLE: Designs, Lessons and Advice from Building Large   Distributed Systems GOOGLE: Designs, Lessons and Advice from Building Large   Distributed Systems
GOOGLE: Designs, Lessons and Advice from Building Large Distributed Systems
 
UDT
UDTUDT
UDT
 
sector-sphere
sector-spheresector-sphere
sector-sphere
 
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
 
Gfarm Fs Tatebe Tip2004
Gfarm Fs Tatebe Tip2004Gfarm Fs Tatebe Tip2004
Gfarm Fs Tatebe Tip2004
 
Make Your web Work
Make Your web WorkMake Your web Work
Make Your web Work
 
mogpres
mogpresmogpres
mogpres
 
moscow_developer_day
moscow_developer_daymoscow_developer_day
moscow_developer_day
 
OSGi
OSGiOSGi
OSGi
 

Último

A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkPixlogix Infotech
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsYoss Cohen
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...itnewsafrica
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...Karmanjay Verma
 

Último (20)

A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App Framework
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platforms
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
 

http://www.hfadeel.com/Blog/?p=151

  • 1. Level 300 Storage Systems For Scalable systems HaythamElFadeel Researcher in Computer Sciences
  • 2. Agenda Introduction Glance at the Scalable systems. What the available storage solution. The problem with the current solutions. The problem with the Database The Next-Generation of Storage System Key-Value store systems. Performance comparison. How it’s works Discussions, Q/A
  • 3. Glance at Scalable Systems Scalable systems Scalability is the ability to provide better performance when you add more computing power. This performance gained should be relevant to the added computing power. Examples: Google, Yahoo, Facebook, Amazon, eBay, Orkut, Google App Engine, etc.
  • 4. Glance at Scalable Systems Scalable types Vertical Scalability: Adding resource within the same logical unit to increase the capacity. For example: Add more CPUs, or expanding the storage or the memory. Horizontal Scalability: Add multiple logical units of resources and make them together work as a single unit. You can think about it like: Clustering, Distributed, and Load-Balancing.
  • 5. Vertical Scalability vs. Horizontal Scalability Limited Not limited Vertical Scaling Horizontal Scaling Software and Hardware Hardware only
  • 6. Vertical Scalability vs. Horizontal Scalability HaythamElFadeel Quote: If you need scalability, urgently, going to vertical scaling is probably will to be the easiest, but be sure that Vertical scaling, gets more and more expensive as you grow, and While infinite horizontal linear scalability is difficult to achieve, infinite vertical scalability is impossible.
  • 7. Vertical Scalability vs. Horizontal Scalability HaythamElFadeel Quote: On the other hand Horizontal scalability doesn’t require you to buy more and more expensive hardware. It’s meant to be scaled using commodity storage and server solutions. But Horizontal scalability isn’t cheap either. The application has to be built ground up to run on multiple servers as a single application.
  • 8. Glance at Scalable Systems Facebook More than 200,000,000 active user. 50,000 photo uploaded per minute. The most active social-network in the Web. Facebook chat The main challenge is maintain the users status. Distribute the load should depend on the users, and they friends to avoid the traveling. Building a system that should scale from that start to serve 100,000,000 user is really hard.
  • 9. Glance at Scalable Systems Amazon More than 10,000,000 transition in every holidays. The Reliability of the user shopping cart is not option. Google, Yahoo, Microsoft, Kngine, etc Processing huge amount of data, more than 1TB. Sorting the index by the rank value. Which means, sort more than 1TB of data. Save the Crawled Web pages.
  • 10. The Available Storage Solutions Memory: Just a Data Structure :) Disk: Text File: { XML, Protocol Buffer, Json } Binary File: { Serialized, custom format } Database: { MySQL, SQL Server, SQLLite, Oracle }
  • 11. The Available Storage Solutions Memory: Just a Data Structure :) Disk: Text File: { XML, Protocol Buffer, Json } Binary File: { Serialized, custom format } Database: { MySQL, SQL Server, SQLLite, Oracle } What about capacity Bad performance Not portable, questions about performance Bad performance, Complex, huge latency.
  • 12. The Problem with the Database Causes Old and Very complex system. Many wasted features. Many steps to process the SQL query. Need administration, and others.
  • 13. The Problem with the Database Causes Old and Very complex system. The RDMS is very complex system, just like Operating System: Thread Scheduling, Deadlock monitor, Resource manager. I/O Manager, Pages Manager, Execution Plan Manager. Case Manager, Memory Manager, Transaction Manager, etc. Most of DBMS architecture, designs, algorithms came up around 1970s: Different hardware, platform properties. Old architecture, design, and algorithms. Please review resource #1
  • 14. The Problem with the Database Causes Many wasted features. Today systems have very rich features, simply because they think that ‘one size fits all’: CLR Types, CLR Integration, Replication, Functions. Policy, Relations, Transaction, Stored procedure, ACID, etc. You can even call a Web Service from SQL Server! All this mess, make the database appear like a platform and development environment.
  • 15. The problem with the Database Causes Many Steps to process the query. Parse the Query. Build the expression tree, and resolve the relational algebra expression. Optimize the expression tree. Choice the execution plan. Start execute. Please review resource #2, #3
  • 16. The problem with the Database Effects Bad Performance: Throughput, Resource usage, Latency. Not Scalable.
  • 17. The problem with the Database Effects Bad Performance: Throughput, Resource usage, Latency: Even the faster DBMS ‘MySQL’ can’t provide more than 5,000 query per second*. Add to this the consumed resource, and the big latency. * Depend on the configuration
  • 18. The problem with the Database Effects Not Scale: The Database is not designed to scale. Even if you get a new PC and partition the Database you will never get (accepted) good performance improvement. Please review resource #1
  • 19. The problem with the Database The Database give us ACID: Atomicity: A transaction is all or nothing. Consistency: Only valid data is written to the database. Isolation: pretend all transactions are happening serially and the data is correct. Durability: What you write is what you get.
  • 20. The problem with the Database The problem with ACID is that it gives you too much, it trips you up when you are trying to scale a system across multiple nodes. Down time is unacceptable. So your system needs to be reliable. Reliability requires multiple nodes to handle machine failures. To make a scalable systems that can handle lots and lots of reads and writes you need many more nodes.
  • 21. The problem with the Database Once you try to scale ACID across many machines you hit problems with network failures and delays. The algorithms don't work in a distributed environment at any acceptable speed. It’s a dead end
  • 22. The Next generation of Storage Systems From long time ago many researches teams and companies discovered that the database is main bottleneck. Many wasted features, bad performance, and not designed for scale systems.
  • 23. The Next generation of Storage Systems Building large systems on top of a traditional RDBMS data storage layer is no longer good enough. This talk explores the landscape of new technologies available today to augment your data layer to improve performance and reliability. Please review resource #4
  • 24. Key-Value Storage Systems Simple data-model, just key-value pairs. Every Value Assigned to Key. No complex stuff, such as: Relations, ACID, or SQL quires. Simple interface: Get(key) Put(key, value) Delete(key) < Optional
  • 25. Key-Value Storage Systems Designed from the start to scale to hundreds of machines. Designed to be reliable, even if 50% of the machines crashed. No extra work require to add new machine, just plug the machine and it will work in harmony. Many open source projects (C++, Java, Lisp).
  • 26.
  • 27. Storing, and huge data analysis.
  • 28.
  • 29. Key-Value Storage Systems Now You should make your decide Take the blue pill And see the truth Or, Take the red pill And stay in wonderland
  • 30. Key-Value Storage Systems Key-Value Storage System, and other systems built around CAP concept: Consistency: your data is correct all the time. What you write is what you read. Availability: you can read and write and write your data all the time. Partition Tolerance: if one or more nodes fails the system still works and becomes consistent when the system comes on-line.
  • 31. Key-Value Storage Systems One Node - Performance Comparison (Web) MySql 3,030 sets/second. 4,670 gets/second. Redis 11,200 sets/second. (3.7x MySQL) 9,840 gets/second. (2.1x MySQL) Tokyo Tyrant 9,030 sets/second. (3.0x MySQL) 9,250 gets/second. (2.0x MySQL) Please review resource #5
  • 32. Key-Value Storage Systems Two High-End Nodes - Performance Comparison (Web) Redis 89,230 sets/second. 85,840 gets/second.
  • 33. Key-Value Storage Systems One Node - Performance Comparison SQL Server 2,900 sets/second. 3,500 gets/second. Vina* 10,100 sets/second. (3.4x SQL Server) 9,970 gets/second. (2.8x SQL Server) * Vina : Key-Value Storage System used inside Kngine.
  • 34. How it’s Works Any Key-Value storage system, consist of two primary layers: Aggregation Layer Storing Layer
  • 35. How it’s Works Any Key-Value storage system, consist of two primary layers: Aggregation Layer Manage the instances, replication and distribution. Storing Layer One or many Disk-based Hash-Table.
  • 36. How it’s Works (Storing Layer) On the board
  • 37. How it’s Works (Aggregation Layer) Received the requests. Route it to the target node. Manage Partitioning, and Replicas. The Partitioning, Replication done by Consistence Hashing algorithm. On the board Please review resource #6
  • 38. Key-Value Storage Systems Amazon Dynamo. < Paper Facebook Cassandra. < Open source Tokyo Cabinet/Tyrant. < Open source Redis < Open source MongoDB < Open source
  • 39. Q / A
  • 40. References The End of an Architectural Era (It’s Time for a Complete Rewrite). Paper. Database Systems - Paul Beynon-Davies. Book. Inside SQL Server engine - MS Press. Book. Drop ACID and Think About Data. Highscalability.com. RedisvsMySQLvs Tokyo Tyrant. Colin Howe’s Blog. Consistent Hashing and Random Trees: Distributed Caching Protocols for Relieving Hot Spots on the World Wide Web. Paper. Dynamo: Amazon’s Highly Available Key-value Store. Paper. Redis, Tokyo Tyrant project. Consistent Hashing. Tom white Blog.
  • 41. Resources High Scalability blog. Highscalability.com It’s all about innovation blog. Hfadeel.com/blog. All Things Distributed. Allthingsdistributed.com Tom White blog lexemetech.com
  • 42. Thanks… Dear all, All of my presentation content it's open source. Please feel free to use, copy, and re-distribute it.