SlideShare uma empresa Scribd logo
1 de 31
NewSQL - Deliverance from BASE and back to SQL and ACID 
There are a number of NewSQL products now on market such as VoltDB and Progres-XL. These promise 
NoSQL performance and scalability but with ACID and relational concepts implemented with ANSI SQL. 
This session will cover off why NoSQL came about, why it's had it's day and why NewSQL will become the 
backbone of the Enterprise for OLTP and Analytics. 
Tony Rogerson, SQL Server MVP 
tonyrogerson@torver.net 
@tonyrogerson 
http://dataidol.com/tonyrogerson
Who am I? 
Freelance SQL Server professional and Data Specialist 
Fellow BCS, MSc in BI, PGCert in Data Science 
28 years of development and database experience, 22 of which SQL Server – starting out in 1986 
with VSAM, System W, Application System, DB2 and Oracle crossing over to Client/Server and 
SQL Server since 4.21a in 1993 
Awarded SQL Server MVP yearly since 97 
Founded UK SQL Server User Group back in ’99, founder member of DDD, SQL Bits, SQL Relay, 
SQL Santa 
Interested in commodity based distributed processing of Data (naturally!)
Agenda 
NoSQL 
◦ Why the need? 
◦ What products are available? 
Transactions 
◦ BASE 
◦ ACID 
SQL 
◦ What is today’s SQL capable of? 
◦ SQL Server performance – NoSQL required? 
NewSQL 
◦ SQL -> NoSQL -> NewSQL (distributed form of where we started) 
◦ Distributed Data and ACID 
Discussion
Not Only SQL (NoSQL) 
WHY THE NEED?
Why the Need? 
The year is 2001 and 
◦ It’s that Big Data thing…. 
◦ Mainstream Relational Databases (that use SQL) are scale up 
◦ More grunt required – buy a bigger box 
◦ SAN based storage is ridiculously expensive and complicated, heavy TCO 
Y2K + 1 
◦ Developers twiddling their thumbs ;) 
Web adoption accelerates 
◦ Google, Yahoo, Amazon and the like are born 
◦ MySQL does not scale – too inflexible 
◦ Up front costs of kit for projects/business that may fail – need elasticity 
http://www.tomshardware.co.uk/15-years-of-hard-drive- 
history-uk,review-1908-7.html
Products Available 
Varied – type of NoSQL database 
◦ Graph 
◦ Key-Value 
◦ Column store/Column Family 
◦ Document Store 
◦ Object 
◦ Relational but without SQL 
You name it and there is a product to do it
Performance Today [commodity] 
64KiB 100% Read 
100% sequential 100% random
ACID 
Atomicity 
◦ The bounds of the transaction – everything within those bounds is a single unit of work 
◦ All or nothing 
Consistency 
◦ Data must reside in the correct Domain of values 
◦ Deferrable to the end of the unit of work 
Isolation 
◦ Changes are Isolated from other users 
◦ Other connections cannot update what you have updated/updating 
◦ Multi-Value Concurrency Control (MVCC) – snapshots 
◦ Locking 
Durability 
◦ In system failure your changes are still maintained – nothing is lost
BASE (Basically Available, Soft-state, Eventually 
Consistent) 
BASE is a Transactional modelish (at the global level, rather than individual transactions) 
Specific to Distributed database model 
Basically Available – all or some of the system is available 
Node 1 Node 2 Node 3
BASE (Basically Available, Soft-state, Eventually 
Consistent) 
Soft-state 
Eventually Consistent 
System may change over time [as replica’s become up-to-date (consistent)] 
Node 1 Node 2 Node 3 
Insert value ‘A’
Eventual Consistency in SQL Server 
Asynchronous Availability Groups/Database Mirroring 
Replication 
Eventual / Causal Consistency 
◦ Eventual no good for order specific [and important] transactions 
◦ Like Merge replication 
◦ Causal: deliver messages in correct order [e.g. service broker] 
◦ Like Transactional Replication
ACID - Distributed 
2PC is clunky and doesn’t scale across many nodes 
PAXOS – Consensus theory – scales better 
Remove the need for distributed ACID altogether 
2PC Transaction 
Coordinator 
Subordinate 
INSERT Subordinate 
Subordinate 
All or nothing
Mixing BASE and ACID 
ACID applied local data node 
BASE remote
Relational 
Sets 
Tables with Rows x Columns 
Relational Theory dictates the row/column intersection is an Atomic value i.e. contains only a 
single value from the domain modelled for that column 
Chris Date: 
◦ Atomicity cannot really be defined as absolute in Normal Form 
◦ a column can contain “relational values” i.e. another table 
Normal Form – the process used to define the schema around the data being modelled
OldSQL roots 
Built for disk storage 
Built for single machine, scale-up 
Mature SQL language (decades of research) over the Relational Model 
SQL extensions to deal with unstructured data (freetext)
OldSQL today 
ACI [no Durability] 
In-Memory 
Modified design to work with Flash 
Still scale-up
SQL Server 
Delayed / No-Durability in SQL Server 2014 
In-Memory extensions 
Entity Attribute Value design combined with ColumnStore 
Sparse Columns / Column sets 
DEMOS
NewSQL 
OLDSQL -> SQL -> NEWSQL
Describe NewSQL 
NewSQL = OldSQL + Transparent_Data_Distribution + ACID 
Also – add in the knobs and whistles for new tech 
◦ Flash 
◦ RAM 
◦ Processor cache improvements 
◦ Better parallelisation across local processor cores 
Basically -> Scale out with ACID
Latency in a Distributed environment 
Server 
1Gbit 
ethernet 
Server 
Switch 
Server 
Server 
Server 
Server 
SQL Server 
FirstName Surname DOB 
Query returns 
20,000 rows 
558KiBytes of data 
Slowest Slower Fastest 
(Data Travel)
Reduce Latency – Data Locality 
Server SQL Server 
1Gbit 
Server ethernet 
Switch 
Server 
Server 
Server 
Server 
Server SQL Server 
Server SQL Server
Distributed SQL with ACID 
Server1 SQL Server 
1Gbit 
ethernet 
Switch 
Server2 SQL Server 
• 2 Phase Commit using DTC 
• High Latency 
• All or nothing 
BEGIN DISTRIBUTED TRAN 
INSERT Server3.pres_NEWSQL.dbo.people( ….. ) 
INSERT Server2.pres_NEWSQL.dbo.people( ….. ) 
INSERT Server1.pres_NEWSQL.dbo.people( ….. ) 
COMMIT TRAN 
Server2 SQL Server
Querying a Distributed Environment 
Financial Trading – Global position of the book 
TOP 10 customers 
Not easy (at speed) in an OLTP setting 
Network Switch 
N1 N2 N3 N4
Couple {Data, Processing} with 
{Machine-n}
Partitioning 
Chop big table up into “horizontal 
partitions” 
Partition key required (Mash, Modulo, Key 
range) 
Each partition is self-contained binding rows 
by the partitioning key 
Access all data through logical view over all 
partitions (local database) 
Table by table basis
Shared Nothing 
Partitioning+ 
Each Shard is self-contained and has all the 
procs, meta-data and of course your partition of 
data 
Shard Key common to multiple tables, for 
example CustomerID, Email Address. 
Greater autonomy across the distributed 
database 
Seeing the entire database as a logical unit is 
more difficult – joining is a nightmare 
Node 1 
Node 2 
Node 3
Data Distribution using Hashing 
Distributed Database Cluster has fixed number of data nodes 
Your data is spread across the database cluster 
◦ 10 node cluster; each data item may reside on 3 nodes 
◦ Which 3 nodes? 
Data key is Hashed to a number – hashing algorithm is deterministic 
data-node = f( data-key ) 
◦ print ( checksum( 'All hale to the ale' ) * 1.) % 10 
◦ print ( checksum( 'And a glass of wine for the ladies' ) * 1.) % 10
Sharding Sync 
LOGICAL 
DATABASE 
Pick a 
node 
Node 1 
Node 2 
Node 3 
Full copy of data 
Subset of data 
Replication 
Apps
Postgres-XC 
Applications 
(issue SQL to coordinators) 
Coordinators 
(plans, 2pc trans, knows 
about data distribution) 
Data Nodes 
GTM 
Global 
Transaction 
Manager 
http://de.slideshare.net/PavanDeolasee/postgresxc-28475161
Combine Sharding + Replication 
Shard your big tables based on a hash (or something) around your business key e.g. Customer, 
EmailAddress etc. 
Replicate static tables.
Discussion 
Tonyrogerson@torver.net 
@tonyrogerson 
http://dataidol.com/tonyrogerson

Mais conteúdo relacionado

Mais procurados

Introduction to NuoDB
Introduction to NuoDBIntroduction to NuoDB
Introduction to NuoDB
Sandun Perera
 

Mais procurados (20)

Introduction to NuoDB
Introduction to NuoDBIntroduction to NuoDB
Introduction to NuoDB
 
Advanced Cassandra
Advanced CassandraAdvanced Cassandra
Advanced Cassandra
 
Introduction to CQL and Data Modeling with Apache Cassandra
Introduction to CQL and Data Modeling with Apache CassandraIntroduction to CQL and Data Modeling with Apache Cassandra
Introduction to CQL and Data Modeling with Apache Cassandra
 
How to size up an Apache Cassandra cluster (Training)
How to size up an Apache Cassandra cluster (Training)How to size up an Apache Cassandra cluster (Training)
How to size up an Apache Cassandra cluster (Training)
 
Cassandra internals
Cassandra internalsCassandra internals
Cassandra internals
 
Sql server hybrid what every sql professional should know
Sql server hybrid what every sql professional should knowSql server hybrid what every sql professional should know
Sql server hybrid what every sql professional should know
 
Introduction to Apache Cassandra™ + What’s New in 4.0
Introduction to Apache Cassandra™ + What’s New in 4.0Introduction to Apache Cassandra™ + What’s New in 4.0
Introduction to Apache Cassandra™ + What’s New in 4.0
 
Apache Cassandra
Apache CassandraApache Cassandra
Apache Cassandra
 
SQL vs. NoSQL
SQL vs. NoSQLSQL vs. NoSQL
SQL vs. NoSQL
 
Apache Cassandra Developer Training Slide Deck
Apache Cassandra Developer Training Slide DeckApache Cassandra Developer Training Slide Deck
Apache Cassandra Developer Training Slide Deck
 
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
 
Midwest PHP Presentation - New MSQL Features
Midwest PHP Presentation - New MSQL FeaturesMidwest PHP Presentation - New MSQL Features
Midwest PHP Presentation - New MSQL Features
 
Clustering van IT-componenten
Clustering van IT-componentenClustering van IT-componenten
Clustering van IT-componenten
 
Cassandra architecture
Cassandra architectureCassandra architecture
Cassandra architecture
 
Apache Cassandra and Drivers
Apache Cassandra and DriversApache Cassandra and Drivers
Apache Cassandra and Drivers
 
Using Spark to Load Oracle Data into Cassandra
Using Spark to Load Oracle Data into CassandraUsing Spark to Load Oracle Data into Cassandra
Using Spark to Load Oracle Data into Cassandra
 
Introduction to cassandra
Introduction to cassandraIntroduction to cassandra
Introduction to cassandra
 
Cassandra 101
Cassandra 101Cassandra 101
Cassandra 101
 
Brk2051 sql server on linux and docker
Brk2051 sql server on linux and dockerBrk2051 sql server on linux and docker
Brk2051 sql server on linux and docker
 
Understanding data
Understanding dataUnderstanding data
Understanding data
 

Destaque

Building scalable application with sql server
Building scalable application with sql serverBuilding scalable application with sql server
Building scalable application with sql server
Chris Adkin
 
Column store indexes and batch processing mode (nx power lite)
Column store indexes and batch processing mode (nx power lite)Column store indexes and batch processing mode (nx power lite)
Column store indexes and batch processing mode (nx power lite)
Chris Adkin
 

Destaque (20)

Why new hardware may not make SQL Server faster
Why new hardware may not make SQL Server fasterWhy new hardware may not make SQL Server faster
Why new hardware may not make SQL Server faster
 
Leveraging memory in sql server
Leveraging memory in sql serverLeveraging memory in sql server
Leveraging memory in sql server
 
NoSQL, SQL, NewSQL - methods of structuring data.
NoSQL, SQL, NewSQL - methods of structuring data.NoSQL, SQL, NewSQL - methods of structuring data.
NoSQL, SQL, NewSQL - methods of structuring data.
 
The have no fear guide to virtualizing databases
The have no fear guide to virtualizing databasesThe have no fear guide to virtualizing databases
The have no fear guide to virtualizing databases
 
Building scalable application with sql server
Building scalable application with sql serverBuilding scalable application with sql server
Building scalable application with sql server
 
SQL Server 2014 In-Memory Tables (XTP, Hekaton)
SQL Server 2014 In-Memory Tables (XTP, Hekaton)SQL Server 2014 In-Memory Tables (XTP, Hekaton)
SQL Server 2014 In-Memory Tables (XTP, Hekaton)
 
Veja em primeira mão os tópicos de tecnologia de 2016
Veja em primeira mão os tópicos de tecnologia de 2016Veja em primeira mão os tópicos de tecnologia de 2016
Veja em primeira mão os tópicos de tecnologia de 2016
 
Why SQL Server 2014 Cardinality Estimator is *the* killer feature
Why SQL Server 2014 Cardinality Estimator is *the* killer featureWhy SQL Server 2014 Cardinality Estimator is *the* killer feature
Why SQL Server 2014 Cardinality Estimator is *the* killer feature
 
Why new hardware may not make Oracle databases faster
Why new hardware may not make Oracle databases fasterWhy new hardware may not make Oracle databases faster
Why new hardware may not make Oracle databases faster
 
SolarWinds State of Government IT Management and Monitoring Survey
SolarWinds State of Government IT Management and Monitoring SurveySolarWinds State of Government IT Management and Monitoring Survey
SolarWinds State of Government IT Management and Monitoring Survey
 
Azure Machine Learning
Azure Machine LearningAzure Machine Learning
Azure Machine Learning
 
How to find what is making your Oracle database slow
How to find what is making your Oracle database slowHow to find what is making your Oracle database slow
How to find what is making your Oracle database slow
 
Column store indexes and batch processing mode (nx power lite)
Column store indexes and batch processing mode (nx power lite)Column store indexes and batch processing mode (nx power lite)
Column store indexes and batch processing mode (nx power lite)
 
Super scaling singleton inserts
Super scaling singleton insertsSuper scaling singleton inserts
Super scaling singleton inserts
 
Sql sever engine batch mode and cpu architectures
Sql sever engine batch mode and cpu architecturesSql sever engine batch mode and cpu architectures
Sql sever engine batch mode and cpu architectures
 
The 2015 Top Ten IT Pro-dictions
The 2015 Top Ten IT Pro-dictionsThe 2015 Top Ten IT Pro-dictions
The 2015 Top Ten IT Pro-dictions
 
2015 Top 10 Vorhersagen Für IT-Profis
2015 Top 10 Vorhersagen Für IT-Profis2015 Top 10 Vorhersagen Für IT-Profis
2015 Top 10 Vorhersagen Für IT-Profis
 
Back to the roots - SQL Server Indexing
Back to the roots - SQL Server IndexingBack to the roots - SQL Server Indexing
Back to the roots - SQL Server Indexing
 
Scaling out SSIS with Parallelism, Diving Deep Into The Dataflow Engine
Scaling out SSIS with Parallelism, Diving Deep Into The Dataflow EngineScaling out SSIS with Parallelism, Diving Deep Into The Dataflow Engine
Scaling out SSIS with Parallelism, Diving Deep Into The Dataflow Engine
 
Azure ML: from basic to integration with custom applications
Azure ML: from basic to integration with custom applicationsAzure ML: from basic to integration with custom applications
Azure ML: from basic to integration with custom applications
 

Semelhante a NewSQL - Deliverance from BASE and back to SQL and ACID

B.Vinithamani,II-M.sc.,Computer science,Bon Secours college for women,thanjavur.
B.Vinithamani,II-M.sc.,Computer science,Bon Secours college for women,thanjavur.B.Vinithamani,II-M.sc.,Computer science,Bon Secours college for women,thanjavur.
B.Vinithamani,II-M.sc.,Computer science,Bon Secours college for women,thanjavur.
vinithamaniB
 

Semelhante a NewSQL - Deliverance from BASE and back to SQL and ACID (20)

Migrating on premises workload to azure sql database
Migrating on premises workload to azure sql databaseMigrating on premises workload to azure sql database
Migrating on premises workload to azure sql database
 
By Popular Demand: The Rise of Elastic SQL
By Popular Demand: The Rise of Elastic SQLBy Popular Demand: The Rise of Elastic SQL
By Popular Demand: The Rise of Elastic SQL
 
MySQL Cluster Scaling to a Billion Queries
MySQL Cluster Scaling to a Billion QueriesMySQL Cluster Scaling to a Billion Queries
MySQL Cluster Scaling to a Billion Queries
 
Azure Days 2019: Grösser und Komplexer ist nicht immer besser (Meinrad Weiss)
Azure Days 2019: Grösser und Komplexer ist nicht immer besser (Meinrad Weiss)Azure Days 2019: Grösser und Komplexer ist nicht immer besser (Meinrad Weiss)
Azure Days 2019: Grösser und Komplexer ist nicht immer besser (Meinrad Weiss)
 
Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?
Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?
Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?
 
Introduction to ClustrixDB
Introduction to ClustrixDBIntroduction to ClustrixDB
Introduction to ClustrixDB
 
Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)
 
NoSQL_Night
NoSQL_NightNoSQL_Night
NoSQL_Night
 
Experience sql server on l inux and docker
Experience sql server on l inux and dockerExperience sql server on l inux and docker
Experience sql server on l inux and docker
 
Copy Data Management for the DBA
Copy Data Management for the DBACopy Data Management for the DBA
Copy Data Management for the DBA
 
NoSQL and NewSQL: Tradeoffs between Scalable Performance & Consistency
NoSQL and NewSQL: Tradeoffs between Scalable Performance & ConsistencyNoSQL and NewSQL: Tradeoffs between Scalable Performance & Consistency
NoSQL and NewSQL: Tradeoffs between Scalable Performance & Consistency
 
AWS Redshift Introduction - Big Data Analytics
AWS Redshift Introduction - Big Data AnalyticsAWS Redshift Introduction - Big Data Analytics
AWS Redshift Introduction - Big Data Analytics
 
Distributed Database Design Decisions to Support High Performance Event Strea...
Distributed Database Design Decisions to Support High Performance Event Strea...Distributed Database Design Decisions to Support High Performance Event Strea...
Distributed Database Design Decisions to Support High Performance Event Strea...
 
Denver SQL Saturday The Next Frontier
Denver SQL Saturday The Next FrontierDenver SQL Saturday The Next Frontier
Denver SQL Saturday The Next Frontier
 
SPL_ALL_EN.pptx
SPL_ALL_EN.pptxSPL_ALL_EN.pptx
SPL_ALL_EN.pptx
 
Cloud architectural patterns and Microsoft Azure tools
Cloud architectural patterns and Microsoft Azure toolsCloud architectural patterns and Microsoft Azure tools
Cloud architectural patterns and Microsoft Azure tools
 
NoSQL
NoSQLNoSQL
NoSQL
 
No sql databases
No sql databases No sql databases
No sql databases
 
SQL and NoSQL in SQL Server
SQL and NoSQL in SQL ServerSQL and NoSQL in SQL Server
SQL and NoSQL in SQL Server
 
B.Vinithamani,II-M.sc.,Computer science,Bon Secours college for women,thanjavur.
B.Vinithamani,II-M.sc.,Computer science,Bon Secours college for women,thanjavur.B.Vinithamani,II-M.sc.,Computer science,Bon Secours college for women,thanjavur.
B.Vinithamani,II-M.sc.,Computer science,Bon Secours college for women,thanjavur.
 

Último

Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
Lars Albertsson
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
amitlee9823
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
shambhavirathore45
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
shivangimorya083
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 

Último (20)

April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 

NewSQL - Deliverance from BASE and back to SQL and ACID

  • 1. NewSQL - Deliverance from BASE and back to SQL and ACID There are a number of NewSQL products now on market such as VoltDB and Progres-XL. These promise NoSQL performance and scalability but with ACID and relational concepts implemented with ANSI SQL. This session will cover off why NoSQL came about, why it's had it's day and why NewSQL will become the backbone of the Enterprise for OLTP and Analytics. Tony Rogerson, SQL Server MVP tonyrogerson@torver.net @tonyrogerson http://dataidol.com/tonyrogerson
  • 2. Who am I? Freelance SQL Server professional and Data Specialist Fellow BCS, MSc in BI, PGCert in Data Science 28 years of development and database experience, 22 of which SQL Server – starting out in 1986 with VSAM, System W, Application System, DB2 and Oracle crossing over to Client/Server and SQL Server since 4.21a in 1993 Awarded SQL Server MVP yearly since 97 Founded UK SQL Server User Group back in ’99, founder member of DDD, SQL Bits, SQL Relay, SQL Santa Interested in commodity based distributed processing of Data (naturally!)
  • 3. Agenda NoSQL ◦ Why the need? ◦ What products are available? Transactions ◦ BASE ◦ ACID SQL ◦ What is today’s SQL capable of? ◦ SQL Server performance – NoSQL required? NewSQL ◦ SQL -> NoSQL -> NewSQL (distributed form of where we started) ◦ Distributed Data and ACID Discussion
  • 4. Not Only SQL (NoSQL) WHY THE NEED?
  • 5. Why the Need? The year is 2001 and ◦ It’s that Big Data thing…. ◦ Mainstream Relational Databases (that use SQL) are scale up ◦ More grunt required – buy a bigger box ◦ SAN based storage is ridiculously expensive and complicated, heavy TCO Y2K + 1 ◦ Developers twiddling their thumbs ;) Web adoption accelerates ◦ Google, Yahoo, Amazon and the like are born ◦ MySQL does not scale – too inflexible ◦ Up front costs of kit for projects/business that may fail – need elasticity http://www.tomshardware.co.uk/15-years-of-hard-drive- history-uk,review-1908-7.html
  • 6. Products Available Varied – type of NoSQL database ◦ Graph ◦ Key-Value ◦ Column store/Column Family ◦ Document Store ◦ Object ◦ Relational but without SQL You name it and there is a product to do it
  • 7. Performance Today [commodity] 64KiB 100% Read 100% sequential 100% random
  • 8. ACID Atomicity ◦ The bounds of the transaction – everything within those bounds is a single unit of work ◦ All or nothing Consistency ◦ Data must reside in the correct Domain of values ◦ Deferrable to the end of the unit of work Isolation ◦ Changes are Isolated from other users ◦ Other connections cannot update what you have updated/updating ◦ Multi-Value Concurrency Control (MVCC) – snapshots ◦ Locking Durability ◦ In system failure your changes are still maintained – nothing is lost
  • 9. BASE (Basically Available, Soft-state, Eventually Consistent) BASE is a Transactional modelish (at the global level, rather than individual transactions) Specific to Distributed database model Basically Available – all or some of the system is available Node 1 Node 2 Node 3
  • 10. BASE (Basically Available, Soft-state, Eventually Consistent) Soft-state Eventually Consistent System may change over time [as replica’s become up-to-date (consistent)] Node 1 Node 2 Node 3 Insert value ‘A’
  • 11. Eventual Consistency in SQL Server Asynchronous Availability Groups/Database Mirroring Replication Eventual / Causal Consistency ◦ Eventual no good for order specific [and important] transactions ◦ Like Merge replication ◦ Causal: deliver messages in correct order [e.g. service broker] ◦ Like Transactional Replication
  • 12. ACID - Distributed 2PC is clunky and doesn’t scale across many nodes PAXOS – Consensus theory – scales better Remove the need for distributed ACID altogether 2PC Transaction Coordinator Subordinate INSERT Subordinate Subordinate All or nothing
  • 13. Mixing BASE and ACID ACID applied local data node BASE remote
  • 14. Relational Sets Tables with Rows x Columns Relational Theory dictates the row/column intersection is an Atomic value i.e. contains only a single value from the domain modelled for that column Chris Date: ◦ Atomicity cannot really be defined as absolute in Normal Form ◦ a column can contain “relational values” i.e. another table Normal Form – the process used to define the schema around the data being modelled
  • 15. OldSQL roots Built for disk storage Built for single machine, scale-up Mature SQL language (decades of research) over the Relational Model SQL extensions to deal with unstructured data (freetext)
  • 16. OldSQL today ACI [no Durability] In-Memory Modified design to work with Flash Still scale-up
  • 17. SQL Server Delayed / No-Durability in SQL Server 2014 In-Memory extensions Entity Attribute Value design combined with ColumnStore Sparse Columns / Column sets DEMOS
  • 18. NewSQL OLDSQL -> SQL -> NEWSQL
  • 19. Describe NewSQL NewSQL = OldSQL + Transparent_Data_Distribution + ACID Also – add in the knobs and whistles for new tech ◦ Flash ◦ RAM ◦ Processor cache improvements ◦ Better parallelisation across local processor cores Basically -> Scale out with ACID
  • 20. Latency in a Distributed environment Server 1Gbit ethernet Server Switch Server Server Server Server SQL Server FirstName Surname DOB Query returns 20,000 rows 558KiBytes of data Slowest Slower Fastest (Data Travel)
  • 21. Reduce Latency – Data Locality Server SQL Server 1Gbit Server ethernet Switch Server Server Server Server Server SQL Server Server SQL Server
  • 22. Distributed SQL with ACID Server1 SQL Server 1Gbit ethernet Switch Server2 SQL Server • 2 Phase Commit using DTC • High Latency • All or nothing BEGIN DISTRIBUTED TRAN INSERT Server3.pres_NEWSQL.dbo.people( ….. ) INSERT Server2.pres_NEWSQL.dbo.people( ….. ) INSERT Server1.pres_NEWSQL.dbo.people( ….. ) COMMIT TRAN Server2 SQL Server
  • 23. Querying a Distributed Environment Financial Trading – Global position of the book TOP 10 customers Not easy (at speed) in an OLTP setting Network Switch N1 N2 N3 N4
  • 24. Couple {Data, Processing} with {Machine-n}
  • 25. Partitioning Chop big table up into “horizontal partitions” Partition key required (Mash, Modulo, Key range) Each partition is self-contained binding rows by the partitioning key Access all data through logical view over all partitions (local database) Table by table basis
  • 26. Shared Nothing Partitioning+ Each Shard is self-contained and has all the procs, meta-data and of course your partition of data Shard Key common to multiple tables, for example CustomerID, Email Address. Greater autonomy across the distributed database Seeing the entire database as a logical unit is more difficult – joining is a nightmare Node 1 Node 2 Node 3
  • 27. Data Distribution using Hashing Distributed Database Cluster has fixed number of data nodes Your data is spread across the database cluster ◦ 10 node cluster; each data item may reside on 3 nodes ◦ Which 3 nodes? Data key is Hashed to a number – hashing algorithm is deterministic data-node = f( data-key ) ◦ print ( checksum( 'All hale to the ale' ) * 1.) % 10 ◦ print ( checksum( 'And a glass of wine for the ladies' ) * 1.) % 10
  • 28. Sharding Sync LOGICAL DATABASE Pick a node Node 1 Node 2 Node 3 Full copy of data Subset of data Replication Apps
  • 29. Postgres-XC Applications (issue SQL to coordinators) Coordinators (plans, 2pc trans, knows about data distribution) Data Nodes GTM Global Transaction Manager http://de.slideshare.net/PavanDeolasee/postgresxc-28475161
  • 30. Combine Sharding + Replication Shard your big tables based on a hash (or something) around your business key e.g. Customer, EmailAddress etc. Replicate static tables.
  • 31. Discussion Tonyrogerson@torver.net @tonyrogerson http://dataidol.com/tonyrogerson

Notas do Editor

  1. GTM keeps simple state info (not a database itself) GXID (Global Transaction ID’s) – across cluster MVCC One active GTM per cluster, though standby’s available