SlideShare uma empresa Scribd logo
1 de 31
The ScaleDB 
Storage Engine 
Enabling high performance and 
scalability, using a Multi-Table Index, 
and a Shared-Disk Clustering 
Architecture 
Moshe Shadmon moshe@scaledb.com
Agenda 
 Overview 
 ScaleDB’s Clustering Architecture 
o Shared-Disk vs. Shared-Nothing 
o MySQL and a Shared-Disk Storage Engine 
o ScaleDB Installation 
o Demo 
 ScaleDB’s Indexing Technology 
o Multi-Table Index 
o Enabling Multi-Table Index in MySQL 
o Demo 
 Summary 
 ScaleDB Status & Product Availability
Overview 
 Plug-in Storage Engine for MySQL 
 Main Features: 
o Shared-Disk Architecture 
o Innovative Multi-Table Indexing 
o Transactional 
o Row-Level Locking 
o ACID Compliant 
o Atomicity: All tasks of a transaction performed or none of them are. 
o Consistency: The database is in a consistent state before and after the transaction. 
o Isolation: Data is not available in an intermediate state during a transaction 
o Durability: When a transaction completes, the transaction’s data will persist 
o Disk-Based Storage Engine
Shared-Disk vs. Shared-Nothing 
 Manageability 
 Adaptability 
 Availability/Fault-Tolerance 
 Scalability 
 Performance 
 Total Cost of Ownership (TCO)
Shared-Nothing: 
Database 
Instance 1 Table A 
Table B 
Table C 
Database 
Instance 1 
Database 
Instance 2 
Database 
Instance 3 
Table A 
Table B 
Table C 
Vertical Partitioning
Shared Nothing: 
Partitioning Your Data…How 
 Predict usage patterns, application evolution, data 
growth patterns…all are moving targets 
 Avoid data skew: bottlenecks caused by frequently 
accessed data on just a few nodes 
 Avoid data shipping between nodes 
 Avoid delays from distributed 2-phase commit 
 Searches outside the partition column require 
participation by all nodes 
 Scaling becomes an exercise in fire fighting
Shared-Nothing: 
Horizontal Partitioning 
name age salary 
Bob 20 10K 
Shideh 18 35K 
Ted 50 60K 
Kevin 62 120K 
Angela 55 140K 
Mike 45 90K 
name age salary 
Physical View 
Partitioned 
by Salary 
Logical View 
name age salary 
Ted 50 60K 
Kevin 62 120K 
Mike 46 90K 
Bob 20 10K 
name age salary 
Shideh 18 35K 
Angela 55 140K 
Horizontal Partitioning – Salary % 3
Shared-Nothing: 
Horizontal Partitioning Pitfalls 
 Selections with equality predicates referencing 
the partitioning attribute are directed to a 
single node: 
o Retrieve Emp where salary = 60K 
SELECT FROM Emp WHERE salary=60K 
 Equality predicates referencing a non-partitioning 
attribute and range predicates are 
directed to all nodes: 
o Retrieve Emp where age = 20 
o Retrieve Emp where salary < 20K 
SELECT FROM Emp WHERE salary<20K
Shared-Disk: 
No Partitioning, Full Access to Data 
DB Cluster 
Node 1 
DB Cluster 
Node 2 
DB Cluster 
Node 3 
Table A 
Table B 
Table C 
Shared Disk 
Subsystem 
High-Speed Interconnect 
Database 
Instance 1 Table A 
Table B 
Table C
Slave A 
Scalability & Availability 
Slave B 
Shared Nothing 
Slave C
Scalability & Availability 
Shared Disk 
Node A 
Node B 
Node C 
MySQL Servers 
with ScaleDB 
Engine 
Data 
Node D 
Node E
Shared-Disk: 
Summarizing Shared-Disk Benefits 
 Grow by simply adding nodes to the cluster 
o Servers can be added and removed dynamically 
according to your needs 
o No interruption to your application 
 High-Availability with dynamic failover 
o Existing nodes automatically take over 
 Significantly reduced maintenance costs 
o Can be built on low-cost commodity hardware 
o No data partitioning 
o No need for slaves 
 Low Total Cost of Ownership (TCO)
Shared-Disk: 
Making it work with MySQL 
NNooddee 11 
ScaleDB Engine 
Instance A 
Cluster 
Manager 
Buffer Manager 
Comm. 
Layer 
Server 
Instance A 
NNooddee 22 
Server 
Instance B 
ScaleDB Engine 
Instance B 
Cluster 
Manager 
Comm. 
Layer 
Buffer Manager 
CClluusstteerr IInntteerrccoonnnneecctt 
SShhaarreedd DDiisskk SSuubb--ssyysstteemm
Shared-Disk: Insert New Row 
ScaleDB Engine 
Instance A 
Cluster 
Manager 
Buffer Manager 
Comm. 
Layer 
NNooddee 11 
Server 
Instance A 
NNooddee 22 
Server 
Instance B 
ScaleDB Engine 
Instance B 
Cluster 
Manager 
Comm. 
Layer 
Buffer Manager 
CClluusstteerr IInntteerrccoonnnneecctt 
SShhaarreedd DDiisskk SSuubb--ssyysstteemm
ScaleDB Engine 
Instance A 
Cluster 
Manager 
Buffer Manager 
Comm. 
Layer 
NNooddee 11 
Server 
Instance A 
NNooddee 22 
Server 
Instance B 
ScaleDB Engine 
Instance B 
Cluster 
Manager 
Comm. 
Layer 
Buffer Manager 
Shared-Disk: Select 
CClluusstteerr IInntteerrccoonnnneecctt 
SShhaarreedd DDiisskk SSuubb--ssyysstteemm
Shared-Disk: Create Table 
ScaleDB Engine 
Instance A 
Cluster 
Manager 
Buffer Manager 
Comm. 
Layer 
NNooddee 11 
Server 
Instance A 
NNooddee 22 
Server 
Instance B 
ScaleDB Engine 
Instance B 
Cluster 
Manager 
Comm. 
Layer 
Buffer Manager 
CClluusstteerr IInntteerrccoonnnneecctt 
SShhaarreedd DDiisskk SSuubb--ssyysstteemm 
Table A 
Table A Meta-Data 
Meta-Data
ScaleDB Installation 
 Define cluster = true in ScaleDB Config file: 
 ScaleDB.cnf is at the same directory as my.cnf: 
 Cluster params: 
o cluster = true 
o nodes_in_cluster = 2 
o node_id = 1 
o this_machine_port = 100 
o next_machine_ip_address = 192.168.0.101 
o next_machine_port = 100 
o log_directory = /share/logs/
Demo - Sysbench 
 ScaleDB cluster – one node – show throughput 
 ScaleDB cluster – 2nd node – show throughput
ScaleDB: Multi-Table Indexing 
B-tree: Only indexes the data in tables 
Index 
#1 
#1 #2 
Index 
#2 
Index 
#3 
Index 
#4 
Index 
#5 
#3 #4 #5 
ScaleDB: Indexes the data and relationships 
ScaleDB 
Index 
#1 
#2 
#3 
#4 
#5 
Advantages: 
• Faster 
• Smaller 
• Referential integrity
Example 
 Scenario: Select information that is spread 
across 3 tables: Colleges, Students and 
Enrollment 
 Relationships: Students are enrolled in courses 
within departments of colleges 
SELECT c1.CollName, s.StudName, c2.CourseName , e.Grade 
FROM College AS c1 
JOIN Student AS s 
JOIN Enrollment AS e 
JOIN Course AS c2 
ON ( c1.CollNo = s.CollNo AND 
s.CollNo = e.CollNo AND 
s.StudentNo = e.StudentNo AND 
e.CollNo = c2.CollNo AND 
e.DeptNo = c2.DeptNo AND 
e.CourseNum = c2.CourseNum ) 
WHERE c1.CollNo = X 
AND s.StudentNo = Y ;
Option #1: Conventional Joins 
College Table 
ID College Students 
234 Institute of Technology 1,334 
167 High Tech Institute 5,742 
85 Golden State College 2,119 
298 Kaplan College 12,323 
510 California College 1,926 
Students Table 
ID Student Name SS# Phone 
1220 Bruce Chizen 422-72-8495 (650) 234-2234 
6778 Naomi Seligman 533-99-1234 (279) 331-2345 
4435 Raymond Bingham 
8872 Reed Hastings 412-44-5567 (312)676-8812 
1129 Maria Klawe 
1123 Bernard Vergnes 
Enrollment Table 
College ID Course Name Student Grade 
510 C67 Mathematics 4435 87 
167 C123 History 1 1129 70 
167 C14 Photography 1 1120 88 
Get College information 
Get Student information 
Search enrollment by College & Student
Option #2: Materialized View 
ID College Students ID Course Name ID Student Name 
234 Institute of Technology 1,334 C134 Mathematics 1145 John Cheechoo … 
234 Institute of Technology 1,334 C134 Mathematics 1837 Ryane Clowe … 
234 Institute of Technology 1,334 C134 Mathematics 2256 Patrick Marleau … 
234 Institute of Technology 1,334 C134 Mathematics 2277 Jamie McGinn … 
234 Institute of Technology 1,334 C134 Mathematics 4113 Torrey Mitchell … 
. . . 
234 Institute of Technology 1,334 C134 Mathematics 1145 … 
385 Golden State College 2,224 G85 World History 7783 Joe Pavelski … 
385 Golden State College 2,224 G85 World History 2234 Jeremy Roenick … 
385 Golden State College 2,224 G85 World History 1177 Devin Setoguchi … 
385 Golden State College 2,224 G85 World History 4113 Torrey Mitchell …
Option #3: Multi-Table Index 
Colleges 
Col_ID# Col_Name Col_Budget Col_Description 
Coll_ID# Coll_Name Coll_Budget Coll_Description Student_ID# College_ID# Student_Name Student_Desc College_ID# Dept_ID# Student_ID# Grade 
001 Agriculture $1,234,567 Nice place to visit 
002 Arts $5,432,567 Sports not so good 
003 Business $9,999,666 Cool logo 
004 Education $3,234,567 Ugh Worcester 
005 Engineering $8,238,568 Serious work 
006 Law $7,237,767 Jumpy students 
007 Liberal Arts $9,898,777 Pretty campus 
008 Medicine $5,987,004 In Texas 
Students 
56-8033 008 Mike Hogan Caucasian 
56-8045 008 Moshe Smith Caucasian 
56-8044 008 Sally Shadmon Native American 
56-8055 008 Billy Fleegle African American 
56-8037 008 Saul Goode African American 
56-8122 008 Tim Collins Polynesian 
56-8233 008 Sam Gee Asian 
56-8334 008 Rod Paulino Asian 
Enrollment 
008 4455 56-8037 B+ 
008 4455 56-8033 C 
008 4455 56-8045 B+ 
008 4456 56-8044 A- 
008 4456 56-8122 B- 
008 4454 56-8233 C 
008 4455 56-8334 F 
008 4454 56-8055 D 
CCoollleleggee 
SStutuddeenntsts 
EEnnrorolllmlmeenntt 
DDeeppaartrmtmeenntsts 
CCoouursrseess 
ScaleDB Multi-Table Index 
EEnnrroolllmlmeenntt
Mapping Foreign Keys to Data Views 
 Create Students Table 
o Foreign key – College 
Students 
Enrollment 
 Create Enrollment Table 
o Foreign key - Students 
Course 
 Create Course Table 
o Foreign Key – Department 
Department 
 Create Department Table 
o Foreign key – College 
College 
 Create College Table 
The Parent-Child tables are Created in MySQL 
Such that MySQL is able to operate over the new 
tables 
The data of the Parent-Child tables is assembled 
on the fly from the source tables
Mapping Foreign Keys to Data Views 
Students 
Course 
Enrollment 
College Department 
College Department 
College 
College Students 
ScaleDB 
Physical files: 
1. College 
2. Department 
3. Student 
4. Course 
5. Enrollment 
Meta-Data Tables: 
1. College 
2. College-Dept 
3. College-Dept-Course 
4. College-Students 
5. College-Students-Enrollment 
6. Department 
7. Students 
8. Course 
9. Enrollment
Enabling the MySQL optimizer to 
use a Multi-Table Index 
SELECT c1.CollName, s.StudName, 
c2.CourseName , e.Grade 
FROM College AS c1 
JOIN Student AS s 
JOIN Enrollment AS e 
JOIN Course AS c2 
ON ( c1.CollNo = s.CollNo AND 
s.CollNo = e.CollNo AND 
s.StudentNo = e.StudentNo AND 
e.CollNo = c2.CollNo AND 
e.DeptNo = c2.DeptNo AND 
e.CourseNum = c2.CourseNum ) 
WHERE c1.CollNo = X 
AND s.StudentNo = Y ; 
CREATE TABLE sdb_view_college_course_student ( 
L1_CollNo INT NOT NULL, 
L1_CollName CHAR(32) NOT NULL, 
L1_CollBudget INT NOT NULL, 
L1_CollDescription CHAR(60) NOT NULL, 
… Table College Columns 
L2_StudNo INT NOT NULL, 
L2_StudName CHAR(48) NOT NULL, 
… Table Student Columns 
L3_CourseNum CHAR(9) NOT NULL, 
L3_Grade CHAR(2) NOT NULL, 
… Table Enrollment Columns 
PRIMARY KEY ( L1_CollNo, L2_StudtNo, 
L3_CourseNum)) 
ENGINE = SCALEDB; 
Select L1_CollName, L2_StudName, L3_CourseName, L3_Grade 
FROM sdb_view_college_course_student WHERE l1_CollNo = X AND l2_StudentNo 
= Y ;
The Multi-Table Index 
 Multi-Table Index appears to MySQL as a data table 
 ScaleDB does not maintain data file associated with 
the Multi-Table Index 
 For a query using virtual table, ScaleDB assembles 
the rows on the fly using the Multi-Table Index 
 ScaleDB indexes are different than B-tree indexes 
 ScaleDB indexes provide the same functionality as 
B-tree, plus… 
o They maintain referential integrity with minimal overhead 
o They allow you to search for the data and relationships 
o They are much smaller in size
Demo 
 Query with join 
 Query with Multi-Table Index 
 2nd node virtual table
Benchmarking ScaleDB Index 
Queries/Sec 
60 
50 
40 
30 
20 
10 
0 
Engine X Join ScaleDB MTI ScaleDB 2 Nodes
Summary 
 ScaleDB Cluster 
o Multiple ScaleDB instances share the same physical data. 
o Connecting to the cluster is similar to connecting to a single 
node. 
o For the application, the cluster appears as a single node. 
o Transparent application failover 
o Transparent Scalability 
 ScaleDB Indexes 
o Provide the B-tree functionality 
o High performance 
 Map relationships 
 Maintain referential integrity 
 Smaller footprint 
 Independent of the key size
ScaleDB Status and Product Availability 
 Started Beta Process 
o We are looking for beta companies 
 Product launch is scheduled for June timeframe 
 Please talk to us if you are developer interested 
in working with ScaleDB 
moshe@scaledb.com

Mais conteúdo relacionado

Destaque

EWD 3 Training Course Part 22: Traversing Documents using DocumentNode Objects
EWD 3 Training Course Part 22: Traversing Documents using DocumentNode ObjectsEWD 3 Training Course Part 22: Traversing Documents using DocumentNode Objects
EWD 3 Training Course Part 22: Traversing Documents using DocumentNode ObjectsRob Tweed
 
EWD 3 Training Course Part 30: Modularising QEWD Applications
EWD 3 Training Course Part 30: Modularising QEWD ApplicationsEWD 3 Training Course Part 30: Modularising QEWD Applications
EWD 3 Training Course Part 30: Modularising QEWD ApplicationsRob Tweed
 
EWD 3 Training Course Part 38: Building a React.js application with QEWD, Part 2
EWD 3 Training Course Part 38: Building a React.js application with QEWD, Part 2EWD 3 Training Course Part 38: Building a React.js application with QEWD, Part 2
EWD 3 Training Course Part 38: Building a React.js application with QEWD, Part 2Rob Tweed
 
EWD 3 Training Course Part 18: Modelling NoSQL Databases using Global Storage
EWD 3 Training Course Part 18: Modelling NoSQL Databases using Global StorageEWD 3 Training Course Part 18: Modelling NoSQL Databases using Global Storage
EWD 3 Training Course Part 18: Modelling NoSQL Databases using Global StorageRob Tweed
 
EWD 3 Training Course Part 25: Document Database Capabilities
EWD 3 Training Course Part 25: Document Database CapabilitiesEWD 3 Training Course Part 25: Document Database Capabilities
EWD 3 Training Course Part 25: Document Database CapabilitiesRob Tweed
 
EWD 3 Training Course Part 24: Traversing a Document's Leaf Nodes
EWD 3 Training Course Part 24: Traversing a Document's Leaf NodesEWD 3 Training Course Part 24: Traversing a Document's Leaf Nodes
EWD 3 Training Course Part 24: Traversing a Document's Leaf NodesRob Tweed
 
EWD 3 Training Course Part 16: QEWD Services
EWD 3 Training Course Part 16: QEWD ServicesEWD 3 Training Course Part 16: QEWD Services
EWD 3 Training Course Part 16: QEWD ServicesRob Tweed
 
EWD 3 Training Course Part 26: Event-driven Indexing
EWD 3 Training Course Part 26: Event-driven IndexingEWD 3 Training Course Part 26: Event-driven Indexing
EWD 3 Training Course Part 26: Event-driven IndexingRob Tweed
 
EWD 3 Training Course Part 21: Persistent JavaScript Objects
EWD 3 Training Course Part 21: Persistent JavaScript ObjectsEWD 3 Training Course Part 21: Persistent JavaScript Objects
EWD 3 Training Course Part 21: Persistent JavaScript ObjectsRob Tweed
 
EWD 3 Training Course Part 31: Using QEWD for Web and REST Services
EWD 3 Training Course Part 31: Using QEWD for Web and REST ServicesEWD 3 Training Course Part 31: Using QEWD for Web and REST Services
EWD 3 Training Course Part 31: Using QEWD for Web and REST ServicesRob Tweed
 
EWD 3 Training Course Part 27: The QEWD Session
EWD 3 Training Course Part 27: The QEWD SessionEWD 3 Training Course Part 27: The QEWD Session
EWD 3 Training Course Part 27: The QEWD SessionRob Tweed
 
EWD 3 Training Course Part 28: Integrating Legacy Mumps Code with QEWD
EWD 3 Training Course Part 28: Integrating Legacy Mumps Code with QEWDEWD 3 Training Course Part 28: Integrating Legacy Mumps Code with QEWD
EWD 3 Training Course Part 28: Integrating Legacy Mumps Code with QEWDRob Tweed
 
EWD 3 Training Course Part 20: The DocumentNode Object
EWD 3 Training Course Part 20: The DocumentNode ObjectEWD 3 Training Course Part 20: The DocumentNode Object
EWD 3 Training Course Part 20: The DocumentNode ObjectRob Tweed
 
EWD 3 Training Course Part 19: The cache.node APIs
EWD 3 Training Course Part 19: The cache.node APIsEWD 3 Training Course Part 19: The cache.node APIs
EWD 3 Training Course Part 19: The cache.node APIsRob Tweed
 
EWD 3 Training Course Part 37: Building a React.js application with ewd-xpres...
EWD 3 Training Course Part 37: Building a React.js application with ewd-xpres...EWD 3 Training Course Part 37: Building a React.js application with ewd-xpres...
EWD 3 Training Course Part 37: Building a React.js application with ewd-xpres...Rob Tweed
 
EWD 3 Training Course Part 41: Building a React.js application with QEWD, Part 5
EWD 3 Training Course Part 41: Building a React.js application with QEWD, Part 5EWD 3 Training Course Part 41: Building a React.js application with QEWD, Part 5
EWD 3 Training Course Part 41: Building a React.js application with QEWD, Part 5Rob Tweed
 
MariaDB Vorstellung
MariaDB VorstellungMariaDB Vorstellung
MariaDB VorstellungMariaDB plc
 

Destaque (17)

EWD 3 Training Course Part 22: Traversing Documents using DocumentNode Objects
EWD 3 Training Course Part 22: Traversing Documents using DocumentNode ObjectsEWD 3 Training Course Part 22: Traversing Documents using DocumentNode Objects
EWD 3 Training Course Part 22: Traversing Documents using DocumentNode Objects
 
EWD 3 Training Course Part 30: Modularising QEWD Applications
EWD 3 Training Course Part 30: Modularising QEWD ApplicationsEWD 3 Training Course Part 30: Modularising QEWD Applications
EWD 3 Training Course Part 30: Modularising QEWD Applications
 
EWD 3 Training Course Part 38: Building a React.js application with QEWD, Part 2
EWD 3 Training Course Part 38: Building a React.js application with QEWD, Part 2EWD 3 Training Course Part 38: Building a React.js application with QEWD, Part 2
EWD 3 Training Course Part 38: Building a React.js application with QEWD, Part 2
 
EWD 3 Training Course Part 18: Modelling NoSQL Databases using Global Storage
EWD 3 Training Course Part 18: Modelling NoSQL Databases using Global StorageEWD 3 Training Course Part 18: Modelling NoSQL Databases using Global Storage
EWD 3 Training Course Part 18: Modelling NoSQL Databases using Global Storage
 
EWD 3 Training Course Part 25: Document Database Capabilities
EWD 3 Training Course Part 25: Document Database CapabilitiesEWD 3 Training Course Part 25: Document Database Capabilities
EWD 3 Training Course Part 25: Document Database Capabilities
 
EWD 3 Training Course Part 24: Traversing a Document's Leaf Nodes
EWD 3 Training Course Part 24: Traversing a Document's Leaf NodesEWD 3 Training Course Part 24: Traversing a Document's Leaf Nodes
EWD 3 Training Course Part 24: Traversing a Document's Leaf Nodes
 
EWD 3 Training Course Part 16: QEWD Services
EWD 3 Training Course Part 16: QEWD ServicesEWD 3 Training Course Part 16: QEWD Services
EWD 3 Training Course Part 16: QEWD Services
 
EWD 3 Training Course Part 26: Event-driven Indexing
EWD 3 Training Course Part 26: Event-driven IndexingEWD 3 Training Course Part 26: Event-driven Indexing
EWD 3 Training Course Part 26: Event-driven Indexing
 
EWD 3 Training Course Part 21: Persistent JavaScript Objects
EWD 3 Training Course Part 21: Persistent JavaScript ObjectsEWD 3 Training Course Part 21: Persistent JavaScript Objects
EWD 3 Training Course Part 21: Persistent JavaScript Objects
 
EWD 3 Training Course Part 31: Using QEWD for Web and REST Services
EWD 3 Training Course Part 31: Using QEWD for Web and REST ServicesEWD 3 Training Course Part 31: Using QEWD for Web and REST Services
EWD 3 Training Course Part 31: Using QEWD for Web and REST Services
 
EWD 3 Training Course Part 27: The QEWD Session
EWD 3 Training Course Part 27: The QEWD SessionEWD 3 Training Course Part 27: The QEWD Session
EWD 3 Training Course Part 27: The QEWD Session
 
EWD 3 Training Course Part 28: Integrating Legacy Mumps Code with QEWD
EWD 3 Training Course Part 28: Integrating Legacy Mumps Code with QEWDEWD 3 Training Course Part 28: Integrating Legacy Mumps Code with QEWD
EWD 3 Training Course Part 28: Integrating Legacy Mumps Code with QEWD
 
EWD 3 Training Course Part 20: The DocumentNode Object
EWD 3 Training Course Part 20: The DocumentNode ObjectEWD 3 Training Course Part 20: The DocumentNode Object
EWD 3 Training Course Part 20: The DocumentNode Object
 
EWD 3 Training Course Part 19: The cache.node APIs
EWD 3 Training Course Part 19: The cache.node APIsEWD 3 Training Course Part 19: The cache.node APIs
EWD 3 Training Course Part 19: The cache.node APIs
 
EWD 3 Training Course Part 37: Building a React.js application with ewd-xpres...
EWD 3 Training Course Part 37: Building a React.js application with ewd-xpres...EWD 3 Training Course Part 37: Building a React.js application with ewd-xpres...
EWD 3 Training Course Part 37: Building a React.js application with ewd-xpres...
 
EWD 3 Training Course Part 41: Building a React.js application with QEWD, Part 5
EWD 3 Training Course Part 41: Building a React.js application with QEWD, Part 5EWD 3 Training Course Part 41: Building a React.js application with QEWD, Part 5
EWD 3 Training Course Part 41: Building a React.js application with QEWD, Part 5
 
MariaDB Vorstellung
MariaDB VorstellungMariaDB Vorstellung
MariaDB Vorstellung
 

Último

Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 

Último (20)

Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

The scale db storage engine enabling high performance and scalability using materialized views and a shared-disk clustering architecture presentation

  • 1. The ScaleDB Storage Engine Enabling high performance and scalability, using a Multi-Table Index, and a Shared-Disk Clustering Architecture Moshe Shadmon moshe@scaledb.com
  • 2. Agenda  Overview  ScaleDB’s Clustering Architecture o Shared-Disk vs. Shared-Nothing o MySQL and a Shared-Disk Storage Engine o ScaleDB Installation o Demo  ScaleDB’s Indexing Technology o Multi-Table Index o Enabling Multi-Table Index in MySQL o Demo  Summary  ScaleDB Status & Product Availability
  • 3. Overview  Plug-in Storage Engine for MySQL  Main Features: o Shared-Disk Architecture o Innovative Multi-Table Indexing o Transactional o Row-Level Locking o ACID Compliant o Atomicity: All tasks of a transaction performed or none of them are. o Consistency: The database is in a consistent state before and after the transaction. o Isolation: Data is not available in an intermediate state during a transaction o Durability: When a transaction completes, the transaction’s data will persist o Disk-Based Storage Engine
  • 4. Shared-Disk vs. Shared-Nothing  Manageability  Adaptability  Availability/Fault-Tolerance  Scalability  Performance  Total Cost of Ownership (TCO)
  • 5. Shared-Nothing: Database Instance 1 Table A Table B Table C Database Instance 1 Database Instance 2 Database Instance 3 Table A Table B Table C Vertical Partitioning
  • 6. Shared Nothing: Partitioning Your Data…How  Predict usage patterns, application evolution, data growth patterns…all are moving targets  Avoid data skew: bottlenecks caused by frequently accessed data on just a few nodes  Avoid data shipping between nodes  Avoid delays from distributed 2-phase commit  Searches outside the partition column require participation by all nodes  Scaling becomes an exercise in fire fighting
  • 7. Shared-Nothing: Horizontal Partitioning name age salary Bob 20 10K Shideh 18 35K Ted 50 60K Kevin 62 120K Angela 55 140K Mike 45 90K name age salary Physical View Partitioned by Salary Logical View name age salary Ted 50 60K Kevin 62 120K Mike 46 90K Bob 20 10K name age salary Shideh 18 35K Angela 55 140K Horizontal Partitioning – Salary % 3
  • 8. Shared-Nothing: Horizontal Partitioning Pitfalls  Selections with equality predicates referencing the partitioning attribute are directed to a single node: o Retrieve Emp where salary = 60K SELECT FROM Emp WHERE salary=60K  Equality predicates referencing a non-partitioning attribute and range predicates are directed to all nodes: o Retrieve Emp where age = 20 o Retrieve Emp where salary < 20K SELECT FROM Emp WHERE salary<20K
  • 9. Shared-Disk: No Partitioning, Full Access to Data DB Cluster Node 1 DB Cluster Node 2 DB Cluster Node 3 Table A Table B Table C Shared Disk Subsystem High-Speed Interconnect Database Instance 1 Table A Table B Table C
  • 10. Slave A Scalability & Availability Slave B Shared Nothing Slave C
  • 11. Scalability & Availability Shared Disk Node A Node B Node C MySQL Servers with ScaleDB Engine Data Node D Node E
  • 12. Shared-Disk: Summarizing Shared-Disk Benefits  Grow by simply adding nodes to the cluster o Servers can be added and removed dynamically according to your needs o No interruption to your application  High-Availability with dynamic failover o Existing nodes automatically take over  Significantly reduced maintenance costs o Can be built on low-cost commodity hardware o No data partitioning o No need for slaves  Low Total Cost of Ownership (TCO)
  • 13. Shared-Disk: Making it work with MySQL NNooddee 11 ScaleDB Engine Instance A Cluster Manager Buffer Manager Comm. Layer Server Instance A NNooddee 22 Server Instance B ScaleDB Engine Instance B Cluster Manager Comm. Layer Buffer Manager CClluusstteerr IInntteerrccoonnnneecctt SShhaarreedd DDiisskk SSuubb--ssyysstteemm
  • 14. Shared-Disk: Insert New Row ScaleDB Engine Instance A Cluster Manager Buffer Manager Comm. Layer NNooddee 11 Server Instance A NNooddee 22 Server Instance B ScaleDB Engine Instance B Cluster Manager Comm. Layer Buffer Manager CClluusstteerr IInntteerrccoonnnneecctt SShhaarreedd DDiisskk SSuubb--ssyysstteemm
  • 15. ScaleDB Engine Instance A Cluster Manager Buffer Manager Comm. Layer NNooddee 11 Server Instance A NNooddee 22 Server Instance B ScaleDB Engine Instance B Cluster Manager Comm. Layer Buffer Manager Shared-Disk: Select CClluusstteerr IInntteerrccoonnnneecctt SShhaarreedd DDiisskk SSuubb--ssyysstteemm
  • 16. Shared-Disk: Create Table ScaleDB Engine Instance A Cluster Manager Buffer Manager Comm. Layer NNooddee 11 Server Instance A NNooddee 22 Server Instance B ScaleDB Engine Instance B Cluster Manager Comm. Layer Buffer Manager CClluusstteerr IInntteerrccoonnnneecctt SShhaarreedd DDiisskk SSuubb--ssyysstteemm Table A Table A Meta-Data Meta-Data
  • 17. ScaleDB Installation  Define cluster = true in ScaleDB Config file:  ScaleDB.cnf is at the same directory as my.cnf:  Cluster params: o cluster = true o nodes_in_cluster = 2 o node_id = 1 o this_machine_port = 100 o next_machine_ip_address = 192.168.0.101 o next_machine_port = 100 o log_directory = /share/logs/
  • 18. Demo - Sysbench  ScaleDB cluster – one node – show throughput  ScaleDB cluster – 2nd node – show throughput
  • 19. ScaleDB: Multi-Table Indexing B-tree: Only indexes the data in tables Index #1 #1 #2 Index #2 Index #3 Index #4 Index #5 #3 #4 #5 ScaleDB: Indexes the data and relationships ScaleDB Index #1 #2 #3 #4 #5 Advantages: • Faster • Smaller • Referential integrity
  • 20. Example  Scenario: Select information that is spread across 3 tables: Colleges, Students and Enrollment  Relationships: Students are enrolled in courses within departments of colleges SELECT c1.CollName, s.StudName, c2.CourseName , e.Grade FROM College AS c1 JOIN Student AS s JOIN Enrollment AS e JOIN Course AS c2 ON ( c1.CollNo = s.CollNo AND s.CollNo = e.CollNo AND s.StudentNo = e.StudentNo AND e.CollNo = c2.CollNo AND e.DeptNo = c2.DeptNo AND e.CourseNum = c2.CourseNum ) WHERE c1.CollNo = X AND s.StudentNo = Y ;
  • 21. Option #1: Conventional Joins College Table ID College Students 234 Institute of Technology 1,334 167 High Tech Institute 5,742 85 Golden State College 2,119 298 Kaplan College 12,323 510 California College 1,926 Students Table ID Student Name SS# Phone 1220 Bruce Chizen 422-72-8495 (650) 234-2234 6778 Naomi Seligman 533-99-1234 (279) 331-2345 4435 Raymond Bingham 8872 Reed Hastings 412-44-5567 (312)676-8812 1129 Maria Klawe 1123 Bernard Vergnes Enrollment Table College ID Course Name Student Grade 510 C67 Mathematics 4435 87 167 C123 History 1 1129 70 167 C14 Photography 1 1120 88 Get College information Get Student information Search enrollment by College & Student
  • 22. Option #2: Materialized View ID College Students ID Course Name ID Student Name 234 Institute of Technology 1,334 C134 Mathematics 1145 John Cheechoo … 234 Institute of Technology 1,334 C134 Mathematics 1837 Ryane Clowe … 234 Institute of Technology 1,334 C134 Mathematics 2256 Patrick Marleau … 234 Institute of Technology 1,334 C134 Mathematics 2277 Jamie McGinn … 234 Institute of Technology 1,334 C134 Mathematics 4113 Torrey Mitchell … . . . 234 Institute of Technology 1,334 C134 Mathematics 1145 … 385 Golden State College 2,224 G85 World History 7783 Joe Pavelski … 385 Golden State College 2,224 G85 World History 2234 Jeremy Roenick … 385 Golden State College 2,224 G85 World History 1177 Devin Setoguchi … 385 Golden State College 2,224 G85 World History 4113 Torrey Mitchell …
  • 23. Option #3: Multi-Table Index Colleges Col_ID# Col_Name Col_Budget Col_Description Coll_ID# Coll_Name Coll_Budget Coll_Description Student_ID# College_ID# Student_Name Student_Desc College_ID# Dept_ID# Student_ID# Grade 001 Agriculture $1,234,567 Nice place to visit 002 Arts $5,432,567 Sports not so good 003 Business $9,999,666 Cool logo 004 Education $3,234,567 Ugh Worcester 005 Engineering $8,238,568 Serious work 006 Law $7,237,767 Jumpy students 007 Liberal Arts $9,898,777 Pretty campus 008 Medicine $5,987,004 In Texas Students 56-8033 008 Mike Hogan Caucasian 56-8045 008 Moshe Smith Caucasian 56-8044 008 Sally Shadmon Native American 56-8055 008 Billy Fleegle African American 56-8037 008 Saul Goode African American 56-8122 008 Tim Collins Polynesian 56-8233 008 Sam Gee Asian 56-8334 008 Rod Paulino Asian Enrollment 008 4455 56-8037 B+ 008 4455 56-8033 C 008 4455 56-8045 B+ 008 4456 56-8044 A- 008 4456 56-8122 B- 008 4454 56-8233 C 008 4455 56-8334 F 008 4454 56-8055 D CCoollleleggee SStutuddeenntsts EEnnrorolllmlmeenntt DDeeppaartrmtmeenntsts CCoouursrseess ScaleDB Multi-Table Index EEnnrroolllmlmeenntt
  • 24. Mapping Foreign Keys to Data Views  Create Students Table o Foreign key – College Students Enrollment  Create Enrollment Table o Foreign key - Students Course  Create Course Table o Foreign Key – Department Department  Create Department Table o Foreign key – College College  Create College Table The Parent-Child tables are Created in MySQL Such that MySQL is able to operate over the new tables The data of the Parent-Child tables is assembled on the fly from the source tables
  • 25. Mapping Foreign Keys to Data Views Students Course Enrollment College Department College Department College College Students ScaleDB Physical files: 1. College 2. Department 3. Student 4. Course 5. Enrollment Meta-Data Tables: 1. College 2. College-Dept 3. College-Dept-Course 4. College-Students 5. College-Students-Enrollment 6. Department 7. Students 8. Course 9. Enrollment
  • 26. Enabling the MySQL optimizer to use a Multi-Table Index SELECT c1.CollName, s.StudName, c2.CourseName , e.Grade FROM College AS c1 JOIN Student AS s JOIN Enrollment AS e JOIN Course AS c2 ON ( c1.CollNo = s.CollNo AND s.CollNo = e.CollNo AND s.StudentNo = e.StudentNo AND e.CollNo = c2.CollNo AND e.DeptNo = c2.DeptNo AND e.CourseNum = c2.CourseNum ) WHERE c1.CollNo = X AND s.StudentNo = Y ; CREATE TABLE sdb_view_college_course_student ( L1_CollNo INT NOT NULL, L1_CollName CHAR(32) NOT NULL, L1_CollBudget INT NOT NULL, L1_CollDescription CHAR(60) NOT NULL, … Table College Columns L2_StudNo INT NOT NULL, L2_StudName CHAR(48) NOT NULL, … Table Student Columns L3_CourseNum CHAR(9) NOT NULL, L3_Grade CHAR(2) NOT NULL, … Table Enrollment Columns PRIMARY KEY ( L1_CollNo, L2_StudtNo, L3_CourseNum)) ENGINE = SCALEDB; Select L1_CollName, L2_StudName, L3_CourseName, L3_Grade FROM sdb_view_college_course_student WHERE l1_CollNo = X AND l2_StudentNo = Y ;
  • 27. The Multi-Table Index  Multi-Table Index appears to MySQL as a data table  ScaleDB does not maintain data file associated with the Multi-Table Index  For a query using virtual table, ScaleDB assembles the rows on the fly using the Multi-Table Index  ScaleDB indexes are different than B-tree indexes  ScaleDB indexes provide the same functionality as B-tree, plus… o They maintain referential integrity with minimal overhead o They allow you to search for the data and relationships o They are much smaller in size
  • 28. Demo  Query with join  Query with Multi-Table Index  2nd node virtual table
  • 29. Benchmarking ScaleDB Index Queries/Sec 60 50 40 30 20 10 0 Engine X Join ScaleDB MTI ScaleDB 2 Nodes
  • 30. Summary  ScaleDB Cluster o Multiple ScaleDB instances share the same physical data. o Connecting to the cluster is similar to connecting to a single node. o For the application, the cluster appears as a single node. o Transparent application failover o Transparent Scalability  ScaleDB Indexes o Provide the B-tree functionality o High performance  Map relationships  Maintain referential integrity  Smaller footprint  Independent of the key size
  • 31. ScaleDB Status and Product Availability  Started Beta Process o We are looking for beta companies  Product launch is scheduled for June timeframe  Please talk to us if you are developer interested in working with ScaleDB moshe@scaledb.com

Notas do Editor

  1. In this presentation I will discuss 2 main features of the ScaleDB storage engine: ScaleDB is a shared disk storage engine which means that multiple servers share the same physical database. ScaleDB implements a unique indexing method. All database companies use B-tree to index their data, ScaleDB is using an innovative indexing that is based on a trie structure. In this talk I will explain and demonstrated both technologies.
  2. ScaleDB is a general purpose storage engine that supports the MySQL storage engine API. It is oriented to support large, disk-based data files with high performance and scalability. In particular, it enables multiple MySQL server instances to share the same physical database as well as innovative high performance indexing.
  3. With shared nothing, the data needs to be partitioned to consider the following: Usage pattern – we try to have balanced distribution of calls to each of the nodes. Complete the query at a single node – we try to distribute the data such that each query will be satisfied in a single node. Growth of data is evenly distributed across the nodes.
  4. With shared nothing, assuming even distribution of the data and the users across the nodes in the cluster, some queries can be satisfied well. However, these are only queries that consider the way the data is partitioned. With the horizontal partitioning, queries using non-partitioned attributes and range queries are executed on all nodes.
  5. With a shared disk, the growth in data size and the number of users is addressed by adding nodes to the cluster. All the nodes view a unified view of the data and each node can satisfy all the queries. With this architecture, there is no need to partition the data and the distribution of users across node depends only on the availability of the node.
  6. With shared nothing, if the info reside on a single node execution may be efficient. However, the info may reside on multiple nodes such that the processing node needs to communicate with one and perhaps more additional nodes.
  7. One main question is how can MySQL take advantage of a shared disk architecture as MySQL is not designed for a shared architecture. - the synchronization is done in the engine layer. Every MySQL server operates as a single server, the ScaleDB engine synchronize the processes among the different nodes.
  8. All database companies use Btree as their main index structure. ScaleDB is using a proprietary multi-table index. The main difference is that Btree index the data. We can search by a key such as customer id to find the customer row. The ScaleDB index allows to search for the data as well as the relations. For example, we can search for the customer row by the customer id. The search to the customer can continue using the same index path to invoices of the customer. With Btree, we would need to initiate a new search using a different index for the invoices relating the customer.
  9. Lets consider an example where rows from 3 tables are joined to satisfy a particular query. These are potential methods to execute the query:
  10. Conventional joins step sequentially through the applicable tables, building the query result with each step.
  11. The second method is to build a materialized view. Materialized view works very efficient as there is no need to join the data, but it has the following drawbacks: 1. The data needs o be duplicated. The data needs to constantly be synchronized with the source tables. For example, if we change the student address it needs to be modified not only in the Students Table but also in the view data. There are no MySQL engines supporting materialized views.
  12. The third approach is the ScaleDB Multi-Table Index that materialized the views without duplicating the data. In addition there is no need to synchronize the view with the source data as the view is constructed on the fly from the source data. In this example, the Multi-Table Index maintains a path that ends with pointers to the needed rows in he College, Students and Enrollment tables. It obviates the need to physically materialize the view.
  13. Two important feature relating the creation of a Multi-Table index are: 1. The index is created transparently from the SQL Primary-Key and Foreign Key definitions. 2. There are no supporting data tables. The index create the tables for MySQL on the fly.
  14. With the ScaleDB index, a query that does joins can be replaced by a simpler query over an “sdb_view” table. Calls to sdb_view table is executed by the engine in a more efficient way than executing the join.