CUBRID Cluster Introduction

CUBRID Cluster
Introduction
Dong Wang
Platform Development Lab
NHN China

2011.9.22

ⓒ 2011 NHN CORPORATION

Agenda

□ Background & Goal

□ CUBRID Cluster basic concept

□ CUBRID Cluster general design

□ Result & Status of each milestone

□ Demo

□ Performance results

□ Pros and cons

□ Next version plan

2 / 35 CUBRID Cluster Introduction

Background & Goal


Background & Goal

□ Background
 In Internet Portal Service, volume size of service data is increasing very fast without deletion
(Such as: Café service)

 How to scale-out DB system without modification of applications?
• Big system power by cheap commodity servers – Clustering or grid computing

□ Goal
 Support Dynamic Scalability

 Location transparency to the applications

 Volume size & Performance
• When performance is the same, Cluster can store more data

• When data size is the same, Cluster can provide higher performance

 Others
• Global Schema, Distributed Partition, Load Balancing

• Cluster Management Node, Heart beat


Background & Goal(cont.)

As-Is To-Be Provides “single DB view”/“multi access point”
DB system architecture is coded in
to applications
the application’s logic
DB system scale-out independently to applications
AP logic decides which SQL to which DB server
(Linear Scalability)

DB1/UPDATE tbl01 DB1/SELECT tbl01 UPDATE tbl01 SELECT tbl01 SELECT tbl35

DB3/UPDATE tbl35 DB4/SELECT tbl47 SELECT tbl47 UPDATE tbl35

RW RO

Global Schema

M M
Distributed Partition
HA DB

S S

DB1 DB2 DB3 DB4 To do


CUBRID Cluster basic concept


CUBRID Cluster basic concept

Basic Features Advanced Features
To do

• Global schema • Support HA
• Global database • Cluster management node
• Distributed partition • Dead lock detection
• Global transaction
• Dynamic scalability
• Global serial & global index


CUBRID Cluster basic concept– Global schema

The global schema is a single representation or a global view of all nodes where
each node has its own database and schema.
SELECT * FROM info, code
WHERE info.id = code.id SELECT * FROM contents
WHERE auth = (SELECT name FROM author WHERE …)
INSERT INTO contents…

contents info author

Global Schema

contents author contents author contents author contents author
local
code level local

info info info info

Local Schema #1 Local Schema #2 Local Schema #3 Local Schema #4

Database #1 Database #2 Database #3 Database #4
8/ CUBRID Cluster Introduction

CUBRID Cluster basic concept – Global database

Global database is a logical concept to represent database managed by the
CUBRID Cluster system.

Global DB A Global DB C Logical View
Logical View

⑴ ⑵ ⑶ ⒜ ⒝

Physical View Physical View

⑴ ⑵ ⒜ ⑶ ⒝

DB A DB B DB A DB C DB A DB D DB C

Node #1 Node #2 Node #3

9/ CUBRID Cluster Introduction

Distributed partition concept

Global Schema

Logical View Logical View

Data
System
Catalog

Physical View Physical View
Index

Schema Schema

Data
System System
Catalog Data Catalog

Index Index

DB1 ON NODE #1 DB1 ON NODE #2
CUBRID Cluster Introduction
10 /

CUBRID Cluster basic concept -- others

□ Global Transaction
 A global transaction will be divided into several local transactions which run on different server
nodes.

 Global transaction makes sure that every server node in CUBRID Cluster is consistent before
or after the global transaction.

 The process of global transaction is transparent to application.

□ Dynamic Scalability
 Dynamic scalability allow user to extend or shrunk server nodes in CUBRID Cluster without
stop the CUBRID Cluster.

 After the new server node is added in Cluster, user can access and query global table from this
new node.


CUBRID Cluster basic concept – User specs

□ Registering Local DB into Global DB (Cluster)
 REGISTER NODE‘node1’ ‘10.34.64.64’;

 REGISTER NODE‘node2’ ‘out-dev7’;

□ Creating Global Table/Global Partition table
 CREATE GLOBAL TABLE gt1 (…) ON NODE‘node1’;

 CREATE GLOBAL TABLE gt2 (id INT primate key, …) partition by hash (id) partitions 2 ON
NODE ‘node1’, ‘node2’;

□ DML operations (INSERT/SELECT/DELETE/UPDATE)

□ Dynamic Scalability
 -- add a new server node in global database

 REGISTER 'node3' '10.34.64.66';

 -- adjust data to new server node

 ALTER GLOBAL TABLE gt2 ADD PARTITIONPARTITIONS 1 ON NODE 'node3';


CUBRID Cluster general design


CUBRID Cluster general design (DDL/INSERT)

CREATE GLOBAL TABLE INSERT INTO gt1
gt1… …
PARTITION BY HASH
ON NODE ‘Server1”, ‘Server2”,
‘Server3’, ‘Server4’;

AP AP

Workspace
Broker
Extension Broker
To store
Remote oid
C2S communication

Global Schema

Distributed partition
DB1 DB1 DB1 DB1

Server #1 Server #2 Server #3 Server #4

Global DB1

CUBRID Cluster general design (SELECT/DELETE)

SECLET .. FROM gt1 UPDATE..
WHERE ….
DELETE …

AP AP

Broker
Broker

Remote execution

DB1 DB1 DB1 DB1

Remote scan Server #1 Server #2 Server #3 Server #4

S2S communication

CUBRID Cluster general design (COMMIT)

INSERT gt1
SELECT …
FROM …

COMMIT

AP AP

Broker
Broker

Global index : 0x40430000 Index Server1 Server2 Server3 Server4
0 2 3 5 1

Local: Local: Local: Local:
2 3 5 1

DB1 DB1 DB1 DB1
2 phase commit
10.34.64.64 10.34.64.65 10.34.64.66 10.34.64.67

Coordinator
Participants 16 / 35 CUBRID Cluster Introduction

CUBRID Cluster general design (dynamic scale-out)

CREATE GLOBAL TABLE
ALTER GLOBAL
gt2…
TABLE gt2 ADD
PARTITION BY HASH
PARTITION … ON
ON NODE ‘Server1”,
NODE ‘Server 4’;
‘Server2”, ‘Server3’;

REGISTER ‘Server 4’
‘10.34.64.67’;
AP AP

Broker
Broker

Sync up global
schema

DB1 DB1 DB1 DB1


Rehash

CUBRID Cluster general design (ORDER BY-Ongoing)

SECLET .. FROM gt1
Order by ….

AP AP

Broker
Broker

Step Send remote query
1: with order by
scan scan scan scan
Step
2: sort sort sort sort
DB1 DB1 DB1 DB1


Step3: Merge results from each server

The result & status of each
milestone


CUBRID Cluster Project Overview

□ Team Composition & Roles
 Service Platform and Development Center, NHN Korea
• Architect: Park Kiun (Architect/SW)

 Platform Development Lab, NHN China
• Project Manager : Baek Jeonghan (Director)/Li Chenglong (Team Leader)
• Dev leader: Li Chenglong (Team Leader) /Wang Dong (Part Leader)

□ Project Duration
 May, 2010 ～ Oct, 2011

□ Quality requirement
 Passed CUBRID all regression test cases;
 Passed CUBRID Cluster all QA and dev functions test cases;
 Passed QA Performance test cases;

□ Others:
 Code based on CUBRID 8.3.0.0337 version (release verison)


The result & status of each milestones -- Overview

2010 H1 2010 H2 2011 H1 2011 H2

May Jun Jul Aug Sep Oct Nov Dec Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

M1 M2 Distributed M3 M4 Next
Global Schema partition Performance nReport Ver.

M1:
Start: May 24th, 2010
End: Oct 20th, 2010 M2:
Start: Oct 21th, 2010
End: Mar 25th, 2011
M3:
Start: Mar 28th, 2011
End: Jul 17th, 2011
M4 (Ongoing):
Start: Jul 18th, 2011
End: Oct 30th, 2011


The result & status of each milestones – M1

□ Achievements:
 Open source on sf.net (including code, wiki, bts, forum)

 General design for CUBRID Cluster

 Implement global database

 Implement system catalog extension and Support global table DDL

 Support basic DML statement(insert/select/delete/update) for global table

 Support s2s communication(server transfer data)

□ Others:
 Source lines of code(LOC): 19246 (add 11358, del 817, mod 7071)

 Add Chinese msg (LOC): 7507

 BTS issues numbers: 178 issues

 Check-in in subversion: 387 times



□ Achievements:
 Implement distributed partition table by hash (basic DDL and DML)

 Support constraints(global index, primary key, unique), query with index

 Support global serial

 Support global transaction (commit, rollback)

 Refactor s2s communication (add s2s communication interface and connection pooling)

 Support all SQL statements for café service

 Passed QA functional testing

□ Others:


 Fix QA bugs: 43 bugs




□ Achievements:
 Performance improvement for M2 (DDL, query, server side insert, 2PC)

 Refactor global transaction, support savepoint and atomic statement

 Implement dynamic scalability (register/unregister node, add/drop partition)

 Support load/unloaddb, killtran

 Others Features : (auto increment, global deadlock timeout)

 Passed QA functional and performance testing

□ Others:


 Fixed QA bugs: 52 bugs



The result & status of each milestones – M4 (Ongoing)

□ Goal:
 Provide data storage engine for nReport Project

 Performance improvement for order by and group by statement

 Support big table join small table (global partition table join with non-partition table)


Demo


Performance Results

□ Test environment
 3 Server nodes (10.34.64.201/202/204):  Configure: data_buffer_pages=1,000,000
• CPU : Intel(R) Xeon(R) CPU E5405 @2.00GHz  Table size: 100,000 and 10,000,000 rows
• Memory: 8G
 Data size: 108M (total 207M) and 9.6G (total
• Network: 1000 Mbps
30G)
• OS: Center 5.5(64bit)
 Each thread runs 5000 times

CUBRID Cluster M3 CUBRID 8.3.0.0337
JAVA Program JAVA Program
10.34.64.203 10.34.64.203
40 threads 40 threads

14 threads 13 threads 13 threads 40 threads

Node1 Node2 Node3 10.34.64.201
10.34.64.204 10.34.64.201 10.34.64.202 CUBRID DB
Cluster Database


Performance Results (cont.)

□ Create table statement:
 Cluster M3:
• CREATE GLOBAL TABLE t1 (a INT, b INT, c INT, d CHAR(10),e CHAR(100),f CHAR(500),INDEX
i_t1_a(a),INDEX i_t1_b(b)) PARTITION BY HASH(a) PARTITIONS 256 ON NODE
'node1', 'node2', 'node3';

 CUBRID R3.0:
• CREATE TABLE t1 (a INT, b INT, c INT, d CHAR(10),e CHAR(100),f CHAR(500),INDEX i_t1_a(a),INDEX
i_t1_b(b)) PARTITION BY HASH(a) PARTITIONS 256;

□ Test statements:
 Select partition key column: SELECT * FROM t1 WHERE a = ?

 Select non-partition key column: SELECT * FROM t1 WHERE b = ?

 Select non-partition key column by range: SELECT * FROM t1 WHERE b BETWEEN ? AND ?

 Insert with auto commit: INSERT INTO T1 VALUES (?,?,?,?,?,?);



□ TPS (Transactions Per Second) Graph
SELECT * FROM t1 WHERE a = ? SELECT * FROM t1 WHERE b = ?
column a is indexed and partition key. column b is indexed but not partitioned key.

SELECT * FROM t1 WHERE b BETWEEN ? AND ? INSERT INTO T1 VALUES (?,?,?,?,?,?)
column b is indexed but not partitioned key. Auto commit



□ ART (Average Response Time) Graph -- The lower the better
SELECT * FROM t1 WHERE a = ? SELECT * FROM t1 WHERE b = ?
column a is indexed and partition key. column b is indexed but not partitioned key.

SELECT * FROM t1 WHERE b BETWEEN ? AN INSERT INTO T1 VALUES (?,?,?,?,?,?)
D? Auto commit
column b is indexed but not partitioned key.

30 30 /
/ 35 CUBRID Cluster Introduction


□ Test environment
 Server nodes (10.34.64.49/50 …/58):  Configure:
• CPU : Intel(R) Xeon(R) CPU E5645@ • cubrid.conf: data_buffer_pages=1,000,000
2.40GHz(12 core)
 Table size: 100,000,000 rows (one hundred
• Memory: 16G
million)
• Network: 1000 Mbps
 Data size: 88G (total size: 127G)
• OS: Center 5.5(64bit)


Pros and cons

□ Pros
 Current application can use CUBRID Cluster easily

 CUBRID Cluster can store more data or provide higher performance than CUBRID

 CUBRID Cluster is easy to scale-out data size

 CUBRID Cluster can save cost

 Support transaction

□ Cons
 Not support join

 Performance is not good enough yet
• S2S communication may led network cost

• 2PC will write many logs led IO cost


Next Version plan

□ Tentative Work plan
 Performance improvement

 Support HA for each server node in CUBRID Cluster

 Support Load balance (write to active server/read from standby server)

 Support distributed partition by range/list

 Support global user

 Others : backup/restore DB


Appendix

□ Why select partition key is not fast enough? (back)

SECLET .. FROM t1 SECLET .. FROM t1
AP Where a = 100 AP Where a = 100

SECLET .. FROM t1__p__p2 SECLET .. FROM t1__p__p2
Where a = 100 Where a = 100
Broker p2 stored on server2 Broker p2 stored on server2

Step1: Send request to Step1: Send request
server1 (default server) to server2 directly
Step2: Send
remote scan
request

Step3: SCAN Step2:
SCAN
do scan do scan

DB1 DB1 DB1 DB1


Step4: fetch back No remote scan here

35 /35 / 35 CUBRID Cluster Introduction

Appendix (cont.)

□ Why insert is not fast enough? (BACK)

INSERT t1 (a, …) INSERT t1 (a, …)
VALUES (100, …..); VALUES (100, …..);
AP AP
COMMIT COMMIT

a=100 should be a=100 should be
stored on server2 stored on server2
Broker Broker

Dirty Dirty Write log
1 time

Write log DB1 DB1 Write log DB1 DB1
3 times 2 times

2 phase commit No 2PC here


CUBRID Cluster Introduction

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to CUBRID Cluster Introduction

Similar to CUBRID Cluster Introduction (20)

More from CUBRID

More from CUBRID (11)

Recently uploaded

Recently uploaded (20)

CUBRID Cluster Introduction