The presentation in Oracle Technical Carnival China 2016, this is the second presentation about Oracle sharding function that will release in 12.2. In this presentation, described in real case how Oracle construct the sharding table and duplicated table.
4. WHAT IS SHARDING EXACTLY
//Query single row
select User_Name from T1 where User_ID=1;
//Query all
create view T as select * from T1 union all select * from T2;
select count(*) from T;
Database “DB” Database “DB”
Split Table
Table “T”
User_ID User_Name
1 Kim
2 Tim
3 Jim
4 Sim
Table “T1”
User_ID User_Name
1 Kim
3 Jim
Table “T2”
User_ID User_Name
2 Tim
4 Sim
5. WHAT IS SHARDING EXACTLY
//Query single row
select User_Name from T where User_ID=1;
//Query all
select count(*) from T;
Database “DB” Database “DB”
Partition Table
Table “T”
User_ID User_Name
1 Kim
2 Tim
3 Jim
4 Sim
Partition “P1”
User_ID User_Name
1 Kim
3 Jim
Partition“P2”
User_ID User_Name
2 Tim
4 Sim
Table “T”
6. WHAT IS SHARDING EXACTLY
//Query single row
select User_Name from T where User_ID=1;
//Query all
select count(*) from T;
Database “DB” Database “DB1”
Shard Table
Table “T”
User_ID User_Name
1 Kim
2 Tim
3 Jim
4 Sim
Shard “S1”
User_ID User_Name
1 Kim
3 Jim
Shard “S2”
User_ID User_Name
2 Tim
4 Sim
Database “DB2”
Table “T”
7. ➤ Greater scalability and fault isolation than possible with RAC
➤ Large billing systems
➤ Airline ticketing systems
➤ Online financial services
➤ Media companies
➤ Online information services
➤ Social media companies
WHICH SYSTEM WILL NEED SHARDING?
0
100000
200000
300000
400000
500000
600000
700000
800000
900000
1000000
OLTP Throughput
8. APPLICATION DESIGNED FOR SHARDING
➤ Sharding is not application transparent
➤ Application must specify a sharding key for optimal performance
➤ e.g. customer_id, account_id etc
➤ Primary usage pattern
➤ Direct routing to a shard based on sharding key
➤ Single-shard operations for highest performance
➤ Ancillary usage pattern
➤ Proxy routing for multi-shard queries (reporting)
➤ Able to tolerate lesser performance than direct routing used for single-shard
operations
13. SIMPLE ENV FOR TESTING
shard director + shard catalog
shard node1, shard node2
14. SDB DEPLOYMENT OVERVIEW
➤ 1.Oracle Sharding Prerequisites
➤ 2.Installing Oracle Database Software (database)
➤ 3.Installing the Shard Director Software (gsm)
➤ 4.Installing schagent in all Shard Node (database)
➤ 4.Creating the Shard Catalog Database (dbca)
➤ 5.Setting Up the Oracle Sharding Management and GDS
➤ 6.Deploying and Managing a System-Managed SDB (gdsctl)
https://oracleblog.org/working-case/deployoracle-sharding-database/
创建Oracle sharding database - ⼩荷OracleBlog 天堂向左,DBA向右
15. REQUIRED MEDIA
➤ database.zip, gsm.zip
db software, for shardcat database
db software on every shard node
GDS framework and GSM service
Scheduler Agent on shard node
16. ORACLE SHARDING PREREQUISITES
➤ 12.2 Enterprise Edition
➤ Non-cdb
➤ Filesystem, no ASM (12.2 Beta)
➤ every shard node IP resolved in every node’s hosts file
➤ A whole new machine without any Oracle software preinstallation
17. WHAT DOES DEPLOY DO?
➤ Creates shards and listeners
➤ DBMS_SCHEDULER package (executed on shard catalog) communicates with
Scheduler Agents on remote hosts
➤ Agents run DBCA and NETCA to create shards and listeners
➤ Creates the Data Guard configuration
➤ Primaries are created first, RMAN duplicate is used to create corresponding standbys
➤ Redo transport and broker are configured, observers are started on shard director
hosts and Fast-Start Failover is enabled
➤ Optionally, deploys GoldenGate bi-directional replication (OGG 12.3)
➤ Replication pipelines are configured and replication is started
18. CENTRALIZED SCHEMA MANAGEMENT
connect to GDS$CATALOG service
alter session enable shard ddl;
create tablespace set …
create tablespace …
create user ...
create sharded table … tablespace set
Create duplicated table … tablespace
Shard Director
Shard 1 Shard 2 Shard n
Shard
Catalog
19. UNDERSTANDING CHUNKS AND TABLESPACE
➤ Chunk is the Unit of Data Movement in a Sharded Database
➤ Simple form: 1 chunk = 1 tablespace = 1 datafile
➤ The number of chunks is defined during the creation of shard catalog
20. UNDERSTANDING CHUNKS AND TABLESPACE
//Log in GSM
GDSCTL>config chunks
Chunks
------------------------
Database From To
-------- ---- --
sh1 1 6
sh2 7 12
21. UNDERSTANDING CHUNKS AND TABLESPACE
//Log in shard node database sh1
SQL> select tablespace_name from dba_tablespaces where tablespace_name like '%TSSET%';
TABLESPACE_NAME
--------------------
TSSET1
C001TSSET1
C002TSSET1
C003TSSET1
C004TSSET1
C005TSSET1
C006TSSET1
7 rows selected.
22. UNDERSTANDING CHUNKS AND TABLESPACE
//Log in shard node database sh2
SQL> select tablespace_name from dba_tablespaces where tablespace_name like '%TSSET%';
TABLESPACE_NAME
------------------------------
TSSET1
C007TSSET1
C008TSSET1
C009TSSET1
C00ATSSET1
C00BTSSET1
C00CTSSET1
7 rows selected.
23. UNDERSTANDING CHUNKS AND TABLESPACE
//Log in catalog database
//Where is sharded table
SQL> select table_name from dba_tables where tablespace_name='TSSET1';
no rows selected
//Where is duplicated table
SQL> select table_name from dba_tables where tablespace_name='TS1';
TABLE_NAME
--------------------
PRODUCTS
MLOG$_PRODUCTS
24. UNDERSTANDING CHUNKS AND TABLESPACE AND DATAFILE
//Log in shard node database sh1
SQL> select partition_name, tablespace_name from dba_tab_partitions where table_name='CUSTOMERS' and tablespace_name like 'C%TSSET%'
order by tablespace_name;
PARTITION_NAME TABLESPACE_NAME
-------------------- --------------------
CUSTOMERS_P1 C001TSSET1
CUSTOMERS_P2 C002TSSET1
CUSTOMERS_P3 C003TSSET1
CUSTOMERS_P4 C004TSSET1
CUSTOMERS_P5 C005TSSET1
CUSTOMERS_P6 C006TSSET1
6 rows selected.
SQL> select table_name from dba_tables where tablespace_name='TS1';
TABLE_NAME
--------------------
PRODUCTS
25. UNDERSTANDING CHUNKS AND TABLESPACE
//Log in shard node database sh2
SQL> select partition_name, tablespace_name from dba_tab_partitions where table_name='CUSTOMERS' and tablespace_name like 'C%TSSET%'
order by tablespace_name;
PARTITION_NAME TABLESPACE_NAME
---------------------------------------- ------------------------------
CUSTOMERS_P7 C007TSSET1
CUSTOMERS_P8 C008TSSET1
CUSTOMERS_P9 C009TSSET1
CUSTOMERS_P10 C00ATSSET1
CUSTOMERS_P11 C00BTSSET1
CUSTOMERS_P12 C00CTSSET1
6 rows selected.
SQL> select table_name from dba_tables where tablespace_name='TS1';
TABLE_NAME
--------------------
PRODUCTS
26. AND … DATAFILES AND TABLESPACE
//Log in shard node database sh1
SQL> select TABLESPACE_NAME,FILE_NAME from dba_data_files where TABLESPACE_NAME like
'C%TSSET%' order by tablespace_name;
TABLESPACE_NAME FILE_NAME
-------------------- ----------------------------------------------------------------------
C001TSSET1 /u01/app/oracle/oradata/SH1/datafile/o1_mf_c001tsse_d1rfod3l_.dbf
C002TSSET1 /u01/app/oracle/oradata/SH1/datafile/o1_mf_c002tsse_d1rfofj6_.dbf
C003TSSET1 /u01/app/oracle/oradata/SH1/datafile/o1_mf_c003tsse_d1rfogs5_.dbf
C004TSSET1 /u01/app/oracle/oradata/SH1/datafile/o1_mf_c004tsse_d1rfoht8_.dbf
C005TSSET1 /u01/app/oracle/oradata/SH1/datafile/o1_mf_c005tsse_d1rfojs6_.dbf
C006TSSET1 /u01/app/oracle/oradata/SH1/datafile/o1_mf_c006tsse_d1rfokv6_.dbf
6 rows selected.
29. ROUTING IN AN ORACLE SHARDED ENVIRONMENT
➤ Direct Routing
➤ For OLTP workloads that specify sharding_key (e.g. customer_id) during connect
➤ Connect string must contain: (SHARD_KEY=...)
➤ JDBC: connection.setShardKey(<shard_key>,<shard_group_key>);
➤ Support for OCI/OCCI (C++)/ODP.NET
➤ Support for PHP, Python, Perl, and Node.js
➤ Proxy Routing
➤ Multi-shard queries – e.g. reporting workloads
➤ Workloads that cannot specify sharding_key as part of connection
30. DIRECT ROUTING VIA SHARDING KEY
➤ The connection pool maintains a shard topology
cache=a mapping of key ranges to shards
➤ DB requests for a key in a cached range go
directly to the shard (i.e., bypasses shard
director)
➤ Or a new connection is created by forwarding
the request with the sharding key to the shard
director
Shard Key
Ranges
Chunk Name Shards
1 -- 10 Chunk 1 Shard 1, Shard 2
10 -- 20 Chunk 2 Shard 1, Shard 2
20 -- 30 Chunk 3 Shard 3, Shard 4
30 – 40 Chunk 4 Shard 3, Shard 4
31. PROXY ROUTING VIA COORDINATOR (SHARD CATALOG)
➤ Multi-shard Queries & Non-shard Key
Access
➤ Connection is made to the coordinator
➤ Coordinator parses SQL and will
proxy/route request to correct shard
➤ SQL statements rewritten to get much
of the query processing done on the
participating shards and as little as
possible on the coordinator shard
➤ For developer convenience and not for
high performance
Coordinator
(shard catalog)
Application
Server
Shard
Directors
App Tier
Routing Tier
Data Tier
32. EXECUTION PLAN
Execution Plan
----------------------------------------------------------
Plan hash value: 2953441084
--------------------------------------------------------------
| Id | Operation | Name | Cost (%CPU)| Inst |IN-OUT|
--------------------------------------------------------------
| 0 | SELECT STATEMENT | | 0 (0)| | |
| 1 | SHARD ITERATOR | | | | |
| 2 | REMOTE | | | ORA_S~ | R->S |
--------------------------------------------------------------
Remote SQL Information (identified by operation id):
----------------------------------------------------
2 - EXPLAIN PLAN SET STATEMENT_ID='PLUS630005' INTO PLAN_TABLE@! FOR
SELECT "A1"."CUSTID" FROM "CUSTOMERS" "A1" /*
coord_sql_id=0zpg825w625yn */ (accessing
'ORA_SHARD_POOL@ORA_MULTI_TARGET' )