O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.

Easy MySQL Database Sharding with CUBRID SHARD - 2013 Percona

4.260 visualizações

Publicada em

If you ask companies who operate mission-critical services, they will tell:

1) that a relational database system is still the best choice for mission-critical data;
2) that service availability is more important than performance;
3) that high performance is good, but predictable performance is the king.

This is a fact, and we know it. At NHN we have over 30,000 Web servers that operate over 150 large scale Web and mobile services. At such scale we must know what scales, how to provide high-availability and operate at predictable speed.

At Percona Live MySQL Conference 2013 I will talk about CUBRID SHARD, a universal database sharding solution for CUBRID, MySQL, and Oracle. CUBRID SHARD can be used with a heterogeneous database backend, i.e. some shards can be stored in CUBRID, some in MySQL or even Oracle. At NHN we deploy various combinations: MySQL only, MySQL + Oracle, MySQL + CUBRID, CUBRID only, and Oracle only. I will explain how DBAs can easily configure it, and how we have implemented this feature.

CUBRID SHARD allows to store unlimited number of database shards and distribute data based on modulo, DATETIME, or hash/range calculations. The developers can even feed in their own library to calculate the SHARD_ID using some complicated custom algorithm. At the session I will show how easy it is to setup all this. No need for a third-party management tool. With CUBRID SHARD application developers do not need to modify the application logic to provide data sharding. This is DBAs job as all this is handled by the database system automatically.

CUBRID SHARD is designed to be very efficient. It provides built-in (*) distributed load balancing and (*) connection and statement pooling. At the conference I will present several cases where CUBRID SHARD is deployed as a shard manager and a connection manager, or where it's used as a way for seamless data migration between different systems.

Who should come to the session?

If you run a service which spends money on a database solution, on tools you need to shard databases or manage connections, you should come and learn how CUBRID SHARD can provide your applications native scale-out through single database view.

Publicada em: Tecnologia, Diversão e humor

Easy MySQL Database Sharding with CUBRID SHARD - 2013 Percona

  1. 1. Easy MySQL Database Shardingwith CUBRID SHARDEsen SagynovApril 24, 2013
  2. 2. Today1. About NHN2. Sharding in Production3. Why CUBRID SHARD4. How to shard MySQL databases5. DEMO6. CUBRID SHARD in Ndrive2
  3. 3. About me3• Esen Sagynov (NHN Corp.)– @CUBRID– fb.com/cubridesen@cubrid.org
  4. 4. About NHN
  5. 5. Sharding in Production• Uses RDBMS with Sharding• Data is stored as simple Key-Value.••••••••••••
  6. 6. Sharding SolutionsName Type Requirements InterfaceDB ETCHibernate shards AS frameworkDBMS w/Hibernatesupport- Hibernate- JVMJavaHiveDB AS framework MySQL- Hibernate- JVMJavadbShards AS & Middleware MySQLJava, C, PHP, Python,RubyGizzard (Twitter) Middleware Any storage - JVM JavaSpider for MySQLMiddleware &Storage EngineMySQL AnySpock Proxy Middleware MySQL AnyShard-Query Middleware MySQL PHP, RESTful APICUBRID SHARD Middleware- CUBRID- MySQL- OracleAny
  7. 7. Sharding Solution Categories• Application layer• Storage layer• Heavy middleware• Lightweight middleware
  8. 8. Application & Storage LayersApplication Layer• Hibernate Shards• HiveDBDisadvantage• Requires Hibernate/Java• Uses many XML files for configuration• Not for running services8Storage Layer• Spider for MySQLDisadvantage• Requires to change storageengine• Not for running services
  9. 9. Heavy MiddlewaredbShards Gizzard9• Requires to change applicationcode• Requires agents to be installed oneach DB server• Not for running services• Not active
  10. 10. Lightweight Middleware• Spock Proxy– Active project– Lightweight– Flexible– Easy to configure– No application change10
  11. 11. Spock ProxySpock ProxySharding rule storage DatabaseSharding strategy ModuloDetermine Sharding Key Full SQL ParsingStrength No need to change SQLWeakness • Performance degradation:• Extra SQL parsing• Resultset merging• Not all MySQL SQL is supported• Single threaded11Blog post: http://www.cubrid.org/blog/dev-platform/database-sharding-platform-at-nhn/
  12. 12. Spock Proxy Performance12• Single threaded• Parses and rewrites SQL01002003004005001 5 10 20 30 50 70 100 200 400App ShardingSpock ProxyCUBRID SHARDConcurrent clientsExec. time
  13. 13. Spock Proxy Active project Lightweight Flexible Easy to configure No application change✕No performance impact
  14. 14. Lightweight, Easy to ConfigureSharding MiddlewareCUBRID SHARD14
  15. 15. Spock Proxy vs. CUBRID SHARDSpock Proxy CUBRID SHARDSharding rule storage Database Configuration fileSharding strategy Modulo • Modulo• User defined hash functionDetermine Sharding Key Full SQL Parsing SQL Hint SearchStrength No need to change SQL • Supports CUBRID and MySQL• Full MySQL SQL support• Higher performance• No SQL parsing• Multi-threaded• Connection pooling• Load balancing• Custom sharding strategy• Easy configurationWeakness • Performance degradation:• Extra SQL parsing• Resultset merging• Supports MySQL only• Not all MySQL SQL is supported• Single threaded• Requires to change SQL queries toinsert the sharding hint15Blog post: http://www.cubrid.org/blog/dev-platform/database-sharding-platform-at-nhn/
  16. 16. CUBRID Facts RDBMS True Open Source @ www.cubrid.org Optimized for Web services High performance Large DB support High-Availability feature DB Sharding support 90+% MySQL compatible SQL syntax + Oracle analyticalfunctions ACID Transactions Online Backup Supported by NHN Corporation
  17. 17. CUBRID SHARD Architecture… ………Single database viewOR
  18. 18. SHARD Environment…………
  19. 19. Installing CUBRID SHARD is easy!
  20. 20. Easy Installationhttp://www.cubrid.org/downloadsapt-getyumchef ⭐VMEC2 AMIcloud serviceDoc page:http://www.cubrid.org/wiki_tutorials/entry/cubrid-installation-instructions
  21. 21. Configuring is very easy and intuitive!
  22. 22. Configuration Steps• Create1. Shards2. Database Users3. Database Schema4. Configure CUBRID SHARD– shard database information– backend shards connection information– sharding strategy5. Start CUBRID SHARD6. Change application code– connection URL– shard hint23CUBRID SHARD
  23. 23. # 1. Create Shards• Host 1..N:$> mysql –ushard -ppassword –hnode1mysql> CREATE DATABASE sharddb;
  24. 24. # 2. Create Users• Host 1..N:$> mysql –ushard -ppassword –hnode1mysql> USE mysql;mysql> GRANT ALL PRIVILEGES ONsharddb@localhost TO shard@localhostIDENTIFIED BY ‘shard123’mysql> GRANT ALL PRIVILEGES ONsharddb@localhost TOshard@shardBrokerNode IDENTIFIED BY‘shard123’
  25. 25. # 3. Create same tables$> mysql –ushard -ppassword –hnode1mysql> USE sharddb;mysql> CREATE TABLE tbl_users (id BIGINTPRIMARY KEY, name VARCHAR(20), ageSMALLINT)$> mysql –ushard -ppassword –hnode2…• Host 1..N:
  26. 26. # 4. Simple Configuration• shard.conf– Main configuration file for CUBRID SHARD.• shard_connection.txt– Predefined list of shard IDs, database and hostnames for CUBRID/MySQL.• shard_keys.txt– A list of shard_key_columns and their mappingwith shard_idDoc page:http://www.cubrid.org/manual/91/en/shard.html#configuration-and-setup
  27. 27. shard.confSet:1. SHARD_DB_NAME2. SHARD_DB_USER3. SHARD_DB_PASSWORD4. APPL_SERVER…SHARD_DB_NAME = sharddbSHARD_DB_USER = shardSHARD_DB_PASSWORD = shard123APPL_SERVER = CAS_MYSQL…Doc page:http://www.cubrid.org/manual/91/en/shard.html#default-configuration-file-shard-conf
  28. 28. shard_connection.txtSet:1. Shard ID2. Real database name3. Remote/local host name# shard-id real-db-name connection-info0 sharddb mysqlA:33061 sharddb mysqlB:33062 sharddb mysqlC:3306…** Host names must be identical to the output ofhostname command of every node.Doc page:http://www.cubrid.org/manual/91/en/shard.html#setting-shard-metadata
  29. 29. shard_keys.txtSet:1. Min shard key2. Max shard key3. Shard ID[%student_no]# min max shard_id0 63 064 127 1128 191 2192 255 3** Default sharding strategy isto apply modulo 256(SHARD_KEY_MODULAR inshard.conf ).Doc page:http://www.cubrid.org/manual/91/en/shard.html#setting-shard-metadata
  30. 30. Custom Libraryint fn_shard_key_udf(int type, void *val){int mod = 2;if (val == NULL){return ERROR_ON_ARGUMENT;}switch(type){case SHARD_U_TYPE_INT:{int ival;ival = (int) (*(int *)val);return ival % 2;}break;case SHARD_U_TYPE_STRING:return ERROR_ON_MAKE_SHARD_KEY;default:return ERROR_ON_ARGUMENT;}return ERROR_ON_MAKE_SHARD_KEY;}shard.conf1. SHARD_KEY_LIBRARY_NAME2. SHARD_KEY_FUNCTION_NAME[%student_no]SHARD_KEY_LIBRARY_NAME=$CUBRID/conf/shard_key_udf.soSHARD_KEY_FUNCTION_NAME=fn_shard_key_udfDoc page:http://www.cubrid.org/manual/91/en/shard.html#setting-user-defined-hash-function
  31. 31. # 5. Start CUBRID SHARD$> cubrid shard start@ cubrid shard start ++cubrid shard start: success
  32. 32. # 6. Connection URLconnectionURL ="jdbc:cubrid:localhost:45511:sharddb:shard:shard123:?althosts=node2:port2,node3:port3&loadBalance=true";
  33. 33. Querying ShardsSELECT name FROM student WHEREstudent_no = /*+ shard_key */ ?;••
  34. 34. Types of SQL Hints
  35. 35. String query = "SELECT name FROM student WHERE student_no = /*+ shard_key */ ?; ";PrepareStatement query_stmt = connection.prepareStatement(query);query_stmt.setInt(1,100);ResultSet rs = query_stmt.executeQuery();// fetch resultsetkey_columnrange(hash result) shard_idmin maxstudent_no 0 63 0student_no 64 127 1student_no 128 191 2student_no 192 255 3
  36. 36. MySQL Sharding DEMORequirements:• 1GB free RAM• 3GB free space for 2 VMs• VirtualBox• Vagrant
  37. 37. MySQL Sharding DEMO39https://github.com/kadishmal/cubrid-shard-demo
  38. 38. CUBRID SHARD• Easy– No configuration hassle– No “moving parts”• Reliable– High performance– No SPOF• Open source– Supported by NHN
  39. 39. CUBRID SHARD DisadvantagesNeed to alter SQL to add HintsNo Data RebalancingNeed to carefully plan the sharding strategy inadvance.No GUI monitoring tool. Only command line.
  40. 40. CUBRID SHARD is great when…• Services are already running and stable• But data is growing fast• And you need a stable solution• Quick installation and easy configuration• Time constraints43
  41. 41. Ndrive cloud storage service• User files meta data• Sharding strategy by user ID• 24 master shards– Intel(R) Xeon(R) L5640 @ 2.27GHz * 8, 16G RAM, 820GHDD• 10TB data• Load pattern:– 75~80% SELECT vs. 20~25% INSERT– Avg. ~3000 QPS/shard– Avg. ~5% CPU load/shard44
  42. 42. Ndrive cloud storage service• 1 SHARD BROKER• 4 Proxies per Broker• 50 CAS per proxy• No performancedegradation after CUBRIDSHARD is used4564 128 192 256 320Vuser
  43. 43. CUBRID SHARD NextAuto-rebalancing in CUBRID SHARDCM shard monitoringAggregation feature
  44. 44. Questions?47• Esen Sagynov (NHN Corp.)– @CUBRID– fb.com/cubridesen@cubrid.org