Mais conteúdo relacionado Semelhante a DAT320_Moving a Galaxy into Cloud (20) Mais de Amazon Web Services (20) DAT320_Moving a Galaxy into Cloud1. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Moving a Galaxy into Cloud
Best Practi ces from Samsung on Mi grati ng to
Amazon DynamoDB
V i j a y N a t a r a j a n - P r i n c i p a l P r o d u c t M a n a g e r , A W S
S a n g p i l l K i m - E n t e r p r i s e S o l u t i o n s A r c h i t e c t M a n a g e r , A W S K o r e a
S e o n g g y u K i m – S e r v e r E n g i n e e r , S a m s u n g E l e c t r o n i c s
D A T 3 2 0
2. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Topics
• Best practices for migrating to Amazon DynamoDB
• Samsung’s migration from Apache Cassandra to Amazon DynamoDB
3. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Migrating to Amazon DynamoDB
Sangpill Kim
Enterprise Solutions Architect Manager, AWS Korea
4. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Fast, consistent
performance
Highly scalable Fully managed Business critical
reliability
Consistent single-digit millisecond
latency; DAX in-memory
performance reduces response
times to microseconds
Auto-scaling to hundreds
of terabytes of data that
serve millions of
requests per second
Automatic provisioning,
infrastructure
management, scaling,
and configuration with
zero downtime
Data is stored in fault
tolerant availability zones,
with fine-grained access
control
Amazon DynamoDB
F a s t a n d f l e x i b l e N o S Q L d a t a b a s e s e r v i c e f o r a n y s c a l e
5. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Best Practices for Migrating to DynamoDB
Planning
Data
Analysis
Data
Modeling
Testing Migration
6. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Planning Phase
• Goals of the migration
• Identify tables to migrate
• Document per table challenges
• Define and document back and restore strategies
Planning
Data
Analysis
Data
Modeling
Testing Migration
7. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Data Analysis Phase
Source
Database Source Data Analysis
Key data attributes
• Number of items to be imported into
DynamoDB
• Distribution of the item sizes
• Multiplicity of values to be used as
partition or sort keys
Access Pattern of the Application
Examples:
• Write only
• Fetches by distinct value
• Queries across a range of values
Planning
Data
Analysis
Data
Modeling
Testing Migration
8. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Data Modeling Phase
Choosing primary key for a table
• Distributes the data across more distinct partitions key values
• Simple (partition key) vs. Composite (partition key + sort key)
• The workload patterns on individual items
• Randomizing Across Multiple Partition key values
Understanding data access patterns
• Local Secondary Indexes
• Global Secondary Indexes
Planning
Data
Analysis
Data
Modeling
Testing Migration
9. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Testing Phase
End-to-end test for entire migration process
• Run and developed during the other phases as the migration
strategy is iterative
• The outcome of a round of tests will often result in revisiting a
previous phase
Testing Overview
• Basic Acceptance Test
• Functional Tests
• Non-Functional Tests
• User Acceptance Tests
Planning
Data
Analysis
Data
Modeling
Testing Migration
10. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Data Migration Phase
Fully documented and automated as much as possible
• If the migration fails for any reason, execute the rollback procedure, which
also should be well documented and tested
• After the rollback, a root cause analysis of the failure should be done and
the migration rescheduled once the issue is identified and resolved
Planning
Data
Analysis
Data
Modeling
Testing Migration
11. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Massive Scale Migration in Action
Evaluation
(7 months)
Testing
(1 month)
Modeling
(1 month)
Migration
(4 months)
Operation
(~2 years)
12. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Moving a Galaxy into Cloud:
Samsung’s case study
Seonggyu Kim
Server Engineer, Samsung Electronics
13. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Backup and restore
data and settings
Your photos on multiple
devices any time
15 GB of free storage,
Upgrade for more
Samsung Cloud Service
Home screen, App data, Contact, Messages,
Device settings, Music, Documents, etc.
Sync photos, videos, notes using native
applications across Samsung devices
US, 29 countries in EU, KR, etc.
• Storage service providing backup and restore and key value store to mobile application
• ~300M users with Samsung Accounts, 860TB DynamoDB storage
14. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
DynamoDB usage is growing steadily
500 tables, 3.5M RCU, 3M WCU, 860TB Storage in total
Growth rate (YoY) - RCU: 136%, WCU: 175%, Storage: 226% (2017. 7)
DynamoDB Usage Today
15. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
NoSQL Database Usage in 2014
Cassandra Cluster
• Cassandra ring : > 100 i2.8xlarge instances (50% On-demand, 50% Reserved Instance)
Challenges
• High cost of operations and resources
• Unstable consistency
Requirements
• Providing indexes
• Real-time query response
• Enables large data size
• Efficient and fast scale-out
• Easy and secure operation
? DynamoDB
16. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Phase 1: Evaluation
Scalability
Requirements
• > 20K concurrent connections, > 100TB table size
DynamoDB
• Do not monitor and limit connections count. Just limit throughputs by user’s input
• There are no limits on the request capacity or storage size for a given table
17. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Phase 1: Evaluation
Performance
Requirements
• Consistent latency at scale
DynamoDB
• No storage capacity limitation, no latency performance impact from large amounts of data
and transactions
18. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Phase 1: Evaluation
Reliability
Requirements
• Amazon S3 level availability & durability, Backup and Recovery
DynamoDB
• Fault tolerance in the event of a server failure or Availability Zone outage
• Synchronously replicates data across three facilities within an AWS Region
• Export / Import as a full backup
19. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Phase 1: Evaluation
Security
Requirements
• Data encryption at rest, DB access logs
DynamoDB
• DynamoDB is already secure enough with AWS-managed infrastructure, IAM for access
control, encryption in transit.
• Client-side encryption with KMS
20. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Phase 1: Evaluation
Cost
Instance Type Spec
• i2.8xlarge : 365K Read IOPS, 315K First Write IOPS (with 4KB blocksize)
• c4.8xlarge : 48K IOPS (with 16KB blocksize), 500Mbps
Usage
• i2.8xlarge, 10K Read IOPS, 2K Write IOPS (with 40KB)
Results
• Changed i2.8xlarge instances to c4.8xl instances with 8 x 1TB EBS GP2 volumes
• i2.8xlarge $5,500/mon vs c4.8xlarge + gp2 $2,568/mon
• > 50% cost savings
Lessons
• As data grows, instances are used for storage capacity rather than IOPS. Opportunity to optimize
21. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Phase 1: Evaluation (Contd.)
Cost Comparison
Capacity
• Total = Used 230TB / Physical 512TB (800GB * 8 instance store volumes * 80 instances)
• DynamoDB indexed data storage capacity = 80TB ( = 230TB / 3 replication factor)
Calculate Capacity Unit
• 43K reads / second = 3,700M calls per day
• 14K writes / second = 1,250M calls per day
• 43K RCU, 14K WCU (Item size < 1KB, strong consistency)
Cost
• > 90% cost savings without RI/RC
• not realized, 60~70% cost savings in reality
Lessons
• Provisioned throughputs might not be 100% utilized. Consider % utilization for cost comparisons
22. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Phase 2: Testing
YCSB (Yahoo! Cloud Serving Benchmark)
• Open source benchmark tool for NoSQL : DynamoDB, Cassandra, Couchbase, etc
• Wiki: https://github.com/brianfrankcooper/YCSB
Core Workloads
• Sets of pre-defined properties: readproportion, insertproportion, requestdistribution,
operationcount
• Workload A: Update heavy workload (50/50 reads and writes)
• Workload B: Read mostly workload (95/5 reads/write)
• Workload C: Read only (100% read)
• Workload D: Read latest workload
• Workload E: Short ranges
• Workload F: Read-modify-write
23. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Phase 2: Testing
RCU/WCU : 80K
(Strong Consistency)
Workload
A
Workload
B
Workload
C
Workload
F
Workload
D
Workload
E
Consumed Read Capacity 7,000 14,000 14,000 14,000 80,000 22,000
Consumed Write Capacity 14,500 1,500 - 14,000 9,000 207
RCU/WCU : 40K
(Strong Consistency)
Workload
A
Workload
B
Workload
C
Workload
F
Workload
D
Workload
E
Consumed Read Capacity 8,000 14,000 14,000 14,000 40,000 -
Consumed Write Capacity 16,500 1,500 - 14,600 4,500 -
Test Results (with Item counts : 100M, Item size < 1KB, 4 clients)
• Increased throughputs from 40K to 80K,
but DynamoDB performance does not scale for workload A,B,C,F.
24. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Phase 2: Testing
Core Workloads: sets of core properties
• Workload A: Update heavy workload, requestdistribution=zipfian
• Workload B: Read mostly workload, requestdistribution=zipfian
• Workload C: Read only, requestdistribution=zipfian
• Workload D: Read latest workload, requestdistribution=latest
• Workload E: Short ranges, requestdistribution=zipfian
• Workload F: Read-modify-write, requestdistribution=zipfian
Zipfian distribution
• Distribution requests popular items more
• Doesn’t spread workloads across partitions in DynamoDB
• Small table with small no. of partitions doesn’t matter (with 1K RCU, WCU)
25. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Phase 2: Testing
Recommended workload distribution set to ‘uniform’
• https://github.com/brianfrankcooper/YCSB/tree/master/dynamodb
• Best practices for using DynamoDB - uniform, evenly distributed workload is the
recommended pattern for scaling and getting predictable performance out of DynamoDB
26. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Record type sync server
Phase 3: Design - Architecture
Cassandra
Cluster
S3 for
large size data
Memcached
for user lock
Mobile Client
External ELB
Sync API Server
S3 for
large size data
DynamoDB RDS for
scheme info
Mobile Client
External ELB
Sync API Server
RDS for
scheme info
27. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Phase 3: Design - Table
Function tables
• Partition key : composite_key
• Sort key : record_id
• A single big table to put and query once
Composite key : user_id + service_id + unique_id
Random alphanumeric
Unique values per service
Unique values
per record
Partition key Sort key Attributes…
…
…
…
…
28. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Phase 3: Design - Table
Contents tables
• Partition key : user_id
• Sort key : record_id
• > 70 small tables per services to provision
different throughputs for each services
e.g.)
• More popular table ( > 50TB)
: 45K WCU / 200K RCU
• Less popular ( < 4GB)
: 50 WCU / 300 RCU
Partition key Sort key Attributes…
…
…
…
…
user_id record_id
Local Secondary Index
Partition key Sort key Attributes
user_id update_time record_id
29. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Phase 4: Data Migration
Online migration
• Full migration is not possible (several 100s of TB sized tables) : Per user migration
• Some users are in Cassandra while others are in DynamoDB : Storage path DB
• To minimize impact for each user, migrate as soon as possible : Accelerate migration
Cassandra
cluster
DynamoDB
App Servers
user storage path
User A <path_to_cassandra>
User B <path_to_dynamodb>
Mobile
clients Storage
path DB
User
A
User
B
Normal data flow
for migrated users
Normal data flow
for not migrated
users
Migration data
flow
30. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Phase 4: Data Migration
Low utilization
• Simultaneous write calls and Batch deletion in migration: spiky workload pattern
• For instance, a user could hit 2~3 partitions only for writing 1K~10K items
Partition #1
Partition #2
Partition #3
…
Partition #N
Hitting only few partitions
w/ multiple threads
User A’s
items
31. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Phase 4: Data Migration
Solutions
• Decrease thread count for proper load control
• Reduce the number of items to read and write at once
• Improvements: Reduced throttled events > 90%, Increased utilization from ~45% to ~80%,
Accelerated migration speed
32. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Phase 5: Operation
Diluted partitions
• As storage capacity grows and provision throughputs, partitions grow automatically
• But do not shrink
• Avoid diluted partitions situations
Partition #1
Partition #2
Partition #3
…
Partition
#1,000
Partition #1
Partition #2
Partition #3
…
Partition
#1,000
Partition #1
Create table
Migration After
Increase WCU
to 1,000K
Decrease WCU
to 100K
1,000 WCU
per partition
X
1,000 partitions
100 WCU
per partition
X
1,000 partitions
1/10 per partition
WCU
33. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Phase 5: Operation
Data Backup
• Backup requirements – RPO 1day, RTO 7 days for internal SLA
• Export / Import is not supported in SIN (as Data Pipeline is not available)
• Large table (> 40 TB) backup fails using Export / Import
Solutions
• Daily full backup with custom backup scripts with EMR/Hive: still challenge for cost
34. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
After Migration
Results
Feb. 2015 Aug. 2015 Sep. 2015 Oct. 2015 Feb. 2016
Evaluation
(7 months)
Testing
(1 month)
Modeling
(1 month)
Migration
(4 months)
Operation
( > 2 years)
Cassandra
i2.8xlarge
> 100ea
DynamoDB
> 250K WCU
> 1M RCU
~40% Cost savings
Excludes transfers, services costs, etc.
Dec. 2015 Dec. 2017
Realized huge cost savings by migrating to DynamoDB!
35. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Benefits of using DynamoDB
• Successfully launched Samsung Cloud service supporting massive scale workloads for
Samsung Galaxy Smart Phones
• 40% cost saving in NoSQL infrastructure cost
• No capacity planning for Peta-byte Scale Storage Capacity with on-demand capacity
• Consistent performance at 10s of Millions of Throughputs
• Zero administration for Hundreds of Tables with DynamoDB Auto Scaling
• No failures during 2 years Operation with Fully Managed Service
• No data corruption or loss for Billions of Items
• Enterprise Level Security and Compliance using VPC Endpoints for DynamoDB
36. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Lessons learned
Evaluation
• TCO drives technology adoption / innovation
• As data grows, instances are used for storage capacity rather than IOPS. Chances to optimize.
• Provisioned throughputs might not be 100% utilized. Consider utilization
Table Design
• Both “the primary key selection” and “The workload patterns on individual items” are important
Data Migration
• Enable online migration by migrating per user using “storage path db”
• Handle spiky workloads
• Go back to table design to spread workloads across partitions
Operations
• Avoid Diluted Partitions
• Consider how to backup
37. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Q & A
38. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Thank you!