2. Databases - Background
1. SQL – Normalized Schema, Referential Integrity, Constraints, Indexed
data
2. OLTP vs OLAP
3. Teradata, EDW - OLAP
4. NoSQL – Not only SQL use cases
i. Document Database – MongoDB
Website Visitor profile
Case Document of Patient
Wheel traceability
ii. Column Family Database – Cassandra
Gmail with mail Id and mails
iii. Graph Database – Gremlin
most influential person(s) in a network
3. Cloud Concepts
High Availability in Cloud
Geography
Region Region
AZ AZ AZ AZ
• Failures can occur that affect the availability of resources that are in the same Region and AZ.
• AZ may host multiple Data Centers
• 54 regions – highest for Cloud provider, public and private regions
4. Cosmos DB - Background
• IaaS – Compute, Storage and Network
• PaaS – Brand of Products – OS, Databases, Middleware as managed service
Management and Sizing is planned by Provider
• SaaS – Readily available Product – SalesForce, Gmail, PeopleSoft, SAP
SuccessFactors
• DBaaS [database-as-a-service]
1. SQL Database – Structured Database as service
2. Azure Document DB – NoSQL Database service with support for Document model
3. Cosmos DB – evolution of Document DB with multi model support
Supported Models – MongoDB, Gremlin, Cassandra, Table /SQL API
ServiceModels
5. Globally Distributed
• Scale read and write throughput globally.
• Maintain business continuity during
regional outages.
• Globally Distributed – multiple read regions on demand
• Low Latency – 10 ms at 99th percentile
• Elastic Scalability - thousands to hundreds of
millions of requests/sec
• High Availability - 99.999% for multi region reads
• Tunable consistency – Strong, Bounded Stateless,
Session, Consistent Prefix, Eventual.
Each model has a tradeoff between
consistency and performance.
6. High Availability
Partitions
Logical to Physical Mapping
Group of nodes =
replication set
Container
= collection or table
London NYC
Before
MysqLDB1 MysqLDB2
2TB 2TB
Total Data = 3 TB
7. Pricing
Cosmos DB was expensive, now affordable to get started.
Before Now
Minimum 10000 RU/S
with scaling increments of 1,000 RU/S
400 RU/S with scaling increments of 100
RU/S.
1 region - $584 per month
Multi Region - $1,168 per month
1 region - $23.36 per month
Multi Region - $46.72 per month
RU/S – Request Unit per second
8. Best Practices
• Cloud keep a close eye on your usage.
• Scale on Demand – Is complete size estimate needed before deployment?
• Test production scale data in Test environment
• Technology Stack evolution with multiple model support
• Supports Manual Failover of region. Simulate failure and plan.
• Leverage automatic secondary indexing
• If using MongoDB API, get started using emulator.
Notas do Editor
AZ has one or more DC
Why multi model – maintenance of different databases