SlideShare uma empresa Scribd logo
1 de 38
NoSQL, NO PROBLEM:
USING AZURE
DOCUMENTDB
{
"name": "Ken Cenerelli",
"twitter": "@KenCenerelli",
"e-mail": "Ken_Cenerelli@Outlook.com",
"hashtags": ["#DevTeach", "#DocumentDB"]
}
ABOUT ME
Twitter: @KenCenerelli
Email: Ken_Cenerelli@Outlook.com
Blog: kencenerelli.wordpress.com
LinkedI
n:
linkedin.com/in/kencenerelli
Bio:
 Content Developer / Programmer
Writer
 Microsoft MVP - Visual Studio and
Development Technologies
 Microsoft TechNet Wiki Guru
 Co-Organizer of CTTDNUG
 Technical reviewer of multiple
booksCTTDNU
G
Ken
Cenerelli
2
ROAD MAP
1. Overview
2. The Resource Model
3. Modeling Your Data
4. Performance
5. Developing with DocumentDB & Demos {“aka”: “The Good Stuff”}
6. Pricing
7. Wrap-up
3
WHAT IS NoSQL?
 NoSQL → Not Only SQL
 No up-front (schema) design
 Easier to scale horizontally
 Easier to develop iteratively
 Types & Examples:
 Document databases: DocumentDB, MongoDB, CouchDB
 Key-value stores: Redis
 Graph stores: Neo4J, Giraph
 Wide-column: Cassandra, HBase
4
WHAT IS AZURE DOCUMENTDB?
 NoSQL document database fully managed by Microsoft Azure
 Part of the NoSQL family of databases
 For rapid development of cloud-designed apps (web, mobile,
gaming, IoT)
 Store and query schema agnostic JSON data with SQL-like grammar
 Fast, predictable performance
 Transactionally process multiple documents via native JavaScript
processing
 Tunable consistency levels
 Built with familiar tools – REST, JSON, JavaScript
5
6
WHERE DOES IT FIT IN THE AZURE
FAMILY?
7
WHEN TO USE DOCUMENTDB?
 In General
 You don’t want to do replication and scale-out by yourself
 You want ACID transactions
 You want to have tunable consistency
 You want to do rapid development where models can evolve
 You want to utilize your .NET, JavaScript and MongoDB skills
 Compared to relational databases
 You don’t want predefined columns
 Compared to other document stores
 You want to use a SQL-like grammar
8
WHEN TO NOT USE DOCUMENTDB?
 If your data has complex relationships
 If your data has rigid schemas
 If your data has complex transactions
 If your data needs aggregation
 If your data needs encrypted storage
 If you’re planning to move your entire data store to DocumentDB
 If you do not want your data to be locked into Azure
9
DOCUMENTDB USE CASES
 User generated content
 Blog posts, chat sessions, ratings, comments, feedback, polls
 Catalog data
 User accounts, product catalogs, device registries for IoT
 Logging and Time-series data
 Event logs, input source for data analytics jobs performed offline
 Gaming
 In-game stats, social media integration, and high-score leaderboards
 User preferences data
 Modern web and mobile applications
 IoT and Device sensor data
 Ingest bursts of data from device sensors, ad-hoc querying and offline analytics
10
RESOURCE MODEL
11
JS
JS
JS
101
010
RESOURCE MODEL
JS
JS
JS
101
010
RESOURCE MODEL
JS
JS
JS
101
010
* collection != table of homogenous entities
collection ~ a data partition
RESOURCE MODEL
14
JS
JS
JS
101
010
{
"id" : "123"
"name" : "joe"
"age" : 30
"address" : {
"street" : "some st"
}
}
RESOURCE MODEL
15
JS
JS
JS
101
010
RESOURCE ADDRESSING
 Native REST Interface
 Each resource has a permanent unique ID
 API URL:
 https://{database account}.documents.azure.com
 Document Path:
 /dbs/{database id}/colls/{collection id}/docs/{document id}
16
DOCUMENTDB JSON DOCUMENTS
JSON
 Intersection of most
modern type systems
JSON values
 Self-describable,
self-contained values
 Are trivially serialized
to/from text
17
{
"locations":
[
{"country": "Germany", "city": "Berlin"},
{"country": "France", "city": "Paris"},
],
"headquarters": "Belgium",
"exports":[{"city"; "Moscow"},{"city: "Athens"}]
};
a JSON document, as a tree
Locations
Headquarte
rs
Belgium
Country City Country City
Germany Berlin France Paris
Exports
CityCity
Moscow Athens
0 10 1
DATA MODELING WITH RDBMS
18
Doing it the RDBMS way: normalize everything!
To query for Person joins are needed to related
tables:
SELECT p.name, p.lastName, p.age, cd.detail,
cdt.type, a.street, a.city, a.state, a.zip
FROM Person p
INNER JOIN Address a
ON a.person_id = p.id
INNER JOIN ContactDetail cd
ON cd.person_id = p.id
INNER JOIN ContactDetailType cdt
ON cd.type_id = cdt.id
multiple
table updates
DATA MODELING WITH
DENORMALIZATION
19
{
"id": "1",
"firstName": "Thomas",
"lastName": "Andersen",
"addresses": [
{
"line1": "100 Some Street",
"line2": "Unit 1",
"city": "Seattle",
"state": "WA",
"zip": 98012 }
],
"contactDetails": [
{"email: "thomas@andersen.com"},
{"phone": "+1 555 555-5555", "extension": 5555}
]
}
Try to model your entity as a self-
contained document
Generally, use embedded data models
when:
 contains
 one-to-few
 changes infrequently
 won’t grow
 integral
better read
performance
DATA MODELING WITH
REFERENCING
20
In general, use normalized
data models when:
 Write performance is more
important than read
performance
 Representing one-to-many
relationships
 Can representing many-to-many
relationships
 Related data changes frequently
Provides more flexibility than
embedding
More round trips to read data
{
"id": "xyz",
"username: "user xyz"
}
{
"id": "address_xyz",
"userid": "xyz",
"address" : {
…
}
}
{
"id: "contact_xyz",
"userid": "xyz",
"email" : "user@user.com"
"phone" : "555 5555"
}
Normalizing typically provides better write performance
HYBRID MODELS: DENORMALIZE +
REFERENCE
21
No magic bullet!
Think about how your data is
going to be written and read
then model accordingly
{
"id": "1",
"firstName": "Thomas",
"lastName": "Andersen",
"countOfBooks": 3,
"books": [1, 2, 3],
"images": [
{"thumbnail": "http://....png"}
{"profile": "http://....png"}
]
}
{
"id": 1,
"name": "DocumentDB 101",
"authors": [
{"id": 1, "name": "Thomas Andersen", "thumbnail": "http://....png"},
{"id": 2, "name": "William Wakefield", "thumbnail": "http://....png"}
]
}
Author document
Book document
DATA MODELLING TIPS
 Map properties to JSON types
 Prefer smaller documents (<16KB) for smaller footprint, less IO,
lower RU charges
 Maximum size is 512KB – watch unbounded arrays leading to
document bloat
 Store metadata on attachments, reference binary data/free text as
external links
 Prefer sparse properties – skip rather than explicit null
 Use fullName = "Azure DocumentDB" instead of firstName =
"Azure" AND lastName = "DocumentDB"
22
TUNABLE CONSISTENCY
 Set at the account level
 Can be overridden at the query level
 Levels:
 Strong
 Session (default option)
 Bounded Staleness
 Eventual
23
Strong consistency; slow write
speeds
Weak consistency; fast write
speeds
INDEXING
 Automatic indexing of documents and its properties when added
to the collection
 Instantly queryable by property using a SQL-like grammar
 No need to define secondary indices / schema hints for indexing
24
Indexing Modes
Consistent
 Default mode
 Index updated
synchronously on
writes
Lazy
 Useful for bulk
ingestion scenarios
Indexing Policies
Automatic
 Default
Manual
 Can manually opt-
out of automatic
indexing
Indexing Types
Hash
 For equality queries
 Strings and
numbers
Range
 For comparison
queries
 Numbers
INDEXING POLICIES
25
Configuration Level Options
Automatic Per collection True (default) or False
Override with each document write
Indexing Mode Per collection Consistent or Lazy
Lazy for eventual updates/bulk ingestion
Included and excluded
paths
Per path Individual path or recursive includes (? And *)
Indexing Type Per path Support Hash (Default) and Range
Hash for equality, range for range queries
Indexing Precision Per path Supports 3 – 7 per path
Tradeoff storage, query RUs and write RUs
DOCUMENTDB FOR DEVELOPERS
 Promotes code-first development
 Resilient to iterative schema changes
 Low impedance as object / JSON store; no ORM required
 Richer query and indexing
 Has a REST API
 Available SDKs and libraries:
 .NET (LINQ to SQL is supported)
 Node.js
 JavaScript
 Python
 Java
 JavaScript for server-side app logic
26
QUERYING LIMITATION
 Within a collection
 Besides filtering, ORDER BY and TOP is supported
 No aggregation yet
 No COUNT
 No GROUP BY
 No SUM, AVG, etc.
SQL for queries only
 No batch UPDATE or DELETE or CREATE
27
DEMO TIME!
28
REQUEST UNITS
 DocumentDB unit of scale
 Throughput (in terms of rate of transactions / second)
 Measured in Request Units (RUs)
 1 RU = throughput for a 1KB document/second
 2,000 requests per second allowed
 “Request” depends on the size of the document
 For example, uploading 1,000 large JSON documents might count as more than
one request
 Max throughput per collection, measured in RUs per second per
collection, is 250,000 RUs/second
29
REQUEST UNITS
30
Request Unit (RU) is
the normalized
currency
%
Memory
% IOPS
% CPU
Replica gets a fixed
budget of Request Units
Resource
Resource
set
Resource
Resource
DocumentsSQL
sprocsargs
Resource Resource
Predictable Performance
NOT ALL REQUEST UNITS ARE
CREATED EQUALLY
31
PRICING
 Standard pricing tier with
hourly billing
 99.95% availability
 Adjustable performance
levels
 Collections have 10 GB SSD
room
 Limit of 100 collections (1
TB) for each account – can be
adjusted
 http://bit.do/documentdb-
pricing
32
LIMITATIONS & QUOTA
33
Entity Quota
Accounts 5 (soft)
DBs / Account 100
Document storage per collection 250 GB
Collections / DB 100 (soft)
Request document size 512 KB
Permissions / Account 2M
Stored Procedures, Triggers & UDFs / collection 25
Max Execution Time / Stored Procedure or Trigger 5 seconds
ID Length 255 chars
AND, OR / query 20
https://azure.microsoft.com/en-
us/documentation/articles/documentdb-limits/
SUMMARY
 Collections != Tables
 De-normalize data where appropriate
 Tuning / Performance
 Consistency Levels
 Indexing Policies
 Understand Query Costs / Limits / Avoid Scans
34
DESIGNING A DOCUMENTDB APP
1.
2.
3.
4.
5.
6.





35
RESOURCES
 Query Playground: aka.ms/docdbplayground
 Data Import Tool: aka.ms/docdbimport
 Docs & Tutorials: aka.ms/documentdb-docs
 Code Samples: aka.ms/documentdb-samples
 Cheat Sheet: aka.ms/docdbcheatsheet
 Blog: aka.ms/documentdb-blog
 Twitter: @documentdb
36
QUESTIONS?
37
@KenCenerelli
Ken_Cenerelli@Outlook.
com
Please complete the session evaluation to win
prizes!
CLD101: NoSQL, No Problem: Use Azure DocumentDB
38Credit:

Mais conteúdo relacionado

Mais procurados

SQL to NoSQL: Top 6 Questions
SQL to NoSQL: Top 6 QuestionsSQL to NoSQL: Top 6 Questions
SQL to NoSQL: Top 6 QuestionsMike Broberg
 
Data Platform Overview
Data Platform OverviewData Platform Overview
Data Platform OverviewHamid J. Fard
 
Document Databases & RavenDB
Document Databases & RavenDBDocument Databases & RavenDB
Document Databases & RavenDBBrian Ritchie
 
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part20812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2Raul Chong
 
Analyze and visualize non-relational data with DocumentDB + Power BI
Analyze and visualize non-relational data with DocumentDB + Power BIAnalyze and visualize non-relational data with DocumentDB + Power BI
Analyze and visualize non-relational data with DocumentDB + Power BISriram Hariharan
 
Tokyo azure meetup #2 big data made easy
Tokyo azure meetup #2   big data made easyTokyo azure meetup #2   big data made easy
Tokyo azure meetup #2 big data made easyTokyo Azure Meetup
 
Odessa .net-user-group-sql-server-2019-hidden-gems by Denis Reznik
Odessa .net-user-group-sql-server-2019-hidden-gems by Denis ReznikOdessa .net-user-group-sql-server-2019-hidden-gems by Denis Reznik
Odessa .net-user-group-sql-server-2019-hidden-gems by Denis ReznikAlex Tumanoff
 
Cloudant Overview Bluemix Meetup from Lisa Neddam
Cloudant Overview Bluemix Meetup from Lisa NeddamCloudant Overview Bluemix Meetup from Lisa Neddam
Cloudant Overview Bluemix Meetup from Lisa NeddamRomeo Kienzler
 
Integration Monday - Analysing StackExchange data with Azure Data Lake
Integration Monday - Analysing StackExchange data with Azure Data LakeIntegration Monday - Analysing StackExchange data with Azure Data Lake
Integration Monday - Analysing StackExchange data with Azure Data LakeTom Kerkhove
 
SharePoint 2013 APIs
SharePoint 2013 APIsSharePoint 2013 APIs
SharePoint 2013 APIsJohn Calvert
 
ECS 19 Anil Erduran - simplifying microsoft architectures with aws services
ECS 19 Anil Erduran - simplifying microsoft architectures with aws servicesECS 19 Anil Erduran - simplifying microsoft architectures with aws services
ECS 19 Anil Erduran - simplifying microsoft architectures with aws servicesEuropean Collaboration Summit
 
Thinking in a document centric world with RavenDB by Nick Josevski
Thinking in a document centric world with RavenDB by Nick JosevskiThinking in a document centric world with RavenDB by Nick Josevski
Thinking in a document centric world with RavenDB by Nick JosevskiNick Josevski
 
Azure Data Factory presentation with links
Azure Data Factory presentation with linksAzure Data Factory presentation with links
Azure Data Factory presentation with linksChris Testa-O'Neill
 
Deep Dive into Azure Data Factory v2
Deep Dive into Azure Data Factory v2Deep Dive into Azure Data Factory v2
Deep Dive into Azure Data Factory v2Eric Bragas
 
Glynn Bird – Cloudant – Building applications for success.- NoSQL matters Bar...
Glynn Bird – Cloudant – Building applications for success.- NoSQL matters Bar...Glynn Bird – Cloudant – Building applications for success.- NoSQL matters Bar...
Glynn Bird – Cloudant – Building applications for success.- NoSQL matters Bar...NoSQLmatters
 

Mais procurados (20)

SQL to NoSQL: Top 6 Questions
SQL to NoSQL: Top 6 QuestionsSQL to NoSQL: Top 6 Questions
SQL to NoSQL: Top 6 Questions
 
Data Platform Overview
Data Platform OverviewData Platform Overview
Data Platform Overview
 
Document Databases & RavenDB
Document Databases & RavenDBDocument Databases & RavenDB
Document Databases & RavenDB
 
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part20812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
 
Introduction to RavenDB
Introduction to RavenDBIntroduction to RavenDB
Introduction to RavenDB
 
Analyze and visualize non-relational data with DocumentDB + Power BI
Analyze and visualize non-relational data with DocumentDB + Power BIAnalyze and visualize non-relational data with DocumentDB + Power BI
Analyze and visualize non-relational data with DocumentDB + Power BI
 
Tokyo azure meetup #2 big data made easy
Tokyo azure meetup #2   big data made easyTokyo azure meetup #2   big data made easy
Tokyo azure meetup #2 big data made easy
 
Odessa .net-user-group-sql-server-2019-hidden-gems by Denis Reznik
Odessa .net-user-group-sql-server-2019-hidden-gems by Denis ReznikOdessa .net-user-group-sql-server-2019-hidden-gems by Denis Reznik
Odessa .net-user-group-sql-server-2019-hidden-gems by Denis Reznik
 
Cloudant Overview Bluemix Meetup from Lisa Neddam
Cloudant Overview Bluemix Meetup from Lisa NeddamCloudant Overview Bluemix Meetup from Lisa Neddam
Cloudant Overview Bluemix Meetup from Lisa Neddam
 
Integration Monday - Analysing StackExchange data with Azure Data Lake
Integration Monday - Analysing StackExchange data with Azure Data LakeIntegration Monday - Analysing StackExchange data with Azure Data Lake
Integration Monday - Analysing StackExchange data with Azure Data Lake
 
Serverless SQL
Serverless SQLServerless SQL
Serverless SQL
 
SharePoint 2013 APIs
SharePoint 2013 APIsSharePoint 2013 APIs
SharePoint 2013 APIs
 
Databasecentricapisonthecloudusingplsqlandnodejscon3153oow2016 160922021655
Databasecentricapisonthecloudusingplsqlandnodejscon3153oow2016 160922021655Databasecentricapisonthecloudusingplsqlandnodejscon3153oow2016 160922021655
Databasecentricapisonthecloudusingplsqlandnodejscon3153oow2016 160922021655
 
ECS 19 Anil Erduran - simplifying microsoft architectures with aws services
ECS 19 Anil Erduran - simplifying microsoft architectures with aws servicesECS 19 Anil Erduran - simplifying microsoft architectures with aws services
ECS 19 Anil Erduran - simplifying microsoft architectures with aws services
 
R in Power BI
R in Power BIR in Power BI
R in Power BI
 
Thinking in a document centric world with RavenDB by Nick Josevski
Thinking in a document centric world with RavenDB by Nick JosevskiThinking in a document centric world with RavenDB by Nick Josevski
Thinking in a document centric world with RavenDB by Nick Josevski
 
Azure Data Factory presentation with links
Azure Data Factory presentation with linksAzure Data Factory presentation with links
Azure Data Factory presentation with links
 
Oracle application container cloud back end integration using node final
Oracle application container cloud back end integration using node finalOracle application container cloud back end integration using node final
Oracle application container cloud back end integration using node final
 
Deep Dive into Azure Data Factory v2
Deep Dive into Azure Data Factory v2Deep Dive into Azure Data Factory v2
Deep Dive into Azure Data Factory v2
 
Glynn Bird – Cloudant – Building applications for success.- NoSQL matters Bar...
Glynn Bird – Cloudant – Building applications for success.- NoSQL matters Bar...Glynn Bird – Cloudant – Building applications for success.- NoSQL matters Bar...
Glynn Bird – Cloudant – Building applications for success.- NoSQL matters Bar...
 

Semelhante a No SQL, No Problem: Use Azure DocumentDB

[PASS Summit 2016] Azure DocumentDB: A Deep Dive into Advanced Features
[PASS Summit 2016] Azure DocumentDB: A Deep Dive into Advanced Features[PASS Summit 2016] Azure DocumentDB: A Deep Dive into Advanced Features
[PASS Summit 2016] Azure DocumentDB: A Deep Dive into Advanced FeaturesAndrew Liu
 
Simplifying & accelerating application development with MongoDB's intelligent...
Simplifying & accelerating application development with MongoDB's intelligent...Simplifying & accelerating application development with MongoDB's intelligent...
Simplifying & accelerating application development with MongoDB's intelligent...Maxime Beugnet
 
Big Data Analytics in the Cloud with Microsoft Azure
Big Data Analytics in the Cloud with Microsoft AzureBig Data Analytics in the Cloud with Microsoft Azure
Big Data Analytics in the Cloud with Microsoft AzureMark Kromer
 
Microsoft Azure Big Data Analytics
Microsoft Azure Big Data AnalyticsMicrosoft Azure Big Data Analytics
Microsoft Azure Big Data AnalyticsMark Kromer
 
ADL/U-SQL Introduction (SQLBits 2016)
ADL/U-SQL Introduction (SQLBits 2016)ADL/U-SQL Introduction (SQLBits 2016)
ADL/U-SQL Introduction (SQLBits 2016)Michael Rys
 
Technological insights behind Clusterpoint database
Technological insights behind Clusterpoint databaseTechnological insights behind Clusterpoint database
Technological insights behind Clusterpoint databaseClusterpoint
 
Data Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and AtlasData Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and AtlasData Con LA
 
Moving from SQL Server to MongoDB
Moving from SQL Server to MongoDBMoving from SQL Server to MongoDB
Moving from SQL Server to MongoDBNick Court
 
Semi Formal Model for Document Oriented Databases
Semi Formal Model for Document Oriented DatabasesSemi Formal Model for Document Oriented Databases
Semi Formal Model for Document Oriented DatabasesDaniel Coupal
 
Getting Started with NoSQL
Getting Started with NoSQLGetting Started with NoSQL
Getting Started with NoSQLAaron Benton
 
Big Data Analytics from Azure Cloud to Power BI Mobile
Big Data Analytics from Azure Cloud to Power BI MobileBig Data Analytics from Azure Cloud to Power BI Mobile
Big Data Analytics from Azure Cloud to Power BI MobileRoy Kim
 
Big data technology unit 3
Big data technology unit 3Big data technology unit 3
Big data technology unit 3RojaT4
 
MongoDB NoSQL database a deep dive -MyWhitePaper
MongoDB  NoSQL database a deep dive -MyWhitePaperMongoDB  NoSQL database a deep dive -MyWhitePaper
MongoDB NoSQL database a deep dive -MyWhitePaperRajesh Kumar
 
MIS5101 WK10 Outcome Measures
MIS5101 WK10 Outcome MeasuresMIS5101 WK10 Outcome Measures
MIS5101 WK10 Outcome MeasuresSteven Johnson
 
February 2016 Webinar Series - Introduction to DynamoDB
February 2016 Webinar Series - Introduction to DynamoDBFebruary 2016 Webinar Series - Introduction to DynamoDB
February 2016 Webinar Series - Introduction to DynamoDBAmazon Web Services
 
3 CityNetConf - sql+c#=u-sql
3 CityNetConf - sql+c#=u-sql3 CityNetConf - sql+c#=u-sql
3 CityNetConf - sql+c#=u-sqlŁukasz Grala
 
SQL Server Denali: BI on Your Terms
SQL Server Denali: BI on Your Terms SQL Server Denali: BI on Your Terms
SQL Server Denali: BI on Your Terms Andrew Brust
 
SQL or NoSQL, is this the question? - George Grammatikos
SQL or NoSQL, is this the question? - George GrammatikosSQL or NoSQL, is this the question? - George Grammatikos
SQL or NoSQL, is this the question? - George GrammatikosGeorge Grammatikos
 

Semelhante a No SQL, No Problem: Use Azure DocumentDB (20)

[PASS Summit 2016] Azure DocumentDB: A Deep Dive into Advanced Features
[PASS Summit 2016] Azure DocumentDB: A Deep Dive into Advanced Features[PASS Summit 2016] Azure DocumentDB: A Deep Dive into Advanced Features
[PASS Summit 2016] Azure DocumentDB: A Deep Dive into Advanced Features
 
Simplifying & accelerating application development with MongoDB's intelligent...
Simplifying & accelerating application development with MongoDB's intelligent...Simplifying & accelerating application development with MongoDB's intelligent...
Simplifying & accelerating application development with MongoDB's intelligent...
 
Big Data Analytics in the Cloud with Microsoft Azure
Big Data Analytics in the Cloud with Microsoft AzureBig Data Analytics in the Cloud with Microsoft Azure
Big Data Analytics in the Cloud with Microsoft Azure
 
Microsoft Azure Big Data Analytics
Microsoft Azure Big Data AnalyticsMicrosoft Azure Big Data Analytics
Microsoft Azure Big Data Analytics
 
ADL/U-SQL Introduction (SQLBits 2016)
ADL/U-SQL Introduction (SQLBits 2016)ADL/U-SQL Introduction (SQLBits 2016)
ADL/U-SQL Introduction (SQLBits 2016)
 
Technological insights behind Clusterpoint database
Technological insights behind Clusterpoint databaseTechnological insights behind Clusterpoint database
Technological insights behind Clusterpoint database
 
Data Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and AtlasData Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and Atlas
 
Moving from SQL Server to MongoDB
Moving from SQL Server to MongoDBMoving from SQL Server to MongoDB
Moving from SQL Server to MongoDB
 
Semi Formal Model for Document Oriented Databases
Semi Formal Model for Document Oriented DatabasesSemi Formal Model for Document Oriented Databases
Semi Formal Model for Document Oriented Databases
 
Getting Started with NoSQL
Getting Started with NoSQLGetting Started with NoSQL
Getting Started with NoSQL
 
Big Data Analytics from Azure Cloud to Power BI Mobile
Big Data Analytics from Azure Cloud to Power BI MobileBig Data Analytics from Azure Cloud to Power BI Mobile
Big Data Analytics from Azure Cloud to Power BI Mobile
 
Big data technology unit 3
Big data technology unit 3Big data technology unit 3
Big data technology unit 3
 
MongoDB NoSQL database a deep dive -MyWhitePaper
MongoDB  NoSQL database a deep dive -MyWhitePaperMongoDB  NoSQL database a deep dive -MyWhitePaper
MongoDB NoSQL database a deep dive -MyWhitePaper
 
MIS5101 WK10 Outcome Measures
MIS5101 WK10 Outcome MeasuresMIS5101 WK10 Outcome Measures
MIS5101 WK10 Outcome Measures
 
February 2016 Webinar Series - Introduction to DynamoDB
February 2016 Webinar Series - Introduction to DynamoDBFebruary 2016 Webinar Series - Introduction to DynamoDB
February 2016 Webinar Series - Introduction to DynamoDB
 
3 CityNetConf - sql+c#=u-sql
3 CityNetConf - sql+c#=u-sql3 CityNetConf - sql+c#=u-sql
3 CityNetConf - sql+c#=u-sql
 
SQL Server Denali: BI on Your Terms
SQL Server Denali: BI on Your Terms SQL Server Denali: BI on Your Terms
SQL Server Denali: BI on Your Terms
 
Presentation
PresentationPresentation
Presentation
 
Nosql
NosqlNosql
Nosql
 
SQL or NoSQL, is this the question? - George Grammatikos
SQL or NoSQL, is this the question? - George GrammatikosSQL or NoSQL, is this the question? - George Grammatikos
SQL or NoSQL, is this the question? - George Grammatikos
 

Mais de Ken Cenerelli

ASP.NET Core deployment options
ASP.NET Core deployment optionsASP.NET Core deployment options
ASP.NET Core deployment optionsKen Cenerelli
 
Azure app service to create web and mobile apps
Azure app service to create web and mobile appsAzure app service to create web and mobile apps
Azure app service to create web and mobile appsKen Cenerelli
 
ASP.NET Core: The best of the new bits
ASP.NET Core: The best of the new bitsASP.NET Core: The best of the new bits
ASP.NET Core: The best of the new bitsKen Cenerelli
 
Analyze Your Code With Visual Studio 2015 Diagnostic Tools
Analyze Your Code With Visual Studio 2015 Diagnostic ToolsAnalyze Your Code With Visual Studio 2015 Diagnostic Tools
Analyze Your Code With Visual Studio 2015 Diagnostic ToolsKen Cenerelli
 
Building high performance software with Microsoft Application Insights
Building high performance software with Microsoft Application InsightsBuilding high performance software with Microsoft Application Insights
Building high performance software with Microsoft Application InsightsKen Cenerelli
 
An Introduction to Universal Windows Apps
An Introduction to Universal Windows AppsAn Introduction to Universal Windows Apps
An Introduction to Universal Windows Apps Ken Cenerelli
 
Build end-to-end video experiences with Azure Media Services
Build end-to-end video experiences with Azure Media ServicesBuild end-to-end video experiences with Azure Media Services
Build end-to-end video experiences with Azure Media ServicesKen Cenerelli
 
Cloud Powered Mobile Apps with Azure
Cloud Powered Mobile Apps with AzureCloud Powered Mobile Apps with Azure
Cloud Powered Mobile Apps with AzureKen Cenerelli
 
Building Windows 8.1 Apps with Mobile Services
Building Windows 8.1 Apps with Mobile ServicesBuilding Windows 8.1 Apps with Mobile Services
Building Windows 8.1 Apps with Mobile ServicesKen Cenerelli
 
Maximizing code reuse between Windows Phone 8 and Windows 8 (That Conference ...
Maximizing code reuse between Windows Phone 8 and Windows 8 (That Conference ...Maximizing code reuse between Windows Phone 8 and Windows 8 (That Conference ...
Maximizing code reuse between Windows Phone 8 and Windows 8 (That Conference ...Ken Cenerelli
 
Maximizing code reuse between Windows Phone 8 and Windows 8 (DevTeach Toronto...
Maximizing code reuse between Windows Phone 8 and Windows 8 (DevTeach Toronto...Maximizing code reuse between Windows Phone 8 and Windows 8 (DevTeach Toronto...
Maximizing code reuse between Windows Phone 8 and Windows 8 (DevTeach Toronto...Ken Cenerelli
 
An Introduction to Windows Phone 7 Development
An Introduction to Windows Phone 7 DevelopmentAn Introduction to Windows Phone 7 Development
An Introduction to Windows Phone 7 DevelopmentKen Cenerelli
 
Introduction To Umbraco
Introduction To UmbracoIntroduction To Umbraco
Introduction To UmbracoKen Cenerelli
 

Mais de Ken Cenerelli (14)

ASP.NET Core deployment options
ASP.NET Core deployment optionsASP.NET Core deployment options
ASP.NET Core deployment options
 
Azure app service to create web and mobile apps
Azure app service to create web and mobile appsAzure app service to create web and mobile apps
Azure app service to create web and mobile apps
 
ASP.NET Core: The best of the new bits
ASP.NET Core: The best of the new bitsASP.NET Core: The best of the new bits
ASP.NET Core: The best of the new bits
 
Analyze Your Code With Visual Studio 2015 Diagnostic Tools
Analyze Your Code With Visual Studio 2015 Diagnostic ToolsAnalyze Your Code With Visual Studio 2015 Diagnostic Tools
Analyze Your Code With Visual Studio 2015 Diagnostic Tools
 
Azure Data Storage
Azure Data StorageAzure Data Storage
Azure Data Storage
 
Building high performance software with Microsoft Application Insights
Building high performance software with Microsoft Application InsightsBuilding high performance software with Microsoft Application Insights
Building high performance software with Microsoft Application Insights
 
An Introduction to Universal Windows Apps
An Introduction to Universal Windows AppsAn Introduction to Universal Windows Apps
An Introduction to Universal Windows Apps
 
Build end-to-end video experiences with Azure Media Services
Build end-to-end video experiences with Azure Media ServicesBuild end-to-end video experiences with Azure Media Services
Build end-to-end video experiences with Azure Media Services
 
Cloud Powered Mobile Apps with Azure
Cloud Powered Mobile Apps with AzureCloud Powered Mobile Apps with Azure
Cloud Powered Mobile Apps with Azure
 
Building Windows 8.1 Apps with Mobile Services
Building Windows 8.1 Apps with Mobile ServicesBuilding Windows 8.1 Apps with Mobile Services
Building Windows 8.1 Apps with Mobile Services
 
Maximizing code reuse between Windows Phone 8 and Windows 8 (That Conference ...
Maximizing code reuse between Windows Phone 8 and Windows 8 (That Conference ...Maximizing code reuse between Windows Phone 8 and Windows 8 (That Conference ...
Maximizing code reuse between Windows Phone 8 and Windows 8 (That Conference ...
 
Maximizing code reuse between Windows Phone 8 and Windows 8 (DevTeach Toronto...
Maximizing code reuse between Windows Phone 8 and Windows 8 (DevTeach Toronto...Maximizing code reuse between Windows Phone 8 and Windows 8 (DevTeach Toronto...
Maximizing code reuse between Windows Phone 8 and Windows 8 (DevTeach Toronto...
 
An Introduction to Windows Phone 7 Development
An Introduction to Windows Phone 7 DevelopmentAn Introduction to Windows Phone 7 Development
An Introduction to Windows Phone 7 Development
 
Introduction To Umbraco
Introduction To UmbracoIntroduction To Umbraco
Introduction To Umbraco
 

Último

Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 

Último (20)

Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 

No SQL, No Problem: Use Azure DocumentDB

  • 1. NoSQL, NO PROBLEM: USING AZURE DOCUMENTDB { "name": "Ken Cenerelli", "twitter": "@KenCenerelli", "e-mail": "Ken_Cenerelli@Outlook.com", "hashtags": ["#DevTeach", "#DocumentDB"] }
  • 2. ABOUT ME Twitter: @KenCenerelli Email: Ken_Cenerelli@Outlook.com Blog: kencenerelli.wordpress.com LinkedI n: linkedin.com/in/kencenerelli Bio:  Content Developer / Programmer Writer  Microsoft MVP - Visual Studio and Development Technologies  Microsoft TechNet Wiki Guru  Co-Organizer of CTTDNUG  Technical reviewer of multiple booksCTTDNU G Ken Cenerelli 2
  • 3. ROAD MAP 1. Overview 2. The Resource Model 3. Modeling Your Data 4. Performance 5. Developing with DocumentDB & Demos {“aka”: “The Good Stuff”} 6. Pricing 7. Wrap-up 3
  • 4. WHAT IS NoSQL?  NoSQL → Not Only SQL  No up-front (schema) design  Easier to scale horizontally  Easier to develop iteratively  Types & Examples:  Document databases: DocumentDB, MongoDB, CouchDB  Key-value stores: Redis  Graph stores: Neo4J, Giraph  Wide-column: Cassandra, HBase 4
  • 5. WHAT IS AZURE DOCUMENTDB?  NoSQL document database fully managed by Microsoft Azure  Part of the NoSQL family of databases  For rapid development of cloud-designed apps (web, mobile, gaming, IoT)  Store and query schema agnostic JSON data with SQL-like grammar  Fast, predictable performance  Transactionally process multiple documents via native JavaScript processing  Tunable consistency levels  Built with familiar tools – REST, JSON, JavaScript 5
  • 6. 6
  • 7. WHERE DOES IT FIT IN THE AZURE FAMILY? 7
  • 8. WHEN TO USE DOCUMENTDB?  In General  You don’t want to do replication and scale-out by yourself  You want ACID transactions  You want to have tunable consistency  You want to do rapid development where models can evolve  You want to utilize your .NET, JavaScript and MongoDB skills  Compared to relational databases  You don’t want predefined columns  Compared to other document stores  You want to use a SQL-like grammar 8
  • 9. WHEN TO NOT USE DOCUMENTDB?  If your data has complex relationships  If your data has rigid schemas  If your data has complex transactions  If your data needs aggregation  If your data needs encrypted storage  If you’re planning to move your entire data store to DocumentDB  If you do not want your data to be locked into Azure 9
  • 10. DOCUMENTDB USE CASES  User generated content  Blog posts, chat sessions, ratings, comments, feedback, polls  Catalog data  User accounts, product catalogs, device registries for IoT  Logging and Time-series data  Event logs, input source for data analytics jobs performed offline  Gaming  In-game stats, social media integration, and high-score leaderboards  User preferences data  Modern web and mobile applications  IoT and Device sensor data  Ingest bursts of data from device sensors, ad-hoc querying and offline analytics 10
  • 13. RESOURCE MODEL JS JS JS 101 010 * collection != table of homogenous entities collection ~ a data partition
  • 14. RESOURCE MODEL 14 JS JS JS 101 010 { "id" : "123" "name" : "joe" "age" : 30 "address" : { "street" : "some st" } }
  • 16. RESOURCE ADDRESSING  Native REST Interface  Each resource has a permanent unique ID  API URL:  https://{database account}.documents.azure.com  Document Path:  /dbs/{database id}/colls/{collection id}/docs/{document id} 16
  • 17. DOCUMENTDB JSON DOCUMENTS JSON  Intersection of most modern type systems JSON values  Self-describable, self-contained values  Are trivially serialized to/from text 17 { "locations": [ {"country": "Germany", "city": "Berlin"}, {"country": "France", "city": "Paris"}, ], "headquarters": "Belgium", "exports":[{"city"; "Moscow"},{"city: "Athens"}] }; a JSON document, as a tree Locations Headquarte rs Belgium Country City Country City Germany Berlin France Paris Exports CityCity Moscow Athens 0 10 1
  • 18. DATA MODELING WITH RDBMS 18 Doing it the RDBMS way: normalize everything! To query for Person joins are needed to related tables: SELECT p.name, p.lastName, p.age, cd.detail, cdt.type, a.street, a.city, a.state, a.zip FROM Person p INNER JOIN Address a ON a.person_id = p.id INNER JOIN ContactDetail cd ON cd.person_id = p.id INNER JOIN ContactDetailType cdt ON cd.type_id = cdt.id multiple table updates
  • 19. DATA MODELING WITH DENORMALIZATION 19 { "id": "1", "firstName": "Thomas", "lastName": "Andersen", "addresses": [ { "line1": "100 Some Street", "line2": "Unit 1", "city": "Seattle", "state": "WA", "zip": 98012 } ], "contactDetails": [ {"email: "thomas@andersen.com"}, {"phone": "+1 555 555-5555", "extension": 5555} ] } Try to model your entity as a self- contained document Generally, use embedded data models when:  contains  one-to-few  changes infrequently  won’t grow  integral better read performance
  • 20. DATA MODELING WITH REFERENCING 20 In general, use normalized data models when:  Write performance is more important than read performance  Representing one-to-many relationships  Can representing many-to-many relationships  Related data changes frequently Provides more flexibility than embedding More round trips to read data { "id": "xyz", "username: "user xyz" } { "id": "address_xyz", "userid": "xyz", "address" : { … } } { "id: "contact_xyz", "userid": "xyz", "email" : "user@user.com" "phone" : "555 5555" } Normalizing typically provides better write performance
  • 21. HYBRID MODELS: DENORMALIZE + REFERENCE 21 No magic bullet! Think about how your data is going to be written and read then model accordingly { "id": "1", "firstName": "Thomas", "lastName": "Andersen", "countOfBooks": 3, "books": [1, 2, 3], "images": [ {"thumbnail": "http://....png"} {"profile": "http://....png"} ] } { "id": 1, "name": "DocumentDB 101", "authors": [ {"id": 1, "name": "Thomas Andersen", "thumbnail": "http://....png"}, {"id": 2, "name": "William Wakefield", "thumbnail": "http://....png"} ] } Author document Book document
  • 22. DATA MODELLING TIPS  Map properties to JSON types  Prefer smaller documents (<16KB) for smaller footprint, less IO, lower RU charges  Maximum size is 512KB – watch unbounded arrays leading to document bloat  Store metadata on attachments, reference binary data/free text as external links  Prefer sparse properties – skip rather than explicit null  Use fullName = "Azure DocumentDB" instead of firstName = "Azure" AND lastName = "DocumentDB" 22
  • 23. TUNABLE CONSISTENCY  Set at the account level  Can be overridden at the query level  Levels:  Strong  Session (default option)  Bounded Staleness  Eventual 23 Strong consistency; slow write speeds Weak consistency; fast write speeds
  • 24. INDEXING  Automatic indexing of documents and its properties when added to the collection  Instantly queryable by property using a SQL-like grammar  No need to define secondary indices / schema hints for indexing 24 Indexing Modes Consistent  Default mode  Index updated synchronously on writes Lazy  Useful for bulk ingestion scenarios Indexing Policies Automatic  Default Manual  Can manually opt- out of automatic indexing Indexing Types Hash  For equality queries  Strings and numbers Range  For comparison queries  Numbers
  • 25. INDEXING POLICIES 25 Configuration Level Options Automatic Per collection True (default) or False Override with each document write Indexing Mode Per collection Consistent or Lazy Lazy for eventual updates/bulk ingestion Included and excluded paths Per path Individual path or recursive includes (? And *) Indexing Type Per path Support Hash (Default) and Range Hash for equality, range for range queries Indexing Precision Per path Supports 3 – 7 per path Tradeoff storage, query RUs and write RUs
  • 26. DOCUMENTDB FOR DEVELOPERS  Promotes code-first development  Resilient to iterative schema changes  Low impedance as object / JSON store; no ORM required  Richer query and indexing  Has a REST API  Available SDKs and libraries:  .NET (LINQ to SQL is supported)  Node.js  JavaScript  Python  Java  JavaScript for server-side app logic 26
  • 27. QUERYING LIMITATION  Within a collection  Besides filtering, ORDER BY and TOP is supported  No aggregation yet  No COUNT  No GROUP BY  No SUM, AVG, etc. SQL for queries only  No batch UPDATE or DELETE or CREATE 27
  • 29. REQUEST UNITS  DocumentDB unit of scale  Throughput (in terms of rate of transactions / second)  Measured in Request Units (RUs)  1 RU = throughput for a 1KB document/second  2,000 requests per second allowed  “Request” depends on the size of the document  For example, uploading 1,000 large JSON documents might count as more than one request  Max throughput per collection, measured in RUs per second per collection, is 250,000 RUs/second 29
  • 30. REQUEST UNITS 30 Request Unit (RU) is the normalized currency % Memory % IOPS % CPU Replica gets a fixed budget of Request Units Resource Resource set Resource Resource DocumentsSQL sprocsargs Resource Resource Predictable Performance
  • 31. NOT ALL REQUEST UNITS ARE CREATED EQUALLY 31
  • 32. PRICING  Standard pricing tier with hourly billing  99.95% availability  Adjustable performance levels  Collections have 10 GB SSD room  Limit of 100 collections (1 TB) for each account – can be adjusted  http://bit.do/documentdb- pricing 32
  • 33. LIMITATIONS & QUOTA 33 Entity Quota Accounts 5 (soft) DBs / Account 100 Document storage per collection 250 GB Collections / DB 100 (soft) Request document size 512 KB Permissions / Account 2M Stored Procedures, Triggers & UDFs / collection 25 Max Execution Time / Stored Procedure or Trigger 5 seconds ID Length 255 chars AND, OR / query 20 https://azure.microsoft.com/en- us/documentation/articles/documentdb-limits/
  • 34. SUMMARY  Collections != Tables  De-normalize data where appropriate  Tuning / Performance  Consistency Levels  Indexing Policies  Understand Query Costs / Limits / Avoid Scans 34
  • 35. DESIGNING A DOCUMENTDB APP 1. 2. 3. 4. 5. 6.      35
  • 36. RESOURCES  Query Playground: aka.ms/docdbplayground  Data Import Tool: aka.ms/docdbimport  Docs & Tutorials: aka.ms/documentdb-docs  Code Samples: aka.ms/documentdb-samples  Cheat Sheet: aka.ms/docdbcheatsheet  Blog: aka.ms/documentdb-blog  Twitter: @documentdb 36
  • 37. QUESTIONS? 37 @KenCenerelli Ken_Cenerelli@Outlook. com Please complete the session evaluation to win prizes! CLD101: NoSQL, No Problem: Use Azure DocumentDB

Notas do Editor

  1. Document databases pair each key with a complex data structure known as a document. Documents can contain many different key-value pairs, or key-array pairs, or even nested documents. Mongo, Azure DocumentDB. Work best with hierarchical documents that are entirely or almost entirely self contained. Graph stores are used to store information about networks of data, such as social connections. Graph stores include Neo4J and Giraph. Key-value stores are the simplest NoSQL databases. Every single item in the database is stored as an attribute name (or 'key'), together with its value. Examples of key-value stores are Riak and Berkeley DB. Some key-value stores, such as Redis, allow each value to have a type, such as 'integer', which adds functionality. Wide-column stores such as Cassandra and HBase are optimized for queries over large datasets, and store columns of data together, instead of rows.
  2. In DocumentDB, persist whole object in DB ; query the whole object and display it Service: no need for containers or VMs to run it; provision the service; has fine grain control b/c of this Indexing: builds the structure to start right away; can shape the data; no need to spend time on ERD or large schemas to persist data; indexes evolve as models do DocumentDB database can grow to hundred of terabytes or event petabytes; thousands of nodes in data centers
  3. Table Storage (Key Value) Azure Blob Storage can be used to store full user profiles including images. Azure Tables is cheap, scalable No SQL solution Good for key value patterns to be scored at scale Runs on spinning disk and has higher latency when getting to 95th percentile But no secondary indexes either
  4. Mongo protocol support now supports Mongo drivers Can leverage existing skills and tools to interact with a DocumentDB service This supports the native MongoDB wire protocol
  5. Complex relationships – obvious (denormalized data and flex schema) Rigid schema – for instance if your data was imported from XML docs Complex transactions – supports some but can’t cross boundaries Aggregation – since some properties could be present or not, it’s not a good fit for data aggregation Encrypted data storage – protocol is encrypted, data is not. No current in-built mechanisms for encrypted data storage Moving entire datastore to Azure DocumentDB – use as supplement not replacement Azure specific – although adding protocol support for MongoDB
  6. UGC in social media applications is a blend of free form text, properties, tags and relationships not bounded by rigid structure. Content such as chats, comments, and posts can be stored in DocumentDB without requiring transformations or complex object to relational mapping layers. Data properties can be added or modified easily to match requirements as developers iterate over the application code, thus promoting rapid development. Catalog Data - Attributes for this data may vary and can change over time to fit application requirements. Consider an example of a product catalog for an automotive parts supplier. Every part may have its own attributes in addition to the common attributes that all parts share. Furthermore, attributes for a specific part can change the following year when a new model is released. As a JSON document store, DocumentDB supports flexible schemas and allows you to represent data with nested properties, and thus it is well suited for storing product catalog data. Logging - Perform ad-hoc queries over a subset of data for troubleshooting. Subset of data is first retrieved from the logs, typically by time series. Then, a drill-down is performed by filtering the dataset with error levels or error messages. Long running data analytics jobs performed offline over a large volume of log data. Examples of this use case include server availability analysis, application error analysis, and clickstream data analysis. Typically, Hadoop is used to perform these types of analyses with data from DocumentDB using Hadoop connector Gaming - Handle updating profile and stats from millions of simultaneous gamers, millisecond reads and writes to help avoid any lags, automatic indexing allows for filtering against multiple different properties in real-time User preferences data - most modern web and mobile applications come with complex views and experiences. These views and experiences are usually dynamic, catering to user preferences or moods and branding needs. Hence, applications need to be able to retrieve personalized settings effectively in order to render UI elements and experiences quickly. IoT - Bursts of data can be ingested by Azure Event Hubs as it offers high throughput data ingestion with low latency. Data ingested that needs to be processed for real time insight can be funneled to Azure Stream Analytics for real time analytics. Data can be loaded into DocumentDB for ad-hoc querying. Once the data is loaded into DocumentDB, the data is ready to be queried. The data in DocumentDB can be used as reference data as part of real time analytics. In addition, data can further be refined and processed by connecting DocumentDB data to HDInsight for Pig, Hive or Map/Reduce jobs. Refined data is then loaded back to DocumentDB for reporting.
  7. Provides keys and resource location for databases and contents
  8. Container of JSON documents and the associated JavaScript application logic (stored proc, trigger, UDF) Collections are isolated and independent from one another. Queries run against a single collection and returns documents from that collection
  9. Read consistency level of documents follows the consistency policy on the database account
  10. DocumentDB cannot have poor performing code affect the server so it uses bounded execution to limit how long code can run
  11. Requires multiple reads Typically provides faster write speeds
  12. Data from entities are queried together Only making one round trip to server for all the data
  13. One to many relationships (unbounded) Blog article comments array could have millions of entries Many-to-many relationships For example, one speaker could have multiple sessions and each session could have multiple speakers Related data changes with differing volatility Document does not change but document with views or likes does
  14. Use Amazon/Best Buy product example Can have embedded product info that also contains current review summary so don’t have to retrieve each time When you want the rating summary or review you have different query
  15. Trade-off between performance and consistency Strong: Strong consistency guarantees that a write is only visible after it is committed durably by the majority quorum of replicas. Strong consistency provides absolute guarantees on data consistency, but offers the lowest level of read and write performance Session: ability to read your own writes. A read request for session consistency is issued against a replica that can serve the client requested version (part of the session cookie). Session consistency provides predictable read data consistency for a session while offering the lowest latency writes. Reads are also low latency as except in the rare cases, the read will be served by a single replica. Bounded staleness: provides more predictable behavior for read consistency while offering the lowest latency writes. As reads are acknowledged by a majority quorum, read latency is not the lowest offered by the system. This provides a stronger guarantee than Session or Eventual. Eventual: weakest form of consistency wherein a client may get the values which are older than the ones it had seen before, over time. In the absence of any further writes, the replicas within the group will eventually converge. The read request is served by any secondary index. Eventual consistency provides the weakest read consistency but offers the lowest latency for both reads and writes.
  16. DocumentDB supports a RESTful protocol Any DocumentDB operation can be performed from any HTTP client as long as request URL points to valid DocumentDB resource and the headers contain the required authentication info
  17. Throughput = amount of items passing through a system
  18. Every request you do has a response that displays the RequestUnit charge whether action succeeds or fails Can see charge in QueryExplorer Writes are more expensive than reads
  19. Standard – pay as you go, allows partitioning, storage measured in GB, 400-250K RUs, 250GB storage max (can be increased) S1, S2, S3 – predefined. 10GB storage max, throughput max 2.5K Rus Can have mix of S1, S2 and S3 collections in a DB; can archive data in an S1 and keep active data in an S3 Once you create an account there is a Minimum S1 charge ($25) even with no collections created But when you create a collection you satisfy the charge and no longer have the $25 fee