SlideShare uma empresa Scribd logo
1 de 59
Baixar para ler offline
Strongly Typed Languages and Flexible
Schemas
2
Agenda
Strongly Typed Languages
Flexible Schema Databases
Change Management
Strategies
Tradeoffs
Strongly Typed Languages
"Aprogramming language that requires a
variable to be defined as well as the variable it
is"
Flexible Schema Databases
6
Traditional RDMS
create table users (id int, firstname text, lastname text);Table definition
Column structure
7
Traditional RDMS
Table with
checks
create table cat_pictures(
id int not null,
size int not null,
picture blob not null,
user_id int,
primary key (id),
foreign key (user_id) references users(id));
Null checks
Foreign and Primary
key checks
8
Traditional RDMS
users cat_pictures
1 N
9
Is this Flexible?
•  What happens when we need to
change the schema?
–  Add new fields
–  Add new relations
–  Change data types
•  What happens when we need to
scale out our data structure?
10
Flexible Schema Database
Document Graph Key Value
11
Flexible Schema
•  No mandatory schema definition
•  No structure restrictions
•  No schema validation process
12
We start from code
public class CatPicture {	
	
	int size;	
	byte[] blob;	
		
}
public class User {	
	
	int id;	
	String firstname;	
	String lastname;	
		
	CatPicture[] cat_pictures;	
		
}
13
Document Structure
{
_id: 1234,
firstname: 'Juan',
lastname: 'Olivo',
cat_pictures: [ {
size: 10,
picture: BinData("0x133334299399299432"),
}
]
}
Rich Data Types
Embedded
Documents
14
Flexible Schema Databases
•  Challenges
– Different Versions of Documents
– Different Structures of Documents
– Different Value Types for Fields in
Documents
15
Different Versions of Documents
Same document across time suffers changes on how it
represents data
{ "_id" : 174, "firstname": "Juan" }
{ "_id" : 174, "firstname": "Juan", "lastname": "Olivo" }
First Version
Second Version
{ "_id" : 174, "firstname": "Juan", "lastname": "Olivo" , "cat_pictures":
[{"size": 10, picture: BinData("0x133334299399299432")}]
}
Third Version
16
Different Versions of Documents
Same document across time suffers changes on how it
represents data
{ "_id" : 174, "firstname": "Juan" }
{ "_id" : 174, "name": { "first": "Juan", "last": "Olivo"} }
Different Structure
17
Different Structures of Documents
Different documents coexisting on the same collection
{ "_id" : 175, "brand": "Ford", "model": "Mustang", "date": ISODate("XXX") }
{ "_id" : 174, "firstname": "Juan", "lastname": "Olivo" }
Within same collection
18
Different Data Types for Fields
Different documents coexisting on the same collection
{ "_id" : 174, "firstname": "Juan", "lastname": "Olivo", "bdate": 1224234312}
{ "_id" : 175, "firstname": "Paco", "lastname": "Hernan", "bdate": "2015-06-27"}
{ "_id" : 176, "firstname": "Tomas", "lastname": "Marce", "bdate": ISODate("2015-06-27")}
Same field, different data type
Change Management
20
Change Management
Versioning Class Loading
How to set correct data format
versioning?
What mechanisms are out there to
make this work ?
Strategies
22
Strategies
•  Decoupling Architectures
•  ODM'S
•  Versioning
•  Data Migrations
Decoupled Architectures
24
Strongly Coupled
25
Becomes a mess in your hair…
Coupled Architectures
DatabaseApplication A
Application C
Application B Let me perform
some schema
changes!
Decoupled Architecture
DatabaseApplication A API
Application C
Application B
28
Decoupled Architectures
•  Allows the business logic to evolve independently of the
data layer
•  Decouples the underlying storage / persistency option
from the business service
•  Changes are "requested" and not imposed across all
applications
•  Better versioning control of each request and it's
mapping
ODM's
30
ODM
•  Reduce impedance between code and Databases
•  Data management facilitator
•  Hides complexity of operators
•  Tries to decouple business complexity with "magic"
recipes
31
Spring Data
•  POJO centric model
•  MongoTemplate || CrudRepository
extensions to make the connection to the
repositories
•  Uses annotations to override default field names
and even data types (data type mapping)
public interface UserRepository extends
MongoRepository<User, Integer>{	
		
	 		
}	
public class User {	
	
	@Id	
	int id;	
		
	@Field("first_name")	
	String firstname;	
	String lastname;
32
Spring Data Document Structure
{
"_id": 1,
"first_name": "first",
"lastname": "last",
"catpictures": [
{
"size": 10,
"blob": BinData(0, "Kr3AqmvV1R9TJQ==")
},
]
}
33
Spring Data Considerations
•  Data formats, versions and types still need to be
managed
•  Does not solve issues like type validation out-of-box
•  Can make things more complicated but more
"controllable"
	@Field("first_name")	
	String firstname;
34
Morphia
•  Data source centric
•  Will do all the discovery of POJO's for
given package
•  Also uses annotations to perform
overrides and deal with object mapping
@Entity("users")	
public class User {	
	@Id	
	int id;	
	String firstname;	
	String lastname;
morphia.mapPackage("examples.odms.morphia.pojos");	
	 		
Datastore datastore = morphia.createDatastore(new MongoClient(),
"morphia_example");	
datastore.save(user);
35
Morphia Document Structure
{
"_id": 1,
"className": "examples.odms.morphia.pojos.User",
"firstname": "first",
"lastname": "last",
"catpictures": [
{
"size": 10,
"blob": BinData(0, "Kr3AqmvV1R9TJQ==")
},
]
}
Class Definition
36
Morphia Considerations
•  Enables better control at Class loading
•  Also facilitates, like Spring Data, the field overriding (tags
to define field keys)
•  Better support for Object Polymorphism
Versioning
38
Versioning
Versioning of data structures (specially documents) can be
very helpful
Recreate documents over time
Flow Control
Data / Field Multiversion Requirements
Archiving and History Purposes
39
Versioning – Option 0
Change existing document each time there is a write with
monotonically increasing version number inside
{ "_id" : 174, "v" : 1, "firstname": "Juan" }
{ "_id" : 174, "v" : 2, "firstname": "Juan", "lastname": "Olivo" }
{ "_id" : 174, "v" : 3, "firstname": "Juan", "lastname": "Olivo", "gender": "M" }
> db.users.update( {"_id":174 } , { {"$set" :{ ... }, {"$inc": { "v": 1 }} } )!
Increment field
value
40
Versioning – Option 1
Store full document each time there is a write with
monotonically increasing version number inside
{ "docId" : 174, "v" : 1, "firstname": "Juan" }
{ "docId" : 174, "v" : 2, "firstname": "Juan", "lastname": "Olivo" }
{ "docId" : 174, "v" : 3, "firstname": "Juan", "lastname": "Olivo", "gender": "M" }
> db.users.insert( {"docId":174 …})!
> db.docs.find({"docId":174}).sort({"v":-1}).limit(-1);!
Find always latest
version
41
Versioning – Option 2
Store all document versions inside a single document.
> db.users.update( {"_id": 174 } , { {"$set" :{ "current": ... }, !
{"$inc": { "current.v": 1 }}, {"$addToSet": {"prev": {... }}} } )!
!
Current value
{ "_id" : 174, "current" : { "v" :3, "attr1": 184, "attr2" : "A-1" },
"prev" : [
{ "v" : 1, "attr1": 165 },
{ "v" : 2, "attr1": 165, "attr2": "A-1" }
]
}
Previous values
42
Versioning – Option 3
Keep collection for "current" version and past versions
> db.users.find( {"_id": 174 })!
> db.users_past.find( {"pid": 174 })!
{ "pid" : 174, "v" : 1, "firstname": "Juan" }
{ "pid" : 174, "v" : 2, "firstname": "Juan", "lastname": "Olivo" }
{ "_id" : 174, "v" : 3, "firstname": "Juan", "lastname": "Olivo", "gender": "M" }
Previous versions
collection
Current collection
43
Versioning
Schema Fetch 1 Fetch Many Update Recover if
Fail
0) Increment
Version
Easy, Fast Fast Easy Medium N/A
1) New
Document
Easy, Fast Not Easy,
Slow
Medium Hard
2) Embedded in
Single Doc
Easy,
Fastest
Easy, Fastest Medium N/A
3) Separate
Collection
Easy,
Fastest
Easy, Fastest Medium Medium, Hard
Migrations
45
Migrations
Several types of "Migrations":
Add/Remove Fields
Change Field Names
Change Field Data Type
Extract Embedded Document into Collection
46
Add / Remove Fields
For Flexible Schema Database this is our Bread & Butter
{ "_id" : 174, "firstname": "Juan", "lastname": "Olivo", "gender": "M" }
{ "_id" : 174, "firstname": "Juan", "lastname": "Olivo", "newfield": "value" }
> db.users.update( {"_id": 174}, {"$set": { "newfield":
"value" }, "$unset": {"gender":""} })!
47
Change Field Names
Again, programmatically you can do it
{ "_id" : 174, "firstname": "Juan", "lastname": "Olivo",}
{ "_id" : 174, "first": "Juan", "last": "Olivo" }
> db.users.update( {"_id": 174}, {"$rename": { "firstname":
"first", "lastname":"last"} })!
48
Change Field Data Type
Align to a new code change and move from Int to String!
{..."bdate": 1435394461522} {..."bdate": "2015-06-27"}
1) Batch Process
2) Aggregation Framework
3) Change based on usage
49
Change Field Data Type
1) Batch Process – bulk api
public void migrateBulk(){	
	DateFormat df = new SimpleDateFormat("yyyy-MM-DD");	
	...	
	List<UpdateOneModel<Document>> toUpdate = 	
	 	new ArrayList<UpdateOneModel<Document>>();	
	for (Document doc : coll.find()){	
	 	String dateAsString = df.format( new Date( doc.getInteger("bdate", 0) ));	
	 	Document filter = new Document("_id", doc.getInteger("_id"));	
	 	Document value = new Document("bdate", dateAsString);	
	 	Document update = new Document("$set", value);	
	 	 		
	 	toUpdate.add(new UpdateOneModel<Document>(filter, update));	
	}	
	coll.bulkWrite(toUpdate);
50
Change Field Data Type
1) Batch Process – bulk api
public void migrateBulk(){	
	...	
	for (Document doc : coll.find()){	
	 	... 	 	 		
	}	
	coll.bulkWrite(toUpdate);	
Is there any problem with
this?
51
Change Field Data Type
1) Batch Process – bulk api
public void migrateBulk(){	
	...	
	//bson type 16 represents int32 data type	
	Document query = new Document("bdate", new Document("$type", "16"));	
	for (Document doc : coll.find(query)){	
	 	... 	 	 		
	}	
coll.bulkWrite(toUpdate);	
More efficient filtering!
52
Extract Document into Collection
Normalize your schema
{"size": 10, picture: BinData("0x133334299399299432")}
{ "_id" : 174, "firstname": "Juan",
"lastname": "Olivo",}
> db.users.aggregate( [ !
{$unwind: "$cat_pictures"},!
{$project: { "_id":0, "uid":"$_id", "size": "$cat_pictures.size",
"picture": "$cat_pictures.picture"}}, !
{$out:"cats"}])!
{ "_id" : 174, "firstname": "Juan", "lastname": "Olivo" , "cat_pictures":
[{"size": 10, picture: BinData(0, "m/lhLlLmoNiUKQ==")}]
}
{"size": 10, "picture": BinData(0, "m/lhLlLmoNiUKQ==")}
Tradeoffs
54
Tradeoffs
Positives Penalties
Decoupled Architecture -  Should be your default
approach
-  Clean Solution
-  Scalable
N/A
Data Structures Variability -  Reflects Nowadays data
structures
-  You can push decisions for
later
-  More complex code base
Data Structures Strictness -  Simple to maintain
-  Always aligned with your
code base
-  Will eventually need
Migrations
-  Restricts your code
iterations
Recap
56
Recap
•  Flexible and Dynamic Schemas are a great tool
–  Use them wisely
–  Make sure you understand the tradeoffs
–  Make sure you understand the different strategies and
options
•  Works well with Strongly Typed Languages
57
Free Education
https://university.mongodb.com/courses/M101J/about
Obrigado!
Norberto Leite
Technical Evangelist
http://www.mongodb.com/norberto
norberto@mongodb.com
@nleite
Strongly Typed Languages and Flexible Schemas

Mais conteúdo relacionado

Mais procurados

MongoDB (Advanced)
MongoDB (Advanced)MongoDB (Advanced)
MongoDB (Advanced)
TO THE NEW | Technology
 
Aggregation in MongoDB
Aggregation in MongoDBAggregation in MongoDB
Aggregation in MongoDB
Kishor Parkhe
 
Aggregation Framework
Aggregation FrameworkAggregation Framework
Aggregation Framework
MongoDB
 
The Aggregation Framework
The Aggregation FrameworkThe Aggregation Framework
The Aggregation Framework
MongoDB
 

Mais procurados (20)

MongoDB Europe 2016 - Advanced MongoDB Aggregation Pipelines
MongoDB Europe 2016 - Advanced MongoDB Aggregation PipelinesMongoDB Europe 2016 - Advanced MongoDB Aggregation Pipelines
MongoDB Europe 2016 - Advanced MongoDB Aggregation Pipelines
 
Inside MongoDB: the Internals of an Open-Source Database
Inside MongoDB: the Internals of an Open-Source DatabaseInside MongoDB: the Internals of an Open-Source Database
Inside MongoDB: the Internals of an Open-Source Database
 
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
 
ETL for Pros: Getting Data Into MongoDB
ETL for Pros: Getting Data Into MongoDBETL for Pros: Getting Data Into MongoDB
ETL for Pros: Getting Data Into MongoDB
 
MongoDB (Advanced)
MongoDB (Advanced)MongoDB (Advanced)
MongoDB (Advanced)
 
MongoDB Aggregation Framework
MongoDB Aggregation FrameworkMongoDB Aggregation Framework
MongoDB Aggregation Framework
 
Avro, la puissance du binaire, la souplesse du JSON
Avro, la puissance du binaire, la souplesse du JSONAvro, la puissance du binaire, la souplesse du JSON
Avro, la puissance du binaire, la souplesse du JSON
 
Embedding a language into string interpolator
Embedding a language into string interpolatorEmbedding a language into string interpolator
Embedding a language into string interpolator
 
MongoD Essentials
MongoD EssentialsMongoD Essentials
MongoD Essentials
 
Aggregation in MongoDB
Aggregation in MongoDBAggregation in MongoDB
Aggregation in MongoDB
 
Webinar: Exploring the Aggregation Framework
Webinar: Exploring the Aggregation FrameworkWebinar: Exploring the Aggregation Framework
Webinar: Exploring the Aggregation Framework
 
MongoDB Europe 2016 - Debugging MongoDB Performance
MongoDB Europe 2016 - Debugging MongoDB PerformanceMongoDB Europe 2016 - Debugging MongoDB Performance
MongoDB Europe 2016 - Debugging MongoDB Performance
 
MongoDB World 2016 : Advanced Aggregation
MongoDB World 2016 : Advanced AggregationMongoDB World 2016 : Advanced Aggregation
MongoDB World 2016 : Advanced Aggregation
 
Aggregation Framework
Aggregation FrameworkAggregation Framework
Aggregation Framework
 
MongoDB Performance Tuning
MongoDB Performance TuningMongoDB Performance Tuning
MongoDB Performance Tuning
 
Back to Basics Webinar 5: Introduction to the Aggregation Framework
Back to Basics Webinar 5: Introduction to the Aggregation FrameworkBack to Basics Webinar 5: Introduction to the Aggregation Framework
Back to Basics Webinar 5: Introduction to the Aggregation Framework
 
The Aggregation Framework
The Aggregation FrameworkThe Aggregation Framework
The Aggregation Framework
 
Agg framework selectgroup feb2015 v2
Agg framework selectgroup feb2015 v2Agg framework selectgroup feb2015 v2
Agg framework selectgroup feb2015 v2
 
Online | MongoDB Atlas on GCP Workshop
Online | MongoDB Atlas on GCP Workshop Online | MongoDB Atlas on GCP Workshop
Online | MongoDB Atlas on GCP Workshop
 
MongoDB Aggregation
MongoDB Aggregation MongoDB Aggregation
MongoDB Aggregation
 

Destaque

Online Travel: Today and Tomorrow
Online Travel: Today and TomorrowOnline Travel: Today and Tomorrow
Online Travel: Today and Tomorrow
Yanis Dzenis
 
USJBF Overview Presentation
USJBF Overview PresentationUSJBF Overview Presentation
USJBF Overview Presentation
kdieckgraeff
 
Division of roles and responsibilities
Division of roles and responsibilitiesDivision of roles and responsibilities
Division of roles and responsibilities
kausargulaid
 
Amadeus big data
Amadeus big dataAmadeus big data
Amadeus big data
승필 고
 
Introduction to jira
Introduction to jiraIntroduction to jira
Introduction to jira
Xpand IT
 
MongoDB and AWS Best Practices
MongoDB and AWS Best PracticesMongoDB and AWS Best Practices
MongoDB and AWS Best Practices
MongoDB
 
Migrating to git
Migrating to gitMigrating to git
Migrating to git
Xpand IT
 
Old & wise(에듀시니어)
Old & wise(에듀시니어)Old & wise(에듀시니어)
Old & wise(에듀시니어)
Jungku Hong
 

Destaque (20)

Mgidigitalglobalization
MgidigitalglobalizationMgidigitalglobalization
Mgidigitalglobalization
 
R Statistics With MongoDB
R Statistics With MongoDBR Statistics With MongoDB
R Statistics With MongoDB
 
Review: Leadership Frameworks
Review: Leadership FrameworksReview: Leadership Frameworks
Review: Leadership Frameworks
 
Online Travel: Today and Tomorrow
Online Travel: Today and TomorrowOnline Travel: Today and Tomorrow
Online Travel: Today and Tomorrow
 
USJBF Overview Presentation
USJBF Overview PresentationUSJBF Overview Presentation
USJBF Overview Presentation
 
Division of roles and responsibilities
Division of roles and responsibilitiesDivision of roles and responsibilities
Division of roles and responsibilities
 
Amadeus big data
Amadeus big dataAmadeus big data
Amadeus big data
 
NOSQL Session GlueCon May 2010
NOSQL Session GlueCon May 2010NOSQL Session GlueCon May 2010
NOSQL Session GlueCon May 2010
 
Creative Overview
Creative OverviewCreative Overview
Creative Overview
 
GIT Best Practices V 0.1
GIT Best Practices V 0.1GIT Best Practices V 0.1
GIT Best Practices V 0.1
 
онлайн бронирование модуль для турагенств
онлайн бронирование модуль для турагенствонлайн бронирование модуль для турагенств
онлайн бронирование модуль для турагенств
 
Heyat terzi report (Mart 2016)
Heyat terzi report (Mart 2016)Heyat terzi report (Mart 2016)
Heyat terzi report (Mart 2016)
 
Introduction to jira
Introduction to jiraIntroduction to jira
Introduction to jira
 
Introduction Pentaho 5.0
Introduction Pentaho 5.0 Introduction Pentaho 5.0
Introduction Pentaho 5.0
 
Special project
Special projectSpecial project
Special project
 
MongoDB and AWS Best Practices
MongoDB and AWS Best PracticesMongoDB and AWS Best Practices
MongoDB and AWS Best Practices
 
Ov big data
Ov big dataOv big data
Ov big data
 
Data meets Creativity - Webbdagarna 2015
Data meets Creativity - Webbdagarna 2015Data meets Creativity - Webbdagarna 2015
Data meets Creativity - Webbdagarna 2015
 
Migrating to git
Migrating to gitMigrating to git
Migrating to git
 
Old & wise(에듀시니어)
Old & wise(에듀시니어)Old & wise(에듀시니어)
Old & wise(에듀시니어)
 

Semelhante a Strongly Typed Languages and Flexible Schemas

Data Processing and Aggregation with MongoDB
Data Processing and Aggregation with MongoDB Data Processing and Aggregation with MongoDB
Data Processing and Aggregation with MongoDB
MongoDB
 
No sql present
No sql presentNo sql present
No sql present
Thai Phong
 

Semelhante a Strongly Typed Languages and Flexible Schemas (20)

Webinar: Strongly Typed Languages and Flexible Schemas
Webinar: Strongly Typed Languages and Flexible SchemasWebinar: Strongly Typed Languages and Flexible Schemas
Webinar: Strongly Typed Languages and Flexible Schemas
 
ELK Stack - Turn boring logfiles into sexy dashboard
ELK Stack - Turn boring logfiles into sexy dashboardELK Stack - Turn boring logfiles into sexy dashboard
ELK Stack - Turn boring logfiles into sexy dashboard
 
Webinar: General Technical Overview of MongoDB for Dev Teams
Webinar: General Technical Overview of MongoDB for Dev TeamsWebinar: General Technical Overview of MongoDB for Dev Teams
Webinar: General Technical Overview of MongoDB for Dev Teams
 
SFScon17 - Patrick Puecher: "Exploring data with Elasticsearch and Kibana"
SFScon17 - Patrick Puecher: "Exploring data with Elasticsearch and Kibana"SFScon17 - Patrick Puecher: "Exploring data with Elasticsearch and Kibana"
SFScon17 - Patrick Puecher: "Exploring data with Elasticsearch and Kibana"
 
Webinar: Data Processing and Aggregation Options
Webinar: Data Processing and Aggregation OptionsWebinar: Data Processing and Aggregation Options
Webinar: Data Processing and Aggregation Options
 
Streaming Operational Data with MariaDB MaxScale
Streaming Operational Data with MariaDB MaxScaleStreaming Operational Data with MariaDB MaxScale
Streaming Operational Data with MariaDB MaxScale
 
Data Processing and Aggregation with MongoDB
Data Processing and Aggregation with MongoDB Data Processing and Aggregation with MongoDB
Data Processing and Aggregation with MongoDB
 
Introduction to Azure DocumentDB
Introduction to Azure DocumentDBIntroduction to Azure DocumentDB
Introduction to Azure DocumentDB
 
Avro introduction
Avro introductionAvro introduction
Avro introduction
 
An introduction into Spring Data
An introduction into Spring DataAn introduction into Spring Data
An introduction into Spring Data
 
Analytics with MongoDB Aggregation Framework and Hadoop Connector
Analytics with MongoDB Aggregation Framework and Hadoop ConnectorAnalytics with MongoDB Aggregation Framework and Hadoop Connector
Analytics with MongoDB Aggregation Framework and Hadoop Connector
 
Harnessing The Power of Search - Liferay DEVCON 2015, Darmstadt, Germany
Harnessing The Power of Search - Liferay DEVCON 2015, Darmstadt, GermanyHarnessing The Power of Search - Liferay DEVCON 2015, Darmstadt, Germany
Harnessing The Power of Search - Liferay DEVCON 2015, Darmstadt, Germany
 
MongoDB Meetup
MongoDB MeetupMongoDB Meetup
MongoDB Meetup
 
Liferay Search: Best Practices to Dramatically Improve Relevance - Liferay Sy...
Liferay Search: Best Practices to Dramatically Improve Relevance - Liferay Sy...Liferay Search: Best Practices to Dramatically Improve Relevance - Liferay Sy...
Liferay Search: Best Practices to Dramatically Improve Relevance - Liferay Sy...
 
MongoDB Europe 2016 - Graph Operations with MongoDB
MongoDB Europe 2016 - Graph Operations with MongoDBMongoDB Europe 2016 - Graph Operations with MongoDB
MongoDB Europe 2016 - Graph Operations with MongoDB
 
Working with the Web: 
Decoding JSON
Working with the Web: 
Decoding JSONWorking with the Web: 
Decoding JSON
Working with the Web: 
Decoding JSON
 
No sql present
No sql presentNo sql present
No sql present
 
Drupal Mobile
Drupal MobileDrupal Mobile
Drupal Mobile
 
The rise of json in rdbms land jab17
The rise of json in rdbms land jab17The rise of json in rdbms land jab17
The rise of json in rdbms land jab17
 
MongoDB World 2018: Keynote
MongoDB World 2018: KeynoteMongoDB World 2018: Keynote
MongoDB World 2018: Keynote
 

Mais de Norberto Leite

Spark and MongoDB
Spark and MongoDBSpark and MongoDB
Spark and MongoDB
Norberto Leite
 

Mais de Norberto Leite (20)

Data Modelling for MongoDB - MongoDB.local Tel Aviv
Data Modelling for MongoDB - MongoDB.local Tel AvivData Modelling for MongoDB - MongoDB.local Tel Aviv
Data Modelling for MongoDB - MongoDB.local Tel Aviv
 
Avoid Query Pitfalls
Avoid Query PitfallsAvoid Query Pitfalls
Avoid Query Pitfalls
 
MongoDB and Spark
MongoDB and SparkMongoDB and Spark
MongoDB and Spark
 
Mongo db 3.4 Overview
Mongo db 3.4 OverviewMongo db 3.4 Overview
Mongo db 3.4 Overview
 
MongoDB Certification Study Group - May 2016
MongoDB Certification Study Group - May 2016MongoDB Certification Study Group - May 2016
MongoDB Certification Study Group - May 2016
 
Geospatial and MongoDB
Geospatial and MongoDBGeospatial and MongoDB
Geospatial and MongoDB
 
MongodB Internals
MongodB InternalsMongodB Internals
MongodB Internals
 
MongoDB WiredTiger Internals
MongoDB WiredTiger InternalsMongoDB WiredTiger Internals
MongoDB WiredTiger Internals
 
MongoDB 3.2 Feature Preview
MongoDB 3.2 Feature PreviewMongoDB 3.2 Feature Preview
MongoDB 3.2 Feature Preview
 
Mongodb Spring
Mongodb SpringMongodb Spring
Mongodb Spring
 
MongoDB on Azure
MongoDB on AzureMongoDB on Azure
MongoDB on Azure
 
MongoDB: Agile Combustion Engine
MongoDB: Agile Combustion EngineMongoDB: Agile Combustion Engine
MongoDB: Agile Combustion Engine
 
MongoDB Capacity Planning
MongoDB Capacity PlanningMongoDB Capacity Planning
MongoDB Capacity Planning
 
Spark and MongoDB
Spark and MongoDBSpark and MongoDB
Spark and MongoDB
 
Analyse Yourself
Analyse YourselfAnalyse Yourself
Analyse Yourself
 
Python and MongoDB
Python and MongoDB Python and MongoDB
Python and MongoDB
 
Effectively Deploying MongoDB on AEM
Effectively Deploying MongoDB on AEMEffectively Deploying MongoDB on AEM
Effectively Deploying MongoDB on AEM
 
Advanced applications with MongoDB
Advanced applications with MongoDBAdvanced applications with MongoDB
Advanced applications with MongoDB
 
MongoDB and Node.js
MongoDB and Node.jsMongoDB and Node.js
MongoDB and Node.js
 
MongoDB + Spring
MongoDB + SpringMongoDB + Spring
MongoDB + Spring
 

Último

+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
Health
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
VictorSzoltysek
 

Último (20)

Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdf
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
Pharm-D Biostatistics and Research methodology
Pharm-D Biostatistics and Research methodologyPharm-D Biostatistics and Research methodology
Pharm-D Biostatistics and Research methodology
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
ManageIQ - Sprint 236 Review - Slide Deck
ManageIQ - Sprint 236 Review - Slide DeckManageIQ - Sprint 236 Review - Slide Deck
ManageIQ - Sprint 236 Review - Slide Deck
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfAzure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
 

Strongly Typed Languages and Flexible Schemas

  • 1. Strongly Typed Languages and Flexible Schemas
  • 2. 2 Agenda Strongly Typed Languages Flexible Schema Databases Change Management Strategies Tradeoffs
  • 4. "Aprogramming language that requires a variable to be defined as well as the variable it is"
  • 6. 6 Traditional RDMS create table users (id int, firstname text, lastname text);Table definition Column structure
  • 7. 7 Traditional RDMS Table with checks create table cat_pictures( id int not null, size int not null, picture blob not null, user_id int, primary key (id), foreign key (user_id) references users(id)); Null checks Foreign and Primary key checks
  • 9. 9 Is this Flexible? •  What happens when we need to change the schema? –  Add new fields –  Add new relations –  Change data types •  What happens when we need to scale out our data structure?
  • 11. 11 Flexible Schema •  No mandatory schema definition •  No structure restrictions •  No schema validation process
  • 12. 12 We start from code public class CatPicture { int size; byte[] blob; } public class User { int id; String firstname; String lastname; CatPicture[] cat_pictures; }
  • 13. 13 Document Structure { _id: 1234, firstname: 'Juan', lastname: 'Olivo', cat_pictures: [ { size: 10, picture: BinData("0x133334299399299432"), } ] } Rich Data Types Embedded Documents
  • 14. 14 Flexible Schema Databases •  Challenges – Different Versions of Documents – Different Structures of Documents – Different Value Types for Fields in Documents
  • 15. 15 Different Versions of Documents Same document across time suffers changes on how it represents data { "_id" : 174, "firstname": "Juan" } { "_id" : 174, "firstname": "Juan", "lastname": "Olivo" } First Version Second Version { "_id" : 174, "firstname": "Juan", "lastname": "Olivo" , "cat_pictures": [{"size": 10, picture: BinData("0x133334299399299432")}] } Third Version
  • 16. 16 Different Versions of Documents Same document across time suffers changes on how it represents data { "_id" : 174, "firstname": "Juan" } { "_id" : 174, "name": { "first": "Juan", "last": "Olivo"} } Different Structure
  • 17. 17 Different Structures of Documents Different documents coexisting on the same collection { "_id" : 175, "brand": "Ford", "model": "Mustang", "date": ISODate("XXX") } { "_id" : 174, "firstname": "Juan", "lastname": "Olivo" } Within same collection
  • 18. 18 Different Data Types for Fields Different documents coexisting on the same collection { "_id" : 174, "firstname": "Juan", "lastname": "Olivo", "bdate": 1224234312} { "_id" : 175, "firstname": "Paco", "lastname": "Hernan", "bdate": "2015-06-27"} { "_id" : 176, "firstname": "Tomas", "lastname": "Marce", "bdate": ISODate("2015-06-27")} Same field, different data type
  • 20. 20 Change Management Versioning Class Loading How to set correct data format versioning? What mechanisms are out there to make this work ?
  • 22. 22 Strategies •  Decoupling Architectures •  ODM'S •  Versioning •  Data Migrations
  • 25. 25 Becomes a mess in your hair…
  • 26. Coupled Architectures DatabaseApplication A Application C Application B Let me perform some schema changes!
  • 27. Decoupled Architecture DatabaseApplication A API Application C Application B
  • 28. 28 Decoupled Architectures •  Allows the business logic to evolve independently of the data layer •  Decouples the underlying storage / persistency option from the business service •  Changes are "requested" and not imposed across all applications •  Better versioning control of each request and it's mapping
  • 29. ODM's
  • 30. 30 ODM •  Reduce impedance between code and Databases •  Data management facilitator •  Hides complexity of operators •  Tries to decouple business complexity with "magic" recipes
  • 31. 31 Spring Data •  POJO centric model •  MongoTemplate || CrudRepository extensions to make the connection to the repositories •  Uses annotations to override default field names and even data types (data type mapping) public interface UserRepository extends MongoRepository<User, Integer>{ } public class User { @Id int id; @Field("first_name") String firstname; String lastname;
  • 32. 32 Spring Data Document Structure { "_id": 1, "first_name": "first", "lastname": "last", "catpictures": [ { "size": 10, "blob": BinData(0, "Kr3AqmvV1R9TJQ==") }, ] }
  • 33. 33 Spring Data Considerations •  Data formats, versions and types still need to be managed •  Does not solve issues like type validation out-of-box •  Can make things more complicated but more "controllable" @Field("first_name") String firstname;
  • 34. 34 Morphia •  Data source centric •  Will do all the discovery of POJO's for given package •  Also uses annotations to perform overrides and deal with object mapping @Entity("users") public class User { @Id int id; String firstname; String lastname; morphia.mapPackage("examples.odms.morphia.pojos"); Datastore datastore = morphia.createDatastore(new MongoClient(), "morphia_example"); datastore.save(user);
  • 35. 35 Morphia Document Structure { "_id": 1, "className": "examples.odms.morphia.pojos.User", "firstname": "first", "lastname": "last", "catpictures": [ { "size": 10, "blob": BinData(0, "Kr3AqmvV1R9TJQ==") }, ] } Class Definition
  • 36. 36 Morphia Considerations •  Enables better control at Class loading •  Also facilitates, like Spring Data, the field overriding (tags to define field keys) •  Better support for Object Polymorphism
  • 38. 38 Versioning Versioning of data structures (specially documents) can be very helpful Recreate documents over time Flow Control Data / Field Multiversion Requirements Archiving and History Purposes
  • 39. 39 Versioning – Option 0 Change existing document each time there is a write with monotonically increasing version number inside { "_id" : 174, "v" : 1, "firstname": "Juan" } { "_id" : 174, "v" : 2, "firstname": "Juan", "lastname": "Olivo" } { "_id" : 174, "v" : 3, "firstname": "Juan", "lastname": "Olivo", "gender": "M" } > db.users.update( {"_id":174 } , { {"$set" :{ ... }, {"$inc": { "v": 1 }} } )! Increment field value
  • 40. 40 Versioning – Option 1 Store full document each time there is a write with monotonically increasing version number inside { "docId" : 174, "v" : 1, "firstname": "Juan" } { "docId" : 174, "v" : 2, "firstname": "Juan", "lastname": "Olivo" } { "docId" : 174, "v" : 3, "firstname": "Juan", "lastname": "Olivo", "gender": "M" } > db.users.insert( {"docId":174 …})! > db.docs.find({"docId":174}).sort({"v":-1}).limit(-1);! Find always latest version
  • 41. 41 Versioning – Option 2 Store all document versions inside a single document. > db.users.update( {"_id": 174 } , { {"$set" :{ "current": ... }, ! {"$inc": { "current.v": 1 }}, {"$addToSet": {"prev": {... }}} } )! ! Current value { "_id" : 174, "current" : { "v" :3, "attr1": 184, "attr2" : "A-1" }, "prev" : [ { "v" : 1, "attr1": 165 }, { "v" : 2, "attr1": 165, "attr2": "A-1" } ] } Previous values
  • 42. 42 Versioning – Option 3 Keep collection for "current" version and past versions > db.users.find( {"_id": 174 })! > db.users_past.find( {"pid": 174 })! { "pid" : 174, "v" : 1, "firstname": "Juan" } { "pid" : 174, "v" : 2, "firstname": "Juan", "lastname": "Olivo" } { "_id" : 174, "v" : 3, "firstname": "Juan", "lastname": "Olivo", "gender": "M" } Previous versions collection Current collection
  • 43. 43 Versioning Schema Fetch 1 Fetch Many Update Recover if Fail 0) Increment Version Easy, Fast Fast Easy Medium N/A 1) New Document Easy, Fast Not Easy, Slow Medium Hard 2) Embedded in Single Doc Easy, Fastest Easy, Fastest Medium N/A 3) Separate Collection Easy, Fastest Easy, Fastest Medium Medium, Hard
  • 45. 45 Migrations Several types of "Migrations": Add/Remove Fields Change Field Names Change Field Data Type Extract Embedded Document into Collection
  • 46. 46 Add / Remove Fields For Flexible Schema Database this is our Bread & Butter { "_id" : 174, "firstname": "Juan", "lastname": "Olivo", "gender": "M" } { "_id" : 174, "firstname": "Juan", "lastname": "Olivo", "newfield": "value" } > db.users.update( {"_id": 174}, {"$set": { "newfield": "value" }, "$unset": {"gender":""} })!
  • 47. 47 Change Field Names Again, programmatically you can do it { "_id" : 174, "firstname": "Juan", "lastname": "Olivo",} { "_id" : 174, "first": "Juan", "last": "Olivo" } > db.users.update( {"_id": 174}, {"$rename": { "firstname": "first", "lastname":"last"} })!
  • 48. 48 Change Field Data Type Align to a new code change and move from Int to String! {..."bdate": 1435394461522} {..."bdate": "2015-06-27"} 1) Batch Process 2) Aggregation Framework 3) Change based on usage
  • 49. 49 Change Field Data Type 1) Batch Process – bulk api public void migrateBulk(){ DateFormat df = new SimpleDateFormat("yyyy-MM-DD"); ... List<UpdateOneModel<Document>> toUpdate = new ArrayList<UpdateOneModel<Document>>(); for (Document doc : coll.find()){ String dateAsString = df.format( new Date( doc.getInteger("bdate", 0) )); Document filter = new Document("_id", doc.getInteger("_id")); Document value = new Document("bdate", dateAsString); Document update = new Document("$set", value); toUpdate.add(new UpdateOneModel<Document>(filter, update)); } coll.bulkWrite(toUpdate);
  • 50. 50 Change Field Data Type 1) Batch Process – bulk api public void migrateBulk(){ ... for (Document doc : coll.find()){ ... } coll.bulkWrite(toUpdate); Is there any problem with this?
  • 51. 51 Change Field Data Type 1) Batch Process – bulk api public void migrateBulk(){ ... //bson type 16 represents int32 data type Document query = new Document("bdate", new Document("$type", "16")); for (Document doc : coll.find(query)){ ... } coll.bulkWrite(toUpdate); More efficient filtering!
  • 52. 52 Extract Document into Collection Normalize your schema {"size": 10, picture: BinData("0x133334299399299432")} { "_id" : 174, "firstname": "Juan", "lastname": "Olivo",} > db.users.aggregate( [ ! {$unwind: "$cat_pictures"},! {$project: { "_id":0, "uid":"$_id", "size": "$cat_pictures.size", "picture": "$cat_pictures.picture"}}, ! {$out:"cats"}])! { "_id" : 174, "firstname": "Juan", "lastname": "Olivo" , "cat_pictures": [{"size": 10, picture: BinData(0, "m/lhLlLmoNiUKQ==")}] } {"size": 10, "picture": BinData(0, "m/lhLlLmoNiUKQ==")}
  • 54. 54 Tradeoffs Positives Penalties Decoupled Architecture -  Should be your default approach -  Clean Solution -  Scalable N/A Data Structures Variability -  Reflects Nowadays data structures -  You can push decisions for later -  More complex code base Data Structures Strictness -  Simple to maintain -  Always aligned with your code base -  Will eventually need Migrations -  Restricts your code iterations
  • 55. Recap
  • 56. 56 Recap •  Flexible and Dynamic Schemas are a great tool –  Use them wisely –  Make sure you understand the tradeoffs –  Make sure you understand the different strategies and options •  Works well with Strongly Typed Languages