SlideShare uma empresa Scribd logo
1 de 23
Baixar para ler offline
SQL vs NoSQL and moving data from
MongoDB to Azure data lake by
using Azure Data Factory
Diponkar Paul
Father and
Husband
Blogger &
Speaker
Profession:
Data Engineer
Working with
BI, data
warehouse 12
years
Diverse
background:
South Asia,
Nordic
region, North
America
Community:
Lead Toronto
Data
Professionals
Community
Twitter: @Paulswengrr
Blog: www.allaboutdata.ca
What we
cover
Refresh our memory with traditional SQL
Know about NoSQL (MongoDB)
Demo: No SQL
Comparison
Azure data factory: Copy data from MongoDB
Demo: MongoDB with ADF
SQL Syntax
SELECT Id, Product, Price
From Product
Where ProductCategory=’Bikes’
Join, Insert, Update, Delete
Well defined
Schema
CREATE TABLE [Production].[Product](
[ProductID] [int] IDENTITY(1,1) NOT NULL,
[Name] [nvarchar](100) NOT NULL,
[ProductNumber] [nvarchar](25) NOT NULL,
[MakeFlag] [dbo].[Flag] NOT NULL,
[FinishedGoodsFlag] [dbo].[Flag] NOT NULL,
[Color] [nvarchar](15) NULL,
[SafetyStockLevel] [smallint] NOT NULL,
[StandardCost] [money] NOT NULL,
[ListPrice] [money] NOT NULL,
[Size] [nvarchar](5) NULL)
Relationship/Normalization
Customer Bridge table (Order)
Product
Id Name Price Description
1 “Mountain Bike “ 2500 “Bike for mountain trek”
2 “City Bike” 1000 “Best fit to roam around city”
Id Customer_ID Product_ID
1 2 1
2 2 2
3 1 1
Id Name Email
1 Morten
Sorenson
m.s@outlook.com
2 Andersen Lu al@yahoo.com
3 Derek Paul dp@outlook.com
Type of
relationships
NoSQL
• MongoDB
• Azure Cosmos DB
• Amazon Document DB
• Oracle NoSQL
• Google BigTable
Not Only SQL!!
NoSQL- MongoDB
“MongoDB” derives from the word “humongous”
How we call them?
Database E-Commerce
Collections Table –Customer, Product…
Documents {“Name”: ”Anders”, age:36}
{“Name”: “Carsten”, age:42}
No defined Schema
Id:1 Age:36Name: ‘Anders’ …..
Id:2
Age:36
Name: ‘Carsten’ …..
Id:3 …..
NoSQL –No relation
Profession
{id:1,profession:’Developer’}
{id:2, profession: ’Data Engineer’}
{id:3, profession: ’Actor’}
Users
{id:1,name:’Tom Hanks’, age:20}
{id:2,name:’Casper Ruther’, age:42}
{id:3,name:’Paul Anders’, age:63}
db.Users.insert(
{
id:"01",
name:"Tom Hanks",
age:20
email:"th@hollywood.com",
Profession:["Developer","Data
Engineer","Actor"]
}
)
Usersprofession
{id:1,userId:1,professionId:1}
{id:2,userId:1, professionId: 2}
{id:3,userId: 1, professionId: 3}
{id:4,userId: 2, professionId: 2}
Tools: MongoDB
https://www.mongodb.com/products/compass
Robo 3T: https://robomongo.org/
https://docs.mongodb.com/manual/core/data-model-design/
https://docs.mongodb.com/manual/reference/method/db.collection.update/
Languages
• MONGO SHELL
• Python
• java
• C#
• Scala
• GO and many more.
Demo
SQL vs NoSQL
SQL NoSQL
Data uses Schema Schema-less (Schema Agnostic)
Maintain Relationship No relations– though you can design relationship
Data distributed in multiple tables Data in one table (embedded)
Move your NoSQL data from OnPrem to Data
Lake Gen2
Azure Data Lake
Azure Data Lake is a scalable data storage
and analytics service
-Fully HDFS compliance file system
-Azure AD integrated
-Microsoft’s PAAS service big data solution
Azure Data Factory
-ETL/ELT Tool
-Code free
-Azure Cloud
-a lot more…
Pre-requisite
• azure account
• Azure data factory resource
• Linked services (Source and
target connection)
• Integration run time
Integration
Runtime
Linked Service
Demo
Be cautious!
• MongoDB version supported for ADF copy activity (V 3.4)
*https://docs.microsoft.com/en-us/azure/data-factory/connector-mongodb
Questions
@paulswengrr
Diponkarpaul

Mais conteúdo relacionado

Semelhante a Sql vs no sql and azure data factory glasgow data UG

dbms-unit-_part-1.pptxeqweqweqweqweqweqweqweq
dbms-unit-_part-1.pptxeqweqweqweqweqweqweqweqdbms-unit-_part-1.pptxeqweqweqweqweqweqweqweq
dbms-unit-_part-1.pptxeqweqweqweqweqweqweqweq
wrushabhsirsat
 

Semelhante a Sql vs no sql and azure data factory glasgow data UG (20)

SlamData - How MongoDB Is Powering a Revolution in Visual Analytics
SlamData - How MongoDB Is Powering a Revolution in Visual AnalyticsSlamData - How MongoDB Is Powering a Revolution in Visual Analytics
SlamData - How MongoDB Is Powering a Revolution in Visual Analytics
 
The Rise of NoSQL
The Rise of NoSQLThe Rise of NoSQL
The Rise of NoSQL
 
ER/Studio and DB PowerStudio Launch Webinar: Big Data, Big Models, Big News!
ER/Studio and DB PowerStudio Launch Webinar: Big Data, Big Models, Big News! ER/Studio and DB PowerStudio Launch Webinar: Big Data, Big Models, Big News!
ER/Studio and DB PowerStudio Launch Webinar: Big Data, Big Models, Big News!
 
Semi Formal Model for Document Oriented Databases
Semi Formal Model for Document Oriented DatabasesSemi Formal Model for Document Oriented Databases
Semi Formal Model for Document Oriented Databases
 
PostgreSQL Open SV 2018
PostgreSQL Open SV 2018PostgreSQL Open SV 2018
PostgreSQL Open SV 2018
 
Why no sql
Why no sqlWhy no sql
Why no sql
 
Schema management with Scalameta
Schema management with ScalametaSchema management with Scalameta
Schema management with Scalameta
 
MongoDB .local London 2019: Managing Diverse User Needs with MongoDB and SQL
MongoDB .local London 2019: Managing Diverse User Needs with MongoDB and SQLMongoDB .local London 2019: Managing Diverse User Needs with MongoDB and SQL
MongoDB .local London 2019: Managing Diverse User Needs with MongoDB and SQL
 
SQL Access to NoSQL
SQL Access to NoSQLSQL Access to NoSQL
SQL Access to NoSQL
 
MongoDB
MongoDBMongoDB
MongoDB
 
CouchDB
CouchDBCouchDB
CouchDB
 
Composable Data Processing with Apache Spark
Composable Data Processing with Apache SparkComposable Data Processing with Apache Spark
Composable Data Processing with Apache Spark
 
GraphQL - when REST API is not enough - lessons learned
GraphQL - when REST API is not enough - lessons learnedGraphQL - when REST API is not enough - lessons learned
GraphQL - when REST API is not enough - lessons learned
 
Advanced data modeling with apache cassandra
Advanced data modeling with apache cassandraAdvanced data modeling with apache cassandra
Advanced data modeling with apache cassandra
 
Tugdual Grall - From SQL to NoSQL in less than 40 min - NoSQL matters Paris 2015
Tugdual Grall - From SQL to NoSQL in less than 40 min - NoSQL matters Paris 2015Tugdual Grall - From SQL to NoSQL in less than 40 min - NoSQL matters Paris 2015
Tugdual Grall - From SQL to NoSQL in less than 40 min - NoSQL matters Paris 2015
 
Couchbase Tutorial: Big data Open Source Systems: VLDB2018
Couchbase Tutorial: Big data Open Source Systems: VLDB2018Couchbase Tutorial: Big data Open Source Systems: VLDB2018
Couchbase Tutorial: Big data Open Source Systems: VLDB2018
 
dbms-unit-_part-1.pptxeqweqweqweqweqweqweqweq
dbms-unit-_part-1.pptxeqweqweqweqweqweqweqweqdbms-unit-_part-1.pptxeqweqweqweqweqweqweqweq
dbms-unit-_part-1.pptxeqweqweqweqweqweqweqweq
 
Couchbase N1QL: Language & Architecture Overview.
Couchbase N1QL: Language & Architecture Overview.Couchbase N1QL: Language & Architecture Overview.
Couchbase N1QL: Language & Architecture Overview.
 
Awesome Tools 2017
Awesome Tools 2017Awesome Tools 2017
Awesome Tools 2017
 
Jumpstart: MongoDB BI Connector & Tableau
Jumpstart: MongoDB BI Connector & TableauJumpstart: MongoDB BI Connector & Tableau
Jumpstart: MongoDB BI Connector & Tableau
 

Último

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Último (20)

Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 

Sql vs no sql and azure data factory glasgow data UG

  • 1. SQL vs NoSQL and moving data from MongoDB to Azure data lake by using Azure Data Factory Diponkar Paul
  • 2. Father and Husband Blogger & Speaker Profession: Data Engineer Working with BI, data warehouse 12 years Diverse background: South Asia, Nordic region, North America Community: Lead Toronto Data Professionals Community Twitter: @Paulswengrr Blog: www.allaboutdata.ca
  • 3. What we cover Refresh our memory with traditional SQL Know about NoSQL (MongoDB) Demo: No SQL Comparison Azure data factory: Copy data from MongoDB Demo: MongoDB with ADF
  • 4. SQL Syntax SELECT Id, Product, Price From Product Where ProductCategory=’Bikes’ Join, Insert, Update, Delete
  • 5. Well defined Schema CREATE TABLE [Production].[Product]( [ProductID] [int] IDENTITY(1,1) NOT NULL, [Name] [nvarchar](100) NOT NULL, [ProductNumber] [nvarchar](25) NOT NULL, [MakeFlag] [dbo].[Flag] NOT NULL, [FinishedGoodsFlag] [dbo].[Flag] NOT NULL, [Color] [nvarchar](15) NULL, [SafetyStockLevel] [smallint] NOT NULL, [StandardCost] [money] NOT NULL, [ListPrice] [money] NOT NULL, [Size] [nvarchar](5) NULL)
  • 6. Relationship/Normalization Customer Bridge table (Order) Product Id Name Price Description 1 “Mountain Bike “ 2500 “Bike for mountain trek” 2 “City Bike” 1000 “Best fit to roam around city” Id Customer_ID Product_ID 1 2 1 2 2 2 3 1 1 Id Name Email 1 Morten Sorenson m.s@outlook.com 2 Andersen Lu al@yahoo.com 3 Derek Paul dp@outlook.com
  • 8. NoSQL • MongoDB • Azure Cosmos DB • Amazon Document DB • Oracle NoSQL • Google BigTable Not Only SQL!!
  • 9. NoSQL- MongoDB “MongoDB” derives from the word “humongous”
  • 10. How we call them? Database E-Commerce Collections Table –Customer, Product… Documents {“Name”: ”Anders”, age:36} {“Name”: “Carsten”, age:42}
  • 11. No defined Schema Id:1 Age:36Name: ‘Anders’ ….. Id:2 Age:36 Name: ‘Carsten’ ….. Id:3 …..
  • 12. NoSQL –No relation Profession {id:1,profession:’Developer’} {id:2, profession: ’Data Engineer’} {id:3, profession: ’Actor’} Users {id:1,name:’Tom Hanks’, age:20} {id:2,name:’Casper Ruther’, age:42} {id:3,name:’Paul Anders’, age:63} db.Users.insert( { id:"01", name:"Tom Hanks", age:20 email:"th@hollywood.com", Profession:["Developer","Data Engineer","Actor"] } ) Usersprofession {id:1,userId:1,professionId:1} {id:2,userId:1, professionId: 2} {id:3,userId: 1, professionId: 3} {id:4,userId: 2, professionId: 2}
  • 13. Tools: MongoDB https://www.mongodb.com/products/compass Robo 3T: https://robomongo.org/ https://docs.mongodb.com/manual/core/data-model-design/ https://docs.mongodb.com/manual/reference/method/db.collection.update/
  • 14. Languages • MONGO SHELL • Python • java • C# • Scala • GO and many more.
  • 15. Demo
  • 16. SQL vs NoSQL SQL NoSQL Data uses Schema Schema-less (Schema Agnostic) Maintain Relationship No relations– though you can design relationship Data distributed in multiple tables Data in one table (embedded)
  • 17. Move your NoSQL data from OnPrem to Data Lake Gen2
  • 18. Azure Data Lake Azure Data Lake is a scalable data storage and analytics service -Fully HDFS compliance file system -Azure AD integrated -Microsoft’s PAAS service big data solution
  • 19. Azure Data Factory -ETL/ELT Tool -Code free -Azure Cloud -a lot more…
  • 20. Pre-requisite • azure account • Azure data factory resource • Linked services (Source and target connection) • Integration run time Integration Runtime Linked Service
  • 21. Demo
  • 22. Be cautious! • MongoDB version supported for ADF copy activity (V 3.4) *https://docs.microsoft.com/en-us/azure/data-factory/connector-mongodb