SlideShare uma empresa Scribd logo
1 de 17
Baixar para ler offline
We offer MongoDB-as-a-Service on any cloud of your choice. You can read more about our
MongoDB-as-a-service in our white paper on our website: http://www.cumulogic.com/
resources/mongodb_wp/
The goal of this boot camp is to give you hands-on experience with MongoDB database-as-a-
service, how to load the data and show you a sample application to analyze the data. We will
use a small sample Twitter application for our hands-on lab, which will help you write a
MongoDB application. We will also discuss briefly a few performance-related so you can
analyze and tweak performance of your databases. At the same time, you will also see how
you can easily launch a fully managed MongoDB instance in the cloud.
About a decade ago, business applications were transactional in nature and most of the
issues were related to executing transactions (i.e. credit card processing) with low latency, as
a result enterprise data was more “relational” in nature and was therefore “structured”.
The nature of business applications has changed and enterprises are trying to figure out how
to use all the data in their enterprise systems, social media, machine logs, etc. to understand
how all the data impacts their business and how they can get competitive advantage by
leveraging nuggets in that data.
Fast forward till today and businesses are trying to solve a different problem. And with the
diverse nature of data sources and data formats, we need newer technologies that scale and
provide answers or identify those nuggets in the data at much faster speed and low cost than
traditional SQL database or data warehouse systems. Hence, we see a slew of new database
technologies being developed that promise to help solving these problems.
Depending on the nature of the data or problem they solve, we can categorize these new
database technologies in three major categories. (1) Document oriented databases, which
store and crunch data in document formats, (2) Key-value pair databases such as Riak and
Redis and (3) Graph databases. Depending on the type of data, we could use one of these
databases to solve your data analytics problems. Today, we are focus on MongoDB.
When should we want to use NoSQL database Vs SQL database, and which NoSQL
database?
As I mentioned before, the problems that NoSQL databases solve is related to the nature and
amount of data we want to processes in our next generation applications. We need databases
that can scale to petabytes of data at a fraction of the cost of a relational database. We need
database systems which can help us quickly analyze petabytes of data and provide results in
realtime - hence the speed and velocity of data access is critical.
NoSQL database systems can provide high speed access and low latency access to large
amount of data. And one key criteria to consider when choosing NoSQL database is the
nature of your applications and main issues with them – are they operational or analytical? For
example, for batch processing, analytical apps, you may be better off with Hadoop – while for
operational issues of scalability and realtime processing, you may want to choose MongoDB
database. So consider these criteria in making your decisions and do some experiments and
find the best ones that fits your application needs.
1. Let’s take a look at the key feature sets of MongoDB at very high level. MongoDB is a
document oriented database server. It stores objects as BSON (pronounced as bison), which
is a binary versions of JSON format and it supports dynamic schemas – which essentially
means it is schema-less database. There is no rigid SQL-like schema to store the data. This
gives flexibility in choosing the data types from different data sources such as social networks,
machine logs or CRM systems.
2. MongoDb supports indexing just like traditional SQL indexing, which means you can index
data on any field with high fidelity to improve query performance. (FYI – High fidelity here
means the field which is a variable in all records. For example, if we are storing data about
employees, the data field that varies most is the phone number and not the city name or
company name)
3. Most of you may be familiar with the concept of database sharding. MongoDB is a
horizontally scalable database and supports sharding – which means it stores data in smaller
chunks on several data nodes for low latency access to the data. Hence MongoDB is widely
used in the cloud because you can scale the database by adding shards as your data grows
and maintain that low latency of data access even as your size of the data grows.
4. MongoDB is designed to be resilient for data durability and supports replica sets which can
be geographically distributed
5. MongoDB supports Map-reduce operations and provides fast updates to the data.
FAQs: When do you want to use Hadoop Vs MongoDB for Map-reduce?
Answer: You want to use Hadoop for batch jobs, where you can fire up analytics on
offline data, whereas you can use MongoDB for realtime data analytics.
Question: How does Sharding work in MongoDB?
Answer: MongoDB sharding works by spreading writes to multiple data nodes.
Mongos, which is the mongoDB proces,s directs data to a different data node to write or read.
And show the slide – (refer to the sharding diagram)
Since MongoDB scales very well horizontally, it is the most widely used database in the cloud.
And given the complexity of managing mongoDB for maintaining availability, data durability
and performance, you may want to leverage platforms which provide you MongoDB-as-a-
Service, which is a web service call to provision a dedicated mongoDB server, fully sharded
and replicated, which scales automatically.
You will get a chance to use MongoDB service shortly in our platform
The specific MongoDB architecture that you choose will impact the performance, availability
and data durability. MongoDB is flexible and supports high availability and sharding
architectures to provide you tge level of redundancy, performance and SLA you want for your
service.
MongoDB supports replica sets and sharding deployment architectures. Replica sets provide
high availability and data durability while sharding provides scalability. You can configure
shards on the replica sets for achieving the best of both, reliability and scalability.
This is a replica set with three replica nodes in two datacenters or two regions of a public
cloud.
MongoDB uses “eventual consistency” which means there may be a possibility that data on
the replicas may be out of sync from the primary node. You may want to use this architecture
for data redundancy purposes rather than scaling. In this architecture, you still send reads and
writes to the primary node, which means even with multiple nodes, your application wouldn’t
necessarily scale better. To maintain this level of redundancy yet improve scalability, you can
use sharding as in the next slide.
This is a three shard deployment architecture which uses three replica sets and can be in a
single region or datacenter or distributed geographically.
With this architecture, you get the benefit of both, the data redundancy with replica sets and
high scalability with shards. Each shard itself can be a replica set which provides data
redundancy at each node level. But keep in mind, there is a overhead to sharding and
replication and you want to choose what’s best for your database
Now let’s take a look at a sample application. We have a sample Twitter app to do hands-on
experiment with. We will use MongoDB-as-a-Service on the cloud and use a sample app to
analyze twitter dat.
Just like any database, the performance of MongoDB database must be monitored and
optimized for a given workload or application type.
These are key metrics you want to look for in MongoDb: (1) CPU (2) memory (3) Ops counters
– this is the total number of operations over a period of time. This number shows you number
of active and pending operations (4) background flush – this is the number of disk writes when
MongoDb flushes all in-memory data to the disk. You want to keep an eye on this number and
tweak if you wish to reduce the number of times or frequency of disk writes. There are other
metrics which we will see during our hands-on lab.
Hands on Big Data Analysis with MongoDB - Cloud Expo Bootcamp NYC
Hands on Big Data Analysis with MongoDB - Cloud Expo Bootcamp NYC
Hands on Big Data Analysis with MongoDB - Cloud Expo Bootcamp NYC

Mais conteúdo relacionado

Mais procurados

Performance analysis of MongoDB and HBase
Performance analysis of MongoDB and HBasePerformance analysis of MongoDB and HBase
Performance analysis of MongoDB and HBaseSindhujanDhayalan
 
Benefits of Using MongoDB Over RDBMSs
Benefits of Using MongoDB Over RDBMSsBenefits of Using MongoDB Over RDBMSs
Benefits of Using MongoDB Over RDBMSsMongoDB
 
Mongo db a deep dive of mongodb indexes
Mongo db  a deep dive of mongodb indexesMongo db  a deep dive of mongodb indexes
Mongo db a deep dive of mongodb indexesRajesh Kumar
 
MongoDB: An Introduction - june-2011
MongoDB:  An Introduction - june-2011MongoDB:  An Introduction - june-2011
MongoDB: An Introduction - june-2011Chris Westin
 
Apache Spark and MongoDB - Turning Analytics into Real-Time Action
Apache Spark and MongoDB - Turning Analytics into Real-Time ActionApache Spark and MongoDB - Turning Analytics into Real-Time Action
Apache Spark and MongoDB - Turning Analytics into Real-Time ActionJoão Gabriel Lima
 
No SQL - MongoDB
No SQL - MongoDBNo SQL - MongoDB
No SQL - MongoDBMirza Asif
 
MongoDB .local Munich 2019: Managing a Heterogeneous Stack with MongoDB & SQL
MongoDB .local Munich 2019: Managing a Heterogeneous Stack with MongoDB & SQLMongoDB .local Munich 2019: Managing a Heterogeneous Stack with MongoDB & SQL
MongoDB .local Munich 2019: Managing a Heterogeneous Stack with MongoDB & SQLMongoDB
 
Blazing Fast Analytics with MongoDB & Spark
Blazing Fast Analytics with MongoDB & SparkBlazing Fast Analytics with MongoDB & Spark
Blazing Fast Analytics with MongoDB & SparkMongoDB
 
Introduction To MongoDB
Introduction To MongoDBIntroduction To MongoDB
Introduction To MongoDBElieHannouch
 
The Right (and Wrong) Use Cases for MongoDB
The Right (and Wrong) Use Cases for MongoDBThe Right (and Wrong) Use Cases for MongoDB
The Right (and Wrong) Use Cases for MongoDBMongoDB
 

Mais procurados (20)

Performance analysis of MongoDB and HBase
Performance analysis of MongoDB and HBasePerformance analysis of MongoDB and HBase
Performance analysis of MongoDB and HBase
 
Benefits of Using MongoDB Over RDBMSs
Benefits of Using MongoDB Over RDBMSsBenefits of Using MongoDB Over RDBMSs
Benefits of Using MongoDB Over RDBMSs
 
Mongo db a deep dive of mongodb indexes
Mongo db  a deep dive of mongodb indexesMongo db  a deep dive of mongodb indexes
Mongo db a deep dive of mongodb indexes
 
Open source Technology
Open source TechnologyOpen source Technology
Open source Technology
 
MongoDB: An Introduction - june-2011
MongoDB:  An Introduction - june-2011MongoDB:  An Introduction - june-2011
MongoDB: An Introduction - june-2011
 
No sql - { If and Else }
No sql - { If and Else }No sql - { If and Else }
No sql - { If and Else }
 
Extend db
Extend dbExtend db
Extend db
 
Mongo db report
Mongo db reportMongo db report
Mongo db report
 
MongoDb - Details on the POC
MongoDb - Details on the POCMongoDb - Details on the POC
MongoDb - Details on the POC
 
Mongo db
Mongo dbMongo db
Mongo db
 
Apache Spark and MongoDB - Turning Analytics into Real-Time Action
Apache Spark and MongoDB - Turning Analytics into Real-Time ActionApache Spark and MongoDB - Turning Analytics into Real-Time Action
Apache Spark and MongoDB - Turning Analytics into Real-Time Action
 
Mongo DB
Mongo DBMongo DB
Mongo DB
 
No SQL - MongoDB
No SQL - MongoDBNo SQL - MongoDB
No SQL - MongoDB
 
Mongo db workshop # 01
Mongo db workshop # 01Mongo db workshop # 01
Mongo db workshop # 01
 
CMS Mongo DB
CMS Mongo DBCMS Mongo DB
CMS Mongo DB
 
MongoDB .local Munich 2019: Managing a Heterogeneous Stack with MongoDB & SQL
MongoDB .local Munich 2019: Managing a Heterogeneous Stack with MongoDB & SQLMongoDB .local Munich 2019: Managing a Heterogeneous Stack with MongoDB & SQL
MongoDB .local Munich 2019: Managing a Heterogeneous Stack with MongoDB & SQL
 
Blazing Fast Analytics with MongoDB & Spark
Blazing Fast Analytics with MongoDB & SparkBlazing Fast Analytics with MongoDB & Spark
Blazing Fast Analytics with MongoDB & Spark
 
Introduction To MongoDB
Introduction To MongoDBIntroduction To MongoDB
Introduction To MongoDB
 
Unit 3 MongDB
Unit 3 MongDBUnit 3 MongDB
Unit 3 MongDB
 
The Right (and Wrong) Use Cases for MongoDB
The Right (and Wrong) Use Cases for MongoDBThe Right (and Wrong) Use Cases for MongoDB
The Right (and Wrong) Use Cases for MongoDB
 

Destaque

Doctrine MongoDB Object Document Mapper
Doctrine MongoDB Object Document MapperDoctrine MongoDB Object Document Mapper
Doctrine MongoDB Object Document MapperJonathan Wage
 
Mongo db crud guide
Mongo db crud guideMongo db crud guide
Mongo db crud guideDeysi Gmarra
 
Data Migration Between MongoDB and Oracle
Data Migration Between MongoDB and OracleData Migration Between MongoDB and Oracle
Data Migration Between MongoDB and OracleChihYung(Raymond) Wu
 
OSCON 2012 MongoDB Tutorial
OSCON 2012 MongoDB TutorialOSCON 2012 MongoDB Tutorial
OSCON 2012 MongoDB TutorialSteven Francia
 
MongoDB Schema Design: Practical Applications and Implications
MongoDB Schema Design: Practical Applications and ImplicationsMongoDB Schema Design: Practical Applications and Implications
MongoDB Schema Design: Practical Applications and ImplicationsMongoDB
 
MongoDB, E-commerce and Transactions
MongoDB, E-commerce and TransactionsMongoDB, E-commerce and Transactions
MongoDB, E-commerce and TransactionsSteven Francia
 

Destaque (8)

Mongo db manual
Mongo db manualMongo db manual
Mongo db manual
 
Doctrine MongoDB Object Document Mapper
Doctrine MongoDB Object Document MapperDoctrine MongoDB Object Document Mapper
Doctrine MongoDB Object Document Mapper
 
Mongo db crud guide
Mongo db crud guideMongo db crud guide
Mongo db crud guide
 
MONGODB
MONGODBMONGODB
MONGODB
 
Data Migration Between MongoDB and Oracle
Data Migration Between MongoDB and OracleData Migration Between MongoDB and Oracle
Data Migration Between MongoDB and Oracle
 
OSCON 2012 MongoDB Tutorial
OSCON 2012 MongoDB TutorialOSCON 2012 MongoDB Tutorial
OSCON 2012 MongoDB Tutorial
 
MongoDB Schema Design: Practical Applications and Implications
MongoDB Schema Design: Practical Applications and ImplicationsMongoDB Schema Design: Practical Applications and Implications
MongoDB Schema Design: Practical Applications and Implications
 
MongoDB, E-commerce and Transactions
MongoDB, E-commerce and TransactionsMongoDB, E-commerce and Transactions
MongoDB, E-commerce and Transactions
 

Semelhante a Hands on Big Data Analysis with MongoDB - Cloud Expo Bootcamp NYC

how_can_businesses_address_storage_issues_using_mongodb.pptx
how_can_businesses_address_storage_issues_using_mongodb.pptxhow_can_businesses_address_storage_issues_using_mongodb.pptx
how_can_businesses_address_storage_issues_using_mongodb.pptxsarah david
 
how_can_businesses_address_storage_issues_using_mongodb.pdf
how_can_businesses_address_storage_issues_using_mongodb.pdfhow_can_businesses_address_storage_issues_using_mongodb.pdf
how_can_businesses_address_storage_issues_using_mongodb.pdfsarah david
 
Everything You Need to Know About MongoDB Development.pptx
Everything You Need to Know About MongoDB Development.pptxEverything You Need to Know About MongoDB Development.pptx
Everything You Need to Know About MongoDB Development.pptx75waytechnologies
 
MONGODB VS MYSQL: A COMPARATIVE STUDY OF PERFORMANCE IN SUPER MARKET MANAGEME...
MONGODB VS MYSQL: A COMPARATIVE STUDY OF PERFORMANCE IN SUPER MARKET MANAGEME...MONGODB VS MYSQL: A COMPARATIVE STUDY OF PERFORMANCE IN SUPER MARKET MANAGEME...
MONGODB VS MYSQL: A COMPARATIVE STUDY OF PERFORMANCE IN SUPER MARKET MANAGEME...ijcsity
 
MONGODB VS MYSQL: A COMPARATIVE STUDY OF PERFORMANCE IN SUPER MARKET MANAGEME...
MONGODB VS MYSQL: A COMPARATIVE STUDY OF PERFORMANCE IN SUPER MARKET MANAGEME...MONGODB VS MYSQL: A COMPARATIVE STUDY OF PERFORMANCE IN SUPER MARKET MANAGEME...
MONGODB VS MYSQL: A COMPARATIVE STUDY OF PERFORMANCE IN SUPER MARKET MANAGEME...ijcsity
 
MONGODB VS MYSQL: A COMPARATIVE STUDY OF PERFORMANCE IN SUPER MARKET MANAGEME...
MONGODB VS MYSQL: A COMPARATIVE STUDY OF PERFORMANCE IN SUPER MARKET MANAGEME...MONGODB VS MYSQL: A COMPARATIVE STUDY OF PERFORMANCE IN SUPER MARKET MANAGEME...
MONGODB VS MYSQL: A COMPARATIVE STUDY OF PERFORMANCE IN SUPER MARKET MANAGEME...ijcsity
 
Introduction to MongoDB and its best practices
Introduction to MongoDB and its best practicesIntroduction to MongoDB and its best practices
Introduction to MongoDB and its best practicesAshishRathore72
 
Pros and Cons of MongoDB in Web Development
Pros and Cons of MongoDB in Web DevelopmentPros and Cons of MongoDB in Web Development
Pros and Cons of MongoDB in Web DevelopmentNirvana Canada
 
MongoDB Lab Manual (1).pdf used in data science
MongoDB Lab Manual (1).pdf used in data scienceMongoDB Lab Manual (1).pdf used in data science
MongoDB Lab Manual (1).pdf used in data sciencebitragowthamkumar1
 
SQL vs NoSQL, an experiment with MongoDB
SQL vs NoSQL, an experiment with MongoDBSQL vs NoSQL, an experiment with MongoDB
SQL vs NoSQL, an experiment with MongoDBMarco Segato
 
Mongo db transcript
Mongo db transcriptMongo db transcript
Mongo db transcriptfoliba
 
NOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLNOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLRamakant Soni
 
NOSQL in big data is the not only structure langua.pdf
NOSQL in big data is the not only structure langua.pdfNOSQL in big data is the not only structure langua.pdf
NOSQL in big data is the not only structure langua.pdfajajkhan16
 
MongoDB_Spark
MongoDB_SparkMongoDB_Spark
MongoDB_SparkMat Keep
 
Introduction to databases (1).pptx
Introduction to databases (1).pptxIntroduction to databases (1).pptx
Introduction to databases (1).pptxmohamednagga1
 
10gen telco white_paper
10gen telco white_paper10gen telco white_paper
10gen telco white_paperEl Taller Web
 

Semelhante a Hands on Big Data Analysis with MongoDB - Cloud Expo Bootcamp NYC (20)

how_can_businesses_address_storage_issues_using_mongodb.pptx
how_can_businesses_address_storage_issues_using_mongodb.pptxhow_can_businesses_address_storage_issues_using_mongodb.pptx
how_can_businesses_address_storage_issues_using_mongodb.pptx
 
how_can_businesses_address_storage_issues_using_mongodb.pdf
how_can_businesses_address_storage_issues_using_mongodb.pdfhow_can_businesses_address_storage_issues_using_mongodb.pdf
how_can_businesses_address_storage_issues_using_mongodb.pdf
 
Everything You Need to Know About MongoDB Development.pptx
Everything You Need to Know About MongoDB Development.pptxEverything You Need to Know About MongoDB Development.pptx
Everything You Need to Know About MongoDB Development.pptx
 
MONGODB VS MYSQL: A COMPARATIVE STUDY OF PERFORMANCE IN SUPER MARKET MANAGEME...
MONGODB VS MYSQL: A COMPARATIVE STUDY OF PERFORMANCE IN SUPER MARKET MANAGEME...MONGODB VS MYSQL: A COMPARATIVE STUDY OF PERFORMANCE IN SUPER MARKET MANAGEME...
MONGODB VS MYSQL: A COMPARATIVE STUDY OF PERFORMANCE IN SUPER MARKET MANAGEME...
 
MONGODB VS MYSQL: A COMPARATIVE STUDY OF PERFORMANCE IN SUPER MARKET MANAGEME...
MONGODB VS MYSQL: A COMPARATIVE STUDY OF PERFORMANCE IN SUPER MARKET MANAGEME...MONGODB VS MYSQL: A COMPARATIVE STUDY OF PERFORMANCE IN SUPER MARKET MANAGEME...
MONGODB VS MYSQL: A COMPARATIVE STUDY OF PERFORMANCE IN SUPER MARKET MANAGEME...
 
MONGODB VS MYSQL: A COMPARATIVE STUDY OF PERFORMANCE IN SUPER MARKET MANAGEME...
MONGODB VS MYSQL: A COMPARATIVE STUDY OF PERFORMANCE IN SUPER MARKET MANAGEME...MONGODB VS MYSQL: A COMPARATIVE STUDY OF PERFORMANCE IN SUPER MARKET MANAGEME...
MONGODB VS MYSQL: A COMPARATIVE STUDY OF PERFORMANCE IN SUPER MARKET MANAGEME...
 
Mongodb
MongodbMongodb
Mongodb
 
Introduction to MongoDB and its best practices
Introduction to MongoDB and its best practicesIntroduction to MongoDB and its best practices
Introduction to MongoDB and its best practices
 
MongoDB.pptx
MongoDB.pptxMongoDB.pptx
MongoDB.pptx
 
Pros and Cons of MongoDB in Web Development
Pros and Cons of MongoDB in Web DevelopmentPros and Cons of MongoDB in Web Development
Pros and Cons of MongoDB in Web Development
 
MongoDB Lab Manual (1).pdf used in data science
MongoDB Lab Manual (1).pdf used in data scienceMongoDB Lab Manual (1).pdf used in data science
MongoDB Lab Manual (1).pdf used in data science
 
SQL vs NoSQL, an experiment with MongoDB
SQL vs NoSQL, an experiment with MongoDBSQL vs NoSQL, an experiment with MongoDB
SQL vs NoSQL, an experiment with MongoDB
 
Mongo db dhruba
Mongo db dhrubaMongo db dhruba
Mongo db dhruba
 
Mongo db transcript
Mongo db transcriptMongo db transcript
Mongo db transcript
 
Mdb dn 2016_11_ops_mgr
Mdb dn 2016_11_ops_mgrMdb dn 2016_11_ops_mgr
Mdb dn 2016_11_ops_mgr
 
NOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLNOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQL
 
NOSQL in big data is the not only structure langua.pdf
NOSQL in big data is the not only structure langua.pdfNOSQL in big data is the not only structure langua.pdf
NOSQL in big data is the not only structure langua.pdf
 
MongoDB_Spark
MongoDB_SparkMongoDB_Spark
MongoDB_Spark
 
Introduction to databases (1).pptx
Introduction to databases (1).pptxIntroduction to databases (1).pptx
Introduction to databases (1).pptx
 
10gen telco white_paper
10gen telco white_paper10gen telco white_paper
10gen telco white_paper
 

Último

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 

Último (20)

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 

Hands on Big Data Analysis with MongoDB - Cloud Expo Bootcamp NYC

  • 1.
  • 2.
  • 3. We offer MongoDB-as-a-Service on any cloud of your choice. You can read more about our MongoDB-as-a-service in our white paper on our website: http://www.cumulogic.com/ resources/mongodb_wp/
  • 4. The goal of this boot camp is to give you hands-on experience with MongoDB database-as-a- service, how to load the data and show you a sample application to analyze the data. We will use a small sample Twitter application for our hands-on lab, which will help you write a MongoDB application. We will also discuss briefly a few performance-related so you can analyze and tweak performance of your databases. At the same time, you will also see how you can easily launch a fully managed MongoDB instance in the cloud.
  • 5. About a decade ago, business applications were transactional in nature and most of the issues were related to executing transactions (i.e. credit card processing) with low latency, as a result enterprise data was more “relational” in nature and was therefore “structured”. The nature of business applications has changed and enterprises are trying to figure out how to use all the data in their enterprise systems, social media, machine logs, etc. to understand how all the data impacts their business and how they can get competitive advantage by leveraging nuggets in that data. Fast forward till today and businesses are trying to solve a different problem. And with the diverse nature of data sources and data formats, we need newer technologies that scale and provide answers or identify those nuggets in the data at much faster speed and low cost than traditional SQL database or data warehouse systems. Hence, we see a slew of new database technologies being developed that promise to help solving these problems. Depending on the nature of the data or problem they solve, we can categorize these new database technologies in three major categories. (1) Document oriented databases, which store and crunch data in document formats, (2) Key-value pair databases such as Riak and Redis and (3) Graph databases. Depending on the type of data, we could use one of these databases to solve your data analytics problems. Today, we are focus on MongoDB.
  • 6. When should we want to use NoSQL database Vs SQL database, and which NoSQL database? As I mentioned before, the problems that NoSQL databases solve is related to the nature and amount of data we want to processes in our next generation applications. We need databases that can scale to petabytes of data at a fraction of the cost of a relational database. We need database systems which can help us quickly analyze petabytes of data and provide results in realtime - hence the speed and velocity of data access is critical. NoSQL database systems can provide high speed access and low latency access to large amount of data. And one key criteria to consider when choosing NoSQL database is the nature of your applications and main issues with them – are they operational or analytical? For example, for batch processing, analytical apps, you may be better off with Hadoop – while for operational issues of scalability and realtime processing, you may want to choose MongoDB database. So consider these criteria in making your decisions and do some experiments and find the best ones that fits your application needs.
  • 7. 1. Let’s take a look at the key feature sets of MongoDB at very high level. MongoDB is a document oriented database server. It stores objects as BSON (pronounced as bison), which is a binary versions of JSON format and it supports dynamic schemas – which essentially means it is schema-less database. There is no rigid SQL-like schema to store the data. This gives flexibility in choosing the data types from different data sources such as social networks, machine logs or CRM systems. 2. MongoDb supports indexing just like traditional SQL indexing, which means you can index data on any field with high fidelity to improve query performance. (FYI – High fidelity here means the field which is a variable in all records. For example, if we are storing data about employees, the data field that varies most is the phone number and not the city name or company name) 3. Most of you may be familiar with the concept of database sharding. MongoDB is a horizontally scalable database and supports sharding – which means it stores data in smaller chunks on several data nodes for low latency access to the data. Hence MongoDB is widely used in the cloud because you can scale the database by adding shards as your data grows and maintain that low latency of data access even as your size of the data grows. 4. MongoDB is designed to be resilient for data durability and supports replica sets which can be geographically distributed 5. MongoDB supports Map-reduce operations and provides fast updates to the data. FAQs: When do you want to use Hadoop Vs MongoDB for Map-reduce? Answer: You want to use Hadoop for batch jobs, where you can fire up analytics on offline data, whereas you can use MongoDB for realtime data analytics. Question: How does Sharding work in MongoDB? Answer: MongoDB sharding works by spreading writes to multiple data nodes. Mongos, which is the mongoDB proces,s directs data to a different data node to write or read. And show the slide – (refer to the sharding diagram)
  • 8. Since MongoDB scales very well horizontally, it is the most widely used database in the cloud. And given the complexity of managing mongoDB for maintaining availability, data durability and performance, you may want to leverage platforms which provide you MongoDB-as-a- Service, which is a web service call to provision a dedicated mongoDB server, fully sharded and replicated, which scales automatically. You will get a chance to use MongoDB service shortly in our platform
  • 9. The specific MongoDB architecture that you choose will impact the performance, availability and data durability. MongoDB is flexible and supports high availability and sharding architectures to provide you tge level of redundancy, performance and SLA you want for your service. MongoDB supports replica sets and sharding deployment architectures. Replica sets provide high availability and data durability while sharding provides scalability. You can configure shards on the replica sets for achieving the best of both, reliability and scalability.
  • 10. This is a replica set with three replica nodes in two datacenters or two regions of a public cloud. MongoDB uses “eventual consistency” which means there may be a possibility that data on the replicas may be out of sync from the primary node. You may want to use this architecture for data redundancy purposes rather than scaling. In this architecture, you still send reads and writes to the primary node, which means even with multiple nodes, your application wouldn’t necessarily scale better. To maintain this level of redundancy yet improve scalability, you can use sharding as in the next slide.
  • 11. This is a three shard deployment architecture which uses three replica sets and can be in a single region or datacenter or distributed geographically. With this architecture, you get the benefit of both, the data redundancy with replica sets and high scalability with shards. Each shard itself can be a replica set which provides data redundancy at each node level. But keep in mind, there is a overhead to sharding and replication and you want to choose what’s best for your database
  • 12. Now let’s take a look at a sample application. We have a sample Twitter app to do hands-on experiment with. We will use MongoDB-as-a-Service on the cloud and use a sample app to analyze twitter dat.
  • 13.
  • 14. Just like any database, the performance of MongoDB database must be monitored and optimized for a given workload or application type. These are key metrics you want to look for in MongoDb: (1) CPU (2) memory (3) Ops counters – this is the total number of operations over a period of time. This number shows you number of active and pending operations (4) background flush – this is the number of disk writes when MongoDb flushes all in-memory data to the disk. You want to keep an eye on this number and tweak if you wish to reduce the number of times or frequency of disk writes. There are other metrics which we will see during our hands-on lab.