SlideShare uma empresa Scribd logo
1 de 48
Baixar para ler offline
2 App Designs
Using MongoDB
Kyle Banker
kyle@10gen.com
@hwaet
What we'll cover:
Brief intro

Feed reader

Website analyitcs

Questions
Intro
MongoDB
Rich data model

Replication and automatic failover

Sharding
Use cases
Social networking and geolocation

E-commerce

CMS

Analytics

Production deployments: http://is.gd/U0VkG6
Demo
Feed reader
Four collections
Users

Feeds

Entries

Buckets
Users
{ _id: ObjectId("4e316f236ce9cca7ef17d59d"),
  username: 'kbanker',
    feeds: [
        { _id: ObjectId("4e316f236ce9cca7ef17d54c"),
          name: "GigaOM" },

          { _id: ObjectId("4e316f236ce9cca7ef17d54d"),
            name: "Salon.com" }

          { _id: ObjectId("4e316f236ce9cca7ef17d54e"),
            name: "Foo News" }
     ],

    latest: Date(2011, 7, 25)
}
Index
db.users.ensureIndex( { username: 1 }, { unique: true } )
Feeds
{ _id: ObjectId("4e316b8a6ce9cca7ef17d54b"),
  url: 'http://news.foo.com/feed.xml/',
  name: 'Foo News',
  subscriber_count: 2,
  latest: Date(2011, 7, 25)
}


                 Index
db.feeds.ensureIndex( { url: 1 }, { unique: true
Adding a feed subscription
// Upsert
db.feeds.update(
  { url: 'http://news.foo.com/feed.xml/', name: 'Foo News'},
  { $inc: {subscriber_count: 1} },
   true )
Adding a feed subscription
// Add this feed to user feed list
db.users.update( {_id: ObjectId("4e316f236ce9cca7ef17d59d") },
  { $addToSet: { feeds:
             { _id: ObjectId("4e316b8a6ce9cca7ef17d54b"),
               name: 'Foo News'
             }
  }
)
Removing a feed
            subscription
db.users.update(
  { _id: ObjectId("4e316f236ce9cca7ef17d59d") },

 { $pull: { feeds:
            { _id: ObjectId("4e316b8a6ce9cca7ef17d54b"),
              name: 'Foo News'
            }
 } )
Removing a feed
            subscription
db.feeds.update( {url: 'http://news.foo.com/feed.xml/'},
  { $inc: {subscriber_count: -1} } )
Entries (populate in
          background)
{ _id: ObjectId("4e316b8a6ce9cca7ef17d54b"),
  feed_id: ObjectId("4e316b8a6ce9cca7ef17d54b"),
  title: 'Important person to resign',
  body: 'A person deemed very important has decided...',
  reads: 5,
  date: Date(2011, 7, 27)
}
What we need now
Populate personal feeds (buckets)

Avoid lots of expensive queries

Record what's been read
Without bucketing
// Naive query runs every time
db.entries.find(
  feed_id: {$in: user_feed_ids}
).sort({date: 1}).limit(25)
With bucketing
// A bit smarter: only runs once
entries = db.entries.find(
  { date: {$gt: user_latest },
    feed_id: { $in: user_feed_ids }
 ).sort({date: 1})
Index
db.entries.ensureIndex( { date: 1, feed_id: 1} )
bucket = {
  _id: ObjectId("4e3185c26ce9cca7ef17d552"),
  user_id: ObjectId("4e316f236ce9cca7ef17d59d"),
  date: Date( 2011, 7, 27 )
  n: 100,

    entries: [
      { _id: ObjectId("4e316b8a6ce9cca7ef17d5ac"),
         feed_id: ObjectId("4e316b8a6ce9cca7ef17d54b"),
         title: 'Important person to resign',
         body: 'A person deemed very important has decided...',
         date: Date(2011, 7, 27),
         read: false
      },
        { _id: ObjectId("4e316b8a6ce9cca7ef17d5c8"),
          feed_id: ObjectId("4e316b8a6ce9cca7ef17d54b"),
          title: 'Panda bear waves hello',
          body: 'A panda bear at the local zoo...',
          date: Date(2011, 7, 27),
          read: false
        }
    ]
}
db.buckets.insert( buckets )

db.users.update(
  { _id: ObjectId("4e316f236ce9cca7ef17d59d") }
  { $set: { latest: latest_entry_date } }
)
Viewing a personal feed
// Newest
db.buckets.find(
  { user_id: ObjectId("4e316f236ce9cca7ef17d59d") }
).sort({date: -1}).limit(1)
Viewing a personal feed
// Next newest (that's how we paginate)
db.buckets.find(
  { user_id: ObjectId("4e316f236ce9cca7ef17d59d"),
    date: { $lt: previous_reader_date }
  }
).sort({date: -1}).limit(1)
Index
db.buckets.ensureIndex( { user_id: 1, date: -1 } )
Marking a feed item
db.buckets.update(
  { user_id: ObjectId("4e316f236ce9cca7ef17d59d"),
    'entries.id': ObjectId("4e316b8a6ce9cca7ef17d5c8")},

    { $set: { 'entries.$.read' : true }
)
Marking a feed item
db.entries.update(
  { _id: ObjectId("4e316b8a6ce9cca7ef17d5c8") },

    { $inc: { read: 1 } }
)
Sharding note:
Buckets collection is eminently shardable

Shard key: { user_id: 1, date: 1 }
Website
analyitcs
Challenges:
Real-time reporting.

Efficient use of space.

Easily wipe unneeded data.
Techniques
Pre-aggregate the data.

Pre-construct document structure.

Store emphemeral data in a separate
database.
Two collections:
Each collection gets its own database.

Collections names are time-scoped.

Clean, fast removal of old data.
// Collections holding totals for each day, stored
// in a database per month
days_2011_5
days_2011_6
days_2011_7
...

// Totals for each month...
months_2011_1_4
months_2011_5_8
months_2011_9_12
...
Hours and minutes
{ _id: { uri: BinData("0beec7b5ea3f0fdbc95d0dd47f35"),
         day: '2011-5-1'
       },
  total: 2820,
  hrs: { 0: 500,
         1: 700,
         2: 450,
         3: 343,
         // ... 4-23 go here
         }
   // Minutes are rolling. This gives real-time
   // numbers for the last hour. So when you increment
   // minute n, you need to $set minute n-1 to 0.
   mins: { 1: 12,
           2: 10,
           3: 5,
           4: 34
           // ... 5-60 go here
         }
}
Updating day
document...
//   Update 'days' collection at 5:37 p.m. on 2011-5-1
//   Might want to queue increments so that $inc is greater
//   than 1 for each write...
id   = { uri: BinData("0beec7b5ea3f0fdbc95d0dd47f35"),
         day: '2011-5-1'
       };

update = { $inc: { total: 1, 'hrs.17': 1, 'mins.37': 1 },
           $set: { 'mins.36': 0} };

// Update collection containing days 1-5
db.days_2011_5.update( id, update );
Months and days
{ _id: { uri: BinData("0beec7b5ea3f0fdbc95d0dd47f35"),
         month: '2011-5'
       },
  total: 34173,
  months: { 1: 4000,
            2: 4300,
            3: 4200,
            4: 5000,
            5: 5100,
            6: 5700,
            7: 5873
            // ... 8-12 go here
         }
}
Updating month
 document...
// Update 'months' collection at 5:37 p.m. on 2011-5-1
query = { uri: BinData("0beec7b5ea3f0fdbc95d0dd47f35"),
          month: '2011-5'
        };

update = { $inc: { total: 1, 'months.5': 1 } };

// Update collection containing days 1-5
db.month_2011_1_4.update( query, update );
Reporting
Must provide data at multiple resolutions
(second, minute, etc.).

We have the raw materials for that.

Application assembles the data
intelligently.
Summary
Feed Reader
Rich documents

Incremental modifiers

Bucketing strategy
Website Analytics
Pre-aggregate data

Time-scoped databases and collections
http://www.10gen.com/presentations
http://manning.com/banker
Thank you

Mais conteúdo relacionado

Mais procurados

Google cloud datastore driver for Google Apps Script DB abstraction
Google cloud datastore driver for Google Apps Script DB abstractionGoogle cloud datastore driver for Google Apps Script DB abstraction
Google cloud datastore driver for Google Apps Script DB abstractionBruce McPherson
 
Hitchhiker's guide to the win8
Hitchhiker's guide to the win8Hitchhiker's guide to the win8
Hitchhiker's guide to the win8Jua Alice Kim
 
Tools and Projects Dec 2018 Edition
Tools and Projects Dec 2018 EditionTools and Projects Dec 2018 Edition
Tools and Projects Dec 2018 EditionJesus Manuel Olivas
 
Sebastian Schmidt, Rachel Myers - How To Go Serverless And Not Violate The GD...
Sebastian Schmidt, Rachel Myers - How To Go Serverless And Not Violate The GD...Sebastian Schmidt, Rachel Myers - How To Go Serverless And Not Violate The GD...
Sebastian Schmidt, Rachel Myers - How To Go Serverless And Not Violate The GD...Codemotion
 
Starting with MongoDB
Starting with MongoDBStarting with MongoDB
Starting with MongoDBDoThinger
 
Mongo db query docuement
Mongo db query docuementMongo db query docuement
Mongo db query docuementzarigatongy
 
Eventsourcing with PHP and MongoDB
Eventsourcing with PHP and MongoDBEventsourcing with PHP and MongoDB
Eventsourcing with PHP and MongoDBJacopo Nardiello
 
Web client security
Web client securityWeb client security
Web client securityZiv Birer
 
Operational Intelligence with MongoDB Webinar
Operational Intelligence with MongoDB WebinarOperational Intelligence with MongoDB Webinar
Operational Intelligence with MongoDB WebinarMongoDB
 
Server Side Events
Server Side EventsServer Side Events
Server Side Eventsthepilif
 
Creating, Updating and Deleting Document in MongoDB
Creating, Updating and Deleting Document in MongoDBCreating, Updating and Deleting Document in MongoDB
Creating, Updating and Deleting Document in MongoDBWildan Maulana
 
2011 Mongo FR - Indexing in MongoDB
2011 Mongo FR - Indexing in MongoDB2011 Mongo FR - Indexing in MongoDB
2011 Mongo FR - Indexing in MongoDBantoinegirbal
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDBantoinegirbal
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDBNosh Petigara
 

Mais procurados (20)

Google cloud datastore driver for Google Apps Script DB abstraction
Google cloud datastore driver for Google Apps Script DB abstractionGoogle cloud datastore driver for Google Apps Script DB abstraction
Google cloud datastore driver for Google Apps Script DB abstraction
 
Hitchhiker's guide to the win8
Hitchhiker's guide to the win8Hitchhiker's guide to the win8
Hitchhiker's guide to the win8
 
Goa tutorial
Goa tutorialGoa tutorial
Goa tutorial
 
Tools and Projects Dec 2018 Edition
Tools and Projects Dec 2018 EditionTools and Projects Dec 2018 Edition
Tools and Projects Dec 2018 Edition
 
Sebastian Schmidt, Rachel Myers - How To Go Serverless And Not Violate The GD...
Sebastian Schmidt, Rachel Myers - How To Go Serverless And Not Violate The GD...Sebastian Schmidt, Rachel Myers - How To Go Serverless And Not Violate The GD...
Sebastian Schmidt, Rachel Myers - How To Go Serverless And Not Violate The GD...
 
Beyond the page
Beyond the pageBeyond the page
Beyond the page
 
Starting with MongoDB
Starting with MongoDBStarting with MongoDB
Starting with MongoDB
 
Git as NoSQL
Git as NoSQLGit as NoSQL
Git as NoSQL
 
Mongo db query docuement
Mongo db query docuementMongo db query docuement
Mongo db query docuement
 
Eventsourcing with PHP and MongoDB
Eventsourcing with PHP and MongoDBEventsourcing with PHP and MongoDB
Eventsourcing with PHP and MongoDB
 
Web client security
Web client securityWeb client security
Web client security
 
Operational Intelligence with MongoDB Webinar
Operational Intelligence with MongoDB WebinarOperational Intelligence with MongoDB Webinar
Operational Intelligence with MongoDB Webinar
 
Server Side Events
Server Side EventsServer Side Events
Server Side Events
 
Creating, Updating and Deleting Document in MongoDB
Creating, Updating and Deleting Document in MongoDBCreating, Updating and Deleting Document in MongoDB
Creating, Updating and Deleting Document in MongoDB
 
2011 Mongo FR - Indexing in MongoDB
2011 Mongo FR - Indexing in MongoDB2011 Mongo FR - Indexing in MongoDB
2011 Mongo FR - Indexing in MongoDB
 
MongoDB 3.2 - Analytics
MongoDB 3.2  - AnalyticsMongoDB 3.2  - Analytics
MongoDB 3.2 - Analytics
 
Diving into php
Diving into phpDiving into php
Diving into php
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
Indexed db
Indexed dbIndexed db
Indexed db
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 

Semelhante a Getting Started with MongoDB: 4 Application Designs

Graph db as metastore
Graph db as metastoreGraph db as metastore
Graph db as metastoreHaris Khan
 
Context Information Management in IoT enabled smart systems - the basics
Context Information Management in IoT enabled smart systems - the basicsContext Information Management in IoT enabled smart systems - the basics
Context Information Management in IoT enabled smart systems - the basicsFernando Lopez Aguilar
 
Javascript Application Architecture with Backbone.JS
Javascript Application Architecture with Backbone.JSJavascript Application Architecture with Backbone.JS
Javascript Application Architecture with Backbone.JSMin Ming Lo
 
Marc s01 e02-crud-database
Marc s01 e02-crud-databaseMarc s01 e02-crud-database
Marc s01 e02-crud-databaseMongoDB
 
Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...
Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...
Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...MongoDB
 
Taking Web Apps Offline
Taking Web Apps OfflineTaking Web Apps Offline
Taking Web Apps OfflinePedro Morais
 
Hands On Spring Data
Hands On Spring DataHands On Spring Data
Hands On Spring DataEric Bottard
 
MongoDB-MFF.pptx
MongoDB-MFF.pptxMongoDB-MFF.pptx
MongoDB-MFF.pptxYellowGem
 
171_74_216_Module_5-Non_relational_database_-mongodb.pptx
171_74_216_Module_5-Non_relational_database_-mongodb.pptx171_74_216_Module_5-Non_relational_database_-mongodb.pptx
171_74_216_Module_5-Non_relational_database_-mongodb.pptxsukrithlal008
 
How to Bring Common UI Patterns to ADF
How to Bring Common UI Patterns to ADF How to Bring Common UI Patterns to ADF
How to Bring Common UI Patterns to ADF Luc Bors
 
Understanding backbonejs
Understanding backbonejsUnderstanding backbonejs
Understanding backbonejsNick Lee
 
Session #5 content providers
Session #5  content providersSession #5  content providers
Session #5 content providersVitali Pekelis
 
Local data storage for mobile apps
Local data storage for mobile appsLocal data storage for mobile apps
Local data storage for mobile appsIvano Malavolta
 
Superficial mongo db
Superficial mongo dbSuperficial mongo db
Superficial mongo dbDaeMyung Kang
 
Elasticsearch an overview
Elasticsearch   an overviewElasticsearch   an overview
Elasticsearch an overviewAmit Juneja
 
Intro to MongoDB and datamodeling
Intro to MongoDB and datamodeling Intro to MongoDB and datamodeling
Intro to MongoDB and datamodeling rogerbodamer
 

Semelhante a Getting Started with MongoDB: 4 Application Designs (20)

ABCD firebase
ABCD firebaseABCD firebase
ABCD firebase
 
Graph db as metastore
Graph db as metastoreGraph db as metastore
Graph db as metastore
 
Context Information Management in IoT enabled smart systems - the basics
Context Information Management in IoT enabled smart systems - the basicsContext Information Management in IoT enabled smart systems - the basics
Context Information Management in IoT enabled smart systems - the basics
 
Javascript Application Architecture with Backbone.JS
Javascript Application Architecture with Backbone.JSJavascript Application Architecture with Backbone.JS
Javascript Application Architecture with Backbone.JS
 
Platform BigData 2.pptx
Platform BigData 2.pptxPlatform BigData 2.pptx
Platform BigData 2.pptx
 
Marc s01 e02-crud-database
Marc s01 e02-crud-databaseMarc s01 e02-crud-database
Marc s01 e02-crud-database
 
Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...
Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...
Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...
 
Taking Web Apps Offline
Taking Web Apps OfflineTaking Web Apps Offline
Taking Web Apps Offline
 
Advance Mobile Application Development class 05
Advance Mobile Application Development class 05Advance Mobile Application Development class 05
Advance Mobile Application Development class 05
 
Hands On Spring Data
Hands On Spring DataHands On Spring Data
Hands On Spring Data
 
MongoDB-MFF.pptx
MongoDB-MFF.pptxMongoDB-MFF.pptx
MongoDB-MFF.pptx
 
How te bring common UI patterns to ADF
How te bring common UI patterns to ADFHow te bring common UI patterns to ADF
How te bring common UI patterns to ADF
 
171_74_216_Module_5-Non_relational_database_-mongodb.pptx
171_74_216_Module_5-Non_relational_database_-mongodb.pptx171_74_216_Module_5-Non_relational_database_-mongodb.pptx
171_74_216_Module_5-Non_relational_database_-mongodb.pptx
 
How to Bring Common UI Patterns to ADF
How to Bring Common UI Patterns to ADF How to Bring Common UI Patterns to ADF
How to Bring Common UI Patterns to ADF
 
Understanding backbonejs
Understanding backbonejsUnderstanding backbonejs
Understanding backbonejs
 
Session #5 content providers
Session #5  content providersSession #5  content providers
Session #5 content providers
 
Local data storage for mobile apps
Local data storage for mobile appsLocal data storage for mobile apps
Local data storage for mobile apps
 
Superficial mongo db
Superficial mongo dbSuperficial mongo db
Superficial mongo db
 
Elasticsearch an overview
Elasticsearch   an overviewElasticsearch   an overview
Elasticsearch an overview
 
Intro to MongoDB and datamodeling
Intro to MongoDB and datamodeling Intro to MongoDB and datamodeling
Intro to MongoDB and datamodeling
 

Mais de DATAVERSITY

Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...DATAVERSITY
 
Data at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceData at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceDATAVERSITY
 
Exploring Levels of Data Literacy
Exploring Levels of Data LiteracyExploring Levels of Data Literacy
Exploring Levels of Data LiteracyDATAVERSITY
 
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsBuilding a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsDATAVERSITY
 
Make Data Work for You
Make Data Work for YouMake Data Work for You
Make Data Work for YouDATAVERSITY
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?DATAVERSITY
 
Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?DATAVERSITY
 
Data Modeling Fundamentals
Data Modeling FundamentalsData Modeling Fundamentals
Data Modeling FundamentalsDATAVERSITY
 
Showing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectShowing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectDATAVERSITY
 
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes  Data Mesh Work at ScaleHow a Semantic Layer Makes  Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at ScaleDATAVERSITY
 
Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?DATAVERSITY
 
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...DATAVERSITY
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?DATAVERSITY
 
Data Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsData Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsDATAVERSITY
 
Data Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayData Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayDATAVERSITY
 
2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics2023 Trends in Enterprise Analytics
2023 Trends in Enterprise AnalyticsDATAVERSITY
 
Data Strategy Best Practices
Data Strategy Best PracticesData Strategy Best Practices
Data Strategy Best PracticesDATAVERSITY
 
Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?DATAVERSITY
 
Data Management Best Practices
Data Management Best PracticesData Management Best Practices
Data Management Best PracticesDATAVERSITY
 
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageMLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageDATAVERSITY
 

Mais de DATAVERSITY (20)

Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
 
Data at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceData at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and Governance
 
Exploring Levels of Data Literacy
Exploring Levels of Data LiteracyExploring Levels of Data Literacy
Exploring Levels of Data Literacy
 
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsBuilding a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business Goals
 
Make Data Work for You
Make Data Work for YouMake Data Work for You
Make Data Work for You
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?
 
Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?
 
Data Modeling Fundamentals
Data Modeling FundamentalsData Modeling Fundamentals
Data Modeling Fundamentals
 
Showing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectShowing ROI for Your Analytic Project
Showing ROI for Your Analytic Project
 
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes  Data Mesh Work at ScaleHow a Semantic Layer Makes  Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at Scale
 
Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?
 
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?
 
Data Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsData Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and Forwards
 
Data Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayData Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement Today
 
2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics
 
Data Strategy Best Practices
Data Strategy Best PracticesData Strategy Best Practices
Data Strategy Best Practices
 
Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?
 
Data Management Best Practices
Data Management Best PracticesData Management Best Practices
Data Management Best Practices
 
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageMLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive Advantage
 

Último

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 

Último (20)

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 

Getting Started with MongoDB: 4 Application Designs

  • 4. What we'll cover: Brief intro Feed reader Website analyitcs Questions
  • 6. MongoDB Rich data model Replication and automatic failover Sharding
  • 7. Use cases Social networking and geolocation E-commerce CMS Analytics Production deployments: http://is.gd/U0VkG6
  • 11. Users { _id: ObjectId("4e316f236ce9cca7ef17d59d"), username: 'kbanker', feeds: [ { _id: ObjectId("4e316f236ce9cca7ef17d54c"), name: "GigaOM" }, { _id: ObjectId("4e316f236ce9cca7ef17d54d"), name: "Salon.com" } { _id: ObjectId("4e316f236ce9cca7ef17d54e"), name: "Foo News" } ], latest: Date(2011, 7, 25) }
  • 12. Index db.users.ensureIndex( { username: 1 }, { unique: true } )
  • 13. Feeds { _id: ObjectId("4e316b8a6ce9cca7ef17d54b"), url: 'http://news.foo.com/feed.xml/', name: 'Foo News', subscriber_count: 2, latest: Date(2011, 7, 25) } Index db.feeds.ensureIndex( { url: 1 }, { unique: true
  • 14. Adding a feed subscription // Upsert db.feeds.update( { url: 'http://news.foo.com/feed.xml/', name: 'Foo News'}, { $inc: {subscriber_count: 1} }, true )
  • 15. Adding a feed subscription // Add this feed to user feed list db.users.update( {_id: ObjectId("4e316f236ce9cca7ef17d59d") }, { $addToSet: { feeds: { _id: ObjectId("4e316b8a6ce9cca7ef17d54b"), name: 'Foo News' } } )
  • 16. Removing a feed subscription db.users.update( { _id: ObjectId("4e316f236ce9cca7ef17d59d") }, { $pull: { feeds: { _id: ObjectId("4e316b8a6ce9cca7ef17d54b"), name: 'Foo News' } } )
  • 17. Removing a feed subscription db.feeds.update( {url: 'http://news.foo.com/feed.xml/'}, { $inc: {subscriber_count: -1} } )
  • 18. Entries (populate in background) { _id: ObjectId("4e316b8a6ce9cca7ef17d54b"), feed_id: ObjectId("4e316b8a6ce9cca7ef17d54b"), title: 'Important person to resign', body: 'A person deemed very important has decided...', reads: 5, date: Date(2011, 7, 27) }
  • 19. What we need now Populate personal feeds (buckets) Avoid lots of expensive queries Record what's been read
  • 20. Without bucketing // Naive query runs every time db.entries.find( feed_id: {$in: user_feed_ids} ).sort({date: 1}).limit(25)
  • 21. With bucketing // A bit smarter: only runs once entries = db.entries.find( { date: {$gt: user_latest }, feed_id: { $in: user_feed_ids } ).sort({date: 1})
  • 23. bucket = { _id: ObjectId("4e3185c26ce9cca7ef17d552"), user_id: ObjectId("4e316f236ce9cca7ef17d59d"), date: Date( 2011, 7, 27 ) n: 100, entries: [ { _id: ObjectId("4e316b8a6ce9cca7ef17d5ac"), feed_id: ObjectId("4e316b8a6ce9cca7ef17d54b"), title: 'Important person to resign', body: 'A person deemed very important has decided...', date: Date(2011, 7, 27), read: false }, { _id: ObjectId("4e316b8a6ce9cca7ef17d5c8"), feed_id: ObjectId("4e316b8a6ce9cca7ef17d54b"), title: 'Panda bear waves hello', body: 'A panda bear at the local zoo...', date: Date(2011, 7, 27), read: false } ] }
  • 24. db.buckets.insert( buckets ) db.users.update( { _id: ObjectId("4e316f236ce9cca7ef17d59d") } { $set: { latest: latest_entry_date } } )
  • 25. Viewing a personal feed // Newest db.buckets.find( { user_id: ObjectId("4e316f236ce9cca7ef17d59d") } ).sort({date: -1}).limit(1)
  • 26. Viewing a personal feed // Next newest (that's how we paginate) db.buckets.find( { user_id: ObjectId("4e316f236ce9cca7ef17d59d"), date: { $lt: previous_reader_date } } ).sort({date: -1}).limit(1)
  • 28. Marking a feed item db.buckets.update( { user_id: ObjectId("4e316f236ce9cca7ef17d59d"), 'entries.id': ObjectId("4e316b8a6ce9cca7ef17d5c8")}, { $set: { 'entries.$.read' : true } )
  • 29. Marking a feed item db.entries.update( { _id: ObjectId("4e316b8a6ce9cca7ef17d5c8") }, { $inc: { read: 1 } } )
  • 30. Sharding note: Buckets collection is eminently shardable Shard key: { user_id: 1, date: 1 }
  • 32. Challenges: Real-time reporting. Efficient use of space. Easily wipe unneeded data.
  • 33. Techniques Pre-aggregate the data. Pre-construct document structure. Store emphemeral data in a separate database.
  • 34. Two collections: Each collection gets its own database. Collections names are time-scoped. Clean, fast removal of old data.
  • 35. // Collections holding totals for each day, stored // in a database per month days_2011_5 days_2011_6 days_2011_7 ... // Totals for each month... months_2011_1_4 months_2011_5_8 months_2011_9_12 ...
  • 36. Hours and minutes { _id: { uri: BinData("0beec7b5ea3f0fdbc95d0dd47f35"), day: '2011-5-1' }, total: 2820, hrs: { 0: 500, 1: 700, 2: 450, 3: 343, // ... 4-23 go here } // Minutes are rolling. This gives real-time // numbers for the last hour. So when you increment // minute n, you need to $set minute n-1 to 0. mins: { 1: 12, 2: 10, 3: 5, 4: 34 // ... 5-60 go here } }
  • 38. // Update 'days' collection at 5:37 p.m. on 2011-5-1 // Might want to queue increments so that $inc is greater // than 1 for each write... id = { uri: BinData("0beec7b5ea3f0fdbc95d0dd47f35"), day: '2011-5-1' }; update = { $inc: { total: 1, 'hrs.17': 1, 'mins.37': 1 }, $set: { 'mins.36': 0} }; // Update collection containing days 1-5 db.days_2011_5.update( id, update );
  • 39. Months and days { _id: { uri: BinData("0beec7b5ea3f0fdbc95d0dd47f35"), month: '2011-5' }, total: 34173, months: { 1: 4000, 2: 4300, 3: 4200, 4: 5000, 5: 5100, 6: 5700, 7: 5873 // ... 8-12 go here } }
  • 41. // Update 'months' collection at 5:37 p.m. on 2011-5-1 query = { uri: BinData("0beec7b5ea3f0fdbc95d0dd47f35"), month: '2011-5' }; update = { $inc: { total: 1, 'months.5': 1 } }; // Update collection containing days 1-5 db.month_2011_1_4.update( query, update );
  • 42. Reporting Must provide data at multiple resolutions (second, minute, etc.). We have the raw materials for that. Application assembles the data intelligently.
  • 44. Feed Reader Rich documents Incremental modifiers Bucketing strategy