SlideShare uma empresa Scribd logo
1 de 35
Mongo
Philip.Zhong/Chen.Tao/Leaf .Zhu, 2014
Agenda
• What’s Mongo?
• Mongo Advantages & Limitations
• Mongo Case Studies
What’s Mongo?

http://www.mongodb.org/

$1,200,000,000 (2007-2013)

http://www.mongodb.com/

Red Hat (1993-2013)

$ 16.75

Billion

$ 30.0+

Billion
What’s Mongo?
 MongoDB (from "humongous") is an open-source document database, and the
leading NoSQL database. Written in C++

 The most SQL-like NoSQL.
 Mongo is a Open, Schemaless, Document-Oriented NoSql data base with Rich
Query, High Performance, High Availbility, High Scalibility, High Flexibility
1. Document Data Model. Document, BSON.
2. Rich Query Model. Full Index, Various Query Type.
3. Idiomatic Drivers. Over 17 language drivers support.
4. Horizontal Scalability. Easy to append capacity
5. High Availability. HA, Journal, Auto-Recover.
6. In-Memory Performance. Memeory-Mapped Files, read/write in RAM.
7. Flexibility. Schema-free, multi-datacenter deployments, tunable consistency, widly
used across many industries.
Data Model
Data Model

• Max BSON Document Size 16M
• Nested Depth for BSON Document 100Level
• Document-level Atomic operation
Data Operation
Query
Query Type
1. Key-value
2. Range queries.
3. Text Search AND, OR, NOT etc.
4. Aggregation count, min, max, average etc.
5. MapReduce
Cursor

 Query returns a cursor
 Iterate the cursor to get results
 Return 101 results or size less than 1M bytes,
overrided by batchSize or limit, not exceeds 16M
Create/Update/Delete
Write Concern

 Error Ignored
 Unacknowledged
 Acknowledged

 Journaled
Index

1.

Single Field Indexes

2.

Compound Indexes.

3.

Array Indexes.

4.

Geospatial Indexes.

5.

Hash Indexes.

1.

Unique Indexes

6.

Text Search Indexes (V2.4, Beta)

2.

Spars Index
Index
 At least 8KB for each index.
 Negative performance impact for write operations. Expensive for high
write-to-read ratio collection.
 benefit high read-to-write ratio collections.
 Consumes disk space and memory. Carefully tracked and plan
Availability
RDBMS Replication
Mongo Replication
 Have up to 12
Mongod
instances
 Have a Primary
member, which
receives write
requests
Mongo Failover
Secondary Hidden
Read Preference
Scalability
Basic Concepts
• Config Servers
Shards
Replica
Mongos Set










Contain APP requests
a group of mongod
Exist in sets of three
Process fractions of
global requests to
processes
Maintain metadata
Direct data
Are replica
Includes sets in
shards Primary and
Are mongod instances
production
Secondarys to clients
Direct results
Can be queried
Exist as 1+
directly by clients (not
Are mongos instances
recommended)
Cache metadata
Range Based Sharding
Hash Based Sharding
Splitting and Balancing
Data Store As Service
Case Study
Schema Design
•

Remember, "schemaless" doesn't mean you don't need to design your schema!

•
•
•
•
•
•
•

Considerations to avoid the pitfalls of MongoDB schema design:
1. Avoid growing documents
3. Pay attention to BSON data types
5. Field names take up space
6. Consider using _id for your own purposes
7. Can you use covered indexes?
8. Use collections and databases to your advantage
•
•

Test everything

Schema design effect performance
Schema design effect infrastructure: RAM > indexes + hot data = better performance
MongoDB for MDS – Sharding Strategy
• When need shard?
–
–
–

your data set approaches or exceeds the storage capacity of a single MongoDB instance.
the size of your system’s active working set will soon exceed the capacity of your system’s maximumRAM.
a single MongoDB instance cannot meet the demands of your write operations, and all other approaches have not
reduced contention.

• The considerations for sharding
–
–
–
–

Multiple ways to model a domain problem
Understand the key uses cases of your app
Balance between ease of query vs. ease of write
Random I/O should be avoided

• Meeting behavior and sharding consideration(From 10G)
–
–
–
–

Schedule meeting - ~800K meetings write/day
~20% instant meetings
Scalability best practice: Don’t scale by using replication. Scale by using local read nodes.
Recommend to implement local write to meet JOIN meetings use case requirements
Cross DC latency Testing
Local vs Remote Write/Read Latency Test:
Scenario:
Create two shards, each with three member replica sets. Make sure that Primary node of one runs on local DC(SJ), where as Primary
of the second runs on remote DC(TX). Run small number of writes from local DC to Replica1 Primary and then run the same against
Replica2 Primary. Writeconcern = majority. Average object size is 1500 bytes. (ping time 46 ms from local DC(SJ) to remote DC(TX).

Local vs Remote Insert Tests (YCSB test):
Replication delay cross DC
•
•

Repication Lag between data centers:
Scenario: On the local DC(SJ), where the replication Primary is running, insert 500 records at a time, upto a total of 550,000 records.
Record the record count and current timestamp at the end of every 500 insertions. Note that this is a single threaded operation and only
one process is inserting these records. On the remote DC(TX), where the 3rd secondary is running (this node is the least nearest of all
the secondaries and so, is not part of the initial write), in a loop keep getting the db.collection.count() and whenever the count returns a
multiple of 500, record the count and the current timestamp. Use the data collected on Primary and remote secondary, compute the
replication delay.
MongoDB for MDS – Sharding
Goals:
- write to a shard primary node with physical proximity to the application server

- keep the shard primary node in close proximity to the application server [monitor the primary node of the replica set and if possible, restore the primary t
- reduce 'scatter/gather' on reads - use smart shard keys

Solution:

Add a geo-location based field in the schema, create a shard index based on that field, assign a tag to each shard and assign specific shard index field ra

e.g., Say we can add a 'DC' field into our collection. Assuming that the application somehow knows the data center it is running on, it can use this value for

Associate the tag ranges to specific tagged shard.
Inferred Technical Requirements
1. MongoDB Sharding (shard keys: region + siteId + userId, region + siteId + meetingUUID) to support 3 regions
(US, EMEA, APAC)
2. Sharding by siteId + userId or siteId + meetingUUID allows hosts from the same company (siteId), same region
to create meetings in different shards. if we need to scale horizontally, the shard config will add another shard
for the same siteId
3. Based on shard keys, we can support the requirements of local writes, local reads
4. Replication requirement - replicating 600,000 meetings/day within 15 minutes between 2 nodes (remark: early
benchmarking shows 11M meetings data replicated across 3 sites within 4 minutes)
5. Availability requirement - a primary node fails over to a secondary node within the same data center = < 30
sec; a primary node fails over to a secondary in a different data center = < 10 minutes
MongoDB使用案例
•
•

BillRun 计费系统
奥弗•科恩发布下一代的开源计费解决方案BillRun ,此方案利用MongoDB作为其后端存储。此计费系统已经运行于以色列发展最快的移动运
营商的产品环境,每个月能处理超过500M的呼叫数据记录CDR。

•
•
•
•
•

视觉中国
存储comments/feed/full text search
问题:
Fail-over失效,由于没有正确配置replica set,至少1 primary+2 sencondary+n arbiter.
Out of Memory导致宕机 --增加内存,使用正确驱动(非开发版)

•
•

优酷
优酷的在线评论业务已部分迁移到MongoDB,运营数据分析及挖掘处理前在使用Hadoop/HBase;

•
•
•
•

奇虎360
Document>100Million
问题 Time out (数据超过内存,随机读写,moving chunk时间)
Solution: 增大内存(甚至用SSD),节省空间使用(schema refactor);调整balancer工作时间,避免高峰

•
•
•
•

Mailbox
100 Million Messages Per Day, store email and related data by MongoDB
https://tech.dropbox.com/2013/09/scaling-mongodb-at-mailbox/
Lesson: write lock contention Solution: separate hot collection to standalone cluster, sharding

•
•
•

Other
百度开放云-云数据库 非关系型数据库用了mongoDB有很多中小开发者基于mongodb进行开发
Amazon E2: MongoDB后台数据库,如果其上应用data
Q&A

Mais conteúdo relacionado

Mais procurados

Sharding Methods for MongoDB
Sharding Methods for MongoDBSharding Methods for MongoDB
Sharding Methods for MongoDBMongoDB
 
Webinar: When to Use MongoDB
Webinar: When to Use MongoDBWebinar: When to Use MongoDB
Webinar: When to Use MongoDBMongoDB
 
Performance Tuning and Optimization
Performance Tuning and OptimizationPerformance Tuning and Optimization
Performance Tuning and OptimizationMongoDB
 
Sharding
ShardingSharding
ShardingMongoDB
 
Back to Basics 2017: Introduction to Sharding
Back to Basics 2017: Introduction to ShardingBack to Basics 2017: Introduction to Sharding
Back to Basics 2017: Introduction to ShardingMongoDB
 
Agility and Scalability with MongoDB
Agility and Scalability with MongoDBAgility and Scalability with MongoDB
Agility and Scalability with MongoDBMongoDB
 
Breaking the Oracle Tie; High Performance OLTP and Analytics Using MongoDB
Breaking the Oracle Tie; High Performance OLTP and Analytics Using MongoDBBreaking the Oracle Tie; High Performance OLTP and Analytics Using MongoDB
Breaking the Oracle Tie; High Performance OLTP and Analytics Using MongoDBMongoDB
 
Introduction to Sharding
Introduction to ShardingIntroduction to Sharding
Introduction to ShardingMongoDB
 
The design and implementation of modern column oriented databases
The design and implementation of modern column oriented databasesThe design and implementation of modern column oriented databases
The design and implementation of modern column oriented databasesTilak Patidar
 
Webinar: Performance Tuning + Optimization
Webinar: Performance Tuning + OptimizationWebinar: Performance Tuning + Optimization
Webinar: Performance Tuning + OptimizationMongoDB
 
I have a good shard key now what - Advanced Sharding
I have a good shard key now what - Advanced ShardingI have a good shard key now what - Advanced Sharding
I have a good shard key now what - Advanced ShardingDavid Murphy
 
How to Achieve Scale with MongoDB
How to Achieve Scale with MongoDBHow to Achieve Scale with MongoDB
How to Achieve Scale with MongoDBMongoDB
 
MongoDB at Scale
MongoDB at ScaleMongoDB at Scale
MongoDB at ScaleMongoDB
 
MongoDB Auto-Sharding at Mongo Seattle
MongoDB Auto-Sharding at Mongo SeattleMongoDB Auto-Sharding at Mongo Seattle
MongoDB Auto-Sharding at Mongo SeattleMongoDB
 
Mongodb sharding
Mongodb shardingMongodb sharding
Mongodb shardingxiangrong
 
Mongo presentation conf
Mongo presentation confMongo presentation conf
Mongo presentation confShridhar Joshi
 
Write intensive workloads and lsm trees
Write intensive workloads and lsm treesWrite intensive workloads and lsm trees
Write intensive workloads and lsm treesTilak Patidar
 

Mais procurados (20)

Sharding Methods for MongoDB
Sharding Methods for MongoDBSharding Methods for MongoDB
Sharding Methods for MongoDB
 
Webinar: When to Use MongoDB
Webinar: When to Use MongoDBWebinar: When to Use MongoDB
Webinar: When to Use MongoDB
 
Performance Tuning and Optimization
Performance Tuning and OptimizationPerformance Tuning and Optimization
Performance Tuning and Optimization
 
Sharding
ShardingSharding
Sharding
 
Back to Basics 2017: Introduction to Sharding
Back to Basics 2017: Introduction to ShardingBack to Basics 2017: Introduction to Sharding
Back to Basics 2017: Introduction to Sharding
 
Tag based sharding presentation
Tag based sharding presentationTag based sharding presentation
Tag based sharding presentation
 
Agility and Scalability with MongoDB
Agility and Scalability with MongoDBAgility and Scalability with MongoDB
Agility and Scalability with MongoDB
 
Breaking the Oracle Tie; High Performance OLTP and Analytics Using MongoDB
Breaking the Oracle Tie; High Performance OLTP and Analytics Using MongoDBBreaking the Oracle Tie; High Performance OLTP and Analytics Using MongoDB
Breaking the Oracle Tie; High Performance OLTP and Analytics Using MongoDB
 
Introduction to Sharding
Introduction to ShardingIntroduction to Sharding
Introduction to Sharding
 
The design and implementation of modern column oriented databases
The design and implementation of modern column oriented databasesThe design and implementation of modern column oriented databases
The design and implementation of modern column oriented databases
 
Webinar: Performance Tuning + Optimization
Webinar: Performance Tuning + OptimizationWebinar: Performance Tuning + Optimization
Webinar: Performance Tuning + Optimization
 
I have a good shard key now what - Advanced Sharding
I have a good shard key now what - Advanced ShardingI have a good shard key now what - Advanced Sharding
I have a good shard key now what - Advanced Sharding
 
How to Achieve Scale with MongoDB
How to Achieve Scale with MongoDBHow to Achieve Scale with MongoDB
How to Achieve Scale with MongoDB
 
MongoDB at Scale
MongoDB at ScaleMongoDB at Scale
MongoDB at Scale
 
MongoDB Auto-Sharding at Mongo Seattle
MongoDB Auto-Sharding at Mongo SeattleMongoDB Auto-Sharding at Mongo Seattle
MongoDB Auto-Sharding at Mongo Seattle
 
Mongodb sharding
Mongodb shardingMongodb sharding
Mongodb sharding
 
MongoDB
MongoDBMongoDB
MongoDB
 
Mongo presentation conf
Mongo presentation confMongo presentation conf
Mongo presentation conf
 
Mongo db
Mongo dbMongo db
Mongo db
 
Write intensive workloads and lsm trees
Write intensive workloads and lsm treesWrite intensive workloads and lsm trees
Write intensive workloads and lsm trees
 

Destaque

В помощь мастерам производственного обучения
В помощь мастерам производственного обученияВ помощь мастерам производственного обучения
В помощь мастерам производственного обученияelenabarin
 
Выбираем профессию — выбираем будущее: методические материалы
Выбираем профессию — выбираем будущее: методические материалыВыбираем профессию — выбираем будущее: методические материалы
Выбираем профессию — выбираем будущее: методические материалыOpenLibrary35
 
Профстандарт педагога дополнительного образования
Профстандарт педагога дополнительного образованияПрофстандарт педагога дополнительного образования
Профстандарт педагога дополнительного образованияNatasha7529
 
Themes ways of the world
Themes ways of the worldThemes ways of the world
Themes ways of the worldashleighalece
 
Tok prescribed title # 3 copy
Tok prescribed title # 3 copyTok prescribed title # 3 copy
Tok prescribed title # 3 copyteamhumanities
 
Power Notes: Measurements and Dealing with Data-2011
Power Notes:   Measurements and Dealing with Data-2011Power Notes:   Measurements and Dealing with Data-2011
Power Notes: Measurements and Dealing with Data-2011jmori1
 
Tools of the Trade
Tools of the TradeTools of the Trade
Tools of the Tradejmori1
 
Juego con vene
Juego con veneJuego con vene
Juego con veneDaisneidy
 
שימו לב לסגול הרועד
שימו לב לסגול הרועדשימו לב לסגול הרועד
שימו לב לסגול הרועדnirit68
 
Welcome in Turin
Welcome in TurinWelcome in Turin
Welcome in Turinenzoppi
 
Photo album latest slideshow1
Photo album latest slideshow1Photo album latest slideshow1
Photo album latest slideshow1hussain56
 
Power Notes - Phase Changes
Power Notes - Phase ChangesPower Notes - Phase Changes
Power Notes - Phase Changesjmori1
 
Weekend na ziemi oświęcimskiej
Weekend na ziemi oświęcimskiejWeekend na ziemi oświęcimskiej
Weekend na ziemi oświęcimskiejpowiatoswiecimski
 

Destaque (20)

В помощь мастерам производственного обучения
В помощь мастерам производственного обученияВ помощь мастерам производственного обучения
В помощь мастерам производственного обучения
 
Выбираем профессию — выбираем будущее: методические материалы
Выбираем профессию — выбираем будущее: методические материалыВыбираем профессию — выбираем будущее: методические материалы
Выбираем профессию — выбираем будущее: методические материалы
 
Профстандарт педагога дополнительного образования
Профстандарт педагога дополнительного образованияПрофстандарт педагога дополнительного образования
Профстандарт педагога дополнительного образования
 
Themes ways of the world
Themes ways of the worldThemes ways of the world
Themes ways of the world
 
Tok prescribed title # 3 copy
Tok prescribed title # 3 copyTok prescribed title # 3 copy
Tok prescribed title # 3 copy
 
Watch reviews
Watch reviewsWatch reviews
Watch reviews
 
Power Notes: Measurements and Dealing with Data-2011
Power Notes:   Measurements and Dealing with Data-2011Power Notes:   Measurements and Dealing with Data-2011
Power Notes: Measurements and Dealing with Data-2011
 
Tools of the Trade
Tools of the TradeTools of the Trade
Tools of the Trade
 
Juego con vene
Juego con veneJuego con vene
Juego con vene
 
שימו לב לסגול הרועד
שימו לב לסגול הרועדשימו לב לסגול הרועד
שימו לב לסגול הרועד
 
C 4
C 4C 4
C 4
 
Module 1
Module 1Module 1
Module 1
 
Welcome in Turin
Welcome in TurinWelcome in Turin
Welcome in Turin
 
Thinking & Planning the EPUB 3 Way
Thinking & Planning the EPUB 3 Way Thinking & Planning the EPUB 3 Way
Thinking & Planning the EPUB 3 Way
 
Photo album latest slideshow1
Photo album latest slideshow1Photo album latest slideshow1
Photo album latest slideshow1
 
CV VI
CV VICV VI
CV VI
 
Sd10 nadia alkhazaliah
Sd10   nadia alkhazaliahSd10   nadia alkhazaliah
Sd10 nadia alkhazaliah
 
Power Notes - Phase Changes
Power Notes - Phase ChangesPower Notes - Phase Changes
Power Notes - Phase Changes
 
Weekend na ziemi oświęcimskiej
Weekend na ziemi oświęcimskiejWeekend na ziemi oświęcimskiej
Weekend na ziemi oświęcimskiej
 
Successes2009
Successes2009Successes2009
Successes2009
 

Semelhante a MongoDB Knowledge Shareing

Scaling MongoDB - Presentation at MTP
Scaling MongoDB - Presentation at MTPScaling MongoDB - Presentation at MTP
Scaling MongoDB - Presentation at MTPdarkdata
 
The Care + Feeding of a Mongodb Cluster
The Care + Feeding of a Mongodb ClusterThe Care + Feeding of a Mongodb Cluster
The Care + Feeding of a Mongodb ClusterChris Henry
 
MongoDB : Scaling, Security & Performance
MongoDB : Scaling, Security & PerformanceMongoDB : Scaling, Security & Performance
MongoDB : Scaling, Security & PerformanceSasidhar Gogulapati
 
MongoDB Pros and Cons
MongoDB Pros and ConsMongoDB Pros and Cons
MongoDB Pros and Consjohnrjenson
 
DBVersity MongoDB Online Training Presentations
DBVersity MongoDB Online Training PresentationsDBVersity MongoDB Online Training Presentations
DBVersity MongoDB Online Training PresentationsSrinivas Mutyala
 
Lessons Learned Migrating 2+ Billion Documents at Craigslist
Lessons Learned Migrating 2+ Billion Documents at CraigslistLessons Learned Migrating 2+ Billion Documents at Craigslist
Lessons Learned Migrating 2+ Billion Documents at CraigslistJeremy Zawodny
 
MongoDB 2.4 and spring data
MongoDB 2.4 and spring dataMongoDB 2.4 and spring data
MongoDB 2.4 and spring dataJimmy Ray
 
MongoDB Internals
MongoDB InternalsMongoDB Internals
MongoDB InternalsSiraj Memon
 
Mongo db pefrormance optimization strategies
Mongo db pefrormance optimization strategiesMongo db pefrormance optimization strategies
Mongo db pefrormance optimization strategiesronwarshawsky
 
Mongo db transcript
Mongo db transcriptMongo db transcript
Mongo db transcriptfoliba
 
Compare DynamoDB vs. MongoDB
Compare DynamoDB vs. MongoDBCompare DynamoDB vs. MongoDB
Compare DynamoDB vs. MongoDBAmar Das
 
Dynamo vs Mongo
Dynamo vs MongoDynamo vs Mongo
Dynamo vs MongoAmar Das
 
SQL vs NoSQL, an experiment with MongoDB
SQL vs NoSQL, an experiment with MongoDBSQL vs NoSQL, an experiment with MongoDB
SQL vs NoSQL, an experiment with MongoDBMarco Segato
 

Semelhante a MongoDB Knowledge Shareing (20)

Scaling MongoDB - Presentation at MTP
Scaling MongoDB - Presentation at MTPScaling MongoDB - Presentation at MTP
Scaling MongoDB - Presentation at MTP
 
mongodb tutorial
mongodb tutorialmongodb tutorial
mongodb tutorial
 
Mongo db
Mongo dbMongo db
Mongo db
 
The Care + Feeding of a Mongodb Cluster
The Care + Feeding of a Mongodb ClusterThe Care + Feeding of a Mongodb Cluster
The Care + Feeding of a Mongodb Cluster
 
MongoDB : Scaling, Security & Performance
MongoDB : Scaling, Security & PerformanceMongoDB : Scaling, Security & Performance
MongoDB : Scaling, Security & Performance
 
MongoDb - Details on the POC
MongoDb - Details on the POCMongoDb - Details on the POC
MongoDb - Details on the POC
 
MongoDB Pros and Cons
MongoDB Pros and ConsMongoDB Pros and Cons
MongoDB Pros and Cons
 
Mongodb
MongodbMongodb
Mongodb
 
DBVersity MongoDB Online Training Presentations
DBVersity MongoDB Online Training PresentationsDBVersity MongoDB Online Training Presentations
DBVersity MongoDB Online Training Presentations
 
Lessons Learned Migrating 2+ Billion Documents at Craigslist
Lessons Learned Migrating 2+ Billion Documents at CraigslistLessons Learned Migrating 2+ Billion Documents at Craigslist
Lessons Learned Migrating 2+ Billion Documents at Craigslist
 
MongoDB
MongoDBMongoDB
MongoDB
 
MongoDB 2.4 and spring data
MongoDB 2.4 and spring dataMongoDB 2.4 and spring data
MongoDB 2.4 and spring data
 
MongoDB Internals
MongoDB InternalsMongoDB Internals
MongoDB Internals
 
Mongo db pefrormance optimization strategies
Mongo db pefrormance optimization strategiesMongo db pefrormance optimization strategies
Mongo db pefrormance optimization strategies
 
Mongo db transcript
Mongo db transcriptMongo db transcript
Mongo db transcript
 
Compare DynamoDB vs. MongoDB
Compare DynamoDB vs. MongoDBCompare DynamoDB vs. MongoDB
Compare DynamoDB vs. MongoDB
 
Dynamo vs Mongo
Dynamo vs MongoDynamo vs Mongo
Dynamo vs Mongo
 
Mongodb Introduction
Mongodb IntroductionMongodb Introduction
Mongodb Introduction
 
SQL vs NoSQL, an experiment with MongoDB
SQL vs NoSQL, an experiment with MongoDBSQL vs NoSQL, an experiment with MongoDB
SQL vs NoSQL, an experiment with MongoDB
 
No sq lv1_0
No sq lv1_0No sq lv1_0
No sq lv1_0
 

Mais de Philip Zhong

Cisco Webex Distributed Framework and Data Store Design
Cisco Webex Distributed Framework and Data Store DesignCisco Webex Distributed Framework and Data Store Design
Cisco Webex Distributed Framework and Data Store DesignPhilip Zhong
 
How to Implement Distributed Data Store
How to Implement Distributed Data Store How to Implement Distributed Data Store
How to Implement Distributed Data Store Philip Zhong
 
Adapter Poxy Pattern
Adapter Poxy PatternAdapter Poxy Pattern
Adapter Poxy PatternPhilip Zhong
 
How to estimate_oracle_cost
How to estimate_oracle_costHow to estimate_oracle_cost
How to estimate_oracle_costPhilip Zhong
 
Mongo db program_installation_guide
Mongo db program_installation_guideMongo db program_installation_guide
Mongo db program_installation_guidePhilip Zhong
 
Mongo db sharding_cluster_installation_guide
Mongo db sharding_cluster_installation_guideMongo db sharding_cluster_installation_guide
Mongo db sharding_cluster_installation_guidePhilip Zhong
 
Vitess percona 2012
Vitess percona 2012Vitess percona 2012
Vitess percona 2012Philip Zhong
 
Distributed_Database_System
Distributed_Database_SystemDistributed_Database_System
Distributed_Database_SystemPhilip Zhong
 
Mysql performance tuning
Mysql performance tuningMysql performance tuning
Mysql performance tuningPhilip Zhong
 
Mysql5.1 character set testing
Mysql5.1 character set testingMysql5.1 character set testing
Mysql5.1 character set testingPhilip Zhong
 
How to write_language_compiler
How to write_language_compilerHow to write_language_compiler
How to write_language_compilerPhilip Zhong
 
Compare mysql5.1.50 mysql5.5.8
Compare mysql5.1.50 mysql5.5.8Compare mysql5.1.50 mysql5.5.8
Compare mysql5.1.50 mysql5.5.8Philip Zhong
 
Mysql handle socket
Mysql handle socketMysql handle socket
Mysql handle socketPhilip Zhong
 
Mysql architecture&parameters
Mysql architecture&parametersMysql architecture&parameters
Mysql architecture&parametersPhilip Zhong
 

Mais de Philip Zhong (14)

Cisco Webex Distributed Framework and Data Store Design
Cisco Webex Distributed Framework and Data Store DesignCisco Webex Distributed Framework and Data Store Design
Cisco Webex Distributed Framework and Data Store Design
 
How to Implement Distributed Data Store
How to Implement Distributed Data Store How to Implement Distributed Data Store
How to Implement Distributed Data Store
 
Adapter Poxy Pattern
Adapter Poxy PatternAdapter Poxy Pattern
Adapter Poxy Pattern
 
How to estimate_oracle_cost
How to estimate_oracle_costHow to estimate_oracle_cost
How to estimate_oracle_cost
 
Mongo db program_installation_guide
Mongo db program_installation_guideMongo db program_installation_guide
Mongo db program_installation_guide
 
Mongo db sharding_cluster_installation_guide
Mongo db sharding_cluster_installation_guideMongo db sharding_cluster_installation_guide
Mongo db sharding_cluster_installation_guide
 
Vitess percona 2012
Vitess percona 2012Vitess percona 2012
Vitess percona 2012
 
Distributed_Database_System
Distributed_Database_SystemDistributed_Database_System
Distributed_Database_System
 
Mysql performance tuning
Mysql performance tuningMysql performance tuning
Mysql performance tuning
 
Mysql5.1 character set testing
Mysql5.1 character set testingMysql5.1 character set testing
Mysql5.1 character set testing
 
How to write_language_compiler
How to write_language_compilerHow to write_language_compiler
How to write_language_compiler
 
Compare mysql5.1.50 mysql5.5.8
Compare mysql5.1.50 mysql5.5.8Compare mysql5.1.50 mysql5.5.8
Compare mysql5.1.50 mysql5.5.8
 
Mysql handle socket
Mysql handle socketMysql handle socket
Mysql handle socket
 
Mysql architecture&parameters
Mysql architecture&parametersMysql architecture&parameters
Mysql architecture&parameters
 

Último

What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????blackmambaettijean
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 

Último (20)

What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 

MongoDB Knowledge Shareing

  • 2. Agenda • What’s Mongo? • Mongo Advantages & Limitations • Mongo Case Studies
  • 4. What’s Mongo?  MongoDB (from "humongous") is an open-source document database, and the leading NoSQL database. Written in C++  The most SQL-like NoSQL.  Mongo is a Open, Schemaless, Document-Oriented NoSql data base with Rich Query, High Performance, High Availbility, High Scalibility, High Flexibility
  • 5. 1. Document Data Model. Document, BSON. 2. Rich Query Model. Full Index, Various Query Type. 3. Idiomatic Drivers. Over 17 language drivers support. 4. Horizontal Scalability. Easy to append capacity 5. High Availability. HA, Journal, Auto-Recover. 6. In-Memory Performance. Memeory-Mapped Files, read/write in RAM. 7. Flexibility. Schema-free, multi-datacenter deployments, tunable consistency, widly used across many industries.
  • 7. Data Model • Max BSON Document Size 16M • Nested Depth for BSON Document 100Level • Document-level Atomic operation
  • 10. Query Type 1. Key-value 2. Range queries. 3. Text Search AND, OR, NOT etc. 4. Aggregation count, min, max, average etc. 5. MapReduce
  • 11. Cursor  Query returns a cursor  Iterate the cursor to get results  Return 101 results or size less than 1M bytes, overrided by batchSize or limit, not exceeds 16M
  • 13. Write Concern  Error Ignored  Unacknowledged  Acknowledged  Journaled
  • 14. Index 1. Single Field Indexes 2. Compound Indexes. 3. Array Indexes. 4. Geospatial Indexes. 5. Hash Indexes. 1. Unique Indexes 6. Text Search Indexes (V2.4, Beta) 2. Spars Index
  • 15. Index  At least 8KB for each index.  Negative performance impact for write operations. Expensive for high write-to-read ratio collection.  benefit high read-to-write ratio collections.  Consumes disk space and memory. Carefully tracked and plan
  • 18. Mongo Replication  Have up to 12 Mongod instances  Have a Primary member, which receives write requests
  • 23. Basic Concepts • Config Servers Shards Replica Mongos Set          Contain APP requests a group of mongod Exist in sets of three Process fractions of global requests to processes Maintain metadata Direct data Are replica Includes sets in shards Primary and Are mongod instances production Secondarys to clients Direct results Can be queried Exist as 1+ directly by clients (not Are mongos instances recommended) Cache metadata
  • 27. Data Store As Service
  • 29. Schema Design • Remember, "schemaless" doesn't mean you don't need to design your schema! • • • • • • • Considerations to avoid the pitfalls of MongoDB schema design: 1. Avoid growing documents 3. Pay attention to BSON data types 5. Field names take up space 6. Consider using _id for your own purposes 7. Can you use covered indexes? 8. Use collections and databases to your advantage • • Test everything Schema design effect performance Schema design effect infrastructure: RAM > indexes + hot data = better performance
  • 30. MongoDB for MDS – Sharding Strategy • When need shard? – – – your data set approaches or exceeds the storage capacity of a single MongoDB instance. the size of your system’s active working set will soon exceed the capacity of your system’s maximumRAM. a single MongoDB instance cannot meet the demands of your write operations, and all other approaches have not reduced contention. • The considerations for sharding – – – – Multiple ways to model a domain problem Understand the key uses cases of your app Balance between ease of query vs. ease of write Random I/O should be avoided • Meeting behavior and sharding consideration(From 10G) – – – – Schedule meeting - ~800K meetings write/day ~20% instant meetings Scalability best practice: Don’t scale by using replication. Scale by using local read nodes. Recommend to implement local write to meet JOIN meetings use case requirements
  • 31. Cross DC latency Testing Local vs Remote Write/Read Latency Test: Scenario: Create two shards, each with three member replica sets. Make sure that Primary node of one runs on local DC(SJ), where as Primary of the second runs on remote DC(TX). Run small number of writes from local DC to Replica1 Primary and then run the same against Replica2 Primary. Writeconcern = majority. Average object size is 1500 bytes. (ping time 46 ms from local DC(SJ) to remote DC(TX). Local vs Remote Insert Tests (YCSB test):
  • 32. Replication delay cross DC • • Repication Lag between data centers: Scenario: On the local DC(SJ), where the replication Primary is running, insert 500 records at a time, upto a total of 550,000 records. Record the record count and current timestamp at the end of every 500 insertions. Note that this is a single threaded operation and only one process is inserting these records. On the remote DC(TX), where the 3rd secondary is running (this node is the least nearest of all the secondaries and so, is not part of the initial write), in a loop keep getting the db.collection.count() and whenever the count returns a multiple of 500, record the count and the current timestamp. Use the data collected on Primary and remote secondary, compute the replication delay.
  • 33. MongoDB for MDS – Sharding Goals: - write to a shard primary node with physical proximity to the application server - keep the shard primary node in close proximity to the application server [monitor the primary node of the replica set and if possible, restore the primary t - reduce 'scatter/gather' on reads - use smart shard keys Solution: Add a geo-location based field in the schema, create a shard index based on that field, assign a tag to each shard and assign specific shard index field ra e.g., Say we can add a 'DC' field into our collection. Assuming that the application somehow knows the data center it is running on, it can use this value for Associate the tag ranges to specific tagged shard. Inferred Technical Requirements 1. MongoDB Sharding (shard keys: region + siteId + userId, region + siteId + meetingUUID) to support 3 regions (US, EMEA, APAC) 2. Sharding by siteId + userId or siteId + meetingUUID allows hosts from the same company (siteId), same region to create meetings in different shards. if we need to scale horizontally, the shard config will add another shard for the same siteId 3. Based on shard keys, we can support the requirements of local writes, local reads 4. Replication requirement - replicating 600,000 meetings/day within 15 minutes between 2 nodes (remark: early benchmarking shows 11M meetings data replicated across 3 sites within 4 minutes) 5. Availability requirement - a primary node fails over to a secondary node within the same data center = < 30 sec; a primary node fails over to a secondary in a different data center = < 10 minutes
  • 34. MongoDB使用案例 • • BillRun 计费系统 奥弗•科恩发布下一代的开源计费解决方案BillRun ,此方案利用MongoDB作为其后端存储。此计费系统已经运行于以色列发展最快的移动运 营商的产品环境,每个月能处理超过500M的呼叫数据记录CDR。 • • • • • 视觉中国 存储comments/feed/full text search 问题: Fail-over失效,由于没有正确配置replica set,至少1 primary+2 sencondary+n arbiter. Out of Memory导致宕机 --增加内存,使用正确驱动(非开发版) • • 优酷 优酷的在线评论业务已部分迁移到MongoDB,运营数据分析及挖掘处理前在使用Hadoop/HBase; • • • • 奇虎360 Document>100Million 问题 Time out (数据超过内存,随机读写,moving chunk时间) Solution: 增大内存(甚至用SSD),节省空间使用(schema refactor);调整balancer工作时间,避免高峰 • • • • Mailbox 100 Million Messages Per Day, store email and related data by MongoDB https://tech.dropbox.com/2013/09/scaling-mongodb-at-mailbox/ Lesson: write lock contention Solution: separate hot collection to standalone cluster, sharding • • • Other 百度开放云-云数据库 非关系型数据库用了mongoDB有很多中小开发者基于mongodb进行开发 Amazon E2: MongoDB后台数据库,如果其上应用data
  • 35. Q&A

Notas do Editor

  1. MongoDB起源于2007年10gen公司的一个项目,该项目的目的是创建一个类似于谷歌AppEngine的Paas平台,用来自动管理软硬件基础设施,让开发者将精力集中在程序设计上,但是这样也剥夺了开发人员很多的自主权,反响不是很好。原本的paas平台由应用服务和数据库组成,发现人们对数据库更感兴趣,于是专注于数据库部分,也就是现在的MongoDB Dwight Merriman &amp; Kevin Ryan MongoDB成为2013年大数据领域的创业新贵。这家成立于2007年的企业在近期获得了2.31亿美元的融资,也因此成为首个身价超过10亿美元的开源创业企业。目前,业内对该公司资产的估值高达12亿美元
  2. To support hash based sharding, MongoDB provides a hashed index type The sparse property of an index ensures that the index only contain entries for documents that have the indexed field
  3. To support hash based sharding, MongoDB provides a hashed index type The sparse property of an index ensures that the index only contain entries for documents that have the indexed field