SlideShare uma empresa Scribd logo
1 de 35
# M D B l o c a l
Alex Komyagin
Senior Consulting Engineer
MongoDB
O C T O B E R 1 2 , 2 0 1 7 | B E S P O K E | S A N F R A N C I S C O
# M D B l o c a l
Who stole my write?
Or the story of Write Concern and Read Concern
# M D B l o c a l
WHAT ARE WE GOING TO LEARN TODAY?
• What those things are - Write Concern and Read Concern
• What you can do with them
• What you should do with them
# M D B l o c a l
TYPICAL WRITE WORKFLOW
The App
Secondaryjournal
In-memory structures and oplog
data files
{name:”Alex”}
{ok:1}
1
2
3
4
5
6
7
Secondary
Primary
# M D B l o c a l
WANT TO SEE A MAGIC TRICK?
# M D B l o c a l
WRITE SOME DATA
The App
Secondaryjournal
In-memory structures and oplog
data
files
Secondary
Primary
{x:1},...,{x:99}
{ok:1}
{x:99}
…
{x:1}
{x:99}
…
{x:1}
{x:99}
…
{x:1}
# M D B l o c a l
WRITE SOME MORE
The App
Secondaryjournal
In-memory structures and oplog
data
files
Secondary
Primary
{x:100}
{ok:1}
{x:100}
{x:99}
…
{x:1}
{x:99}
…
{x:1}
{x:99}
…
{x:1}
# M D B l o c a l
OOOPSIE!
The App
Secondaryjournal
In-memory structures and oplog
data
files
Secondary
Primary
{x:100}
{ok:1}
{x:100}
{x:99}
…
{x:1}
{x:99}
…
{x:1}
{x:99}
…
{x:1}
# M D B l o c a l
KEEP WRITING
The App
Secondary
Primary
Primary
{x:101}
{ok:1}
{x:100}
{x:99}
…
{x:1}
{x:101}
{x:99}
…
{x:1}
{x:101}
{x:99}
…
{x:1}
# M D B l o c a l
THE OLD PRIMARY COMES BACK ONLINE
The App
Secondary
Primary
???
{x:101}
{ok:1}
{x:100}
{x:99}
…
{x:1}
{x:101}
{x:99}
…
{x:1}
{x:101}
{x:99}
…
{x:1}
# M D B l o c a l
HE HAS TO FIX HIS STATE TO RESUME
REPLICATION
The App
Secondary
Primary
ROLLBACK
{x:100}
{x:99}
…
{x:1}
{x:101}
{x:99}
…
{x:1}
{x:101}
{x:99}
…
{x:1}
<dbpath>/rollback/<...>.bson
{x:99} is the last common point
# M D B l o c a l
…AND THINGS ARE BACK TO NORMAL
The App
Secondary
Primary
Secondary
{x:101}
{x:99}
…
{x:1}
{x:101}
{x:99}
…
{x:1}
{x:101}
{x:99}
…
{x:1}
<dbpath>/rollback/<...>.bson
The {x:100} write is not lost per se,
but is not accessible for the app
# M D B l o c a l
Rollback is entirely unavoidable, but it is not a problem, it’s like self-healing
# M D B l o c a l
SO WHERE WAS THE PROBLEM?
The App
Secondaryjournal
In-memory structures and oplog
data
files
Secondary
Primary
{x:100}
{ok:1}
{x:100}
{x:99}
…
{x:1}
{x:99}
…
{x:1}
{x:99}
…
{x:1}
The App got the “OK” before the write was replicated to any of the secondaries
# M D B l o c a l
Solution – write receipt
# M D B l o c a l
WRITE CONCERN
• Form of an intelligent receipt/confirmation that the write operation
was replicated to the desired number of nodes
• Default number is 1
• Allows us to express how concerned we are with durability of a
particular write in a replica set
• Can be set for individual ops / collections / etc
• NOT a distributed transaction
db.test.insert({x:100},{writeConcern:{w:2}})
# M D B l o c a l
HOW DOES IT WORK?
• Different levels
• {w:<N>}
• {w:<N>, j:true}
• Includes secondaries since 3.2
• {w:”majority”} - implies {j:true} in
MongoDB 3.2+
• Guarantees that confirmed operations
won’t be rolled back
• Supports timeout
• {w:2, wtimeout:100}
• Timeout doesn’t imply a write failure -
you just get no receipt
# M D B l o c a l
WRITE CONCERN TRADEOFFS
• Choose {w:”majority”} for writes that matter
• The main tradeoff is latency
• It’s not as bad as you think (within the same DC, AZ
or even region)
• Use multiple threads to get desired throughput
• Use async frameworks in user facing applications, if
needed
• For cross-regional deployments choose {w:2}
• Reasonable compromise between performance and
durability
# M D B l o c a l
Failures
# M D B l o c a l
WHAT HAPPENS IF WRITE CONCERN FAILS?
• “wtimeout” only generates a write concern failure exception
• Similar to network exceptions
• No useful information in a failure
• App code has to handle exceptions and retry when appropriate
• Writes need to be made idempotent (e.g. updates with $inc -> $set)
• When idempotency is not possible, at least log the failures
• Retriable writes: Coming soon!
db.test.insert({name:”Alex”},
{writeConcern:{w:2,wtimeout:1000}}
writeConcernError
SecondaryPrimary
# M D B l o c a l
BEST EFFORT WRITE CODE EXAMPLE
• Replica set with 2 data nodes and
an arbiter
• One node goes down every 90
seconds
• Inserting 2mln records
• w:1 - only 1999911 records were
actually there in the end!
client = MongoClient("mongodb://a,b,c/?replicaSet=rs")
coll = client.test_db.test_col
i = 0
while i < 2000000:
my_object = {'number': i}
try:
coll.insert(my_object)
except:
while True: # repeat until success or we hit a
dup key error
try:
coll.insert(my_object)
break
except DuplicateKeyError:
break
except ConnectionFailure:
pass
i += 1
# M D B l o c a l
HOW TO MAKE IT BETTER?
• Use write concern to know if writes
are durable
• We’ll pay with additional latency for
writes that might never be rolled
back (but we don’t know that!)
• It’s not practical to wait for every
write
- Use bulk inserts
client = MongoClient("mongodb://a,b,c/?replicaSet=rs")
coll = client.test_db.test_col
i = 0
while i < 2000000:
my_object = {'number': i}
try:
coll.insert(my_object)
except:
while True: # repeat until success or we hit a
dup key error
try:
coll.insert(my_object)
break
except DuplicateKeyError:
break
except ConnectionFailure:
pass
i += 1
# M D B l o c a l
client = MongoClient("mongodb://a,b,c/?replicaSet=rs")
coll = client.test_db.test_col.with_options(write_concern=WriteConcern(w=2))
i=0
while i<20000:
requests = []
for j in range(0,100):
requests.append(InsertOne({"number":i*100+j}))
while True: #repeat until success or write concern is satisfied
try:
coll.bulk_write(requests, ordered=False)
break
except BulkWriteError as bwe:
if bwe.details.get('writeConcernErrors') == []:
break
except ConnectionFailure:
pass
i+=1
BETTER, SAFER CODE
• db.test.count() is 2000000 after the test
• Takes the same amount of time with w:2 as w:1
Insert
batch
Next!
Success
Problems?
No write concern errors
Otherwise
# M D B l o c a l
Let’s look at the reads now
# M D B l o c a l
WHAT IS A DIRTY READ?
The App
Secondaryjournal
In-memory structures and oplog
data
files
Secondary
Primary
db.test.find({x:100})
{x:100}
{x:100}
{x:99}
…
{x:1}
{x:99}
…
{x:1}
{x:99}
…
{x:1}
# M D B l o c a l
WHAT IS A DIRTY READ?
The App
Secondary
Primary
Secondary
{x:101}
{x:99}
…
{x:1}
{x:101}
{x:99}
…
{x:1}
{x:101}
{x:99}
…
{x:1}
<dbpath>/rollback/<...>.bson
db.test.find({x:100})
null
# M D B l o c a l
READ CONCERN
• Determines which data to
return from a query
• Different modes:
- Local
- Majority (3.2)
- Linearizable (3.4)
• NOT related to read
preferences
Secondaryjournal
In-memory structures and oplog
data
files
Secondary
Primary
{x:100} - local
{x:99} - majority
{x:98}
…
{x:1}
{x:99} - majority/local
{x:98}
…
{x:1}
{x:99} - local
{x:98} - majority
…
{x:1}
# M D B l o c a l
READ CONCERN
• db.test.find( { x:100 } )
- WORKS
• db.test.find( { x:100 } ).readConcern("majority")
- RETURNS “null”
• db.test.find( { x:100 } ).readConcern("linearizable")
- BLOCKS until the last write is replicated
- Use the maxTimeMS() option to avoid blocking forever
Secondary
Primary
{x:100} - local
{x:99} - majority
{x:98}
…
{x:1}
{x:99} - local
{x:98} - majority
…
{x:1}
# M D B l o c a l
MAJORITY VS. LINEARIZABLE
• Return data that won’t be rolled back
• “Majority” returns the most recent data replicated to a majority of nodes that this particular
node knows about
- Each node maintains and advances a separate “majority-committed” pointer/snapshot
• “Linearizable” ensures that this data is the most recent
- Enables multiple threads to perform reads and writes on a single document as if a single thread
performed these operations in real time
- Only on Primary
- Significantly slower than “majority”
• In most applications dirty reads is not a big problem
- If write failures are handled correctly, the “dirtiness” is temporary
- Twitter vs. Changing your password
# M D B l o c a l
DID WE FORGET ANYTHING?
• Read preference controls where we are reading
from
• Read concern controls what we are reading
• Causal consistency, new in 3.6, allows us to read
what we wrote from any node
• Extension for read concern (read-after-optime)
• Compatible with read concern “majority”
• Enabled on the session level
Secondary
Primary
{x:100} - local
{x:99} - majority
{x:98}
…
{x:1}
{x:99} - local
{x:98} - majority
…
{x:1}
The App
Reads
Writes
Readsdb.getMongo().setCausalConsistency(true)
# M D B l o c a l
Successes
# M D B l o c a l
HOW TO CHOOSE THE RIGHT CONCERN?
THINK WHAT YOUR USERS CARE ABOUT
Writing important data that has to be
durable?
• Example: ETL process for reporting
• Use {w:2}* or {w:”majority”}
Reads must see the most recent durable
state (can’t be stale or uncommitted)?
• Example: Credentials Management
Application
• Use {w:”majority”} and “linearizable” read
concern
Mission-critical data where dirty reads are not
allowed?
• Example: Config servers in sharding
• Use {w:”majority”} and “majority” read
concern
# M D B l o c a l
DOES MY DRIVER SUPPORT THIS??
• Java
- https://mongodb.github.io/mongo-java-driver/3.4/javadoc/com/mongodb/WriteConcern.html
- https://mongodb.github.io/mongo-java-driver/3.4/javadoc/com/mongodb/ReadConcern.html
• C#
- https://mongodb.github.io/mongo-csharp-driver/2.3/apidocs/html/T_MongoDB_Driver_WriteConcern.htm
- https://mongodb.github.io/mongo-csharp-driver/2.3/apidocs/html/T_MongoDB_Driver_ReadConcern.htm
• PHP
- http://php.net/manual/en/mongo.writeconcerns.php
- http://php.net/manual/en/class.mongodb-driver-readconcern.php
• Others do, too!
# M D B l o c a l
THANK YOU!
TIME FOR YOUR QUESTIONS
My name is Alex
Don’t email me here:
alex@mongodb.com
# M D B l o c a l
MORE RESOURCES
• Documentation is where we all start:
https://docs.mongodb.com/manual/reference/write-concern/
https://docs.mongodb.com/manual/reference/read-concern/
• Great presentation by Jesse Davis on resilient operations:
https://www.slideshare.net/mongodb/mongodb-world-2016-smart-strategies-for-resilient-
applications

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Introducing Exactly Once Semantics in Apache Kafka with Matthias J. Sax
Introducing Exactly Once Semantics in Apache Kafka with Matthias J. SaxIntroducing Exactly Once Semantics in Apache Kafka with Matthias J. Sax
Introducing Exactly Once Semantics in Apache Kafka with Matthias J. Sax
 
Mongo db intro.pptx
Mongo db intro.pptxMongo db intro.pptx
Mongo db intro.pptx
 
MongoDB WiredTiger Internals: Journey To Transactions
MongoDB WiredTiger Internals: Journey To TransactionsMongoDB WiredTiger Internals: Journey To Transactions
MongoDB WiredTiger Internals: Journey To Transactions
 
Introduction to Kafka Streams
Introduction to Kafka StreamsIntroduction to Kafka Streams
Introduction to Kafka Streams
 
Advanced Design Patterns for Amazon DynamoDB - DAT403 - re:Invent 2017
Advanced Design Patterns for Amazon DynamoDB - DAT403 - re:Invent 2017Advanced Design Patterns for Amazon DynamoDB - DAT403 - re:Invent 2017
Advanced Design Patterns for Amazon DynamoDB - DAT403 - re:Invent 2017
 
Delta from a Data Engineer's Perspective
Delta from a Data Engineer's PerspectiveDelta from a Data Engineer's Perspective
Delta from a Data Engineer's Perspective
 
Indexing and Performance Tuning
Indexing and Performance TuningIndexing and Performance Tuning
Indexing and Performance Tuning
 
Evolution of MongoDB Replicaset and Its Best Practices
Evolution of MongoDB Replicaset and Its Best PracticesEvolution of MongoDB Replicaset and Its Best Practices
Evolution of MongoDB Replicaset and Its Best Practices
 
The Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesThe Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization Opportunities
 
Introduction to KSQL: Streaming SQL for Apache Kafka®
Introduction to KSQL: Streaming SQL for Apache Kafka®Introduction to KSQL: Streaming SQL for Apache Kafka®
Introduction to KSQL: Streaming SQL for Apache Kafka®
 
Real-time Analytics with Presto and Apache Pinot
Real-time Analytics with Presto and Apache PinotReal-time Analytics with Presto and Apache Pinot
Real-time Analytics with Presto and Apache Pinot
 
MongodB Internals
MongodB InternalsMongodB Internals
MongodB Internals
 
My first 90 days with ClickHouse.pdf
My first 90 days with ClickHouse.pdfMy first 90 days with ClickHouse.pdf
My first 90 days with ClickHouse.pdf
 
Concurrency Control in MongoDB 3.0
Concurrency Control in MongoDB 3.0Concurrency Control in MongoDB 3.0
Concurrency Control in MongoDB 3.0
 
MongoDB
MongoDBMongoDB
MongoDB
 
An Introduction To NoSQL & MongoDB
An Introduction To NoSQL & MongoDBAn Introduction To NoSQL & MongoDB
An Introduction To NoSQL & MongoDB
 
MongoDB World 2019: MongoDB Read Isolation: Making Your Reads Clean, Committe...
MongoDB World 2019: MongoDB Read Isolation: Making Your Reads Clean, Committe...MongoDB World 2019: MongoDB Read Isolation: Making Your Reads Clean, Committe...
MongoDB World 2019: MongoDB Read Isolation: Making Your Reads Clean, Committe...
 
[135] 오픈소스 데이터베이스, 은행 서비스에 첫발을 내밀다.
[135] 오픈소스 데이터베이스, 은행 서비스에 첫발을 내밀다.[135] 오픈소스 데이터베이스, 은행 서비스에 첫발을 내밀다.
[135] 오픈소스 데이터베이스, 은행 서비스에 첫발을 내밀다.
 
MongoDB at Baidu
MongoDB at BaiduMongoDB at Baidu
MongoDB at Baidu
 
CQRS and Event Sourcing
CQRS and Event Sourcing CQRS and Event Sourcing
CQRS and Event Sourcing
 

Semelhante a ReadConcern and WriteConcern

Visual basic 6.0
Visual basic 6.0Visual basic 6.0
Visual basic 6.0
Aarti P
 
The Art Of Readable Code
The Art Of Readable CodeThe Art Of Readable Code
The Art Of Readable Code
Baidu, Inc.
 

Semelhante a ReadConcern and WriteConcern (20)

What is ATT&CK coverage, anyway? Breadth and depth analysis with Atomic Red Team
What is ATT&CK coverage, anyway? Breadth and depth analysis with Atomic Red TeamWhat is ATT&CK coverage, anyway? Breadth and depth analysis with Atomic Red Team
What is ATT&CK coverage, anyway? Breadth and depth analysis with Atomic Red Team
 
C# 101: Intro to Programming with C#
C# 101: Intro to Programming with C#C# 101: Intro to Programming with C#
C# 101: Intro to Programming with C#
 
The pragmatic programmer
The pragmatic programmerThe pragmatic programmer
The pragmatic programmer
 
Top Java Performance Problems and Metrics To Check in Your Pipeline
Top Java Performance Problems and Metrics To Check in Your PipelineTop Java Performance Problems and Metrics To Check in Your Pipeline
Top Java Performance Problems and Metrics To Check in Your Pipeline
 
Lotusphere 2007 AD505 DevBlast 30 LotusScript Tips
Lotusphere 2007 AD505 DevBlast 30 LotusScript TipsLotusphere 2007 AD505 DevBlast 30 LotusScript Tips
Lotusphere 2007 AD505 DevBlast 30 LotusScript Tips
 
An introduction to Competitive Programming
An introduction to Competitive ProgrammingAn introduction to Competitive Programming
An introduction to Competitive Programming
 
Visual basic 6.0
Visual basic 6.0Visual basic 6.0
Visual basic 6.0
 
Beyond php - it's not (just) about the code
Beyond php - it's not (just) about the codeBeyond php - it's not (just) about the code
Beyond php - it's not (just) about the code
 
BSides MCR 2016: From CSV to CMD to qwerty
BSides MCR 2016: From CSV to CMD to qwertyBSides MCR 2016: From CSV to CMD to qwerty
BSides MCR 2016: From CSV to CMD to qwerty
 
5 Pitfalls to Avoid with MongoDB
5 Pitfalls to Avoid with MongoDB5 Pitfalls to Avoid with MongoDB
5 Pitfalls to Avoid with MongoDB
 
Building Microservices with Event Sourcing and CQRS
Building Microservices with Event Sourcing and CQRSBuilding Microservices with Event Sourcing and CQRS
Building Microservices with Event Sourcing and CQRS
 
Beyond php it's not (just) about the code
Beyond php   it's not (just) about the codeBeyond php   it's not (just) about the code
Beyond php it's not (just) about the code
 
Mobile Weekend Budapest presentation
Mobile Weekend Budapest presentationMobile Weekend Budapest presentation
Mobile Weekend Budapest presentation
 
The Art Of Readable Code
The Art Of Readable CodeThe Art Of Readable Code
The Art Of Readable Code
 
Microsoft azure data fundamentals (dp 900) practice tests 2022
Microsoft azure data fundamentals (dp 900) practice tests 2022Microsoft azure data fundamentals (dp 900) practice tests 2022
Microsoft azure data fundamentals (dp 900) practice tests 2022
 
maXbox Starter 43 Work with Code Metrics ISO Standard
maXbox Starter 43 Work with Code Metrics ISO StandardmaXbox Starter 43 Work with Code Metrics ISO Standard
maXbox Starter 43 Work with Code Metrics ISO Standard
 
Tdd in practice
Tdd in practiceTdd in practice
Tdd in practice
 
MongoDB World 2019: Don't Break the Camel's Back: Running MongoDB as Hard as ...
MongoDB World 2019: Don't Break the Camel's Back: Running MongoDB as Hard as ...MongoDB World 2019: Don't Break the Camel's Back: Running MongoDB as Hard as ...
MongoDB World 2019: Don't Break the Camel's Back: Running MongoDB as Hard as ...
 
Code instrumentation
Code instrumentationCode instrumentation
Code instrumentation
 
Production Debugging at Code Camp Philly
Production Debugging at Code Camp PhillyProduction Debugging at Code Camp Philly
Production Debugging at Code Camp Philly
 

Mais de MongoDB

Mais de MongoDB (20)

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
 

ReadConcern and WriteConcern

  • 1. # M D B l o c a l Alex Komyagin Senior Consulting Engineer MongoDB
  • 2. O C T O B E R 1 2 , 2 0 1 7 | B E S P O K E | S A N F R A N C I S C O # M D B l o c a l Who stole my write? Or the story of Write Concern and Read Concern
  • 3. # M D B l o c a l WHAT ARE WE GOING TO LEARN TODAY? • What those things are - Write Concern and Read Concern • What you can do with them • What you should do with them
  • 4. # M D B l o c a l TYPICAL WRITE WORKFLOW The App Secondaryjournal In-memory structures and oplog data files {name:”Alex”} {ok:1} 1 2 3 4 5 6 7 Secondary Primary
  • 5. # M D B l o c a l WANT TO SEE A MAGIC TRICK?
  • 6. # M D B l o c a l WRITE SOME DATA The App Secondaryjournal In-memory structures and oplog data files Secondary Primary {x:1},...,{x:99} {ok:1} {x:99} … {x:1} {x:99} … {x:1} {x:99} … {x:1}
  • 7. # M D B l o c a l WRITE SOME MORE The App Secondaryjournal In-memory structures and oplog data files Secondary Primary {x:100} {ok:1} {x:100} {x:99} … {x:1} {x:99} … {x:1} {x:99} … {x:1}
  • 8. # M D B l o c a l OOOPSIE! The App Secondaryjournal In-memory structures and oplog data files Secondary Primary {x:100} {ok:1} {x:100} {x:99} … {x:1} {x:99} … {x:1} {x:99} … {x:1}
  • 9. # M D B l o c a l KEEP WRITING The App Secondary Primary Primary {x:101} {ok:1} {x:100} {x:99} … {x:1} {x:101} {x:99} … {x:1} {x:101} {x:99} … {x:1}
  • 10. # M D B l o c a l THE OLD PRIMARY COMES BACK ONLINE The App Secondary Primary ??? {x:101} {ok:1} {x:100} {x:99} … {x:1} {x:101} {x:99} … {x:1} {x:101} {x:99} … {x:1}
  • 11. # M D B l o c a l HE HAS TO FIX HIS STATE TO RESUME REPLICATION The App Secondary Primary ROLLBACK {x:100} {x:99} … {x:1} {x:101} {x:99} … {x:1} {x:101} {x:99} … {x:1} <dbpath>/rollback/<...>.bson {x:99} is the last common point
  • 12. # M D B l o c a l …AND THINGS ARE BACK TO NORMAL The App Secondary Primary Secondary {x:101} {x:99} … {x:1} {x:101} {x:99} … {x:1} {x:101} {x:99} … {x:1} <dbpath>/rollback/<...>.bson The {x:100} write is not lost per se, but is not accessible for the app
  • 13. # M D B l o c a l Rollback is entirely unavoidable, but it is not a problem, it’s like self-healing
  • 14. # M D B l o c a l SO WHERE WAS THE PROBLEM? The App Secondaryjournal In-memory structures and oplog data files Secondary Primary {x:100} {ok:1} {x:100} {x:99} … {x:1} {x:99} … {x:1} {x:99} … {x:1} The App got the “OK” before the write was replicated to any of the secondaries
  • 15. # M D B l o c a l Solution – write receipt
  • 16. # M D B l o c a l WRITE CONCERN • Form of an intelligent receipt/confirmation that the write operation was replicated to the desired number of nodes • Default number is 1 • Allows us to express how concerned we are with durability of a particular write in a replica set • Can be set for individual ops / collections / etc • NOT a distributed transaction db.test.insert({x:100},{writeConcern:{w:2}})
  • 17. # M D B l o c a l HOW DOES IT WORK? • Different levels • {w:<N>} • {w:<N>, j:true} • Includes secondaries since 3.2 • {w:”majority”} - implies {j:true} in MongoDB 3.2+ • Guarantees that confirmed operations won’t be rolled back • Supports timeout • {w:2, wtimeout:100} • Timeout doesn’t imply a write failure - you just get no receipt
  • 18. # M D B l o c a l WRITE CONCERN TRADEOFFS • Choose {w:”majority”} for writes that matter • The main tradeoff is latency • It’s not as bad as you think (within the same DC, AZ or even region) • Use multiple threads to get desired throughput • Use async frameworks in user facing applications, if needed • For cross-regional deployments choose {w:2} • Reasonable compromise between performance and durability
  • 19. # M D B l o c a l Failures
  • 20. # M D B l o c a l WHAT HAPPENS IF WRITE CONCERN FAILS? • “wtimeout” only generates a write concern failure exception • Similar to network exceptions • No useful information in a failure • App code has to handle exceptions and retry when appropriate • Writes need to be made idempotent (e.g. updates with $inc -> $set) • When idempotency is not possible, at least log the failures • Retriable writes: Coming soon! db.test.insert({name:”Alex”}, {writeConcern:{w:2,wtimeout:1000}} writeConcernError SecondaryPrimary
  • 21. # M D B l o c a l BEST EFFORT WRITE CODE EXAMPLE • Replica set with 2 data nodes and an arbiter • One node goes down every 90 seconds • Inserting 2mln records • w:1 - only 1999911 records were actually there in the end! client = MongoClient("mongodb://a,b,c/?replicaSet=rs") coll = client.test_db.test_col i = 0 while i < 2000000: my_object = {'number': i} try: coll.insert(my_object) except: while True: # repeat until success or we hit a dup key error try: coll.insert(my_object) break except DuplicateKeyError: break except ConnectionFailure: pass i += 1
  • 22. # M D B l o c a l HOW TO MAKE IT BETTER? • Use write concern to know if writes are durable • We’ll pay with additional latency for writes that might never be rolled back (but we don’t know that!) • It’s not practical to wait for every write - Use bulk inserts client = MongoClient("mongodb://a,b,c/?replicaSet=rs") coll = client.test_db.test_col i = 0 while i < 2000000: my_object = {'number': i} try: coll.insert(my_object) except: while True: # repeat until success or we hit a dup key error try: coll.insert(my_object) break except DuplicateKeyError: break except ConnectionFailure: pass i += 1
  • 23. # M D B l o c a l client = MongoClient("mongodb://a,b,c/?replicaSet=rs") coll = client.test_db.test_col.with_options(write_concern=WriteConcern(w=2)) i=0 while i<20000: requests = [] for j in range(0,100): requests.append(InsertOne({"number":i*100+j})) while True: #repeat until success or write concern is satisfied try: coll.bulk_write(requests, ordered=False) break except BulkWriteError as bwe: if bwe.details.get('writeConcernErrors') == []: break except ConnectionFailure: pass i+=1 BETTER, SAFER CODE • db.test.count() is 2000000 after the test • Takes the same amount of time with w:2 as w:1 Insert batch Next! Success Problems? No write concern errors Otherwise
  • 24. # M D B l o c a l Let’s look at the reads now
  • 25. # M D B l o c a l WHAT IS A DIRTY READ? The App Secondaryjournal In-memory structures and oplog data files Secondary Primary db.test.find({x:100}) {x:100} {x:100} {x:99} … {x:1} {x:99} … {x:1} {x:99} … {x:1}
  • 26. # M D B l o c a l WHAT IS A DIRTY READ? The App Secondary Primary Secondary {x:101} {x:99} … {x:1} {x:101} {x:99} … {x:1} {x:101} {x:99} … {x:1} <dbpath>/rollback/<...>.bson db.test.find({x:100}) null
  • 27. # M D B l o c a l READ CONCERN • Determines which data to return from a query • Different modes: - Local - Majority (3.2) - Linearizable (3.4) • NOT related to read preferences Secondaryjournal In-memory structures and oplog data files Secondary Primary {x:100} - local {x:99} - majority {x:98} … {x:1} {x:99} - majority/local {x:98} … {x:1} {x:99} - local {x:98} - majority … {x:1}
  • 28. # M D B l o c a l READ CONCERN • db.test.find( { x:100 } ) - WORKS • db.test.find( { x:100 } ).readConcern("majority") - RETURNS “null” • db.test.find( { x:100 } ).readConcern("linearizable") - BLOCKS until the last write is replicated - Use the maxTimeMS() option to avoid blocking forever Secondary Primary {x:100} - local {x:99} - majority {x:98} … {x:1} {x:99} - local {x:98} - majority … {x:1}
  • 29. # M D B l o c a l MAJORITY VS. LINEARIZABLE • Return data that won’t be rolled back • “Majority” returns the most recent data replicated to a majority of nodes that this particular node knows about - Each node maintains and advances a separate “majority-committed” pointer/snapshot • “Linearizable” ensures that this data is the most recent - Enables multiple threads to perform reads and writes on a single document as if a single thread performed these operations in real time - Only on Primary - Significantly slower than “majority” • In most applications dirty reads is not a big problem - If write failures are handled correctly, the “dirtiness” is temporary - Twitter vs. Changing your password
  • 30. # M D B l o c a l DID WE FORGET ANYTHING? • Read preference controls where we are reading from • Read concern controls what we are reading • Causal consistency, new in 3.6, allows us to read what we wrote from any node • Extension for read concern (read-after-optime) • Compatible with read concern “majority” • Enabled on the session level Secondary Primary {x:100} - local {x:99} - majority {x:98} … {x:1} {x:99} - local {x:98} - majority … {x:1} The App Reads Writes Readsdb.getMongo().setCausalConsistency(true)
  • 31. # M D B l o c a l Successes
  • 32. # M D B l o c a l HOW TO CHOOSE THE RIGHT CONCERN? THINK WHAT YOUR USERS CARE ABOUT Writing important data that has to be durable? • Example: ETL process for reporting • Use {w:2}* or {w:”majority”} Reads must see the most recent durable state (can’t be stale or uncommitted)? • Example: Credentials Management Application • Use {w:”majority”} and “linearizable” read concern Mission-critical data where dirty reads are not allowed? • Example: Config servers in sharding • Use {w:”majority”} and “majority” read concern
  • 33. # M D B l o c a l DOES MY DRIVER SUPPORT THIS?? • Java - https://mongodb.github.io/mongo-java-driver/3.4/javadoc/com/mongodb/WriteConcern.html - https://mongodb.github.io/mongo-java-driver/3.4/javadoc/com/mongodb/ReadConcern.html • C# - https://mongodb.github.io/mongo-csharp-driver/2.3/apidocs/html/T_MongoDB_Driver_WriteConcern.htm - https://mongodb.github.io/mongo-csharp-driver/2.3/apidocs/html/T_MongoDB_Driver_ReadConcern.htm • PHP - http://php.net/manual/en/mongo.writeconcerns.php - http://php.net/manual/en/class.mongodb-driver-readconcern.php • Others do, too!
  • 34. # M D B l o c a l THANK YOU! TIME FOR YOUR QUESTIONS My name is Alex Don’t email me here: alex@mongodb.com
  • 35. # M D B l o c a l MORE RESOURCES • Documentation is where we all start: https://docs.mongodb.com/manual/reference/write-concern/ https://docs.mongodb.com/manual/reference/read-concern/ • Great presentation by Jesse Davis on resilient operations: https://www.slideshare.net/mongodb/mongodb-world-2016-smart-strategies-for-resilient- applications

Notas do Editor

  1. Why this talk? Distributed systems are different from standalone systems STORY about where is my data
  2. Typical app operations workflow with a replica set (app sends ops to the primary secondaries replicate) - the steps that a write goes thru - mainly to establish a common ground especially in terminology
  3. Can I disable rollback? Is rollback avoidable?
  4. Replication is async, so the primary waits for secondaries who send a special replSetUpdatePosition command to inform upstream nodes on their replication progress We’ll talk more about timeouts later
  5. This is how you would probably write it if you didn’t attend this presentation  By the way in prod we discourage while True loops (Jesse) Why did we lose documents?
  6. Now, knowing all of the above, let’s make it better
  7. 30 min
  8. What happens if someone reads data that is going to be rolled back? How much of a problem is that?
  9. Now after the rollback we get nothing
  10. Almost not related as we will see
  11. Changing your password requires linearizable read concern when verifying
  12. As of now, you can only get what you wrote by reading from the primary Not a need for everyone, but there were quite a few requests