ReadConcern and WriteConcern

# M D B l o c a l
Alex Komyagin
Senior Consulting Engineer
MongoDB

O C T O B E R 1 2 , 2 0 1 7 | B E S P O K E | S A N F R A N C I S C O
# M D B l o c a l
Who stole my write?
Or the story of Write Concern and Read Concern

# M D B l o c a l
WHAT ARE WE GOING TO LEARN TODAY?
• What those things are - Write Concern and Read Concern
• What you can do with them
• What you should do with them

# M D B l o c a l
TYPICAL WRITE WORKFLOW
The App
Secondaryjournal
In-memory structures and oplog
data files
{name:”Alex”}
{ok:1}
1
2
3
4
5
6
7
Secondary
Primary

# M D B l o c a l
WANT TO SEE A MAGIC TRICK?

# M D B l o c a l
WRITE SOME DATA
The App
Secondaryjournal
data
files
Secondary
Primary
{x:1},...,{x:99}
{ok:1}
{x:99}
…
{x:1}
{x:99}
…
{x:1}
{x:99}
…
{x:1}

# M D B l o c a l
WRITE SOME MORE
The App
Secondaryjournal
data
files
Secondary
Primary
{x:100}
{ok:1}
{x:100}
{x:99}
…
{x:1}
{x:99}
…
{x:1}
{x:99}
…
{x:1}

# M D B l o c a l
OOOPSIE!
The App
Secondaryjournal
data
files
Secondary
Primary
{x:100}
{ok:1}
{x:100}
{x:99}
…
{x:1}
{x:99}
…
{x:1}
{x:99}
…
{x:1}

# M D B l o c a l
KEEP WRITING
The App
Secondary
Primary
Primary
{x:101}
{ok:1}
{x:100}
{x:99}
…
{x:1}
{x:101}
{x:99}
…
{x:1}
{x:101}
{x:99}
…
{x:1}

# M D B l o c a l
THE OLD PRIMARY COMES BACK ONLINE
The App
Secondary
Primary
???
{x:101}
{ok:1}
{x:100}
{x:99}
…
{x:1}
{x:101}
{x:99}
…
{x:1}
{x:101}
{x:99}
…
{x:1}

# M D B l o c a l
HE HAS TO FIX HIS STATE TO RESUME
REPLICATION
The App
Secondary
Primary
ROLLBACK
{x:100}
{x:99}
…
{x:1}
{x:101}
{x:99}
…
{x:1}
{x:101}
{x:99}
…
{x:1}
<dbpath>/rollback/<...>.bson
{x:99} is the last common point

# M D B l o c a l
…AND THINGS ARE BACK TO NORMAL
The App
Secondary
Primary
Secondary
{x:101}
{x:99}
…
{x:1}
{x:101}
{x:99}
…
{x:1}
{x:101}
{x:99}
…
{x:1}
The {x:100} write is not lost per se,
but is not accessible for the app

# M D B l o c a l
Rollback is entirely unavoidable, but it is not a problem, it’s like self-healing

# M D B l o c a l
SO WHERE WAS THE PROBLEM?
The App
Secondaryjournal
data
files
Secondary
Primary
{x:100}
{ok:1}
{x:100}
{x:99}
…
{x:1}
{x:99}
…
{x:1}
{x:99}
…
{x:1}
The App got the “OK” before the write was replicated to any of the secondaries

# M D B l o c a l
Solution – write receipt

# M D B l o c a l
WRITE CONCERN
• Form of an intelligent receipt/confirmation that the write operation
was replicated to the desired number of nodes
• Default number is 1
• Allows us to express how concerned we are with durability of a
particular write in a replica set
• Can be set for individual ops / collections / etc
• NOT a distributed transaction
db.test.insert({x:100},{writeConcern:{w:2}})

# M D B l o c a l
HOW DOES IT WORK?
• Different levels
• {w:<N>}
• {w:<N>, j:true}
• Includes secondaries since 3.2
• {w:”majority”} - implies {j:true} in
MongoDB 3.2+
• Guarantees that confirmed operations
won’t be rolled back
• Supports timeout
• {w:2, wtimeout:100}
• Timeout doesn’t imply a write failure -
you just get no receipt

# M D B l o c a l
WRITE CONCERN TRADEOFFS
• Choose {w:”majority”} for writes that matter
• The main tradeoff is latency
• It’s not as bad as you think (within the same DC, AZ
or even region)
• Use multiple threads to get desired throughput
• Use async frameworks in user facing applications, if
needed
• For cross-regional deployments choose {w:2}
• Reasonable compromise between performance and
durability

# M D B l o c a l
WHAT HAPPENS IF WRITE CONCERN FAILS?
• “wtimeout” only generates a write concern failure exception
• Similar to network exceptions
• No useful information in a failure
• App code has to handle exceptions and retry when appropriate
• Writes need to be made idempotent (e.g. updates with $inc -> $set)
• When idempotency is not possible, at least log the failures
• Retriable writes: Coming soon!
db.test.insert({name:”Alex”},
{writeConcern:{w:2,wtimeout:1000}}
writeConcernError
SecondaryPrimary

# M D B l o c a l
BEST EFFORT WRITE CODE EXAMPLE
• Replica set with 2 data nodes and
an arbiter
• One node goes down every 90
seconds
• Inserting 2mln records
• w:1 - only 1999911 records were
actually there in the end!
client = MongoClient("mongodb://a,b,c/?replicaSet=rs")
coll = client.test_db.test_col
i = 0
while i < 2000000:
my_object = {'number': i}
try:
coll.insert(my_object)
except:
while True: # repeat until success or we hit a
dup key error
try:
break
except DuplicateKeyError:
break
except ConnectionFailure:
pass
i += 1

# M D B l o c a l
HOW TO MAKE IT BETTER?
• Use write concern to know if writes
are durable
• We’ll pay with additional latency for
writes that might never be rolled
back (but we don’t know that!)
• It’s not practical to wait for every
write
- Use bulk inserts
coll = client.test_db.test_col
i = 0
while i < 2000000:
my_object = {'number': i}
try:
except:
while True: # repeat until success or we hit a
dup key error
try:
break
except DuplicateKeyError:
break
pass
i += 1

# M D B l o c a l
coll = client.test_db.test_col.with_options(write_concern=WriteConcern(w=2))
i=0
while i<20000:
requests = []
for j in range(0,100):
requests.append(InsertOne({"number":i*100+j}))
while True: #repeat until success or write concern is satisfied
try:
coll.bulk_write(requests, ordered=False)
break
except BulkWriteError as bwe:
if bwe.details.get('writeConcernErrors') == []:
break
pass
i+=1
BETTER, SAFER CODE
• db.test.count() is 2000000 after the test
• Takes the same amount of time with w:2 as w:1
Insert
batch
Next!
Success
Problems?
No write concern errors
Otherwise

# M D B l o c a l
Let’s look at the reads now

# M D B l o c a l
WHAT IS A DIRTY READ?
The App
Secondaryjournal
data
files
Secondary
Primary
db.test.find({x:100})
{x:100}
{x:100}
{x:99}
…
{x:1}
{x:99}
…
{x:1}
{x:99}
…
{x:1}

# M D B l o c a l
WHAT IS A DIRTY READ?
The App
Secondary
Primary
Secondary
{x:101}
{x:99}
…
{x:1}
{x:101}
{x:99}
…
{x:1}
{x:101}
{x:99}
…
{x:1}
db.test.find({x:100})
null

# M D B l o c a l
READ CONCERN
• Determines which data to
return from a query
• Different modes:
- Local
- Majority (3.2)
- Linearizable (3.4)
• NOT related to read
preferences
Secondaryjournal
data
files
Secondary
Primary
{x:100} - local
{x:99} - majority
{x:98}
…
{x:1}
{x:99} - majority/local
{x:98}
…
{x:1}
{x:99} - local
{x:98} - majority
…
{x:1}

# M D B l o c a l
READ CONCERN
• db.test.find( { x:100 } )
- WORKS
• db.test.find( { x:100 } ).readConcern("majority")
- RETURNS “null”
• db.test.find( { x:100 } ).readConcern("linearizable")
- BLOCKS until the last write is replicated
- Use the maxTimeMS() option to avoid blocking forever
Secondary
Primary
{x:100} - local
{x:99} - majority
{x:98}
…
{x:1}
{x:99} - local
{x:98} - majority
…
{x:1}

# M D B l o c a l
MAJORITY VS. LINEARIZABLE
• Return data that won’t be rolled back
• “Majority” returns the most recent data replicated to a majority of nodes that this particular
node knows about
- Each node maintains and advances a separate “majority-committed” pointer/snapshot
• “Linearizable” ensures that this data is the most recent
- Enables multiple threads to perform reads and writes on a single document as if a single thread
performed these operations in real time
- Only on Primary
- Significantly slower than “majority”
• In most applications dirty reads is not a big problem
- If write failures are handled correctly, the “dirtiness” is temporary
- Twitter vs. Changing your password

# M D B l o c a l
DID WE FORGET ANYTHING?
• Read preference controls where we are reading
from
• Read concern controls what we are reading
• Causal consistency, new in 3.6, allows us to read
what we wrote from any node
• Extension for read concern (read-after-optime)
• Compatible with read concern “majority”
• Enabled on the session level
Secondary
Primary
{x:100} - local
{x:99} - majority
{x:98}
…
{x:1}
{x:99} - local
{x:98} - majority
…
{x:1}
The App
Reads
Writes
Readsdb.getMongo().setCausalConsistency(true)

# M D B l o c a l
HOW TO CHOOSE THE RIGHT CONCERN?
THINK WHAT YOUR USERS CARE ABOUT
Writing important data that has to be
durable?
• Example: ETL process for reporting
• Use {w:2}* or {w:”majority”}
Reads must see the most recent durable
state (can’t be stale or uncommitted)?
• Example: Credentials Management
Application
• Use {w:”majority”} and “linearizable” read
concern
Mission-critical data where dirty reads are not
allowed?
• Example: Config servers in sharding
• Use {w:”majority”} and “majority” read
concern

# M D B l o c a l
DOES MY DRIVER SUPPORT THIS??
• Java
- https://mongodb.github.io/mongo-java-driver/3.4/javadoc/com/mongodb/WriteConcern.html
- https://mongodb.github.io/mongo-java-driver/3.4/javadoc/com/mongodb/ReadConcern.html
• C#
- https://mongodb.github.io/mongo-csharp-driver/2.3/apidocs/html/T_MongoDB_Driver_WriteConcern.htm
- https://mongodb.github.io/mongo-csharp-driver/2.3/apidocs/html/T_MongoDB_Driver_ReadConcern.htm
• PHP
- http://php.net/manual/en/mongo.writeconcerns.php
- http://php.net/manual/en/class.mongodb-driver-readconcern.php
• Others do, too!

# M D B l o c a l
THANK YOU!
TIME FOR YOUR QUESTIONS
My name is Alex
Don’t email me here:
alex@mongodb.com

# M D B l o c a l
MORE RESOURCES
• Documentation is where we all start:
https://docs.mongodb.com/manual/reference/write-concern/
https://docs.mongodb.com/manual/reference/read-concern/
• Great presentation by Jesse Davis on resilient operations:
https://www.slideshare.net/mongodb/mongodb-world-2016-smart-strategies-for-resilient-
applications

ReadConcern and WriteConcern

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Semelhante a ReadConcern and WriteConcern

Semelhante a ReadConcern and WriteConcern (20)

Mais de MongoDB

Mais de MongoDB (20)

ReadConcern and WriteConcern

Notas do Editor