SlideShare uma empresa Scribd logo
1 de 28
A Technical Introduction to WiredTiger
Michael Cahill
Director of Engineering (Storage), MongoDB
You may have seen this:
or this…
How does WiredTiger do it?
6
What’s different about WiredTiger?
• Document-level concurrency
• In-memory performance
• Multi-core scalability
• Checksums
• Compression
• Durability with and without journaling
7
This presentation is not…
• How to write stand-alone WiredTiger apps
• How to configure MongoDB with WiredTiger for your workload
WiredTiger Background
9
Why create a new storage engine?
• Minimize contention between threads
– lock-free algorithms, hazard pointers
– eliminate blocking due to concurrency control
• Hotter cache and more work per I/O
– compact file formats
– compression
– big-block I/O
10
MongoDB’s Storage Engine API
• Allows different storage engines to "plug-in"
– Different workloads have different performance characteristics
– mmap is not ideal for all workloads
– More flexibility
• mix storage engines on same replica set/sharded cluster
• Opportunity to integrate further (HDFS, native encrypted,
hardware optimized …)
• Great way for us to demonstrate WiredTiger’s performance
11
Storage Engine Layer
Content
Repo
IoT Sensor
Backend
Ad Service
Customer
Analytics
Archive
MongoDB Query Language (MQL) + Native Drivers
MongoDB Document Data Model
MMAP V1 WT In-Memory ? ?
Supported in MongoDB 3.0 Future Possible Storage Engines
Management
Security
Example Future State
Experimental
WiredTiger Architecture
13
WiredTiger Architecture
WiredTiger Engine
Schema &
Cursors
Python API C API Java API
Database
Files
Transactions
Page
read/ write
Logging
Column
storage
Block
management
Row
storage
Snapshots
Log Files
Cache
In-memory
consistency
Per-file
checkpoints
14
Trees in cache
non-resident
child
ordinary pointer
root page
internal
page
internal
page
root page
leaf page
leaf page leaf page leaf page
1 memory
flush
read required
disk image
15
Pages in cache
cache
data files
page images
on-disk
page
image
index
clean
page on-disk
page
image
indexdirty
page
updates
constructed
during read
skiplist
reconciled
during write
Concurrency
and
multi-core scaling
17
Multiversion Concurrency Control (MVCC)
• Multiple versions of records kept in cache
• Readers see the version that was committed before the operation
started
– MongoDB “yields” turn large operations into small transactions
• Writers can create new versions concurrent with readers
• Concurrent updates to a single record cause write conflicts
– MongoDB retries with back-off
Transforming data during I/O
19
Checksums
• A checksum is stored with every page
• WiredTiger stores the checksum with the page address (typically
in a parent page)
– Extra safety against reading an old, valid page image
• Checksums are validated during page read
– Detects filesystem corruption, random bitflips
20
Compression
• WiredTiger uses snappy compression by default in MongoDB
• supported compression algorithms
– snappy [default]: good compression, low overhead
– zlib: better compression, more CPU
– none
• Indexes are compressed using prefix compression
– allows compression in memory
Durability
with and without
Journal
22
Writing a checkpoint
1. Write the leaves
2. Write the internal pages, including the root
– the old checkpoint is still valid
3. Sync the file
4. Write the new root’s address to the metadata
– free pages from old checkpoints once the metadata is durable
23
Durability without Journaling
• MMAPv1 uses a write-ahead log (journal) to guarantee
consistency
– Running with “nojournal” is unsafe
• WiredTiger doesn't have this need: no in-place updates
– Write-ahead log can be truncated at checkpoints
• Every 2GB or 60sec by default – configurable
– Updates are written to optional journal as they commit
• Not flushed on every commit by default
• Recovery rolls forward from last checkpoint
• Replication can guarantee durability
24
Journal and recovery
• Optional write-ahead logging
• Only written at transaction commit
• snappy compression by default
• Group commit
• Automatic log archive / removal
• On startup, we rely on finding a consistent checkpoint in the
metadata
• Check LSNs in the metadata to figure out where to roll forward
from
25
So, what’s different about WiredTiger?
• Multiversion Concurrent Control
– No lock manager
• Non-locking data structures
– Multi-core scalability
• Different representation of in-memory data vs on-disk
– Enabled checksums, compression
• Copy-on-write storage
– Durability without a journal
What’s next?
27
What’s next for WiredTiger?
• WiredTiger btree access method is optional in 3.0
• plan to make it the default in 3.2
• tuning for (many) more workloads
• make Log Structured Merge (LSM) work well with MongoDB
– out-of-cache, write-heavy workloads
• adding encryption
• more advanced transactional semantics in the storage engine API
Thanks!
Questions?
Michael Cahill
michael.cahill@mongodb.com

Mais conteúdo relacionado

Mais procurados

mongodb와 mysql의 CRUD 연산의 성능 비교
mongodb와 mysql의 CRUD 연산의 성능 비교mongodb와 mysql의 CRUD 연산의 성능 비교
mongodb와 mysql의 CRUD 연산의 성능 비교
Woo Yeong Choi
 

Mais procurados (20)

Introduction to Mongodb execution plan and optimizer
Introduction to Mongodb execution plan and optimizerIntroduction to Mongodb execution plan and optimizer
Introduction to Mongodb execution plan and optimizer
 
Understanding and tuning WiredTiger, the new high performance database engine...
Understanding and tuning WiredTiger, the new high performance database engine...Understanding and tuning WiredTiger, the new high performance database engine...
Understanding and tuning WiredTiger, the new high performance database engine...
 
ELK Stack
ELK StackELK Stack
ELK Stack
 
Redo log improvements MYSQL 8.0
Redo log improvements MYSQL 8.0Redo log improvements MYSQL 8.0
Redo log improvements MYSQL 8.0
 
AWS RDS Benchmark - Instance comparison
AWS RDS Benchmark - Instance comparisonAWS RDS Benchmark - Instance comparison
AWS RDS Benchmark - Instance comparison
 
Indexing with MongoDB
Indexing with MongoDBIndexing with MongoDB
Indexing with MongoDB
 
Inside MongoDB: the Internals of an Open-Source Database
Inside MongoDB: the Internals of an Open-Source DatabaseInside MongoDB: the Internals of an Open-Source Database
Inside MongoDB: the Internals of an Open-Source Database
 
ElasticSearch Basic Introduction
ElasticSearch Basic IntroductionElasticSearch Basic Introduction
ElasticSearch Basic Introduction
 
카프카, 산전수전 노하우
카프카, 산전수전 노하우카프카, 산전수전 노하우
카프카, 산전수전 노하우
 
mongodb와 mysql의 CRUD 연산의 성능 비교
mongodb와 mysql의 CRUD 연산의 성능 비교mongodb와 mysql의 CRUD 연산의 성능 비교
mongodb와 mysql의 CRUD 연산의 성능 비교
 
Building robust CDC pipeline with Apache Hudi and Debezium
Building robust CDC pipeline with Apache Hudi and DebeziumBuilding robust CDC pipeline with Apache Hudi and Debezium
Building robust CDC pipeline with Apache Hudi and Debezium
 
Apache Arrow: In Theory, In Practice
Apache Arrow: In Theory, In PracticeApache Arrow: In Theory, In Practice
Apache Arrow: In Theory, In Practice
 
Elasticsearch Tutorial | Getting Started with Elasticsearch | ELK Stack Train...
Elasticsearch Tutorial | Getting Started with Elasticsearch | ELK Stack Train...Elasticsearch Tutorial | Getting Started with Elasticsearch | ELK Stack Train...
Elasticsearch Tutorial | Getting Started with Elasticsearch | ELK Stack Train...
 
PostgreSQL and RAM usage
PostgreSQL and RAM usagePostgreSQL and RAM usage
PostgreSQL and RAM usage
 
The delta architecture
The delta architectureThe delta architecture
The delta architecture
 
Introduction to elasticsearch
Introduction to elasticsearchIntroduction to elasticsearch
Introduction to elasticsearch
 
Dive into PySpark
Dive into PySparkDive into PySpark
Dive into PySpark
 
Amazon DocumentDB vs MongoDB 의 내부 아키텍쳐 와 장단점 비교
Amazon DocumentDB vs MongoDB 의 내부 아키텍쳐 와 장단점 비교Amazon DocumentDB vs MongoDB 의 내부 아키텍쳐 와 장단점 비교
Amazon DocumentDB vs MongoDB 의 내부 아키텍쳐 와 장단점 비교
 
DynamoDB를 게임에서 사용하기 – 김성수, 박경표, AWS솔루션즈 아키텍트:: AWS Summit Online Korea 2020
DynamoDB를 게임에서 사용하기 – 김성수, 박경표, AWS솔루션즈 아키텍트::  AWS Summit Online Korea 2020DynamoDB를 게임에서 사용하기 – 김성수, 박경표, AWS솔루션즈 아키텍트::  AWS Summit Online Korea 2020
DynamoDB를 게임에서 사용하기 – 김성수, 박경표, AWS솔루션즈 아키텍트:: AWS Summit Online Korea 2020
 
[2D1]Elasticsearch 성능 최적화
[2D1]Elasticsearch 성능 최적화[2D1]Elasticsearch 성능 최적화
[2D1]Elasticsearch 성능 최적화
 

Semelhante a MongoDB World 2015 - A Technical Introduction to WiredTiger

Deployment Strategy
Deployment StrategyDeployment Strategy
Deployment Strategy
MongoDB
 

Semelhante a MongoDB World 2015 - A Technical Introduction to WiredTiger (20)

https://docs.google.com/presentation/d/1DcL4zK6i3HZRDD4xTGX1VpSOwyu2xBeWLT6a_...
https://docs.google.com/presentation/d/1DcL4zK6i3HZRDD4xTGX1VpSOwyu2xBeWLT6a_...https://docs.google.com/presentation/d/1DcL4zK6i3HZRDD4xTGX1VpSOwyu2xBeWLT6a_...
https://docs.google.com/presentation/d/1DcL4zK6i3HZRDD4xTGX1VpSOwyu2xBeWLT6a_...
 
WiredTiger & What's New in 3.0
WiredTiger & What's New in 3.0WiredTiger & What's New in 3.0
WiredTiger & What's New in 3.0
 
A Technical Introduction to WiredTiger
A Technical Introduction to WiredTigerA Technical Introduction to WiredTiger
A Technical Introduction to WiredTiger
 
Beyond the Basics 1: Storage Engines
Beyond the Basics 1: Storage Engines	Beyond the Basics 1: Storage Engines
Beyond the Basics 1: Storage Engines
 
Let the Tiger Roar - MongoDB 3.0
Let the Tiger Roar - MongoDB 3.0Let the Tiger Roar - MongoDB 3.0
Let the Tiger Roar - MongoDB 3.0
 
Let the Tiger Roar!
Let the Tiger Roar!Let the Tiger Roar!
Let the Tiger Roar!
 
Running MongoDB 3.0 on AWS
Running MongoDB 3.0 on AWSRunning MongoDB 3.0 on AWS
Running MongoDB 3.0 on AWS
 
What'sNnew in 3.0 Webinar
What'sNnew in 3.0 WebinarWhat'sNnew in 3.0 Webinar
What'sNnew in 3.0 Webinar
 
MongoDB 3.0 and WiredTiger (Event: An Evening with MongoDB Dallas 3/10/15)
MongoDB 3.0 and WiredTiger (Event: An Evening with MongoDB Dallas 3/10/15)MongoDB 3.0 and WiredTiger (Event: An Evening with MongoDB Dallas 3/10/15)
MongoDB 3.0 and WiredTiger (Event: An Evening with MongoDB Dallas 3/10/15)
 
Mongo DB
Mongo DBMongo DB
Mongo DB
 
MongoDB Evenings Boston - An Update on MongoDB's WiredTiger Storage Engine
MongoDB Evenings Boston - An Update on MongoDB's WiredTiger Storage EngineMongoDB Evenings Boston - An Update on MongoDB's WiredTiger Storage Engine
MongoDB Evenings Boston - An Update on MongoDB's WiredTiger Storage Engine
 
Back to Basics Webinar 6: Production Deployment
Back to Basics Webinar 6: Production DeploymentBack to Basics Webinar 6: Production Deployment
Back to Basics Webinar 6: Production Deployment
 
Beyond the Basics 1: Storage Engines
Beyond the Basics 1: Storage EnginesBeyond the Basics 1: Storage Engines
Beyond the Basics 1: Storage Engines
 
A Closer Look at Apache Kudu
A Closer Look at Apache KuduA Closer Look at Apache Kudu
A Closer Look at Apache Kudu
 
Deployment Strategies
Deployment StrategiesDeployment Strategies
Deployment Strategies
 
Deployment Strategies (Mongo Austin)
Deployment Strategies (Mongo Austin)Deployment Strategies (Mongo Austin)
Deployment Strategies (Mongo Austin)
 
Deployment Strategy
Deployment StrategyDeployment Strategy
Deployment Strategy
 
Conceptos Avanzados 1: Motores de Almacenamiento
Conceptos Avanzados 1: Motores de AlmacenamientoConceptos Avanzados 1: Motores de Almacenamiento
Conceptos Avanzados 1: Motores de Almacenamiento
 
MongoDB: Advantages of an Open Source NoSQL Database
MongoDB: Advantages of an Open Source NoSQL DatabaseMongoDB: Advantages of an Open Source NoSQL Database
MongoDB: Advantages of an Open Source NoSQL Database
 
MongoDB 101 & Beyond: Get Started in MongoDB 3.0, Preview 3.2 & Demo of Ops M...
MongoDB 101 & Beyond: Get Started in MongoDB 3.0, Preview 3.2 & Demo of Ops M...MongoDB 101 & Beyond: Get Started in MongoDB 3.0, Preview 3.2 & Demo of Ops M...
MongoDB 101 & Beyond: Get Started in MongoDB 3.0, Preview 3.2 & Demo of Ops M...
 

Último

Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
amitlee9823
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
amitlee9823
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
amitlee9823
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
AroojKhan71
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
amitlee9823
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
MarinCaroMartnezBerg
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
amitlee9823
 

Último (20)

Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 

MongoDB World 2015 - A Technical Introduction to WiredTiger

  • 1.
  • 2. A Technical Introduction to WiredTiger Michael Cahill Director of Engineering (Storage), MongoDB
  • 3. You may have seen this:
  • 6. 6 What’s different about WiredTiger? • Document-level concurrency • In-memory performance • Multi-core scalability • Checksums • Compression • Durability with and without journaling
  • 7. 7 This presentation is not… • How to write stand-alone WiredTiger apps • How to configure MongoDB with WiredTiger for your workload
  • 9. 9 Why create a new storage engine? • Minimize contention between threads – lock-free algorithms, hazard pointers – eliminate blocking due to concurrency control • Hotter cache and more work per I/O – compact file formats – compression – big-block I/O
  • 10. 10 MongoDB’s Storage Engine API • Allows different storage engines to "plug-in" – Different workloads have different performance characteristics – mmap is not ideal for all workloads – More flexibility • mix storage engines on same replica set/sharded cluster • Opportunity to integrate further (HDFS, native encrypted, hardware optimized …) • Great way for us to demonstrate WiredTiger’s performance
  • 11. 11 Storage Engine Layer Content Repo IoT Sensor Backend Ad Service Customer Analytics Archive MongoDB Query Language (MQL) + Native Drivers MongoDB Document Data Model MMAP V1 WT In-Memory ? ? Supported in MongoDB 3.0 Future Possible Storage Engines Management Security Example Future State Experimental
  • 13. 13 WiredTiger Architecture WiredTiger Engine Schema & Cursors Python API C API Java API Database Files Transactions Page read/ write Logging Column storage Block management Row storage Snapshots Log Files Cache In-memory consistency Per-file checkpoints
  • 14. 14 Trees in cache non-resident child ordinary pointer root page internal page internal page root page leaf page leaf page leaf page leaf page 1 memory flush read required disk image
  • 15. 15 Pages in cache cache data files page images on-disk page image index clean page on-disk page image indexdirty page updates constructed during read skiplist reconciled during write
  • 17. 17 Multiversion Concurrency Control (MVCC) • Multiple versions of records kept in cache • Readers see the version that was committed before the operation started – MongoDB “yields” turn large operations into small transactions • Writers can create new versions concurrent with readers • Concurrent updates to a single record cause write conflicts – MongoDB retries with back-off
  • 19. 19 Checksums • A checksum is stored with every page • WiredTiger stores the checksum with the page address (typically in a parent page) – Extra safety against reading an old, valid page image • Checksums are validated during page read – Detects filesystem corruption, random bitflips
  • 20. 20 Compression • WiredTiger uses snappy compression by default in MongoDB • supported compression algorithms – snappy [default]: good compression, low overhead – zlib: better compression, more CPU – none • Indexes are compressed using prefix compression – allows compression in memory
  • 22. 22 Writing a checkpoint 1. Write the leaves 2. Write the internal pages, including the root – the old checkpoint is still valid 3. Sync the file 4. Write the new root’s address to the metadata – free pages from old checkpoints once the metadata is durable
  • 23. 23 Durability without Journaling • MMAPv1 uses a write-ahead log (journal) to guarantee consistency – Running with “nojournal” is unsafe • WiredTiger doesn't have this need: no in-place updates – Write-ahead log can be truncated at checkpoints • Every 2GB or 60sec by default – configurable – Updates are written to optional journal as they commit • Not flushed on every commit by default • Recovery rolls forward from last checkpoint • Replication can guarantee durability
  • 24. 24 Journal and recovery • Optional write-ahead logging • Only written at transaction commit • snappy compression by default • Group commit • Automatic log archive / removal • On startup, we rely on finding a consistent checkpoint in the metadata • Check LSNs in the metadata to figure out where to roll forward from
  • 25. 25 So, what’s different about WiredTiger? • Multiversion Concurrent Control – No lock manager • Non-locking data structures – Multi-core scalability • Different representation of in-memory data vs on-disk – Enabled checksums, compression • Copy-on-write storage – Durability without a journal
  • 27. 27 What’s next for WiredTiger? • WiredTiger btree access method is optional in 3.0 • plan to make it the default in 3.2 • tuning for (many) more workloads • make Log Structured Merge (LSM) work well with MongoDB – out-of-cache, write-heavy workloads • adding encryption • more advanced transactional semantics in the storage engine API

Notas do Editor

  1. Different in-memory representation vs on-disk – opportunities to optimize plus some interesting fetures Writing a page: Combine the previous page image with in-memory changes Allocate new space in the file If the result is too big, split Always write multiples of the allocation size Can configure direct I/O to keep reads and writes out of the filesystem cache