Replication Online

•Download as PPTX, PDF•

2 likes•462 views

MongoDB

Business Education Technology

Open source, high performance database

Replication

Summer 2012

1

• High Availability (auto-failover)

• Read Scaling (extra copies to read from)

• Backups
– Online, Delayed Copy (fat finger)
– Point in Time (PiT) backups

• Use (hidden) replica for secondary workload
– Analytics
– Data-processing
– Integration with external systems 3

Planned
– Hardware upgrade
– O/S or file-system tuning
– Relocation of data to new file-system / storage
– Software upgrade

Unplanned
– Hardware failure
– Data center failure
– Region outage
– Human error
– Application corruption 4

• A cluster of N servers
• All writes to primary
• Reads can be to primary (default) or a
secondary
• Any (one) node can be primary
• Consensus election of primary
• Automatic failover
• Automatic recovery
5

Member 1 Member 3

Member 2

• Replica Set is made up of 2 or more nodes

6

Member 1 Member 3

Member 2
Primary

• Election establishes the PRIMARY
• Data replication from PRIMARY to SECONDARY

7

negotiate new
master
Member 1 Member 3

Member 2
DOWN

• PRIMARY may fail
• Automatic election of new PRIMARY
if majority exists

8

negotiate new
master
Member 3
Member 1 Primary

Member 2
DOWN

• New PRIMARY elected
• Replica Set re-established

9

Member 3
Member 1
Primary

Member 2
Recovering

• Automatic recovery

10

Member 3
Member 1
Primary

Member 2

• Replica Set re-established

11

Primary As long as a partition
Secondary
can see a majority
(>50%) of the
Secondary cluster, then it will elect a
primary.

13

Primary 66% of cluster visible.
Secondary
Primary is elected

Failed
Node

14

Secondary 33% of cluster visible.
Failed
Read only mode.
Node

Failed
Node

15

Primary
66% of cluster visible
Secondary
Primary is elected
Primary

Failed
Secondary Node

Secondary

17

Primary

Secondary
Failed
Node

Failed
Node
Secondary

Secondary
33% of cluster visible
Read only mode.
18

Primary

Secondary

Secondary

Secondary

19

Secondary
50% of cluster visible
Primary
Secondary Read only mode.
Failed
Secondary Node

Failed
Node

Secondary

Secondary

20

Primary
Failed
Node
Secondary
Failed
Node

Secondary

Secondary 50% of cluster visible
Secondary Read only mode.
Secondary

21

Primary

Secondary

Secondary

Top of rack switch

Rack falls over

24

Primary

Secondary

Secondary

Loss of internet

Building burns dow

25

San Francisco

Primary

Secondary

Secondary

Dallas

26

San Francisco

Primary Priority 1

Secondary Priority 1

Secondary Priority 0
Dallas

Disaster recover data center. Will
never become primary
automatically.
27

San Francisco

New York Primary

Secondary

Dallas Secondary

28

Primary

Secondary
Is this a good idea?
Arbiter

30

1 2

Primary Primary

Secondary Secondary

Arbiter Arbiter

32

1 2 3

Primary Primary Primary
Full Sync

Secondary Secondary Secondary Secondary

Arbiter Arbiter Arbiter

Uh oh. Full Sync is going to use
a lot of resources on the
primary. So I may have
downtime or degraded
performance
33

1 2

Primary Primary

Secondary Secondary

Secondary Secondary

35

1 2 3

Primary Primary Primary

Secondary Secondary Secondary Secondary

Secondary Secondary Secondary Full Sync

Sync can happen from
secondary, which will not impact
traffic on Primary.

36

• Avoid single points of failure
– Separate racks
– Separate data centers
• Avoid long recovery downtime
– Use journaling
– Use 3+ replicas
• Keep your actives close
– Use priority to control where failovers happen

37

Recently uploaded

unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE AbudhabiAbortion pills in Kuwait Cytotec pills in Kuwait

Monte Carlo simulation : Simulation using MCSMRavindra Nath Shukla

It will be International Nurses' Day on 12 MayNZSG

Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...Dipal Arora

VIP Call Girls In Saharaganj ( Lucknow ) 🔝 8923113531 🔝 Cash Payment (COD) 👒anilsa9823

Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...lizamodels9

How to Get Started in Social Media for Art League CityEric T. Tung

KYC-Verified Accounts: Helping Companies Handle Challenging Regulatory Enviro...Any kyc Account

B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptxpriyanshujha201

Mifty kit IN Salmiya (+918133066128) Abortion pills IN Salmiyah Cytotec pillsAbortion pills in Kuwait Cytotec pills in Kuwait

Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...amitlee9823

Call Girls in Gomti Nagar - 7388211116 - With room Servicediscovermytutordmt

Ensure the security of your HCL environment by applying the Zero Trust princi...Roland Driesen

MONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRLSeo

Insurers' journeys to build a mastery in the IoT usageMatteo Carbone

0183760ssssssssssssssssssssssssssss00101011 (27).pdfRenandantas16

Best VIP Call Girls Noida Sector 40 Call Me: 8448380779Delhi Call girls

HONOR Veterans Event Keynote by Michael HawkinsMichael W. Hawkins

Call Girls In Panjim North Goa 9971646499 Genuine Serviceritikaroy0888

RSA Conference Exhibitor List 2024 - Exhibitors DataExhibitors Data

Recently uploaded (20)

unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi

Monte Carlo simulation : Simulation using MCSM

It will be International Nurses' Day on 12 May

Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...

VIP Call Girls In Saharaganj ( Lucknow ) 🔝 8923113531 🔝 Cash Payment (COD) 👒

Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...

How to Get Started in Social Media for Art League City

KYC-Verified Accounts: Helping Companies Handle Challenging Regulatory Enviro...

B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptx

Mifty kit IN Salmiya (+918133066128) Abortion pills IN Salmiyah Cytotec pills

Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...

Call Girls in Gomti Nagar - 7388211116 - With room Service

Ensure the security of your HCL environment by applying the Zero Trust princi...

MONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRL

Insurers' journeys to build a mastery in the IoT usage

0183760ssssssssssssssssssssssssssss00101011 (27).pdf

Best VIP Call Girls Noida Sector 40 Call Me: 8448380779

HONOR Veterans Event Keynote by Michael Hawkins

Call Girls In Panjim North Goa 9971646499 Genuine Service

RSA Conference Exhibitor List 2024 - Exhibitors Data

Replication Online

1. Open source, high performance database Replication Summer 2012 1

2. Why Have Replication? 2

3. • High Availability (auto-failover) • Read Scaling (extra copies to read from) • Backups – Online, Delayed Copy (fat finger) – Point in Time (PiT) backups • Use (hidden) replica for secondary workload – Analytics – Data-processing – Integration with external systems 3

4. Planned – Hardware upgrade – O/S or file-system tuning – Relocation of data to new file-system / storage – Software upgrade Unplanned – Hardware failure – Data center failure – Region outage – Human error – Application corruption 4

5. • A cluster of N servers • All writes to primary • Reads can be to primary (default) or a secondary • Any (one) node can be primary • Consensus election of primary • Automatic failover • Automatic recovery 5

6. Member 1 Member 3 Member 2 • Replica Set is made up of 2 or more nodes 6

7. Member 1 Member 3 Member 2 Primary • Election establishes the PRIMARY • Data replication from PRIMARY to SECONDARY 7

8. negotiate new master Member 1 Member 3 Member 2 DOWN • PRIMARY may fail • Automatic election of new PRIMARY if majority exists 8

9. negotiate new master Member 3 Member 1 Primary Member 2 DOWN • New PRIMARY elected • Replica Set re-established 9

10. Member 3 Member 1 Primary Member 2 Recovering • Automatic recovery 10

11. Member 3 Member 1 Primary Member 2 • Replica Set re-established 11

12. Understanding automatic failover 12

13. Primary As long as a partition Secondary can see a majority (>50%) of the Secondary cluster, then it will elect a primary. 13

14. Primary 66% of cluster visible. Secondary Primary is elected Failed Node 14

15. Secondary 33% of cluster visible. Failed Read only mode. Node Failed Node 15

16. Primary Secondary Secondary 16

17. Primary 66% of cluster visible Secondary Primary is elected Primary Failed Secondary Node Secondary 17

18. Primary Secondary Failed Node Failed Node Secondary Secondary 33% of cluster visible Read only mode. 18

19. Primary Secondary Secondary Secondary 19

20. Secondary 50% of cluster visible Primary Secondary Read only mode. Failed Secondary Node Failed Node Secondary Secondary 20

21. Primary Failed Node Secondary Failed Node Secondary Secondary 50% of cluster visible Secondary Read only mode. Secondary 21

22. Avoid single points of failure 22

23. 23

24. Primary Secondary Secondary Top of rack switch Rack falls over 24

25. Primary Secondary Secondary Loss of internet Building burns dow 25

26. San Francisco Primary Secondary Secondary Dallas 26

27. San Francisco Primary Priority 1 Secondary Priority 1 Secondary Priority 0 Dallas Disaster recover data center. Will never become primary automatically. 27

28. San Francisco New York Primary Secondary Dallas Secondary 28

29. Fast recovery 29

30. Primary Secondary Is this a good idea? Arbiter 30

31. 1 Primary Secondary Arbiter 31

32. 1 2 Primary Primary Secondary Secondary Arbiter Arbiter 32

33. 1 2 3 Primary Primary Primary Full Sync Secondary Secondary Secondary Secondary Arbiter Arbiter Arbiter Uh oh. Full Sync is going to use a lot of resources on the primary. So I may have downtime or degraded performance 33

34. 1 Primary Secondary Secondary 34

35. 1 2 Primary Primary Secondary Secondary Secondary Secondary 35

36. 1 2 3 Primary Primary Primary Secondary Secondary Secondary Secondary Secondary Secondary Secondary Full Sync Sync can happen from secondary, which will not impact traffic on Primary. 36

37. • Avoid single points of failure – Separate racks – Separate data centers • Avoid long recovery downtime – Use journaling – Use 3+ replicas • Keep your actives close – Use priority to control where failovers happen 37

38. Q&A after this session 38

Editor's Notes

Change operations are written to the oplogThe oplog is a capped collection (fixed size)Must have enough space to allow new secondaries to catch up after copying from a primaryMust have enough space to cope with any applicable slaveDelaySecondaries query the primary's oplog and apply what they findAll replicas contain an oplog

Replication Online

Recommended

Recommended

More Related Content

More from MongoDB

More from MongoDB (20)

Recently uploaded

Recently uploaded (20)

Replication Online

Editor's Notes