GHX uses MongoDB to store over 500,000 documents per day relating to healthcare transactions. They evaluated NoSQL databases and selected MongoDB due to its scalability, flexible data model, and developer friendliness. Their data model involves storing events that are assigned to workers by a distributed event broker. After 9 months in production, MongoDB has supported throughput of over 90,000 operations per second through sharding and indexing, meeting GHX's requirements.
MongoDB in Denver: How Global Healthcare Exchange is Using MongoDB
1. GHX proprietary information: Please do not copy or distribute
GHX Use of MongoDB
Bill Perkins: Director of Data and Architecture
Jeff Sherard: Manager of Database Services
1
3. What is a GHX?
• 500K+ documents a day
– Scaling to many millions
• Saved $5.5 Billion dollars over last 5 years
Healthcare
Providers
Healthcare
Providers
Healthcare
Providers
Healthcare
Providers
1000’s
Healthcare
Providers
Healthcare
Providers
Healthcare
Suppliers
100’s
85% of US market
Purchase Orders
Purchase Order Acknowledgement
Advanced Shipment Notice
Invoice
Contract
Item Master
Vendor Master…
4. GHX proprietary information: Please do not copy or distribute
Yah so, tell me what the Mongo
selection process felt like?
4
5. This is me being greeted by the committee for homogenous databases
6. In this picture I am standing to the right of the photographer (see arrow)
7. This is me after the selection of Mongo was approved
9. Applicability of Different NoSQL Data Models
Characteristics
(1-4, lower is better)
Data
Model
Performance
Scalability
Flexibility
Complexity
Index
Support
Examples Applicability
Key-Value
Stores
1 3 1 1 4
Oracle NoSQL , Redis,
Riak, Membrain,
LevelDB, Voldemort,
Castle, Aerospike
LOW Stores a key associated with a binary. Good
for cache. We will need multiple keys on objects,
Column
Store
1 1 2 3 3
Cassandra, HBase,
Hypertable, Acuru
MEDIUM Very scalable, good tooling support, no
built-in second index, steep learning curve
Document
Store
2 2 1 1 1
MongoDB,
Couchbase, RavenDB,
RethinkDB, CouchDB
HIGH Fields everything, flexible model, supports
multiple indexes. Low learning. Matches our model.
Graph
Database
2 3 4 4 2
Neo4j, InfiniteGraph,
OrientDB
LOW Very good at quick storage. Queries are
harder. Horizontal scale is questionable. Our data is
not graph oriented.
9
10. DBs that made first cut
Platform Applicability Community / Market Traction
Oracle
noSQL
LOW Low traction, potential community is high due to Oracle. This is
BerkleyDB with an Oracle wrapper. May be good for cache.
Cassandra MEDIUM High traction, good community. Companies are using this to solve
very large problems. Scales well. Does not support secondary
indexes natively. Good tool support. Good integration with ETL /
BI tools.
Hbase MEDIUM High traction, good community. No one is using this successfully
for OLTP problems. Very complex.
MongoDB HIGH High traction, high community. Lots of features including many
types of indexes. Leading in deployments and growing fastest.
Developer friendly. Tooling for operations is command line.
10
11. GHX proprietary information: Please do not copy or distribute
So there, what was it like when
Mongo battled Cassandra?
11
12. The one who looks like “David” in this
picture is actually actually Bryan
13. Performance Test Results
• Mongo config:
– Primary and 2 replicas
• Cassandra config:
– 3 node ring
• Mongo outperformed Cassandra in
each test
• Different architectures: Mongo
performs best in read heavy scenarios,
while Cassandra is the opposite.
• Mongo’s latency was better on each
test.
13
0
5000
10000
15000
20000
25000
30000
35000
40000
99/1 90/10 70/30 50/50 30/70
OPS by Read/Write %
MongoDB Datastax
0
1
2
3
4
5
6
7
99/1 90/10 70/30 50/50 30/70
Latency by Read/Write %
MongoDB Datastax
14. Mongo: Index test results
• Added 3 secondary indexes to mongo collection
• YCSB test reflects the cost of index management in writes, but does not use secondary
indexes for reads
14
99/1 90/10 70/30 50/50 30/70
Primary Key Only 36274 29244 21042 17381 15270
Add 3 Indexes 31151 23297 15158 11742 9142
Performance Drop 14% 20% 28% 32% 40%
0
5000
10000
15000
20000
25000
30000
35000
40000
OperationsperSecond
0
1000
2000
3000
4000
5000
6000
7000
8000
99/1 30/70 50/50 70/30 90/10
Read/write ratio
Performance
Performance w/ indexes
ThroughputLatency
15. Mongo: Sharding test results
• 90/10 read / write ratio
• This was not the same configuration as previous results.
We are interested in the gain, not the raw value
15
Application
Server
(YCSB)
MongoS (router)
MongoD
(shard 1)
MongoD
(shard 2)
Application
Server
(YCSB)
MongoS (router)
Shards Ops/Second Gain %
1 12K ----
2 22.6K 88.3%
16. Rolling up Results to Match our Requirements
16
Criteria Weight Mongo Cassandra
Scalability
Development throughput
Flexible Data Model
Developer Impact
Database Services Impact
3
1
3
2
2
19. Event Driven Work Broker
Work Broker
Worker
A
Worker
A
Worker
A
Worker
A
Worker
A
Worker
B
Worker
A
Worker
A
Worker
C
2. Broker assigns it to
exactly one worker
Event A
1. Reads events
from Mongo
Work Event B
3. Work Log and Event B generated by Worker A
4. Repeat…
20. Distributed Event Broker
• Many readers reading from mongo trying to
assign to workers on as fast as they can take
the work
…
21. Assignment Approach 1: FindandModify
• Steps
– FindandModify
• Find event of certain type that are not assigned
• Mark them as assigned
• Do this BATCH_SIZE times
• Result
– Really high WriteLock% (90+)
– Upset stakeholder
– **mongo did not recommend this
22. Assignment Approach 2: N Updates and a find
• Steps
– Update event of certain type that are not assigned
• Mark them as assigned with batch ID
• Do this BATCH_SIZE times
– Find all with batch ID
• Result
– Lots of loss # modify > # find
– Upset stakeholder
– Only tested this, never used in prod
23. Assignment Approach 3: Find-Modify-Find
• Steps
– Find event _ID where event unassigned and of
certain type limit BATCH_SIZE
– Modify set of events found by writing a batch ID
to assign it
– Find the modified events by batch ID
• Result
– WriteLock% went down
– Lots of loss
• Count of # found > # modified > # found
24. Assignment Approach 4: Don’t do it
• Steps
– Assign workers a range of events on startup
– Mark every event with a random number token
(1-1000)
– Worker updates an event with token that is in its
range
• Results (moving to this now)
– No conflict or loss expected
– WorkerManager rebalances ranges, monitors
health of workers
25. Throughput Challenges
• 25MM customer documents * 20 activities per
doc * 4K per activity = 2TB per day throughput
• 90,000 operations per second at peak load
Scaling
• 7 to 10 shards (21-30 Mongod)
– 9,000 to 12,700 db operations per shard per second
Execution
< 1 day
Audit
30 days
Archive
quite a while