This document discusses how MongoDB can help capital markets firms address challenges with traditional relational database solutions for tasks like risk analysis and reporting, market data aggregation, and reference data management. It provides examples of how MongoDB's flexible schema, replication, and sharding capabilities allow global reference data to be distributed in real-time for low-latency access. The document argues that using MongoDB can significantly reduce costs compared to existing ETL-based approaches by distributing updates immediately in a single place.
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
Real World MongoDB: Use Cases from Financial Services by Daniel Roberts
1.
2.
3. 3
• Introduction
• We feel your pain
• Traditional Solutions not working
• What are your peers doing?
• How MongoDB has helped them
• Current Use Cases
• Future Opportunities
6. 6
• Tick Data Capture & Analysis
• Reference Data Management
• Risk Analysis & Reporting
• Trade Repository
• Portfolio / Position Reporting
• Caching Tiers
• Mainframe offloading
• Know Your Client
7.
8.
9. 9
Use Case:
– Collect and aggregate risk data
– Calculate risk / exposures, potentially real-time
Why MongoDB?
•Collect data from a single or multiple sources
•Support for polymorphic data formats.
•Documents used to create ‘pre-aggregated’ reports
•Aggregation Framework or Map Reduce
•Horizontal Scale
12. 12
• Capture real-time market data (multi-asset, top of
book, depth of book, even news)
• Load historical data
• Aggregate data into bars, daily, monthly intervals
• Enable queries & analysis on raw ticks or
aggregates
• Drive backtesting or automated signals
13. 13
Trades/metrics
Feed HandlerFeed Handler
Exchanges/Mark
ets/Brokers
Exchanges/Mark
ets/Brokers
Capturing
Application
Capturing
Application
Low Latency
Applications
Low Latency
Applications
Higher Latency
Trading
Applications
Higher Latency
Trading
Applications
Backtesting and
Analysis
Applications
Backtesting and
Analysis
Applications
Market Data
Cached Static &
Aggregated Data
News & social
networking
sources
News & social
networking
sources
Orders
Orders
Data Types
•Top of book
•Depth of book
•Multi-asset
•Derivatives
•News (text, video)
•Social Networking
Data Types
•Top of book
•Depth of book
•Multi-asset
•Derivatives
•News (text, video)
•Social Networking
16. 16
• How do you globally distribute reference data?
– Polymorphic data
• Price / Products / Securities Master
• Counterparty information - KYC
• Corporate Actions
• Golden / Single source truth
– Often changing in structure,
• e.g. new products
– Often High volume
• How is this typically solved today?
17. 17
• What do reference data solutions look like today?
• Storage
– Relational Database or Caching Technologies
• Replication
– ETL or Messaging
• Complex, Costly and Brittle
– Maintenance
• schema changes
• infrastructure
– Multiple technologies
18. 18
• What features in MongoDB are ideally suited for
Global replicated reference data systems?
1. Dynamic and flexible schema
19. 19
IssID IssuerName PVCurrency
117883 DWS Vietnam Fund USD
69461 Independence III Cdo Ltd USD
102862 Zamano Plc EUR
73277 Green Way BMD
65134 First European Growth Inc. CHF
SecID EventID Company_Meeting IssID
762288 407341 AGM 117883
81198 243459 SDCHG 69461
422999 410626 AGM 102862
422999 243440 SDCHG 102862
75128 20056 ISCHG 65134
21. 21
• What features in MongoDB are ideally suited for
Globally replicated reference data systems?
1. Dynamic and flexible schema
2. Built in replication and high availability
23. 23
• What features in MongoDB are ideally suited for
Globally replicated reference data systems?
1. Dynamic and flexible schema
2. Built in replication and high availability
3. Tag Aware Sharding (Geo)
25. 25
Feeds & Batch data
•Pricing
•Accounts
•Securities Master
•Corporate actions
Source
Master Data
(RDBMS)
Source
Master Data
(RDBMS)
ETL
ETL ETL
ETL
ETL
ETL
ETL
Destination
Data
(RDBMS)
Destination
Data
(RDBMS)
Each represents
•People $
•Hardware $
•License $
•Reg penalty $
•& other downstream
problems
26. 26
Feeds & Batch data
•Pricing
•Accounts
•Securities Master
•Corporate actions
Real-time
Real-time Real-time
Real-time
Real-time
Real-time
Real-time
Each represents
•No people $
•Less hardware $
•Less license $
•No penalty $
•& many less
problems
MongoDB
Secondaries
MongoDB
Primary
27. 27
Distribute reference data globally in real-time for
fast local accessing and querying
Problem Why MongoDB Results
• Delays up to 20 hours in
distributing data via ETL
• Had to manage 20
distributed systems with
same data
• Incurring regulatory
penalties from missing
SLAs
• Stale data caused
operational issues
• Dynamic schema
management: update
immediately & in one
place
• Auto-replication: data
distributed in real-time
• Both cache and database:
cache always up-to-date
• Simple data modeling &
analysis: easy changes
and understanding
• Will save about
$40,000,000 in costs and
penalties over 5 years
• Greater throughput means
charging more to internal
groups
• Network and disk speed is
the bottleneck, not
software and applications
28.
29. 29
• Not Just Capital Markets
• Insurance / Consumer banking
• Inspiration from other Industries
– Telco
– Retail
• Opportunity to consolidate and utilise data in
interesting ways
30.
31.
32. 32
Resource Location
MongoDB Downloads 10gen.com/products/mongodb
Free Online Training education.10gen.com
Webinars and Events 10gen.com/events
White Papers 10gen.com/white-papers
Case Studies 10gen.com/customers
Presentations 10gen.com/presentations
Documentation docs.mongodb.org
Additional Info info@10gen.com
Resource Location
Notas do Editor
Increased regulation means increased reporting. Increased IT effort spend and complexity. Increase volumes of data. 3 VVV volume velocity and variability. Need to keep regulators happy .
There is a move to new technologies to solve these problems in FS… and the starting point is how we manage these large volumes of data at velocity with continued variablity and increasing volume. What kind of area are we working on?