Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Webinar: Managing Real Time Risk Analytics with MongoDB
1. Webinar : Managing Real Time Risk Analytics with MongoDB
will begin at 14:00 GMT / 7:00 AM PDT / 2:00 PM UTC
Audio should start immediately when you log into the event via
Audio Broadcast. You will need a VOID headset and reliable
internet connection for Audio Broadcast. If you are having issues
connecting, please dial 1-877-668-4493; Access code: 666 722
454.
There is a Q&A following the webinar. You can enter questions in
the chat box to the Host and Presenter.
A recording of the webinar will be available 24 hours after the
eventi s complete.
For any other issues please email webinars@10gen.com.
2. Easy to Start, Easy to Develop, Easy to Scale
Managing Real Time Risk Analytics with
MongoDB
10gen, Inc.
November 2012
7. Risk Analytics & Reporting
Use Case:
•Collect and aggregate risk data
•Calculate risk / exposures
•Potentially real time
Why MongoDB?
•Collect data from a single or multiple sources
•Different formats
•Documents used to create ‘pre-aggregated’ reports
•Real Time
•Aggregation Framework for reporting
•e.g. exposure for a counter party
•Internal MR or Hadoop connector
•Batch process risk data
8. Portfolio / Position reporting
Use Case:
•Store positions or portfolio information
•Query to find current positions/portfolios
•Query by client or trader
Why MongoDB?
•Customer/client my have many different products
•Aggregation Framework to calculate values and views
•Work on extremely large data sets
•Current and historic data
9. Reporting / Analytics requirements
•How quickly do you need answers?
•How often do you need updates?
•Requirements will drive which methods to utilise.
•Generally the high the latency tolerance the greater the insight.
•Choices
•Batch calculations - large complex data volumes
•Pre-Aggregated - specific and very fast
•Real-time calculations - As needed reports and calculations
10. Batch Processing
•MongoDB internal Map Reduce
•Hadoop Map Reduce with MongoDB connector
raw
•Insight after batch run
hourly
•For instance every hour or day
•Output to documents/collection daily
•Fast read once data produced
monthly
•Results not up to last millisecond
•Can generate insight from huge datasets
•Rolled up stats
•Source collections -> reporting collection
11. Sharded MongoDB + Hadoop
Shard 1 Shard 2 Shard 3 Shard 4 Shard 5
c z t f v w y
a s u g e h d b x
Hadoop Hadoop Hadoop Hadoop Hadoop
Node Node Node Node Node
Hadoop Hadoop
Hadoop Hadoop
Node Node
Node Node
12. Use Query Language
•Query across documents using MongoDB JSON query language
•Infer results in the application code.
•Dynamic - but what happens when we have 1 billion documents.
•Indexing strategy key
•var data = db.pl.find({ positionId: 1234 })[0]
{
"_id" : ObjectId("50990a10fd421cb025407cb1"),
"positionId" : 1234,
"security" : "ORCL",
"quantity" : 1000,
"price" : 30.23,
"currency" : "USD"
}
data.price * data.quantity = 30230.00
13. Leverage schema design
•Group useful data together into documents
•Utilise upsert and $inc functionality of MongoDB
•Pre-aggregate reports
•$inc incrementing counters is light weight.
•Fast pre calculated data
•Low latency retrieval
•http://docs.mongodb.org/manual/use-cases/pre-aggregated-reports/