This document discusses analytics with NoSQL databases. It begins by defining different types of analytics like alerting, getting insights, and transforming data. It then discusses challenges like having lots of data in many formats from different sources. It provides examples of real-time analytics like credit card fraud detection and collaborative filtering. It argues that MongoDB is useful for analytics because it allows for horizontal scalability, flexibility to add new data, and high performance for ingesting and serving operational analytics. Specific use cases discussed include retail price optimization, smart grid analytics, mobile analytics, and financial customer insights. It concludes that analytics now require integrating real-time context and that MongoDB can help process data where it lands more flexibly.
2024: Domino Containers - The Next Step. News from the Domino Container commu...
Webinar: Analytics with NoSQL: Why, for What, and When?
1. 11
Analytics with NoSQL: why? for what? and when?
Edouard Servan-Schreiber, Ph.D.
Director for Solution Architecture
10gen
2. 2
What is Analytics?
2
• Alerting
– Let me know when a cell tower has failed
• Getting insights - Strategic Analytics
– Churn rates, Customer segment distribution
• Transforming, Enriching, Aggregating
– Identifying faces in videos and images
– Identifying voices in recordings
• Operating smarter
– Having a pre-approved offer for a customer who calls after he expressed
interest on the web
• Analytics-driven actions in real-time
– Smart modeling integrating real time context
– This customer has lower status but suffered multiple delays in past
month, and should have priority over this higher status customer right
now on this flight
3. 3
Why is this hard?
• Lots of data
– but few eyes and slow brains
• Lots of data
– just as many formats
• Lots of data
– many owners with unaligned interests and concerns
• Can you get your analysis in a useful timeframe?
• Can you make improvements in a useful timeframe?
• When you get new data, how fast can you do something with it?
• The more DATA you have, the easier it is to get lost in it...
• Data is useful only if it allows you to CHANGE the way you run your activity
– this is a surprisingly useful litmus test
• Any change requires measurement to make sure it helps
– this is a remarkably effective test to identify analytical organizations
3
4. 4
Seven vital success areas
CRISP-DM methodology
Data
4
Data
Many Data Sources and Schemas
Hard to Integrate
Keeps evolving
Acting on “real time”
data
Is particularly hard
5. 5
Collaborative Filtering
“Those who saw this also liked this….”
• Real time continuous updates of the user-product matrix
to make up-to-date predictions
5
6. 6
Credit Card Fraud
Complex Event Processing
• Each transaction must be approved in a
matter of seconds. Each step, the relevant
authority must decide in real-time whether
the transaction is suspicious enough to
warrant an alert, refuting the transaction
6
8. 88
Once you have built insights, the hard part is turning those insights into
money making actions through a multitude of field systems
Actions are taken in field systems….
DWHSensor Store
Order Store
Inventory Mgmt
Warranty Mgmt
Customer Portal
Analytical Store
Data is built here and action is taken here
Long running batch
analysis
Development of
Stats Models
Integration of
Enterprise Data
9. 99
• Once you have built insights, the hard part is turning those insights
into money making actions through a multitude of field systems
Actions are taken in field systems….
DWHSensor Store
Order Store
Inventory Mgmt
Warranty Mgmt
Customer Portal
Analytical Store
Data is built here and action is taken here
BIG
ETL
Mess
10. 1010
Once you have built insights, the hard part is turning those insights into
money making actions through a multitude of field systems
Actions are taken in field systems….
DWH
Sensor Store
Order Store
Inventory Mgmt
Warranty Mgmt
Customer Portal
Analytical Store
Operational Pre-aggregation
BIG Moveable
Normal
ETL
Mess
13. 13
Use Cases
• Retail:
– Price Optimization
• Utilities and Manufacturing:
– Using smart meter data, optimizing the flow of
electrical power to maximize yield and usage
– Sensor data from vehicles to build truck fleet analytics
in real time
• Telco:
– Geo-based advertising, delivering relevant ads based
on interest and locality
– Smart call routing taking into account saturated cell
towers and customer value
13
14. 14
Use Cases
• Gov: City of Chicago (WindyGrid)
– Based on reports of maintenance needs (e.g.
broken streetlights), dispatching police in
targeted ways to reduce crime
• Financial Services: MetLife (The Wall)
– Moving from a policy centric view to a
customer centric view, enabling informed
upsell and cross sell offers based on historical
analysis and recent activity
14
15. 15
How does MongoDB help for these?
• Agility to compute and aggregate in place
– All
• Agility to add new data to existing schema
– Price Optimization
• High scalable performance to ingest
operational data
– Sensor data
• High scalable performance to serve
operational analytics
– Metlife, Telco 15
16. 16
NoSQL and Analytics
16
Tech Dev Time
Exec
latency
Exec
Power
Data
Transfer
Functional
Depth
Hadoop * * ***** ** *****
MongoDB ***** ***** *** ***** **
Cassandra
with
Hadoop
* * ***** ***** *****
DWH *** ***** ***** ** ****
SAS ***** ***** ** * *****
17. 17
Conclusions
• Analytics are no longer just batch
• Analytics requires integrating the real time
context
• Big Data is putting pressure to process
data where it lands
• New sources and forms of data are making
it difficult to stick to RDBMS rigidity
• MongoDB can help you
17