SlideShare uma empresa Scribd logo
1 de 12
Telco Analytics @Scale
Harikumar, Director Platform & Architecture
www.subex.com
Nov , 2016
1
Private & Confidentialwww.subex.com2
Subex Intro
Private & Confidentialwww.subex.com
Subex BSS/OSS Portfolio
3
Data Crunching - Use Cases & Latency
www.subex.com
Real Time
(Milliseconds)
Near Real Time
(Seconds)
Micro Batch
(Minutes)
Batch
(Hours-Days)
Latency
Algorithmic Complexity
Reporting
Aggregation
Rule Engine
Profiling
Machine Learning
Audits
Graph/Network Analysis
Text Search
Natural Language Processing
Stream Processing & Complex Event Processing (CEP)
Event Processing in the Eventful World
www.subex.com 5
• (Aggregated) Event
Data is
combined/correlated
with
• Users
• Assets
• Threats
• Vulnerabilities
• Location
• Historical
Techniques
• Rule Engine
• Event filtering
• Event aggregation and
transformation
• Operate on stored and
streaming data
• SQL like semantics over
stream data
• Supervised/Unsupervised
machine learning
• Applying known Models
• Event Pattern Detection
• Detecting Event
relationships
Areas
• Real time fraud
detection.
• Real time rating.
• Security Information and
Event Management
• Sensor Data/IOT
• DPI Data – Metadata,
Content,Flow
correlation.
• M2M Data
• Data Fraud – Malware
• Transaction Risk Scoring
Stream processing
• Keep the data Moving (Low Latency) – In Memory
• Distributed Message Queues
• Distributed In Memory Caches
• Distributed In Memory Stores
• Scalable, Highly Available Distributed stream Processing(Partition Data &
Scale, Data safety & Highly Available)
• Handle Stream Imperfections( Delayed, Missing, Out-Of-Order Data)
Key considerations
www.subex.com 6
ETL @ Scale – In Memory Distributed Cache
• Problem Statement(s)
• Scale the ETL
enrichment/lookups
layer
• High throughput +
Streaming Low latency
• Support Multiple
Access mechanisms
• GET/PUT
• SQL
• Views
www.subex.com 7
JVM
ETL
JVM
Cache
JVM
ETL
JVM
ETL
JVM
Cache
JVM
Cache
RDBMS
Read Write Update
Rule n
Rule Engine – In Memory Aggregation
www.subex.com 8
Event / I/P Data Record
Rule 2
Rule 1
Event Filters
Filtered Records
Aggregation Layer
Condition Evaluation
Actions
Shared Memory
8K
Page
Pool
16K
Page
Pool
32K
Page
Pool
..256
K
Page
Pool
Key / Value
Byte Stream
SerDEI
M
L
o
g
Shared Memory
8K
Page
Pool
16K
Page
Pool
32K
Page
Pool
..256
K
Page
Pool
Key / Value
Byte Stream
SerDEI
M
L
o
g
Shared Memory
8K
Page
Pool
16K
Page
Pool
32K
Page
Pool
..256
K
Page
Pool
Key / Value
Byte Stream
SerDEI
M
L
o
g
www.subex.com 9
Data placement Strategies
Application Data
• Application configuration data– Rule libraries
,DNA Configurations, Configurations – MySQL.
• Application generated data – Alarms,
Discrepancies – MySQL
• Operations Data (Application generated , Infra
Monitoring ) – Logs , Audit ,Metrics – Solr
• Application Aggregations - Summary/Pre-
aggregated data – Hive Tables
• Statistical Profiles, In Memory aggregation files –
HDFS
Traditional Telco Data
• Telco Entity Data – With Update Semantics –
HBase/MySQL
• Telco Historic Transaction Data – Hive with ORC file
format Partitioned by Date Stored in HDFS
• Switch Input Raw Files –HDFS
Other Sources
• Social Media
• DPI Flow Data
• Location Data
• IOT Sensor Data
Spark Streaming
Application Data
Data Flow
www.subex.com 10
Landing Directory
SAN/HDFS Apache
Flume
Flume –
Spark
Sink
Apache Kafka
In Memory
Rule Engine
Analytics
Application
s
…
Apache Spark
Streaming
ETL Adaptors
Flume – Dir
Source
Message
Queue
Flume –Kafka
Source
DB Sources
Sqoop/CDC
Tools
HDFS – Raw
File Backup
HDFS Hive Tables Hbase Tables Solr - Search
Indexes
Audits
MySQL–
Ref DB
HDFS
Hive Tables
Hbase Tables
Dist Message Queue
Data
Lake
Submit Spark
Jobs
Data Access
Hive/Presto
Distributed Cache
Operational
Metrics
Data Load
Stage
O
M
Spark Streaming
O
M
Pre-
aggregation
Data Management
Data Platform – Business and Domain Packaging
11
Data Acquisition/Ingest
Data Federation F/W
Data Processing
PreAggregation
Distributed Stream
Processing Apache Spark
Data Visualization & Analysis
Mobile F/W
ROC View
Case Management
Standard APIs – EAI & WS
Analytics Engine
Reconciliation Engine BPM-
Workflow
Engine
Flexible ETL
Rule Processing - In
Memory
Common Data Model
DistributedCache
Control
Panel
Operations &
Admin
Resource
Mgmt
Data Security
Audit &
Logging
Scheduler
Network Analysis
ROC Insights
Real time Message based
Distributed
MessageQueue
Hadoop – HDFS, Hive , HBase
Multi -tenancy
Machine Learning
Enterprise Search
Real time
Continuous
Query - CEP
Document Store Graph Data Store
Authoriza
tion &
Authentic
ation
Real time Rating
Profiling
Cloud
Metering
Risk Scoring
Cloud connectors
API Mgmt
Infrastructure
On premise OS/Servers/Network/StorageIaaS(Public /Private cloud)
ESB
Analytic Models
Thank you
Harikumar
hari.kumar@Subex.com
www.subex.com 12

Mais conteúdo relacionado

Mais procurados

Case Study: Elasticsearch Ingest Using StreamSets at Cisco Intercloud
Case Study: Elasticsearch Ingest Using StreamSets at Cisco IntercloudCase Study: Elasticsearch Ingest Using StreamSets at Cisco Intercloud
Case Study: Elasticsearch Ingest Using StreamSets at Cisco IntercloudRick Bilodeau
 
Low-latency data applications with Kafka and Agg indexes | Tino Tereshko, Fir...
Low-latency data applications with Kafka and Agg indexes | Tino Tereshko, Fir...Low-latency data applications with Kafka and Agg indexes | Tino Tereshko, Fir...
Low-latency data applications with Kafka and Agg indexes | Tino Tereshko, Fir...HostedbyConfluent
 
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...Guglielmo Iozzia
 
Modern ETL Pipelines with Change Data Capture
Modern ETL Pipelines with Change Data CaptureModern ETL Pipelines with Change Data Capture
Modern ETL Pipelines with Change Data CaptureDatabricks
 
03-NOV-1510-Ognjen-Antonic-Telemach-stream-1
03-NOV-1510-Ognjen-Antonic-Telemach-stream-103-NOV-1510-Ognjen-Antonic-Telemach-stream-1
03-NOV-1510-Ognjen-Antonic-Telemach-stream-1Ognjen Antonic
 
Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...
Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...
Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...Databricks
 
Apache Spark in Scientific Applciations
Apache Spark in Scientific ApplciationsApache Spark in Scientific Applciations
Apache Spark in Scientific ApplciationsDr. Mirko Kämpf
 
Near Real-Time Analytics with Apache Spark: Ingestion, ETL, and Interactive Q...
Near Real-Time Analytics with Apache Spark: Ingestion, ETL, and Interactive Q...Near Real-Time Analytics with Apache Spark: Ingestion, ETL, and Interactive Q...
Near Real-Time Analytics with Apache Spark: Ingestion, ETL, and Interactive Q...Databricks
 
#BDAM: EDW Optimization with Hadoop and CDAP, by Sagar Kapare from Cask
#BDAM: EDW Optimization with Hadoop and CDAP, by Sagar Kapare from Cask #BDAM: EDW Optimization with Hadoop and CDAP, by Sagar Kapare from Cask
#BDAM: EDW Optimization with Hadoop and CDAP, by Sagar Kapare from Cask Cask Data
 
Learnings Using Spark Streaming and DataFrames for Walmart Search: Spark Summ...
Learnings Using Spark Streaming and DataFrames for Walmart Search: Spark Summ...Learnings Using Spark Streaming and DataFrames for Walmart Search: Spark Summ...
Learnings Using Spark Streaming and DataFrames for Walmart Search: Spark Summ...Spark Summit
 
Realtime streaming architecture in INFINARIO
Realtime streaming architecture in INFINARIORealtime streaming architecture in INFINARIO
Realtime streaming architecture in INFINARIOJozo Kovac
 
Spark and Online Analytics: Spark Summit East talky by Shubham Chopra
Spark and Online Analytics: Spark Summit East talky by Shubham ChopraSpark and Online Analytics: Spark Summit East talky by Shubham Chopra
Spark and Online Analytics: Spark Summit East talky by Shubham ChopraSpark Summit
 
Lightning-Fast Analytics for Workday Transactional Data with Pavel Hardak and...
Lightning-Fast Analytics for Workday Transactional Data with Pavel Hardak and...Lightning-Fast Analytics for Workday Transactional Data with Pavel Hardak and...
Lightning-Fast Analytics for Workday Transactional Data with Pavel Hardak and...Databricks
 
RealTime Recommendations @Netflix - Spark
RealTime Recommendations @Netflix - SparkRealTime Recommendations @Netflix - Spark
RealTime Recommendations @Netflix - SparkNitin S
 
Spline 0.3 User Guide
Spline 0.3 User GuideSpline 0.3 User Guide
Spline 0.3 User GuideVaclav Kosar
 
Superset druid realtime
Superset druid realtimeSuperset druid realtime
Superset druid realtimearupmalakar
 
Lambda architecture: from zero to One
Lambda architecture: from zero to OneLambda architecture: from zero to One
Lambda architecture: from zero to OneSerg Masyutin
 
Obfuscating LinkedIn Member Data
Obfuscating LinkedIn Member DataObfuscating LinkedIn Member Data
Obfuscating LinkedIn Member DataDataWorks Summit
 

Mais procurados (20)

Case Study: Elasticsearch Ingest Using StreamSets at Cisco Intercloud
Case Study: Elasticsearch Ingest Using StreamSets at Cisco IntercloudCase Study: Elasticsearch Ingest Using StreamSets at Cisco Intercloud
Case Study: Elasticsearch Ingest Using StreamSets at Cisco Intercloud
 
Low-latency data applications with Kafka and Agg indexes | Tino Tereshko, Fir...
Low-latency data applications with Kafka and Agg indexes | Tino Tereshko, Fir...Low-latency data applications with Kafka and Agg indexes | Tino Tereshko, Fir...
Low-latency data applications with Kafka and Agg indexes | Tino Tereshko, Fir...
 
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
 
Modern ETL Pipelines with Change Data Capture
Modern ETL Pipelines with Change Data CaptureModern ETL Pipelines with Change Data Capture
Modern ETL Pipelines with Change Data Capture
 
03-NOV-1510-Ognjen-Antonic-Telemach-stream-1
03-NOV-1510-Ognjen-Antonic-Telemach-stream-103-NOV-1510-Ognjen-Antonic-Telemach-stream-1
03-NOV-1510-Ognjen-Antonic-Telemach-stream-1
 
Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...
Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...
Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...
 
Apache Spark in Scientific Applciations
Apache Spark in Scientific ApplciationsApache Spark in Scientific Applciations
Apache Spark in Scientific Applciations
 
Near Real-Time Analytics with Apache Spark: Ingestion, ETL, and Interactive Q...
Near Real-Time Analytics with Apache Spark: Ingestion, ETL, and Interactive Q...Near Real-Time Analytics with Apache Spark: Ingestion, ETL, and Interactive Q...
Near Real-Time Analytics with Apache Spark: Ingestion, ETL, and Interactive Q...
 
#BDAM: EDW Optimization with Hadoop and CDAP, by Sagar Kapare from Cask
#BDAM: EDW Optimization with Hadoop and CDAP, by Sagar Kapare from Cask #BDAM: EDW Optimization with Hadoop and CDAP, by Sagar Kapare from Cask
#BDAM: EDW Optimization with Hadoop and CDAP, by Sagar Kapare from Cask
 
Learnings Using Spark Streaming and DataFrames for Walmart Search: Spark Summ...
Learnings Using Spark Streaming and DataFrames for Walmart Search: Spark Summ...Learnings Using Spark Streaming and DataFrames for Walmart Search: Spark Summ...
Learnings Using Spark Streaming and DataFrames for Walmart Search: Spark Summ...
 
Realtime streaming architecture in INFINARIO
Realtime streaming architecture in INFINARIORealtime streaming architecture in INFINARIO
Realtime streaming architecture in INFINARIO
 
Spark and Online Analytics: Spark Summit East talky by Shubham Chopra
Spark and Online Analytics: Spark Summit East talky by Shubham ChopraSpark and Online Analytics: Spark Summit East talky by Shubham Chopra
Spark and Online Analytics: Spark Summit East talky by Shubham Chopra
 
Lightning-Fast Analytics for Workday Transactional Data with Pavel Hardak and...
Lightning-Fast Analytics for Workday Transactional Data with Pavel Hardak and...Lightning-Fast Analytics for Workday Transactional Data with Pavel Hardak and...
Lightning-Fast Analytics for Workday Transactional Data with Pavel Hardak and...
 
Self-Service Analytics on Hadoop: Lessons Learned
Self-Service Analytics on Hadoop: Lessons LearnedSelf-Service Analytics on Hadoop: Lessons Learned
Self-Service Analytics on Hadoop: Lessons Learned
 
RealTime Recommendations @Netflix - Spark
RealTime Recommendations @Netflix - SparkRealTime Recommendations @Netflix - Spark
RealTime Recommendations @Netflix - Spark
 
Spline 0.3 User Guide
Spline 0.3 User GuideSpline 0.3 User Guide
Spline 0.3 User Guide
 
Superset druid realtime
Superset druid realtimeSuperset druid realtime
Superset druid realtime
 
Lambda architecture: from zero to One
Lambda architecture: from zero to OneLambda architecture: from zero to One
Lambda architecture: from zero to One
 
ASPgems - kappa architecture
ASPgems - kappa architectureASPgems - kappa architecture
ASPgems - kappa architecture
 
Obfuscating LinkedIn Member Data
Obfuscating LinkedIn Member DataObfuscating LinkedIn Member Data
Obfuscating LinkedIn Member Data
 

Destaque

Building scalable rest service using Akka HTTP
Building scalable rest service using Akka HTTPBuilding scalable rest service using Akka HTTP
Building scalable rest service using Akka HTTPdatamantra
 
Platform for Data Scientists
Platform for Data ScientistsPlatform for Data Scientists
Platform for Data Scientistsdatamantra
 
Interactive Data Analysis in Spark Streaming
Interactive Data Analysis in Spark StreamingInteractive Data Analysis in Spark Streaming
Interactive Data Analysis in Spark Streamingdatamantra
 
Functional programming in Scala
Functional programming in ScalaFunctional programming in Scala
Functional programming in Scaladatamantra
 
M&A Trends in Telco Analytics
M&A Trends in Telco AnalyticsM&A Trends in Telco Analytics
M&A Trends in Telco AnalyticsOpen Analytics
 
Actian Analytics Platform - Hadoop SQL Edition
Actian Analytics Platform - Hadoop SQL EditionActian Analytics Platform - Hadoop SQL Edition
Actian Analytics Platform - Hadoop SQL EditionAlessandro Salvatico
 
Data Science with Spark by Saeed Aghabozorgi
Data Science with Spark by Saeed Aghabozorgi Data Science with Spark by Saeed Aghabozorgi
Data Science with Spark by Saeed Aghabozorgi Sachin Aggarwal
 
Analytics at the Speed of Thought: Actian Express Overview
Analytics at the Speed of Thought: Actian Express Overview Analytics at the Speed of Thought: Actian Express Overview
Analytics at the Speed of Thought: Actian Express Overview Actian Corporation
 
Jump start your analytics investments and accelerate analytics ROI
Jump start your analytics investments and accelerate analytics ROIJump start your analytics investments and accelerate analytics ROI
Jump start your analytics investments and accelerate analytics ROIActian Corporation
 
Turning Your Data Lake into Measurable Business Value
Turning Your Data Lake into Measurable Business ValueTurning Your Data Lake into Measurable Business Value
Turning Your Data Lake into Measurable Business ValueActian Corporation
 
Telco Big Data 2012 Highlights
Telco Big Data 2012 HighlightsTelco Big Data 2012 Highlights
Telco Big Data 2012 HighlightsAlan Quayle
 
Lead to Cash: The Value of Big Data and Analytics for Telco
Lead to Cash: The Value of Big Data and Analytics for TelcoLead to Cash: The Value of Big Data and Analytics for Telco
Lead to Cash: The Value of Big Data and Analytics for TelcoSam Thomsett
 
Gruter TECHDAY 2014 Realtime Processing in Telco
Gruter TECHDAY 2014 Realtime Processing in TelcoGruter TECHDAY 2014 Realtime Processing in Telco
Gruter TECHDAY 2014 Realtime Processing in TelcoGruter
 
Big Telco Real-Time Network Analytics
Big Telco Real-Time Network AnalyticsBig Telco Real-Time Network Analytics
Big Telco Real-Time Network AnalyticsYousun Jeong
 
Variance in scala
Variance in scalaVariance in scala
Variance in scalaLyleK
 
Types by Adform Research, Saulius Valatka
Types by Adform Research, Saulius ValatkaTypes by Adform Research, Saulius Valatka
Types by Adform Research, Saulius ValatkaVasil Remeniuk
 
Python in real world.
Python in real world.Python in real world.
Python in real world.Alph@.M
 
Spark & Cassandra at DataStax Meetup on Jan 29, 2015
Spark & Cassandra at DataStax Meetup on Jan 29, 2015 Spark & Cassandra at DataStax Meetup on Jan 29, 2015
Spark & Cassandra at DataStax Meetup on Jan 29, 2015 Sameer Farooqui
 

Destaque (20)

Building scalable rest service using Akka HTTP
Building scalable rest service using Akka HTTPBuilding scalable rest service using Akka HTTP
Building scalable rest service using Akka HTTP
 
Platform for Data Scientists
Platform for Data ScientistsPlatform for Data Scientists
Platform for Data Scientists
 
Interactive Data Analysis in Spark Streaming
Interactive Data Analysis in Spark StreamingInteractive Data Analysis in Spark Streaming
Interactive Data Analysis in Spark Streaming
 
Functional programming in Scala
Functional programming in ScalaFunctional programming in Scala
Functional programming in Scala
 
Actian Vector Whitepaper
 Actian Vector Whitepaper Actian Vector Whitepaper
Actian Vector Whitepaper
 
M&A Trends in Telco Analytics
M&A Trends in Telco AnalyticsM&A Trends in Telco Analytics
M&A Trends in Telco Analytics
 
Actian Analytics Platform - Hadoop SQL Edition
Actian Analytics Platform - Hadoop SQL EditionActian Analytics Platform - Hadoop SQL Edition
Actian Analytics Platform - Hadoop SQL Edition
 
Data Science with Spark by Saeed Aghabozorgi
Data Science with Spark by Saeed Aghabozorgi Data Science with Spark by Saeed Aghabozorgi
Data Science with Spark by Saeed Aghabozorgi
 
Analytics at the Speed of Thought: Actian Express Overview
Analytics at the Speed of Thought: Actian Express Overview Analytics at the Speed of Thought: Actian Express Overview
Analytics at the Speed of Thought: Actian Express Overview
 
Jump start your analytics investments and accelerate analytics ROI
Jump start your analytics investments and accelerate analytics ROIJump start your analytics investments and accelerate analytics ROI
Jump start your analytics investments and accelerate analytics ROI
 
Turning Your Data Lake into Measurable Business Value
Turning Your Data Lake into Measurable Business ValueTurning Your Data Lake into Measurable Business Value
Turning Your Data Lake into Measurable Business Value
 
Telco Big Data 2012 Highlights
Telco Big Data 2012 HighlightsTelco Big Data 2012 Highlights
Telco Big Data 2012 Highlights
 
Lead to Cash: The Value of Big Data and Analytics for Telco
Lead to Cash: The Value of Big Data and Analytics for TelcoLead to Cash: The Value of Big Data and Analytics for Telco
Lead to Cash: The Value of Big Data and Analytics for Telco
 
Gruter TECHDAY 2014 Realtime Processing in Telco
Gruter TECHDAY 2014 Realtime Processing in TelcoGruter TECHDAY 2014 Realtime Processing in Telco
Gruter TECHDAY 2014 Realtime Processing in Telco
 
Big Telco Real-Time Network Analytics
Big Telco Real-Time Network AnalyticsBig Telco Real-Time Network Analytics
Big Telco Real-Time Network Analytics
 
Variance in scala
Variance in scalaVariance in scala
Variance in scala
 
Types by Adform Research, Saulius Valatka
Types by Adform Research, Saulius ValatkaTypes by Adform Research, Saulius Valatka
Types by Adform Research, Saulius Valatka
 
Q2 teenagers
Q2 teenagersQ2 teenagers
Q2 teenagers
 
Python in real world.
Python in real world.Python in real world.
Python in real world.
 
Spark & Cassandra at DataStax Meetup on Jan 29, 2015
Spark & Cassandra at DataStax Meetup on Jan 29, 2015 Spark & Cassandra at DataStax Meetup on Jan 29, 2015
Spark & Cassandra at DataStax Meetup on Jan 29, 2015
 

Semelhante a Telco analytics at scale

Using Data Lakes: Data Analytics Week SF
Using Data Lakes: Data Analytics Week SFUsing Data Lakes: Data Analytics Week SF
Using Data Lakes: Data Analytics Week SFAmazon Web Services
 
Scalable Data Analytics - DevDay Austin 2017 Day 2
Scalable Data Analytics - DevDay Austin 2017 Day 2Scalable Data Analytics - DevDay Austin 2017 Day 2
Scalable Data Analytics - DevDay Austin 2017 Day 2Amazon Web Services
 
Introduction Big Data
Introduction Big DataIntroduction Big Data
Introduction Big DataFrank Kienle
 
Achieving Real-time Ingestion and Analysis of Security Events through Kafka a...
Achieving Real-time Ingestion and Analysis of Security Events through Kafka a...Achieving Real-time Ingestion and Analysis of Security Events through Kafka a...
Achieving Real-time Ingestion and Analysis of Security Events through Kafka a...Kevin Mao
 
Data lake-itweekend-sharif university-vahid amiry
Data lake-itweekend-sharif university-vahid amiryData lake-itweekend-sharif university-vahid amiry
Data lake-itweekend-sharif university-vahid amirydatastack
 
Data Science with the Help of Metadata
Data Science with the Help of MetadataData Science with the Help of Metadata
Data Science with the Help of MetadataJim Dowling
 
A Gentle Introduction to Big Data
A Gentle Introduction to Big DataA Gentle Introduction to Big Data
A Gentle Introduction to Big DataMehmet Ali Akyol
 
Lightning Fast Analytics with Hive LLAP and Druid
Lightning Fast Analytics with Hive LLAP and DruidLightning Fast Analytics with Hive LLAP and Druid
Lightning Fast Analytics with Hive LLAP and DruidDataWorks Summit
 
Scaling up with Cisco Big Data: Data + Science = Data Science
Scaling up with Cisco Big Data: Data + Science = Data ScienceScaling up with Cisco Big Data: Data + Science = Data Science
Scaling up with Cisco Big Data: Data + Science = Data ScienceeRic Choo
 
Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2
Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2
Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2Amazon Web Services
 
Modernizing upstream workflows with aws storage - john mallory
Modernizing upstream workflows with aws storage -  john malloryModernizing upstream workflows with aws storage -  john mallory
Modernizing upstream workflows with aws storage - john malloryAmazon Web Services
 
Tugdual Grall - Real World Use Cases: Hadoop and NoSQL in Production
Tugdual Grall - Real World Use Cases: Hadoop and NoSQL in ProductionTugdual Grall - Real World Use Cases: Hadoop and NoSQL in Production
Tugdual Grall - Real World Use Cases: Hadoop and NoSQL in ProductionCodemotion
 
An architecture for federated data discovery and lineage over on-prem datasou...
An architecture for federated data discovery and lineage over on-prem datasou...An architecture for federated data discovery and lineage over on-prem datasou...
An architecture for federated data discovery and lineage over on-prem datasou...DataWorks Summit
 
Lecture1 BIG DATA and Types of data in details
Lecture1 BIG DATA and Types of data in detailsLecture1 BIG DATA and Types of data in details
Lecture1 BIG DATA and Types of data in detailsAbhishekKumarAgrahar2
 

Semelhante a Telco analytics at scale (20)

Using Data Lakes: Data Analytics Week SF
Using Data Lakes: Data Analytics Week SFUsing Data Lakes: Data Analytics Week SF
Using Data Lakes: Data Analytics Week SF
 
Using Data Lakes
Using Data Lakes Using Data Lakes
Using Data Lakes
 
Scalable Data Analytics - DevDay Austin 2017 Day 2
Scalable Data Analytics - DevDay Austin 2017 Day 2Scalable Data Analytics - DevDay Austin 2017 Day 2
Scalable Data Analytics - DevDay Austin 2017 Day 2
 
Using Data Lakes
Using Data LakesUsing Data Lakes
Using Data Lakes
 
Introduction Big Data
Introduction Big DataIntroduction Big Data
Introduction Big Data
 
Achieving Real-time Ingestion and Analysis of Security Events through Kafka a...
Achieving Real-time Ingestion and Analysis of Security Events through Kafka a...Achieving Real-time Ingestion and Analysis of Security Events through Kafka a...
Achieving Real-time Ingestion and Analysis of Security Events through Kafka a...
 
Data lake-itweekend-sharif university-vahid amiry
Data lake-itweekend-sharif university-vahid amiryData lake-itweekend-sharif university-vahid amiry
Data lake-itweekend-sharif university-vahid amiry
 
Data Science with the Help of Metadata
Data Science with the Help of MetadataData Science with the Help of Metadata
Data Science with the Help of Metadata
 
A Gentle Introduction to Big Data
A Gentle Introduction to Big DataA Gentle Introduction to Big Data
A Gentle Introduction to Big Data
 
Lightning Fast Analytics with Hive LLAP and Druid
Lightning Fast Analytics with Hive LLAP and DruidLightning Fast Analytics with Hive LLAP and Druid
Lightning Fast Analytics with Hive LLAP and Druid
 
Scaling up with Cisco Big Data: Data + Science = Data Science
Scaling up with Cisco Big Data: Data + Science = Data ScienceScaling up with Cisco Big Data: Data + Science = Data Science
Scaling up with Cisco Big Data: Data + Science = Data Science
 
Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2
Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2
Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2
 
Big data.ppt
Big data.pptBig data.ppt
Big data.ppt
 
Lecture1
Lecture1Lecture1
Lecture1
 
Modernizing upstream workflows with aws storage - john mallory
Modernizing upstream workflows with aws storage -  john malloryModernizing upstream workflows with aws storage -  john mallory
Modernizing upstream workflows with aws storage - john mallory
 
Tugdual Grall - Real World Use Cases: Hadoop and NoSQL in Production
Tugdual Grall - Real World Use Cases: Hadoop and NoSQL in ProductionTugdual Grall - Real World Use Cases: Hadoop and NoSQL in Production
Tugdual Grall - Real World Use Cases: Hadoop and NoSQL in Production
 
Analytics&IoT
Analytics&IoTAnalytics&IoT
Analytics&IoT
 
Using Data Lakes
Using Data LakesUsing Data Lakes
Using Data Lakes
 
An architecture for federated data discovery and lineage over on-prem datasou...
An architecture for federated data discovery and lineage over on-prem datasou...An architecture for federated data discovery and lineage over on-prem datasou...
An architecture for federated data discovery and lineage over on-prem datasou...
 
Lecture1 BIG DATA and Types of data in details
Lecture1 BIG DATA and Types of data in detailsLecture1 BIG DATA and Types of data in details
Lecture1 BIG DATA and Types of data in details
 

Mais de datamantra

Multi Source Data Analysis using Spark and Tellius
Multi Source Data Analysis using Spark and TelliusMulti Source Data Analysis using Spark and Tellius
Multi Source Data Analysis using Spark and Telliusdatamantra
 
State management in Structured Streaming
State management in Structured StreamingState management in Structured Streaming
State management in Structured Streamingdatamantra
 
Spark on Kubernetes
Spark on KubernetesSpark on Kubernetes
Spark on Kubernetesdatamantra
 
Understanding transactional writes in datasource v2
Understanding transactional writes in  datasource v2Understanding transactional writes in  datasource v2
Understanding transactional writes in datasource v2datamantra
 
Introduction to Datasource V2 API
Introduction to Datasource V2 APIIntroduction to Datasource V2 API
Introduction to Datasource V2 APIdatamantra
 
Exploratory Data Analysis in Spark
Exploratory Data Analysis in SparkExploratory Data Analysis in Spark
Exploratory Data Analysis in Sparkdatamantra
 
Core Services behind Spark Job Execution
Core Services behind Spark Job ExecutionCore Services behind Spark Job Execution
Core Services behind Spark Job Executiondatamantra
 
Optimizing S3 Write-heavy Spark workloads
Optimizing S3 Write-heavy Spark workloadsOptimizing S3 Write-heavy Spark workloads
Optimizing S3 Write-heavy Spark workloadsdatamantra
 
Structured Streaming with Kafka
Structured Streaming with KafkaStructured Streaming with Kafka
Structured Streaming with Kafkadatamantra
 
Understanding time in structured streaming
Understanding time in structured streamingUnderstanding time in structured streaming
Understanding time in structured streamingdatamantra
 
Spark stack for Model life-cycle management
Spark stack for Model life-cycle managementSpark stack for Model life-cycle management
Spark stack for Model life-cycle managementdatamantra
 
Productionalizing Spark ML
Productionalizing Spark MLProductionalizing Spark ML
Productionalizing Spark MLdatamantra
 
Introduction to Structured streaming
Introduction to Structured streamingIntroduction to Structured streaming
Introduction to Structured streamingdatamantra
 
Building real time Data Pipeline using Spark Streaming
Building real time Data Pipeline using Spark StreamingBuilding real time Data Pipeline using Spark Streaming
Building real time Data Pipeline using Spark Streamingdatamantra
 
Testing Spark and Scala
Testing Spark and ScalaTesting Spark and Scala
Testing Spark and Scaladatamantra
 
Understanding Implicits in Scala
Understanding Implicits in ScalaUnderstanding Implicits in Scala
Understanding Implicits in Scaladatamantra
 
Migrating to Spark 2.0 - Part 2
Migrating to Spark 2.0 - Part 2Migrating to Spark 2.0 - Part 2
Migrating to Spark 2.0 - Part 2datamantra
 
Migrating to spark 2.0
Migrating to spark 2.0Migrating to spark 2.0
Migrating to spark 2.0datamantra
 
Scalable Spark deployment using Kubernetes
Scalable Spark deployment using KubernetesScalable Spark deployment using Kubernetes
Scalable Spark deployment using Kubernetesdatamantra
 
Introduction to concurrent programming with akka actors
Introduction to concurrent programming with akka actorsIntroduction to concurrent programming with akka actors
Introduction to concurrent programming with akka actorsdatamantra
 

Mais de datamantra (20)

Multi Source Data Analysis using Spark and Tellius
Multi Source Data Analysis using Spark and TelliusMulti Source Data Analysis using Spark and Tellius
Multi Source Data Analysis using Spark and Tellius
 
State management in Structured Streaming
State management in Structured StreamingState management in Structured Streaming
State management in Structured Streaming
 
Spark on Kubernetes
Spark on KubernetesSpark on Kubernetes
Spark on Kubernetes
 
Understanding transactional writes in datasource v2
Understanding transactional writes in  datasource v2Understanding transactional writes in  datasource v2
Understanding transactional writes in datasource v2
 
Introduction to Datasource V2 API
Introduction to Datasource V2 APIIntroduction to Datasource V2 API
Introduction to Datasource V2 API
 
Exploratory Data Analysis in Spark
Exploratory Data Analysis in SparkExploratory Data Analysis in Spark
Exploratory Data Analysis in Spark
 
Core Services behind Spark Job Execution
Core Services behind Spark Job ExecutionCore Services behind Spark Job Execution
Core Services behind Spark Job Execution
 
Optimizing S3 Write-heavy Spark workloads
Optimizing S3 Write-heavy Spark workloadsOptimizing S3 Write-heavy Spark workloads
Optimizing S3 Write-heavy Spark workloads
 
Structured Streaming with Kafka
Structured Streaming with KafkaStructured Streaming with Kafka
Structured Streaming with Kafka
 
Understanding time in structured streaming
Understanding time in structured streamingUnderstanding time in structured streaming
Understanding time in structured streaming
 
Spark stack for Model life-cycle management
Spark stack for Model life-cycle managementSpark stack for Model life-cycle management
Spark stack for Model life-cycle management
 
Productionalizing Spark ML
Productionalizing Spark MLProductionalizing Spark ML
Productionalizing Spark ML
 
Introduction to Structured streaming
Introduction to Structured streamingIntroduction to Structured streaming
Introduction to Structured streaming
 
Building real time Data Pipeline using Spark Streaming
Building real time Data Pipeline using Spark StreamingBuilding real time Data Pipeline using Spark Streaming
Building real time Data Pipeline using Spark Streaming
 
Testing Spark and Scala
Testing Spark and ScalaTesting Spark and Scala
Testing Spark and Scala
 
Understanding Implicits in Scala
Understanding Implicits in ScalaUnderstanding Implicits in Scala
Understanding Implicits in Scala
 
Migrating to Spark 2.0 - Part 2
Migrating to Spark 2.0 - Part 2Migrating to Spark 2.0 - Part 2
Migrating to Spark 2.0 - Part 2
 
Migrating to spark 2.0
Migrating to spark 2.0Migrating to spark 2.0
Migrating to spark 2.0
 
Scalable Spark deployment using Kubernetes
Scalable Spark deployment using KubernetesScalable Spark deployment using Kubernetes
Scalable Spark deployment using Kubernetes
 
Introduction to concurrent programming with akka actors
Introduction to concurrent programming with akka actorsIntroduction to concurrent programming with akka actors
Introduction to concurrent programming with akka actors
 

Último

Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubaikojalkojal131
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxchadhar227
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteedamy56318795
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Klinik kandungan
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareGraham Ware
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...nirzagarg
 
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制vexqp
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...nirzagarg
 
Harnessing the Power of GenAI for BI and Reporting.pptx
Harnessing the Power of GenAI for BI and Reporting.pptxHarnessing the Power of GenAI for BI and Reporting.pptx
Harnessing the Power of GenAI for BI and Reporting.pptxParas Gupta
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制vexqp
 
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制vexqp
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNKTimothy Spann
 
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制vexqp
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabiaahmedjiabur940
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRajesh Mondal
 
PLE-statistics document for primary schs
PLE-statistics document for primary schsPLE-statistics document for primary schs
PLE-statistics document for primary schscnajjemba
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...nirzagarg
 

Último (20)

Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubai
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham Ware
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
 
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
 
Cytotec in Jeddah+966572737505) get unwanted pregnancy kit Riyadh
Cytotec in Jeddah+966572737505) get unwanted pregnancy kit RiyadhCytotec in Jeddah+966572737505) get unwanted pregnancy kit Riyadh
Cytotec in Jeddah+966572737505) get unwanted pregnancy kit Riyadh
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
 
Harnessing the Power of GenAI for BI and Reporting.pptx
Harnessing the Power of GenAI for BI and Reporting.pptxHarnessing the Power of GenAI for BI and Reporting.pptx
Harnessing the Power of GenAI for BI and Reporting.pptx
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
 
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for Research
 
PLE-statistics document for primary schs
PLE-statistics document for primary schsPLE-statistics document for primary schs
PLE-statistics document for primary schs
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
 

Telco analytics at scale

  • 1. Telco Analytics @Scale Harikumar, Director Platform & Architecture www.subex.com Nov , 2016 1
  • 4. Data Crunching - Use Cases & Latency www.subex.com Real Time (Milliseconds) Near Real Time (Seconds) Micro Batch (Minutes) Batch (Hours-Days) Latency Algorithmic Complexity Reporting Aggregation Rule Engine Profiling Machine Learning Audits Graph/Network Analysis Text Search Natural Language Processing
  • 5. Stream Processing & Complex Event Processing (CEP) Event Processing in the Eventful World www.subex.com 5 • (Aggregated) Event Data is combined/correlated with • Users • Assets • Threats • Vulnerabilities • Location • Historical Techniques • Rule Engine • Event filtering • Event aggregation and transformation • Operate on stored and streaming data • SQL like semantics over stream data • Supervised/Unsupervised machine learning • Applying known Models • Event Pattern Detection • Detecting Event relationships Areas • Real time fraud detection. • Real time rating. • Security Information and Event Management • Sensor Data/IOT • DPI Data – Metadata, Content,Flow correlation. • M2M Data • Data Fraud – Malware • Transaction Risk Scoring
  • 6. Stream processing • Keep the data Moving (Low Latency) – In Memory • Distributed Message Queues • Distributed In Memory Caches • Distributed In Memory Stores • Scalable, Highly Available Distributed stream Processing(Partition Data & Scale, Data safety & Highly Available) • Handle Stream Imperfections( Delayed, Missing, Out-Of-Order Data) Key considerations www.subex.com 6
  • 7. ETL @ Scale – In Memory Distributed Cache • Problem Statement(s) • Scale the ETL enrichment/lookups layer • High throughput + Streaming Low latency • Support Multiple Access mechanisms • GET/PUT • SQL • Views www.subex.com 7 JVM ETL JVM Cache JVM ETL JVM ETL JVM Cache JVM Cache RDBMS Read Write Update
  • 8. Rule n Rule Engine – In Memory Aggregation www.subex.com 8 Event / I/P Data Record Rule 2 Rule 1 Event Filters Filtered Records Aggregation Layer Condition Evaluation Actions Shared Memory 8K Page Pool 16K Page Pool 32K Page Pool ..256 K Page Pool Key / Value Byte Stream SerDEI M L o g Shared Memory 8K Page Pool 16K Page Pool 32K Page Pool ..256 K Page Pool Key / Value Byte Stream SerDEI M L o g Shared Memory 8K Page Pool 16K Page Pool 32K Page Pool ..256 K Page Pool Key / Value Byte Stream SerDEI M L o g
  • 9. www.subex.com 9 Data placement Strategies Application Data • Application configuration data– Rule libraries ,DNA Configurations, Configurations – MySQL. • Application generated data – Alarms, Discrepancies – MySQL • Operations Data (Application generated , Infra Monitoring ) – Logs , Audit ,Metrics – Solr • Application Aggregations - Summary/Pre- aggregated data – Hive Tables • Statistical Profiles, In Memory aggregation files – HDFS Traditional Telco Data • Telco Entity Data – With Update Semantics – HBase/MySQL • Telco Historic Transaction Data – Hive with ORC file format Partitioned by Date Stored in HDFS • Switch Input Raw Files –HDFS Other Sources • Social Media • DPI Flow Data • Location Data • IOT Sensor Data
  • 10. Spark Streaming Application Data Data Flow www.subex.com 10 Landing Directory SAN/HDFS Apache Flume Flume – Spark Sink Apache Kafka In Memory Rule Engine Analytics Application s … Apache Spark Streaming ETL Adaptors Flume – Dir Source Message Queue Flume –Kafka Source DB Sources Sqoop/CDC Tools HDFS – Raw File Backup HDFS Hive Tables Hbase Tables Solr - Search Indexes Audits MySQL– Ref DB HDFS Hive Tables Hbase Tables Dist Message Queue Data Lake Submit Spark Jobs Data Access Hive/Presto Distributed Cache Operational Metrics Data Load Stage O M Spark Streaming O M Pre- aggregation
  • 11. Data Management Data Platform – Business and Domain Packaging 11 Data Acquisition/Ingest Data Federation F/W Data Processing PreAggregation Distributed Stream Processing Apache Spark Data Visualization & Analysis Mobile F/W ROC View Case Management Standard APIs – EAI & WS Analytics Engine Reconciliation Engine BPM- Workflow Engine Flexible ETL Rule Processing - In Memory Common Data Model DistributedCache Control Panel Operations & Admin Resource Mgmt Data Security Audit & Logging Scheduler Network Analysis ROC Insights Real time Message based Distributed MessageQueue Hadoop – HDFS, Hive , HBase Multi -tenancy Machine Learning Enterprise Search Real time Continuous Query - CEP Document Store Graph Data Store Authoriza tion & Authentic ation Real time Rating Profiling Cloud Metering Risk Scoring Cloud connectors API Mgmt Infrastructure On premise OS/Servers/Network/StorageIaaS(Public /Private cloud) ESB Analytic Models