SlideShare uma empresa Scribd logo
1 de 29
A trusted partner
Business Powered By
Data
What it takes to build Real-Time Operational Intelligence and
Big Data solutions
Row level
data security
manual
development
Massively Parallel
Processing Systems
Example: Vertica,
GreenPlum,
Neteeza,
ParStream
NoSQL Databases
Example:
MongoDB, Amazon
DynamoDB,
Cassandra
Relational
Databases
Example: Microsoft
SQL Server, IBM
DB2, Oracle,
Sybase
OLAP
Example: Microsoft
SSAS, Cognos
Powerplay
NewSQL
Example: NuoDB
Discovery & Analysis
Tableau, QlikView,
Cognos, SiSense
Reporting
(Many)
Event Stream Processing – Developer
focused (Tibco, Microsoft, IBM)
Streaming data consumption (APIs, Enterprise Service Bus)Static data (Connectors)
Data Mining
R, SAS, SPSS
Custom Applications
Data Transformation (Ascential Software,
Cognos, Microsoft Integration Services)
Store & Manage Data
Process Data
Structured
Semi Structured
Unstructured Data Access &
Visualization
Acquire Data
Generate Insights (Correlation, KPIs, Data
Denormalization) Manual Custom
Development
Hadoop
Example:
HortonWorks,
Cloudera
Data
Access &
Security
Real-time
and historical
data
publishing -
manual
development
API Data
Export
Database Design and Development
Solution design, rules for data manipulation, rules for monitoring conditions and KPIs, rules for detecting events...
Pre work
Row level
data security
manual
development
Massively Parallel
Processing Systems
Example: Vertica,
GreenPlum,
Neteeza,
ParStream
NoSQL Databases
Example:
MongoDB, Amazon
DynamoDB,
Cassandra
Relational
Databases
Example: Microsoft
SQL Server, IBM
DB2, Oracle,
Sybase
OLAP
Example: Microsoft
SSAS, Cognos
Powerplay
NewSQL
Example: NuoDB
Discovery & Analysis
Tableau, QlikView,
Cognos, SiSense
Reporting
(Many)
Event Stream Processing – Developer
focused (Tibco, Microsoft, IBM)
Streaming data consumption (APIs, Enterprise Service Bus)Static data (Connectors)
Data Mining
R, SAS, SPSS
Custom Applications
Data Transformation (Ascential Software,
Cognos, Microsoft Integration Services)
Store & Manage Data
Process Data
Structured
Semi Structured
Unstructured Data Access &
Visualization
Acquire Data
Generate Insights (Correlation, KPIs, Data
Denormalization) Manual Custom
Development
Hadoop
Example:
HortonWorks,
Cloudera
Data
Access &
Security
Real-time
and historical
data
publishing -
manual
development
API Data
Export
Database Design and Development
Solution design, rules for data manipulation, rules for monitoring conditions and KPIs, rules for detecting events...
Pre work
Innovations in Big Data technologies over the last 5 years
Row level
data security
manual
development
Massively Parallel
Processing Systems
Example: Vertica,
GreenPlum,
Neteeza,
ParStream
NoSQL Databases
Example:
MongoDB, Amazon
DynamoDB,
Cassandra
Relational
Databases
Example: Microsoft
SQL Server, IBM
DB2, Oracle,
Sybase
OLAP
Example: Microsoft
SSAS, Cognos
Powerplay
NewSQL
Example: NuoDB
Discovery & Analysis
Tableau, QlikView,
Cognos, SiSense
Reporting
(Many)
Event Stream Processing (Tibco,
Microsoft, IBM)
Streaming data consumption (APIs, Enterprise Service Bus)Static data (Connectors)
Data Mining
R, SAS, SPSS
Custom Applications
Data Transformation (Ascential Software,
Cognos, Microsoft Integration Services)
Store & Manage Data
Process Data
Structured
Semi Structured
Unstructured Data Access &
Visualization
Acquire Data
Generate Insights (Correlation, KPIs, Data
Denormalization) Manual Custom
Development
Hadoop
Example:
HortonWorks,
Cloudera
Data
Access &
Security
Real-time
and historical
data
publishing -
manual
development
API Data
Export
Database Design and Development
Solution design, rules for data manipulation, rules for monitoring conditions and KPIs, rules for detecting events...
Pre work
Challenging bits not addressed in this innovation cycle
This causes:
• Lots of systems integration of
point solutions
• Custom code
• Specialist skills
• Hard to change and evolve
Rapidly
industrialize the
use of data by
designing, building
and running real-
time business
intelligence and big
data solutions with
StreamCentral.
Solution Designer
(Data Consumption, data
transformations, conditions,
event, correlation)
Workbench – Easy to Design
Security Designer Systems ManagementAPI Designer
Meta Data Manager
Information Warehouse Manager – Auto Build
De normalized schema
generation for data marts
Security schema generation
Normalized schema
generation for Fact and
Dimensions
Auto generate database design, auto generate database and application code, infer relationships in data
BI Server – Run with scale
Data Processing
Analytic Applications
BI /
Reporting
Data
Exploration /
Viisualization
Functional
Application
Event Driven
Predictive Analytics
Industry
Application
Association
Analysis
Data Collection
Business Event
Detection
Data Publishing - SQL
Server, Vertica,
MongoDB
Data Export Caching
Putting it together – High impact real-time solutions in
fraction of the time
StreamCentral
auto builds
security
infrastrucure
Massively Parallel
Processing Systems
Vertica
NoSQL Databases
MongoDB
Relational Databases
Microsoft SQL Server
Discovery & Analysis
Tableau, QlikView,
Cognos, SiSense
Reporting
(Many)
Built in StreamToMe API (Stream any data from any application
or device to StreamCentral)
Static data (Connectors)
Data Mining
R, SAS, SPSS
Custom Applications
Store & Manage Data
Process Data
Structured
Semi Structured
Unstructured Data Access &
Visualization
Acquire Data
Hadoop
Data
Access &
Security
StreamCentral
Built in API
builder
API Data
Export
Database Development - StreamCentral auto generates database design and database code
StreamCentral Workbench – No coding required -- Solution design, rules for data manipulation, rules for monitoring conditions
and KPIs, rules for detecting events... ) – For a broad set of people with varying technical skills
Pre work
Event Stream Processing (No coding)Data Transformation (No coding)
Generate Insights (Correlation, KPIs, Data
Denormalization) (No coding)
StreamCentral
+
Big Data
• Massively Parallel Processing
architecture
• Distributed processing
• Scale out and distribute any
component of StreamCentral
independently on commodity
hardware
• Integrates with best of breed
database technologies
Collector
Service
Processing
Service
Business Event
Service
Data
Pubishing
Service
Cache
Service
StreamCentral BI Server Scalability
Data available via StreamCentral
Processed
Source Data
• Data Validation
• Association to
entities
• Evaluated for
conditions
• Time and location
standardization
• Custom dimension
standardization
Single Event
Stream
• Correlated data
across multiple
data sources
• Event detection
based on condition
evaluation
Event Analysis
Data Marts
• Data mart built on
highly correlated
data
• Updated real-time
• Analyze multiple
events and
conditions
• Bring together
relevant data
360
o
Analysis
Data Marts
• Data mart build on
loosely correlated
data
• Updated
periodically
• Analyze any data
Real-time Push
Historical Pull
API Access:
Real-time Push
Historical Pull
API Access:
Real-time Push
Historical Pull
API Access:
Historical Pull
API Access:
Database Access:
Historical Pull
Database Access:
Historical Pull
Database Access:
Historical Pull
Database Access:
Historical Pull
Example Big Data Solutions: Telco
Telco’s Core IMS
Network Data
Data, Voice & Video
Performance Data
Data, Voice & Video
Performance Data
Data from
Telco Towers
Weather Data
Traffic
IncidentsPopulation
Data
Data Stream
weatherundergrou
nd
MapquestUSA Today Census
data
Sources of real
time streaming
data from
networks,
devices, services
and other
internal
applications
External sources
of data that add
understanding of
what’s happening
when events are
detected
Network
Test
New
Service –
Investment
Planning
Adaptive
Bit Rate –
Video
Streaming
QoE
360o
Customer
QoE for
1st Level
customer
service
Video QoE
for IPTV
Business
Solutions
10
New
revenue
sources from
marketing
operations
Service
Disruption
Making changes to definitions
• StreamCentral allows updates to data sources, entities, dimensions, rules for
conditions, event detection rules and data mart definitions
• When changes are made using the Workbench updates the schema change
information in the StreamCentral meta data database. It also makes changes to
the underlying database schema
• Configuration data for all services running within StreamCentral is also in the
distributed cache. The next step is to update this distributed cache. The cache
then notifies the various services of the updates in schema definition
• Correlation and the publishing engine evaluate the schema changes and make
the appropriate changes to their in-memory data before sending the data to the
database
• Roll back is built in to account for errors
Many point solutions from multiple
vendors
High learning curve
Maximum time spent integrating
Manual design and coding
Many steps to solution
Older technology
Years to Value
= High Risk,
= High Cost
Agilityinmeetingchangingcustomerneedsinreal-time
Data
Real-timeorHistorical|Streamingorbatch|Structuredorunstructured
Business
Analysis
Detailed
Solution
Design
Manual
Database
Design
Database
Development
CEP -
Development
Platform
Enterprise
Service Bus
Traditional
ETL tools
Application
Development
Workbench – Business Solutions Designer
Consume data, design transformations, conditions, events, analytics,
security, APIs to export and share data
Information Warehouse Manager
Auto generate design, auto generate code, infer relationships, reduce
manual design
BI Server
Built-in Event Processing, high speed data processing, scalable, secure, run
on modern database platforms
Traditional
Pre-work
Data Acquisition, Transformation and
Enrichment
Data Correlation &
Event Mgmt
Analytics & insight
specific data marts
Data Level
Security
Export Enriched Data &
Real Time Analytics
High Automation
No coding required
Contains multiple components that
work together (ETL, CEP, data mart
builder, location intelligence and
more)
Fewer steps to solution
Modern technology
Weeks to Value
= Low Risk,
= Reduced Cost
StreamCentral advantage: Agility to change how you use data in real-time
Risk
Value
Current technology and approach
StreamCentral
Risk
Value
Time
Time
StreamCentral Concepts
Definitions of key concepts in
StreamCentral..
• Entity: An entity represents a group of people or groups of things,
that incoming data is directly connected to. Examples include
departments, customers, site, products etc. By defining entities you
tell StreamCentral how distributed data is connected to things core to
your business
• Data Source: StreamCentral can pull data from a variety of sources
using standard web interfaces and data can also be streamed directly
to StreamCentral API for processing purposes by devices, sensors,
applications and services
• Dimension: Common attributes in a variety of data sources that can
be used to categorize and analyze data
Definitions of key concepts in
StreamCentral..
• Conditions: A condition is a rule based measurement that is applied to incoming
data. A condition has three parts to it : The Condition Name (example Voice
Quality), Condition Range (Range of quality from Hard to hear, poor, average, toll
quality, excellent) and Condition KPI (for example a RED KPI would be when the
ranges are Hard to hear and Poor). Individual conditions can be grouped together
in a conditions set which can then be used to detect events as an aggregate
• Events: An event happens when patterns of multiple conditions with specific
ranges from different data streams and environmental data sources are detected
as the data streams in. While StreamCentral allows sophisticated rule based
event detection, it goes further than that. StreamCentral auto builds a data mart
around the event that consists of a variety of context around the event like
entities, environmental data, dimensions and detailed data from data sources
16
Insight
Who (entities like
customer,
patient)
When (time) Where (location)
What (streaming
& static data
correlation)
Generating insights from data requires context to be
added to the data. This context is a continuous
thread that connects all types of data throughout the
BI Solution lifecycle. Four typical examples of
context..
• StreamCentral automatically builds
and maintains time and location
dimensions
• Entities like customer, department,
site can be created and defined in
StreamCentral. Entity data can be
imported for initial load and
continuously kept in sync
• All incoming data in StreamCentral is
continuously and automatically
connected to time, location and
defined entities
• Resultant real-time events and
analytical data marts automatically
inherit this context without need for
any programming or development
work
Converting data to insights by continuously adding
context
Types of data sources: Regular
• Data sources used to measure performance
• Examples include data from that will be measured for conditions, ranges and
events
• This data can be connected to entities directly – For example data from a
device can be connected to a customer or sales data can be connected to a
product and a customer
• Can be used in correlation, event detection and data marts
Types of data sources : Environmental
• This source of data is used to add context and measure performance – These are
also called environmental data sources
• Example typically include external data that adds context about external factors in play
• Does not have to be connected to the entities directly. StreamCentral will use implicit
relations with time and location dimension to tie environmental data to other enterprise
data. For example, consider an environmental data source called weather. Weather has
location information associated with it. There are two entities namely “Customer” and
“Tower”. Both also have location information associated with them. StreamCentral
standardizes all three to the location dimension but StreamCentral also implicitly connects
Customer to weather and Tower to weather because weather was created as an
environmental data source. Now when analyzing data, StreamCentral will be able to provide
real-time or historical context as to what the weather is where the customer is and what the
weather is where the tower is
• Great to use in data marts for analyzing associations with other data
• Can be used in event detection as part of conditions set and to evaluate events
A note on time and location data
• StreamCentral auto creates time and location dimensions.
• Extended data types allow very specific association of a variety of time and
location based attributes
• Data types can be assigned to attributes in entities, regular data sources and
environmental data sources
• For every incoming attribute that is associated with one of the special time or
location data types, StreamCentral looks to see if a specific record for that data
already exists in the dimension. If not, it creates a new record for that value. If it
exists already, then the key value of that data is substituted in the data source
• Time and location data is stored in the database and in the distributed cache
though the real-time lookups are done against the data stored in the cache
• StreamCentral can dynamically feed time or location data to REST or SOAP based
web services from these dimensions
• StreamCentral supports standardizing location data for any geographic level and
supports ability to standardize for specific radius
Types of data outputs available from
StreamCentral
• Processed Source Data – Once real-time streaming data or static data via
scheduled pull is received by StreamCentral, it is validated, evaluated for
conditions and associations to entities and dimensions like time and location are
made, the data is available to be published
• Event data – Processed data is evaluated for events. If event is detected then
event data along with its associated context is available as a real-time stream. In
addition, StreamCentral builds a data mart just for this event. Access to historical
data for an event is also available
• Events data mart analysis – Custom data marts that evaluate multiple events and
the conditions that were recorded when the events were detected are available
via events data mart. Historical access is available
• Aggregate 360 degree data mart analysis – Bring disparate data together that is
standardized to common themes and StreamCentral automatically builds a
scalable data mart structure for this data
Type of data
available
Real-Time access
method
Historical access method
Processed Source
Data
• ActiveMQ Messages, JMS
based hornetQ, OracleQ,
Microsoft based MSMQ
• WCF based Pub/ Sub model
• Format options - XML/JSON
• REST API – Format options XML/JSON
• Method Name: getFactualData
• Input parameters: source name, filter
parameters (location, time),
numOfRecords
Event Data with
context
• ActiveMQ Messages, JMS
based hornetQ, OracleQ,
Microsoft based MSMQ
• WCF based Pub/ Sub model
• Format options - XML/JSON
• REST API – Format options XML/JSON
• Method Name:getEventData
• Input parameters: Event name or id,
filter parameters (location, time), entity
Id array ,numOfRecords
Events Data Mart • ActiveMQ Messages, JMS
based hornetQ, OracleQ,
Microsoft based MSMQ
• WCF based Pub/ Sub model
• Format options - XML/JSON
• REST API – Format options XML/JSON
• Method Name: getAnalysisData
• Input parameters: analysis collection
name or id, filter parameters (location,
time), entity Id array ,numOfRecords
Choosing the right technology for
visualization• Don’t select a delivery technology for these reasons – Best to use StreamCentral
• Centralize business logic in one place – use many tools to deliver the insight
• Definition of KPIs
• Rules for events
• Alias’s for data attributes
• Connectivity and transformation requirements of source data
• Adding context to data
• Select one or more delivery technologies for these reasons
• Performance (in-memory aggregation)
• Cross browser support, support for various tablets and mobile device platforms
• Broad portfolio of charts and visualizations
• Highly interactive
• Ability to be integrated in portals for internal (employees) or external (partners or
customers) consumption
• Standards based like HTML5 and CSS3
• Can be hosted in a SaaS model
Data Security
StreamCentral
Database
Workbench administrator defines roles and
specifies data access rules. Assign users to roles.
StreamCentral builds and manages meta data for
row level access
• Centralize data security with StreamCentral
• Custom applications and analytical/reporting tools only pass
user id as part of their query to StreamCentral database.
• Two types of row level security:
1. Underlying fact data based on dimensions (like time,
location) and entities (like customer, department, site)
2. Denormalized aggregated data based on and/or rules
StreamCentral row level
security layer
Managing row-level data security
Factual tables of
StreamCentral Database
Security tables of
Stream Central DB
StreamCentralSecurity
ScrtyRoleID
Role Processing Tables
in StreamCentral
Stream Central Database (MS
SQL / HP Vertica)
StreamCentral Metadata DB
Workbench Administrator will manage
data security by creating data access rule
for Roles and assigning Users to Roles
For data accessed from Stream Central Database via
reporting / analytical Tools or API , Stream Central
will determine the data access permission for that
user
Stream Central
Workbench
Distributed Caching
• Storing Time and Location dimension data for fast lookups and data
standardization
• Maintaining configuration information about the system which aids in
managing updates to definitions
• Storing entity data required for adding context to incoming data
• Managing correlation of real-time data
• Managing event detection
• Processed data formatted to data mart specification
• Managing batch data inserts into the database
Availability
OUTSIDE NETWORK
....
CACHE CLUSTER
Microsoft AppFabric Cache is a distributed caching
technology that allows the cache to be high
available by configuring more than one servers to
participate in storing cache data which is often
called as Cache Cluster.
Software Network Load Balancing (NLB)
Microsoft IIS Web server configured in Software NLB provided by
Microsoft Windows Server allows all Websites to be highly
available.
Microsoft Message Queue persists unread messages in the queue
in the event of sudden server shutdown. The physical hardware is
available for clustering to ensure fail over in case of hardware
failure
Web Application
StreamCentral Public API
Workbench Application
Reports / analytics
Messaging
Inbound Message Queue
Publish Message Queue
Processing Service
Correlation Service
Publish Service
Workbench Database
(StreamCentral MetaData)
StreamCentral Database
(Fact and aggregate data – Vertica/
MS SQL Server)
Processing Engine, Correlation
Engine, Publish Engine can be
made to run on multiple
physical servers to make these
services always highly available.
StreamCentral High Availability
Thank you
for your time
Raheel Retiwalla
CTO - Virtus IT Ltd
E: raheel.retiwalla@virtus-it.com
M: +1 617 901 8370
A trusted partner29

Mais conteúdo relacionado

Mais procurados

Building Modern Data Platform with AWS
Building Modern Data Platform with AWSBuilding Modern Data Platform with AWS
Building Modern Data Platform with AWSDmitry Anoshin
 
Power BI for Big Data and the New Look of Big Data Solutions
Power BI for Big Data and the New Look of Big Data SolutionsPower BI for Big Data and the New Look of Big Data Solutions
Power BI for Big Data and the New Look of Big Data SolutionsJames Serra
 
Enterprise Data World 2018 - Building Cloud Self-Service Analytical Solution
Enterprise Data World 2018 - Building Cloud Self-Service Analytical SolutionEnterprise Data World 2018 - Building Cloud Self-Service Analytical Solution
Enterprise Data World 2018 - Building Cloud Self-Service Analytical SolutionDmitry Anoshin
 
Design Principles for a Modern Data Warehouse
Design Principles for a Modern Data WarehouseDesign Principles for a Modern Data Warehouse
Design Principles for a Modern Data WarehouseRob Winters
 
Modernizing Data Management Through Metadata
Modernizing Data Management Through MetadataModernizing Data Management Through Metadata
Modernizing Data Management Through MetadataMANTA
 
Big Data with Not Only SQL
Big Data with Not Only SQLBig Data with Not Only SQL
Big Data with Not Only SQLPhilippe Julio
 
AWS Cloud Kata 2013 | Singapore - Getting to Scale on AWS
AWS Cloud Kata 2013 | Singapore - Getting to Scale on AWSAWS Cloud Kata 2013 | Singapore - Getting to Scale on AWS
AWS Cloud Kata 2013 | Singapore - Getting to Scale on AWSAmazon Web Services
 
Building an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureBuilding an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureJames Serra
 
Building a modern data warehouse
Building a modern data warehouseBuilding a modern data warehouse
Building a modern data warehouseJames Serra
 
Data Vault Automation at the Bijenkorf
Data Vault Automation at the BijenkorfData Vault Automation at the Bijenkorf
Data Vault Automation at the BijenkorfRob Winters
 
Building big data solutions on azure
Building big data solutions on azureBuilding big data solutions on azure
Building big data solutions on azureEyal Ben Ivri
 
The modern analytics architecture
The modern analytics architectureThe modern analytics architecture
The modern analytics architectureJoseph D'Antoni
 
Data Lake Overview
Data Lake OverviewData Lake Overview
Data Lake OverviewJames Serra
 
Creating a Next-Generation Big Data Architecture
Creating a Next-Generation Big Data ArchitectureCreating a Next-Generation Big Data Architecture
Creating a Next-Generation Big Data ArchitecturePerficient, Inc.
 
Cloud-native Semantic Layer on Data Lake
Cloud-native Semantic Layer on Data LakeCloud-native Semantic Layer on Data Lake
Cloud-native Semantic Layer on Data LakeDatabricks
 
Real-time Microservices and In-Memory Data Grids
Real-time Microservices and In-Memory Data GridsReal-time Microservices and In-Memory Data Grids
Real-time Microservices and In-Memory Data GridsAli Hodroj
 
Seeing Redshift: How Amazon Changed Data Warehousing Forever
Seeing Redshift: How Amazon Changed Data Warehousing ForeverSeeing Redshift: How Amazon Changed Data Warehousing Forever
Seeing Redshift: How Amazon Changed Data Warehousing ForeverInside Analysis
 
Building Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft AzureBuilding Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft AzureDmitry Anoshin
 
Enterprise Data Lake - Scalable Digital
Enterprise Data Lake - Scalable DigitalEnterprise Data Lake - Scalable Digital
Enterprise Data Lake - Scalable Digitalsambiswal
 

Mais procurados (20)

Building Modern Data Platform with AWS
Building Modern Data Platform with AWSBuilding Modern Data Platform with AWS
Building Modern Data Platform with AWS
 
Power BI for Big Data and the New Look of Big Data Solutions
Power BI for Big Data and the New Look of Big Data SolutionsPower BI for Big Data and the New Look of Big Data Solutions
Power BI for Big Data and the New Look of Big Data Solutions
 
Enterprise Data World 2018 - Building Cloud Self-Service Analytical Solution
Enterprise Data World 2018 - Building Cloud Self-Service Analytical SolutionEnterprise Data World 2018 - Building Cloud Self-Service Analytical Solution
Enterprise Data World 2018 - Building Cloud Self-Service Analytical Solution
 
Design Principles for a Modern Data Warehouse
Design Principles for a Modern Data WarehouseDesign Principles for a Modern Data Warehouse
Design Principles for a Modern Data Warehouse
 
Modernizing Data Management Through Metadata
Modernizing Data Management Through MetadataModernizing Data Management Through Metadata
Modernizing Data Management Through Metadata
 
Big Data with Not Only SQL
Big Data with Not Only SQLBig Data with Not Only SQL
Big Data with Not Only SQL
 
AWS Cloud Kata 2013 | Singapore - Getting to Scale on AWS
AWS Cloud Kata 2013 | Singapore - Getting to Scale on AWSAWS Cloud Kata 2013 | Singapore - Getting to Scale on AWS
AWS Cloud Kata 2013 | Singapore - Getting to Scale on AWS
 
Building an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureBuilding an Effective Data Warehouse Architecture
Building an Effective Data Warehouse Architecture
 
Building a modern data warehouse
Building a modern data warehouseBuilding a modern data warehouse
Building a modern data warehouse
 
Data Vault Automation at the Bijenkorf
Data Vault Automation at the BijenkorfData Vault Automation at the Bijenkorf
Data Vault Automation at the Bijenkorf
 
Building big data solutions on azure
Building big data solutions on azureBuilding big data solutions on azure
Building big data solutions on azure
 
The modern analytics architecture
The modern analytics architectureThe modern analytics architecture
The modern analytics architecture
 
Data Lake Overview
Data Lake OverviewData Lake Overview
Data Lake Overview
 
Creating a Next-Generation Big Data Architecture
Creating a Next-Generation Big Data ArchitectureCreating a Next-Generation Big Data Architecture
Creating a Next-Generation Big Data Architecture
 
Cloud-native Semantic Layer on Data Lake
Cloud-native Semantic Layer on Data LakeCloud-native Semantic Layer on Data Lake
Cloud-native Semantic Layer on Data Lake
 
Real-time Microservices and In-Memory Data Grids
Real-time Microservices and In-Memory Data GridsReal-time Microservices and In-Memory Data Grids
Real-time Microservices and In-Memory Data Grids
 
Seeing Redshift: How Amazon Changed Data Warehousing Forever
Seeing Redshift: How Amazon Changed Data Warehousing ForeverSeeing Redshift: How Amazon Changed Data Warehousing Forever
Seeing Redshift: How Amazon Changed Data Warehousing Forever
 
Building Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft AzureBuilding Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft Azure
 
2022 02 Integration Bootcamp
2022 02 Integration Bootcamp2022 02 Integration Bootcamp
2022 02 Integration Bootcamp
 
Enterprise Data Lake - Scalable Digital
Enterprise Data Lake - Scalable DigitalEnterprise Data Lake - Scalable Digital
Enterprise Data Lake - Scalable Digital
 

Semelhante a StreamCentral Technical Overview

How does Microsoft solve Big Data?
How does Microsoft solve Big Data?How does Microsoft solve Big Data?
How does Microsoft solve Big Data?James Serra
 
Bringing the Power of Big Data Computation to Salesforce
Bringing the Power of Big Data Computation to SalesforceBringing the Power of Big Data Computation to Salesforce
Bringing the Power of Big Data Computation to SalesforceSalesforce Developers
 
Skillwise Big Data part 2
Skillwise Big Data part 2Skillwise Big Data part 2
Skillwise Big Data part 2Skillwise Group
 
Modern Data Architectures for Business Outcomes
Modern Data Architectures for Business OutcomesModern Data Architectures for Business Outcomes
Modern Data Architectures for Business OutcomesAmazon Web Services
 
StreamCentral for the IT Professional
StreamCentral for the IT ProfessionalStreamCentral for the IT Professional
StreamCentral for the IT ProfessionalRaheel Retiwalla
 
Modern Data Architectures for Business Outcomes
Modern Data Architectures for Business OutcomesModern Data Architectures for Business Outcomes
Modern Data Architectures for Business OutcomesAmazon Web Services
 
Horses for Courses: Database Roundtable
Horses for Courses: Database RoundtableHorses for Courses: Database Roundtable
Horses for Courses: Database RoundtableEric Kavanagh
 
Creating a Modern Data Architecture for Digital Transformation
Creating a Modern Data Architecture for Digital TransformationCreating a Modern Data Architecture for Digital Transformation
Creating a Modern Data Architecture for Digital TransformationMongoDB
 
Big Data Analytics in the Cloud with Microsoft Azure
Big Data Analytics in the Cloud with Microsoft AzureBig Data Analytics in the Cloud with Microsoft Azure
Big Data Analytics in the Cloud with Microsoft AzureMark Kromer
 
Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...
Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...
Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...MSAdvAnalytics
 
Hadoop in the Cloud: Common Architectural Patterns
Hadoop in the Cloud: Common Architectural PatternsHadoop in the Cloud: Common Architectural Patterns
Hadoop in the Cloud: Common Architectural PatternsDataWorks Summit
 
Дмитрий Лавриненко "Big & Fast Data for Identity & Telemetry services"
Дмитрий Лавриненко "Big & Fast Data for Identity & Telemetry services"Дмитрий Лавриненко "Big & Fast Data for Identity & Telemetry services"
Дмитрий Лавриненко "Big & Fast Data for Identity & Telemetry services"Fwdays
 
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSetsEnabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSetsStreamsets Inc.
 
Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kin...
Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kin...Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kin...
Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kin...Amazon Web Services
 
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...Hortonworks
 
Unlocking Operational Intelligence from the Data Lake
Unlocking Operational Intelligence from the Data LakeUnlocking Operational Intelligence from the Data Lake
Unlocking Operational Intelligence from the Data LakeMongoDB
 

Semelhante a StreamCentral Technical Overview (20)

How does Microsoft solve Big Data?
How does Microsoft solve Big Data?How does Microsoft solve Big Data?
How does Microsoft solve Big Data?
 
Bringing the Power of Big Data Computation to Salesforce
Bringing the Power of Big Data Computation to SalesforceBringing the Power of Big Data Computation to Salesforce
Bringing the Power of Big Data Computation to Salesforce
 
Skilwise Big data
Skilwise Big dataSkilwise Big data
Skilwise Big data
 
Skillwise Big Data part 2
Skillwise Big Data part 2Skillwise Big Data part 2
Skillwise Big Data part 2
 
Modern Data Architectures for Business Outcomes
Modern Data Architectures for Business OutcomesModern Data Architectures for Business Outcomes
Modern Data Architectures for Business Outcomes
 
StreamCentral for the IT Professional
StreamCentral for the IT ProfessionalStreamCentral for the IT Professional
StreamCentral for the IT Professional
 
Modern Data Architectures for Business Outcomes
Modern Data Architectures for Business OutcomesModern Data Architectures for Business Outcomes
Modern Data Architectures for Business Outcomes
 
Horses for Courses: Database Roundtable
Horses for Courses: Database RoundtableHorses for Courses: Database Roundtable
Horses for Courses: Database Roundtable
 
Creating a Modern Data Architecture for Digital Transformation
Creating a Modern Data Architecture for Digital TransformationCreating a Modern Data Architecture for Digital Transformation
Creating a Modern Data Architecture for Digital Transformation
 
Big Data Analytics in the Cloud with Microsoft Azure
Big Data Analytics in the Cloud with Microsoft AzureBig Data Analytics in the Cloud with Microsoft Azure
Big Data Analytics in the Cloud with Microsoft Azure
 
Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...
Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...
Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...
 
Hadoop in the Cloud: Common Architectural Patterns
Hadoop in the Cloud: Common Architectural PatternsHadoop in the Cloud: Common Architectural Patterns
Hadoop in the Cloud: Common Architectural Patterns
 
Дмитрий Лавриненко "Big & Fast Data for Identity & Telemetry services"
Дмитрий Лавриненко "Big & Fast Data for Identity & Telemetry services"Дмитрий Лавриненко "Big & Fast Data for Identity & Telemetry services"
Дмитрий Лавриненко "Big & Fast Data for Identity & Telemetry services"
 
DA_01_Intro.pptx
DA_01_Intro.pptxDA_01_Intro.pptx
DA_01_Intro.pptx
 
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSetsEnabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
 
Building your Datalake on AWS
Building your Datalake on AWSBuilding your Datalake on AWS
Building your Datalake on AWS
 
Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kin...
Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kin...Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kin...
Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kin...
 
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
 
AWS Big Data Platform
AWS Big Data PlatformAWS Big Data Platform
AWS Big Data Platform
 
Unlocking Operational Intelligence from the Data Lake
Unlocking Operational Intelligence from the Data LakeUnlocking Operational Intelligence from the Data Lake
Unlocking Operational Intelligence from the Data Lake
 

Último

From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 

Último (20)

From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 

StreamCentral Technical Overview

  • 1. A trusted partner Business Powered By Data
  • 2. What it takes to build Real-Time Operational Intelligence and Big Data solutions Row level data security manual development Massively Parallel Processing Systems Example: Vertica, GreenPlum, Neteeza, ParStream NoSQL Databases Example: MongoDB, Amazon DynamoDB, Cassandra Relational Databases Example: Microsoft SQL Server, IBM DB2, Oracle, Sybase OLAP Example: Microsoft SSAS, Cognos Powerplay NewSQL Example: NuoDB Discovery & Analysis Tableau, QlikView, Cognos, SiSense Reporting (Many) Event Stream Processing – Developer focused (Tibco, Microsoft, IBM) Streaming data consumption (APIs, Enterprise Service Bus)Static data (Connectors) Data Mining R, SAS, SPSS Custom Applications Data Transformation (Ascential Software, Cognos, Microsoft Integration Services) Store & Manage Data Process Data Structured Semi Structured Unstructured Data Access & Visualization Acquire Data Generate Insights (Correlation, KPIs, Data Denormalization) Manual Custom Development Hadoop Example: HortonWorks, Cloudera Data Access & Security Real-time and historical data publishing - manual development API Data Export Database Design and Development Solution design, rules for data manipulation, rules for monitoring conditions and KPIs, rules for detecting events... Pre work
  • 3. Row level data security manual development Massively Parallel Processing Systems Example: Vertica, GreenPlum, Neteeza, ParStream NoSQL Databases Example: MongoDB, Amazon DynamoDB, Cassandra Relational Databases Example: Microsoft SQL Server, IBM DB2, Oracle, Sybase OLAP Example: Microsoft SSAS, Cognos Powerplay NewSQL Example: NuoDB Discovery & Analysis Tableau, QlikView, Cognos, SiSense Reporting (Many) Event Stream Processing – Developer focused (Tibco, Microsoft, IBM) Streaming data consumption (APIs, Enterprise Service Bus)Static data (Connectors) Data Mining R, SAS, SPSS Custom Applications Data Transformation (Ascential Software, Cognos, Microsoft Integration Services) Store & Manage Data Process Data Structured Semi Structured Unstructured Data Access & Visualization Acquire Data Generate Insights (Correlation, KPIs, Data Denormalization) Manual Custom Development Hadoop Example: HortonWorks, Cloudera Data Access & Security Real-time and historical data publishing - manual development API Data Export Database Design and Development Solution design, rules for data manipulation, rules for monitoring conditions and KPIs, rules for detecting events... Pre work Innovations in Big Data technologies over the last 5 years
  • 4.
  • 5. Row level data security manual development Massively Parallel Processing Systems Example: Vertica, GreenPlum, Neteeza, ParStream NoSQL Databases Example: MongoDB, Amazon DynamoDB, Cassandra Relational Databases Example: Microsoft SQL Server, IBM DB2, Oracle, Sybase OLAP Example: Microsoft SSAS, Cognos Powerplay NewSQL Example: NuoDB Discovery & Analysis Tableau, QlikView, Cognos, SiSense Reporting (Many) Event Stream Processing (Tibco, Microsoft, IBM) Streaming data consumption (APIs, Enterprise Service Bus)Static data (Connectors) Data Mining R, SAS, SPSS Custom Applications Data Transformation (Ascential Software, Cognos, Microsoft Integration Services) Store & Manage Data Process Data Structured Semi Structured Unstructured Data Access & Visualization Acquire Data Generate Insights (Correlation, KPIs, Data Denormalization) Manual Custom Development Hadoop Example: HortonWorks, Cloudera Data Access & Security Real-time and historical data publishing - manual development API Data Export Database Design and Development Solution design, rules for data manipulation, rules for monitoring conditions and KPIs, rules for detecting events... Pre work Challenging bits not addressed in this innovation cycle This causes: • Lots of systems integration of point solutions • Custom code • Specialist skills • Hard to change and evolve
  • 6. Rapidly industrialize the use of data by designing, building and running real- time business intelligence and big data solutions with StreamCentral. Solution Designer (Data Consumption, data transformations, conditions, event, correlation) Workbench – Easy to Design Security Designer Systems ManagementAPI Designer Meta Data Manager Information Warehouse Manager – Auto Build De normalized schema generation for data marts Security schema generation Normalized schema generation for Fact and Dimensions Auto generate database design, auto generate database and application code, infer relationships in data BI Server – Run with scale Data Processing Analytic Applications BI / Reporting Data Exploration / Viisualization Functional Application Event Driven Predictive Analytics Industry Application Association Analysis Data Collection Business Event Detection Data Publishing - SQL Server, Vertica, MongoDB Data Export Caching
  • 7. Putting it together – High impact real-time solutions in fraction of the time StreamCentral auto builds security infrastrucure Massively Parallel Processing Systems Vertica NoSQL Databases MongoDB Relational Databases Microsoft SQL Server Discovery & Analysis Tableau, QlikView, Cognos, SiSense Reporting (Many) Built in StreamToMe API (Stream any data from any application or device to StreamCentral) Static data (Connectors) Data Mining R, SAS, SPSS Custom Applications Store & Manage Data Process Data Structured Semi Structured Unstructured Data Access & Visualization Acquire Data Hadoop Data Access & Security StreamCentral Built in API builder API Data Export Database Development - StreamCentral auto generates database design and database code StreamCentral Workbench – No coding required -- Solution design, rules for data manipulation, rules for monitoring conditions and KPIs, rules for detecting events... ) – For a broad set of people with varying technical skills Pre work Event Stream Processing (No coding)Data Transformation (No coding) Generate Insights (Correlation, KPIs, Data Denormalization) (No coding) StreamCentral + Big Data
  • 8. • Massively Parallel Processing architecture • Distributed processing • Scale out and distribute any component of StreamCentral independently on commodity hardware • Integrates with best of breed database technologies Collector Service Processing Service Business Event Service Data Pubishing Service Cache Service StreamCentral BI Server Scalability
  • 9. Data available via StreamCentral Processed Source Data • Data Validation • Association to entities • Evaluated for conditions • Time and location standardization • Custom dimension standardization Single Event Stream • Correlated data across multiple data sources • Event detection based on condition evaluation Event Analysis Data Marts • Data mart built on highly correlated data • Updated real-time • Analyze multiple events and conditions • Bring together relevant data 360 o Analysis Data Marts • Data mart build on loosely correlated data • Updated periodically • Analyze any data Real-time Push Historical Pull API Access: Real-time Push Historical Pull API Access: Real-time Push Historical Pull API Access: Historical Pull API Access: Database Access: Historical Pull Database Access: Historical Pull Database Access: Historical Pull Database Access: Historical Pull
  • 10. Example Big Data Solutions: Telco Telco’s Core IMS Network Data Data, Voice & Video Performance Data Data, Voice & Video Performance Data Data from Telco Towers Weather Data Traffic IncidentsPopulation Data Data Stream weatherundergrou nd MapquestUSA Today Census data Sources of real time streaming data from networks, devices, services and other internal applications External sources of data that add understanding of what’s happening when events are detected Network Test New Service – Investment Planning Adaptive Bit Rate – Video Streaming QoE 360o Customer QoE for 1st Level customer service Video QoE for IPTV Business Solutions 10 New revenue sources from marketing operations Service Disruption
  • 11. Making changes to definitions • StreamCentral allows updates to data sources, entities, dimensions, rules for conditions, event detection rules and data mart definitions • When changes are made using the Workbench updates the schema change information in the StreamCentral meta data database. It also makes changes to the underlying database schema • Configuration data for all services running within StreamCentral is also in the distributed cache. The next step is to update this distributed cache. The cache then notifies the various services of the updates in schema definition • Correlation and the publishing engine evaluate the schema changes and make the appropriate changes to their in-memory data before sending the data to the database • Roll back is built in to account for errors
  • 12. Many point solutions from multiple vendors High learning curve Maximum time spent integrating Manual design and coding Many steps to solution Older technology Years to Value = High Risk, = High Cost Agilityinmeetingchangingcustomerneedsinreal-time Data Real-timeorHistorical|Streamingorbatch|Structuredorunstructured Business Analysis Detailed Solution Design Manual Database Design Database Development CEP - Development Platform Enterprise Service Bus Traditional ETL tools Application Development Workbench – Business Solutions Designer Consume data, design transformations, conditions, events, analytics, security, APIs to export and share data Information Warehouse Manager Auto generate design, auto generate code, infer relationships, reduce manual design BI Server Built-in Event Processing, high speed data processing, scalable, secure, run on modern database platforms Traditional Pre-work Data Acquisition, Transformation and Enrichment Data Correlation & Event Mgmt Analytics & insight specific data marts Data Level Security Export Enriched Data & Real Time Analytics High Automation No coding required Contains multiple components that work together (ETL, CEP, data mart builder, location intelligence and more) Fewer steps to solution Modern technology Weeks to Value = Low Risk, = Reduced Cost StreamCentral advantage: Agility to change how you use data in real-time Risk Value Current technology and approach StreamCentral Risk Value Time Time
  • 14. Definitions of key concepts in StreamCentral.. • Entity: An entity represents a group of people or groups of things, that incoming data is directly connected to. Examples include departments, customers, site, products etc. By defining entities you tell StreamCentral how distributed data is connected to things core to your business • Data Source: StreamCentral can pull data from a variety of sources using standard web interfaces and data can also be streamed directly to StreamCentral API for processing purposes by devices, sensors, applications and services • Dimension: Common attributes in a variety of data sources that can be used to categorize and analyze data
  • 15. Definitions of key concepts in StreamCentral.. • Conditions: A condition is a rule based measurement that is applied to incoming data. A condition has three parts to it : The Condition Name (example Voice Quality), Condition Range (Range of quality from Hard to hear, poor, average, toll quality, excellent) and Condition KPI (for example a RED KPI would be when the ranges are Hard to hear and Poor). Individual conditions can be grouped together in a conditions set which can then be used to detect events as an aggregate • Events: An event happens when patterns of multiple conditions with specific ranges from different data streams and environmental data sources are detected as the data streams in. While StreamCentral allows sophisticated rule based event detection, it goes further than that. StreamCentral auto builds a data mart around the event that consists of a variety of context around the event like entities, environmental data, dimensions and detailed data from data sources
  • 16. 16 Insight Who (entities like customer, patient) When (time) Where (location) What (streaming & static data correlation) Generating insights from data requires context to be added to the data. This context is a continuous thread that connects all types of data throughout the BI Solution lifecycle. Four typical examples of context.. • StreamCentral automatically builds and maintains time and location dimensions • Entities like customer, department, site can be created and defined in StreamCentral. Entity data can be imported for initial load and continuously kept in sync • All incoming data in StreamCentral is continuously and automatically connected to time, location and defined entities • Resultant real-time events and analytical data marts automatically inherit this context without need for any programming or development work Converting data to insights by continuously adding context
  • 17. Types of data sources: Regular • Data sources used to measure performance • Examples include data from that will be measured for conditions, ranges and events • This data can be connected to entities directly – For example data from a device can be connected to a customer or sales data can be connected to a product and a customer • Can be used in correlation, event detection and data marts
  • 18. Types of data sources : Environmental • This source of data is used to add context and measure performance – These are also called environmental data sources • Example typically include external data that adds context about external factors in play • Does not have to be connected to the entities directly. StreamCentral will use implicit relations with time and location dimension to tie environmental data to other enterprise data. For example, consider an environmental data source called weather. Weather has location information associated with it. There are two entities namely “Customer” and “Tower”. Both also have location information associated with them. StreamCentral standardizes all three to the location dimension but StreamCentral also implicitly connects Customer to weather and Tower to weather because weather was created as an environmental data source. Now when analyzing data, StreamCentral will be able to provide real-time or historical context as to what the weather is where the customer is and what the weather is where the tower is • Great to use in data marts for analyzing associations with other data • Can be used in event detection as part of conditions set and to evaluate events
  • 19. A note on time and location data • StreamCentral auto creates time and location dimensions. • Extended data types allow very specific association of a variety of time and location based attributes • Data types can be assigned to attributes in entities, regular data sources and environmental data sources • For every incoming attribute that is associated with one of the special time or location data types, StreamCentral looks to see if a specific record for that data already exists in the dimension. If not, it creates a new record for that value. If it exists already, then the key value of that data is substituted in the data source • Time and location data is stored in the database and in the distributed cache though the real-time lookups are done against the data stored in the cache • StreamCentral can dynamically feed time or location data to REST or SOAP based web services from these dimensions • StreamCentral supports standardizing location data for any geographic level and supports ability to standardize for specific radius
  • 20. Types of data outputs available from StreamCentral • Processed Source Data – Once real-time streaming data or static data via scheduled pull is received by StreamCentral, it is validated, evaluated for conditions and associations to entities and dimensions like time and location are made, the data is available to be published • Event data – Processed data is evaluated for events. If event is detected then event data along with its associated context is available as a real-time stream. In addition, StreamCentral builds a data mart just for this event. Access to historical data for an event is also available • Events data mart analysis – Custom data marts that evaluate multiple events and the conditions that were recorded when the events were detected are available via events data mart. Historical access is available • Aggregate 360 degree data mart analysis – Bring disparate data together that is standardized to common themes and StreamCentral automatically builds a scalable data mart structure for this data
  • 21. Type of data available Real-Time access method Historical access method Processed Source Data • ActiveMQ Messages, JMS based hornetQ, OracleQ, Microsoft based MSMQ • WCF based Pub/ Sub model • Format options - XML/JSON • REST API – Format options XML/JSON • Method Name: getFactualData • Input parameters: source name, filter parameters (location, time), numOfRecords Event Data with context • ActiveMQ Messages, JMS based hornetQ, OracleQ, Microsoft based MSMQ • WCF based Pub/ Sub model • Format options - XML/JSON • REST API – Format options XML/JSON • Method Name:getEventData • Input parameters: Event name or id, filter parameters (location, time), entity Id array ,numOfRecords Events Data Mart • ActiveMQ Messages, JMS based hornetQ, OracleQ, Microsoft based MSMQ • WCF based Pub/ Sub model • Format options - XML/JSON • REST API – Format options XML/JSON • Method Name: getAnalysisData • Input parameters: analysis collection name or id, filter parameters (location, time), entity Id array ,numOfRecords
  • 22. Choosing the right technology for visualization• Don’t select a delivery technology for these reasons – Best to use StreamCentral • Centralize business logic in one place – use many tools to deliver the insight • Definition of KPIs • Rules for events • Alias’s for data attributes • Connectivity and transformation requirements of source data • Adding context to data • Select one or more delivery technologies for these reasons • Performance (in-memory aggregation) • Cross browser support, support for various tablets and mobile device platforms • Broad portfolio of charts and visualizations • Highly interactive • Ability to be integrated in portals for internal (employees) or external (partners or customers) consumption • Standards based like HTML5 and CSS3 • Can be hosted in a SaaS model
  • 24. StreamCentral Database Workbench administrator defines roles and specifies data access rules. Assign users to roles. StreamCentral builds and manages meta data for row level access • Centralize data security with StreamCentral • Custom applications and analytical/reporting tools only pass user id as part of their query to StreamCentral database. • Two types of row level security: 1. Underlying fact data based on dimensions (like time, location) and entities (like customer, department, site) 2. Denormalized aggregated data based on and/or rules StreamCentral row level security layer Managing row-level data security
  • 25. Factual tables of StreamCentral Database Security tables of Stream Central DB StreamCentralSecurity ScrtyRoleID Role Processing Tables in StreamCentral Stream Central Database (MS SQL / HP Vertica) StreamCentral Metadata DB Workbench Administrator will manage data security by creating data access rule for Roles and assigning Users to Roles For data accessed from Stream Central Database via reporting / analytical Tools or API , Stream Central will determine the data access permission for that user Stream Central Workbench
  • 26. Distributed Caching • Storing Time and Location dimension data for fast lookups and data standardization • Maintaining configuration information about the system which aids in managing updates to definitions • Storing entity data required for adding context to incoming data • Managing correlation of real-time data • Managing event detection • Processed data formatted to data mart specification • Managing batch data inserts into the database
  • 28. OUTSIDE NETWORK .... CACHE CLUSTER Microsoft AppFabric Cache is a distributed caching technology that allows the cache to be high available by configuring more than one servers to participate in storing cache data which is often called as Cache Cluster. Software Network Load Balancing (NLB) Microsoft IIS Web server configured in Software NLB provided by Microsoft Windows Server allows all Websites to be highly available. Microsoft Message Queue persists unread messages in the queue in the event of sudden server shutdown. The physical hardware is available for clustering to ensure fail over in case of hardware failure Web Application StreamCentral Public API Workbench Application Reports / analytics Messaging Inbound Message Queue Publish Message Queue Processing Service Correlation Service Publish Service Workbench Database (StreamCentral MetaData) StreamCentral Database (Fact and aggregate data – Vertica/ MS SQL Server) Processing Engine, Correlation Engine, Publish Engine can be made to run on multiple physical servers to make these services always highly available. StreamCentral High Availability
  • 29. Thank you for your time Raheel Retiwalla CTO - Virtus IT Ltd E: raheel.retiwalla@virtus-it.com M: +1 617 901 8370 A trusted partner29