SlideShare uma empresa Scribd logo
1 de 48
Baixar para ler offline
© Copyright 2018 Pivotal Software, Inc. All rights Reserved. Version 1.0
Simplifying Data and Analytics with
Pivotal Greenplum
451 Research and Pivotal, Inc.
May 24, 2018
Welcome!
2
James Curtis
Sr Analyst, Data Platforms & Analytics
james.curtis@451research.com
@jmscrts
www.451research.com
Bob Glithero
Principal Product Marketing Mgr
linkedin.com/in/glithero
@bglithero
www.pivotal.io
Bharath Sitaraman
Principal Product Manager
linkedin.com/in/bsitaraman
@bharath1028
www.pivotal.io
Cover w/ Image
Agenda
●  Expanding Analytics with EDW
●  Integrating Data for Analytical
Transformation
●  Use Case: Layered Analytics in
Cybersecurity
●  Q&A
451 Research is a leading IT research & advisory company
4
Founded in 2000
300+ employees, including over 120 analysts
2,000+ clients: Technology & Service providers, corporate
advisory, finance, professional services, and IT decision makers
70,000+ IT professionals, business users and consumers in our research
community
Over 52 million data points published each quarter and 4,500+ reports
published each year
3,000+ technology & service providers under coverage
451 Research and its sister company, Uptime Institute, are the two divisions
of The 451 Group
Headquartered in New York City, with offices in London, Boston, San
Francisco, Washington DC, Mexico, Costa Rica, Brazil, Spain, UAE, Russia,
Taiwan, Singapore and Malaysia
Research & Data
Advisory
Events
Go 2 Market
5
Becoming Data Driven, Analytics Driven
DECISION
MAKERS
DATA
ANALYSTS
IT PROSENTERPRISE
APPLICATIONS
DATA
WAREHOUSE
Enterprise Data Warehouse: Common Characteristics
6
Analytic Data Platforms: A Growing Market
7
Source: 451 Research, Market Monitor, Total Data: Platforms & Analytics, February 2018.
9.1%
CAGR
2017-22
ENTERPRISE
APPLICATIONS
DECISION
MAKERS
DATA
ANALYSTS
IT PROSDATA
WAREHOUSE
3
Adapt and
Expand
Our Field
of Vision
ENTERPRISE
APPLICATIONS
CLOUD STORAGE
DECISION
MAKERS
HADOOP
SPARK
AI+ML
DATA
ANALYSTS
IT PROSDATA
WAREHOUSE
3
Expanded
Processing
Choices
ENTERPRISE
APPLICATIONS
CLOUD STORAGE
MOBILE
APPS
BOTS
IOT DEVICES
AND SENSORS
SOCIAL
MEDIA
DECISION
MAKERS
HADOOP
SPARK
AI+ML
DATA
ANALYSTS
IT PROS
LOG AND
CLICKSTREAM
DATA
DATA
WAREHOUSE
3
Leads to
Expansion
of Data
Sources
ENTERPRISE
APPLICATIONS
CLOUD STORAGE
MOBILE
APPS
BOTS
IOT DEVICES
AND SENSORS
SOCIAL
MEDIA
BUSINESS
USERS
DATA-DRIVEN
APPLICATIONS
DATA
SCIENTISTS
DECISION
MAKERS
HADOOP
SPARK
AI+ML
DATA
ANALYSTS
IT PROS
LOG AND
CLICKSTREAM
DATA
OT
USERS
DATA
WAREHOUSE
3
Which
Leads to
More
Advanced
Decision-
Making
Processes
CLOUD
STORAGE
HADOOP
SPARK
AI+ML
DATA
WAREHOUSE
3
Consider the Environment
•  Too many systems to maintain
•  Excessive data movement
•  Analytics on a portion of data
•  Duplicate capabilities
•  Low utilization
•  Low optimization
3
Consolidate Analytical Frameworks
•  Fewer systems to maintain
•  Minimize data movement
•  Analytic optimization
•  Resource efficiency
CLOUD
STORAGE
HADOOP
SPARK
AI+ML
DATA
WAREHOUSE
Consolidated Systems Enable In-Database Machine Learning
14
✔︎ Operate on all of the data,
including varied types
✔︎
✔︎
✔︎Algorithms optimized to
the architecture
No moving of data
Leverage the use of SQL
3
Level Set on Machine Learning
!  The terms ‘algorithm’ and ‘model’ are often used to mean the same thing. They are not.
!  An algorithm is a set of computational instructions, such as Random Forest.
!  A model in that context would be the result of applying an Random Forest to a dataset
—its output, which is based on the algorithm.
In-Database Machine Learning Works Best When You...
16
Understand the
business problem
(and rules) thoroughly
1
CLOUD
STORAGE
HADOOP
SPARK
AI+ML
DATA
WAREHOUSE
In-Database Machine Learning Works Best When You...
17
Understand the
business problem
(and rules) thoroughly
Have a decent
amount of data
that is consolidated
1 2
CLOUD
STORAGE
HADOOP
SPARK
AI+ML
DATA
WAREHOUSE
In-Database Machine Learning Works Best When You...
18
Understand the
business problem
(and rules) thoroughly
Have a decent
amount of data
that is consolidated
Algorithms and tools for
data analysis and
preparation (optimized)
1 2
3
CLOUD
STORAGE
HADOOP
SPARK
AI+ML
DATA
WAREHOUSE
In-Database Machine Learning Works Best When You...
19
Understand the
business problem
(and rules) thoroughly
Have a decent
amount of data
that is consolidated
Algorithms and tools for
data analysis and
preparation (optimized)
Algorithms for machine
learning development
(optimized)
1 2
3 4
CLOUD
STORAGE
HADOOP
SPARK
AI+ML
DATA
WAREHOUSE
In-Database Machine Learning Works Best When You...
20
Understand the
business problem
(and rules) thoroughly
Have a decent
amount of data
that is consolidated
Algorithms and tools for
data analysis and
preparation (optimized)
Algorithms and tools to
carrying out
maintenance, validation,
updating
Algorithms for machine
learning development
(optimized)
1 2
3 4
5
CLOUD
STORAGE
HADOOP
SPARK
AI+ML
DATA
WAREHOUSE
In-Database Machine Learning Works Best When You...
21
Understand the
business problem
(and rules) thoroughly
Have a decent
amount of data
that is consolidated
Algorithms and tools for
data analysis and
preparation (optimized)
Algorithms and tools to
carrying out
maintenance, validation,
updating
Algorithms for machine
learning development
(optimized)
Methods for machine
learning model
deployment
1 2
3 4
5 6
CLOUD
STORAGE
HADOOP
SPARK
AI+ML
DATA
WAREHOUSE
Key Takeaways
22
PIVOTAL
Integrating Data for Analytical
Transformation
Data is at the center of digital transformation;
data-driven action is how transformation happens
24
25
How do you get your
arms around this?
So How Can We Use Data Effectively?
●  Over 30% of organizations have failed on big data
projects
●  Recent research says it takes an average of 52 days
to build a predictive model
●  Consolidating data and analytics in fewer
environments simplifies modeling and deployment
26
1. Converge Analytics and Data
●  Run algorithms in a database, as close to the data as
possible
●  Leverage MPP architectures for rapid data science
●  Avoid ETL by moving data only when necessary
●  Integrate structured and unstructured data in one
environment, reducing footprint of specialist databases
27
2. Remove Friction from Data Science
●  Test as many hypotheses in parallel to find the
most relevant features as quickly as possible
●  Don’t push/pull data into other environments,
train and test, over and over...
●  Instead, develop a process that lets you
○  train a model with consolidated, cleansed
data sets in the database
○  deploy the model as a pre-computed
object
○  quickly retrain if data patterns change
28
3. Choose a Data Platform for Rapid Analytics
Some algorithms are iterative, like clustering or
graphing
Some algorithms can be parallelized, like random
forests
Sometimes patterns in the underlying data change, so
models become obsolete 29
4. Start with Standard Statistical Methods and ML
A lot of useful data science can
be done with standard algorithms
Exotic algorithms need more data
to train
Standard algorithms can be
combined into ensembles for
greater predictive power
Source: “A Few Useful Things to Know About Machine Learning,” Pedro Domingos, Communications of the ACM, October 2012
Starting with simpler algorithms and ensembles paves the way for more advanced data science
30
But Don’t Just Take Our Word for It...
“As we built the ML model, we were
surprised to learn that none of the most
hyped data science tools — such as deep
learning, AutoML, and ‘AI that creates AI’ —
were needed to make it work.”
Pivotal Solutions for Data and Analytics
Pivotal Greenplum
Multi-Cloud, MPP data platform
for complex analytics with
diverse data locality and data
types
Pivotal GemFire
Fast, transactional
in-memory grid for rapid
data refresh
Complete portfolio
Multi-Cloud and
on premises
Based on
open source
Flexible licensing
Advanced
data services
Pivotal Cloud Cache
On-demand in-memory
caching for cloud native
apps
Pivotal Cloud Foundry
Proven solution for
operationalization of
analytics and software-led,
digital transformation
Pivotal Data Science
World-class Data Science
consulting to drive more
insights from data for Data-
Driven Applications.
Apache MADlib
Distributed, in-database
analytical library on large-scale
data set.
32
Consolidating Diverse Data Enables New Use Cases
Native Graph
Relationship
Intelligence
Greenplum
GPText
Fast Search and
Semantic
Intelligence
Greenplum
PostGIS
Location
Intelligence
Greenplum
Integrated,
Cleansed Data
33
New use cases for locations, flows, connections, relationships, and intent
Text Analytics with GPText
Extracts content, structure from
many binary formats
Fast heterogeneous document
indexing, search, and retrieval
Massive parallelism for
●  Topic modeling
●  Named entity recognition
●  Term frequency
●  Stemming
●  Topic graph
●  Topic cloud
●  NLP
34
+
PIVOTAL
Layered Analytics for Security
Attacks go unnoticed for long periods
Insider Threats Increasingly Evade Network-Level
Visibility
Source: Verizon, 2017, n = 77
36
Layering Analytic Techniques for a More Complete
Picture
Understanding user/entity behavior (graphing, clustering,
predictive analytics)
Network-level intelligence (firewalls, IDS, SIEMs)
Semantic understanding (e.g., text analytics, NLP)
Understand
activity
Understand
behavior
Understand
intent
General
Specific
37
Engineering-level view misses higher-level activity
Understanding Activity - Network Intelligence
●  SIEM/IDS useful if activity matches a signature
or rule
●  Small-ish data sets - APTs unfold over months
or years
●  Inflexible schema - difficult to extend with user-
level attributes
●  User is not same as device, IP address
Log Collection
Log Analysis
Event
Correlation
Log Forensics
Object Access
Auditing
Alerts
Reports
Log Monitoring
Log Retention
File Integrity
Monitoring
SIEM
38
Advanced Analytics Reveal Hidden Patterns
Reveal latent/invisible patterns that stand
out from normal behavior
●  Reconnaissance
●  Privilege escalation attempts
●  Access attempts
●  Unusual data flows or exfiltration
attempts
39
Predictive Analytics for Understanding Behavior
Chaining models increases predictive power, decreases false positives
Lateral Movement
Ensemble Model
Outcomes
Training data
●  Kerberos authentication events
(source, destination, account, type,
success/failure, etc.)
●  10K users, 13K nodes
●  >110M events
•  Regression analysis
•  Constrained diameter
authentication graph
•  Robust rank aggregation
Model features
●  # of distinct destinations logged in to
●  # of distinct sources logged in from
●  # of distinct destination user accounts
●  # of distinct processes started by user
Signals of credential takeover,
running scripts, and other
behavioral anomalies
https://content.pivotal.io/blog/insider-threat-detection-detecting-variance-in-user-behavior-using-an-ensemble-approach
40
Ensemble of methods reveals hidden patterns that signal problem behavior
Revealing the Needle in the Haystack
Surge in access attempts at
regular intervals indicates
possible background script
Regular surges in logins from
unusual user accounts
indicates possible account
takeovers
Graph reveals two access
attempts to a sensitive server
indirectly via other servers
41
Understanding Intent with Semantic Intelligence
Semantic intelligence can clarify
ambiguous signals from other analytics or
security appliances
Scan for interesting words, words in
proximity, variations
Scan and index content - “Is this document
leaving the network similar to other known
sensitive documents?”
Scan document to suggest possible
classification tags
“sorry, i sent this file by mistake”
✔
“please don’t share this file
outside the organization”
!
Benign, no further
action
Investigate
42
Final Thoughts
PIVOTAL
Digital
Transformation
is Real
T-Mobile goes from 7 months and 72 steps to update
software, to same day deployments.
Liberty Mutual builds and deploys an MVP in one
month and delivers revenue-generating version just
months later.
Comcast supports over 1500 developers with an
operator team of 4 people.
The Home Depot ships to production 1,500 times a
month, and 17,000 times a month to all environments.
Leading companies trust Pivotal as
a transformation partner
45
Cover w/ Image
Data is the Key to
Transformation
●  Consolidating data simplifies modeling
and deployment
●  Executing massively parallel analytics in
the database speeds complex use
cases
●  Pivotal Greenplum integrates diverse
data with machine learning at scale for
faster value with less risk
Start Your Data Transformation Journey Today!
Pivotal Greenplum
pivotal.io/pivotal-greenplum
Pivotal Data Science
pivotal.io/data-science
Apache MADlib
madlib.apache.org
Greenplum Database
Channel
Data Tells the Story
© Copyright 2018 Pivotal Software, Inc. All rights Reserved.

Mais conteúdo relacionado

Mais procurados

Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data scienceVignesh Prajapati
 
Neo4j Graph Data Science Training - June 9 & 10 - Slides #7 GDS Best Practices
Neo4j Graph Data Science Training - June 9 & 10 - Slides #7 GDS Best PracticesNeo4j Graph Data Science Training - June 9 & 10 - Slides #7 GDS Best Practices
Neo4j Graph Data Science Training - June 9 & 10 - Slides #7 GDS Best PracticesNeo4j
 
Top Big data Analytics tools: Emerging trends and Best practices
Top Big data Analytics tools: Emerging trends and Best practicesTop Big data Analytics tools: Emerging trends and Best practices
Top Big data Analytics tools: Emerging trends and Best practicesSpringPeople
 
Unified Information Governance, Powered by Knowledge Graph
Unified Information Governance, Powered by Knowledge GraphUnified Information Governance, Powered by Knowledge Graph
Unified Information Governance, Powered by Knowledge GraphVaticle
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big DataAmpoolIO
 
Should a Graph Database Be in Your Next Data Warehouse Stack?
Should a Graph Database Be in Your Next Data Warehouse Stack?Should a Graph Database Be in Your Next Data Warehouse Stack?
Should a Graph Database Be in Your Next Data Warehouse Stack?Cambridge Semantics
 
Hadoop in Validated Environment - Data Governance Initiative
Hadoop in Validated Environment - Data Governance InitiativeHadoop in Validated Environment - Data Governance Initiative
Hadoop in Validated Environment - Data Governance InitiativeDataWorks Summit
 
Big data deep learning: applications and challenges
Big data deep learning: applications and challengesBig data deep learning: applications and challenges
Big data deep learning: applications and challengesfazail amin
 
Knowledge Graph Discussion: Foundational Capability for Data Fabric, Data Int...
Knowledge Graph Discussion: Foundational Capability for Data Fabric, Data Int...Knowledge Graph Discussion: Foundational Capability for Data Fabric, Data Int...
Knowledge Graph Discussion: Foundational Capability for Data Fabric, Data Int...Cambridge Semantics
 
Sustainability Investment Research Using Cognitive Analytics
Sustainability Investment Research Using Cognitive AnalyticsSustainability Investment Research Using Cognitive Analytics
Sustainability Investment Research Using Cognitive AnalyticsCambridge Semantics
 
Big Data Analytics Using Hadoop
Big Data Analytics Using HadoopBig Data Analytics Using Hadoop
Big Data Analytics Using HadoopSrikanth VNV
 
Predictive Analysis for Airbnb Listing Rating using Scalable Big Data Platform
Predictive Analysis for Airbnb Listing Rating using Scalable Big Data PlatformPredictive Analysis for Airbnb Listing Rating using Scalable Big Data Platform
Predictive Analysis for Airbnb Listing Rating using Scalable Big Data PlatformSavita Yadav
 
Modern Big Data Analytics Tools: An Overview
Modern Big Data Analytics Tools: An OverviewModern Big Data Analytics Tools: An Overview
Modern Big Data Analytics Tools: An OverviewGreat Wide Open
 
From hadoop to spark
From hadoop to sparkFrom hadoop to spark
From hadoop to sparksteccami
 
Big Tools for Big Data
Big Tools for Big DataBig Tools for Big Data
Big Tools for Big DataLewis Crawford
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data ScienceCaserta
 
Applying Noisy Knowledge Graphs to Real Problems
Applying Noisy Knowledge Graphs to Real ProblemsApplying Noisy Knowledge Graphs to Real Problems
Applying Noisy Knowledge Graphs to Real ProblemsDataWorks Summit
 

Mais procurados (20)

Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data science
 
Neo4j Graph Data Science Training - June 9 & 10 - Slides #7 GDS Best Practices
Neo4j Graph Data Science Training - June 9 & 10 - Slides #7 GDS Best PracticesNeo4j Graph Data Science Training - June 9 & 10 - Slides #7 GDS Best Practices
Neo4j Graph Data Science Training - June 9 & 10 - Slides #7 GDS Best Practices
 
Top Big data Analytics tools: Emerging trends and Best practices
Top Big data Analytics tools: Emerging trends and Best practicesTop Big data Analytics tools: Emerging trends and Best practices
Top Big data Analytics tools: Emerging trends and Best practices
 
Unified Information Governance, Powered by Knowledge Graph
Unified Information Governance, Powered by Knowledge GraphUnified Information Governance, Powered by Knowledge Graph
Unified Information Governance, Powered by Knowledge Graph
 
Apouc 2014-business-analytics-and-big-data
Apouc 2014-business-analytics-and-big-dataApouc 2014-business-analytics-and-big-data
Apouc 2014-business-analytics-and-big-data
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
Should a Graph Database Be in Your Next Data Warehouse Stack?
Should a Graph Database Be in Your Next Data Warehouse Stack?Should a Graph Database Be in Your Next Data Warehouse Stack?
Should a Graph Database Be in Your Next Data Warehouse Stack?
 
Hadoop in Validated Environment - Data Governance Initiative
Hadoop in Validated Environment - Data Governance InitiativeHadoop in Validated Environment - Data Governance Initiative
Hadoop in Validated Environment - Data Governance Initiative
 
Are you ready for BIG DATA?
Are you ready for BIG DATA?Are you ready for BIG DATA?
Are you ready for BIG DATA?
 
Big data deep learning: applications and challenges
Big data deep learning: applications and challengesBig data deep learning: applications and challenges
Big data deep learning: applications and challenges
 
Knowledge Graph Discussion: Foundational Capability for Data Fabric, Data Int...
Knowledge Graph Discussion: Foundational Capability for Data Fabric, Data Int...Knowledge Graph Discussion: Foundational Capability for Data Fabric, Data Int...
Knowledge Graph Discussion: Foundational Capability for Data Fabric, Data Int...
 
Sustainability Investment Research Using Cognitive Analytics
Sustainability Investment Research Using Cognitive AnalyticsSustainability Investment Research Using Cognitive Analytics
Sustainability Investment Research Using Cognitive Analytics
 
Big Data Analytics Using Hadoop
Big Data Analytics Using HadoopBig Data Analytics Using Hadoop
Big Data Analytics Using Hadoop
 
Predictive Analysis for Airbnb Listing Rating using Scalable Big Data Platform
Predictive Analysis for Airbnb Listing Rating using Scalable Big Data PlatformPredictive Analysis for Airbnb Listing Rating using Scalable Big Data Platform
Predictive Analysis for Airbnb Listing Rating using Scalable Big Data Platform
 
Modern Big Data Analytics Tools: An Overview
Modern Big Data Analytics Tools: An OverviewModern Big Data Analytics Tools: An Overview
Modern Big Data Analytics Tools: An Overview
 
From hadoop to spark
From hadoop to sparkFrom hadoop to spark
From hadoop to spark
 
Big Tools for Big Data
Big Tools for Big DataBig Tools for Big Data
Big Tools for Big Data
 
Big data 101
Big data 101Big data 101
Big data 101
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Applying Noisy Knowledge Graphs to Real Problems
Applying Noisy Knowledge Graphs to Real ProblemsApplying Noisy Knowledge Graphs to Real Problems
Applying Noisy Knowledge Graphs to Real Problems
 

Semelhante a Simplified Machine Learning, Text, and Graph Analytics with Pivotal Greenplum

When and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data ArchitectureWhen and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data ArchitectureDATAVERSITY
 
Unlock Your Data for ML & AI using Data Virtualization
Unlock Your Data for ML & AI using Data VirtualizationUnlock Your Data for ML & AI using Data Virtualization
Unlock Your Data for ML & AI using Data VirtualizationDenodo
 
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...Denodo
 
using big-data methods analyse the Cross platform aviation
 using big-data methods analyse the Cross platform aviation using big-data methods analyse the Cross platform aviation
using big-data methods analyse the Cross platform aviationranjit banshpal
 
Advanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data VirtualizationAdvanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data VirtualizationDenodo
 
Gse uk-cedrinemadera-2018-shared
Gse uk-cedrinemadera-2018-sharedGse uk-cedrinemadera-2018-shared
Gse uk-cedrinemadera-2018-sharedcedrinemadera
 
Advanced Analytics and Machine Learning with Data Virtualization (India)
Advanced Analytics and Machine Learning with Data Virtualization (India)Advanced Analytics and Machine Learning with Data Virtualization (India)
Advanced Analytics and Machine Learning with Data Virtualization (India)Denodo
 
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...Denodo
 
How Data Virtualization Puts Machine Learning into Production (APAC)
How Data Virtualization Puts Machine Learning into Production (APAC)How Data Virtualization Puts Machine Learning into Production (APAC)
How Data Virtualization Puts Machine Learning into Production (APAC)Denodo
 
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...Alex Liu
 
Using a Semantic and Graph-based Data Catalog in a Modern Data Fabric
Using a Semantic and Graph-based Data Catalog in a Modern Data FabricUsing a Semantic and Graph-based Data Catalog in a Modern Data Fabric
Using a Semantic and Graph-based Data Catalog in a Modern Data FabricCambridge Semantics
 
Augmentation, Collaboration, Governance: Defining the Future of Self-Service BI
Augmentation, Collaboration, Governance: Defining the Future of Self-Service BIAugmentation, Collaboration, Governance: Defining the Future of Self-Service BI
Augmentation, Collaboration, Governance: Defining the Future of Self-Service BIDenodo
 
DATASCIENCE vs BUSINESS INTELLIGENCE.pptx
DATASCIENCE vs BUSINESS INTELLIGENCE.pptxDATASCIENCE vs BUSINESS INTELLIGENCE.pptx
DATASCIENCE vs BUSINESS INTELLIGENCE.pptxOTA13NayabNakhwa
 
6 enriching your data warehouse with big data and hadoop
6 enriching your data warehouse with big data and hadoop6 enriching your data warehouse with big data and hadoop
6 enriching your data warehouse with big data and hadoopDr. Wilfred Lin (Ph.D.)
 
Data Mesh using Microsoft Fabric
Data Mesh using Microsoft FabricData Mesh using Microsoft Fabric
Data Mesh using Microsoft FabricNathan Bijnens
 
Big Data Driven Solutions to Combat Covid' 19
Big Data Driven Solutions to Combat Covid' 19Big Data Driven Solutions to Combat Covid' 19
Big Data Driven Solutions to Combat Covid' 19Prof.Balakrishnan S
 
Got data?… now what? An introduction to modern data platforms
Got data?… now what?  An introduction to modern data platformsGot data?… now what?  An introduction to modern data platforms
Got data?… now what? An introduction to modern data platformsJamesAnderson599331
 

Semelhante a Simplified Machine Learning, Text, and Graph Analytics with Pivotal Greenplum (20)

When and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data ArchitectureWhen and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data Architecture
 
Unlock Your Data for ML & AI using Data Virtualization
Unlock Your Data for ML & AI using Data VirtualizationUnlock Your Data for ML & AI using Data Virtualization
Unlock Your Data for ML & AI using Data Virtualization
 
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
 
using big-data methods analyse the Cross platform aviation
 using big-data methods analyse the Cross platform aviation using big-data methods analyse the Cross platform aviation
using big-data methods analyse the Cross platform aviation
 
Advanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data VirtualizationAdvanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data Virtualization
 
Gse uk-cedrinemadera-2018-shared
Gse uk-cedrinemadera-2018-sharedGse uk-cedrinemadera-2018-shared
Gse uk-cedrinemadera-2018-shared
 
Advanced Analytics and Machine Learning with Data Virtualization (India)
Advanced Analytics and Machine Learning with Data Virtualization (India)Advanced Analytics and Machine Learning with Data Virtualization (India)
Advanced Analytics and Machine Learning with Data Virtualization (India)
 
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
 
How Data Virtualization Puts Machine Learning into Production (APAC)
How Data Virtualization Puts Machine Learning into Production (APAC)How Data Virtualization Puts Machine Learning into Production (APAC)
How Data Virtualization Puts Machine Learning into Production (APAC)
 
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...
 
BigData Analysis
BigData AnalysisBigData Analysis
BigData Analysis
 
Using a Semantic and Graph-based Data Catalog in a Modern Data Fabric
Using a Semantic and Graph-based Data Catalog in a Modern Data FabricUsing a Semantic and Graph-based Data Catalog in a Modern Data Fabric
Using a Semantic and Graph-based Data Catalog in a Modern Data Fabric
 
IBM Cloud pak for data brochure
IBM Cloud pak for data   brochureIBM Cloud pak for data   brochure
IBM Cloud pak for data brochure
 
Augmentation, Collaboration, Governance: Defining the Future of Self-Service BI
Augmentation, Collaboration, Governance: Defining the Future of Self-Service BIAugmentation, Collaboration, Governance: Defining the Future of Self-Service BI
Augmentation, Collaboration, Governance: Defining the Future of Self-Service BI
 
DATASCIENCE vs BUSINESS INTELLIGENCE.pptx
DATASCIENCE vs BUSINESS INTELLIGENCE.pptxDATASCIENCE vs BUSINESS INTELLIGENCE.pptx
DATASCIENCE vs BUSINESS INTELLIGENCE.pptx
 
6 enriching your data warehouse with big data and hadoop
6 enriching your data warehouse with big data and hadoop6 enriching your data warehouse with big data and hadoop
6 enriching your data warehouse with big data and hadoop
 
Data Mesh using Microsoft Fabric
Data Mesh using Microsoft FabricData Mesh using Microsoft Fabric
Data Mesh using Microsoft Fabric
 
Big Data Driven Solutions to Combat Covid' 19
Big Data Driven Solutions to Combat Covid' 19Big Data Driven Solutions to Combat Covid' 19
Big Data Driven Solutions to Combat Covid' 19
 
Proposed Talk Outline for Pycon2017
Proposed Talk Outline for Pycon2017 Proposed Talk Outline for Pycon2017
Proposed Talk Outline for Pycon2017
 
Got data?… now what? An introduction to modern data platforms
Got data?… now what?  An introduction to modern data platformsGot data?… now what?  An introduction to modern data platforms
Got data?… now what? An introduction to modern data platforms
 

Mais de VMware Tanzu

What AI Means For Your Product Strategy And What To Do About It
What AI Means For Your Product Strategy And What To Do About ItWhat AI Means For Your Product Strategy And What To Do About It
What AI Means For Your Product Strategy And What To Do About ItVMware Tanzu
 
Make the Right Thing the Obvious Thing at Cardinal Health 2023
Make the Right Thing the Obvious Thing at Cardinal Health 2023Make the Right Thing the Obvious Thing at Cardinal Health 2023
Make the Right Thing the Obvious Thing at Cardinal Health 2023VMware Tanzu
 
Enhancing DevEx and Simplifying Operations at Scale
Enhancing DevEx and Simplifying Operations at ScaleEnhancing DevEx and Simplifying Operations at Scale
Enhancing DevEx and Simplifying Operations at ScaleVMware Tanzu
 
Spring Update | July 2023
Spring Update | July 2023Spring Update | July 2023
Spring Update | July 2023VMware Tanzu
 
Platforms, Platform Engineering, & Platform as a Product
Platforms, Platform Engineering, & Platform as a ProductPlatforms, Platform Engineering, & Platform as a Product
Platforms, Platform Engineering, & Platform as a ProductVMware Tanzu
 
Building Cloud Ready Apps
Building Cloud Ready AppsBuilding Cloud Ready Apps
Building Cloud Ready AppsVMware Tanzu
 
Spring Boot 3 And Beyond
Spring Boot 3 And BeyondSpring Boot 3 And Beyond
Spring Boot 3 And BeyondVMware Tanzu
 
Spring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdf
Spring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdfSpring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdf
Spring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdfVMware Tanzu
 
Simplify and Scale Enterprise Apps in the Cloud | Boston 2023
Simplify and Scale Enterprise Apps in the Cloud | Boston 2023Simplify and Scale Enterprise Apps in the Cloud | Boston 2023
Simplify and Scale Enterprise Apps in the Cloud | Boston 2023VMware Tanzu
 
Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023
Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023
Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023VMware Tanzu
 
tanzu_developer_connect.pptx
tanzu_developer_connect.pptxtanzu_developer_connect.pptx
tanzu_developer_connect.pptxVMware Tanzu
 
Tanzu Virtual Developer Connect Workshop - French
Tanzu Virtual Developer Connect Workshop - FrenchTanzu Virtual Developer Connect Workshop - French
Tanzu Virtual Developer Connect Workshop - FrenchVMware Tanzu
 
Tanzu Developer Connect Workshop - English
Tanzu Developer Connect Workshop - EnglishTanzu Developer Connect Workshop - English
Tanzu Developer Connect Workshop - EnglishVMware Tanzu
 
Virtual Developer Connect Workshop - English
Virtual Developer Connect Workshop - EnglishVirtual Developer Connect Workshop - English
Virtual Developer Connect Workshop - EnglishVMware Tanzu
 
Tanzu Developer Connect - French
Tanzu Developer Connect - FrenchTanzu Developer Connect - French
Tanzu Developer Connect - FrenchVMware Tanzu
 
Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023
Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023
Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023VMware Tanzu
 
SpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring Boot
SpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring BootSpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring Boot
SpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring BootVMware Tanzu
 
SpringOne Tour: The Influential Software Engineer
SpringOne Tour: The Influential Software EngineerSpringOne Tour: The Influential Software Engineer
SpringOne Tour: The Influential Software EngineerVMware Tanzu
 
SpringOne Tour: Domain-Driven Design: Theory vs Practice
SpringOne Tour: Domain-Driven Design: Theory vs PracticeSpringOne Tour: Domain-Driven Design: Theory vs Practice
SpringOne Tour: Domain-Driven Design: Theory vs PracticeVMware Tanzu
 
SpringOne Tour: Spring Recipes: A Collection of Common-Sense Solutions
SpringOne Tour: Spring Recipes: A Collection of Common-Sense SolutionsSpringOne Tour: Spring Recipes: A Collection of Common-Sense Solutions
SpringOne Tour: Spring Recipes: A Collection of Common-Sense SolutionsVMware Tanzu
 

Mais de VMware Tanzu (20)

What AI Means For Your Product Strategy And What To Do About It
What AI Means For Your Product Strategy And What To Do About ItWhat AI Means For Your Product Strategy And What To Do About It
What AI Means For Your Product Strategy And What To Do About It
 
Make the Right Thing the Obvious Thing at Cardinal Health 2023
Make the Right Thing the Obvious Thing at Cardinal Health 2023Make the Right Thing the Obvious Thing at Cardinal Health 2023
Make the Right Thing the Obvious Thing at Cardinal Health 2023
 
Enhancing DevEx and Simplifying Operations at Scale
Enhancing DevEx and Simplifying Operations at ScaleEnhancing DevEx and Simplifying Operations at Scale
Enhancing DevEx and Simplifying Operations at Scale
 
Spring Update | July 2023
Spring Update | July 2023Spring Update | July 2023
Spring Update | July 2023
 
Platforms, Platform Engineering, & Platform as a Product
Platforms, Platform Engineering, & Platform as a ProductPlatforms, Platform Engineering, & Platform as a Product
Platforms, Platform Engineering, & Platform as a Product
 
Building Cloud Ready Apps
Building Cloud Ready AppsBuilding Cloud Ready Apps
Building Cloud Ready Apps
 
Spring Boot 3 And Beyond
Spring Boot 3 And BeyondSpring Boot 3 And Beyond
Spring Boot 3 And Beyond
 
Spring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdf
Spring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdfSpring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdf
Spring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdf
 
Simplify and Scale Enterprise Apps in the Cloud | Boston 2023
Simplify and Scale Enterprise Apps in the Cloud | Boston 2023Simplify and Scale Enterprise Apps in the Cloud | Boston 2023
Simplify and Scale Enterprise Apps in the Cloud | Boston 2023
 
Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023
Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023
Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023
 
tanzu_developer_connect.pptx
tanzu_developer_connect.pptxtanzu_developer_connect.pptx
tanzu_developer_connect.pptx
 
Tanzu Virtual Developer Connect Workshop - French
Tanzu Virtual Developer Connect Workshop - FrenchTanzu Virtual Developer Connect Workshop - French
Tanzu Virtual Developer Connect Workshop - French
 
Tanzu Developer Connect Workshop - English
Tanzu Developer Connect Workshop - EnglishTanzu Developer Connect Workshop - English
Tanzu Developer Connect Workshop - English
 
Virtual Developer Connect Workshop - English
Virtual Developer Connect Workshop - EnglishVirtual Developer Connect Workshop - English
Virtual Developer Connect Workshop - English
 
Tanzu Developer Connect - French
Tanzu Developer Connect - FrenchTanzu Developer Connect - French
Tanzu Developer Connect - French
 
Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023
Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023
Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023
 
SpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring Boot
SpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring BootSpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring Boot
SpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring Boot
 
SpringOne Tour: The Influential Software Engineer
SpringOne Tour: The Influential Software EngineerSpringOne Tour: The Influential Software Engineer
SpringOne Tour: The Influential Software Engineer
 
SpringOne Tour: Domain-Driven Design: Theory vs Practice
SpringOne Tour: Domain-Driven Design: Theory vs PracticeSpringOne Tour: Domain-Driven Design: Theory vs Practice
SpringOne Tour: Domain-Driven Design: Theory vs Practice
 
SpringOne Tour: Spring Recipes: A Collection of Common-Sense Solutions
SpringOne Tour: Spring Recipes: A Collection of Common-Sense SolutionsSpringOne Tour: Spring Recipes: A Collection of Common-Sense Solutions
SpringOne Tour: Spring Recipes: A Collection of Common-Sense Solutions
 

Último

OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureEric D. Schabell
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Adtran
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7DianaGray10
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1DianaGray10
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxGDSC PJATK
 
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfJamie (Taka) Wang
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostMatt Ray
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?IES VE
 
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsSafe Software
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDELiveplex
 
20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-pyJamie (Taka) Wang
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1DianaGray10
 
AI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarAI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarPrecisely
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024D Cloud Solutions
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopBachir Benyammi
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemAsko Soukka
 
VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXVoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXTarek Kalaji
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdfPedro Manuel
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6DianaGray10
 

Último (20)

OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability Adventure
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptx
 
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?
 
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
 
20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-py
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
 
AI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarAI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity Webinar
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 Workshop
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystem
 
VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXVoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBX
 
20150722 - AGV
20150722 - AGV20150722 - AGV
20150722 - AGV
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdf
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6
 

Simplified Machine Learning, Text, and Graph Analytics with Pivotal Greenplum

  • 1. © Copyright 2018 Pivotal Software, Inc. All rights Reserved. Version 1.0 Simplifying Data and Analytics with Pivotal Greenplum 451 Research and Pivotal, Inc. May 24, 2018
  • 2. Welcome! 2 James Curtis Sr Analyst, Data Platforms & Analytics james.curtis@451research.com @jmscrts www.451research.com Bob Glithero Principal Product Marketing Mgr linkedin.com/in/glithero @bglithero www.pivotal.io Bharath Sitaraman Principal Product Manager linkedin.com/in/bsitaraman @bharath1028 www.pivotal.io
  • 3. Cover w/ Image Agenda ●  Expanding Analytics with EDW ●  Integrating Data for Analytical Transformation ●  Use Case: Layered Analytics in Cybersecurity ●  Q&A
  • 4. 451 Research is a leading IT research & advisory company 4 Founded in 2000 300+ employees, including over 120 analysts 2,000+ clients: Technology & Service providers, corporate advisory, finance, professional services, and IT decision makers 70,000+ IT professionals, business users and consumers in our research community Over 52 million data points published each quarter and 4,500+ reports published each year 3,000+ technology & service providers under coverage 451 Research and its sister company, Uptime Institute, are the two divisions of The 451 Group Headquartered in New York City, with offices in London, Boston, San Francisco, Washington DC, Mexico, Costa Rica, Brazil, Spain, UAE, Russia, Taiwan, Singapore and Malaysia Research & Data Advisory Events Go 2 Market
  • 5. 5 Becoming Data Driven, Analytics Driven
  • 7. Analytic Data Platforms: A Growing Market 7 Source: 451 Research, Market Monitor, Total Data: Platforms & Analytics, February 2018. 9.1% CAGR 2017-22
  • 10. ENTERPRISE APPLICATIONS CLOUD STORAGE MOBILE APPS BOTS IOT DEVICES AND SENSORS SOCIAL MEDIA DECISION MAKERS HADOOP SPARK AI+ML DATA ANALYSTS IT PROS LOG AND CLICKSTREAM DATA DATA WAREHOUSE 3 Leads to Expansion of Data Sources
  • 11. ENTERPRISE APPLICATIONS CLOUD STORAGE MOBILE APPS BOTS IOT DEVICES AND SENSORS SOCIAL MEDIA BUSINESS USERS DATA-DRIVEN APPLICATIONS DATA SCIENTISTS DECISION MAKERS HADOOP SPARK AI+ML DATA ANALYSTS IT PROS LOG AND CLICKSTREAM DATA OT USERS DATA WAREHOUSE 3 Which Leads to More Advanced Decision- Making Processes
  • 12. CLOUD STORAGE HADOOP SPARK AI+ML DATA WAREHOUSE 3 Consider the Environment •  Too many systems to maintain •  Excessive data movement •  Analytics on a portion of data •  Duplicate capabilities •  Low utilization •  Low optimization
  • 13. 3 Consolidate Analytical Frameworks •  Fewer systems to maintain •  Minimize data movement •  Analytic optimization •  Resource efficiency CLOUD STORAGE HADOOP SPARK AI+ML DATA WAREHOUSE
  • 14. Consolidated Systems Enable In-Database Machine Learning 14 ✔︎ Operate on all of the data, including varied types ✔︎ ✔︎ ✔︎Algorithms optimized to the architecture No moving of data Leverage the use of SQL
  • 15. 3 Level Set on Machine Learning !  The terms ‘algorithm’ and ‘model’ are often used to mean the same thing. They are not. !  An algorithm is a set of computational instructions, such as Random Forest. !  A model in that context would be the result of applying an Random Forest to a dataset —its output, which is based on the algorithm.
  • 16. In-Database Machine Learning Works Best When You... 16 Understand the business problem (and rules) thoroughly 1 CLOUD STORAGE HADOOP SPARK AI+ML DATA WAREHOUSE
  • 17. In-Database Machine Learning Works Best When You... 17 Understand the business problem (and rules) thoroughly Have a decent amount of data that is consolidated 1 2 CLOUD STORAGE HADOOP SPARK AI+ML DATA WAREHOUSE
  • 18. In-Database Machine Learning Works Best When You... 18 Understand the business problem (and rules) thoroughly Have a decent amount of data that is consolidated Algorithms and tools for data analysis and preparation (optimized) 1 2 3 CLOUD STORAGE HADOOP SPARK AI+ML DATA WAREHOUSE
  • 19. In-Database Machine Learning Works Best When You... 19 Understand the business problem (and rules) thoroughly Have a decent amount of data that is consolidated Algorithms and tools for data analysis and preparation (optimized) Algorithms for machine learning development (optimized) 1 2 3 4 CLOUD STORAGE HADOOP SPARK AI+ML DATA WAREHOUSE
  • 20. In-Database Machine Learning Works Best When You... 20 Understand the business problem (and rules) thoroughly Have a decent amount of data that is consolidated Algorithms and tools for data analysis and preparation (optimized) Algorithms and tools to carrying out maintenance, validation, updating Algorithms for machine learning development (optimized) 1 2 3 4 5 CLOUD STORAGE HADOOP SPARK AI+ML DATA WAREHOUSE
  • 21. In-Database Machine Learning Works Best When You... 21 Understand the business problem (and rules) thoroughly Have a decent amount of data that is consolidated Algorithms and tools for data analysis and preparation (optimized) Algorithms and tools to carrying out maintenance, validation, updating Algorithms for machine learning development (optimized) Methods for machine learning model deployment 1 2 3 4 5 6 CLOUD STORAGE HADOOP SPARK AI+ML DATA WAREHOUSE
  • 23. PIVOTAL Integrating Data for Analytical Transformation
  • 24. Data is at the center of digital transformation; data-driven action is how transformation happens 24
  • 25. 25 How do you get your arms around this?
  • 26. So How Can We Use Data Effectively? ●  Over 30% of organizations have failed on big data projects ●  Recent research says it takes an average of 52 days to build a predictive model ●  Consolidating data and analytics in fewer environments simplifies modeling and deployment 26
  • 27. 1. Converge Analytics and Data ●  Run algorithms in a database, as close to the data as possible ●  Leverage MPP architectures for rapid data science ●  Avoid ETL by moving data only when necessary ●  Integrate structured and unstructured data in one environment, reducing footprint of specialist databases 27
  • 28. 2. Remove Friction from Data Science ●  Test as many hypotheses in parallel to find the most relevant features as quickly as possible ●  Don’t push/pull data into other environments, train and test, over and over... ●  Instead, develop a process that lets you ○  train a model with consolidated, cleansed data sets in the database ○  deploy the model as a pre-computed object ○  quickly retrain if data patterns change 28
  • 29. 3. Choose a Data Platform for Rapid Analytics Some algorithms are iterative, like clustering or graphing Some algorithms can be parallelized, like random forests Sometimes patterns in the underlying data change, so models become obsolete 29
  • 30. 4. Start with Standard Statistical Methods and ML A lot of useful data science can be done with standard algorithms Exotic algorithms need more data to train Standard algorithms can be combined into ensembles for greater predictive power Source: “A Few Useful Things to Know About Machine Learning,” Pedro Domingos, Communications of the ACM, October 2012 Starting with simpler algorithms and ensembles paves the way for more advanced data science 30
  • 31. But Don’t Just Take Our Word for It... “As we built the ML model, we were surprised to learn that none of the most hyped data science tools — such as deep learning, AutoML, and ‘AI that creates AI’ — were needed to make it work.”
  • 32. Pivotal Solutions for Data and Analytics Pivotal Greenplum Multi-Cloud, MPP data platform for complex analytics with diverse data locality and data types Pivotal GemFire Fast, transactional in-memory grid for rapid data refresh Complete portfolio Multi-Cloud and on premises Based on open source Flexible licensing Advanced data services Pivotal Cloud Cache On-demand in-memory caching for cloud native apps Pivotal Cloud Foundry Proven solution for operationalization of analytics and software-led, digital transformation Pivotal Data Science World-class Data Science consulting to drive more insights from data for Data- Driven Applications. Apache MADlib Distributed, in-database analytical library on large-scale data set. 32
  • 33. Consolidating Diverse Data Enables New Use Cases Native Graph Relationship Intelligence Greenplum GPText Fast Search and Semantic Intelligence Greenplum PostGIS Location Intelligence Greenplum Integrated, Cleansed Data 33 New use cases for locations, flows, connections, relationships, and intent
  • 34. Text Analytics with GPText Extracts content, structure from many binary formats Fast heterogeneous document indexing, search, and retrieval Massive parallelism for ●  Topic modeling ●  Named entity recognition ●  Term frequency ●  Stemming ●  Topic graph ●  Topic cloud ●  NLP 34 +
  • 36. Attacks go unnoticed for long periods Insider Threats Increasingly Evade Network-Level Visibility Source: Verizon, 2017, n = 77 36
  • 37. Layering Analytic Techniques for a More Complete Picture Understanding user/entity behavior (graphing, clustering, predictive analytics) Network-level intelligence (firewalls, IDS, SIEMs) Semantic understanding (e.g., text analytics, NLP) Understand activity Understand behavior Understand intent General Specific 37
  • 38. Engineering-level view misses higher-level activity Understanding Activity - Network Intelligence ●  SIEM/IDS useful if activity matches a signature or rule ●  Small-ish data sets - APTs unfold over months or years ●  Inflexible schema - difficult to extend with user- level attributes ●  User is not same as device, IP address Log Collection Log Analysis Event Correlation Log Forensics Object Access Auditing Alerts Reports Log Monitoring Log Retention File Integrity Monitoring SIEM 38
  • 39. Advanced Analytics Reveal Hidden Patterns Reveal latent/invisible patterns that stand out from normal behavior ●  Reconnaissance ●  Privilege escalation attempts ●  Access attempts ●  Unusual data flows or exfiltration attempts 39
  • 40. Predictive Analytics for Understanding Behavior Chaining models increases predictive power, decreases false positives Lateral Movement Ensemble Model Outcomes Training data ●  Kerberos authentication events (source, destination, account, type, success/failure, etc.) ●  10K users, 13K nodes ●  >110M events •  Regression analysis •  Constrained diameter authentication graph •  Robust rank aggregation Model features ●  # of distinct destinations logged in to ●  # of distinct sources logged in from ●  # of distinct destination user accounts ●  # of distinct processes started by user Signals of credential takeover, running scripts, and other behavioral anomalies https://content.pivotal.io/blog/insider-threat-detection-detecting-variance-in-user-behavior-using-an-ensemble-approach 40
  • 41. Ensemble of methods reveals hidden patterns that signal problem behavior Revealing the Needle in the Haystack Surge in access attempts at regular intervals indicates possible background script Regular surges in logins from unusual user accounts indicates possible account takeovers Graph reveals two access attempts to a sensitive server indirectly via other servers 41
  • 42. Understanding Intent with Semantic Intelligence Semantic intelligence can clarify ambiguous signals from other analytics or security appliances Scan for interesting words, words in proximity, variations Scan and index content - “Is this document leaving the network similar to other known sensitive documents?” Scan document to suggest possible classification tags “sorry, i sent this file by mistake” ✔ “please don’t share this file outside the organization” ! Benign, no further action Investigate 42
  • 44. Digital Transformation is Real T-Mobile goes from 7 months and 72 steps to update software, to same day deployments. Liberty Mutual builds and deploys an MVP in one month and delivers revenue-generating version just months later. Comcast supports over 1500 developers with an operator team of 4 people. The Home Depot ships to production 1,500 times a month, and 17,000 times a month to all environments. Leading companies trust Pivotal as a transformation partner
  • 45. 45
  • 46. Cover w/ Image Data is the Key to Transformation ●  Consolidating data simplifies modeling and deployment ●  Executing massively parallel analytics in the database speeds complex use cases ●  Pivotal Greenplum integrates diverse data with machine learning at scale for faster value with less risk
  • 47. Start Your Data Transformation Journey Today! Pivotal Greenplum pivotal.io/pivotal-greenplum Pivotal Data Science pivotal.io/data-science Apache MADlib madlib.apache.org Greenplum Database Channel
  • 48. Data Tells the Story © Copyright 2018 Pivotal Software, Inc. All rights Reserved.