SlideShare a Scribd company logo
1 of 51
Shape as Organizing Principle for 
Data 
MLConf, SF 2014 
Anthony Bak, Principal Data Scientist
The Data Problem: Complexity
Solution: Topological Summaries
Shape as Organizing 
Principle for Data
Shape as Organizing Principle
Reduce Bias, Discover Models 
Want to Discover the underlying structure without bias. 
TDA analyzes the data you have, 
not the data you want to have.
Generating Topological 
Summaries
Generating Topological Summaries
Generating Topological Summaries
Generating Topological Summaries
Generating Topological Summaries
Generating Topological Summaries
Generating Topological Summaries
Generating Topological Summaries
Generating Topological Summaries
Generating Topological Summaries
Generating Topological Summaries
Generating Topological Summaries
Generating Topological Summaries
Generating Topological Summaries
Generating Topological Summaries
Generating Topological Summaries
Generating Topological Summaries
Remember/Forget 
 Different lenses provide different summaries 
 Use multiple lenses/metrics to get the complete picture
Generating Topological Summaries
Lenses: where do they come from? 
Statistics 
Mean/Max/Min 
Variance 
n-Moment 
Density 
… 
Machine 
Learning 
PCA/SVD 
Autoencoders 
Isomap/MDS/TS 
NE 
… 
Geometry 
Centrality 
Curvature 
Harmonic Cycles 
…
Why Topology?
Key Properties of TDA 
Deformation 
Invariance 
Compressed 
Representation 
Coordinate 
Freeness
Coordinate Invariance 
1. Topology of shape doesn’t depend on the coordinates used to 
describe the shape 
1. Different feature sets can describe the same phenomena 
1. While processing data, we frequently alter coordinates: scaling, 
rotating, whitening 
You want to study properties of your data that are invariant 
under coordinate changes
Coordinate Invariance: Gene Expression 
NKI 
GSE230
Deformation Invariance 
• Topological features don’t change when you stretch and distort the 
data 
Advantage: Makes problems easier 
 Noise resistance 
 Less pre-processing of data 
 Robust (stable) data
Deformation Invariance
Deformation Invariance
Deformation Invariance
Deformation Invariance
Compressed Representation 
• Replace the metric space with a combinatorial summary: a simplicial 
complex. 
• Data becomes easier to manage, search, and query while 
maintaining essential features. 
• Leverages many known algorithms from graph theory, computational 
topology, computational geometry.
Compressed Representation
Baby Steps: PCA
PCA
PCA
Data Stories
Model Introspection
Model Introspection
Predictive Maintenance
Customer Churn
Customer Churn
Customer Churn
Transaction Fraud
Transaction Fraud
Transaction Fraud
Data has shape, 
Shape has meaning. 
http://www.ayasdi.com/company/careers/

More Related Content

Viewers also liked

Logistics and ecommerce in india and china
Logistics and ecommerce in india and chinaLogistics and ecommerce in india and china
Logistics and ecommerce in india and china
Kaushik Raja
 
The Science behind Lead Generation
The Science behind Lead GenerationThe Science behind Lead Generation
The Science behind Lead Generation
HubSpot
 

Viewers also liked (16)

Media and Marketing-Message Effectiveness for Wireless Telecom
Media and Marketing-Message Effectiveness for Wireless TelecomMedia and Marketing-Message Effectiveness for Wireless Telecom
Media and Marketing-Message Effectiveness for Wireless Telecom
 
Logistics and ecommerce in india and china
Logistics and ecommerce in india and chinaLogistics and ecommerce in india and china
Logistics and ecommerce in india and china
 
Using Topological Data Analysis on your BigData
Using Topological Data Analysis on your BigDataUsing Topological Data Analysis on your BigData
Using Topological Data Analysis on your BigData
 
10 Ways to Make Your Lead Generation Website Convert On the First Visit
10 Ways to Make Your Lead Generation Website Convert On the First Visit10 Ways to Make Your Lead Generation Website Convert On the First Visit
10 Ways to Make Your Lead Generation Website Convert On the First Visit
 
The Science behind Lead Generation
The Science behind Lead GenerationThe Science behind Lead Generation
The Science behind Lead Generation
 
Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!
 
Introduction to Machine Learning and Deep Learning
Introduction to Machine Learning and Deep LearningIntroduction to Machine Learning and Deep Learning
Introduction to Machine Learning and Deep Learning
 
Deep learning - Conceptual understanding and applications
Deep learning - Conceptual understanding and applicationsDeep learning - Conceptual understanding and applications
Deep learning - Conceptual understanding and applications
 
The Future Is Here: How Social Affects Lead Generation
The Future Is Here: How Social Affects Lead GenerationThe Future Is Here: How Social Affects Lead Generation
The Future Is Here: How Social Affects Lead Generation
 
Lead Generation 101: How To Build A Responsive List
Lead Generation 101: How To Build A Responsive ListLead Generation 101: How To Build A Responsive List
Lead Generation 101: How To Build A Responsive List
 
Python for Image Understanding: Deep Learning with Convolutional Neural Nets
Python for Image Understanding: Deep Learning with Convolutional Neural NetsPython for Image Understanding: Deep Learning with Convolutional Neural Nets
Python for Image Understanding: Deep Learning with Convolutional Neural Nets
 
Deep neural networks
Deep neural networksDeep neural networks
Deep neural networks
 
Deep Learning - The Past, Present and Future of Artificial Intelligence
Deep Learning - The Past, Present and Future of Artificial IntelligenceDeep Learning - The Past, Present and Future of Artificial Intelligence
Deep Learning - The Past, Present and Future of Artificial Intelligence
 
Lead Generation on SlideShare: A How-to Guide
Lead Generation on SlideShare: A How-to GuideLead Generation on SlideShare: A How-to Guide
Lead Generation on SlideShare: A How-to Guide
 
The Who What Where When And Why Of Social Media Lead Generation
The Who What Where When And Why Of Social Media Lead GenerationThe Who What Where When And Why Of Social Media Lead Generation
The Who What Where When And Why Of Social Media Lead Generation
 
Deep Learning Use Cases - Data Science Pop-up Seattle
Deep Learning Use Cases - Data Science Pop-up SeattleDeep Learning Use Cases - Data Science Pop-up Seattle
Deep Learning Use Cases - Data Science Pop-up Seattle
 

Similar to Shape as Organizing Principle for Data

Supporting B2Bsales forecasting by machine learning - Mirjana Klajic Borstnar
Supporting B2Bsales forecasting by machine learning - Mirjana Klajic BorstnarSupporting B2Bsales forecasting by machine learning - Mirjana Klajic Borstnar
Supporting B2Bsales forecasting by machine learning - Mirjana Klajic Borstnar
Institute of Contemporary Sciences
 
Topological Data Analysis
Topological Data AnalysisTopological Data Analysis
Topological Data Analysis
DeviousQuant
 
Student guide to spreadsheet modelling
Student guide to spreadsheet modellingStudent guide to spreadsheet modelling
Student guide to spreadsheet modelling
broo209
 

Similar to Shape as Organizing Principle for Data (20)

Anthony Bak, Principal Data Scientist at Ayasdi at MLconf SEA - 5/01/15
Anthony Bak, Principal Data Scientist at Ayasdi at MLconf SEA - 5/01/15Anthony Bak, Principal Data Scientist at Ayasdi at MLconf SEA - 5/01/15
Anthony Bak, Principal Data Scientist at Ayasdi at MLconf SEA - 5/01/15
 
Interpretable ML
Interpretable MLInterpretable ML
Interpretable ML
 
Dimensional data modeling
Dimensional data modelingDimensional data modeling
Dimensional data modeling
 
Towards a Comprehensive Machine Learning Benchmark
Towards a Comprehensive Machine Learning BenchmarkTowards a Comprehensive Machine Learning Benchmark
Towards a Comprehensive Machine Learning Benchmark
 
Supporting B2Bsales forecasting by machine learning - Mirjana Klajic Borstnar
Supporting B2Bsales forecasting by machine learning - Mirjana Klajic BorstnarSupporting B2Bsales forecasting by machine learning - Mirjana Klajic Borstnar
Supporting B2Bsales forecasting by machine learning - Mirjana Klajic Borstnar
 
Data Warehouse Design and Best Practices
Data Warehouse Design and Best PracticesData Warehouse Design and Best Practices
Data Warehouse Design and Best Practices
 
Topological Data Analysis
Topological Data AnalysisTopological Data Analysis
Topological Data Analysis
 
Kaggle Days Paris - Alberto Danese - ML Interpretability
Kaggle Days Paris - Alberto Danese - ML InterpretabilityKaggle Days Paris - Alberto Danese - ML Interpretability
Kaggle Days Paris - Alberto Danese - ML Interpretability
 
IEEE Pattern analysis and machine intelligence 2016 Title and Abstract
IEEE Pattern analysis and machine intelligence 2016 Title and AbstractIEEE Pattern analysis and machine intelligence 2016 Title and Abstract
IEEE Pattern analysis and machine intelligence 2016 Title and Abstract
 
Dimensional Modeling
Dimensional ModelingDimensional Modeling
Dimensional Modeling
 
Object recognition
Object recognitionObject recognition
Object recognition
 
Machine Learning in the Financial Industry
Machine Learning in the Financial IndustryMachine Learning in the Financial Industry
Machine Learning in the Financial Industry
 
ML-Unit-4.pdf
ML-Unit-4.pdfML-Unit-4.pdf
ML-Unit-4.pdf
 
ML Drift - How to find issues before they become problems
ML Drift - How to find issues before they become problemsML Drift - How to find issues before they become problems
ML Drift - How to find issues before they become problems
 
Student guide to spreadsheet modelling
Student guide to spreadsheet modellingStudent guide to spreadsheet modelling
Student guide to spreadsheet modelling
 
Incorporating SAP Metadata within your Information Architecture
Incorporating SAP Metadata within your Information ArchitectureIncorporating SAP Metadata within your Information Architecture
Incorporating SAP Metadata within your Information Architecture
 
Creating Effective Visuals for Teaching and Presentation
Creating Effective Visuals for Teaching and PresentationCreating Effective Visuals for Teaching and Presentation
Creating Effective Visuals for Teaching and Presentation
 
06_features_slides.pdf
06_features_slides.pdf06_features_slides.pdf
06_features_slides.pdf
 
Human in the loop: Bayesian Rules Enabling Explainable AI
Human in the loop: Bayesian Rules Enabling Explainable AIHuman in the loop: Bayesian Rules Enabling Explainable AI
Human in the loop: Bayesian Rules Enabling Explainable AI
 
Data ware house design
Data ware house designData ware house design
Data ware house design
 

Recently uploaded

VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
dharasingh5698
 
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
ankushspencer015
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
ssuser89054b
 
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 

Recently uploaded (20)

KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - V
 
Bhosari ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For ...
Bhosari ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For ...Bhosari ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For ...
Bhosari ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For ...
 
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
 
chapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringchapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineering
 
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
 
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torque
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdf
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdf
 
NFPA 5000 2024 standard .
NFPA 5000 2024 standard                                  .NFPA 5000 2024 standard                                  .
NFPA 5000 2024 standard .
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPT
 
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
 
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
 

Shape as Organizing Principle for Data

Editor's Notes

  1. NEED: COLOR BYS
  2. NEED: COLOR BYS
  3. NEED: COLOR BYS
  4. NEED: COLOR BYS
  5. NEED: COLOR BYS
  6. NEED: COLOR BYS
  7. NEED: COLOR BYS
  8. NEED: COLOR BYS
  9. NEED: COLOR BYS
  10. NEED: COLOR BYS