SlideShare uma empresa Scribd logo
1 de 15
Graph Theory at work 
doug.needham@ilwllc.com
• @dougneedham 
• Data Guy - Started as a DBA in the Marine Corps, 
evolved to Architect, now aspiring Data Scientist. 
• Oracle, SQL Server, Cassandra, Hadoop, MySQL. 
• I have a strong relational/traditional background. 
• Perpetual Student 
• Learning new things challenges our assumptions. 
Forces us to take a new perspective on “old” 
problems. Eventually maybe even shows us that 
there is a better way to solve a problem.
• Stand back, we are going to talk about math! 
• Basically we are talking about a bunch of dots joined together by 
lines 
• Vertex – Dot on a graph 
• Edge – Line connecting the two points 
• Triangle – 3 Vertices, 3 Edges 
• Square – 4 Vertices, 4 edges 
• Open Triangle - 3 Vertices, 2 edges 
• A lot of things are networks if you look at them the right way. 
• Mark Newman has done a number of really cool presentations, 
available on Youtube about Network analysis. 
• https://www.youtube.com/watch?v=lETt7IcDWLI
• The 7 Bridges of Konisberg 
• Every tome on Graph theory or Network 
analysis devotes a small portion of there time 
to the 7 Bridges of Konisberg. 
• If I don’t cover this with you, the gods of 
mathematics will strike me down, and never 
allow me to do analysis again in the future.
• Folks enjoyed there Sunday afternoon strolls across the 
bridges, but occasionally people would wonder if one 
particular route was more efficient than another. 
• Eventually Leonhard Euler was brought into the debate 
about the efficiency problem. 
• Euler used Vertices to represent the land masses and edges 
(or arcs, at the time) to represent bridges. He realized the 
odd number of edges per vertex made the problem 
unsolvable. 
• And here is the cool thing about mathematicians. If we tell 
you something is impossible, we have to tell you why in a 
way you can understand it. But he also invented the branch 
of mathematics today we call Graph Theory. 
• http://en.wikipedia.org/wiki/Leonhard_Euler
• http://gephi.github.io/ 
• From the website: “Gephi is an interactive visualization and 
exploration platform for all kinds of networks and complex 
systems, dynamic and hierarchical graphs.” 
• To get this yourself go into Facebook and search for: 
Netvizz. (You have to authorized it. You can un-authorized it 
later) 
• Click the application. 
• Click “personal network” 
• Click Start 
• Download your gdf file 
• Quick Demo:
• Shortest path – How are two vertices connected? 
• What is a path? 
• Centrality 
• Transitivity 
• Homophily 
• Directed Graphs – or Digraphs 
• Contagion – How do things “spread” through a network? 
• Let’s rearrange things, how does the layout affect 
understanding? 
• This is not just data visualization, it can also be used for 
prediction. https://www.youtube.com/watch?v=rwA-y-XwjuU
• Requires Spark, which is not a bad deal. 
• Jump to Demo 
• http://ampcamp.berkeley.edu/big-data-mini-course/graph-analytics-with-graphx. 
html
• Giraph, I haven’t really done as much with as I 
wanted to do. Perhaps a later presentation 
with a more detailed example comparing 
GraphX with Giraph.
• I started doing some analysis some time ago 
using Graph models to understand metadata. 
• I came up with two types of Graphs: 
• Data Structure Graph Level 1 – This is roughly like 
an Entity Relationship Diagram (ERD) Tables are 
Vertices, Foreign Keys are Edges. 
• Data Structure Graph Level 2 – Each Vertex in this 
graph is an application. Each Edge is data transfer. 
Roughly equivalent to what we used to call Data 
Flow diagrams.
• A DSG Level 1 can show you where you are 
going to have the most interesting query 
performance of your tables. 
• A DSG Level 2 can show you where the most 
amount of work is going on in your Enterprise.
• Network/Graph Analysis is cool. 
• It can show you some interesting things about your data. 
• Some things to consider. 
• Some thought needs to be put into how the raw data is 
organized for a Graph Analysis. 
• Directed graph, undirected, bigraph? Some up front setup 
work needs to be done. 
• Tools help with the detailed calculations, and show the 
paths, walks, etc. 
• However, due thought should be put towards a network 
analysis project.
• http://blog.revolutionanalytics.com/2012/05/facebook-class-social-network-analysis-with-r-and-hadoop.html

Mais conteúdo relacionado

Mais procurados

Data Science in Future Tense
Data Science in Future TenseData Science in Future Tense
Data Science in Future TensePaco Nathan
 
Use of standards and related issues in predictive analytics
Use of standards and related issues in predictive analyticsUse of standards and related issues in predictive analytics
Use of standards and related issues in predictive analyticsPaco Nathan
 
GraphFrames: Graph Queries in Spark SQL by Ankur Dave
GraphFrames: Graph Queries in Spark SQL by Ankur DaveGraphFrames: Graph Queries in Spark SQL by Ankur Dave
GraphFrames: Graph Queries in Spark SQL by Ankur DaveSpark Summit
 
Microservices, containers, and machine learning
Microservices, containers, and machine learningMicroservices, containers, and machine learning
Microservices, containers, and machine learningPaco Nathan
 
GraphFrames: Graph Queries In Spark SQL
GraphFrames: Graph Queries In Spark SQLGraphFrames: Graph Queries In Spark SQL
GraphFrames: Graph Queries In Spark SQLSpark Summit
 
Congressional PageRank: Graph Analytics of US Congress With Neo4j
Congressional PageRank: Graph Analytics of US Congress With Neo4jCongressional PageRank: Graph Analytics of US Congress With Neo4j
Congressional PageRank: Graph Analytics of US Congress With Neo4jWilliam Lyon
 
Big Graph Analytics on Neo4j with Apache Spark
Big Graph Analytics on Neo4j with Apache SparkBig Graph Analytics on Neo4j with Apache Spark
Big Graph Analytics on Neo4j with Apache SparkKenny Bastani
 
GalvanizeU Seattle: Eleven Almost-Truisms About Data
GalvanizeU Seattle: Eleven Almost-Truisms About DataGalvanizeU Seattle: Eleven Almost-Truisms About Data
GalvanizeU Seattle: Eleven Almost-Truisms About DataPaco Nathan
 
Data Science in 2016: Moving Up
Data Science in 2016: Moving UpData Science in 2016: Moving Up
Data Science in 2016: Moving UpPaco Nathan
 
Strata 2015 Data Preview: Spark, Data Visualization, YARN, and More
Strata 2015 Data Preview: Spark, Data Visualization, YARN, and MoreStrata 2015 Data Preview: Spark, Data Visualization, YARN, and More
Strata 2015 Data Preview: Spark, Data Visualization, YARN, and MorePaco Nathan
 
Jupyter for Education: Beyond Gutenberg and Erasmus
Jupyter for Education: Beyond Gutenberg and ErasmusJupyter for Education: Beyond Gutenberg and Erasmus
Jupyter for Education: Beyond Gutenberg and ErasmusPaco Nathan
 
Graph Analytics for big data
Graph Analytics for big dataGraph Analytics for big data
Graph Analytics for big dataSigmoid
 
Benchmarking graph databases on the problem of community detection
Benchmarking graph databases on the problem of community detectionBenchmarking graph databases on the problem of community detection
Benchmarking graph databases on the problem of community detectionSymeon Papadopoulos
 
Interpreting Relational Schema to Graphs
Interpreting Relational Schema to GraphsInterpreting Relational Schema to Graphs
Interpreting Relational Schema to GraphsNeo4j
 
Graph-Powered Machine Learning
Graph-Powered Machine Learning Graph-Powered Machine Learning
Graph-Powered Machine Learning GraphAware
 
Magellen: Geospatial Analytics on Spark by Ram Sriharsha
Magellen: Geospatial Analytics on Spark by Ram SriharshaMagellen: Geospatial Analytics on Spark by Ram Sriharsha
Magellen: Geospatial Analytics on Spark by Ram SriharshaSpark Summit
 
Transforming AI with Graphs: Real World Examples using Spark and Neo4j
Transforming AI with Graphs: Real World Examples using Spark and Neo4jTransforming AI with Graphs: Real World Examples using Spark and Neo4j
Transforming AI with Graphs: Real World Examples using Spark and Neo4jFred Madrid
 
Improve ML Predictions using Connected Feature Extraction
Improve ML Predictions using Connected Feature ExtractionImprove ML Predictions using Connected Feature Extraction
Improve ML Predictions using Connected Feature ExtractionDatabricks
 
What you need to know to start an AI company?
What you need to know to start an AI company?What you need to know to start an AI company?
What you need to know to start an AI company?Mo Patel
 

Mais procurados (20)

Data Science in Future Tense
Data Science in Future TenseData Science in Future Tense
Data Science in Future Tense
 
Use of standards and related issues in predictive analytics
Use of standards and related issues in predictive analyticsUse of standards and related issues in predictive analytics
Use of standards and related issues in predictive analytics
 
GraphFrames: Graph Queries in Spark SQL by Ankur Dave
GraphFrames: Graph Queries in Spark SQL by Ankur DaveGraphFrames: Graph Queries in Spark SQL by Ankur Dave
GraphFrames: Graph Queries in Spark SQL by Ankur Dave
 
Microservices, containers, and machine learning
Microservices, containers, and machine learningMicroservices, containers, and machine learning
Microservices, containers, and machine learning
 
GraphFrames: Graph Queries In Spark SQL
GraphFrames: Graph Queries In Spark SQLGraphFrames: Graph Queries In Spark SQL
GraphFrames: Graph Queries In Spark SQL
 
Congressional PageRank: Graph Analytics of US Congress With Neo4j
Congressional PageRank: Graph Analytics of US Congress With Neo4jCongressional PageRank: Graph Analytics of US Congress With Neo4j
Congressional PageRank: Graph Analytics of US Congress With Neo4j
 
Big Graph Analytics on Neo4j with Apache Spark
Big Graph Analytics on Neo4j with Apache SparkBig Graph Analytics on Neo4j with Apache Spark
Big Graph Analytics on Neo4j with Apache Spark
 
GalvanizeU Seattle: Eleven Almost-Truisms About Data
GalvanizeU Seattle: Eleven Almost-Truisms About DataGalvanizeU Seattle: Eleven Almost-Truisms About Data
GalvanizeU Seattle: Eleven Almost-Truisms About Data
 
Data Science in 2016: Moving Up
Data Science in 2016: Moving UpData Science in 2016: Moving Up
Data Science in 2016: Moving Up
 
Strata 2015 Data Preview: Spark, Data Visualization, YARN, and More
Strata 2015 Data Preview: Spark, Data Visualization, YARN, and MoreStrata 2015 Data Preview: Spark, Data Visualization, YARN, and More
Strata 2015 Data Preview: Spark, Data Visualization, YARN, and More
 
Jupyter for Education: Beyond Gutenberg and Erasmus
Jupyter for Education: Beyond Gutenberg and ErasmusJupyter for Education: Beyond Gutenberg and Erasmus
Jupyter for Education: Beyond Gutenberg and Erasmus
 
Graph Analytics for big data
Graph Analytics for big dataGraph Analytics for big data
Graph Analytics for big data
 
Benchmarking graph databases on the problem of community detection
Benchmarking graph databases on the problem of community detectionBenchmarking graph databases on the problem of community detection
Benchmarking graph databases on the problem of community detection
 
Interpreting Relational Schema to Graphs
Interpreting Relational Schema to GraphsInterpreting Relational Schema to Graphs
Interpreting Relational Schema to Graphs
 
Graph-Powered Machine Learning
Graph-Powered Machine Learning Graph-Powered Machine Learning
Graph-Powered Machine Learning
 
Magellen: Geospatial Analytics on Spark by Ram Sriharsha
Magellen: Geospatial Analytics on Spark by Ram SriharshaMagellen: Geospatial Analytics on Spark by Ram Sriharsha
Magellen: Geospatial Analytics on Spark by Ram Sriharsha
 
Power of Polyglot Search
Power of Polyglot SearchPower of Polyglot Search
Power of Polyglot Search
 
Transforming AI with Graphs: Real World Examples using Spark and Neo4j
Transforming AI with Graphs: Real World Examples using Spark and Neo4jTransforming AI with Graphs: Real World Examples using Spark and Neo4j
Transforming AI with Graphs: Real World Examples using Spark and Neo4j
 
Improve ML Predictions using Connected Feature Extraction
Improve ML Predictions using Connected Feature ExtractionImprove ML Predictions using Connected Feature Extraction
Improve ML Predictions using Connected Feature Extraction
 
What you need to know to start an AI company?
What you need to know to start an AI company?What you need to know to start an AI company?
What you need to know to start an AI company?
 

Destaque

Exploring Titan and Spark GraphX for Analyzing Time-Varying Electrical Networks
Exploring Titan and Spark GraphX for Analyzing Time-Varying Electrical Networks Exploring Titan and Spark GraphX for Analyzing Time-Varying Electrical Networks
Exploring Titan and Spark GraphX for Analyzing Time-Varying Electrical Networks DataWorks Summit/Hadoop Summit
 
pojarnaya bezopasnost
pojarnaya bezopasnostpojarnaya bezopasnost
pojarnaya bezopasnostmdou_142
 
The effectsofchanmeditation
The effectsofchanmeditationThe effectsofchanmeditation
The effectsofchanmeditationwalkmankim
 
Patent Basics Presentation Mesa Thinkspot 2016
Patent Basics Presentation Mesa Thinkspot 2016Patent Basics Presentation Mesa Thinkspot 2016
Patent Basics Presentation Mesa Thinkspot 2016statelibaz
 
презентація до занять школа етикету2
презентація до занять школа етикету2презентація до занять школа етикету2
презентація до занять школа етикету2Тетяна Коваль
 
povedenie na pogare
povedenie na pogarepovedenie na pogare
povedenie na pogaremdou_142
 
Using behavioral economics in lunchrooms
Using behavioral economics in lunchroomsUsing behavioral economics in lunchrooms
Using behavioral economics in lunchroomsaleighb801
 
διδω σωτηριου
διδω σωτηριουδιδω σωτηριου
διδω σωτηριουekidrou
 
Elder City Council of Newcastle Newsletter March-April 2014
Elder City Council of Newcastle Newsletter March-April 2014Elder City Council of Newcastle Newsletter March-April 2014
Elder City Council of Newcastle Newsletter March-April 2014Byker Community Trust
 
张澄基教授《什么是佛法》
张澄基教授《什么是佛法》张澄基教授《什么是佛法》
张澄基教授《什么是佛法》walkmankim
 
διδω σωτηριου
διδω σωτηριουδιδω σωτηριου
διδω σωτηριουekidrou
 
One indiabulls gurgaon sector 104 99997.44778 dwarka expressway new project i...
One indiabulls gurgaon sector 104 99997.44778 dwarka expressway new project i...One indiabulls gurgaon sector 104 99997.44778 dwarka expressway new project i...
One indiabulls gurgaon sector 104 99997.44778 dwarka expressway new project i...sachivchawla
 
The vibrant startup challenge entry submission
The vibrant startup challenge entry submissionThe vibrant startup challenge entry submission
The vibrant startup challenge entry submissionwebrosoft
 
CHRISTMAS CARDS
CHRISTMAS CARDSCHRISTMAS CARDS
CHRISTMAS CARDSsoniapr30
 
佛教與基督教的比較
佛教與基督教的比較佛教與基督教的比較
佛教與基督教的比較walkmankim
 
Alimentos trangénicos ea ii
Alimentos trangénicos ea iiAlimentos trangénicos ea ii
Alimentos trangénicos ea iiYetsin Vinces
 
Impressoinisme informàtica
Impressoinisme informàticaImpressoinisme informàtica
Impressoinisme informàticatorragrau
 

Destaque (20)

Exploring Titan and Spark GraphX for Analyzing Time-Varying Electrical Networks
Exploring Titan and Spark GraphX for Analyzing Time-Varying Electrical Networks Exploring Titan and Spark GraphX for Analyzing Time-Varying Electrical Networks
Exploring Titan and Spark GraphX for Analyzing Time-Varying Electrical Networks
 
pojarnaya bezopasnost
pojarnaya bezopasnostpojarnaya bezopasnost
pojarnaya bezopasnost
 
The effectsofchanmeditation
The effectsofchanmeditationThe effectsofchanmeditation
The effectsofchanmeditation
 
Patent Basics Presentation Mesa Thinkspot 2016
Patent Basics Presentation Mesa Thinkspot 2016Patent Basics Presentation Mesa Thinkspot 2016
Patent Basics Presentation Mesa Thinkspot 2016
 
презентація до занять школа етикету2
презентація до занять школа етикету2презентація до занять школа етикету2
презентація до занять школа етикету2
 
povedenie na pogare
povedenie na pogarepovedenie na pogare
povedenie na pogare
 
Using behavioral economics in lunchrooms
Using behavioral economics in lunchroomsUsing behavioral economics in lunchrooms
Using behavioral economics in lunchrooms
 
διδω σωτηριου
διδω σωτηριουδιδω σωτηριου
διδω σωτηριου
 
Elder City Council of Newcastle Newsletter March-April 2014
Elder City Council of Newcastle Newsletter March-April 2014Elder City Council of Newcastle Newsletter March-April 2014
Elder City Council of Newcastle Newsletter March-April 2014
 
Christmas
Christmas Christmas
Christmas
 
张澄基教授《什么是佛法》
张澄基教授《什么是佛法》张澄基教授《什么是佛法》
张澄基教授《什么是佛法》
 
διδω σωτηριου
διδω σωτηριουδιδω σωτηριου
διδω σωτηριου
 
One indiabulls gurgaon sector 104 99997.44778 dwarka expressway new project i...
One indiabulls gurgaon sector 104 99997.44778 dwarka expressway new project i...One indiabulls gurgaon sector 104 99997.44778 dwarka expressway new project i...
One indiabulls gurgaon sector 104 99997.44778 dwarka expressway new project i...
 
The vibrant startup challenge entry submission
The vibrant startup challenge entry submissionThe vibrant startup challenge entry submission
The vibrant startup challenge entry submission
 
Christmas
ChristmasChristmas
Christmas
 
TOUCH
TOUCHTOUCH
TOUCH
 
CHRISTMAS CARDS
CHRISTMAS CARDSCHRISTMAS CARDS
CHRISTMAS CARDS
 
佛教與基督教的比較
佛教與基督教的比較佛教與基督教的比較
佛教與基督教的比較
 
Alimentos trangénicos ea ii
Alimentos trangénicos ea iiAlimentos trangénicos ea ii
Alimentos trangénicos ea ii
 
Impressoinisme informàtica
Impressoinisme informàticaImpressoinisme informàtica
Impressoinisme informàtica
 

Semelhante a Gephi, Graphx, and Giraph

Social Network Analysis Introduction including Data Structure Graph overview.
Social Network Analysis Introduction including Data Structure Graph overview. Social Network Analysis Introduction including Data Structure Graph overview.
Social Network Analysis Introduction including Data Structure Graph overview. Doug Needham
 
Knowledge Graphs - Journey to the Connected Enterprise - Data Strategy and An...
Knowledge Graphs - Journey to the Connected Enterprise - Data Strategy and An...Knowledge Graphs - Journey to the Connected Enterprise - Data Strategy and An...
Knowledge Graphs - Journey to the Connected Enterprise - Data Strategy and An...Benjamin Nussbaum
 
InfiniteGraph Presentation from Oct 21, 2010 DBTA Webcast
InfiniteGraph Presentation from Oct 21, 2010 DBTA WebcastInfiniteGraph Presentation from Oct 21, 2010 DBTA Webcast
InfiniteGraph Presentation from Oct 21, 2010 DBTA WebcastInfiniteGraph
 
Big Data Analysis : Deciphering the haystack
Big Data Analysis : Deciphering the haystack Big Data Analysis : Deciphering the haystack
Big Data Analysis : Deciphering the haystack Srinath Perera
 
Data Structure Graph DMZ #DMZone
Data Structure Graph DMZ #DMZoneData Structure Graph DMZ #DMZone
Data Structure Graph DMZ #DMZoneDoug Needham
 
Is NoSQL The Future of Data Storage?
Is NoSQL The Future of Data Storage?Is NoSQL The Future of Data Storage?
Is NoSQL The Future of Data Storage?Saltmarch Media
 
How to interactively visualise and explore a billion objects (wit vaex)
How to interactively visualise and explore a billion objects (wit vaex)How to interactively visualise and explore a billion objects (wit vaex)
How to interactively visualise and explore a billion objects (wit vaex)Ali-ziane Myriam
 
Agile Data: Building Hadoop Analytics Applications
Agile Data: Building Hadoop Analytics ApplicationsAgile Data: Building Hadoop Analytics Applications
Agile Data: Building Hadoop Analytics ApplicationsDataWorks Summit
 
Neo4j - Rik Van Bruggen
Neo4j - Rik Van BruggenNeo4j - Rik Van Bruggen
Neo4j - Rik Van Bruggenbigdatalondon
 
Data Science Accelerator Program
Data Science Accelerator ProgramData Science Accelerator Program
Data Science Accelerator ProgramGoDataDriven
 
Distributed processing of large graphs in python
Distributed processing of large graphs in pythonDistributed processing of large graphs in python
Distributed processing of large graphs in pythonJose Quesada (hiring)
 
Neo4j Training Introduction
Neo4j Training IntroductionNeo4j Training Introduction
Neo4j Training IntroductionMax De Marzi
 
"R, Hadoop, and Amazon Web Services (20 December 2011)"
"R, Hadoop, and Amazon Web Services (20 December 2011)""R, Hadoop, and Amazon Web Services (20 December 2011)"
"R, Hadoop, and Amazon Web Services (20 December 2011)"Portland R User Group
 
Agile Data Science: Building Hadoop Analytics Applications
Agile Data Science: Building Hadoop Analytics ApplicationsAgile Data Science: Building Hadoop Analytics Applications
Agile Data Science: Building Hadoop Analytics ApplicationsRussell Jurney
 
Agile Data Science: Hadoop Analytics Applications
Agile Data Science: Hadoop Analytics ApplicationsAgile Data Science: Hadoop Analytics Applications
Agile Data Science: Hadoop Analytics ApplicationsRussell Jurney
 

Semelhante a Gephi, Graphx, and Giraph (20)

Social Network Analysis Introduction including Data Structure Graph overview.
Social Network Analysis Introduction including Data Structure Graph overview. Social Network Analysis Introduction including Data Structure Graph overview.
Social Network Analysis Introduction including Data Structure Graph overview.
 
Knowledge Graphs - Journey to the Connected Enterprise - Data Strategy and An...
Knowledge Graphs - Journey to the Connected Enterprise - Data Strategy and An...Knowledge Graphs - Journey to the Connected Enterprise - Data Strategy and An...
Knowledge Graphs - Journey to the Connected Enterprise - Data Strategy and An...
 
InfiniteGraph Presentation from Oct 21, 2010 DBTA Webcast
InfiniteGraph Presentation from Oct 21, 2010 DBTA WebcastInfiniteGraph Presentation from Oct 21, 2010 DBTA Webcast
InfiniteGraph Presentation from Oct 21, 2010 DBTA Webcast
 
Big Data Analysis : Deciphering the haystack
Big Data Analysis : Deciphering the haystack Big Data Analysis : Deciphering the haystack
Big Data Analysis : Deciphering the haystack
 
Data Structure Graph DMZ #DMZone
Data Structure Graph DMZ #DMZoneData Structure Graph DMZ #DMZone
Data Structure Graph DMZ #DMZone
 
Is NoSQL The Future of Data Storage?
Is NoSQL The Future of Data Storage?Is NoSQL The Future of Data Storage?
Is NoSQL The Future of Data Storage?
 
How to interactively visualise and explore a billion objects (wit vaex)
How to interactively visualise and explore a billion objects (wit vaex)How to interactively visualise and explore a billion objects (wit vaex)
How to interactively visualise and explore a billion objects (wit vaex)
 
Vaex talk-pydata-paris
Vaex talk-pydata-parisVaex talk-pydata-paris
Vaex talk-pydata-paris
 
Agile Data: Building Hadoop Analytics Applications
Agile Data: Building Hadoop Analytics ApplicationsAgile Data: Building Hadoop Analytics Applications
Agile Data: Building Hadoop Analytics Applications
 
Neo4j - Rik Van Bruggen
Neo4j - Rik Van BruggenNeo4j - Rik Van Bruggen
Neo4j - Rik Van Bruggen
 
Betabit - syrwag 2018-03-28
Betabit - syrwag 2018-03-28Betabit - syrwag 2018-03-28
Betabit - syrwag 2018-03-28
 
Data Science Accelerator Program
Data Science Accelerator ProgramData Science Accelerator Program
Data Science Accelerator Program
 
Distributed processing of large graphs in python
Distributed processing of large graphs in pythonDistributed processing of large graphs in python
Distributed processing of large graphs in python
 
Intro to Big Data
Intro to Big DataIntro to Big Data
Intro to Big Data
 
Architecting Your First Big Data Implementation
Architecting Your First Big Data ImplementationArchitecting Your First Big Data Implementation
Architecting Your First Big Data Implementation
 
Neo4j Training Introduction
Neo4j Training IntroductionNeo4j Training Introduction
Neo4j Training Introduction
 
"R, Hadoop, and Amazon Web Services (20 December 2011)"
"R, Hadoop, and Amazon Web Services (20 December 2011)""R, Hadoop, and Amazon Web Services (20 December 2011)"
"R, Hadoop, and Amazon Web Services (20 December 2011)"
 
R, Hadoop and Amazon Web Services
R, Hadoop and Amazon Web ServicesR, Hadoop and Amazon Web Services
R, Hadoop and Amazon Web Services
 
Agile Data Science: Building Hadoop Analytics Applications
Agile Data Science: Building Hadoop Analytics ApplicationsAgile Data Science: Building Hadoop Analytics Applications
Agile Data Science: Building Hadoop Analytics Applications
 
Agile Data Science: Hadoop Analytics Applications
Agile Data Science: Hadoop Analytics ApplicationsAgile Data Science: Hadoop Analytics Applications
Agile Data Science: Hadoop Analytics Applications
 

Último

VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsJoseMangaJr1
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangaloreamitlee9823
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...amitlee9823
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...amitlee9823
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 

Último (20)

VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 

Gephi, Graphx, and Giraph

  • 1. Graph Theory at work doug.needham@ilwllc.com
  • 2. • @dougneedham • Data Guy - Started as a DBA in the Marine Corps, evolved to Architect, now aspiring Data Scientist. • Oracle, SQL Server, Cassandra, Hadoop, MySQL. • I have a strong relational/traditional background. • Perpetual Student • Learning new things challenges our assumptions. Forces us to take a new perspective on “old” problems. Eventually maybe even shows us that there is a better way to solve a problem.
  • 3. • Stand back, we are going to talk about math! • Basically we are talking about a bunch of dots joined together by lines • Vertex – Dot on a graph • Edge – Line connecting the two points • Triangle – 3 Vertices, 3 Edges • Square – 4 Vertices, 4 edges • Open Triangle - 3 Vertices, 2 edges • A lot of things are networks if you look at them the right way. • Mark Newman has done a number of really cool presentations, available on Youtube about Network analysis. • https://www.youtube.com/watch?v=lETt7IcDWLI
  • 4.
  • 5. • The 7 Bridges of Konisberg • Every tome on Graph theory or Network analysis devotes a small portion of there time to the 7 Bridges of Konisberg. • If I don’t cover this with you, the gods of mathematics will strike me down, and never allow me to do analysis again in the future.
  • 6.
  • 7. • Folks enjoyed there Sunday afternoon strolls across the bridges, but occasionally people would wonder if one particular route was more efficient than another. • Eventually Leonhard Euler was brought into the debate about the efficiency problem. • Euler used Vertices to represent the land masses and edges (or arcs, at the time) to represent bridges. He realized the odd number of edges per vertex made the problem unsolvable. • And here is the cool thing about mathematicians. If we tell you something is impossible, we have to tell you why in a way you can understand it. But he also invented the branch of mathematics today we call Graph Theory. • http://en.wikipedia.org/wiki/Leonhard_Euler
  • 8. • http://gephi.github.io/ • From the website: “Gephi is an interactive visualization and exploration platform for all kinds of networks and complex systems, dynamic and hierarchical graphs.” • To get this yourself go into Facebook and search for: Netvizz. (You have to authorized it. You can un-authorized it later) • Click the application. • Click “personal network” • Click Start • Download your gdf file • Quick Demo:
  • 9. • Shortest path – How are two vertices connected? • What is a path? • Centrality • Transitivity • Homophily • Directed Graphs – or Digraphs • Contagion – How do things “spread” through a network? • Let’s rearrange things, how does the layout affect understanding? • This is not just data visualization, it can also be used for prediction. https://www.youtube.com/watch?v=rwA-y-XwjuU
  • 10. • Requires Spark, which is not a bad deal. • Jump to Demo • http://ampcamp.berkeley.edu/big-data-mini-course/graph-analytics-with-graphx. html
  • 11. • Giraph, I haven’t really done as much with as I wanted to do. Perhaps a later presentation with a more detailed example comparing GraphX with Giraph.
  • 12. • I started doing some analysis some time ago using Graph models to understand metadata. • I came up with two types of Graphs: • Data Structure Graph Level 1 – This is roughly like an Entity Relationship Diagram (ERD) Tables are Vertices, Foreign Keys are Edges. • Data Structure Graph Level 2 – Each Vertex in this graph is an application. Each Edge is data transfer. Roughly equivalent to what we used to call Data Flow diagrams.
  • 13. • A DSG Level 1 can show you where you are going to have the most interesting query performance of your tables. • A DSG Level 2 can show you where the most amount of work is going on in your Enterprise.
  • 14. • Network/Graph Analysis is cool. • It can show you some interesting things about your data. • Some things to consider. • Some thought needs to be put into how the raw data is organized for a Graph Analysis. • Directed graph, undirected, bigraph? Some up front setup work needs to be done. • Tools help with the detailed calculations, and show the paths, walks, etc. • However, due thought should be put towards a network analysis project.