SlideShare uma empresa Scribd logo
1 de 26
Baixar para ler offline
Open Source Big Graph Analytics
on Neo4j with Apache Spark
Kenny Bastani (@kennybastani)
Speaker Introduction
§ Graph database enthusiast
§ Building microservice architectures
§ Lead developer at Digital Insight
§ Co-author of upcoming O’Reilly book — Spring Boot Essentials
The Problem
It's hard to analyze graphs at scale
The importance of graph algorithms
§ PageRank gave us Google
§ Friend of a friend gave us Facebook
§ Collaborative filtering makes Netflix recommendations awesome
Why is it so hard to do this stuff?
Enemy #1:
Relational databases store data in ways that make it difficult to extract graphs for analysis
Enemy #2:

If you still think Big Data is a buzz word
You haven’t had to feel the pain of failing at it.
When you hit a wall because your data is too big
You start to see what this big data thing is all about.
Distributed File Systems
Distributed file systems are a foundational component of big data analytics
Chops things into manageable sized blocks, usually 64MB
Spreads those blocks out across a cluster of VM resources
Hadoop MapReduce
Worth mentioning, Hadoop started this whole distributed MapReduce thing
You could translate the raw data from a CSV and turn it into a map of keys to values
Keys are distributed per node and used to reduce the values into a partitioned analysis
Graph algorithms can be evil at scale
It depends on the complexity of your graph
How many strongly connected components you have
But since some graph algorithms like PageRank are iterative
You have to iterate from one stage and use the results of the previous stage
It doesn't matter how many nodes you have in your cluster
For iterative graph algorithms, the complexity of the graph will make you or break you
Graphs with high complexity need a lot of memory to be processed iteratively
Neo4j Mazerunner Project
2-way Graph ETL
What is Neo4j Mazerunner?
The basic idea is…
Graph databases need ETL so you can analyze your data and look it up later.
Docker
If you’re not up on Docker, let me give you a quick intro.
Docker
Docker is a VM framework that let’s you easily create a recipe for an image and deploy applications with ease.
The idea is that infrastructure and operational complexity makes it hard for agile development of new products.
Why?
If I am an engineer on a product team, I want to choose my own software libraries and languages to solve
problems.
Microservices and Docker
§ If want to build a new service, use whatever application framework you want. As long as you communicate over
REST.
§ Docker gives you the freedom to use Neo4j, or MySQL, or MongoDB or whatever application dependency you
want inside your container.
Docker Compose
Docker Compose allows you to run multi-container applications
It uses a single YAML file
PageRank
Distributed PageRank
Questions?
kennybastani.com

Mais conteúdo relacionado

Mais procurados

Large-Scale Malicious Domain Detection with Spark AI
Large-Scale Malicious Domain Detection with Spark AILarge-Scale Malicious Domain Detection with Spark AI
Large-Scale Malicious Domain Detection with Spark AI
Databricks
 
Microservices, containers, and machine learning
Microservices, containers, and machine learningMicroservices, containers, and machine learning
Microservices, containers, and machine learning
Paco Nathan
 

Mais procurados (19)

GraphGen: Conducting Graph Analytics over Relational Databases
GraphGen: Conducting Graph Analytics over Relational DatabasesGraphGen: Conducting Graph Analytics over Relational Databases
GraphGen: Conducting Graph Analytics over Relational Databases
 
Janus graph lookingbackwardreachingforward
Janus graph lookingbackwardreachingforwardJanus graph lookingbackwardreachingforward
Janus graph lookingbackwardreachingforward
 
Julia + R for Data Science
Julia + R for Data ScienceJulia + R for Data Science
Julia + R for Data Science
 
Scaling Analysis Responsibly
Scaling Analysis ResponsiblyScaling Analysis Responsibly
Scaling Analysis Responsibly
 
Large-Scale Malicious Domain Detection with Spark AI
Large-Scale Malicious Domain Detection with Spark AILarge-Scale Malicious Domain Detection with Spark AI
Large-Scale Malicious Domain Detection with Spark AI
 
Big Data is changing abruptly, and where it is likely heading
Big Data is changing abruptly, and where it is likely headingBig Data is changing abruptly, and where it is likely heading
Big Data is changing abruptly, and where it is likely heading
 
[DataCon.TW 2019] Graph Query on Big-data, REST API, and Live Analysis Systems
[DataCon.TW 2019] Graph Query on Big-data, REST API, and Live Analysis Systems[DataCon.TW 2019] Graph Query on Big-data, REST API, and Live Analysis Systems
[DataCon.TW 2019] Graph Query on Big-data, REST API, and Live Analysis Systems
 
Cloud Computing in the Cloud (Hadoop.tw Meetup @ 2015/11/23)
Cloud Computing in the Cloud (Hadoop.tw Meetup @ 2015/11/23)Cloud Computing in the Cloud (Hadoop.tw Meetup @ 2015/11/23)
Cloud Computing in the Cloud (Hadoop.tw Meetup @ 2015/11/23)
 
Continuum Analytics and Python
Continuum Analytics and PythonContinuum Analytics and Python
Continuum Analytics and Python
 
How Netflix Uses Druid in Real-time to Ensure a High Quality Streaming Experi...
How Netflix Uses Druid in Real-time to Ensure a High Quality Streaming Experi...How Netflix Uses Druid in Real-time to Ensure a High Quality Streaming Experi...
How Netflix Uses Druid in Real-time to Ensure a High Quality Streaming Experi...
 
Are we reaching a Data Science Singularity? How Cognitive Computing is emergi...
Are we reaching a Data Science Singularity? How Cognitive Computing is emergi...Are we reaching a Data Science Singularity? How Cognitive Computing is emergi...
Are we reaching a Data Science Singularity? How Cognitive Computing is emergi...
 
Quick and Dirty: Scaling Out Predictive Models Using Revolution Analytics on ...
Quick and Dirty: Scaling Out Predictive Models Using Revolution Analytics on ...Quick and Dirty: Scaling Out Predictive Models Using Revolution Analytics on ...
Quick and Dirty: Scaling Out Predictive Models Using Revolution Analytics on ...
 
Microservices, containers, and machine learning
Microservices, containers, and machine learningMicroservices, containers, and machine learning
Microservices, containers, and machine learning
 
What We Learned Building an R-Python Hybrid Predictive Analytics Pipeline
What We Learned Building an R-Python Hybrid Predictive Analytics PipelineWhat We Learned Building an R-Python Hybrid Predictive Analytics Pipeline
What We Learned Building an R-Python Hybrid Predictive Analytics Pipeline
 
Exploring Titan and Spark GraphX for Analyzing Time-Varying Electrical Networks
Exploring Titan and Spark GraphX for Analyzing Time-Varying Electrical Networks Exploring Titan and Spark GraphX for Analyzing Time-Varying Electrical Networks
Exploring Titan and Spark GraphX for Analyzing Time-Varying Electrical Networks
 
Building Reactive Real-time Data Pipeline
Building Reactive Real-time Data PipelineBuilding Reactive Real-time Data Pipeline
Building Reactive Real-time Data Pipeline
 
Deep Learning in Security—An Empirical Example in User and Entity Behavior An...
Deep Learning in Security—An Empirical Example in User and Entity Behavior An...Deep Learning in Security—An Empirical Example in User and Entity Behavior An...
Deep Learning in Security—An Empirical Example in User and Entity Behavior An...
 
GluonNLP MXNet Meetup-Aug
GluonNLP MXNet Meetup-AugGluonNLP MXNet Meetup-Aug
GluonNLP MXNet Meetup-Aug
 
Large Scale Image Forensics using Tika and Tensorflow [ICMR MFSec 2017]
Large Scale Image Forensics using Tika and Tensorflow [ICMR MFSec 2017]Large Scale Image Forensics using Tika and Tensorflow [ICMR MFSec 2017]
Large Scale Image Forensics using Tika and Tensorflow [ICMR MFSec 2017]
 

Destaque

Perbedaan & persamaan linux & windows
Perbedaan & persamaan  linux & windowsPerbedaan & persamaan  linux & windows
Perbedaan & persamaan linux & windows
Safrin_Sarifudin
 
Tài chính tiền tệ nhóm 5 lớp 54ckt 3
Tài chính tiền tệ nhóm 5 lớp 54ckt 3Tài chính tiền tệ nhóm 5 lớp 54ckt 3
Tài chính tiền tệ nhóm 5 lớp 54ckt 3
lovelycat1416
 
Perbedaan & persamaan linux & windows
Perbedaan & persamaan  linux & windowsPerbedaan & persamaan  linux & windows
Perbedaan & persamaan linux & windows
Safrin_Sarifudin
 

Destaque (20)

Big Graph Analytics on Neo4j with Apache Spark
Big Graph Analytics on Neo4j with Apache SparkBig Graph Analytics on Neo4j with Apache Spark
Big Graph Analytics on Neo4j with Apache Spark
 
Natural Language Processing with Neo4j
Natural Language Processing with Neo4jNatural Language Processing with Neo4j
Natural Language Processing with Neo4j
 
Natural language search using Neo4j
Natural language search using Neo4jNatural language search using Neo4j
Natural language search using Neo4j
 
Building a Graph-based Analytics Platform
Building a Graph-based Analytics PlatformBuilding a Graph-based Analytics Platform
Building a Graph-based Analytics Platform
 
Document Classification with Neo4j
Document Classification with Neo4jDocument Classification with Neo4j
Document Classification with Neo4j
 
Gradoop: Scalable Graph Analytics with Apache Flink @ Flink & Neo4j Meetup Be...
Gradoop: Scalable Graph Analytics with Apache Flink @ Flink & Neo4j Meetup Be...Gradoop: Scalable Graph Analytics with Apache Flink @ Flink & Neo4j Meetup Be...
Gradoop: Scalable Graph Analytics with Apache Flink @ Flink & Neo4j Meetup Be...
 
Natural Language Processing with Graph Databases and Neo4j
Natural Language Processing with Graph Databases and Neo4jNatural Language Processing with Graph Databases and Neo4j
Natural Language Processing with Graph Databases and Neo4j
 
Introduction to Graph Databases
Introduction to Graph DatabasesIntroduction to Graph Databases
Introduction to Graph Databases
 
Real time, streaming advanced analytics, approximations, and recommendations ...
Real time, streaming advanced analytics, approximations, and recommendations ...Real time, streaming advanced analytics, approximations, and recommendations ...
Real time, streaming advanced analytics, approximations, and recommendations ...
 
Tax and fee rates
Tax and fee ratesTax and fee rates
Tax and fee rates
 
Perbedaan & persamaan linux & windows
Perbedaan & persamaan  linux & windowsPerbedaan & persamaan  linux & windows
Perbedaan & persamaan linux & windows
 
Социальные права человека.
Социальные права человека.  Социальные права человека.
Социальные права человека.
 
Nhóm 5 chủ đề 5
Nhóm 5   chủ đề 5Nhóm 5   chủ đề 5
Nhóm 5 chủ đề 5
 
Безопасность в интернете
Безопасность в интернетеБезопасность в интернете
Безопасность в интернете
 
Hassuhn perth’s electronic career portfolio
Hassuhn perth’s electronic career portfolioHassuhn perth’s electronic career portfolio
Hassuhn perth’s electronic career portfolio
 
Nhóm 5 chủ đề 5
Nhóm 5   chủ đề 5Nhóm 5   chủ đề 5
Nhóm 5 chủ đề 5
 
Slides on CSR and GRI
Slides on CSR and GRISlides on CSR and GRI
Slides on CSR and GRI
 
Tài chính tiền tệ nhóm 5 lớp 54ckt 3
Tài chính tiền tệ nhóm 5 lớp 54ckt 3Tài chính tiền tệ nhóm 5 lớp 54ckt 3
Tài chính tiền tệ nhóm 5 lớp 54ckt 3
 
Perbedaan & persamaan linux & windows
Perbedaan & persamaan  linux & windowsPerbedaan & persamaan  linux & windows
Perbedaan & persamaan linux & windows
 
Quản trị học
Quản trị họcQuản trị học
Quản trị học
 

Semelhante a Open Source Big Graph Analytics on Neo4j with Apache Spark

Start Getting Your Feet Wet in Open Source Machine and Deep Learning
Start Getting Your Feet Wet in Open Source Machine and Deep Learning Start Getting Your Feet Wet in Open Source Machine and Deep Learning
Start Getting Your Feet Wet in Open Source Machine and Deep Learning
Ian Gomez
 

Semelhante a Open Source Big Graph Analytics on Neo4j with Apache Spark (20)

Big Data Analytics (ML, DL, AI) hands-on
Big Data Analytics (ML, DL, AI) hands-onBig Data Analytics (ML, DL, AI) hands-on
Big Data Analytics (ML, DL, AI) hands-on
 
Architecting Your First Big Data Implementation
Architecting Your First Big Data ImplementationArchitecting Your First Big Data Implementation
Architecting Your First Big Data Implementation
 
Bringing Deep Learning into production
Bringing Deep Learning into production Bringing Deep Learning into production
Bringing Deep Learning into production
 
.NET per la Data Science e oltre
.NET per la Data Science e oltre.NET per la Data Science e oltre
.NET per la Data Science e oltre
 
Deeplearning on Hadoop @OSCON 2014
Deeplearning on Hadoop @OSCON 2014Deeplearning on Hadoop @OSCON 2014
Deeplearning on Hadoop @OSCON 2014
 
Hadoop
HadoopHadoop
Hadoop
 
Neurodb Engr245 2021 Lessons Learned
Neurodb Engr245 2021 Lessons LearnedNeurodb Engr245 2021 Lessons Learned
Neurodb Engr245 2021 Lessons Learned
 
PPT5: Neuron Introduction
PPT5: Neuron IntroductionPPT5: Neuron Introduction
PPT5: Neuron Introduction
 
Deep Learning for Java Developer - Getting Started
Deep Learning for Java Developer - Getting StartedDeep Learning for Java Developer - Getting Started
Deep Learning for Java Developer - Getting Started
 
Invoice 2 Vec: Creating AI to Read Documents - Mark Landry - H2O AI World Lon...
Invoice 2 Vec: Creating AI to Read Documents - Mark Landry - H2O AI World Lon...Invoice 2 Vec: Creating AI to Read Documents - Mark Landry - H2O AI World Lon...
Invoice 2 Vec: Creating AI to Read Documents - Mark Landry - H2O AI World Lon...
 
Simplifying AI integration on Apache Spark
Simplifying AI integration on Apache SparkSimplifying AI integration on Apache Spark
Simplifying AI integration on Apache Spark
 
Enabling Data centric Teams
Enabling Data centric TeamsEnabling Data centric Teams
Enabling Data centric Teams
 
Should You Choose Java or Python for Data Science?
Should You Choose Java or Python for Data Science?Should You Choose Java or Python for Data Science?
Should You Choose Java or Python for Data Science?
 
Graph Data Science at Scale
Graph Data Science at ScaleGraph Data Science at Scale
Graph Data Science at Scale
 
Start Getting Your Feet Wet in Open Source Machine and Deep Learning
Start Getting Your Feet Wet in Open Source Machine and Deep Learning Start Getting Your Feet Wet in Open Source Machine and Deep Learning
Start Getting Your Feet Wet in Open Source Machine and Deep Learning
 
Lambda architecture for real time big data
Lambda architecture for real time big dataLambda architecture for real time big data
Lambda architecture for real time big data
 
All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...
All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...
All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...
 
Deep Learning on Apache® Spark™ : Workflows and Best Practices
Deep Learning on Apache® Spark™ : Workflows and Best PracticesDeep Learning on Apache® Spark™ : Workflows and Best Practices
Deep Learning on Apache® Spark™ : Workflows and Best Practices
 
Deep Learning on Apache® Spark™: Workflows and Best Practices
Deep Learning on Apache® Spark™: Workflows and Best PracticesDeep Learning on Apache® Spark™: Workflows and Best Practices
Deep Learning on Apache® Spark™: Workflows and Best Practices
 
Deep Learning on Apache® Spark™: Workflows and Best Practices
Deep Learning on Apache® Spark™: Workflows and Best PracticesDeep Learning on Apache® Spark™: Workflows and Best Practices
Deep Learning on Apache® Spark™: Workflows and Best Practices
 

Mais de Kenny Bastani

Mais de Kenny Bastani (9)

In the Eventual Consistency of Succeeding at Microservices
In the Eventual Consistency of Succeeding at MicroservicesIn the Eventual Consistency of Succeeding at Microservices
In the Eventual Consistency of Succeeding at Microservices
 
Building Cloud Native Architectures with Spring
Building Cloud Native Architectures with SpringBuilding Cloud Native Architectures with Spring
Building Cloud Native Architectures with Spring
 
Extending the Platform with Spring Boot and Cloud Foundry
Extending the Platform with Spring Boot and Cloud FoundryExtending the Platform with Spring Boot and Cloud Foundry
Extending the Platform with Spring Boot and Cloud Foundry
 
Back your app with MySQL and Redis on Cloud Foundry
Back your app with MySQL and Redis on Cloud FoundryBack your app with MySQL and Redis on Cloud Foundry
Back your app with MySQL and Redis on Cloud Foundry
 
Using Docker, Neo4j, and Spring Cloud for Developing Microservices
Using Docker, Neo4j, and Spring Cloud for Developing MicroservicesUsing Docker, Neo4j, and Spring Cloud for Developing Microservices
Using Docker, Neo4j, and Spring Cloud for Developing Microservices
 
Cloud Native Java Microservices
Cloud Native Java MicroservicesCloud Native Java Microservices
Cloud Native Java Microservices
 
Building REST APIs with Spring Boot and Spring Cloud
Building REST APIs with Spring Boot and Spring CloudBuilding REST APIs with Spring Boot and Spring Cloud
Building REST APIs with Spring Boot and Spring Cloud
 
Neo4j Graph Data Modeling
Neo4j Graph Data ModelingNeo4j Graph Data Modeling
Neo4j Graph Data Modeling
 
Building Killer Apps with Neo4j 2.0
Building Killer Apps with Neo4j 2.0Building Killer Apps with Neo4j 2.0
Building Killer Apps with Neo4j 2.0
 

Último

CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
shinachiaurasa2
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
masabamasaba
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
masabamasaba
 

Último (20)

%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK Software
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
 
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdfThe Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 

Open Source Big Graph Analytics on Neo4j with Apache Spark

  • 1. Open Source Big Graph Analytics on Neo4j with Apache Spark Kenny Bastani (@kennybastani)
  • 2. Speaker Introduction § Graph database enthusiast § Building microservice architectures § Lead developer at Digital Insight § Co-author of upcoming O’Reilly book — Spring Boot Essentials
  • 3. The Problem It's hard to analyze graphs at scale
  • 4. The importance of graph algorithms § PageRank gave us Google § Friend of a friend gave us Facebook § Collaborative filtering makes Netflix recommendations awesome
  • 5. Why is it so hard to do this stuff?
  • 6. Enemy #1: Relational databases store data in ways that make it difficult to extract graphs for analysis
  • 7. Enemy #2:
 If you still think Big Data is a buzz word You haven’t had to feel the pain of failing at it.
  • 8. When you hit a wall because your data is too big You start to see what this big data thing is all about.
  • 9. Distributed File Systems Distributed file systems are a foundational component of big data analytics Chops things into manageable sized blocks, usually 64MB Spreads those blocks out across a cluster of VM resources
  • 10. Hadoop MapReduce Worth mentioning, Hadoop started this whole distributed MapReduce thing You could translate the raw data from a CSV and turn it into a map of keys to values Keys are distributed per node and used to reduce the values into a partitioned analysis
  • 11. Graph algorithms can be evil at scale It depends on the complexity of your graph How many strongly connected components you have But since some graph algorithms like PageRank are iterative You have to iterate from one stage and use the results of the previous stage
  • 12. It doesn't matter how many nodes you have in your cluster For iterative graph algorithms, the complexity of the graph will make you or break you Graphs with high complexity need a lot of memory to be processed iteratively
  • 14. What is Neo4j Mazerunner?
  • 15. The basic idea is… Graph databases need ETL so you can analyze your data and look it up later.
  • 16. Docker If you’re not up on Docker, let me give you a quick intro.
  • 17. Docker Docker is a VM framework that let’s you easily create a recipe for an image and deploy applications with ease. The idea is that infrastructure and operational complexity makes it hard for agile development of new products.
  • 18. Why? If I am an engineer on a product team, I want to choose my own software libraries and languages to solve problems.
  • 19. Microservices and Docker § If want to build a new service, use whatever application framework you want. As long as you communicate over REST. § Docker gives you the freedom to use Neo4j, or MySQL, or MongoDB or whatever application dependency you want inside your container.
  • 20.
  • 21. Docker Compose Docker Compose allows you to run multi-container applications It uses a single YAML file
  • 22.
  • 25.