Open Source Big Graph Analytics on Neo4j with Apache Spark

•

9 gostaram•4,115 visualizações

In this talk I will introduce you to a Docker container that provides an easy way to do distributed graph processing using Apache Spark GraphX and a Neo4j graph database. You’ll learn how to analyze big data graphs that are exported from Neo4j and consequently updated from the results of a Spark GraphX analysis. The types of analysis I will be talking about are PageRank, connected components, triangle counting, and community detection.

Software

Open Source Big Graph Analytics
on Neo4j with Apache Spark
Kenny Bastani (@kennybastani)

Speaker Introduction
§ Graph database enthusiast
§ Building microservice architectures
§ Lead developer at Digital Insight
§ Co-author of upcoming O’Reilly book — Spring Boot Essentials

The Problem
It's hard to analyze graphs at scale

The importance of graph algorithms
§ PageRank gave us Google
§ Friend of a friend gave us Facebook
§ Collaborative filtering makes Netflix recommendations awesome

Enemy #1:
Relational databases store data in ways that make it difficult to extract graphs for analysis

Enemy #2: 
If you still think Big Data is a buzz word
You haven’t had to feel the pain of failing at it.

When you hit a wall because your data is too big
You start to see what this big data thing is all about.

Distributed File Systems
Distributed file systems are a foundational component of big data analytics
Chops things into manageable sized blocks, usually 64MB
Spreads those blocks out across a cluster of VM resources

Hadoop MapReduce
Worth mentioning, Hadoop started this whole distributed MapReduce thing
You could translate the raw data from a CSV and turn it into a map of keys to values
Keys are distributed per node and used to reduce the values into a partitioned analysis

Graph algorithms can be evil at scale
It depends on the complexity of your graph
How many strongly connected components you have
But since some graph algorithms like PageRank are iterative
You have to iterate from one stage and use the results of the previous stage

It doesn't matter how many nodes you have in your cluster
For iterative graph algorithms, the complexity of the graph will make you or break you
Graphs with high complexity need a lot of memory to be processed iteratively

Neo4j Mazerunner Project
2-way Graph ETL

The basic idea is…
Graph databases need ETL so you can analyze your data and look it up later.

Docker
If you’re not up on Docker, let me give you a quick intro.

Docker
Docker is a VM framework that let’s you easily create a recipe for an image and deploy applications with ease.
The idea is that infrastructure and operational complexity makes it hard for agile development of new products.

Why?
If I am an engineer on a product team, I want to choose my own software libraries and languages to solve
problems.

Microservices and Docker
§ If want to build a new service, use whatever application framework you want. As long as you communicate over
REST.
§ Docker gives you the freedom to use Neo4j, or MySQL, or MongoDB or whatever application dependency you
want inside your container.

Docker Compose
Docker Compose allows you to run multi-container applications
It uses a single YAML file

Mais conteúdo relacionado

Mais procurados

GraphGen: Conducting Graph Analytics over Relational Databases

Konstantinos Xirogiannopoulos

JanusGraph: Looking Backward and Reaching Forward - by Jason Plurad (@pluradj): The JanusGraph project started at the Linux Foundation earlier this year, but it is not the new kid on the block. We'll start with a look at the origins and evolution of this open source graph database through the lens of a few IBM graph use cases. We'll discuss the new features in latest release of JanusGraph, and then take a look at future directions to explore together with the open community.

Janus graph lookingbackwardreachingforward

Demai Ni

Julia + R for Data Science

Work-Bench

Scaling Analysis Responsibly

Work-Bench

Malicious domains are one of the main resources used to mount attacks over the Internet. It is important to detect such activities by mining the large-scale network traffic data and identifying malicious URLs, domains or IPs. The attackers often take advantages of vulnerabilities in DNS and commit activities such as stealing private information, spamming, phishing, and DDoS attacks, and tend to by-pass botnet detection by generating domain clusters from Domain Generation Algorithms (DGA). We have billions of DNS records per day. Spark AI platform hence serves as an efficient distributed platform for the processing and mining of this huge amount of data. We work on the following two cybersecurity use cases. 1. Detect DGA, Porn, and Gambling domains Each malware-compromised host machine will have a large amount of DNS request in sequential order. The domain names are either generated by DGAs or preserve particular string patterns by design. We use spark to generate DNS request domain sequences and use Word2Vec to estimate the embedding of the domains. We then estimate the similarity and the most similar domains in the embedding space are discovered as the potential malicious domains. 2. Detect cryptocurrency mining pool domains The attackers are interested in accessing computing resources to mine cryptocurrency. The malware infected computers will be directed to attacker-controlled mining pool domains. This type of DNS request does not preserve sequential order and is relatively random. Since each mining pool domain cluster is visited by a wide range of different host machines, we used LSH to evaluate the similarity among sets of hosts. As a result, LSH generates domain-bucket bipartite graph and FastUnfolding algorithm is used to discover the domain clusters. We leveraged spark AI for large scale DNS data analysis and discovered hundreds of thousands of malicious domains each day at high precision. Authors: Ting Chen, Hao Guo

Large-Scale Malicious Domain Detection with Spark AI

Databricks

Big Data is changing abruptly, and where it is likely heading

Paco Nathan

There are different kinds of ways to query things. Relational (algebra), Search (engine), and Streaming (processing) are the common three. Beyond them, Graph Query is more complex and resource consuming. So that there is no commonly accepted system available for graph query currently - especially when you have petabytes of data. In this talk, we will share how Trend Micro built the in-house graph query system for petabytes of data. Even more, together with big-data, this system can also query existing REST-based services and live analysis system at the same time. This enables researchers in Trend Micro to get the latest intelligence for threat analysis and Machine Learning modeling.

[DataCon.TW 2019] Graph Query on Big-data, REST API, and Live Analysis Systems

Jeff Hung

It is a common believe that Hadoop should run on physical servers. However, this requires huge capital investment in the beginning while you have no guarantee for the returns. Therefore, things usually end up in proving big-data with not-that-big data. One approach to workaround this dilemma is to run Cloud Computing in the Cloud. With the elastic that AWS provides, you could spend little but run big!! However, is it really a good idea? In this sharing, we will try to answer it – based on the result from an 1-year journey with real application and real big-data.

Cloud Computing in the Cloud (Hadoop.tw Meetup @ 2015/11/23)

Jeff Hung

Continuum Analytics and Python

Travis Oliphant

How Netflix Uses Druid in Real-time to Ensure a High Quality Streaming Experi...

Imply

Are we reaching a Data Science Singularity? How Cognitive Computing is emergi...

Big Data Spain

[Presentation by Skylar Lyon at DataWeek 2014, September 17 2014.] I recently faced the task of how to scale out an existing analytics process. The schedule was compressed - it always is in my world. The data was big - 400+ million rows waiting in database. What did I do? I offered my favorite type of solution - quick and dirty. At the outset, I wasn't sure how easy it would be. Nor was I certain of realized performance gains. But the concept seemed sound and the exercise fun. Let's move the compute to the data via Revolution R Enterprise for Teradata. This presentation outlines my approach in leveraging a colleague's R models as I experimented with running R in-database. Would my path lead to significant improvement? Could it be used to productionalize the workflow?

Quick and Dirty: Scaling Out Predictive Models Using Revolution Analytics on ...

Revolution Analytics

http://www.oscon.com/open-source-2015/public/schedule/detail/41579 In this presentation, an open source developer community considers itself algorithmically. This shows how to surface data insights from the developer email forums for just about any Apache open source project. It leverages advanced techniques for natural language processing, machine learning, graph algorithms, time series analysis, etc. As an example, we use data from the Apache Spark email list archives to help understand its community better; however, the code can be applied to many other communities. Exsto is an open source project that demonstrates Apache Spark workflow examples for SQL-based ETL (Spark SQL), machine learning (MLlib), and graph algorithms (GraphX). It surfaces insights about developer communities from their email forums. Natural language processing services in Python (based on NLTK, TextBlob, WordNet, etc.), gets containerized and used to crawl and parse email archives. These produce JSON data sets, then we run machine learning on a Spark cluster to find out insights such as: * What are the trending topic summaries? * Who are the leaders in the community for various topics? * Who discusses most frequently with whom? This talk shows how to use cloud-based notebooks for organizing and running the analytics and visualizations. It reviews the background for how and why the graph analytics and machine learning algorithms generalize patterns within the data — based on open source implementations for two advanced approaches, Word2Vec and TextRank The talk also illustrates best practices for leveraging functional programming for big data.

Microservices, containers, and machine learning

Paco Nathan

What We Learned Building an R-Python Hybrid Predictive Analytics Pipeline

Work-Bench

Exploring Titan and Spark GraphX for Analyzing Time-Varying Electrical Networks

DataWorks Summit/Hadoop Summit

Building Reactive Real-time Data Pipeline

Trieu Nguyen

Recently, deep learning has delivered groundbreaking advances in many industries. In this presentation, Dr. Wang will share empirical experiences of applying deep learning to solving some specific security problems with real-world customer attack detection examples. He will also discuss the challenges and guidelines for successfully deploying deep learning, or general machine learning, in broader security. This session will feature two deep learning examples. The first example is a user-behavior anomaly detection solution using Convolutional Neural Network (CNN). Since CNN is most effective for image processing, Dr. Wang will introduce an innovative way to encode a user’s daily behavior into multi-channel images. He will also share the experimental comparison results of CNN hyperparameter tuning. The second example is a stateful user risk scoring system using Long Short Term Memory (LSTM). Most of the modern attacks happen in a multi-stage fashion, i.e., infection -> command & control -> lateral movement -> data infiltration -> data exfiltration. In this case, the company uses LSTM to monitor the temporal state transition of each user over these.“

Deep Learning in Security—An Empirical Example in User and Entity Behavior An...

Databricks

GluonNLP MXNet Meetup-Aug

Chenguang Wang

This paper describes the applications of deep learning-based image recognition in the DARPA Memex program and its repository of 1.4 million weapons-related images collected from the Deep web. We develop a fast, efficient, and easily deployable framework for integrating Google’s Tensorflow framework with Apache Tika for automatically performing image forensics on the Memex data. Our framework and its integration are evaluated qualitatively and quantitatively and our work suggests that automated, large-scale, and reliable image classification and forensics can be widely used and deployed in bulk analysis for answering domain-specific questions

Large Scale Image Forensics using Tika and Tensorflow [ICMR MFSec 2017]

Thamme Gowda

Mais procurados (19)

GraphGen: Conducting Graph Analytics over Relational Databases

Janus graph lookingbackwardreachingforward

Julia + R for Data Science

Scaling Analysis Responsibly

Large-Scale Malicious Domain Detection with Spark AI

Big Data is changing abruptly, and where it is likely heading

[DataCon.TW 2019] Graph Query on Big-data, REST API, and Live Analysis Systems

Cloud Computing in the Cloud (Hadoop.tw Meetup @ 2015/11/23)

Continuum Analytics and Python

How Netflix Uses Druid in Real-time to Ensure a High Quality Streaming Experi...

Are we reaching a Data Science Singularity? How Cognitive Computing is emergi...

Quick and Dirty: Scaling Out Predictive Models Using Revolution Analytics on ...

Microservices, containers, and machine learning

What We Learned Building an R-Python Hybrid Predictive Analytics Pipeline

Exploring Titan and Spark GraphX for Analyzing Time-Varying Electrical Networks

Building Reactive Real-time Data Pipeline

Deep Learning in Security—An Empirical Example in User and Entity Behavior An...

GluonNLP MXNet Meetup-Aug

Large Scale Image Forensics using Tika and Tensorflow [ICMR MFSec 2017]

Destaque

In this talk I will introduce you to a Docker container that provides you an easy way to do distributed graph processing using Apache Spark GraphX and a Neo4j graph database. You'll learn how to analyze big data graphs that are exported from Neo4j and consequently updated from the results of a Spark GraphX analysis. The types of analysis I will be talking about are PageRank, connected components, triangle counting, and community detection. Database technologies have evolved to be able to store big data, but are largely inflexible. For complex graph data models stored in a relational database there may be tedious transformations and shuffling around of data to perform large scale analysis. Fast and scalable analysis of big data has become a critical competitive advantage for companies. There are open source tools like Apache Hadoop and Apache Spark that are providing opportunities for companies to solve these big data problems in a scalable way. Platforms like these have become the foundation of the big data analysis movement. Speakers

Big Graph Analytics on Neo4j with Apache Spark

Kenny Bastani

Recent natural language processing advancements have propelled search engine and information retrieval innovations into the public spotlight. People want to be able to interact with their devices in a natural way. In this talk I will be introducing you to natural language search using a Neo4j graph database. I will show you how to interact with an abstract graph data structure using natural language and how this approach is key to future innovations in the way we interact with our devices.

Natural Language Processing with Neo4j

Kenny Bastani

Natural language search using Neo4j

Kenny Bastani

Meetup is a valuable source of data for understanding trends around products or brands. Meetup does not support an analytics package to track group statistics overtime unless you are an administrator of a group. There are no third-party tools or websites that analyze Meetup trends to understand how communities grow. In this talk I will present a graph-based analytics platform that uses the Meetup.com API to collect and analyze membership statistics over time. This talk will cover: How to poll and import periodic data from the Meetup.com API into Neo4j using Node.js. How to track meetup group growth over time using a Neo4j graph database using Node.js. How to apply tags to meetup groups and report combined growth of all groups over time. How to build an interactive documented analytics API to support applications using Node.js and Neo4j. How to build a business dashboard to visualize time-based statistics and reports using a Node.js based REST API that queries Neo4j.

Building a Graph-based Analytics Platform

Kenny Bastani

Graphs are a perfect solution to organize information and to determine the relatedness of content. Neo4j Developer Evangelist Kenny Bastani will discuss using Neo4j to perform document classification and text classification using a graph database. Kenny will demonstrate how to build a scalable architecture for classifying natural language text using a graph-based algorithm called Hierarchical Pattern Recognition. This approach encompasses a set of techniques familiar to Deep Learning practitioners.

Document Classification with Neo4j

Kenny Bastani

Gradoop: Scalable Graph Analytics with Apache Flink @ Flink & Neo4j Meetup Be...

Martin Junghanns

Natural Language Processing with Graph Databases and Neo4j

William Lyon

Introduction to Graph Databases

Max De Marzi

Real time, streaming advanced analytics, approximations, and recommendations ...

DataWorks Summit/Hadoop Summit

Tax and fee rates

lovelycat1416

Perbedaan & persamaan linux & windows

Safrin_Sarifudin

Социальные права человека.

Natalia Semkova

Chủ đề 5: Khi doanh nghiệp thay đổi lập dự phòng và không lập dự phòng ảnh hưởng đến những thông tin trên các sổ kế toán nào (theo từng hình thức kế toán), các mã số nào trong từng báo cáo tài chính. Sự thay đổi đó có tác động gì đến giá cổ phiều trên thị trường chứng khoán. Minh họa cụ thể cho một doanh nghiệp.

Nhóm 5 chủ đề 5

lovelycat1416

Презентация для проведения урока безопасности работы детей в интернете. Цель презентации 6 Знакомство школьников с элементарными правилами поведения в информационном пространстве Обучение школьников безопасным способам пользования Интернетом, сотовой связью

Безопасность в интернете

Natalia Semkova

Hassuhn perth’s electronic career portfolio

Coleman1955

Nhóm 5 chủ đề 5

lovelycat1416

Slides on CSR and GRI

themaineventita

Tài chính tiền tệ nhóm 5 lớp 54ckt 3

lovelycat1416

Perbedaan & persamaan linux & windows

Safrin_Sarifudin

Quản trị học

lovelycat1416

Destaque (20)

Big Graph Analytics on Neo4j with Apache Spark

Natural Language Processing with Neo4j

Natural language search using Neo4j

Building a Graph-based Analytics Platform

Document Classification with Neo4j

Gradoop: Scalable Graph Analytics with Apache Flink @ Flink & Neo4j Meetup Be...

Natural Language Processing with Graph Databases and Neo4j

Introduction to Graph Databases

Real time, streaming advanced analytics, approximations, and recommendations ...

Tax and fee rates

Perbedaan & persamaan linux & windows

Социальные права человека.

Nhóm 5 chủ đề 5

Безопасность в интернете

Hassuhn perth’s electronic career portfolio

Nhóm 5 chủ đề 5

Slides on CSR and GRI

Tài chính tiền tệ nhóm 5 lớp 54ckt 3

Perbedaan & persamaan linux & windows

Quản trị học

Semelhante a Open Source Big Graph Analytics on Neo4j with Apache Spark

Big Data Analytics (ML, DL, AI) hands-on

Dony Riyanto

Architecting Your First Big Data Implementation

Adaryl "Bob" Wakefield, MBA

Bringing Deep Learning into production

Paolo Platter

.NET per la Data Science e oltre

Marco Parenzan

Distributed Deep Learning on Hadoop Deep-learning is useful in detecting anomalies like fraud, spam and money laundering; identifying similarities to augment search and text analytics; predicting customer lifetime value and churn; recognizing faces and voices. Deeplearning4j is an infinitely scalable deep-learning architecture suitable for Hadoop and other big-data structures. It includes a distributed deep-learning framework and a normal deep-learning framework; i.e. it runs on a single thread as well. Training takes place in the cluster, which means it can process massive amounts of data. Nets are trained in parallel via iterative reduce, and they are equally compatible with Java, Scala and Clojure. The distributed deep-learning framework is made for data input and neural net training at scale, and its output should be highly accurate predictive models. The framework's neural nets include restricted Boltzmann machines, deep-belief networks, deep autoencoders, convolutional nets and recursive neural tensor networks.

Deeplearning on Hadoop @OSCON 2014

Adam Gibson

Hadoop

thisisnabin

Neurodb Engr245 2021 Lessons Learned

Stanford University

PPT5: Neuron Introduction

akira-ai

Deep Learning for Java Developer - Getting Started

Suyash Joshi

This talk was recorded in London on October 30th, 2018 and can be viewed here: https://youtu.be/18Pxvs50G-0 Bio: Mark Landry is a competition data scientist and product manager at H2O. He enjoys testing ideas in Kaggle competitions, where he is ranked in the top 100 in the world (top 0.03%) and well-trained in getting quick solutions to iterate over. Most at home in SQL, he found H2O through hacking in R. Interests are multi-model architectures and helping the world make fewer models that perform worse than the mean Linkedin: https://www.linkedin.com/in/mark-landry-78b863a/

Invoice 2 Vec: Creating AI to Read Documents - Mark Landry - H2O AI World Lon...

Sri Ambati

Spark is an ETL and Data Processing engine especially suited for big data. Most of the time an organization has different teams working on different languages, frameworks and libraries, which needs to be integrated in the ETL Pipelines or for general data processing. For example, a Spark ETL job may be written in Scala by data engineering team, but there is a need to integrate a machine learning solution written in python/R developed by Data Science team. These kinds of solutions are not very straightforward to integrate with spark engine, and it required great amount of collaboration between different teams, hence increasing overall project time and cost. Furthermore, these solutions will keep on changing/upgrading with time using latest versions of the technologies and with improved design and implementation, especially in Machine Learning domain where ML models/algorithms keep on improving with new data and new approaches. And so there is significant downtime involved in integrating the these upgraded version.

Simplifying AI integration on Apache Spark

Databricks

Data Con LA 2020 Description Coming from a grand belief of data democratization, I believe that in order for any team to be successful collaborators, it has to be data centric and data should be accessible to all. *To ensure that your non software or software engineering centric team has maximum efficiency, data should be visible, data lake should be accessible. *Form a database for analytics summaries, talk about the different technologies(SQL, NoSQL) cost of deployment, need, team driven structure. Build an API for this database for external/inter team crosstalk. *Build analytics and visual layer on top of it. Flask/Django/Node, etc.., to enable the team to have high visibility in their analysis, and to ensure a higher turnaround of data. *Talk about an easy way of enabling the team to run code, could be local/cloud, JupyterHub is a great way of doing so, talk about the tremendous value added in that and the potential it enables *Talk about the common tools user for version control/CICD/Coding technologies, etc.. *Finally summarize the value of the mixture of all these tools and technologies in order to ensure the maximum efficiency. Speaker Nawar Khabbaz, Rivian, Data Engineer

Enabling Data centric Teams

Data Con LA

We know the scientific libraries of Python. Time to see what libraries every Java web application development company needs to know! Also known as Java Machine Learning. This library comes with the ability to develop calculative applications. These applications are capable of data processing, classification, and analysis. A Java development company will be able to master this library as machine learning is an expanding field.

Should You Choose Java or Python for Data Science?

Narola Infotech

Graph Data Science at Scale

Neo4j

At H2O.ai we see a world where all software will incorporate AI, and we’re focused on bringing AI to business through software. H2O.ai is the maker behind H2O, the leading open source machine and deep learning platform for smarter applications and data products. H2O operationalizes data science by developing and deploying algorithms and models for R, Python and the Sparkling Water API for Spark. In this webinar, you will learn about the scalable H2O core platform and the distributed algorithms it supports. H2O integrates seamlessly with the R and the Python environments. We will show you how to leverage the power of H2O algorithms in R, Python and H2O Flow interface. Come with an open mind and some high level knowledge of machine learning, and you will take away a stream of knowledge for your next ML/DL project. Amy Wang is a math hacker at H2O, as well as the Sales Engineering Lead. She graduated from Hunter College in NYC with a Masters in Applied Mathematics and Statistics with a heavy concentration on numerical analysis and financial mathematics. Her interest in applicable math eventually lead her to big data and finding the appropriate mediums for data analysis. Desmond is a Senior Director of Marketing at H2O.ai. In his 15+ years of career in Enterprise Software, Desmond worked in Distributed Systems, Storage, Virtualization, MPP databases, Streaming Analytics Platform, and most recently Machine Learning. He obtained his Master’s degree in Computer Science from Stanford University and MBA degree from UC Berkeley, Haas School of Business.

Start Getting Your Feet Wet in Open Source Machine and Deep Learning

Ian Gomez

Lambda architecture for real time big data

Trieu Nguyen

Serverless Toronto's 6th-anniversary event helps IT pros understand and prepare for the #GenAI tsunami ahead. You'll gain situational awareness of the LLM Landscape, receive condensed insights, and actionable advice about RAG in 2024 from Google AI Lead Mark Ryan and LlamaIndex creator Jerry Liu. We chose #RAG (Retrieval-Augmented Generation) because it is the predominant paradigm for building #LLM (Large Language Model) applications in enterprises today - and that's where the jobs will be shifting. Here is the recording: https://youtu.be/P5xd1ZjD-Os?si=iq8xibj5pJsJ62oW

All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...

Daniel Zivkovic

The combination of Deep Learning with Apache Spark has the potential for tremendous impact in many sectors of the industry. This webinar, based on the experience gained in assisting customers with the Databricks Virtual Analytics Platform, will present some best practices for building deep learning pipelines with Spark. Rather than comparing deep learning systems or specific optimizations, this webinar will focus on issues that are common to deep learning frameworks when running on a Spark cluster, including: * optimizing cluster setup; * configuring the cluster; * ingesting data; and * monitoring long-running jobs. We will demonstrate the techniques we cover using Google’s popular TensorFlow library. More specifically, we will cover typical issues users encounter when integrating deep learning libraries with Spark clusters. Clusters can be configured to avoid task conflicts on GPUs and to allow using multiple GPUs per worker. Setting up pipelines for efficient data ingest improves job throughput, and monitoring facilitates both the work of configuration and the stability of deep learning jobs.

Deep Learning on Apache® Spark™ : Workflows and Best Practices

Jen Aman

Deep Learning on Apache® Spark™: Workflows and Best Practices

Databricks

Deep Learning on Apache® Spark™: Workflows and Best Practices

Jen Aman

Semelhante a Open Source Big Graph Analytics on Neo4j with Apache Spark (20)

Big Data Analytics (ML, DL, AI) hands-on

Architecting Your First Big Data Implementation

Bringing Deep Learning into production

.NET per la Data Science e oltre

Deeplearning on Hadoop @OSCON 2014

Hadoop

Neurodb Engr245 2021 Lessons Learned

PPT5: Neuron Introduction

Deep Learning for Java Developer - Getting Started

Invoice 2 Vec: Creating AI to Read Documents - Mark Landry - H2O AI World Lon...

Simplifying AI integration on Apache Spark

Enabling Data centric Teams

Should You Choose Java or Python for Data Science?

Graph Data Science at Scale

Start Getting Your Feet Wet in Open Source Machine and Deep Learning

Lambda architecture for real time big data

All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...

Deep Learning on Apache® Spark™ : Workflows and Best Practices

Deep Learning on Apache® Spark™: Workflows and Best Practices

Mais de Kenny Bastani

In the Eventual Consistency of Succeeding at Microservices

Kenny Bastani

Cloud-native architectures are an emerging practice of software development and delivery. This deck was presented at the Pivotal Cloud Native roadshow and teaches developers how to build modern cloud-native applications using the popular JVM-based application framework: Spring Boot. You'll be provided with a walk through from the monolith application architecture into the more modern microservices architecture. Two open source reference architectures are introduced for building cloud-native microservices. Learn the basics of cloud native platforms and also the approaches for integrating and strangling legacy systems. https://pivotal.io/event/pivotal-cloud-native-roadshow

Building Cloud Native Architectures with Spring

Kenny Bastani

Extending the Platform with Spring Boot and Cloud Foundry

Kenny Bastani

Back your app with MySQL and Redis on Cloud Foundry

Kenny Bastani

In this talk we will explore a sample microservice architecture that uses Spring Boot, Docker, and Neo4j to discover similar users on Twitter. We will dive into the architecture and talk about how the application uses Spring Cloud to add service discovery and an API gateway to help services communicate. Finally we will take a look at how to use Docker Compose to run the multi-container application, using Docker Hub distributions of Neo4j and Apache Spark for graph processing and ranking of Twitter users.

Using Docker, Neo4j, and Spring Cloud for Developing Microservices

Kenny Bastani

In this talk, Kenny Bastani will introduce you to Spring Cloud, a set of tools for building cloud-native JVM applications. We will take a look at some of the common patterns for microservice architectures and how to use Cloud Foundry to deploy multiple microservices to the cloud. We will also dive into a microservices example project of a cloud-native application built using Spring Boot and Spring Cloud. Using this example project, I'll show you how to use Cloud Foundry to spin up a microservice cluster. We will then explore what a cloud-native application looks like when using self-describing REST APIs that link multiple microservices together.

Cloud Native Java Microservices

Kenny Bastani

In this talk I will introduce you to Spring Cloud, a set of tools for building cloud-native JVM applications. We will take a look at some of the common patterns for microservice architectures and how to use Cloud Foundry to deploy multiple microservices to the cloud. We will also dive into a microservices example project of a cloud-native application built using Spring Boot and Spring Cloud. Using this example project, I'll show you how to use Lattice to spin up a microservice cluster on AWS. We will then explore what a cloud-native application looks like when using self-describing REST APIs that link multiple microservices together.

Building REST APIs with Spring Boot and Spring Cloud

Kenny Bastani

Neo4j Graph Data Modeling

Kenny Bastani

As companies like Facebook and Google have introduced us to Graph Search and the Knowledge Graph, developers are learning the benefits of graph database architectures. Graph databases, like Neo4j, have increased in popularity by nearly 250% from last year - the highest among all other DBMS categories, according to db-engines.com. Join Kenny Bastani as we look at the benefits of using a graph database, explore various use cases and walkthrough creating a movie recommendation app on Neo4j 2.0.

Building Killer Apps with Neo4j 2.0

Kenny Bastani

Mais de Kenny Bastani (9)

In the Eventual Consistency of Succeeding at Microservices

Building Cloud Native Architectures with Spring

Extending the Platform with Spring Boot and Cloud Foundry

Back your app with MySQL and Redis on Cloud Foundry

Using Docker, Neo4j, and Spring Cloud for Developing Microservices

Cloud Native Java Microservices

Building REST APIs with Spring Boot and Spring Cloud

Neo4j Graph Data Modeling

Building Killer Apps with Neo4j 2.0

Último

%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein

masabamasaba

%in Midrand+277-882-255-28 abortion pills for sale in midrand

masabamasaba

%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg

masabamasaba

%in kempton park+277-882-255-28 abortion pills for sale in kempton park

masabamasaba

Model Call Girl Services in Delhi reach out to us at 🔝 9953056974 🔝✔️✔️ Our agency presents a selection of young, charming call girls available for bookings at Oyo Hotels. Experience high-class escort services at pocket-friendly rates, with our female escorts exuding both beauty and a delightful personality, ready to meet your desires. Whether it's Housewives, College girls, Russian girls, Muslim girls, or any other preference, we offer a diverse range of options to cater to your tastes. We provide both in-call and out-call services for your convenience. Our in-call location in Delhi ensures cleanliness, hygiene, and 100% safety, while our out-call services offer doorstep delivery for added ease. We value your time and money, hence we kindly request pic collectors, time-passers, and bargain hunters to refrain from contacting us. Our services feature various packages at competitive rates: One shot: ₹2000/in-call, ₹5000/out-call Two shots with one girl: ₹3500/in-call, ₹6000/out-call Body to body massage with sex: ₹3000/in-call Full night for one person: ₹7000/in-call, ₹10000/out-call Full night for more than 1 person: Contact us at 🔝 9953056974 🔝. for details Operating 24/7, we serve various locations in Delhi, including Green Park, Lajpat Nagar, Saket, and Hauz Khas near metro stations. For premium call girl services in Delhi 🔝 9953056974 🔝. Thank you for considering us!

CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE

9953056974 Low Rate Call Girls In Saket, Delhi NCR

Direct Style Effect Systems -The Print[A] Example- A Comprehension Aid

Philip Schwarz

Abortion Clinic And Abortion Pills For Sale in Oman MEDICAL ABORTION PILLS FOR SALE in Fujairah , Ajman, Al Ain Quality Abortion services provided by specialists. Women's health problems. A same day service, safe, legal & pain free Abortions. From 1 week to 5 months. We use tested and approved pills. It's 100% guaranteed & safe. No side effects at all. We also do free womb cleaning and free pills delivery. We sell pregnancy termination pills {ABORTION PILLS} Our service is ever private with all our clients. Call/WhatsApp:

%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...

masabamasaba

The title is not connected to what is inside

shinachiaurasa2

%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview

masabamasaba

Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...

SelfMade bd

Announcing Codolex 2.0 from GDK Software

Jim McKeeth

Test automation is a cornerstone of software development and quality assurance in today's rapidly evolving digital landscape. Its significance cannot be overstated. Businesses can enhance efficiency, productivity, and accelerate software delivery to market through automation, streamlining testing processes effectively. This comprehensive guide addresses the best practices for test automation in 2024. It offers a detailed checklist to empower you to optimize your automation efforts and maintain a competitive edge.

The Ultimate Test Automation Guide_ Best Practices and Tips.pdf

kalichargn70th171

We specialize in Psychic Readings, Psychic Love Spells, Binding Love Spells, Obsession Spells, Voodoo Spells, Lottery Spells, Marriage Spells, Black Magic Spells, Palm Readings & much more. Are you depressed? We perform this come-to-me love spell that works instantly with the aim of bringing back the victim to the person performing the magic. Have you lost your lover? We perform this come-to-me love spell that works instantly with the aim of bringing back the victim to the person performing the magic. Have you lost your lover? Do u need to solve any relationship problem? Contact the powerful spells caster chief kule with love spells that work overnight and love spells that really work. Have you found yourself infatuated with a special someone you think could be the one? Are you looking for a spell to provide them with a nudge in the right direction? Or maybe the spell you cast didn’t achieve the results you were hoping for? Whether you’re new or versed in the ways of spell casting, we’re here to help. Today we’re going to provide you with a detailed guide on the types of love spells to cast. Not only that but there’s something for those who wish to find outside advice from more advanced spell casters. We’re also going to provide you with the top sites available to help you with your dilemma. Let’s begin our journey by educating ourselves on love magic and what a real love caster looks like. Love Magic and Love Casters Love magic made its first appearance back in Ancient Egypt and has been an active practice since. This type of magic is a branch of traditional magic and can be practiced in various ways. Typically the more common use of love magic is through the work of spells, but other methods look like Charms Rituals-LOVE Potions-Dolls and even Amulets If you are interested in becoming a love caster, be prepared for what’s to come. A genuine love caster knows that the art of love casting is no easy feat and shouldn’t be done casually. You should know that not only does it require you to be gifted spiritually, but you must be ready to serve others. Someone who is considered a real love caster has experience in all manner of spells, no matter the difficulty. Training yourself in attraction, commitment, and marriage spells is an excellent place to start. But this by no means will make you a professional. Practice your craft and expand your knowledge; understand that you will possess the ability to help others in time truly. Types of Love Spells What better way to start broadening your experiences with love spells than by learning more about them? These spells work like just about any other spell. Simply apply your intention, use a medium (sigils, mantras, candles, or charm bags), and top it off with establishing the belief that you will receive what you want. So what kind of spells are available and which ones suit your needs the best? Let’s take a look at the many options you have at your disposal.

%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...

masabamasaba

In today's dynamic e-commerce landscape, the payment gateway emerges as a linchpin, ensuring smooth and secure transactions between buyers and sellers. In this discourse, we delve into the meticulous process of devising test cases tailored for scrutinizing payment gateways. Crafting precise test cases for payment gateways is a quintessential responsibility for testers operating within the service industry. This article meticulously explores pivotal scenarios integral to how to test payment gateways, coupled with essential guidelines for drafting effective test cases.

Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf

kalichargn70th171

%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...

masabamasaba

%in tembisa+277-882-255-28 abortion pills for sale in tembisa

masabamasaba

Foundation models are machine learning models which are easily capable of performing variable tasks on large and huge datasets. FMs have managed to get a lot of attention due to this feature of handling large datasets. It can do text generation, video editing to protein folding and robotics. In case we believe that FMs can help the hospitals and patients in any way, we need to perform some important evaluations, tests to test these assumptions. In this review, we take a walk through Fms and their evaluation regimes assumed clinical value. To clarify on this topic, we reviewed no less than 80 clinical FMs built from the EMR data. We added all the models trained on structured and unstructured data. We are referring to this combination of structured and unstructured EMR data or clinical data.

Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...

harshavardhanraghave

Technology has taken up space all over the world. From generating content with a single command on ChatGPT to getting your food served by Robots at your favorite restaurant, artificial advancements have ruled every space. Every industry is set to develop top-notch technology in every sector; finance, IT, healthcare, gaming, and banking, with competitive market standards. One of these rapidly growing industries is Mobile App Development. According to the Straits Research report, it is expected to reach USD 583.03 billion at a CAGR OF 12.8% between (2022 and 2030). It clearly shows how mobile app development has become an integral part of the digital landscape and revolutionized technology.

The Top App Development Trends Shaping the Industry in 2024-25 .pdf

ayushiqss

Looking for an efficient way to manage your finances? Look no further than our money management app. With easy-to-use features, you can track your expenses, create budgets, and monitor your savings goals all in one place. Our app provides real-time updates on your spending habits and helps you make smarter financial decisions. Take control of your finances today with our user-friendly money management app.

Right Money Management App For Your Financial Goals

Jhone kinadey

%in ivory park+277-882-255-28 abortion pills for sale in ivory park

masabamasaba

Open Source Big Graph Analytics on Neo4j with Apache Spark

1. Open Source Big Graph Analytics on Neo4j with Apache Spark Kenny Bastani (@kennybastani)

2. Speaker Introduction § Graph database enthusiast § Building microservice architectures § Lead developer at Digital Insight § Co-author of upcoming O’Reilly book — Spring Boot Essentials

3. The Problem It's hard to analyze graphs at scale

4. The importance of graph algorithms § PageRank gave us Google § Friend of a friend gave us Facebook § Collaborative filtering makes Netflix recommendations awesome

5. Why is it so hard to do this stuff?

6. Enemy #1: Relational databases store data in ways that make it difficult to extract graphs for analysis

7. Enemy #2:  If you still think Big Data is a buzz word You haven’t had to feel the pain of failing at it.

8. When you hit a wall because your data is too big You start to see what this big data thing is all about.

9. Distributed File Systems Distributed file systems are a foundational component of big data analytics Chops things into manageable sized blocks, usually 64MB Spreads those blocks out across a cluster of VM resources

10. Hadoop MapReduce Worth mentioning, Hadoop started this whole distributed MapReduce thing You could translate the raw data from a CSV and turn it into a map of keys to values Keys are distributed per node and used to reduce the values into a partitioned analysis

11. Graph algorithms can be evil at scale It depends on the complexity of your graph How many strongly connected components you have But since some graph algorithms like PageRank are iterative You have to iterate from one stage and use the results of the previous stage

12. It doesn't matter how many nodes you have in your cluster For iterative graph algorithms, the complexity of the graph will make you or break you Graphs with high complexity need a lot of memory to be processed iteratively

13. Neo4j Mazerunner Project 2-way Graph ETL

14. What is Neo4j Mazerunner?

15. The basic idea is… Graph databases need ETL so you can analyze your data and look it up later.

16. Docker If you’re not up on Docker, let me give you a quick intro.

17. Docker Docker is a VM framework that let’s you easily create a recipe for an image and deploy applications with ease. The idea is that infrastructure and operational complexity makes it hard for agile development of new products.

18. Why? If I am an engineer on a product team, I want to choose my own software libraries and languages to solve problems.

19. Microservices and Docker § If want to build a new service, use whatever application framework you want. As long as you communicate over REST. § Docker gives you the freedom to use Neo4j, or MySQL, or MongoDB or whatever application dependency you want inside your container.

20.

21. Docker Compose Docker Compose allows you to run multi-container applications It uses a single YAML file

22.

23. PageRank

24. Distributed PageRank

25.

26. Questions? kennybastani.com

Open Source Big Graph Analytics on Neo4j with Apache Spark

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (19)

Destaque

Destaque (20)

Semelhante a Open Source Big Graph Analytics on Neo4j with Apache Spark

Semelhante a Open Source Big Graph Analytics on Neo4j with Apache Spark (20)

Mais de Kenny Bastani

Mais de Kenny Bastani (9)

Último

Último (20)

Open Source Big Graph Analytics on Neo4j with Apache Spark