O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.

Graph tour keynote 2019

A look at how Neo4j is transforming enterprise applications

  • Seja o primeiro a comentar

  • Seja a primeira pessoa a gostar disto

Graph tour keynote 2019

  1. 1. Graph Tour Washington DC #1 Database for Connected Data Jeff Morris Head of Product Marketing jeff@neo4j.com 5/7/19
  2. 2. I’m still listening to a lot of graph-y books Adjacent Possibilities Think in Maps Connecting with PeopleJPL Innovation Uniqueness of Individuals Practice, Practice, Practice Food Journey Space Journey Human Senses InnovationStartups
  3. 3. FATHER_OF DRIVE S My Graph
  4. 4. Agenda • Great Graph Stories are here in Washington DC • State of the Graph in 2019 • Innovation Waves • Looking ahead at Recommendations, AI and Graphs
  5. 5. Neo4j Is Helping The World To Make Sense of Data 5 ICIJ used Neo4j to uncover the world’s largest journalistic leak to date, The Panama Papers NASA uses Neo4j for a “Lessons Learned” database to improve effectiveness in search missions in space Neo4j is used to graph the human body, map correlations, identify cause & effect and search for the cure for cancer SAVING DEMOCRACY MISSION TO MARS CURING CANCER
  6. 6. In-Q-Tel’s Mission Economy 6 • Venture Capital sponsored by National Intelligence • Decomposes and reassembles technology stacks into common “genome” vocabulary • Matches mission problems to technology assemblies and vendors • Evaluates tech across communications, Bio tech, robotics, software, hardware, IoT • Faster evaluations, better innovations
  7. 7. ACCOUNT ADDRESS PERSON PERSON NAME STREET BANK NAME COMPANY BANK BAHAMAS 2.6 TB 11.5 million documents Emails, Scanned Documents, Bank Statements etc…
  8. 8. 2.6 TB 11.5 million documents Emails, Scanned Documents, Bank Statements etc… Person B Bank US Account 123 Person A Acme Inc Bank Bahama s Address XNODE RELATIONSHIP
  9. 9. ICIJ Pulitzer Price Winner 2017
  10. 10. Paradise Papers Metadata Model
  11. 11. Business Problem • Find relationships between people, corporations, accounts, shell companies and offshore accounts • Journalists are non-technical • 2017 Leak from Appleby tax sheltering law firm matched 13.4 million account records with public business registrations data from across Caribbean Solution and Benefits • Exposed tax sheltering practices of Apple, Nike • Revealed hidden connections among politicians and nations, like Wilbur Ross & Putin’s son in law • Triggered government tax evasion investigations in US, UK, Europe, India, Australia, Bermuda, Canada and Cayman Islands within 2 days. • Granted $1M endowment from Golden Globes’ HFPA Background • International Consortium of Investigative Journalists (ICIJ), Pulitzer Prize winning journalists • Fourth blockbuster investigation using Neo4j to reveal connections in text-based, and account-based data leaked from offshore law firms and government records about the “1% Elite” • Appends Neo4j-based, “Offshore Leaks Database” ICIJ Paradise Papers INVESTIGATIVE JOURNALISM Fraud Detection / Knowledge Graph16
  12. 12. Background • US IT consulting firm helped US Army streamline equipment deployments and maintenance spending • Saving lives by improving the operational readiness of Army equipment like tanks, radios, transports, aircraft, weaponry, etc. Business Problem • Needed to modernize procurement, budget and logistics processes for equipment & spare parts • Millions of connections among a tank’s bill-of- materials, for example • Improve “what if” cost calculations when planning missions and troop deployments • Mainframe systems required over 60 man-hrs to calculate changes… planning took too long. Solution and Benefits • 118M nodes & 185M relationships • Shed cost estimation times by 88% • Improved parts delivery timing and accuracy • DBA labor required dropped by 77% • Equipment TCO more predictable • Safer soldiers US Army / Calibre Systems Equipment Logistics Parts Assembly & Equipment Maintenance19
  13. 13. State of the graph in 2019
  14. 14. 2000+ 7/10 12/25 8/10 53K+ 100+ 300+ 450+ Adoption Top Retail Firms Top Financial Firms Top Software Vendors Customers Partners • Creator of the Neo4j Graph Platform • 250+ employees • HQ in Silicon Valley, other offices include London, Munich, Paris and Malmö Sweden • $80M Series E led by Morgan Stanley & One Peak. • $160M total raised to date • Over 20M+ downloads & container pulls • 300+ enterprise subscription customers with over half with >$1B in revenue Ecosystem Startup Program Alumni Enterprise customers Partners Meet up members Events per year Neo4j - The Graph Company 2 1 The Industry’s Largest Dedicated Investment in Graphs
  15. 15. Networks of People Business Processes Knowledge Networks E.g., Risk management, Supply chain, Payments E.g., Employees, Customers, Suppliers, Partners, Influencers E.g., Enterprise content, Domain specific content, eCommerce content Data connections are increasing as rapidly as data volumes The Rise of Connections in Data Electronic Networks On-prem & cloud computing, Cellular, Telco & Internet, IoT, Blockchain
  16. 16. CAR name: “Dan” born: May 29, 1970 twitter: “@dan” name: “Ann” born: Dec 5, 1975 since: Jan 10, 2011 brand: “Volvo” model: “V70” Latitude: 37.5629900° Longitude: -122.3255300° Nodes • Can have Labels to classify nodes • Labels have native indexes Relationships • Relate nodes by type and direction Properties • Attributes of Nodes & Relationships • Stored as Name/Value pairs • Can have indexes and composite indexes • Visibility security by user/role Neo4j Invented the Labeled Property Graph Model MARRIED TO LIVES WITH PERSON PERSON 23
  17. 17. 24 Graph Databases are Designed for Connected Data TRADITIONAL DATABASES BIG DATA TECHNOLOGY Store and retrieve data Aggregate and filter data Connections in data Real time storage & retrieval Real-Time Connected Insights Long running queries aggregation & filtering “Our Neo4j solution is literally thousands of times faster than the prior MySQL solution, with queries that require 10-100 times less code” Volker Pacher, Senior Developer Up to 3 Max # of hops 1 Millions
  18. 18. Internal & Confidential, Neo4j Inc. 25 Neo4j Graph Advantage: Foundational Components 1 2 3 4 5 6 Index-Free Adjacency In memory and on flash/disk vs ACID Foundation Required for safe writes Full-Stack Clustering Causal consistency Language, Drivers, Tooling Developer Experience, Graph Efficiency, Type Safety Graph Engine Cost-Based Optimizer, Graph Statistics, Cypher Runtime Hardware Optimizations For next-gen infrastructure
  19. 19. 26 Strongly Differentiated Commercial Offering Enterprise Edition is Highly Differentiated from Community Open Source Edition Date/Time data type1 ✔ ✔ 3D Geospatial data types1 ✔ ✔ Native String Indexes – up to 5x faster writes1 ✔ ✔ 100B+ Bulk Importer1 ✔ Resumable1 Enterprise Cypher Runtime up to 70% faster – ✔ Hot Backups – 2x Faster1 ACID Transactions ✔ ✔ High-performance native API ✔ ✔ High-performance caching ✔ ✔ Cost-based query optimizer ✔ ✔ Graph algorithms library to support AI initiatives ✔ ✔ Massively parallel graph algorithms – ✔ Query monitoring with enriched metrics – ✔ User and role-based security – ✔ LDAP and Active Directory Integration – ✔ Kerberos security option – ✔ Multi-Clustering (partition of clusters)1 – ✔ Automatic Cache Warming1 – ✔ Rolling Upgrades1 – ✔ Resumable Copy/ Restore Cluster Member – ✔ New diagnostic metrics and support tools1 – ✔ Property Blacklisting – ✔ Language drivers for Java, Python, C# & JavaScript ✔ ✔ Bolt Binary Protocol ✔ ✔ RPM, Debian, Docker, Azure & AWS Cloud Delivery ✔ ✔ Intra-cluster encryption secures all traffic across data centers and cloud zones – ✔ IPv6 support in clustered deployments – Available High throughput, least-connected load balancing built into Bolt drivers – ✔ Causal Clustering, core and read-replica design at global scale for applications, analytics workflows, HA and DR – ✔ Enterprise Lock Manager accesses all cores on server – ✔ Labeled property graph model ✔ ✔ Native graph processing & storage ✔ ✔ Cypher graph query language ✔ ✔ Neo4j Browser with syntax highlighting ✔ ✔ Fast writes via native label indexes ✔ ✔ Composite Indexes ✔ ✔ Cypher for Apache Spark (CAPS) for big data analytics ✔ ✔ Graph size limitations 34B nodes None Auto reuse of deleted space – ✔ Property existence constraints – ✔ Cypher query tracing, monitoring and metrics – ✔ Node Key schema constraints – ✔ Neo4j Desktop: Free developer-friendly package with full database and tools – ✔ CommunityDatabase Features Architectural Features Graph Platform Features 1New in Neo4j 3.4 Enterprise Community Enterprise Community Enterprise
  20. 20. Neo4j Graph Platform Vision 27 Development & Administration Analytics Tooling BUSINESS USERS DEVELOPERS ADMINS Graph Analytics Graph Transactions Data Integration Discovery & Visualization DATA ANALYSTS DATA SCIENTISTS Drivers & APIs APPLICATIONS AI openCypherCloud
  21. 21. Development & Administration Analytics Tooling Graph Analytics Graph Transactions Data Integration Discovery & VisualizationDrivers & APIs AI Neo4j Database 3.4 & 3.5 • Full Text Search • Native Indexes (up to 5x faster writes) • 100B+ bulk importer Improved Admin Experience • Rolling upgrades • 2x faster backups • Cache Warming on startup • Improved diagnostics Morpheus for Apache Spark • Graph analytics in the data lake • In-memory Spark graphs from Apache Hadoop, Hive, Gremlin and Spark • Save graphs into Neo4j • High-speed data exchange between Neo4j & data lake • Progressive analysis using named graphs Graph Data Science • High speed graph algorithms Neo4j Bloom • New graph illustration and communication tool for non- technical users • Explore and edit graph • Search-based • Create storyboards • Foundation for graph data discovery • Integrated with graph platform Multi-Cluster routing built into Bolt drivers • Date/Time data type • 3-D Geospatial search • Secure, Horizontal Multi-Clustering • Property-value Security The Neo4j Graph Platform
  22. 22. Graph Apps 30
  23. 23. Neo4j Bloom 31 • High fidelity • Scene navigation • Property views • Search suggestions • Saved phrase history • Property editor • Schema perspectives • Bloom chart type • Visualize • Communicate • Discover • Navigate • Isolate • Edit • Share
  24. 24. Neo4j Fabric Schema-Based Security Multi- DatabaseNeo4j 4.0 3 2 Reactive Drivers And more!!
  25. 25. Graphs Are VERY Hungry for Data Graphs’ appetite to connect more data accelerates the ability to find adjacent innovations Customer iteration cycles from 2 weeks to 3 months
  26. 26. Graph Database Surging in Popularity 34
  27. 27. 20M+ Downloads 8M+ from Neo4j Distribution 12M+ from Docker Events 400+ Approximate Number of Neo4j Events per Year 50k+ Meetups Number of Meetup Members Globally 50k+ Trained/certified Neo4j professionals 1k Certified Trained Developers Largest Pool of Graph Technologists
  28. 28. Density Drives Value In Graphs Metcalfe’s Law of the Network (V=n2) 5 hops < less Value 100’s of hops deliver immense VALUE
  29. 29. "Neo4j continues to dominate the graph database market.” “69% of enterprises have, or are planning to implement graphs over next 12 months” October, 2017 “The most widely stated reason in the survey for selecting Neo4j was to drive innovation” February, 2018 Critical Capabilities for DBMA “In fact, the rapid rise of Neo4j and other graph technologies may signal that data connectedness is indeed a separate paradigm from the model consolidation happening across the rest of the NoSQL landscape.” March, 2018 Analysts See Unique Benefits of Graphs "Neo4j is the clear market leader in the graph space. It has the most users, it uses a widely adopted language that is much easier than Gremlin and in many respects, it has consistently been a lot more innovative than its competitors.” “It is the Oracle or SQL Server of the graph database world.” March, 2019 "Our research suggests that graph databases have the best chance to survive and thrive as a distinct category (versus the other NoSQL models) because connected data applications present serious performance problems that only a specialized graph DB can solve.” March, 2019
  30. 30. Neo4j Has a Ten Year Head Start Native Connectedness Differentiates Neo4j Conceive Code Compute Store Non-Native Graph DBNative Graph DB RDBMS Optimized for graph workloads
  31. 31. Graph Database Vendor Landscape 3 9 NEO4J SIGNIFICANTLY OUTPACES COMPETITION IN GRAPH LEADERSHIP & INVESTMENT, TECHNOLOGY CAPABILITY, COMMUNITY BREADTH AND PRODUCT MATURITY Graph Pioneer & Leader Architectures optimized for non-graph workloads. Not easily adaptable for graphs. Lack “minutes to milliseconds” performance. Few graph-expert resources Nascent products fall vastly short. Graphs as a checkbox. Slow performance. Playing ‘catch-up,’ requiring years to stabilize & grow. Aggressive posture & claims to secure PR. Many fail the “kill -9” test Graph pioneer & visionary. Largest, most active community. More customer successes than all other vendors combined. Strongest technology. Diverse roadmap: cloud, DBaaS, Spark, Algos for AI, GQL.
  32. 32. 40 Real-Time Recommendations Fraud Detection Network & IT Operations Master Data Management Knowledge Graph Identity & Access Management Common Graph Technology Use Cases AirBnb
  33. 33. Highly Valuable Connected Data Use Cases Drive Enterprise Adoption 41 Real-Time Recommendations Fraud Detection Network & IT Operations Master Data Management Identity & Access Management Knowledge Graph
  34. 34. Background • Over 7M citizens suffer from Diabetes • Connecting over 400 researchers • Incorporates over 50 databases, 100k’s of Excel workbooks, 30 database of biological samples • Sought to examine disease from as many angles as possible. Business Problem • Genes are connected by proteins or to metabolites, and patients are connected with their diets, etc… • Needed to improve the utilization of immensely technical data • Needed to cater to doctors and researchers with simple navigation, communication and connections of the graph. Solution and Benefits • Dr. Alexander Jarasch, Head of Bioinformatics and Data Management • Scientists can conduct parallel research without asking the same questions or repeating tests • Built views like a liver sample knowledge graph DZD - German Center for Diabetes Research Medical Genomic Research43 EE Customer since 2016 Q4
  35. 35. Software Financial Services Telecom Retail & Consumer Goods Media & Entertainment Other Industries Airbus Over 300 Enterprises and 10s of Thousands of Projects on Neo4j
  36. 36. Background • Fortune 100 heavy equipment manufacturer • 27 Million warranty & service documents parsed • Foundation for AI-based supply chain management Business Problem • Improve maintenance predictability • Need a knowledge base for 27 million warranty documents and maintenance orders • Graphs gather context for AI to identify ‘prime examples’ of connections among parts, suppliers, customers and their mechanics anticipate when equipment will need servicing and by whom. Solution and Benefits • Text to knowledge graph • Common ontology for complaints, symptoms & parts • Anticipates when equipment will need servicing • Improves customer and brand satisfaction • Maximizes lifespan and value of equipment Caterpillar Heavy Equipment Manufacturing Parts Assembly & Equipment Maintenance45
  37. 37. 7 of the Top 10 Software Companies Use Neo4j
  38. 38. Background • Social network of 10M graphic artists • Peer-to-peer evaluation of art and works-in-progress • Job sourcing site for creatives • Massive, millions of updates (reads & writes) to Activity Feed • 150 Mongos to 48 Cassandras to 3 Neo4j’s! Business Problem • Artists subscribe, appreciate and curate “galleries” of works of their own and from other artists • Activities Feed is how everyone receives updates • 1st implementation was 150 MongoDB instances • 2nd implementation shrunk to 48 Cassandras, but it was still too slow and required heavy IT overhead Solution and Benefits • 3rd implementation shrunk to 3 Neo4j instances • Saved over $500k in annual AWS fees • Reduced data footprint from 50TB to 40GB • Significantly easier to introduce new features like, “New projects in you Network” Adobe Behance Social Network of 10M Graphic Artists Social Network47 EE Customer since 2016 Q4
  39. 39. 8 of the Top 10 Insurance Companies Use Neo4j
  40. 40. Home Security Internet of things Institutional Memory Entertainment Recommendations Home Operations Personalization Voice Enabled Smart Home
  41. 41. 51
  42. 42. Background • Largest Cable TV & Internet Provider in US • 3rd Largest network on the planet • xFi is consumer experience in 3M houses • Internet, router, devices, security, voice & telephony • Transformational customer experience Business Problem • Integrate all experience in a smart home • Create innovative ideas based on cross-platform and household member preferences • Add integrated value of xFinity triple play & quad- play services (internet, VoIP, cable TV & home security) Solution and Benefits • Custom content per household member • Security reminders (kids are home, garage left open) • Serves millions of households • Makes content recommendations based on occupant, time of day, permissions and preferences • Has Siri-like voice commands COMCAST Xfinity xFi TELECOMMUNICATIONS Smart Home / Internet of Things52 EE Customer since 2016 Q4
  43. 43. Analog to Digital Innovations
  44. 44. Common Graph Entities are Analog People Locations Processes Devices Objects Motives • Who – People • What – Activities & Events • Where – Locations • When – Time • Why – Motives & Feelings • How – Processes, Devices & Networks Activities
  45. 45. The Whiteboard Model Is the Physical Model 55 Ideation is an analog activity • Easily understood • Easily evolved • Easy collaboration between business and IT
  46. 46. Neo4j Innovation Lab 56
  47. 47. Innovation Lab Illustrated 57
  48. 48. Graphs Drive Innovation 58 Context Paths Auto-Graphs Graph Layers 1st Order Graph Cross-Connect Cross-tech applications Internet of Things operations Transparent Neural Networks Blockchain-managed systems Adjacent graph layers inspire new innovations Metadata / Risk Management Knowledge Graphs AI- Powered Customer Experiences Connect unlike objects such as people to products, locations Mobile app explosion Recommendation engines Fraud detectors Desire for more context to follow connections Extract properties during traversals Connects like objects People, computer networks, telco, etc
  49. 49. Cypher: Powerful and Expressive Query Language MATCH (:Person { name:“Dan”} ) -[:MARRIED_TO]-> (spouse) MARRIED_TO Dan Ann NODE RELATIONSHIP TYPE LABEL PROPERTY VARIABLE
  50. 50. 60 The GQL Manifesto: https://gql.today/ • Introduced in May 2018: https://gql.today/ • An initiative to immediately rally support for a unified Graph Query Language • Standards meetings are ongoing • All community members are encouraged to Vote their support at https://gql.today/#vote
  51. 51. Keith Hare, GraphConnect 2018 61
  52. 52. Data Sources CLIENT Admin Dashboard Session Data Feedback Scored Recommen- dations Graph Algorithms AI / ML Click Stream Data INTELLIGENT RECOMMENDATIONS FRAMEWORK Discovery Exclude Boost Diversity User Segmentation Item Similarity Intelligent Recommendations Framework Recommendation Engines 62
  53. 53. Graph Analytics Graph Algorithms Cypher for Apache Spark™ Graph-Enhanced AI & ML Similarity ML
  54. 54. Graph & ML Algorithms in Neo4j +35 neo4j.com/ graph- algorithms- book/ Pathfinding & Search Centrality / Importance Community Detection Link Prediction Finds optimal paths or evaluates route availability and quality Determines the importance of distinct nodes in the network Detects group clustering or partition options Evaluates how alike nodes are Estimates the likelihood of nodes forming a future relationship Similarity
  55. 55. Graph and ML Algorithms in Neo4j • Parallel Breadth First Search & DFS • Shortest Path • Single-Source Shortest Path • All Pairs Shortest Path • Minimum Spanning Tree • A* Shortest Path • Yen’s K Shortest Path • K-Spanning Tree (MST) • Random Walk • Degree Centrality • Closeness Centrality • CC Variations: Harmonic, Dangalchev, Wasserman & Faust • Betweenness Centrality • Approximate Betweenness Centrality • PageRank • Personalized PageRank • ArticleRank • Eigenvector Centrality • Triangle Count • Clustering Coefficients • Connected Components (Union Find) • Strongly Connected Components • Label Propagation • Louvain Modularity – 1 Step & Multi- Step • Balanced Triad (identification) • Euclidean Distance • Cosine Similarity • Jaccard Similarity • Overlap Similarity • Pearson Similarity Pathfinding & Search Centrality / Importance Community Detection Similarity neo4j.com/docs/ graph-algorithms/current/ Updated April 2019 Link Prediction • Adamic Adar • Common Neighbors • Preferential Attachment • Resource Allocations • Same Community • Total Neighbors
  56. 56. Graph Analytics: SparkCypher & Morpheus Objective: Draw new users from the Spark ecosystem to graphs & Neo4j (Also bolsters Cypher as the de-facto query language)
  57. 57. AI & Graphs
  59. 59. Graphs Provide Connections & Context for AI
  60. 60. Knowledge Graphs 70 GraphConnect speakers 2015-2017
  61. 61. Thomson Reuters Graph 71 • Data Fusion for Portfolio Managers • Graph layers
  62. 62. 1. Knowledge Graphs Context for Decisions 2. Connected Feature Extraction Context for Credibility 4. AI Explainability3. Graph- Accelerated AI Context for Efficiency Context for Accuracy Four Pillars of Graph-Enhanced AI
  63. 63. More Data Enables More Use Cases
  64. 64. Data Network Effect “A product, generally powered by machine learning, becomes smarter as it gets more data from your users. The more users use your product, the more data they contribute; the more data they contribute, the smarter your product becomes.” — Matt Turck
  65. 65. A Highly Connected Future
  66. 66. Your Homework - Connect
  67. 67. Enjoy the Conference!