O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.

Should a Graph Database Be in Your Next Data Warehouse Stack?

4.426 visualizações

Publicada em

In this webinar, AnzoGraph’s graph database guru Barry Zane (former co-founder of Netezza) and data governance author Steve Sarsfield talk about how graph databases fit into the data warehouse modernization trend. They also explore how certain workloads can be better served with an analytical graph database and how today’s technology stacks offer new paradigms for deployment like the cloud, containers and graph analytics.

Publicada em: Dados e análise
  • Seja o primeiro a comentar

  • Seja a primeira pessoa a gostar disto

Should a Graph Database Be in Your Next Data Warehouse Stack?

  1. 1. Should a Graph Database be in your Next Data Warehouse Stack? Presenters: Barry Zane, VP Engineering Steve Sarsfield, VP Product
  2. 2. ©2018 Cambridge Semantics Inc. All rights reserved. Relationship Between Anzo & AnzoGraph Our topic today: Anzo - Managed Data Fabric Connect and blend enterprise data Ingestion & Mapping Rich data management, harmonization, provenance Uses AnzoGraph as an engine AnzoGraph - Data Mart in your warehouse stack Embed a graph database in your application or write queries directly Use with your own, or third-party ETL and visualization tools 1001 other uses
  3. 3. Graph’s time has come. - Gartner - Top 10 Data and Analytics Technology Trends for 2019 “Graph analytics will grow in the next few years due to the need to ask complex questions across complex data, which is not always practical or even possible at scale using SQL queries”
  4. 4. ©2018 Cambridge Semantics Inc. All rights reserved. What is Driving People to Graph Additional powerful graph analysis, not present in SQL Graph Algorithms and capabilities Page Rank Shortest Path All Path Label Propagation Weakly Connected Components K neighborhood Counting Triangles Inferences (RDFS+) Advanced Grouping Sets Labeled Property Graphs (RDF*) Why you should care: Use Cases • R&D Acceleration (pharma) • Fraud Detection • Recommendation Engines • Network and IT operations • Search • Master data management • Machine Learning
  5. 5. ©2018 Cambridge Semantics Inc. All rights reserved. …but with Standard Analytics Present All of the standard analytics are still available • BI-style analytics • Aggregates • Sub-queries • Views • User defined extensions FROM Multiple Graphs or Default Graph WHERE Joins/Traversals GROUP BY Sum/Count/Min/Max/Avg/Std/Var GroupConcat/Sample ORDER BY Offset & Limit HAVING Built-in functions 80+ functions on strings, numbers, dates, times Why you should care: • You aren’t losing any analytical capabilities
  6. 6. ©2018 Cambridge Semantics Inc. All rights reserved. Analytic Systems Complement Transactional systems Analytics (OLAP) Aggregations and multi-way joining across broad swaths of the data Inserts are primarily bulk loads Query: “Tell me about population trends” Transactions (OLTP) Fetches and relations between a small set of subjects Inserts are primarily single-subject Query: “Tell me about Barry” Redshift Handles inserts and fetches, just slowly MySQL Can do analytical workloads, just slowly The bigger the data, the more this difference is noticed.
  7. 7. ©2018 Cambridge Semantics Inc. All rights reserved. Table Stakes for Modern Analytical Systems Massively Parallel Processing (MPP) - Scalable BI Analytics • Scalable to hundreds of compute servers - trillions of nodes/edges Standards Driven • Relational ANSI SQL • AnzoGraph W3C SPARQL, OpenCypher Automatic Analytics Optimization • Compiled Queries, Automatic, invisible code generator • First query of a given shape is slower • Subsequent query runs of that shape at “physics speed” Simple Deployment • No customer-defined indexes • Load and Go • Cloud, containers, on-premises Analytics Speed, Simplicity, Lower Costs
  8. 8. What is Graph? A Wonderful Way To Think About Data
  9. 9. ©2018 Cambridge Semantics Inc. All rights reserved. Graph Nodes and Edges - Simple, but Rich • More like Human processing of the world. • Graphs do not require pre-created schema. – New Properties immediately usable • Possible in SQL schema, but need to make schema choices: • Possibly: – Person Table [name, birthday…] – Places Table [height…] – friend Join Table [person1, person2] – wentUp Join Table [person, place] – has Join Table [place, property] type: <Boy> birthday: 09/17/1975 Jack type: <Girl> Jill TheHill type: <Place> height: 1500 feet has: Grass has: Trees partOf: <TheMountain>
  10. 10. ©2018 Cambridge Semantics Inc. All rights reserved. type: <Boy> birthday: 09/17/1975 Jack type: <Girl> Jill TheHill type: <Place> height: 1500 feet has: Grass has: Trees partOf: <TheMountain> Subject Predicate Object (Hint) Jack type <Boy> label Jack birthday 09/17/1975 property Jack friend <Jill> edge Jack wentUp <TheHill> edge Jill type <Girl> label Jill wentUp <TheHill> edge TheHill type <Place> edge TheHill has Water property TheHill has Trees edge TheHill partOf <TheMountain> edge TheHill height 1500 property EVERYTHING is expressed as atomic “Triple Statements” (RDF)
  11. 11. ©2018 Cambridge Semantics Inc. All rights reserved. Let’s Add Triples for Inferences - Semantic Enrichment type: <Boy> birthday: 09/17/1975 type:<Man> type:<Person> gender:Male Jack type: <Girl> type: <Woman> type: <Person> gender: Female Jill TheHill type: <Place> height: 1500 feet has: Grass has: Trees partOf: <TheMountain> Subject Predicate Object (Hint) Jack type <Boy> label Jack birthday 09/17/1975 property Jack friend <Jill> edge Jack wentUp <TheHill> edge Jill type <Girl> label Jill wentUp <TheHill> edge TheHill type <Place> edge TheHill has Water property TheHill has Trees edge TheHill partOf <TheMountain> edge TheHill height 1500 property Boy subClass <Man> Man gender Male Man subClass <Person> Girl subClass <Woman> Woman gender Female Woman subClass <Person> friend property Reflexive
  12. 12. ©2018 Cambridge Semantics Inc. All rights reserved. Let’s add Edge Properties (LPG RDF*) type: <Boy> birthday: 09/17/1975 Jack type: <Girl> Jill TheHill type: <Place> height: 1500 feet has: Grass has: Trees partOf: <TheMountain> Edge properties have been the primary historical difference between RDF and LPG graph engines. The RDF* is a slight extension of RDF to reference/store these properties. While not yet a full W3C standard, RDF* is supported by multiple vendors and is on the standards-track. AnzoGraph fully supports the RDF*/LPG model. Essentially, edge properties are triples about other triples.
  13. 13. ©2018 Cambridge Semantics Inc. All rights reserved. How We Load Relational Tables to AnzoGraph • Rows —> Nodes – Value fields —> Node Properties • Example: Birthdate, Gender, SSN… – Foreign keys —> Edges • Example: Father, Mother… • Join-tables Disappear! – Multi-value Properties & Edges • Example: Friends, Credit Cards… • NULLs disappear – Example: If birthday unknown, that node-property not present
  14. 14. ©2018 Cambridge Semantics Inc. All rights reserved. Similarities, Differences -> Power • Like SQL Relational but… • No explicit schema. The Ontology (fancy word for schema) in the data. • Further ontology information may also be called out in the data, such as inference rules. • Standard SQL aggregates, joins, etc, but simple and powerful relationship capabilities. • “How is Jack related to Jill?” (upcoming slide) – In SQL Relational • Are they spouses? • Are they siblings? • Are they friends? • Do they have the same hobby? • … enumerate the choices, EXPLODES with degrees of separation – In SPARQL/Cypher Graph • How is Jack related to Jill? • … you can directly specify degrees of separation • Pretty exciting, essentially all the power of SQL, but you can do more, with more diverse data, where the data tells you about itself, rather than you knowing in advance.
  15. 15. Graph Queries
  16. 16. ©2018 Cambridge Semantics Inc. All rights reserved. You know SQL, introducing LPG SPARQL*, Cypher! • SPARQL* - New Labelled Property Graph (LPG) – SQL-like - Standard Keywords (SELECT, WHERE…) – W3C Standard (LPG aspect de-facto) – Most attractive to SQL folks – Amazon Neptune, Jena, AllegroGraph … • Cypher – New Keywords (MATCH vs. WHERE) – De-facto market standard – Most attractive to programmers – Neo4j, Redis, … Not a repeat of Beta vs. VHS Both can work on the Same data on the Same system. They do the same things - is really a matter of taste! AnzoGraph has SPARQL*, soon Cypher… GQL when available
  17. 17. ©2018 Cambridge Semantics Inc. All rights reserved. Discovery Inference SELECT $thing $type WHERE { $thing <type> $type } SELECT $edge WHERE { <Jack> $edge <Jill> } How is Jack related to Jill? What kind of things are in the graph? thing | type +------+------ Jack | Boy Jill | Girl Jack | Man Jill | Woman Jack | Person Jill | Person TheHill| Place edge ------------ friend Returns: Returns: Queries in SPARQL… Can just as easily be in Cypher The “scan” in the WHERE clause is ALWAYS Subject, Predicate, Object as in the RDF!
  18. 18. ©2018 Cambridge Semantics Inc. All rights reserved. Discovery Analytics SELECT $jack $common $jill WHERE { <Jack> $jack $common . <Jill> $jill $common . } GROUP BY $type What do Jack and Jill have in common? jack | common | jill ----------+---------+-------- wentUp | TheHill | wentUp worksWith | Harry | brother friend | Aesop | friend Both Jack and Jill wentUp TheHill. Jack worksWith Jill’s brother Harry. They both have a friend Aesop. Queries in SPARQL… Can just as easily be in Cypher SELECT $state $type (COUNT(*) AS $cnt) WHERE { $person <type> $type . $person <livesIn> $zipcode . $zipcode <in> $state . FILTER($type=<Man> || $type=<Woman>) } GROUP BY $state $type ORDER BY $state Number of Men & Women By State state | type | cnt -----------+-------+----------- Alabama | Man | 2,355,456 Alabama | Woman | 2,456,642 Alaska | Man |351,455 Alaska | Woman | 350,637 California | Man | 19,344,567 California | Woman | 20,454,679 ... Triples clauses with same variable are “joined”
  19. 19. Simple, Powerful, Next Generation Download or Deploy to AWS Cloud: www.anzograph.com
  20. 20. Begin your journey to graph today! Watch the On-demand Webinar Or to learn more, go to www.anzograph.com

×