SlideShare uma empresa Scribd logo
1 de 16
GraphTO
                 February 2013, Mozilla Toronto




David Colebatch & Darrick Wiebe               us@xnlogic.com
Agenda

• Who We Are
• Intro to GraphDB             Sponsored By:


• Intro to Patent-Grant Data
• Graph Concepts
• Pacer::Xml
¿por qué?

• Data Set Size
• Connectivity of Data
• Semi-structure
• Evolution of SOA and REST
The Zone of SQL Adequacy
                                                                         SQL database
                                                     Social
                                                                         Requirement of application
                                         Geo
Performance




               Salary List

                                                       Network / Cloud
                                                        Management

                             ERP

                                               MDM
                                   CRM




                                               Data complexity
How?
• Nodes / Vertices
• Relationships / Edges
Relational Model vs. Graph


                                      Each of these models
                                    expresses the same thing

Person*   Person-Friend   Friend*
Graph db performance
๏ a sample social graph
• with ~1,000 persons
๏ average 50 friends per person
๏ pathExists(a,b) limited to depth 4
๏ caches warmed up to eliminate disk I/O
          Database             # persons           query time
  MySQL                                    1,000       2,000 ms
  Neo4j                                    1,000          2 ms
  Neo4j                           1,000,000               2 ms
Different Visualization
Query Languages


• Pacer - gem install pacer
• Cypher
• SPARQL - if you grok RDF already
US PTO Data

• Patent Grant Data in XML
• bi-weekly chunks
• Pacer::Xml has handy loader as an example:
  jruby-1.7.0 > g = PacerXml::Sample.load_100
  Downloading a sample xml file from...
001> PacerXml



Importing XML into a graph?   What do you do next?
Resources

https://github.com/xnlogic/pacer-xml
https://github.com/pangloss/pacer
http://neo4j.org/
http://tinkerpop.com/

Mais conteúdo relacionado

Semelhante a 20130204 graph to-pacer-xml

Modèles de données et langages de description ouverts 6 - 2021-2022
Modèles de données et langages de description ouverts   6 - 2021-2022Modèles de données et langages de description ouverts   6 - 2021-2022
Modèles de données et langages de description ouverts 6 - 2021-2022François-Xavier Boffy
 
Applying large scale text analytics with graph databases
Applying large scale text analytics with graph databasesApplying large scale text analytics with graph databases
Applying large scale text analytics with graph databasesData Ninja API
 
Graph Database Use Cases - StampedeCon 2015
Graph Database Use Cases - StampedeCon 2015Graph Database Use Cases - StampedeCon 2015
Graph Database Use Cases - StampedeCon 2015StampedeCon
 
Graph database Use Cases
Graph database Use CasesGraph database Use Cases
Graph database Use CasesMax De Marzi
 
Microservices Patterns with GoldenGate
Microservices Patterns with GoldenGateMicroservices Patterns with GoldenGate
Microservices Patterns with GoldenGateJeffrey T. Pollock
 
Knowledge Graph for Machine Learning and Data Science
Knowledge Graph for Machine Learning and Data ScienceKnowledge Graph for Machine Learning and Data Science
Knowledge Graph for Machine Learning and Data ScienceCambridge Semantics
 
Nodes2020 | Graph of enterprise_metadata | NEO4J Conference
Nodes2020 | Graph of enterprise_metadata | NEO4J ConferenceNodes2020 | Graph of enterprise_metadata | NEO4J Conference
Nodes2020 | Graph of enterprise_metadata | NEO4J ConferenceDeepak Chandramouli
 
The Search Is Over: Integrating Solr and Hadoop in the Same Cluster to Simpli...
The Search Is Over: Integrating Solr and Hadoop in the Same Cluster to Simpli...The Search Is Over: Integrating Solr and Hadoop in the Same Cluster to Simpli...
The Search Is Over: Integrating Solr and Hadoop in the Same Cluster to Simpli...lucenerevolution
 
The Search Is Over: Integrating Solr and Hadoop in the Same Cluster to Simpli...
The Search Is Over: Integrating Solr and Hadoop in the Same Cluster to Simpli...The Search Is Over: Integrating Solr and Hadoop in the Same Cluster to Simpli...
The Search Is Over: Integrating Solr and Hadoop in the Same Cluster to Simpli...lucenerevolution
 
Data Engineer, Patterns & Architecture The future: Deep-dive into Microservic...
Data Engineer, Patterns & Architecture The future: Deep-dive into Microservic...Data Engineer, Patterns & Architecture The future: Deep-dive into Microservic...
Data Engineer, Patterns & Architecture The future: Deep-dive into Microservic...Igor De Souza
 
Information processing architectures
Information processing architecturesInformation processing architectures
Information processing architecturesRaji Gogulapati
 
Etosha - Data Asset Manager : Status and road map
Etosha - Data Asset Manager : Status and road mapEtosha - Data Asset Manager : Status and road map
Etosha - Data Asset Manager : Status and road mapDr. Mirko Kämpf
 
Model-Driven Cloud Data Storage
Model-Driven Cloud Data StorageModel-Driven Cloud Data Storage
Model-Driven Cloud Data Storagejccastrejon
 
CIKB - Software Architecture Analysis Design
CIKB - Software Architecture Analysis DesignCIKB - Software Architecture Analysis Design
CIKB - Software Architecture Analysis DesignAntonio Castellon
 
Apache Spark Overview part1 (20161107)
Apache Spark Overview part1 (20161107)Apache Spark Overview part1 (20161107)
Apache Spark Overview part1 (20161107)Steve Min
 
La bi, l'informatique décisionnelle et les graphes
La bi, l'informatique décisionnelle et les graphesLa bi, l'informatique décisionnelle et les graphes
La bi, l'informatique décisionnelle et les graphesCédric Fauvet
 
Transforming AI with Graphs: Real World Examples using Spark and Neo4j
Transforming AI with Graphs: Real World Examples using Spark and Neo4jTransforming AI with Graphs: Real World Examples using Spark and Neo4j
Transforming AI with Graphs: Real World Examples using Spark and Neo4jDatabricks
 
Transforming AI with Graphs: Real World Examples using Spark and Neo4j
Transforming AI with Graphs: Real World Examples using Spark and Neo4jTransforming AI with Graphs: Real World Examples using Spark and Neo4j
Transforming AI with Graphs: Real World Examples using Spark and Neo4jFred Madrid
 

Semelhante a 20130204 graph to-pacer-xml (20)

Modèles de données et langages de description ouverts 6 - 2021-2022
Modèles de données et langages de description ouverts   6 - 2021-2022Modèles de données et langages de description ouverts   6 - 2021-2022
Modèles de données et langages de description ouverts 6 - 2021-2022
 
Big data and cloud
Big data and cloudBig data and cloud
Big data and cloud
 
Applying large scale text analytics with graph databases
Applying large scale text analytics with graph databasesApplying large scale text analytics with graph databases
Applying large scale text analytics with graph databases
 
Graph Database Use Cases - StampedeCon 2015
Graph Database Use Cases - StampedeCon 2015Graph Database Use Cases - StampedeCon 2015
Graph Database Use Cases - StampedeCon 2015
 
Graph database Use Cases
Graph database Use CasesGraph database Use Cases
Graph database Use Cases
 
Microservices Patterns with GoldenGate
Microservices Patterns with GoldenGateMicroservices Patterns with GoldenGate
Microservices Patterns with GoldenGate
 
Knowledge Graph for Machine Learning and Data Science
Knowledge Graph for Machine Learning and Data ScienceKnowledge Graph for Machine Learning and Data Science
Knowledge Graph for Machine Learning and Data Science
 
Nodes2020 | Graph of enterprise_metadata | NEO4J Conference
Nodes2020 | Graph of enterprise_metadata | NEO4J ConferenceNodes2020 | Graph of enterprise_metadata | NEO4J Conference
Nodes2020 | Graph of enterprise_metadata | NEO4J Conference
 
The Search Is Over: Integrating Solr and Hadoop in the Same Cluster to Simpli...
The Search Is Over: Integrating Solr and Hadoop in the Same Cluster to Simpli...The Search Is Over: Integrating Solr and Hadoop in the Same Cluster to Simpli...
The Search Is Over: Integrating Solr and Hadoop in the Same Cluster to Simpli...
 
The Search Is Over: Integrating Solr and Hadoop in the Same Cluster to Simpli...
The Search Is Over: Integrating Solr and Hadoop in the Same Cluster to Simpli...The Search Is Over: Integrating Solr and Hadoop in the Same Cluster to Simpli...
The Search Is Over: Integrating Solr and Hadoop in the Same Cluster to Simpli...
 
Data Engineer, Patterns & Architecture The future: Deep-dive into Microservic...
Data Engineer, Patterns & Architecture The future: Deep-dive into Microservic...Data Engineer, Patterns & Architecture The future: Deep-dive into Microservic...
Data Engineer, Patterns & Architecture The future: Deep-dive into Microservic...
 
Drill njhug -19 feb2013
Drill njhug -19 feb2013Drill njhug -19 feb2013
Drill njhug -19 feb2013
 
Information processing architectures
Information processing architecturesInformation processing architectures
Information processing architectures
 
Etosha - Data Asset Manager : Status and road map
Etosha - Data Asset Manager : Status and road mapEtosha - Data Asset Manager : Status and road map
Etosha - Data Asset Manager : Status and road map
 
Model-Driven Cloud Data Storage
Model-Driven Cloud Data StorageModel-Driven Cloud Data Storage
Model-Driven Cloud Data Storage
 
CIKB - Software Architecture Analysis Design
CIKB - Software Architecture Analysis DesignCIKB - Software Architecture Analysis Design
CIKB - Software Architecture Analysis Design
 
Apache Spark Overview part1 (20161107)
Apache Spark Overview part1 (20161107)Apache Spark Overview part1 (20161107)
Apache Spark Overview part1 (20161107)
 
La bi, l'informatique décisionnelle et les graphes
La bi, l'informatique décisionnelle et les graphesLa bi, l'informatique décisionnelle et les graphes
La bi, l'informatique décisionnelle et les graphes
 
Transforming AI with Graphs: Real World Examples using Spark and Neo4j
Transforming AI with Graphs: Real World Examples using Spark and Neo4jTransforming AI with Graphs: Real World Examples using Spark and Neo4j
Transforming AI with Graphs: Real World Examples using Spark and Neo4j
 
Transforming AI with Graphs: Real World Examples using Spark and Neo4j
Transforming AI with Graphs: Real World Examples using Spark and Neo4jTransforming AI with Graphs: Real World Examples using Spark and Neo4j
Transforming AI with Graphs: Real World Examples using Spark and Neo4j
 

Último

The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024The Digital Insurer
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Principled Technologies
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024SynarionITSolutions
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 

Último (20)

The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 

20130204 graph to-pacer-xml

Notas do Editor

  1. There are four trends underpinning the NoSQL and specifically the GraphDB movements: 1)...the size of data that we are managing is more than doubling every two years, with around 2.4 Zettabytes expected by the end of this year (or 250mil years of the TV show “24”). 2) Data is more highly-connected than ever before. FOAF on social networks; Configuration Management for a Datacenter 3) Schema-less data persistence; Add a field to just one record, no problem. Sparkes on Toyota 4) Application Architecture changed from flat-files and batch processing, to shared RDBMS, SOA + Web services
  2. *This is a somewhat contrived example, as “person” & “friend” would normally be one table with a self join.
  3. A borrowed slide from neo technology
  4. Gephi - example of high-level graph visualization where you might be looking for clustering of data types and super nodes.
  5. d3js.org - example of mixing high-level overview of relationships, with specific relationships on hover
  6. A few options exist for graph query languages, some you may have hear of. SPARQL is a recursive acronym for “SPARQL Protocol and RDF Query Language” for Resource Description Framework. Cypher and Gremlin are modern graph query languages with strong ties to the Neo4j community. Pacer is a ruby gem that you can include in your projects and get jamming on embedded graph databases straight away.
  7. Chris compared Traffic-based and Content-based message ranking approaches to discover Ego Networks. We don’t need to worry about the details here though. Chris has left us with a nice property graph which identifies official reporting relationships by an edge labelled “Directly_Reported_To”.
  8. Go here, cool stuff.