O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.

GraphChain

735 visualizações

Publicada em

A Distributed Database with Explicit Semanticsand Chained RDF Graphs.
The talk at "The Web Conference" Lyon 2018

Publicada em: Internet
  • Seja o primeiro a comentar

GraphChain

  1. 1. A Distributed Database with Explicit Semantics and Chained RDF Graphs GraphChain Mirek Sopek, Przemysław Grądzki, Witold Kosowski, Dominik Kuziński, Rafał Trójczak, Robert Trypuz 3rd Workshop on Linked Data & Distributed Ledgers (LD-DL)
  2. 2. 23RD WORKSHOP ON LINKED DATA & DISTRIBUTED LEDGERS (LD-DL), Lyon, April 24, 2018 THE TALK AGENDA • The Motivation • Related Works • GraphChain defined • GraphChain visualized • GraphChain challenges • GraphChain Ontology • The Implementations • The future of GraphChain GraphChain
  3. 3. The MotivationWhy did we invent GraphChain?
  4. 4. 43RD WORKSHOP ON LINKED DATA & DISTRIBUTED LEDGERS (LD-DL), Lyon, April 24, 2018 WHY GraphChain? • What is LEI system? • The LEI (Legal Entity Identifier) is the most universal, global and trustworthy identification mechanism for companies from all countries of the world. • It is a 20-digit, globally unique identifier for businesses created as Legal Entities • LEI.INFO (https://lei.info) is the LEI Linked Data Resolver. We have been working with the LEI system, and discovered its shortcomings …
  5. 5. 53RD WORKSHOP ON LINKED DATA & DISTRIBUTED LEDGERS (LD-DL), Lyon, April 24, 2018 WHY GraphChain? • Lack of explicit data semantics • While LEI system is distributed in nature, the technology used to support it is of old style and inherently unsecured • So, using LEI.INFO as the starting point, we have created a concept of Blockchained LEI system and we have made a number of POCs demonstrating its usefulness. LEI system shortcomings …
  6. 6. 63RD WORKSHOP ON LINKED DATA & DISTRIBUTED LEDGERS (LD-DL), Lyon, April 24, 2018 WHY GraphChain? • It was very hard to combine the three features the LEI system required: • Explicit data semantics • Linked Open Data/SW data model and • Blockchain security model • In our Blockchain based POCs, the extensibility of Legal Entity data was poor (unless we used simple textual serialization for the LEI records) • The queries across the entire LEI dataset were impossible or very hard  What we really needed was a standard RDF Graph database protected by a Blockchain. However, despite our efforts …
  7. 7. Related worksDevelopments combining SW & Blockchain technologies
  8. 8. 83RD WORKSHOP ON LINKED DATA & DISTRIBUTED LEDGERS (LD-DL), Lyon, April 24, 2018 Ontologies and LD data models for Blockchain • BLONDiE - a generic Blockchain ontology developed by Héctor Ugarte in the Semantic Blockchain project. “BLONDiE is the most developed vocabulary for representing blockchain concepts, with the most potential to enable reusable modelling across different distributed ledgers in the future.” - Allan Third and John Domingue. 2017. Linked Data Indexing of Distributed Ledgers • EthOn is an ontology developed by Consensys (https://consensys.net/) that is intended to be a semantical counterpart of the Ethereum Blockchain framework. (http://ethon.consensys.net). We noticed some problems with its taxonomy. • Flex (Web) Ledger is a graph data model and a protocol for decentralized ledgers developed by Digital Bazaar. From the data model perspective, Flex Ledger assumes the use of generic JSON objects encapsulated in the ledger. “The ledger data model and syntax make no assumption about which ontology is used.”
  9. 9. 93RD WORKSHOP ON LINKED DATA & DISTRIBUTED LEDGERS (LD-DL), Lyon, April 24, 2018 Bring Blockchain to legacy databases • BigchainDB – Blockchain database build using MongoDB „Rather than trying to enhance blockchain technology, BigchainDB starts with a big data distributed database and then adds Blockchain characteristics - decentralized control, immutability and the transfer of digital assets.” ( https://www.bigchaindb.com/features/ ) • MongoDB Blockchained – is essentially the same development but from MongoDB perspective: “A blockchain-enabled MongoDB that wraps the core database (MongoDB) and implements the three blockchain characteristics of decentralization, immutability, and assets.“ (“Building Enterprise-Grade Blockchain Databases with MongoDB”, A MongoDB White Paper)  In both cases the essence is in the ability of the distributed database to use legacy access methods (Querying, Scalability, Operationalizability)
  10. 10. The birth of GraphChain Rather than trying to add Graphs and Ontologies to Blockchain, GraphChain starts with RDF database and then adds Blockchain features to the system.
  11. 11. 113RD WORKSHOP ON LINKED DATA & DISTRIBUTED LEDGERS (LD-DL), Lyon, April 24, 2018 GraphChain defined • An RDF graph is an unordered set of RDF triples and a named RDF graph is an RDF graph which is assigned a name in the form of a URI • GraphChain is thus defined as: • A linked chain of named RDF graphs specified by the GraphChain ontology and an ontology for data graph part of the GraphChain. • A set of general mechanisms for calculating a digest of the named RDF graphs. • A set of network mechanisms that are responsible for the distribution of the named RDF graphs among the distributed peers and the for achieving the consensus. The main idea behind GraphChain is to use Blockchain mechanisms on top of an abstract RDF graph data type.
  12. 12. 123RD WORKSHOP ON LINKED DATA & DISTRIBUTED LEDGERS (LD-DL), Lyon, April 24, 2018 The GraphChain Architecture • A single node of the GraphChain consists of several parts: • a web interface for communication with clients (via the HTTP protocol), • a web socket endpoint for communication with others nodes, • a cryptography module for handling of digest calculation, • a triple store repository manager for storing blocks as sets of triples and obtaining blocks from the repository, • and services which bind all these parts together. A single node perspective
  13. 13. 133RD WORKSHOP ON LINKED DATA & DISTRIBUTED LEDGERS (LD-DL), Lyon, April 24, 2018 The GraphChain process flow
  14. 14. 143RD WORKSHOP ON LINKED DATA & DISTRIBUTED LEDGERS (LD-DL), Lyon, April 24, 2018 The artistic visualization of the GraphChain Watch it at: https://youtu.be/C7_mB_myo5w
  15. 15. The implementation challenges The implementation of GraphChain brings a number of challenges that must be addressed before production-grade alternative to the existing Blockchain implementations is offered.
  16. 16. 163RD WORKSHOP ON LINKED DATA & DISTRIBUTED LEDGERS (LD-DL), Lyon, April 24, 2018 The implementation challenges • Performance of the programmatic access to RDF graphs. • Performance and quality of the RDF graphs serialization used for the broadcast of the named graphs to other nodes. • Computation of the RDF digests. The most important challenges
  17. 17. 173RD WORKSHOP ON LINKED DATA & DISTRIBUTED LEDGERS (LD-DL), Lyon, April 24, 2018 The computation of the RDF digests • Blank nodes’ identifiers are implementation dependent, i.e. they may change while transferring the same graph between different implementations, triple stores or other methods of their instantiation. • Every RDF graph is equivalent to an unordered set of triples so the one and the same graph, even in the same syntax, can be serialized in many different ways. • RDF graph serialization can be differently encoded; hashing functions are encoding sensitive. The issues
  18. 18. 183RD WORKSHOP ON LINKED DATA & DISTRIBUTED LEDGERS (LD-DL), Lyon, April 24, 2018 The computation of the RDF digests • “Canonicalization” — calculation of the digest as the hash of the canonical graph serialization • “DotHash” — calculation of the digest as the result of the combining operation on the hashes of the individual triples • “Interwoven DotHash” — calculation of the digest as the result of the combining operation on the hashes of the individual triples and the triples linked by blank nodes. The proposed solutions
  19. 19. 193RD WORKSHOP ON LINKED DATA & DISTRIBUTED LEDGERS (LD-DL), Lyon, April 24, 2018 Interwoven DotHash • If the triple does not contain a blank node, compute it as a hash of its N3 serialized format. • If the triple contains a blank node as its Subject, then compute its hash as the sum of the hashes of the N3 serialized Predicate and Object and the hashes of non-blank nodes of all those triples where the blank node appears in the Object nodes. • If the triple contains a blank node as its Object, then compute its hash as the sum of the hashes of the N3 serialized Subject and Predicate and the hashes of non-blank nodes of all those triples where the blank node appears in the Subject nodes. • If the triple contains blank nodes in both Subject & Object nodes, use the above rules twice, once for Subject, then for Objects. The algorithm
  20. 20. The GraphChain Ontology The GraphChain ontology is an OWL ontology of the chained, named RDF graphs
  21. 21. 213RD WORKSHOP ON LINKED DATA & DISTRIBUTED LEDGERS (LD-DL), Lyon, April 24, 2018 The GraphChain Ontology • The GraphChain ontology resembles sequentially ordered and cryptographically secured chain structure presented in the original paper by Nakamoto. • From the ontological perspective GraphChain’s block (unit) is a reified 7-ary relation • In the reification pattern each n-ary relation between resources is represented as a separate OWL class and each instance of the relation (an n-tuple) as an instance of the class plus n additional binary relations providing links to each argument of the n-tuple. An OWL ontology of chained named RDF graphs
  22. 22. 223RD WORKSHOP ON LINKED DATA & DISTRIBUTED LEDGERS (LD-DL), Lyon, April 24, 2018 The GraphChain Ontology • The GraphChain ontology is a light-weight OWL ontology: • it consists of 2 classes, 1 object property and 6 data properties. • It has ALQ(D) expressivity and 11 restrictions on classes. • It’s main purpose is to describe GraphChain constructs • GraphChain sets no restriction on the ontological description of the data graphs http://ontologies.makolab.com/bc
  23. 23. The implementations So far, GraphChain has three simple, exemplary implementations and one production grade (in progress)
  24. 24. 243RD WORKSHOP ON LINKED DATA & DISTRIBUTED LEDGERS (LD-DL), Lyon, April 24, 2018 Simple implementations of the GraphChain • .NET/C# The C# implementation uses .NET Core platform. The node itself is an ASP.NET Core web application with REST API for a client - node communication and WebSocket layer for Peer to Peer transmission • Java The Java implementation was created as a Spring-managed web application. It uses the RDF4J library for handling semantic-related operations. Can store RDF graphs in both the RDF4J triple store and the AllegroGraph triple store. Adding a new storage method is easy. • JavaScript/node.js We have also developed a third, illustrative and simple implementation of the node: a JavaScript implementation which is based on Naivechain. It offers HTTP API and P2P communication between nodes. There are some differences between our implementation and Naivechain, though. The simple implementations
  25. 25. 253RD WORKSHOP ON LINKED DATA & DISTRIBUTED LEDGERS (LD-DL), Lyon, April 24, 2018 Web interface for the GraphChain • Uses YasUI client (http://about.ayasgui.org) • Allows for SPARQL querying both GraphChain ledger and data graphs • Connects to multiple nodes of the GraphChain http://binsem.makolab.pl/gcgui/
  26. 26. 263RD WORKSHOP ON LINKED DATA & DISTRIBUTED LEDGERS (LD-DL), Lyon, April 24, 2018 Production grade implementation of the GraphChain • Hyperledger Indy is a distributed ledger, purpose- built for decentralized identity. • Used by Sovrin foundation (https://sovrin.org/) for digital identity. • Ideal solution for our LEI applications • „Observer” nodes in the second half of 2018 • We are analyzing the results of our POC on the use of the GraphChain and Hyperledger Indy and are working on the production system. Using Hyperledger Indy
  27. 27. The future of the GraphChain
  28. 28. 283RD WORKSHOP ON LINKED DATA & DISTRIBUTED LEDGERS (LD-DL), Lyon, April 24, 2018 The plans for the project development • Finalization of Hyperledger Indy implementation • In parallel: selection of the alternative triple store implementations • Testing other graph “models” and engines (like neo4j) and other “non-SQL” database models • Creating a solution for the global LEI system. • Considering using QRL code (Quantum Resistent Ledger) as part of the GraphChain. What’s next ?
  29. 29. THANK YOU The slide deck at: https://ml.ms/TheGraphChainLyon ❧
  30. 30. 303RD WORKSHOP ON LINKED DATA & DISTRIBUTED LEDGERS (LD-DL), Lyon, April 24, 2018 PLEASE CONTACT US! DR. MIREK SOPEK CTO & founder sopek@makolab.com Poland: MakoLab SA, Demokratyczna 46, 93430 Lodz, Poland Phone: +48 600 814 537, www.makolab.com USA: 2153 S.E. Hawthorne Road, Suite 205, Gainesville, Florida 32641 Phone: +1 551 226 5488 us.makolab.com Dr ROBERT TRYPUZ MakoLab SA (DC Lublin) KUL University robert.trypuz@makolab.com PRZEMYSŁAW GRĄDZKI MakoLab SA (DC Lublin) KUL University przemyslaw.gradzki@makolab.com WITOLD KOSOWSKI MakoLab SA witold.kosowski@makolab.com DOMINIK KUZIŃSKI MakoLab SA dominik.kuzinski@makolab.com RAFAŁ TRÓJCZAK MakoLab SA (DC Lublin) KUL University Rafal.trojczak@makolab.com

×