Graph Databases and Web Frameworks (NodeJS, AngularJS, GridFS, OpenLink Virtuoso)

Web frameworks
and
graph databases
Overview and code demos
João Rocha da Silva
May 2014
joaorosilva@gmail.com

Contents
• Modeling limits of relational databases
• Entities with variable attributes
• Time-variant values
• Inheritance
• Hierarchies (parents of parents of parents…)

Contents (cont’d)
• Modeling problems in a graph
• Ontologies and SPARQL
• OpenLink Virtuoso
• Scalable ﬁle storage: GridFS within MongoDB
• Scalable document indexing : ElasticSearch

• NodeJS and asynchronous ﬂow control
• AngularJS for dynamic web interfaces
• BONUS : Socket.io sneak peek
Contents (cont’d)

Relational databases
• Good when you know everything about the
problem at the time of modeling
• A column can only be of a single type (VARCHAR,
int, etc)
• Hard to document
• Model can become too attached to the code

Relational databases
• Handling historical values = complex SQL
• Hierarchies = Foreign Key loops
• Variable attributes, inheritance = [null + if Hell] or
many JOINs

(one of 78,826 tables and counting)
source : SAP
Beautiful, meaningful column names ;-)
Even better table names

!
source MediaWiki
“Old Versions” aka
“copy everything and add a timestamp”

!
source MediaWiki
now imagine we want to images of different kinds,
with different attributes…

Attribute name
Timestamps
Value
(always varchar)
Entity with variable,
time-dependent
attributes
Fixed attrs.
!
source CKAN

Graph databases
• Represent entities (Users, Products, Places…) as
vertexes (entity types are called classes)
• Connections between them are directed graph
edges (edge types are called properties)
!
• The meaning of these connections is expressed in
ontologies that can be shared and reused

Representing a person
using ontologies

http://www.fe.up.pt/
~pro11004
“João Rocha”
foaf:name
up:PhDStudent rdf:type
http://www.w3.org/TR/rdf-schema/
http://www.foaf-project.org/
http://
www.fe.up.pt/
org:memberOf

Getting all the students
SELECT ?uri ?attribute ?value
FROM <http://myorganization.com/data>
WHERE
{
?uri rdfs:type up:Student.
?uri ?attribute ?value
}
• Will fetch all the students, regardless of their type
• Will also return their attributes (“database columns”)
• Different types of students will have different attributes

Inference
• Transitive Properties (subclass of subclass…)
• Subclasses
• Multiple Inheritance Handling
(Student + Researcher + ScholarshipHolder)
Saves coding time
spent writing complex queries

Nothing comes for free
• Aggregation operators slow
• Transactions are not supported in standard
SPARQL
• (“SPARQL 1.1 Query/Update Services should be atomic but that they are
not required to be atomic.”)
• Graph DBMS Solutions are in early stages (many
bugs, many “beta”s, many mailing lists…)

Dendro
(dendro-dev.fe.up.pt:3001)
• Dropbox and File/Folder description platform
• Variable descriptions
• Time-dependent values
• Directory structures (hierarchy)
• Need for simple querying…

nie:isLogicalPartOf
Pn
Dn
280mm
“DCB Base
Data”
120
Dn-1
dcb:initialCrackLength
dc:title
dcb:specimenWidth
dc:isReferencedBy
Fn
120
dc:title
dcb:specimenWidth
dc:isVersionOf
Added property
instance
01/01/2014
^^xsd:date
dc:created
01/01/2014
^^xsd:date
dc:modiﬁed
Changed
modiﬁcation
timestamp
Revision
creation
timestamp
Un
dc:creator
Current dataset version Past Revisions
ddr:pertainsTo
Change
recording
C
ddr:initial
CrackLen
gth
ddr:changedDescriptor
“add”
ddr:operation
“DCB Base
Data”

Socket.io
Real-time
events
NodeJS
Business
Logic
AngularJS
Dynamic interfaces à la Google Docs
Files
GridFS
Database
OpenLink
Virtuoso
Free-text
search
ElasticSearch

Code Demos
NodeJS (Dendro) http://192.168.5.75:3001
GridFS http://192.168.5.75:27017
OpenLink Virtuoso http://192.168.5.75:8890
ElasticSearch
http://192.168.5.75:9200/_plugin/
head/
Socket.io (BattleBits) http://localhost:3000

Conclusions
• JavaScript + JSON = easy parsing, less verbose code
• NodeJS = asynchronous everything. Needs precise ﬂow control
• ElasticSearch = Scalable indexing, easy to use JSON API
• GridFS = Transparent scaling for huge numbers of large ﬁles;
querying using JSON-based API
• Graph Databases = Model certain problems better than their
relational counterparts. Simpler queries using SPARQL. Less
mature than RDBMs. No transactions.
• Socket.io = Real-time library for client-server-client push
communication

João Rocha da Silva is an Informatics Engineering PhD student at the Faculty of
Engineering of the University of Porto. He specializes on research data management,
applying the latest Semantic Web Technologies to the adequate preservation and
discovery of research data assets.
!
He is experienced in many programming languages (Javascript-Node, PHP with MVC
frameworks, Ruby on Rails, J2EE, etc etc) running on the major operating systems
(everyday Mac user). Regardless of language, he is a quick learner that can adapt to any
new technology quickly and effectively.
!
He is also an experienced freelancer iOS Developer with several Apps published on the
App Store, and a self-taught DIY mechanic with a special interest in classic cars,
particularly his 1987 Toyota Corolla GT Twin Cam, also known as Hachi-Roku or AE86.
!
Research Data Management and Semantic
Web Researcher, Web & iPhone Developer
João Rocha da Silva!
joaorosilva@gmail.com

Graph Databases and Web Frameworks (NodeJS, AngularJS, GridFS, OpenLink Virtuoso)

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (11)

Semelhante a Graph Databases and Web Frameworks (NodeJS, AngularJS, GridFS, OpenLink Virtuoso)

Semelhante a Graph Databases and Web Frameworks (NodeJS, AngularJS, GridFS, OpenLink Virtuoso) (20)

Último

Último (20)

Graph Databases and Web Frameworks (NodeJS, AngularJS, GridFS, OpenLink Virtuoso)