SlideShare a Scribd company logo
1 of 55
FluidDB
Design ⊳ Architecture ⊳ Tradeoffs

           Terry Jones

           terry@jon.es
           @terrycojones
Or...
          How someone
who knows nothing about databases
    started building a database
Early days

FluidDB is not yet released

It’s an experiment

Much remains to be done

Alpha release real soon now (end of June?)
Fluidinfo

Founded in the UK in March 2006

Two full-time employees

Development in Barcelona

Friends, family, & Esther Dyson funded
Motivations

How humans work with information: Concepts

User interface and API difficulties / restrictions

Personalization

Make the world more writable
Concepts
             Are not owned
         No formal structure
       No permission is needed
     Partial or full disorganization
     No predefined set of qualities
  No special central piece of content
Easy to reorganize or multiply organize
Concepts
               Are not owned
           No formal structure
         No permission is needed
       Partial or full disorganization
       No predefined set of qualities
    No special central piece of content
  Easy to reorganize or multiply organize

To engineer this kind of flexibility,
    we must rethink control
UIs and APIs
Where did my information go?
Can I extract, re-use, add, delete?
Is there an API?
What am I allowed to do?
Can I search? On what?
Special pleading
Wouldn't it be cool if...
Personalization
In our hands vs on our behalf
 Add anything to anything, and search on it

 Protect, share, delete as you wish

 Organize things as you please

 Freedom of choice in applications
Read ⊳ Search ⊳ Write ?

 The web is readable (c. 1990)
 And searchable (c. 1995)
 But not generally writable
Read ⊳ Search ⊳ Write ?

 The web is readable (c. 1990)
 And searchable (c. 1995)
 But not generally writable

 Plus, search is not very interactive:
   Dull: refine query or ask for more results
   Can’t change results
   Can’t organize
Why don’t our architectures
let us work with information
        more flexibly?
What would it take
to build something that did?
How can we address
all these problems at once ?
Alan Kay
At PARC we had a slogan: “Point of view is worth 80 IQ
points.” It was based on a few things from the past like
how smart you had to be in Roman times to multiply two
numbers together; only geniuses did it.

We haven't gotten any smarter, we've just changed our
representation system. We think better generally by
inventing better representations; that's something that we
as computer scientists recognize as one of the main
things that we try to do.
Alan Kay
At PARC we had a slogan: “Point of view is worth 80 IQ
points.” It was based on a few things from the past like
how smart you had to be in Roman times to multiply two
numbers together; only geniuses did it.

We haven't gotten any smarter, we've just changed our
representation system. We think better generally by
inventing better representations; that's something that we
as computer scientists recognize as one of the main
things that we try to do.
1             1
     2            10
     4           100
     8          1000
   16          10000
   32         100000
   64        1000000
  128       10000000
  256      100000000
  512     1000000000
1024 +   10000000000 +
??????   11111111111
VIII
 XVII
 XLIV
LXXX
XCVI
CCLV
VIII
 XVII
 XLIV
LXXX
XCVI
CCLV +
VIII
 XVII
 XLIV
LXXX
XCVI
CCLV +
   D
VIII
 XVII
 XLIV
LXXX
XCVI
CCLV +
   D
8 queens problem
Use a poor representation with
264 (281,474,976,710,656) states:

Look for a smart algorithm!


              Or...

Use a good representation with 8! (40,320) states
and exhaustive search as an algorithm!
Objects
      digg.com/date May 21, 2009
         meg/web/rating 6
         tim/seen true
   about http://digg.com/news.html
mike/opinion “half-baked nonsense”
What else?
Objects have no owner, though attributes do
No metadata/data duality
Unlimited aggregation & combination of information
No a priori organization
Search, search, search!
Dynamic data structures
FluidDB
Objects composed of attributes with values

Attributes in namespaces: joe/rating, amazon.com/price

Simple query language

Simple API: HTTP, JSON, REST etc.

Users, attributes, namespaces each have an object
Applications are users too
FluidDB

Single-instance hosted database (like SimpleDB)

Enables sharing between applications, people

Distributed storage and query processing
Design goals
Simple data model
Simple permissions model
Simple, easily parallelizable, query language
Horizontally scalable
Fast for common tasks
Implementable!
Permissions

For each action on a namespace or attribute:

  There’s a policy: ‘open’ or ‘closed’

  And a (possibly empty) list of exceptions
Permissions
attribute or
               action   policy   exceptions
namespace
  tim/seen      read    closed    tim, meg
mike/opinion   update   open
   mike/       create   closed     mike
 meg/rating     see     open
 meg/rating     read    closed      meg
Query language
Numeric: attribute value (=, <, etc.)
Textual: attribute text match
Presence: has attribute
Grouping/logic: (...), and, or, not
Query language
   Numeric: attribute value (=, <, etc.)
   Textual: attribute text match
   Presence: has attribute
   Grouping/logic: (...), and, or, not


Designed to bring back object ids
Demo
Data buffet
username seen, lastvisited, rating, goingtoread, comment
username me, FBfriend, linkedInContact, met, family
username/myusername name, password
flickr.com longitude, latitude, owner, date, camera/make
username/flickr title, description
VCspotter fredwilson, bradburnham
nasdaq.com name, symbol, type, outstanding, value, price
username/stocks shares, date
google.com pagerank
tracks album, artist, name, year
username/music count, favorite, lastPlay, stars, bestOf2007
Data buffet 2
email messageId, fromId, toId
username/email
 from, to, subject, date
alexa.com rank
digg.com title, description, date, diggs
mahalo.com appeared, category
readwriteweb.com appeared
reddit.com date, score
techcrunch.com appeared, URI
attribute description, name, path
namespace description, name, path
Example queries
terry/rating > 5 and has reddit.com/score

has goingtoread and seen > quot;January 1, 2008quot;

has FBfriend and has linkedInContact

has james/FBfriend and not has anne/FBfriend

alexa.com/rank < 50 and fred/comment ~ cool

has reddit.com/score and not has digg.com/diggs

has readwriteweb.com/appeared and not has
techcrunch.com/appeared
More queries
terry/seen > quot;July 1, 2007quot;

has russell/myusername/name and not has
terry/myusername/name

flickr.com/latitude > 52.15 and flickr.com/
latitude < 52.35 and flickr.com/camera/make ~
Sony and has sally/seen

amazon.com/stars > 3 and amazon.com/price <
20 and amazon.com/title ~ chess and peter/
bookrating > 3 sort by amazon.com/
publication-date
Architecture
 Permissions
 Software
 Communications
 Functional
 Storage
 Query processing
 Per box
Twisted

Asynchronous Python libraries

We’re using Python because of Twisted

Event-driven, using deferreds with callbacks

Steep learning curve, documentation is patchy
AMQP
Standardized, high perfomance messaging

Backed by JP Morgan, Novell, MS, Redhat

Fully asynchronous, up to 600K messages/sec

Exchanges, queues, bindings

We released txAMQP
Thrift
Released by Facebook

Serialization of structures

Definition of services

RPC

We released txThrift
Communications

     AMQP
d
d
                  Functional
                                    objects
           http   facade    sets
    apps                           objects
           http   facade    sets namespace

                  coord            namespace
                  coord               attr
                                     attr
                  control            text
                                     text
d
d
           Functional
                            objects
    http   facade   sets
                           objects
    http   facade   sets namespace

           coord           namespace
           coord              attr
                             attr
                             text
                             text
d
d
           Functional
                           objects
    http   facade   sets
                           objects
    http   facade   sets namespace

           coord    amqp namespace
           coord    amqp    attr
                            attr
                            text
                            text
d
d
           Functional
                           objects
    http   facade   sets
                           objects
    http   facade   sets namespace
    xmpp   coord    amqp namespace
    xmpp   coord    amqp    attr
                            attr
                            text
                            text
d
d
            Functional
                            objects
    http    facade   sets
                            objects
    http    facade   sets namespace
    xmpp    coord    amqp namespace
    xmpp    coord    amqp    attr
    other                    attr
    other                    text
                             text
d
d
                   Functional
                                   objects
    load   http    facade   sets
                                   objects
    load   http    facade   sets namespace
    load xmpp      coord    amqp namespace
    load xmpp      coord    amqp    attr
           other                    attr
           other                    text
                                    text
d
d
                   Functional
                                       objects
    load   http    facade     sets
                                     objects
    load   http    facade     sets namespace
    load xmpp      coord     amqp namespace
    load xmpp      coord     amqp    attr
           other            memcache    attr
           other                        text
                            memcache
                                        text
d
d
                   Functional
                                       objects   db
    load   http    facade     sets
                                     objects     db
    load   http    facade     sets namespace     db
    load xmpp      coord     amqp namespace      db
    load xmpp      coord     amqp    attr        db
           other            memcache    attr     db
           other                        text     db
                            memcache
                                        text     db
d
d
                   Functional
                                       objects   db   kv
    load   http    facade     sets
                                     objects     db
    load   http    facade     sets namespace     db
    load xmpp      coord     amqp namespace      db
    load xmpp      coord     amqp    attr        db
           other            memcache    attr     db
           other                        text     db
                            memcache
                                        text     db
Attribute Storage
meg/rating                  tim/books/opinion
object id user id   value   object id user id    value
1234567 667           26    526141     362       nice
6527527 667          188    726483     362        fun
2876281     17       207    635378     362      boring
7628876 667         1225    477582     362       sexy
                            362782     362       long
  PostgreSQL
  Tall tables
  Independent (column store)
  Backed by key/value store (Amazon S3, for now)
Query processing

                       and

digg/date > “Monday”     or

        meg/rating > 5        has tim/seen
Query processing
                       attr   set ops

digg/date > “Monday”


meg/rating > 5


has tim/seen
Attribute affinity
                       attr   set ops

digg/date > “Monday”




meg/rating > 5
has tim/seen
Per box
A controller service, launched on boot

The controller launches new services (processes)

All services talk AMQP as well as pure Thrift

A coordinator brings up new boxes & services

We use Amazon EC2, for now

More Related Content

What's hot

Two graph data models : RDF and Property Graphs
Two graph data models : RDF and Property GraphsTwo graph data models : RDF and Property Graphs
Two graph data models : RDF and Property Graphsandyseaborne
 
IPython Notebook as a Unified Data Science Interface for Hadoop
IPython Notebook as a Unified Data Science Interface for HadoopIPython Notebook as a Unified Data Science Interface for Hadoop
IPython Notebook as a Unified Data Science Interface for HadoopDataWorks Summit
 
Getting started with Apache Spark in Python - PyLadies Toronto 2016
Getting started with Apache Spark in Python - PyLadies Toronto 2016Getting started with Apache Spark in Python - PyLadies Toronto 2016
Getting started with Apache Spark in Python - PyLadies Toronto 2016Holden Karau
 
Scalable and Flexible Machine Learning With Scala @ LinkedIn
Scalable and Flexible Machine Learning With Scala @ LinkedInScalable and Flexible Machine Learning With Scala @ LinkedIn
Scalable and Flexible Machine Learning With Scala @ LinkedInVitaly Gordon
 
SPARQL-DL - Theory & Practice
SPARQL-DL - Theory & PracticeSPARQL-DL - Theory & Practice
SPARQL-DL - Theory & PracticeAdriel Café
 
WebTech Tutorial Querying DBPedia
WebTech Tutorial Querying DBPediaWebTech Tutorial Querying DBPedia
WebTech Tutorial Querying DBPediaKatrien Verbert
 
Debugging PySpark - PyCon US 2018
Debugging PySpark -  PyCon US 2018Debugging PySpark -  PyCon US 2018
Debugging PySpark - PyCon US 2018Holden Karau
 
070517 Jena
070517 Jena070517 Jena
070517 Jenayuhana
 
An Introduction to SPARQL
An Introduction to SPARQLAn Introduction to SPARQL
An Introduction to SPARQLOlaf Hartig
 
SPARQL 1.1 Update (2013-03-05)
SPARQL 1.1 Update (2013-03-05)SPARQL 1.1 Update (2013-03-05)
SPARQL 1.1 Update (2013-03-05)andyseaborne
 
A fast introduction to PySpark with a quick look at Arrow based UDFs
A fast introduction to PySpark with a quick look at Arrow based UDFsA fast introduction to PySpark with a quick look at Arrow based UDFs
A fast introduction to PySpark with a quick look at Arrow based UDFsHolden Karau
 
Hadoop with Python
Hadoop with PythonHadoop with Python
Hadoop with PythonDonald Miner
 
2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs
2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs
2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIsJosef Petrák
 
Apache Drill Workshop
Apache Drill WorkshopApache Drill Workshop
Apache Drill WorkshopCharles Givre
 
F# Type Provider for R Statistical Platform
F# Type Provider for R Statistical PlatformF# Type Provider for R Statistical Platform
F# Type Provider for R Statistical PlatformHoward Mansell
 
Testing and validating distributed systems with Apache Spark and Apache Beam ...
Testing and validating distributed systems with Apache Spark and Apache Beam ...Testing and validating distributed systems with Apache Spark and Apache Beam ...
Testing and validating distributed systems with Apache Spark and Apache Beam ...Holden Karau
 
Querying Linked Data with SPARQL
Querying Linked Data with SPARQLQuerying Linked Data with SPARQL
Querying Linked Data with SPARQLOlaf Hartig
 

What's hot (20)

Two graph data models : RDF and Property Graphs
Two graph data models : RDF and Property GraphsTwo graph data models : RDF and Property Graphs
Two graph data models : RDF and Property Graphs
 
IPython Notebook as a Unified Data Science Interface for Hadoop
IPython Notebook as a Unified Data Science Interface for HadoopIPython Notebook as a Unified Data Science Interface for Hadoop
IPython Notebook as a Unified Data Science Interface for Hadoop
 
Getting started with Apache Spark in Python - PyLadies Toronto 2016
Getting started with Apache Spark in Python - PyLadies Toronto 2016Getting started with Apache Spark in Python - PyLadies Toronto 2016
Getting started with Apache Spark in Python - PyLadies Toronto 2016
 
Scalable and Flexible Machine Learning With Scala @ LinkedIn
Scalable and Flexible Machine Learning With Scala @ LinkedInScalable and Flexible Machine Learning With Scala @ LinkedIn
Scalable and Flexible Machine Learning With Scala @ LinkedIn
 
Jena
JenaJena
Jena
 
SPARQL-DL - Theory & Practice
SPARQL-DL - Theory & PracticeSPARQL-DL - Theory & Practice
SPARQL-DL - Theory & Practice
 
WebTech Tutorial Querying DBPedia
WebTech Tutorial Querying DBPediaWebTech Tutorial Querying DBPedia
WebTech Tutorial Querying DBPedia
 
Debugging PySpark - PyCon US 2018
Debugging PySpark -  PyCon US 2018Debugging PySpark -  PyCon US 2018
Debugging PySpark - PyCon US 2018
 
070517 Jena
070517 Jena070517 Jena
070517 Jena
 
4 sw architectures and sparql
4 sw architectures and sparql4 sw architectures and sparql
4 sw architectures and sparql
 
An Introduction to SPARQL
An Introduction to SPARQLAn Introduction to SPARQL
An Introduction to SPARQL
 
SPARQL 1.1 Update (2013-03-05)
SPARQL 1.1 Update (2013-03-05)SPARQL 1.1 Update (2013-03-05)
SPARQL 1.1 Update (2013-03-05)
 
A fast introduction to PySpark with a quick look at Arrow based UDFs
A fast introduction to PySpark with a quick look at Arrow based UDFsA fast introduction to PySpark with a quick look at Arrow based UDFs
A fast introduction to PySpark with a quick look at Arrow based UDFs
 
Hadoop with Python
Hadoop with PythonHadoop with Python
Hadoop with Python
 
2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs
2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs
2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs
 
Apache Drill Workshop
Apache Drill WorkshopApache Drill Workshop
Apache Drill Workshop
 
F# Type Provider for R Statistical Platform
F# Type Provider for R Statistical PlatformF# Type Provider for R Statistical Platform
F# Type Provider for R Statistical Platform
 
Testing and validating distributed systems with Apache Spark and Apache Beam ...
Testing and validating distributed systems with Apache Spark and Apache Beam ...Testing and validating distributed systems with Apache Spark and Apache Beam ...
Testing and validating distributed systems with Apache Spark and Apache Beam ...
 
Querying Linked Data with SPARQL
Querying Linked Data with SPARQLQuerying Linked Data with SPARQL
Querying Linked Data with SPARQL
 
Avro intro
Avro introAvro intro
Avro intro
 

Viewers also liked

Task6 -project mgm tool eval--canavan2
Task6 -project mgm tool eval--canavan2Task6 -project mgm tool eval--canavan2
Task6 -project mgm tool eval--canavan2Debra Canavan
 
Final Dissertation Defense: Summary Lecture Phonecasting
Final Dissertation Defense: Summary Lecture PhonecastingFinal Dissertation Defense: Summary Lecture Phonecasting
Final Dissertation Defense: Summary Lecture PhonecastingWesley Fryer
 
Using Media to Support Inquiry (June 2013)
Using Media to Support Inquiry (June 2013)Using Media to Support Inquiry (June 2013)
Using Media to Support Inquiry (June 2013)Wesley Fryer
 
Mapping Media to the Curriculum (June 2012)
Mapping Media to the Curriculum (June 2012)Mapping Media to the Curriculum (June 2012)
Mapping Media to the Curriculum (June 2012)Wesley Fryer
 
Mapping Media to the Common Core (May 2014)
Mapping Media to the Common Core (May 2014)Mapping Media to the Common Core (May 2014)
Mapping Media to the Common Core (May 2014)Wesley Fryer
 
Empowering Students as Digital Witnesses (Part 1 of 2)
Empowering Students as Digital Witnesses (Part 1 of 2)Empowering Students as Digital Witnesses (Part 1 of 2)
Empowering Students as Digital Witnesses (Part 1 of 2)Wesley Fryer
 
Masters Dissertation
Masters DissertationMasters Dissertation
Masters DissertationDarren Gash
 
Designer Social: a cross-channel marketing case study with Dukky
Designer Social: a cross-channel marketing case study with DukkyDesigner Social: a cross-channel marketing case study with Dukky
Designer Social: a cross-channel marketing case study with DukkyDukky
 
2014 festa do sobreiro
2014 festa do sobreiro2014 festa do sobreiro
2014 festa do sobreiroelsaduarte
 
Powerpoint10
Powerpoint10Powerpoint10
Powerpoint10coral91
 
Atv7 Judimari Unid3
Atv7 Judimari Unid3Atv7 Judimari Unid3
Atv7 Judimari Unid3Judimari
 
Joaquim pacheco neves
Joaquim pacheco nevesJoaquim pacheco neves
Joaquim pacheco nevesmalex86
 
Uso de tecnología inalámbrica de bajo costo para mejorar la calidad de vida d...
Uso de tecnología inalámbrica de bajo costo para mejorar la calidad de vida d...Uso de tecnología inalámbrica de bajo costo para mejorar la calidad de vida d...
Uso de tecnología inalámbrica de bajo costo para mejorar la calidad de vida d...Jonathan Stalin Delgado Guerrero
 
Presentación1
Presentación1Presentación1
Presentación1gatike
 
Powerpoint08
Powerpoint08Powerpoint08
Powerpoint08coral91
 
Leonie Baker - Portfolio
Leonie Baker - PortfolioLeonie Baker - Portfolio
Leonie Baker - PortfolioLeonie Baker
 
Magazine Advert Analysis - LCD Soundsystem
Magazine Advert Analysis - LCD SoundsystemMagazine Advert Analysis - LCD Soundsystem
Magazine Advert Analysis - LCD SoundsystemSmith_
 

Viewers also liked (20)

Task6 -project mgm tool eval--canavan2
Task6 -project mgm tool eval--canavan2Task6 -project mgm tool eval--canavan2
Task6 -project mgm tool eval--canavan2
 
Final Dissertation Defense: Summary Lecture Phonecasting
Final Dissertation Defense: Summary Lecture PhonecastingFinal Dissertation Defense: Summary Lecture Phonecasting
Final Dissertation Defense: Summary Lecture Phonecasting
 
Using Media to Support Inquiry (June 2013)
Using Media to Support Inquiry (June 2013)Using Media to Support Inquiry (June 2013)
Using Media to Support Inquiry (June 2013)
 
Mapping Media to the Curriculum (June 2012)
Mapping Media to the Curriculum (June 2012)Mapping Media to the Curriculum (June 2012)
Mapping Media to the Curriculum (June 2012)
 
Mapping Media to the Common Core (May 2014)
Mapping Media to the Common Core (May 2014)Mapping Media to the Common Core (May 2014)
Mapping Media to the Common Core (May 2014)
 
Empowering Students as Digital Witnesses (Part 1 of 2)
Empowering Students as Digital Witnesses (Part 1 of 2)Empowering Students as Digital Witnesses (Part 1 of 2)
Empowering Students as Digital Witnesses (Part 1 of 2)
 
Masters Dissertation
Masters DissertationMasters Dissertation
Masters Dissertation
 
Designer Social: a cross-channel marketing case study with Dukky
Designer Social: a cross-channel marketing case study with DukkyDesigner Social: a cross-channel marketing case study with Dukky
Designer Social: a cross-channel marketing case study with Dukky
 
2014 festa do sobreiro
2014 festa do sobreiro2014 festa do sobreiro
2014 festa do sobreiro
 
Powerpoint10
Powerpoint10Powerpoint10
Powerpoint10
 
Atv7 Judimari Unid3
Atv7 Judimari Unid3Atv7 Judimari Unid3
Atv7 Judimari Unid3
 
Joaquim pacheco neves
Joaquim pacheco nevesJoaquim pacheco neves
Joaquim pacheco neves
 
Uso de tecnología inalámbrica de bajo costo para mejorar la calidad de vida d...
Uso de tecnología inalámbrica de bajo costo para mejorar la calidad de vida d...Uso de tecnología inalámbrica de bajo costo para mejorar la calidad de vida d...
Uso de tecnología inalámbrica de bajo costo para mejorar la calidad de vida d...
 
Presentación1
Presentación1Presentación1
Presentación1
 
Variável
VariávelVariável
Variável
 
Emilie
EmilieEmilie
Emilie
 
Powerpoint08
Powerpoint08Powerpoint08
Powerpoint08
 
Leonie Baker - Portfolio
Leonie Baker - PortfolioLeonie Baker - Portfolio
Leonie Baker - Portfolio
 
Magazine Advert Analysis - LCD Soundsystem
Magazine Advert Analysis - LCD SoundsystemMagazine Advert Analysis - LCD Soundsystem
Magazine Advert Analysis - LCD Soundsystem
 
caso de sucesso tupy
caso de sucesso tupycaso de sucesso tupy
caso de sucesso tupy
 

Similar to The design, architecture, and tradeoffs of FluidDB

RDFa: introduction, comparison with microdata and microformats and how to use it
RDFa: introduction, comparison with microdata and microformats and how to use itRDFa: introduction, comparison with microdata and microformats and how to use it
RDFa: introduction, comparison with microdata and microformats and how to use itJose Luis Lopez Pino
 
The Semantic Web An Introduction
The Semantic Web An IntroductionThe Semantic Web An Introduction
The Semantic Web An Introductionshaouy
 
State of the Semantic Web
State of the Semantic WebState of the Semantic Web
State of the Semantic WebIvan Herman
 
Python tools for testing web services over HTTP
Python tools for testing web services over HTTPPython tools for testing web services over HTTP
Python tools for testing web services over HTTPMykhailo Kolesnyk
 
Ruby Conf Preso
Ruby Conf PresoRuby Conf Preso
Ruby Conf PresoDan Yoder
 
Web scraping with BeautifulSoup, LXML, RegEx and Scrapy
Web scraping with BeautifulSoup, LXML, RegEx and ScrapyWeb scraping with BeautifulSoup, LXML, RegEx and Scrapy
Web scraping with BeautifulSoup, LXML, RegEx and ScrapyLITTINRAJAN
 
Introduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processingIntroduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processingTill Rohrmann
 
Allura - an Open Source MongoDB Based Document Oriented SourceForge
Allura - an Open Source MongoDB Based Document Oriented SourceForgeAllura - an Open Source MongoDB Based Document Oriented SourceForge
Allura - an Open Source MongoDB Based Document Oriented SourceForgeRick Copeland
 
Intro to-html-backbone
Intro to-html-backboneIntro to-html-backbone
Intro to-html-backbonezonathen
 
What is WebDAV - uploaded by Murali Krishna Nookella
What is WebDAV - uploaded by Murali Krishna NookellaWhat is WebDAV - uploaded by Murali Krishna Nookella
What is WebDAV - uploaded by Murali Krishna Nookellamuralikrishnanookella
 
Web Development Environments: Choose the best or go with the rest
Web Development Environments:  Choose the best or go with the restWeb Development Environments:  Choose the best or go with the rest
Web Development Environments: Choose the best or go with the restgeorge.james
 
Delphi ORM SOA MVC SQL NoSQL JSON REST mORMot
Delphi ORM SOA MVC SQL NoSQL JSON REST mORMotDelphi ORM SOA MVC SQL NoSQL JSON REST mORMot
Delphi ORM SOA MVC SQL NoSQL JSON REST mORMotArnaud Bouchez
 
How To Crawl Amazon Website Using Python Scrap (1).pptx
How To Crawl Amazon Website Using Python Scrap (1).pptxHow To Crawl Amazon Website Using Python Scrap (1).pptx
How To Crawl Amazon Website Using Python Scrap (1).pptxiwebdatascraping
 

Similar to The design, architecture, and tradeoffs of FluidDB (20)

Avro
AvroAvro
Avro
 
RDFa: introduction, comparison with microdata and microformats and how to use it
RDFa: introduction, comparison with microdata and microformats and how to use itRDFa: introduction, comparison with microdata and microformats and how to use it
RDFa: introduction, comparison with microdata and microformats and how to use it
 
Ruby on rails
Ruby on railsRuby on rails
Ruby on rails
 
Java scriptforjavadev part2a
Java scriptforjavadev part2aJava scriptforjavadev part2a
Java scriptforjavadev part2a
 
The Semantic Web An Introduction
The Semantic Web An IntroductionThe Semantic Web An Introduction
The Semantic Web An Introduction
 
State of the Semantic Web
State of the Semantic WebState of the Semantic Web
State of the Semantic Web
 
Php resque
Php resquePhp resque
Php resque
 
Apache Persistence Layers
Apache Persistence LayersApache Persistence Layers
Apache Persistence Layers
 
Python tools for testing web services over HTTP
Python tools for testing web services over HTTPPython tools for testing web services over HTTP
Python tools for testing web services over HTTP
 
Ruby Conf Preso
Ruby Conf PresoRuby Conf Preso
Ruby Conf Preso
 
Web scraping with BeautifulSoup, LXML, RegEx and Scrapy
Web scraping with BeautifulSoup, LXML, RegEx and ScrapyWeb scraping with BeautifulSoup, LXML, RegEx and Scrapy
Web scraping with BeautifulSoup, LXML, RegEx and Scrapy
 
Ruby On Rails
Ruby On RailsRuby On Rails
Ruby On Rails
 
Introduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processingIntroduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processing
 
Allura - an Open Source MongoDB Based Document Oriented SourceForge
Allura - an Open Source MongoDB Based Document Oriented SourceForgeAllura - an Open Source MongoDB Based Document Oriented SourceForge
Allura - an Open Source MongoDB Based Document Oriented SourceForge
 
Intro to-html-backbone
Intro to-html-backboneIntro to-html-backbone
Intro to-html-backbone
 
Web Topics
Web TopicsWeb Topics
Web Topics
 
What is WebDAV - uploaded by Murali Krishna Nookella
What is WebDAV - uploaded by Murali Krishna NookellaWhat is WebDAV - uploaded by Murali Krishna Nookella
What is WebDAV - uploaded by Murali Krishna Nookella
 
Web Development Environments: Choose the best or go with the rest
Web Development Environments:  Choose the best or go with the restWeb Development Environments:  Choose the best or go with the rest
Web Development Environments: Choose the best or go with the rest
 
Delphi ORM SOA MVC SQL NoSQL JSON REST mORMot
Delphi ORM SOA MVC SQL NoSQL JSON REST mORMotDelphi ORM SOA MVC SQL NoSQL JSON REST mORMot
Delphi ORM SOA MVC SQL NoSQL JSON REST mORMot
 
How To Crawl Amazon Website Using Python Scrap (1).pptx
How To Crawl Amazon Website Using Python Scrap (1).pptxHow To Crawl Amazon Website Using Python Scrap (1).pptx
How To Crawl Amazon Website Using Python Scrap (1).pptx
 

Recently uploaded

Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxMatsuo Lab
 
Spring24-Release Overview - Wellingtion User Group-1.pdf
Spring24-Release Overview - Wellingtion User Group-1.pdfSpring24-Release Overview - Wellingtion User Group-1.pdf
Spring24-Release Overview - Wellingtion User Group-1.pdfAnna Loughnan Colquhoun
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxGDSC PJATK
 
Introduction to Quantum Computing
Introduction to Quantum ComputingIntroduction to Quantum Computing
Introduction to Quantum ComputingGDSC PJATK
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfAijun Zhang
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024SkyPlanner
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopBachir Benyammi
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Websitedgelyza
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024D Cloud Solutions
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfinfogdgmi
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaborationbruanjhuli
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemAsko Soukka
 
Digital magic. A small project for controlling smart light bulbs.
Digital magic. A small project for controlling smart light bulbs.Digital magic. A small project for controlling smart light bulbs.
Digital magic. A small project for controlling smart light bulbs.francesco barbera
 
Do we need a new standard for visualizing the invisible?
Do we need a new standard for visualizing the invisible?Do we need a new standard for visualizing the invisible?
Do we need a new standard for visualizing the invisible?SANGHEE SHIN
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfDianaGray10
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdfPedro Manuel
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Commit University
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URLRuncy Oommen
 
Cloud Revolution: Exploring the New Wave of Serverless Spatial Data
Cloud Revolution: Exploring the New Wave of Serverless Spatial DataCloud Revolution: Exploring the New Wave of Serverless Spatial Data
Cloud Revolution: Exploring the New Wave of Serverless Spatial DataSafe Software
 

Recently uploaded (20)

Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptx
 
Spring24-Release Overview - Wellingtion User Group-1.pdf
Spring24-Release Overview - Wellingtion User Group-1.pdfSpring24-Release Overview - Wellingtion User Group-1.pdf
Spring24-Release Overview - Wellingtion User Group-1.pdf
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptx
 
Introduction to Quantum Computing
Introduction to Quantum ComputingIntroduction to Quantum Computing
Introduction to Quantum Computing
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdf
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 Workshop
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Website
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdf
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystem
 
Digital magic. A small project for controlling smart light bulbs.
Digital magic. A small project for controlling smart light bulbs.Digital magic. A small project for controlling smart light bulbs.
Digital magic. A small project for controlling smart light bulbs.
 
Do we need a new standard for visualizing the invisible?
Do we need a new standard for visualizing the invisible?Do we need a new standard for visualizing the invisible?
Do we need a new standard for visualizing the invisible?
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdf
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URL
 
Cloud Revolution: Exploring the New Wave of Serverless Spatial Data
Cloud Revolution: Exploring the New Wave of Serverless Spatial DataCloud Revolution: Exploring the New Wave of Serverless Spatial Data
Cloud Revolution: Exploring the New Wave of Serverless Spatial Data
 

The design, architecture, and tradeoffs of FluidDB

  • 1. FluidDB Design ⊳ Architecture ⊳ Tradeoffs Terry Jones terry@jon.es @terrycojones
  • 2. Or... How someone who knows nothing about databases started building a database
  • 3. Early days FluidDB is not yet released It’s an experiment Much remains to be done Alpha release real soon now (end of June?)
  • 4. Fluidinfo Founded in the UK in March 2006 Two full-time employees Development in Barcelona Friends, family, & Esther Dyson funded
  • 5. Motivations How humans work with information: Concepts User interface and API difficulties / restrictions Personalization Make the world more writable
  • 6. Concepts Are not owned No formal structure No permission is needed Partial or full disorganization No predefined set of qualities No special central piece of content Easy to reorganize or multiply organize
  • 7. Concepts Are not owned No formal structure No permission is needed Partial or full disorganization No predefined set of qualities No special central piece of content Easy to reorganize or multiply organize To engineer this kind of flexibility, we must rethink control
  • 8. UIs and APIs Where did my information go? Can I extract, re-use, add, delete? Is there an API? What am I allowed to do? Can I search? On what? Special pleading Wouldn't it be cool if...
  • 9. Personalization In our hands vs on our behalf Add anything to anything, and search on it Protect, share, delete as you wish Organize things as you please Freedom of choice in applications
  • 10. Read ⊳ Search ⊳ Write ? The web is readable (c. 1990) And searchable (c. 1995) But not generally writable
  • 11. Read ⊳ Search ⊳ Write ? The web is readable (c. 1990) And searchable (c. 1995) But not generally writable Plus, search is not very interactive: Dull: refine query or ask for more results Can’t change results Can’t organize
  • 12. Why don’t our architectures let us work with information more flexibly?
  • 13. What would it take to build something that did?
  • 14. How can we address all these problems at once ?
  • 15. Alan Kay At PARC we had a slogan: “Point of view is worth 80 IQ points.” It was based on a few things from the past like how smart you had to be in Roman times to multiply two numbers together; only geniuses did it. We haven't gotten any smarter, we've just changed our representation system. We think better generally by inventing better representations; that's something that we as computer scientists recognize as one of the main things that we try to do.
  • 16. Alan Kay At PARC we had a slogan: “Point of view is worth 80 IQ points.” It was based on a few things from the past like how smart you had to be in Roman times to multiply two numbers together; only geniuses did it. We haven't gotten any smarter, we've just changed our representation system. We think better generally by inventing better representations; that's something that we as computer scientists recognize as one of the main things that we try to do.
  • 17. 1 1 2 10 4 100 8 1000 16 10000 32 100000 64 1000000 128 10000000 256 100000000 512 1000000000 1024 + 10000000000 + ?????? 11111111111
  • 22. 8 queens problem Use a poor representation with 264 (281,474,976,710,656) states: Look for a smart algorithm! Or... Use a good representation with 8! (40,320) states and exhaustive search as an algorithm!
  • 23. Objects digg.com/date May 21, 2009 meg/web/rating 6 tim/seen true about http://digg.com/news.html mike/opinion “half-baked nonsense”
  • 24. What else? Objects have no owner, though attributes do No metadata/data duality Unlimited aggregation & combination of information No a priori organization Search, search, search! Dynamic data structures
  • 25. FluidDB Objects composed of attributes with values Attributes in namespaces: joe/rating, amazon.com/price Simple query language Simple API: HTTP, JSON, REST etc. Users, attributes, namespaces each have an object Applications are users too
  • 26. FluidDB Single-instance hosted database (like SimpleDB) Enables sharing between applications, people Distributed storage and query processing
  • 27. Design goals Simple data model Simple permissions model Simple, easily parallelizable, query language Horizontally scalable Fast for common tasks Implementable!
  • 28. Permissions For each action on a namespace or attribute: There’s a policy: ‘open’ or ‘closed’ And a (possibly empty) list of exceptions
  • 29. Permissions attribute or action policy exceptions namespace tim/seen read closed tim, meg mike/opinion update open mike/ create closed mike meg/rating see open meg/rating read closed meg
  • 30. Query language Numeric: attribute value (=, <, etc.) Textual: attribute text match Presence: has attribute Grouping/logic: (...), and, or, not
  • 31. Query language Numeric: attribute value (=, <, etc.) Textual: attribute text match Presence: has attribute Grouping/logic: (...), and, or, not Designed to bring back object ids
  • 32. Demo
  • 33. Data buffet username seen, lastvisited, rating, goingtoread, comment username me, FBfriend, linkedInContact, met, family username/myusername name, password flickr.com longitude, latitude, owner, date, camera/make username/flickr title, description VCspotter fredwilson, bradburnham nasdaq.com name, symbol, type, outstanding, value, price username/stocks shares, date google.com pagerank tracks album, artist, name, year username/music count, favorite, lastPlay, stars, bestOf2007
  • 34. Data buffet 2 email messageId, fromId, toId username/email from, to, subject, date alexa.com rank digg.com title, description, date, diggs mahalo.com appeared, category readwriteweb.com appeared reddit.com date, score techcrunch.com appeared, URI attribute description, name, path namespace description, name, path
  • 35. Example queries terry/rating > 5 and has reddit.com/score has goingtoread and seen > quot;January 1, 2008quot; has FBfriend and has linkedInContact has james/FBfriend and not has anne/FBfriend alexa.com/rank < 50 and fred/comment ~ cool has reddit.com/score and not has digg.com/diggs has readwriteweb.com/appeared and not has techcrunch.com/appeared
  • 36. More queries terry/seen > quot;July 1, 2007quot; has russell/myusername/name and not has terry/myusername/name flickr.com/latitude > 52.15 and flickr.com/ latitude < 52.35 and flickr.com/camera/make ~ Sony and has sally/seen amazon.com/stars > 3 and amazon.com/price < 20 and amazon.com/title ~ chess and peter/ bookrating > 3 sort by amazon.com/ publication-date
  • 37. Architecture Permissions Software Communications Functional Storage Query processing Per box
  • 38. Twisted Asynchronous Python libraries We’re using Python because of Twisted Event-driven, using deferreds with callbacks Steep learning curve, documentation is patchy
  • 39. AMQP Standardized, high perfomance messaging Backed by JP Morgan, Novell, MS, Redhat Fully asynchronous, up to 600K messages/sec Exchanges, queues, bindings We released txAMQP
  • 40. Thrift Released by Facebook Serialization of structures Definition of services RPC We released txThrift
  • 42. d d Functional objects http facade sets apps objects http facade sets namespace coord namespace coord attr attr control text text
  • 43. d d Functional objects http facade sets objects http facade sets namespace coord namespace coord attr attr text text
  • 44. d d Functional objects http facade sets objects http facade sets namespace coord amqp namespace coord amqp attr attr text text
  • 45. d d Functional objects http facade sets objects http facade sets namespace xmpp coord amqp namespace xmpp coord amqp attr attr text text
  • 46. d d Functional objects http facade sets objects http facade sets namespace xmpp coord amqp namespace xmpp coord amqp attr other attr other text text
  • 47. d d Functional objects load http facade sets objects load http facade sets namespace load xmpp coord amqp namespace load xmpp coord amqp attr other attr other text text
  • 48. d d Functional objects load http facade sets objects load http facade sets namespace load xmpp coord amqp namespace load xmpp coord amqp attr other memcache attr other text memcache text
  • 49. d d Functional objects db load http facade sets objects db load http facade sets namespace db load xmpp coord amqp namespace db load xmpp coord amqp attr db other memcache attr db other text db memcache text db
  • 50. d d Functional objects db kv load http facade sets objects db load http facade sets namespace db load xmpp coord amqp namespace db load xmpp coord amqp attr db other memcache attr db other text db memcache text db
  • 51. Attribute Storage meg/rating tim/books/opinion object id user id value object id user id value 1234567 667 26 526141 362 nice 6527527 667 188 726483 362 fun 2876281 17 207 635378 362 boring 7628876 667 1225 477582 362 sexy 362782 362 long PostgreSQL Tall tables Independent (column store) Backed by key/value store (Amazon S3, for now)
  • 52. Query processing and digg/date > “Monday” or meg/rating > 5 has tim/seen
  • 53. Query processing attr set ops digg/date > “Monday” meg/rating > 5 has tim/seen
  • 54. Attribute affinity attr set ops digg/date > “Monday” meg/rating > 5 has tim/seen
  • 55. Per box A controller service, launched on boot The controller launches new services (processes) All services talk AMQP as well as pure Thrift A coordinator brings up new boxes & services We use Amazon EC2, for now

Editor's Notes