SlideShare a Scribd company logo
1 of 82
Download to read offline
http://streamreasoning.org




Order Matters!
Harnessing a World of Orderings
for Reasoning over Massive Data
Emanuele Della Valle
emanuele.dellavalle@polimi.it - http://emanueledellavalle.org
Acknowledges
§  This talk presents the content of a joint paper with
    Stefan Schlobachb, Markus Krötzschc, Alessandro Bozzona,
    Stefano Ceria, and Ian Horrocksc to appear on SWJ
    a Politecnico di Milano
    b Vrije Universiteit Amsterdam
    c Univerity of Oxford


§  I also want to thank Frank van Harmelenb for his important
    contribution to the discussion, Tony Lee (Saltlux), Andreas
    Schreiber (DLR) and Achim Basermann (DLR) for the valuable
    discussion on concrete examples of problems that require order-
    aware reasoning. Moreover I want to thank Sara Magliacaneb
    for her work on SPARQL-RANK and the slides I use in this
    presentation, and Marco Balduinia, Davide Barbieria, and
    Daniele Bragaa for their work on C-SPARQL
§  Check out the paper:
    •  http://www.semantic-web-journal.net/content/order-matters-
       harnessing-world-orderings-reasoning-over-massive-data
 Trento, Italy, 6.11.2012       Emanuele Della Valle - http://streamreasoning.org/
References
§  The numbers in square brackets refers to references
    in the SWJ paper
    •  http://www.semantic-web-journal.net/content/order-
       matters-harnessing-world-orderings-reasoning-over-
       massive-data

§  A short selection of references to my papers is
    available in the end of the presentation.




 Trento, Italy, 6.11.2012   Emanuele Della Valle - http://streamreasoning.org/
The problem, three use cases, and …
§  More and more applications require real-time
    processing of massive, dynamically generated, data
       Space Situational     Jet Engine                      Intelligent
          Awareness            Design                       Surveillance




 Trento, Italy, 6.11.2012   Emanuele Della Valle - http://streamreasoning.org/
The Problem
Use case: space junk




[source http://wordlesstech.com/2011/03/26/space-junk/ ]

 Trento, Italy, 6.11.2012               Emanuele Della Valle - http://streamreasoning.org/   5
The Problem
Use case: jet engine design




[Source: http://www.sae.org/mags/aem/10018/ ]

 Trento, Italy, 6.11.2012             Emanuele Della Valle - http://streamreasoning.org/   6
The Problem
 Use case: intelligent surveillance




[Source: http://youtu.be/I3iDBfB_ZC0 ]



   Trento, Italy, 6.11.2012              Emanuele Della Valle - http://streamreasoning.org/   7
The Problem
… and four common features!
§  their data is ordered,
    •  naturally ordered by recency, proximity, etc.
    •  intrinsically ordered by precision, popularity, provenance,
       certainty, trust, etc.
    •  and, in any case, it is explicitly sortable through attribute
       values
§  the answers are also required to come
    in an ordered fashion
    •  engineers surveying a satellite orbit need to know the largest
       pieces of debris in closest proximity with maximal certainty,
       measured with highest precision, etc.
§  they require immediate answers at runtime
    •  flight paths have to be adapted once an object in collision
       course is detected
§  and, they require inference
    •  rich ontological models describing complex domain
       knowledge is often used to pose the queries and to interpret
       the results

 Trento, Italy, 6.11.2012     Emanuele Della Valle - http://streamreasoning.org/
The Problem
Performance targets
  Answer
                                    Target
quality at
   time t
                                                                               Fully correct
                                                                               answers



                            Desired situation

                                             Current situation

                                                                                 Computation
                                                                                 Time t
                Real-time                                           Max runtime
                behaviour
Note: completeness may not be necessary if all relevant answers are found
 Trento, Italy, 6.11.2012                Emanuele Della Valle - http://streamreasoning.org/    9
The Problem
A running example
§  Imagine a system which
    •  listens to all micro-posts that are published,
    •  knows the geographic location of social media
       users,
    •  has the ability of detecting the topic of each micro-
       post, and
    •  has modelled relationships between topics in an
       expressive ontological language

§  Let suppose that each of us asks a query like
    the following to such a system:
    •  Which users of social media, currently leading
       popular discussions on fashion-related topics, are
       closest to my current location? What are they
       saying about the shopping district nearby?

 Trento, Italy, 6.11.2012   Emanuele Della Valle - http://streamreasoning.org/
The solution space
                   Types of
                   orders
       Combinations


       Expensive to enforce


       Cheap to enforce


       Natural

       No ordering


                                                                            Types of
 Approximation
                                                                            reasoning
      and
 parallelisation
                           No reasoning Data-driven Query-driven Combinations

Trento, Italy, 6.11.2012            Emanuele Della Valle - http://streamreasoning.org/   11
The solution space
no ordering, no reasoning
                    Types of
                    orders
        Combinations


        Expensive to enforce


        Cheap to enforce


        Natural

        No ordering


                                                                             Types of
                                                                             reasoning

                            No reasoning Data-driven Query-driven Combinations

 Trento, Italy, 6.11.2012            Emanuele Della Valle - http://streamreasoning.org/   12
The solution space
no ordering, no reasoning
§  Most of the big data solutions currently
    on the market
    •    BSP (Bulk Synchronous Parallel)
    •    PRAM (Parallel Random Access Machine)
    •    PGAS (Partitioned Global Access Space)
    •    Map-Reduce implementations
    •    and data-centric workflow systems based on them

§  Some (e.g., Hive and Pig) allow the specification of
    ordering constraints, but no specific optimisation is
    provided for top-k or streaming queries
§  W.r.t. the running example
    •  Right performances and scalability
    •  Limited ability to harnessing orderings
    •  Missing inference capability


 Trento, Italy, 6.11.2012    Emanuele Della Valle - http://streamreasoning.org/
The solution space
Order aware data management
                    Types of
                    orders
        Combinations




                                   data management
        Expensive to enforce




                                   Order-aware
        Cheap to enforce


        Natural

        No ordering


                                                                                Types of
                                                                                reasoning

                            No reasoning Data-driven Query-driven Combinations

 Trento, Italy, 6.11.2012               Emanuele Della Valle - http://streamreasoning.org/   14
The solution space
Order aware data management
§  When treating massive data order matters!
    Data	
  as	
  a	
       where	
  we	
  can	
                     e.g.,	
  order	
  by	
  
    sortable	
  en,ty	
     enforce	
  orderings	
                   •  sortable	
  literals	
  
                            easily	
  and	
  logically	
             •  popularity	
  
                                                                     •  uncertainty	
  
                                                                     •  trust	
  


                                                                                        Most	
  relevant	
  
                                                             streaming	
  	
  
                                                                                        answers	
  first	
  	
  
                                                             algorithms	
  



§  If N is the size of the input, a problem is considered to be
    “well- solved” if a streaming algorithm exists which
    requires at most O(poly(log(N)) space and time [31]

 Trento, Italy, 6.11.2012                 Emanuele Della Valle - http://streamreasoning.org/
The solution space
Order aware data management and approximation

§  approximate, streaming algorithms can outperform
    classical, data-bound approaches to this problem by
    several orders of magnitude [6,14].
§  Such approximations can be asymptotic, so that
    arbitrary accuracy can be achieved [6].

            Answer
        accuracy at                                   Fully correct answers
        computation
              time t




                                Computation Time t

 Trento, Italy, 6.11.2012   Emanuele Della Valle - http://streamreasoning.org/
The solution space
Harnessing natural orderings
                    Types of
                    orders
        Combinations


        Expensive to enforce


        Cheap to enforce


        Natural

        No ordering


                                                                             Types of
                                                                             reasoning

                            No reasoning Data-driven Query-driven Combinations

 Trento, Italy, 6.11.2012            Emanuele Della Valle - http://streamreasoning.org/   17
The solution space
Harnessing natural orderings
§  Continuous queries registered over streams that, in most of
    the cases, are observed trough windows
                                                     window




        input streams        Registered	
                     streams of answer
        (unbound, and        Con,nuous	
  
        time-varying)        Query	
  

§  Assumption: the recent information being more relevant as it describes
    the current state of a dynamic system
  Trento, Italy, 6.11.2012         Emanuele Della Valle - http://streamreasoning.org/   18
The solution space
Harnessing natural orderings
§  The nature of streams requires a
    paradigmatic change*
    •  from persistent data
          –  to be stored and queried on demand
          –  a.k.a. one time semantics
    •  to transient data
          –  to be consumed on the fly by continuous queries
          –  a.k.a. continuous semantics




* This paradigmatic change first arose in DB community [31]
 Trento, Italy, 6.11.2012        Emanuele Della Valle - http://streamreasoning.org/
The solution space
Harnessing natural orderings
§  Two types of solutions
    •    Data Stream Management Systems (DSMS)
    •    Complex Event Processors (CEP)
§  Research Prototypes
    •    Amazon/Cougar (Cornell) – sensors
    •    Aurora (Brown/MIT) – sensor monitoring, dataflow
    •    Gigascope: AT&T Labs – Network Monitoring
    •    Hancock (AT&T) – Telecom streams
    •    Niagara (OGI/Wisconsin) – Internet DBs & XML
    •    OpenCQ (Georgia) – triggers, view maintenance
    •    Stream (Stanford) – general-purpose DSMS
    •    Stream Mill (UCLA) - power & extensibility
    •    Tapestry (Xerox) – publish/subscribe filtering
    •    Telegraph (Berkeley) – adaptive engine for sensors
    •    Tribeca (Bellcore) – network monitoring
§  High-tech startups
    •    Streambase, Coral8, Apama, Truviso
§  Major DBMS vendors are all adding stream extensions as well
    •    IBM InfoSphere Stream
    •    Microsoft streaminsight
    •    Oracle CEP



 Trento, Italy, 6.11.2012          Emanuele Della Valle - http://streamreasoning.org/
The solution space
Harnessing natural orderings
§  DSMSs are optimised for the simplest portion of the
    query in our running example
    •  retrieve the micro posts that have been posted recently




 Trento, Italy, 6.11.2012    Emanuele Della Valle - http://streamreasoning.org/
The solution space
Harnessing other types of orders
                    Types of
                    orders
        Combinations


        Expensive to enforce


        Cheap to enforce


        Natural

        No ordering


                                                                             Types of
                                                                             reasoning

                            No reasoning Data-driven Query-driven Combinations

 Trento, Italy, 6.11.2012            Emanuele Della Valle - http://streamreasoning.org/   22
The solution space
Harnessing other types of orders
§  W.r.t. the running example, solutions studied in these
    two areas allow to efficiently
    •  retrieve nearby shops that are discussed by popular social
       media users.

§  This is a typical top-k query
    •  a limited number of results k
    •  ordered by a scoring function
    •  that combines several criteria
          –  e.g., near by and most discussed




 Trento, Italy, 6.11.2012      Emanuele Della Valle - http://streamreasoning.org/
The solution space - Harnessing other types of orders
Treating order as a first class citizen
§  Traditional query                                                §  Order-aware query
    evaluation schema:                                                   evaluation schema:
    materialize then sort                                                split and interleave
                     Limit	
  to	
  K	
                                                  Limit	
  to	
  K	
  
                                   [10s]	
  
                                                                                                       [10s]	
  
 Materialize	
  join	
  results	
  and	
  order	
  
 them	
  all	
  by	
  proximity	
  of	
  the	
  shop	
                                    discussed	
  
 to	
  the	
  issuer	
  and	
  popularity	
  of	
  the	
                     [10s]	
                               [10s]	
  
                social	
  media	
  user	
  	
  	
  
                               [1,000s]	
  
                                                                            Order	
  by	
                    Order	
  by	
  
                                                                          proximity	
  to	
                 popularity	
  	
  
                     discussed	
                                           the	
  issuer	
  
  [1,000s]	
                                   [100,0000s]	
  

         shops	
                      social	
                                shops	
                        social	
  
                                    media	
  user	
                                                        media	
  user	
  

 Trento, Italy, 6.11.2012                                   Emanuele Della Valle - http://streamreasoning.org/                   24
The solution space - Harnessing other types of orders

The split-and-interleave scheme
§  State-of-the-art
    •  Literature in RDBMS (for a survey see [35]) presents the
       split-and-interleave scheme:
          1.  Split the evaluation of the scoring function
              into the evaluation of the single criteria
          2.  Interleave them with other operators
          3.  Use partial orders to construct incrementally the final order

§  Standard assumptions:
    •    Monotone increasing scoring function
    •    Sorted access for each criterion
    •    Random access when possible is expensive
    •    No uncertainty in the scores
    •    No uncertainty in the scoring function




 Trento, Italy, 6.11.2012            Emanuele Della Valle - http://streamreasoning.org/
The solution space - Harnessing other types of orders
Be aware, it’s a trade-off




 Orders of
magnitude




 NOTE: Typically users are interested in 1<= k <= 100
 Trento, Italy, 6.11.2012        Emanuele Della Valle - http://streamreasoning.org/   26
The solution space
Harnessing all types of orders together
                    Types of
                    orders
        Combinations


        Expensive to enforce


        Cheap to enforce


        Natural

        No ordering


                                                                             Types of
                                                                             reasoning

                            No reasoning Data-driven Query-driven Combinations

 Trento, Italy, 6.11.2012            Emanuele Della Valle - http://streamreasoning.org/   27
The solution space
Harnessing all types of orders together
§  W.r.t. the running example, solutions studied in these
    area allow to efficiently
    •  retrieve the shops nearby that popular social media users
       are currently positively posting about..

§  This is a typical continuous monitoring of top-k
    queries over sliding windows [45]
§  A very promising and little explored research area in
    data management




 Trento, Italy, 6.11.2012   Emanuele Della Valle - http://streamreasoning.org/
The solution space
Wrapping up order-aware data mng.
§  Two parts of the query in the running example
    remain difficult to express:
    •  knowing which topics are related to fashion
          –  requires at least a taxonomy of fashion-related topics
    •  computing which recent discussions on social media
       are popular
          –  requires to compute the transitive closure of the discussion

§  Both are
    •  difficult to model without an expressive ontological
       language (such as OWL 2) and
    •  both require complex algorithms that an ontology
       reasoner can handle natively

§  Moreover, order-aware data management
    techniques do not cope with heterogeneity
    •  i.e., data should be translated in one common representation
       before order-aware data manage- ment techniques can be
       applied.
 Trento, Italy, 6.11.2012           Emanuele Della Valle - http://streamreasoning.org/
The solution space
                   Types of
                   orders
       Combinations


       Expensive to enforce


       Cheap to enforce


       Natural

       No ordering
                                           Scalable reasoning
                                                                            Types of
                                                                            reasoning

                           No reasoning Data-driven Query-driven Combinations

Trento, Italy, 6.11.2012            Emanuele Della Valle - http://streamreasoning.org/   30
The Solution Space
Scalable Reasoning
§  Why?
    •  handling heterogeneity in the input data through
       ontology-based information integration

§  In the running example,
    •  ontological background knowledge can be used to model
       relationships between more specific and more general topics
       of interest, which can be used to infer which concrete topics
       are related to fashion

§  How?
    •  Data-driven methods
          –  Scalable methods available in the state-of-the-art
    •  Query-driven methods
          –  research trend, implementations are appearing
    •  Combinations of the previous two
          –  mostly theoretical results

 Trento, Italy, 6.11.2012       Emanuele Della Valle - http://streamreasoning.org/
The Solution Space – Scalable Reasoning
Data-driven
§  Ontological Language:
    •  OWL 2 RL
          –  aimed at applications that require scalable reasoning without sacrificing
             too much expressive power
          –  http://www.w3.org/TR/owl2-profiles/#OWL_2_RL

§  Reasoning approach
    •  Backward chaining: from asserted data to all possible entailments
§  Pros: Low query latency
§  Cons: they do not take the actual information-need into account
§  Implementations
    •  OWLIM, Virtuoso, Allegro- Graph, and OntoBroker
§  Research trend
    •  Parallelization using Map-Reduce as a main paradigm
          –  e.g. [33,65] for OWL2RL or a fragment thereof [32,64,66,38]
    •  Applying similar techniques to more expressive fragments of OWL
          –  e.g., ELK reasoner for OWL EL [37]

 Trento, Italy, 6.11.2012            Emanuele Della Valle - http://streamreasoning.org/
The Solution Space – Scalable Reasoning
Query-driven
§  Ontological Language
    •  OWL 2 QL
          –  designed for query answering in LOGSPACE w.r.t the size of the data,
             with the expressivity of conceptual models (e.g., UML class diagrams)
          –  http://www.w3.org/TR/owl2-profiles/#OWL_2_QL

§  Reasoning approach
    •  Forward chaining: from query to asserted facts
    •  Query rewriting: from ontological query to a set of SQL queries
§  Pros: limit the search space by considering the actual query
§  Cons: number of rewritings grow exponentially
§  Implementations
    •  QuOnto, Owlgres, and Requiem
§  Research trend
    •  Extend query rewriting for more expressive ontology languages
          –    e.g., Datalog± [27,4]
    •  Parallelization using Map-Reduce
          –    e.g., Query Pie
 Trento, Italy, 6.11.2012              Emanuele Della Valle - http://streamreasoning.org/
The Solution Space – Scalable Reasoning

Combinations
§  Ontological Language
    •    Subject to research

§  Reasoning approach
    •    combine the advantages of data- and query-driven approaches

§  State-of-the-art
    •    Magic Sets technique [1]

§  Recent theoretical results
    •    for limited fragment of OWL EL [44]
    •    for existential rules [4]




 Trento, Italy, 6.11.2012           Emanuele Della Valle - http://streamreasoning.org/
The Solution Space – Scalable Reasoning

Approximation
§  Many rule-based systems compute only part of the
    entailed consequences by employing a set of rules that cannot
    derive all results
    •  E.g., Jena, Sesame, OWLIM, and Virtuoso
§  A typical approach is to approximate the input information
    by restricting to a simpler ontology language that is then
    processed with a more efficient, sound and complete algorithm
    •    e.g., Trowl [48], and screech [62].

§  Approximate reasoning is used as a sub-method in many
    sound and complete reasoners,
    •    e.g., the OWL reasoner HermiT first computes the syntactically told class
         hierarchy before using more complex algorithms for a complete subsumption
         check.

§  None of the above, however, deal with or take advantage of
    orderings of any kind.
§  A number of interesting research challenges thus remain open.


 Trento, Italy, 6.11.2012            Emanuele Della Valle - http://streamreasoning.org/
The solution space
Wrap up of the talk so far
                    Types of
                    orders
        Combinations




                                  data management
        Expensive to enforce




                                  Order-aware
        Cheap to enforce


        Natural

        No ordering
                                                    Scalable reasoning
                                                                               Types of
                                                                               reasoning

                            No reasoning Data-driven Query-driven Combinations

 Trento, Italy, 6.11.2012              Emanuele Della Valle - http://streamreasoning.org/   36
The solution space
Reasoning with streaming algorithms
                    Types of
                    orders
        Combinations                                   Order-aware




                                  data management
                                                        reasoning
        Expensive to enforce




                                  Order-aware
                                                          Top-k
        Cheap to enforce                                Reasoning

        Natural
                                                     Stream reasoning
        No ordering
                                                    Scalable reasoning
                                                                               Types of
                                                                               reasoning

                            No reasoning Data-driven Query-driven Combinations

 Trento, Italy, 6.11.2012              Emanuele Della Valle - http://streamreasoning.org/   37
The solution space
Reasoning with streaming algorithms
                    Types of
                    orders
        Combinations




                                  data management
        Expensive to enforce




                                  Order-aware
        Cheap to enforce


        Natural
                                                     Stream reasoning
        No ordering
                                                    Scalable reasoning
                                                                               Types of
                                                                               reasoning

                            No reasoning Data-driven Query-driven Combinations

 Trento, Italy, 6.11.2012              Emanuele Della Valle - http://streamreasoning.org/   38
The solution space
Stream Reasoning [IEEE-IS2009]
§  W.r.t. the running example, solutions studied in these
    area allow to efficiently
    •  compute which recent discussions on social media are
       popular

§  For instance, how many micro-posts discussed (either
    replying or retweeting) my tweet?


                                        discuss	
  
                                          reply	
              discuss	
  
                                                                 reply	
  
                 discuss	
   t2	
  
                    reply	
                           t4	
                   t7	
  
                          discuss	
  
                          retweet	
              discuss	
  
                                                   reply	
                   discuss	
  
                                                                             reply	
  

                                                                                                    7!
              t1	
                      t3	
                    t5	
                       t8	
  


                                        retweet	
  
                                        discuss	
               t6	
  

 Trento, Italy, 6.11.2012                         Emanuele Della Valle - http://streamreasoning.org/
The solution space
Stream Reasoning features

                            Trad Data      Stream            Automatic            Stream
                            Processing   Processing          Reasoning           Reasoning
Feature                       offers        offers             offers             aims at
Processing
Streams
Handling Large
datasets
Reactivity (real-
time)
Expressing
Fine-grained
queries
Capturing
Knowledge
Access to
Persistent Data

 Trento, Italy, 6.11.2012                Emanuele Della Valle - http://streamreasoning.org/
The solution space
Stream Reasoning definition
§  Making sense [IEEE-IS2010]
     •  in real time
     •  of multiple, heterogeneous, gigantic and inevitably noisy
        data streams
     •  in order to support the decision process of extremely
        large numbers of concurrent user




§  Note: making sense of streams necessarily requires processing
    them against rich background knowledge, an unsolved problem
    in database
 Trento, Italy, 6.11.2012    Emanuele Della Valle - http://streamreasoning.org/
The solution space
Architecture of a Stream Reasoner
§  Continuous reasoning tasks registered over
    streams that, in most of the cases, are observed
    trough windows                      window




                              Registered	
  
              input streams                                      streams of answer
                              Con,nuous	
  
                              Reasoning	
  
                              Tasks	
  


 Trento, Italy, 6.11.2012      Emanuele Della Valle - http://streamreasoning.org/
The solution space
Stream Reasoning PoliMi’s Achievements
§  RDF Stream data type [WWW2009]
    •  (virtually) represent heterogeneous data streams
§  C-SPARQL query language [WWW2009]
    •  express fine-grained continuous queries
    •  It is “compiled down” to keep high performances
§  Incremental RDFS++ Reasoning [ESWC2010]
    •  allows for domain knowledge exploitation
§  C-SPARQL Engine [EDBT2010]
    •  Fully operational prototype
    •  Deployed in award winning applications (e.g., Bottari [JWS2012])




 Trento, Italy, 6.11.2012      Emanuele Della Valle - http://streamreasoning.org/
The solution space
Stream Reasoning PoliMi’s Achievements
                    Types of
                    orders
        Combinations




                                  data management
        Expensive to enforce




                                  Order-aware
        Cheap to enforce


        Natural

        No ordering
                                                    Scalable reasoning
                                                                               Types of
                                                                               reasoning

                            No reasoning Data-driven Query-driven Combinations

 Trento, Italy, 6.11.2012              Emanuele Della Valle - http://streamreasoning.org/   44
The solution space – Stream Reasoning “alla PoliMi”

RDF Stream

§  RDF Stream Data Type
     •  Ordered sequence of pairs, where each pair is made of an
        RDF triple and its timestamp




     §  Timestamps are not required to be unique, they must be non-
         decreasing

§  E.g.,
     (<:Alice       :posts                       :post1 >,        2010-02-12T13:34:41)
     (<:post1       :talksAboutPositively        :LaScala>,       2010-02-12T13:34:41)
     (<:Bob         :posts                       :post2 >,        2010-02-12T13:36:28)
     (<:post2       :talksAboutNegatively        :Duomo>,         2010-02-12T13:36:28)




 Trento, Italy, 6.11.2012                 Emanuele Della Valle - http://streamreasoning.org/
MEMO: SPARQL




Trento, Italy, 6.11.2012   Emanuele Della Valle - http://streamreasoning.org/
The solution space – Stream Reasoning “alla PoliMi”
Where C-SPARQL Extends SPARQL




 Trento, Italy, 6.11.2012        Emanuele Della Valle - http://streamreasoning.org/
The solution space – Stream Reasoning “alla PoliMi”
An Example of C-SPARQL Query
    Who are the opinion makers? i.e., the users who are likely to
      influence the behavior of other users who follow them

    REGISTER STREAM OpinionMakers COMPUTED EVERY 5m AS
    CONSTRUCT { ?opinionMaker sd:about ?resource }
    FROM STREAM <http://streamingsocialdata.org/interactions>
       [RANGE 30m STEP 5m]
    WHERE {
                ?opinionMaker ?opinion ?resource .
                ?follower sioc:follows ?opinionMaker.
                ?follower ?opinion ?resource.
                FILTER ( cs:timestamp(?follower) >
                          cs:timestamp(?opinionMaker)
                            && ?opinion != sd:accesses )
    }
    HAVING ( COUNT(DISTINCT ?follower) > 3 )



 Trento, Italy, 6.11.2012            Emanuele Della Valle - http://streamreasoning.org/
The solution space – Stream Reasoning “alla PoliMi”
An Example of C-SPARQL Query
    Who are the opinion makers? i.e., the users who are likely to
      influence the behavior of other users who follow added as
         Query registration                  RDF Stream them
             (for continuous execution)                           new ouput format
    REGISTER STREAM OpinionMakers COMPUTED EVERY 5m AS
    CONSTRUCT { ?opinionMaker sd:about ?resource }
    FROM STREAM <http://streamingsocialdata.org/interactions>
       [RANGE 30m STEP 5m]
    WHERE {                                                        FROM STREAM clause
                ?opinionMaker ?opinion ?resource .
                                                                         WINDOW
                ?follower sioc:follows ?opinionMaker.
                ?follower ?opinion ?resource.                                 Builtin to
                                                                              access
                FILTER ( cs:timestamp(?follower) >                            timestamps
                          cs:timestamp(?opinionMaker)
                            && ?opinion != sd:accesses )
                                                                                 Aggregates as
    }                                                                            in SPARQL 1.1
    HAVING ( COUNT(DISTINCT ?follower) > 3 )



 Trento, Italy, 6.11.2012                 Emanuele Della Valle - http://streamreasoning.org/
The solution space – Stream Reasoning “alla PoliMi”
Efficiency of C-SPARQL Query Evaluation
§  window based selection of C-SPARQL outperforms the
    standard FILTER based selection




 Trento, Italy, 6.11.2012        Emanuele Della Valle - http://streamreasoning.org/
The solution space – Stream Reasoning “alla PoliMi”
Efficiency of C-SPARQL Query Evaluation

§  C-SPARQL Algebra allows to push of filters and projections




 Trento, Italy, 6.11.2012            Emanuele Della Valle - http://streamreasoning.org/
The solution space – Stream Reasoning “alla PoliMi”
High Throughputs of C-SPARQL Engine




 Trento, Italy, 6.11.2012        Emanuele Della Valle - http://streamreasoning.org/
The solution space – Stream Reasoning “alla PoliMi”
Incremental Materialization evaluation
§             base-line: re-computing the materialization from scratch
§             state-of-the-art (materialized view incremental maintenance)
§             PoliMi’s incremental stream approach [ESWC2010]




                          % of the materialization changed when the window slides

      Trento, Italy, 6.11.2012              Emanuele Della Valle - http://streamreasoning.org/
The solution space – Stream Reasoning “alla PoliMi”
Incremental Maintenance and Query Latency
§  comparison of the average time needed to answer
    a C-SPARQL query using
    •    backward reasoner
    •    the naive approach of re-computing the materialization
    •    PoliMi’s incremental-stream approach
                      20


                      15


                      10
             ms.




                        5


                        0
                            forward	
  reasoning        naive	
  approach     incremental-­‐stream
          query                    5,82
                               Backward reasoning             1,61                    1,61
          materialization            0                       15,91                    0,28




 Trento, Italy, 6.11.2012                   Emanuele Della Valle - http://streamreasoning.org/
The solution space
Stream Reasoning Community Achievements
§  RDF Stream data type
    •  Adopted by most of the research groups active on Stream
       Reasoning
    •  Alternative solution based on two time stamps used in eTalis
§  Continuous query language
    •  C-SPARQL was extended by the community
    •  Alternative solutions have been studied
          –    without FROM STREAM clause [CQUELS]
          –    oriented to complex event processing [2]

§  Reasoning
    •    Data-driven for RDFS++ [ESCW2010]
    •    Goal-driven for temporal logics (eTalis) [2]
    •    time-decaying logic programs [26].
    •    Inductive reasoning [IEEE-IS2010]
§  Implementation Experiences
    •    C-SPARQL Engine
    •    eTalis / EP-SPARQL
    •    CQUELS
    •    S2R

 Trento, Italy, 6.11.2012                Emanuele Della Valle - http://streamreasoning.org/
The solution space
Stream Reasoning next steps

§  Scientific
    •  Notions of soundness and completeness
    •  More expressive reasoning
          –  with minor loss in throughput
          –  and predictable loss on scalability
    •  Dealing with incomplete & noisy data
    •  Parallelization and distribution of the processing

§  Technical
    •  Prove effectiveness and efficacy in specific application
       domains
    •  Better integrate continuous semantics with Linked Data
    •  Design and develop a software framework to simplify stream
       reasoning application development

§  Organizational
    •  Standardaze RDF Stream, C-SPARQL, Streaming Linked
       Data, etc.

 Trento, Italy, 6.11.2012          Emanuele Della Valle - http://streamreasoning.org/
The solution space
Wrap-up of Stream Reasoning
                    Types of
                    orders
        Combinations




                                  data management
        Expensive to enforce




                                  Order-aware
        Cheap to enforce


        Natural
                                                     Stream reasoning
        No ordering
                                                    Scalable reasoning
                                                                               Types of
                                                                               reasoning

                            No reasoning Data-driven Query-driven Combinations

 Trento, Italy, 6.11.2012              Emanuele Della Valle - http://streamreasoning.org/   57
The solution space
Top-k reasoning
                    Types of
                    orders
        Combinations




                                  data management
        Expensive to enforce




                                  Order-aware
                                                          Top-k
        Cheap to enforce                                Reasoning

        Natural
                                                     Stream reasoning
        No ordering
                                                    Scalable reasoning
                                                                               Types of
                                                                               reasoning

                            No reasoning Data-driven Query-driven Combinations

 Trento, Italy, 6.11.2012              Emanuele Della Valle - http://streamreasoning.org/   58
The solution space
Top-k reasoning approach
§  In traditional reasoning, ranking of results is
    normally considered a task that increase the
    hopelessness of scaling inference to massive data
    set
§  Top-k reasoning should, instead, overcome such a
    common practice and interleave ordering and
    reasoning
§  W.r.t. the running example, top-k reasoning should
    allow to efficiently
    •  compute which are the top-k social media users, who are
       well-known to lead discussions on fashion-related topics and
       are closest to the requester current location.




 Trento, Italy, 6.11.2012    Emanuele Della Valle - http://streamreasoning.org/
The solution space
Top-k reasoning attempts
§  SoftFacts [60]
    •  an ontology-mediated top-k information retrieval system over
       relational databases

§  SparqlRank[13]
    •  adds order to SPARQL algebra as a first class citizen and
       experimentally shows the performance gain

§  AnQL [41]
    •  extends SPARQL to querying RDFS annotated by bounded
       lattice (and thus comes with a partial or- dering).

§  Notion of exact top-k closure of an ontology w.r.t. a
    query and a scoring function [53]




 Trento, Italy, 6.11.2012    Emanuele Della Valle - http://streamreasoning.org/
The solution space
Top-k queries in SPARQL 1.1
§  Retrieve the best 10 offers ordered by a function of
    user ratings of the product and offer price:
      	
  
      SELECT	
  ?product	
  ?offer	
  	
  
      (g1(?avgRat1)	
  +	
  g2(?avgRat2)	
  +	
  g3(?price)	
  AS	
  ?score)	
  
      WHERE	
  {	
  	
  
         ?product	
  hasAvgRat1	
  ?avgRat1	
  .	
  
         ?product	
  hasAvgRat2	
  ?avgRat2	
  .	
  
         ?product	
  hasName	
  ?name	
  .	
  
         ?product	
  hasOffers	
  ?offer	
  .	
  
         ?offer	
  hasPrice	
  ?price	
  	
  
      }	
  
      ORDER	
  BY	
  DESC	
  (?score)	
  	
  
      LIMIT	
  10 	
  

§  Slow = tens of seconds on 5M (could be improved to
    milliseconds)
 Trento, Italy, 6.11.2012                 Emanuele Della Valle - http://streamreasoning.org/
The solution space - Top-k queries in SPARQL 1.1

Challenges
§  Adapting SQL optimizations to SPARQL is not
    straightforward:
    •  Different algebra
    •  Different cost of data access in native RDF triplestores
          –  Sorted access is slow, random access is fast
    •  Additional optimization dimensions
          –  Pushing the evaluation of BGP in the storage

§  Research tasks
    •  New algebra for SPARQL where order is a first class citizen
    •  new algorithms, and
    •  optimization techniques




 Trento, Italy, 6.11.2012           Emanuele Della Valle - http://streamreasoning.org/
The solution space - Top-k queries in SPARQL 1.1
The SPARQL-Rank algebra

§  Extends the standard SPARQL algebra
§  Ranked set of mappings: set of mappings augmented
    with an order relation




                  New
      Extended
               EQUIVALENC
     OPERATORS
                   ES

 Trento, Italy, 6.11.2012           Emanuele Della Valle - http://streamreasoning.org/
The solution space – SPARQL-Rank algebra
The new Rank Operator




                            F (p1, p2)= ?p1 + ?p2

            ?x   ?y ?p1 ?p2                     ?x ?y ?p1 ?p2 Fp1
       µ1    1     8    0.8   0.8   ρp1    µ1    1    8   0.8   0.8   1.8
       µ2    3     3    0.3   0.6          µ3    3    4   0.4   0.6   1.4

       µ3    3     4    0.4   0.6          µ2    3    3   0.3   0.6   1.3


                  Ω                                   ρp1(Ω )

 Trento, Italy, 6.11.2012              Emanuele Della Valle - http://streamreasoning.org/   64
The solution space – SPARQL-Rank algebra
The redefined Join Operator




       ?x ?y ?p1 ?p2 Fp1                                ?x ?z ?p2 Fp2
    µ1 1   8 0.8 0.8 1.8                              µ4 1 9 0.8 1.8
    µ3    3     4    0.4    0.6     1.4
                                                      µ5    3        0     0.6   1.6
    µ2    3     3    0.3    0.6     1.3
                    Ωp1                                              Ω’p2
                               ?x         ?y ?z   ?p1 ?p2 Fp1Up
                                                                 2
                     µ1 U µ4      1       8   9    0.8     0.8       1.6
                     µ3 U µ5      3       4   0    0.4     0.6       1.0
                     µ2 U µ5      3       3   0    0.3     0.6       0.9

 Trento, Italy, 6.11.2012                     Emanuele Della Valle - http://streamreasoning.org/   65
The solution space – SPARQL-Rank algebra
Rank Join Algorithms

§  Different algorithms based on available access in
    the inputs:                   RankJoin

                                             (a)
    •  Hash Rank-Join                                       RankJoin
                                                   sortedAccess sortedAccess
          –  e.g. HRJN [Ilyas2004]           (a)
                                                         RankSequence
                                                   sortedAccess sortedAccess
                                             (b)
                                                         RankSequence
                                                   sortedAccess randomAccess
                                             (b)
    •  Random Access Rank-Join                            RA-RankJoin
                                                   sortedAccess randomAccess
          –  e.g. RA-HRJN [Ilyas2004] (c)                  RA-RankJoin
                                                             RankJoin
                                                    sortedAccess sortedAccess
                                                   randomAccess randomAccess
                                             (c)
                                             (a)
                                                    sortedAccess sortedAccess
                                                   randomAccess randomAccess
                                                    sortedAccess sortedAccess

    •  RankSequence (e,g, RSEQ)                          RankSequence

          –  Minimum sorted access   (b)
          –  Leverages random access     sortedAccess randomAccess
                                                                                  2   ]
                                                                           SWC201
                                                                      EW [I
                                                          RA-RankJoin
                                                                   N
 Trento, Italy, 6.11.2012           Emanuele(c)
                                             Della Valle - http://streamreasoning.org/
The solution space – SPARQL-Rank algebra
The new Algebraic Equivalences



     Split




 Trento, Italy, 6.11.2012         Emanuele Della Valle - http://streamreasoning.org/
The solution space – SPARQL-Rank algebra
The new Algebraic Equivalences




 Interleave




 Trento, Italy, 6.11.2012         Emanuele Della Valle - http://streamreasoning.org/
The solution space – SPARQL-Rank algebra
Planning Strategies

§  Apply algebraic equivalences
§  Result: three possible strategies




  1. Rank of BGPs           2. Interleaved                  3. Rank Join




 Trento, Italy, 6.11.2012         Emanuele Della Valle - http://streamreasoning.org/
The solution space – SPARQL-Rank algebra
Planning Strategies: rank of BGPs (ROB)

§  Substitute the monolithic scoring function with a
    number of incremental rank operators (rho)

         ?pr, ?of, ?score                                                 ?pr, ?of, ?score
                                                                    ?pr, ?of, ?score
          SLICE [0,10]                                                      SLICE [0,10]
                                                                   SLICE [0,10]
            ORDER                                                                     Join
             [?score]                                                          ?pr = ?pr
                                                         RankJoin    g3(?p1)
            EXTEND                                             ?pr = ?pr
[?score =g1(?a1)+g2(?a2)+g3(?p1)]       RankJoin                                          g2(?a2)
                                              ?pr = ?pr                g1(?a1)
         ?pr hasA1 ?a1.
         ?pr hasA2 ?a2 .               g3(?p1)                   g1(?a1)
         ?pr hasN ?n .                                    ?pr hasA1 ?a1 . ?pr hasN ?n .
         ?pr hasO ?of .             ?pr hasO ?of .        ?pr hasO ?of . ?of hasP1 ?p1
         ?of hasP ?p1.              ?of hasP ?p1 .           ?pr hasA1 ?a1 .         ?pr hasA2 ?a
                                                                      seqScan
                (a)                                                     (b)
                                                                         (a)


 Trento, Italy, 6.11.2012            Emanuele Della Valle - http://streamreasoning.org/
The solution space – SPARQL-Rank algebra
     Planning Strategies: Interleaved (INTER)

     §  Separate the pattern in two groups:
          •  Triple patterns that influence the ranking
          •  Triple patterns that don’t influence the ranking


of, ?score ?pr, ?of, ?score ?pr, ?of, ?score                                    ?pr, ?of, ?score
                                                                              ?pr, ?of, ?score
            SLICE [0,10]                                                           SLICE [0,10]
E [0,10]                   SLICE [0,10]                                       SLICE [0,10]
             ORDER                                                                           Join
                   [?score]                                                           ?pr =Sequence
                                                                                           ?pr
 p1)                                                             RankJoin        ?pr = ?pr
                 EXTEND              g3(?p1)                           ?pr = ?pr
     [?score =g1(?a1)+g2(?a2)+g3(?p1)]           RankJoin               g3(?p1)             ?prg2(?a2)?n ?pr h
                                                                                                hasN
 a1)                                                   ?pr = ?pr                             seqScan
            ?pr hasA1 ?a1.                                              g1(?a1)
            ?pr hasA2 ?a2 .                 g3(?p1)                     g1(?a1)
  ?pr hasN ?n . hasN?pr .hasA1 ?a1 . ?pr hasN ?n .
            ?pr      ?n                                            ?pr hasA1 ?a1 .
 ?of hasP1 ?p1 hasO?pr hasO ?of . ?of hasP1 ?p1 .
            ?pr      ?of .               ?pr hasO ?of        ?pr hasO ?of . ?of hasP1 ?p1
            ?of hasP ?p1.                ?of hasP ?p1 .             ?pr hasA1 ?a1 .        ?pr hasA2 ?a2 .
 can                              orderScan_a1                         seqScan
a)                   (a)              (b)                                      (b)
                                                                                 (c)


       Trento, Italy, 6.11.2012                  Emanuele Della Valle - http://streamreasoning.org/
The solution space – SPARQL-Rank algebra
Planning Strategies: Rank-Join (RJ)

§  Split into one pattern for each ranking criterion
§  Use the most appropriate join based on type of access
                               ?pr, ?of, ?score                                                ?pr, ?of, ?score
           ?pr, ?of, ?score                                                          ?pr, ?of, ?score
                               SLICE [0,10]                                                      SLICE [0,10]
            SLICE [0,10]                                                                SLICE [0,10]
                                 ORDER                                                                   Join
              ORDER               [?score]                                                    Join ?pr
                                                                                              ?pr =
               [?score]                                                       RankJoin ?pr = ?pr
                                                                   RankJoin
                         EXTEND
             EXTEND =g1(?a1)+g2(?a2)+g3(?p1)]                            ?pr = ?pr
                                                                                   ?pr = ?pr
                 [?score
 [?score =g1(?a1)+g2(?a2)+g3(?p1)]                            RankJoin
                                               RankJoin                                                   g     ?pr ?pr hasN ?n .
                                                                                                    g2(?a2)2(?a2) hasN ?n .
                           ?pr hasA1 ?a1.             ?pr = ?pr      ?pr = ?pr
          ?pr hasA1 ?a1.
          ?pr hasA2 ?a2 .  ?pr hasA2 ?a2 .                  g3(?p1)
                           ?pr hasN ?n .
                                             g3(?p1)                      g1(?a1) g1(?a1)
          ?pr hasN ?n .
          ?pr hasO ?of .   ?pr hasO ?of . ?pr hasO ?of . ?pr hasO ?of .
          ?of hasP ?p1.    ?of hasP ?p1. ?of hasP ?p1 . ?of hasP ?p1 ?pr hasA1 ?pr hasA1 ?a1 . ?pr hasA2 hasA2 ?a2 .
                                                                      .         ?a1 .                 ?pr ?a2 .

                 (a)                (a)                                           (b)      (b)




 Trento, Italy, 6.11.2012                              Emanuele Della Valle - http://streamreasoning.org/
The solution space – SPARQL-Rank algebra
Experimental evidences of performance improvements

§  Example query, 5M triples dataset
§  Assumption: availability of sorted access indexes


                                                                     Two orders
                                                                     of magnitude
                                                                     better




 Trento, Italy, 6.11.2012         Emanuele Della Valle - http://streamreasoning.org/
The solution space – SPARQL-Rank algebra
Experimental evidences of performance improvements

§  Benchmark: 8 queries from on an extension of BSBM




 Trento, Italy, 6.11.2012         Emanuele Della Valle - http://streamreasoning.org/
The solution space
Wrap-up of Top-k Reasoning
                    Types of
                    orders
        Combinations




                                  data management
        Expensive to enforce




                                  Order-aware
                                                          Top-k
        Cheap to enforce                                Reasoning

        Natural
                                                     Stream reasoning
        No ordering
                                                    Scalable reasoning
                                                                               Types of
                                                                               reasoning

                            No reasoning Data-driven Query-driven Combinations

 Trento, Italy, 6.11.2012              Emanuele Della Valle - http://streamreasoning.org/   75
The solution space
Full-fledge Order-aware reasoning
                    Types of
                    orders
        Combinations                                   Order-aware




                                  data management
                                                        reasoning
        Expensive to enforce




                                  Order-aware
                                                          Top-k
        Cheap to enforce                                Reasoning

        Natural
                                                     Stream reasoning
        No ordering
                                                    Scalable reasoning
                                                                               Types of
                                                                               reasoning

                            No reasoning Data-driven Query-driven Combinations

 Trento, Italy, 6.11.2012              Emanuele Della Valle - http://streamreasoning.org/   76
The solution space
Full-fledge Order-aware reasoning
§  In Full-fledged order-aware reasoning, data- and
    query-driven inference methods have to deal with
    combinations of natural, cheap to enforce and
    expensive to enforce type of orders.
    •  the naive assumption of independence of orderings would
       have to be relaxed
    •  theories and methods, which exploit mutual relationships
       between the three type of orders, have to be rethought

§  Considering our running example, methods
    implementing order-aware reasoning are the only
    ones able to answer to the query
    •  Which users of social media, currently leading popular
       discussions on fashion- related topics, are closest to my
       current location? What are they saying about the shopping
       district nearby?


 Trento, Italy, 6.11.2012   Emanuele Della Valle - http://streamreasoning.org/
The solution space
Full-fledge Order-aware reasoning
§  State-of-the-art
    •  None

§  Promising work
    •  The Answer Set Programming (ASP) community has recently
       proposed an streaming algorithm for ASP [25] that
          1.  ranks the constants referring to domain elements and,
          2.  fetch them increasing the domain sizes until an answer set is
              found.

§  Challenges
    •  theoretical framework that unifies and generalises those
       defined for stream reasoning and top-k reasoning
    •  designing and test scalable data- and query-driven methods
       that allows for efficient answering of queries that involve all
       types of orders



 Trento, Italy, 6.11.2012         Emanuele Della Valle - http://streamreasoning.org/
The solution space
Wrap-up of Top-k Reasoning
                    Types of
                    orders
        Combinations                                   Order-aware




                                  data management
                                                        reasoning
        Expensive to enforce




                                  Order-aware
                                                          Top-k
        Cheap to enforce                                Reasoning

        Natural
                                                     Stream reasoning
        No ordering
                                                    Scalable reasoning
                                                                               Types of
                                                                               reasoning

                            No reasoning Data-driven Query-driven Combinations

 Trento, Italy, 6.11.2012              Emanuele Della Valle - http://streamreasoning.org/   79
References
My papers
[IEEE-IS2009] E. Della Valle, S. Ceri, F. van Harmelen, D. Fensel
It's a Streaming World! Reasoning upon Rapidly Changing Information.
IEEE Intelligent Systems 24(6): 83-89 (2009)
[EDBT2010] D.F. Barbieri, D.Braga, S. Ceri and M. Grossniklaus.
An Execution Environment for C-SPARQL Queries. EDBT 2010
[WWW2009] D.F. Barbieri, D. Braga, S. Ceri, E. Della Valle, M. Grossniklaus:
C-SPARQL: SPARQL for continuous querying. WWW 2009: 1061-1062
[IEEE-IS2010] D. Barbieri, D. Braga, S. Ceri, E. Della Valle, Y. Huang, V. Tresp, A.Rettinger, H.
Wermser: Deductive and Inductive Stream Reasoning for Semantic Social Media Analytics IEEE
Intelligent Systems, 30 Aug. 2010.
[JWS2012] M. Balduini; I.Celino; E. Della Valle; D.Dell'Aglio; Y. Huang; T. Lee; S. Kim; V. Tresp:
BOTTARI: an Augmented Reality Mobile Application to deliver Personalized and Location-based
Recommendations by Continuous Analysis of Social Media Streams. JWS. 2012. IN PRESS.
[ESWC2010] D.F. Barbieri, D. Braga, S. Ceri, E. Della Valle, M. Grossniklaus.
Incremental Reasoning on Streams and Rich Background Knowledge. ESWC 2010
[SWJ2012] E. Della Valle, S.Schlobach, M. Krötzsch, A. Bozzon, S. Ceri, I. Horrocks.
Order Matters! Harnessing a World of Orderings for Reasoning over Massive Data. IN PRESS
[ISWC2012] S. Magliacane, A. Bozzon, E. Della Valle.
Efficient Execution of Top-k SPARQL Queries. ISWC 2012. IN PRESS
  Trento, Italy, 6.11.2012                Emanuele Della Valle - http://streamreasoning.org/
Downloads
§  C-SPARQL Engine (no reasoning support)
    •  A ready to go pack for eclipse
          –  http://streamreasoning.org/download
    •  Source code available on request

§  SPARQL-Rank Engine (ARQ-Rank)
    •  Source code and experimental data
          –  http://sparqlrank.search-computing.org/




 Trento, Italy, 6.11.2012        Emanuele Della Valle - http://streamreasoning.org/
Thank You!

Any questions? emanuele.dellavalle@polimi.it



                           Keep an eye on
                           http://www.streamreasoning.org
                           There’s much more to come!




Trento, Italy, 6.11.2012    Emanuele Della Valle - http://streamreasoning.org/   82

More Related Content

Similar to Harnessing Order for Reasoning over Massive Data

Challenges, Approaches, and Solutions in Stream Reasoning
Challenges, Approaches, and Solutions in Stream ReasoningChallenges, Approaches, and Solutions in Stream Reasoning
Challenges, Approaches, and Solutions in Stream Reasoning Emanuele Della Valle
 
The Reactive Principles: Design Principles For Cloud Native Applications
The Reactive Principles: Design Principles For Cloud Native ApplicationsThe Reactive Principles: Design Principles For Cloud Native Applications
The Reactive Principles: Design Principles For Cloud Native ApplicationsJonas Bonér
 
The Reactive Principles: Eight Tenets For Building Cloud Native Applications
The Reactive Principles: Eight Tenets For Building Cloud Native ApplicationsThe Reactive Principles: Eight Tenets For Building Cloud Native Applications
The Reactive Principles: Eight Tenets For Building Cloud Native ApplicationsLightbend
 
A Cloud-Based Bayesian Smart Agent Architecture for Internet-of-Things Applic...
A Cloud-Based Bayesian Smart Agent Architecture for Internet-of-Things Applic...A Cloud-Based Bayesian Smart Agent Architecture for Internet-of-Things Applic...
A Cloud-Based Bayesian Smart Agent Architecture for Internet-of-Things Applic...Veselin Pizurica
 
A Cloud-Based Bayesian Smart Agent Architecture for Internet-of-Things Applic...
A Cloud-Based Bayesian Smart Agent Architecture for Internet-of-Things Applic...A Cloud-Based Bayesian Smart Agent Architecture for Internet-of-Things Applic...
A Cloud-Based Bayesian Smart Agent Architecture for Internet-of-Things Applic...waylay
 
The Role of Ontologies in Emergent Middleware: Supporting Interoperability in...
The Role of Ontologies in Emergent Middleware: Supporting Interoperability in...The Role of Ontologies in Emergent Middleware: Supporting Interoperability in...
The Role of Ontologies in Emergent Middleware: Supporting Interoperability in...Amel Bennaceur
 
zenoh: The Edge Data Fabric
zenoh: The Edge Data Fabriczenoh: The Edge Data Fabric
zenoh: The Edge Data FabricAngelo Corsaro
 
Decomposed Conformance Checking in the Data era
Decomposed Conformance Checking in the Data eraDecomposed Conformance Checking in the Data era
Decomposed Conformance Checking in the Data eraWai Lam Jonathan Lee
 
Providing fault tolerance in extreme scale parallel applications
Providing fault tolerance in extreme scale parallel applicationsProviding fault tolerance in extreme scale parallel applications
Providing fault tolerance in extreme scale parallel applicationshjjvandam
 
Large Components in the Rearview Mirror
Large Components in the Rearview MirrorLarge Components in the Rearview Mirror
Large Components in the Rearview MirrorMichelle Brush
 
The Role Of Ontology In Modern Expert Systems Dallas 2008
The Role Of Ontology In Modern Expert Systems   Dallas   2008The Role Of Ontology In Modern Expert Systems   Dallas   2008
The Role Of Ontology In Modern Expert Systems Dallas 2008Jason Morris
 
Mining data streams using option trees
Mining data streams using option treesMining data streams using option trees
Mining data streams using option treesAlexander Decker
 
The Science of Cyber Security Experimentation: The DETER Project
The Science of Cyber Security Experimentation: The DETER ProjectThe Science of Cyber Security Experimentation: The DETER Project
The Science of Cyber Security Experimentation: The DETER ProjectDETER-Project
 
The return of big iron?
The return of big iron?The return of big iron?
The return of big iron?Ben Stopford
 
Unit-1.pptx final unit new mtech unit thre
Unit-1.pptx final unit new mtech unit threUnit-1.pptx final unit new mtech unit thre
Unit-1.pptx final unit new mtech unit threjaved75
 
Large Scale Data Mining using Genetics-Based Machine Learning
Large Scale Data Mining using Genetics-Based Machine LearningLarge Scale Data Mining using Genetics-Based Machine Learning
Large Scale Data Mining using Genetics-Based Machine Learningjaumebp
 
Thinking in parallel ab tuladev
Thinking in parallel ab tuladevThinking in parallel ab tuladev
Thinking in parallel ab tuladevPavel Tsukanov
 

Similar to Harnessing Order for Reasoning over Massive Data (20)

Challenges, Approaches, and Solutions in Stream Reasoning
Challenges, Approaches, and Solutions in Stream ReasoningChallenges, Approaches, and Solutions in Stream Reasoning
Challenges, Approaches, and Solutions in Stream Reasoning
 
The Reactive Principles: Design Principles For Cloud Native Applications
The Reactive Principles: Design Principles For Cloud Native ApplicationsThe Reactive Principles: Design Principles For Cloud Native Applications
The Reactive Principles: Design Principles For Cloud Native Applications
 
The Reactive Principles: Eight Tenets For Building Cloud Native Applications
The Reactive Principles: Eight Tenets For Building Cloud Native ApplicationsThe Reactive Principles: Eight Tenets For Building Cloud Native Applications
The Reactive Principles: Eight Tenets For Building Cloud Native Applications
 
A Cloud-Based Bayesian Smart Agent Architecture for Internet-of-Things Applic...
A Cloud-Based Bayesian Smart Agent Architecture for Internet-of-Things Applic...A Cloud-Based Bayesian Smart Agent Architecture for Internet-of-Things Applic...
A Cloud-Based Bayesian Smart Agent Architecture for Internet-of-Things Applic...
 
A Cloud-Based Bayesian Smart Agent Architecture for Internet-of-Things Applic...
A Cloud-Based Bayesian Smart Agent Architecture for Internet-of-Things Applic...A Cloud-Based Bayesian Smart Agent Architecture for Internet-of-Things Applic...
A Cloud-Based Bayesian Smart Agent Architecture for Internet-of-Things Applic...
 
The Role of Ontologies in Emergent Middleware: Supporting Interoperability in...
The Role of Ontologies in Emergent Middleware: Supporting Interoperability in...The Role of Ontologies in Emergent Middleware: Supporting Interoperability in...
The Role of Ontologies in Emergent Middleware: Supporting Interoperability in...
 
zenoh: The Edge Data Fabric
zenoh: The Edge Data Fabriczenoh: The Edge Data Fabric
zenoh: The Edge Data Fabric
 
Decomposed Conformance Checking in the Data era
Decomposed Conformance Checking in the Data eraDecomposed Conformance Checking in the Data era
Decomposed Conformance Checking in the Data era
 
Providing fault tolerance in extreme scale parallel applications
Providing fault tolerance in extreme scale parallel applicationsProviding fault tolerance in extreme scale parallel applications
Providing fault tolerance in extreme scale parallel applications
 
Bloom plseminar-sp15
Bloom plseminar-sp15Bloom plseminar-sp15
Bloom plseminar-sp15
 
Large Components in the Rearview Mirror
Large Components in the Rearview MirrorLarge Components in the Rearview Mirror
Large Components in the Rearview Mirror
 
The Role Of Ontology In Modern Expert Systems Dallas 2008
The Role Of Ontology In Modern Expert Systems   Dallas   2008The Role Of Ontology In Modern Expert Systems   Dallas   2008
The Role Of Ontology In Modern Expert Systems Dallas 2008
 
Mining data streams using option trees
Mining data streams using option treesMining data streams using option trees
Mining data streams using option trees
 
18 Data Streams
18 Data Streams18 Data Streams
18 Data Streams
 
The Science of Cyber Security Experimentation: The DETER Project
The Science of Cyber Security Experimentation: The DETER ProjectThe Science of Cyber Security Experimentation: The DETER Project
The Science of Cyber Security Experimentation: The DETER Project
 
The return of big iron?
The return of big iron?The return of big iron?
The return of big iron?
 
Unit-1.pptx final unit new mtech unit thre
Unit-1.pptx final unit new mtech unit threUnit-1.pptx final unit new mtech unit thre
Unit-1.pptx final unit new mtech unit thre
 
Large Scale Data Mining using Genetics-Based Machine Learning
Large Scale Data Mining using Genetics-Based Machine LearningLarge Scale Data Mining using Genetics-Based Machine Learning
Large Scale Data Mining using Genetics-Based Machine Learning
 
169 s170
169 s170169 s170
169 s170
 
Thinking in parallel ab tuladev
Thinking in parallel ab tuladevThinking in parallel ab tuladev
Thinking in parallel ab tuladev
 

More from Emanuele Della Valle

Taming velocity - a tale of four streams
Taming velocity - a tale of four streamsTaming velocity - a tale of four streams
Taming velocity - a tale of four streamsEmanuele Della Valle
 
Work in progress on Inductive Stream Reasoning
Work in progress on Inductive Stream ReasoningWork in progress on Inductive Stream Reasoning
Work in progress on Inductive Stream ReasoningEmanuele Della Valle
 
Knowledge graphs in search engines
Knowledge graphs in search enginesKnowledge graphs in search engines
Knowledge graphs in search enginesEmanuele Della Valle
 
La città dei balocchi 2017 in numeri - Fluxedo
La città dei balocchi 2017 in numeri - FluxedoLa città dei balocchi 2017 in numeri - Fluxedo
La città dei balocchi 2017 in numeri - FluxedoEmanuele Della Valle
 
Stream Reasoning: a summary of ten years of research and a vision for the nex...
Stream Reasoning: a summary of ten years of research and a vision for the nex...Stream Reasoning: a summary of ten years of research and a vision for the nex...
Stream Reasoning: a summary of ten years of research and a vision for the nex...Emanuele Della Valle
 
ACQUA: Approximate Continuous Query Answering over Streams and Dynamic Linked...
ACQUA: Approximate Continuous Query Answering over Streams and Dynamic Linked...ACQUA: Approximate Continuous Query Answering over Streams and Dynamic Linked...
ACQUA: Approximate Continuous Query Answering over Streams and Dynamic Linked...Emanuele Della Valle
 
Stream reasoning: an approach to tame the velocity and variety dimensions of ...
Stream reasoning: an approach to tame the velocity and variety dimensions of ...Stream reasoning: an approach to tame the velocity and variety dimensions of ...
Stream reasoning: an approach to tame the velocity and variety dimensions of ...Emanuele Della Valle
 
Big Data: how to use it to create value
Big Data: how to use it to create valueBig Data: how to use it to create value
Big Data: how to use it to create valueEmanuele Della Valle
 
Ist16-02 HL7 from v2 (syntax) to v3 (semantics)
Ist16-02 HL7 from v2 (syntax) to v3 (semantics)Ist16-02 HL7 from v2 (syntax) to v3 (semantics)
Ist16-02 HL7 from v2 (syntax) to v3 (semantics)Emanuele Della Valle
 
IST16-01 - Introduction to Interoperability and Semantic Technologies
IST16-01 - Introduction to Interoperability and Semantic TechnologiesIST16-01 - Introduction to Interoperability and Semantic Technologies
IST16-01 - Introduction to Interoperability and Semantic TechnologiesEmanuele Della Valle
 
Stream reasoning: mastering the velocity and the variety dimensions of Big Da...
Stream reasoning: mastering the velocity and the variety dimensions of Big Da...Stream reasoning: mastering the velocity and the variety dimensions of Big Da...
Stream reasoning: mastering the velocity and the variety dimensions of Big Da...Emanuele Della Valle
 
Social listener-brera-design-district-2015-03
Social listener-brera-design-district-2015-03Social listener-brera-design-district-2015-03
Social listener-brera-design-district-2015-03Emanuele Della Valle
 
City Data Fusion for Event Management (in Italiano)
City Data Fusion for Event Management (in Italiano)City Data Fusion for Event Management (in Italiano)
City Data Fusion for Event Management (in Italiano)Emanuele Della Valle
 
Semantic technologies and Interoperability
Semantic technologies and InteroperabilitySemantic technologies and Interoperability
Semantic technologies and InteroperabilityEmanuele Della Valle
 
Big data: why, what, paradigm shifts enabled , tools and market landscape
Big data: why, what, paradigm shifts enabled , tools and market landscapeBig data: why, what, paradigm shifts enabled , tools and market landscape
Big data: why, what, paradigm shifts enabled , tools and market landscapeEmanuele Della Valle
 
City Data Fusion and City Sensing presented at EIT ICT Labs for EXPO 2015
City Data Fusion and City Sensing presented at EIT ICT Labs for EXPO 2015City Data Fusion and City Sensing presented at EIT ICT Labs for EXPO 2015
City Data Fusion and City Sensing presented at EIT ICT Labs for EXPO 2015Emanuele Della Valle
 
On the effectiveness of a Mobile Puzzle Game UI to Crowdsource Linked Data Ma...
On the effectiveness of a Mobile Puzzle Game UI to Crowdsource Linked Data Ma...On the effectiveness of a Mobile Puzzle Game UI to Crowdsource Linked Data Ma...
On the effectiveness of a Mobile Puzzle Game UI to Crowdsource Linked Data Ma...Emanuele Della Valle
 

More from Emanuele Della Valle (20)

Taming velocity - a tale of four streams
Taming velocity - a tale of four streamsTaming velocity - a tale of four streams
Taming velocity - a tale of four streams
 
Stream reasoning
Stream reasoningStream reasoning
Stream reasoning
 
Work in progress on Inductive Stream Reasoning
Work in progress on Inductive Stream ReasoningWork in progress on Inductive Stream Reasoning
Work in progress on Inductive Stream Reasoning
 
Big Data and Data Science W's
Big Data and Data Science W'sBig Data and Data Science W's
Big Data and Data Science W's
 
Knowledge graphs in search engines
Knowledge graphs in search enginesKnowledge graphs in search engines
Knowledge graphs in search engines
 
La città dei balocchi 2017 in numeri - Fluxedo
La città dei balocchi 2017 in numeri - FluxedoLa città dei balocchi 2017 in numeri - Fluxedo
La città dei balocchi 2017 in numeri - Fluxedo
 
Stream Reasoning: a summary of ten years of research and a vision for the nex...
Stream Reasoning: a summary of ten years of research and a vision for the nex...Stream Reasoning: a summary of ten years of research and a vision for the nex...
Stream Reasoning: a summary of ten years of research and a vision for the nex...
 
ACQUA: Approximate Continuous Query Answering over Streams and Dynamic Linked...
ACQUA: Approximate Continuous Query Answering over Streams and Dynamic Linked...ACQUA: Approximate Continuous Query Answering over Streams and Dynamic Linked...
ACQUA: Approximate Continuous Query Answering over Streams and Dynamic Linked...
 
Stream reasoning: an approach to tame the velocity and variety dimensions of ...
Stream reasoning: an approach to tame the velocity and variety dimensions of ...Stream reasoning: an approach to tame the velocity and variety dimensions of ...
Stream reasoning: an approach to tame the velocity and variety dimensions of ...
 
Big Data: how to use it to create value
Big Data: how to use it to create valueBig Data: how to use it to create value
Big Data: how to use it to create value
 
Ist16-04 An introduction to RDF
Ist16-04 An introduction to RDF Ist16-04 An introduction to RDF
Ist16-04 An introduction to RDF
 
Ist16-02 HL7 from v2 (syntax) to v3 (semantics)
Ist16-02 HL7 from v2 (syntax) to v3 (semantics)Ist16-02 HL7 from v2 (syntax) to v3 (semantics)
Ist16-02 HL7 from v2 (syntax) to v3 (semantics)
 
IST16-01 - Introduction to Interoperability and Semantic Technologies
IST16-01 - Introduction to Interoperability and Semantic TechnologiesIST16-01 - Introduction to Interoperability and Semantic Technologies
IST16-01 - Introduction to Interoperability and Semantic Technologies
 
Stream reasoning: mastering the velocity and the variety dimensions of Big Da...
Stream reasoning: mastering the velocity and the variety dimensions of Big Da...Stream reasoning: mastering the velocity and the variety dimensions of Big Da...
Stream reasoning: mastering the velocity and the variety dimensions of Big Da...
 
Social listener-brera-design-district-2015-03
Social listener-brera-design-district-2015-03Social listener-brera-design-district-2015-03
Social listener-brera-design-district-2015-03
 
City Data Fusion for Event Management (in Italiano)
City Data Fusion for Event Management (in Italiano)City Data Fusion for Event Management (in Italiano)
City Data Fusion for Event Management (in Italiano)
 
Semantic technologies and Interoperability
Semantic technologies and InteroperabilitySemantic technologies and Interoperability
Semantic technologies and Interoperability
 
Big data: why, what, paradigm shifts enabled , tools and market landscape
Big data: why, what, paradigm shifts enabled , tools and market landscapeBig data: why, what, paradigm shifts enabled , tools and market landscape
Big data: why, what, paradigm shifts enabled , tools and market landscape
 
City Data Fusion and City Sensing presented at EIT ICT Labs for EXPO 2015
City Data Fusion and City Sensing presented at EIT ICT Labs for EXPO 2015City Data Fusion and City Sensing presented at EIT ICT Labs for EXPO 2015
City Data Fusion and City Sensing presented at EIT ICT Labs for EXPO 2015
 
On the effectiveness of a Mobile Puzzle Game UI to Crowdsource Linked Data Ma...
On the effectiveness of a Mobile Puzzle Game UI to Crowdsource Linked Data Ma...On the effectiveness of a Mobile Puzzle Game UI to Crowdsource Linked Data Ma...
On the effectiveness of a Mobile Puzzle Game UI to Crowdsource Linked Data Ma...
 

Recently uploaded

Call Girls in Tughlakabad Delhi 9654467111 Shot 2000 Night 7000
Call Girls in Tughlakabad Delhi 9654467111 Shot 2000 Night 7000Call Girls in Tughlakabad Delhi 9654467111 Shot 2000 Night 7000
Call Girls in Tughlakabad Delhi 9654467111 Shot 2000 Night 7000Sapana Sha
 
Call In girls Delhi Safdarjung Enclave/WhatsApp 🔝 97111⇛⇛47426
Call In girls Delhi Safdarjung Enclave/WhatsApp 🔝  97111⇛⇛47426Call In girls Delhi Safdarjung Enclave/WhatsApp 🔝  97111⇛⇛47426
Call In girls Delhi Safdarjung Enclave/WhatsApp 🔝 97111⇛⇛47426jennyeacort
 
BOOK NIGHT-Call Girls In Noida City Centre Delhi ☎️ 8377877756
BOOK NIGHT-Call Girls In Noida City Centre Delhi ☎️ 8377877756BOOK NIGHT-Call Girls In Noida City Centre Delhi ☎️ 8377877756
BOOK NIGHT-Call Girls In Noida City Centre Delhi ☎️ 8377877756dollysharma2066
 
Unlocking Radiant Skin: The Ultimate Skincare Guide( beyonist)
Unlocking Radiant Skin: The Ultimate Skincare Guide( beyonist)Unlocking Radiant Skin: The Ultimate Skincare Guide( beyonist)
Unlocking Radiant Skin: The Ultimate Skincare Guide( beyonist)beyonistskincare
 
8 Easy Ways to Keep Your Heart Healthy this Summer | Amit Kakkar Healthyway
8 Easy Ways to Keep Your Heart Healthy this Summer | Amit Kakkar Healthyway8 Easy Ways to Keep Your Heart Healthy this Summer | Amit Kakkar Healthyway
8 Easy Ways to Keep Your Heart Healthy this Summer | Amit Kakkar HealthywayAmit Kakkar Healthyway
 
labradorite energetic gems for well beings.pdf
labradorite energetic gems for well beings.pdflabradorite energetic gems for well beings.pdf
labradorite energetic gems for well beings.pdfAkrati jewels inc
 
Uttoxeter & Cheadle Voice, Issue 122.pdf
Uttoxeter & Cheadle Voice, Issue 122.pdfUttoxeter & Cheadle Voice, Issue 122.pdf
Uttoxeter & Cheadle Voice, Issue 122.pdfNoel Sergeant
 
《QUT毕业文凭网-认证昆士兰科技大学毕业证成绩单》
《QUT毕业文凭网-认证昆士兰科技大学毕业证成绩单》《QUT毕业文凭网-认证昆士兰科技大学毕业证成绩单》
《QUT毕业文凭网-认证昆士兰科技大学毕业证成绩单》rnrncn29
 
83778-876O7, Cash On Delivery Call Girls In South- EX-(Delhi) Escorts Service...
83778-876O7, Cash On Delivery Call Girls In South- EX-(Delhi) Escorts Service...83778-876O7, Cash On Delivery Call Girls In South- EX-(Delhi) Escorts Service...
83778-876O7, Cash On Delivery Call Girls In South- EX-(Delhi) Escorts Service...dollysharma2066
 
Traditional vs. Modern Parenting: Unveiling the Pros and Cons for Your Child’...
Traditional vs. Modern Parenting: Unveiling the Pros and Cons for Your Child’...Traditional vs. Modern Parenting: Unveiling the Pros and Cons for Your Child’...
Traditional vs. Modern Parenting: Unveiling the Pros and Cons for Your Child’...bluetroyvictorVinay
 
'the Spring 2024- popular Fashion trends
'the Spring 2024- popular Fashion trends'the Spring 2024- popular Fashion trends
'the Spring 2024- popular Fashion trendsTangledThoughtsCO
 
Call Girls in New Friends Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in New Friends Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in New Friends Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in New Friends Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Virat Kohli Centuries In Career Age Awards and Facts.pdf
Virat Kohli Centuries In Career Age Awards and Facts.pdfVirat Kohli Centuries In Career Age Awards and Facts.pdf
Virat Kohli Centuries In Career Age Awards and Facts.pdfkigaya33
 
8377877756 Full Enjoy @24/7 Call Girls In Mayur Vihar Delhi Ncr
8377877756 Full Enjoy @24/7 Call Girls In Mayur Vihar Delhi Ncr8377877756 Full Enjoy @24/7 Call Girls In Mayur Vihar Delhi Ncr
8377877756 Full Enjoy @24/7 Call Girls In Mayur Vihar Delhi Ncrdollysharma2066
 

Recently uploaded (16)

Call Girls in Tughlakabad Delhi 9654467111 Shot 2000 Night 7000
Call Girls in Tughlakabad Delhi 9654467111 Shot 2000 Night 7000Call Girls in Tughlakabad Delhi 9654467111 Shot 2000 Night 7000
Call Girls in Tughlakabad Delhi 9654467111 Shot 2000 Night 7000
 
Call In girls Delhi Safdarjung Enclave/WhatsApp 🔝 97111⇛⇛47426
Call In girls Delhi Safdarjung Enclave/WhatsApp 🔝  97111⇛⇛47426Call In girls Delhi Safdarjung Enclave/WhatsApp 🔝  97111⇛⇛47426
Call In girls Delhi Safdarjung Enclave/WhatsApp 🔝 97111⇛⇛47426
 
BOOK NIGHT-Call Girls In Noida City Centre Delhi ☎️ 8377877756
BOOK NIGHT-Call Girls In Noida City Centre Delhi ☎️ 8377877756BOOK NIGHT-Call Girls In Noida City Centre Delhi ☎️ 8377877756
BOOK NIGHT-Call Girls In Noida City Centre Delhi ☎️ 8377877756
 
Unlocking Radiant Skin: The Ultimate Skincare Guide( beyonist)
Unlocking Radiant Skin: The Ultimate Skincare Guide( beyonist)Unlocking Radiant Skin: The Ultimate Skincare Guide( beyonist)
Unlocking Radiant Skin: The Ultimate Skincare Guide( beyonist)
 
8 Easy Ways to Keep Your Heart Healthy this Summer | Amit Kakkar Healthyway
8 Easy Ways to Keep Your Heart Healthy this Summer | Amit Kakkar Healthyway8 Easy Ways to Keep Your Heart Healthy this Summer | Amit Kakkar Healthyway
8 Easy Ways to Keep Your Heart Healthy this Summer | Amit Kakkar Healthyway
 
Call Girls 9953525677 Call Girls In Delhi Call Girls 9953525677 Call Girls In...
Call Girls 9953525677 Call Girls In Delhi Call Girls 9953525677 Call Girls In...Call Girls 9953525677 Call Girls In Delhi Call Girls 9953525677 Call Girls In...
Call Girls 9953525677 Call Girls In Delhi Call Girls 9953525677 Call Girls In...
 
labradorite energetic gems for well beings.pdf
labradorite energetic gems for well beings.pdflabradorite energetic gems for well beings.pdf
labradorite energetic gems for well beings.pdf
 
Uttoxeter & Cheadle Voice, Issue 122.pdf
Uttoxeter & Cheadle Voice, Issue 122.pdfUttoxeter & Cheadle Voice, Issue 122.pdf
Uttoxeter & Cheadle Voice, Issue 122.pdf
 
《QUT毕业文凭网-认证昆士兰科技大学毕业证成绩单》
《QUT毕业文凭网-认证昆士兰科技大学毕业证成绩单》《QUT毕业文凭网-认证昆士兰科技大学毕业证成绩单》
《QUT毕业文凭网-认证昆士兰科技大学毕业证成绩单》
 
83778-876O7, Cash On Delivery Call Girls In South- EX-(Delhi) Escorts Service...
83778-876O7, Cash On Delivery Call Girls In South- EX-(Delhi) Escorts Service...83778-876O7, Cash On Delivery Call Girls In South- EX-(Delhi) Escorts Service...
83778-876O7, Cash On Delivery Call Girls In South- EX-(Delhi) Escorts Service...
 
Traditional vs. Modern Parenting: Unveiling the Pros and Cons for Your Child’...
Traditional vs. Modern Parenting: Unveiling the Pros and Cons for Your Child’...Traditional vs. Modern Parenting: Unveiling the Pros and Cons for Your Child’...
Traditional vs. Modern Parenting: Unveiling the Pros and Cons for Your Child’...
 
'the Spring 2024- popular Fashion trends
'the Spring 2024- popular Fashion trends'the Spring 2024- popular Fashion trends
'the Spring 2024- popular Fashion trends
 
Call Girls in New Friends Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in New Friends Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in New Friends Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in New Friends Colony Delhi 💯Call Us 🔝8264348440🔝
 
Stunning ➥8448380779▻ Call Girls In Jasola Vihar Delhi NCR
Stunning ➥8448380779▻ Call Girls In Jasola Vihar Delhi NCRStunning ➥8448380779▻ Call Girls In Jasola Vihar Delhi NCR
Stunning ➥8448380779▻ Call Girls In Jasola Vihar Delhi NCR
 
Virat Kohli Centuries In Career Age Awards and Facts.pdf
Virat Kohli Centuries In Career Age Awards and Facts.pdfVirat Kohli Centuries In Career Age Awards and Facts.pdf
Virat Kohli Centuries In Career Age Awards and Facts.pdf
 
8377877756 Full Enjoy @24/7 Call Girls In Mayur Vihar Delhi Ncr
8377877756 Full Enjoy @24/7 Call Girls In Mayur Vihar Delhi Ncr8377877756 Full Enjoy @24/7 Call Girls In Mayur Vihar Delhi Ncr
8377877756 Full Enjoy @24/7 Call Girls In Mayur Vihar Delhi Ncr
 

Harnessing Order for Reasoning over Massive Data

  • 1. http://streamreasoning.org Order Matters! Harnessing a World of Orderings for Reasoning over Massive Data Emanuele Della Valle emanuele.dellavalle@polimi.it - http://emanueledellavalle.org
  • 2. Acknowledges §  This talk presents the content of a joint paper with Stefan Schlobachb, Markus Krötzschc, Alessandro Bozzona, Stefano Ceria, and Ian Horrocksc to appear on SWJ a Politecnico di Milano b Vrije Universiteit Amsterdam c Univerity of Oxford §  I also want to thank Frank van Harmelenb for his important contribution to the discussion, Tony Lee (Saltlux), Andreas Schreiber (DLR) and Achim Basermann (DLR) for the valuable discussion on concrete examples of problems that require order- aware reasoning. Moreover I want to thank Sara Magliacaneb for her work on SPARQL-RANK and the slides I use in this presentation, and Marco Balduinia, Davide Barbieria, and Daniele Bragaa for their work on C-SPARQL §  Check out the paper: •  http://www.semantic-web-journal.net/content/order-matters- harnessing-world-orderings-reasoning-over-massive-data Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/
  • 3. References §  The numbers in square brackets refers to references in the SWJ paper •  http://www.semantic-web-journal.net/content/order- matters-harnessing-world-orderings-reasoning-over- massive-data §  A short selection of references to my papers is available in the end of the presentation. Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/
  • 4. The problem, three use cases, and … §  More and more applications require real-time processing of massive, dynamically generated, data Space Situational Jet Engine Intelligent Awareness Design Surveillance Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/
  • 5. The Problem Use case: space junk [source http://wordlesstech.com/2011/03/26/space-junk/ ] Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/ 5
  • 6. The Problem Use case: jet engine design [Source: http://www.sae.org/mags/aem/10018/ ] Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/ 6
  • 7. The Problem Use case: intelligent surveillance [Source: http://youtu.be/I3iDBfB_ZC0 ] Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/ 7
  • 8. The Problem … and four common features! §  their data is ordered, •  naturally ordered by recency, proximity, etc. •  intrinsically ordered by precision, popularity, provenance, certainty, trust, etc. •  and, in any case, it is explicitly sortable through attribute values §  the answers are also required to come in an ordered fashion •  engineers surveying a satellite orbit need to know the largest pieces of debris in closest proximity with maximal certainty, measured with highest precision, etc. §  they require immediate answers at runtime •  flight paths have to be adapted once an object in collision course is detected §  and, they require inference •  rich ontological models describing complex domain knowledge is often used to pose the queries and to interpret the results Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/
  • 9. The Problem Performance targets Answer Target quality at time t Fully correct answers Desired situation Current situation Computation Time t Real-time Max runtime behaviour Note: completeness may not be necessary if all relevant answers are found Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/ 9
  • 10. The Problem A running example §  Imagine a system which •  listens to all micro-posts that are published, •  knows the geographic location of social media users, •  has the ability of detecting the topic of each micro- post, and •  has modelled relationships between topics in an expressive ontological language §  Let suppose that each of us asks a query like the following to such a system: •  Which users of social media, currently leading popular discussions on fashion-related topics, are closest to my current location? What are they saying about the shopping district nearby? Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/
  • 11. The solution space Types of orders Combinations Expensive to enforce Cheap to enforce Natural No ordering Types of Approximation reasoning and parallelisation No reasoning Data-driven Query-driven Combinations Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/ 11
  • 12. The solution space no ordering, no reasoning Types of orders Combinations Expensive to enforce Cheap to enforce Natural No ordering Types of reasoning No reasoning Data-driven Query-driven Combinations Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/ 12
  • 13. The solution space no ordering, no reasoning §  Most of the big data solutions currently on the market •  BSP (Bulk Synchronous Parallel) •  PRAM (Parallel Random Access Machine) •  PGAS (Partitioned Global Access Space) •  Map-Reduce implementations •  and data-centric workflow systems based on them §  Some (e.g., Hive and Pig) allow the specification of ordering constraints, but no specific optimisation is provided for top-k or streaming queries §  W.r.t. the running example •  Right performances and scalability •  Limited ability to harnessing orderings •  Missing inference capability Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/
  • 14. The solution space Order aware data management Types of orders Combinations data management Expensive to enforce Order-aware Cheap to enforce Natural No ordering Types of reasoning No reasoning Data-driven Query-driven Combinations Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/ 14
  • 15. The solution space Order aware data management §  When treating massive data order matters! Data  as  a   where  we  can   e.g.,  order  by   sortable  en,ty   enforce  orderings   •  sortable  literals   easily  and  logically   •  popularity   •  uncertainty   •  trust   Most  relevant   streaming     answers  first     algorithms   §  If N is the size of the input, a problem is considered to be “well- solved” if a streaming algorithm exists which requires at most O(poly(log(N)) space and time [31] Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/
  • 16. The solution space Order aware data management and approximation §  approximate, streaming algorithms can outperform classical, data-bound approaches to this problem by several orders of magnitude [6,14]. §  Such approximations can be asymptotic, so that arbitrary accuracy can be achieved [6]. Answer accuracy at Fully correct answers computation time t Computation Time t Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/
  • 17. The solution space Harnessing natural orderings Types of orders Combinations Expensive to enforce Cheap to enforce Natural No ordering Types of reasoning No reasoning Data-driven Query-driven Combinations Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/ 17
  • 18. The solution space Harnessing natural orderings §  Continuous queries registered over streams that, in most of the cases, are observed trough windows window input streams Registered   streams of answer (unbound, and Con,nuous   time-varying) Query   §  Assumption: the recent information being more relevant as it describes the current state of a dynamic system Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/ 18
  • 19. The solution space Harnessing natural orderings §  The nature of streams requires a paradigmatic change* •  from persistent data –  to be stored and queried on demand –  a.k.a. one time semantics •  to transient data –  to be consumed on the fly by continuous queries –  a.k.a. continuous semantics * This paradigmatic change first arose in DB community [31] Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/
  • 20. The solution space Harnessing natural orderings §  Two types of solutions •  Data Stream Management Systems (DSMS) •  Complex Event Processors (CEP) §  Research Prototypes •  Amazon/Cougar (Cornell) – sensors •  Aurora (Brown/MIT) – sensor monitoring, dataflow •  Gigascope: AT&T Labs – Network Monitoring •  Hancock (AT&T) – Telecom streams •  Niagara (OGI/Wisconsin) – Internet DBs & XML •  OpenCQ (Georgia) – triggers, view maintenance •  Stream (Stanford) – general-purpose DSMS •  Stream Mill (UCLA) - power & extensibility •  Tapestry (Xerox) – publish/subscribe filtering •  Telegraph (Berkeley) – adaptive engine for sensors •  Tribeca (Bellcore) – network monitoring §  High-tech startups •  Streambase, Coral8, Apama, Truviso §  Major DBMS vendors are all adding stream extensions as well •  IBM InfoSphere Stream •  Microsoft streaminsight •  Oracle CEP Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/
  • 21. The solution space Harnessing natural orderings §  DSMSs are optimised for the simplest portion of the query in our running example •  retrieve the micro posts that have been posted recently Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/
  • 22. The solution space Harnessing other types of orders Types of orders Combinations Expensive to enforce Cheap to enforce Natural No ordering Types of reasoning No reasoning Data-driven Query-driven Combinations Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/ 22
  • 23. The solution space Harnessing other types of orders §  W.r.t. the running example, solutions studied in these two areas allow to efficiently •  retrieve nearby shops that are discussed by popular social media users. §  This is a typical top-k query •  a limited number of results k •  ordered by a scoring function •  that combines several criteria –  e.g., near by and most discussed Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/
  • 24. The solution space - Harnessing other types of orders Treating order as a first class citizen §  Traditional query §  Order-aware query evaluation schema: evaluation schema: materialize then sort split and interleave Limit  to  K   Limit  to  K   [10s]   [10s]   Materialize  join  results  and  order   them  all  by  proximity  of  the  shop   discussed   to  the  issuer  and  popularity  of  the   [10s]   [10s]   social  media  user       [1,000s]   Order  by   Order  by   proximity  to   popularity     discussed   the  issuer   [1,000s]   [100,0000s]   shops   social   shops   social   media  user   media  user   Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/ 24
  • 25. The solution space - Harnessing other types of orders The split-and-interleave scheme §  State-of-the-art •  Literature in RDBMS (for a survey see [35]) presents the split-and-interleave scheme: 1.  Split the evaluation of the scoring function into the evaluation of the single criteria 2.  Interleave them with other operators 3.  Use partial orders to construct incrementally the final order §  Standard assumptions: •  Monotone increasing scoring function •  Sorted access for each criterion •  Random access when possible is expensive •  No uncertainty in the scores •  No uncertainty in the scoring function Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/
  • 26. The solution space - Harnessing other types of orders Be aware, it’s a trade-off Orders of magnitude NOTE: Typically users are interested in 1<= k <= 100 Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/ 26
  • 27. The solution space Harnessing all types of orders together Types of orders Combinations Expensive to enforce Cheap to enforce Natural No ordering Types of reasoning No reasoning Data-driven Query-driven Combinations Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/ 27
  • 28. The solution space Harnessing all types of orders together §  W.r.t. the running example, solutions studied in these area allow to efficiently •  retrieve the shops nearby that popular social media users are currently positively posting about.. §  This is a typical continuous monitoring of top-k queries over sliding windows [45] §  A very promising and little explored research area in data management Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/
  • 29. The solution space Wrapping up order-aware data mng. §  Two parts of the query in the running example remain difficult to express: •  knowing which topics are related to fashion –  requires at least a taxonomy of fashion-related topics •  computing which recent discussions on social media are popular –  requires to compute the transitive closure of the discussion §  Both are •  difficult to model without an expressive ontological language (such as OWL 2) and •  both require complex algorithms that an ontology reasoner can handle natively §  Moreover, order-aware data management techniques do not cope with heterogeneity •  i.e., data should be translated in one common representation before order-aware data manage- ment techniques can be applied. Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/
  • 30. The solution space Types of orders Combinations Expensive to enforce Cheap to enforce Natural No ordering Scalable reasoning Types of reasoning No reasoning Data-driven Query-driven Combinations Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/ 30
  • 31. The Solution Space Scalable Reasoning §  Why? •  handling heterogeneity in the input data through ontology-based information integration §  In the running example, •  ontological background knowledge can be used to model relationships between more specific and more general topics of interest, which can be used to infer which concrete topics are related to fashion §  How? •  Data-driven methods –  Scalable methods available in the state-of-the-art •  Query-driven methods –  research trend, implementations are appearing •  Combinations of the previous two –  mostly theoretical results Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/
  • 32. The Solution Space – Scalable Reasoning Data-driven §  Ontological Language: •  OWL 2 RL –  aimed at applications that require scalable reasoning without sacrificing too much expressive power –  http://www.w3.org/TR/owl2-profiles/#OWL_2_RL §  Reasoning approach •  Backward chaining: from asserted data to all possible entailments §  Pros: Low query latency §  Cons: they do not take the actual information-need into account §  Implementations •  OWLIM, Virtuoso, Allegro- Graph, and OntoBroker §  Research trend •  Parallelization using Map-Reduce as a main paradigm –  e.g. [33,65] for OWL2RL or a fragment thereof [32,64,66,38] •  Applying similar techniques to more expressive fragments of OWL –  e.g., ELK reasoner for OWL EL [37] Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/
  • 33. The Solution Space – Scalable Reasoning Query-driven §  Ontological Language •  OWL 2 QL –  designed for query answering in LOGSPACE w.r.t the size of the data, with the expressivity of conceptual models (e.g., UML class diagrams) –  http://www.w3.org/TR/owl2-profiles/#OWL_2_QL §  Reasoning approach •  Forward chaining: from query to asserted facts •  Query rewriting: from ontological query to a set of SQL queries §  Pros: limit the search space by considering the actual query §  Cons: number of rewritings grow exponentially §  Implementations •  QuOnto, Owlgres, and Requiem §  Research trend •  Extend query rewriting for more expressive ontology languages –  e.g., Datalog± [27,4] •  Parallelization using Map-Reduce –  e.g., Query Pie Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/
  • 34. The Solution Space – Scalable Reasoning Combinations §  Ontological Language •  Subject to research §  Reasoning approach •  combine the advantages of data- and query-driven approaches §  State-of-the-art •  Magic Sets technique [1] §  Recent theoretical results •  for limited fragment of OWL EL [44] •  for existential rules [4] Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/
  • 35. The Solution Space – Scalable Reasoning Approximation §  Many rule-based systems compute only part of the entailed consequences by employing a set of rules that cannot derive all results •  E.g., Jena, Sesame, OWLIM, and Virtuoso §  A typical approach is to approximate the input information by restricting to a simpler ontology language that is then processed with a more efficient, sound and complete algorithm •  e.g., Trowl [48], and screech [62]. §  Approximate reasoning is used as a sub-method in many sound and complete reasoners, •  e.g., the OWL reasoner HermiT first computes the syntactically told class hierarchy before using more complex algorithms for a complete subsumption check. §  None of the above, however, deal with or take advantage of orderings of any kind. §  A number of interesting research challenges thus remain open. Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/
  • 36. The solution space Wrap up of the talk so far Types of orders Combinations data management Expensive to enforce Order-aware Cheap to enforce Natural No ordering Scalable reasoning Types of reasoning No reasoning Data-driven Query-driven Combinations Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/ 36
  • 37. The solution space Reasoning with streaming algorithms Types of orders Combinations Order-aware data management reasoning Expensive to enforce Order-aware Top-k Cheap to enforce Reasoning Natural Stream reasoning No ordering Scalable reasoning Types of reasoning No reasoning Data-driven Query-driven Combinations Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/ 37
  • 38. The solution space Reasoning with streaming algorithms Types of orders Combinations data management Expensive to enforce Order-aware Cheap to enforce Natural Stream reasoning No ordering Scalable reasoning Types of reasoning No reasoning Data-driven Query-driven Combinations Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/ 38
  • 39. The solution space Stream Reasoning [IEEE-IS2009] §  W.r.t. the running example, solutions studied in these area allow to efficiently •  compute which recent discussions on social media are popular §  For instance, how many micro-posts discussed (either replying or retweeting) my tweet? discuss   reply   discuss   reply   discuss   t2   reply   t4   t7   discuss   retweet   discuss   reply   discuss   reply   7! t1   t3   t5   t8   retweet   discuss   t6   Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/
  • 40. The solution space Stream Reasoning features Trad Data Stream Automatic Stream Processing Processing Reasoning Reasoning Feature offers offers offers aims at Processing Streams Handling Large datasets Reactivity (real- time) Expressing Fine-grained queries Capturing Knowledge Access to Persistent Data Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/
  • 41. The solution space Stream Reasoning definition §  Making sense [IEEE-IS2010] •  in real time •  of multiple, heterogeneous, gigantic and inevitably noisy data streams •  in order to support the decision process of extremely large numbers of concurrent user §  Note: making sense of streams necessarily requires processing them against rich background knowledge, an unsolved problem in database Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/
  • 42. The solution space Architecture of a Stream Reasoner §  Continuous reasoning tasks registered over streams that, in most of the cases, are observed trough windows window Registered   input streams streams of answer Con,nuous   Reasoning   Tasks   Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/
  • 43. The solution space Stream Reasoning PoliMi’s Achievements §  RDF Stream data type [WWW2009] •  (virtually) represent heterogeneous data streams §  C-SPARQL query language [WWW2009] •  express fine-grained continuous queries •  It is “compiled down” to keep high performances §  Incremental RDFS++ Reasoning [ESWC2010] •  allows for domain knowledge exploitation §  C-SPARQL Engine [EDBT2010] •  Fully operational prototype •  Deployed in award winning applications (e.g., Bottari [JWS2012]) Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/
  • 44. The solution space Stream Reasoning PoliMi’s Achievements Types of orders Combinations data management Expensive to enforce Order-aware Cheap to enforce Natural No ordering Scalable reasoning Types of reasoning No reasoning Data-driven Query-driven Combinations Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/ 44
  • 45. The solution space – Stream Reasoning “alla PoliMi” RDF Stream §  RDF Stream Data Type •  Ordered sequence of pairs, where each pair is made of an RDF triple and its timestamp §  Timestamps are not required to be unique, they must be non- decreasing §  E.g., (<:Alice :posts :post1 >, 2010-02-12T13:34:41) (<:post1 :talksAboutPositively :LaScala>, 2010-02-12T13:34:41) (<:Bob :posts :post2 >, 2010-02-12T13:36:28) (<:post2 :talksAboutNegatively :Duomo>, 2010-02-12T13:36:28) Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/
  • 46. MEMO: SPARQL Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/
  • 47. The solution space – Stream Reasoning “alla PoliMi” Where C-SPARQL Extends SPARQL Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/
  • 48. The solution space – Stream Reasoning “alla PoliMi” An Example of C-SPARQL Query Who are the opinion makers? i.e., the users who are likely to influence the behavior of other users who follow them REGISTER STREAM OpinionMakers COMPUTED EVERY 5m AS CONSTRUCT { ?opinionMaker sd:about ?resource } FROM STREAM <http://streamingsocialdata.org/interactions> [RANGE 30m STEP 5m] WHERE { ?opinionMaker ?opinion ?resource . ?follower sioc:follows ?opinionMaker. ?follower ?opinion ?resource. FILTER ( cs:timestamp(?follower) > cs:timestamp(?opinionMaker) && ?opinion != sd:accesses ) } HAVING ( COUNT(DISTINCT ?follower) > 3 ) Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/
  • 49. The solution space – Stream Reasoning “alla PoliMi” An Example of C-SPARQL Query Who are the opinion makers? i.e., the users who are likely to influence the behavior of other users who follow added as Query registration RDF Stream them (for continuous execution) new ouput format REGISTER STREAM OpinionMakers COMPUTED EVERY 5m AS CONSTRUCT { ?opinionMaker sd:about ?resource } FROM STREAM <http://streamingsocialdata.org/interactions> [RANGE 30m STEP 5m] WHERE { FROM STREAM clause ?opinionMaker ?opinion ?resource . WINDOW ?follower sioc:follows ?opinionMaker. ?follower ?opinion ?resource. Builtin to access FILTER ( cs:timestamp(?follower) > timestamps cs:timestamp(?opinionMaker) && ?opinion != sd:accesses ) Aggregates as } in SPARQL 1.1 HAVING ( COUNT(DISTINCT ?follower) > 3 ) Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/
  • 50. The solution space – Stream Reasoning “alla PoliMi” Efficiency of C-SPARQL Query Evaluation §  window based selection of C-SPARQL outperforms the standard FILTER based selection Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/
  • 51. The solution space – Stream Reasoning “alla PoliMi” Efficiency of C-SPARQL Query Evaluation §  C-SPARQL Algebra allows to push of filters and projections Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/
  • 52. The solution space – Stream Reasoning “alla PoliMi” High Throughputs of C-SPARQL Engine Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/
  • 53. The solution space – Stream Reasoning “alla PoliMi” Incremental Materialization evaluation §  base-line: re-computing the materialization from scratch §  state-of-the-art (materialized view incremental maintenance) §  PoliMi’s incremental stream approach [ESWC2010] % of the materialization changed when the window slides Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/
  • 54. The solution space – Stream Reasoning “alla PoliMi” Incremental Maintenance and Query Latency §  comparison of the average time needed to answer a C-SPARQL query using •  backward reasoner •  the naive approach of re-computing the materialization •  PoliMi’s incremental-stream approach 20 15 10 ms. 5 0 forward  reasoning naive  approach incremental-­‐stream query 5,82 Backward reasoning 1,61 1,61 materialization 0 15,91 0,28 Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/
  • 55. The solution space Stream Reasoning Community Achievements §  RDF Stream data type •  Adopted by most of the research groups active on Stream Reasoning •  Alternative solution based on two time stamps used in eTalis §  Continuous query language •  C-SPARQL was extended by the community •  Alternative solutions have been studied –  without FROM STREAM clause [CQUELS] –  oriented to complex event processing [2] §  Reasoning •  Data-driven for RDFS++ [ESCW2010] •  Goal-driven for temporal logics (eTalis) [2] •  time-decaying logic programs [26]. •  Inductive reasoning [IEEE-IS2010] §  Implementation Experiences •  C-SPARQL Engine •  eTalis / EP-SPARQL •  CQUELS •  S2R Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/
  • 56. The solution space Stream Reasoning next steps §  Scientific •  Notions of soundness and completeness •  More expressive reasoning –  with minor loss in throughput –  and predictable loss on scalability •  Dealing with incomplete & noisy data •  Parallelization and distribution of the processing §  Technical •  Prove effectiveness and efficacy in specific application domains •  Better integrate continuous semantics with Linked Data •  Design and develop a software framework to simplify stream reasoning application development §  Organizational •  Standardaze RDF Stream, C-SPARQL, Streaming Linked Data, etc. Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/
  • 57. The solution space Wrap-up of Stream Reasoning Types of orders Combinations data management Expensive to enforce Order-aware Cheap to enforce Natural Stream reasoning No ordering Scalable reasoning Types of reasoning No reasoning Data-driven Query-driven Combinations Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/ 57
  • 58. The solution space Top-k reasoning Types of orders Combinations data management Expensive to enforce Order-aware Top-k Cheap to enforce Reasoning Natural Stream reasoning No ordering Scalable reasoning Types of reasoning No reasoning Data-driven Query-driven Combinations Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/ 58
  • 59. The solution space Top-k reasoning approach §  In traditional reasoning, ranking of results is normally considered a task that increase the hopelessness of scaling inference to massive data set §  Top-k reasoning should, instead, overcome such a common practice and interleave ordering and reasoning §  W.r.t. the running example, top-k reasoning should allow to efficiently •  compute which are the top-k social media users, who are well-known to lead discussions on fashion-related topics and are closest to the requester current location. Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/
  • 60. The solution space Top-k reasoning attempts §  SoftFacts [60] •  an ontology-mediated top-k information retrieval system over relational databases §  SparqlRank[13] •  adds order to SPARQL algebra as a first class citizen and experimentally shows the performance gain §  AnQL [41] •  extends SPARQL to querying RDFS annotated by bounded lattice (and thus comes with a partial or- dering). §  Notion of exact top-k closure of an ontology w.r.t. a query and a scoring function [53] Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/
  • 61. The solution space Top-k queries in SPARQL 1.1 §  Retrieve the best 10 offers ordered by a function of user ratings of the product and offer price:   SELECT  ?product  ?offer     (g1(?avgRat1)  +  g2(?avgRat2)  +  g3(?price)  AS  ?score)   WHERE  {     ?product  hasAvgRat1  ?avgRat1  .   ?product  hasAvgRat2  ?avgRat2  .   ?product  hasName  ?name  .   ?product  hasOffers  ?offer  .   ?offer  hasPrice  ?price     }   ORDER  BY  DESC  (?score)     LIMIT  10   §  Slow = tens of seconds on 5M (could be improved to milliseconds) Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/
  • 62. The solution space - Top-k queries in SPARQL 1.1 Challenges §  Adapting SQL optimizations to SPARQL is not straightforward: •  Different algebra •  Different cost of data access in native RDF triplestores –  Sorted access is slow, random access is fast •  Additional optimization dimensions –  Pushing the evaluation of BGP in the storage §  Research tasks •  New algebra for SPARQL where order is a first class citizen •  new algorithms, and •  optimization techniques Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/
  • 63. The solution space - Top-k queries in SPARQL 1.1 The SPARQL-Rank algebra §  Extends the standard SPARQL algebra §  Ranked set of mappings: set of mappings augmented with an order relation New Extended EQUIVALENC OPERATORS ES Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/
  • 64. The solution space – SPARQL-Rank algebra The new Rank Operator F (p1, p2)= ?p1 + ?p2 ?x ?y ?p1 ?p2 ?x ?y ?p1 ?p2 Fp1 µ1 1 8 0.8 0.8 ρp1 µ1 1 8 0.8 0.8 1.8 µ2 3 3 0.3 0.6 µ3 3 4 0.4 0.6 1.4 µ3 3 4 0.4 0.6 µ2 3 3 0.3 0.6 1.3 Ω ρp1(Ω ) Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/ 64
  • 65. The solution space – SPARQL-Rank algebra The redefined Join Operator ?x ?y ?p1 ?p2 Fp1 ?x ?z ?p2 Fp2 µ1 1 8 0.8 0.8 1.8 µ4 1 9 0.8 1.8 µ3 3 4 0.4 0.6 1.4 µ5 3 0 0.6 1.6 µ2 3 3 0.3 0.6 1.3 Ωp1 Ω’p2 ?x ?y ?z ?p1 ?p2 Fp1Up 2 µ1 U µ4 1 8 9 0.8 0.8 1.6 µ3 U µ5 3 4 0 0.4 0.6 1.0 µ2 U µ5 3 3 0 0.3 0.6 0.9 Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/ 65
  • 66. The solution space – SPARQL-Rank algebra Rank Join Algorithms §  Different algorithms based on available access in the inputs: RankJoin (a) •  Hash Rank-Join RankJoin sortedAccess sortedAccess –  e.g. HRJN [Ilyas2004] (a) RankSequence sortedAccess sortedAccess (b) RankSequence sortedAccess randomAccess (b) •  Random Access Rank-Join RA-RankJoin sortedAccess randomAccess –  e.g. RA-HRJN [Ilyas2004] (c) RA-RankJoin RankJoin sortedAccess sortedAccess randomAccess randomAccess (c) (a) sortedAccess sortedAccess randomAccess randomAccess sortedAccess sortedAccess •  RankSequence (e,g, RSEQ) RankSequence –  Minimum sorted access (b) –  Leverages random access sortedAccess randomAccess 2 ] SWC201 EW [I RA-RankJoin N Trento, Italy, 6.11.2012 Emanuele(c) Della Valle - http://streamreasoning.org/
  • 67. The solution space – SPARQL-Rank algebra The new Algebraic Equivalences Split Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/
  • 68. The solution space – SPARQL-Rank algebra The new Algebraic Equivalences Interleave Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/
  • 69. The solution space – SPARQL-Rank algebra Planning Strategies §  Apply algebraic equivalences §  Result: three possible strategies 1. Rank of BGPs 2. Interleaved 3. Rank Join Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/
  • 70. The solution space – SPARQL-Rank algebra Planning Strategies: rank of BGPs (ROB) §  Substitute the monolithic scoring function with a number of incremental rank operators (rho) ?pr, ?of, ?score ?pr, ?of, ?score ?pr, ?of, ?score SLICE [0,10] SLICE [0,10] SLICE [0,10] ORDER Join [?score] ?pr = ?pr RankJoin g3(?p1) EXTEND ?pr = ?pr [?score =g1(?a1)+g2(?a2)+g3(?p1)] RankJoin g2(?a2) ?pr = ?pr g1(?a1) ?pr hasA1 ?a1. ?pr hasA2 ?a2 . g3(?p1) g1(?a1) ?pr hasN ?n . ?pr hasA1 ?a1 . ?pr hasN ?n . ?pr hasO ?of . ?pr hasO ?of . ?pr hasO ?of . ?of hasP1 ?p1 ?of hasP ?p1. ?of hasP ?p1 . ?pr hasA1 ?a1 . ?pr hasA2 ?a seqScan (a) (b) (a) Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/
  • 71. The solution space – SPARQL-Rank algebra Planning Strategies: Interleaved (INTER) §  Separate the pattern in two groups: •  Triple patterns that influence the ranking •  Triple patterns that don’t influence the ranking of, ?score ?pr, ?of, ?score ?pr, ?of, ?score ?pr, ?of, ?score ?pr, ?of, ?score SLICE [0,10] SLICE [0,10] E [0,10] SLICE [0,10] SLICE [0,10] ORDER Join [?score] ?pr =Sequence ?pr p1) RankJoin ?pr = ?pr EXTEND g3(?p1) ?pr = ?pr [?score =g1(?a1)+g2(?a2)+g3(?p1)] RankJoin g3(?p1) ?prg2(?a2)?n ?pr h hasN a1) ?pr = ?pr seqScan ?pr hasA1 ?a1. g1(?a1) ?pr hasA2 ?a2 . g3(?p1) g1(?a1) ?pr hasN ?n . hasN?pr .hasA1 ?a1 . ?pr hasN ?n . ?pr ?n ?pr hasA1 ?a1 . ?of hasP1 ?p1 hasO?pr hasO ?of . ?of hasP1 ?p1 . ?pr ?of . ?pr hasO ?of ?pr hasO ?of . ?of hasP1 ?p1 ?of hasP ?p1. ?of hasP ?p1 . ?pr hasA1 ?a1 . ?pr hasA2 ?a2 . can orderScan_a1 seqScan a) (a) (b) (b) (c) Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/
  • 72. The solution space – SPARQL-Rank algebra Planning Strategies: Rank-Join (RJ) §  Split into one pattern for each ranking criterion §  Use the most appropriate join based on type of access ?pr, ?of, ?score ?pr, ?of, ?score ?pr, ?of, ?score ?pr, ?of, ?score SLICE [0,10] SLICE [0,10] SLICE [0,10] SLICE [0,10] ORDER Join ORDER [?score] Join ?pr ?pr = [?score] RankJoin ?pr = ?pr RankJoin EXTEND EXTEND =g1(?a1)+g2(?a2)+g3(?p1)] ?pr = ?pr ?pr = ?pr [?score [?score =g1(?a1)+g2(?a2)+g3(?p1)] RankJoin RankJoin g ?pr ?pr hasN ?n . g2(?a2)2(?a2) hasN ?n . ?pr hasA1 ?a1. ?pr = ?pr ?pr = ?pr ?pr hasA1 ?a1. ?pr hasA2 ?a2 . ?pr hasA2 ?a2 . g3(?p1) ?pr hasN ?n . g3(?p1) g1(?a1) g1(?a1) ?pr hasN ?n . ?pr hasO ?of . ?pr hasO ?of . ?pr hasO ?of . ?pr hasO ?of . ?of hasP ?p1. ?of hasP ?p1. ?of hasP ?p1 . ?of hasP ?p1 ?pr hasA1 ?pr hasA1 ?a1 . ?pr hasA2 hasA2 ?a2 . . ?a1 . ?pr ?a2 . (a) (a) (b) (b) Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/
  • 73. The solution space – SPARQL-Rank algebra Experimental evidences of performance improvements §  Example query, 5M triples dataset §  Assumption: availability of sorted access indexes Two orders of magnitude better Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/
  • 74. The solution space – SPARQL-Rank algebra Experimental evidences of performance improvements §  Benchmark: 8 queries from on an extension of BSBM Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/
  • 75. The solution space Wrap-up of Top-k Reasoning Types of orders Combinations data management Expensive to enforce Order-aware Top-k Cheap to enforce Reasoning Natural Stream reasoning No ordering Scalable reasoning Types of reasoning No reasoning Data-driven Query-driven Combinations Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/ 75
  • 76. The solution space Full-fledge Order-aware reasoning Types of orders Combinations Order-aware data management reasoning Expensive to enforce Order-aware Top-k Cheap to enforce Reasoning Natural Stream reasoning No ordering Scalable reasoning Types of reasoning No reasoning Data-driven Query-driven Combinations Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/ 76
  • 77. The solution space Full-fledge Order-aware reasoning §  In Full-fledged order-aware reasoning, data- and query-driven inference methods have to deal with combinations of natural, cheap to enforce and expensive to enforce type of orders. •  the naive assumption of independence of orderings would have to be relaxed •  theories and methods, which exploit mutual relationships between the three type of orders, have to be rethought §  Considering our running example, methods implementing order-aware reasoning are the only ones able to answer to the query •  Which users of social media, currently leading popular discussions on fashion- related topics, are closest to my current location? What are they saying about the shopping district nearby? Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/
  • 78. The solution space Full-fledge Order-aware reasoning §  State-of-the-art •  None §  Promising work •  The Answer Set Programming (ASP) community has recently proposed an streaming algorithm for ASP [25] that 1.  ranks the constants referring to domain elements and, 2.  fetch them increasing the domain sizes until an answer set is found. §  Challenges •  theoretical framework that unifies and generalises those defined for stream reasoning and top-k reasoning •  designing and test scalable data- and query-driven methods that allows for efficient answering of queries that involve all types of orders Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/
  • 79. The solution space Wrap-up of Top-k Reasoning Types of orders Combinations Order-aware data management reasoning Expensive to enforce Order-aware Top-k Cheap to enforce Reasoning Natural Stream reasoning No ordering Scalable reasoning Types of reasoning No reasoning Data-driven Query-driven Combinations Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/ 79
  • 80. References My papers [IEEE-IS2009] E. Della Valle, S. Ceri, F. van Harmelen, D. Fensel It's a Streaming World! Reasoning upon Rapidly Changing Information. IEEE Intelligent Systems 24(6): 83-89 (2009) [EDBT2010] D.F. Barbieri, D.Braga, S. Ceri and M. Grossniklaus. An Execution Environment for C-SPARQL Queries. EDBT 2010 [WWW2009] D.F. Barbieri, D. Braga, S. Ceri, E. Della Valle, M. Grossniklaus: C-SPARQL: SPARQL for continuous querying. WWW 2009: 1061-1062 [IEEE-IS2010] D. Barbieri, D. Braga, S. Ceri, E. Della Valle, Y. Huang, V. Tresp, A.Rettinger, H. Wermser: Deductive and Inductive Stream Reasoning for Semantic Social Media Analytics IEEE Intelligent Systems, 30 Aug. 2010. [JWS2012] M. Balduini; I.Celino; E. Della Valle; D.Dell'Aglio; Y. Huang; T. Lee; S. Kim; V. Tresp: BOTTARI: an Augmented Reality Mobile Application to deliver Personalized and Location-based Recommendations by Continuous Analysis of Social Media Streams. JWS. 2012. IN PRESS. [ESWC2010] D.F. Barbieri, D. Braga, S. Ceri, E. Della Valle, M. Grossniklaus. Incremental Reasoning on Streams and Rich Background Knowledge. ESWC 2010 [SWJ2012] E. Della Valle, S.Schlobach, M. Krötzsch, A. Bozzon, S. Ceri, I. Horrocks. Order Matters! Harnessing a World of Orderings for Reasoning over Massive Data. IN PRESS [ISWC2012] S. Magliacane, A. Bozzon, E. Della Valle. Efficient Execution of Top-k SPARQL Queries. ISWC 2012. IN PRESS Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/
  • 81. Downloads §  C-SPARQL Engine (no reasoning support) •  A ready to go pack for eclipse –  http://streamreasoning.org/download •  Source code available on request §  SPARQL-Rank Engine (ARQ-Rank) •  Source code and experimental data –  http://sparqlrank.search-computing.org/ Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/
  • 82. Thank You! Any questions? emanuele.dellavalle@polimi.it Keep an eye on http://www.streamreasoning.org There’s much more to come! Trento, Italy, 6.11.2012 Emanuele Della Valle - http://streamreasoning.org/ 82