1. Copyright 2010 Digital Enterprise Research Institute. All rights reserved.
Digital Enterprise Research Institute www.deri.ie
On-The-Fly Generation of
Multidimensional Data Cubes for
Web of Things
Muntazir Mehdi
(DERI, TU Kaiserslautern)
Stefan.Decker@deri.org
http://www.StefanDecker.org/
2. Digital Enterprise Research Institute www.deri.ie
Agenda
Motivation and Background
• Problem statement, Use case, Linked Data, WoT
Processing Metadata for Cube Creation
• Capturing and Publishing Sensor data, Event Registration
Cube Generation
• EDWH Agent, An example Scenario
Other Potential Use Cases
Results and Evaluation
Conclusion
3. Digital Enterprise Research Institute www.deri.ie
Motivation & Background
Enterprises producing huge amounts of data
making data management, exchange and decision
making complex.
Use Case (Smart Buildings)
1. Rely on Sensor data for decision making
2. Heterogeneous and Big Data Management
3. Event Processing can be applied to sustain decision making
4. Limited support for decision making with event processing
techniques
5. Controlling supply / demand based on statistical data
6. Identify meaningful event and deal with them asap
4. Digital Enterprise Research Institute www.deri.ie
Motivation & Background (continued)
Heterogeneous Data Management
1. Different Data generated from different applications within one or
more smart environments.
2. For example: A smart city relying on combined data from different
smart buildings.
3. Linked data: A set of best practices to represent data into RDF and
link, relate or connect to other RDF data.
4. Linked Open Data (LOD) Cloud: A huge openly available cloud of
linked data from different domains.
5. Digital Enterprise Research Institute www.deri.ie
Motivation & Background (continued)
Big Data Management
1. A fast response to complex queries to support event processing.
2. Huge amounts of sensor data as RDF.
3. Generation of real-time multidimensional and contextual data cubes
to sustain fast responses to complex queries.
4. An event data-warehouse.
5. Multidimensional shape of data in data-warehouse = A data cube =
Structuring information into dimensions and facts or measures.
6. Digital Enterprise Research Institute www.deri.ie
Why Data-warehouse for events?
1. Data characteristics:
• Logged once, never updated
• Flat data, no need to normalize
• Incoming data: temporal (based on time)
2. Objective characteristics:
• Reporting, Analysis, Prediction, Mining, Pattern Identification……
• To use a data model to speed up querying unlike transactional processing system
• To provide with a historical repository containing features as per interest
• Support Complex Event Processing
Motivation & Background (continued)
7. Digital Enterprise Research Institute www.deri.ie
Motivation & Background (continued)
Web of Things
1. Extending the Web to easily blend real-world objects like electronic
appliances, sensors and embedded devices etc.
2. Even though we are limited to sensor data in our use case, the
approach can be easily extended.
3. CoAP (Constrained Application Protocol): A Web transfer protocol for
request/response model.
8. Digital Enterprise Research Institute www.deri.ie
Related Work
Antoniades, Athos, et al. "Linked2Safety: A secure linked
data medical information space for semantically-
interconnecting EHRs advancing patients' safety in medical
research." Bioinformatics & Bioengineering (BIBE), 2012 IEEE
12th International Conference on. IEEE, 2012.
Lefort, Laurent, et al. "A Linked Sensor Data Cube for a 100
Year Homogenised Daily Temperature Dataset." SSN. 2012.
ENERGIE VISIBLE
(http://www.webofthings.org/energievisible/)
9. Digital Enterprise Research Institute www.deri.ie
Processing Metadata for Cube Generation
Involves two major steps:
1. Capturing and Publishing Sensor Data
2. Event Registration
10. Digital Enterprise Research Institute www.deri.ie
Capturing and Publishing Sensor Data: An
example Scenario
JMS
SERVER
& publish on JMS Server
RDF
Oh wait,
I see a way of converting them into RDF,
add relevant metadata,
SSN
Event Stream
Event Stream
Event Stream
11. Digital Enterprise Research Institute www.deri.ie
Capturing and Publishing Sensor Data:
Process
Filter
UDP Listeners
&
CoAP Clients
RDFizer
JMS Publisher Enricher
JMS Server Metadata
Knowledge Base
S1
S2
S3
Sn
12. Digital Enterprise Research Institute www.deri.ie
Event Registration: EDWH Ontology
NamedCubeGraph
Configuration
Dimension
Measure
Source Event
JMSSource
13. Digital Enterprise Research Institute www.deri.ie
Event Registration Process
Specify Event
Type
Specify Event
Source
Select
Measures
Select
Dimensions
Specify Graph
Details
EDWH Ontology Instance
16. Digital Enterprise Research Institute www.deri.ie
Cube Generation
1. Requires an event to be registered into the system.
2. Current implementation generates cubes based on
time dimension only. However, it can be easily
extended to attain other dimensions.
3. Critical component: EDWH Agent
18. Digital Enterprise Research Institute www.deri.ie
Cubes Generation: An example Scenario
JMS
SERVER
RDF
RDF
CUBES AS
RDF MEETS Mr. CUBES
EDWH Ontology
CUBE
Store
21. Digital Enterprise Research Institute www.deri.ie
Use Case: 1
The electricity usage at location X for duration Y for consumer Z
has been moderate as compared to previous duration W.
CUBE
Store
22. Digital Enterprise Research Institute www.deri.ie
Use Case: 2
Historical Data suggests that the weather is going to be windy and
Rainy in Galway even after the Easter.
CUBE
Store
23. Digital Enterprise Research Institute www.deri.ie
Use Case: 3
CUBE
Store
Some suspicious activity has been detected on your credit card!
24. Digital Enterprise Research Institute www.deri.ie
Use Case: 4
Linked
CUBE
Stores
Each of these things
can be achieved from
one place
25. Digital Enterprise Research Institute www.deri.ie
Evaluation
We evaluated our system in terms of
1. Total number of cubes generated
2. Size of each cube
3. Accuracy of generated cubes
4. Impact of adding and removing dimensions on size of cube
5. Performance of the system to generate cubes
6. Query Execution Time (QET)
26. Digital Enterprise Research Institute www.deri.ie
Evaluation
0
2000
4000
6000
8000
10000
12000
14000
16000
Time(milliseconds)
Quarter Cube
Day Cube
Hour Cube
28. Digital Enterprise Research Institute www.deri.ie
Evaluation: Impact of dimensions
1 Dim
1 Dim
1 Dim
2 Dim
2 Dim
2 Dim
3 Dim
3 Dim
3 Dim
0
50
100
150
200
Quarter Hour Day
StorgaeSizeperCube(KB)
1 Dim 2 Dim 3 Dim
31. Digital Enterprise Research Institute www.deri.ie
Conclusion
With the approach presented, we were able to enrich
events with necessary metadata, and process
enriched events to generate on-the-fly data cubes.
After looking at performance chart shown in previous
slides, it is safe to conclude that our approach
provides a good way of generating data cubes on-the-
fly in a real-time sensor network.