This document discusses analyzing and specifying concerns for data as a service (DaaS). It outlines key issues including a lack of standardized models and terminology for describing data quality, context, and other concerns. The document presents examples of how data concerns are important for mashups, open data usage, and smart environments. It also discusses challenges like missing evaluation techniques and APIs for concerns. The document proposes developing linked models, concern evaluation techniques, and APIs to address these issues.
1. Advanced Services Engineering,
WS 2012, Lecture 4
Analyzing and Specifying Concerns for
DaaS
Hong-Linh Truong
Distributed Systems Group,
Vienna University of Technology
truong@dsg.tuwien.ac.at
http://www.infosys.tuwien.ac.at/staff/truong
ASE WS 2012 1
2. Outline
What are data concerns and why their are
important
Issues in DaaS concerns
Analysis and specification of DaaS concerns
ASE WS 2012 2
3. What are data concerns?
data .... .... DaaS data assets
APIs, Querying, Data Management, etc.
Located price?
Quality of data? Privacy
in US?
problem? Service
free? redistribution?
quality?
ASE WS 2012 3
4. DaaS Concerns
data .... .... DaaS data assets
APIs, Querying, Data Management, etc.
Data
concerns
Quality of Ownership
data Price
License ....
DaaS concerns include QoS, quality of data (QoD),
service licensing, data licensing, data governance, etc.
ASE WS 2012 4
5. Why DaaS/data concerns are
important?
Too much data returned to the
consumer/integrator
Results are returned without a clear usage and
ownership causing data compliance problems
Ultimate goal: to provide relevant data with
acceptable constraints on data concerns
ASE WS 2012 5
6. Example: Mashup (1)
Composition of Yahoo! Boss News Search,
Google News Search , and Flickr
recent news and high-qualified images, but free-
of charge, related to "Haiti earthquake"
Hong Linh Truong, Marco Comerio, Andrea Maurino, Schahram Dustdar, Flavio De Paoli, Luca Panziera: On
Identifying and Reducing Irrelevant Information in Service Composition and Execution. WISE 2010: 52-66
ASE WS 2012 6
8. If the composer is aware of context
and quality parameters
Possible mappings of context and quality
requirements
but it is a tedious task and hard to be automated and we
are not sure we have a correct mapping.
ASE WS 2012 8
9. Example: open data (1)
Retrieve big datasets from RESTful services for further
extraction, transform or data composition activities
http://www.undata-api.org/
ASE WS 2012 9
10. Example: open data (2)
Example: study the population growth and
literacy rate from 1990-2009 for all countries in
the world
Without QoD: get datasets and perform mashup
ASE WS 2012 10
11. Example: open data (2)
CountriesYear 1990 ... 2009
223
1
elements ...
With QoD support: 223
Population annual growth rate (percent):
dataelementcompleteness= 0.8654708520179372,
datasetcompleteness=0.7356502242152466;
Adult literacy rate (percent):
dataelementcompleteness=0.5874439461883408
datasetcompleteness=0.04349775784753363
Should we retrieve the data and perform data
composition?
ASE WS 2012 11
12. Example: smart environments
Smart environments with several low level sensors:
Recognize human activities: idle, relaxing, and cleaning
up,
Provide context information for adaptive service
discovery and execution
E.g., FP7 SM4All, FP7 EU OPPORTUNITY
Virtual Sensor-as-a-Service provides human activities
ASE WS 2012 12
13. Example: smart environments (2)
PoC: Probability of Correctness
QoC: Quality of Context
VSS: Virtual Sensor Service Atif Manzoor, Hong Linh Truong, Christoph
Dorn, Schahram Dustdar: Service-centric
CMS: Context Management Service Inference and Utilization of Confidence on
CCS: Context Consumer Service Context. APSCC 2010: 11-18
AC: Appliances Control (AC)
AM: Ambiance Management
ASE WS 2012 13
14. Discussion time
WHAT ARE OTHER CASES
WHERE DAAS CONCERNS
ARE IMPORTANT FOR?
ASE WS 2012 14
15. Issues on DaaS concerns (1)
DaaS concern models
Unstructured description of context, QoS and
quality of data (QoD)
Different specifications and terminologies
Mismatching semantics of information about
services and data concerns
ASE WS 2012 15
16. Issues on DaaS concerns (2)
DaaS APIs
No/Limited description of data and service
usage
No API for retrieving quality and context
information
No quality and context information associated
with requested data
ASE WS 2012 16
17. Issues on DaaS concerns (3)
Evaluation techniques
Missing evaluation of compatibility of context
and concerns for multiple DaaS and data
assets
Missing evaluation techniques to filter
large/irrelevant data quantity
Require a „holistic integration“ of information models,
APIs and evaluation techniques for DaaS concerns!
ASE WS 2012 17
18. Solutions needed
Developing meta-model and domain-dependent semantic
representations for quality and context information specifications
Reconciliation of DaaS concern
Linked DaaS concerns models
terms
Developing context and DaaS concerns that can be accessed via open
APIs
APIs extension External DaaS information service
Developing techniques for context and DaaS concerns evaluation
On-the-fly data concerns Concerns compatibility evaluation
evaluation and composition
ASE WS 2012 18
20. DaaS concerns analysis and
specification
Which concerns are important in which
situations?
How to specify concerns?
Hong Linh Truong, Schahram Dustdar On analyzing and specifying concerns for data as a service. APSCC 2009: 87-
94
ASE WS 2012 20
21. The importance of concerns in
DaaS consumer‘s view – data
governance
Storage/Database
-as-a-Service
data DaaS
Data governance
Important factor, for example, the security and
privacy compliance, data distribution, and auditing
ASE WS 2012 21
22. The importance of concerns in DaaS
consumer‘s view – quality of data
Read-only DaaS CRUD DaaS
Important factor for the Expected some support
selection of DaaS. to control the quality of
For example, the the data in case the data
accurary and is offered to other
compleness of the data, consumers
whether the data is up-to-
date
ASE WS 2012 22 22
23. The importance of concerns in
DaaS consumer‘s view– data and
service usage
Read-only DaaS CRUD DaaS
Important factor, in Important factor, in
particular, price, data paricular, price, service
and service APIs APIs licensing, and law
licensing, law enforcement
enforcement, and
Intellectual Property
rights
ASE WS 2012 23
24. The importance of concerns in
DaaS consumer‘s view – QoD
Read-only DaaS CRUD Daas
Important factor, in Important factor, in
particular availability and particular, availability,
response time response time,
dependability, and security
ASE WS 2012 24
25. The importance of concerns in DaaS
consumer‘s view– service context
Read-only DaaS CRUD DaaS
Useful factor, such as Important factor, e.g.
classification and service location (for regulation
type (REST, SOAP), compliance) and versioning
location
ASE WS 2012 25
26. Discussion time
WHAT ARE OTHER
IMPORTANT ISSUES? ADD
YOUR FINDING!
ASE WS 2012 26
28. Capability concerns
Data Quality capabilities
Based on well-established research on data quality
Timelineness, uptodate, free-of-error, cleaning, consistency,
completeness, domain-specific metrics, etc.
We mainly support the specification of QoD metrics for the whole
DaaS but possible to extend to the service operation level
Data Security/Privacy capabilities
Data protection within DaaS, e.g. encryption, sensitive data
filtering, and data privacy
Many terms are based on the W3C P3P
ASE WS 2012 28
29. Capability concerns (2)
Auditing capabilities
Logging, reporting (e.g., daily, weekly, and monthly),
and warning
Support system maintenance, SLA monitoring, billing,
and taxation
Data lifecycle
Backup/recovery, distribution (e.g., a service is in
Europe but data is stored in US), and disposition
Support system maintenance but also regulation on
data
ASE WS 2012 29
30. Capability concerns (3)
Data and service license
Usage permission: for data (distribution, transfer,
personal use, etc.) and for service APIs (adaptation,
composition, derivation, etc.)
We utilize some terms from ODRL/ODRL-S
Copyrights
Liability: e.g., who is reponsible for the loss due to a
network disruption?
Law enforcement (e.g., US or European court)
Domain specific Intellectural property rights
ASE WS 2012 30
31. Data source concerns
A DaaS may utilize data from many sources.
Similar DaaSs may utilize data from the same source
Data source properties
Name: e.g. ddfFlus or DataFlux
Size
Timespan: the duration of collected data,
Update Frequency: how offen the data is updated
etc
ASE WS 2012 31
32. Service context concerns
Location:
Selecting a DaaS in Amazon US Zone or European Zone?
Service Type: REST or SOAP?
Level of Service
Service Classification
Based on UNSPSC Code Classification Services
Data Classification
Service/data versioning
ASE WS 2012 32
33. XML Diagram for the DaaS
capability specification
ASE WS 2012 33 33
35. From capability/context to
DaaS contract
Search Define and
properties of negotiate contract Contracts
DaaSs terms
DaaS Capabilities,
Context, Data Consumer-specific
35 concerns
Source
A DaaS contract includes a set of generic, data-
specific and service-specific conditions established
based on concerns
ASE WS 2012 35
36. Recall -- stakeholders in data
provisioning
Data Provider
• People
(individual/crowds/org
anization)
• Software, Things
Service Provider
Data Assessment • Software and people
• Software and
people
Data
Data Consumer
Data Aggregator/Integrator • People, Software,
• Software Things
• People + software
ASE WS 2012 36
37. Populating DaaS concerns
The role of stakeholders in the most trivial view
Data
Consumer
Data Provider
evaluate, specify, Data
publish and manage Aggregator/Integrator
Service Provider
specify, select,
monitor, evaluate
DaaS
Concerns
monitor and
evaluate
Data
Assessment
ASE WS 2012 37
38. Support DaaS concerns selection
Data SECO2
Consumer
DeXIN
Service Information
Management
Service
SEMF-based External
information, including sources
concerns
1. Muhammad Intizar Ali, Reinhard Pichler, Hong Linh Truong, Schahram Dustdar: Data Concern Aware Querying
for the Integration of Data Services. ICEIS (1) 2011: 111-119
2. Marco Comerio, Hong Linh Truong, Flavio De Paoli, Schahram Dustdar: Evaluating Contract Compatibility for
Service Composition in the SeCO2 Framework. ICSOC/ServiceWave 2009: 221-236
ASE WS 2012 38
41. Implementation (3)
Michael Mrissa, Salah-Eddine Tbahriti, Hong Linh
Truong: Privacy Model and Annotation for
DaaS. ECOWS 2010: 3-10
Joint work with
http://infochimps.org/datasets/twitter-haiti-earthquake-data
41
42. Some Studies
We are not aware of any provider that publishes
DaaS‘s concerns in a well-defined form
Mainly in HTML
Our studies examines the description of DaaSs
Enterprising computing
StrikeIron, Xignite, serviceobjects.NET, WebserviceX,
XWebServices, AERS, Amazon
E-science
GBIF (Global Biodiversity Information Facility), EBI
(European Bioinformatics Institute) Web Services,
EMBRACE Service Registry, and BioCatalogue
ASE WS 2012 42
43. 0
5
10
15
20
25
30
35
Completeness
Uptodate
94
Correctness
Cleaning
Standard output
Privacy
based
ASE WS 2012
Logging
Reporting
Warning
Backup
Response Time
Availability
Network Latency
Packet Loss
Network Security
Price Model
43
Service Credit
Usage Permission
Copyright
Liability
Law Enforcement
Domain-specific IPR
Location
Service Type
Data Classification
Data Source Name
Data Source Size
Concerns in HTML descriptions
Data Source Update Freq.
29 services from 7 providers, most are SOAP-
Hong Linh Truong, Schahram Dustdar On analyzing and specifying concerns for data as a service. APSCC 2009: 87-
Mentioned
Not mentioned/clear
44. Concerns of DaaSs in E-science
From the DaaS description point of view
Service Registries DQ QoS Business Licensing
Ownership Usage
permission
GBIF No No No unstructured unstructured
EBI Web Services No No No No No
EMBRACE Service No No No No No
Registry
BioCatalogue No No unstructured unstructured unstructured
Hong Linh Truong, Schahram Dustdar On analyzing and specifying concerns for data as a service. APSCC 2009: 87-
94
ASE WS 2012 44
45. Discussion time
WHAT CAN WE DO MORE
WITH INFORMATION ABOUT
DAAS CONCERNS?
ASE WS 2012 45
46. Exercises
Read mentioned papers
Visit DaaS mentioned in previous lectures
Analyze existing DaaS concerns
Examine how they specify and publish concerns
Investigate possible concerns when merging
data from different types of DaaS
Open government data and near-realtime data from
sensors
ASE WS 2012 46
47. Thanks for
your attention
Hong-Linh Truong
Distributed Systems Group
Vienna University of Technology
truong@dsg.tuwien.ac.at
http://www.infosys.tuwien.ac.at/staff/truong
ASE WS 2012 47