SlideShare uma empresa Scribd logo
1 de 26
User Centric Integration of Activity Data Mathieu d’Aquin, Stuart Brown, SalmanElahi, Enrico Motta The Open University
Agenda Introduction of the Team Objectives and Hypothesis Overview of technical realization Challenges Summary of results so far and dissemination
Team Dr Mathieu d’Aquin– Research fellow, KMi – project director Stuart Brown – Web developments and online communities, communication services – member of the steering group, liaison with online services SalmanElahi– Resarch assistant and PhD student, KMi – developer/researcher  Prof Enrico Motta – Professor of knowledge technologies, KMi – Chair of the steering group
Objectives and Hypothesis Hypothesis Taking a user centric point of view can allow different types of analysis of logs/activity data, which are valuable to the organisation and the user Ontologiesand Ontology-based reasoning can support the integration, consolidation and interpretation of activity data from multiple sources
Organisation Centric Activity Data  Analytics = aggregated stats Consolidation Consolidation Consolidation Logs 2 Logs 4 Logs 1 Logs 3 Website 2 Website 4 Website 1 Website 3 Organisation Users
At the Open University An analytics system building aggregated data from various university’s websites Based on a manually defined sitemaps Good for website optimization, marketing campaigns, etc. But the data being pre-aggregated, it is limited with respect to what it can do Limited control No user view
User Centric Activity Data Activity analysis for and by individual users Consolidation Integration Interpretation Ontologies Logs 2 Logs 4 Logs 1 Logs 3 Website 2 Website 4 Website 1 Website 3 Organisation Users
Ontologies Formal conceptual models of a domain Here, the domain is online user activity  At the basis of Semantic Web technologies Standard languages for expressing ontologies and ontological data (RDF, OWL) Tools to manipulate and work with ontologies and semantic data (NeOn Toolkit, OWLIM) Many ontologies to reuse (cf. Watson) Adhere to a logical formalism Enable inferences on the data
Objectives and Deliverables Build the technical infrastructure that can hold traces of activity data as semantic data Include triple store with reasoning capability, log parsers for different formats of logs, and renderers as semantic data (RDF) Build the ontologies to interpret and reason upon activity data Including various aspects of activity data in a way which is extensible  Tools to support users in analyzing their own activity data Recognize a user from the different settings and provide view on his/her own data  Allow him/her to customize the view, by customizing the ontology Test, validate, deploy, distribute
Technical infrastructure Semantic Triple Store Scheduler/Manager Daily RDF traces Daily RDF traces Parser/RDF renderer Parser/RDF renderer Daily RDF traces Daily RDF traces Daily RDF traces Log Log Parser/RDF renderer Parser/RDF renderer Parser/RDF renderer Application Log Log Log Application Server1 Server2 Server3
Technical infrastructure Development of parsers for different kinds a log formats  Currently handle Apache web server log files, parameterized from the Apache configuration Easily extensible for dedicated log formats Provide a common data structure serialized in RDF by the RDF renderer Each server produces a daily extract from the logs in RDF, which is being used to populate the semantic triple store The triple store includes multiple repositories and sub-spaces depending on time/user/server
Ontologies Key concepts to be represented: Actors (human users and robots) Sitemaps Traces (broad notion of logs) Activities Reusing existing ontologies FOAF: for people and documents Time Ontology: for traces Action ontology: for traces and activities (Planned) OPO: Online presence (Planner) SIOC: Online communities
Iterative and extensible construction of the ontologies Provide a base with actors, sitemaps and traces Specific extensions with typologies of activities, depending on user and site Dynamically building and integrating
Tool for analysis Need a tool which given A set of ontologies A data repository (which can be the overall one, the one restricted by time, and one for a given user) 	can provide a meaningful and interactive overview of the activity data To be used for  Provide an ontology-specific view of data analytics Support the iterative development of the ontologies Provide a user centric view of the data
Tools for analysis
Example In the ontology: /robot.txt is a RobotTXT page A Spider is an RobotAgent (ActorAgent) An agent used to access a RobotTXT is a Spider An AutomaticActivity is a Trace realized by a RobotAgent Result: Thousands of traces automatically classified as automatic activities.
Example In the ontology: UCIAD-Blog and LUCERO-Blog are Blogs (Website) A BlogPage is a page which is part of a Blog An activity onBlog is an activity happening on a Blog Page Result: Can look specifically at activities happening on a Blog and specialize them (same applies to Wikis, and other types of websites)
Example In the ontology: A SPARQLEndpoint is a specific type of Webpage AccessingSparqlEnpoint is an activity on a SPARQLEndpoint SPARLQQueryParameter is a parameter with the name “query” used in an AccessingSPARQLEndpoint activity ExecutingSPARQLQuery is an AccessingSPARQLQuery activity attached to a SPARQLQueryParameter Result: Can explore the specific activity of executing SPARQL queries and its parameters Can combine: Detect the activity of Automatically Accessing a SPARQL endpoint: and automatic activity and accessing a SPARQL endpoint.
Next step: User support Allow users  to log-in detect setting  bring up the relevant data  explore it But also,  to customize the view of the data to extend the ontologies to provide a personalized analysis of activity data to export (interpreted) activity data for reuse
User support User Logging or register Detect setting (agent+IP) unknown setting It is the first time you log into UCIAD with this setting (detail) do you want to attach it to your account? Check setting non-ambiguous non-ambiguous ambiguous known setting for user Add setting to known setting Register setting as ambiguous Display Activity Data related to all known settings of the user yes no
User support: data for a user For a user <u> the SPARQL query     Construct {?trace ?p ?y.  							?y ?q ?z} where 				{<u> actor:hasKnownSetting ?s.          ?trace trace:hasSetting ?s.          ?trace ?p ?y. ?trace ?q ?z} builds the traces of activities around the known setting of <u> Used to populate a specific repository with sub-spaces for each registered users
Deployment, test, validation At the moment, testing for websites of projects and events hosted on KMi servers: Sssw.org, sssw09.org, loted.eu, lucero-project.info, uciad.info, data.open.ac.uk, lucero.open.ac.uk, … Next level up, websites/systems from main open university website: www.open.ac.uk, study at the OU, podcasts.open.ac.uk, VLE Extend to deployment of instances for specific projects with distributed websites
Challenges Scalability OWLIM triple store can handle billions of triples But struggle with millions when inference is “on”  1 repository without inference with all historical data, 1 with inference with 1 week of data only, and 1 with inference for registered users User management and privacy Ensuring that the user who logs in from a particular setting is the one having the activity is difficult (e.g., in the case of shared computers) Is this really a problem? Check ambiguity – ask verification questions – moderate? Distribution and IPR Code and ontologies under open licenses (small uncertainty regarding code developed in other projects) Overall data: privacy issues (is k-anonymity actually applicable? Would it work?) Overall data: institutional issues (can we show the traffic on our websites to everybody) User data export: what license?
Summary and dissemination Promising initial results Can create new ways of analysis at run-time by editing the ontologies! Mechanisms to provide personal views on own activity data across websites First version of the ontologies: ongoing task First version of the tools: test and validate! Dissemination Blog / Twitter #uciad KMi’sinternal news letter (KMi Planet) Salman’s paper at the ESWC 2011 PhD symposium: “Personal Semantics: Personal information management in the Web with Semantic Technologies” Position paper at the W3C Web tracking and privacy workshop: “Self-Tracking on the Web: Why and How” Submission to the Personal Semantic Data workshop at K-CAP 2011
More info UCIAD Blog: http://uciad.info Code base: http://github.com/uciad Twitter: #uciad @mdaquin

Mais conteúdo relacionado

Mais procurados

IEEE ISM 2008: Kalman Graffi: A Distributed Platform for Multimedia Communities
IEEE ISM 2008: Kalman Graffi: A Distributed Platform for Multimedia CommunitiesIEEE ISM 2008: Kalman Graffi: A Distributed Platform for Multimedia Communities
IEEE ISM 2008: Kalman Graffi: A Distributed Platform for Multimedia CommunitiesKalman Graffi
 
Introduction to Text Mining and Visualization with Interactive Web Application
Introduction to Text Mining and Visualization with Interactive Web ApplicationIntroduction to Text Mining and Visualization with Interactive Web Application
Introduction to Text Mining and Visualization with Interactive Web ApplicationOlga Scrivner
 
Annotating Digital Texts in the Brown University Library
Annotating Digital Texts in the Brown University LibraryAnnotating Digital Texts in the Brown University Library
Annotating Digital Texts in the Brown University LibraryTimothy Cole
 
Citation and Research Objects: Toward Active Research Objects
Citation and Research Objects: Toward Active Research ObjectsCitation and Research Objects: Toward Active Research Objects
Citation and Research Objects: Toward Active Research ObjectsDaniel S. Katz
 
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...Ian Foster
 
Python tool to data analysis and artificial intelligence
Python tool to data analysis and artificial intelligencePython tool to data analysis and artificial intelligence
Python tool to data analysis and artificial intelligenceMd Aksam VK
 
Natural Language Processing & Semantic Models in an Imperfect World
Natural Language Processing & Semantic Modelsin an Imperfect WorldNatural Language Processing & Semantic Modelsin an Imperfect World
Natural Language Processing & Semantic Models in an Imperfect WorldVital.AI
 
SageCite demonstrator overview
SageCite demonstrator overviewSageCite demonstrator overview
SageCite demonstrator overviewmonicaduke
 
Mduke sagecite-jisc-march11
Mduke sagecite-jisc-march11Mduke sagecite-jisc-march11
Mduke sagecite-jisc-march11monicaduke
 
Introduction to Biological Network Analysis and Visualization with Cytoscape ...
Introduction to Biological Network Analysis and Visualization with Cytoscape ...Introduction to Biological Network Analysis and Visualization with Cytoscape ...
Introduction to Biological Network Analysis and Visualization with Cytoscape ...Keiichiro Ono
 
One IOTA at a time: A Case Study of OpenURL Success Metrics
One IOTA at a time: A Case Study of OpenURL Success MetricsOne IOTA at a time: A Case Study of OpenURL Success Metrics
One IOTA at a time: A Case Study of OpenURL Success MetricsCharleston Conference
 
Cytoscape Tutorial Session 1 at UT-KBRIN Bioinformatics Summit 2014 (4/11/2014)
Cytoscape Tutorial Session 1 at UT-KBRIN Bioinformatics Summit 2014 (4/11/2014)Cytoscape Tutorial Session 1 at UT-KBRIN Bioinformatics Summit 2014 (4/11/2014)
Cytoscape Tutorial Session 1 at UT-KBRIN Bioinformatics Summit 2014 (4/11/2014)Keiichiro Ono
 
2015 Cytoscape 3.2 Tutorial
2015 Cytoscape 3.2 Tutorial2015 Cytoscape 3.2 Tutorial
2015 Cytoscape 3.2 TutorialAlexander Pico
 
NTCIR-12 MobileClick-2 Overview
NTCIR-12 MobileClick-2 OverviewNTCIR-12 MobileClick-2 Overview
NTCIR-12 MobileClick-2 Overviewkt.mako
 
Open Annotation Collaboration Briefing
Open Annotation Collaboration BriefingOpen Annotation Collaboration Briefing
Open Annotation Collaboration BriefingTimothy Cole
 
WoSC19: Serverless Workflows for Indexing Large Scientific Data
WoSC19: Serverless Workflows for Indexing Large Scientific DataWoSC19: Serverless Workflows for Indexing Large Scientific Data
WoSC19: Serverless Workflows for Indexing Large Scientific DataUniversity of Chicago
 
Using Implicit Preference Relations to Improve Content-based Recommendations,...
Using Implicit Preference Relations to Improve Content-based Recommendations,...Using Implicit Preference Relations to Improve Content-based Recommendations,...
Using Implicit Preference Relations to Improve Content-based Recommendations,...Ladislav Peska
 
Introducing the Whole Tale Project: Merging Science and Cyberinfrastructure P...
Introducing the Whole Tale Project: Merging Science and Cyberinfrastructure P...Introducing the Whole Tale Project: Merging Science and Cyberinfrastructure P...
Introducing the Whole Tale Project: Merging Science and Cyberinfrastructure P...Bertram Ludäscher
 
IntAct and data distribution with PSICQUIC
IntAct and data distribution with PSICQUICIntAct and data distribution with PSICQUIC
IntAct and data distribution with PSICQUICRafael C. Jimenez
 

Mais procurados (19)

IEEE ISM 2008: Kalman Graffi: A Distributed Platform for Multimedia Communities
IEEE ISM 2008: Kalman Graffi: A Distributed Platform for Multimedia CommunitiesIEEE ISM 2008: Kalman Graffi: A Distributed Platform for Multimedia Communities
IEEE ISM 2008: Kalman Graffi: A Distributed Platform for Multimedia Communities
 
Introduction to Text Mining and Visualization with Interactive Web Application
Introduction to Text Mining and Visualization with Interactive Web ApplicationIntroduction to Text Mining and Visualization with Interactive Web Application
Introduction to Text Mining and Visualization with Interactive Web Application
 
Annotating Digital Texts in the Brown University Library
Annotating Digital Texts in the Brown University LibraryAnnotating Digital Texts in the Brown University Library
Annotating Digital Texts in the Brown University Library
 
Citation and Research Objects: Toward Active Research Objects
Citation and Research Objects: Toward Active Research ObjectsCitation and Research Objects: Toward Active Research Objects
Citation and Research Objects: Toward Active Research Objects
 
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
 
Python tool to data analysis and artificial intelligence
Python tool to data analysis and artificial intelligencePython tool to data analysis and artificial intelligence
Python tool to data analysis and artificial intelligence
 
Natural Language Processing & Semantic Models in an Imperfect World
Natural Language Processing & Semantic Modelsin an Imperfect WorldNatural Language Processing & Semantic Modelsin an Imperfect World
Natural Language Processing & Semantic Models in an Imperfect World
 
SageCite demonstrator overview
SageCite demonstrator overviewSageCite demonstrator overview
SageCite demonstrator overview
 
Mduke sagecite-jisc-march11
Mduke sagecite-jisc-march11Mduke sagecite-jisc-march11
Mduke sagecite-jisc-march11
 
Introduction to Biological Network Analysis and Visualization with Cytoscape ...
Introduction to Biological Network Analysis and Visualization with Cytoscape ...Introduction to Biological Network Analysis and Visualization with Cytoscape ...
Introduction to Biological Network Analysis and Visualization with Cytoscape ...
 
One IOTA at a time: A Case Study of OpenURL Success Metrics
One IOTA at a time: A Case Study of OpenURL Success MetricsOne IOTA at a time: A Case Study of OpenURL Success Metrics
One IOTA at a time: A Case Study of OpenURL Success Metrics
 
Cytoscape Tutorial Session 1 at UT-KBRIN Bioinformatics Summit 2014 (4/11/2014)
Cytoscape Tutorial Session 1 at UT-KBRIN Bioinformatics Summit 2014 (4/11/2014)Cytoscape Tutorial Session 1 at UT-KBRIN Bioinformatics Summit 2014 (4/11/2014)
Cytoscape Tutorial Session 1 at UT-KBRIN Bioinformatics Summit 2014 (4/11/2014)
 
2015 Cytoscape 3.2 Tutorial
2015 Cytoscape 3.2 Tutorial2015 Cytoscape 3.2 Tutorial
2015 Cytoscape 3.2 Tutorial
 
NTCIR-12 MobileClick-2 Overview
NTCIR-12 MobileClick-2 OverviewNTCIR-12 MobileClick-2 Overview
NTCIR-12 MobileClick-2 Overview
 
Open Annotation Collaboration Briefing
Open Annotation Collaboration BriefingOpen Annotation Collaboration Briefing
Open Annotation Collaboration Briefing
 
WoSC19: Serverless Workflows for Indexing Large Scientific Data
WoSC19: Serverless Workflows for Indexing Large Scientific DataWoSC19: Serverless Workflows for Indexing Large Scientific Data
WoSC19: Serverless Workflows for Indexing Large Scientific Data
 
Using Implicit Preference Relations to Improve Content-based Recommendations,...
Using Implicit Preference Relations to Improve Content-based Recommendations,...Using Implicit Preference Relations to Improve Content-based Recommendations,...
Using Implicit Preference Relations to Improve Content-based Recommendations,...
 
Introducing the Whole Tale Project: Merging Science and Cyberinfrastructure P...
Introducing the Whole Tale Project: Merging Science and Cyberinfrastructure P...Introducing the Whole Tale Project: Merging Science and Cyberinfrastructure P...
Introducing the Whole Tale Project: Merging Science and Cyberinfrastructure P...
 
IntAct and data distribution with PSICQUIC
IntAct and data distribution with PSICQUICIntAct and data distribution with PSICQUIC
IntAct and data distribution with PSICQUIC
 

Semelhante a UCIAD overview

Hughes RDAP11 Data Publication Repositories
Hughes RDAP11 Data Publication RepositoriesHughes RDAP11 Data Publication Repositories
Hughes RDAP11 Data Publication RepositoriesASIS&T
 
Semantic Web in Action: Ontology-driven information search, integration and a...
Semantic Web in Action: Ontology-driven information search, integration and a...Semantic Web in Action: Ontology-driven information search, integration and a...
Semantic Web in Action: Ontology-driven information search, integration and a...Amit Sheth
 
Web analytics webinar
Web analytics webinarWeb analytics webinar
Web analytics webinarJim Jansen
 
OSFair2017 Workshop | EGI applications database
OSFair2017 Workshop | EGI applications databaseOSFair2017 Workshop | EGI applications database
OSFair2017 Workshop | EGI applications databaseOpen Science Fair
 
Web analytics presentation
Web analytics presentationWeb analytics presentation
Web analytics presentationJim Jansen
 
Biocatalogue Talk Slides
Biocatalogue Talk SlidesBiocatalogue Talk Slides
Biocatalogue Talk SlidesBioCatalogue
 
Author's workflow and the role of open access
Author's workflow and the role of open accessAuthor's workflow and the role of open access
Author's workflow and the role of open accessPaola Gargiulo
 
CREW VRE Release 5 - 2009 May
CREW VRE Release 5 - 2009 MayCREW VRE Release 5 - 2009 May
CREW VRE Release 5 - 2009 MayMartin Turner
 
IEEE 2014 DOTNET CLOUD COMPUTING PROJECTS A scientometric analysis of cloud c...
IEEE 2014 DOTNET CLOUD COMPUTING PROJECTS A scientometric analysis of cloud c...IEEE 2014 DOTNET CLOUD COMPUTING PROJECTS A scientometric analysis of cloud c...
IEEE 2014 DOTNET CLOUD COMPUTING PROJECTS A scientometric analysis of cloud c...IEEEMEMTECHSTUDENTPROJECTS
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational WorkflowsCarole Goble
 
Tracking user activity logs using Loggastic #ApiPlatformCon
Tracking user activity logs using Loggastic #ApiPlatformConTracking user activity logs using Loggastic #ApiPlatformCon
Tracking user activity logs using Loggastic #ApiPlatformConPaula Čučuk
 
Suricate
SuricateSuricate
Suricatebefreax
 
Seminario eMadrid sobre "Nuevas experiencias en laboratorios remotos". Estand...
Seminario eMadrid sobre "Nuevas experiencias en laboratorios remotos". Estand...Seminario eMadrid sobre "Nuevas experiencias en laboratorios remotos". Estand...
Seminario eMadrid sobre "Nuevas experiencias en laboratorios remotos". Estand...eMadrid network
 
A Distributed Architecture for Sharing Ecological Data Sets with Access and U...
A Distributed Architecture for Sharing Ecological Data Sets with Access and U...A Distributed Architecture for Sharing Ecological Data Sets with Access and U...
A Distributed Architecture for Sharing Ecological Data Sets with Access and U...Javier González
 
A Look into the Apache OODT Ecosystem
A Look into the Apache OODT EcosystemA Look into the Apache OODT Ecosystem
A Look into the Apache OODT EcosystemChris Mattmann
 
Getting Started with Splunk Enterprise
Getting Started with Splunk EnterpriseGetting Started with Splunk Enterprise
Getting Started with Splunk EnterpriseSplunk
 

Semelhante a UCIAD overview (20)

UCIAD - quick overview
UCIAD - quick overviewUCIAD - quick overview
UCIAD - quick overview
 
Hughes RDAP11 Data Publication Repositories
Hughes RDAP11 Data Publication RepositoriesHughes RDAP11 Data Publication Repositories
Hughes RDAP11 Data Publication Repositories
 
Semantic Web in Action: Ontology-driven information search, integration and a...
Semantic Web in Action: Ontology-driven information search, integration and a...Semantic Web in Action: Ontology-driven information search, integration and a...
Semantic Web in Action: Ontology-driven information search, integration and a...
 
Web analytics webinar
Web analytics webinarWeb analytics webinar
Web analytics webinar
 
OSFair2017 Workshop | EGI applications database
OSFair2017 Workshop | EGI applications databaseOSFair2017 Workshop | EGI applications database
OSFair2017 Workshop | EGI applications database
 
Web analytics presentation
Web analytics presentationWeb analytics presentation
Web analytics presentation
 
Biocatalogue Talk Slides
Biocatalogue Talk SlidesBiocatalogue Talk Slides
Biocatalogue Talk Slides
 
Author's workflow and the role of open access
Author's workflow and the role of open accessAuthor's workflow and the role of open access
Author's workflow and the role of open access
 
CREW VRE Release 5 - 2009 May
CREW VRE Release 5 - 2009 MayCREW VRE Release 5 - 2009 May
CREW VRE Release 5 - 2009 May
 
IEEE 2014 DOTNET CLOUD COMPUTING PROJECTS A scientometric analysis of cloud c...
IEEE 2014 DOTNET CLOUD COMPUTING PROJECTS A scientometric analysis of cloud c...IEEE 2014 DOTNET CLOUD COMPUTING PROJECTS A scientometric analysis of cloud c...
IEEE 2014 DOTNET CLOUD COMPUTING PROJECTS A scientometric analysis of cloud c...
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
 
Tracking user activity logs using Loggastic #ApiPlatformCon
Tracking user activity logs using Loggastic #ApiPlatformConTracking user activity logs using Loggastic #ApiPlatformCon
Tracking user activity logs using Loggastic #ApiPlatformCon
 
Suricate
SuricateSuricate
Suricate
 
Seminario eMadrid sobre "Nuevas experiencias en laboratorios remotos". Estand...
Seminario eMadrid sobre "Nuevas experiencias en laboratorios remotos". Estand...Seminario eMadrid sobre "Nuevas experiencias en laboratorios remotos". Estand...
Seminario eMadrid sobre "Nuevas experiencias en laboratorios remotos". Estand...
 
A Distributed Architecture for Sharing Ecological Data Sets with Access and U...
A Distributed Architecture for Sharing Ecological Data Sets with Access and U...A Distributed Architecture for Sharing Ecological Data Sets with Access and U...
A Distributed Architecture for Sharing Ecological Data Sets with Access and U...
 
A Look into the Apache OODT Ecosystem
A Look into the Apache OODT EcosystemA Look into the Apache OODT Ecosystem
A Look into the Apache OODT Ecosystem
 
Executable papers
Executable papersExecutable papers
Executable papers
 
Getting Started with Splunk Enterprise
Getting Started with Splunk EnterpriseGetting Started with Splunk Enterprise
Getting Started with Splunk Enterprise
 
Archonnex at ICPSR
Archonnex at ICPSRArchonnex at ICPSR
Archonnex at ICPSR
 
Prototype Design of Open Access Institutional Repository
Prototype Design of Open Access Institutional RepositoryPrototype Design of Open Access Institutional Repository
Prototype Design of Open Access Institutional Repository
 

Mais de Mathieu d'Aquin

A factorial study of neural network learning from differences for regression
A factorial study of neural network learning from  differences for regressionA factorial study of neural network learning from  differences for regression
A factorial study of neural network learning from differences for regressionMathieu d'Aquin
 
Recentrer l'intelligence artificielle sur les connaissances
Recentrer l'intelligence artificielle sur les connaissancesRecentrer l'intelligence artificielle sur les connaissances
Recentrer l'intelligence artificielle sur les connaissancesMathieu d'Aquin
 
Data and Knowledge as Commodities
Data and Knowledge as CommoditiesData and Knowledge as Commodities
Data and Knowledge as CommoditiesMathieu d'Aquin
 
Unsupervised learning approach for identifying sub-genres in music scores
Unsupervised learning approach for identifying sub-genres in music scoresUnsupervised learning approach for identifying sub-genres in music scores
Unsupervised learning approach for identifying sub-genres in music scoresMathieu d'Aquin
 
Is knowledge engineering still relevant?
Is knowledge engineering still relevant?Is knowledge engineering still relevant?
Is knowledge engineering still relevant?Mathieu d'Aquin
 
A data view of the data science process
A data view of the data science processA data view of the data science process
A data view of the data science processMathieu d'Aquin
 
Dealing with Open Domain Data
Dealing with Open Domain DataDealing with Open Domain Data
Dealing with Open Domain DataMathieu d'Aquin
 
Web Analytics for Everyday Learning
Web Analytics for  Everyday LearningWeb Analytics for  Everyday Learning
Web Analytics for Everyday LearningMathieu d'Aquin
 
Presentation a in ovive montpellier - 26%2 f06%2f2018 (1)
Presentation a in ovive   montpellier - 26%2 f06%2f2018 (1)Presentation a in ovive   montpellier - 26%2 f06%2f2018 (1)
Presentation a in ovive montpellier - 26%2 f06%2f2018 (1)Mathieu d'Aquin
 
Learning Analytics: understand learning and support the learner
Learning Analytics: understand learning and support the learnerLearning Analytics: understand learning and support the learner
Learning Analytics: understand learning and support the learnerMathieu d'Aquin
 
Assessing the Readability of Policy Documents: The Case of Terms of Use of On...
Assessing the Readability of Policy Documents: The Case of Terms of Use of On...Assessing the Readability of Policy Documents: The Case of Terms of Use of On...
Assessing the Readability of Policy Documents: The Case of Terms of Use of On...Mathieu d'Aquin
 
Data for Learning and Learning with Data
Data for Learning and Learning with DataData for Learning and Learning with Data
Data for Learning and Learning with DataMathieu d'Aquin
 
Towards an “Ethics in Design” methodology for AI research projects
Towards an “Ethics in Design” methodology  for AI research projects Towards an “Ethics in Design” methodology  for AI research projects
Towards an “Ethics in Design” methodology for AI research projects Mathieu d'Aquin
 
AFEL: Towards Measuring Online Activities Contributions to Self-Directed Lear...
AFEL: Towards Measuring Online Activities Contributions to Self-Directed Lear...AFEL: Towards Measuring Online Activities Contributions to Self-Directed Lear...
AFEL: Towards Measuring Online Activities Contributions to Self-Directed Lear...Mathieu d'Aquin
 
Profiling information sources and services for discovery
Profiling information sources and services for discoveryProfiling information sources and services for discovery
Profiling information sources and services for discoveryMathieu d'Aquin
 
Analyse de données et de réseaux sociaux pour l’aide à l’apprentissage infor...
Analyse de données et de réseaux sociaux pour  l’aide à l’apprentissage infor...Analyse de données et de réseaux sociaux pour  l’aide à l’apprentissage infor...
Analyse de données et de réseaux sociaux pour l’aide à l’apprentissage infor...Mathieu d'Aquin
 
From Knowledge Bases to Knowledge Infrastructures for Intelligent Systems
From Knowledge Bases to Knowledge Infrastructures for Intelligent SystemsFrom Knowledge Bases to Knowledge Infrastructures for Intelligent Systems
From Knowledge Bases to Knowledge Infrastructures for Intelligent SystemsMathieu d'Aquin
 
Data analytics beyond data processing and how it affects Industry 4.0
Data analytics beyond data processing and how it affects Industry 4.0Data analytics beyond data processing and how it affects Industry 4.0
Data analytics beyond data processing and how it affects Industry 4.0Mathieu d'Aquin
 

Mais de Mathieu d'Aquin (20)

A factorial study of neural network learning from differences for regression
A factorial study of neural network learning from  differences for regressionA factorial study of neural network learning from  differences for regression
A factorial study of neural network learning from differences for regression
 
Recentrer l'intelligence artificielle sur les connaissances
Recentrer l'intelligence artificielle sur les connaissancesRecentrer l'intelligence artificielle sur les connaissances
Recentrer l'intelligence artificielle sur les connaissances
 
Data and Knowledge as Commodities
Data and Knowledge as CommoditiesData and Knowledge as Commodities
Data and Knowledge as Commodities
 
Unsupervised learning approach for identifying sub-genres in music scores
Unsupervised learning approach for identifying sub-genres in music scoresUnsupervised learning approach for identifying sub-genres in music scores
Unsupervised learning approach for identifying sub-genres in music scores
 
Is knowledge engineering still relevant?
Is knowledge engineering still relevant?Is knowledge engineering still relevant?
Is knowledge engineering still relevant?
 
A data view of the data science process
A data view of the data science processA data view of the data science process
A data view of the data science process
 
Dealing with Open Domain Data
Dealing with Open Domain DataDealing with Open Domain Data
Dealing with Open Domain Data
 
Web Analytics for Everyday Learning
Web Analytics for  Everyday LearningWeb Analytics for  Everyday Learning
Web Analytics for Everyday Learning
 
Presentation a in ovive montpellier - 26%2 f06%2f2018 (1)
Presentation a in ovive   montpellier - 26%2 f06%2f2018 (1)Presentation a in ovive   montpellier - 26%2 f06%2f2018 (1)
Presentation a in ovive montpellier - 26%2 f06%2f2018 (1)
 
Learning Analytics: understand learning and support the learner
Learning Analytics: understand learning and support the learnerLearning Analytics: understand learning and support the learner
Learning Analytics: understand learning and support the learner
 
The AFEL Project
The AFEL ProjectThe AFEL Project
The AFEL Project
 
Assessing the Readability of Policy Documents: The Case of Terms of Use of On...
Assessing the Readability of Policy Documents: The Case of Terms of Use of On...Assessing the Readability of Policy Documents: The Case of Terms of Use of On...
Assessing the Readability of Policy Documents: The Case of Terms of Use of On...
 
Data ethics
Data ethicsData ethics
Data ethics
 
Data for Learning and Learning with Data
Data for Learning and Learning with DataData for Learning and Learning with Data
Data for Learning and Learning with Data
 
Towards an “Ethics in Design” methodology for AI research projects
Towards an “Ethics in Design” methodology  for AI research projects Towards an “Ethics in Design” methodology  for AI research projects
Towards an “Ethics in Design” methodology for AI research projects
 
AFEL: Towards Measuring Online Activities Contributions to Self-Directed Lear...
AFEL: Towards Measuring Online Activities Contributions to Self-Directed Lear...AFEL: Towards Measuring Online Activities Contributions to Self-Directed Lear...
AFEL: Towards Measuring Online Activities Contributions to Self-Directed Lear...
 
Profiling information sources and services for discovery
Profiling information sources and services for discoveryProfiling information sources and services for discovery
Profiling information sources and services for discovery
 
Analyse de données et de réseaux sociaux pour l’aide à l’apprentissage infor...
Analyse de données et de réseaux sociaux pour  l’aide à l’apprentissage infor...Analyse de données et de réseaux sociaux pour  l’aide à l’apprentissage infor...
Analyse de données et de réseaux sociaux pour l’aide à l’apprentissage infor...
 
From Knowledge Bases to Knowledge Infrastructures for Intelligent Systems
From Knowledge Bases to Knowledge Infrastructures for Intelligent SystemsFrom Knowledge Bases to Knowledge Infrastructures for Intelligent Systems
From Knowledge Bases to Knowledge Infrastructures for Intelligent Systems
 
Data analytics beyond data processing and how it affects Industry 4.0
Data analytics beyond data processing and how it affects Industry 4.0Data analytics beyond data processing and how it affects Industry 4.0
Data analytics beyond data processing and how it affects Industry 4.0
 

Último

Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 

Último (20)

Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 

UCIAD overview

  • 1. User Centric Integration of Activity Data Mathieu d’Aquin, Stuart Brown, SalmanElahi, Enrico Motta The Open University
  • 2. Agenda Introduction of the Team Objectives and Hypothesis Overview of technical realization Challenges Summary of results so far and dissemination
  • 3. Team Dr Mathieu d’Aquin– Research fellow, KMi – project director Stuart Brown – Web developments and online communities, communication services – member of the steering group, liaison with online services SalmanElahi– Resarch assistant and PhD student, KMi – developer/researcher Prof Enrico Motta – Professor of knowledge technologies, KMi – Chair of the steering group
  • 4. Objectives and Hypothesis Hypothesis Taking a user centric point of view can allow different types of analysis of logs/activity data, which are valuable to the organisation and the user Ontologiesand Ontology-based reasoning can support the integration, consolidation and interpretation of activity data from multiple sources
  • 5. Organisation Centric Activity Data Analytics = aggregated stats Consolidation Consolidation Consolidation Logs 2 Logs 4 Logs 1 Logs 3 Website 2 Website 4 Website 1 Website 3 Organisation Users
  • 6. At the Open University An analytics system building aggregated data from various university’s websites Based on a manually defined sitemaps Good for website optimization, marketing campaigns, etc. But the data being pre-aggregated, it is limited with respect to what it can do Limited control No user view
  • 7. User Centric Activity Data Activity analysis for and by individual users Consolidation Integration Interpretation Ontologies Logs 2 Logs 4 Logs 1 Logs 3 Website 2 Website 4 Website 1 Website 3 Organisation Users
  • 8. Ontologies Formal conceptual models of a domain Here, the domain is online user activity At the basis of Semantic Web technologies Standard languages for expressing ontologies and ontological data (RDF, OWL) Tools to manipulate and work with ontologies and semantic data (NeOn Toolkit, OWLIM) Many ontologies to reuse (cf. Watson) Adhere to a logical formalism Enable inferences on the data
  • 9. Objectives and Deliverables Build the technical infrastructure that can hold traces of activity data as semantic data Include triple store with reasoning capability, log parsers for different formats of logs, and renderers as semantic data (RDF) Build the ontologies to interpret and reason upon activity data Including various aspects of activity data in a way which is extensible Tools to support users in analyzing their own activity data Recognize a user from the different settings and provide view on his/her own data Allow him/her to customize the view, by customizing the ontology Test, validate, deploy, distribute
  • 10. Technical infrastructure Semantic Triple Store Scheduler/Manager Daily RDF traces Daily RDF traces Parser/RDF renderer Parser/RDF renderer Daily RDF traces Daily RDF traces Daily RDF traces Log Log Parser/RDF renderer Parser/RDF renderer Parser/RDF renderer Application Log Log Log Application Server1 Server2 Server3
  • 11. Technical infrastructure Development of parsers for different kinds a log formats Currently handle Apache web server log files, parameterized from the Apache configuration Easily extensible for dedicated log formats Provide a common data structure serialized in RDF by the RDF renderer Each server produces a daily extract from the logs in RDF, which is being used to populate the semantic triple store The triple store includes multiple repositories and sub-spaces depending on time/user/server
  • 12. Ontologies Key concepts to be represented: Actors (human users and robots) Sitemaps Traces (broad notion of logs) Activities Reusing existing ontologies FOAF: for people and documents Time Ontology: for traces Action ontology: for traces and activities (Planned) OPO: Online presence (Planner) SIOC: Online communities
  • 13.
  • 14. Iterative and extensible construction of the ontologies Provide a base with actors, sitemaps and traces Specific extensions with typologies of activities, depending on user and site Dynamically building and integrating
  • 15. Tool for analysis Need a tool which given A set of ontologies A data repository (which can be the overall one, the one restricted by time, and one for a given user) can provide a meaningful and interactive overview of the activity data To be used for Provide an ontology-specific view of data analytics Support the iterative development of the ontologies Provide a user centric view of the data
  • 17. Example In the ontology: /robot.txt is a RobotTXT page A Spider is an RobotAgent (ActorAgent) An agent used to access a RobotTXT is a Spider An AutomaticActivity is a Trace realized by a RobotAgent Result: Thousands of traces automatically classified as automatic activities.
  • 18. Example In the ontology: UCIAD-Blog and LUCERO-Blog are Blogs (Website) A BlogPage is a page which is part of a Blog An activity onBlog is an activity happening on a Blog Page Result: Can look specifically at activities happening on a Blog and specialize them (same applies to Wikis, and other types of websites)
  • 19. Example In the ontology: A SPARQLEndpoint is a specific type of Webpage AccessingSparqlEnpoint is an activity on a SPARQLEndpoint SPARLQQueryParameter is a parameter with the name “query” used in an AccessingSPARQLEndpoint activity ExecutingSPARQLQuery is an AccessingSPARQLQuery activity attached to a SPARQLQueryParameter Result: Can explore the specific activity of executing SPARQL queries and its parameters Can combine: Detect the activity of Automatically Accessing a SPARQL endpoint: and automatic activity and accessing a SPARQL endpoint.
  • 20. Next step: User support Allow users to log-in detect setting bring up the relevant data explore it But also, to customize the view of the data to extend the ontologies to provide a personalized analysis of activity data to export (interpreted) activity data for reuse
  • 21. User support User Logging or register Detect setting (agent+IP) unknown setting It is the first time you log into UCIAD with this setting (detail) do you want to attach it to your account? Check setting non-ambiguous non-ambiguous ambiguous known setting for user Add setting to known setting Register setting as ambiguous Display Activity Data related to all known settings of the user yes no
  • 22. User support: data for a user For a user <u> the SPARQL query Construct {?trace ?p ?y. ?y ?q ?z} where {<u> actor:hasKnownSetting ?s. ?trace trace:hasSetting ?s. ?trace ?p ?y. ?trace ?q ?z} builds the traces of activities around the known setting of <u> Used to populate a specific repository with sub-spaces for each registered users
  • 23. Deployment, test, validation At the moment, testing for websites of projects and events hosted on KMi servers: Sssw.org, sssw09.org, loted.eu, lucero-project.info, uciad.info, data.open.ac.uk, lucero.open.ac.uk, … Next level up, websites/systems from main open university website: www.open.ac.uk, study at the OU, podcasts.open.ac.uk, VLE Extend to deployment of instances for specific projects with distributed websites
  • 24. Challenges Scalability OWLIM triple store can handle billions of triples But struggle with millions when inference is “on”  1 repository without inference with all historical data, 1 with inference with 1 week of data only, and 1 with inference for registered users User management and privacy Ensuring that the user who logs in from a particular setting is the one having the activity is difficult (e.g., in the case of shared computers) Is this really a problem? Check ambiguity – ask verification questions – moderate? Distribution and IPR Code and ontologies under open licenses (small uncertainty regarding code developed in other projects) Overall data: privacy issues (is k-anonymity actually applicable? Would it work?) Overall data: institutional issues (can we show the traffic on our websites to everybody) User data export: what license?
  • 25. Summary and dissemination Promising initial results Can create new ways of analysis at run-time by editing the ontologies! Mechanisms to provide personal views on own activity data across websites First version of the ontologies: ongoing task First version of the tools: test and validate! Dissemination Blog / Twitter #uciad KMi’sinternal news letter (KMi Planet) Salman’s paper at the ESWC 2011 PhD symposium: “Personal Semantics: Personal information management in the Web with Semantic Technologies” Position paper at the W3C Web tracking and privacy workshop: “Self-Tracking on the Web: Why and How” Submission to the Personal Semantic Data workshop at K-CAP 2011
  • 26. More info UCIAD Blog: http://uciad.info Code base: http://github.com/uciad Twitter: #uciad @mdaquin