SlideShare uma empresa Scribd logo
1 de 30
INFORMATION MANAGEMENT



                    Semantic Document Architecture for Desktop
                    Data Integration And Management


Place image here
                      November 30, 2010


                      Saša Nešić
                      PhD Dissertation Defense
Motivation



   Semantic Web



  Semantic Desktop

                       Ontologies
                       Resource Description Framework (RDF)
                       SPARQL query language
 Semantic Documents




                                                               2
Semantic Documents


Semantic document are composite information resources composed of data/information
units that are:


    uniquely identified by globally unique URIs,
    semantically annotated by concepts from domain ontologies,
    interlinked with other data/information units via explicit semantic links .




                                                                                     3
Thesis Statement


  “Semantic documents integrate desktop data into a unified desktop information
space and enable desktop data to be integrated into a unified information space of
                             social communities”




     Improving the Effectiveness and Efficiency of Desktop Users




                                                                                     4
Outline

  Motivation

  Semantic Document Model - SDM

  Semantic Document Architecture - SDArch

  Prototype

  Thesis Validation

  Conclusions
Semantic Document Model
                                                  Semantic-Linking Part
                                                  Change-Tracking Part
                                                    Annotation Part
                                                       Core Part


           Core Part              Annotation Part            Semantic-Linking Part               Change-Tracking Part




- document unit types        - annotation types              - semantic linking interface       - types of doc. unit changes

- structural relationships   - annotation interface                                             - change-tracking interface
- identification

- binary content linking




                                                                                Annals of Information systems’ 09              6
Machine-Processable and Human-Readable instances of SDM


 MP document representation
   Unique and permanent instance
   HTTP de-referencable URIs
   RDF data format


 HR document representation
   Temporal document instances
   Rendered from the MP instance
   Existing document formats




                                                           7
Outline

  Motivation

  Semantic Document Model - SDM

  Semantic Document Architecture - SDArch

  Prototype

  Thesis Validation

  Conclusions
Semantic Document Architecture - SDArch




                            Annals of Information systems’ 09   9
Semantic Document Authoring, Search, and Navigation


     Concept Exploration Algorithm

   Objective:
     Search Algorithm
  - conceptualization of DU semantics
 Objective:
 Input:
      Search Personalization Algorithm
 - search for semantic document units (DUs)
  - document unit:
 Objective:
 - Input: ontology(ies)
    domain
  - personalization of semantic doc. Search
 - Output:
     a free-text keyword query
 Input:
 - Output: vector:
    concept
  - list of retrieved semantic DUs:
  - a ranked list of semantic DUs
  - list of user preferences
  - concept weight vector:
 Features:
 Output:
   - forming semantic query:
  - re-ranked list of semantic DUs
 Features:
 Features:
  - lexical expansion of concept labels
 - - executingSCA for each DUagainst CI:
    extracting semantic query
  - syntactic concept matching
 - weighting schema for each user preference
  - semantic concept matching
 - ranking DUs based on calculated weights
  - measuring concept relevance

    - measuring similarity between   and



                                               Semantic document authoring service
                                               Semantic document search and navigation service


                                                                                SEKE’ 10         10
                                                                                                 10
Semantic Document Sharing


 SDArch social network
 Publishing only RDFs
 Capturing social-context annotations
 Contributing to:
    Linked Open Data Cloud
    Web of Linked Data
    Semantic Web




                                         ESEC/FSE – SoSEA’09   11
Outline

  Motivation

  Semantic Document Model - SDM

  Semantic Document Architecture - SDArch

  Prototype

  Thesis Valiadtion

  Conclusions
SDArch Prototype

Objectives:                                     Source Code Organization:
                                                Number of services          5
  Validation of SDArch and SDM
                                                Number of .NET assemblies   15
  Enabling experimental evaluation
                                                Number of .NET namespaces   14
  Enabling usability evaluation

Implementation:
 Semantic Document Repository
      Sesame 2 RDF repository
      SemWeb C# Library
      MySQL DB-backed persistent RDF storage
      SPARQL query support
      Full-Text query support (Lucene)
 Services
    WCF Framework

 Tools
    MS Office Add-Ins

                                                                                 13
SemanticDoc - MS Office Add-Ins




                                  ICWE’08   14
Outline

  Motivation

  Semantic Document Model - SDM

  Semantic Document Architecture - SDArch

  Prototype

  Thesis Validation

  Conclusions
Thesis Validation



Q1: How do semantic documents improve information finding and retrieval in
semantically integrated document collections?

     1. Experimental evaluation of Information Retrieval in Semantic Documents


Q2: How do semantic documents facilitate desktop users in completing tasks that
draw data from both a personal desktop and social communities?

     2. Usability evaluation of SDArch Services and Tools




                                                                                  16
                                                                                  16
Experimental Evaluation of Information Retrieval in
 Semantic Documents

 Objectives:

     Measuring effectiveness of the semantic document search
     Measuring effectiveness of the semantic document annotation (indexing)



 Compared approaches:

       Concept-Based Indexing and Search – Simple Syntactic Matching
       Concept-Based Indexing and Search – Lexically Expanded Syntactic Matching
       Full Text Indexing and Search (Lucene)
       Semantic Document Indexing and Search




                                                                               SEMAPRO' 10   17
                                                                                             17
Test Collections

      Mammals of the World                    Metals and Alloys

 MAMO Ontology                      Metals Ontology
   OWL + SKOS                         OWL + SKOS
   Finnish National Museum            Key-To-Metals, Zurich
   ~ 5000 domain concepts             ~ 1800 domain concepts
 Document Set                       Document Set
   Wikipedia – List of Mammals        Key-To-Metals records
   150 articles                       240 Word documents
   2130 semantic document units       3312 semantic document units
 Query Set                          Query Set
    5 queries related to Mammals       5 queries related to Metals and Alloys




                                                                                  18
                                                                                  18
Measuring Effectiveness of the Semantic Document
(Indexing) Annotation


Test collection 1: Mammals of the World

                                            # of syn.   # of sem.   weight of syn.   weight of sem.
                Approach
                                            matches     matches       matches          matches
CB – simple syntactic matching                1524          -            2.56              -

CB – lexically expand. syntactic matching     3182          -            3.62              -

Semantic document indexing and annotation     3182        2437           3.62             2.96



Test collection 2: Metals and Alloys

                                            # of syn.   # of sem.   weight of syn.     weight of
                Approach
                                            matches     matches       matches        sem. matches
CB – simple syntactic matching                2153          -            1.73              -

CB – lexically expand. syntactic matching     2879          -            2.43              -

Semantic document indexing and annotation     2879        1024           2.43             2.14




                                                                                                  19
                                                                                                  19
Measuring Effectiveness of the Semantic Document Search


Test collection: Mammals of the World   Test collection: Metals and alloys




                                                                             20
                                                                             20
Usability Evaluation

Evaluation Hypothesis :

    “Using SDArch results in a more effective, efficient, and satisfactory user
 experience when authoring, exploring (i.e., searching and navigating) and utilizing
                     documents in carrying out daily tasks.”



Usability evaluation criteria :


      User Effectiveness
      User Efficiency
      User Satisfaction




                                                                           ICALT’ 10   21
                                                                                       21
Case Study: Authoring of Course Material


 Participants – SDArch Social Network
       University of Lugano, Switzerland – 7 participants
       Simon Fraser University, Canada – 7 participants
       Athabasca University, Canada – 2 participants
       University of Belgrade, Serbia – 2 participants



 Document Collection
     “Software Design Patterns” – 70 PowerPoint and Word documents


 Evaluation Session
     Task-Based Usability Test
     Follow-up questionnaires




                                                                      22
                                                                      22
Usability Test Use Cases


i. Setting Up the User Profile and the Social Network Properties



ii. Authoring and Publishing Semantic Documents




iii. Searching and Navigating across Semantic Documents

                                        Task        Task objective          Slide

                                         1     Design patterns definition    1

                                         2     Example 1 - definition
                                                                             2
                                         3     Example 1 - illustration

                                         4     Example 2 - definition
                                                                             3
                                         5     Example 2 - illustration


                                                                                    23
Evaluation Methods and Metrics



  Evaluation Criteria          Evaluation Method                Evaluation Metric

1. Effectiveness        Objective - Quantitative Measure   • Task Success Rates

                        Objective - Quantitative Measure   • Task Completion Times
2. Efficiency           “                                  • Number of Mouse Clicks
                        “                                  • Number of Window Switches

3. Satisfaction         Subjective - Questionnaire         • 5-level Likert scale




                                                                                         24
                                                                                         24
1. User Effectiveness

 metric: Task success rate




                   Conventional System                 SDArch System
    Task
              Successful Completions     %     Successful Completions    %
     1                 18              100              18              100

     2                 17              94.44            18              100

     3                 15              83.33            17              94.44

     4                 17              94.44            18              100

     5                 14              77.77            16              88.88




                                                                                25
                                                                                25
2. User Efficiency


metric: Task execution time   metric: Number of mouse clicks   metric: Number of window switches




                                                                                    T-Test results:

                                                                             Task
                                                                             Task           p-value
                                                                                            p-value
                                                                                1
                                                                                1            1.6*10-12
                                                                                              0.00071
                                                                                             0.00004

                                                                                2
                                                                                2           1.22*10-7
                                                                                             0.00011
                                                                                             0.0041

                                                                                3
                                                                                3           6.91*10-8
                                                                                            9.17*10-6
                                                                                             0.00016

                                                                                4
                                                                                4           3.67*10-7
                                                                                             0.00034
                                                                                             0.00009

                                                                                5
                                                                                5           4.82*10-10
                                                                                             2.6*10-6
                                                                                             0.00004


                                                                  If p < 0.05  results are statistically significant




                                                                                                                26
                                                                                                                26
3. User Satisfaction


      metric: 5-level Likert Scale
                                     Internal consistency (reliability) test:

                                        Dimension           Cronbach’s α
                                     Usefulness                  0.85
Strongly
agree                               Ease-of-Use                 0.78

                                     Ease-of-Learning            0.92

                                     Overall Satisfaction        0.83


                                           Recommended α values > 0.75




Strongly
disagree 




                                                                            27
                                                                            27
Outline

  Motivation

  Semantic Document Model - SDM

  Semantic Document Architecture - SDArch

  Prototype

  Thesis Valiadtion

  Conclusions
Conclusions


 Main contributions

      Introducing the Semantic Document Model – SDM
      Designing the Semantic Document Architecture – SDArch
      Providing the SDArch Prototype Implementation
      Experimental and Usability evaluations


 Future directions:

    Document units versioning
    Document units privacy and security
    Decentralized storage of shared semantic documents




                                                               29
Publications
Journals:
 S. Nešić, "Semantic Document Model to Enhance Data and Knowledge Interoperability," Annals of Information Systems - Special
Issue on Semantic Web & Web 2.0, Springer US, pp. 135 – 160, 2009.

Conferences:
 S. Nešić, F. Crestani, D. Gašević , M. Jazayeri, "Search and Navigation in Semantically Integrated Document Collections," 4th
International Conference on Advances in Semantic Processing - SEMAPRO, pp. 123 – 129, Firenze, Italy, 2010.

 S. Nešić, D. Gašević , M. Jazayeri, "Semantic Document Architecture for Desktop Data Integration and Management," The 22nd
International Conference on Software Engineering and Knowledge Engineering - SEKE, pp. 73 – 78, San Francisco, USA, 2010.

 S. Nešić, D. Gašević , M. Jazayeri, M. Landoni, "Using Semantic Documents and Social Networking in Authoring Course Material: An
Empirical Study," 10th IEEE International Conference on Advanced Learning Technologies - ICALT, pp. 666 – 670, Sousse,Tunisia,
2010. (Best paper award)

 S. Nešić, F. Crestani, D. Gašević , M. Jazayeri, "Concept-Based Semantic Annotation, Indexing and Retrieval of Office-Like
Document Units," 9th RIAO Conference, pp. 234 – 237 Paris, France, 2010.

 S. Nešić, D. Gašević, M. Jazayeri, "Extending MS Office for sharing Document Content Units over the Semantic Web," 8th
International Conference on Web Engineering - ICWE, Yorktown Heights, pp. 350 – 353, New York, USA, 2008.

 S. Nešić, D. Gašević, M. Jazayeri, "Semantic Document Management for Collaborative Learning Object Authoring," 8th IEEE
International Conference on Advanced Learning Technologies - ICALT, pp. 751 – 755, Santander, Spain, 2008.

 S. Nešić, D. Gašević, M. Jazayeri, "An ontology-based framework for author-learning content interaction," 6th International
Conference on Web-based Education - WBE, Chamonix, France, 2007.

 S. Nešić, D. Gašević, M. Jazayeri, "An Ontology-Based Framework for Authoring Assisted by Recommendation," 7th IEEE
International Conference on Advanced Learning Technologies - ICALT, pp. 227 – 231, Niigata, Japan, 2007.

 S. Nešić, J. Jovanović, D. Gašević, M. Jazayeri, "Ontology-Based Content Model for Scalable Content Reuse," 4th ACM SIGART
International Conference on Knowledge Capture - K-CAP, pp. 195 – 198, Whistler, Canada, 2007.

Workshops:
 S. Nešić, M. Jazayeri, F. Lelli, S. Nešić, "Towards Efficient Document Content Sharing in Social Networks” 2nd Workshop on Social
Software Engineering and Applications, co-located with ESEC/FSE, pp. 1- 8, Amsterdam, Netherlands, 2009.


                                                                                                                                      30

Mais conteúdo relacionado

Semelhante a Sasa Nesic - PhD Dissertation Defense

ChemConnect: Poster for European Combustion Meeting 2017
ChemConnect: Poster for European Combustion Meeting 2017ChemConnect: Poster for European Combustion Meeting 2017
ChemConnect: Poster for European Combustion Meeting 2017Edward Blurock
 
SRBench Streaming RDF SPARQL Benchmark
SRBench Streaming  RDF SPARQL BenchmarkSRBench Streaming  RDF SPARQL Benchmark
SRBench Streaming RDF SPARQL BenchmarkJean-Paul Calbimonte
 
Semantic Annotation: The Mainstay of Semantic Web
Semantic Annotation: The Mainstay of Semantic WebSemantic Annotation: The Mainstay of Semantic Web
Semantic Annotation: The Mainstay of Semantic WebEditor IJCATR
 
Introduction to the Semantic Web
Introduction to the Semantic WebIntroduction to the Semantic Web
Introduction to the Semantic WebNuxeo
 
Repositories thru the looking glass
Repositories thru the looking glassRepositories thru the looking glass
Repositories thru the looking glassEduserv Foundation
 
Standards for Semantic Mashups
Standards for Semantic MashupsStandards for Semantic Mashups
Standards for Semantic MashupsLaurent Lefort
 
CSHALS 2010 W3C Semanic Web Tutorial
CSHALS 2010 W3C Semanic Web TutorialCSHALS 2010 W3C Semanic Web Tutorial
CSHALS 2010 W3C Semanic Web TutorialLeeFeigenbaum
 
Enterprise linked data clouds
Enterprise linked data cloudsEnterprise linked data clouds
Enterprise linked data cloudsdamienjoyce
 
Linked Open data: CNR
Linked Open data: CNRLinked Open data: CNR
Linked Open data: CNRDatiGovIT
 
Poster Semantic Web - Abhijit Chandrasen Manepatil
Poster Semantic Web - Abhijit Chandrasen ManepatilPoster Semantic Web - Abhijit Chandrasen Manepatil
Poster Semantic Web - Abhijit Chandrasen Manepatilap
 
Negotiated Studies - A semantic social network based expert recommender system
Negotiated Studies - A semantic social network based expert recommender systemNegotiated Studies - A semantic social network based expert recommender system
Negotiated Studies - A semantic social network based expert recommender systemPremsankar Chakkingal
 
Linked Data Technology and Status
Linked Data Technology and StatusLinked Data Technology and Status
Linked Data Technology and StatusMyungjin Lee
 
Crowdsourcing-enabled Linked Data management architecture
Crowdsourcing-enabled Linked Data management architectureCrowdsourcing-enabled Linked Data management architecture
Crowdsourcing-enabled Linked Data management architectureElena Simperl
 
Searching Heterogenous E Learning Resources
Searching Heterogenous E Learning ResourcesSearching Heterogenous E Learning Resources
Searching Heterogenous E Learning Resourcesimranlatif
 

Semelhante a Sasa Nesic - PhD Dissertation Defense (20)

ChemConnect: Poster for European Combustion Meeting 2017
ChemConnect: Poster for European Combustion Meeting 2017ChemConnect: Poster for European Combustion Meeting 2017
ChemConnect: Poster for European Combustion Meeting 2017
 
SRBench Streaming RDF SPARQL Benchmark
SRBench Streaming  RDF SPARQL BenchmarkSRBench Streaming  RDF SPARQL Benchmark
SRBench Streaming RDF SPARQL Benchmark
 
Semantic Annotation: The Mainstay of Semantic Web
Semantic Annotation: The Mainstay of Semantic WebSemantic Annotation: The Mainstay of Semantic Web
Semantic Annotation: The Mainstay of Semantic Web
 
Introduction to the Semantic Web
Introduction to the Semantic WebIntroduction to the Semantic Web
Introduction to the Semantic Web
 
Repositories thru the looking glass
Repositories thru the looking glassRepositories thru the looking glass
Repositories thru the looking glass
 
Standards for Semantic Mashups
Standards for Semantic MashupsStandards for Semantic Mashups
Standards for Semantic Mashups
 
LRMI in Context, Brandt Redd
LRMI in Context, Brandt ReddLRMI in Context, Brandt Redd
LRMI in Context, Brandt Redd
 
CSHALS 2010 W3C Semanic Web Tutorial
CSHALS 2010 W3C Semanic Web TutorialCSHALS 2010 W3C Semanic Web Tutorial
CSHALS 2010 W3C Semanic Web Tutorial
 
Enterprise linked data clouds
Enterprise linked data cloudsEnterprise linked data clouds
Enterprise linked data clouds
 
Linked Open data: CNR
Linked Open data: CNRLinked Open data: CNR
Linked Open data: CNR
 
Spotlight
SpotlightSpotlight
Spotlight
 
Semantic web
Semantic webSemantic web
Semantic web
 
Ontology based metadata schema for digital library projects in China
Ontology based metadata schema for digital library projects in ChinaOntology based metadata schema for digital library projects in China
Ontology based metadata schema for digital library projects in China
 
Poster Semantic Web - Abhijit Chandrasen Manepatil
Poster Semantic Web - Abhijit Chandrasen ManepatilPoster Semantic Web - Abhijit Chandrasen Manepatil
Poster Semantic Web - Abhijit Chandrasen Manepatil
 
Presentation at MTSR 2012
Presentation at MTSR 2012Presentation at MTSR 2012
Presentation at MTSR 2012
 
Negotiated Studies - A semantic social network based expert recommender system
Negotiated Studies - A semantic social network based expert recommender systemNegotiated Studies - A semantic social network based expert recommender system
Negotiated Studies - A semantic social network based expert recommender system
 
Linked Data Technology and Status
Linked Data Technology and StatusLinked Data Technology and Status
Linked Data Technology and Status
 
Crowdsourcing-enabled Linked Data management architecture
Crowdsourcing-enabled Linked Data management architectureCrowdsourcing-enabled Linked Data management architecture
Crowdsourcing-enabled Linked Data management architecture
 
call for papers, research paper publishing, where to publish research paper, ...
call for papers, research paper publishing, where to publish research paper, ...call for papers, research paper publishing, where to publish research paper, ...
call for papers, research paper publishing, where to publish research paper, ...
 
Searching Heterogenous E Learning Resources
Searching Heterogenous E Learning ResourcesSearching Heterogenous E Learning Resources
Searching Heterogenous E Learning Resources
 

Último

Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxDenish Jangid
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17Celine George
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptxMaritesTamaniVerdade
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxRamakrishna Reddy Bijjam
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsMebane Rash
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docxPoojaSen20
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.pptRamjanShidvankar
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxVishalSingh1417
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxnegromaestrong
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxVishalSingh1417
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxAreebaZafar22
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin ClassesCeline George
 
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural ResourcesEnergy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural ResourcesShubhangi Sonawane
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.MaryamAhmad92
 

Último (20)

Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Asian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptxAsian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptx
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docx
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural ResourcesEnergy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 

Sasa Nesic - PhD Dissertation Defense

  • 1. INFORMATION MANAGEMENT Semantic Document Architecture for Desktop Data Integration And Management Place image here November 30, 2010 Saša Nešić PhD Dissertation Defense
  • 2. Motivation Semantic Web Semantic Desktop  Ontologies  Resource Description Framework (RDF)  SPARQL query language Semantic Documents 2
  • 3. Semantic Documents Semantic document are composite information resources composed of data/information units that are:  uniquely identified by globally unique URIs,  semantically annotated by concepts from domain ontologies,  interlinked with other data/information units via explicit semantic links . 3
  • 4. Thesis Statement “Semantic documents integrate desktop data into a unified desktop information space and enable desktop data to be integrated into a unified information space of social communities” Improving the Effectiveness and Efficiency of Desktop Users 4
  • 5. Outline Motivation Semantic Document Model - SDM Semantic Document Architecture - SDArch Prototype Thesis Validation Conclusions
  • 6. Semantic Document Model Semantic-Linking Part Change-Tracking Part Annotation Part Core Part Core Part Annotation Part Semantic-Linking Part Change-Tracking Part - document unit types - annotation types - semantic linking interface - types of doc. unit changes - structural relationships - annotation interface - change-tracking interface - identification - binary content linking Annals of Information systems’ 09 6
  • 7. Machine-Processable and Human-Readable instances of SDM  MP document representation  Unique and permanent instance  HTTP de-referencable URIs  RDF data format  HR document representation  Temporal document instances  Rendered from the MP instance  Existing document formats 7
  • 8. Outline Motivation Semantic Document Model - SDM Semantic Document Architecture - SDArch Prototype Thesis Validation Conclusions
  • 9. Semantic Document Architecture - SDArch Annals of Information systems’ 09 9
  • 10. Semantic Document Authoring, Search, and Navigation Concept Exploration Algorithm  Objective: Search Algorithm - conceptualization of DU semantics  Objective:  Input: Search Personalization Algorithm - search for semantic document units (DUs) - document unit:  Objective:  - Input: ontology(ies) domain - personalization of semantic doc. Search  - Output: a free-text keyword query  Input:  - Output: vector: concept - list of retrieved semantic DUs: - a ranked list of semantic DUs - list of user preferences - concept weight vector:  Features:  Output: - forming semantic query: - re-ranked list of semantic DUs  Features:  Features: - lexical expansion of concept labels - - executingSCA for each DUagainst CI: extracting semantic query - syntactic concept matching - weighting schema for each user preference - semantic concept matching - ranking DUs based on calculated weights - measuring concept relevance - measuring similarity between and Semantic document authoring service Semantic document search and navigation service SEKE’ 10 10 10
  • 11. Semantic Document Sharing  SDArch social network  Publishing only RDFs  Capturing social-context annotations  Contributing to:  Linked Open Data Cloud  Web of Linked Data  Semantic Web ESEC/FSE – SoSEA’09 11
  • 12. Outline Motivation Semantic Document Model - SDM Semantic Document Architecture - SDArch Prototype Thesis Valiadtion Conclusions
  • 13. SDArch Prototype Objectives: Source Code Organization: Number of services 5  Validation of SDArch and SDM Number of .NET assemblies 15  Enabling experimental evaluation Number of .NET namespaces 14  Enabling usability evaluation Implementation:  Semantic Document Repository  Sesame 2 RDF repository  SemWeb C# Library  MySQL DB-backed persistent RDF storage  SPARQL query support  Full-Text query support (Lucene)  Services  WCF Framework  Tools  MS Office Add-Ins 13
  • 14. SemanticDoc - MS Office Add-Ins ICWE’08 14
  • 15. Outline Motivation Semantic Document Model - SDM Semantic Document Architecture - SDArch Prototype Thesis Validation Conclusions
  • 16. Thesis Validation Q1: How do semantic documents improve information finding and retrieval in semantically integrated document collections? 1. Experimental evaluation of Information Retrieval in Semantic Documents Q2: How do semantic documents facilitate desktop users in completing tasks that draw data from both a personal desktop and social communities? 2. Usability evaluation of SDArch Services and Tools 16 16
  • 17. Experimental Evaluation of Information Retrieval in Semantic Documents  Objectives:  Measuring effectiveness of the semantic document search  Measuring effectiveness of the semantic document annotation (indexing)  Compared approaches:  Concept-Based Indexing and Search – Simple Syntactic Matching  Concept-Based Indexing and Search – Lexically Expanded Syntactic Matching  Full Text Indexing and Search (Lucene)  Semantic Document Indexing and Search SEMAPRO' 10 17 17
  • 18. Test Collections Mammals of the World Metals and Alloys  MAMO Ontology  Metals Ontology  OWL + SKOS  OWL + SKOS  Finnish National Museum  Key-To-Metals, Zurich  ~ 5000 domain concepts  ~ 1800 domain concepts  Document Set  Document Set  Wikipedia – List of Mammals  Key-To-Metals records  150 articles  240 Word documents  2130 semantic document units  3312 semantic document units  Query Set  Query Set  5 queries related to Mammals  5 queries related to Metals and Alloys 18 18
  • 19. Measuring Effectiveness of the Semantic Document (Indexing) Annotation Test collection 1: Mammals of the World # of syn. # of sem. weight of syn. weight of sem. Approach matches matches matches matches CB – simple syntactic matching 1524 - 2.56 - CB – lexically expand. syntactic matching 3182 - 3.62 - Semantic document indexing and annotation 3182 2437 3.62 2.96 Test collection 2: Metals and Alloys # of syn. # of sem. weight of syn. weight of Approach matches matches matches sem. matches CB – simple syntactic matching 2153 - 1.73 - CB – lexically expand. syntactic matching 2879 - 2.43 - Semantic document indexing and annotation 2879 1024 2.43 2.14 19 19
  • 20. Measuring Effectiveness of the Semantic Document Search Test collection: Mammals of the World Test collection: Metals and alloys 20 20
  • 21. Usability Evaluation Evaluation Hypothesis : “Using SDArch results in a more effective, efficient, and satisfactory user experience when authoring, exploring (i.e., searching and navigating) and utilizing documents in carrying out daily tasks.” Usability evaluation criteria :  User Effectiveness  User Efficiency  User Satisfaction ICALT’ 10 21 21
  • 22. Case Study: Authoring of Course Material  Participants – SDArch Social Network  University of Lugano, Switzerland – 7 participants  Simon Fraser University, Canada – 7 participants  Athabasca University, Canada – 2 participants  University of Belgrade, Serbia – 2 participants  Document Collection  “Software Design Patterns” – 70 PowerPoint and Word documents  Evaluation Session  Task-Based Usability Test  Follow-up questionnaires 22 22
  • 23. Usability Test Use Cases i. Setting Up the User Profile and the Social Network Properties ii. Authoring and Publishing Semantic Documents iii. Searching and Navigating across Semantic Documents Task Task objective Slide 1 Design patterns definition 1 2 Example 1 - definition 2 3 Example 1 - illustration 4 Example 2 - definition 3 5 Example 2 - illustration 23
  • 24. Evaluation Methods and Metrics Evaluation Criteria Evaluation Method Evaluation Metric 1. Effectiveness Objective - Quantitative Measure • Task Success Rates Objective - Quantitative Measure • Task Completion Times 2. Efficiency “ • Number of Mouse Clicks “ • Number of Window Switches 3. Satisfaction Subjective - Questionnaire • 5-level Likert scale 24 24
  • 25. 1. User Effectiveness metric: Task success rate Conventional System SDArch System Task Successful Completions % Successful Completions % 1 18 100 18 100 2 17 94.44 18 100 3 15 83.33 17 94.44 4 17 94.44 18 100 5 14 77.77 16 88.88 25 25
  • 26. 2. User Efficiency metric: Task execution time metric: Number of mouse clicks metric: Number of window switches T-Test results: Task Task p-value p-value 1 1 1.6*10-12 0.00071 0.00004 2 2 1.22*10-7 0.00011 0.0041 3 3 6.91*10-8 9.17*10-6 0.00016 4 4 3.67*10-7 0.00034 0.00009 5 5 4.82*10-10 2.6*10-6 0.00004 If p < 0.05  results are statistically significant 26 26
  • 27. 3. User Satisfaction metric: 5-level Likert Scale Internal consistency (reliability) test: Dimension Cronbach’s α Usefulness 0.85 Strongly agree  Ease-of-Use 0.78 Ease-of-Learning 0.92 Overall Satisfaction 0.83 Recommended α values > 0.75 Strongly disagree  27 27
  • 28. Outline Motivation Semantic Document Model - SDM Semantic Document Architecture - SDArch Prototype Thesis Valiadtion Conclusions
  • 29. Conclusions  Main contributions  Introducing the Semantic Document Model – SDM  Designing the Semantic Document Architecture – SDArch  Providing the SDArch Prototype Implementation  Experimental and Usability evaluations  Future directions:  Document units versioning  Document units privacy and security  Decentralized storage of shared semantic documents 29
  • 30. Publications Journals:  S. Nešić, "Semantic Document Model to Enhance Data and Knowledge Interoperability," Annals of Information Systems - Special Issue on Semantic Web & Web 2.0, Springer US, pp. 135 – 160, 2009. Conferences:  S. Nešić, F. Crestani, D. Gašević , M. Jazayeri, "Search and Navigation in Semantically Integrated Document Collections," 4th International Conference on Advances in Semantic Processing - SEMAPRO, pp. 123 – 129, Firenze, Italy, 2010.  S. Nešić, D. Gašević , M. Jazayeri, "Semantic Document Architecture for Desktop Data Integration and Management," The 22nd International Conference on Software Engineering and Knowledge Engineering - SEKE, pp. 73 – 78, San Francisco, USA, 2010.  S. Nešić, D. Gašević , M. Jazayeri, M. Landoni, "Using Semantic Documents and Social Networking in Authoring Course Material: An Empirical Study," 10th IEEE International Conference on Advanced Learning Technologies - ICALT, pp. 666 – 670, Sousse,Tunisia, 2010. (Best paper award)  S. Nešić, F. Crestani, D. Gašević , M. Jazayeri, "Concept-Based Semantic Annotation, Indexing and Retrieval of Office-Like Document Units," 9th RIAO Conference, pp. 234 – 237 Paris, France, 2010.  S. Nešić, D. Gašević, M. Jazayeri, "Extending MS Office for sharing Document Content Units over the Semantic Web," 8th International Conference on Web Engineering - ICWE, Yorktown Heights, pp. 350 – 353, New York, USA, 2008.  S. Nešić, D. Gašević, M. Jazayeri, "Semantic Document Management for Collaborative Learning Object Authoring," 8th IEEE International Conference on Advanced Learning Technologies - ICALT, pp. 751 – 755, Santander, Spain, 2008.  S. Nešić, D. Gašević, M. Jazayeri, "An ontology-based framework for author-learning content interaction," 6th International Conference on Web-based Education - WBE, Chamonix, France, 2007.  S. Nešić, D. Gašević, M. Jazayeri, "An Ontology-Based Framework for Authoring Assisted by Recommendation," 7th IEEE International Conference on Advanced Learning Technologies - ICALT, pp. 227 – 231, Niigata, Japan, 2007.  S. Nešić, J. Jovanović, D. Gašević, M. Jazayeri, "Ontology-Based Content Model for Scalable Content Reuse," 4th ACM SIGART International Conference on Knowledge Capture - K-CAP, pp. 195 – 198, Whistler, Canada, 2007. Workshops:  S. Nešić, M. Jazayeri, F. Lelli, S. Nešić, "Towards Efficient Document Content Sharing in Social Networks” 2nd Workshop on Social Software Engineering and Applications, co-located with ESEC/FSE, pp. 1- 8, Amsterdam, Netherlands, 2009. 30