SlideShare uma empresa Scribd logo
1 de 32
Baixar para ler offline
data.cnr.it and the
           Semantic Scout
      CNR Semantic Technology Lab
              ISTC - SI
Aldo Gangemi, Alberto Salvati, Enrico Daga, Gianluca Troiani
Thanks to Claudio Baldassarre (UN-FAO) and Alfio Gliozzo (IBM-Watson)
                       http://stlab.istc.cnr.it
                          http://data.cnr.it
                    http://bit.ly/semanticscout
                                                                        1
data.cnr.it




              2
Enhanced SPARQL endpoint




                      3
Ontologies




             4
Sample class from ontology




                         5
The Semantic Scout
• A framework for search, presentation, and analysis of entities and
  their associated knowledge
• Employs SW, LOD, NLP, IR
• Scientific work goes back to 2006, first presented at ISWC2007
• An evolving prototype for requirements of the EU IP IKS: semantic
  search, hybrid IR/SW identity management, automatic document
  classification (against DBpedia)
• 2009 requirements from the technology transfer office of CNR for the
  NetwOrK initiative




                                                              6
The CNR

• CNR is the largest research institution in Italy
 – about 8000 permanent researchers (+14000)
 – 7 departments focused on the main scientific
   research areas
 – 108 institutes spread all over Italy
   • Subdivided into research units, labs, etc.




                                                  7
The CNR data sources
                          Organizational data
                                                                                           File System
                                                                    DB
        DB
                                                                                        Administration
                          DB                                Frameworks,
  Departments                                                                           documentation
                                                            Programmes,
                                                            Workpackages
                    Institutes,
                  Central admin,
                   Publications

                                                       Activity-related data
                                                                                                Only partly as open data!
   DB                 DB



Curricula        Permanent
                                                                                                                      DB
                 employees                            DB
                                                                               Financial data                    Accounting,
                                                     Other                                                        Contracts,
                                                   research                                                       Invoicing
                                                  employees,
 Personnel-related data                            Externally
                                                funded projects
                                                                                                           8
The CNR tasks
• Strategic objective: matching the research
  demand to the research supply
• Requirements
 – Semantic interoperability between heterogeneous
   data sources
 – Expert finding based on competence
 – Monitoring funding and evolution of different
   research areas and units
 – Browsing and reporting capabilities


                                              9
Architecture




               10
11
Methods for data conversion, extraction, inference,
  integration, linking, publishing, and searching




                                              12
Figures



                 }
  28 modules
 120 classes
                     CNR	
  Ontology
 300 relations




                                         }
1200 axioms
>200K entities
≈3M facts (about 2M inferred or extracted)        CNR	
  Data
≈240 datasets


                                             13
Sources and lifting
• Situation usually not as clean as using a
  unique CMS for most organizational tasks
• DB (e.g. SQL Server) + a lot of textual
  records + HTML Web Site + textual corpus +
  linked open data
• DB + interaction schemata (XML templates
  and HTML scraping, needed because of
  schemata degradation and user perspective
  evolution)

                                      14
Ontology design
• Starting from XML templates as module/pattern drafts
• Reengineering XML and scraped templates
• Reengineering DB schemata (system engineer
  involved)
• Obtained modular, pattern-based, task-based ontology
• Textual DB records with identity: precondition for
  hybridizing IR and SW (see later)
• Alignments to FOAF, SIOC, SKOS, WordNet ontologies
• Used patterns: situation, place, transitive reduction


                                                15
The CNR
ontology




           16
Data design
• Triplifiers based on SQL rules (automatic
  scripting on JDBC drivers not enough because
  of legacy degradation of physical schemata)
 – Cf. also: Semion reengineering tool
• Inferences: OWL (Pellet, HermiT), SPARQL
  CONSTRUCT
• Extraction tool: Semiosearch, categorizer over
  Wikipedia categories
 – Next: deep parsing approach (facts, relations, entities)


                                                   17
Publishing and hybridizing
• Publishing OWL-RDF datasets
  – linked data approach (persistent URIs, triple stores for RDF dataset management,
    linking to common vocabularies: FOAF, DBpedia, Geonames, Bibo, ...)
  – OWL ontologies for dataset generation, querying, inference (new enriched
    datasets)
• Subgraph extraction through SNA
• Virtual semantic corpus
  – IRW to distinguish information and non-information resources
  – SPARQL rules to generate virtual texts associated with entities
• Indexing
  – Lucene+LSA indexing of semantic corpus
  – “Semantic” Lucene extension to produce tight coupling of virtual texts with
    entities
  – Multilinguality

                                                                          18
Consuming
• SPARQL endpoint, with interface enhancement
• Keyword-based search
  – Semantic browsing with SPARQL-based AJAX DHTML, RDF
    relation browser, or XML-based relation browser
• Category-based search
  – Keyword-based result focusing




                                                19
20
21
http://bit.ly/semanticscout




                          22
Expert finding: Task-based testing
• It is based on the ability to materialize on
  demand a contextual network of relevant
  information.
• It is performed with a combination of tools in the
  toolkit to:
 – Identify the main topics of research
 – Recursively search the CNR data cloud




                                              23
Identifying the main topics of research:
              project description
• “Reputation is a social knowledge, on which a number of social decisions are
  accomplished. Regulating society from the morning of mankind becomes more
  crucial with the pace of development of ICT technologies, dramatically
  enlarging the range of interaction and generating new types of aggregation.
  Despite its critical role, reputation generation, transmission and use are
  unclear. The project aims to an interdisciplinary theory of reputation and to
  modeling the interplay between direct evaluations and meta-evaluations in
  three types of decisions, epistemic (whether to form a given evaluation),
  strategic (whether and how interact with target), and memetic (whether and
  which evaluation to transmit).”
  – Project About: Social Knowledge for e-Governance.
  – Topics can be manually annotated, or automatically induced,
    e.g.: ethics, sociology, collaboration, social network,
    reputation



                                                                     24
Identifying the main topics of
        research: text categorization
• Query: “ethics, sociology, collaboration, social network, reputation”




                                                               25
Search the CNR data cloud: identify an
                 entry point
• “Commessa” (programme): “Il Circuito dell’Integrazione: Mente, Relazioni
  e Reti Sociali. Simulazione Sociale e Strumenti di Governance”




                                                                26
Search the CNR data cloud: identify
                   key people
• Ing. Jordi Sabater: Cognitive Science;
• Dott. Mario Paolucci: Sociology, Psichology;
• Gennaro di Tosto: Artificial Intelligence;
• Walter Quattrociocchi: Interdisciplinary Fields;




• Giuseppe Castaldi: Ethics;
                                                          27
• Aldo Gangemi: Semantic Web, Knowledge representation.
Expert Finding: Results
• The description of “eRep project” was adopted as a
  gold standard to evaluate the results when testing the
  Semantic Scout.
• 6 out of 10 CNR researchers, were correctly retrieved
  and a project member affiliated with another
  institution.
 – Project Coordinator: Dott. Mario Paolucci
 – External Member: Jordi Sabater Mir




                                                28
Functional evaluation of Semantic
              Scout (example)
• Expert finding accuracy
 – All the 6 retrieved people scored among the first 10 in the
   result from the search engine.
• Benefit of integrated data cloud
 – The user judged an “activity” to be relevant to his goal and
   used it as entry point to the CNR newtork of resources.




                                                        29
Functional evaluation of Semantic
                   Scout
• Accessibility and Interaction
  – Multiple users interfaces guarantee the users an adaptive level
    of interaction to each specific type of required information
• Completeness of retrieval
  – 4 people have not been included in our result set.
  – Antonietta Di Salvatore: scored below the first 10 people in the
    list;(+1)
  – Giulia Andrighetto was not listed among the people relevant to
    the query, but belongs to the social network of Dr. Rosaria
    Conte.(+1)
  – Marco Capenni and Stefano Picascia: have a technician profile,
    hence they are neither reported among the people relevant to
    the search query, nor belong to the network of any of the other
    researchers.


                                                           30
Ongoing work
• More data linking (e.g. DBLP,
  Georeferencing)
• Synchronization with data sources
• More interaction paradigms
• Privacy issues interlaced with hierarchical
  and idiosyncratic practices




                                          31
Conclusions
• Hybridizing several semantic and retrieval
  technologies provides added value to a
  research organization
• Scalability works for CNR figures
• Interaction is a core selling point
• Try it at http://bit.ly/semanticscout
• @data_cnr_it, @semanticscout,
  @aldogangemi

                                         32

Mais conteúdo relacionado

Mais procurados

Similarity based Dynamic Web Data Extraction and Integration System from Sear...
Similarity based Dynamic Web Data Extraction and Integration System from Sear...Similarity based Dynamic Web Data Extraction and Integration System from Sear...
Similarity based Dynamic Web Data Extraction and Integration System from Sear...IDES Editor
 
Crushing, Blending, and Stretching Transactional Data
Crushing, Blending, and Stretching Transactional DataCrushing, Blending, and Stretching Transactional Data
Crushing, Blending, and Stretching Transactional DataRay Schwartz
 
Data mining - GDi Techno Solutions
Data mining - GDi Techno SolutionsData mining - GDi Techno Solutions
Data mining - GDi Techno SolutionsGDi Techno Solutions
 
20120718 linkedopendataandnextgenerationsciencemcguinnessesip final
20120718 linkedopendataandnextgenerationsciencemcguinnessesip final20120718 linkedopendataandnextgenerationsciencemcguinnessesip final
20120718 linkedopendataandnextgenerationsciencemcguinnessesip finalDeborah McGuinness
 
Preparing eScience librarians -- RDAP 2012
Preparing eScience librarians -- RDAP 2012 Preparing eScience librarians -- RDAP 2012
Preparing eScience librarians -- RDAP 2012 Jian Qin
 
Everything Self-Service:Linked Data Applications with the Information Workbench
Everything Self-Service:Linked Data Applications with the Information WorkbenchEverything Self-Service:Linked Data Applications with the Information Workbench
Everything Self-Service:Linked Data Applications with the Information WorkbenchPeter Haase
 
Data-knowledge transition zones within the biomedical research ecosystem
Data-knowledge transition zones within the biomedical research ecosystemData-knowledge transition zones within the biomedical research ecosystem
Data-knowledge transition zones within the biomedical research ecosystemMaryann Martone
 
Data Mining
Data MiningData Mining
Data Miningswami920
 
Dc sheridan dlf_2011_final
Dc sheridan dlf_2011_finalDc sheridan dlf_2011_final
Dc sheridan dlf_2011_finalSayeed Choudhury
 
Adaptive Semantic Data Management Techniques for Federations of Endpoints
Adaptive Semantic Data Management Techniques for Federations of EndpointsAdaptive Semantic Data Management Techniques for Federations of Endpoints
Adaptive Semantic Data Management Techniques for Federations of EndpointsPlanetData Network of Excellence
 
20120419 linkedopendataandteamsciencemcguinnesschicago
20120419 linkedopendataandteamsciencemcguinnesschicago20120419 linkedopendataandteamsciencemcguinnesschicago
20120419 linkedopendataandteamsciencemcguinnesschicagoDeborah McGuinness
 
Introduction to Data Mining for Newbies
Introduction to Data Mining for NewbiesIntroduction to Data Mining for Newbies
Introduction to Data Mining for NewbiesEunjeong (Lucy) Park
 
Fundamentals of data mining and its applications
Fundamentals of data mining and its applicationsFundamentals of data mining and its applications
Fundamentals of data mining and its applicationsSubrat Swain
 
Small Data: Bridging the Gap Between Generic and Specific Repositories
Small Data: Bridging the Gap Between Generic and Specific RepositoriesSmall Data: Bridging the Gap Between Generic and Specific Repositories
Small Data: Bridging the Gap Between Generic and Specific RepositoriesAnita de Waard
 
Indexing techniques for advanced database systems
Indexing techniques for advanced database systemsIndexing techniques for advanced database systems
Indexing techniques for advanced database systemsMohammed Muqeet
 
Libby Bishop, Ethics Of Data Sharing Ncess Jun 09 Final
Libby Bishop, Ethics Of Data Sharing Ncess Jun 09 FinalLibby Bishop, Ethics Of Data Sharing Ncess Jun 09 Final
Libby Bishop, Ethics Of Data Sharing Ncess Jun 09 Finala.carusi
 
Open hpi semweb-06-part7
Open hpi semweb-06-part7Open hpi semweb-06-part7
Open hpi semweb-06-part7Nadine Ludwig
 

Mais procurados (20)

Similarity based Dynamic Web Data Extraction and Integration System from Sear...
Similarity based Dynamic Web Data Extraction and Integration System from Sear...Similarity based Dynamic Web Data Extraction and Integration System from Sear...
Similarity based Dynamic Web Data Extraction and Integration System from Sear...
 
Crushing, Blending, and Stretching Transactional Data
Crushing, Blending, and Stretching Transactional DataCrushing, Blending, and Stretching Transactional Data
Crushing, Blending, and Stretching Transactional Data
 
Data mining - GDi Techno Solutions
Data mining - GDi Techno SolutionsData mining - GDi Techno Solutions
Data mining - GDi Techno Solutions
 
20120718 linkedopendataandnextgenerationsciencemcguinnessesip final
20120718 linkedopendataandnextgenerationsciencemcguinnessesip final20120718 linkedopendataandnextgenerationsciencemcguinnessesip final
20120718 linkedopendataandnextgenerationsciencemcguinnessesip final
 
Preparing eScience librarians -- RDAP 2012
Preparing eScience librarians -- RDAP 2012 Preparing eScience librarians -- RDAP 2012
Preparing eScience librarians -- RDAP 2012
 
Everything Self-Service:Linked Data Applications with the Information Workbench
Everything Self-Service:Linked Data Applications with the Information WorkbenchEverything Self-Service:Linked Data Applications with the Information Workbench
Everything Self-Service:Linked Data Applications with the Information Workbench
 
Data-knowledge transition zones within the biomedical research ecosystem
Data-knowledge transition zones within the biomedical research ecosystemData-knowledge transition zones within the biomedical research ecosystem
Data-knowledge transition zones within the biomedical research ecosystem
 
Data Mining
Data MiningData Mining
Data Mining
 
Introducation to metadata
Introducation to metadataIntroducation to metadata
Introducation to metadata
 
Dc sheridan dlf_2011_final
Dc sheridan dlf_2011_finalDc sheridan dlf_2011_final
Dc sheridan dlf_2011_final
 
Adaptive Semantic Data Management Techniques for Federations of Endpoints
Adaptive Semantic Data Management Techniques for Federations of EndpointsAdaptive Semantic Data Management Techniques for Federations of Endpoints
Adaptive Semantic Data Management Techniques for Federations of Endpoints
 
NISO Forum, Denver, Sept. 24, 2012: EZID: Easy dataset identification & manag...
NISO Forum, Denver, Sept. 24, 2012: EZID: Easy dataset identification & manag...NISO Forum, Denver, Sept. 24, 2012: EZID: Easy dataset identification & manag...
NISO Forum, Denver, Sept. 24, 2012: EZID: Easy dataset identification & manag...
 
20120419 linkedopendataandteamsciencemcguinnesschicago
20120419 linkedopendataandteamsciencemcguinnesschicago20120419 linkedopendataandteamsciencemcguinnesschicago
20120419 linkedopendataandteamsciencemcguinnesschicago
 
Introduction to Data Mining for Newbies
Introduction to Data Mining for NewbiesIntroduction to Data Mining for Newbies
Introduction to Data Mining for Newbies
 
Fundamentals of data mining and its applications
Fundamentals of data mining and its applicationsFundamentals of data mining and its applications
Fundamentals of data mining and its applications
 
Small Data: Bridging the Gap Between Generic and Specific Repositories
Small Data: Bridging the Gap Between Generic and Specific RepositoriesSmall Data: Bridging the Gap Between Generic and Specific Repositories
Small Data: Bridging the Gap Between Generic and Specific Repositories
 
Indexing techniques for advanced database systems
Indexing techniques for advanced database systemsIndexing techniques for advanced database systems
Indexing techniques for advanced database systems
 
Libby Bishop, Ethics Of Data Sharing Ncess Jun 09 Final
Libby Bishop, Ethics Of Data Sharing Ncess Jun 09 FinalLibby Bishop, Ethics Of Data Sharing Ncess Jun 09 Final
Libby Bishop, Ethics Of Data Sharing Ncess Jun 09 Final
 
Open hpi semweb-06-part7
Open hpi semweb-06-part7Open hpi semweb-06-part7
Open hpi semweb-06-part7
 
Role of Semantic Web in Health Informatics
Role of Semantic Web in Health InformaticsRole of Semantic Web in Health Informatics
Role of Semantic Web in Health Informatics
 

Destaque

Innovative teaching manual of surumi
Innovative teaching manual of surumiInnovative teaching manual of surumi
Innovative teaching manual of surumiSano Anil
 
Step by step guidance general overview final_new
Step by step guidance general overview final_newStep by step guidance general overview final_new
Step by step guidance general overview final_neweTwinning Europe
 
Animal classification based on Job 39
Animal classification based on Job 39Animal classification based on Job 39
Animal classification based on Job 39Kathy Page-Applebee
 
#weightloss 2014 vs Old School #Dieting
#weightloss 2014 vs Old School #Dieting#weightloss 2014 vs Old School #Dieting
#weightloss 2014 vs Old School #DietingCindy McAsey
 
The Millennial Shift: Financial Services and the Digial Generation Study Preview
The Millennial Shift: Financial Services and the Digial Generation Study PreviewThe Millennial Shift: Financial Services and the Digial Generation Study Preview
The Millennial Shift: Financial Services and the Digial Generation Study PreviewCorporate Insight
 
осіння фантазія
осіння фантазіяосіння фантазія
осіння фантазіяNatalya Markova
 
Command keynote! part 2 p1
Command keynote! part 2 p1Command keynote! part 2 p1
Command keynote! part 2 p1ambersweet95
 
Implementation training updated 9.27.13
Implementation training updated 9.27.13Implementation training updated 9.27.13
Implementation training updated 9.27.13progroup
 
Iphone app possibilities
Iphone app possibilitiesIphone app possibilities
Iphone app possibilitiesJenny Chang
 
Новости недвижимости Майами - Февраль 2016
Новости недвижимости Майами - Февраль 2016Новости недвижимости Майами - Февраль 2016
Новости недвижимости Майами - Февраль 2016The Reznik Group
 
Evaluation Question 4
Evaluation Question 4Evaluation Question 4
Evaluation Question 4AmyLongworth
 
지정공모(Pt제출) 소셜나눔
지정공모(Pt제출) 소셜나눔지정공모(Pt제출) 소셜나눔
지정공모(Pt제출) 소셜나눔Seong Whan Park
 
KSA by Samaiel Bakolka & Rahaf Tawfeeg
KSA by Samaiel Bakolka & Rahaf TawfeegKSA by Samaiel Bakolka & Rahaf Tawfeeg
KSA by Samaiel Bakolka & Rahaf Tawfeegliza14
 
Four Ways to Leverage Social Media in Your Marketing
Four Ways to Leverage Social Media in Your MarketingFour Ways to Leverage Social Media in Your Marketing
Four Ways to Leverage Social Media in Your MarketingJocelyn Murray
 

Destaque (19)

Tradicii
TradiciiTradicii
Tradicii
 
Innovative teaching manual of surumi
Innovative teaching manual of surumiInnovative teaching manual of surumi
Innovative teaching manual of surumi
 
Question 1
Question 1Question 1
Question 1
 
Step by step guidance general overview final_new
Step by step guidance general overview final_newStep by step guidance general overview final_new
Step by step guidance general overview final_new
 
Animal classification based on Job 39
Animal classification based on Job 39Animal classification based on Job 39
Animal classification based on Job 39
 
#weightloss 2014 vs Old School #Dieting
#weightloss 2014 vs Old School #Dieting#weightloss 2014 vs Old School #Dieting
#weightloss 2014 vs Old School #Dieting
 
The Millennial Shift: Financial Services and the Digial Generation Study Preview
The Millennial Shift: Financial Services and the Digial Generation Study PreviewThe Millennial Shift: Financial Services and the Digial Generation Study Preview
The Millennial Shift: Financial Services and the Digial Generation Study Preview
 
Ch06
Ch06Ch06
Ch06
 
осіння фантазія
осіння фантазіяосіння фантазія
осіння фантазія
 
Command keynote! part 2 p1
Command keynote! part 2 p1Command keynote! part 2 p1
Command keynote! part 2 p1
 
Implementation training updated 9.27.13
Implementation training updated 9.27.13Implementation training updated 9.27.13
Implementation training updated 9.27.13
 
Bs ipa7 semester 2
Bs ipa7 semester 2Bs ipa7 semester 2
Bs ipa7 semester 2
 
Iphone app possibilities
Iphone app possibilitiesIphone app possibilities
Iphone app possibilities
 
Новости недвижимости Майами - Февраль 2016
Новости недвижимости Майами - Февраль 2016Новости недвижимости Майами - Февраль 2016
Новости недвижимости Майами - Февраль 2016
 
Evaluation Question 4
Evaluation Question 4Evaluation Question 4
Evaluation Question 4
 
지정공모(Pt제출) 소셜나눔
지정공모(Pt제출) 소셜나눔지정공모(Pt제출) 소셜나눔
지정공모(Pt제출) 소셜나눔
 
KSA by Samaiel Bakolka & Rahaf Tawfeeg
KSA by Samaiel Bakolka & Rahaf TawfeegKSA by Samaiel Bakolka & Rahaf Tawfeeg
KSA by Samaiel Bakolka & Rahaf Tawfeeg
 
Ucm237512
Ucm237512Ucm237512
Ucm237512
 
Four Ways to Leverage Social Media in Your Marketing
Four Ways to Leverage Social Media in Your MarketingFour Ways to Leverage Social Media in Your Marketing
Four Ways to Leverage Social Media in Your Marketing
 

Semelhante a Linked Open data: CNR

Data Collection and Integration, Linked Data Management
Data Collection and Integration, Linked Data ManagementData Collection and Integration, Linked Data Management
Data Collection and Integration, Linked Data ManagementRENDER project
 
Contributing to the Smart City Through Linked Library Data
Contributing to the Smart City Through Linked Library DataContributing to the Smart City Through Linked Library Data
Contributing to the Smart City Through Linked Library DataMarcia Zeng
 
Educating a New Breed of Data Scientists for Scientific Data Management
Educating a New Breed of Data Scientists for Scientific Data Management Educating a New Breed of Data Scientists for Scientific Data Management
Educating a New Breed of Data Scientists for Scientific Data Management Jian Qin
 
Introduction to the Semantic Web
Introduction to the Semantic WebIntroduction to the Semantic Web
Introduction to the Semantic WebNuxeo
 
Functional and Architectural Requirements for Metadata: Supporting Discovery...
Functional and Architectural Requirements for Metadata: Supporting Discovery...Functional and Architectural Requirements for Metadata: Supporting Discovery...
Functional and Architectural Requirements for Metadata: Supporting Discovery...Jian Qin
 
Linked Data at the Open University: From Technical Challenges to Organization...
Linked Data at the Open University: From Technical Challenges to Organization...Linked Data at the Open University: From Technical Challenges to Organization...
Linked Data at the Open University: From Technical Challenges to Organization...Mathieu d'Aquin
 
Why manage research data?
Why manage research data?Why manage research data?
Why manage research data?Graham Pryor
 
Metadata in general and Dublin Core in specific; some experiences
Metadata in general and Dublin Core in specific; some experiencesMetadata in general and Dublin Core in specific; some experiences
Metadata in general and Dublin Core in specific; some experiencesKerstin Forsberg
 
Simon Hodson
Simon HodsonSimon Hodson
Simon HodsonEduserv
 
CNI Fall 2011 Meeting Presentation Margaret Hedstrom & Robert McDonald (Dec. ...
CNI Fall 2011 Meeting Presentation Margaret Hedstrom & Robert McDonald (Dec. ...CNI Fall 2011 Meeting Presentation Margaret Hedstrom & Robert McDonald (Dec. ...
CNI Fall 2011 Meeting Presentation Margaret Hedstrom & Robert McDonald (Dec. ...SEAD
 
Minimizing the Complexities of Machine Learning with Data Virtualization
Minimizing the Complexities of Machine Learning with Data VirtualizationMinimizing the Complexities of Machine Learning with Data Virtualization
Minimizing the Complexities of Machine Learning with Data VirtualizationDenodo
 
A Framework for Ontology Usage Analysis
A Framework for Ontology Usage AnalysisA Framework for Ontology Usage Analysis
A Framework for Ontology Usage AnalysisJamshaid Ashraf
 
ESI Supplemental Webinar 2 - DataONE presentation slides
ESI Supplemental Webinar 2 - DataONE presentation slides ESI Supplemental Webinar 2 - DataONE presentation slides
ESI Supplemental Webinar 2 - DataONE presentation slides DuraSpace
 
Data repositories -- Xiamen University 2012 06-08
Data repositories -- Xiamen University 2012 06-08Data repositories -- Xiamen University 2012 06-08
Data repositories -- Xiamen University 2012 06-08Jian Qin
 
Preparing eScience Librarians for Managing Research Data - Jian Qin - RDAP12
Preparing eScience Librarians for Managing Research Data - Jian Qin - RDAP12Preparing eScience Librarians for Managing Research Data - Jian Qin - RDAP12
Preparing eScience Librarians for Managing Research Data - Jian Qin - RDAP12ASIS&T
 
Building a Data Discovery Network for Sustainability Science
Building a Data Discovery Network for Sustainability ScienceBuilding a Data Discovery Network for Sustainability Science
Building a Data Discovery Network for Sustainability ScienceRobert H. McDonald
 
Introduction to Object Oriented databases
Introduction to Object Oriented databasesIntroduction to Object Oriented databases
Introduction to Object Oriented databasesDr. C.V. Suresh Babu
 
CS6007 information retrieval - 5 units notes
CS6007   information retrieval - 5 units notesCS6007   information retrieval - 5 units notes
CS6007 information retrieval - 5 units notesAnandh Arumugakan
 

Semelhante a Linked Open data: CNR (20)

Data Collection and Integration, Linked Data Management
Data Collection and Integration, Linked Data ManagementData Collection and Integration, Linked Data Management
Data Collection and Integration, Linked Data Management
 
Contributing to the Smart City Through Linked Library Data
Contributing to the Smart City Through Linked Library DataContributing to the Smart City Through Linked Library Data
Contributing to the Smart City Through Linked Library Data
 
Educating a New Breed of Data Scientists for Scientific Data Management
Educating a New Breed of Data Scientists for Scientific Data Management Educating a New Breed of Data Scientists for Scientific Data Management
Educating a New Breed of Data Scientists for Scientific Data Management
 
Introduction to the Semantic Web
Introduction to the Semantic WebIntroduction to the Semantic Web
Introduction to the Semantic Web
 
Functional and Architectural Requirements for Metadata: Supporting Discovery...
Functional and Architectural Requirements for Metadata: Supporting Discovery...Functional and Architectural Requirements for Metadata: Supporting Discovery...
Functional and Architectural Requirements for Metadata: Supporting Discovery...
 
Linked Data at the Open University: From Technical Challenges to Organization...
Linked Data at the Open University: From Technical Challenges to Organization...Linked Data at the Open University: From Technical Challenges to Organization...
Linked Data at the Open University: From Technical Challenges to Organization...
 
Why manage research data?
Why manage research data?Why manage research data?
Why manage research data?
 
Metadata in general and Dublin Core in specific; some experiences
Metadata in general and Dublin Core in specific; some experiencesMetadata in general and Dublin Core in specific; some experiences
Metadata in general and Dublin Core in specific; some experiences
 
Simon Hodson
Simon HodsonSimon Hodson
Simon Hodson
 
CNI Fall 2011 Meeting Presentation Margaret Hedstrom & Robert McDonald (Dec. ...
CNI Fall 2011 Meeting Presentation Margaret Hedstrom & Robert McDonald (Dec. ...CNI Fall 2011 Meeting Presentation Margaret Hedstrom & Robert McDonald (Dec. ...
CNI Fall 2011 Meeting Presentation Margaret Hedstrom & Robert McDonald (Dec. ...
 
Minimizing the Complexities of Machine Learning with Data Virtualization
Minimizing the Complexities of Machine Learning with Data VirtualizationMinimizing the Complexities of Machine Learning with Data Virtualization
Minimizing the Complexities of Machine Learning with Data Virtualization
 
A Framework for Ontology Usage Analysis
A Framework for Ontology Usage AnalysisA Framework for Ontology Usage Analysis
A Framework for Ontology Usage Analysis
 
ESI Supplemental Webinar 2 - DataONE presentation slides
ESI Supplemental Webinar 2 - DataONE presentation slides ESI Supplemental Webinar 2 - DataONE presentation slides
ESI Supplemental Webinar 2 - DataONE presentation slides
 
Data repositories -- Xiamen University 2012 06-08
Data repositories -- Xiamen University 2012 06-08Data repositories -- Xiamen University 2012 06-08
Data repositories -- Xiamen University 2012 06-08
 
Preparing eScience Librarians for Managing Research Data - Jian Qin - RDAP12
Preparing eScience Librarians for Managing Research Data - Jian Qin - RDAP12Preparing eScience Librarians for Managing Research Data - Jian Qin - RDAP12
Preparing eScience Librarians for Managing Research Data - Jian Qin - RDAP12
 
Building a Data Discovery Network for Sustainability Science
Building a Data Discovery Network for Sustainability ScienceBuilding a Data Discovery Network for Sustainability Science
Building a Data Discovery Network for Sustainability Science
 
NISO Forum, Denver, Sept. 24, 2012: Needs for Data Management & Citation Thro...
NISO Forum, Denver, Sept. 24, 2012: Needs for Data Management & Citation Thro...NISO Forum, Denver, Sept. 24, 2012: Needs for Data Management & Citation Thro...
NISO Forum, Denver, Sept. 24, 2012: Needs for Data Management & Citation Thro...
 
Neuroscience as networked science
Neuroscience as networked scienceNeuroscience as networked science
Neuroscience as networked science
 
Introduction to Object Oriented databases
Introduction to Object Oriented databasesIntroduction to Object Oriented databases
Introduction to Object Oriented databases
 
CS6007 information retrieval - 5 units notes
CS6007   information retrieval - 5 units notesCS6007   information retrieval - 5 units notes
CS6007 information retrieval - 5 units notes
 

Mais de DatiGovIT

La carta dei dati aperti
La carta dei dati apertiLa carta dei dati aperti
La carta dei dati apertiDatiGovIT
 
Big data & opendata
Big data & opendataBig data & opendata
Big data & opendataDatiGovIT
 
OPEN DATA, L’ESPERIENZA DI REGIONE LOMBARDIA
OPEN DATA, L’ESPERIENZA DI REGIONE LOMBARDIAOPEN DATA, L’ESPERIENZA DI REGIONE LOMBARDIA
OPEN DATA, L’ESPERIENZA DI REGIONE LOMBARDIADatiGovIT
 
opendata.comune.bari.it
opendata.comune.bari.itopendata.comune.bari.it
opendata.comune.bari.itDatiGovIT
 
Contenuti minimi: modalità di pubblicazione
Contenuti minimi: modalità di pubblicazione  Contenuti minimi: modalità di pubblicazione
Contenuti minimi: modalità di pubblicazione DatiGovIT
 
(LINKED) OPEN DATA A FIRENZE
(LINKED) OPEN DATA A FIRENZE(LINKED) OPEN DATA A FIRENZE
(LINKED) OPEN DATA A FIRENZEDatiGovIT
 
Progetto open data Milano
Progetto open data Milano Progetto open data Milano
Progetto open data Milano DatiGovIT
 
Esperienza open data della provincia di Roma
Esperienza open data della provincia di RomaEsperienza open data della provincia di Roma
Esperienza open data della provincia di RomaDatiGovIT
 
Il Comune di Senigallia e il progetto OpenMunicipio
Il Comune di Senigallia e il progetto OpenMunicipioIl Comune di Senigallia e il progetto OpenMunicipio
Il Comune di Senigallia e il progetto OpenMunicipioDatiGovIT
 
Open Municipio
Open MunicipioOpen Municipio
Open MunicipioDatiGovIT
 
il portale Dati.gov.it e l’Infografica su open data in Italia
il portale Dati.gov.it e l’Infografica su open data in Italia il portale Dati.gov.it e l’Infografica su open data in Italia
il portale Dati.gov.it e l’Infografica su open data in Italia DatiGovIT
 
Open semantic linked data
Open semantic linked dataOpen semantic linked data
Open semantic linked dataDatiGovIT
 
Open data INPS
Open data INPS Open data INPS
Open data INPS DatiGovIT
 
Open data Firenze - opendata.comune.fi.it
Open data Firenze - opendata.comune.fi.itOpen data Firenze - opendata.comune.fi.it
Open data Firenze - opendata.comune.fi.itDatiGovIT
 

Mais de DatiGovIT (14)

La carta dei dati aperti
La carta dei dati apertiLa carta dei dati aperti
La carta dei dati aperti
 
Big data & opendata
Big data & opendataBig data & opendata
Big data & opendata
 
OPEN DATA, L’ESPERIENZA DI REGIONE LOMBARDIA
OPEN DATA, L’ESPERIENZA DI REGIONE LOMBARDIAOPEN DATA, L’ESPERIENZA DI REGIONE LOMBARDIA
OPEN DATA, L’ESPERIENZA DI REGIONE LOMBARDIA
 
opendata.comune.bari.it
opendata.comune.bari.itopendata.comune.bari.it
opendata.comune.bari.it
 
Contenuti minimi: modalità di pubblicazione
Contenuti minimi: modalità di pubblicazione  Contenuti minimi: modalità di pubblicazione
Contenuti minimi: modalità di pubblicazione
 
(LINKED) OPEN DATA A FIRENZE
(LINKED) OPEN DATA A FIRENZE(LINKED) OPEN DATA A FIRENZE
(LINKED) OPEN DATA A FIRENZE
 
Progetto open data Milano
Progetto open data Milano Progetto open data Milano
Progetto open data Milano
 
Esperienza open data della provincia di Roma
Esperienza open data della provincia di RomaEsperienza open data della provincia di Roma
Esperienza open data della provincia di Roma
 
Il Comune di Senigallia e il progetto OpenMunicipio
Il Comune di Senigallia e il progetto OpenMunicipioIl Comune di Senigallia e il progetto OpenMunicipio
Il Comune di Senigallia e il progetto OpenMunicipio
 
Open Municipio
Open MunicipioOpen Municipio
Open Municipio
 
il portale Dati.gov.it e l’Infografica su open data in Italia
il portale Dati.gov.it e l’Infografica su open data in Italia il portale Dati.gov.it e l’Infografica su open data in Italia
il portale Dati.gov.it e l’Infografica su open data in Italia
 
Open semantic linked data
Open semantic linked dataOpen semantic linked data
Open semantic linked data
 
Open data INPS
Open data INPS Open data INPS
Open data INPS
 
Open data Firenze - opendata.comune.fi.it
Open data Firenze - opendata.comune.fi.itOpen data Firenze - opendata.comune.fi.it
Open data Firenze - opendata.comune.fi.it
 

Último

"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????blackmambaettijean
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 

Último (20)

"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 

Linked Open data: CNR

  • 1. data.cnr.it and the Semantic Scout CNR Semantic Technology Lab ISTC - SI Aldo Gangemi, Alberto Salvati, Enrico Daga, Gianluca Troiani Thanks to Claudio Baldassarre (UN-FAO) and Alfio Gliozzo (IBM-Watson) http://stlab.istc.cnr.it http://data.cnr.it http://bit.ly/semanticscout 1
  • 5. Sample class from ontology 5
  • 6. The Semantic Scout • A framework for search, presentation, and analysis of entities and their associated knowledge • Employs SW, LOD, NLP, IR • Scientific work goes back to 2006, first presented at ISWC2007 • An evolving prototype for requirements of the EU IP IKS: semantic search, hybrid IR/SW identity management, automatic document classification (against DBpedia) • 2009 requirements from the technology transfer office of CNR for the NetwOrK initiative 6
  • 7. The CNR • CNR is the largest research institution in Italy – about 8000 permanent researchers (+14000) – 7 departments focused on the main scientific research areas – 108 institutes spread all over Italy • Subdivided into research units, labs, etc. 7
  • 8. The CNR data sources Organizational data File System DB DB Administration DB Frameworks, Departments documentation Programmes, Workpackages Institutes, Central admin, Publications Activity-related data Only partly as open data! DB DB Curricula Permanent DB employees DB Financial data Accounting, Other Contracts, research Invoicing employees, Personnel-related data Externally funded projects 8
  • 9. The CNR tasks • Strategic objective: matching the research demand to the research supply • Requirements – Semantic interoperability between heterogeneous data sources – Expert finding based on competence – Monitoring funding and evolution of different research areas and units – Browsing and reporting capabilities 9
  • 11. 11
  • 12. Methods for data conversion, extraction, inference, integration, linking, publishing, and searching 12
  • 13. Figures } 28 modules 120 classes CNR  Ontology 300 relations } 1200 axioms >200K entities ≈3M facts (about 2M inferred or extracted) CNR  Data ≈240 datasets 13
  • 14. Sources and lifting • Situation usually not as clean as using a unique CMS for most organizational tasks • DB (e.g. SQL Server) + a lot of textual records + HTML Web Site + textual corpus + linked open data • DB + interaction schemata (XML templates and HTML scraping, needed because of schemata degradation and user perspective evolution) 14
  • 15. Ontology design • Starting from XML templates as module/pattern drafts • Reengineering XML and scraped templates • Reengineering DB schemata (system engineer involved) • Obtained modular, pattern-based, task-based ontology • Textual DB records with identity: precondition for hybridizing IR and SW (see later) • Alignments to FOAF, SIOC, SKOS, WordNet ontologies • Used patterns: situation, place, transitive reduction 15
  • 17. Data design • Triplifiers based on SQL rules (automatic scripting on JDBC drivers not enough because of legacy degradation of physical schemata) – Cf. also: Semion reengineering tool • Inferences: OWL (Pellet, HermiT), SPARQL CONSTRUCT • Extraction tool: Semiosearch, categorizer over Wikipedia categories – Next: deep parsing approach (facts, relations, entities) 17
  • 18. Publishing and hybridizing • Publishing OWL-RDF datasets – linked data approach (persistent URIs, triple stores for RDF dataset management, linking to common vocabularies: FOAF, DBpedia, Geonames, Bibo, ...) – OWL ontologies for dataset generation, querying, inference (new enriched datasets) • Subgraph extraction through SNA • Virtual semantic corpus – IRW to distinguish information and non-information resources – SPARQL rules to generate virtual texts associated with entities • Indexing – Lucene+LSA indexing of semantic corpus – “Semantic” Lucene extension to produce tight coupling of virtual texts with entities – Multilinguality 18
  • 19. Consuming • SPARQL endpoint, with interface enhancement • Keyword-based search – Semantic browsing with SPARQL-based AJAX DHTML, RDF relation browser, or XML-based relation browser • Category-based search – Keyword-based result focusing 19
  • 20. 20
  • 21. 21
  • 23. Expert finding: Task-based testing • It is based on the ability to materialize on demand a contextual network of relevant information. • It is performed with a combination of tools in the toolkit to: – Identify the main topics of research – Recursively search the CNR data cloud 23
  • 24. Identifying the main topics of research: project description • “Reputation is a social knowledge, on which a number of social decisions are accomplished. Regulating society from the morning of mankind becomes more crucial with the pace of development of ICT technologies, dramatically enlarging the range of interaction and generating new types of aggregation. Despite its critical role, reputation generation, transmission and use are unclear. The project aims to an interdisciplinary theory of reputation and to modeling the interplay between direct evaluations and meta-evaluations in three types of decisions, epistemic (whether to form a given evaluation), strategic (whether and how interact with target), and memetic (whether and which evaluation to transmit).” – Project About: Social Knowledge for e-Governance. – Topics can be manually annotated, or automatically induced, e.g.: ethics, sociology, collaboration, social network, reputation 24
  • 25. Identifying the main topics of research: text categorization • Query: “ethics, sociology, collaboration, social network, reputation” 25
  • 26. Search the CNR data cloud: identify an entry point • “Commessa” (programme): “Il Circuito dell’Integrazione: Mente, Relazioni e Reti Sociali. Simulazione Sociale e Strumenti di Governance” 26
  • 27. Search the CNR data cloud: identify key people • Ing. Jordi Sabater: Cognitive Science; • Dott. Mario Paolucci: Sociology, Psichology; • Gennaro di Tosto: Artificial Intelligence; • Walter Quattrociocchi: Interdisciplinary Fields; • Giuseppe Castaldi: Ethics; 27 • Aldo Gangemi: Semantic Web, Knowledge representation.
  • 28. Expert Finding: Results • The description of “eRep project” was adopted as a gold standard to evaluate the results when testing the Semantic Scout. • 6 out of 10 CNR researchers, were correctly retrieved and a project member affiliated with another institution. – Project Coordinator: Dott. Mario Paolucci – External Member: Jordi Sabater Mir 28
  • 29. Functional evaluation of Semantic Scout (example) • Expert finding accuracy – All the 6 retrieved people scored among the first 10 in the result from the search engine. • Benefit of integrated data cloud – The user judged an “activity” to be relevant to his goal and used it as entry point to the CNR newtork of resources. 29
  • 30. Functional evaluation of Semantic Scout • Accessibility and Interaction – Multiple users interfaces guarantee the users an adaptive level of interaction to each specific type of required information • Completeness of retrieval – 4 people have not been included in our result set. – Antonietta Di Salvatore: scored below the first 10 people in the list;(+1) – Giulia Andrighetto was not listed among the people relevant to the query, but belongs to the social network of Dr. Rosaria Conte.(+1) – Marco Capenni and Stefano Picascia: have a technician profile, hence they are neither reported among the people relevant to the search query, nor belong to the network of any of the other researchers. 30
  • 31. Ongoing work • More data linking (e.g. DBLP, Georeferencing) • Synchronization with data sources • More interaction paradigms • Privacy issues interlaced with hierarchical and idiosyncratic practices 31
  • 32. Conclusions • Hybridizing several semantic and retrieval technologies provides added value to a research organization • Scalability works for CNR figures • Interaction is a core selling point • Try it at http://bit.ly/semanticscout • @data_cnr_it, @semanticscout, @aldogangemi 32