Open Education Challenge 2014: exploiting Linked Data in Educational Applications

Presentation from mentoring event of Open Education Europa Challenge (http://www.openeducationchallenge.eu/) about using Linked Data in educational applications.

  1. 1. Exploiting (Linked) Web Data in Educational Applications Stefan Dietze L3S Research Center http://purl.org/dietze @stefandietze - Open Education Challenge, Berlin, 2014 - 28/10/14 1 Stefan Dietze
  2. 2. Linked Data for education  Data sharing: TED, Open Courseware, mEducator, LinkedUp, LAK….  Tutorials & workshops (eg „Linked Learning“ series)  LinkedUniversities.org and LinkedEducation.org  W3C Linked Open Education community group Research areas  Web & data science, information retrieval, semantic web & Linked Data, data & knowledge integration  Application domains: education/TEL, Web archiving, … Some projects Introduction http://www.l3s.de/ 28/10/14 2  See also: http://purl.org/dietze Stefan Dietze
  3. 3. Social Media Exploiting Open Data for Education?nutshell (Open) Educational Resources World Wide Web Distance Universities MOOCs Linked Open Data 28/10/14 3 Stefan Dietze
  4. 4. How Open is Open Data? Open Data (as in “open licensing”) Open licensing (ODL, CC etc) Yet: variety of approaches APIs/feeds: SOAP, REST, etc Diverse schemas & vocabularies (lack of) controlled vocabularies Reuse & interoperability? Linked Data (technology) (as in “interoperability”) Defacto Standard for Open Data on the Web W3C standards: Common HTTP interface: SPARQL Common representation: RDF Dereferencable URIs Shared/linked vocabularies Linked Open Data 5-star scheme by Sir Tim Berners Lee 28/10/14 4 Stefan Dietze
  5. 5. Semantic Web Example: Google Knowledge Graph (DBpedia, Freebase, Yago etc) W3C standards (RDF & SPARQL) for knowledge representation and querying URIs to identify/link data “A little semantics goes a long way” (J. Hendler1) dbp:United_States http://dbpedia.org/resource/Cambridge_MA dbp:W3C country cityOf 1 Hendler, J., The Dark Side of the Semantic Web, IEEE Intelligent Systems, Jan/Feb 2007 schema:City typeOf dbp:MIT ru.dbp:Кембридж_(Массачусетс) sameAs headquarterOf
  6. 6. HTTP accessibility: persistent URIs, SPARQL FOAF Gene Ontology BIBO Geo Ontology DBpedia Ontology Dublin Core BBC Programmes Connected graph of open Web data (500+ datasets and 100 billion triples) Persistent, dereferencable URIs & content negotiation, shared/linked vocabularies SPARQL to query via HTTP Other „incarnations“: Google Knowledge Graph Facebook Open Graph http://schema.org http://dbpedia.org/resource/Cambridge_MA 28/10/14 6 Stefan Dietze
  7. 7. LD to ensure discoverability of content/Websites (eg schema.org/microdata/RDFa) Annotating HTML documents about (educational) material with schema.org (eg LRMI, Learning Resource Metadata Initiative) Adopted by major sites (YouTube, LinkedIn etc) & tool support (DRUPAL, WordPress) LD is not just for your data Schema.org for discovery of content/websites http://schema.org © Ramanathan V. Guha, Google, SemTech2014 28/10/14 7 Stefan Dietze
  8. 8. Other learning-relevant data & resources Publications & literature (Social) media resource metadata Domain-specific knowledge: Bioportal, Europeana, Geonames, … Cross-domain factual knowledge: DBpedia, Freebase, … LD as body of knowledge for education http://linkededucation.org http://linkeduniversities.org 28/10/14 8 Stefan Dietze Educational datasets and vocabularies University Linked Data: The Open University UK, http://data.open.ac.uk, Southampton University, http://education.data.gov.uk, … Open Educational Resources metadata: mEducator, Open Learn, Open Courseware, … Schemas: Learning Resource Metadata Initiative (LRMI, mEducator Educational Resources schema, BIBO, AAISO, …
  9. 9. LD as background knowledge for educational apps? http://metamorphosis.med.duth.gr/ Title: ECG Patient case 1001 chest and limb leads 28/10/14 9 Stefan Dietze
  10. 10. Title: ECG Patient case 1001 chest and limb leads „ECG“ dismabiguation on Wikipedia: 9 meanings LD as background knowledge for educational apps? 28/10/14 10 Stefan Dietze
  11. 11. dbpedia.org/resource/Electrocardiagraphy 1. Understanding data: contextual disambiguation through NLP tools 2. Enrichment with factual knowledge dbpedia:Электрокардиография category:Cardiac_procedures dbpedia:Willem_Einthoven 3. interlinking with related resources bbc:ProgrammeXY slideshare:SlidesetXY yovisto:VideolectureXY Title: ECG Patient case 1001 chest and limb leads Understanding, enriching, linking data 28/10/14 11 Stefan Dietze
  12. 12. „Success models“: data & applications Supporting innovative tools & applications Evaluation methods LinkedUp – Linking Web Data for Education Technology transfer & community-building Involving educators, developers, computer scientists, data engineers… http://www.linkedup-challenge.org/ Data curation & profiling Collecting & exposing open data for education Profiling of Web Data http://data.linkededucation.org EC-funded project aimed at advancing take-up of open data and related technologies http://www.linkedup-project.eu/events 28/10/14 Stefan Dietze 12 http://www.linkedup-project.eu/
  13. 13. Community-building and collaboration Joint work on tangible outcomes (datasets, applications....) Associated Partners Initiatives EC Projects Stefan Dietze
  14. 14. Collected & curated datasets of educational relevance Beyond collecting: published over 50 datasets as LD together with most important content providers e.g. TED, OCW, SoLAR etc LinkedUp catalog: most comprehensive collection of LD/Open Data for education RDF dataset metadata Federated queries across datasets using type mappings Publishing & curating educational data http://data.linkededucation.org/linkedup/catalog/ 28/10/14 Stefan Dietze 14
  15. 15. http://data-observatory.org/lod-explorer Supporting developers and data consumers Devtalk blog: developer resource & community to aid developers Webinars and tutorials http://data.linkededucation.org/linkedup/devtalk/ Topic-based annotation and discovery of data Data exploration & visualisation features 28/10/14 Stefan Dietze 16
  16. 16. LinkedUp events, training & technology transfer Bringing stakeholders together Data Providers & Data Scientists Developers Community-building through events & communication channels/social media (cross-disciplinary, industry & academia) Exploitation of project outcomes across communities: technology transfer (Co-)organised approx. 20 events (tutorials, workshops, booths etc) More than 30 invited talks/lectures …. Users (Learners, Tutors, Teachers) 28/10/14 Stefan Dietze 17
  17. 17. May –September 2013 October 2013 – May 2014 May 2014 – October 2014 Series of Open Data Competitions to promote applications which exploit Linked Open Data http://www.linkedup-challenge.org/ LinkedUp Challenge
  18. 18. 23 14 13 8 9 10 0 5 10 15 20 25 Veni Vidi Vici submissions shortlist LinkedUp Challenge results  50 submissions of which 27 were shortlisted and supported (through travel grants, participation in events and rewards)  13 Veni, Vidi, Vici winners (grants: 1000 – 3000 €)  Authors from 23 distinct, mostly European countries LinkedUp submissions & shortlist Coatia; 4 Greece; 4 Belgium; 5 Italy; 7 Germany; 11 Spain; 13 France; 14 Netherlands; 15 United States; 15 United Kingdom; 21 authors Top-10 author‘s origins Stefan Dietze 28/10/14 21
  19. 19. Issues (1/3) - open data is messier than we think SPARQL endpoint availability over time [Buil-Aranda et al 2013] Accessibility of datasets? Less than 50% of all SPARQL endpoints actually responsive at given point of time [Buil-Aranda2013] “THE” SPARQL protocol? No, but many variants & subsets Data “quality”? …data accuracy (eg DBpedia)? [Paulheim2013] …vocabulary reuse/links? [D’AquinWebSci13] …schema compliance (RDFS, schemas) [HoganJWS2012] Stefan Dietze SPARQL Web-Querying Infrastructure: Ready for Action?, Carlos Buil-Aranda, Aidan Hogan, Jürgen Umbrich Pierre-Yves Vandenbussch, International Semantic Web Conference 2013, (ISWC2013). Assessing the Educational Linked Data Landscape, D’Aquin, M., Adamou, A., Dietze, S., ACM Web Science 2013 (WebSci2013), Paris, France, May 2013. Type Inference on Noisy RDF Data, Paulheim H., Bizer, C. Semantic Web – ISWC 2013, Lecture Notes in Computer Science Volume 8218, 2013, pp 510-525 An empirical survey of Linked Data conformance. Hogan, A., Umbrich, J., Harth, A., Cyganiak, R., Polleres, A., Decker., S., Journal of Web Semantics 14, 2012 28/10/14 22
  20. 20. Issues (2/3) – accepting inconsistency Analyzing Relative Incompleteness of Movie Descriptions in the Web of Data: A Case Study, Yuan, W., Demidova, E., Dietze, S., Zhu, X., International Semantic Web Conference 2014 (ISWC2014) 28/10/14 Stefan Dietze 23
  21. 21. Issues (3/3) – licensing/legal aspects Dataset Words Pages DBpedia 7163 16 Flickr 10367 23 ConceptNet 7163 16 World Bank 7056 16 Nature 7024 16 LinkedIn 6104 14 Google+ 5740 13 Tumblr 5362 12 Twitter 4247 9 Facebook 4179 9 Mashing up data: legal and licensing related issues under-estimated What license do you get when mashing up: Attribution: copyright violation from missing (86%) or incorrect attribution (14%) information Terms & conditions: complexity and conflicts when merging data from different sources Potential non-compliance from evolution of (a) LOD applications and (b) underlying datasets (and their licenses) T&C of established datasets 28/10/14 Stefan Dietze 24 Nature (CC0) + DBpedia (CC-ShareAlike) + FAO (Proprietary non-commercial) => ?
  22. 22. Get involved! http://www.w3.org/community/opened http://data.linkededucation.org/linkedup/catalog/ http://data.linkededucation.org/linkedup/devtalk/
  23. 23. Thank you! 28/10/14 Stefan Dietze 26