SlideShare uma empresa Scribd logo
1 de 15
Baixar para ler offline
Sem. Markup and RDF            Krextor Framework         Applications          Examples           Related         Conclusion




    Krextor – An Extensible XML→RDF Extraction
                     Framework
                      Scripting for the Semantic Web, 5th Workshop


                                                   Christoph Lange

                                         Jacobs University, Bremen, Germany
                          KWARC – Knowledge Adaptation and Reasoning for Content


                                                    May 31, 2009



         Ch. Lange (Jacobs University)               Krextor – An Extensible XML→RDF Extraction Framework   May 31, 2009 1/15
Sem. Markup and RDF            Krextor Framework       Applications          Examples           Related         Conclusion



Overview
 Want XML applications to contribute to the Semantic Web?
  1 Define a schema→ontology mapping for your XML language
  2 Extract RDF from XML

Krextor:
    Specify XML→ontology
    mappings (as extraction
    rules)
    Perform extraction
    (XSLT-based
    implementation)
 http://kwarc.info/projects/krextor/
         Ch. Lange (Jacobs University)             Krextor – An Extensible XML→RDF Extraction Framework   May 31, 2009 2/15
Sem. Markup and RDF            Krextor Framework       Applications          Examples           Related         Conclusion



XML vs. RDF

 Two slices of the infamous Layer Cake:
                 RDF

                 XML


 Doesn’t tell much about the role of XML:
   1  XML only for encoding higher-layer formalisms like RDF or OWL?
   2  or XML as a metalanguage of its own right?
 In case (2), we need a semantics for XML-based languages!



         Ch. Lange (Jacobs University)             Krextor – An Extensible XML→RDF Extraction Framework   May 31, 2009 3/15
Sem. Markup and RDF            Krextor Framework       Applications          Examples           Related         Conclusion



XML languages


 Advantages of using XML for knowledge representation (and not
 just RDF):
   1  Sequential order out of the box
   2  Style languages (CSS, XSL)
 Given any domain, . . .
      can define an XML schema for a domain-specific language
      concise syntax for domain experts
      no need to think in triples (compare OWL XML vs. RDF/XML)




         Ch. Lange (Jacobs University)             Krextor – An Extensible XML→RDF Extraction Framework   May 31, 2009 4/15
Sem. Markup and RDF            Krextor Framework       Applications          Examples           Related         Conclusion



What about the semantics?

 <workshop xml:id="SFSW09"
  conference="#ESWC09"
  number="5"
  date="2009-05-31">
   <title short="SFSW">Scripting for the Semantic Web</title>
 </workshop>
 Usual approach: human-readable specification, then hard-code
 Semantic approaches: RDFa, Microformats
 Open questions:
  1 How to give above language a direct RDF-based semantics?
  2 How to implement the XML→RDF translation?

         Ch. Lange (Jacobs University)             Krextor – An Extensible XML→RDF Extraction Framework   May 31, 2009 5/15
Sem. Markup and RDF            Krextor Framework       Applications          Examples           Related         Conclusion



Making an XML language semantic


         We are focused on practical implementation, not on a formal
         semantics bridging XML and RDF.
         We want to benefit from existing XML and RDF tools.
 Our approach:
  1  provide rules that translate XML to RDF
  2  if needed, supply an ontology as vocabulary for the extracted
     RDF




         Ch. Lange (Jacobs University)             Krextor – An Extensible XML→RDF Extraction Framework   May 31, 2009 6/15
Sem. Markup and RDF            Krextor Framework       Applications          Examples           Related         Conclusion



Krextor’s History

   1   Origin: OMDoc (Open Mathematical
       Documents; XML schema and ontology)
       manage in a semantic wiki
   2   Hard-coded Java implementation: too
       unflexible to maintain
   3   More lightweight approach: XSLT coded
       from scratch (OMDoc→RXR→Java)
   4   Needed support for other languages                                         http://kwarc.info/
   5   Created Krextor, a generic XSLT-based                                      projects/krextor/
       framework
   6   . . . and provided some more
       translations (‘‘extraction modules’’)

         Ch. Lange (Jacobs University)             Krextor – An Extensible XML→RDF Extraction Framework   May 31, 2009 7/15
Sem. Markup and RDF            Krextor Framework        Applications          Examples           Related         Conclusion



The Framework
       OMDoc
              +RDFa                                                    RDF/XML
    OMDoc/OWL
               +RDFa
       XHTML                                                     RXR         Turtle
              +RDFa                                                     ?
                                      generic                ?
     OpenMath
                                   representation                      your format
       my XML
               +RDFa?                                         Java
  my Microformat                                            callback                                  input format


                                                                                                    output format


         Collection of XSLT stylesheets, Java wrapper, Shell frontend
         Output targetted at machines, not humans


         Ch. Lange (Jacobs University)              Krextor – An Extensible XML→RDF Extraction Framework   May 31, 2009 8/15
Sem. Markup and RDF            Krextor Framework       Applications          Examples           Related         Conclusion



Adding Input and Output Modules

 Input module (for a new XML language):
     very simple declarative mappings (element class)
     otherwise pattern-match XML structure, then call a predefined
     template: create resource, add property, etc.
     several ways of generating URIs for XML elements: xml:id,
     auto-generated, custom
 Output module (for a new RDF serialization):
     implement low-level ‘‘triple generation template’’
     or post-process output of an existing module



         Ch. Lange (Jacobs University)             Krextor – An Extensible XML→RDF Extraction Framework   May 31, 2009 9/15
Sem. Markup and RDF            Krextor Framework       Applications          Examples           Related         Conclusion



Our own applications


 Semantic wiki: SWiM semantic wiki (http://swim.kwarc.info)
                 mathematical documents (OMDoc, OpenMath)
                 extract RDF outline from documents
                 use it for navigation, querying, problem-solving
                 assistance
 Documented ontologies:
                                 write ontologies in OMDoc
                                 (better documentability → poster session)
                                 Krextor translates to OWL



         Ch. Lange (Jacobs University)             Krextor – An Extensible XML→RDF Extraction Framework   May 31, 2009 10/15
Sem. Markup and RDF            Krextor Framework       Applications          Examples           Related         Conclusion



Example: hCalendar Microformat (1)


 Input:
 <div class="vevent">
  <a class="url" href="http://www.eswc2009.org">ESWC</a>
  starts on <span class="dtstart">2009-05-31</span>.</div>
 Desired output:
 <http://www.eswc2009.org>
   a <http://www.w3.org/2002/12/cal/ical#Vevent> ;
   <http://www.w3.org/2002/12/cal/ical#dtstart>
      "2009-05-31"^^<http://www.w3.org/2001/XMLSchema#date>



         Ch. Lange (Jacobs University)             Krextor – An Extensible XML→RDF Extraction Framework   May 31, 2009 11/15
Sem. Markup and RDF            Krextor Framework       Applications          Examples           Related         Conclusion



Example: hCalendar Microformat (2)




 Usage: krextor hcalendar..turtle infile.xhtml


         Ch. Lange (Jacobs University)             Krextor – An Extensible XML→RDF Extraction Framework   May 31, 2009 12/15
Sem. Markup and RDF            Krextor Framework       Applications          Examples           Related         Conclusion



Example: Declarative Mapping (OpenMath)
<xsl:variable name="krextor:resources">
  <CD type="&omo;ContentDictionary"/>                                                                     Resources
  <CDDefinition type="&omo;SymbolDefinition"
   related-via-properties="&omo;containsSymbolDefinition"/>
  <Example type="&omo;Example"
   related-via-properties="&omo;hasExample"/>
</xsl:variable>

<xsl:template match="CD|CDDefinition|Example"
  <xsl:apply-templates select="." mode="krextor:create-resource"/>
</xsl:template>
<xsl:variable name="krextor:literal-properties">
  <Name property="&dc;identifier" normalize-space="true"/>
  <Description property="&dc;description" normalize-space="true"/>                                        Properties
  <Title property="&dc;title" normalize-space="true"/>
  <Role property="&omo;role" normalize-space="true"/>
</xsl:variable>

<xsl:template match="Name|Description|Title|Role">
  <xsl:apply-templates select="." mode="krextor:add-literal-property"/>
</xsl:template>
         Ch. Lange (Jacobs University)             Krextor – An Extensible XML→RDF Extraction Framework   May 31, 2009 13/15
Sem. Markup and RDF            Krextor Framework       Applications          Examples           Related         Conclusion



Related Work


 Swignition: extensive support for ‘‘standard’’ semantics (RDFa,
             microformats, GRDDL), but harder to add a new input
             language
     XSDL: declarative XML→OWL-DL mapping. Not (?)
             implemented; would make a nice frontend to Krextor
  XSPARQL: combines SPARQL and XQuery, breaks boundaries
             between XML and RDF. Currently rather one-time
             queries than complete translations.




         Ch. Lange (Jacobs University)             Krextor – An Extensible XML→RDF Extraction Framework   May 31, 2009 14/15
Sem. Markup and RDF            Krextor Framework       Applications          Examples           Related         Conclusion



Conclusion


         Krextor supports many XML→RDF conversion tasks
         Easy to extend, easy to integrate into applications
 Possible integration into engineering workflows:
 Ontology engineering: First design the ontology, then a convenient
              XML syntax for domain-specific knowledge
 Language engineering: Specify the semantics while engineering
              the schema




         Ch. Lange (Jacobs University)             Krextor – An Extensible XML→RDF Extraction Framework   May 31, 2009 15/15

Mais conteúdo relacionado

Mais de Christoph Lange

TCP – zuverlässiger Ende-zu-Ende-Datenstrom
TCP – zuverlässiger Ende-zu-Ende-DatenstromTCP – zuverlässiger Ende-zu-Ende-Datenstrom
TCP – zuverlässiger Ende-zu-Ende-Datenstrom
Christoph Lange
 
Publishing Math Lecture Notes as Linked Data
Publishing Math Lecture Notes as Linked DataPublishing Math Lecture Notes as Linked Data
Publishing Math Lecture Notes as Linked Data
Christoph Lange
 
sTeX+ – a System for Flexible Formalization of Linked Data
sTeX+ – a System for Flexible Formalization of Linked DatasTeX+ – a System for Flexible Formalization of Linked Data
sTeX+ – a System for Flexible Formalization of Linked Data
Christoph Lange
 
Krextor – An Extensible Framework for Contributing Content Math to the Web of...
Krextor – An Extensible Framework for Contributing Content Math to the Web of...Krextor – An Extensible Framework for Contributing Content Math to the Web of...
Krextor – An Extensible Framework for Contributing Content Math to the Web of...
Christoph Lange
 
Ontology Integration and Interoperability (OntoIOp) – Part 1: The Distributed...
Ontology Integration and Interoperability (OntoIOp) – Part 1: The Distributed...Ontology Integration and Interoperability (OntoIOp) – Part 1: The Distributed...
Ontology Integration and Interoperability (OntoIOp) – Part 1: The Distributed...
Christoph Lange
 

Mais de Christoph Lange (20)

OSCOSS: Opening Scholarly Communication in Social Sciences
OSCOSS: Opening Scholarly Communication in Social SciencesOSCOSS: Opening Scholarly Communication in Social Sciences
OSCOSS: Opening Scholarly Communication in Social Sciences
 
WDAqua ITN – Answering Questions using Web Data
WDAqua ITN – Answering Questions using Web DataWDAqua ITN – Answering Questions using Web Data
WDAqua ITN – Answering Questions using Web Data
 
Interlinking Data and Knowledge in Enterprises, Research and Society with Lin...
Interlinking Data and Knowledge in Enterprises, Research and Society with Lin...Interlinking Data and Knowledge in Enterprises, Research and Society with Lin...
Interlinking Data and Knowledge in Enterprises, Research and Society with Lin...
 
Linked Open (Geo)Data and the Distributed Ontology Language – a perfect match
Linked Open (Geo)Data and the Distributed Ontology Language – a perfect matchLinked Open (Geo)Data and the Distributed Ontology Language – a perfect match
Linked Open (Geo)Data and the Distributed Ontology Language – a perfect match
 
Linking Big Data to Rich Process Descriptions
Linking Big Data to Rich Process DescriptionsLinking Big Data to Rich Process Descriptions
Linking Big Data to Rich Process Descriptions
 
The Distributed Ontology Language (DOL): Use Cases, Syntax, and Extensibility
The Distributed Ontology Language (DOL): Use Cases, Syntax, and ExtensibilityThe Distributed Ontology Language (DOL): Use Cases, Syntax, and Extensibility
The Distributed Ontology Language (DOL): Use Cases, Syntax, and Extensibility
 
Bringing Mathematics To the Web of Data: the Case of the Mathematics Subject ...
Bringing Mathematics To the Web of Data: the Case of the Mathematics Subject ...Bringing Mathematics To the Web of Data: the Case of the Mathematics Subject ...
Bringing Mathematics To the Web of Data: the Case of the Mathematics Subject ...
 
Semantic Web Technology: The Key to Making Scientific Information Systems Social
Semantic Web Technology: The Key to Making Scientific Information Systems SocialSemantic Web Technology: The Key to Making Scientific Information Systems Social
Semantic Web Technology: The Key to Making Scientific Information Systems Social
 
TCP – zuverlässiger Ende-zu-Ende-Datenstrom
TCP – zuverlässiger Ende-zu-Ende-DatenstromTCP – zuverlässiger Ende-zu-Ende-Datenstrom
TCP – zuverlässiger Ende-zu-Ende-Datenstrom
 
Previewing OWL Changes and Refactorings Using a Flexible XML Database
Previewing OWL Changes and Refactorings Using a Flexible XML DatabasePreviewing OWL Changes and Refactorings Using a Flexible XML Database
Previewing OWL Changes and Refactorings Using a Flexible XML Database
 
JOBAD – Interactive Mathematical Documents
JOBAD – Interactive Mathematical DocumentsJOBAD – Interactive Mathematical Documents
JOBAD – Interactive Mathematical Documents
 
Publishing Math Lecture Notes as Linked Data
Publishing Math Lecture Notes as Linked DataPublishing Math Lecture Notes as Linked Data
Publishing Math Lecture Notes as Linked Data
 
sTeX+ – a System for Flexible Formalization of Linked Data
sTeX+ – a System for Flexible Formalization of Linked DatasTeX+ – a System for Flexible Formalization of Linked Data
sTeX+ – a System for Flexible Formalization of Linked Data
 
Krextor – An Extensible Framework for Contributing Content Math to the Web of...
Krextor – An Extensible Framework for Contributing Content Math to the Web of...Krextor – An Extensible Framework for Contributing Content Math to the Web of...
Krextor – An Extensible Framework for Contributing Content Math to the Web of...
 
Mathematical Semantics of Statistical Data
Mathematical Semantics of Statistical DataMathematical Semantics of Statistical Data
Mathematical Semantics of Statistical Data
 
Enabling Collaboration on Semiformal Mathematical Knowledge by Semantic Web I...
Enabling Collaboration on Semiformal Mathematical Knowledge by Semantic Web I...Enabling Collaboration on Semiformal Mathematical Knowledge by Semantic Web I...
Enabling Collaboration on Semiformal Mathematical Knowledge by Semantic Web I...
 
Ontology Integration and Interoperability (OntoIOp) – Part 1: The Distributed...
Ontology Integration and Interoperability (OntoIOp) – Part 1: The Distributed...Ontology Integration and Interoperability (OntoIOp) – Part 1: The Distributed...
Ontology Integration and Interoperability (OntoIOp) – Part 1: The Distributed...
 
Ontology Integration and Interoperability (OntoIOp) – Part 1: The Distributed...
Ontology Integration and Interoperability (OntoIOp) – Part 1: The Distributed...Ontology Integration and Interoperability (OntoIOp) – Part 1: The Distributed...
Ontology Integration and Interoperability (OntoIOp) – Part 1: The Distributed...
 
TNTBase – a Versioned Database for XML (Mathematical) Documents
TNTBase – a Versioned Database for XML (Mathematical) DocumentsTNTBase – a Versioned Database for XML (Mathematical) Documents
TNTBase – a Versioned Database for XML (Mathematical) Documents
 
A Mathematical Approach to Ontology Authoring and Documentation
A Mathematical Approach to Ontology Authoring and DocumentationA Mathematical Approach to Ontology Authoring and Documentation
A Mathematical Approach to Ontology Authoring and Documentation
 

Último

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 

Último (20)

DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 

Krextor – An Extensible XML→RDF Extraction Framework

  • 1. Sem. Markup and RDF Krextor Framework Applications Examples Related Conclusion Krextor – An Extensible XML→RDF Extraction Framework Scripting for the Semantic Web, 5th Workshop Christoph Lange Jacobs University, Bremen, Germany KWARC – Knowledge Adaptation and Reasoning for Content May 31, 2009 Ch. Lange (Jacobs University) Krextor – An Extensible XML→RDF Extraction Framework May 31, 2009 1/15
  • 2. Sem. Markup and RDF Krextor Framework Applications Examples Related Conclusion Overview Want XML applications to contribute to the Semantic Web? 1 Define a schema→ontology mapping for your XML language 2 Extract RDF from XML Krextor: Specify XML→ontology mappings (as extraction rules) Perform extraction (XSLT-based implementation) http://kwarc.info/projects/krextor/ Ch. Lange (Jacobs University) Krextor – An Extensible XML→RDF Extraction Framework May 31, 2009 2/15
  • 3. Sem. Markup and RDF Krextor Framework Applications Examples Related Conclusion XML vs. RDF Two slices of the infamous Layer Cake: RDF XML Doesn’t tell much about the role of XML: 1 XML only for encoding higher-layer formalisms like RDF or OWL? 2 or XML as a metalanguage of its own right? In case (2), we need a semantics for XML-based languages! Ch. Lange (Jacobs University) Krextor – An Extensible XML→RDF Extraction Framework May 31, 2009 3/15
  • 4. Sem. Markup and RDF Krextor Framework Applications Examples Related Conclusion XML languages Advantages of using XML for knowledge representation (and not just RDF): 1 Sequential order out of the box 2 Style languages (CSS, XSL) Given any domain, . . . can define an XML schema for a domain-specific language concise syntax for domain experts no need to think in triples (compare OWL XML vs. RDF/XML) Ch. Lange (Jacobs University) Krextor – An Extensible XML→RDF Extraction Framework May 31, 2009 4/15
  • 5. Sem. Markup and RDF Krextor Framework Applications Examples Related Conclusion What about the semantics? <workshop xml:id="SFSW09" conference="#ESWC09" number="5" date="2009-05-31"> <title short="SFSW">Scripting for the Semantic Web</title> </workshop> Usual approach: human-readable specification, then hard-code Semantic approaches: RDFa, Microformats Open questions: 1 How to give above language a direct RDF-based semantics? 2 How to implement the XML→RDF translation? Ch. Lange (Jacobs University) Krextor – An Extensible XML→RDF Extraction Framework May 31, 2009 5/15
  • 6. Sem. Markup and RDF Krextor Framework Applications Examples Related Conclusion Making an XML language semantic We are focused on practical implementation, not on a formal semantics bridging XML and RDF. We want to benefit from existing XML and RDF tools. Our approach: 1 provide rules that translate XML to RDF 2 if needed, supply an ontology as vocabulary for the extracted RDF Ch. Lange (Jacobs University) Krextor – An Extensible XML→RDF Extraction Framework May 31, 2009 6/15
  • 7. Sem. Markup and RDF Krextor Framework Applications Examples Related Conclusion Krextor’s History 1 Origin: OMDoc (Open Mathematical Documents; XML schema and ontology) manage in a semantic wiki 2 Hard-coded Java implementation: too unflexible to maintain 3 More lightweight approach: XSLT coded from scratch (OMDoc→RXR→Java) 4 Needed support for other languages http://kwarc.info/ 5 Created Krextor, a generic XSLT-based projects/krextor/ framework 6 . . . and provided some more translations (‘‘extraction modules’’) Ch. Lange (Jacobs University) Krextor – An Extensible XML→RDF Extraction Framework May 31, 2009 7/15
  • 8. Sem. Markup and RDF Krextor Framework Applications Examples Related Conclusion The Framework OMDoc +RDFa RDF/XML OMDoc/OWL +RDFa XHTML RXR Turtle +RDFa ? generic ? OpenMath representation your format my XML +RDFa? Java my Microformat callback input format output format Collection of XSLT stylesheets, Java wrapper, Shell frontend Output targetted at machines, not humans Ch. Lange (Jacobs University) Krextor – An Extensible XML→RDF Extraction Framework May 31, 2009 8/15
  • 9. Sem. Markup and RDF Krextor Framework Applications Examples Related Conclusion Adding Input and Output Modules Input module (for a new XML language): very simple declarative mappings (element class) otherwise pattern-match XML structure, then call a predefined template: create resource, add property, etc. several ways of generating URIs for XML elements: xml:id, auto-generated, custom Output module (for a new RDF serialization): implement low-level ‘‘triple generation template’’ or post-process output of an existing module Ch. Lange (Jacobs University) Krextor – An Extensible XML→RDF Extraction Framework May 31, 2009 9/15
  • 10. Sem. Markup and RDF Krextor Framework Applications Examples Related Conclusion Our own applications Semantic wiki: SWiM semantic wiki (http://swim.kwarc.info) mathematical documents (OMDoc, OpenMath) extract RDF outline from documents use it for navigation, querying, problem-solving assistance Documented ontologies: write ontologies in OMDoc (better documentability → poster session) Krextor translates to OWL Ch. Lange (Jacobs University) Krextor – An Extensible XML→RDF Extraction Framework May 31, 2009 10/15
  • 11. Sem. Markup and RDF Krextor Framework Applications Examples Related Conclusion Example: hCalendar Microformat (1) Input: <div class="vevent"> <a class="url" href="http://www.eswc2009.org">ESWC</a> starts on <span class="dtstart">2009-05-31</span>.</div> Desired output: <http://www.eswc2009.org> a <http://www.w3.org/2002/12/cal/ical#Vevent> ; <http://www.w3.org/2002/12/cal/ical#dtstart> "2009-05-31"^^<http://www.w3.org/2001/XMLSchema#date> Ch. Lange (Jacobs University) Krextor – An Extensible XML→RDF Extraction Framework May 31, 2009 11/15
  • 12. Sem. Markup and RDF Krextor Framework Applications Examples Related Conclusion Example: hCalendar Microformat (2) Usage: krextor hcalendar..turtle infile.xhtml Ch. Lange (Jacobs University) Krextor – An Extensible XML→RDF Extraction Framework May 31, 2009 12/15
  • 13. Sem. Markup and RDF Krextor Framework Applications Examples Related Conclusion Example: Declarative Mapping (OpenMath) <xsl:variable name="krextor:resources"> <CD type="&omo;ContentDictionary"/> Resources <CDDefinition type="&omo;SymbolDefinition" related-via-properties="&omo;containsSymbolDefinition"/> <Example type="&omo;Example" related-via-properties="&omo;hasExample"/> </xsl:variable> <xsl:template match="CD|CDDefinition|Example" <xsl:apply-templates select="." mode="krextor:create-resource"/> </xsl:template> <xsl:variable name="krextor:literal-properties"> <Name property="&dc;identifier" normalize-space="true"/> <Description property="&dc;description" normalize-space="true"/> Properties <Title property="&dc;title" normalize-space="true"/> <Role property="&omo;role" normalize-space="true"/> </xsl:variable> <xsl:template match="Name|Description|Title|Role"> <xsl:apply-templates select="." mode="krextor:add-literal-property"/> </xsl:template> Ch. Lange (Jacobs University) Krextor – An Extensible XML→RDF Extraction Framework May 31, 2009 13/15
  • 14. Sem. Markup and RDF Krextor Framework Applications Examples Related Conclusion Related Work Swignition: extensive support for ‘‘standard’’ semantics (RDFa, microformats, GRDDL), but harder to add a new input language XSDL: declarative XML→OWL-DL mapping. Not (?) implemented; would make a nice frontend to Krextor XSPARQL: combines SPARQL and XQuery, breaks boundaries between XML and RDF. Currently rather one-time queries than complete translations. Ch. Lange (Jacobs University) Krextor – An Extensible XML→RDF Extraction Framework May 31, 2009 14/15
  • 15. Sem. Markup and RDF Krextor Framework Applications Examples Related Conclusion Conclusion Krextor supports many XML→RDF conversion tasks Easy to extend, easy to integrate into applications Possible integration into engineering workflows: Ontology engineering: First design the ontology, then a convenient XML syntax for domain-specific knowledge Language engineering: Specify the semantics while engineering the schema Ch. Lange (Jacobs University) Krextor – An Extensible XML→RDF Extraction Framework May 31, 2009 15/15