SlideShare uma empresa Scribd logo
1 de 30
Baixar para ler offline
Integrating NLP with Linked Data and RDF: 
the NIF format (hands on) 
Ciro Baron Neto 
Ph.D student at University of Leipzig 
Building the Multilingual Web of Data – ISWC 
10/20/14 tutorial 
1
Building the Multilingual Web of Data – ISWC 
10/20/14 tutorial 
2 
Overview 
• Github NLP2RDF web page overview 
and NIF Online demos (Dashboard, 
Combinator...) 
• Examples 
–Example 1: How to annotate string 
• using Snowball Steamer and OpenNLP 
–Example 2: 
• Query generated NIF data and Querying Brown 
Corpus
NLP2RDF GitHub Website 
• https://github.com/NLP2RDF/ 
• /home/ciro/websites/github/github.com/NLP2RDF/index.html 
Building the Multilingual Web of Data – ISWC 
10/20/14 tutorial 
3
dashboard.nlp2rdf.aksw.org 
Building the Multilingual Web of Data – ISWC 
10/20/14 tutorial 
4
Building the Multilingual Web of Data – ISWC 
10/20/14 tutorial 
5 
nlp2rdf.aksw.org
Example 1: Snowball Stemmer 
Wrapper 
Building the Multilingual Web of Data – ISWC 
10/20/14 tutorial 
6
Snowball Stemmer Wrapper 
• Stemming algorithm is a process 
for removing suffixes from words. 
–CONNECT 
Building the Multilingual Web of Data – ISWC 
10/20/14 tutorial 
7 
• CONNECTED 
• CONNECTION 
• CONNECTING 
• CONNECTIONS
Snowball Stemmer Wrapper 
• 1. Open the USB stick folder 
• 2. Go to “NIF_tutorial_hands_on_jars” folder 
• 3. Open the “instructions.txt” file in a text 
editor 
• 4. Open a terminal 
• 5. Go to the “jar” folder 
Building the Multilingual Web of Data – ISWC 
10/20/14 tutorial 
8
Snowball Stemmer Wrapper 
• Copy the second command of the 
instructions.txt 
“java -jar snowball.jar -f text -i 'My 
favorite actress is Natalie Portman.'“ 
• -f is used to define the format 
• -i is used to define the input 
• Paste in the terminal 
Building the Multilingual Web of Data – ISWC 
10/20/14 tutorial 
9
Snowball Stemmer Wrapper 
Building the Multilingual Web of Data – ISWC 
10/20/14 tutorial 
10
Snowball Stemmer Wrapper 
Building the Multilingual Web of Data – ISWC 
10/20/14 tutorial 
11
Snowball Stemmer Wrapper 
Building the Multilingual Web of Data – ISWC 
10/20/14 tutorial 
12 
NIF Standard Annotations 
NIF Offset
Snowball Stemmer Wrapper 
Building the Multilingual Web of Data – ISWC 
10/20/14 tutorial 
13 
NIF Standard Annotations 
Snowball Stem 
NIF Offset
OpenNLP Wrapper 
• Back to the terminal and use the first command 
of the instructions.txt 
java -jar opennlp.jar -f text -i 'My favorite actress is 
Natalie Portman.' -modelFolder ../model/ 
• The -modelFolder parameter set the folder that 
contains the POS tagging OpenNLP trained 
models and tokenization. 
• You might add the parameter “--outfile 
myAnnotatedFile.ttl“ to store the triples in a file. 
Building the Multilingual Web of Data – ISWC 
10/20/14 tutorial 
14
Example 2: Query Brown Corpus 
Building the Multilingual Web of Data – ISWC 
10/20/14 tutorial 
15
Querying with Twinkle 
• Open the “/twinkle/example” folder 
• Open the NIF_query_example file 
in a text editor and copy the query 
• Open the “/twinle” folder and run 
the command: 
java -jar twinkle.jar 
Building the Multilingual Web of Data – ISWC 
10/20/14 tutorial 
16
Querying Brown Corpus 
Building the Multilingual Web of Data – ISWC 
10/20/14 tutorial 
17
Querying Brown Corpus 
Building the Multilingual Web of Data – ISWC 
10/20/14 tutorial 
18
Querying Brown Corpus 
Building the Multilingual Web of Data – ISWC 
10/20/14 tutorial 
19
Querying Brown Corpus 
Building the Multilingual Web of Data – ISWC 
10/20/14 tutorial 
20
Querying Brown Corpus 
Building the Multilingual Web of Data – ISWC 
10/20/14 tutorial 
21
Querying Brown Corpus 
Building the Multilingual Web of Data – ISWC 
10/20/14 tutorial 
22
Querying Brown Corpus 
Building the Multilingual Web of Data – ISWC 
10/20/14 tutorial 
23
Querying Brown Corpus 
Building the Multilingual Web of Data – ISWC 
10/20/14 tutorial 
24
Querying Brown Corpus 
Building the Multilingual Web of Data – ISWC 
10/20/14 tutorial 
25
Querying Brown Corpus 
Building the Multilingual Web of Data – ISWC 
10/20/14 tutorial 
26
Exercise 3: Querying your own NIF 
annotated string 
Building the Multilingual Web of Data – ISWC 
10/20/14 tutorial 
27
Querying your own NIF annotated 
string 
1. Annotate your string using one of the 
wrappers 
2. Save your annotated sentence to a file 
(using “--outfile”) 
3. Open Twinkle 
4. Query your string using Twinkle 
Building the Multilingual Web of Data – ISWC 
10/20/14 tutorial 
28
• Query your annotated string: 
– nif:Context 
– nif:Sentence 
– nif:anchorOf 
– nif:oliaCategory 
– nif:oliaLink 
… or practice with Brown Corpus! 
Building the Multilingual Web of Data – ISWC 
10/20/14 tutorial 
29
Thank you! 
http://site.nlp2rdf.org/ 
NLP2RDF Google+ Community 
Building the Multilingual Web of Data – ISWC 
10/20/14 tutorial 
30

Mais conteúdo relacionado

Semelhante a Integrating NLP with Linked Data using NIF format

nf-core: A community-driven collection of omics portable pipelines
nf-core: A community-driven collection of omics portable pipelinesnf-core: A community-driven collection of omics portable pipelines
nf-core: A community-driven collection of omics portable pipelinesJose Espinosa-Carrasco
 
Data Plane and VNF Acceleration Mini Summit
Data Plane and VNF Acceleration Mini Summit Data Plane and VNF Acceleration Mini Summit
Data Plane and VNF Acceleration Mini Summit Open-NFP
 
Varnish more than a cache
Varnish more than a cacheVarnish more than a cache
Varnish more than a cachebloeffeld
 
Augmenting Flow Operations and Feedback on the Model Driven MD_SAL Approach i...
Augmenting Flow Operations and Feedback on the Model Driven MD_SAL Approach i...Augmenting Flow Operations and Feedback on the Model Driven MD_SAL Approach i...
Augmenting Flow Operations and Feedback on the Model Driven MD_SAL Approach i...Brent Salisbury
 
Common asp.net design patterns aspconf2012
Common asp.net design patterns aspconf2012Common asp.net design patterns aspconf2012
Common asp.net design patterns aspconf2012Steven Smith
 
Querying Linked Data with SPARQL (2010)
Querying Linked Data with SPARQL (2010)Querying Linked Data with SPARQL (2010)
Querying Linked Data with SPARQL (2010)Olaf Hartig
 
Mark Hughes Annual Seminar Presentation on Open Source
Mark Hughes Annual Seminar Presentation on Open Source Mark Hughes Annual Seminar Presentation on Open Source
Mark Hughes Annual Seminar Presentation on Open Source Tracy Kent
 
Querying Linked Data with SPARQL
Querying Linked Data with SPARQLQuerying Linked Data with SPARQL
Querying Linked Data with SPARQLOlaf Hartig
 
Keeping a codebase fresh for over a decade
Keeping a codebase fresh for over a decadeKeeping a codebase fresh for over a decade
Keeping a codebase fresh for over a decadeChristian Keuerleber
 
DockerDay2015: Docker Networking
DockerDay2015: Docker NetworkingDockerDay2015: Docker Networking
DockerDay2015: Docker NetworkingDocker-Hanoi
 
Building a Web-Scale Dependency-Parsed Corpus from Common Crawl
Building a Web-Scale Dependency-Parsed Corpus from Common CrawlBuilding a Web-Scale Dependency-Parsed Corpus from Common Crawl
Building a Web-Scale Dependency-Parsed Corpus from Common CrawlAlexander Panchenko
 
CS8073C# dot net programming syllabus.docx
CS8073C# dot net programming syllabus.docxCS8073C# dot net programming syllabus.docx
CS8073C# dot net programming syllabus.docxPriyadarshiniS28
 
CS8073C# dot net programming syllabus.docx
CS8073C# dot net programming syllabus.docxCS8073C# dot net programming syllabus.docx
CS8073C# dot net programming syllabus.docxPriyadarshiniS28
 
Virtuoso RDF Triple Store Analysis Benchmark & mapping tools RDF / OO
Virtuoso RDF Triple Store Analysis Benchmark & mapping tools RDF / OOVirtuoso RDF Triple Store Analysis Benchmark & mapping tools RDF / OO
Virtuoso RDF Triple Store Analysis Benchmark & mapping tools RDF / OOPaolo Cristofaro
 
Concept net150529
Concept net150529Concept net150529
Concept net150529KangSe Lee
 
IBM Impact session Ed addison nuts and bolts ws
IBM Impact session Ed addison nuts and bolts wsIBM Impact session Ed addison nuts and bolts ws
IBM Impact session Ed addison nuts and bolts wsnick_garrod
 
Fitman webinar 2015 09-21 Generation and Transformation of Virtualized Assets...
Fitman webinar 2015 09-21 Generation and Transformation of Virtualized Assets...Fitman webinar 2015 09-21 Generation and Transformation of Virtualized Assets...
Fitman webinar 2015 09-21 Generation and Transformation of Virtualized Assets...FITMAN FI
 

Semelhante a Integrating NLP with Linked Data using NIF format (20)

An API Your Parents Would Be Proud Of
An API Your Parents Would Be Proud OfAn API Your Parents Would Be Proud Of
An API Your Parents Would Be Proud Of
 
nf-core: A community-driven collection of omics portable pipelines
nf-core: A community-driven collection of omics portable pipelinesnf-core: A community-driven collection of omics portable pipelines
nf-core: A community-driven collection of omics portable pipelines
 
Data Plane and VNF Acceleration Mini Summit
Data Plane and VNF Acceleration Mini Summit Data Plane and VNF Acceleration Mini Summit
Data Plane and VNF Acceleration Mini Summit
 
Varnish more than a cache
Varnish more than a cacheVarnish more than a cache
Varnish more than a cache
 
Augmenting Flow Operations and Feedback on the Model Driven MD_SAL Approach i...
Augmenting Flow Operations and Feedback on the Model Driven MD_SAL Approach i...Augmenting Flow Operations and Feedback on the Model Driven MD_SAL Approach i...
Augmenting Flow Operations and Feedback on the Model Driven MD_SAL Approach i...
 
Common asp.net design patterns aspconf2012
Common asp.net design patterns aspconf2012Common asp.net design patterns aspconf2012
Common asp.net design patterns aspconf2012
 
Querying Linked Data with SPARQL (2010)
Querying Linked Data with SPARQL (2010)Querying Linked Data with SPARQL (2010)
Querying Linked Data with SPARQL (2010)
 
Mark Hughes Annual Seminar Presentation on Open Source
Mark Hughes Annual Seminar Presentation on Open Source Mark Hughes Annual Seminar Presentation on Open Source
Mark Hughes Annual Seminar Presentation on Open Source
 
Querying Linked Data with SPARQL
Querying Linked Data with SPARQLQuerying Linked Data with SPARQL
Querying Linked Data with SPARQL
 
Bio2RDF @ W3C HCLS2009
Bio2RDF @ W3C HCLS2009Bio2RDF @ W3C HCLS2009
Bio2RDF @ W3C HCLS2009
 
Keeping a codebase fresh for over a decade
Keeping a codebase fresh for over a decadeKeeping a codebase fresh for over a decade
Keeping a codebase fresh for over a decade
 
DockerDay2015: Docker Networking
DockerDay2015: Docker NetworkingDockerDay2015: Docker Networking
DockerDay2015: Docker Networking
 
Building a Web-Scale Dependency-Parsed Corpus from Common Crawl
Building a Web-Scale Dependency-Parsed Corpus from Common CrawlBuilding a Web-Scale Dependency-Parsed Corpus from Common Crawl
Building a Web-Scale Dependency-Parsed Corpus from Common Crawl
 
CS8073C# dot net programming syllabus.docx
CS8073C# dot net programming syllabus.docxCS8073C# dot net programming syllabus.docx
CS8073C# dot net programming syllabus.docx
 
CS8073C# dot net programming syllabus.docx
CS8073C# dot net programming syllabus.docxCS8073C# dot net programming syllabus.docx
CS8073C# dot net programming syllabus.docx
 
Virtuoso RDF Triple Store Analysis Benchmark & mapping tools RDF / OO
Virtuoso RDF Triple Store Analysis Benchmark & mapping tools RDF / OOVirtuoso RDF Triple Store Analysis Benchmark & mapping tools RDF / OO
Virtuoso RDF Triple Store Analysis Benchmark & mapping tools RDF / OO
 
Concept net150529
Concept net150529Concept net150529
Concept net150529
 
IBM Impact session Ed addison nuts and bolts ws
IBM Impact session Ed addison nuts and bolts wsIBM Impact session Ed addison nuts and bolts ws
IBM Impact session Ed addison nuts and bolts ws
 
Bio2RDF@BH2010
Bio2RDF@BH2010Bio2RDF@BH2010
Bio2RDF@BH2010
 
Fitman webinar 2015 09-21 Generation and Transformation of Virtualized Assets...
Fitman webinar 2015 09-21 Generation and Transformation of Virtualized Assets...Fitman webinar 2015 09-21 Generation and Transformation of Virtualized Assets...
Fitman webinar 2015 09-21 Generation and Transformation of Virtualized Assets...
 

Último

Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 

Último (20)

Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 

Integrating NLP with Linked Data using NIF format

  • 1. Integrating NLP with Linked Data and RDF: the NIF format (hands on) Ciro Baron Neto Ph.D student at University of Leipzig Building the Multilingual Web of Data – ISWC 10/20/14 tutorial 1
  • 2. Building the Multilingual Web of Data – ISWC 10/20/14 tutorial 2 Overview • Github NLP2RDF web page overview and NIF Online demos (Dashboard, Combinator...) • Examples –Example 1: How to annotate string • using Snowball Steamer and OpenNLP –Example 2: • Query generated NIF data and Querying Brown Corpus
  • 3. NLP2RDF GitHub Website • https://github.com/NLP2RDF/ • /home/ciro/websites/github/github.com/NLP2RDF/index.html Building the Multilingual Web of Data – ISWC 10/20/14 tutorial 3
  • 4. dashboard.nlp2rdf.aksw.org Building the Multilingual Web of Data – ISWC 10/20/14 tutorial 4
  • 5. Building the Multilingual Web of Data – ISWC 10/20/14 tutorial 5 nlp2rdf.aksw.org
  • 6. Example 1: Snowball Stemmer Wrapper Building the Multilingual Web of Data – ISWC 10/20/14 tutorial 6
  • 7. Snowball Stemmer Wrapper • Stemming algorithm is a process for removing suffixes from words. –CONNECT Building the Multilingual Web of Data – ISWC 10/20/14 tutorial 7 • CONNECTED • CONNECTION • CONNECTING • CONNECTIONS
  • 8. Snowball Stemmer Wrapper • 1. Open the USB stick folder • 2. Go to “NIF_tutorial_hands_on_jars” folder • 3. Open the “instructions.txt” file in a text editor • 4. Open a terminal • 5. Go to the “jar” folder Building the Multilingual Web of Data – ISWC 10/20/14 tutorial 8
  • 9. Snowball Stemmer Wrapper • Copy the second command of the instructions.txt “java -jar snowball.jar -f text -i 'My favorite actress is Natalie Portman.'“ • -f is used to define the format • -i is used to define the input • Paste in the terminal Building the Multilingual Web of Data – ISWC 10/20/14 tutorial 9
  • 10. Snowball Stemmer Wrapper Building the Multilingual Web of Data – ISWC 10/20/14 tutorial 10
  • 11. Snowball Stemmer Wrapper Building the Multilingual Web of Data – ISWC 10/20/14 tutorial 11
  • 12. Snowball Stemmer Wrapper Building the Multilingual Web of Data – ISWC 10/20/14 tutorial 12 NIF Standard Annotations NIF Offset
  • 13. Snowball Stemmer Wrapper Building the Multilingual Web of Data – ISWC 10/20/14 tutorial 13 NIF Standard Annotations Snowball Stem NIF Offset
  • 14. OpenNLP Wrapper • Back to the terminal and use the first command of the instructions.txt java -jar opennlp.jar -f text -i 'My favorite actress is Natalie Portman.' -modelFolder ../model/ • The -modelFolder parameter set the folder that contains the POS tagging OpenNLP trained models and tokenization. • You might add the parameter “--outfile myAnnotatedFile.ttl“ to store the triples in a file. Building the Multilingual Web of Data – ISWC 10/20/14 tutorial 14
  • 15. Example 2: Query Brown Corpus Building the Multilingual Web of Data – ISWC 10/20/14 tutorial 15
  • 16. Querying with Twinkle • Open the “/twinkle/example” folder • Open the NIF_query_example file in a text editor and copy the query • Open the “/twinle” folder and run the command: java -jar twinkle.jar Building the Multilingual Web of Data – ISWC 10/20/14 tutorial 16
  • 17. Querying Brown Corpus Building the Multilingual Web of Data – ISWC 10/20/14 tutorial 17
  • 18. Querying Brown Corpus Building the Multilingual Web of Data – ISWC 10/20/14 tutorial 18
  • 19. Querying Brown Corpus Building the Multilingual Web of Data – ISWC 10/20/14 tutorial 19
  • 20. Querying Brown Corpus Building the Multilingual Web of Data – ISWC 10/20/14 tutorial 20
  • 21. Querying Brown Corpus Building the Multilingual Web of Data – ISWC 10/20/14 tutorial 21
  • 22. Querying Brown Corpus Building the Multilingual Web of Data – ISWC 10/20/14 tutorial 22
  • 23. Querying Brown Corpus Building the Multilingual Web of Data – ISWC 10/20/14 tutorial 23
  • 24. Querying Brown Corpus Building the Multilingual Web of Data – ISWC 10/20/14 tutorial 24
  • 25. Querying Brown Corpus Building the Multilingual Web of Data – ISWC 10/20/14 tutorial 25
  • 26. Querying Brown Corpus Building the Multilingual Web of Data – ISWC 10/20/14 tutorial 26
  • 27. Exercise 3: Querying your own NIF annotated string Building the Multilingual Web of Data – ISWC 10/20/14 tutorial 27
  • 28. Querying your own NIF annotated string 1. Annotate your string using one of the wrappers 2. Save your annotated sentence to a file (using “--outfile”) 3. Open Twinkle 4. Query your string using Twinkle Building the Multilingual Web of Data – ISWC 10/20/14 tutorial 28
  • 29. • Query your annotated string: – nif:Context – nif:Sentence – nif:anchorOf – nif:oliaCategory – nif:oliaLink … or practice with Brown Corpus! Building the Multilingual Web of Data – ISWC 10/20/14 tutorial 29
  • 30. Thank you! http://site.nlp2rdf.org/ NLP2RDF Google+ Community Building the Multilingual Web of Data – ISWC 10/20/14 tutorial 30