SlideShare uma empresa Scribd logo
1 de 82
Introduction to Sketch Engine
http://www.sketchengine.co.uk/
–
1
Basic Terminology
Introduction
How to Use Sketch Engine ?
Research Issues
Outline
2
BasicTerminology
English Term
Corpus - Corpora
≠Blog .
Parallel corpora
Comparable Corpus .
Written Corpora
Spoken Corpora
3
BasicTerminology
English Term
Collocation
)( ()()
()
Concordances –
:
.
.
.
..
Lemma
4
BasicTerminology
English Term
Part-of-Speech
(PoS) Tagging codetag
.
Thesaurus
()
5
What is Sketch Engine ?
 It is a corpus query tool which takes as input a corpus of any
language and a corresponding grammar patterns, and which
generates, amongst other things, word sketches for the words
of that language.
 The Sketch Engine is designed for anyone wanting to research
how words behave.
6
SkE
Corpus
Word Sketches
What is Sketch Engine ?
7
Upload
your own
corpus
Access to
public
corpora
Advanced
search
options
Sketch Engine Features
1
• Web based tool – No installation
2
• Support Arabic corpora
3
• The Concordancer with advanced options
4
• The Word Sketches
8
Sketch Engine Features
5
• The Thesaurus (find similar words)
6
• Support for parallel corpora, virtual sub- and
super corpora
7
• Full regular-expression searching using CQL
8
• Corpus Architect: user corpora, uploaded by
users or created by WebBootCaT
9
Who Use Sketch Engine ?
10
Language
learners
WritersLinguists
Researchers
Sketch engine usage:
11
Common
words/colloc
ations
synonyms grammar
Words
behavior
Available corpora
12
200+ corpora in 60+ languages
Available Arabic corpora
13

14
How to create your corpus using SKE?
Steps to create a Corpus in SKE
15
Word Sketches
Sketch Diff
Thesaurus
Raw text
Tokenizati
on
Lemmatiz
ation
POS
tagging
Sketch
Grammar
SKE
Features
16
1- Upload your text:
- Sketch engine accepts types of files such as (.xml .doc, .docx, .htm,
.html, .pdf,.txt, …)
17
2- Tokenization:
- The process of splitting words and adding structure tags
(<s>,<doc>,<p>).
- The output will be a vertical line file
18
3- Lemmatization (optional):
- The process of attaching a word with its lemma.
19
4- POS tagging:(mandatory for word sketch)
- The process of attaching a word with its part-of-speech tag.
- SKE Arabic tagger is not avaliable.
•
V
•
PN
•
N
20
5- uploading Sketch Grammar:
- A file describing the grammatical relations in a langauge.
Example: 1: ”V” “(DET|NUM|ADJ|ADV|N)”* 2:”N”
Vertical line file with annotations
21
Adding data to the corpus by uploading a file
22
Adding data to the corpus usingWebBootCat
23
Seeds/URLs WebBootCat Your corpus
How to Use Sketch Engine ?
 As a Corpus User (Querying Corpora)
Concordance Word Lists Word Sketches
Sketch Diff Thesaurus
24

Concordance
25
Concordance
What is Concordancer?
A concordancer looks through the
whole corpus and finds every
example of a particular word or
phrase, then displays it with its
immediate context.
.
.
26
27
Query Types
Context
Text Types
28
Concordance
Query'sTypes
Query’s
Types
Simple
Lemma
Phrase
Word
Character
CQL
29
Concordance
Query'sTypes
Simple Will match the lemma (the stemmed form)
as well as the word
+ work for phrases.
«
» ...
30
Concordance
Query'sTypes
Lemma Will match any lemma
+ you can select PoS (Not for Arabic corpus).
This option will not work for phrases
« » ...
31
Concordance
Query'sTypes
Phrase Will match a phrase
+ any capitalized variant (Not for Arabic corpus)
but will not match the lemma
«
»
«
»
32
Concordance
Query'sTypes
Word Will match any word form exactly.
+you can select the PoS (Not for Arabic corpus)
+you can select "match case“ (Not for Arabic corpus)
« »« »
33
Concordance
Query'sTypes
Character Matches a character string.
« » ...
34
Concordance
Query'sTypes
CQL Is for inputting complex queries using Corpus
Query Language
35
 The general form is: [attr="value"]
o«»
 “Match any character“ operator: *
o«...»
 Or , And operators: | , &:
o«»«»
36
Concordance
Corpus Query Language (Basics)
 “Match any token" operator: []
o«..»«»
 Specifying number of tokens operator: {}
o«..»«»
o«..»0-3
«»
37
Concordance
Corpus Query Language (Basics)
Concordance
Exercises (CQL)
 Ex1:
: «»
 Ex2:

38
Concordance
Exercises (CQL)
 Ex1:
: «»
"" [] "“
 Ex2:

"" [] {0,3} "|"
39
Context
40
 Here you can specify criteria on the context for
your query.
 Ex1:
«»«»
 Ex2:
«»«»
41
Concordance
Context
42
Concordance
Context (Exercise)
43
Concordance
Context (Exercise)
Text Types
44
 Here you can:
 Select a sub-corpus or
 Create a new sub-corpus from a subset
of the current corpus
 You can also select constraints on the
text types for documents that will be
searched for your query
45
Concordance
TextTypes
46
Concordance
TextTypes
47
Concordance
Concordance Menu Options
 Save
 View Options
 Sort
 Sample
 Filter
 Frequency
 Collocations
 ConcDesc
 Visualize
Concordance
Exercises
 Ex1: Filter

 Ex2: Collocation
«»
 Ex3: Frequency – Node Tags
«»,
 Ex4: CQL - Frequency – Node Forms
: «» «»
48
Concordance
Exercises
 Ex1: Concordance:  Make Concordance
 Filter  select negative, Simple query:
 Ex2: Concordance:  Make Concordance
 Collocation  Attribute: word  Make Candidate List
 Ex3: Concordance:  Make Concordance
 Click Node Tags
 Ex4: Concordance  CQL: « » « | »
49

Word List
50
WordList
What is theWord List?
 Word List: for obtaining word lists ranked by
frequency for an entire corpus, or a
specified sub-corpus
 It can be useful for investigating whether a
word is used most frequently in its verb or
noun form, for instance.
51
52
Input: RE pattern or any
attribute (word, tag, lemma…)
Word List
Output:
Filtered list of lemma and/
words with frequencies
53
WordList
Exercises
 Ex1:
«»
«»
54
Choose lemma at Search attribute
Type the lemma (e.g. ) into
the RE pattern box.
Tick the box that says change
output attribute(s).
In the first two levels, select
“lemma" and "Tag".
55
56
WordList
Exercises
 Ex1:
«»
57
WordList
Exercises
58
WordList
Exercises
59

Word Sketch
60
WordSketch
What isWord Sketch?
 Word Sketch: this allows you to explore the
grammatical and collocational behaviour of
a word.
 The Word Sketch function doesn’t just tell
you what words are commonly found in the
company of your search word, but also tells
you what their grammatical relationship is
to the search word.
61
62
Input: Lemma
Word Sketch
Output:
Collocations
in grammatical
relation
WordSketch
Example
63
WordSketch
Example
64
WordSketch
Exercises
 Ex1:
«»
 Ex2:
«»
65

Thesaurus
66
Thesaurus
What isThesaurus?
 Thesaurus: this allows you to find other
words that have similar grammatical and
collocational behaviour to a given word.
 Note that this thesaurus is produced
automatically from statistics on word co-
occurrences.
 It is not a manually constructed thesaurus and
will list words for each entry which are
distributionally related but not necessarily
synonyms.
67
68
Input: Lemma +
POS tag
Thesaurus
Output:
Similar lemma
Thesaurus
Example
69
Thesaurus
Example
70
Thesaurus
Example
71

Word Sketch difference
72
Sketch-Diff
What isWord Sketch Difference?
 Sketch-Diff: this allows you to compare the
behavior of two words
 This function is also very useful for
comparing/deciding between two possible
translations of an item.
73
74
Input: two words or
lemmas
Sketch-Diff
Output: the different and
common collocations of
the two lemmas.
Sketch-Diff
Example
75
Sketch-Diff
Example
76
Sketch-Diff
Exercises
 Ex1:
/
 Ex2:
/
77

Compare corpora
78
79
Research Issues!
Please visit: http://goo.gl/HqhUir
Limitations!
Usage!
References
 http://www.sketchengine.co.uk/
 http://lisan1.com/wordpress/?p=146
 Kilgarriff, A., Rychly, P., Smrz, P., & Tugwell, D.
(2004). Itri-04-08 the sketch engine. Information
Technology, 105, 116.
81
Thank You
#__
82

Mais conteúdo relacionado

Mais procurados

Historical linguistics
Historical linguisticsHistorical linguistics
Historical linguisticsRick McKinnon
 
Corpus linguistics
Corpus linguisticsCorpus linguistics
Corpus linguisticsIrum Malik
 
Corpus and bnc
Corpus and bncCorpus and bnc
Corpus and bncmoona butt
 
Corpus annotation for corpus linguistics (nov2009)
Corpus annotation for corpus linguistics (nov2009)Corpus annotation for corpus linguistics (nov2009)
Corpus annotation for corpus linguistics (nov2009)Jorge Baptista
 
Ch 6 corpus linguistics
Ch 6   corpus linguisticsCh 6   corpus linguistics
Ch 6 corpus linguisticsNaveed Khokher
 
Principles of parameters
Principles of parametersPrinciples of parameters
Principles of parametersVelnar
 
Corpus linguistics, ch6
Corpus linguistics, ch6Corpus linguistics, ch6
Corpus linguistics, ch6VivaAs
 
19th century linguistics
19th century linguistics19th century linguistics
19th century linguisticsVenus Withers
 
Supir whorf final
Supir whorf finalSupir whorf final
Supir whorf finalflzza
 
Types of corpus linguistics Parallel ,aligned...
 Types of corpus linguistics Parallel ,aligned... Types of corpus linguistics Parallel ,aligned...
Types of corpus linguistics Parallel ,aligned...RajpootBhatti5
 
Applied linguisticss
Applied linguisticssApplied linguisticss
Applied linguisticssAprian0704
 
Sociolinguistic Aspect of Language
Sociolinguistic Aspect of Language Sociolinguistic Aspect of Language
Sociolinguistic Aspect of Language Aulia Hakim
 
MELT 104 - Construction Grammar
MELT 104 - Construction GrammarMELT 104 - Construction Grammar
MELT 104 - Construction GrammarGlynn Palecpec
 
chapter 1 - Harley (2001)_Eftekhari
chapter 1 - Harley (2001)_Eftekharichapter 1 - Harley (2001)_Eftekhari
chapter 1 - Harley (2001)_EftekhariNasrin Eftekhary
 

Mais procurados (20)

Historical linguistics
Historical linguisticsHistorical linguistics
Historical linguistics
 
Corpus linguistics
Corpus linguisticsCorpus linguistics
Corpus linguistics
 
Corpus and bnc
Corpus and bncCorpus and bnc
Corpus and bnc
 
Corpus annotation for corpus linguistics (nov2009)
Corpus annotation for corpus linguistics (nov2009)Corpus annotation for corpus linguistics (nov2009)
Corpus annotation for corpus linguistics (nov2009)
 
Ch 6 corpus linguistics
Ch 6   corpus linguisticsCh 6   corpus linguistics
Ch 6 corpus linguistics
 
Principles of parameters
Principles of parametersPrinciples of parameters
Principles of parameters
 
Corpus linguistics, ch6
Corpus linguistics, ch6Corpus linguistics, ch6
Corpus linguistics, ch6
 
19th century linguistics
19th century linguistics19th century linguistics
19th century linguistics
 
Supir whorf final
Supir whorf finalSupir whorf final
Supir whorf final
 
Types of corpus linguistics Parallel ,aligned...
 Types of corpus linguistics Parallel ,aligned... Types of corpus linguistics Parallel ,aligned...
Types of corpus linguistics Parallel ,aligned...
 
Generative grammar
Generative grammarGenerative grammar
Generative grammar
 
Applied linguisticss
Applied linguisticssApplied linguisticss
Applied linguisticss
 
Sociolinguistic Aspect of Language
Sociolinguistic Aspect of Language Sociolinguistic Aspect of Language
Sociolinguistic Aspect of Language
 
MELT 104 - Construction Grammar
MELT 104 - Construction GrammarMELT 104 - Construction Grammar
MELT 104 - Construction Grammar
 
Generative grammar ppt report
Generative grammar ppt reportGenerative grammar ppt report
Generative grammar ppt report
 
"The study of language" - Chapter 20
"The study of language" - Chapter 20"The study of language" - Chapter 20
"The study of language" - Chapter 20
 
chapter 1 - Harley (2001)_Eftekhari
chapter 1 - Harley (2001)_Eftekharichapter 1 - Harley (2001)_Eftekhari
chapter 1 - Harley (2001)_Eftekhari
 
Corpus Linguistics
Corpus LinguisticsCorpus Linguistics
Corpus Linguistics
 
Parameter setting
Parameter settingParameter setting
Parameter setting
 
Corpus linguistics
Corpus linguisticsCorpus linguistics
Corpus linguistics
 

Destaque

Классификация корпусов
Классификация корпусовКлассификация корпусов
Классификация корпусовArtem Lukanin
 

Destaque (20)

Sketch engine
Sketch engine Sketch engine
Sketch engine
 
Corpora and its use in elt
Corpora and its use in eltCorpora and its use in elt
Corpora and its use in elt
 
Баев Системы для обучения программированию
Баев Системы для обучения программированиюБаев Системы для обучения программированию
Баев Системы для обучения программированию
 
Смолина Пользовательские интерфейсы систем лингвистической разметки текстов
Смолина Пользовательские интерфейсы систем лингвистической разметки текстовСмолина Пользовательские интерфейсы систем лингвистической разметки текстов
Смолина Пользовательские интерфейсы систем лингвистической разметки текстов
 
Савкуев. Построение формального описания фотографий на основе контекстно-собы...
Савкуев. Построение формального описания фотографий на основе контекстно-собы...Савкуев. Построение формального описания фотографий на основе контекстно-собы...
Савкуев. Построение формального описания фотографий на основе контекстно-собы...
 
Классификация корпусов
Классификация корпусовКлассификация корпусов
Классификация корпусов
 
Мищенко. Методы автоматического определения наиболее частотного значения слова.
Мищенко. Методы автоматического определения наиболее частотного значения слова.Мищенко. Методы автоматического определения наиболее частотного значения слова.
Мищенко. Методы автоматического определения наиболее частотного значения слова.
 
Савостин. Системы и методы научного поиска и мониторинга
Савостин. Системы и методы научного поиска и мониторингаСавостин. Системы и методы научного поиска и мониторинга
Савостин. Системы и методы научного поиска и мониторинга
 
Лукьяненко. Извлечение коллокаций из текста
Лукьяненко. Извлечение коллокаций из текстаЛукьяненко. Извлечение коллокаций из текста
Лукьяненко. Извлечение коллокаций из текста
 
Котиков Простые методы выделения ключевых слов и построения рефератов
Котиков Простые методы выделения ключевых слов и построения рефератовКотиков Простые методы выделения ключевых слов и построения рефератов
Котиков Простые методы выделения ключевых слов и построения рефератов
 
Смирнова. Методы исправления ошибок в текстах, написанных иностранцами.
Смирнова. Методы исправления ошибок в текстах, написанных иностранцами.Смирнова. Методы исправления ошибок в текстах, написанных иностранцами.
Смирнова. Методы исправления ошибок в текстах, написанных иностранцами.
 
Тодуа. Сериализация и язык YAML
Тодуа. Сериализация и язык YAMLТодуа. Сериализация и язык YAML
Тодуа. Сериализация и язык YAML
 
Багдатов Методы автоматического выявления плагиата в текстах компьютерных про...
Багдатов Методы автоматического выявления плагиата в текстах компьютерных про...Багдатов Методы автоматического выявления плагиата в текстах компьютерных про...
Багдатов Методы автоматического выявления плагиата в текстах компьютерных про...
 
Иванов. Автоматизация построения предметных указателей
Иванов. Автоматизация построения предметных указателейИванов. Автоматизация построения предметных указателей
Иванов. Автоматизация построения предметных указателей
 
Можарова Тематические модели: учет сходства между униграммами и биграммами.
Можарова Тематические модели: учет сходства между униграммами и биграммами.Можарова Тематические модели: учет сходства между униграммами и биграммами.
Можарова Тематические модели: учет сходства между униграммами и биграммами.
 
Муромцев. Обзор библиографических менеджеров
Муромцев. Обзор библиографических менеджеровМуромцев. Обзор библиографических менеджеров
Муромцев. Обзор библиографических менеджеров
 
Панфилов. Корпусы текстов и принципы их создания
Панфилов. Корпусы текстов и принципы их созданияПанфилов. Корпусы текстов и принципы их создания
Панфилов. Корпусы текстов и принципы их создания
 
куликов Sketch engine ord
куликов Sketch engine ordкуликов Sketch engine ord
куликов Sketch engine ord
 
Сапин. Интеллектуальные агенты и обучение с подкреплением
Сапин. Интеллектуальные агенты и обучение с подкреплениемСапин. Интеллектуальные агенты и обучение с подкреплением
Сапин. Интеллектуальные агенты и обучение с подкреплением
 
Рой. Аспектный анализ тональности отзывов
Рой. Аспектный анализ тональности отзывов Рой. Аспектный анализ тональности отзывов
Рой. Аспектный анализ тональности отзывов
 

Semelhante a Sketch engine presentation

ANTLR - Writing Parsers the Easy Way
ANTLR - Writing Parsers the Easy WayANTLR - Writing Parsers the Easy Way
ANTLR - Writing Parsers the Easy WayMichael Yarichuk
 
Using ANTLR on real example - convert "string combined" queries into paramete...
Using ANTLR on real example - convert "string combined" queries into paramete...Using ANTLR on real example - convert "string combined" queries into paramete...
Using ANTLR on real example - convert "string combined" queries into paramete...Alexey Diyan
 
Unsupervised Software-Specific Morphological Forms Inference from Informal Di...
Unsupervised Software-Specific Morphological Forms Inference from Informal Di...Unsupervised Software-Specific Morphological Forms Inference from Informal Di...
Unsupervised Software-Specific Morphological Forms Inference from Informal Di...Chunyang Chen
 
Fusing Modeling and Programming into Language-Oriented Programming
Fusing Modeling and Programming into Language-Oriented ProgrammingFusing Modeling and Programming into Language-Oriented Programming
Fusing Modeling and Programming into Language-Oriented ProgrammingMarkus Voelter
 
Computational model language and grammar bnf
Computational model language and grammar bnfComputational model language and grammar bnf
Computational model language and grammar bnfTaha Shakeel
 
Towards an RDF Validation Language based on Regular Expression Derivatives
Towards an RDF Validation Language based on Regular Expression DerivativesTowards an RDF Validation Language based on Regular Expression Derivatives
Towards an RDF Validation Language based on Regular Expression DerivativesJose Emilio Labra Gayo
 
Compiler Construction
Compiler ConstructionCompiler Construction
Compiler ConstructionAhmed Raza
 
My 10 favorite Haxe language features - Francis Bourre - Codemotion Rome 2017
My 10 favorite Haxe language features - Francis Bourre - Codemotion Rome 2017My 10 favorite Haxe language features - Francis Bourre - Codemotion Rome 2017
My 10 favorite Haxe language features - Francis Bourre - Codemotion Rome 2017Codemotion
 
Stefan Richter - Writing simple, readable and robust code: Examples in Java, ...
Stefan Richter - Writing simple, readable and robust code: Examples in Java, ...Stefan Richter - Writing simple, readable and robust code: Examples in Java, ...
Stefan Richter - Writing simple, readable and robust code: Examples in Java, ...AboutYouGmbH
 
Introduction to Boost regex
Introduction to Boost regexIntroduction to Boost regex
Introduction to Boost regexYongqiang Li
 
Building .NET Core tools using the Roslyn API by Arthur Tabatchnic at .Net fo...
Building .NET Core tools using the Roslyn API by Arthur Tabatchnic at .Net fo...Building .NET Core tools using the Roslyn API by Arthur Tabatchnic at .Net fo...
Building .NET Core tools using the Roslyn API by Arthur Tabatchnic at .Net fo...DevClub_lv
 

Semelhante a Sketch engine presentation (20)

ANTLR - Writing Parsers the Easy Way
ANTLR - Writing Parsers the Easy WayANTLR - Writing Parsers the Easy Way
ANTLR - Writing Parsers the Easy Way
 
Using ANTLR on real example - convert "string combined" queries into paramete...
Using ANTLR on real example - convert "string combined" queries into paramete...Using ANTLR on real example - convert "string combined" queries into paramete...
Using ANTLR on real example - convert "string combined" queries into paramete...
 
Plc part 2
Plc  part 2Plc  part 2
Plc part 2
 
LANGUAGE TRANSLATOR
LANGUAGE TRANSLATORLANGUAGE TRANSLATOR
LANGUAGE TRANSLATOR
 
Unsupervised Software-Specific Morphological Forms Inference from Informal Di...
Unsupervised Software-Specific Morphological Forms Inference from Informal Di...Unsupervised Software-Specific Morphological Forms Inference from Informal Di...
Unsupervised Software-Specific Morphological Forms Inference from Informal Di...
 
Parser
ParserParser
Parser
 
Fusing Modeling and Programming into Language-Oriented Programming
Fusing Modeling and Programming into Language-Oriented ProgrammingFusing Modeling and Programming into Language-Oriented Programming
Fusing Modeling and Programming into Language-Oriented Programming
 
Lexical analyzer
Lexical analyzerLexical analyzer
Lexical analyzer
 
CD U1-5.pptx
CD U1-5.pptxCD U1-5.pptx
CD U1-5.pptx
 
Computational model language and grammar bnf
Computational model language and grammar bnfComputational model language and grammar bnf
Computational model language and grammar bnf
 
2.regular expressions
2.regular expressions2.regular expressions
2.regular expressions
 
Towards an RDF Validation Language based on Regular Expression Derivatives
Towards an RDF Validation Language based on Regular Expression DerivativesTowards an RDF Validation Language based on Regular Expression Derivatives
Towards an RDF Validation Language based on Regular Expression Derivatives
 
Compiler Construction
Compiler ConstructionCompiler Construction
Compiler Construction
 
My 10 favorite Haxe language features - Francis Bourre - Codemotion Rome 2017
My 10 favorite Haxe language features - Francis Bourre - Codemotion Rome 2017My 10 favorite Haxe language features - Francis Bourre - Codemotion Rome 2017
My 10 favorite Haxe language features - Francis Bourre - Codemotion Rome 2017
 
8074448.ppt
8074448.ppt8074448.ppt
8074448.ppt
 
NLP and LSA getting started
NLP and LSA getting startedNLP and LSA getting started
NLP and LSA getting started
 
Stefan Richter - Writing simple, readable and robust code: Examples in Java, ...
Stefan Richter - Writing simple, readable and robust code: Examples in Java, ...Stefan Richter - Writing simple, readable and robust code: Examples in Java, ...
Stefan Richter - Writing simple, readable and robust code: Examples in Java, ...
 
Compiler design Project
Compiler design ProjectCompiler design Project
Compiler design Project
 
Introduction to Boost regex
Introduction to Boost regexIntroduction to Boost regex
Introduction to Boost regex
 
Building .NET Core tools using the Roslyn API by Arthur Tabatchnic at .Net fo...
Building .NET Core tools using the Roslyn API by Arthur Tabatchnic at .Net fo...Building .NET Core tools using the Roslyn API by Arthur Tabatchnic at .Net fo...
Building .NET Core tools using the Roslyn API by Arthur Tabatchnic at .Net fo...
 

Mais de iwan_rg

Automatic text simplification evaluation aspects
Automatic text simplification  evaluation aspectsAutomatic text simplification  evaluation aspects
Automatic text simplification evaluation aspectsiwan_rg
 
تلخيص كتاب مقدمة في معالجة اللغة العربية
تلخيص كتاب مقدمة في معالجة اللغة العربيةتلخيص كتاب مقدمة في معالجة اللغة العربية
تلخيص كتاب مقدمة في معالجة اللغة العربيةiwan_rg
 
Building theoretical models using structured equation modeling
Building theoretical models using structured equation modelingBuilding theoretical models using structured equation modeling
Building theoretical models using structured equation modelingiwan_rg
 
ورشة تضمين الكلمات في التعلم العميق Word embeddings workshop
ورشة تضمين الكلمات في التعلم العميق Word embeddings workshopورشة تضمين الكلمات في التعلم العميق Word embeddings workshop
ورشة تضمين الكلمات في التعلم العميق Word embeddings workshopiwan_rg
 
Introduction to Arabic natural language processing (Infographics)
Introduction to Arabic natural language processing (Infographics)Introduction to Arabic natural language processing (Infographics)
Introduction to Arabic natural language processing (Infographics)iwan_rg
 
Summary of Multilingual Natural Language Processing Applications: From Theory...
Summary of Multilingual Natural Language Processing Applications: From Theory...Summary of Multilingual Natural Language Processing Applications: From Theory...
Summary of Multilingual Natural Language Processing Applications: From Theory...iwan_rg
 
التقرير السنوي لمجموعة إيوان البحثية 1437هـ-1438هـ
التقرير السنوي لمجموعة إيوان البحثية 1437هـ-1438هـالتقرير السنوي لمجموعة إيوان البحثية 1437هـ-1438هـ
التقرير السنوي لمجموعة إيوان البحثية 1437هـ-1438هـiwan_rg
 
CHOOSING RESEARCH TOPICS AND WRITING RESEARCH PAPERS
CHOOSING RESEARCH TOPICS AND WRITING RESEARCH PAPERSCHOOSING RESEARCH TOPICS AND WRITING RESEARCH PAPERS
CHOOSING RESEARCH TOPICS AND WRITING RESEARCH PAPERSiwan_rg
 
التقرير السنوي لمجموعة إيوان البحثية 1436هـ-1437هـ
التقرير السنوي لمجموعة إيوان البحثية 1436هـ-1437هـالتقرير السنوي لمجموعة إيوان البحثية 1436هـ-1437هـ
التقرير السنوي لمجموعة إيوان البحثية 1436هـ-1437هـiwan_rg
 
مركز تميز الحوسبة العربية المتقدمة
مركز تميز  الحوسبة العربية المتقدمةمركز تميز  الحوسبة العربية المتقدمة
مركز تميز الحوسبة العربية المتقدمةiwan_rg
 
P05- DINA: A Multi-Dialect Dataset for Arabic Emotion Analysis
P05- DINA: A Multi-Dialect Dataset for Arabic Emotion Analysis P05- DINA: A Multi-Dialect Dataset for Arabic Emotion Analysis
P05- DINA: A Multi-Dialect Dataset for Arabic Emotion Analysis iwan_rg
 
P03- MANDIAC: A Web-based Annotation System For Manual Arabic Diacritization
P03- MANDIAC: A Web-based Annotation System For Manual Arabic Diacritization P03- MANDIAC: A Web-based Annotation System For Manual Arabic Diacritization
P03- MANDIAC: A Web-based Annotation System For Manual Arabic Diacritization iwan_rg
 
P04- Toward an Arabic Punctuated Corpus: Annotation Guidelines and Evaluation
P04- Toward an Arabic Punctuated Corpus: Annotation Guidelines and Evaluation P04- Toward an Arabic Punctuated Corpus: Annotation Guidelines and Evaluation
P04- Toward an Arabic Punctuated Corpus: Annotation Guidelines and Evaluation iwan_rg
 
P02- Towards a New Arabic Corpus of Dyslexic Texts
P02- Towards a New Arabic Corpus of Dyslexic TextsP02- Towards a New Arabic Corpus of Dyslexic Texts
P02- Towards a New Arabic Corpus of Dyslexic Textsiwan_rg
 
P01- Toward a rich Arabic Speech Parallel Corpus for Algerian sub-Dialects
P01- Toward a rich Arabic Speech Parallel Corpus for Algerian sub-Dialects P01- Toward a rich Arabic Speech Parallel Corpus for Algerian sub-Dialects
P01- Toward a rich Arabic Speech Parallel Corpus for Algerian sub-Dialects iwan_rg
 
Keynote - Computational Processing of Arabic Dialects: Challenges, Advances a...
Keynote - Computational Processing of Arabic Dialects: Challenges, Advances a...Keynote - Computational Processing of Arabic Dialects: Challenges, Advances a...
Keynote - Computational Processing of Arabic Dialects: Challenges, Advances a...iwan_rg
 
OSACT2 LREC 2016 workshop proceedings
OSACT2 LREC 2016 workshop proceedingsOSACT2 LREC 2016 workshop proceedings
OSACT2 LREC 2016 workshop proceedingsiwan_rg
 
محاضرة المدونات اللغوية وأدواتها
محاضرة المدونات اللغوية وأدواتهامحاضرة المدونات اللغوية وأدواتها
محاضرة المدونات اللغوية وأدواتهاiwan_rg
 
لغويات المدونة الحاسوبية
لغويات المدونة الحاسوبيةلغويات المدونة الحاسوبية
لغويات المدونة الحاسوبيةiwan_rg
 
iWAN Annual Report 1435/1436H
 iWAN Annual Report 1435/1436H iWAN Annual Report 1435/1436H
iWAN Annual Report 1435/1436Hiwan_rg
 

Mais de iwan_rg (20)

Automatic text simplification evaluation aspects
Automatic text simplification  evaluation aspectsAutomatic text simplification  evaluation aspects
Automatic text simplification evaluation aspects
 
تلخيص كتاب مقدمة في معالجة اللغة العربية
تلخيص كتاب مقدمة في معالجة اللغة العربيةتلخيص كتاب مقدمة في معالجة اللغة العربية
تلخيص كتاب مقدمة في معالجة اللغة العربية
 
Building theoretical models using structured equation modeling
Building theoretical models using structured equation modelingBuilding theoretical models using structured equation modeling
Building theoretical models using structured equation modeling
 
ورشة تضمين الكلمات في التعلم العميق Word embeddings workshop
ورشة تضمين الكلمات في التعلم العميق Word embeddings workshopورشة تضمين الكلمات في التعلم العميق Word embeddings workshop
ورشة تضمين الكلمات في التعلم العميق Word embeddings workshop
 
Introduction to Arabic natural language processing (Infographics)
Introduction to Arabic natural language processing (Infographics)Introduction to Arabic natural language processing (Infographics)
Introduction to Arabic natural language processing (Infographics)
 
Summary of Multilingual Natural Language Processing Applications: From Theory...
Summary of Multilingual Natural Language Processing Applications: From Theory...Summary of Multilingual Natural Language Processing Applications: From Theory...
Summary of Multilingual Natural Language Processing Applications: From Theory...
 
التقرير السنوي لمجموعة إيوان البحثية 1437هـ-1438هـ
التقرير السنوي لمجموعة إيوان البحثية 1437هـ-1438هـالتقرير السنوي لمجموعة إيوان البحثية 1437هـ-1438هـ
التقرير السنوي لمجموعة إيوان البحثية 1437هـ-1438هـ
 
CHOOSING RESEARCH TOPICS AND WRITING RESEARCH PAPERS
CHOOSING RESEARCH TOPICS AND WRITING RESEARCH PAPERSCHOOSING RESEARCH TOPICS AND WRITING RESEARCH PAPERS
CHOOSING RESEARCH TOPICS AND WRITING RESEARCH PAPERS
 
التقرير السنوي لمجموعة إيوان البحثية 1436هـ-1437هـ
التقرير السنوي لمجموعة إيوان البحثية 1436هـ-1437هـالتقرير السنوي لمجموعة إيوان البحثية 1436هـ-1437هـ
التقرير السنوي لمجموعة إيوان البحثية 1436هـ-1437هـ
 
مركز تميز الحوسبة العربية المتقدمة
مركز تميز  الحوسبة العربية المتقدمةمركز تميز  الحوسبة العربية المتقدمة
مركز تميز الحوسبة العربية المتقدمة
 
P05- DINA: A Multi-Dialect Dataset for Arabic Emotion Analysis
P05- DINA: A Multi-Dialect Dataset for Arabic Emotion Analysis P05- DINA: A Multi-Dialect Dataset for Arabic Emotion Analysis
P05- DINA: A Multi-Dialect Dataset for Arabic Emotion Analysis
 
P03- MANDIAC: A Web-based Annotation System For Manual Arabic Diacritization
P03- MANDIAC: A Web-based Annotation System For Manual Arabic Diacritization P03- MANDIAC: A Web-based Annotation System For Manual Arabic Diacritization
P03- MANDIAC: A Web-based Annotation System For Manual Arabic Diacritization
 
P04- Toward an Arabic Punctuated Corpus: Annotation Guidelines and Evaluation
P04- Toward an Arabic Punctuated Corpus: Annotation Guidelines and Evaluation P04- Toward an Arabic Punctuated Corpus: Annotation Guidelines and Evaluation
P04- Toward an Arabic Punctuated Corpus: Annotation Guidelines and Evaluation
 
P02- Towards a New Arabic Corpus of Dyslexic Texts
P02- Towards a New Arabic Corpus of Dyslexic TextsP02- Towards a New Arabic Corpus of Dyslexic Texts
P02- Towards a New Arabic Corpus of Dyslexic Texts
 
P01- Toward a rich Arabic Speech Parallel Corpus for Algerian sub-Dialects
P01- Toward a rich Arabic Speech Parallel Corpus for Algerian sub-Dialects P01- Toward a rich Arabic Speech Parallel Corpus for Algerian sub-Dialects
P01- Toward a rich Arabic Speech Parallel Corpus for Algerian sub-Dialects
 
Keynote - Computational Processing of Arabic Dialects: Challenges, Advances a...
Keynote - Computational Processing of Arabic Dialects: Challenges, Advances a...Keynote - Computational Processing of Arabic Dialects: Challenges, Advances a...
Keynote - Computational Processing of Arabic Dialects: Challenges, Advances a...
 
OSACT2 LREC 2016 workshop proceedings
OSACT2 LREC 2016 workshop proceedingsOSACT2 LREC 2016 workshop proceedings
OSACT2 LREC 2016 workshop proceedings
 
محاضرة المدونات اللغوية وأدواتها
محاضرة المدونات اللغوية وأدواتهامحاضرة المدونات اللغوية وأدواتها
محاضرة المدونات اللغوية وأدواتها
 
لغويات المدونة الحاسوبية
لغويات المدونة الحاسوبيةلغويات المدونة الحاسوبية
لغويات المدونة الحاسوبية
 
iWAN Annual Report 1435/1436H
 iWAN Annual Report 1435/1436H iWAN Annual Report 1435/1436H
iWAN Annual Report 1435/1436H
 

Último

Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAssociation for Project Management
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...fonyou31
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room servicediscovermytutordmt
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Disha Kariya
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
The byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptxThe byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptxShobhayan Kirtania
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfchloefrazer622
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 

Último (20)

Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room service
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
The byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptxThe byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptx
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdf
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 

Sketch engine presentation

Notas do Editor

  1. It is a corpus query tool which takes as input a corpus of any language (with an appropriate level of linguistic mark-up) and a corresponding grammar patterns, and which generates, amongst other things, word sketches for the words of that language.Those other things include a corpus-based thesaurus and ‘sketch differences’, which specify, for two semantically related words, what behaviour they share and how they differ. We anticipate that sketch differences will be particularly useful for lexicographers interested in near synonym differentiation.Word sketches were first used in the production of the Macmillan English Dictionary (Rundell 2002) and were presented at Euralex 2002 (Kilgarriff and Rundell 2002). Following that presentation, the most-asked question was “can I have them for my language?” In response, we have now developed the Sketch Engine.
  2. It is a corpus query tool which takes as input a corpus of any language (with an appropriate level of linguistic mark-up) and a corresponding grammar patterns, and which generates, amongst other things, word sketches for the words of that language.Those other things include a corpus-based thesaurus and ‘sketch differences’, which specify, for two semantically related words, what behaviour they share and how they differ. We anticipate that sketch differences will be particularly useful for lexicographers interested in near synonym differentiation.Word sketches were first used in the production of the Macmillan English Dictionary (Rundell 2002) and were presented at Euralex 2002 (Kilgarriff and Rundell 2002). Following that presentation, the most-asked question was “can I have them for my language?” In response, we have now developed the Sketch Engine.
  3. The Sketch Engine has a number of language-analysis functions, the core ones being:the Concordancer A program which displays all occurrences from the corpus for a given query. The program is very powerful with a wide variety of query types and many different ways of displaying and organising the results. (concordancing, sorting, sampling, wordlists, collocation lists)the Word Sketch program This program provides a corpus-based summary of a word&apos;s grammatical and collocationalbehaviour.
  4. With Corpus Architect, you can build your own corpora from documents in various format: TXT, PDF, PS, DOC, HTML, VERT. When processed, you can search and query them within Sketch Engine.
  5. With Corpus Architect, you can build your own corpora from documents in various format: TXT, PDF, PS, DOC, HTML, VERT. When processed, you can search and query them within Sketch Engine.
  6. With Corpus Architect, you can build your own corpora from documents in various format: TXT, PDF, PS, DOC, HTML, VERT. When processed, you can search and query them within Sketch Engine.
  7. Concordance: for querying a corpus and obtaining concordances which you can then further refine, filter and use for generating frequency information and collocation listsWord List: for obtaining word lists for an entire corpus, or a specified subcorpusWord Sketch: this allows you to explore the grammatical and collocational behaviour of a word.Thesaurus: this allows you to find other words that have similar grammatical and collocational behaviour to a given word. Note that this thesaurus is produced automatically from statistics on word co-occurrences. It is not a manually constructed thesaurus and will list words for each entry which are distributionally related but not necessarily synonyms.Sketch-Diff: this allows you to compare the behaviour of two words
  8. Main Sketch Engine Links:https://www.sketchengine.co.uk/documentation/wiki/SkE/Help/MainLinkHelp
  9. Concordance Query:https://www.sketchengine.co.uk/documentation/wiki/SkE/Help/PageSpecificHelp/ConcordanceQueryQuery Types: Using Query Type, you can refine the type of query you wish to make in the main panel.Context : If Context is selected in the LHS menu, on the main panel you can specify criteria on the context for your query. You can choose to specify the context in terms of surrounding lemma(s) and/or PoS tag(s).Text Types: Here you can select a subcorpus or create a new subcorpus from a subset of the current corpus. You can also stipulate constraints on the text types for documents that will be searched for your query
  10. CQL:https://www.sketchengine.co.uk/documentation/wiki/SkE/CorpusQuerying#1.
  11. Ex1:Lemma filter:Window: right, 1 tokensLemma(s): عن none
  12. Concordance Menu options:https://www.sketchengine.co.uk/documentation/wiki/SkE/Help/PageSpecificHelp/Concordance Menu optionsNote that the options in the left hand side panel are all available when you are viewing the concordance. Some of the options will not be shown if you have already selected from this menu. If so, you can click view concordance to get back to the concordance.View OptionsClicking on View Options will allow you to alter how the concordance looksWith this you can select what attributes of the words in the concordance you seeKWIC/Sentence Toggle betweenthe KWIC mode where the queried text (node) is in a central column and context is displayed on either sideSentence where the queried text (node) is provided in the context of the sentence in which it occursSave Click on this to see options for saving the concordance in the main panel (or the frequency list or collocation candidates).Sort Click on this to see complex sorting options. If the concordance is sorted based on the context, an option to&quot;Jump to&quot; a page with context starting with a certain letter occurs.Alternatively, you can click onLeft (Right): to sort by the text left (Right) of the nodeNode: to sort by the text in the central column (referred to as the node or KWIC)References: to sort by the document references at the left hand side of the concordanceShuffle: the concordance will be jumbled to avoid bias from a user only looking at the first portionSample Click this to select a random sample of the concordance linesFilter Click this to further specify contextual features to filter the concordance, for example by words to the left or right of the node word, or by text typeFrequency Click on this to see a variety of complex methods for obtaining frequency listsAlternatively, you can click onNode tags: to get a frequency list over the part of speech tags of the node word/s in the central columnNode forms: to get a frequency list over the node word forms in the central columnDoc IDs: to get a frequency list over the Doc ID&apos;s for the node word/s in the central columnText Types: to get a frequency list over all the text types of the node word/s in the central columnCollocations Click on this to specify criteria and build collocation lists for the node word/s in the central columnConcDesc You can see the query in detail (for technical people) and you can go back in the history if the query consists of several subsequent actions.Visualize This link will show you the distributional graph of the concordance within the corpus. On x-axis there are concordance positions (by default 100 columns for 100 slices of the corpus, you may change its granularity with the slider + click on Redraw button), on y-axis there is a relative frequency of the query hits within a concordance part (=column). Columns are clickable: by clicking on a column, you will filter the concordance and will see only the appropriate concordance part.
  13. Word List Options:Left hand side options:select All words to generate a list of words in the corpus ranked by frequencyselect All lemmas to generate a list of lemmas in the corpus ranked by frequency. Lemma is the base (stem) form of a word.In the main panel of the interface you have further options:Subcorpus: where you can specify a subcorpus for the source data, or create a new one.Search Attribute: you can specify word, lemma, tag (part of speech tag) etc.. depending on the attributes defined for the corpus or you can specify one of the text types defined for the corpus. The default attribute is word.Filter Options: You can either do this for all words (or lemmas or whichever attribute you specify) or you can filter the list.Output Options:You can select different types of the produced list.
  14. Choose a corpus and click on Word List in the left hand side menu.Choose lemma at Search attributeType the lemma (e.g.  حار) into the RE pattern box. Tick the box that says change output attribute(s).In the first two levels, select “lemma&quot; and &quot;Tag&quot;.Click on Make Word List.
  15. Wordlist  search Attr: lemma, Change Attr: gender