SlideShare uma empresa Scribd logo
1 de 23
Baixar para ler offline
31 March 2009 Yoruba verb morphology 1
A computational approach to
Yorùbá morphology
Ọdẹtúnjí A. Ọdẹjọbí
Cork Constraint Computation
Center (4C)
Computer Science Department
University College Cork
Cork, Ireland
SUPPORTED BY Science Foundation Ireland Grant 05/IN/I886 and Marie
Curie Grant MTKD-CT-2006-042563
Raphael Finkel
Computer Science Department
University of Kentucky,
USA
SUPPORTED BY US National Science Foundation Grants IIS-0097278 and IIS-0325063
and by the University of Kentucky Center for Computational Science
31 March 2009 Yoruba verb morphology 2
In this presentation
 Explain the output of our program for
Yoruba verb morphology
 Discuss how we developed the program
 Discuss the significance of our efforts
 State our ongoing efforts
/home/odetunji/Desktop/ConferenceSlides/yoruba.utf8.html
31 March 2009 Yoruba verb morphology 3
Yorùbá in Brief
• Edikiri language in the Niger-Congo family spoken
widely in southwestern Nigeria (ISO: yor)
• Many dialects, with a standard form (SY) for
communication and education
• 3 tones: High(H), Medium(M), Low(L)
• 2 tonal contours: falling (HL) and rising (LH)
• Simple verb morphology: Only one conjugation
• The verb morphology is documented.
31 March 2009 Yoruba verb morphology 4
Our goals
To generate verb forms for SY
(i) realise all 160 combinations of morphosyntactic
properties
Tense: present, continuous, past, future
Polarity: positive, negative
Person: 1, 2Older, 3Older, 2Notolder, 3NotOlder
Number: singular, plural
Strength: normal, emphatic
(ii) provide a computational description of SY verb
formation
31 March 2009 Yoruba verb morphology 5
The KATR formalism
 Based on DATR, a formalism for representing
lexical knowledge by default-inheritance
hierarchies (Evans & Gazdar, 1989).
 Queries (such as 1 pl past) are directed to
nodes that contain rules that either answer the
queries or direct them to further nodes.
31 March 2009 Yoruba verb morphology 6
Generating Queries in KATR
We declare variables to represent
morphosyntactic properties
1) #vars $tense: present past continuous future .
2) #vars $polarity: positive negative .
3) #vars $person: 1 2Older 3Older 2NotOlder 3NotOlder .
4) #vars $number: sg pl .
5) #vars $strength: normal emphatic .
31 March 2009 Yoruba verb morphology 7
Generating multiple queries
 This "show" line generates 160 queries such as:
 <normal negative past 3Older sg>
 <emphatic negative continuous 3Older pl>
 These queries are directed to all leaf nodes, such
as the "Take" node. (Node names always start with
upper-case letters)
#show <$strength :: $polarity :: $tense :: $person :: $number > .
31 March 2009 Yoruba verb morphology 8
Take:
1 <stem> = m un ´ % tone marks always follow vowels
2 {} = Verb
The "Take" node
 The order of rules is not significant.
 The query <emphatic negative continuous 3Older pl>
only matches Rule 2, which is completely
unconstrained.
 Rule 2 directs the query to the “Verb” node.
31 March 2009 Yoruba verb morphology 9
This query:
<emphatic negative continuous 3Older pl>
matches both rules. KATR chooses the more constraining rule (Panini's
principle), that is, Rule 2.
Rule 2 converts the query to
<present negative emphatic 3Older pl>
and directs it again to the "Verb" node.
The "Verb" node
Verb:
1 {} = Person Negator1 Tense Negator2 , "<stem>" Ending
2 {continuous negative} = <present negative>
31 March 2009 Yoruba verb morphology 10
Verb:
1 {} = Person Negator1 Tense Negator2 , "<stem>" Ending
2 {continuous negative} = <present negative>
The "Verb" node, modified query
This modified query:
<present negative emphatic 3Older pl>
matches only Rule 1, which

Represents our analysis of SY, which identifies 6 slots.

Combines the results for each slot into a single result
• The results of sending the query to five different nodes.
• The surface form "," which we use to create word boundaries.
• The result of sending the new query "<stem>" to the starting leaf
node ”Take", which returns the surface form “m un ´”
31 March 2009 Yoruba verb morphology 11
Person:
1 {3Older positive !future} = w ϙn ´
2 {3Older} = w ϙn
3 {3NotOlder} = o ´
4 {3NotOlder negative sg} =
5 {3NotOlder future} = y i ´
6 {3NotOlder pl ++} = <3Older>
... % omitting many other rules
The "Person" node
This query:
<present negative emphatic 3Older pl>
only matches Rule 2, generating the answer “w ϙn”.
31 March 2009 Yoruba verb morphology 12
Negator1:
1 {negative} = , (k) o `
2 {negative 3NotOlder sg} = k o `
3 {} =
The "Negator1" node
This query:
<present negative emphatic 3Older pl>
matches Rules 1 and 3. KATR chooses Rule 1,
generating the answer “, (k) o `”.
31 March 2009 Yoruba verb morphology 13
Tense: % polarity, tense
1 {} =
2 {past} = , t i
3 {continuous positive} = , n ´
4 {future positive} = , o ̂
5 {future 1 sg positive} = , a ̌
6 {future 3NotOlder positive} = <future 3Older positive>
The "Tense" node
This query:
<present negative emphatic 3Older pl>
matches Rule 1, generating an empty (but valid!) output.
31 March 2009 Yoruba verb morphology 14
Negator2: % polarity, tense
1 {future negative} = , n i ´
2 {past negative} = ´ i `
3 {} =
The "Negator2" node
This query:
<present negative emphatic 3Older pl>
Matches only Rule 3, which generates an empty output.
31 March 2009 Yoruba verb morphology 15
Ending:
1 {} =
2 {emphatic} = ↓
The "Ending" node
This query:
<present negative emphatic 3Older pl>
Matches both rules; KATR chooses Rule 2, which
generates ↓, which is a jer for post-processing.
31 March 2009 Yoruba verb morphology 16
The "Verb" node assembles all the results into this surface
form:
w ϙn , (k) o ` , m un ´ ↓
This surface form is now treated by postprocessing rules.
Postprocessing
1) #sandhi $vowel ↓ => $1 $1 ` .
2) #sandhi $vowel $tone ↓ => $1 $2 $1 .
3) #sandhi un $tone => u $1 n . % spelling
4) %(others omitted)
Rules 1 and 2 remove the ↓ jer. In this case, Rule 2 applies,
giving us:
wwϙϙn , (k) o ` , m un ´ unn , (k) o ` , m un ´ un
31 March 2009 Yoruba verb morphology 17
Then Rule 3 applies, giving us
wϙn , (k) o ` , m u ´ n un
When we compress spaces out and replace comma
with space, we get:
wϙn (k)ò múnun
which is the correct surface form for
Take:<emphatic negative continuous 3Older pl>
“They (older) are certainly not taking (that object)”
31 March 2009 Yoruba verb morphology 18
Implementation
1. A Perl script converts the KATR theory into
 yoruba.katr.pro: a Prolog representation of the theory
 yoruba.sandhi.pl: a Perl script for post-processing
2. A Prolog interpreter computes the results of all queries generated
by “show” directed to all leaf nodes in the KATR theory.
3. The Perl post-processing script applies the Sandhi and other
post-processing rules.
4. We then either generate textual output for direct viewing or HTML
output for a browser.
The KATR theory implemenation for Yoruba is available at
http://www.cs.uky.edu/~raphael/KATR.html
31 March 2009 Yoruba verb morphology 19
Applications
Linguistics: Theoretical studies of SY
Pedagogy: Describing SY verbs to students
Learning : Facilitating tool for teaching SY
Technology: Developing software products
such as spelling and grammar checkers
31 March 2009 Yoruba verb morphology 20
KATR instead of DATR
 KATR allows sets in addition to paths on the
left-hand side, so it is easy to ignore
irrelevant morphosyntactic properties.
 KATR is fast, so turn-around time is very short.
 KATR lets us specify post-processing directly
instead of embedding it in the default-inheritance
hierarchy.
31 March 2009 Yoruba verb morphology 21
 Description of slots in SY verb morphology
 Six slots identified
 Complete specification of the realizations of
those slots
 A simple use of jers to deal with the tone
Sandhi of the emphatic suffix.
Contributions
31 March 2009 Yoruba verb morphology 22
On going efforts
 Evaulation: Subject out programe to further
evaluation throught working with Yoruba
linguists and phonologist
 Expansion: Expand the rule for similar African
tone languages
 Exploration: Explore the generalitry of our
approach and the possibility for developing
genertic morphological rules
31 March 2009 Yoruba verb morphology 23
HELP!!
Suggestions?
Education?
Questions?

Mais conteúdo relacionado

Mais procurados

Named Entity Recognition - VLSP 2016
Named Entity Recognition - VLSP 2016Named Entity Recognition - VLSP 2016
Named Entity Recognition - VLSP 2016Anh Vũ
 
Deep Dependency Graph Conversion in English
Deep Dependency Graph Conversion in EnglishDeep Dependency Graph Conversion in English
Deep Dependency Graph Conversion in EnglishJinho Choi
 
Jarrar: Final Check & Schema Engineering Issues
Jarrar: Final Check & Schema Engineering IssuesJarrar: Final Check & Schema Engineering Issues
Jarrar: Final Check & Schema Engineering IssuesMustafa Jarrar
 
Aaa ped-3. Pythond: advanced concepts
Aaa ped-3. Pythond: advanced conceptsAaa ped-3. Pythond: advanced concepts
Aaa ped-3. Pythond: advanced conceptsAminaRepo
 
Tcs technical interview questions
Tcs technical interview questionsTcs technical interview questions
Tcs technical interview questionsAshu0711
 
TCS Job Interview Questions
TCS Job Interview QuestionsTCS Job Interview Questions
TCS Job Interview QuestionsNavdeep Kumar
 
Functional dependencies in Database Management System
Functional dependencies in Database Management SystemFunctional dependencies in Database Management System
Functional dependencies in Database Management SystemKevin Jadiya
 
Sdd Syntax Descriptions
Sdd Syntax DescriptionsSdd Syntax Descriptions
Sdd Syntax Descriptionsgavhays
 
FUNCTION DEPENDENCY AND TYPES & EXAMPLE
FUNCTION DEPENDENCY  AND TYPES & EXAMPLEFUNCTION DEPENDENCY  AND TYPES & EXAMPLE
FUNCTION DEPENDENCY AND TYPES & EXAMPLEVraj Patel
 
A Survey of Various Methods for Text Summarization
A Survey of Various Methods for Text SummarizationA Survey of Various Methods for Text Summarization
A Survey of Various Methods for Text SummarizationIJERD Editor
 
B.tech admission in india
B.tech admission in indiaB.tech admission in india
B.tech admission in indiaEdhole.com
 
Four Languages From Forty Years Ago
Four Languages From Forty Years AgoFour Languages From Forty Years Ago
Four Languages From Forty Years AgoScott Wlaschin
 
Jarrar: Subtype Relations and Constraints
Jarrar: Subtype Relations and ConstraintsJarrar: Subtype Relations and Constraints
Jarrar: Subtype Relations and ConstraintsMustafa Jarrar
 
Designing A Syntax Based Retrieval System03
Designing A Syntax Based Retrieval System03Designing A Syntax Based Retrieval System03
Designing A Syntax Based Retrieval System03Avelin Huo
 
Python Interview questions 2020
Python Interview questions 2020Python Interview questions 2020
Python Interview questions 2020VigneshVijay21
 

Mais procurados (20)

Named Entity Recognition - VLSP 2016
Named Entity Recognition - VLSP 2016Named Entity Recognition - VLSP 2016
Named Entity Recognition - VLSP 2016
 
Deep Dependency Graph Conversion in English
Deep Dependency Graph Conversion in EnglishDeep Dependency Graph Conversion in English
Deep Dependency Graph Conversion in English
 
Jarrar: Final Check & Schema Engineering Issues
Jarrar: Final Check & Schema Engineering IssuesJarrar: Final Check & Schema Engineering Issues
Jarrar: Final Check & Schema Engineering Issues
 
Aaa ped-3. Pythond: advanced concepts
Aaa ped-3. Pythond: advanced conceptsAaa ped-3. Pythond: advanced concepts
Aaa ped-3. Pythond: advanced concepts
 
FinalReport
FinalReportFinalReport
FinalReport
 
Tcs technical interview questions
Tcs technical interview questionsTcs technical interview questions
Tcs technical interview questions
 
Database management system session 5
Database management system session 5Database management system session 5
Database management system session 5
 
TCS Job Interview Questions
TCS Job Interview QuestionsTCS Job Interview Questions
TCS Job Interview Questions
 
Functional dependencies in Database Management System
Functional dependencies in Database Management SystemFunctional dependencies in Database Management System
Functional dependencies in Database Management System
 
Sdd Syntax Descriptions
Sdd Syntax DescriptionsSdd Syntax Descriptions
Sdd Syntax Descriptions
 
Presentation1
Presentation1Presentation1
Presentation1
 
Interpreter
InterpreterInterpreter
Interpreter
 
FUNCTION DEPENDENCY AND TYPES & EXAMPLE
FUNCTION DEPENDENCY  AND TYPES & EXAMPLEFUNCTION DEPENDENCY  AND TYPES & EXAMPLE
FUNCTION DEPENDENCY AND TYPES & EXAMPLE
 
Parsalyzer
ParsalyzerParsalyzer
Parsalyzer
 
A Survey of Various Methods for Text Summarization
A Survey of Various Methods for Text SummarizationA Survey of Various Methods for Text Summarization
A Survey of Various Methods for Text Summarization
 
B.tech admission in india
B.tech admission in indiaB.tech admission in india
B.tech admission in india
 
Four Languages From Forty Years Ago
Four Languages From Forty Years AgoFour Languages From Forty Years Ago
Four Languages From Forty Years Ago
 
Jarrar: Subtype Relations and Constraints
Jarrar: Subtype Relations and ConstraintsJarrar: Subtype Relations and Constraints
Jarrar: Subtype Relations and Constraints
 
Designing A Syntax Based Retrieval System03
Designing A Syntax Based Retrieval System03Designing A Syntax Based Retrieval System03
Designing A Syntax Based Retrieval System03
 
Python Interview questions 2020
Python Interview questions 2020Python Interview questions 2020
Python Interview questions 2020
 

Semelhante a A Computational Approach to Yoruba Morphology

A Plan towards Ruby 3 Types
A Plan towards Ruby 3 TypesA Plan towards Ruby 3 Types
A Plan towards Ruby 3 Typesmametter
 
Deep learning for biotechnology presentation
Deep learning for biotechnology presentationDeep learning for biotechnology presentation
Deep learning for biotechnology presentationashuh3
 
NL to OCL via SBVR
NL to OCL via SBVRNL to OCL via SBVR
NL to OCL via SBVRImran Bajwa
 
The Secret Life of Words: Exploring Regularity and Systematicity (joint talk ...
The Secret Life of Words: Exploring Regularity and Systematicity (joint talk ...The Secret Life of Words: Exploring Regularity and Systematicity (joint talk ...
The Secret Life of Words: Exploring Regularity and Systematicity (joint talk ...Katerina Vylomova
 
L06 stemmer and edit distance
L06 stemmer and edit distanceL06 stemmer and edit distance
L06 stemmer and edit distanceananth
 
Monitoring and feedback in the process of language acquisition analysis and ...
Monitoring and feedback in the process of language acquisition  analysis and ...Monitoring and feedback in the process of language acquisition  analysis and ...
Monitoring and feedback in the process of language acquisition analysis and ...ijnlc
 

Semelhante a A Computational Approach to Yoruba Morphology (7)

A Plan towards Ruby 3 Types
A Plan towards Ruby 3 TypesA Plan towards Ruby 3 Types
A Plan towards Ruby 3 Types
 
Deep learning for biotechnology presentation
Deep learning for biotechnology presentationDeep learning for biotechnology presentation
Deep learning for biotechnology presentation
 
NL to OCL via SBVR
NL to OCL via SBVRNL to OCL via SBVR
NL to OCL via SBVR
 
The Secret Life of Words: Exploring Regularity and Systematicity (joint talk ...
The Secret Life of Words: Exploring Regularity and Systematicity (joint talk ...The Secret Life of Words: Exploring Regularity and Systematicity (joint talk ...
The Secret Life of Words: Exploring Regularity and Systematicity (joint talk ...
 
L06 stemmer and edit distance
L06 stemmer and edit distanceL06 stemmer and edit distance
L06 stemmer and edit distance
 
N20190729
N20190729N20190729
N20190729
 
Monitoring and feedback in the process of language acquisition analysis and ...
Monitoring and feedback in the process of language acquisition  analysis and ...Monitoring and feedback in the process of language acquisition  analysis and ...
Monitoring and feedback in the process of language acquisition analysis and ...
 

Mais de Guy De Pauw

Technological Tools for Dictionary and Corpora Building for Minority Language...
Technological Tools for Dictionary and Corpora Building for Minority Language...Technological Tools for Dictionary and Corpora Building for Minority Language...
Technological Tools for Dictionary and Corpora Building for Minority Language...Guy De Pauw
 
Semi-automated extraction of morphological grammars for Nguni with special re...
Semi-automated extraction of morphological grammars for Nguni with special re...Semi-automated extraction of morphological grammars for Nguni with special re...
Semi-automated extraction of morphological grammars for Nguni with special re...Guy De Pauw
 
Resource-Light Bantu Part-of-Speech Tagging
Resource-Light Bantu Part-of-Speech TaggingResource-Light Bantu Part-of-Speech Tagging
Resource-Light Bantu Part-of-Speech TaggingGuy De Pauw
 
Natural Language Processing for Amazigh Language
Natural Language Processing for Amazigh LanguageNatural Language Processing for Amazigh Language
Natural Language Processing for Amazigh LanguageGuy De Pauw
 
POS Annotated 50m Corpus of Tajik Language
POS Annotated 50m Corpus of Tajik LanguagePOS Annotated 50m Corpus of Tajik Language
POS Annotated 50m Corpus of Tajik LanguageGuy De Pauw
 
The Tagged Icelandic Corpus (MÍM)
The Tagged Icelandic Corpus (MÍM)The Tagged Icelandic Corpus (MÍM)
The Tagged Icelandic Corpus (MÍM)Guy De Pauw
 
Describing Morphologically Rich Languages Using Metagrammars a Look at Verbs ...
Describing Morphologically Rich Languages Using Metagrammars a Look at Verbs ...Describing Morphologically Rich Languages Using Metagrammars a Look at Verbs ...
Describing Morphologically Rich Languages Using Metagrammars a Look at Verbs ...Guy De Pauw
 
Tagging and Verifying an Amharic News Corpus
Tagging and Verifying an Amharic News CorpusTagging and Verifying an Amharic News Corpus
Tagging and Verifying an Amharic News CorpusGuy De Pauw
 
A Corpus of Santome
A Corpus of SantomeA Corpus of Santome
A Corpus of SantomeGuy De Pauw
 
Automatic Structuring and Correction Suggestion System for Hungarian Clinical...
Automatic Structuring and Correction Suggestion System for Hungarian Clinical...Automatic Structuring and Correction Suggestion System for Hungarian Clinical...
Automatic Structuring and Correction Suggestion System for Hungarian Clinical...Guy De Pauw
 
Compiling Apertium Dictionaries with HFST
Compiling Apertium Dictionaries with HFSTCompiling Apertium Dictionaries with HFST
Compiling Apertium Dictionaries with HFSTGuy De Pauw
 
The Database of Modern Icelandic Inflection
The Database of Modern Icelandic InflectionThe Database of Modern Icelandic Inflection
The Database of Modern Icelandic InflectionGuy De Pauw
 
Learning Morphological Rules for Amharic Verbs Using Inductive Logic Programming
Learning Morphological Rules for Amharic Verbs Using Inductive Logic ProgrammingLearning Morphological Rules for Amharic Verbs Using Inductive Logic Programming
Learning Morphological Rules for Amharic Verbs Using Inductive Logic ProgrammingGuy De Pauw
 
Issues in Designing a Corpus of Spoken Irish
Issues in Designing a Corpus of Spoken IrishIssues in Designing a Corpus of Spoken Irish
Issues in Designing a Corpus of Spoken IrishGuy De Pauw
 
How to build language technology resources for the next 100 years
How to build language technology resources for the next 100 yearsHow to build language technology resources for the next 100 years
How to build language technology resources for the next 100 yearsGuy De Pauw
 
Towards Standardizing Evaluation Test Sets for Compound Analysers
Towards Standardizing Evaluation Test Sets for Compound AnalysersTowards Standardizing Evaluation Test Sets for Compound Analysers
Towards Standardizing Evaluation Test Sets for Compound AnalysersGuy De Pauw
 
The PALDO Concept - New Paradigms for African Language Resource Development
The PALDO Concept - New Paradigms for African Language Resource DevelopmentThe PALDO Concept - New Paradigms for African Language Resource Development
The PALDO Concept - New Paradigms for African Language Resource DevelopmentGuy De Pauw
 
A System for the Recognition of Handwritten Yorùbá Characters
A System for the Recognition of Handwritten Yorùbá CharactersA System for the Recognition of Handwritten Yorùbá Characters
A System for the Recognition of Handwritten Yorùbá CharactersGuy De Pauw
 
IFE-MT: An English-to-Yorùbá Machine Translation System
IFE-MT: An English-to-Yorùbá Machine Translation SystemIFE-MT: An English-to-Yorùbá Machine Translation System
IFE-MT: An English-to-Yorùbá Machine Translation SystemGuy De Pauw
 
A Number to Yorùbá Text Transcription System
A Number to Yorùbá Text Transcription SystemA Number to Yorùbá Text Transcription System
A Number to Yorùbá Text Transcription SystemGuy De Pauw
 

Mais de Guy De Pauw (20)

Technological Tools for Dictionary and Corpora Building for Minority Language...
Technological Tools for Dictionary and Corpora Building for Minority Language...Technological Tools for Dictionary and Corpora Building for Minority Language...
Technological Tools for Dictionary and Corpora Building for Minority Language...
 
Semi-automated extraction of morphological grammars for Nguni with special re...
Semi-automated extraction of morphological grammars for Nguni with special re...Semi-automated extraction of morphological grammars for Nguni with special re...
Semi-automated extraction of morphological grammars for Nguni with special re...
 
Resource-Light Bantu Part-of-Speech Tagging
Resource-Light Bantu Part-of-Speech TaggingResource-Light Bantu Part-of-Speech Tagging
Resource-Light Bantu Part-of-Speech Tagging
 
Natural Language Processing for Amazigh Language
Natural Language Processing for Amazigh LanguageNatural Language Processing for Amazigh Language
Natural Language Processing for Amazigh Language
 
POS Annotated 50m Corpus of Tajik Language
POS Annotated 50m Corpus of Tajik LanguagePOS Annotated 50m Corpus of Tajik Language
POS Annotated 50m Corpus of Tajik Language
 
The Tagged Icelandic Corpus (MÍM)
The Tagged Icelandic Corpus (MÍM)The Tagged Icelandic Corpus (MÍM)
The Tagged Icelandic Corpus (MÍM)
 
Describing Morphologically Rich Languages Using Metagrammars a Look at Verbs ...
Describing Morphologically Rich Languages Using Metagrammars a Look at Verbs ...Describing Morphologically Rich Languages Using Metagrammars a Look at Verbs ...
Describing Morphologically Rich Languages Using Metagrammars a Look at Verbs ...
 
Tagging and Verifying an Amharic News Corpus
Tagging and Verifying an Amharic News CorpusTagging and Verifying an Amharic News Corpus
Tagging and Verifying an Amharic News Corpus
 
A Corpus of Santome
A Corpus of SantomeA Corpus of Santome
A Corpus of Santome
 
Automatic Structuring and Correction Suggestion System for Hungarian Clinical...
Automatic Structuring and Correction Suggestion System for Hungarian Clinical...Automatic Structuring and Correction Suggestion System for Hungarian Clinical...
Automatic Structuring and Correction Suggestion System for Hungarian Clinical...
 
Compiling Apertium Dictionaries with HFST
Compiling Apertium Dictionaries with HFSTCompiling Apertium Dictionaries with HFST
Compiling Apertium Dictionaries with HFST
 
The Database of Modern Icelandic Inflection
The Database of Modern Icelandic InflectionThe Database of Modern Icelandic Inflection
The Database of Modern Icelandic Inflection
 
Learning Morphological Rules for Amharic Verbs Using Inductive Logic Programming
Learning Morphological Rules for Amharic Verbs Using Inductive Logic ProgrammingLearning Morphological Rules for Amharic Verbs Using Inductive Logic Programming
Learning Morphological Rules for Amharic Verbs Using Inductive Logic Programming
 
Issues in Designing a Corpus of Spoken Irish
Issues in Designing a Corpus of Spoken IrishIssues in Designing a Corpus of Spoken Irish
Issues in Designing a Corpus of Spoken Irish
 
How to build language technology resources for the next 100 years
How to build language technology resources for the next 100 yearsHow to build language technology resources for the next 100 years
How to build language technology resources for the next 100 years
 
Towards Standardizing Evaluation Test Sets for Compound Analysers
Towards Standardizing Evaluation Test Sets for Compound AnalysersTowards Standardizing Evaluation Test Sets for Compound Analysers
Towards Standardizing Evaluation Test Sets for Compound Analysers
 
The PALDO Concept - New Paradigms for African Language Resource Development
The PALDO Concept - New Paradigms for African Language Resource DevelopmentThe PALDO Concept - New Paradigms for African Language Resource Development
The PALDO Concept - New Paradigms for African Language Resource Development
 
A System for the Recognition of Handwritten Yorùbá Characters
A System for the Recognition of Handwritten Yorùbá CharactersA System for the Recognition of Handwritten Yorùbá Characters
A System for the Recognition of Handwritten Yorùbá Characters
 
IFE-MT: An English-to-Yorùbá Machine Translation System
IFE-MT: An English-to-Yorùbá Machine Translation SystemIFE-MT: An English-to-Yorùbá Machine Translation System
IFE-MT: An English-to-Yorùbá Machine Translation System
 
A Number to Yorùbá Text Transcription System
A Number to Yorùbá Text Transcription SystemA Number to Yorùbá Text Transcription System
A Number to Yorùbá Text Transcription System
 

Último

Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 

Último (20)

Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 

A Computational Approach to Yoruba Morphology

  • 1. 31 March 2009 Yoruba verb morphology 1 A computational approach to Yorùbá morphology Ọdẹtúnjí A. Ọdẹjọbí Cork Constraint Computation Center (4C) Computer Science Department University College Cork Cork, Ireland SUPPORTED BY Science Foundation Ireland Grant 05/IN/I886 and Marie Curie Grant MTKD-CT-2006-042563 Raphael Finkel Computer Science Department University of Kentucky, USA SUPPORTED BY US National Science Foundation Grants IIS-0097278 and IIS-0325063 and by the University of Kentucky Center for Computational Science
  • 2. 31 March 2009 Yoruba verb morphology 2 In this presentation  Explain the output of our program for Yoruba verb morphology  Discuss how we developed the program  Discuss the significance of our efforts  State our ongoing efforts /home/odetunji/Desktop/ConferenceSlides/yoruba.utf8.html
  • 3. 31 March 2009 Yoruba verb morphology 3 Yorùbá in Brief • Edikiri language in the Niger-Congo family spoken widely in southwestern Nigeria (ISO: yor) • Many dialects, with a standard form (SY) for communication and education • 3 tones: High(H), Medium(M), Low(L) • 2 tonal contours: falling (HL) and rising (LH) • Simple verb morphology: Only one conjugation • The verb morphology is documented.
  • 4. 31 March 2009 Yoruba verb morphology 4 Our goals To generate verb forms for SY (i) realise all 160 combinations of morphosyntactic properties Tense: present, continuous, past, future Polarity: positive, negative Person: 1, 2Older, 3Older, 2Notolder, 3NotOlder Number: singular, plural Strength: normal, emphatic (ii) provide a computational description of SY verb formation
  • 5. 31 March 2009 Yoruba verb morphology 5 The KATR formalism  Based on DATR, a formalism for representing lexical knowledge by default-inheritance hierarchies (Evans & Gazdar, 1989).  Queries (such as 1 pl past) are directed to nodes that contain rules that either answer the queries or direct them to further nodes.
  • 6. 31 March 2009 Yoruba verb morphology 6 Generating Queries in KATR We declare variables to represent morphosyntactic properties 1) #vars $tense: present past continuous future . 2) #vars $polarity: positive negative . 3) #vars $person: 1 2Older 3Older 2NotOlder 3NotOlder . 4) #vars $number: sg pl . 5) #vars $strength: normal emphatic .
  • 7. 31 March 2009 Yoruba verb morphology 7 Generating multiple queries  This "show" line generates 160 queries such as:  <normal negative past 3Older sg>  <emphatic negative continuous 3Older pl>  These queries are directed to all leaf nodes, such as the "Take" node. (Node names always start with upper-case letters) #show <$strength :: $polarity :: $tense :: $person :: $number > .
  • 8. 31 March 2009 Yoruba verb morphology 8 Take: 1 <stem> = m un ´ % tone marks always follow vowels 2 {} = Verb The "Take" node  The order of rules is not significant.  The query <emphatic negative continuous 3Older pl> only matches Rule 2, which is completely unconstrained.  Rule 2 directs the query to the “Verb” node.
  • 9. 31 March 2009 Yoruba verb morphology 9 This query: <emphatic negative continuous 3Older pl> matches both rules. KATR chooses the more constraining rule (Panini's principle), that is, Rule 2. Rule 2 converts the query to <present negative emphatic 3Older pl> and directs it again to the "Verb" node. The "Verb" node Verb: 1 {} = Person Negator1 Tense Negator2 , "<stem>" Ending 2 {continuous negative} = <present negative>
  • 10. 31 March 2009 Yoruba verb morphology 10 Verb: 1 {} = Person Negator1 Tense Negator2 , "<stem>" Ending 2 {continuous negative} = <present negative> The "Verb" node, modified query This modified query: <present negative emphatic 3Older pl> matches only Rule 1, which  Represents our analysis of SY, which identifies 6 slots.  Combines the results for each slot into a single result • The results of sending the query to five different nodes. • The surface form "," which we use to create word boundaries. • The result of sending the new query "<stem>" to the starting leaf node ”Take", which returns the surface form “m un ´”
  • 11. 31 March 2009 Yoruba verb morphology 11 Person: 1 {3Older positive !future} = w ϙn ´ 2 {3Older} = w ϙn 3 {3NotOlder} = o ´ 4 {3NotOlder negative sg} = 5 {3NotOlder future} = y i ´ 6 {3NotOlder pl ++} = <3Older> ... % omitting many other rules The "Person" node This query: <present negative emphatic 3Older pl> only matches Rule 2, generating the answer “w ϙn”.
  • 12. 31 March 2009 Yoruba verb morphology 12 Negator1: 1 {negative} = , (k) o ` 2 {negative 3NotOlder sg} = k o ` 3 {} = The "Negator1" node This query: <present negative emphatic 3Older pl> matches Rules 1 and 3. KATR chooses Rule 1, generating the answer “, (k) o `”.
  • 13. 31 March 2009 Yoruba verb morphology 13 Tense: % polarity, tense 1 {} = 2 {past} = , t i 3 {continuous positive} = , n ´ 4 {future positive} = , o ̂ 5 {future 1 sg positive} = , a ̌ 6 {future 3NotOlder positive} = <future 3Older positive> The "Tense" node This query: <present negative emphatic 3Older pl> matches Rule 1, generating an empty (but valid!) output.
  • 14. 31 March 2009 Yoruba verb morphology 14 Negator2: % polarity, tense 1 {future negative} = , n i ´ 2 {past negative} = ´ i ` 3 {} = The "Negator2" node This query: <present negative emphatic 3Older pl> Matches only Rule 3, which generates an empty output.
  • 15. 31 March 2009 Yoruba verb morphology 15 Ending: 1 {} = 2 {emphatic} = ↓ The "Ending" node This query: <present negative emphatic 3Older pl> Matches both rules; KATR chooses Rule 2, which generates ↓, which is a jer for post-processing.
  • 16. 31 March 2009 Yoruba verb morphology 16 The "Verb" node assembles all the results into this surface form: w ϙn , (k) o ` , m un ´ ↓ This surface form is now treated by postprocessing rules. Postprocessing 1) #sandhi $vowel ↓ => $1 $1 ` . 2) #sandhi $vowel $tone ↓ => $1 $2 $1 . 3) #sandhi un $tone => u $1 n . % spelling 4) %(others omitted) Rules 1 and 2 remove the ↓ jer. In this case, Rule 2 applies, giving us: wwϙϙn , (k) o ` , m un ´ unn , (k) o ` , m un ´ un
  • 17. 31 March 2009 Yoruba verb morphology 17 Then Rule 3 applies, giving us wϙn , (k) o ` , m u ´ n un When we compress spaces out and replace comma with space, we get: wϙn (k)ò múnun which is the correct surface form for Take:<emphatic negative continuous 3Older pl> “They (older) are certainly not taking (that object)”
  • 18. 31 March 2009 Yoruba verb morphology 18 Implementation 1. A Perl script converts the KATR theory into  yoruba.katr.pro: a Prolog representation of the theory  yoruba.sandhi.pl: a Perl script for post-processing 2. A Prolog interpreter computes the results of all queries generated by “show” directed to all leaf nodes in the KATR theory. 3. The Perl post-processing script applies the Sandhi and other post-processing rules. 4. We then either generate textual output for direct viewing or HTML output for a browser. The KATR theory implemenation for Yoruba is available at http://www.cs.uky.edu/~raphael/KATR.html
  • 19. 31 March 2009 Yoruba verb morphology 19 Applications Linguistics: Theoretical studies of SY Pedagogy: Describing SY verbs to students Learning : Facilitating tool for teaching SY Technology: Developing software products such as spelling and grammar checkers
  • 20. 31 March 2009 Yoruba verb morphology 20 KATR instead of DATR  KATR allows sets in addition to paths on the left-hand side, so it is easy to ignore irrelevant morphosyntactic properties.  KATR is fast, so turn-around time is very short.  KATR lets us specify post-processing directly instead of embedding it in the default-inheritance hierarchy.
  • 21. 31 March 2009 Yoruba verb morphology 21  Description of slots in SY verb morphology  Six slots identified  Complete specification of the realizations of those slots  A simple use of jers to deal with the tone Sandhi of the emphatic suffix. Contributions
  • 22. 31 March 2009 Yoruba verb morphology 22 On going efforts  Evaulation: Subject out programe to further evaluation throught working with Yoruba linguists and phonologist  Expansion: Expand the rule for similar African tone languages  Exploration: Explore the generalitry of our approach and the possibility for developing genertic morphological rules
  • 23. 31 March 2009 Yoruba verb morphology 23 HELP!! Suggestions? Education? Questions?