SlideShare uma empresa Scribd logo
1 de 19
Baixar para ler offline
Selecting Proper Lexical 
Paraphrase for Children 
Tomoyuki Kajiwara 
Hiroshi Matsumoto 
Kazuhide Yamamoto 
Nagaoka University of Technology
Lexical Paraphrase for Children 
Elementary school 
Japanese dictionary 
【大詰め:final stage】 
The last scene of the play 
芝居の最後の場面 
Newspaper 
for Children 
Basic Vocabulary 
to Learn 
5,404 words 
最後の大一番 
Total annual number 
of vocabulary 
200,000 words Selected by the similarity 
between the headword 
Big match of the last 
Newspaper 
for Adults 
大詰めの大一番 
Big match of the final stage 
2
BVL : Basic Vocabulary to Learn 
Vocabulary that 
registered in 
the elementary 
school dictionary 
Vocabulary that 
registered in the 
general dictionary 
Vocabulary that 
elementary 
school students 
General Vocabulary 
Vocabulary to Learn 
25,000 words 
can use 
sufficient Vocabulary of 
the minimum 
necessary for 
a living 
3 
Basic Vocabulary to Learn 
5,404 words 
Paraphrase to BVL 
from GV and VL 
Reading assistance for 
elementary school students 
Basic Vocabulary 
2,000 words
Related Works 
• Paraphrase of utilizing a dictionary 
– headword → headword 
• Fujita et al. (2000)、Mino and Tanaka (2011) 
– headword → word from the end of 
definition statement 
• Kaji et al. (2002)、Mino and Tanaka (2011)、 
Kajiwara and Yamamoto(2013) 
”The definition statements are simpler than the headwords” 
”The last segment represents the meaning of the headword” 
4
Problem of Related Works 
Definition 
【 大詰め 】芝居の最後の場面 
【final stage】the last scene of the party 
Paraphrase 
 ✕ 大詰めの大一番 → 場面の大一番 
 Big match of the final stage → Big match of the scene 
 ✔ 大詰めの大一番 → 最後の大一番 
 Big match of the final stage → Big match of the last 
Appropriate target words are not always 
found at the end of definitions 
5
Proposed Method
Proposed Method(1/2) 
• Acquisition of the Target Word Candidates 
① Difficult word is extracted 
② Entries of the difficult word are searched 
③ Words are extracted 
if they are the same part-of-speech as the difficult word 
6 
① ③ 
Original Sentence ・・・ 
People 
professor ・・・ 
【professor】People of status as professor. 
【professor】Status as professor. 
【professor】Teach learning and skill. 
【professor】University teacher. 
Japanese 
Dictionary 
Status 
Professor 
Learning 
Skill 
University 
Teacher 
②
Proposed Method(2/2) 
• Selection of the Proper Target Word 
④ Simple words are extracted 
⑤ Similarities of meaning are calculated 
⑥ Simple word with the highest similarity is selected 
7 
Basic Vocabulary 
to Learn 
People 
Learning  
University 
Skill 
Teacher  
People 
Status 
Professor 
Learning 
Skill 
University 
Teacher 
:0.17 
:0.11  
:0.08 
:0.13 
:0.25 
④ ⑤ 
⑥
Experiments
Comparative Methods 
• Acquisition of the Target Word Candidates 
One word is extracted 
From the end of definition statements 
If it is the same part-of-speech as the difficult word 
• Selection of the Proper Target Word 
Weighted voting by following methods 
• Frequency 
• Co-occurrence frequency 
• Point-wise Mutual Information 
• Tri-gram frequency 
• Cosine similarity between document vectors 8
Experimental Setup 
• Experimental object : 152 difficult words 
– Do not appear in BVL 
– Appear more than 50 times 
in the Mainichi News Paper published in 2000 
– Include paraphrasable simple words 
in the definition statements 
• Dictionary : Three Japanese dictionary 
• Thesaurus : Japanese WordNet 
9
Procedure (1/2) 
• Experiments on the 52 difficult words 
– Decide weight 
• Experiments on the 100 difficult words 
– Weighted voting 
• Evaluation 
– Three evaluator are judged 
– Decide by majority vote 
– Definition of “paraphrasable” 
The simple word can be replaced with 
difficult word in the original sentence 10
Procedure (2/2) 
③ Nouns are extracted 
11 
① Difficult word is extracted 
Original Sentence ・・・ 
People 
professor ・・・ 
② Entries of the professor are searched 
【professor】People of status as professor. 
【professor】Status as professor. 
【professor】Teach learning and skill. 
【professor】University teacher. 
Japanese 
Dictionary 
Status 
Professor 
Learning 
Skill 
University 
Teacher 
Basic Vocabulary 
to Learn 
People 
Learning  
University 
Skill 
Teacher  
People 
Status 
Professor 
Learning 
Skill 
University 
Teacher 
:0.17 
:0.11  
:0.08 
:0.13 
:0.25 
④ Simple words are extracted 
⑤ Similarities of meaning are calculated
Result (1/3) 
• Acquisition of the Target Word Candidates 
– More paraphrasable simple words are acquired 
– Only 3.2 points difference 
Number of 
paraphrasable words 
Percentage of 
paraphrasable words 
Proposed 165 / 221 74.7 % 
Comparative 158 / 221 71.5 % 
Many paraphrasable simple words 
appear at the end of definition statements 
12
Result (2/3) 
0 10 20 30 40 50 60 70 
13 
【Baseline】Randomness 
【Proposed】WordNet-similarity 
(1) Frequency 
(2) Co-occurrence Frequency 
(3) Point-wise Mutual Information 
(4) Tri-gram frequency 
(5) Cosine similarity 
Acquisition by comparative method 
Acquisition by proposed method
Result (3/3) 
0 10 20 30 40 50 60 70 
14 
【Baseline】Randomness 
【Proposed】WordNet-similarity 
A) Weightless voting by comparative 
methods (1)-(5) 
B) Weighted voting by comparative 
methods (1)-(5) 
C) Weightless voting adds the 
WordNet-similarity to the A) 
D) Weighted voting adds the 
WordNet-similarity to the B) 
Acquisition by comparative method 
Acquisition by proposed method
Erroneous Examples (1/2) 
• Two or more simple words have the highest similarity 
Example 
• Original : A summary of the main points. 
• Definition :【Points】essential, score, game, spot 
essential 
score 
game 
spot 
The method utilizing frequency or context 
information selected paraphrasable word 
15 
: similarity 1.0 
: similarity 1.0 
: similarity 1.0 
: similarity 1.0
Erroneous Examples (2/2) 
• The non-paraphrasable word have the highest similarity 
Example 
• Original : I can play the program during recording. 
• Definition : 【Play】Use the garbage again. What 
was gone once again regains power and life. 
16 
use : paraphrasable, similarity 0.8 
power : non-paraphrasable, similarity 1.0 
The method utilizing frequency or context 
information selected paraphrasable word
Conclusion 
We paraphrase difficult word to simple word with the 
highest similarity using the whole definition statements 
• Acquisition of the Target Word Candidates 
– More paraphrasable simple words are acquired 
– Many of them appear at the end of definitions 
• Selection of the Proper Target Word 
 The selection based on the similarity is better than 
 the selection by frequency or context information 
17

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Teaching methodology clt Vs desu
Teaching methodology   clt Vs desuTeaching methodology   clt Vs desu
Teaching methodology clt Vs desu
 
Introduction to toefl course
Introduction to toefl courseIntroduction to toefl course
Introduction to toefl course
 
A Brief Overview of IELTS
A Brief Overview of IELTSA Brief Overview of IELTS
A Brief Overview of IELTS
 
CAE Preparation Guide
CAE Preparation Guide CAE Preparation Guide
CAE Preparation Guide
 
Ket handbook
Ket handbookKet handbook
Ket handbook
 
TDC 1 - Class 8
TDC 1 - Class 8TDC 1 - Class 8
TDC 1 - Class 8
 
Exam format
Exam formatExam format
Exam format
 
Skills for the TOEFL IBT
Skills for the TOEFL IBTSkills for the TOEFL IBT
Skills for the TOEFL IBT
 
Introduction to TOEFL
Introduction to TOEFLIntroduction to TOEFL
Introduction to TOEFL
 
IELTS Teaching Workshop: Introduction To IELTS
IELTS Teaching Workshop: Introduction To IELTSIELTS Teaching Workshop: Introduction To IELTS
IELTS Teaching Workshop: Introduction To IELTS
 
Exam Pattern of the TOEFL Exam
Exam Pattern of the TOEFL ExamExam Pattern of the TOEFL Exam
Exam Pattern of the TOEFL Exam
 
Language Objectives for Elementary ELLs Handouts
Language Objectives for Elementary ELLs HandoutsLanguage Objectives for Elementary ELLs Handouts
Language Objectives for Elementary ELLs Handouts
 
British Council IELTS Exam Preparation Pattern for 2017
British Council IELTS Exam Preparation Pattern for 2017British Council IELTS Exam Preparation Pattern for 2017
British Council IELTS Exam Preparation Pattern for 2017
 
IELTS Ppt
IELTS PptIELTS Ppt
IELTS Ppt
 
IELTS
IELTSIELTS
IELTS
 
Kupt.ppd.tutors.manual
Kupt.ppd.tutors.manualKupt.ppd.tutors.manual
Kupt.ppd.tutors.manual
 
Ielts overview presentation
Ielts overview presentationIelts overview presentation
Ielts overview presentation
 
Cambridge english first__fce__handbook
Cambridge english first__fce__handbookCambridge english first__fce__handbook
Cambridge english first__fce__handbook
 
An Introduction to International English Language Tests
An Introduction to International English Language TestsAn Introduction to International English Language Tests
An Introduction to International English Language Tests
 
Fce Exam Overview
Fce Exam OverviewFce Exam Overview
Fce Exam Overview
 

Destaque

넥스트 컨퍼런스 2013: Conference on Innovation and The Future
넥스트 컨퍼런스 2013: Conference on Innovation and The Future넥스트 컨퍼런스 2013: Conference on Innovation and The Future
넥스트 컨퍼런스 2013: Conference on Innovation and The FutureBernard Moon
 
Consumer Web Platforms & Customer Acquisition
Consumer Web Platforms & Customer AcquisitionConsumer Web Platforms & Customer Acquisition
Consumer Web Platforms & Customer AcquisitionDave McClure
 
NTXISSACSC3 - How Threat Modeling Can Improve Your IAM Solution by John Fehan
NTXISSACSC3 - How Threat Modeling Can Improve Your IAM Solution by John Fehan NTXISSACSC3 - How Threat Modeling Can Improve Your IAM Solution by John Fehan
NTXISSACSC3 - How Threat Modeling Can Improve Your IAM Solution by John Fehan North Texas Chapter of the ISSA
 
BALLSY Guide To The SXSW 2016 Talks You Should Vote For NOW
BALLSY Guide To The SXSW 2016 Talks You Should Vote For NOWBALLSY Guide To The SXSW 2016 Talks You Should Vote For NOW
BALLSY Guide To The SXSW 2016 Talks You Should Vote For NOWJon Burkhart
 
Local SEO - How to beat your clueless competitors
Local SEO - How to beat your clueless competitorsLocal SEO - How to beat your clueless competitors
Local SEO - How to beat your clueless competitorsGreg Gifford
 
Підсумки роботи ДП «НАЕК «Енергоатом» за 8 місяців 2015 року (оперативні)
Підсумки роботи ДП «НАЕК «Енергоатом» за 8 місяців 2015 року (оперативні)Підсумки роботи ДП «НАЕК «Енергоатом» за 8 місяців 2015 року (оперативні)
Підсумки роботи ДП «НАЕК «Енергоатом» за 8 місяців 2015 року (оперативні)НАЕК «Енергоатом»
 
AIM-OPP for clearbook
AIM-OPP for clearbookAIM-OPP for clearbook
AIM-OPP for clearbookJinky Quizon
 
кратко
краткократко
краткоkulibin
 
5 Questions That You Should Ask in Any Negotiation
5 Questions That You Should Ask in Any Negotiation5 Questions That You Should Ask in Any Negotiation
5 Questions That You Should Ask in Any NegotiationManisha Dorawala
 
10 Landing Page Case Studies to Help you Optimize your Own
10 Landing Page Case Studies to Help you Optimize your Own10 Landing Page Case Studies to Help you Optimize your Own
10 Landing Page Case Studies to Help you Optimize your OwnWishpond
 

Destaque (20)

Ordem trt2
Ordem trt2Ordem trt2
Ordem trt2
 
넥스트 컨퍼런스 2013: Conference on Innovation and The Future
넥스트 컨퍼런스 2013: Conference on Innovation and The Future넥스트 컨퍼런스 2013: Conference on Innovation and The Future
넥스트 컨퍼런스 2013: Conference on Innovation and The Future
 
Automatic Selection of Predicates for Common Sense Knowledge Expression
Automatic Selection of Predicates for Common Sense Knowledge ExpressionAutomatic Selection of Predicates for Common Sense Knowledge Expression
Automatic Selection of Predicates for Common Sense Knowledge Expression
 
Consumer Web Platforms & Customer Acquisition
Consumer Web Platforms & Customer AcquisitionConsumer Web Platforms & Customer Acquisition
Consumer Web Platforms & Customer Acquisition
 
NTXISSACSC3 - How Threat Modeling Can Improve Your IAM Solution by John Fehan
NTXISSACSC3 - How Threat Modeling Can Improve Your IAM Solution by John Fehan NTXISSACSC3 - How Threat Modeling Can Improve Your IAM Solution by John Fehan
NTXISSACSC3 - How Threat Modeling Can Improve Your IAM Solution by John Fehan
 
BALLSY Guide To The SXSW 2016 Talks You Should Vote For NOW
BALLSY Guide To The SXSW 2016 Talks You Should Vote For NOWBALLSY Guide To The SXSW 2016 Talks You Should Vote For NOW
BALLSY Guide To The SXSW 2016 Talks You Should Vote For NOW
 
Local SEO - How to beat your clueless competitors
Local SEO - How to beat your clueless competitorsLocal SEO - How to beat your clueless competitors
Local SEO - How to beat your clueless competitors
 
用言等換言辞書を用いた換言結果の考察
用言等換言辞書を用いた換言結果の考察用言等換言辞書を用いた換言結果の考察
用言等換言辞書を用いた換言結果の考察
 
小学生の読解支援に向けた語釈文から語彙的換言を選択する手法
小学生の読解支援に向けた語釈文から語彙的換言を選択する手法小学生の読解支援に向けた語釈文から語彙的換言を選択する手法
小学生の読解支援に向けた語釈文から語彙的換言を選択する手法
 
用言等換言辞書の構築
用言等換言辞書の構築用言等換言辞書の構築
用言等換言辞書の構築
 
Dicas presentes de natal 2014
Dicas presentes de natal 2014Dicas presentes de natal 2014
Dicas presentes de natal 2014
 
Підсумки роботи ДП «НАЕК «Енергоатом» за 8 місяців 2015 року (оперативні)
Підсумки роботи ДП «НАЕК «Енергоатом» за 8 місяців 2015 року (оперативні)Підсумки роботи ДП «НАЕК «Енергоатом» за 8 місяців 2015 року (оперативні)
Підсумки роботи ДП «НАЕК «Енергоатом» за 8 місяців 2015 року (оперативні)
 
AIM-OPP for clearbook
AIM-OPP for clearbookAIM-OPP for clearbook
AIM-OPP for clearbook
 
кратко
краткократко
кратко
 
Socialmedianew
SocialmedianewSocialmedianew
Socialmedianew
 
5 Questions That You Should Ask in Any Negotiation
5 Questions That You Should Ask in Any Negotiation5 Questions That You Should Ask in Any Negotiation
5 Questions That You Should Ask in Any Negotiation
 
Cannes insights mma
Cannes insights mmaCannes insights mma
Cannes insights mma
 
対訳コーパスから生成したワードグラフによる部分的機械翻訳
対訳コーパスから生成したワードグラフによる部分的機械翻訳対訳コーパスから生成したワードグラフによる部分的機械翻訳
対訳コーパスから生成したワードグラフによる部分的機械翻訳
 
10 Landing Page Case Studies to Help you Optimize your Own
10 Landing Page Case Studies to Help you Optimize your Own10 Landing Page Case Studies to Help you Optimize your Own
10 Landing Page Case Studies to Help you Optimize your Own
 
役所からの公的文書に対する「やさしい日本語」への変換システムの構築
役所からの公的文書に対する「やさしい日本語」への変換システムの構築役所からの公的文書に対する「やさしい日本語」への変換システムの構築
役所からの公的文書に対する「やさしい日本語」への変換システムの構築
 

Semelhante a Selecting Proper Lexical Paraphrase for Children

Selecting Proper Lexical Paraphrase for Children
Selecting Proper Lexical Paraphrase for ChildrenSelecting Proper Lexical Paraphrase for Children
Selecting Proper Lexical Paraphrase for ChildrenTomoyuki Kajiwara
 
aptis-practice-book-for-students-ftu.pdf
aptis-practice-book-for-students-ftu.pdfaptis-practice-book-for-students-ftu.pdf
aptis-practice-book-for-students-ftu.pdfKimTung3
 
John De Jong: Optimizing Test & Courseware Development
John De Jong: Optimizing Test & Courseware DevelopmentJohn De Jong: Optimizing Test & Courseware Development
John De Jong: Optimizing Test & Courseware Developmenteaquals
 
Lesson Plan Oral Language Grade 1112.2
Lesson Plan Oral Language Grade 1112.2Lesson Plan Oral Language Grade 1112.2
Lesson Plan Oral Language Grade 1112.2Chad Cornwell
 
EC-TOEFL-2.pptx
EC-TOEFL-2.pptxEC-TOEFL-2.pptx
EC-TOEFL-2.pptxAnaZahida3
 
tofel class presentation tofel class presentation
tofel class presentation tofel class presentationtofel class presentation tofel class presentation
tofel class presentation tofel class presentationAlaaBaniKhalef1
 
TOEFL TRAINING :- TOFEL EXAM PATTERN AND SYLLABUS
TOEFL TRAINING :- TOFEL EXAM PATTERN AND SYLLABUSTOEFL TRAINING :- TOFEL EXAM PATTERN AND SYLLABUS
TOEFL TRAINING :- TOFEL EXAM PATTERN AND SYLLABUSGlobal Opportunities
 
Cambridge IELTS GT [@cambridgematerials].pdf
Cambridge IELTS GT [@cambridgematerials].pdfCambridge IELTS GT [@cambridgematerials].pdf
Cambridge IELTS GT [@cambridgematerials].pdfssuser92368f
 
fujii22apsipa_asc
fujii22apsipa_ascfujii22apsipa_asc
fujii22apsipa_ascYuki Saito
 
Basic Reading 1200 Key Words - Walktrhough
Basic Reading 1200 Key Words - WalktrhoughBasic Reading 1200 Key Words - Walktrhough
Basic Reading 1200 Key Words - WalktrhoughCompass Publishing
 
Overview TOEFL iBT Listening (edited byIgateshoerny)
Overview TOEFL iBT Listening (edited byIgateshoerny)Overview TOEFL iBT Listening (edited byIgateshoerny)
Overview TOEFL iBT Listening (edited byIgateshoerny)UIN Arraniry
 
IELTS Introductory session
IELTS Introductory sessionIELTS Introductory session
IELTS Introductory sessionessraa Othman
 
Close reading workshop
Close reading workshopClose reading workshop
Close reading workshopMelissa
 

Semelhante a Selecting Proper Lexical Paraphrase for Children (20)

Selecting Proper Lexical Paraphrase for Children
Selecting Proper Lexical Paraphrase for ChildrenSelecting Proper Lexical Paraphrase for Children
Selecting Proper Lexical Paraphrase for Children
 
Selecting proper lexical paraphrase for children
Selecting proper lexical paraphrase for childrenSelecting proper lexical paraphrase for children
Selecting proper lexical paraphrase for children
 
aptis-practice-book-for-students-ftu.pdf
aptis-practice-book-for-students-ftu.pdfaptis-practice-book-for-students-ftu.pdf
aptis-practice-book-for-students-ftu.pdf
 
John De Jong: Optimizing Test & Courseware Development
John De Jong: Optimizing Test & Courseware DevelopmentJohn De Jong: Optimizing Test & Courseware Development
John De Jong: Optimizing Test & Courseware Development
 
TOEFL Presentation
TOEFL PresentationTOEFL Presentation
TOEFL Presentation
 
CAMBRIDGE 17 TEST.pdf
CAMBRIDGE 17 TEST.pdfCAMBRIDGE 17 TEST.pdf
CAMBRIDGE 17 TEST.pdf
 
Lesson Plan Oral Language Grade 1112.2
Lesson Plan Oral Language Grade 1112.2Lesson Plan Oral Language Grade 1112.2
Lesson Plan Oral Language Grade 1112.2
 
EC-TOEFL-2.pptx
EC-TOEFL-2.pptxEC-TOEFL-2.pptx
EC-TOEFL-2.pptx
 
tofel class presentation tofel class presentation
tofel class presentation tofel class presentationtofel class presentation tofel class presentation
tofel class presentation tofel class presentation
 
Very Easy TOEIC 2/e
Very Easy TOEIC 2/eVery Easy TOEIC 2/e
Very Easy TOEIC 2/e
 
Tesol 2010 Boston
Tesol 2010 BostonTesol 2010 Boston
Tesol 2010 Boston
 
TOEFL TRAINING :- TOFEL EXAM PATTERN AND SYLLABUS
TOEFL TRAINING :- TOFEL EXAM PATTERN AND SYLLABUSTOEFL TRAINING :- TOFEL EXAM PATTERN AND SYLLABUS
TOEFL TRAINING :- TOFEL EXAM PATTERN AND SYLLABUS
 
Cambridge IELTS GT [@cambridgematerials].pdf
Cambridge IELTS GT [@cambridgematerials].pdfCambridge IELTS GT [@cambridgematerials].pdf
Cambridge IELTS GT [@cambridgematerials].pdf
 
fujii22apsipa_asc
fujii22apsipa_ascfujii22apsipa_asc
fujii22apsipa_asc
 
Basic Reading 1200 Key Words - Walktrhough
Basic Reading 1200 Key Words - WalktrhoughBasic Reading 1200 Key Words - Walktrhough
Basic Reading 1200 Key Words - Walktrhough
 
Overview TOEFL iBT Listening (edited byIgateshoerny)
Overview TOEFL iBT Listening (edited byIgateshoerny)Overview TOEFL iBT Listening (edited byIgateshoerny)
Overview TOEFL iBT Listening (edited byIgateshoerny)
 
IELTS Introductory session
IELTS Introductory sessionIELTS Introductory session
IELTS Introductory session
 
Close reading workshop
Close reading workshopClose reading workshop
Close reading workshop
 
How to test vocabulary
How to test vocabularyHow to test vocabulary
How to test vocabulary
 
TOEFL Lecture.
TOEFL Lecture.TOEFL Lecture.
TOEFL Lecture.
 

Mais de 長岡技術科学大学 自然言語処理研究室

説明文と記述要素の関係要因の調査~そこにクエリの「何」が書かれているのか~
説明文と記述要素の関係要因の調査~そこにクエリの「何」が書かれているのか~説明文と記述要素の関係要因の調査~そこにクエリの「何」が書かれているのか~
説明文と記述要素の関係要因の調査~そこにクエリの「何」が書かれているのか~長岡技術科学大学 自然言語処理研究室
 

Mais de 長岡技術科学大学 自然言語処理研究室 (20)

小学生の読解支援に向けた複数の換言知識を併用した語彙平易化と評価
小学生の読解支援に向けた複数の換言知識を併用した語彙平易化と評価小学生の読解支援に向けた複数の換言知識を併用した語彙平易化と評価
小学生の読解支援に向けた複数の換言知識を併用した語彙平易化と評価
 
質問意図によるQAサイト質問文の自動分類
質問意図によるQAサイト質問文の自動分類質問意図によるQAサイト質問文の自動分類
質問意図によるQAサイト質問文の自動分類
 
用言等換言辞書を人手で作りました
用言等換言辞書を人手で作りました用言等換言辞書を人手で作りました
用言等換言辞書を人手で作りました
 
文字列の出現頻度情報を用いた分かち書き単位の自動取得
文字列の出現頻度情報を用いた分かち書き単位の自動取得文字列の出現頻度情報を用いた分かち書き単位の自動取得
文字列の出現頻度情報を用いた分かち書き単位の自動取得
 
「やさしい日本語」変換システムの試作
「やさしい日本語」変換システムの試作「やさしい日本語」変換システムの試作
「やさしい日本語」変換システムの試作
 
常識表現となり得る用言の自動選定の検討
常識表現となり得る用言の自動選定の検討常識表現となり得る用言の自動選定の検討
常識表現となり得る用言の自動選定の検討
 
動詞意味類型の曖昧性解消に向けた格フレーム情報との関連調査
動詞意味類型の曖昧性解消に向けた格フレーム情報との関連調査動詞意味類型の曖昧性解消に向けた格フレーム情報との関連調査
動詞意味類型の曖昧性解消に向けた格フレーム情報との関連調査
 
二格深層格の定量的分析
二格深層格の定量的分析二格深層格の定量的分析
二格深層格の定量的分析
 
大規模常識知識ベース構築のための常識表現の自動獲得
大規模常識知識ベース構築のための常識表現の自動獲得大規模常識知識ベース構築のための常識表現の自動獲得
大規模常識知識ベース構築のための常識表現の自動獲得
 
文脈の多様性に基づく名詞換言の提案
文脈の多様性に基づく名詞換言の提案文脈の多様性に基づく名詞換言の提案
文脈の多様性に基づく名詞換言の提案
 
保険関連文書を対象とした文章校正支援のための変換誤り検出
保険関連文書を対象とした文章校正支援のための変換誤り検出保険関連文書を対象とした文章校正支援のための変換誤り検出
保険関連文書を対象とした文章校正支援のための変換誤り検出
 
Developing User-friendly and Customizable Text Analyzer
Developing User-friendly and Customizable Text AnalyzerDeveloping User-friendly and Customizable Text Analyzer
Developing User-friendly and Customizable Text Analyzer
 
普通名詞換言辞書の構築
普通名詞換言辞書の構築普通名詞換言辞書の構築
普通名詞換言辞書の構築
 
大規模常識知識ベース構築のための常識表現の自動獲得
大規模常識知識ベース構築のための常識表現の自動獲得大規模常識知識ベース構築のための常識表現の自動獲得
大規模常識知識ベース構築のための常識表現の自動獲得
 
普通名詞換言辞書の構築
普通名詞換言辞書の構築普通名詞換言辞書の構築
普通名詞換言辞書の構築
 
機械学習を用いたニ格深層格の自動付与の検討
機械学習を用いたニ格深層格の自動付与の検討機械学習を用いたニ格深層格の自動付与の検討
機械学習を用いたニ格深層格の自動付与の検討
 
A Comparison of Unsuperviesed Bilingual Term Extraction Methods Using Phrase ...
A Comparison of Unsuperviesed Bilingual Term Extraction Methods Using Phrase ...A Comparison of Unsuperviesed Bilingual Term Extraction Methods Using Phrase ...
A Comparison of Unsuperviesed Bilingual Term Extraction Methods Using Phrase ...
 
説明文と記述要素の関係要因の調査~そこにクエリの「何」が書かれているのか~
説明文と記述要素の関係要因の調査~そこにクエリの「何」が書かれているのか~説明文と記述要素の関係要因の調査~そこにクエリの「何」が書かれているのか~
説明文と記述要素の関係要因の調査~そこにクエリの「何」が書かれているのか~
 
QAサイトにおける専門用語を用いた最適な回答者提示
QAサイトにおける専門用語を用いた最適な回答者提示QAサイトにおける専門用語を用いた最適な回答者提示
QAサイトにおける専門用語を用いた最適な回答者提示
 
フレーズテーブルを用いた教師なし用語対訳抽出手法の比較
フレーズテーブルを用いた教師なし用語対訳抽出手法の比較フレーズテーブルを用いた教師なし用語対訳抽出手法の比較
フレーズテーブルを用いた教師なし用語対訳抽出手法の比較
 

Último

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 

Último (20)

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 

Selecting Proper Lexical Paraphrase for Children

  • 1. Selecting Proper Lexical Paraphrase for Children Tomoyuki Kajiwara Hiroshi Matsumoto Kazuhide Yamamoto Nagaoka University of Technology
  • 2. Lexical Paraphrase for Children Elementary school Japanese dictionary 【大詰め:final stage】 The last scene of the play 芝居の最後の場面 Newspaper for Children Basic Vocabulary to Learn 5,404 words 最後の大一番 Total annual number of vocabulary 200,000 words Selected by the similarity between the headword Big match of the last Newspaper for Adults 大詰めの大一番 Big match of the final stage 2
  • 3. BVL : Basic Vocabulary to Learn Vocabulary that registered in the elementary school dictionary Vocabulary that registered in the general dictionary Vocabulary that elementary school students General Vocabulary Vocabulary to Learn 25,000 words can use sufficient Vocabulary of the minimum necessary for a living 3 Basic Vocabulary to Learn 5,404 words Paraphrase to BVL from GV and VL Reading assistance for elementary school students Basic Vocabulary 2,000 words
  • 4. Related Works • Paraphrase of utilizing a dictionary – headword → headword • Fujita et al. (2000)、Mino and Tanaka (2011) – headword → word from the end of definition statement • Kaji et al. (2002)、Mino and Tanaka (2011)、 Kajiwara and Yamamoto(2013) ”The definition statements are simpler than the headwords” ”The last segment represents the meaning of the headword” 4
  • 5. Problem of Related Works Definition 【 大詰め 】芝居の最後の場面 【final stage】the last scene of the party Paraphrase  ✕ 大詰めの大一番 → 場面の大一番  Big match of the final stage → Big match of the scene  ✔ 大詰めの大一番 → 最後の大一番  Big match of the final stage → Big match of the last Appropriate target words are not always found at the end of definitions 5
  • 7. Proposed Method(1/2) • Acquisition of the Target Word Candidates ① Difficult word is extracted ② Entries of the difficult word are searched ③ Words are extracted if they are the same part-of-speech as the difficult word 6 ① ③ Original Sentence ・・・ People professor ・・・ 【professor】People of status as professor. 【professor】Status as professor. 【professor】Teach learning and skill. 【professor】University teacher. Japanese Dictionary Status Professor Learning Skill University Teacher ②
  • 8. Proposed Method(2/2) • Selection of the Proper Target Word ④ Simple words are extracted ⑤ Similarities of meaning are calculated ⑥ Simple word with the highest similarity is selected 7 Basic Vocabulary to Learn People Learning  University Skill Teacher  People Status Professor Learning Skill University Teacher :0.17 :0.11  :0.08 :0.13 :0.25 ④ ⑤ ⑥
  • 10. Comparative Methods • Acquisition of the Target Word Candidates One word is extracted From the end of definition statements If it is the same part-of-speech as the difficult word • Selection of the Proper Target Word Weighted voting by following methods • Frequency • Co-occurrence frequency • Point-wise Mutual Information • Tri-gram frequency • Cosine similarity between document vectors 8
  • 11. Experimental Setup • Experimental object : 152 difficult words – Do not appear in BVL – Appear more than 50 times in the Mainichi News Paper published in 2000 – Include paraphrasable simple words in the definition statements • Dictionary : Three Japanese dictionary • Thesaurus : Japanese WordNet 9
  • 12. Procedure (1/2) • Experiments on the 52 difficult words – Decide weight • Experiments on the 100 difficult words – Weighted voting • Evaluation – Three evaluator are judged – Decide by majority vote – Definition of “paraphrasable” The simple word can be replaced with difficult word in the original sentence 10
  • 13. Procedure (2/2) ③ Nouns are extracted 11 ① Difficult word is extracted Original Sentence ・・・ People professor ・・・ ② Entries of the professor are searched 【professor】People of status as professor. 【professor】Status as professor. 【professor】Teach learning and skill. 【professor】University teacher. Japanese Dictionary Status Professor Learning Skill University Teacher Basic Vocabulary to Learn People Learning  University Skill Teacher  People Status Professor Learning Skill University Teacher :0.17 :0.11  :0.08 :0.13 :0.25 ④ Simple words are extracted ⑤ Similarities of meaning are calculated
  • 14. Result (1/3) • Acquisition of the Target Word Candidates – More paraphrasable simple words are acquired – Only 3.2 points difference Number of paraphrasable words Percentage of paraphrasable words Proposed 165 / 221 74.7 % Comparative 158 / 221 71.5 % Many paraphrasable simple words appear at the end of definition statements 12
  • 15. Result (2/3) 0 10 20 30 40 50 60 70 13 【Baseline】Randomness 【Proposed】WordNet-similarity (1) Frequency (2) Co-occurrence Frequency (3) Point-wise Mutual Information (4) Tri-gram frequency (5) Cosine similarity Acquisition by comparative method Acquisition by proposed method
  • 16. Result (3/3) 0 10 20 30 40 50 60 70 14 【Baseline】Randomness 【Proposed】WordNet-similarity A) Weightless voting by comparative methods (1)-(5) B) Weighted voting by comparative methods (1)-(5) C) Weightless voting adds the WordNet-similarity to the A) D) Weighted voting adds the WordNet-similarity to the B) Acquisition by comparative method Acquisition by proposed method
  • 17. Erroneous Examples (1/2) • Two or more simple words have the highest similarity Example • Original : A summary of the main points. • Definition :【Points】essential, score, game, spot essential score game spot The method utilizing frequency or context information selected paraphrasable word 15 : similarity 1.0 : similarity 1.0 : similarity 1.0 : similarity 1.0
  • 18. Erroneous Examples (2/2) • The non-paraphrasable word have the highest similarity Example • Original : I can play the program during recording. • Definition : 【Play】Use the garbage again. What was gone once again regains power and life. 16 use : paraphrasable, similarity 0.8 power : non-paraphrasable, similarity 1.0 The method utilizing frequency or context information selected paraphrasable word
  • 19. Conclusion We paraphrase difficult word to simple word with the highest similarity using the whole definition statements • Acquisition of the Target Word Candidates – More paraphrasable simple words are acquired – Many of them appear at the end of definitions • Selection of the Proper Target Word  The selection based on the similarity is better than  the selection by frequency or context information 17