SlideShare uma empresa Scribd logo
1 de 26
Internal Evaluation German to English translation [email_address] ,  [email_address]   Nervo Verdezoto D.
Outline ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
The Problem ,[object Object],[object Object],[object Object]
Outline ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Objetives ,[object Object],[object Object],[object Object]
Outline ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Literature Review ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Road Map ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Software & Hardware ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Outline ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Baseline Table 1. Statistics of the Dataset  de: wiederaufnahme der sitzungsperiode en:  resumption of the session de: ich erkläre die am freitag , dem 17. dezember unterbrochene sitzungsperiode des europäischen parlaments für wiederaufgenommen , wünsche ihnen nochmals en:  i declare resumed the session of the european parliament adjourned on friday 17 december 1999 , and i would like once again to wish you a happy new year in  de:  alles gute zum jahreswechsel und hoffe , daß sie schöne ferien hatten . en:  the hope that you enjoyed a pleasant festive period . Figure 1.  Sample of the training corpus German < >English Training Sentences 78524 Words 1581042 1684639 Dev Sentences 2000 Words 55118 58761 Test Sentences 2000 Words 55580 59153
Baseline - Results ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Table 2. Performance of initial models Measure Model 0 -Baseline Model 1 -Tunning BLUE 23.24% 23.62% NIST 6.5426 6.4539 WER 69.09% 70.90% PER 18.82% 16.43%
Outline ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Experiments Setup Description Setup 1, 2 Filter sentences Baseline - 40  Setup 1 – 45 Setup 2 -35 Setup 3,4 and 5 Combination with baseline, setup1 and setup2 and lexicalized reordering model (reordering configuration msd-bidirectional-fe and distorsion limit 6   ). Setup 3 – filter(40) Setup 4 –filter (45) Setup 5 –filter (35) Setup 6 I tried to split source data but it does not work Setup 7 and 8 Adding Part Of Speech information  using  Factored translation mode  in the target data (English) / LM: Setup 7 (3gram), Setup 8(5gram) Setup 9 I tried to used Moses for factored translation model in the source (German) but it does not work. I tried to train the suppertagger with a German corpus (TIGGER corpus) and I got a problem with the format of the files  see  http://www.ims.uni-stuttgart.de/projekte/TIGER/TIGERCorpus/annotation
Factored Translation model ,[object Object]
Factored translation model ,[object Object],nervo @nervo-laptop:~/supertagger/candc-1.00$  bin/pos  --model  models/pos  --input data/baseline.de-en.tok.clean.en --output data/baseline.de-en.tok.clean.postag.en nervo @nervo-laptop:~/supertagger/candc-1.00$  bin/super  --model  models/super  --input data/baseline.de-en.tok.clean.postag.lowercased.en --output data/baseline.de-en.tok.clean.postag.lowercased.supertag.en nervo @nervo-laptop:~/MOSESMT/baseline-system/baseline-system$ cat trainingcorpus/baseline.de-en.tok.clean.postag.en | perl ../../moses-scripts/ lowercase . perl  > trainingcorpus/baseline.de-en.tok.clean.postag.lowercased.en Figure 3.  Sample of pre-processing (supertagger) resumption|resumption|nn of|of|in the|the|dt session|session|nn i|i|prp declare|declare|vbp resumed|resumed|vbn the|the|dt session|session|nn of|of|in the|the|dt european|european|nnp parliament|parliament|nnp adjourned|adjourned|vbd on|on|in friday|friday|nnp 17|17|cd december|december|nnp 1999|1999|cd ,|,|, and|and|cc i|i|prp would|would|md like|like|vb once|once|rb again|again|rb to|to|to wish|wish|vb you|you|prp a|a|dt happy|happy|jj new|new|jj year|year|nn in|in|in the|the|dt hope|hope|nn that|that|in you|you|prp enjoyed|enjoyed|vbd a|a|dt pleasant|pleasant|jj festive|festive|jj   period|period|nn .|.|. Figure 4 . Sample of training data with supertags Original corpus resumption of the session   POStagged corpus resumption|resumption|nn of|of|in the|the|dt session|session|nn
Training ,[object Object],[object Object]
Changes in moses.ini ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
SETUP7
Outline ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Experimental results Measure Baseline  40 SETUP 1 - 45 SETUP 2 35 BLUE 23.24% 23.20% 22.88% NIST 6.5426 6.5490 6.4742 WER 69.09% 69.08% 69.58% PER 18.82% 18.80% 18.84% Measure SETUP 3  40 SETUP 4 45 SETUP 5 35 BLUE 23.03% 23.06% 22.56% NIST 6.5168 6.5349 6.4485 WER 68.52% 68.29% 69.07% PER 19.46% 19.72% 19.46% Measure SETUP 7  40 - 3gram SETUP 8 40-5gram SETUP9 BLUE 21.51% 21.59% / NIST 6.1699 6.1754 / WER 73.29% 73.45% / PER 17.74% 17.39% /
MODEL EXAMPLE REFERENCE he wanted the presidency to outline the way forward at nice . BASELINE he  HAS EXPRESSED THE WISH THAT  the presidency IN  NICE  the way ***  AHEAD AUFZEIGT  TUNING he  HAS EXPRESSED THE WISH THAT  the presidency IN  NICE  the way ***  AHEAD AUFZEIGT  SETUP1 he  HAS EXPRESSED THE WISH THAT  the presidency IN  NICE  the way ***  *** AHEAD  SETUP2 he  HAS EXPRESSED the WISH THAT THE  PRESIDENCY IN  NICE way AUFZEIGT THE FUTURE  SETUP3 he  HAS EXPRESSED THE WISH THAT  the presidency IN  NICE ,  the way ***  AHEAD AUFZEIGT SETUP4 he  HAS EXPRESSED THE WISH THAT  the presidency IN  NICE  the way ***  *** AHEAD  SETUP5 he  HAS EXPRESSED THE WISH THAT  the presidency IN  NICE  ,  THE FUTURE  PATH AUFZEIGT  SETUP6 -- SETUP7 he  HAS EXPRESSED THE WISH THAT  the presidency IN  NICE  ,  THE FUTURE  PATH SHOWS  SETUP8 he  HAS EXPRESSED THE WISH THAT  the presidency IN  NICE ,  the way ***  AHEAD SHOWS
Outline ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Summary and Conclusion ,[object Object],[object Object],[object Object]
REFERENCES ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
[object Object],Nervo Verdezoto D. [email_address] ,  [email_address]

Mais conteúdo relacionado

Semelhante a Internal Evaluation for a MT System, German to English

Evaluating Parameter Efficient Learning for Generation.pdf
Evaluating Parameter Efficient Learning for Generation.pdfEvaluating Parameter Efficient Learning for Generation.pdf
Evaluating Parameter Efficient Learning for Generation.pdfPo-Chuan Chen
 
ESR2 Santanu Pal - EXPERT Summer School - Malaga 2015
ESR2 Santanu Pal - EXPERT Summer School - Malaga 2015ESR2 Santanu Pal - EXPERT Summer School - Malaga 2015
ESR2 Santanu Pal - EXPERT Summer School - Malaga 2015RIILP
 
Improving the role of language model in statistical machine translation (Indo...
Improving the role of language model in statistical machine translation (Indo...Improving the role of language model in statistical machine translation (Indo...
Improving the role of language model in statistical machine translation (Indo...IJECEIAES
 
Moore_slides.ppt
Moore_slides.pptMoore_slides.ppt
Moore_slides.pptbutest
 
A Simulation Based Approach for Studying the Effect of Buffers on the Perform...
A Simulation Based Approach for Studying the Effect of Buffers on the Perform...A Simulation Based Approach for Studying the Effect of Buffers on the Perform...
A Simulation Based Approach for Studying the Effect of Buffers on the Perform...inventionjournals
 
A Context-based Numeral Reading Technique for Text to Speech Systems
A Context-based Numeral Reading Technique for Text to Speech Systems A Context-based Numeral Reading Technique for Text to Speech Systems
A Context-based Numeral Reading Technique for Text to Speech Systems IJECEIAES
 
Part-of-Speech Tagging for Bengali Thesis submitted to Indian ...
Part-of-Speech Tagging for Bengali Thesis submitted to Indian ...Part-of-Speech Tagging for Bengali Thesis submitted to Indian ...
Part-of-Speech Tagging for Bengali Thesis submitted to Indian ...butest
 
A Pilot Study On Computer-Aided Coreference Annotation
A Pilot Study On Computer-Aided Coreference AnnotationA Pilot Study On Computer-Aided Coreference Annotation
A Pilot Study On Computer-Aided Coreference AnnotationDarian Pruitt
 
Chinese Named Entity Recognition with Graph-based Semi-supervised Learning Model
Chinese Named Entity Recognition with Graph-based Semi-supervised Learning ModelChinese Named Entity Recognition with Graph-based Semi-supervised Learning Model
Chinese Named Entity Recognition with Graph-based Semi-supervised Learning ModelLifeng (Aaron) Han
 
2-Chapter Two-N-gram Language Models.ppt
2-Chapter Two-N-gram Language Models.ppt2-Chapter Two-N-gram Language Models.ppt
2-Chapter Two-N-gram Language Models.pptmilkesa13
 
HMM APPLICATION IN ISOLATED WORD SPEECH RECOGNITION
HMM APPLICATION IN ISOLATED WORD SPEECH RECOGNITIONHMM APPLICATION IN ISOLATED WORD SPEECH RECOGNITION
HMM APPLICATION IN ISOLATED WORD SPEECH RECOGNITIONAM Publications
 
part of speech tagger for ARABIC TEXT
part of speech tagger for ARABIC TEXTpart of speech tagger for ARABIC TEXT
part of speech tagger for ARABIC TEXTarteimi
 
GENERATING SUMMARIES USING SENTENCE COMPRESSION AND STATISTICAL MEASURES
GENERATING SUMMARIES USING SENTENCE COMPRESSION AND STATISTICAL MEASURESGENERATING SUMMARIES USING SENTENCE COMPRESSION AND STATISTICAL MEASURES
GENERATING SUMMARIES USING SENTENCE COMPRESSION AND STATISTICAL MEASURESijnlc
 
Real-time DirectTranslation System for Sinhala and Tamil Languages.
Real-time DirectTranslation System for Sinhala and Tamil Languages.Real-time DirectTranslation System for Sinhala and Tamil Languages.
Real-time DirectTranslation System for Sinhala and Tamil Languages.Sheeyam Shellvacumar
 

Semelhante a Internal Evaluation for a MT System, German to English (20)

Evaluating Parameter Efficient Learning for Generation.pdf
Evaluating Parameter Efficient Learning for Generation.pdfEvaluating Parameter Efficient Learning for Generation.pdf
Evaluating Parameter Efficient Learning for Generation.pdf
 
ESR2 Santanu Pal - EXPERT Summer School - Malaga 2015
ESR2 Santanu Pal - EXPERT Summer School - Malaga 2015ESR2 Santanu Pal - EXPERT Summer School - Malaga 2015
ESR2 Santanu Pal - EXPERT Summer School - Malaga 2015
 
How to Translate from English to Khmer using Moses
How to Translate from English to Khmer using MosesHow to Translate from English to Khmer using Moses
How to Translate from English to Khmer using Moses
 
Improving the role of language model in statistical machine translation (Indo...
Improving the role of language model in statistical machine translation (Indo...Improving the role of language model in statistical machine translation (Indo...
Improving the role of language model in statistical machine translation (Indo...
 
Moore_slides.ppt
Moore_slides.pptMoore_slides.ppt
Moore_slides.ppt
 
A Simulation Based Approach for Studying the Effect of Buffers on the Perform...
A Simulation Based Approach for Studying the Effect of Buffers on the Perform...A Simulation Based Approach for Studying the Effect of Buffers on the Perform...
A Simulation Based Approach for Studying the Effect of Buffers on the Perform...
 
Notes on algorithms
Notes on algorithmsNotes on algorithms
Notes on algorithms
 
228-SE3001_2
228-SE3001_2228-SE3001_2
228-SE3001_2
 
A Context-based Numeral Reading Technique for Text to Speech Systems
A Context-based Numeral Reading Technique for Text to Speech Systems A Context-based Numeral Reading Technique for Text to Speech Systems
A Context-based Numeral Reading Technique for Text to Speech Systems
 
Part-of-Speech Tagging for Bengali Thesis submitted to Indian ...
Part-of-Speech Tagging for Bengali Thesis submitted to Indian ...Part-of-Speech Tagging for Bengali Thesis submitted to Indian ...
Part-of-Speech Tagging for Bengali Thesis submitted to Indian ...
 
A Pilot Study On Computer-Aided Coreference Annotation
A Pilot Study On Computer-Aided Coreference AnnotationA Pilot Study On Computer-Aided Coreference Annotation
A Pilot Study On Computer-Aided Coreference Annotation
 
Chinese Named Entity Recognition with Graph-based Semi-supervised Learning Model
Chinese Named Entity Recognition with Graph-based Semi-supervised Learning ModelChinese Named Entity Recognition with Graph-based Semi-supervised Learning Model
Chinese Named Entity Recognition with Graph-based Semi-supervised Learning Model
 
team10.ppt.pptx
team10.ppt.pptxteam10.ppt.pptx
team10.ppt.pptx
 
Pert2
Pert2Pert2
Pert2
 
2-Chapter Two-N-gram Language Models.ppt
2-Chapter Two-N-gram Language Models.ppt2-Chapter Two-N-gram Language Models.ppt
2-Chapter Two-N-gram Language Models.ppt
 
HMM APPLICATION IN ISOLATED WORD SPEECH RECOGNITION
HMM APPLICATION IN ISOLATED WORD SPEECH RECOGNITIONHMM APPLICATION IN ISOLATED WORD SPEECH RECOGNITION
HMM APPLICATION IN ISOLATED WORD SPEECH RECOGNITION
 
part of speech tagger for ARABIC TEXT
part of speech tagger for ARABIC TEXTpart of speech tagger for ARABIC TEXT
part of speech tagger for ARABIC TEXT
 
GENERATING SUMMARIES USING SENTENCE COMPRESSION AND STATISTICAL MEASURES
GENERATING SUMMARIES USING SENTENCE COMPRESSION AND STATISTICAL MEASURESGENERATING SUMMARIES USING SENTENCE COMPRESSION AND STATISTICAL MEASURES
GENERATING SUMMARIES USING SENTENCE COMPRESSION AND STATISTICAL MEASURES
 
Real-time DirectTranslation System for Sinhala and Tamil Languages.
Real-time DirectTranslation System for Sinhala and Tamil Languages.Real-time DirectTranslation System for Sinhala and Tamil Languages.
Real-time DirectTranslation System for Sinhala and Tamil Languages.
 
IRJET- Vocal Code
IRJET- Vocal CodeIRJET- Vocal Code
IRJET- Vocal Code
 

Mais de Nervo Verdezoto

Improving Traffic in Oulu
Improving Traffic in OuluImproving Traffic in Oulu
Improving Traffic in OuluNervo Verdezoto
 
S@VE - an electronic voting system.
S@VE - an electronic voting system.S@VE - an electronic voting system.
S@VE - an electronic voting system.Nervo Verdezoto
 
Intelligent Geographical Information System with decision support capabilities.
Intelligent Geographical Information System with decision support capabilities.Intelligent Geographical Information System with decision support capabilities.
Intelligent Geographical Information System with decision support capabilities.Nervo Verdezoto
 
Application of formal ontology and semantic techniques to improve the coheren...
Application of formal ontology and semantic techniques to improve the coheren...Application of formal ontology and semantic techniques to improve the coheren...
Application of formal ontology and semantic techniques to improve the coheren...Nervo Verdezoto
 
A Method for coordinative syntactic disambiguation in Spanish
A Method for coordinative syntactic disambiguation in SpanishA Method for coordinative syntactic disambiguation in Spanish
A Method for coordinative syntactic disambiguation in SpanishNervo Verdezoto
 
Presentacion Accesibilidad y Posicionamiento 25 08 08
Presentacion Accesibilidad y Posicionamiento 25 08 08Presentacion Accesibilidad y Posicionamiento 25 08 08
Presentacion Accesibilidad y Posicionamiento 25 08 08Nervo Verdezoto
 

Mais de Nervo Verdezoto (6)

Improving Traffic in Oulu
Improving Traffic in OuluImproving Traffic in Oulu
Improving Traffic in Oulu
 
S@VE - an electronic voting system.
S@VE - an electronic voting system.S@VE - an electronic voting system.
S@VE - an electronic voting system.
 
Intelligent Geographical Information System with decision support capabilities.
Intelligent Geographical Information System with decision support capabilities.Intelligent Geographical Information System with decision support capabilities.
Intelligent Geographical Information System with decision support capabilities.
 
Application of formal ontology and semantic techniques to improve the coheren...
Application of formal ontology and semantic techniques to improve the coheren...Application of formal ontology and semantic techniques to improve the coheren...
Application of formal ontology and semantic techniques to improve the coheren...
 
A Method for coordinative syntactic disambiguation in Spanish
A Method for coordinative syntactic disambiguation in SpanishA Method for coordinative syntactic disambiguation in Spanish
A Method for coordinative syntactic disambiguation in Spanish
 
Presentacion Accesibilidad y Posicionamiento 25 08 08
Presentacion Accesibilidad y Posicionamiento 25 08 08Presentacion Accesibilidad y Posicionamiento 25 08 08
Presentacion Accesibilidad y Posicionamiento 25 08 08
 

Último

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 

Último (20)

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 

Internal Evaluation for a MT System, German to English

  • 1. Internal Evaluation German to English translation [email_address] , [email_address] Nervo Verdezoto D.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11. Baseline Table 1. Statistics of the Dataset de: wiederaufnahme der sitzungsperiode en: resumption of the session de: ich erkläre die am freitag , dem 17. dezember unterbrochene sitzungsperiode des europäischen parlaments für wiederaufgenommen , wünsche ihnen nochmals en: i declare resumed the session of the european parliament adjourned on friday 17 december 1999 , and i would like once again to wish you a happy new year in de: alles gute zum jahreswechsel und hoffe , daß sie schöne ferien hatten . en: the hope that you enjoyed a pleasant festive period . Figure 1. Sample of the training corpus German < >English Training Sentences 78524 Words 1581042 1684639 Dev Sentences 2000 Words 55118 58761 Test Sentences 2000 Words 55580 59153
  • 12.
  • 13.
  • 14. Experiments Setup Description Setup 1, 2 Filter sentences Baseline - 40 Setup 1 – 45 Setup 2 -35 Setup 3,4 and 5 Combination with baseline, setup1 and setup2 and lexicalized reordering model (reordering configuration msd-bidirectional-fe and distorsion limit 6 ). Setup 3 – filter(40) Setup 4 –filter (45) Setup 5 –filter (35) Setup 6 I tried to split source data but it does not work Setup 7 and 8 Adding Part Of Speech information using Factored translation mode in the target data (English) / LM: Setup 7 (3gram), Setup 8(5gram) Setup 9 I tried to used Moses for factored translation model in the source (German) but it does not work. I tried to train the suppertagger with a German corpus (TIGGER corpus) and I got a problem with the format of the files see http://www.ims.uni-stuttgart.de/projekte/TIGER/TIGERCorpus/annotation
  • 15.
  • 16.
  • 17.
  • 18.
  • 20.
  • 21. Experimental results Measure Baseline 40 SETUP 1 - 45 SETUP 2 35 BLUE 23.24% 23.20% 22.88% NIST 6.5426 6.5490 6.4742 WER 69.09% 69.08% 69.58% PER 18.82% 18.80% 18.84% Measure SETUP 3 40 SETUP 4 45 SETUP 5 35 BLUE 23.03% 23.06% 22.56% NIST 6.5168 6.5349 6.4485 WER 68.52% 68.29% 69.07% PER 19.46% 19.72% 19.46% Measure SETUP 7 40 - 3gram SETUP 8 40-5gram SETUP9 BLUE 21.51% 21.59% / NIST 6.1699 6.1754 / WER 73.29% 73.45% / PER 17.74% 17.39% /
  • 22. MODEL EXAMPLE REFERENCE he wanted the presidency to outline the way forward at nice . BASELINE he HAS EXPRESSED THE WISH THAT the presidency IN NICE the way *** AHEAD AUFZEIGT TUNING he HAS EXPRESSED THE WISH THAT the presidency IN NICE the way *** AHEAD AUFZEIGT SETUP1 he HAS EXPRESSED THE WISH THAT the presidency IN NICE the way *** *** AHEAD SETUP2 he HAS EXPRESSED the WISH THAT THE PRESIDENCY IN NICE way AUFZEIGT THE FUTURE SETUP3 he HAS EXPRESSED THE WISH THAT the presidency IN NICE , the way *** AHEAD AUFZEIGT SETUP4 he HAS EXPRESSED THE WISH THAT the presidency IN NICE the way *** *** AHEAD SETUP5 he HAS EXPRESSED THE WISH THAT the presidency IN NICE , THE FUTURE PATH AUFZEIGT SETUP6 -- SETUP7 he HAS EXPRESSED THE WISH THAT the presidency IN NICE , THE FUTURE PATH SHOWS SETUP8 he HAS EXPRESSED THE WISH THAT the presidency IN NICE , the way *** AHEAD SHOWS
  • 23.
  • 24.
  • 25.
  • 26.

Notas do Editor

  1. Let ’ s now move to the problem
  2. The problem addressed in this paper is related with the coordinative and prepositional syntactic ambiguity in Spanish. Since, Spanish is considered a complex language for its variability structure and some different grammatical rules
  3. So, Turning to the objetives
  4. This paper proposed a method to solve coordinative and prepositional syntactic ambiguity for a written text in natural language. The main aims are: Decrease the number of syntactic representations of a phrase. Definition of a set of heuristic rules to indentify and solve this type of ambiguity. Implementation of this method for syntactic disambiguation for Spanish using the python language (Natural Language Toolkit - NLTK)
  5. So, Turning to the objetives
  6. Next, I will give you a brief explanation about the implementation of this method
  7. Next, I will give you a brief explanation about the implementation of this method
  8. Next, I will give you a brief explanation about the implementation of this method
  9. Next, I will give you a brief explanation about the implementation of this method
  10. Next, I will give you a brief explanation about the implementation of this method
  11. For the extension of the work!!
  12. For the extension of the work!!