SlideShare uma empresa Scribd logo
1 de 24
Statistical Machine Translation
(SMT) for Indian Language
Presented By:
Nakul Sharma, Parteek Bhatia.
Thapar University, Patiala.
Main Agenda
• Introduction to SMT.
• Tools.
• Popular Machine Translation Systems.
• Machine Translation Projects in India.
• Machine Translation Tools and Punjabi
Language.
• Conclusion and future work.
• References.
Introduction
• Part of Corpus based Machine Translation.
• System consists of 3 components:
– Language Model (LM).
– Translation Model (TM).
– Decoder.
System Architecture
T s
S T
Language Model
P(T)
Translation Model
P(S|T)
Decoder
Language Model (LM)
• Gives probability of single word given all
words of the sentence.
• N-gram model.
• P(s)=P(w1,w2,w3,……….,wn)
=P(w1)P(w2/w1)P(w3/w1.w2)P(w4/w1w2w3)
……..
P(wn/w1w2w3w……wn-1).
Translation Model (TM)
• Computes conditional probability P (T|S).
• Break the process into smaller units (words,
phrases..)
• Here T:Target Language, S:Source language.
• For Example, (aUH baag wYWch s/UN gaYI|
she slept in garden).
Decoder
• Search for a sentence T is performed that
maximizes P(S|T) i.e.
– Pr (S, T) = argmax P(T) P (S|T).
• Start with null hypothesis, i.e. sequence starts
with sequence of sentences.
Main Agenda
• Introduction to SMT.
• Tools for SMT.
• Popular Machine Translation Systems.
• Machine Translation Projects in India.
• Machine Translation Tools and Punjabi
Language.
• Conclusion and future work.
• References.
Tools for SMT
• LM Tools
– CMU Statistical Language Modeling (SLM) Toolkit
– SRILM
• TM Tools
– GIZA++
– MGIZA
• Decoder
– Moses
– ISI Rewriter Decoder
– Pharaoh
LM Tools
• CMU Statistical Language Modeling (SLM)
Toolkit.
– Set of unix software tools.
– Written by Roni Rosenfeld.
• SRILM
– Developed by SRI Speech Technology and research
laboratory.
– Applying Language Models.
Architecture for LM
Architecture of LM.
TM Tools
• GIZA++
– Implements different models like HMM.
– Performs word alignment.
• MGIZA++
– Multi-threaded word alignment
– Memory optimization.
This is the t3 final:-
First column: ids of source words
Second column:ids of target words.
Third column: Probability of alignment words.
Decoder Tools
• Moses
– Automatic training of translation models for any
language pair.
– Works with SRILM and GIZA++.
• ISI Rewriter Decoder
– Performs searching in development of SMT.
– Works with CMU-Statistical Language Modeling
toolkit and GIZA++.
Popular Machine Translation
Systems
• Google Translator.
• Bing Translator.
• Systran.
• Hindi to Punjabi Machine Translation System.
• METAL.
Main Agenda
• Introduction to SMT.
• Tools.
• Popular Machine Translation Systems.
• Machine Translation Projects in India.
• Machine Translation Tools and Punjabi
Language.
• Conclusion and future work.
• References.
Machine Translation Project in
India
• Anglabharat and Anubharati
• Anusaaraka
• MaTra
• Mantra
• UCSG-based English-Kannada MT
• UNL based MT between English, Hindi and
Marathi
• Tamil-Hindi Anusaarka and English-Tamil MT
• English-Hindi SMT.
Machine Translation Tools and
Punjabi Language
• Punjabi University.
– On-line Hindi-Punjabi & Punjabi-Hindi
Machine Translation.
• Thapar University.
– Punjabi language server which includes
Punjabi-UNL Encoverter and UNL-Punjabi
Encoverter.
Conclusion and Future Work
•There are applications supporting regional language translation.
•Future research directions in tree-tostring alignment template,clause based
restructuring.
•Combination of various MT techniques leading to efficient translation.
References
[01]. Adam Lopez, “Statistical Machine Translation”, ACM Computing Surveys, Vol. 40, No. 3, Article 8, Aug 2008.
[02]. Durgesh Rao; ―Machine Translation in India: A Brief Survey.
[03]. Franz Josef Och., ―GIZA++: Training of statistical translation models available at:‖ http://fjoch.com/GIZA++.html
accessed on 26/03/2010.
[04]. Hindi to Punjabi Translation system available at http://h2p.learnpunjabi.org accessed on 03/04/2010.
[05]. Hindi to Punjabi Translation system available at http://h2p.learnpunjabi.org accessed on 03/04/2010.
[06] Gurpreet Singh Lehal, ―A Survey of the State of the Art in Punjabi Language Processing , Language in India, oct‖
2009.
[07] Hindi to Punjabi Translation system available at http://h2p.learnpunjabi.org accessed on 03/04/2010
[08] ISI ReWrite Decoder User's Manual, Version 0.2, available at
http://www.isi.edu/~germann/software/ReWrite-Decoder/isi-decoder-manual.html accessed on 12/03.2010
[09] Jamie G. Carbonell, Teruko Mitamurs, Eric H. Nyberg, ―The KANT Perspective: A Critique of Pur Transfer (and Pure
Interlingua, Pure Statistic,….)
[10] Jayprasad J Hegde, Ananthakrishnan R, Kavitha M, Chandra Shekhar, Ritesh Shah, Sawani Bade, Sasikumar M,
―MaTra: A Practical Approach to Fully- Automatic Indicative English-Hindi Machine Translation.
[11] Jean Senellart, Péter Dienes, Tamás Váradi, ―New Generation Systran Translation System, MT Summit VIII, Sept
2001.
References(Cont.)
[12] On line Translation System available at:
www.translate.google.com accessed on 03/04/2010.
[13] Online manual of CMU Statistical Language Modeling Toolkit
available at:
http://mi.eng.cam.ac.uk/~prc14/toolkit_documentation.html
accessed on 15/03/2010.
[14] P. Brown, S. Della Pietra, V. Della Pietra, and R. Mercer ―The
mathematics of statistical machine translation: parameter
estimation. Computational Linguistics, 19(2), 263-311. (1993).
[15] Parteek Bhatia, Sandeep Singh, ―Punjabi Deconverter
Architecture , National Seminar on Creation of Lexical Resources‖
for Indian Language Computing and Processing, CDAC Mumbai,
March 26-28, 2007
Contact Us
nakul777@gmail.com
950303762

Mais conteúdo relacionado

Mais procurados

Machine translation with statistical approach
Machine translation with statistical approachMachine translation with statistical approach
Machine translation with statistical approach
vini89
 
Error Analysis of Rule-based Machine Translation Outputs
Error Analysis of Rule-based Machine Translation OutputsError Analysis of Rule-based Machine Translation Outputs
Error Analysis of Rule-based Machine Translation Outputs
Parisa Niksefat
 
Techniques in Translation
Techniques in TranslationTechniques in Translation
Techniques in Translation
juvelle villafania
 
CUHK intern PPT. Machine Translation Evaluation: Methods and Tools
CUHK intern PPT. Machine Translation Evaluation: Methods and Tools CUHK intern PPT. Machine Translation Evaluation: Methods and Tools
CUHK intern PPT. Machine Translation Evaluation: Methods and Tools
Lifeng (Aaron) Han
 
6. Khalil Sima'an (UVA) Statistical Machine Translation
6. Khalil Sima'an (UVA) Statistical Machine Translation6. Khalil Sima'an (UVA) Statistical Machine Translation
6. Khalil Sima'an (UVA) Statistical Machine Translation
RIILP
 
7. Trevor Cohn (usfd) Statistical Machine Translation
7. Trevor Cohn (usfd) Statistical Machine Translation7. Trevor Cohn (usfd) Statistical Machine Translation
7. Trevor Cohn (usfd) Statistical Machine Translation
RIILP
 

Mais procurados (20)

Machine Tanslation
Machine TanslationMachine Tanslation
Machine Tanslation
 
NLP pipeline in machine translation
NLP pipeline in machine translationNLP pipeline in machine translation
NLP pipeline in machine translation
 
[PACLING2019] Improving Context-aware Neural Machine Translation with Target-...
[PACLING2019] Improving Context-aware Neural Machine Translation with Target-...[PACLING2019] Improving Context-aware Neural Machine Translation with Target-...
[PACLING2019] Improving Context-aware Neural Machine Translation with Target-...
 
Machine Translation: What it is?
Machine Translation: What it is?Machine Translation: What it is?
Machine Translation: What it is?
 
Past, Present, and Future: Machine Translation & Natural Language Processing ...
Past, Present, and Future: Machine Translation & Natural Language Processing ...Past, Present, and Future: Machine Translation & Natural Language Processing ...
Past, Present, and Future: Machine Translation & Natural Language Processing ...
 
Machine translation with statistical approach
Machine translation with statistical approachMachine translation with statistical approach
Machine translation with statistical approach
 
2013 ALC Boston: Your Trained Moses SMT System doesn't work. What can you do?
2013 ALC Boston: Your Trained Moses SMT System doesn't work. What can you do?2013 ALC Boston: Your Trained Moses SMT System doesn't work. What can you do?
2013 ALC Boston: Your Trained Moses SMT System doesn't work. What can you do?
 
Machine Translation
Machine TranslationMachine Translation
Machine Translation
 
Real-time DirectTranslation System for Sinhala and Tamil Languages.
Real-time DirectTranslation System for Sinhala and Tamil Languages.Real-time DirectTranslation System for Sinhala and Tamil Languages.
Real-time DirectTranslation System for Sinhala and Tamil Languages.
 
Introduction To Translation Technologies
Introduction To Translation TechnologiesIntroduction To Translation Technologies
Introduction To Translation Technologies
 
Moving to neural machine translation at google - gopro-meetup
Moving to neural machine translation at google  - gopro-meetupMoving to neural machine translation at google  - gopro-meetup
Moving to neural machine translation at google - gopro-meetup
 
Error Analysis of Rule-based Machine Translation Outputs
Error Analysis of Rule-based Machine Translation OutputsError Analysis of Rule-based Machine Translation Outputs
Error Analysis of Rule-based Machine Translation Outputs
 
Techniques in Translation
Techniques in TranslationTechniques in Translation
Techniques in Translation
 
CUHK intern PPT. Machine Translation Evaluation: Methods and Tools
CUHK intern PPT. Machine Translation Evaluation: Methods and Tools CUHK intern PPT. Machine Translation Evaluation: Methods and Tools
CUHK intern PPT. Machine Translation Evaluation: Methods and Tools
 
6. Khalil Sima'an (UVA) Statistical Machine Translation
6. Khalil Sima'an (UVA) Statistical Machine Translation6. Khalil Sima'an (UVA) Statistical Machine Translation
6. Khalil Sima'an (UVA) Statistical Machine Translation
 
A Review on a web based Punjabi t o English Machine Transliteration System
A Review on a web based Punjabi t o English Machine Transliteration SystemA Review on a web based Punjabi t o English Machine Transliteration System
A Review on a web based Punjabi t o English Machine Transliteration System
 
7. Trevor Cohn (usfd) Statistical Machine Translation
7. Trevor Cohn (usfd) Statistical Machine Translation7. Trevor Cohn (usfd) Statistical Machine Translation
7. Trevor Cohn (usfd) Statistical Machine Translation
 
Machine translation from English to Hindi
Machine translation from English to HindiMachine translation from English to Hindi
Machine translation from English to Hindi
 
Statistical machine translation
Statistical machine translationStatistical machine translation
Statistical machine translation
 
A tutorial on Machine Translation
A tutorial on Machine TranslationA tutorial on Machine Translation
A tutorial on Machine Translation
 

Destaque

Types of machine translation
Types of machine translationTypes of machine translation
Types of machine translation
Rushdi Shams
 
MT and Translator's Tools
MT and Translator's ToolsMT and Translator's Tools
MT and Translator's Tools
Jim O'Regan
 

Destaque (20)

Types of machine translation
Types of machine translationTypes of machine translation
Types of machine translation
 
Assamese to English Statistical Machine Translation
Assamese to English Statistical Machine TranslationAssamese to English Statistical Machine Translation
Assamese to English Statistical Machine Translation
 
Nltk natural language toolkit overview and application @ PyCon.tw 2012
Nltk  natural language toolkit overview and application @ PyCon.tw 2012Nltk  natural language toolkit overview and application @ PyCon.tw 2012
Nltk natural language toolkit overview and application @ PyCon.tw 2012
 
Documentation with sphinx @ PyHug
Documentation with sphinx @ PyHugDocumentation with sphinx @ PyHug
Documentation with sphinx @ PyHug
 
MT and Translator's Tools
MT and Translator's ToolsMT and Translator's Tools
MT and Translator's Tools
 
Designing e-Learning Content for Localization
Designing e-Learning Content for LocalizationDesigning e-Learning Content for Localization
Designing e-Learning Content for Localization
 
Escaping style and script data
Escaping style and script dataEscaping style and script data
Escaping style and script data
 
Sec16.3: Reordering Integration
Sec16.3: Reordering IntegrationSec16.3: Reordering Integration
Sec16.3: Reordering Integration
 
Translationusing moses1
Translationusing moses1Translationusing moses1
Translationusing moses1
 
7. ebmt based on st sm
7. ebmt based on st sm7. ebmt based on st sm
7. ebmt based on st sm
 
Summary of Rule-based Reordering Space in Statistical Machine Translation
Summary of Rule-based Reordering Space in Statistical Machine TranslationSummary of Rule-based Reordering Space in Statistical Machine Translation
Summary of Rule-based Reordering Space in Statistical Machine Translation
 
Towards OpenLogos Hybrid Machine Translation - Anabela Barreiro
Towards OpenLogos Hybrid Machine Translation - Anabela BarreiroTowards OpenLogos Hybrid Machine Translation - Anabela Barreiro
Towards OpenLogos Hybrid Machine Translation - Anabela Barreiro
 
A statistical approach to machine translation
A statistical approach to machine translationA statistical approach to machine translation
A statistical approach to machine translation
 
Natural Language Processing glossary for Coders
Natural Language Processing glossary for CodersNatural Language Processing glossary for Coders
Natural Language Processing glossary for Coders
 
Data Localization and Translation
Data Localization and TranslationData Localization and Translation
Data Localization and Translation
 
“Why Should I Trust You?” Explaining the Predictions of Any Classifierの紹介
“Why Should I Trust You?” Explaining the Predictions of Any Classifierの紹介“Why Should I Trust You?” Explaining the Predictions of Any Classifierの紹介
“Why Should I Trust You?” Explaining the Predictions of Any Classifierの紹介
 
0G to 5Gl
0G to 5Gl0G to 5Gl
0G to 5Gl
 
Going Global? The ABC of Localization-Friendly Content
Going Global? The ABC of Localization-Friendly ContentGoing Global? The ABC of Localization-Friendly Content
Going Global? The ABC of Localization-Friendly Content
 
Translation & Localization
Translation & LocalizationTranslation & Localization
Translation & Localization
 
Statistical machine translation in a few slides
Statistical machine translation in a few slidesStatistical machine translation in a few slides
Statistical machine translation in a few slides
 

Semelhante a Statistical machine translation for indian language copy

Meta-Evaluation of Translation Evaluation Methods: a systematic up-to-date ov...
Meta-Evaluation of Translation Evaluation Methods: a systematic up-to-date ov...Meta-Evaluation of Translation Evaluation Methods: a systematic up-to-date ov...
Meta-Evaluation of Translation Evaluation Methods: a systematic up-to-date ov...
Lifeng (Aaron) Han
 

Semelhante a Statistical machine translation for indian language copy (20)

Punjabi to Hindi Transliteration System for Proper Nouns Using Hybrid Approach
Punjabi to Hindi Transliteration System for Proper Nouns Using Hybrid ApproachPunjabi to Hindi Transliteration System for Proper Nouns Using Hybrid Approach
Punjabi to Hindi Transliteration System for Proper Nouns Using Hybrid Approach
 
Machine Translation System: Chhattisgarhi to Hindi
Machine Translation System: Chhattisgarhi to HindiMachine Translation System: Chhattisgarhi to Hindi
Machine Translation System: Chhattisgarhi to Hindi
 
18CSP83.pptx
18CSP83.pptx18CSP83.pptx
18CSP83.pptx
 
Experiments with Different Models of Statistcial Machine Translation
Experiments with Different Models of Statistcial Machine TranslationExperiments with Different Models of Statistcial Machine Translation
Experiments with Different Models of Statistcial Machine Translation
 
project present
project presentproject present
project present
 
Quality estimation of machine translation outputs through stemming
Quality estimation of machine translation outputs through stemmingQuality estimation of machine translation outputs through stemming
Quality estimation of machine translation outputs through stemming
 
What is machine translation
What is machine translationWhat is machine translation
What is machine translation
 
Ac04507168175
Ac04507168175Ac04507168175
Ac04507168175
 
Searching for the Best Machine Translation Combination
Searching for the Best Machine Translation CombinationSearching for the Best Machine Translation Combination
Searching for the Best Machine Translation Combination
 
Ijetcas14 444
Ijetcas14 444Ijetcas14 444
Ijetcas14 444
 
Multi lingual corpus for machine aided translation
Multi lingual corpus for machine aided translationMulti lingual corpus for machine aided translation
Multi lingual corpus for machine aided translation
 
From CasMaCat to SEECAT: Patterns of Interaction in Advanced Computer-Assiste...
From CasMaCat to SEECAT: Patterns of Interaction in Advanced Computer-Assiste...From CasMaCat to SEECAT: Patterns of Interaction in Advanced Computer-Assiste...
From CasMaCat to SEECAT: Patterns of Interaction in Advanced Computer-Assiste...
 
HINDI AND MARATHI TO ENGLISH MACHINE TRANSLITERATION USING SVM
HINDI AND MARATHI TO ENGLISH MACHINE TRANSLITERATION USING SVMHINDI AND MARATHI TO ENGLISH MACHINE TRANSLITERATION USING SVM
HINDI AND MARATHI TO ENGLISH MACHINE TRANSLITERATION USING SVM
 
A Novel Approach for Rule Based Translation of English to Marathi
A Novel Approach for Rule Based Translation of English to MarathiA Novel Approach for Rule Based Translation of English to Marathi
A Novel Approach for Rule Based Translation of English to Marathi
 
A Novel Approach for Rule Based Translation of English to Marathi
A Novel Approach for Rule Based Translation of English to MarathiA Novel Approach for Rule Based Translation of English to Marathi
A Novel Approach for Rule Based Translation of English to Marathi
 
A Novel Approach for Rule Based Translation of English to Marathi
A Novel Approach for Rule Based Translation of English to MarathiA Novel Approach for Rule Based Translation of English to Marathi
A Novel Approach for Rule Based Translation of English to Marathi
 
How to Translate from English to Khmer using Moses
How to Translate from English to Khmer using MosesHow to Translate from English to Khmer using Moses
How to Translate from English to Khmer using Moses
 
Meta-Evaluation of Translation Evaluation Methods: a systematic up-to-date ov...
Meta-Evaluation of Translation Evaluation Methods: a systematic up-to-date ov...Meta-Evaluation of Translation Evaluation Methods: a systematic up-to-date ov...
Meta-Evaluation of Translation Evaluation Methods: a systematic up-to-date ov...
 
Computer Programming - Lecture A
Computer Programming - Lecture AComputer Programming - Lecture A
Computer Programming - Lecture A
 
Meta-evaluation of machine translation evaluation methods
Meta-evaluation of machine translation evaluation methodsMeta-evaluation of machine translation evaluation methods
Meta-evaluation of machine translation evaluation methods
 

Mais de Nakul Sharma

Integrating natural language processing and software engineering
Integrating natural language processing and software engineeringIntegrating natural language processing and software engineering
Integrating natural language processing and software engineering
Nakul Sharma
 
Possibility of interdisciplinary research software engineering andnatural lan...
Possibility of interdisciplinary research software engineering andnatural lan...Possibility of interdisciplinary research software engineering andnatural lan...
Possibility of interdisciplinary research software engineering andnatural lan...
Nakul Sharma
 

Mais de Nakul Sharma (10)

Keyphrase Extraction And Source Code Similarity Detection- A Survey
Keyphrase Extraction And Source Code Similarity Detection- A Survey Keyphrase Extraction And Source Code Similarity Detection- A Survey
Keyphrase Extraction And Source Code Similarity Detection- A Survey
 
Visualizing UML’s Sequence and Class Diagrams Using Graph-Based Clusters
Visualizing UML’s Sequence and   Class Diagrams Using Graph-Based Clusters  Visualizing UML’s Sequence and   Class Diagrams Using Graph-Based Clusters
Visualizing UML’s Sequence and Class Diagrams Using Graph-Based Clusters
 
Mapping and visualization of source code a survey
Mapping and visualization of source code a surveyMapping and visualization of source code a survey
Mapping and visualization of source code a survey
 
Mapping and visualization of source code a survey
Mapping and visualization of source code a surveyMapping and visualization of source code a survey
Mapping and visualization of source code a survey
 
A Conceptual Dependency Graph Based Keyword Extraction Model for Source Code...
 A Conceptual Dependency Graph Based Keyword Extraction Model for Source Code... A Conceptual Dependency Graph Based Keyword Extraction Model for Source Code...
A Conceptual Dependency Graph Based Keyword Extraction Model for Source Code...
 
Integrating natural language processing and software engineering
Integrating natural language processing and software engineeringIntegrating natural language processing and software engineering
Integrating natural language processing and software engineering
 
Possibility of interdisciplinary research software engineering andnatural lan...
Possibility of interdisciplinary research software engineering andnatural lan...Possibility of interdisciplinary research software engineering andnatural lan...
Possibility of interdisciplinary research software engineering andnatural lan...
 
Possibility of interdisciplinary research software engineering and
Possibility of interdisciplinary research software engineering andPossibility of interdisciplinary research software engineering and
Possibility of interdisciplinary research software engineering and
 
Session on machine translation batu 19 march2016
Session on machine translation batu 19 march2016Session on machine translation batu 19 march2016
Session on machine translation batu 19 march2016
 
Integrating natural language processing and software engineering
Integrating natural language processing and software engineeringIntegrating natural language processing and software engineering
Integrating natural language processing and software engineering
 

Último

Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Dr.Costas Sachpazis
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
dharasingh5698
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Christo Ananth
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college project
Tonystark477637
 

Último (20)

Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptx
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
 
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur EscortsRussian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 
Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college project
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
Online banking management system project.pdf
Online banking management system project.pdfOnline banking management system project.pdf
Online banking management system project.pdf
 
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 

Statistical machine translation for indian language copy

  • 1. Statistical Machine Translation (SMT) for Indian Language Presented By: Nakul Sharma, Parteek Bhatia. Thapar University, Patiala.
  • 2. Main Agenda • Introduction to SMT. • Tools. • Popular Machine Translation Systems. • Machine Translation Projects in India. • Machine Translation Tools and Punjabi Language. • Conclusion and future work. • References.
  • 3. Introduction • Part of Corpus based Machine Translation. • System consists of 3 components: – Language Model (LM). – Translation Model (TM). – Decoder.
  • 4. System Architecture T s S T Language Model P(T) Translation Model P(S|T) Decoder
  • 5. Language Model (LM) • Gives probability of single word given all words of the sentence. • N-gram model. • P(s)=P(w1,w2,w3,……….,wn) =P(w1)P(w2/w1)P(w3/w1.w2)P(w4/w1w2w3) …….. P(wn/w1w2w3w……wn-1).
  • 6. Translation Model (TM) • Computes conditional probability P (T|S). • Break the process into smaller units (words, phrases..) • Here T:Target Language, S:Source language. • For Example, (aUH baag wYWch s/UN gaYI| she slept in garden).
  • 7. Decoder • Search for a sentence T is performed that maximizes P(S|T) i.e. – Pr (S, T) = argmax P(T) P (S|T). • Start with null hypothesis, i.e. sequence starts with sequence of sentences.
  • 8. Main Agenda • Introduction to SMT. • Tools for SMT. • Popular Machine Translation Systems. • Machine Translation Projects in India. • Machine Translation Tools and Punjabi Language. • Conclusion and future work. • References.
  • 9. Tools for SMT • LM Tools – CMU Statistical Language Modeling (SLM) Toolkit – SRILM • TM Tools – GIZA++ – MGIZA • Decoder – Moses – ISI Rewriter Decoder – Pharaoh
  • 10. LM Tools • CMU Statistical Language Modeling (SLM) Toolkit. – Set of unix software tools. – Written by Roni Rosenfeld. • SRILM – Developed by SRI Speech Technology and research laboratory. – Applying Language Models.
  • 11.
  • 13.
  • 14. TM Tools • GIZA++ – Implements different models like HMM. – Performs word alignment. • MGIZA++ – Multi-threaded word alignment – Memory optimization.
  • 15. This is the t3 final:- First column: ids of source words Second column:ids of target words. Third column: Probability of alignment words.
  • 16. Decoder Tools • Moses – Automatic training of translation models for any language pair. – Works with SRILM and GIZA++. • ISI Rewriter Decoder – Performs searching in development of SMT. – Works with CMU-Statistical Language Modeling toolkit and GIZA++.
  • 17. Popular Machine Translation Systems • Google Translator. • Bing Translator. • Systran. • Hindi to Punjabi Machine Translation System. • METAL.
  • 18. Main Agenda • Introduction to SMT. • Tools. • Popular Machine Translation Systems. • Machine Translation Projects in India. • Machine Translation Tools and Punjabi Language. • Conclusion and future work. • References.
  • 19. Machine Translation Project in India • Anglabharat and Anubharati • Anusaaraka • MaTra • Mantra • UCSG-based English-Kannada MT • UNL based MT between English, Hindi and Marathi • Tamil-Hindi Anusaarka and English-Tamil MT • English-Hindi SMT.
  • 20. Machine Translation Tools and Punjabi Language • Punjabi University. – On-line Hindi-Punjabi & Punjabi-Hindi Machine Translation. • Thapar University. – Punjabi language server which includes Punjabi-UNL Encoverter and UNL-Punjabi Encoverter.
  • 21. Conclusion and Future Work •There are applications supporting regional language translation. •Future research directions in tree-tostring alignment template,clause based restructuring. •Combination of various MT techniques leading to efficient translation.
  • 22. References [01]. Adam Lopez, “Statistical Machine Translation”, ACM Computing Surveys, Vol. 40, No. 3, Article 8, Aug 2008. [02]. Durgesh Rao; ―Machine Translation in India: A Brief Survey. [03]. Franz Josef Och., ―GIZA++: Training of statistical translation models available at:‖ http://fjoch.com/GIZA++.html accessed on 26/03/2010. [04]. Hindi to Punjabi Translation system available at http://h2p.learnpunjabi.org accessed on 03/04/2010. [05]. Hindi to Punjabi Translation system available at http://h2p.learnpunjabi.org accessed on 03/04/2010. [06] Gurpreet Singh Lehal, ―A Survey of the State of the Art in Punjabi Language Processing , Language in India, oct‖ 2009. [07] Hindi to Punjabi Translation system available at http://h2p.learnpunjabi.org accessed on 03/04/2010 [08] ISI ReWrite Decoder User's Manual, Version 0.2, available at http://www.isi.edu/~germann/software/ReWrite-Decoder/isi-decoder-manual.html accessed on 12/03.2010 [09] Jamie G. Carbonell, Teruko Mitamurs, Eric H. Nyberg, ―The KANT Perspective: A Critique of Pur Transfer (and Pure Interlingua, Pure Statistic,….) [10] Jayprasad J Hegde, Ananthakrishnan R, Kavitha M, Chandra Shekhar, Ritesh Shah, Sawani Bade, Sasikumar M, ―MaTra: A Practical Approach to Fully- Automatic Indicative English-Hindi Machine Translation. [11] Jean Senellart, Péter Dienes, Tamás Váradi, ―New Generation Systran Translation System, MT Summit VIII, Sept 2001.
  • 23. References(Cont.) [12] On line Translation System available at: www.translate.google.com accessed on 03/04/2010. [13] Online manual of CMU Statistical Language Modeling Toolkit available at: http://mi.eng.cam.ac.uk/~prc14/toolkit_documentation.html accessed on 15/03/2010. [14] P. Brown, S. Della Pietra, V. Della Pietra, and R. Mercer ―The mathematics of statistical machine translation: parameter estimation. Computational Linguistics, 19(2), 263-311. (1993). [15] Parteek Bhatia, Sandeep Singh, ―Punjabi Deconverter Architecture , National Seminar on Creation of Lexical Resources‖ for Indian Language Computing and Processing, CDAC Mumbai, March 26-28, 2007