2. Introduction
What is Machine Translation (MT)
What is Computer Assisted Translation (CAT)
Technology adapted by CAT and MT
Technology adapted by Automatic Speech Recognition (ASR)
ASR and CAT integration
Translation Quality Evaluation
New Trends in Machine Translation
Conclusion
Overview
2
3. Introduction
Researches have been undertaken to implement translation tools with out a
use of human translator to increase translation quality and to reduce
human post editing needed
Speech is the most natural , practical , simplest and efficient method of
human communication
Integration of Speech to translation process is a wide research area
As a solution for blind people for the communication with different
languages
3
4. Sub categories of Computational Linguistics
Computational linguistics is the branch of linguistics in which the techniques of computer
science are applied to the analysis and synthesis of language and speech
Machine Translation is accomplished by feeding a text to a computer algorithm that
automatically translates in to another language. That is there is no human
involvement
Computer Assisted Translation is human translation carried out with the aid of
computerized tools
What is MT & CAT
4
6. A rule based machine translation system consists of collection of rules called
grammar rules, lexicon and software programs to process the rules
Focus on syntactic, semantic and morphological details of both the source language
and the translated language when translating
• Syntactic – how words are grammatically arranged in sentences, how we speech in
communication
• Semantic – how words are meaningfully arranged in sentences
• Morphological – structure of sentences, source and targeted language
Rule Based Machine Translation
6
7. Rule Based Machine Translation (continued..)
Structure of the Rule Based Machine Translation system
• A tree structure is used to represent the structure of the sentence
• A typical English sentence consists of two major parts as the noun phrase (NP) &
the verb phrase (VP)
• These two can be further divided
• Following are the rules to represent a simple grammar
S -> NPVP
VP -> VNP
NP -> Name
NP -> ART N
S stands for sentence, V for verb, N for noun and ART for article
• Example : Saman was happy can be written in logical form as
(< PAST HAPPY> (NAME “Saman”))
7
8. Translation based on previous example translations as results of experiments
Uses knowledge sources to support the translation process
Require bilingual content
Two types
• Statistical Machine Translation
• Example Based Machine Translation
Empirical Based Machine Translation
8
9. Statistical Machine Translation
Translations based on statistical models
Statistical Translation Model – Learned from Bilingual Data (TM)
• Probabilistic mapping of equivalencies in source words and phrases with target languag
e words and phrases through the Unsupervised Expected Model (EM) training and word
and phrase alignment process
• Generates a lots of possible translations
• Includes finite state models such as finite state transducers , alignment models and
phrase based models
Statistical Language Model – Learned from Monolingual Target Language Data
• Probabilistic model of relative fluency and general usage patterns in the target language
• Based on n-gram model
• Target language model selects the “best” translations from a list of possible candidates
• Candidates stored in a N-best list
• Concept of re-ranking
9
10. • N-gram is a contiguous sequence of n items from a given sequence of text or speech
• When the items are words, n-grams are also called as shingles
• An n-gram of size 1 is a “unigram” , size 2 is a “bigram” , size 3 is “trigram” and so on..
Advantages : More efficient use of human and data resources
Disadvantages in rule based approach are eliminated
Disadvantages : Corpus creation can be costly
Errors are hard to predict and fix
Examples : SYSTRAN, ART, METEO, LOGOS, Anusaarka, TC- Star, Google translate
Statistical Machine Translation (continued..)
10
11. What is Speech Recognition ?
Speech Recognition is the translation of spoken words into text
Also called “Automatic Speech Recognition” (ASR), “Computer Speech
Recognition” or just “Speech To Text” (STT)
11
13. Machine Learning paradigms for speech recognition
• Hidden Markov Model
• Discriminative Learning
• Structured Sequence Learning
• Bayesian Learning
• Adaptive Learning
• Multi – task Learning
• Active Learning
13
14. Speech recognition systems
• Google Speech API
• Cloud Speech API
• Microsoft cognitive services – Bing speech API
• API.AI
• Speechmatics
• Vocapia Speech to Text API
• Klaldi
• iSpeech
• Baidu
• Siri
• Hound
• Google Now
14
15. Integration of ASR & MT
4 approaches
Word graphs product – Separate large word graphs for ASR and MT system
will be generated and take the product of these using composition operation in
automata theory
ASR constrained search – Replaced n-gram language model of phrase base
MT with the ASR word graph
Adapted Language Model – MT system has been improved by adopting it’s
language model to the ASR output
MT-Derived Language Model – Rescoring the ASR word graph with a
language model that is derived from the MT system
Examples : TELNET, IBM 1,2 Models , SEECAT
15
16. Combined ASR / SMT Model
P(e) Language Model
P(f|e) Translation Model P(x|e) Acoustic Model
e = argmax {P(e). P(f|e).P(x|e)}
e: Target Language
Text
x: Speechf: Source Language
16
17. Loose Integration & Tight Integration approaches
Loose Integration approach
• P(e) has 2 components
• PS(e) – characterizes those aspects of language that can be acquired from large t
ext corpora in the target language
• PM(e) – represents the effects that can be acquired from the source language text
P(e) = (λM) PM(e) + (λS) PS(e)
Assumption :- These 2 models are independent
P(e) = (λM) PM(e) . (λS) PS(e)
P(e) = PM(e) λM. PS(e) λS
17
18. Tight Integration approach
• Involves using SMT to reevaluate ASR hypothesis
• Each string hypothesis appearing in the ASR N-best list is rescored using the language translation
probability, P(f|e) obtained from the SMT
• The score for each string is combined from a log linear combination of acoustic, language model &
translation model probabilities
e = argmaxe { (λ1) log(P(e)) + (λ2) log(P(f|e)) + (λ3) log(P(x|e)) }
18
20. New trends in Machine Translation
• Neural MT
• Google Translate -> Phrase based which breaks an input sentence into words and phrases to be
translated largely independently
• GNMT -> considers the entire input sentence as a unit for translation
• Map the meaning of a sentence into a fixed-length vector representation and then generate a
translation based on that vector
• Advantages over Google Translate ---
it requires fewer engineering design choices
Easier to build and train
Small memory footprint
Generalize well to long sequences
20
22. Neural MT of Microsoft
22
Encode
Encode
Encode
Final Output Matrix
Attention Layer
500 dimension vector
1000 dimension vector
23. ASR – MT System of Microsoft
23
Automatic Speech
Recognition
True Text Machine
Translation
Text To Speech
24. MT in 2017 (Some popular and publicly available commercial systems)
• Microsoft
• Microsoft Adapted
• Systran Neural
• SDL
• SDL Adapted
• Lilt
• Lilt Adapted
• Lilt Interactive
• A/B Testing
• Convergence of TM and MT Systems
• Adoption of MT for instant web publishing
• NMT + SMT/RBMT approach
• NMT for mobile devices
24
25. Deep learning in to MT
Attention Mechanism
• NMT translates the whole sentence at once
• Able to translate very long sentences
• Attention retains a memory of source hidden states (Random Access Memory)
• Compare target and source hidden states
• Learning both translation & alignment
• Local attention & Global attention 25
26. Conclusion
Implemented models are still in developing era
Statistical approaches and language models have been popularized so far
Speech recognition tools have been developed so far
Concept of AI , ML can be integrated
Future works
Approaches to minimize the human post editing in translation
Increase the quality of available translation tools
Enhance the effectiveness of speech recognition paradigm
A method of communication with other languages, for blind people so that they can enter a
speech input and can get responses in their language by means of speech to text conversion 26