SlideShare uma empresa Scribd logo
1 de 19
Baixar para ler offline
Dhvani
Indian Language
Text To Speech System
Santhosh Thottingal
Dhvani 2
Agenda
 Text to speech system – An introduction
 Dhvani Introduction
 Algorithm and Architecture
 Demo –Malayalam, Hindi, Kannada
 How to add a new Language support
 Discussion on the Front ends, Integration with other
application
Dhvani 3
 Software to Read text
 Algorithms- Concatenation
 Festival- Open source Text to speech system
 Intonation and Prosody- Still a research area
 Indian Languages and Text to speech-Direct
Grapheme to phoneme mapping(G2P mapping)
Text to speech Systems
Dhvani 4
 Started as a part of Simputer Project.
 Designed By Dr. Ramesh Hariharan, IISC
 Sound database developed at IISC Bangalore
 Language Independent Design
 Based on phoneme concatenation technology
 Hindi , Kannada and Malayalam support
 Project was inactive for the last 5 Years.
 An attempt in India to Cover all Indian languages under a single
framework
 A GPLed Project in GNU/Linux Platform
Dhvani TTS
Dhvani 5
 Based on the Observation that a Direct G2P mapping
exits for all Indian languages in general.
 Each language requires a Unicode parser.
 A UTF to phonetic conversion system converts the
Text to a phonetic script- Dhvani specific.
 Speech synthesizer takes phonetic script and
concatenates the sound files to produce speech
 words are identified by space, comma, full stop, new
line etc...
 Pause at each word gap, new line, paragraph
Algorithm
Dhvani 6
Architecture
Input Text
UTF-8
Text parser
Text to Dhvani
Phonetic Script
Conversion
Grapheme To
Phoneme Rules
Phonetic Synthesizer
CV Pair
Algorithm
Sound
concatenation
Sound Database
Speech Synthesizer
Speech
Dhvani 7
 This makes Dhvani language independent
 Any unicode text will be converted to a common
script.
 This script is the input to the Speech Synthesizer
 Examples:
* khana (food in hindi) kh2 n2 (CV CV)
* maun (silence in hindi) m13n (CVC)
* kahaan (where in hindi) k1 h2an (CV CVC)
* pratibha (talent in hindi) pHr1 t3 bh2 (HCV CV CV)
* sankalp (resolution in hindi) s1n k1l 0p (CVC CVC 0C)
* chandramaa (the moon in hindi) ch1n dHr1 m2 (CVC HCV CV)
* praan (life in hindi) pHr2n (HCVC)
* mysore (as pronounced in kannada) m10 s6 r5 (CV CV CV)
* rashtr (nation in hindi) r2sh 0tt 0r (CVC 0C 0C)
* aadesh (instruction in hindi) 2 d8sh (V CHC)
* andaaz (style in urdu) 1n d2z (VC CVC)
* ahimsa (nonviolence) 1 h3n s2 (V CVC CV)
* vazhapazham (banana in tamil) v2 zh1 p1 zh1m (CV CV CV CVC)
Text to Dhvani Phonetic Script
Dhvani 8
 The phonetic description is syllable based.
 8kinds of sounds are allowed (C -consonant, V -Vowel, H
-Half Sound).
 V: a plain vowel
 CV: a consonant followed by a vowel
 VC: a vowel followed by a consonant
 CVC: a consonant followed by a vowel followed by a
consonant
 HCV: a half consonant, followed by a CV
 HCVC: a half consonant, followed by a CVC
 0C: a consonant alone
 G[0-9]*: a silence gap of the specified length (typical
gaps
Grapheme to Phoneme Conversion
Dhvani 9
vowels allowed are:
2. a as is pun
3. aa as in the hindi word saal (meaning year)
4. i as in pin
5. ii as in keen
6. u as in pull
7. uu as in pool
8. e as in met
9. ee as in mate
10. ae as in mat
11. ai as in height
12. o as in the tamil word ponni (meaning gold)
13. oo as in court
14. au as in call
15. ow as in cow
16. tamil-u : as in the tamil aanddu (meaning year)
 The phonetic description uses the numbers 1-15
instead of the pnemonics given above.
Vowels
Dhvani 10
Consonats are:
k kh g gh
ch chh j jh
t th d dh n
tt tth dd ddh nna
p f b bh m y r l ll v sh s h
zh z an
 These consonants are numbered 1..34. the phonetic
description however uses the pnemonics above.
Within the program and in the database
nomenclature, the numbers are used.
Consonants
Dhvani 11
 All sound files stored in the database are gsm
compressed .gsm files.(GSM standard by The
Communications and Operating Systems Research
Group (KBS) at the Technische Universitaet Berlin)
 Recorded at 16KHz as 16bit signed linear samples.
The following sound units are stored in the database
 CV pairs: 1..33 * 2 4 6 8 9 10 12 13 14 15
 VC pairs: 2 4 6 8 9 10 12 13 14 15 * 1..34
 V: 1..14
 C: 1..34
 Halfs: ky kr kl kll kv ksh khy khr khl khv gy gr gl gv gn ghy ghr ghv
ghn chy chr chv jy jv ty tr tv thy thr dy dr dv dhy
dhr dhv ny nr nv tty ttr ttv ddy ddr ddv py pr pl pll fr fl
by br bl bhy bhr bhl my mr vy vr vl
The total size of the database is around 1MB
Sound Database
Dhvani 12
 CV files are named x.y.gsm where x is the consonant
number and y is the vowel number.
 VC files are named x.y.gsm where x is the vowel
number and y is the consonant number.
 V files are named x.gsm where x is the vowel
number.
 Halfs files are named x.y.gsm where x,y are the two
consonants involved.
 0C files are named x.gsm where x is the consonant
number.
 All files other than the 0C files have been pitch
marked and the marks appear in the corresponding
.marks files, one mark per byte as an unsigned char.
Sound Concatenation
Dhvani 13
 In addition to the sound files, there are four files in
database/, namely cvoffsets, vcoffsets, voffsets and
hoffsets, which store various attributes of the sound
files.
 cvoffsets
CV fields:
start(start of the cv)
diphst(diphone start position: default halfway to ctov
from start)
ctov(cons to vowel change position)
longvowlen(length of long vowel, currently not really used)
shortvowlen(length of short vowel)
diphend(end of diphone for long vowel, short will be obtained from
long)
diphshortfactor(factor for getting short diphone from long)
halfst(place where this cv is cut to connect to previous half)
Sound Concatenation
Dhvani 14
Sound Concatenation
vcoffsets
 VC fields:
 end(end of vc)
 diphend(diphone end position: default halfway from ctov to end)
 vtoc(vowel to cons change position) longvowlen(length of long vowel,
currently not really used)
 shortvowlen(length of short vowel)
 diphst(start of diphone for long vowel, short will be obtained from
long)
voffsets
 V fields:
 length (length to be played starting from 0)
hoffsets
 Halfs fields:
 start (start of half) end (place where this half is cut and appended to
the next
Dhvani 15
Language Modules
A language Module does the parsing, grapheme to
phoneme conversion
Input is text in Unicode format.
Output is phonetic script
Any logic for producing it based on the language
characteristics can be done in the language module
Dhvani can detect the languages and it dispatches
the text to the corresponding phonetic synthesizer
Multiple languages in a single input text is
supported
Dhvani 16
Language Modules
 Language module can handle the number reading
logic
 Acronyms, Currency, other features of language
can be done.
To write a new Language module, start with one
existing one make necessary changes.
Dhvani 17
Typical use of TTS systems
● TTS can save time and money in business, when
compared to studio based pre-recorded speech files
● Telephony applications- voice portals, CRM, call centers
● In-vehicle environments to read text while driving
● Hands-busy, eyes-Busy applications in industry
● Many applications if we can develop a voice recognition
system and integrating it with TTS
Dhvani 18
TODO
●More Language Modules
●Integration with Desktop Environments, Text editors etc..
Already Integrated with Gedit as an External tool
●A GUI for Dhvani
●Facility to save the speech in various sound formats.
Currently it saves the file in 16 bit unsigned 16KHz PCM
format
●Applications that use Dhvani as a back end for various
accessibility requirements
Dhvani 19
Thanks
●Developers: Dr Ramesh Hariharan, Santhosh Thottingal
●Download: http://sourceforge.net/projects/dhvani
●Documentation: http://fci.wikia.com/wiki/Dhvani
●License: GPL version 2 or Later

Mais conteúdo relacionado

Destaque

Structure analysis of dhvani school by anandavardhana
Structure analysis of dhvani school by anandavardhanaStructure analysis of dhvani school by anandavardhana
Structure analysis of dhvani school by anandavardhana
Kinjal Patel
 
Dhwani Theory and Alamkara
Dhwani Theory and AlamkaraDhwani Theory and Alamkara
Dhwani Theory and Alamkara
goswamigayatri19
 
Paper no.:7 Literary Theory and Criticism
Paper no.:7 Literary Theory and CriticismPaper no.:7 Literary Theory and Criticism
Paper no.:7 Literary Theory and Criticism
goswamigayatri
 
Literary theory and criticism
Literary theory and criticismLiterary theory and criticism
Literary theory and criticism
Vibhuti Bhatt
 
Shakuntala ryder full script
Shakuntala ryder full scriptShakuntala ryder full script
Shakuntala ryder full script
dean dundas
 
Auchitya, vakrokti, and riti
Auchitya, vakrokti, and ritiAuchitya, vakrokti, and riti
Auchitya, vakrokti, and riti
bhattprakruti20
 

Destaque (20)

Structure analysis of dhvani school by anandavardhana
Structure analysis of dhvani school by anandavardhanaStructure analysis of dhvani school by anandavardhana
Structure analysis of dhvani school by anandavardhana
 
Dhwani Theory and Alamkara
Dhwani Theory and AlamkaraDhwani Theory and Alamkara
Dhwani Theory and Alamkara
 
Rasa theory-shakuntala
Rasa theory-shakuntalaRasa theory-shakuntala
Rasa theory-shakuntala
 
Theory of rasa
Theory of rasaTheory of rasa
Theory of rasa
 
Rasa Theory
Rasa TheoryRasa Theory
Rasa Theory
 
Daya paper 7-Define the Riti school.
Daya paper 7-Define the Riti school.Daya paper 7-Define the Riti school.
Daya paper 7-Define the Riti school.
 
Paper no.:7 Literary Theory and Criticism
Paper no.:7 Literary Theory and CriticismPaper no.:7 Literary Theory and Criticism
Paper no.:7 Literary Theory and Criticism
 
Presentation7
Presentation7Presentation7
Presentation7
 
Natyasastra: Dramatic Mode in light of the Western Concept of Drama
Natyasastra: Dramatic Mode in light of the Western Concept of DramaNatyasastra: Dramatic Mode in light of the Western Concept of Drama
Natyasastra: Dramatic Mode in light of the Western Concept of Drama
 
rasa theory__(mrinal)
rasa theory__(mrinal)rasa theory__(mrinal)
rasa theory__(mrinal)
 
Natya veda for global peace new
Natya veda for global peace newNatya veda for global peace new
Natya veda for global peace new
 
Literary theory and criticism
Literary theory and criticismLiterary theory and criticism
Literary theory and criticism
 
Sanskriti
SanskritiSanskriti
Sanskriti
 
Theory of Vakrokti
Theory of VakroktiTheory of Vakrokti
Theory of Vakrokti
 
Shakuntala ryder full script
Shakuntala ryder full scriptShakuntala ryder full script
Shakuntala ryder full script
 
Auchitya, vakrokti, and riti
Auchitya, vakrokti, and ritiAuchitya, vakrokti, and riti
Auchitya, vakrokti, and riti
 
Comparative study of Oliver with Jamal
Comparative study of Oliver with Jamal Comparative study of Oliver with Jamal
Comparative study of Oliver with Jamal
 
Contribution of Abhinav Gupt in Rasa theory
Contribution of Abhinav Gupt in Rasa theoryContribution of Abhinav Gupt in Rasa theory
Contribution of Abhinav Gupt in Rasa theory
 
Rasa Theory
Rasa TheoryRasa Theory
Rasa Theory
 
460.02a Enlightening Flavors: Bharata
460.02a Enlightening Flavors: Bharata460.02a Enlightening Flavors: Bharata
460.02a Enlightening Flavors: Bharata
 

Semelhante a Dhvani TTS

Intern Presentation
Intern PresentationIntern Presentation
Intern Presentation
Apurva Singh
 
Speech To Sign Language Interpreter System
Speech To Sign Language Interpreter SystemSpeech To Sign Language Interpreter System
Speech To Sign Language Interpreter System
kkkseld
 
Automatic subtitle generation
Automatic subtitle generationAutomatic subtitle generation
Automatic subtitle generation
tanyasaxena1611
 

Semelhante a Dhvani TTS (20)

IRJET- Text to Speech Synthesis for Hindi Language using Festival Framework
IRJET- Text to Speech Synthesis for Hindi Language using Festival FrameworkIRJET- Text to Speech Synthesis for Hindi Language using Festival Framework
IRJET- Text to Speech Synthesis for Hindi Language using Festival Framework
 
Text To Speech Synthesis System For Marathi Language Using Concatenation Tech...
Text To Speech Synthesis System For Marathi Language Using Concatenation Tech...Text To Speech Synthesis System For Marathi Language Using Concatenation Tech...
Text To Speech Synthesis System For Marathi Language Using Concatenation Tech...
 
High Quality Arabic Concatenative Speech Synthesis
High Quality Arabic Concatenative Speech SynthesisHigh Quality Arabic Concatenative Speech Synthesis
High Quality Arabic Concatenative Speech Synthesis
 
SAP (SPEECH AND AUDIO PROCESSING)
SAP (SPEECH AND AUDIO PROCESSING)SAP (SPEECH AND AUDIO PROCESSING)
SAP (SPEECH AND AUDIO PROCESSING)
 
Intern Presentation
Intern PresentationIntern Presentation
Intern Presentation
 
Gujarati Text-to-Speech Presentation
Gujarati Text-to-Speech PresentationGujarati Text-to-Speech Presentation
Gujarati Text-to-Speech Presentation
 
Hindi digits recognition system on speech data collected in different natural...
Hindi digits recognition system on speech data collected in different natural...Hindi digits recognition system on speech data collected in different natural...
Hindi digits recognition system on speech data collected in different natural...
 
Ijetcas14 575
Ijetcas14 575Ijetcas14 575
Ijetcas14 575
 
PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...
PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...
PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...
 
PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...
PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...
PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...
 
G1803013542
G1803013542G1803013542
G1803013542
 
visH (fin).pptx
visH (fin).pptxvisH (fin).pptx
visH (fin).pptx
 
Deep Learning for Machine Translation - A dramatic turn of paradigm
Deep Learning for Machine Translation - A dramatic turn of paradigmDeep Learning for Machine Translation - A dramatic turn of paradigm
Deep Learning for Machine Translation - A dramatic turn of paradigm
 
**JUNK** (no subject)
**JUNK** (no subject)**JUNK** (no subject)
**JUNK** (no subject)
 
Speech To Sign Language Interpreter System
Speech To Sign Language Interpreter SystemSpeech To Sign Language Interpreter System
Speech To Sign Language Interpreter System
 
An Introduction To Speech Recognition
An Introduction To Speech RecognitionAn Introduction To Speech Recognition
An Introduction To Speech Recognition
 
Artificially Generatedof Concatenative Syllable based Text to Speech Synthesi...
Artificially Generatedof Concatenative Syllable based Text to Speech Synthesi...Artificially Generatedof Concatenative Syllable based Text to Speech Synthesi...
Artificially Generatedof Concatenative Syllable based Text to Speech Synthesi...
 
Automatic subtitle generation
Automatic subtitle generationAutomatic subtitle generation
Automatic subtitle generation
 
Segmentation Words for Speech Synthesis in Persian Language Based On Silence
Segmentation Words for Speech Synthesis in Persian Language Based On SilenceSegmentation Words for Speech Synthesis in Persian Language Based On Silence
Segmentation Words for Speech Synthesis in Persian Language Based On Silence
 
Speech and Language Processing
Speech and Language ProcessingSpeech and Language Processing
Speech and Language Processing
 

Mais de Shrinivasan T

Tamilinayavaani - integrating tva open-source spellchecker with python
Tamilinayavaani -  integrating tva open-source spellchecker with pythonTamilinayavaani -  integrating tva open-source spellchecker with python
Tamilinayavaani - integrating tva open-source spellchecker with python
Shrinivasan T
 
கட்டற்ற மென்பொருள் பற்றிய அறிமுகம் - தமிழில் - Introduction to Open source in...
கட்டற்ற மென்பொருள் பற்றிய அறிமுகம் - தமிழில் - Introduction to Open source in...கட்டற்ற மென்பொருள் பற்றிய அறிமுகம் - தமிழில் - Introduction to Open source in...
கட்டற்ற மென்பொருள் பற்றிய அறிமுகம் - தமிழில் - Introduction to Open source in...
Shrinivasan T
 
Sprit of Engineering
Sprit of EngineeringSprit of Engineering
Sprit of Engineering
Shrinivasan T
 

Mais de Shrinivasan T (20)

Giving New Life to Old Tamil Little Magazines Through Digitization
Giving New Life to Old Tamil Little Magazines Through DigitizationGiving New Life to Old Tamil Little Magazines Through Digitization
Giving New Life to Old Tamil Little Magazines Through Digitization
 
Digitization of Tamil Soviet Publications and Little Magazines.pdf
Digitization of Tamil Soviet Publications and Little Magazines.pdfDigitization of Tamil Soviet Publications and Little Magazines.pdf
Digitization of Tamil Soviet Publications and Little Magazines.pdf
 
python-an-introduction
python-an-introductionpython-an-introduction
python-an-introduction
 
Tamilinayavaani - integrating tva open-source spellchecker with python
Tamilinayavaani -  integrating tva open-source spellchecker with pythonTamilinayavaani -  integrating tva open-source spellchecker with python
Tamilinayavaani - integrating tva open-source spellchecker with python
 
Algorithms for certain classes of tamil spelling correction
Algorithms for certain classes of tamil spelling correctionAlgorithms for certain classes of tamil spelling correction
Algorithms for certain classes of tamil spelling correction
 
Tamil and-free-software - தமிழும் கட்டற்ற மென்பொருட்களும்
Tamil and-free-software - தமிழும் கட்டற்ற மென்பொருட்களும்Tamil and-free-software - தமிழும் கட்டற்ற மென்பொருட்களும்
Tamil and-free-software - தமிழும் கட்டற்ற மென்பொருட்களும்
 
Introducing FreeTamilEbooks
Introducing FreeTamilEbooks Introducing FreeTamilEbooks
Introducing FreeTamilEbooks
 
கணித்தமிழும் மென்பொருள்களும் - தேவைகளும் தீர்வுகளும்
கணித்தமிழும் மென்பொருள்களும் - தேவைகளும் தீர்வுகளும் கணித்தமிழும் மென்பொருள்களும் - தேவைகளும் தீர்வுகளும்
கணித்தமிழும் மென்பொருள்களும் - தேவைகளும் தீர்வுகளும்
 
Contribute to free open source software tamil - கட்டற்ற மென்பொருளுக்கு பங்களி...
Contribute to free open source software tamil - கட்டற்ற மென்பொருளுக்கு பங்களி...Contribute to free open source software tamil - கட்டற்ற மென்பொருளுக்கு பங்களி...
Contribute to free open source software tamil - கட்டற்ற மென்பொருளுக்கு பங்களி...
 
ஏன் லினக்ஸ் பயன்படுத்த வேண்டும்? - Why Linux? in Tamil
ஏன் லினக்ஸ் பயன்படுத்த வேண்டும்? - Why Linux? in Tamilஏன் லினக்ஸ் பயன்படுத்த வேண்டும்? - Why Linux? in Tamil
ஏன் லினக்ஸ் பயன்படுத்த வேண்டும்? - Why Linux? in Tamil
 
கட்டற்ற மென்பொருள் பற்றிய அறிமுகம் - தமிழில் - Introduction to Open source in...
கட்டற்ற மென்பொருள் பற்றிய அறிமுகம் - தமிழில் - Introduction to Open source in...கட்டற்ற மென்பொருள் பற்றிய அறிமுகம் - தமிழில் - Introduction to Open source in...
கட்டற்ற மென்பொருள் பற்றிய அறிமுகம் - தமிழில் - Introduction to Open source in...
 
Share your knowledge in wikipedia
Share your knowledge in wikipediaShare your knowledge in wikipedia
Share your knowledge in wikipedia
 
Open-Tamil Python Library for Tamil Text Processing
Open-Tamil Python Library for Tamil Text ProcessingOpen-Tamil Python Library for Tamil Text Processing
Open-Tamil Python Library for Tamil Text Processing
 
Version control-systems
Version control-systemsVersion control-systems
Version control-systems
 
Contribute to-ubuntu
Contribute to-ubuntuContribute to-ubuntu
Contribute to-ubuntu
 
Freedom toaster
Freedom toasterFreedom toaster
Freedom toaster
 
Sprit of Engineering
Sprit of EngineeringSprit of Engineering
Sprit of Engineering
 
Amace ion newsletter-01
Amace ion   newsletter-01Amace ion   newsletter-01
Amace ion newsletter-01
 
Rpm Introduction
Rpm IntroductionRpm Introduction
Rpm Introduction
 
Foss History
Foss HistoryFoss History
Foss History
 

Último

1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
kauryashika82
 

Último (20)

Magic bus Group work1and 2 (Team 3).pptx
Magic bus Group work1and 2 (Team 3).pptxMagic bus Group work1and 2 (Team 3).pptx
Magic bus Group work1and 2 (Team 3).pptx
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxSKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Asian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptxAsian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptx
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
Dyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptxDyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptx
 

Dhvani TTS

  • 1. Dhvani Indian Language Text To Speech System Santhosh Thottingal
  • 2. Dhvani 2 Agenda  Text to speech system – An introduction  Dhvani Introduction  Algorithm and Architecture  Demo –Malayalam, Hindi, Kannada  How to add a new Language support  Discussion on the Front ends, Integration with other application
  • 3. Dhvani 3  Software to Read text  Algorithms- Concatenation  Festival- Open source Text to speech system  Intonation and Prosody- Still a research area  Indian Languages and Text to speech-Direct Grapheme to phoneme mapping(G2P mapping) Text to speech Systems
  • 4. Dhvani 4  Started as a part of Simputer Project.  Designed By Dr. Ramesh Hariharan, IISC  Sound database developed at IISC Bangalore  Language Independent Design  Based on phoneme concatenation technology  Hindi , Kannada and Malayalam support  Project was inactive for the last 5 Years.  An attempt in India to Cover all Indian languages under a single framework  A GPLed Project in GNU/Linux Platform Dhvani TTS
  • 5. Dhvani 5  Based on the Observation that a Direct G2P mapping exits for all Indian languages in general.  Each language requires a Unicode parser.  A UTF to phonetic conversion system converts the Text to a phonetic script- Dhvani specific.  Speech synthesizer takes phonetic script and concatenates the sound files to produce speech  words are identified by space, comma, full stop, new line etc...  Pause at each word gap, new line, paragraph Algorithm
  • 6. Dhvani 6 Architecture Input Text UTF-8 Text parser Text to Dhvani Phonetic Script Conversion Grapheme To Phoneme Rules Phonetic Synthesizer CV Pair Algorithm Sound concatenation Sound Database Speech Synthesizer Speech
  • 7. Dhvani 7  This makes Dhvani language independent  Any unicode text will be converted to a common script.  This script is the input to the Speech Synthesizer  Examples: * khana (food in hindi) kh2 n2 (CV CV) * maun (silence in hindi) m13n (CVC) * kahaan (where in hindi) k1 h2an (CV CVC) * pratibha (talent in hindi) pHr1 t3 bh2 (HCV CV CV) * sankalp (resolution in hindi) s1n k1l 0p (CVC CVC 0C) * chandramaa (the moon in hindi) ch1n dHr1 m2 (CVC HCV CV) * praan (life in hindi) pHr2n (HCVC) * mysore (as pronounced in kannada) m10 s6 r5 (CV CV CV) * rashtr (nation in hindi) r2sh 0tt 0r (CVC 0C 0C) * aadesh (instruction in hindi) 2 d8sh (V CHC) * andaaz (style in urdu) 1n d2z (VC CVC) * ahimsa (nonviolence) 1 h3n s2 (V CVC CV) * vazhapazham (banana in tamil) v2 zh1 p1 zh1m (CV CV CV CVC) Text to Dhvani Phonetic Script
  • 8. Dhvani 8  The phonetic description is syllable based.  8kinds of sounds are allowed (C -consonant, V -Vowel, H -Half Sound).  V: a plain vowel  CV: a consonant followed by a vowel  VC: a vowel followed by a consonant  CVC: a consonant followed by a vowel followed by a consonant  HCV: a half consonant, followed by a CV  HCVC: a half consonant, followed by a CVC  0C: a consonant alone  G[0-9]*: a silence gap of the specified length (typical gaps Grapheme to Phoneme Conversion
  • 9. Dhvani 9 vowels allowed are: 2. a as is pun 3. aa as in the hindi word saal (meaning year) 4. i as in pin 5. ii as in keen 6. u as in pull 7. uu as in pool 8. e as in met 9. ee as in mate 10. ae as in mat 11. ai as in height 12. o as in the tamil word ponni (meaning gold) 13. oo as in court 14. au as in call 15. ow as in cow 16. tamil-u : as in the tamil aanddu (meaning year)  The phonetic description uses the numbers 1-15 instead of the pnemonics given above. Vowels
  • 10. Dhvani 10 Consonats are: k kh g gh ch chh j jh t th d dh n tt tth dd ddh nna p f b bh m y r l ll v sh s h zh z an  These consonants are numbered 1..34. the phonetic description however uses the pnemonics above. Within the program and in the database nomenclature, the numbers are used. Consonants
  • 11. Dhvani 11  All sound files stored in the database are gsm compressed .gsm files.(GSM standard by The Communications and Operating Systems Research Group (KBS) at the Technische Universitaet Berlin)  Recorded at 16KHz as 16bit signed linear samples. The following sound units are stored in the database  CV pairs: 1..33 * 2 4 6 8 9 10 12 13 14 15  VC pairs: 2 4 6 8 9 10 12 13 14 15 * 1..34  V: 1..14  C: 1..34  Halfs: ky kr kl kll kv ksh khy khr khl khv gy gr gl gv gn ghy ghr ghv ghn chy chr chv jy jv ty tr tv thy thr dy dr dv dhy dhr dhv ny nr nv tty ttr ttv ddy ddr ddv py pr pl pll fr fl by br bl bhy bhr bhl my mr vy vr vl The total size of the database is around 1MB Sound Database
  • 12. Dhvani 12  CV files are named x.y.gsm where x is the consonant number and y is the vowel number.  VC files are named x.y.gsm where x is the vowel number and y is the consonant number.  V files are named x.gsm where x is the vowel number.  Halfs files are named x.y.gsm where x,y are the two consonants involved.  0C files are named x.gsm where x is the consonant number.  All files other than the 0C files have been pitch marked and the marks appear in the corresponding .marks files, one mark per byte as an unsigned char. Sound Concatenation
  • 13. Dhvani 13  In addition to the sound files, there are four files in database/, namely cvoffsets, vcoffsets, voffsets and hoffsets, which store various attributes of the sound files.  cvoffsets CV fields: start(start of the cv) diphst(diphone start position: default halfway to ctov from start) ctov(cons to vowel change position) longvowlen(length of long vowel, currently not really used) shortvowlen(length of short vowel) diphend(end of diphone for long vowel, short will be obtained from long) diphshortfactor(factor for getting short diphone from long) halfst(place where this cv is cut to connect to previous half) Sound Concatenation
  • 14. Dhvani 14 Sound Concatenation vcoffsets  VC fields:  end(end of vc)  diphend(diphone end position: default halfway from ctov to end)  vtoc(vowel to cons change position) longvowlen(length of long vowel, currently not really used)  shortvowlen(length of short vowel)  diphst(start of diphone for long vowel, short will be obtained from long) voffsets  V fields:  length (length to be played starting from 0) hoffsets  Halfs fields:  start (start of half) end (place where this half is cut and appended to the next
  • 15. Dhvani 15 Language Modules A language Module does the parsing, grapheme to phoneme conversion Input is text in Unicode format. Output is phonetic script Any logic for producing it based on the language characteristics can be done in the language module Dhvani can detect the languages and it dispatches the text to the corresponding phonetic synthesizer Multiple languages in a single input text is supported
  • 16. Dhvani 16 Language Modules  Language module can handle the number reading logic  Acronyms, Currency, other features of language can be done. To write a new Language module, start with one existing one make necessary changes.
  • 17. Dhvani 17 Typical use of TTS systems ● TTS can save time and money in business, when compared to studio based pre-recorded speech files ● Telephony applications- voice portals, CRM, call centers ● In-vehicle environments to read text while driving ● Hands-busy, eyes-Busy applications in industry ● Many applications if we can develop a voice recognition system and integrating it with TTS
  • 18. Dhvani 18 TODO ●More Language Modules ●Integration with Desktop Environments, Text editors etc.. Already Integrated with Gedit as an External tool ●A GUI for Dhvani ●Facility to save the speech in various sound formats. Currently it saves the file in 16 bit unsigned 16KHz PCM format ●Applications that use Dhvani as a back end for various accessibility requirements
  • 19. Dhvani 19 Thanks ●Developers: Dr Ramesh Hariharan, Santhosh Thottingal ●Download: http://sourceforge.net/projects/dhvani ●Documentation: http://fci.wikia.com/wiki/Dhvani ●License: GPL version 2 or Later