SlideShare a Scribd company logo
1 of 35
Download to read offline
Watch this presentation in video:
http://videolectures.net/internetofeducation2013_diaz_munio_translectures/

Internet of Education 2013
12 November 2013
Universitat Politècnica de València
www.translectures.eu

EC FP7 ICT project #287755
Outline
• transLectures (tL): motivation and goals
• Video demo
• tL technologies
• Progress and results
• Scientific evaluations
• User evaluations
• Quality control (expert evaluations)

• Implementation and integration
• tL open source tools
12 Nov 2013

2
Motivation
• Video lecture repositories and MOOCs
• Thousands of hours of video lectures available
• Hundreds of hours of video lectures recorded
every week

• Most video lectures only available in their
original language
• No subtitles

12 Nov 2013

3
Motivation
• Transcriptions and translations are needed
•
•
•
•
•

Accessibility for people with disabilities
Accessibility for speakers of different languages
Search and analysis functions
Automated topic finding
…

12 Nov 2013

4
Motivation
• Transcriptions and translations are needed
•
•
•
•
•

Accessibility for people with disabilities
Accessibility for speakers of different languages
Search and analysis functions
Automated topic finding
…

• How do we get there?

12 Nov 2013

5
The transLectures approach
1. Automatic Speech Recognition (ASR)
and Machine Translation (MT)
• Adaptation: Taking advantage of the characteristics
of video lecture repositories
• High-quality automatic transcriptions and translations

2. Interactive postediting:
intelligent interaction for reduced effort

12 Nov 2013

6
Goals
• Massive adaptation
• Intelligent interaction
• Implementation
• Case studies: Videolectures.NET & Polimedia
• Real-life evaluation

• Integration into Opencast Matterhorn
http://opencast.org/matterhorn/
12 Nov 2013

7
The transLectures partners
1
2

Name
Universitat Politècnica de València
Xerox SAS

Country
Spain
France

3
3+
4

Institut Jožef Stefan
Knowledge for All Foundation
RWTH Aachen University

Slovenia
UK
Germany

5
6

EML – European Media Laboratory
DDS – Deluxe Digital Studios

Germany
UK

12 Nov 2013

8
Languages
• Transcription (ASR)
• EN
• SL
• ES

• Translation (MT)
•
•
•
•

EN>SL , SL>EN
EN>ES , ES>EN
EN>FR
EN>DE

12 Nov 2013

9
transLectures: video demo

http://www.translectures.eu/new-demo-video/
12 Nov 2013

10
Massive adaptation
• Characteristics
of video lectures

Just one person
Known speaker
Clear talking

No interruptions
Focused on a topic
Slides

12 Nov 2013

11
Massive adaptation
• Known speaker and topic
• Slides
• Related documents

12 Nov 2013

12
Scientific evaluations (Y2)
• Transcription results

Worse

• WER: Word Error Rate (%)
• Goal: WER < 25%
• EN, SL, ES

Better

12 Nov 2013

13
Scientific evaluations (Y2)
• Translation results

Better

• BLEU
• Goal: BLEU > 30
•
•
•
•

EN>SL , SL>EN
EN>ES , ES>EN
EN>FR
EN>DE
Worse

12 Nov 2013

14
Y1 results and comparison
Org. X = Undisclosed, state of
the art ASR & MT systems
Org. X

Org. X

12 Nov 2013

15
Y1 results and comparison
Org. X = Undisclosed, state of
the art ASR & MT systems
Org. X

Org. X

12 Nov 2013

Org. X

Org. X

16
Intelligent interaction
• Postediting automatic transcriptions/translations
• The user invests the least possible effort
• The system learns the most from it

• Confidence measures
• Fast constrained search

12 Nov 2013

17
Intelligent interaction

12 Nov 2013

18
Intelligent interaction

12 Nov 2013

19
Intelligent interaction
• User evaluations
• UPV (Polimedia)
• JSI (Videolectures.NET)

12 Nov 2013

20
User evaluations
• User evaluations at UPV
• Users: lecturers
• Revising their own lectures

• 3 different experiments
1. Complete supervision
2. Intelligent interaction
3. Two-round supervision

12 Nov 2013

21
User evaluations
1. Complete supervision

12 Nov 2013

22
User evaluations
2. Intelligent interaction

12 Nov 2013

23
User evaluations
3. Two-round supervision

12 Nov 2013

24
User evaluations
• User evaluations at UPV: results

12 Nov 2013

25
User evaluations
• User evaluations at UPV: results

12 Nov 2013

26
User evaluations
• User evaluations at UPV: results

12 Nov 2013

27
Quality control: expert evaluations
• Transcription quality (EN, ES, SL)
• UPV: Representative set of Spanish transcriptions
• Avg. WER: 23.2 ; Avg. RTF: 3.8

12 Nov 2013

28
Quality control: expert evaluations
• Transcription quality (EN, ES, SL)
• UPV: Representative set of Spanish transcriptions
• Avg. WER: 23.2 ; Avg. RTF: 3.8

• Translation quality (EN<>SL, EN<>ES, EN>FR, EN>DE)
• UPV: Representative set of Spanish>English translations
• Avg. BLEU: 46.6 ; Avg. RTF: 14.8 ; Avg. Score: 3.6 out of 5

12 Nov 2013

29
Implementation and integration
• Videolectures.NET
• Polimedia
• Opencast Matterhorn

12 Nov 2013

30
• Polimedia

12 Nov 2013

31
transLectures: Open source tools
• The tL player (& editor)
• Coming soon (www.translectures.eu)

• The transLectures-UPV Toolkit (TLK) for ASR
• www.translectures.eu/tlk

• RWTH Aachen: rASR, Jane (MT)
• http://www-i6.informatik.rwth-aachen.de/web/Software/

12 Nov 2013

32
Next steps for transLectures
• Keep improving ASR and MT results
• Keep improving tL open source tools (TLK, tL player)
• External user evaluations (VL.NET and polimedia)
• External trials: implementation in other universities

12 Nov 2013

33
• More detailed info in the public project deliverables,
soon available from
http://www.translectures.eu/progress/
http://www.translectures.eu/public-reports/
(M12 = Year 1; M24 = Year 2, most recent results)
• More tL video demos:
http://www.translectures.eu/progress/
• Follow transLectures:
• http://twitter.com/translectures
• http://www.facebook.com/translectures
• http://www.slideshare.net/transLectures
12 Nov 2013

34
www.translectures.eu
• About this presentation:
Gonçal Garcés Díaz-Munío
• Project coordinator:
Alfons Juan-Ciscar

ggarces@dsic.upv.es
ajuan@dsic.upv.es

EC FP7 ICT Programme – Project Number 287755
12 Nov 2013

35

More Related Content

Similar to transLectures @IOE2013 (12 Nov 2013)

Subtitling & translation of weblectures by Carlos Turró Ribalta ...
Subtitling & translation of weblectures by Carlos Turró Ribalta              ...Subtitling & translation of weblectures by Carlos Turró Ribalta              ...
Subtitling & translation of weblectures by Carlos Turró Ribalta ...REC:all project
 
Drools5 Community Training Module 6 Drools DSL & Spreadsheets
Drools5 Community Training Module 6 Drools DSL & SpreadsheetsDrools5 Community Training Module 6 Drools DSL & Spreadsheets
Drools5 Community Training Module 6 Drools DSL & SpreadsheetsMauricio (Salaboy) Salatino
 
Automatic transcription of video files sig media
Automatic transcription of video files   sig mediaAutomatic transcription of video files   sig media
Automatic transcription of video files sig mediaCarlos Turró Ribalta
 
UTICamp-2020. Postgraduate Translation Study at KU Leuven, Antwerp
UTICamp-2020. Postgraduate Translation Study at KU Leuven, AntwerpUTICamp-2020. Postgraduate Translation Study at KU Leuven, Antwerp
UTICamp-2020. Postgraduate Translation Study at KU Leuven, AntwerpUTICamp
 
User Requirements in Audiovisual Search: a Quantitative Approach
User Requirements in Audiovisual Search: a Quantitative ApproachUser Requirements in Audiovisual Search: a Quantitative Approach
User Requirements in Audiovisual Search: a Quantitative Approachroelandordelman.nl
 
Research data as an aid in teaching technical competence in subtitling
Research data as an aid in teaching technical competence in subtitlingResearch data as an aid in teaching technical competence in subtitling
Research data as an aid in teaching technical competence in subtitlingUniversity of Warsaw
 
Islandora Webinar: Building a Repository Roadmap
Islandora Webinar: Building a Repository RoadmapIslandora Webinar: Building a Repository Roadmap
Islandora Webinar: Building a Repository Roadmapeohallor
 
Apertium: a unique free/open-source MT system for related languages [but not ...
Apertium: a unique free/open-source MT system for related languages [but not ...Apertium: a unique free/open-source MT system for related languages [but not ...
Apertium: a unique free/open-source MT system for related languages [but not ...Prompsit Language Engineering
 
Apertium: a unique free/open-source MT system for related languages [but not ...
Apertium: a unique free/open-source MT system for related languages [but not ...Apertium: a unique free/open-source MT system for related languages [but not ...
Apertium: a unique free/open-source MT system for related languages [but not ...Gema Ramirez-Sanchez
 
Data Science in Production: Technologies That Drive Adoption of Data Science ...
Data Science in Production: Technologies That Drive Adoption of Data Science ...Data Science in Production: Technologies That Drive Adoption of Data Science ...
Data Science in Production: Technologies That Drive Adoption of Data Science ...Nir Yungster
 
Day in the life of a data librarian [presentation for ANU 23Things group]
Day in the life of a data librarian [presentation for ANU 23Things group]Day in the life of a data librarian [presentation for ANU 23Things group]
Day in the life of a data librarian [presentation for ANU 23Things group]Jane Frazier
 
GraphTalk Wien - Intelligente Lösungen mit Graphen erstellen
GraphTalk Wien - Intelligente Lösungen mit Graphen erstellenGraphTalk Wien - Intelligente Lösungen mit Graphen erstellen
GraphTalk Wien - Intelligente Lösungen mit Graphen erstellenNeo4j
 
Public PhD Defense Vivian Motti
Public PhD Defense Vivian MottiPublic PhD Defense Vivian Motti
Public PhD Defense Vivian MottiVivian Motti
 
Introduction of the SLE'17 conference
Introduction of the SLE'17 conferenceIntroduction of the SLE'17 conference
Introduction of the SLE'17 conferenceBenoit Combemale
 
The New Lisbon University - SDL Open Exchange 2015
The New Lisbon University -  SDL Open Exchange 2015The New Lisbon University -  SDL Open Exchange 2015
The New Lisbon University - SDL Open Exchange 2015Paul Filkin
 
February 11, 2016 - Adobe Marketing Cloud User Group - Concordia's AEM Story ...
February 11, 2016 - Adobe Marketing Cloud User Group - Concordia's AEM Story ...February 11, 2016 - Adobe Marketing Cloud User Group - Concordia's AEM Story ...
February 11, 2016 - Adobe Marketing Cloud User Group - Concordia's AEM Story ...INM_
 

Similar to transLectures @IOE2013 (12 Nov 2013) (20)

Subtitling & translation of weblectures by Carlos Turró Ribalta ...
Subtitling & translation of weblectures by Carlos Turró Ribalta              ...Subtitling & translation of weblectures by Carlos Turró Ribalta              ...
Subtitling & translation of weblectures by Carlos Turró Ribalta ...
 
Drools5 Community Training Module 6 Drools DSL & Spreadsheets
Drools5 Community Training Module 6 Drools DSL & SpreadsheetsDrools5 Community Training Module 6 Drools DSL & Spreadsheets
Drools5 Community Training Module 6 Drools DSL & Spreadsheets
 
Automatic transcription of video files sig media
Automatic transcription of video files   sig mediaAutomatic transcription of video files   sig media
Automatic transcription of video files sig media
 
UTICamp-2020. Postgraduate Translation Study at KU Leuven, Antwerp
UTICamp-2020. Postgraduate Translation Study at KU Leuven, AntwerpUTICamp-2020. Postgraduate Translation Study at KU Leuven, Antwerp
UTICamp-2020. Postgraduate Translation Study at KU Leuven, Antwerp
 
User Requirements in Audiovisual Search: a Quantitative Approach
User Requirements in Audiovisual Search: a Quantitative ApproachUser Requirements in Audiovisual Search: a Quantitative Approach
User Requirements in Audiovisual Search: a Quantitative Approach
 
Introduction to Drupal 7
Introduction to Drupal 7Introduction to Drupal 7
Introduction to Drupal 7
 
Research data as an aid in teaching technical competence in subtitling
Research data as an aid in teaching technical competence in subtitlingResearch data as an aid in teaching technical competence in subtitling
Research data as an aid in teaching technical competence in subtitling
 
Islandora Webinar: Building a Repository Roadmap
Islandora Webinar: Building a Repository RoadmapIslandora Webinar: Building a Repository Roadmap
Islandora Webinar: Building a Repository Roadmap
 
Basics of Debugging Applications
Basics of Debugging ApplicationsBasics of Debugging Applications
Basics of Debugging Applications
 
ThesisPresentation
ThesisPresentationThesisPresentation
ThesisPresentation
 
Apertium: a unique free/open-source MT system for related languages [but not ...
Apertium: a unique free/open-source MT system for related languages [but not ...Apertium: a unique free/open-source MT system for related languages [but not ...
Apertium: a unique free/open-source MT system for related languages [but not ...
 
Apertium: a unique free/open-source MT system for related languages [but not ...
Apertium: a unique free/open-source MT system for related languages [but not ...Apertium: a unique free/open-source MT system for related languages [but not ...
Apertium: a unique free/open-source MT system for related languages [but not ...
 
Data Science in Production: Technologies That Drive Adoption of Data Science ...
Data Science in Production: Technologies That Drive Adoption of Data Science ...Data Science in Production: Technologies That Drive Adoption of Data Science ...
Data Science in Production: Technologies That Drive Adoption of Data Science ...
 
Day in the life of a data librarian [presentation for ANU 23Things group]
Day in the life of a data librarian [presentation for ANU 23Things group]Day in the life of a data librarian [presentation for ANU 23Things group]
Day in the life of a data librarian [presentation for ANU 23Things group]
 
2007 LITA National Forum 2007. Denver, Colorado
2007 LITA National Forum  2007. Denver, Colorado2007 LITA National Forum  2007. Denver, Colorado
2007 LITA National Forum 2007. Denver, Colorado
 
GraphTalk Wien - Intelligente Lösungen mit Graphen erstellen
GraphTalk Wien - Intelligente Lösungen mit Graphen erstellenGraphTalk Wien - Intelligente Lösungen mit Graphen erstellen
GraphTalk Wien - Intelligente Lösungen mit Graphen erstellen
 
Public PhD Defense Vivian Motti
Public PhD Defense Vivian MottiPublic PhD Defense Vivian Motti
Public PhD Defense Vivian Motti
 
Introduction of the SLE'17 conference
Introduction of the SLE'17 conferenceIntroduction of the SLE'17 conference
Introduction of the SLE'17 conference
 
The New Lisbon University - SDL Open Exchange 2015
The New Lisbon University -  SDL Open Exchange 2015The New Lisbon University -  SDL Open Exchange 2015
The New Lisbon University - SDL Open Exchange 2015
 
February 11, 2016 - Adobe Marketing Cloud User Group - Concordia's AEM Story ...
February 11, 2016 - Adobe Marketing Cloud User Group - Concordia's AEM Story ...February 11, 2016 - Adobe Marketing Cloud User Group - Concordia's AEM Story ...
February 11, 2016 - Adobe Marketing Cloud User Group - Concordia's AEM Story ...
 

Recently uploaded

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 

Recently uploaded (20)

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 

transLectures @IOE2013 (12 Nov 2013)

  • 1. Watch this presentation in video: http://videolectures.net/internetofeducation2013_diaz_munio_translectures/ Internet of Education 2013 12 November 2013 Universitat Politècnica de València www.translectures.eu EC FP7 ICT project #287755
  • 2. Outline • transLectures (tL): motivation and goals • Video demo • tL technologies • Progress and results • Scientific evaluations • User evaluations • Quality control (expert evaluations) • Implementation and integration • tL open source tools 12 Nov 2013 2
  • 3. Motivation • Video lecture repositories and MOOCs • Thousands of hours of video lectures available • Hundreds of hours of video lectures recorded every week • Most video lectures only available in their original language • No subtitles 12 Nov 2013 3
  • 4. Motivation • Transcriptions and translations are needed • • • • • Accessibility for people with disabilities Accessibility for speakers of different languages Search and analysis functions Automated topic finding … 12 Nov 2013 4
  • 5. Motivation • Transcriptions and translations are needed • • • • • Accessibility for people with disabilities Accessibility for speakers of different languages Search and analysis functions Automated topic finding … • How do we get there? 12 Nov 2013 5
  • 6. The transLectures approach 1. Automatic Speech Recognition (ASR) and Machine Translation (MT) • Adaptation: Taking advantage of the characteristics of video lecture repositories • High-quality automatic transcriptions and translations 2. Interactive postediting: intelligent interaction for reduced effort 12 Nov 2013 6
  • 7. Goals • Massive adaptation • Intelligent interaction • Implementation • Case studies: Videolectures.NET & Polimedia • Real-life evaluation • Integration into Opencast Matterhorn http://opencast.org/matterhorn/ 12 Nov 2013 7
  • 8. The transLectures partners 1 2 Name Universitat Politècnica de València Xerox SAS Country Spain France 3 3+ 4 Institut Jožef Stefan Knowledge for All Foundation RWTH Aachen University Slovenia UK Germany 5 6 EML – European Media Laboratory DDS – Deluxe Digital Studios Germany UK 12 Nov 2013 8
  • 9. Languages • Transcription (ASR) • EN • SL • ES • Translation (MT) • • • • EN>SL , SL>EN EN>ES , ES>EN EN>FR EN>DE 12 Nov 2013 9
  • 11. Massive adaptation • Characteristics of video lectures Just one person Known speaker Clear talking No interruptions Focused on a topic Slides 12 Nov 2013 11
  • 12. Massive adaptation • Known speaker and topic • Slides • Related documents 12 Nov 2013 12
  • 13. Scientific evaluations (Y2) • Transcription results Worse • WER: Word Error Rate (%) • Goal: WER < 25% • EN, SL, ES Better 12 Nov 2013 13
  • 14. Scientific evaluations (Y2) • Translation results Better • BLEU • Goal: BLEU > 30 • • • • EN>SL , SL>EN EN>ES , ES>EN EN>FR EN>DE Worse 12 Nov 2013 14
  • 15. Y1 results and comparison Org. X = Undisclosed, state of the art ASR & MT systems Org. X Org. X 12 Nov 2013 15
  • 16. Y1 results and comparison Org. X = Undisclosed, state of the art ASR & MT systems Org. X Org. X 12 Nov 2013 Org. X Org. X 16
  • 17. Intelligent interaction • Postediting automatic transcriptions/translations • The user invests the least possible effort • The system learns the most from it • Confidence measures • Fast constrained search 12 Nov 2013 17
  • 20. Intelligent interaction • User evaluations • UPV (Polimedia) • JSI (Videolectures.NET) 12 Nov 2013 20
  • 21. User evaluations • User evaluations at UPV • Users: lecturers • Revising their own lectures • 3 different experiments 1. Complete supervision 2. Intelligent interaction 3. Two-round supervision 12 Nov 2013 21
  • 22. User evaluations 1. Complete supervision 12 Nov 2013 22
  • 23. User evaluations 2. Intelligent interaction 12 Nov 2013 23
  • 24. User evaluations 3. Two-round supervision 12 Nov 2013 24
  • 25. User evaluations • User evaluations at UPV: results 12 Nov 2013 25
  • 26. User evaluations • User evaluations at UPV: results 12 Nov 2013 26
  • 27. User evaluations • User evaluations at UPV: results 12 Nov 2013 27
  • 28. Quality control: expert evaluations • Transcription quality (EN, ES, SL) • UPV: Representative set of Spanish transcriptions • Avg. WER: 23.2 ; Avg. RTF: 3.8 12 Nov 2013 28
  • 29. Quality control: expert evaluations • Transcription quality (EN, ES, SL) • UPV: Representative set of Spanish transcriptions • Avg. WER: 23.2 ; Avg. RTF: 3.8 • Translation quality (EN<>SL, EN<>ES, EN>FR, EN>DE) • UPV: Representative set of Spanish>English translations • Avg. BLEU: 46.6 ; Avg. RTF: 14.8 ; Avg. Score: 3.6 out of 5 12 Nov 2013 29
  • 30. Implementation and integration • Videolectures.NET • Polimedia • Opencast Matterhorn 12 Nov 2013 30
  • 32. transLectures: Open source tools • The tL player (& editor) • Coming soon (www.translectures.eu) • The transLectures-UPV Toolkit (TLK) for ASR • www.translectures.eu/tlk • RWTH Aachen: rASR, Jane (MT) • http://www-i6.informatik.rwth-aachen.de/web/Software/ 12 Nov 2013 32
  • 33. Next steps for transLectures • Keep improving ASR and MT results • Keep improving tL open source tools (TLK, tL player) • External user evaluations (VL.NET and polimedia) • External trials: implementation in other universities 12 Nov 2013 33
  • 34. • More detailed info in the public project deliverables, soon available from http://www.translectures.eu/progress/ http://www.translectures.eu/public-reports/ (M12 = Year 1; M24 = Year 2, most recent results) • More tL video demos: http://www.translectures.eu/progress/ • Follow transLectures: • http://twitter.com/translectures • http://www.facebook.com/translectures • http://www.slideshare.net/transLectures 12 Nov 2013 34
  • 35. www.translectures.eu • About this presentation: Gonçal Garcés Díaz-Munío • Project coordinator: Alfons Juan-Ciscar ggarces@dsic.upv.es ajuan@dsic.upv.es EC FP7 ICT Programme – Project Number 287755 12 Nov 2013 35