SlideShare uma empresa Scribd logo
1 de 24
Baixar para ler offline
Georg Rehm
German Research Center for Artificial Intelligence (DFKI) GmbH
Language Technology Lab – Berlin, Germany
META-NET, General Secretary
georg.rehm@dfki.de
Towards a Human Language
Project for Multilingual Europe
AI and Interpretation
Artificial Intelligence
SCIC Universities Conference (19/20 April 2018) 2/12
SCIC Universities Conference (19/20 April 2018) 3
SCIC Universities Conference (19/20 April 2018)
Data Intelligence
Current breakthroughs based on Machine Learning (“Deep Learning”)
Also still in use: symbolic, rule-based methods and systems
Artificial Intelligence
• Huge data sets + powerful algorithms + extremely fast hardware
• Enormous potential for disruptions in all sectors and areas
4
META-NET and
Multilingual Europe
SCIC Universities Conference (19/20 April 2018) 5/12
• Multilingualism is at the heart of the European idea
• 24 EU languages – all have the same status
• Dozens of regional and minority languages as well as
languages of immigrants and trade partners
• Many economic and social challenges:
– The Digital Single Market needs to be multilingual
– Cross-border, cross-lingual, cross-cultural
communication
!
60 research centres in 34 countries (founded in 2010)
Chair of Executive Board: Jan Hajic (CUNI)
Dep.: J. van Genabith (DFKI), A. Vasiljevs (Tilde)
General Secretary: Georg Rehm (DFKI)
!
Multilingual Europe
Technology Alliance.
826 members in
67 countries
(published in 2013) (31 volumes; published in 2012)
T4ME (META-NET) CESAR METANET4UMETA-NORDMultilingual Europe Technology AllianceNET
! Basque
! Bulgarian*
! Catalan
! Croatian*
! Czech*
! Danish*
! Dutch*
! English*
! Estonian*
! Finnish*
! French*
! Galician
! German*
! Greek*
! Hungarian*
! Icelandic
! Irish*
! Italian*
! Latvian*
! Lithuanian*
! Maltese*
! Norwegian
! Polish*
! Portuguese*
! Romanian*
! Serbian
! Slovak*
! Slovene*
! Spanish*
! Swedish*
! Welsh
* Official EU languagehttp://www.meta-net.eu/whitepapers
MT
English
good
French, Spanish
moderate fragmentary
Catalan, Dutch, German,
Hungarian, Italian, Polish,
Romanian
weak or no support through LT
Basque, Bulgarian, Croatian,
Czech, Danish, Estonian, Finnish,
Galician, Greek, Icelandic, Irish,
Latvian, Lithuanian, Maltese,
Norwegian, Portuguese, Serbian,
Slovak, Slovene, Swedish, Welsh
excellent
Czech, Dutch,
Finnish, French,
German, Italian,
Portuguese,
Spanish
moderate fragmentary
Basque, Bulgarian, Catalan,
Danish, Estonian, Galician,
Greek, Hungarian, Irish,
Norwegian, Polish, Serbian,
Slovak, Slovene, Swedish
weak or no support through LT
Croatian, Icelandic, Latvian,
Lithuanian, Maltese, Romanian,
Welsh
excellent
English
good
Speech
English
good
Dutch, French,
German, Italian,
Spanish
moderate fragmentary
Basque, Bulgarian, Catalan,
Czech, Danish, Finnish,
Galician, Greek, Hungarian,
Norwegian, Polish,
Portuguese, Romanian,
Slovak, Slovene, Swedish
weak or no support through LT
Croatian, Estonian, Icelandic, Irish,
Latvian, Lithuanian, Maltese,
Serbian, Welsh
excellent
English
good
Czech, Dutch,
French, German,
Hungarian, Italian,
Polish, Spanish,
Swedish
moderate fragmentary
Basque, Bulgarian, Catalan,
Croatian, Danish, Estonian,
Finnish, Galician, Greek,
Norwegian, Portuguese,
Romanian, Serbian, Slovak,
Slovene
Icelandic, Irish, Latvian,
Lithuanian, Maltese, Welsh
weak or no support through LTexcellent
ResourcesTextAnalytics
Fragmentary
Weak/none
Moderate
Good
Excellent
Welsh
Maltese
Lithuanian
Latvian
Icelandic
Irish
Croatian
Serbian
Estonian
Slovene
Slovak
Romanian
Norwegian
Greek
Galician
Danish
Bulgarian
Basque
Swedish
Portuguese
Finnish
Catalan
Polish
Hungarian
Czech
Italian
German
Dutch
Spanish
French
English
Levelofsupport
Languages with names in red
have little or no MT support
Source: META-NET White Paper Series: Europe's Languages in the Digital Age. Springer, Heidelberg,
New York, Dordrecht, London, September 2012. Georg Rehm and Hans Uszkoreit (series editors)
Fragmentary
Weak/none
Moderate
Good
Excellent
Welsh
Maltese
Lithuanian
Latvian
Icelandic
Irish
Croatian
Serbian
Estonian
Slovene
Slovak
Romanian
Norwegian
Greek
Galician
Danish
Bulgarian
Basque
Swedish
Portuguese
Finnish
Catalan
Polish
Hungarian
Czech
Italian
German
Dutch
Spanish
French
English
Levelofsupport
Languages with names in red
have little or no MT support
Source: META-NET White Paper Series: Europe's Languages in the Digital Age. Springer, Heidelberg,
New York, Dordrecht, London, September 2012. Georg Rehm and Hans Uszkoreit (series editors)
Important: even current state of the art
technologies are far from being perfect!
Fragmentary
Weak/none
Moderate
Good
Excellent
Welsh
Maltese
Lithuanian
Latvian
Icelandic
Irish
Croatian
Serbian
Estonian
Slovene
Slovak
Romanian
Norwegian
Greek
Galician
Danish
Bulgarian
Basque
Swedish
Portuguese
Finnish
Catalan
Polish
Hungarian
Czech
Italian
German
Dutch
Spanish
French
English
Levelofsupport
Languages with names in red
have little or no MT support
Source: META-NET White Paper Series: Europe's Languages in the Digital Age. Springer, Heidelberg,
New York, Dordrecht, London, September 2012. Georg Rehm and Hans Uszkoreit (series editors)
Important: 20+ European languages are
severely under-supported and face the
danger of digital extinction.
Fragmentary
Weak/none
Moderate
Good
Excellent
Welsh
Maltese
Lithuanian
Latvian
Icelandic
Irish
Croatian
Serbian
Estonian
Slovene
Slovak
Romanian
Norwegian
Greek
Galician
Danish
Bulgarian
Basque
Swedish
Portuguese
Finnish
Catalan
Polish
Hungarian
Czech
Italian
German
Dutch
Spanish
French
English
Levelofsupport
Languages with names in red
have little or no MT support
Source: META-NET White Paper Series: Europe's Languages in the Digital Age. Springer, Heidelberg,
New York, Dordrecht, London, September 2012. Georg Rehm and Hans Uszkoreit (series editors)
We carried out the study in 2010/2012. While support
for many languages has improved in the meantime,
the overall picture remains mostly the same.
AI and Interpretation
SCIC Universities Conference (19/20 April 2018) 14/12
• Since approx. 2015, with breakthroughs in neural technolo-
gies, Machine Translation has been getting better and better.
• All areas of AI look for “super-human performance” but
language is fundamentally different and much more complex.
• Neural AI approaches cannot understand language, they
process it according to huge underlying data sets.
• In many use cases, mistakes can be tolerated.
• But: translation and interpretation are often mission-critical!
• Mistakes can have serious consequences (politics, medicine).
Translation and Interpretation
SCIC Universities Conference (19/20 April 2018) 15
• Example: Lecture Translator
– University lectures are automatically transcribed and translated,
in near-real time, into several languages
– Students can follow the translation through a web interface
• Example: Presentation Translator
– Presenter can have the speech automatically translated
– Translations are displayed as subtitles
• Example: Call Translator
– Internet telephony provider offers automatic voice translation
Speech Translation
SCIC Universities Conference (19/20 April 2018) 16
• The three example applications work surprisingly well for
general-domain language and input. But:
– They are far from being perfect.
– They aren’t robust.
– They cannot cope with unforeseen situations.
– They cannot understand language as humans do.
– They are not (yet?) suited for conference interpretation.
! Limitations as regards their fields of application.
• Interpretation is often mission-critical.
! Human interpreters won’t be replaced anytime soon.
Issues and Limitations
SCIC Universities Conference (19/20 April 2018) 17
SCIC Universities Conference (19/20 April 2018) 18
https://slator.com/features/ai-interpreter-fail-at-china-summit-sparks-debate-about-future-of-profession/
Human Language
Project
SCIC Universities Conference (19/20 April 2018) 19/12
• LT in Europe: World class research, strong SME base, thousands
of LSPs; immense fragmentation; need for coordination.
• Need for High-Quality LT: translation, interpretation, MDSM etc.
• The European Language Challenge cannot be – it must not be –
abandoned or outsourced!
! Need for Language Technology, made in Europe, for Europe!
! STOA Workshop in the EP (January 2017): “Language equality in
the digital age – towards a Human Language Project”
LT – Current Developments
SCIC Universities Conference (19/20 April 2018) 20
STUDY
EPRS | European Parliamentary Research Service
Scientific Foresight Unit (STOA)
PE 581.621
Science and Technology Options Assessment
• Goal: Deep Natural Language Understanding by 2030
• Vision: EU FET Flagship Project (10+ years)
• Broad coverage, high quality, high precision
• Create approaches, algorithms, data sets, resources
• Across modalities: text, text types, speech, video etc.
Artificial Intelligence
including cognition, perception, vision,
cross-modal, cross-platform, cross-culture etc.
Machine Learning
Language TechnologyLinguistics
SCIC Universities Conference (19/20 April 2018)
Human Language Project
21
Summary & Conclusions
• AI is disrupting all industries – including translation
and, increasingly, also interpretation.
! But: perfect, robust, precise language technologies (incl.
written/spoken MT and interpretation) are still far away.
• Linguists are increasingly needed – new profiles emerging
! The machine will support human experts and help them
become more efficient – it will not replace them.
• The Human Language Project is still a vision. Its goal:
develop new breakthroughs in Language Technology.
SCIC Universities Conference (19/20 April 2018) 22
Recommendation
• SCIC Speech Repository
• 4,000 speeches (3,000 public + 1,000 private)
• Extremely interesting data set and language resource for
Language Technology researchers!
• Many R&D groups currently work on TED talk data sets
• Recommendation: establish bridges between SCIC
and research groups for spoken language translation
• Help build the next generation of AI tools for interpreters
• AI tools that are tailored to the needs and wishes, topics
and domains of conference interpreters in the EC/EP
SCIC Universities Conference (19/20 April 2018) 23
Thank you!
Dr. Georg Rehm
DFKI Berlin
georg.rehm@dfki.de
http://de.linkedin.com/in/georgrehm
https://www.slideshare.net/georgrehm
SCIC Universities Conference (19/20 April 2018) 24
Strategic Research and Innovation Agenda
Language Technologies for
Multilingual Europe
Towards a Human Language Project
SRIA Editorial Team
Version 1.0 – December 2017

Mais conteúdo relacionado

Mais procurados

Forum Tal 2014: Celi company presentation
Forum Tal 2014: Celi company presentationForum Tal 2014: Celi company presentation
Forum Tal 2014: Celi company presentation
CELI
 

Mais procurados (20)

Language Technologies for Big Data – A Strategic Agenda for the Multilingual ...
Language Technologies for Big Data – A Strategic Agenda for the Multilingual ...Language Technologies for Big Data – A Strategic Agenda for the Multilingual ...
Language Technologies for Big Data – A Strategic Agenda for the Multilingual ...
 
Multilingual Europe in late 2016 – A Strategic Research and Innovation Agenda...
Multilingual Europe in late 2016 – A Strategic Research and Innovation Agenda...Multilingual Europe in late 2016 – A Strategic Research and Innovation Agenda...
Multilingual Europe in late 2016 – A Strategic Research and Innovation Agenda...
 
Computational Morphology and the META-NET Strategic Research Agenda for Multi...
Computational Morphology and the META-NET Strategic Research Agenda for Multi...Computational Morphology and the META-NET Strategic Research Agenda for Multi...
Computational Morphology and the META-NET Strategic Research Agenda for Multi...
 
ELKL 5 Language documentation for linguistics and technology
ELKL 5 Language documentation for linguistics and technologyELKL 5 Language documentation for linguistics and technology
ELKL 5 Language documentation for linguistics and technology
 
Language Technology for Multilingual Europe
Language Technology for Multilingual EuropeLanguage Technology for Multilingual Europe
Language Technology for Multilingual Europe
 
ELKL 4, Language Technology: learning from endangered languages
ELKL 4, Language Technology: learning from endangered languagesELKL 4, Language Technology: learning from endangered languages
ELKL 4, Language Technology: learning from endangered languages
 
Apertium: Free/open-source rule-based machine translation and language proces...
Apertium: Free/open-source rule-based machine translation and language proces...Apertium: Free/open-source rule-based machine translation and language proces...
Apertium: Free/open-source rule-based machine translation and language proces...
 
Is MT ready for e-Government? The Latvian Story. Indra Samite, Tilde
Is MT ready for e-Government? The Latvian Story. Indra Samite, TildeIs MT ready for e-Government? The Latvian Story. Indra Samite, Tilde
Is MT ready for e-Government? The Latvian Story. Indra Samite, Tilde
 
Forum Tal 2014: Celi company presentation
Forum Tal 2014: Celi company presentationForum Tal 2014: Celi company presentation
Forum Tal 2014: Celi company presentation
 
Language Resources for Multilingual Europe
Language Resources for Multilingual EuropeLanguage Resources for Multilingual Europe
Language Resources for Multilingual Europe
 
Cracking the Language Barrier for a Multilingual Europe
Cracking the Language Barrier for a Multilingual EuropeCracking the Language Barrier for a Multilingual Europe
Cracking the Language Barrier for a Multilingual Europe
 
A Strategic Research and Innovation Agenda for the Multilingual Digital Singl...
A Strategic Research and Innovation Agenda for the Multilingual Digital Singl...A Strategic Research and Innovation Agenda for the Multilingual Digital Singl...
A Strategic Research and Innovation Agenda for the Multilingual Digital Singl...
 
The META-NET Strategic Research Agenda for Multilingual Europe 2020
The META-NET Strategic Research Agenda for Multilingual Europe 2020The META-NET Strategic Research Agenda for Multilingual Europe 2020
The META-NET Strategic Research Agenda for Multilingual Europe 2020
 
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 1...
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 1...Europeana meeting under Finland’s Presidency of the Council of the EU - Day 1...
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 1...
 
TAUS MT Showcace, MT Applications in the EU Public Sector, Adrejs Vasiljevs, ...
TAUS MT Showcace, MT Applications in the EU Public Sector, Adrejs Vasiljevs, ...TAUS MT Showcace, MT Applications in the EU Public Sector, Adrejs Vasiljevs, ...
TAUS MT Showcace, MT Applications in the EU Public Sector, Adrejs Vasiljevs, ...
 
The Strategic Agenda for the Multilingual Digital Single Market V0.9
The Strategic Agenda for the Multilingual Digital Single Market V0.9The Strategic Agenda for the Multilingual Digital Single Market V0.9
The Strategic Agenda for the Multilingual Digital Single Market V0.9
 
Language Technologies for Multilingual Europe - Towards a Human Language Proj...
Language Technologies for Multilingual Europe - Towards a Human Language Proj...Language Technologies for Multilingual Europe - Towards a Human Language Proj...
Language Technologies for Multilingual Europe - Towards a Human Language Proj...
 
The META-NET Language White Paper Series
The META-NET Language White Paper SeriesThe META-NET Language White Paper Series
The META-NET Language White Paper Series
 
META-NET: Towards a Strategic Research Agenda for Multilingual Europe
META-NET: Towards a Strategic Research Agenda for Multilingual EuropeMETA-NET: Towards a Strategic Research Agenda for Multilingual Europe
META-NET: Towards a Strategic Research Agenda for Multilingual Europe
 
Update on the TKUN Project, by Professor Hitoshi Isahara, Toyohashi Universit...
Update on the TKUN Project, by Professor Hitoshi Isahara, Toyohashi Universit...Update on the TKUN Project, by Professor Hitoshi Isahara, Toyohashi Universit...
Update on the TKUN Project, by Professor Hitoshi Isahara, Toyohashi Universit...
 

Semelhante a Towards a Human Language Project for Multilingual Europe: AI and Interpretation

[DSC Europe 23] Slobodan Markovic - NLP for Serbian.pptx
[DSC Europe 23] Slobodan Markovic - NLP for Serbian.pptx[DSC Europe 23] Slobodan Markovic - NLP for Serbian.pptx
[DSC Europe 23] Slobodan Markovic - NLP for Serbian.pptx
DataScienceConferenc1
 

Semelhante a Towards a Human Language Project for Multilingual Europe: AI and Interpretation (17)

European Language Technologies – Past, Present and Future
European Language Technologies – Past, Present and FutureEuropean Language Technologies – Past, Present and Future
European Language Technologies – Past, Present and Future
 
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...
 
ELSE IF 2019: What’s next for Multilingual Europe?
ELSE IF 2019: What’s next for Multilingual Europe?ELSE IF 2019: What’s next for Multilingual Europe?
ELSE IF 2019: What’s next for Multilingual Europe?
 
AI, don't f$# up my name.pdf
AI, don't f$# up my name.pdfAI, don't f$# up my name.pdf
AI, don't f$# up my name.pdf
 
TAUS Roundtable Moscow, Is MT Ready for e-Government, The Latvian Story, Indr...
TAUS Roundtable Moscow, Is MT Ready for e-Government, The Latvian Story, Indr...TAUS Roundtable Moscow, Is MT Ready for e-Government, The Latvian Story, Indr...
TAUS Roundtable Moscow, Is MT Ready for e-Government, The Latvian Story, Indr...
 
The Strategic Impact of META-NET on the Regional, National and International ...
The Strategic Impact of META-NET on the Regional, National and International ...The Strategic Impact of META-NET on the Regional, National and International ...
The Strategic Impact of META-NET on the Regional, National and International ...
 
[DSC Europe 23] Slobodan Markovic - NLP for Serbian.pptx
[DSC Europe 23] Slobodan Markovic - NLP for Serbian.pptx[DSC Europe 23] Slobodan Markovic - NLP for Serbian.pptx
[DSC Europe 23] Slobodan Markovic - NLP for Serbian.pptx
 
TAUS MT Showcase, MT@EC for European public administrations and online servic...
TAUS MT Showcase, MT@EC for European public administrations and online servic...TAUS MT Showcase, MT@EC for European public administrations and online servic...
TAUS MT Showcase, MT@EC for European public administrations and online servic...
 
MLi - Project presentation
MLi - Project presentationMLi - Project presentation
MLi - Project presentation
 
IMPACT Final Conference - Steven Krauwer
IMPACT Final Conference - Steven KrauwerIMPACT Final Conference - Steven Krauwer
IMPACT Final Conference - Steven Krauwer
 
Cyflwyniad Bloc
Cyflwyniad BlocCyflwyniad Bloc
Cyflwyniad Bloc
 
Spanish Language Technology Plan. David Pérez Fernández, Cabinet of State Sec...
Spanish Language Technology Plan. David Pérez Fernández, Cabinet of State Sec...Spanish Language Technology Plan. David Pérez Fernández, Cabinet of State Sec...
Spanish Language Technology Plan. David Pérez Fernández, Cabinet of State Sec...
 
2015-11-18 research seminar
2015-11-18 research seminar2015-11-18 research seminar
2015-11-18 research seminar
 
Why the Baltics are a prime region for driving innovation in language technol...
Why the Baltics are a prime region for driving innovation in language technol...Why the Baltics are a prime region for driving innovation in language technol...
Why the Baltics are a prime region for driving innovation in language technol...
 
A Digital Survival Kit for your Language.Pecyn Goresgyn y Bygythiadau i’ch Ia...
A Digital Survival Kit for your Language.Pecyn Goresgyn y Bygythiadau i’ch Ia...A Digital Survival Kit for your Language.Pecyn Goresgyn y Bygythiadau i’ch Ia...
A Digital Survival Kit for your Language.Pecyn Goresgyn y Bygythiadau i’ch Ia...
 
Heriot watt 2
Heriot watt 2Heriot watt 2
Heriot watt 2
 
Protecting Minority Languages from Digital Extinction
Protecting Minority Languages from Digital ExtinctionProtecting Minority Languages from Digital Extinction
Protecting Minority Languages from Digital Extinction
 

Mais de Georg Rehm

Mais de Georg Rehm (19)

QURATOR: A Flexible AI Platform for the Adaptive Analysis and Creative Genera...
QURATOR: A Flexible AI Platform for the Adaptive Analysis and Creative Genera...QURATOR: A Flexible AI Platform for the Adaptive Analysis and Creative Genera...
QURATOR: A Flexible AI Platform for the Adaptive Analysis and Creative Genera...
 
Observations on Annotations – From Computational Linguistics and the World Wi...
Observations on Annotations – From Computational Linguistics and the World Wi...Observations on Annotations – From Computational Linguistics and the World Wi...
Observations on Annotations – From Computational Linguistics and the World Wi...
 
The Preparation, Impact and Future of the META-NET White Paper Series “Europe...
The Preparation, Impact and Future of the META-NET White Paper Series “Europe...The Preparation, Impact and Future of the META-NET White Paper Series “Europe...
The Preparation, Impact and Future of the META-NET White Paper Series “Europe...
 
Künstliche Intelligenz beim Dolmetschen und Übersetzen
Künstliche Intelligenz beim Dolmetschen und ÜbersetzenKünstliche Intelligenz beim Dolmetschen und Übersetzen
Künstliche Intelligenz beim Dolmetschen und Übersetzen
 
Herausforderungen und Lösungen für die europäische Sprachtechnologie- Forschu...
Herausforderungen und Lösungen für die europäische Sprachtechnologie- Forschu...Herausforderungen und Lösungen für die europäische Sprachtechnologie- Forschu...
Herausforderungen und Lösungen für die europäische Sprachtechnologie- Forschu...
 
KI, Sprachtechnologie und Digital Humanities: Ein (unvollständiger) Überblick
KI, Sprachtechnologie und Digital Humanities: Ein (unvollständiger) ÜberblickKI, Sprachtechnologie und Digital Humanities: Ein (unvollständiger) Überblick
KI, Sprachtechnologie und Digital Humanities: Ein (unvollständiger) Überblick
 
Kuratieren im Zeitalter der KI
Kuratieren im Zeitalter der KIKuratieren im Zeitalter der KI
Kuratieren im Zeitalter der KI
 
Artificial Intelligence for the Film Industry
Artificial Intelligence for the Film IndustryArtificial Intelligence for the Film Industry
Artificial Intelligence for the Film Industry
 
KI für die Kundenkommunikation
KI für die KundenkommunikationKI für die Kundenkommunikation
KI für die Kundenkommunikation
 
Transformieren, Manipulieren, Kuratieren: Technologien für die Wissensarbeit ...
Transformieren, Manipulieren, Kuratieren: Technologien für die Wissensarbeit ...Transformieren, Manipulieren, Kuratieren: Technologien für die Wissensarbeit ...
Transformieren, Manipulieren, Kuratieren: Technologien für die Wissensarbeit ...
 
Digitale Kuratierungstechnologien: Anwendungsfälle in Digitalen Bibliotheken
Digitale Kuratierungstechnologien: Anwendungsfälle in Digitalen BibliothekenDigitale Kuratierungstechnologien: Anwendungsfälle in Digitalen Bibliotheken
Digitale Kuratierungstechnologien: Anwendungsfälle in Digitalen Bibliotheken
 
EPUB, quo vadis? Publishing im W3C
EPUB, quo vadis? Publishing im W3CEPUB, quo vadis? Publishing im W3C
EPUB, quo vadis? Publishing im W3C
 
Digitale Kuratierungstechnologien für verschiedene Branchen und Anwendungssze...
Digitale Kuratierungstechnologien für verschiedene Branchen und Anwendungssze...Digitale Kuratierungstechnologien für verschiedene Branchen und Anwendungssze...
Digitale Kuratierungstechnologien für verschiedene Branchen und Anwendungssze...
 
Generische Kuratierungstechnologien für spezifische Anwendungsfälle: Hintergr...
Generische Kuratierungstechnologien für spezifische Anwendungsfälle: Hintergr...Generische Kuratierungstechnologien für spezifische Anwendungsfälle: Hintergr...
Generische Kuratierungstechnologien für spezifische Anwendungsfälle: Hintergr...
 
Curation Technologies for Multilingual Europe
Curation Technologies for Multilingual EuropeCuration Technologies for Multilingual Europe
Curation Technologies for Multilingual Europe
 
Web Annotations – A Game Changer for Language Technology?
Web Annotations – A Game Changer for Language Technology?Web Annotations – A Game Changer for Language Technology?
Web Annotations – A Game Changer for Language Technology?
 
Globale Standards im Web of Things
Globale Standards im Web of ThingsGlobale Standards im Web of Things
Globale Standards im Web of Things
 
W3C/DFKI Automotive Workshop
W3C/DFKI Automotive WorkshopW3C/DFKI Automotive Workshop
W3C/DFKI Automotive Workshop
 
Digitale Kuratierungstechnologien – Beispiele aus ausgewählten Branchen
Digitale Kuratierungstechnologien – Beispiele aus ausgewählten BranchenDigitale Kuratierungstechnologien – Beispiele aus ausgewählten Branchen
Digitale Kuratierungstechnologien – Beispiele aus ausgewählten Branchen
 

Último

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 

Último (20)

Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 

Towards a Human Language Project for Multilingual Europe: AI and Interpretation

  • 1. Georg Rehm German Research Center for Artificial Intelligence (DFKI) GmbH Language Technology Lab – Berlin, Germany META-NET, General Secretary georg.rehm@dfki.de Towards a Human Language Project for Multilingual Europe AI and Interpretation
  • 2. Artificial Intelligence SCIC Universities Conference (19/20 April 2018) 2/12
  • 3. SCIC Universities Conference (19/20 April 2018) 3
  • 4. SCIC Universities Conference (19/20 April 2018) Data Intelligence Current breakthroughs based on Machine Learning (“Deep Learning”) Also still in use: symbolic, rule-based methods and systems Artificial Intelligence • Huge data sets + powerful algorithms + extremely fast hardware • Enormous potential for disruptions in all sectors and areas 4
  • 5. META-NET and Multilingual Europe SCIC Universities Conference (19/20 April 2018) 5/12
  • 6. • Multilingualism is at the heart of the European idea • 24 EU languages – all have the same status • Dozens of regional and minority languages as well as languages of immigrants and trade partners • Many economic and social challenges: – The Digital Single Market needs to be multilingual – Cross-border, cross-lingual, cross-cultural communication
  • 7. ! 60 research centres in 34 countries (founded in 2010) Chair of Executive Board: Jan Hajic (CUNI) Dep.: J. van Genabith (DFKI), A. Vasiljevs (Tilde) General Secretary: Georg Rehm (DFKI) ! Multilingual Europe Technology Alliance. 826 members in 67 countries (published in 2013) (31 volumes; published in 2012) T4ME (META-NET) CESAR METANET4UMETA-NORDMultilingual Europe Technology AllianceNET
  • 8. ! Basque ! Bulgarian* ! Catalan ! Croatian* ! Czech* ! Danish* ! Dutch* ! English* ! Estonian* ! Finnish* ! French* ! Galician ! German* ! Greek* ! Hungarian* ! Icelandic ! Irish* ! Italian* ! Latvian* ! Lithuanian* ! Maltese* ! Norwegian ! Polish* ! Portuguese* ! Romanian* ! Serbian ! Slovak* ! Slovene* ! Spanish* ! Swedish* ! Welsh * Official EU languagehttp://www.meta-net.eu/whitepapers
  • 9. MT English good French, Spanish moderate fragmentary Catalan, Dutch, German, Hungarian, Italian, Polish, Romanian weak or no support through LT Basque, Bulgarian, Croatian, Czech, Danish, Estonian, Finnish, Galician, Greek, Icelandic, Irish, Latvian, Lithuanian, Maltese, Norwegian, Portuguese, Serbian, Slovak, Slovene, Swedish, Welsh excellent Czech, Dutch, Finnish, French, German, Italian, Portuguese, Spanish moderate fragmentary Basque, Bulgarian, Catalan, Danish, Estonian, Galician, Greek, Hungarian, Irish, Norwegian, Polish, Serbian, Slovak, Slovene, Swedish weak or no support through LT Croatian, Icelandic, Latvian, Lithuanian, Maltese, Romanian, Welsh excellent English good Speech English good Dutch, French, German, Italian, Spanish moderate fragmentary Basque, Bulgarian, Catalan, Czech, Danish, Finnish, Galician, Greek, Hungarian, Norwegian, Polish, Portuguese, Romanian, Slovak, Slovene, Swedish weak or no support through LT Croatian, Estonian, Icelandic, Irish, Latvian, Lithuanian, Maltese, Serbian, Welsh excellent English good Czech, Dutch, French, German, Hungarian, Italian, Polish, Spanish, Swedish moderate fragmentary Basque, Bulgarian, Catalan, Croatian, Danish, Estonian, Finnish, Galician, Greek, Norwegian, Portuguese, Romanian, Serbian, Slovak, Slovene Icelandic, Irish, Latvian, Lithuanian, Maltese, Welsh weak or no support through LTexcellent ResourcesTextAnalytics
  • 10. Fragmentary Weak/none Moderate Good Excellent Welsh Maltese Lithuanian Latvian Icelandic Irish Croatian Serbian Estonian Slovene Slovak Romanian Norwegian Greek Galician Danish Bulgarian Basque Swedish Portuguese Finnish Catalan Polish Hungarian Czech Italian German Dutch Spanish French English Levelofsupport Languages with names in red have little or no MT support Source: META-NET White Paper Series: Europe's Languages in the Digital Age. Springer, Heidelberg, New York, Dordrecht, London, September 2012. Georg Rehm and Hans Uszkoreit (series editors)
  • 11. Fragmentary Weak/none Moderate Good Excellent Welsh Maltese Lithuanian Latvian Icelandic Irish Croatian Serbian Estonian Slovene Slovak Romanian Norwegian Greek Galician Danish Bulgarian Basque Swedish Portuguese Finnish Catalan Polish Hungarian Czech Italian German Dutch Spanish French English Levelofsupport Languages with names in red have little or no MT support Source: META-NET White Paper Series: Europe's Languages in the Digital Age. Springer, Heidelberg, New York, Dordrecht, London, September 2012. Georg Rehm and Hans Uszkoreit (series editors) Important: even current state of the art technologies are far from being perfect!
  • 12. Fragmentary Weak/none Moderate Good Excellent Welsh Maltese Lithuanian Latvian Icelandic Irish Croatian Serbian Estonian Slovene Slovak Romanian Norwegian Greek Galician Danish Bulgarian Basque Swedish Portuguese Finnish Catalan Polish Hungarian Czech Italian German Dutch Spanish French English Levelofsupport Languages with names in red have little or no MT support Source: META-NET White Paper Series: Europe's Languages in the Digital Age. Springer, Heidelberg, New York, Dordrecht, London, September 2012. Georg Rehm and Hans Uszkoreit (series editors) Important: 20+ European languages are severely under-supported and face the danger of digital extinction.
  • 13. Fragmentary Weak/none Moderate Good Excellent Welsh Maltese Lithuanian Latvian Icelandic Irish Croatian Serbian Estonian Slovene Slovak Romanian Norwegian Greek Galician Danish Bulgarian Basque Swedish Portuguese Finnish Catalan Polish Hungarian Czech Italian German Dutch Spanish French English Levelofsupport Languages with names in red have little or no MT support Source: META-NET White Paper Series: Europe's Languages in the Digital Age. Springer, Heidelberg, New York, Dordrecht, London, September 2012. Georg Rehm and Hans Uszkoreit (series editors) We carried out the study in 2010/2012. While support for many languages has improved in the meantime, the overall picture remains mostly the same.
  • 14. AI and Interpretation SCIC Universities Conference (19/20 April 2018) 14/12
  • 15. • Since approx. 2015, with breakthroughs in neural technolo- gies, Machine Translation has been getting better and better. • All areas of AI look for “super-human performance” but language is fundamentally different and much more complex. • Neural AI approaches cannot understand language, they process it according to huge underlying data sets. • In many use cases, mistakes can be tolerated. • But: translation and interpretation are often mission-critical! • Mistakes can have serious consequences (politics, medicine). Translation and Interpretation SCIC Universities Conference (19/20 April 2018) 15
  • 16. • Example: Lecture Translator – University lectures are automatically transcribed and translated, in near-real time, into several languages – Students can follow the translation through a web interface • Example: Presentation Translator – Presenter can have the speech automatically translated – Translations are displayed as subtitles • Example: Call Translator – Internet telephony provider offers automatic voice translation Speech Translation SCIC Universities Conference (19/20 April 2018) 16
  • 17. • The three example applications work surprisingly well for general-domain language and input. But: – They are far from being perfect. – They aren’t robust. – They cannot cope with unforeseen situations. – They cannot understand language as humans do. – They are not (yet?) suited for conference interpretation. ! Limitations as regards their fields of application. • Interpretation is often mission-critical. ! Human interpreters won’t be replaced anytime soon. Issues and Limitations SCIC Universities Conference (19/20 April 2018) 17
  • 18. SCIC Universities Conference (19/20 April 2018) 18 https://slator.com/features/ai-interpreter-fail-at-china-summit-sparks-debate-about-future-of-profession/
  • 19. Human Language Project SCIC Universities Conference (19/20 April 2018) 19/12
  • 20. • LT in Europe: World class research, strong SME base, thousands of LSPs; immense fragmentation; need for coordination. • Need for High-Quality LT: translation, interpretation, MDSM etc. • The European Language Challenge cannot be – it must not be – abandoned or outsourced! ! Need for Language Technology, made in Europe, for Europe! ! STOA Workshop in the EP (January 2017): “Language equality in the digital age – towards a Human Language Project” LT – Current Developments SCIC Universities Conference (19/20 April 2018) 20 STUDY EPRS | European Parliamentary Research Service Scientific Foresight Unit (STOA) PE 581.621 Science and Technology Options Assessment
  • 21. • Goal: Deep Natural Language Understanding by 2030 • Vision: EU FET Flagship Project (10+ years) • Broad coverage, high quality, high precision • Create approaches, algorithms, data sets, resources • Across modalities: text, text types, speech, video etc. Artificial Intelligence including cognition, perception, vision, cross-modal, cross-platform, cross-culture etc. Machine Learning Language TechnologyLinguistics SCIC Universities Conference (19/20 April 2018) Human Language Project 21
  • 22. Summary & Conclusions • AI is disrupting all industries – including translation and, increasingly, also interpretation. ! But: perfect, robust, precise language technologies (incl. written/spoken MT and interpretation) are still far away. • Linguists are increasingly needed – new profiles emerging ! The machine will support human experts and help them become more efficient – it will not replace them. • The Human Language Project is still a vision. Its goal: develop new breakthroughs in Language Technology. SCIC Universities Conference (19/20 April 2018) 22
  • 23. Recommendation • SCIC Speech Repository • 4,000 speeches (3,000 public + 1,000 private) • Extremely interesting data set and language resource for Language Technology researchers! • Many R&D groups currently work on TED talk data sets • Recommendation: establish bridges between SCIC and research groups for spoken language translation • Help build the next generation of AI tools for interpreters • AI tools that are tailored to the needs and wishes, topics and domains of conference interpreters in the EC/EP SCIC Universities Conference (19/20 April 2018) 23
  • 24. Thank you! Dr. Georg Rehm DFKI Berlin georg.rehm@dfki.de http://de.linkedin.com/in/georgrehm https://www.slideshare.net/georgrehm SCIC Universities Conference (19/20 April 2018) 24 Strategic Research and Innovation Agenda Language Technologies for Multilingual Europe Towards a Human Language Project SRIA Editorial Team Version 1.0 – December 2017