O slideshow foi denunciado.
Seu SlideShare está sendo baixado. ×

Apertium: Free/open-source rule-based machine translation and language processors, Mikel L. Forcada, Universitat d'Alacant, Spain

Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio

Confira estes a seguir

1 de 13 Anúncio

Apertium: Free/open-source rule-based machine translation and language processors, Mikel L. Forcada, Universitat d'Alacant, Spain

Baixar para ler offline

Apertium is a free/open-source platform for rule-based machine translation which was started in 2005 and is developed collaboratively. Apertium provides: a translation engine, linguistic data for a variety of language pairs, and a host of tools for developers and users. Anyone with reasonable computing skills and with good translation skills can join the Apertium community and, in no time, find themselves contributing to the building of machine translation systems for a language pair. Apertium is particularly suitable for related-language pairs (such as Spanish→Portuguese or Czech→Slovak) where its shallow transfer technology suffices to produce posteditable translations, but is also being used for less-related language pairs in gisting applications. A nice side effect is the development of monolingual language processors (lemmatizers, part-of-speech taggers) which are available to help statistical machine translation deal with languages having a challenging morphology. Apertium is a mature technology and, for instance, it is currently being used by the Spanish Government to provide on-the fly machine translation for public-service webpages, by the regional newspaper Levante-EMV to generate a Catalan online edition, or by Wikipedia to offer a service to translate articles.

Apertium is a free/open-source platform for rule-based machine translation which was started in 2005 and is developed collaboratively. Apertium provides: a translation engine, linguistic data for a variety of language pairs, and a host of tools for developers and users. Anyone with reasonable computing skills and with good translation skills can join the Apertium community and, in no time, find themselves contributing to the building of machine translation systems for a language pair. Apertium is particularly suitable for related-language pairs (such as Spanish→Portuguese or Czech→Slovak) where its shallow transfer technology suffices to produce posteditable translations, but is also being used for less-related language pairs in gisting applications. A nice side effect is the development of monolingual language processors (lemmatizers, part-of-speech taggers) which are available to help statistical machine translation deal with languages having a challenging morphology. Apertium is a mature technology and, for instance, it is currently being used by the Spanish Government to provide on-the fly machine translation for public-service webpages, by the regional newspaper Levante-EMV to generate a Catalan online edition, or by Wikipedia to offer a service to translate articles.

Anúncio
Anúncio

Mais Conteúdo rRelacionado

Semelhante a Apertium: Free/open-source rule-based machine translation and language processors, Mikel L. Forcada, Universitat d'Alacant, Spain (20)

Mais de TAUS - Enabling better translation (20)

Anúncio

Mais recentes (20)

Apertium: Free/open-source rule-based machine translation and language processors, Mikel L. Forcada, Universitat d'Alacant, Spain

  1. 1. Apertium: Free/open-source rule-based machine translation and language processors Mikel L. Forcada Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain Riga TAUS Roundtable, June 1, 2016 Mikel L. Forcada (Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain)Apertium: Free/open-source rule-based machine translation and language process Riga TAUS Roundtable, June 1, 2016 / 13
  2. 2. What is Apertium? What is Apertium? Apertium (since 2005) is a free/open-source platform for shallow-transfer rule-based machine translation which is collaboratively developed and provides: A congurable, language independent machine translation engine, Data (dictionaries, rules) for more than 40 language pairs (in XML and text-based formats), and lots of tools for developers and users. Mikel L. Forcada (Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain)Apertium: Free/open-source rule-based machine translation and language process Riga TAUS Roundtable, June 1, 2016 / 13
  3. 3. What is Apertium? Pipeline architecture A pipelined architecture allows for easy customization and diagnostics. lexical transfer morph. analyser morph. disambig. morph. generator post- generator SL text TL text deformatter reformatter structural transfer lexical selection Mikel L. Forcada (Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain)Apertium: Free/open-source rule-based machine translation and language process Riga TAUS Roundtable, June 1, 2016 / 13
  4. 4. What is Apertium? Languages and language pairs afr nld arg cat ita bre fra spa cym eng glg dan nno nob ast por ron epo eus hbs mkd slv bul ind zsmisl swe kaz tat mlt ara oci sme urd hin Mikel L. Forcada (Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain)Apertium: Free/open-source rule-based machine translation and language process Riga TAUS Roundtable, June 1, 2016 / 13
  5. 5. What is Apertium? Apertium loves small languages Some unique MT systems for small languages: Breton→French Aragonese↔Spanish Occitan↔Catalan Aragonese↔Catalan Occitan↔Spanish North Sámi→Norwegian To love is to give: e.g. provide small languages with language resources, and computational-linguistic descriptions of their language. Mikel L. Forcada (Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain)Apertium: Free/open-source rule-based machine translation and language process Riga TAUS Roundtable, June 1, 2016 / 13
  6. 6. What is Apertium good for? What is Apertium good for? Apertium is basically good to translate between related languages. Some examples in Apertium: Spanish ↔ Portuguese Norwegian Nynorsk ↔ Norwegian Bokmål Slovenian ↔ Croatian Tatar ↔ Kazakh Postediting Apertium output in these cases may save time compared to translation from scratch. It is also being used for less-related language pairs in gisting applications. Mikel L. Forcada (Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain)Apertium: Free/open-source rule-based machine translation and language process Riga TAUS Roundtable, June 1, 2016 / 13
  7. 7. Apertium is collaboratively developed Apertium licensing: free/open-source Apertium language data and code are both licensed under the GNU General Public License: a free/open-source license allowing free distribution of unmodied and modied versions a copylefted license: it avoids private appropriation and encourages giving improvements back to the project (a commons) → community Mikel L. Forcada (Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain)Apertium: Free/open-source rule-based machine translation and language process Riga TAUS Roundtable, June 1, 2016 / 13
  8. 8. Apertium is collaboratively developed Apertium is collaboratively developed Very active group of hundreds of developers (freelance developers, researchers, industrial partners). Wiki documentation (wiki.apertium.org) in addition to formal documents. Help available at IRC channel #apertium in freenode.net Mailing lists: apertium-stuff@lists.sf.net and other language-specic lists Mikel L. Forcada (Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain)Apertium: Free/open-source rule-based machine translation and language process Riga TAUS Roundtable, June 1, 2016 / 13
  9. 9. Apertium is collaboratively developed Research and business with Apertium Apertium is already an active research and business platform: Research: 40+ publications, 2 PhD thesis, 4 master's theses Business: companies (Prompsit, Eleka, Imaxin Software, etc.) oering services to customers such as Autodesk, the Government of Catalonia, one of the main Basque banks, the daily newspaper La Voz de Galicia, etc.) The free/open-source model creates a community which eectively connects researchers, developers, vendors and users. Mikel L. Forcada (Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain)Apertium: Free/open-source rule-based machine translation and language process Riga TAUS Roundtable, June 1, 2016 / 13
  10. 10. Becoming an Apertium user Becoming an Apertium user Professional translators can: use Apertium oine plugins in the OmegaT free/open-source CAT environment. (as with any other system) easily align source and MT to generate machine translation memories to feed into other CAT systems Muggles can use: a stand-alone Java application for the desktop: apertium-caffeine an Android version for handhelds a stand-alone version (Apertium Simpleton) for Windows and MacOS. a plug-in for the OmegaT CAT platform apertium-omegat Mikel L. Forcada (Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain)Apertium: Free/open-source rule-based machine translation and language process Riga TAUS Roundtable, June 1, 2016 / 13
  11. 11. Becoming an Apertium developer Becoming an Apertium developer It's easy to become an Apertium developer. It just takes reasonable computing skills (XML, shell commands, etc.), which are not too hard to acquire, good translation skills. In no time, developers nd themselves contributing to a language pair with the support of the community. Mikel L. Forcada (Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain)Apertium: Free/open-source rule-based machine translation and language process Riga TAUS Roundtable, June 1, 2016 / 13
  12. 12. A nice side eect: monolingual resources A nice side eect: monolingual resources When developing a language pair, monolingual language resources are developed, such as morphological dictionaries morphological disambiguation rules and probabilities The corresponding monolingual processors are available to help statistical machine translation deal, for instance, with languages having a challenging morphology. Mikel L. Forcada (Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain)Apertium: Free/open-source rule-based machine translation and language process Riga TAUS Roundtable, June 1, 2016 / 13
  13. 13. Success cases Success cases Apertium a is mature technology which is used: in Wikimedia Content Translation to generate Wikipedia content in other languages, to produce a Catalan edition of Valencia daily newspaper Levante-EMV, by Universities in the Catalan speaking area to help in the generation of courseware and academic information, in PLATA, the Spanish government platform for on-the-y webpage machine translation of public-service webpages. Mikel L. Forcada (Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain)Apertium: Free/open-source rule-based machine translation and language process Riga TAUS Roundtable, June 1, 2016 / 13

×