SlideShare uma empresa Scribd logo
1 de 1
Baixar para ler offline
OpenLogos	
  Seman-co-­‐Syntac-c	
  Knowledge-­‐Rich	
  Bilingual	
  Dic-onaries	
  
Anabela	
  Barreiro1,	
  Fernando	
  Ba0sta1,2,	
  Ricardo	
  Ribeiro1,2,	
  Helena	
  Moniz1,3,	
  Isabel	
  Trancoso1,4	
  
1INESC-­‐ID,	
  2ISCTE-­‐IUL,	
  3FLUL/CLUL,	
  4IST	
  
{abarreiro;fmmb;rdmr;helenam;imt}@l2f.inesc-id.pt!
http://www.l2f.inesc-id.pt/!
Characteris0cs	
  
–  Representa0on	
  schema	
  with	
  eclec0c	
  categories	
  
–  Designed	
  to	
  work	
  in	
  concert	
  with	
  the	
  lexical	
  
resources	
  and	
  linguis0c	
  rules	
  (transfer	
  (TRAN)	
  
and	
  seman0co-­‐syntac0c	
  (SEMTAB)	
  rules)	
  
–  Easy	
  mapping	
  from	
  natural	
  to	
  symbolic	
  
language,	
  represen0ng	
  both	
  meaning	
  and	
  
structure	
  in	
  a	
  con0nuum,	
  undissociated,	
  
represented	
  in	
  the	
  same	
  layer,	
  based	
  on	
  the	
  
belief	
  that	
  seman0cs	
  of	
  a	
  word	
  oRen	
  affects	
  the	
  
surrounding	
  syntax	
  
–  Extensible	
  system,	
  designed	
  so	
  that	
  developers	
  
would	
  expand	
  and	
  add	
  to	
  its	
  capabili0es	
  
–  Ini0ally	
  developed	
  for	
  English,	
  but	
  many	
  of	
  its	
  
elements	
  are	
  universal	
  (mostly	
  nouns,	
  adjec0ves,	
  
and	
  adverbs)	
  and	
  applicable	
  to	
  other	
  languages	
  
Representa0on	
  
–  SAL	
  knowledge	
  is	
  embedded	
  in	
  the	
  dic0onary	
  in	
  
the	
  form	
  of	
  numeric	
  codes	
  (SAL	
  mnemonics	
  are	
  
used	
  for	
  easier	
  understanding)	
  
•  E.g.	
  the	
  noun	
  (N)	
  table	
  has	
  two	
  SAL	
  representa0ons:	
  
–  COsurf	
  –	
  concrete,	
  surface	
  
–  INdata	
  –	
  informa0on,	
  recorded	
  data	
  
–  Nouns	
  have	
  12	
  supersets.	
  Superset	
  measure	
  
(ME)	
  has	
  3	
  sets	
  and	
  11	
  subsets:	
  	
  
•  SAL	
  codes	
  for	
  nouns	
  represent	
  seman0c	
  groupings,	
  
and	
  are	
  language	
  independent,	
  as	
  concepts	
  are	
  
transverse	
  across	
  languages	
  
–  Verbs	
  are	
  subdivided	
  in	
  3	
  types:	
  intransi0ve,	
  
weak	
  transi0ve	
  and	
  strong	
  transi0ve.	
  Intransi0ve	
  
verbs	
  have	
  3	
  supersets:	
  mo0onal	
  (INMO),	
  
opera0onal	
  (INOP),	
  and	
  existen0al	
  (INEX)	
  
•  Existen0al	
  intransi0ve	
  verbs	
  include	
  be	
  and	
  be-­‐
subs0tutes	
  that	
  take	
  predicate	
  nouns	
  and	
  adjec0ves	
  
–  Adjec-ves	
  are	
  classified	
  in	
  2	
  types:	
  descrip0ve	
  
and	
  par0cipial,	
  sub-­‐classified	
  according	
  to	
  
syntac0c	
  rela0onships	
  with	
  other	
  words	
  
•  syntac0c	
  pa]erns	
  for	
  the	
  descrip0ve	
  pre-­‐clausal	
  
good-­‐type	
  adjec0ves	
  
	
  
–  OpenLogos	
  (OL)	
  is	
  the	
  open	
  source	
  deriva0ve	
  
of	
  the	
  Logos	
  machine	
  transla0on	
  (MT)	
  system	
  	
  
–  OL	
  strength	
  resides	
  in	
  its	
  lexical	
  resources,	
  the	
  
knowledge-­‐rich	
  bilingual	
  dic-onaries	
  
•  contain	
  seman0co-­‐syntac0c	
  knowledge	
  and	
  
ontological	
  rela0ons	
  for	
  all	
  lexical	
  entries	
  represented	
  
at	
  an	
  abstract/higher	
  level	
  by	
  the	
  Seman0co-­‐
Syntac0c	
  Abstrac0on	
  Language	
  –	
  SAL	
  	
  
•  present	
  other	
  idiosyncrasies	
  that	
  dis0nguish	
  them	
  
from	
  other	
  publicly	
  available	
  dic0onaries	
  
Mo0va0on	
  
–  OL	
  resources	
  were	
  used	
  successfully	
  in	
  the	
  Logos	
  
commercial	
  MT	
  product	
  during	
  2-­‐3	
  decades	
  
•  validated	
  by	
  the	
  Logos	
  development	
  team	
  and	
  clients	
  
–  Possible	
  applica0ons	
  
•  basis	
  for	
  new	
  linguis0c	
  and	
  NLP	
  tools,	
  especially	
  for	
  
poor-­‐resourced	
  languages	
  
•  enhancement	
  of	
  other	
  MT	
  systems	
  
Bilingual	
  Dic0onaries:	
  EN	
  >	
  GE/FR/IT	
  
–  Verbs,	
  nouns	
  and	
  adjec0ves	
  are	
  clearly	
  the	
  
most	
  represented	
  classes,	
  as	
  they	
  reach	
  more	
  
than	
  80,000	
  entries	
  for	
  each	
  target	
  language.	
  
–  Dic0onaries	
  stored	
  in	
  self-­‐contained	
  XML	
  files	
  
•  easily	
  addressed	
  by	
  small	
  programs	
  
•  supported	
  by	
  exis0ng	
  efficient	
  XML	
  APIs	
  
–  Example	
  for	
  the	
  verb	
  entry	
  depart,	
  extracted	
  
from	
  the	
  English-­‐French	
  dic0onary	
  
Introduc0on	
   Seman0co-­‐Syntac0c	
  Knowledge	
  
–  Part-­‐of-­‐speech	
  (POS)	
  
–  Gender	
  (GEN)	
  
–  Number	
  (NUM)	
  
–  Morphological	
  paradigms	
  (PAT)	
  for	
  source	
  
and	
  target	
  words	
  
•  make	
  it	
  possible	
  to	
  map	
  inflected	
  forms	
  across	
  
languages	
  and	
  improve	
  agreement	
  in	
  SMT	
  
–  Head	
  word	
  (HEAD)	
  in	
  mul0word	
  
•  useful	
  to	
  correct	
  MT	
  problems	
  related	
  to	
  
agreement	
  within	
  mul0words	
  or	
  within	
  larger	
  
units	
  (e.g.	
  between	
  nominal	
  mul0words	
  and	
  verb	
  
or	
  agreement	
  within	
  verbal	
  mul0words)	
  
–  Homographs	
  (HOMO)	
  
•  homographs	
  are	
  a	
  major	
  source	
  of	
  transla0on	
  
errors	
  and	
  their	
  iden0fica0on	
  is	
  crucial	
  
–  Auxiliary	
  (AUX)	
  
•  helps	
  improve	
  precision	
  in	
  the	
  transla0on	
  when	
  
auxiliary	
  choice	
  is	
  subtle	
  
–  Alternate	
  word	
  (ALT)	
  
•  nominaliza0on	
  (process	
  noun),	
  predicate	
  
adjec0ve,	
  etc.	
  -­‐	
  useful	
  for	
  paraphrasing	
  purposes	
  
–  Causa0ve	
  verb	
  (CAUS)	
  
–  Reflexive	
  verb	
  (REFL)	
  
–  Aspectual	
  verb	
  (ASP)	
  
–  Seman0co-­‐Syntac0c	
  Knowledge	
  (SAL)	
  
•  interlingua-­‐style	
  hierarchical	
  taxonomy	
  with	
  over	
  
1,000	
  elements,	
  embracing	
  all	
  POS	
  
•  3	
  levels	
  of	
  representa0on:	
  superset	
  (SUPER),	
  	
  
set	
  (SET),	
  and	
  subset	
  (SUB)	
  -­‐	
  embedded	
  in	
  the	
  
dic0onary	
  entries	
  and	
  in	
  the	
  transla0on	
  system’s	
  
rules	
  (help	
  with	
  disambigua0on).	
  E.g.	
  pipe,	
  hose:	
  
OpenLogos	
  Data	
  
3
2
1
–  Three	
  bilingual	
  dic0onaries	
  were	
  created	
  
•  English-­‐French;	
  English-­‐German;	
  English-­‐Italian	
  
•  online	
  and	
  free	
  for	
  research	
  purposes	
  	
  
–  h]p://metanet4u.l2f.inesc-­‐id.pt/	
  
–  The	
  resources	
  contain	
  seman0co-­‐syntac0c	
  
knowledge	
  concerning	
  the	
  conceptual	
  
formaliza0on	
  of	
  things,	
  ideas,	
  rela0onships,	
  
disposi0ons,	
  condi0ons,	
  processes,	
  etc.	
  
•  valuable	
  for	
  MT	
  and	
  other	
  NLP	
  applica0ons	
  
•  stored	
  in	
  XML	
  format	
  for	
  easy	
  processing	
  
–  In	
  the	
  future,	
  we	
  will	
  make	
  available	
  three	
  
complementary	
  bilingual	
  dic0onaries	
  
•  English-­‐Portuguese;	
  English-­‐Spanish;	
  German-­‐
English	
  
Acknowledgments	
  
–  This	
  work	
  was	
  supported	
  by	
  na0onal	
  funds	
  through	
  
Fundação	
  para	
  a	
  Ciência	
  e	
  a	
  Tecnologia,	
  under	
  grants	
  
SFRH/BPD/91446/2012	
  and	
  SFRH/BPD/95849/2013	
  	
  
and	
  project	
  PEst-­‐OE/EEI/LA0021/2013	
  
Conclusions	
  and	
  Future	
  Work	
  
5
Resul0ng	
  Resources	
  
4
Instituto de Engenharia de Sistemas e Computadores
Investigação e Desenvolvimento em Lisboa
Laboratório de Sistemas de Língua Falada
	
  	
   id	
   EN-­‐GE	
   EN-­‐FR	
   EN-­‐IT	
  
Noun	
   1	
   28266	
   25910	
   23505	
  
Verb	
   2	
   33855	
   33354	
   33021	
  
Adverb	
  (loca0ve)	
   3	
   465	
   442	
   450	
  
Adjec0ve	
   4	
   21219	
   20749	
   20518	
  
Pronoun	
   5	
   121	
   121	
   121	
  
Adverb	
  (manner,	
  agency,	
  degree)	
   6	
   2207	
   2167	
   2173	
  
Preposi0on	
  (non-­‐loca0ve)	
   11	
   140	
   140	
   139	
  
Auxiliary	
  and	
  Modal	
   12	
   34	
   34	
   34	
  
Preposi0on	
  (loca0ve)	
   13	
   148	
   148	
   148	
  
Definite	
  Ar0cle	
   14	
   194	
   194	
   189	
  
Indefinite	
  Ar0cle	
   15	
   66	
   66	
   65	
  
Arithmate	
  in	
  Apposi0on	
   16	
   208	
   208	
   203	
  
Nega0ve	
   17	
   2	
   2	
   2	
  
Rela0ve	
  and	
  Interroga0ve	
  Pronoun	
   18	
   23	
   23	
   20	
  
Conjunc0on	
   19	
   160	
   160	
   160	
  
Punctua0on	
   20	
   30	
   30	
   30	
  
Total	
   87138	
   83748	
   80778	
  
nouns%
concrete%
func+onals%
conduits%
word%class%
superset%
set%
subset%barriers% containers%
…%…%
…% …%
…%…%
	
  <Entry	
  source="depart"	
  target="qui]er">	
  
	
  	
  	
  	
  <source	
  head_word="1"	
  homograph="no"	
  word_type="01">	
  
	
  	
  	
  	
  	
  	
  <pos	
  descrip0on="Verb"	
  wclass="02"/>	
  
	
  	
  	
  	
  	
  	
  <morphology>	
  
	
  	
  	
  	
  	
  	
  	
  	
  <inflec0on	
  descrip0on="like	
  walk,	
  walked,	
  walking"	
  example="walk"	
  id="1"/>	
  
	
  	
  	
  	
  	
  	
  </morphology>	
  
	
  	
  	
  	
  	
  	
  <sal	
  code="13,98,596"	
  descrip0on="create,	
  etc."	
  mnemonic="generictransi0ve4"	
  set="other98"/>	
  
	
  	
  	
  	
  </source>	
  
	
  	
  	
  	
  <target	
  aux="1"	
  head_word="1"	
  word_type="01">	
  
	
  	
  	
  	
  	
  	
  <pos	
  descrip0on="Verb"	
  wclass="02"/>	
  
	
  	
  	
  	
  	
  	
  <morphology>	
  
	
  	
  	
  	
  	
  	
  	
  	
  <inflec0on	
  descrip0on="regular	
  ending	
  in	
  -­‐er:	
  parler"	
  example="parler"	
  id="3"/>	
  
	
  	
  	
  	
  	
  	
  </morphology>	
  
	
  	
  	
  	
  </target>	
  
	
  	
  </Entry>	
  
	
  	
  <Entry	
  source="depart"	
  target="par0r">	
  
	
  	
  	
  	
  <source	
  head_word="1"	
  homograph="no"	
  word_type="01">	
  
	
  	
  	
  	
  	
  	
  <pos	
  descrip0on="Verb"	
  wclass="02"/>	
  
	
  	
  	
  	
  	
  	
  <morphology>	
  
	
  	
  	
  	
  	
  	
  	
  	
  <inflec0on	
  descrip0on="like	
  walk,	
  walked,	
  walking"	
  example="walk"	
  id="1"/>	
  
	
  	
  	
  	
  	
  	
  </morphology>	
  
	
  	
  	
  	
  	
  	
  <sal	
  code="10,24,596"	
  descrip0on="from	
  =	
  away	
  from,	
  off	
  of,	
  out	
  of"	
  set="governsawayfrom"/>	
  
	
  	
  	
  	
  </source>	
  
	
  	
  	
  	
  <target	
  aux="2"	
  head_word="1"	
  word_type="01">	
  
	
  	
  	
  	
  	
  	
  <pos	
  descrip0on="Verb"	
  wclass="02"/>	
  
	
  	
  	
  	
  	
  	
  <morphology>	
  
	
  	
  	
  	
  	
  	
  	
  	
  <inflec0on	
  descrip0on="Irreg.	
  in	
  -­‐ir	
  with	
  shortened	
  stem	
  ..."	
  example="par0r"	
  id="12"/>	
  
	
  	
  	
  	
  	
  	
  </morphology>	
  
	
  	
  	
  	
  </target>	
  
	
  	
  </Entry>	
  
Mnemonic	
   Example	
  Verb	
   Example	
  Sentence	
  
INEXbe-­‐type	
   be	
   She	
  was	
  at	
  the	
  seashore	
  all	
  summer.	
  
INEXbecome-­‐type	
   become,	
  remain	
   He	
  became	
  a	
  doctor	
  at	
  a	
  very	
  young	
  age.	
  
INEXgrow-­‐type	
  	
   sound,	
  look	
   Their	
  voices	
  sounded	
  cheerful.	
  
INEXseem-­‐type	
   seem,	
  appear	
   He	
  seemed	
  happy	
  with	
  the	
  results.	
  
Mnemonics	
   Descrip-on	
   Examples	
  
MEabs	
   abstract	
  measurable	
  concepts	
   humidity,	
  length	
  
MEdis	
   discrete	
  measurable	
  concepts	
   sum,	
  increment	
  
MEunit	
   units	
  of	
  measure	
   See	
  subsets	
  
MEunitwt	
   units	
  of	
  weight	
   ounce,	
  pound	
  
MEunitvel	
   units	
  of	
  velocity	
   mph,	
  megahertz	
  
MEunitvol	
   unites	
  of	
  volume	
  measure	
   gallon,	
  liter	
  
MEuni]emp	
   units	
  of	
  temperature	
   degrees	
  celsius	
  
MEunitener	
   units	
  of	
  energy/force	
   wa],	
  horsepower	
  
MEunitsys	
   measurement	
  systems	
   fahrenheit,	
  kelvin	
  
MEunitdur	
   units	
  of	
  dura0on	
   hour,	
  year	
  
MEunitspec	
   specialized	
  units	
  of	
  measure	
   oersted,	
  ohm	
  
MEunitvalue	
   units	
  of	
  money/value	
   dollar,	
  euro	
  
MEunitlin	
   units	
  of	
  linear/area	
  measure	
   inch,	
  mille	
  
MEundif	
   undifferen0ated	
  measure	
   degree,	
  share	
  
PaQern	
   Example	
  Sentence	
  
It	
  is	
  ADJ	
  that	
   It	
  is	
  silly	
  that...	
  
It	
  is	
  ADJ	
  for	
  NP	
  that	
   It	
  is	
  good	
  for	
  the	
  employees	
  that...	
  
It	
  is	
  ADJ	
  to	
  VP	
   It	
  is	
  smart	
  to	
  exercise.	
  
It	
  is	
  ADJ	
  for	
  NP	
  to	
  VP	
   It	
  was	
  silly	
  for	
  them	
  to	
  expect...	
  
It	
  is	
  ADJ	
  V'ing	
   It	
  is	
  smart	
  doing	
  the	
  right	
  thing.	
  	
  
NP	
  is	
  ADJ	
  to	
  VP	
   John	
  is	
  smart	
  to	
  exercise.	
  

Mais conteúdo relacionado

Mais procurados

Semantic Rules Representation in Controlled Natural Language in FluentEditor
Semantic Rules Representation in Controlled Natural Language in FluentEditorSemantic Rules Representation in Controlled Natural Language in FluentEditor
Semantic Rules Representation in Controlled Natural Language in FluentEditorCognitum
 
Parts of Speect Tagging
Parts of Speect TaggingParts of Speect Tagging
Parts of Speect Taggingtheyaseen51
 
referát.doc
referát.docreferát.doc
referát.docbutest
 
7 probability and statistics an introduction
7 probability and statistics an introduction7 probability and statistics an introduction
7 probability and statistics an introductionThennarasuSakkan
 
natural language processing
natural language processing natural language processing
natural language processing sunanthakrishnan
 
Setswana Tokenisation and Computational Verb Morphology: Facing the Challenge...
Setswana Tokenisation and Computational Verb Morphology: Facing the Challenge...Setswana Tokenisation and Computational Verb Morphology: Facing the Challenge...
Setswana Tokenisation and Computational Verb Morphology: Facing the Challenge...Guy De Pauw
 
NLP_KASHK:Finite-State Morphological Parsing
NLP_KASHK:Finite-State Morphological ParsingNLP_KASHK:Finite-State Morphological Parsing
NLP_KASHK:Finite-State Morphological ParsingHemantha Kulathilake
 
Ijarcet vol-2-issue-2-676-678
Ijarcet vol-2-issue-2-676-678Ijarcet vol-2-issue-2-676-678
Ijarcet vol-2-issue-2-676-678Editor IJARCET
 
Tutorial - Introduction to Rule Technologies and Systems
Tutorial - Introduction to Rule Technologies and SystemsTutorial - Introduction to Rule Technologies and Systems
Tutorial - Introduction to Rule Technologies and SystemsAdrian Paschke
 
Lean Logic for Lean Times: Entailment and Contradiction Revisited
Lean Logic for Lean Times: Entailment and Contradiction RevisitedLean Logic for Lean Times: Entailment and Contradiction Revisited
Lean Logic for Lean Times: Entailment and Contradiction RevisitedValeria de Paiva
 
A deep analysis of Multi-word Expression and Machine Translation
A deep analysis of Multi-word Expression and Machine TranslationA deep analysis of Multi-word Expression and Machine Translation
A deep analysis of Multi-word Expression and Machine TranslationLifeng (Aaron) Han
 
AN EMPIRICAL STUDY OF WORD SENSE DISAMBIGUATION
AN EMPIRICAL STUDY OF WORD SENSE DISAMBIGUATIONAN EMPIRICAL STUDY OF WORD SENSE DISAMBIGUATION
AN EMPIRICAL STUDY OF WORD SENSE DISAMBIGUATIONijnlc
 
Portuguese Linguistic Tools: What, Why and How
Portuguese Linguistic Tools: What, Why and HowPortuguese Linguistic Tools: What, Why and How
Portuguese Linguistic Tools: What, Why and HowValeria de Paiva
 
Lean Logic for Lean Times: Varieties of Natural Logic
Lean Logic for Lean Times: Varieties of Natural LogicLean Logic for Lean Times: Varieties of Natural Logic
Lean Logic for Lean Times: Varieties of Natural LogicValeria de Paiva
 

Mais procurados (20)

Semantic Rules Representation in Controlled Natural Language in FluentEditor
Semantic Rules Representation in Controlled Natural Language in FluentEditorSemantic Rules Representation in Controlled Natural Language in FluentEditor
Semantic Rules Representation in Controlled Natural Language in FluentEditor
 
Parts of Speect Tagging
Parts of Speect TaggingParts of Speect Tagging
Parts of Speect Tagging
 
referát.doc
referát.docreferát.doc
referát.doc
 
7 probability and statistics an introduction
7 probability and statistics an introduction7 probability and statistics an introduction
7 probability and statistics an introduction
 
natural language processing
natural language processing natural language processing
natural language processing
 
NLTK
NLTKNLTK
NLTK
 
Setswana Tokenisation and Computational Verb Morphology: Facing the Challenge...
Setswana Tokenisation and Computational Verb Morphology: Facing the Challenge...Setswana Tokenisation and Computational Verb Morphology: Facing the Challenge...
Setswana Tokenisation and Computational Verb Morphology: Facing the Challenge...
 
A Proposition Bank of Urdu
A Proposition Bank of UrduA Proposition Bank of Urdu
A Proposition Bank of Urdu
 
NLP_KASHK:Finite-State Morphological Parsing
NLP_KASHK:Finite-State Morphological ParsingNLP_KASHK:Finite-State Morphological Parsing
NLP_KASHK:Finite-State Morphological Parsing
 
Intro to NLP. Lecture 2
Intro to NLP.  Lecture 2Intro to NLP.  Lecture 2
Intro to NLP. Lecture 2
 
Ijarcet vol-2-issue-2-676-678
Ijarcet vol-2-issue-2-676-678Ijarcet vol-2-issue-2-676-678
Ijarcet vol-2-issue-2-676-678
 
Anabela Barreiro - Alinhamentos
Anabela Barreiro - AlinhamentosAnabela Barreiro - Alinhamentos
Anabela Barreiro - Alinhamentos
 
Cross language alignments - challenges guidelines and gold sets
Cross language alignments - challenges guidelines and gold setsCross language alignments - challenges guidelines and gold sets
Cross language alignments - challenges guidelines and gold sets
 
Tutorial - Introduction to Rule Technologies and Systems
Tutorial - Introduction to Rule Technologies and SystemsTutorial - Introduction to Rule Technologies and Systems
Tutorial - Introduction to Rule Technologies and Systems
 
NLP todo
NLP todoNLP todo
NLP todo
 
Lean Logic for Lean Times: Entailment and Contradiction Revisited
Lean Logic for Lean Times: Entailment and Contradiction RevisitedLean Logic for Lean Times: Entailment and Contradiction Revisited
Lean Logic for Lean Times: Entailment and Contradiction Revisited
 
A deep analysis of Multi-word Expression and Machine Translation
A deep analysis of Multi-word Expression and Machine TranslationA deep analysis of Multi-word Expression and Machine Translation
A deep analysis of Multi-word Expression and Machine Translation
 
AN EMPIRICAL STUDY OF WORD SENSE DISAMBIGUATION
AN EMPIRICAL STUDY OF WORD SENSE DISAMBIGUATIONAN EMPIRICAL STUDY OF WORD SENSE DISAMBIGUATION
AN EMPIRICAL STUDY OF WORD SENSE DISAMBIGUATION
 
Portuguese Linguistic Tools: What, Why and How
Portuguese Linguistic Tools: What, Why and HowPortuguese Linguistic Tools: What, Why and How
Portuguese Linguistic Tools: What, Why and How
 
Lean Logic for Lean Times: Varieties of Natural Logic
Lean Logic for Lean Times: Varieties of Natural LogicLean Logic for Lean Times: Varieties of Natural Logic
Lean Logic for Lean Times: Varieties of Natural Logic
 

Destaque

Sight interpreting - sight translation
Sight interpreting - sight translationSight interpreting - sight translation
Sight interpreting - sight translationTyana Widyanandini
 
14 2 Paraphrasing Power Point
14 2 Paraphrasing Power Point14 2 Paraphrasing Power Point
14 2 Paraphrasing Power PointBulldog4
 

Destaque (10)

Machine Translation of Discontinuous Multiword Units
Machine Translation of Discontinuous Multiword UnitsMachine Translation of Discontinuous Multiword Units
Machine Translation of Discontinuous Multiword Units
 
Automatic Paraphrasing of Human Intransitive Adjectives in Portuguese
Automatic Paraphrasing of Human Intransitive Adjectives in PortugueseAutomatic Paraphrasing of Human Intransitive Adjectives in Portuguese
Automatic Paraphrasing of Human Intransitive Adjectives in Portuguese
 
Contributos das Tecnologias da Língua para a Globalização do Português
Contributos das Tecnologias da Língua para a Globalização do PortuguêsContributos das Tecnologias da Língua para a Globalização do Português
Contributos das Tecnologias da Língua para a Globalização do Português
 
Linguistic Evaluation of Support Verb Construction Translations by OpenLogos ...
Linguistic Evaluation of Support Verb Construction Translations by OpenLogos ...Linguistic Evaluation of Support Verb Construction Translations by OpenLogos ...
Linguistic Evaluation of Support Verb Construction Translations by OpenLogos ...
 
CLUE-Aligner: An Alignment Tool to Annotate Pairs of Paraphrastic and Transla...
CLUE-Aligner: An Alignment Tool to Annotate Pairs of Paraphrastic and Transla...CLUE-Aligner: An Alignment Tool to Annotate Pairs of Paraphrastic and Transla...
CLUE-Aligner: An Alignment Tool to Annotate Pairs of Paraphrastic and Transla...
 
Agente Virtual
Agente Virtual Agente Virtual
Agente Virtual
 
ReEscreve: a translator-friendly multi-purpose paraphrasing software tool - A...
ReEscreve: a translator-friendly multi-purpose paraphrasing software tool - A...ReEscreve: a translator-friendly multi-purpose paraphrasing software tool - A...
ReEscreve: a translator-friendly multi-purpose paraphrasing software tool - A...
 
When Multiwords Go Bad in Machine Translation
When Multiwords Go Bad in Machine TranslationWhen Multiwords Go Bad in Machine Translation
When Multiwords Go Bad in Machine Translation
 
Sight interpreting - sight translation
Sight interpreting - sight translationSight interpreting - sight translation
Sight interpreting - sight translation
 
14 2 Paraphrasing Power Point
14 2 Paraphrasing Power Point14 2 Paraphrasing Power Point
14 2 Paraphrasing Power Point
 

Semelhante a OpenLogos Semantico-Syntactic Knowledge-Rich Bilingual Dictionaries

Disntinguished Speaker - Corina Forascu
Disntinguished Speaker - Corina ForascuDisntinguished Speaker - Corina Forascu
Disntinguished Speaker - Corina Forascuoxwocs
 
NLP Deep Learning with Tensorflow
NLP Deep Learning with TensorflowNLP Deep Learning with Tensorflow
NLP Deep Learning with Tensorflowseungwoo kim
 
ENeL_WG3_Survey-AKA4Lexicography-TiberiusHeylenKrek (1).pptx
ENeL_WG3_Survey-AKA4Lexicography-TiberiusHeylenKrek (1).pptxENeL_WG3_Survey-AKA4Lexicography-TiberiusHeylenKrek (1).pptx
ENeL_WG3_Survey-AKA4Lexicography-TiberiusHeylenKrek (1).pptxSyedNadeemAbbas6
 
Fao Semantics Related Projects
Fao Semantics Related ProjectsFao Semantics Related Projects
Fao Semantics Related ProjectsMargherita Sini
 
5a use of annotated corpus
5a use of annotated corpus5a use of annotated corpus
5a use of annotated corpusThennarasuSakkan
 
Natural Language Processing_in semantic web.pptx
Natural Language Processing_in semantic web.pptxNatural Language Processing_in semantic web.pptx
Natural Language Processing_in semantic web.pptxAlyaaMachi
 
Etymology Markup in TEI XML
Etymology Markup in TEI XMLEtymology Markup in TEI XML
Etymology Markup in TEI XMLJack Bowers
 
Unsupervised Software-Specific Morphological Forms Inference from Informal Di...
Unsupervised Software-Specific Morphological Forms Inference from Informal Di...Unsupervised Software-Specific Morphological Forms Inference from Informal Di...
Unsupervised Software-Specific Morphological Forms Inference from Informal Di...Chunyang Chen
 
Pos Tagging for Classical Tamil Texts
Pos Tagging for Classical Tamil TextsPos Tagging for Classical Tamil Texts
Pos Tagging for Classical Tamil Textsijcnes
 
Deciphering voice of customer through speech analytics
Deciphering voice of customer through speech analyticsDeciphering voice of customer through speech analytics
Deciphering voice of customer through speech analyticsR Systems International
 
An exploratory corpus study of the AP Spanish
An exploratory corpus study of the AP SpanishAn exploratory corpus study of the AP Spanish
An exploratory corpus study of the AP SpanishSteven Saffels
 
Shallow parser for hindi language with an input from a transliterator
Shallow parser for hindi language with an input from a transliteratorShallow parser for hindi language with an input from a transliterator
Shallow parser for hindi language with an input from a transliteratorShashank Shisodia
 
Presentation ASLIB 2014_Ghoula
Presentation ASLIB 2014_GhoulaPresentation ASLIB 2014_Ghoula
Presentation ASLIB 2014_GhoulaNizar Ghoula
 
A Brief Introduction to SKOS
A Brief Introduction to SKOSA Brief Introduction to SKOS
A Brief Introduction to SKOSHeather Hedden
 
Modeling of Speech Synthesis of Standard Arabic Using an Expert System
Modeling of Speech Synthesis of Standard Arabic Using an Expert SystemModeling of Speech Synthesis of Standard Arabic Using an Expert System
Modeling of Speech Synthesis of Standard Arabic Using an Expert Systemcsandit
 

Semelhante a OpenLogos Semantico-Syntactic Knowledge-Rich Bilingual Dictionaries (20)

eSPERTo’s Paraphrastic Knowledge Applied to Question-Answering and Summarization
eSPERTo’s Paraphrastic Knowledge Applied to Question-Answering and SummarizationeSPERTo’s Paraphrastic Knowledge Applied to Question-Answering and Summarization
eSPERTo’s Paraphrastic Knowledge Applied to Question-Answering and Summarization
 
Poster @ enetCollect CA MC meeting in Iasi, Romania
Poster @ enetCollect CA MC meeting in Iasi, Romania Poster @ enetCollect CA MC meeting in Iasi, Romania
Poster @ enetCollect CA MC meeting in Iasi, Romania
 
Disntinguished Speaker - Corina Forascu
Disntinguished Speaker - Corina ForascuDisntinguished Speaker - Corina Forascu
Disntinguished Speaker - Corina Forascu
 
haenelt.ppt
haenelt.ppthaenelt.ppt
haenelt.ppt
 
NLP Deep Learning with Tensorflow
NLP Deep Learning with TensorflowNLP Deep Learning with Tensorflow
NLP Deep Learning with Tensorflow
 
ENeL_WG3_Survey-AKA4Lexicography-TiberiusHeylenKrek (1).pptx
ENeL_WG3_Survey-AKA4Lexicography-TiberiusHeylenKrek (1).pptxENeL_WG3_Survey-AKA4Lexicography-TiberiusHeylenKrek (1).pptx
ENeL_WG3_Survey-AKA4Lexicography-TiberiusHeylenKrek (1).pptx
 
nlp (1).pptx
nlp (1).pptxnlp (1).pptx
nlp (1).pptx
 
Fao Semantics Related Projects
Fao Semantics Related ProjectsFao Semantics Related Projects
Fao Semantics Related Projects
 
5a use of annotated corpus
5a use of annotated corpus5a use of annotated corpus
5a use of annotated corpus
 
Natural Language Processing_in semantic web.pptx
Natural Language Processing_in semantic web.pptxNatural Language Processing_in semantic web.pptx
Natural Language Processing_in semantic web.pptx
 
Etymology Markup in TEI XML
Etymology Markup in TEI XMLEtymology Markup in TEI XML
Etymology Markup in TEI XML
 
Unsupervised Software-Specific Morphological Forms Inference from Informal Di...
Unsupervised Software-Specific Morphological Forms Inference from Informal Di...Unsupervised Software-Specific Morphological Forms Inference from Informal Di...
Unsupervised Software-Specific Morphological Forms Inference from Informal Di...
 
Pos Tagging for Classical Tamil Texts
Pos Tagging for Classical Tamil TextsPos Tagging for Classical Tamil Texts
Pos Tagging for Classical Tamil Texts
 
Deciphering voice of customer through speech analytics
Deciphering voice of customer through speech analyticsDeciphering voice of customer through speech analytics
Deciphering voice of customer through speech analytics
 
An exploratory corpus study of the AP Spanish
An exploratory corpus study of the AP SpanishAn exploratory corpus study of the AP Spanish
An exploratory corpus study of the AP Spanish
 
Translationusing moses1
Translationusing moses1Translationusing moses1
Translationusing moses1
 
Shallow parser for hindi language with an input from a transliterator
Shallow parser for hindi language with an input from a transliteratorShallow parser for hindi language with an input from a transliterator
Shallow parser for hindi language with an input from a transliterator
 
Presentation ASLIB 2014_Ghoula
Presentation ASLIB 2014_GhoulaPresentation ASLIB 2014_Ghoula
Presentation ASLIB 2014_Ghoula
 
A Brief Introduction to SKOS
A Brief Introduction to SKOSA Brief Introduction to SKOS
A Brief Introduction to SKOS
 
Modeling of Speech Synthesis of Standard Arabic Using an Expert System
Modeling of Speech Synthesis of Standard Arabic Using an Expert SystemModeling of Speech Synthesis of Standard Arabic Using an Expert System
Modeling of Speech Synthesis of Standard Arabic Using an Expert System
 

Mais de INESC-ID (Spoken Language Systems Laboratory - L2F)

Mais de INESC-ID (Spoken Language Systems Laboratory - L2F) (20)

Multi3Generation@INGL2020
Multi3Generation@INGL2020Multi3Generation@INGL2020
Multi3Generation@INGL2020
 
NooJ 2020 presentation
NooJ 2020 presentationNooJ 2020 presentation
NooJ 2020 presentation
 
PROPOR2020_Barreiroetal
PROPOR2020_BarreiroetalPROPOR2020_Barreiroetal
PROPOR2020_Barreiroetal
 
Análise comparativa das edições portuguesa e brasileira de Os livros que dev...
Análise comparativa das edições portuguesa e brasileira de  Os livros que dev...Análise comparativa das edições portuguesa e brasileira de  Os livros que dev...
Análise comparativa das edições portuguesa e brasileira de Os livros que dev...
 
Welcome session 3rd Annual MC Meeting - enetCollect COST Action
Welcome session 3rd Annual MC Meeting - enetCollect COST ActionWelcome session 3rd Annual MC Meeting - enetCollect COST Action
Welcome session 3rd Annual MC Meeting - enetCollect COST Action
 
Syntactic-semantic analysis for information extraction in biomedicine
Syntactic-semantic analysis for information extraction in biomedicineSyntactic-semantic analysis for information extraction in biomedicine
Syntactic-semantic analysis for information extraction in biomedicine
 
Cross language semantic relations between English and Portuguese
Cross language semantic relations between English and PortugueseCross language semantic relations between English and Portuguese
Cross language semantic relations between English and Portuguese
 
Paraphrasing biomedical support verb constructions for machine translation
Paraphrasing biomedical support verb constructions for machine translationParaphrasing biomedical support verb constructions for machine translation
Paraphrasing biomedical support verb constructions for machine translation
 
ReWriter for legal text
ReWriter for legal textReWriter for legal text
ReWriter for legal text
 
Chatbots for Language Learning
Chatbots for Language LearningChatbots for Language Learning
Chatbots for Language Learning
 
Barreiro et al POP@PROPOR2018-informal2formal-language
Barreiro et al POP@PROPOR2018-informal2formal-languageBarreiro et al POP@PROPOR2018-informal2formal-language
Barreiro et al POP@PROPOR2018-informal2formal-language
 
Rebelo-Arnold et al POP@PROPOR2018-EP-BP-alignments
Rebelo-Arnold et al POP@PROPOR2018-EP-BP-alignmentsRebelo-Arnold et al POP@PROPOR2018-EP-BP-alignments
Rebelo-Arnold et al POP@PROPOR2018-EP-BP-alignments
 
Barreiro-Batista-LR4NLP@Coling2018-presentation
Barreiro-Batista-LR4NLP@Coling2018-presentationBarreiro-Batista-LR4NLP@Coling2018-presentation
Barreiro-Batista-LR4NLP@Coling2018-presentation
 
Barreiro-Mota-VarDial@Coling2018-poster
Barreiro-Mota-VarDial@Coling2018-posterBarreiro-Mota-VarDial@Coling2018-poster
Barreiro-Mota-VarDial@Coling2018-poster
 
NooJ-2018-Palermo
NooJ-2018-PalermoNooJ-2018-Palermo
NooJ-2018-Palermo
 
projeto-eSPERTo
projeto-eSPERToprojeto-eSPERTo
projeto-eSPERTo
 
ReEscreve: A Translator-Friendly Multi-Purpose Paraphrasing Software Tool
ReEscreve: A Translator-Friendly Multi-Purpose Paraphrasing Software ToolReEscreve: A Translator-Friendly Multi-Purpose Paraphrasing Software Tool
ReEscreve: A Translator-Friendly Multi-Purpose Paraphrasing Software Tool
 
Poster l2f 2017
Poster l2f 2017Poster l2f 2017
Poster l2f 2017
 
Nooj2017 cmota-etal
Nooj2017 cmota-etalNooj2017 cmota-etal
Nooj2017 cmota-etal
 
Content Writing Optimization with ReWriter
Content Writing Optimization with ReWriterContent Writing Optimization with ReWriter
Content Writing Optimization with ReWriter
 

Último

Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????blackmambaettijean
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 

Último (20)

Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 

OpenLogos Semantico-Syntactic Knowledge-Rich Bilingual Dictionaries

  • 1. OpenLogos  Seman-co-­‐Syntac-c  Knowledge-­‐Rich  Bilingual  Dic-onaries   Anabela  Barreiro1,  Fernando  Ba0sta1,2,  Ricardo  Ribeiro1,2,  Helena  Moniz1,3,  Isabel  Trancoso1,4   1INESC-­‐ID,  2ISCTE-­‐IUL,  3FLUL/CLUL,  4IST   {abarreiro;fmmb;rdmr;helenam;imt}@l2f.inesc-id.pt! http://www.l2f.inesc-id.pt/! Characteris0cs   –  Representa0on  schema  with  eclec0c  categories   –  Designed  to  work  in  concert  with  the  lexical   resources  and  linguis0c  rules  (transfer  (TRAN)   and  seman0co-­‐syntac0c  (SEMTAB)  rules)   –  Easy  mapping  from  natural  to  symbolic   language,  represen0ng  both  meaning  and   structure  in  a  con0nuum,  undissociated,   represented  in  the  same  layer,  based  on  the   belief  that  seman0cs  of  a  word  oRen  affects  the   surrounding  syntax   –  Extensible  system,  designed  so  that  developers   would  expand  and  add  to  its  capabili0es   –  Ini0ally  developed  for  English,  but  many  of  its   elements  are  universal  (mostly  nouns,  adjec0ves,   and  adverbs)  and  applicable  to  other  languages   Representa0on   –  SAL  knowledge  is  embedded  in  the  dic0onary  in   the  form  of  numeric  codes  (SAL  mnemonics  are   used  for  easier  understanding)   •  E.g.  the  noun  (N)  table  has  two  SAL  representa0ons:   –  COsurf  –  concrete,  surface   –  INdata  –  informa0on,  recorded  data   –  Nouns  have  12  supersets.  Superset  measure   (ME)  has  3  sets  and  11  subsets:     •  SAL  codes  for  nouns  represent  seman0c  groupings,   and  are  language  independent,  as  concepts  are   transverse  across  languages   –  Verbs  are  subdivided  in  3  types:  intransi0ve,   weak  transi0ve  and  strong  transi0ve.  Intransi0ve   verbs  have  3  supersets:  mo0onal  (INMO),   opera0onal  (INOP),  and  existen0al  (INEX)   •  Existen0al  intransi0ve  verbs  include  be  and  be-­‐ subs0tutes  that  take  predicate  nouns  and  adjec0ves   –  Adjec-ves  are  classified  in  2  types:  descrip0ve   and  par0cipial,  sub-­‐classified  according  to   syntac0c  rela0onships  with  other  words   •  syntac0c  pa]erns  for  the  descrip0ve  pre-­‐clausal   good-­‐type  adjec0ves     –  OpenLogos  (OL)  is  the  open  source  deriva0ve   of  the  Logos  machine  transla0on  (MT)  system     –  OL  strength  resides  in  its  lexical  resources,  the   knowledge-­‐rich  bilingual  dic-onaries   •  contain  seman0co-­‐syntac0c  knowledge  and   ontological  rela0ons  for  all  lexical  entries  represented   at  an  abstract/higher  level  by  the  Seman0co-­‐ Syntac0c  Abstrac0on  Language  –  SAL     •  present  other  idiosyncrasies  that  dis0nguish  them   from  other  publicly  available  dic0onaries   Mo0va0on   –  OL  resources  were  used  successfully  in  the  Logos   commercial  MT  product  during  2-­‐3  decades   •  validated  by  the  Logos  development  team  and  clients   –  Possible  applica0ons   •  basis  for  new  linguis0c  and  NLP  tools,  especially  for   poor-­‐resourced  languages   •  enhancement  of  other  MT  systems   Bilingual  Dic0onaries:  EN  >  GE/FR/IT   –  Verbs,  nouns  and  adjec0ves  are  clearly  the   most  represented  classes,  as  they  reach  more   than  80,000  entries  for  each  target  language.   –  Dic0onaries  stored  in  self-­‐contained  XML  files   •  easily  addressed  by  small  programs   •  supported  by  exis0ng  efficient  XML  APIs   –  Example  for  the  verb  entry  depart,  extracted   from  the  English-­‐French  dic0onary   Introduc0on   Seman0co-­‐Syntac0c  Knowledge   –  Part-­‐of-­‐speech  (POS)   –  Gender  (GEN)   –  Number  (NUM)   –  Morphological  paradigms  (PAT)  for  source   and  target  words   •  make  it  possible  to  map  inflected  forms  across   languages  and  improve  agreement  in  SMT   –  Head  word  (HEAD)  in  mul0word   •  useful  to  correct  MT  problems  related  to   agreement  within  mul0words  or  within  larger   units  (e.g.  between  nominal  mul0words  and  verb   or  agreement  within  verbal  mul0words)   –  Homographs  (HOMO)   •  homographs  are  a  major  source  of  transla0on   errors  and  their  iden0fica0on  is  crucial   –  Auxiliary  (AUX)   •  helps  improve  precision  in  the  transla0on  when   auxiliary  choice  is  subtle   –  Alternate  word  (ALT)   •  nominaliza0on  (process  noun),  predicate   adjec0ve,  etc.  -­‐  useful  for  paraphrasing  purposes   –  Causa0ve  verb  (CAUS)   –  Reflexive  verb  (REFL)   –  Aspectual  verb  (ASP)   –  Seman0co-­‐Syntac0c  Knowledge  (SAL)   •  interlingua-­‐style  hierarchical  taxonomy  with  over   1,000  elements,  embracing  all  POS   •  3  levels  of  representa0on:  superset  (SUPER),     set  (SET),  and  subset  (SUB)  -­‐  embedded  in  the   dic0onary  entries  and  in  the  transla0on  system’s   rules  (help  with  disambigua0on).  E.g.  pipe,  hose:   OpenLogos  Data   3 2 1 –  Three  bilingual  dic0onaries  were  created   •  English-­‐French;  English-­‐German;  English-­‐Italian   •  online  and  free  for  research  purposes     –  h]p://metanet4u.l2f.inesc-­‐id.pt/   –  The  resources  contain  seman0co-­‐syntac0c   knowledge  concerning  the  conceptual   formaliza0on  of  things,  ideas,  rela0onships,   disposi0ons,  condi0ons,  processes,  etc.   •  valuable  for  MT  and  other  NLP  applica0ons   •  stored  in  XML  format  for  easy  processing   –  In  the  future,  we  will  make  available  three   complementary  bilingual  dic0onaries   •  English-­‐Portuguese;  English-­‐Spanish;  German-­‐ English   Acknowledgments   –  This  work  was  supported  by  na0onal  funds  through   Fundação  para  a  Ciência  e  a  Tecnologia,  under  grants   SFRH/BPD/91446/2012  and  SFRH/BPD/95849/2013     and  project  PEst-­‐OE/EEI/LA0021/2013   Conclusions  and  Future  Work   5 Resul0ng  Resources   4 Instituto de Engenharia de Sistemas e Computadores Investigação e Desenvolvimento em Lisboa Laboratório de Sistemas de Língua Falada     id   EN-­‐GE   EN-­‐FR   EN-­‐IT   Noun   1   28266   25910   23505   Verb   2   33855   33354   33021   Adverb  (loca0ve)   3   465   442   450   Adjec0ve   4   21219   20749   20518   Pronoun   5   121   121   121   Adverb  (manner,  agency,  degree)   6   2207   2167   2173   Preposi0on  (non-­‐loca0ve)   11   140   140   139   Auxiliary  and  Modal   12   34   34   34   Preposi0on  (loca0ve)   13   148   148   148   Definite  Ar0cle   14   194   194   189   Indefinite  Ar0cle   15   66   66   65   Arithmate  in  Apposi0on   16   208   208   203   Nega0ve   17   2   2   2   Rela0ve  and  Interroga0ve  Pronoun   18   23   23   20   Conjunc0on   19   160   160   160   Punctua0on   20   30   30   30   Total   87138   83748   80778   nouns% concrete% func+onals% conduits% word%class% superset% set% subset%barriers% containers% …%…% …% …% …%…%  <Entry  source="depart"  target="qui]er">          <source  head_word="1"  homograph="no"  word_type="01">              <pos  descrip0on="Verb"  wclass="02"/>              <morphology>                  <inflec0on  descrip0on="like  walk,  walked,  walking"  example="walk"  id="1"/>              </morphology>              <sal  code="13,98,596"  descrip0on="create,  etc."  mnemonic="generictransi0ve4"  set="other98"/>          </source>          <target  aux="1"  head_word="1"  word_type="01">              <pos  descrip0on="Verb"  wclass="02"/>              <morphology>                  <inflec0on  descrip0on="regular  ending  in  -­‐er:  parler"  example="parler"  id="3"/>              </morphology>          </target>      </Entry>      <Entry  source="depart"  target="par0r">          <source  head_word="1"  homograph="no"  word_type="01">              <pos  descrip0on="Verb"  wclass="02"/>              <morphology>                  <inflec0on  descrip0on="like  walk,  walked,  walking"  example="walk"  id="1"/>              </morphology>              <sal  code="10,24,596"  descrip0on="from  =  away  from,  off  of,  out  of"  set="governsawayfrom"/>          </source>          <target  aux="2"  head_word="1"  word_type="01">              <pos  descrip0on="Verb"  wclass="02"/>              <morphology>                  <inflec0on  descrip0on="Irreg.  in  -­‐ir  with  shortened  stem  ..."  example="par0r"  id="12"/>              </morphology>          </target>      </Entry>   Mnemonic   Example  Verb   Example  Sentence   INEXbe-­‐type   be   She  was  at  the  seashore  all  summer.   INEXbecome-­‐type   become,  remain   He  became  a  doctor  at  a  very  young  age.   INEXgrow-­‐type     sound,  look   Their  voices  sounded  cheerful.   INEXseem-­‐type   seem,  appear   He  seemed  happy  with  the  results.   Mnemonics   Descrip-on   Examples   MEabs   abstract  measurable  concepts   humidity,  length   MEdis   discrete  measurable  concepts   sum,  increment   MEunit   units  of  measure   See  subsets   MEunitwt   units  of  weight   ounce,  pound   MEunitvel   units  of  velocity   mph,  megahertz   MEunitvol   unites  of  volume  measure   gallon,  liter   MEuni]emp   units  of  temperature   degrees  celsius   MEunitener   units  of  energy/force   wa],  horsepower   MEunitsys   measurement  systems   fahrenheit,  kelvin   MEunitdur   units  of  dura0on   hour,  year   MEunitspec   specialized  units  of  measure   oersted,  ohm   MEunitvalue   units  of  money/value   dollar,  euro   MEunitlin   units  of  linear/area  measure   inch,  mille   MEundif   undifferen0ated  measure   degree,  share   PaQern   Example  Sentence   It  is  ADJ  that   It  is  silly  that...   It  is  ADJ  for  NP  that   It  is  good  for  the  employees  that...   It  is  ADJ  to  VP   It  is  smart  to  exercise.   It  is  ADJ  for  NP  to  VP   It  was  silly  for  them  to  expect...   It  is  ADJ  V'ing   It  is  smart  doing  the  right  thing.     NP  is  ADJ  to  VP   John  is  smart  to  exercise.