Alfredo studied computer engineering in Mexico and worked at Microsoft before pursuing a PhD in computational linguistics at Trinity College Dublin, where he researches word sense disambiguation and induction using co-occurrence vectors. Liliana is also a PhD student at Trinity, exploring speculation and emotion in text. Both noted differences between public and private education in their home countries of Mexico and Peru compared to Ireland, but see growing opportunities for technology careers and research investment in Latin America.
Experiences from Two Latin American PhD Students in Ireland
1. A Latin American Perspective
on Education in Ireland
Experiences of two Latin American PhD students in Ireland
Liliana Mamani Sánchez
Alfredo Maldonado Guerra
Trinity College Dublin
4. • Born in Monterrey, Mexico
• Studied Computer Systems Engineering at
Tecnológico de Monterrey (1998)
• Got interested in the interaction of computers
with human language
• Most of my classmates ended up developing
'boring software' and I wanted a challenge
involving language and computers
Background
5. • Joined Microsoft Ireland as Spanish Terminologist (2000)
Focus on 'Neutral' Spanish and products specific to
Mexican Market
• As extra-curricular activities I developed some database
tools to help organise terminological work
• In 2006 I transitioned to a fully technical position:
Linguistic Engineer
– Continued working on terminology management tools
– Developed a consistency terminology checker (NLP)
Background
6. Photo by by artemuestra on Flickr
Photo by Scorpions and Centaurs on Flickr
Photo by infomatique on Flickr
2008Maybe a good time to go back to college?
Background
7. PhD at Trinity College Dublin
• Joined TCD as a full-time student in July
2009
• Funding by CNGL
• Originally research inspired by terminology
background but it soon evolved to an
interest in polysemy of words in general
(word-sense disambiguation)
8. Research Summary
• Compositionality (transparency) of multi-
word expressions / terms
• Method based on word co-occurrence
vectors
• Competed in Shared Task (DiSCo 2011
Workshop in Portland, Oregon)
9. Research Summary
• Word sense-disambiguation and induction
Two possible meanings of the word rock:
Photo by deep_schismic on FlickrPhoto by Minerva Bloom on Photopedia
10. Research Summary
• Word-sense disambiguation: Given a word
in context, determine its meaning
• Word-sense induction: Given a collection of
texts (corpus), find out what senses a word
has
• Applications in machine translation, search,
information retrieval, dictionary making
(lexicography), terminology, etc.
• Sense depends on word's context:
– Rock fractures in geological processes
11. Research Summary
• Context representation: co-occurrence
vector
• First-order co-occurrence vector
• Second-order co-occurrence vector
• Compared performance of both vector
types and concluded simpler first-order
vectors just as good as second-order
vectors
12. Research Summary
• Context representation has thousands of
dimensions
• Dimensionality Reduction
– Singular Value Decomposition
(computationally intensive)
– Alternative based on consolidating redundant
dimensions together (less computationally
intensive)
13. Education in Mexico and Ireland
• Whole education (with exception of PhD)
done in Mexico
• Attended private schools (often
considered more prestigious than public
schools) – Inequal access to quality
education
• Still, leading research in Mexico is done in
public universities (UNAM)
14. Education in Mexico and Ireland
• Definitely more funding for research
available in Ireland than in Mexico.
– Cuts in Ireland due to euro crisis
• Career opportunities for technology
graduates looking brighter on both
countries
– Recent 'digital' twinning of Dublin and
Guadalajara
– Flurry of startup companies
15. Education in Mexico and Ireland
“What struck me most here in Monterrey,
though, is the number of tech start-ups
that are emerging from Mexico’s young
population [...] thanks to cheap, open
source innovation tools and cloud
computing.”
• Friedman, Thomas. The New York Times,
23 Feb 2013 http://tinyurl.com/abwke7t
19. Research
• Exploring topics in Computational Linguistics
and with short deployment time can be difficult.
• Study of speculation in informal language
(hedging)
• Study of emotions in text independent of
language.
• Research guided by my supervisor: Carl Vogel.
21. Linguistic Speculation in
Text
hi jt...
also might just download
program again from here and
the install should pick up your
license with no interaction
needed from you.
22. Signals of Emotion
• Emoticons and smilies conveying polarity of
sentiment: positive, negative, neutral.
• Language independent signals of
emotions
:) smile
;) wink
laughing
:’( crying
angry
:O surprised
Positive
Negative
Neutral
23. Different domains
Academic writing
• Grammatically correct
• Limited use of first-person
sentences.
• Easier to be processed
by software tools.
Informal writing (Web
forums)
• Noisy text: typos,
complex names, non
natural language text,
slang, etc.
• Analysis of this kind of
text is still an ongoing
research topic.
User oriented analysis: Writer oriented
Reader oriented
24. Main questions to be
answered
• Do speculation markers really convey
speculation?
• How the use of emoticons and speculation
markers correlate to user categories?
o I welcome this sort of post for everyone else reading this thread
o Thanks!! I was sort of thinking that..
o Problem 1 is sort of resolved! :(
25. Potential applications
• Analysis of sentiment in text in response
to marketing strategies.
• Analysis of point of view in discussion
forums.
• User profiling in user generated content:
– Identification of features that make of a user a
proactive individual
– Features that make a text likable
26. Education in Latin America
and Ireland
• Differences between public and private
educative Institutions in my country.
• Establishment of Computer Science as a body
of knowledge.
• Governmental and private investment in
research is limited in my country, but it is
increasing.
• Career opportunities for graduates in LA and
Ireland.