SlideShare uma empresa Scribd logo
1 de 21
diccionarios
Wörterbücher
사전
λεξικά
‫מילון‬
辞書
словари
dictionnaires
字典
dizionari
शब्दकोष
Natural Language
Processing (NLP)
Kristen Parton
What is NLP?
• “Natural” languages
– English, Mandarin, French, Swahili, Arabic, Nahuatl, ….
– NOT Java, C++, Perl, …
• Ultimate goal: Natural human-to-computer communication
• Sub-field of Artificial Intelligence, but very interdisciplinary
– Computer science, human-computer interaction (HCI), linguistics,
cognitive psychology, speech signal processing (EE), …
• Shall we play a game? (1983)
Real-word NLP
How does NLP work…
• Morphology: What is a word?
• 奧林匹克運動會(希臘語:Ολυμπιακοί Αγώνες,簡稱奧運會或
奧運)是國際奧林匹克委員會主辦的包含多種體育運動項目的國際
性運動會,每四年舉行一次。
• ‫ك‬
‫بيوت‬
‫ها‬ = “to her houses”
• Lexicography: What does each word mean?
– He plays bass guitar.
– That bass was delicious!
• Syntax: How do the words relate to each other?
– The dog bit the man. ≠ The man bit the dog.
– But in Russian: человек собаку съел = человек съел собаку
How does NLP work…
• Semantics: How can we infer meaning from sentences?
– I saw the man on the hill with the telescope.
– The ipod is so small! 
– The monitor is so small! 
• Discourse: How about across many sentences?
– President Bush met with President-Elect Obama today at the
White House. He welcomed him, and showed him around.
– Who is “he”? Who is “him”? How would a computer figure that
out?
Examples from Prof. Julia Hirschberg’s slides
Spoken Language Processing
• Speech Recognition
– Automatic dictation, assistance for blind people, indexing
youtube videos, automatic 411, …
• Related things we study…
– How does intonation affect semantic meaning?
– Detecting uncertainty and emotions
– Detecting deception!
• Why is this hard?
– Each speaker has a different voice (male vs female, child
versus older person)
– Many different accents (Scottish, American, non-native
speakers) and ways of speaking
– Conversation: turn taking, interruptions, …
Examples from Prof. Julia Hirschberg’s slides
Spoken Language Processing
• Text-to-Speech / Spoken dialog systems
– Call response centers, tutoring systems, …
• Related things we study…
– Making computer voices sound more human
– Making computer speech acts more human-like
Machine Translation
Machine Translation
• About $10 billion spent annually on human translation
• Hotels in Beijing, China
– 昨天我打电话订的时候艺龙信誓旦旦的保证说是四星级的酒店,住进去
以后一看没,我靠,这在80年代可能算得上是四星的,我要的是368的大床
房,房间只有一个0.5米*1米的小窗户,打开一看,我靠, ...
– Yesterday, I called out when Art Long vowed to ensure that the four-
star hotel, to live in. I see no future, I rely on it in the 80s may be
regarded as a four-star, and I want the big 368-bed Room, the room
is only one 0.5 m * 1-meter small windows, what we can see, I rely
on, ...
– "本人刚从酒店回来,很想发表一下自己的看法。总体印象:位置很好
,价格也不错,但是服务一般或是太一般了,前台接待的水平和效
率 ..."
– "I came back from the hotel, would like to express my own views. The
overall impression: a good location, good prices, but services in
general or too general, the level of the front reception and efficiency
..."
Why is machine translation hard?
• Requires both understanding the “from” language and
generating the “to” language.
• How can we teach a computer a “second language”
when it doesn’t even really have a first language?
• Can we do machine translation without solving natural
language understanding and natural language
generation first?
Que hambre tengo yo
What hunger have I
I've got that hunger
I am so hungry
She let the cat out of the bag. Ella deja que el gato fuera de la bolsa
Rosetta Stone (not the product)
• Example of “parallel text”: same text in two or more
languages
– Hieroglyphic Egyptian, Demotic Egyptian and classical Greek
• Used to understand hieroglyphic writing system
Statistical Machine Translation
• Lots and lots of parallel text
– Learn word-for-word translations
– Learn phrase-for-phrase translations
– Learn syntax and grammar rules?
Taken from Prof. Chris Manning’s slides
NLP: Conclusions
• NLP is already used in many systems today
– Indexing words on the web: Segmenting Chinese, tokenizing
English, de-compoundizing German, …
– Calling centers (“Welcome to AT&T…”)
• Many technologies are in use, and still improving
– Machine translation used by soldiers in Iraq (speech to speech
translation?)
– Dictation used by doctors, many professionals
• Lots of awesome research to work on!
– Detecting deception in speech?
– Tracking social networks via documents?
– Can a computer get an 800 on the verbal SAT? (not yet!)
NLP @ Columbia
• CS4705 Natural Language Processing
• CS4706 Spoken Language Processing
• CS6998 Search Engine Technology, CS6870 Speech Recognition,
CS6998 Computational Approaches to Emotional Speech, …
• Related to the Artificial Intelligence track
• Professor Kathleen McKeown
• Professor Julia Hirschberg
• Researchers Owen Rambow,
Nizar Habash, Mona Diab,
Rebecca Passonneau (@ CCLS)
• Opportunities for undergrad
research 
Taken from Prof. Chris Manning’s slides
Natural Language Understanding
• Syntactic Parse
Taken from Prof. Chris Manning’s slides
Why is this customer confused?
• A: And, what day in May did you want to travel?
• C: OK, uh, I need to be there for a meeting that’s from the
12th to the 15th.
• Note that client did not answer question.
• Meaning of client’s sentence:
– Meeting
• Start-of-meeting: 12th
• End-of-meeting: 15th
– Doesn’t say anything about flying!!!!!
• How does agent infer client is informing him/her of travel dates?
Examples from Prof. Julia Hirschberg’s slides
Question Answering
• How old is Julia Roberts?
• When did the Berlin Wall fall?
• What about something more open-ended?
– Why did the US enter WWII?
– How does the Electoral College work?
• May want to ask questions about non-English, non-text
documents… and get responses back in English text.
Natural Language Understanding
Taken from Prof. Chris Manning’s slides

Mais conteúdo relacionado

Semelhante a Natural Language Processing Dictionary

Beyond the Symbols: A 30-minute Overview of NLP
Beyond the Symbols: A 30-minute Overview of NLPBeyond the Symbols: A 30-minute Overview of NLP
Beyond the Symbols: A 30-minute Overview of NLPMENGSAYLOEM1
 
Natural Language Processing: L01 introduction
Natural Language Processing: L01 introductionNatural Language Processing: L01 introduction
Natural Language Processing: L01 introductionananth
 
Intro 2 document
Intro 2 documentIntro 2 document
Intro 2 documentUma Kant
 
CNN for NLP using text analysis by using deep learning
CNN for NLP using text analysis by using deep learningCNN for NLP using text analysis by using deep learning
CNN for NLP using text analysis by using deep learningKv Sagar
 
Introduction to NLP.pptx
Introduction to NLP.pptxIntroduction to NLP.pptx
Introduction to NLP.pptxbuivantan_uneti
 
Natural language processing ppt for engineering
Natural language processing ppt for engineeringNatural language processing ppt for engineering
Natural language processing ppt for engineeringmanishadhiman2104
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language ProcessingYasir Khan
 
introduction to natural language processing(NLP).ppt
introduction to natural language processing(NLP).pptintroduction to natural language processing(NLP).ppt
introduction to natural language processing(NLP).pptTemesgenTolcha2
 
Natural language processing (NLP)
Natural language processing (NLP) Natural language processing (NLP)
Natural language processing (NLP) ASWINKP11
 
Intelligent Chatbot on WeChat
Intelligent Chatbot on WeChatIntelligent Chatbot on WeChat
Intelligent Chatbot on WeChatAI Frontiers
 
Natural Language Processing Crash Course
Natural Language Processing Crash CourseNatural Language Processing Crash Course
Natural Language Processing Crash CourseCharlie Greenbacker
 
Lingvist - Statistical Methods in Language Learning
Lingvist - Statistical Methods in Language LearningLingvist - Statistical Methods in Language Learning
Lingvist - Statistical Methods in Language LearningAndré Karpištšenko
 
Natural language processing (nlp)
Natural language processing (nlp)Natural language processing (nlp)
Natural language processing (nlp)Kuppusamy P
 
Rigourous evaluation of nlp models in real world deployment
Rigourous evaluation of nlp models in real world deploymentRigourous evaluation of nlp models in real world deployment
Rigourous evaluation of nlp models in real world deploymentSandy Man
 
Delhi NCR JUG meetup - NLP - APIs - By Vikas Malik
Delhi NCR JUG meetup - NLP - APIs - By Vikas MalikDelhi NCR JUG meetup - NLP - APIs - By Vikas Malik
Delhi NCR JUG meetup - NLP - APIs - By Vikas MalikVikas Malik
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language ProcessingDavid Rostcheck
 
NLP for minority languages
NLP for minority languagesNLP for minority languages
NLP for minority languagesChris Brew
 

Semelhante a Natural Language Processing Dictionary (20)

Beyond the Symbols: A 30-minute Overview of NLP
Beyond the Symbols: A 30-minute Overview of NLPBeyond the Symbols: A 30-minute Overview of NLP
Beyond the Symbols: A 30-minute Overview of NLP
 
Intro to nlp
Intro to nlpIntro to nlp
Intro to nlp
 
Natural Language Processing: L01 introduction
Natural Language Processing: L01 introductionNatural Language Processing: L01 introduction
Natural Language Processing: L01 introduction
 
Intro 2 document
Intro 2 documentIntro 2 document
Intro 2 document
 
CNN for NLP using text analysis by using deep learning
CNN for NLP using text analysis by using deep learningCNN for NLP using text analysis by using deep learning
CNN for NLP using text analysis by using deep learning
 
Introduction to NLP.pptx
Introduction to NLP.pptxIntroduction to NLP.pptx
Introduction to NLP.pptx
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Natural language processing ppt for engineering
Natural language processing ppt for engineeringNatural language processing ppt for engineering
Natural language processing ppt for engineering
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
introduction to natural language processing(NLP).ppt
introduction to natural language processing(NLP).pptintroduction to natural language processing(NLP).ppt
introduction to natural language processing(NLP).ppt
 
Natural language processing (NLP)
Natural language processing (NLP) Natural language processing (NLP)
Natural language processing (NLP)
 
Intelligent Chatbot on WeChat
Intelligent Chatbot on WeChatIntelligent Chatbot on WeChat
Intelligent Chatbot on WeChat
 
Natural Language Processing Crash Course
Natural Language Processing Crash CourseNatural Language Processing Crash Course
Natural Language Processing Crash Course
 
Lingvist - Statistical Methods in Language Learning
Lingvist - Statistical Methods in Language LearningLingvist - Statistical Methods in Language Learning
Lingvist - Statistical Methods in Language Learning
 
Natural language processing (nlp)
Natural language processing (nlp)Natural language processing (nlp)
Natural language processing (nlp)
 
Rigourous evaluation of nlp models in real world deployment
Rigourous evaluation of nlp models in real world deploymentRigourous evaluation of nlp models in real world deployment
Rigourous evaluation of nlp models in real world deployment
 
Delhi NCR JUG meetup - NLP - APIs - By Vikas Malik
Delhi NCR JUG meetup - NLP - APIs - By Vikas MalikDelhi NCR JUG meetup - NLP - APIs - By Vikas Malik
Delhi NCR JUG meetup - NLP - APIs - By Vikas Malik
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
 
Nlp app
Nlp appNlp app
Nlp app
 
NLP for minority languages
NLP for minority languagesNLP for minority languages
NLP for minority languages
 

Último

Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAssociation for Project Management
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajanpragatimahajan3
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDThiyagu K
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...Pooja Nehwal
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...Sapna Thakur
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 

Último (20)

Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajan
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 

Natural Language Processing Dictionary

  • 2. What is NLP? • “Natural” languages – English, Mandarin, French, Swahili, Arabic, Nahuatl, …. – NOT Java, C++, Perl, … • Ultimate goal: Natural human-to-computer communication • Sub-field of Artificial Intelligence, but very interdisciplinary – Computer science, human-computer interaction (HCI), linguistics, cognitive psychology, speech signal processing (EE), … • Shall we play a game? (1983)
  • 4. How does NLP work… • Morphology: What is a word? • 奧林匹克運動會(希臘語:Ολυμπιακοί Αγώνες,簡稱奧運會或 奧運)是國際奧林匹克委員會主辦的包含多種體育運動項目的國際 性運動會,每四年舉行一次。 • ‫ك‬ ‫بيوت‬ ‫ها‬ = “to her houses” • Lexicography: What does each word mean? – He plays bass guitar. – That bass was delicious! • Syntax: How do the words relate to each other? – The dog bit the man. ≠ The man bit the dog. – But in Russian: человек собаку съел = человек съел собаку
  • 5. How does NLP work… • Semantics: How can we infer meaning from sentences? – I saw the man on the hill with the telescope. – The ipod is so small!  – The monitor is so small!  • Discourse: How about across many sentences? – President Bush met with President-Elect Obama today at the White House. He welcomed him, and showed him around. – Who is “he”? Who is “him”? How would a computer figure that out?
  • 6. Examples from Prof. Julia Hirschberg’s slides
  • 7. Spoken Language Processing • Speech Recognition – Automatic dictation, assistance for blind people, indexing youtube videos, automatic 411, … • Related things we study… – How does intonation affect semantic meaning? – Detecting uncertainty and emotions – Detecting deception! • Why is this hard? – Each speaker has a different voice (male vs female, child versus older person) – Many different accents (Scottish, American, non-native speakers) and ways of speaking – Conversation: turn taking, interruptions, … Examples from Prof. Julia Hirschberg’s slides
  • 8. Spoken Language Processing • Text-to-Speech / Spoken dialog systems – Call response centers, tutoring systems, … • Related things we study… – Making computer voices sound more human – Making computer speech acts more human-like
  • 10. Machine Translation • About $10 billion spent annually on human translation • Hotels in Beijing, China – 昨天我打电话订的时候艺龙信誓旦旦的保证说是四星级的酒店,住进去 以后一看没,我靠,这在80年代可能算得上是四星的,我要的是368的大床 房,房间只有一个0.5米*1米的小窗户,打开一看,我靠, ... – Yesterday, I called out when Art Long vowed to ensure that the four- star hotel, to live in. I see no future, I rely on it in the 80s may be regarded as a four-star, and I want the big 368-bed Room, the room is only one 0.5 m * 1-meter small windows, what we can see, I rely on, ... – "本人刚从酒店回来,很想发表一下自己的看法。总体印象:位置很好 ,价格也不错,但是服务一般或是太一般了,前台接待的水平和效 率 ..." – "I came back from the hotel, would like to express my own views. The overall impression: a good location, good prices, but services in general or too general, the level of the front reception and efficiency ..."
  • 11. Why is machine translation hard? • Requires both understanding the “from” language and generating the “to” language. • How can we teach a computer a “second language” when it doesn’t even really have a first language? • Can we do machine translation without solving natural language understanding and natural language generation first? Que hambre tengo yo What hunger have I I've got that hunger I am so hungry She let the cat out of the bag. Ella deja que el gato fuera de la bolsa
  • 12.
  • 13. Rosetta Stone (not the product) • Example of “parallel text”: same text in two or more languages – Hieroglyphic Egyptian, Demotic Egyptian and classical Greek • Used to understand hieroglyphic writing system
  • 14. Statistical Machine Translation • Lots and lots of parallel text – Learn word-for-word translations – Learn phrase-for-phrase translations – Learn syntax and grammar rules? Taken from Prof. Chris Manning’s slides
  • 15. NLP: Conclusions • NLP is already used in many systems today – Indexing words on the web: Segmenting Chinese, tokenizing English, de-compoundizing German, … – Calling centers (“Welcome to AT&T…”) • Many technologies are in use, and still improving – Machine translation used by soldiers in Iraq (speech to speech translation?) – Dictation used by doctors, many professionals • Lots of awesome research to work on! – Detecting deception in speech? – Tracking social networks via documents? – Can a computer get an 800 on the verbal SAT? (not yet!)
  • 16. NLP @ Columbia • CS4705 Natural Language Processing • CS4706 Spoken Language Processing • CS6998 Search Engine Technology, CS6870 Speech Recognition, CS6998 Computational Approaches to Emotional Speech, … • Related to the Artificial Intelligence track • Professor Kathleen McKeown • Professor Julia Hirschberg • Researchers Owen Rambow, Nizar Habash, Mona Diab, Rebecca Passonneau (@ CCLS) • Opportunities for undergrad research 
  • 17. Taken from Prof. Chris Manning’s slides
  • 18. Natural Language Understanding • Syntactic Parse Taken from Prof. Chris Manning’s slides
  • 19. Why is this customer confused? • A: And, what day in May did you want to travel? • C: OK, uh, I need to be there for a meeting that’s from the 12th to the 15th. • Note that client did not answer question. • Meaning of client’s sentence: – Meeting • Start-of-meeting: 12th • End-of-meeting: 15th – Doesn’t say anything about flying!!!!! • How does agent infer client is informing him/her of travel dates? Examples from Prof. Julia Hirschberg’s slides
  • 20. Question Answering • How old is Julia Roberts? • When did the Berlin Wall fall? • What about something more open-ended? – Why did the US enter WWII? – How does the Electoral College work? • May want to ask questions about non-English, non-text documents… and get responses back in English text.
  • 21. Natural Language Understanding Taken from Prof. Chris Manning’s slides