SlideShare a Scribd company logo
1 of 93
 Overview
 Basic knowledge
 Demonstration
LOGO

USER : Men are all alike.
ELIZA : In what way?
USER : They’re always bugging us about something or other.
ELIZA : Can you think of a specific example?
USER : Well, my boyfriend made me come here.
ELIZA : Your boyfriend made you come here ?
USER : He says I’m depressed much of the time.
ELIZA : I am sorry to hear you are depressed.
USER : It’s true, I am unhappy.
ELIZA : Do you think coming here will help you not to be
unhappy?
USER : I need some help; that much seems certain.
ELIZA : What would it mean to you if you got some help?
USER : Perhaps I could learn to get along with my mother.
ELIZA : Tell me more about your family.
USER : My mother takes care of me.
ELIZA : Who else in your family takes care of you?
USER : My father.
ELIZA : Your father ?
USER : You are like my father in some ways.
ELIZA : What resemblance do you see?
 A sub-field of Artificial Intelligent, since 1960s …


 Concerned with the interactions between computers and
  human languages with one ultimate goal : Computers can
  “understand” human



 Many applications in real world
 Natural language unit?
    Natural language understanding
    Natural language generation


 Data?
    Speech processing
    Text processing


Natural language text understanding!
 Task of generating natural language from a machine
  representation
 May be viewed as the opposite of natural language
  understanding .

 Applications:
   Jokes generation
   Textual summaries of databases
   Enhancing accessibility
 An advanced subtopic of NLP deals with reading
  comprehension
 More complex than NLG
 Many commercial interest in this field
   News-gathering
   Data-Mining
   Voice-Activation
   Large-scale content analysis
 Logic is too clear, the lost of flexibility cause
  difficulties in NLP

 Examples :
   Time flies like an arrow
  Can be understood in 7 ways !!!

   I never said she stole my money !
      Someone else said it, but I didn't.
 Logic is too clear, the lost of flexibility become
  difficulties in NLP

 Examples :
   Time flies like an arrow
  Can be understood in 7 ways !!!

   I never said she stole my money !
      I simply didn't ever say it
 Logic is too clear, the lost of flexibility become
  difficulties in NLP

 Examples :
   Time flies like an arrow
  Can be understood in 7 ways !!!

   I never said she stole my money !
      I might have implied it in some way, but I never explicitly said it
 Logic is too clear, the lost of flexibility become
  difficulties in NLP

 Examples :
   Time flies like an arrow
  Can be understood in 7 ways !!!

   I never said she stole my money !
      I said someone took it; I didn't say it was she
 Logic is too clear, the lost of flexibility become
  difficulties in NLP

 Examples:
   Time flies like an arrow
  Can be understood in 7 ways !!!

   I never said she stole my money !
      I just said she probably borrowed it
 Logic is too clear, the lost of flexibility become
  difficulties in NLP

 Examples :
   Time flies like an arrow
  Can be understood in 7 ways !!!

   I never said she stole my money !
      I said she stole someone else's money
 Logic is too clear, the lost of flexibility become
  difficulties in NLP

 Examples :
   Time flies like an arrow
  Can be understood in 7 ways !!!

   I never said she stole my money !
      I said she stole something, but not my money
 Words combination and division
 Stress placing on words
 The properties of subjects
   We gave the monkeys the bananas because they were
    hungry
   We gave the monkeys the bananas because they were
    over-ripe
 Specifying which word an adjective applies to
   A pretty little girls' school
 Involves reasoning about the world
 Embedded a social system of people interacting
   persuading, insulting and amusing them
   changing over time
 Homonymous
 Automatic Summarization
 Information Extraction
 Grammar Testing
 ePi Group:
   Automatic Vietnamese processing system
   www.baomoi.com
      Collecting news from all Vietnamese e-newspapers

 EVTrans – Softex Co Ltd.
 Cyclop
 VnKim
 Morphological analysis :
   Individual words are analyzed into their
     components
 Syntactic analysis
   Linear sequence of words are transformed
      into structures that show how the words
      relate to each other
 Semantic analysis
    A transformation is made from the input
     text to an internal representation that
     reflects the meaning
 Pragmatic analysis
    To reinterpret what was said to what was
     actually meant
 Discourse analysis
    Resolving references between sentences
Morphology

Syntax

Semantic

Pragmatic

Discourse
Morphology

Syntax

Semantic

Pragmatic

Discourse
 Morphemes: smallest meaningful unit
 spoken units of language.
   Stem: book, cat, car, …
   Affixes : un-, -s, -es, ..               Morphology

   Clitic: ‘ve, ‘m                          Syntax

                                             Semantic
 Morphological parsing: parsing a word
                                             Pragmatic
 into stem and affixes and identifying the
                                             Discourse
 parts and their relationships
 Word Classes
   Parts of speech: noun, verb, adjectives,
    etc.
                                               Morphology
   Word class dictates how a word combines
    with morphemes to form new words           Syntax

                                               Semantic
 Examples                                     Pragmatic
   Books: book + s
                                               Discourse
   Unladylike = un + lady + like
 Vietnamese?
   Ăn = ăn
                                  Morphology
   Uống = uống
   Xe = xe                       Syntax

                                  Semantic

 No ‘Xes’ in Vietnamese!         Pragmatic
 Problems are text tokenizing.   Discourse
 Why parse words?

                                          Morphology
   To identify a word’s part-of-speech
   To identify a word’s stem (IR)        Syntax

                                          Semantic

… then?                                   Pragmatic
   Spell- checking
                                          Discourse
   To predict next words
   To predict the word’s accent
 Ambiguity
   I want her to go to the cinema with me
                                             Morphology
  To - infinitive?                           Syntax

  To - preposition?                          Semantic

                                             Pragmatic
   Con ngựa đá đá con ngựa đá.
                                             Discourse



    đá = đá?
 How to implement?
   Regular expression
   Finite State Transducers (FST)
   Finite State Accepter (FSA)      Morphology

                                     Syntax
  *.exe                              Semantic
  ir??man
                                     Pragmatic
  b[0-9]+ *(Mb|[Mm]egabytes?)b
                                     Discourse
 Relate terms:
   Stem, stemming   Morphology
   Part of speech
                     Syntax
   N-gram
                     Semantic

                     Pragmatic

                     Discourse
Morphology

Syntax

Semantic

Pragmatic

Discourse
Morphology

SYNTAX   Syntax

         Semantic

         Pragmatic

         Discourse
 Linear sequence of words are transformed into
  structures that show how the words relate to
  each other.
                                                    Morphology
 Determine grammatical structure.
                                                    Syntax

                                                    Semantic

                                                    Pragmatic

 I am a boy = [Subject] [Verb] [Cardinal] [Noun]   Discourse
Morphology

Syntax

Semantic

Pragmatic

Discourse
 Syntax
   Actual structure of a sentence
                                        Morphology

                                        Syntax
 Grammar
                                        Semantic
   The rule set used in the analysis
                                        Pragmatic

                                        Discourse
 A grammar define syntactically legal sentences
    I ate an apple     (syntactic legal)
    I ate apple        (not syntactic legal)
    I ate a building   (syntactic legal, but?)    Morphology

                                                   Syntax

   doesn’t mean that it’s meaningful!              Semantic

                                                   Pragmatic

                                                   Discourse
 Ambiguities




                Morphology

                Syntax

                Semantic

                Pragmatic

                Discourse
Morphology

Syntax

Semantic

Pragmatic

Discourse
Morphology

           Syntax

SEMANTIC   Semantic

           Pragmatic

           Discourse
 What could this mean…
   Representations of linguistic inputs that capture
    the meanings of those inputs


 For us it means                                       Morphology
   Representations that permit or   facilitate         Syntax
    semantic processing
   Permit us to reason   about their truth             Semantic
    (relationship to some world)
                                                        Pragmatic
   Permit us to answer questions based on their
    content                                             Discourse
   Permit us to perform   inference (answer
    questions and determine the truth of things we
    don’t actually know)
Morphology

Syntax

Semantic

Pragmatic

Discourse
 Requirements


   Verifiability
   Ambiguity
                     Morphology
   Canonical Form
   Inference        Syntax

   Expressiveness
                     Semantic

                     Pragmatic

                     Discourse
Morphology

Syntax

Semantic

Pragmatic

Discourse
 Pragmatics: concerns how sentences are
 used in different situations and how use
                                              Morphology
 affects the interpretation of the sentence
                                              Syntax

                                              Semantic

 Discourse: concerns how the                 Pragmatic
 immediately preceding sentences affect
                                              Discourse
 the interpretation of the next sentence
Morphology

                                           Syntax
 ‘He’, ‘it’, ‘his’ can be inferred from
                                           Semantic
  previous sentence
                                           Pragmatic


 It’s   discourse                         Discourse
Morphology

Syntax

Semantic

Pragmatic

Discourse
Morphology

Syntax

Semantic

Pragmatic

Discourse
Morphology

Syntax

Semantic

Pragmatic

Discourse
Morphology

Syntax

Semantic

Pragmatic

Discourse
Morphology

Syntax

Semantic

Pragmatic

Discourse
 Wordnet
 Mindnet
 Stanford Tagger
 Stanford Parser
 ……..
 Machine translation
 Search engine
 Information extraction
 Chat bot
 Can we use previously translated text to learn how to
 translate new texts?
   Yes! But, it’s not so easy
   Two paradigms, statistical MT, and EBMT
 Requirements:
   Aligned large parallel corpus of translated sentences
   {S source  S target }
   Bilingual dictionary for intra-S alignment
   Generalization patterns (names, numbers, dates…)
 Simplest: Translation Memory
   If S new= S source in corpus, output aligned S target


 Compositional EBMT
   If fragment of Snew matches fragment of Ss, output
    corresponding fragment of aligned St
   Prefer maximal-length fragments
   Maximize grammatical compositionality
      Via a target language grammar
      Or, via an N-gram statistical language model
 Requires an Interlingua - language-neutral Knowledge
  Representation (KR)
 Philosophical debate: Is there an interlingua?
   FOL is not totally language neutral (predicates,
    functions, expressed in a language)
   Other near-interlinguas (Conceptual Dependency)
 Requires a fully-disambiguating parser
   Domain model of legal objects, actions, relations
 Requires a NL generator (KR -> text)
 Applicable only to well-defined technical domains
 Produces high-quality MT in those domains
 Intelingua-based MT
 Rule-based MT
 Each approach has its own strength


   Rapidly adaptable: statistical, example-based
   Good grammar: rule-based (grammar)
   High precision in narrow domain: Intelingua
 Google
 Yahoo
 Alta-vista
 Answer.com
 Spider - a browser-like program that downloads web pages.
 Crawler – a program that automatically follows all of the
    links on each web page.
   Indexer - a program that analyzes web pages downloaded
    by the spider and the crawler.
   Database– storage for downloaded and processed pages.
   Results engine – extracts search results from the database.
    Web server – a server that is responsible for interaction
    between the user and other search engine components.
   Spider - a browser-like program that downloads web pages.
   Crawler – a program that automatically follows all of the
    links on each web page.
   Indexer - a program that analyzes web pages downloaded
    by the spider and the crawler.
   Database– storage for downloaded and processed pages.
   Results engine – extracts search results from the database.
    Web server – a server that is responsible for interaction
    between the user and other search engine components.
 Idea is to ‘extract’ particular types of information from
  arbitrary text or transcribed speech

 Examples:
   Names entities: people, places, organization
   Telephone numbers
   Dates
 Many uses:
   Question answering systems, fisting of news or mail…
   Job ads, financial information, terrorist attacks
 Often use a set of simple templates or frames with slots
 to be filled in from input text. Ignore everything else.
   Husni’s number is 966-3-860-2624.
   The inventor of the First plane was Abbas ibnu Fernas
   The British King died in March of 1932.
 Named Entity recognition (NE)
   Finds and classifies names, places etc.
 Co-reference Resolution (CO)
   Identifies identity relations between entities in texts.
 Template Element construction (TE)
   Adds descriptive information to NE results (using CO).
 Template Relation construction (TR)
   Finds relations between TE entities. Scenario
 Template production (ST)
   Fits TE and TR results into specified event scenarios.
 AIML = Artificial Intelligent Mark-up Language
 Alice
 A.L.I.C.E. (Artificial Linguistic Internet Computer
 Entity)
   an award-winning free natural language artificial
    intelligence chat robot.


 Ruled-base
 Human-like answer without complicated “brain”
 Multi-language
 NLP’s course , Husni Al-Muhtaseb
 Lexical descriptions for Vietnamese language
  processing .
 en.wikipedia.org
 www.xulyngonngu.com
Natural language processing 2

More Related Content

What's hot

Group presentation lexical semantics
Group presentation lexical semanticsGroup presentation lexical semantics
Group presentation lexical semanticsblessedkkr
 
Langacker's cognitive grammar
Langacker's cognitive grammarLangacker's cognitive grammar
Langacker's cognitive grammarJOy Verzosa
 
MORPHOLOGICAL SEGMENTATION WITH LSTM NEURAL NETWORKS FOR TIGRINYA
MORPHOLOGICAL SEGMENTATION WITH LSTM NEURAL NETWORKS FOR TIGRINYAMORPHOLOGICAL SEGMENTATION WITH LSTM NEURAL NETWORKS FOR TIGRINYA
MORPHOLOGICAL SEGMENTATION WITH LSTM NEURAL NETWORKS FOR TIGRINYAijnlc
 
Translation
TranslationTranslation
Translationmjkay
 
Feature Structure Unification Syntactic Parser 2.0
Feature Structure Unification Syntactic Parser 2.0Feature Structure Unification Syntactic Parser 2.0
Feature Structure Unification Syntactic Parser 2.0rcaneba
 
5a use of annotated corpus
5a use of annotated corpus5a use of annotated corpus
5a use of annotated corpusThennarasuSakkan
 
A Constructive Mathematics approach for NL formal grammars
A Constructive Mathematics approach for NL formal grammarsA Constructive Mathematics approach for NL formal grammars
A Constructive Mathematics approach for NL formal grammarsFederico Gobbo
 
Unit 1 Semantics
Unit 1 SemanticsUnit 1 Semantics
Unit 1 Semanticsmjgvalcarce
 
Natural language-processing
Natural language-processingNatural language-processing
Natural language-processingHareem Naz
 
Constructive Hybrid Logics
Constructive Hybrid LogicsConstructive Hybrid Logics
Constructive Hybrid LogicsValeria de Paiva
 
Constructive Description Logics 2006
Constructive Description Logics 2006Constructive Description Logics 2006
Constructive Description Logics 2006Valeria de Paiva
 
Prosodic Morphology
Prosodic Morphology Prosodic Morphology
Prosodic Morphology Maroua Harrif
 
Text : Definition, Elaboration and Examples
Text : Definition, Elaboration and ExamplesText : Definition, Elaboration and Examples
Text : Definition, Elaboration and ExamplesAlaahussein81
 
Minimalist program
Minimalist programMinimalist program
Minimalist programRabbiaAzam
 

What's hot (19)

Group presentation lexical semantics
Group presentation lexical semanticsGroup presentation lexical semantics
Group presentation lexical semantics
 
Langacker's cognitive grammar
Langacker's cognitive grammarLangacker's cognitive grammar
Langacker's cognitive grammar
 
MORPHOLOGICAL SEGMENTATION WITH LSTM NEURAL NETWORKS FOR TIGRINYA
MORPHOLOGICAL SEGMENTATION WITH LSTM NEURAL NETWORKS FOR TIGRINYAMORPHOLOGICAL SEGMENTATION WITH LSTM NEURAL NETWORKS FOR TIGRINYA
MORPHOLOGICAL SEGMENTATION WITH LSTM NEURAL NETWORKS FOR TIGRINYA
 
Minimalist program
Minimalist programMinimalist program
Minimalist program
 
ACTIVIDAD 7
ACTIVIDAD 7ACTIVIDAD 7
ACTIVIDAD 7
 
Translation
TranslationTranslation
Translation
 
Presentation1
Presentation1Presentation1
Presentation1
 
Feature Structure Unification Syntactic Parser 2.0
Feature Structure Unification Syntactic Parser 2.0Feature Structure Unification Syntactic Parser 2.0
Feature Structure Unification Syntactic Parser 2.0
 
5a use of annotated corpus
5a use of annotated corpus5a use of annotated corpus
5a use of annotated corpus
 
Semantics
SemanticsSemantics
Semantics
 
A Constructive Mathematics approach for NL formal grammars
A Constructive Mathematics approach for NL formal grammarsA Constructive Mathematics approach for NL formal grammars
A Constructive Mathematics approach for NL formal grammars
 
Unit 1 Semantics
Unit 1 SemanticsUnit 1 Semantics
Unit 1 Semantics
 
Natural language-processing
Natural language-processingNatural language-processing
Natural language-processing
 
Constructive Hybrid Logics
Constructive Hybrid LogicsConstructive Hybrid Logics
Constructive Hybrid Logics
 
Constructive Description Logics 2006
Constructive Description Logics 2006Constructive Description Logics 2006
Constructive Description Logics 2006
 
Narrative
NarrativeNarrative
Narrative
 
Prosodic Morphology
Prosodic Morphology Prosodic Morphology
Prosodic Morphology
 
Text : Definition, Elaboration and Examples
Text : Definition, Elaboration and ExamplesText : Definition, Elaboration and Examples
Text : Definition, Elaboration and Examples
 
Minimalist program
Minimalist programMinimalist program
Minimalist program
 

Viewers also liked

NLP and its applications
NLP and its applicationsNLP and its applications
NLP and its applicationsUtphala P
 
Natural Language Processing: Definition and Application
Natural Language Processing: Definition and ApplicationNatural Language Processing: Definition and Application
Natural Language Processing: Definition and ApplicationStephen Shellman
 
Statistical machine translation
Statistical machine translationStatistical machine translation
Statistical machine translationHrishikesh Nair
 
Jeeves -natural language interface application
Jeeves -natural language interface applicationJeeves -natural language interface application
Jeeves -natural language interface applicationKaran Harsh Wardhan
 
Startupfest 2015: HARPER REED (Modest, Inc.) - Lightning Keynote
Startupfest 2015: HARPER REED (Modest, Inc.) - Lightning KeynoteStartupfest 2015: HARPER REED (Modest, Inc.) - Lightning Keynote
Startupfest 2015: HARPER REED (Modest, Inc.) - Lightning KeynoteStartupfest
 
Statistical machine translation in a few slides
Statistical machine translation in a few slidesStatistical machine translation in a few slides
Statistical machine translation in a few slidesForcada Mikel
 
Natural language procesing in R
Natural language procesing in RNatural language procesing in R
Natural language procesing in ROlabanji Shonibare
 
Machine translation with statistical approach
Machine translation with statistical approachMachine translation with statistical approach
Machine translation with statistical approachvini89
 
Gordana Panajotović - NLP Master
Gordana Panajotović - NLP MasterGordana Panajotović - NLP Master
Gordana Panajotović - NLP MasterNLP Centar Beograd
 
Text Mining Infrastructure in R
Text Mining Infrastructure in RText Mining Infrastructure in R
Text Mining Infrastructure in RAshraf Uddin
 
Introduction to nlp 2014
Introduction to nlp 2014Introduction to nlp 2014
Introduction to nlp 2014Grant Hamel
 
Types of machine translation
Types of machine translationTypes of machine translation
Types of machine translationRushdi Shams
 
Text analytics in Python and R with examples from Tobacco Control
Text analytics in Python and R with examples from Tobacco ControlText analytics in Python and R with examples from Tobacco Control
Text analytics in Python and R with examples from Tobacco ControlBen Healey
 
Natural language processing (NLP) introduction
Natural language processing (NLP) introductionNatural language processing (NLP) introduction
Natural language processing (NLP) introductionRobert Lujo
 
Practical Natural Language Processing
Practical Natural Language ProcessingPractical Natural Language Processing
Practical Natural Language ProcessingJaganadh Gopinadhan
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processingrohitnayak
 
Introducing natural language processing(NLP) with r
Introducing natural language processing(NLP) with rIntroducing natural language processing(NLP) with r
Introducing natural language processing(NLP) with rVivian S. Zhang
 
Natural language processing
Natural language processingNatural language processing
Natural language processingYogendra Tamang
 
Natural Language Processing in R (rNLP)
Natural Language Processing in R (rNLP)Natural Language Processing in R (rNLP)
Natural Language Processing in R (rNLP)fridolin.wild
 

Viewers also liked (20)

NLP and its applications
NLP and its applicationsNLP and its applications
NLP and its applications
 
Natural Language Processing: Definition and Application
Natural Language Processing: Definition and ApplicationNatural Language Processing: Definition and Application
Natural Language Processing: Definition and Application
 
Statistical machine translation
Statistical machine translationStatistical machine translation
Statistical machine translation
 
Jeeves -natural language interface application
Jeeves -natural language interface applicationJeeves -natural language interface application
Jeeves -natural language interface application
 
Startupfest 2015: HARPER REED (Modest, Inc.) - Lightning Keynote
Startupfest 2015: HARPER REED (Modest, Inc.) - Lightning KeynoteStartupfest 2015: HARPER REED (Modest, Inc.) - Lightning Keynote
Startupfest 2015: HARPER REED (Modest, Inc.) - Lightning Keynote
 
Statistical machine translation in a few slides
Statistical machine translation in a few slidesStatistical machine translation in a few slides
Statistical machine translation in a few slides
 
Natural language procesing in R
Natural language procesing in RNatural language procesing in R
Natural language procesing in R
 
Machine translation with statistical approach
Machine translation with statistical approachMachine translation with statistical approach
Machine translation with statistical approach
 
Intro to nlp
Intro to nlpIntro to nlp
Intro to nlp
 
Gordana Panajotović - NLP Master
Gordana Panajotović - NLP MasterGordana Panajotović - NLP Master
Gordana Panajotović - NLP Master
 
Text Mining Infrastructure in R
Text Mining Infrastructure in RText Mining Infrastructure in R
Text Mining Infrastructure in R
 
Introduction to nlp 2014
Introduction to nlp 2014Introduction to nlp 2014
Introduction to nlp 2014
 
Types of machine translation
Types of machine translationTypes of machine translation
Types of machine translation
 
Text analytics in Python and R with examples from Tobacco Control
Text analytics in Python and R with examples from Tobacco ControlText analytics in Python and R with examples from Tobacco Control
Text analytics in Python and R with examples from Tobacco Control
 
Natural language processing (NLP) introduction
Natural language processing (NLP) introductionNatural language processing (NLP) introduction
Natural language processing (NLP) introduction
 
Practical Natural Language Processing
Practical Natural Language ProcessingPractical Natural Language Processing
Practical Natural Language Processing
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
 
Introducing natural language processing(NLP) with r
Introducing natural language processing(NLP) with rIntroducing natural language processing(NLP) with r
Introducing natural language processing(NLP) with r
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Natural Language Processing in R (rNLP)
Natural Language Processing in R (rNLP)Natural Language Processing in R (rNLP)
Natural Language Processing in R (rNLP)
 

Similar to Natural language processing 2

Visual Word Recognition. The Journey from Features to Meaning
Visual Word Recognition. The Journey from Features to MeaningVisual Word Recognition. The Journey from Features to Meaning
Visual Word Recognition. The Journey from Features to Meaningfawzia
 
Structural grammar iii
Structural grammar iiiStructural grammar iii
Structural grammar iiiflakcute
 
Syntactic Features in Mother Tongue.pptx
Syntactic Features in Mother Tongue.pptxSyntactic Features in Mother Tongue.pptx
Syntactic Features in Mother Tongue.pptxJamelMirafuentes
 
Understanding ASL Grammatical Features and Discourse Mapping
Understanding ASL Grammatical Features and Discourse MappingUnderstanding ASL Grammatical Features and Discourse Mapping
Understanding ASL Grammatical Features and Discourse MappingDoug Stringham
 
Language in cognitive psychology
Language in cognitive psychologyLanguage in cognitive psychology
Language in cognitive psychologyAli Bahrani
 
Nlp Sentemental analysis of Tweetr And CaseStudy
Nlp Sentemental analysis of Tweetr And CaseStudyNlp Sentemental analysis of Tweetr And CaseStudy
Nlp Sentemental analysis of Tweetr And CaseStudyRaza Azeem
 
05 linguistic theory meets lexicography
05 linguistic theory meets lexicography05 linguistic theory meets lexicography
05 linguistic theory meets lexicographyDuygu Aşıklar
 
Setswana Tokenisation and Computational Verb Morphology: Facing the Challenge...
Setswana Tokenisation and Computational Verb Morphology: Facing the Challenge...Setswana Tokenisation and Computational Verb Morphology: Facing the Challenge...
Setswana Tokenisation and Computational Verb Morphology: Facing the Challenge...Guy De Pauw
 
Grammar Presentation
Grammar PresentationGrammar Presentation
Grammar Presentationtickingmindpd
 
Assignment on morphology
Assignment on morphologyAssignment on morphology
Assignment on morphologyLinda Midy
 
What English Do University Students Really Need
What English Do University Students Really NeedWhat English Do University Students Really Need
What English Do University Students Really NeedHala Nur
 

Similar to Natural language processing 2 (20)

NLP
NLPNLP
NLP
 
Semantics
SemanticsSemantics
Semantics
 
Mental grammar
Mental grammarMental grammar
Mental grammar
 
Visual Word Recognition. The Journey from Features to Meaning
Visual Word Recognition. The Journey from Features to MeaningVisual Word Recognition. The Journey from Features to Meaning
Visual Word Recognition. The Journey from Features to Meaning
 
Structural grammar iii
Structural grammar iiiStructural grammar iii
Structural grammar iii
 
Syntactic Features in Mother Tongue.pptx
Syntactic Features in Mother Tongue.pptxSyntactic Features in Mother Tongue.pptx
Syntactic Features in Mother Tongue.pptx
 
Understanding ASL Grammatical Features and Discourse Mapping
Understanding ASL Grammatical Features and Discourse MappingUnderstanding ASL Grammatical Features and Discourse Mapping
Understanding ASL Grammatical Features and Discourse Mapping
 
Language in cognitive psychology
Language in cognitive psychologyLanguage in cognitive psychology
Language in cognitive psychology
 
Nlp ambiguity presentation
Nlp ambiguity presentationNlp ambiguity presentation
Nlp ambiguity presentation
 
Nlp
NlpNlp
Nlp
 
Nlp Sentemental analysis of Tweetr And CaseStudy
Nlp Sentemental analysis of Tweetr And CaseStudyNlp Sentemental analysis of Tweetr And CaseStudy
Nlp Sentemental analysis of Tweetr And CaseStudy
 
Semantics
SemanticsSemantics
Semantics
 
05 linguistic theory meets lexicography
05 linguistic theory meets lexicography05 linguistic theory meets lexicography
05 linguistic theory meets lexicography
 
Syntax
SyntaxSyntax
Syntax
 
Setswana Tokenisation and Computational Verb Morphology: Facing the Challenge...
Setswana Tokenisation and Computational Verb Morphology: Facing the Challenge...Setswana Tokenisation and Computational Verb Morphology: Facing the Challenge...
Setswana Tokenisation and Computational Verb Morphology: Facing the Challenge...
 
Grammar Presentation
Grammar PresentationGrammar Presentation
Grammar Presentation
 
Assignment on morphology
Assignment on morphologyAssignment on morphology
Assignment on morphology
 
What English Do University Students Really Need
What English Do University Students Really NeedWhat English Do University Students Really Need
What English Do University Students Really Need
 
Grammar 4
Grammar 4Grammar 4
Grammar 4
 
Grammar 4
Grammar 4Grammar 4
Grammar 4
 

Recently uploaded

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 

Recently uploaded (20)

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 

Natural language processing 2

  • 1.
  • 2.  Overview  Basic knowledge  Demonstration
  • 3. LOGO USER : Men are all alike. ELIZA : In what way? USER : They’re always bugging us about something or other. ELIZA : Can you think of a specific example? USER : Well, my boyfriend made me come here. ELIZA : Your boyfriend made you come here ? USER : He says I’m depressed much of the time. ELIZA : I am sorry to hear you are depressed. USER : It’s true, I am unhappy. ELIZA : Do you think coming here will help you not to be unhappy? USER : I need some help; that much seems certain. ELIZA : What would it mean to you if you got some help? USER : Perhaps I could learn to get along with my mother. ELIZA : Tell me more about your family. USER : My mother takes care of me. ELIZA : Who else in your family takes care of you? USER : My father. ELIZA : Your father ? USER : You are like my father in some ways. ELIZA : What resemblance do you see?
  • 4.  A sub-field of Artificial Intelligent, since 1960s …  Concerned with the interactions between computers and human languages with one ultimate goal : Computers can “understand” human  Many applications in real world
  • 5.  Natural language unit?  Natural language understanding  Natural language generation  Data?  Speech processing  Text processing Natural language text understanding!
  • 6.  Task of generating natural language from a machine representation  May be viewed as the opposite of natural language understanding .  Applications:  Jokes generation  Textual summaries of databases  Enhancing accessibility
  • 7.  An advanced subtopic of NLP deals with reading comprehension  More complex than NLG  Many commercial interest in this field  News-gathering  Data-Mining  Voice-Activation  Large-scale content analysis
  • 8.  Logic is too clear, the lost of flexibility cause difficulties in NLP  Examples :  Time flies like an arrow Can be understood in 7 ways !!!  I never said she stole my money !  Someone else said it, but I didn't.
  • 9.  Logic is too clear, the lost of flexibility become difficulties in NLP  Examples :  Time flies like an arrow Can be understood in 7 ways !!!  I never said she stole my money !  I simply didn't ever say it
  • 10.  Logic is too clear, the lost of flexibility become difficulties in NLP  Examples :  Time flies like an arrow Can be understood in 7 ways !!!  I never said she stole my money !  I might have implied it in some way, but I never explicitly said it
  • 11.  Logic is too clear, the lost of flexibility become difficulties in NLP  Examples :  Time flies like an arrow Can be understood in 7 ways !!!  I never said she stole my money !  I said someone took it; I didn't say it was she
  • 12.  Logic is too clear, the lost of flexibility become difficulties in NLP  Examples:  Time flies like an arrow Can be understood in 7 ways !!!  I never said she stole my money !  I just said she probably borrowed it
  • 13.  Logic is too clear, the lost of flexibility become difficulties in NLP  Examples :  Time flies like an arrow Can be understood in 7 ways !!!  I never said she stole my money !  I said she stole someone else's money
  • 14.  Logic is too clear, the lost of flexibility become difficulties in NLP  Examples :  Time flies like an arrow Can be understood in 7 ways !!!  I never said she stole my money !  I said she stole something, but not my money
  • 15.  Words combination and division  Stress placing on words  The properties of subjects  We gave the monkeys the bananas because they were hungry  We gave the monkeys the bananas because they were over-ripe  Specifying which word an adjective applies to  A pretty little girls' school
  • 16.  Involves reasoning about the world  Embedded a social system of people interacting  persuading, insulting and amusing them  changing over time  Homonymous
  • 17.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.  ePi Group:  Automatic Vietnamese processing system  www.baomoi.com  Collecting news from all Vietnamese e-newspapers  EVTrans – Softex Co Ltd.  Cyclop  VnKim
  • 29.
  • 30.
  • 31.
  • 32.
  • 33.  Morphological analysis : Individual words are analyzed into their components  Syntactic analysis Linear sequence of words are transformed into structures that show how the words relate to each other  Semantic analysis  A transformation is made from the input text to an internal representation that reflects the meaning  Pragmatic analysis  To reinterpret what was said to what was actually meant  Discourse analysis  Resolving references between sentences
  • 36.  Morphemes: smallest meaningful unit spoken units of language.  Stem: book, cat, car, …  Affixes : un-, -s, -es, .. Morphology  Clitic: ‘ve, ‘m Syntax Semantic  Morphological parsing: parsing a word Pragmatic into stem and affixes and identifying the Discourse parts and their relationships
  • 37.  Word Classes  Parts of speech: noun, verb, adjectives, etc. Morphology  Word class dictates how a word combines with morphemes to form new words Syntax Semantic  Examples Pragmatic  Books: book + s Discourse  Unladylike = un + lady + like
  • 38.  Vietnamese?  Ăn = ăn Morphology  Uống = uống  Xe = xe Syntax Semantic  No ‘Xes’ in Vietnamese! Pragmatic  Problems are text tokenizing. Discourse
  • 39.  Why parse words? Morphology  To identify a word’s part-of-speech  To identify a word’s stem (IR) Syntax Semantic … then? Pragmatic  Spell- checking Discourse  To predict next words  To predict the word’s accent
  • 40.  Ambiguity  I want her to go to the cinema with me Morphology To - infinitive? Syntax To - preposition? Semantic Pragmatic  Con ngựa đá đá con ngựa đá. Discourse đá = đá?
  • 41.  How to implement?  Regular expression  Finite State Transducers (FST)  Finite State Accepter (FSA) Morphology Syntax *.exe Semantic ir??man Pragmatic b[0-9]+ *(Mb|[Mm]egabytes?)b Discourse
  • 42.
  • 43.  Relate terms:  Stem, stemming Morphology  Part of speech Syntax  N-gram Semantic Pragmatic Discourse
  • 45. Morphology SYNTAX Syntax Semantic Pragmatic Discourse
  • 46.  Linear sequence of words are transformed into structures that show how the words relate to each other. Morphology  Determine grammatical structure. Syntax Semantic Pragmatic  I am a boy = [Subject] [Verb] [Cardinal] [Noun] Discourse
  • 48.  Syntax  Actual structure of a sentence Morphology Syntax  Grammar Semantic  The rule set used in the analysis Pragmatic Discourse
  • 49.  A grammar define syntactically legal sentences  I ate an apple (syntactic legal)  I ate apple (not syntactic legal)  I ate a building (syntactic legal, but?) Morphology Syntax doesn’t mean that it’s meaningful! Semantic Pragmatic Discourse
  • 50.  Ambiguities Morphology Syntax Semantic Pragmatic Discourse
  • 52. Morphology Syntax SEMANTIC Semantic Pragmatic Discourse
  • 53.  What could this mean…  Representations of linguistic inputs that capture the meanings of those inputs  For us it means Morphology  Representations that permit or facilitate Syntax semantic processing  Permit us to reason about their truth Semantic (relationship to some world) Pragmatic  Permit us to answer questions based on their content Discourse  Permit us to perform inference (answer questions and determine the truth of things we don’t actually know)
  • 55.  Requirements  Verifiability  Ambiguity Morphology  Canonical Form  Inference Syntax  Expressiveness Semantic Pragmatic Discourse
  • 57.  Pragmatics: concerns how sentences are used in different situations and how use Morphology affects the interpretation of the sentence Syntax Semantic  Discourse: concerns how the Pragmatic immediately preceding sentences affect Discourse the interpretation of the next sentence
  • 58. Morphology Syntax  ‘He’, ‘it’, ‘his’ can be inferred from Semantic previous sentence Pragmatic  It’s discourse Discourse
  • 64.  Wordnet  Mindnet  Stanford Tagger  Stanford Parser  ……..
  • 65.  Machine translation  Search engine  Information extraction  Chat bot
  • 66.
  • 67.
  • 68.
  • 69.  Can we use previously translated text to learn how to translate new texts?  Yes! But, it’s not so easy  Two paradigms, statistical MT, and EBMT  Requirements:  Aligned large parallel corpus of translated sentences  {S source  S target }  Bilingual dictionary for intra-S alignment  Generalization patterns (names, numbers, dates…)
  • 70.  Simplest: Translation Memory  If S new= S source in corpus, output aligned S target  Compositional EBMT  If fragment of Snew matches fragment of Ss, output corresponding fragment of aligned St  Prefer maximal-length fragments  Maximize grammatical compositionality  Via a target language grammar  Or, via an N-gram statistical language model
  • 71.  Requires an Interlingua - language-neutral Knowledge Representation (KR)  Philosophical debate: Is there an interlingua?  FOL is not totally language neutral (predicates, functions, expressed in a language)  Other near-interlinguas (Conceptual Dependency)  Requires a fully-disambiguating parser  Domain model of legal objects, actions, relations  Requires a NL generator (KR -> text)  Applicable only to well-defined technical domains  Produces high-quality MT in those domains
  • 73.  Each approach has its own strength  Rapidly adaptable: statistical, example-based  Good grammar: rule-based (grammar)  High precision in narrow domain: Intelingua
  • 74.  Google  Yahoo  Alta-vista  Answer.com
  • 75.  Spider - a browser-like program that downloads web pages.  Crawler – a program that automatically follows all of the links on each web page.  Indexer - a program that analyzes web pages downloaded by the spider and the crawler.  Database– storage for downloaded and processed pages.  Results engine – extracts search results from the database.  Web server – a server that is responsible for interaction between the user and other search engine components.
  • 76. Spider - a browser-like program that downloads web pages.  Crawler – a program that automatically follows all of the links on each web page.  Indexer - a program that analyzes web pages downloaded by the spider and the crawler.  Database– storage for downloaded and processed pages.  Results engine – extracts search results from the database.  Web server – a server that is responsible for interaction between the user and other search engine components.
  • 77.
  • 78.
  • 79.
  • 80.  Idea is to ‘extract’ particular types of information from arbitrary text or transcribed speech  Examples:  Names entities: people, places, organization  Telephone numbers  Dates  Many uses:  Question answering systems, fisting of news or mail…  Job ads, financial information, terrorist attacks
  • 81.  Often use a set of simple templates or frames with slots to be filled in from input text. Ignore everything else.  Husni’s number is 966-3-860-2624.  The inventor of the First plane was Abbas ibnu Fernas  The British King died in March of 1932.
  • 82.  Named Entity recognition (NE)  Finds and classifies names, places etc.  Co-reference Resolution (CO)  Identifies identity relations between entities in texts.  Template Element construction (TE)  Adds descriptive information to NE results (using CO).  Template Relation construction (TR)  Finds relations between TE entities. Scenario  Template production (ST)  Fits TE and TR results into specified event scenarios.
  • 83.
  • 84.
  • 85.
  • 86.
  • 87.
  • 88.
  • 89.  AIML = Artificial Intelligent Mark-up Language  Alice
  • 90.  A.L.I.C.E. (Artificial Linguistic Internet Computer Entity)  an award-winning free natural language artificial intelligence chat robot.  Ruled-base  Human-like answer without complicated “brain”  Multi-language
  • 91.
  • 92.  NLP’s course , Husni Al-Muhtaseb  Lexical descriptions for Vietnamese language processing .  en.wikipedia.org  www.xulyngonngu.com