SlideShare uma empresa Scribd logo
1 de 86
CS60057 Speech &Natural Language Processing Autumn 2007 Lecture 5 2 August 2007
WORDS The Building Blocks of Language
[object Object],[object Object]
Tokens, Types and Texts ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Extracting text from the Web ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Extracting text from NLTK Corpora ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Brown Corpus ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
 
Corpus Linguistics ,[object Object],[object Object],[object Object],[object Object],[object Object]
What’s a word? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Another example ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Some Useful Empirical Observations ,[object Object],[object Object],[object Object],[object Object],[object Object]
Common words in  Tom Sawyer but words in NL have an uneven distribution…
Text properties (formalized) Sample word frequency data
Frequency of frequencies ,[object Object],[object Object],[object Object],[object Object],[object Object]
Zipf’s Law ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Zipf’s Law ,[object Object]
Zipf curve
Predicting Occurrence Frequencies ,[object Object],[object Object],[object Object],[object Object],Fraction of words with frequency  n  is: Fraction  of words appearing only once is therefore ½.
Explanations for Zipf’s Law ,[object Object]
Zipf’s First Law ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Zipf’s Second Law ,[object Object],[object Object],[object Object]
Zipf’s Third Law ,[object Object],[object Object],[object Object]
Zipf’s Law Impact on Language Analysis ,[object Object],[object Object]
Vocabulary Growth ,[object Object],[object Object],[object Object]
Heaps’ Law ,[object Object],[object Object],[object Object],[object Object]
Heaps’ Law Data
Word counts are interesting... ,[object Object],[object Object],[object Object],[object Object],[object Object]
Zipf’s Law on Tom Saywer ,[object Object],[object Object],[object Object],[object Object]
Plot of Zipf’s Law ,[object Object],[object Object]
Plot of Zipf’s Law (con’t) ,[object Object],[object Object]
Zipf’s Law, so what? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
N-Grams and Corpus Linguistics
A bad language model N-grams & Language Modeling
A bad language model
A bad language model Herman is reprinted with permission from LaughingStock Licensing Inc., Ottawa Canada.  All rights reserved.
What’s a Language Model ,[object Object],[object Object],[object Object]
What’s a language model for? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Next Word Prediction ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
[object Object],[object Object]
Human Word Prediction ,[object Object],[object Object],[object Object],[object Object],[object Object]
Claim ,[object Object],[object Object]
Applications ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Overview ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Simple N-Grams ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
N-grams ,[object Object],[object Object],[object Object],[object Object]
Computing the Probability of a Word Sequence ,[object Object],[object Object],[object Object],[object Object],[object Object]
Bigram Model ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Using N-Grams ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
The  n -gram Approximation ,[object Object],[object Object],[object Object],[object Object],[object Object]
n- grams, continued ,[object Object],[object Object],[object Object],[object Object]
[object Object],[object Object],[object Object],[object Object],[object Object]
N-grams for Language Generation ,[object Object],Unigram: 5. …Here words are chosen independently but with their appropriate frequencies. REPRESENTING AND SPEEDILY IS AN GOOD APT OR COME CAN DIFFERENT NATURAL HERE HE THE A IN CAME THE TO OF TO EXPERT GRAY COME TO FURNISHES THE LINE MESSAGE HAD BE THESE. Bigram: 6. Second-order word approximation. The word transition probabilities are correct but no further structure is included. THE HEAD AND IN FRONTAL ATTACK ON AN ENGLISH WRITER THAT THE CHARACTER OF THIS POINT IS THEREFORE ANOTHER METHOD FOR THE LETTERS THAT THE TIME OF WHO EVER TOLD THE PROBLEM FOR AN UNEXPECTED.
N-Gram Models of Language ,[object Object],[object Object],[object Object],[object Object],[object Object]
Counting Words in Corpora ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Terminology ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Corpora ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Simple N-Grams ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Computing the Probability of a Word Sequence ,[object Object],[object Object],[object Object],[object Object],[object Object]
Bigram Model ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Using N-Grams ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Training and Testing ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
A Simple Example ,[object Object],[object Object]
A Bigram Grammar Fragment from BERP .001 Eat British .03 Eat today .007 Eat dessert .04 Eat Indian .01 Eat tomorrow .04 Eat a .02 Eat Mexican .04 Eat at .02 Eat Chinese .05 Eat dinner .02 Eat in .06 Eat lunch .03 Eat breakfast .06 Eat some .03 Eat Thai .16 Eat on
.01 British lunch .05 Want a .01 British cuisine .65 Want to .15 British restaurant .04 I have .60 British food .08 I don’t .02 To be .29 I would .09 To spend .32 I want .14 To have .02 <start> I’m .26 To eat .04 <start> Tell .01 Want Thai .06 <start> I’d .04 Want some .25 <start> I
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
BERP Bigram Counts 0 1 0 0 0 0 4 Lunch 0 0 0 0 17 0 19 Food 1 120 0 0 0 0 2 Chinese 52 2 19 0 2 0 0 Eat 12 0 3 860 10 0 3 To 6 8 6 0 786 0 3 Want 0 0 0 13 0 1087 8 I lunch Food Chinese Eat To Want I
BERP Bigram Probabilities ,[object Object],[object Object],[object Object],[object Object],[object Object],459 1506 213 938 3256 1215 3437 Lunch Food Chinese Eat To Want I
What do we learn about the language? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
[object Object],[object Object],[object Object]
Approximating Shakespeare ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
[object Object],[object Object],[object Object]
N-Gram Training Sensitivity ,[object Object],[object Object]
Some Useful Empirical Observations ,[object Object],[object Object],[object Object],[object Object],[object Object]
Smoothing Techniques ,[object Object],[object Object],[object Object]
Smoothing Techniques ,[object Object],[object Object],[object Object]
Add-one Smoothing ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
[object Object],[object Object],[object Object],[object Object]
[object Object],[object Object],[object Object],[object Object],Witten-Bell Discounting
[object Object],[object Object]
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Good-Turing Discounting
Backoff methods (e.g. Katz ‘87) ,[object Object],[object Object],[object Object],[object Object],[object Object]
Summary ,[object Object],[object Object],[object Object],[object Object]

Mais conteúdo relacionado

Destaque

anaGlay mOraila Souza
anaGlay mOraila SouzaanaGlay mOraila Souza
anaGlay mOraila Souzaguest0b0c7f
 
Monday Notes 9/16/2007
Monday Notes 9/16/2007Monday Notes 9/16/2007
Monday Notes 9/16/2007jmurph
 
Los peligros de Internet.
Los peligros de Internet.Los peligros de Internet.
Los peligros de Internet.guest62b173
 
Michelle
MichelleMichelle
Michellelisa12
 
Naica Cavernade Cristal
Naica Cavernade CristalNaica Cavernade Cristal
Naica Cavernade Cristaldcasco
 
Perimetros De Poligonos
Perimetros De PoligonosPerimetros De Poligonos
Perimetros De Poligonosguest372be4
 
A Rough Guide towards Govt 2 V0
A  Rough  Guide towards Govt 2 V0A  Rough  Guide towards Govt 2 V0
A Rough Guide towards Govt 2 V0mike_accease
 
Edusim New Interface
Edusim New InterfaceEdusim New Interface
Edusim New Interfacerichwhite
 
AutoPagerize Shibuya.js 2007 9/15
AutoPagerize Shibuya.js 2007 9/15AutoPagerize Shibuya.js 2007 9/15
AutoPagerize Shibuya.js 2007 9/15swdyh
 
Dantesinferno Se
Dantesinferno SeDantesinferno Se
Dantesinferno Seguest236192
 
Fiesta De Disfraces
Fiesta De DisfracesFiesta De Disfraces
Fiesta De Disfracesiluscave i
 
7th Grade Chapter 2 Lesson 1
7th Grade Chapter 2 Lesson 17th Grade Chapter 2 Lesson 1
7th Grade Chapter 2 Lesson 1MRS.KDUNCAN
 
Jacinto Piedraaa!
Jacinto Piedraaa!Jacinto Piedraaa!
Jacinto Piedraaa!Joaco
 
7th Grade Chapter 2 Lesson 4
7th Grade Chapter 2 Lesson 47th Grade Chapter 2 Lesson 4
7th Grade Chapter 2 Lesson 4MRS.KDUNCAN
 

Destaque (20)

anaGlay mOraila Souza
anaGlay mOraila SouzaanaGlay mOraila Souza
anaGlay mOraila Souza
 
Monday Notes 9/16/2007
Monday Notes 9/16/2007Monday Notes 9/16/2007
Monday Notes 9/16/2007
 
Los peligros de Internet.
Los peligros de Internet.Los peligros de Internet.
Los peligros de Internet.
 
Sep18 Mobile
Sep18 MobileSep18 Mobile
Sep18 Mobile
 
Michelle
MichelleMichelle
Michelle
 
Milagros
MilagrosMilagros
Milagros
 
Naica Cavernade Cristal
Naica Cavernade CristalNaica Cavernade Cristal
Naica Cavernade Cristal
 
Preston
PrestonPreston
Preston
 
Vma07
Vma07Vma07
Vma07
 
Perimetros De Poligonos
Perimetros De PoligonosPerimetros De Poligonos
Perimetros De Poligonos
 
A Rough Guide towards Govt 2 V0
A  Rough  Guide towards Govt 2 V0A  Rough  Guide towards Govt 2 V0
A Rough Guide towards Govt 2 V0
 
DivisióN
DivisióNDivisióN
DivisióN
 
Edusim New Interface
Edusim New InterfaceEdusim New Interface
Edusim New Interface
 
AutoPagerize Shibuya.js 2007 9/15
AutoPagerize Shibuya.js 2007 9/15AutoPagerize Shibuya.js 2007 9/15
AutoPagerize Shibuya.js 2007 9/15
 
Dantesinferno Se
Dantesinferno SeDantesinferno Se
Dantesinferno Se
 
Fiesta De Disfraces
Fiesta De DisfracesFiesta De Disfraces
Fiesta De Disfraces
 
7th Grade Chapter 2 Lesson 1
7th Grade Chapter 2 Lesson 17th Grade Chapter 2 Lesson 1
7th Grade Chapter 2 Lesson 1
 
KM Postcards
KM PostcardsKM Postcards
KM Postcards
 
Jacinto Piedraaa!
Jacinto Piedraaa!Jacinto Piedraaa!
Jacinto Piedraaa!
 
7th Grade Chapter 2 Lesson 4
7th Grade Chapter 2 Lesson 47th Grade Chapter 2 Lesson 4
7th Grade Chapter 2 Lesson 4
 

Semelhante a sadf

Chapter 2: Text Operation in information stroage and retrieval
Chapter 2: Text Operation in information stroage and retrievalChapter 2: Text Operation in information stroage and retrieval
Chapter 2: Text Operation in information stroage and retrievalcaptainmactavish1996
 
crypto_graphy_PPTs.pdf
crypto_graphy_PPTs.pdfcrypto_graphy_PPTs.pdf
crypto_graphy_PPTs.pdfMajidMumtaz3
 
Chapter 2 Text Operation.pdf
Chapter 2 Text Operation.pdfChapter 2 Text Operation.pdf
Chapter 2 Text Operation.pdfHabtamu100
 
Computational linguistics
Computational linguisticsComputational linguistics
Computational linguisticsshrey bhate
 
Coms30123 Synthesis 3 Projector
Coms30123 Synthesis 3 ProjectorComs30123 Synthesis 3 Projector
Coms30123 Synthesis 3 ProjectorDr. Cupid Lucid
 
Chapter 2 Text Operation and Term Weighting.pdf
Chapter 2 Text Operation and Term Weighting.pdfChapter 2 Text Operation and Term Weighting.pdf
Chapter 2 Text Operation and Term Weighting.pdfJemalNesre1
 
Stemming algorithms
Stemming algorithmsStemming algorithms
Stemming algorithmsRaghu nath
 
NLTK: Natural Language Processing made easy
NLTK: Natural Language Processing made easyNLTK: Natural Language Processing made easy
NLTK: Natural Language Processing made easyoutsider2
 
Natural Language Processing made easy
Natural Language Processing made easyNatural Language Processing made easy
Natural Language Processing made easyGopi Krishnan Nambiar
 
Natural Language parsing.pptx
Natural Language parsing.pptxNatural Language parsing.pptx
Natural Language parsing.pptxsiddhantroy13
 
ToC_M1L3_Grammar and Derivation.pdf
ToC_M1L3_Grammar and Derivation.pdfToC_M1L3_Grammar and Derivation.pdf
ToC_M1L3_Grammar and Derivation.pdfjaishreemane73
 
Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing Mustafa Jarrar
 
NLP_guest_lecture.pdf
NLP_guest_lecture.pdfNLP_guest_lecture.pdf
NLP_guest_lecture.pdfSoha82
 
Natural Language processing Parts of speech tagging, its classes, and how to ...
Natural Language processing Parts of speech tagging, its classes, and how to ...Natural Language processing Parts of speech tagging, its classes, and how to ...
Natural Language processing Parts of speech tagging, its classes, and how to ...Rajnish Raj
 
Natural Language Processing with Python
Natural Language Processing with PythonNatural Language Processing with Python
Natural Language Processing with PythonBenjamin Bengfort
 

Semelhante a sadf (20)

Chapter 2: Text Operation in information stroage and retrieval
Chapter 2: Text Operation in information stroage and retrievalChapter 2: Text Operation in information stroage and retrieval
Chapter 2: Text Operation in information stroage and retrieval
 
crypto_graphy_PPTs.pdf
crypto_graphy_PPTs.pdfcrypto_graphy_PPTs.pdf
crypto_graphy_PPTs.pdf
 
Chapter 2 Text Operation.pdf
Chapter 2 Text Operation.pdfChapter 2 Text Operation.pdf
Chapter 2 Text Operation.pdf
 
Computational linguistics
Computational linguisticsComputational linguistics
Computational linguistics
 
Coms30123 Synthesis 3 Projector
Coms30123 Synthesis 3 ProjectorComs30123 Synthesis 3 Projector
Coms30123 Synthesis 3 Projector
 
Introduction to linguistics
Introduction to linguisticsIntroduction to linguistics
Introduction to linguistics
 
Chapter 2 Text Operation and Term Weighting.pdf
Chapter 2 Text Operation and Term Weighting.pdfChapter 2 Text Operation and Term Weighting.pdf
Chapter 2 Text Operation and Term Weighting.pdf
 
Ir 03
Ir   03Ir   03
Ir 03
 
Stemming algorithms
Stemming algorithmsStemming algorithms
Stemming algorithms
 
NLTK: Natural Language Processing made easy
NLTK: Natural Language Processing made easyNLTK: Natural Language Processing made easy
NLTK: Natural Language Processing made easy
 
Natural Language Processing made easy
Natural Language Processing made easyNatural Language Processing made easy
Natural Language Processing made easy
 
Linguistics
LinguisticsLinguistics
Linguistics
 
Natural Language parsing.pptx
Natural Language parsing.pptxNatural Language parsing.pptx
Natural Language parsing.pptx
 
ToC_M1L3_Grammar and Derivation.pdf
ToC_M1L3_Grammar and Derivation.pdfToC_M1L3_Grammar and Derivation.pdf
ToC_M1L3_Grammar and Derivation.pdf
 
Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing
 
NLP_guest_lecture.pdf
NLP_guest_lecture.pdfNLP_guest_lecture.pdf
NLP_guest_lecture.pdf
 
Natural Language processing Parts of speech tagging, its classes, and how to ...
Natural Language processing Parts of speech tagging, its classes, and how to ...Natural Language processing Parts of speech tagging, its classes, and how to ...
Natural Language processing Parts of speech tagging, its classes, and how to ...
 
Nlp
NlpNlp
Nlp
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Natural Language Processing with Python
Natural Language Processing with PythonNatural Language Processing with Python
Natural Language Processing with Python
 

Último

From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 

Último (20)

From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 

sadf

  • 1. CS60057 Speech &Natural Language Processing Autumn 2007 Lecture 5 2 August 2007
  • 2. WORDS The Building Blocks of Language
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.  
  • 11.
  • 12.
  • 13.
  • 14.
  • 15. Common words in Tom Sawyer but words in NL have an uneven distribution…
  • 16. Text properties (formalized) Sample word frequency data
  • 17.
  • 18.
  • 19.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 30.
  • 31.
  • 32.
  • 33.
  • 34.
  • 35. N-Grams and Corpus Linguistics
  • 36. A bad language model N-grams & Language Modeling
  • 37. A bad language model
  • 38. A bad language model Herman is reprinted with permission from LaughingStock Licensing Inc., Ottawa Canada. All rights reserved.
  • 39.
  • 40.
  • 41.
  • 42.
  • 43.
  • 44.
  • 45.
  • 46.
  • 47.
  • 48.
  • 49.
  • 50.
  • 51.
  • 52.
  • 53.
  • 54.
  • 55.
  • 56.
  • 57.
  • 58.
  • 59.
  • 60.
  • 61.
  • 62.
  • 63.
  • 64.
  • 65.
  • 66. A Bigram Grammar Fragment from BERP .001 Eat British .03 Eat today .007 Eat dessert .04 Eat Indian .01 Eat tomorrow .04 Eat a .02 Eat Mexican .04 Eat at .02 Eat Chinese .05 Eat dinner .02 Eat in .06 Eat lunch .03 Eat breakfast .06 Eat some .03 Eat Thai .16 Eat on
  • 67. .01 British lunch .05 Want a .01 British cuisine .65 Want to .15 British restaurant .04 I have .60 British food .08 I don’t .02 To be .29 I would .09 To spend .32 I want .14 To have .02 <start> I’m .26 To eat .04 <start> Tell .01 Want Thai .06 <start> I’d .04 Want some .25 <start> I
  • 68.
  • 69. BERP Bigram Counts 0 1 0 0 0 0 4 Lunch 0 0 0 0 17 0 19 Food 1 120 0 0 0 0 2 Chinese 52 2 19 0 2 0 0 Eat 12 0 3 860 10 0 3 To 6 8 6 0 786 0 3 Want 0 0 0 13 0 1087 8 I lunch Food Chinese Eat To Want I
  • 70.
  • 71.
  • 72.
  • 73.
  • 74.
  • 75.
  • 76.
  • 77.
  • 78.
  • 79.
  • 80.
  • 81.
  • 82.
  • 83.
  • 84.
  • 85.
  • 86.