SlideShare uma empresa Scribd logo
1 de 24
BY
VEENA .S.KUMAR
Natural Language Processing
(NLP)
Contents
• What Is NLP?
• Why NLP?
• Basic Terms In NLP
• Approaches To NLP
• NLTK
• Setting Up NLP Environment
• Components Of NLP
• Levels In NLP
• Stages In NLP
• Some Applications Of NLP
What Is NLP?
Artificial
Intelligence
Computational
Linguistics
NLP
•It is automatic manipulation of speech or text
•Goal  To accomplish human-like language processing
•The field of NLP involves making computers to perform useful tasks with the
natural languages humans use. The input and output of an NLP system can be
Speech
Written Text
Why NLP?
• Bsyhbuwhx  Computers lack knowledge
• Large Volumes of Textual Data There are at least 30 trillion pages70-80%
unstructured data i.e. raw text
• Structuring a highly unstructured data source
• Text Data-Website ,tweets , blog etc
• Audio Data-Speech
• Applications for processing large amount of data require NLP expertise
Basic Terms In NLP
Tokenization
It is the task of chopping up of string of characters into pieces, called tokens , perhaps at
the same time throwing away certain characters, such as punctuation.
Input: Friends, Romans, Countrymen, lend me your ears;
Output: Friends Romans Countrymen Lend me your ears
Stemming
Stemming is the process of eliminating affixes (suffixed, prefixes, infixes, circumfixes)
from a word in order to obtain a word stem.
running → run
Lemmatization
Lemmatization is related to stemming, differing in that lemmatization is able to capture
canonical forms based on a word's lemma.
Better → good
Corpus
Corpus refers to a collection of texts. Corpora may also consist of theme texts
(historical,Biblical, etc.). Corpora are generally solely used for statistical linguistic
analysis and hypothesis testing.
Stop Words
Stop words are those words which are filtered out before further processing of text
The quick brown fox jumps over the lazy dog.
Parts-of-speech (POS) Tagging
POS tagging consists of assigning a category tag to the tokenized parts of a
sentence. The most popular POS tagging would be identifying words as nouns,
verbs, adjectives, etc.
Approaches To NLP
Symbolic
• Explicit depiction of facts about language through well understood schemes
and algorithm
• Deep Analysis of linguistic phenomena
Statistical
• Uses mathematical techniques and large texts of corpora without
incorporating world knowledge
• Output produced by each state has a definitive probability
Connectionist
• Combines statistical learning with various representation theories
• Allows transformation,inference and logic formulae manipulation
• Less Constrained Architecture
NLTK
• Natural Language Toolkit (NLTK) was originally created in 2001 as part of a
computational linguistics course in the Department of Computer and Information
Science at the University of Pennsylvania.
• The Natural Language Toolkit (NLTK) defines a basic infrastructure that can be used to build
NLP programs in Python. It provides:
o Basic classes for representing data relevant to natural language processing.
o Standard interfaces for performing tasks, such as tokenization, tagging, and parsing.
o Standard implementations for each task, which can be combined to solve complex problems.
NLTK was designed with four primary goals in mind:
 Simplicity
 Consistency
 Modularity
 Extensibility
Setting Up NLP Environment
Open Anaconda Prompt
Install pip: run in terminal easy_install pip
Install NLTK:run in terminal pip install –U nltk
Open Spyder
Run in terminal 1)import nltk
2) nltk.download()
Press Enter
After Pressing Enter this dialogue box appears on the screen
Components Of NLP
There are two components of NLP as given −
Natural Language Understanding (NLU)
 Understanding involves the following tasks −
 Mapping the given input in natural language into useful representations.
 Analyzing different aspects of the language.
Natural Language Generation (NLG)
 It is the process of producing meaningful phrases and sentences in the form
of natural language from some internal representation . It involves
 Text planning − It includes retrieving the relevant content from knowledge
base.
 Sentence planning − It includes choosing required words, forming
meaningful phrases, setting tone of the sentence.
 Text Realization − It is mapping sentence plan into sentence structure.
The NLU is harder than NLG.
Levels In NLP Phonology
Syntactic
Lexical
Semantic
Morphology
Discourse Pragmatic
Stages In NLP
• Phonology
• Morphological
• Lexical
Parsing
• Syntactic
• Semantic
Translating
• Discourse
• Pragmatic
Generating
Input
Some Applications Of NLP
Machine Translation
• Machine Translation (MT) is the task of automatically converting one natural
language into another, preserving the meaning of the input text, and producing
fluent text in the output language.
• The human translation process may be described as:
• Decoding the meaning of the source text
• Re-encoding this meaning in the target language.
• How to program a computer that will "understand" a text as a person
does, and that will "create" a new text in the target language that
sounds as if it has been written by a person?
Provide a general, though imperfect, approximation of the
original text, getting the "gist" of it (a process called "gisting").
This is sufficient for many purposes, including making best use of
the finite and expensive time of a human translator, reserved for those cases in
which total accuracy is indispensable.
Information Retrieval
• The process of accessing and retrieving the most appropriate information from text
based on a particular query using context-based indexing or metadata.
• Simply, Information retrieval addresses the problem of finding those documents
whose content matches a user's request from among a large collection of documents.
User i/p Indian
PM
Doc1Indian PM
Doc2Pakistan
PM
Doc3American
President
Brings document
relating to Indian
PM
Sentiment Analysis
o The process of evaluating and determining the sentiment captured in a selection of
text
o Sentiment defined as feeling or emotion.
o This sentiment can be simply
• positive (happy)
• negative (sad or angry)
• Neutral
• precise measurement along a scale, with neutral in the middle, and positive and
negative increasing in either direction.
Information Extraction
• Information extraction (IE) is the task of automatically extracting structured
information from unstructured and/or semi-structured machine-readable documents.
Question Answering
• ELIZA-First Chatbot-developed by Joseph Weizenbaum
http://psych.fullerton.edu/mbirnbaum/psych101/Eliza.htm
• Question-answering systems are referred to as intelligent systems that can be used to
provide responses for the questions being asked by the user based on certain facts or
rules stored in the knowledge base.
• So the accuracy of a question-answering system to provide a correct response depends
on the rules or facts stored in the knowledge base.
To Conclude with
• While NLP is a relatively recent area of research and application, as compared to other
information technology approaches, there have been sufficient successes to date that
suggest that NLP-based information access technologies will continue to be a major area
of research and development in information systems now and far into the future.
ANY QUESTIONS???
THANK YOU!!

Mais conteúdo relacionado

Mais procurados

Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
Yasir Khan
 
Natural Language Processing (NLP).pptx
Natural Language Processing (NLP).pptxNatural Language Processing (NLP).pptx
Natural Language Processing (NLP).pptx
SHIBDASDUTTA
 

Mais procurados (20)

Natural Language Processing
Natural Language Processing Natural Language Processing
Natural Language Processing
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Natural language processing (NLP) introduction
Natural language processing (NLP) introductionNatural language processing (NLP) introduction
Natural language processing (NLP) introduction
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Natural Language Processing seminar review
Natural Language Processing seminar review Natural Language Processing seminar review
Natural Language Processing seminar review
 
Natural language processing (NLP)
Natural language processing (NLP) Natural language processing (NLP)
Natural language processing (NLP)
 
Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
NLP
NLPNLP
NLP
 
NLP
NLPNLP
NLP
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Natural language processing and its application in ai
Natural language processing and its application in aiNatural language processing and its application in ai
Natural language processing and its application in ai
 
Natural Language Processing (NLP).pptx
Natural Language Processing (NLP).pptxNatural Language Processing (NLP).pptx
Natural Language Processing (NLP).pptx
 
Natural language processing PPT presentation
Natural language processing PPT presentationNatural language processing PPT presentation
Natural language processing PPT presentation
 

Semelhante a Natural Language Processing

Natural language processing (nlp)
Natural language processing (nlp)Natural language processing (nlp)
Natural language processing (nlp)
Kuppusamy P
 
1 Introduction.ppt
1 Introduction.ppt1 Introduction.ppt
1 Introduction.ppt
tanishamahajan11
 
NLP-ppt.pptx nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
NLP-ppt.pptx nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnNLP-ppt.pptx nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
NLP-ppt.pptx nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
RAtna29
 
Natural Language Processing for development
Natural Language Processing for developmentNatural Language Processing for development
Natural Language Processing for development
Aravind Reddy
 
Natural Language Processing for development
Natural Language Processing for developmentNatural Language Processing for development
Natural Language Processing for development
Aravind Reddy
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
Abdullah al Mamun
 

Semelhante a Natural Language Processing (20)

Natural language processing (nlp)
Natural language processing (nlp)Natural language processing (nlp)
Natural language processing (nlp)
 
Addis Ababa University.pptx
Addis Ababa University.pptxAddis Ababa University.pptx
Addis Ababa University.pptx
 
subrat
 subrat subrat
subrat
 
1 Introduction.ppt
1 Introduction.ppt1 Introduction.ppt
1 Introduction.ppt
 
NLP-ppt.pptx nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
NLP-ppt.pptx nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnNLP-ppt.pptx nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
NLP-ppt.pptx nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
 
Natural language processing ppt for engineering
Natural language processing ppt for engineeringNatural language processing ppt for engineering
Natural language processing ppt for engineering
 
Natural Language Processing for development
Natural Language Processing for developmentNatural Language Processing for development
Natural Language Processing for development
 
Natural Language Processing for development
Natural Language Processing for developmentNatural Language Processing for development
Natural Language Processing for development
 
NATURAL LANGUAGE PROCESSING.pptx
NATURAL LANGUAGE PROCESSING.pptxNATURAL LANGUAGE PROCESSING.pptx
NATURAL LANGUAGE PROCESSING.pptx
 
Natural Language Processing_in semantic web.pptx
Natural Language Processing_in semantic web.pptxNatural Language Processing_in semantic web.pptx
Natural Language Processing_in semantic web.pptx
 
Fast and accurate sentiment classification us and naive bayes model b516001
Fast and accurate sentiment classification  us and naive bayes model b516001Fast and accurate sentiment classification  us and naive bayes model b516001
Fast and accurate sentiment classification us and naive bayes model b516001
 
NATURAL LANGUAGE PROCESSING.pptx
NATURAL LANGUAGE PROCESSING.pptxNATURAL LANGUAGE PROCESSING.pptx
NATURAL LANGUAGE PROCESSING.pptx
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
 
An Overview of Natural Language Processing.pptx
An Overview of Natural Language Processing.pptxAn Overview of Natural Language Processing.pptx
An Overview of Natural Language Processing.pptx
 
Introduction to natural language processing (NLP)
Introduction to natural language processing (NLP)Introduction to natural language processing (NLP)
Introduction to natural language processing (NLP)
 
Unit 5f.pptx
Unit 5f.pptxUnit 5f.pptx
Unit 5f.pptx
 
Natural Language Processing Advancements By Deep Learning - A Survey
Natural Language Processing Advancements By Deep Learning - A SurveyNatural Language Processing Advancements By Deep Learning - A Survey
Natural Language Processing Advancements By Deep Learning - A Survey
 
Untitled presentation.pdf
Untitled presentation.pdfUntitled presentation.pdf
Untitled presentation.pdf
 
Natural Language Processing (NLP).pdf
Natural Language Processing (NLP).pdfNatural Language Processing (NLP).pdf
Natural Language Processing (NLP).pdf
 
NOVA Data Science Meetup 1/19/2017 - Presentation 2
NOVA Data Science Meetup 1/19/2017 - Presentation 2NOVA Data Science Meetup 1/19/2017 - Presentation 2
NOVA Data Science Meetup 1/19/2017 - Presentation 2
 

Último

Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
chadhar227
 
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
vexqp
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
ahmedjiabur940
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
nirzagarg
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
gajnagarg
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
nirzagarg
 
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
vexqp
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
nirzagarg
 
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
wsppdmt
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
gajnagarg
 

Último (20)

Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
 
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubai
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
 
Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........
 
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptxThe-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham Ware
 
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowVadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
 
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
 
SR-101-01012024-EN.docx Federal Constitution of the Swiss Confederation
SR-101-01012024-EN.docx  Federal Constitution  of the Swiss ConfederationSR-101-01012024-EN.docx  Federal Constitution  of the Swiss Confederation
SR-101-01012024-EN.docx Federal Constitution of the Swiss Confederation
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
 
Harnessing the Power of GenAI for BI and Reporting.pptx
Harnessing the Power of GenAI for BI and Reporting.pptxHarnessing the Power of GenAI for BI and Reporting.pptx
Harnessing the Power of GenAI for BI and Reporting.pptx
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
 

Natural Language Processing

  • 2. Contents • What Is NLP? • Why NLP? • Basic Terms In NLP • Approaches To NLP • NLTK • Setting Up NLP Environment • Components Of NLP • Levels In NLP • Stages In NLP • Some Applications Of NLP
  • 3. What Is NLP? Artificial Intelligence Computational Linguistics NLP •It is automatic manipulation of speech or text •Goal  To accomplish human-like language processing •The field of NLP involves making computers to perform useful tasks with the natural languages humans use. The input and output of an NLP system can be Speech Written Text
  • 4. Why NLP? • Bsyhbuwhx  Computers lack knowledge • Large Volumes of Textual Data There are at least 30 trillion pages70-80% unstructured data i.e. raw text • Structuring a highly unstructured data source • Text Data-Website ,tweets , blog etc • Audio Data-Speech • Applications for processing large amount of data require NLP expertise
  • 5. Basic Terms In NLP Tokenization It is the task of chopping up of string of characters into pieces, called tokens , perhaps at the same time throwing away certain characters, such as punctuation. Input: Friends, Romans, Countrymen, lend me your ears; Output: Friends Romans Countrymen Lend me your ears Stemming Stemming is the process of eliminating affixes (suffixed, prefixes, infixes, circumfixes) from a word in order to obtain a word stem. running → run Lemmatization Lemmatization is related to stemming, differing in that lemmatization is able to capture canonical forms based on a word's lemma. Better → good
  • 6. Corpus Corpus refers to a collection of texts. Corpora may also consist of theme texts (historical,Biblical, etc.). Corpora are generally solely used for statistical linguistic analysis and hypothesis testing. Stop Words Stop words are those words which are filtered out before further processing of text The quick brown fox jumps over the lazy dog. Parts-of-speech (POS) Tagging POS tagging consists of assigning a category tag to the tokenized parts of a sentence. The most popular POS tagging would be identifying words as nouns, verbs, adjectives, etc.
  • 7. Approaches To NLP Symbolic • Explicit depiction of facts about language through well understood schemes and algorithm • Deep Analysis of linguistic phenomena Statistical • Uses mathematical techniques and large texts of corpora without incorporating world knowledge • Output produced by each state has a definitive probability Connectionist • Combines statistical learning with various representation theories • Allows transformation,inference and logic formulae manipulation • Less Constrained Architecture
  • 8. NLTK • Natural Language Toolkit (NLTK) was originally created in 2001 as part of a computational linguistics course in the Department of Computer and Information Science at the University of Pennsylvania. • The Natural Language Toolkit (NLTK) defines a basic infrastructure that can be used to build NLP programs in Python. It provides: o Basic classes for representing data relevant to natural language processing. o Standard interfaces for performing tasks, such as tokenization, tagging, and parsing. o Standard implementations for each task, which can be combined to solve complex problems. NLTK was designed with four primary goals in mind:  Simplicity  Consistency  Modularity  Extensibility
  • 9.
  • 10.
  • 11. Setting Up NLP Environment Open Anaconda Prompt Install pip: run in terminal easy_install pip Install NLTK:run in terminal pip install –U nltk
  • 12. Open Spyder Run in terminal 1)import nltk 2) nltk.download() Press Enter After Pressing Enter this dialogue box appears on the screen
  • 13. Components Of NLP There are two components of NLP as given − Natural Language Understanding (NLU)  Understanding involves the following tasks −  Mapping the given input in natural language into useful representations.  Analyzing different aspects of the language. Natural Language Generation (NLG)  It is the process of producing meaningful phrases and sentences in the form of natural language from some internal representation . It involves  Text planning − It includes retrieving the relevant content from knowledge base.  Sentence planning − It includes choosing required words, forming meaningful phrases, setting tone of the sentence.  Text Realization − It is mapping sentence plan into sentence structure. The NLU is harder than NLG.
  • 14. Levels In NLP Phonology Syntactic Lexical Semantic Morphology Discourse Pragmatic
  • 15. Stages In NLP • Phonology • Morphological • Lexical Parsing • Syntactic • Semantic Translating • Discourse • Pragmatic Generating Input
  • 17. Machine Translation • Machine Translation (MT) is the task of automatically converting one natural language into another, preserving the meaning of the input text, and producing fluent text in the output language. • The human translation process may be described as: • Decoding the meaning of the source text • Re-encoding this meaning in the target language. • How to program a computer that will "understand" a text as a person does, and that will "create" a new text in the target language that sounds as if it has been written by a person? Provide a general, though imperfect, approximation of the original text, getting the "gist" of it (a process called "gisting"). This is sufficient for many purposes, including making best use of the finite and expensive time of a human translator, reserved for those cases in which total accuracy is indispensable.
  • 18. Information Retrieval • The process of accessing and retrieving the most appropriate information from text based on a particular query using context-based indexing or metadata. • Simply, Information retrieval addresses the problem of finding those documents whose content matches a user's request from among a large collection of documents. User i/p Indian PM Doc1Indian PM Doc2Pakistan PM Doc3American President Brings document relating to Indian PM
  • 19. Sentiment Analysis o The process of evaluating and determining the sentiment captured in a selection of text o Sentiment defined as feeling or emotion. o This sentiment can be simply • positive (happy) • negative (sad or angry) • Neutral • precise measurement along a scale, with neutral in the middle, and positive and negative increasing in either direction.
  • 20. Information Extraction • Information extraction (IE) is the task of automatically extracting structured information from unstructured and/or semi-structured machine-readable documents.
  • 21. Question Answering • ELIZA-First Chatbot-developed by Joseph Weizenbaum http://psych.fullerton.edu/mbirnbaum/psych101/Eliza.htm • Question-answering systems are referred to as intelligent systems that can be used to provide responses for the questions being asked by the user based on certain facts or rules stored in the knowledge base. • So the accuracy of a question-answering system to provide a correct response depends on the rules or facts stored in the knowledge base.
  • 22. To Conclude with • While NLP is a relatively recent area of research and application, as compared to other information technology approaches, there have been sufficient successes to date that suggest that NLP-based information access technologies will continue to be a major area of research and development in information systems now and far into the future.