SlideShare uma empresa Scribd logo
1 de 40
Baixar para ler offline
Natural Language Processing
CREATED BY AANCHAL CHAURASIA - 2017
Contents
❏ What is NLP?
❏ Where is NLP?
❏ Components of NLP
❏ NLP Terminology
❏ Steps in NLP
❏ Developers Usage of NLP
Algorithms?
❏ Practical Examples
❏ NLP Libraries
❏ Difficulties of NLP
❏ Innovations
❏ Research Areas
❏ Expansion domains of NLP
❏ References
CREATED BY AANCHAL CHAURASIA - 2017
Natural Language?
Refers to the language spoken by people,
i.e English, Spanish, japanese as opposed
to artificial languages like C++, Java etc.
CREATED BY AANCHAL CHAURASIA - 2017
Natural Language Processing?
Natural Language Processing is the science of
teaching Machines how to understand the language
we Human speak and write.
CREATED BY AANCHAL CHAURASIA - 2017
Aim of NLP?
To build intelligent system that can interact
With Human Being like a Human Being.
CREATED BY AANCHAL CHAURASIA - 2017
Where is NLP?
CREATED BY AANCHAL CHAURASIA - 2017
NLP is Everywhere even if we
don’t know it. And although NLP
applications can rarely achieve
performances of 100%, still they
are part of our lives and provide
a precious help to all of us.
❏ Machine translation
❏ Automatic summarization
❏ Sentiment analysis
❏ Text classification
❏ Language detection
❏ Information Extraction & lot more
CREATED BY AANCHAL CHAURASIA - 2017
Input & Output of NLP
NLP involves making computers to perform useful tasks with the natural
languages humans use. The input and output of an NLP system can be :
❏ Speech
❏ Written Text
CREATED BY AANCHAL CHAURASIA - 2017
Components of NLP
CREATED BY AANCHAL CHAURASIA - 2017
Natural Language Understanding
NLU involves the following tasks :
❏ Analyzing different aspects of the
language.
❏ Mapping the given input in natural
language into useful representations.
❏ The NLU is harder than NLG.
Natural Language Generation
NLG involves the following tasks :
❏ Text planning : It includes retrieving the
relevant content from knowledge base.
❏ Sentence planning : It includes choosing
required words, forming meaningful
phrases, setting tone of the sentence.
❏ Text Realization : It is mapping sentence
plan into sentence structure.
CREATED BY AANCHAL CHAURASIA - 2017
NLP Terminology
CREATED BY AANCHAL CHAURASIA - 2017
❏ Phonology : It is study of organizing sound patterns and their meanings.
❏ Morphology : It is a study of construction of words from primitive meaningful
units.
❏ Morpheme : A meaningful morphological unit of a language that cannot be
further divided (e.g. in, come, -ing, forming incoming )
❏ Syntax : It refers to arranging words to make a sentence. It also involves
determining the structural role of words in the sentence and in phrases.
CREATED BY AANCHAL CHAURASIA - 2017
❏ Semantics : It is concerned with the meaning of words and how to combine
words into meaningful phrases and sentences.
❏ Pragmatics : It deals with using and understanding sentences in different
situations and how the interpretation of the sentence is affected. (e.g. Young
men and women)
❏ Discourse : It deals with how the immediately preceding sentence can affect
the interpretation of the next sentence.
❏ World Knowledge : It includes the general knowledge about the world.
CREATED BY AANCHAL CHAURASIA - 2017
Steps in NLP
CREATED BY AANCHAL CHAURASIA - 2017
CREATED BY AANCHAL CHAURASIA - 2017
Detailed study Of NLP Process
CREATED BY AANCHAL CHAURASIA - 2017
❏ Lexical Analysis : Lexical analysis is dividing the whole chunk of text into
paragraphs, sentences, and words.
❏ Syntactic Analysis (Parsing) : It involves analysis of words in the sentence
for grammar and arranging words in a manner that shows the relationship
among the words. i.e, Boy the go the to store
❏ Semantic Analysis : It draws the exact meaning or the dictionary meaning
from the text. The text is checked for meaningfulness. It is done by mapping
syntactic structures. i.e, colourless green idea
CREATED BY AANCHAL CHAURASIA - 2017
❏ Discourse Integration : The meaning of any sentence depends upon the
meaning of the sentence just before it. In addition, it also brings about the
meaning of immediately succeeding sentence.
I.e, Kate wanted it.
❏ Pragmatic Analysis : During this, what was said is re-interpreted on what it
actually meant. It involves deriving those aspects of language which require
real world knowledge.
I.e, Do you know what day it is?
CREATED BY AANCHAL CHAURASIA - 2017
Developers Use of NLP Algorithms?
CREATED BY AANCHAL CHAURASIA - 2017
❏ By utilizing NLP, developers can organize and structure knowledge to perform
tasks such as automatic summarization, translation, named entity
recognition, relationship extraction, sentiment analysis, speech recognition
etc.
❏ Algorithms are available with various useful applications on many platforms
using : Ruby, Python, Java, JavaScript, NodeJs, Swift & so on.
❏ Instead of hand-coding large sets of rules, NLP can rely on machine learning
to automatically learn these rules by analyzing a set of examples.
❏ The more data analyzed, the more accurate the model will be.
CREATED BY AANCHAL CHAURASIA - 2017
Let us look into
some practical Applications
CREATED BY AANCHAL CHAURASIA - 2017
1. Summarize blocks of Text
Summarizer extracts the most important and central ideas while ignoring
irrelevant information.
❏ INPUTS :
(Required): Large block of text.
(Optional): Number of sentences (default=3)
❏ OUTPUT :
Summarized block of text.
CREATED BY AANCHAL CHAURASIA - 2017
2. Auto tag
Automatically generate keyword tags : A technique that discovers topics
contained within a body of text.
❏ INPUTS :
(Required): Raw text.
❏ OUTPUT :
List of extracted tags.
CREATED BY AANCHAL CHAURASIA - 2017
❏ INPUTS :
{
"document": text
}
❏ OUTPUT :
{
"sentences": List
[{
"detectedEntities": List
[{
"word": String,
"entity": String
}]
}]
}
Identify the type of entity extracted, such as it being a person, place, or
organization using Named Entity Recognition.
3. Named Entity Recognition
CREATED BY AANCHAL CHAURASIA - 2017
Identify and extract sentiment in given string. This algorithm takes an input string
and assigns a sentiment rating in the range [-1 t0 1] (very negative to very
positive).
❏ INPUTS :
{
"document": text
}
❏ OUTPUTS :
[
{
"sentiment": 0.474,
"document": "I really like
eating ice cream in the morning!
}
]
4. Sentiment Analysis
CREATED BY AANCHAL CHAURASIA - 2017
This is a algorithm for summarizing content in webpages. This algorithm utilizes two
algorithm: util/Html2Text and nlp/Summarizer. It retrieves the html content in a smart way
by using heuristics with Html2Text algorithm, and summarizes the content using the
Summarizer algorithm.
❏ INPUTS :
(Required): String URL.
(Optional): Number of sentences. (default=3)
❏ OUTPUT :
List of extracted tags.
5. Summarize URL
CREATED BY AANCHAL CHAURASIA - 2017
Some Other Applications
CREATED BY AANCHAL CHAURASIA - 2017
❏ Spelling Correction, Suggests spelling corrections (aka. Autocorrect) for English words.
❏ Detect profanity in text automatically. It is word-based only. It may miss some words,
miss certain misspellings, or double entendres and other material that is offensive in
context.
❏ Create a chat bot using Parsey McParseface, a language parsing deep learning model
made by Google that uses Point-of-Speech tagging.
❏ Reduce words to their root, or stem, using PorterStemmer, or break up text into tokens
using Tokenizer.
CREATED BY AANCHAL CHAURASIA - 2017
NLP Libraries
CREATED BY AANCHAL CHAURASIA - 2017
These libraries provide the algorithmic building blocks of NLP in real-world applications.
❏ Algorithmia provides a free API endpoint for many algorithms.
❏ Apache OpenNLP: a machine learning toolkit that provides tokenizers, sentence
segmentation, part-of-speech tagging, named entity extraction, parsing and more.
❏ Natural Language Toolkit (NLTK): a Python library that provides modules for
processing text, classifying, tokenizing, tagging, parsing, and more.
❏ Standford NLP: a suite of NLP tools that provide part-of-speech tagging, the named
entity recognizer, coreference resolution system, sentiment analysis, and more.
❏ Google SyntaxNet a neural-network framework for analyzing and understanding the
grammatical structure of sentences.
CREATED BY AANCHAL CHAURASIA - 2017
Difficulties Of NLP
CREATED BY AANCHAL CHAURASIA - 2017
Human language is rarely precise, or plainly spoken. To understand human language is
to understand not only the words, but the concepts and how they’re linked together to
create meaning.
Despite language being one of the easiest things for humans to learn, the ambiguity of
language is what makes NLP a difficult problem for computers to master.
❏ Lexical ambiguity : It is at very primitive level such as word-level.
For example, treating the word “board” as noun or verb?
CREATED BY AANCHAL CHAURASIA - 2017
❏ Syntax Level ambiguity : A sentence can be parsed in different ways.
For example, “He lifted the beetle with red cap.” Did he use cap to lift the beetle or he
lifted a beetle that had red cap?
❏ Referential ambiguity : Referring to something using pronouns.
For example, Rima went to Gauri. She said, “I am tired.” − Exactly who is tired?
❏ One input can mean different meanings.
❏ Many inputs can mean the same thing.
CREATED BY AANCHAL CHAURASIA - 2017
Innovations
CREATED BY AANCHAL CHAURASIA - 2017
❏ Last year Google had announced the public beta launch of its Cloud Natural Language
API which has, sentimental analysis, syntax detection, entity recognition etc,
i.e the Vision API and the Translate API.
❏ Vision API has features : Label Detection, Face Detection, Explicit Content Detection,
Image Attributes, Logo Detection, Landmark Detection, Optical Character Recognition.
❏ Translate API feature : Text Translation, Language Detection.
❏ Vision API & Translation API pricing is based on usage.
CREATED BY AANCHAL CHAURASIA - 2017
❏ A new foundational machine learning framework used across Apple products, including
Siri, Camera, and QuickType has made available for developers via Core ML.
❏ Core ML delivers fast performance with easy integration using just a few lines of code.
❏ Supported features include face tracking, face detection, landmarks, text detection,
rectangle detection, barcode detection, object tracking, and image registration.
❏ Core ML Tools is a python package.
❏ Core ML Model provides Places205-GoogLeNet, ResNet50, Inception v3, VGG16.
❏ Requirements Xcode 9 and iOS 11
CREATED BY AANCHAL CHAURASIA - 2017
❏ Medical
❏ Forensic Science
❏ Advertisement
❏ Education
❏ Business Development
❏ Marketing & where ever we use language
Expansion Domains of NLP
CREATED BY AANCHAL CHAURASIA - 2017
Research Areas
❏ Text Processing
❏ Speech Processing
❏ Text to Speech
❏ Automatic Speech Recognition
❏ Speech to Speech Translation
❏ Commercial machine translation
systems
❏ Search Engines
❏ Advanced text Editors
❏ Information extraction
❏ Fraud detection
❏ Opinion mining
❏ Sentimental analysis
CREATED BY AANCHAL CHAURASIA - 2017
References
❏ http://blog.algorithmia.com/introduction-natural-language-processing-nlp/
❏ https://www.tutorialspoint.com/artificial_intelligence/artificial_intelligence_natural_language_processing.htm
❏ https://monkeylearn.com/blog/the-definitive-guide-to-natural-language-processing/
❏ https://cloudacademy.com/blog/google-cloud-natural-language-processing-api/
❏ https://algorithmia.com/algorithms/nlp/
❏ https://github.com/algorithmiaio/algorithmia-swift
❏ https://techcrunch.com/2016/07/20/google-launches-new-api-to-help-you-parse-natural-language/
❏ https://cloud.google.com/vision/?utm_source=google&utm_medium=cpc&utm_campaign=2017-q1-cloud-japac-in-gcp-SKWS-fr
eetrial&utm_content=en&gclid=CJ7wrIOb9NQCFUiGjwodCBcHcQ&dclid=CMDY1IOb9NQCFYqFaAod6M0I9g
❏ https://cloud.google.com/translate/
❏ https://developer.apple.com/machine-learning/ CREATED BY AANCHAL CHAURASIA - 2017
Thank You!!
CREATED BY AANCHAL CHAURASIA - 2017

Mais conteĂșdo relacionado

Mais procurados

Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
Yasir Khan
 
Natural language processing (nlp)
Natural language processing (nlp)Natural language processing (nlp)
Natural language processing (nlp)
Kuppusamy P
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
Jaganadh Gopinadhan
 

Mais procurados (20)

Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Nlp
NlpNlp
Nlp
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Natural language processing (nlp)
Natural language processing (nlp)Natural language processing (nlp)
Natural language processing (nlp)
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Natural language processing (NLP) introduction
Natural language processing (NLP) introductionNatural language processing (NLP) introduction
Natural language processing (NLP) introduction
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
NLP
NLPNLP
NLP
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Natural Language Processing (NLP) - Introduction
Natural Language Processing (NLP) - IntroductionNatural Language Processing (NLP) - Introduction
Natural Language Processing (NLP) - Introduction
 
Natural language processing PPT presentation
Natural language processing PPT presentationNatural language processing PPT presentation
Natural language processing PPT presentation
 
Natural lanaguage processing
Natural lanaguage processingNatural lanaguage processing
Natural lanaguage processing
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
 
NLP
NLPNLP
NLP
 
Natural Language Processing
Natural Language Processing Natural Language Processing
Natural Language Processing
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 

Semelhante a Natural language processing

Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
Abdullah al Mamun
 
Natural Language Processing.pptx
Natural Language Processing.pptxNatural Language Processing.pptx
Natural Language Processing.pptx
PriyadharshiniG41
 
Natural Language Processing.pptx
Natural Language Processing.pptxNatural Language Processing.pptx
Natural Language Processing.pptx
PriyadharshiniG41
 
1 Introduction.ppt
1 Introduction.ppt1 Introduction.ppt
1 Introduction.ppt
tanishamahajan11
 
Full text search
Full text searchFull text search
Full text search
deleteman
 
Natural Language Processing (NLP).pptx
Natural Language Processing (NLP).pptxNatural Language Processing (NLP).pptx
Natural Language Processing (NLP).pptx
SHIBDASDUTTA
 

Semelhante a Natural language processing (20)

Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
 
NLP
NLPNLP
NLP
 
Natural Language Processing.pptx
Natural Language Processing.pptxNatural Language Processing.pptx
Natural Language Processing.pptx
 
Natural Language Processing.pptx
Natural Language Processing.pptxNatural Language Processing.pptx
Natural Language Processing.pptx
 
Using Stanza NLP and TensorFlow to create a summary of a book
Using Stanza NLP and TensorFlow to create a summary of a bookUsing Stanza NLP and TensorFlow to create a summary of a book
Using Stanza NLP and TensorFlow to create a summary of a book
 
Portuguese Linguistic Tools: What, Why and How
Portuguese Linguistic Tools: What, Why and HowPortuguese Linguistic Tools: What, Why and How
Portuguese Linguistic Tools: What, Why and How
 
Programming languages and concepts by vivek parihar
Programming languages and concepts by vivek pariharProgramming languages and concepts by vivek parihar
Programming languages and concepts by vivek parihar
 
NLP todo
NLP todoNLP todo
NLP todo
 
Big data
Big dataBig data
Big data
 
Untitled presentation.pdf
Untitled presentation.pdfUntitled presentation.pdf
Untitled presentation.pdf
 
NLP.pptx
NLP.pptxNLP.pptx
NLP.pptx
 
NLP.pptx
NLP.pptxNLP.pptx
NLP.pptx
 
1 Introduction.ppt
1 Introduction.ppt1 Introduction.ppt
1 Introduction.ppt
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
 
Full text search
Full text searchFull text search
Full text search
 
Natural Language Processing (NLP).pptx
Natural Language Processing (NLP).pptxNatural Language Processing (NLP).pptx
Natural Language Processing (NLP).pptx
 
ARTIFICIAL INTELLEGENCE AND MACHINE LEARNING.pptx
ARTIFICIAL INTELLEGENCE AND MACHINE LEARNING.pptxARTIFICIAL INTELLEGENCE AND MACHINE LEARNING.pptx
ARTIFICIAL INTELLEGENCE AND MACHINE LEARNING.pptx
 
Introduction to natural language processing, history and origin
Introduction to natural language processing, history and originIntroduction to natural language processing, history and origin
Introduction to natural language processing, history and origin
 
Introduction to natural language processing (NLP)
Introduction to natural language processing (NLP)Introduction to natural language processing (NLP)
Introduction to natural language processing (NLP)
 
NLP
NLPNLP
NLP
 

Último

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Último (20)

AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdf
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 

Natural language processing

  • 1. Natural Language Processing CREATED BY AANCHAL CHAURASIA - 2017
  • 2. Contents ❏ What is NLP? ❏ Where is NLP? ❏ Components of NLP ❏ NLP Terminology ❏ Steps in NLP ❏ Developers Usage of NLP Algorithms? ❏ Practical Examples ❏ NLP Libraries ❏ Difficulties of NLP ❏ Innovations ❏ Research Areas ❏ Expansion domains of NLP ❏ References CREATED BY AANCHAL CHAURASIA - 2017
  • 3. Natural Language? Refers to the language spoken by people, i.e English, Spanish, japanese as opposed to artificial languages like C++, Java etc. CREATED BY AANCHAL CHAURASIA - 2017
  • 4. Natural Language Processing? Natural Language Processing is the science of teaching Machines how to understand the language we Human speak and write. CREATED BY AANCHAL CHAURASIA - 2017
  • 5. Aim of NLP? To build intelligent system that can interact With Human Being like a Human Being. CREATED BY AANCHAL CHAURASIA - 2017
  • 6. Where is NLP? CREATED BY AANCHAL CHAURASIA - 2017
  • 7. NLP is Everywhere even if we don’t know it. And although NLP applications can rarely achieve performances of 100%, still they are part of our lives and provide a precious help to all of us. ❏ Machine translation ❏ Automatic summarization ❏ Sentiment analysis ❏ Text classification ❏ Language detection ❏ Information Extraction & lot more CREATED BY AANCHAL CHAURASIA - 2017
  • 8. Input & Output of NLP NLP involves making computers to perform useful tasks with the natural languages humans use. The input and output of an NLP system can be : ❏ Speech ❏ Written Text CREATED BY AANCHAL CHAURASIA - 2017
  • 9. Components of NLP CREATED BY AANCHAL CHAURASIA - 2017
  • 10. Natural Language Understanding NLU involves the following tasks : ❏ Analyzing different aspects of the language. ❏ Mapping the given input in natural language into useful representations. ❏ The NLU is harder than NLG. Natural Language Generation NLG involves the following tasks : ❏ Text planning : It includes retrieving the relevant content from knowledge base. ❏ Sentence planning : It includes choosing required words, forming meaningful phrases, setting tone of the sentence. ❏ Text Realization : It is mapping sentence plan into sentence structure. CREATED BY AANCHAL CHAURASIA - 2017
  • 11. NLP Terminology CREATED BY AANCHAL CHAURASIA - 2017
  • 12. ❏ Phonology : It is study of organizing sound patterns and their meanings. ❏ Morphology : It is a study of construction of words from primitive meaningful units. ❏ Morpheme : A meaningful morphological unit of a language that cannot be further divided (e.g. in, come, -ing, forming incoming ) ❏ Syntax : It refers to arranging words to make a sentence. It also involves determining the structural role of words in the sentence and in phrases. CREATED BY AANCHAL CHAURASIA - 2017
  • 13. ❏ Semantics : It is concerned with the meaning of words and how to combine words into meaningful phrases and sentences. ❏ Pragmatics : It deals with using and understanding sentences in different situations and how the interpretation of the sentence is affected. (e.g. Young men and women) ❏ Discourse : It deals with how the immediately preceding sentence can affect the interpretation of the next sentence. ❏ World Knowledge : It includes the general knowledge about the world. CREATED BY AANCHAL CHAURASIA - 2017
  • 14. Steps in NLP CREATED BY AANCHAL CHAURASIA - 2017
  • 15. CREATED BY AANCHAL CHAURASIA - 2017
  • 16. Detailed study Of NLP Process CREATED BY AANCHAL CHAURASIA - 2017
  • 17. ❏ Lexical Analysis : Lexical analysis is dividing the whole chunk of text into paragraphs, sentences, and words. ❏ Syntactic Analysis (Parsing) : It involves analysis of words in the sentence for grammar and arranging words in a manner that shows the relationship among the words. i.e, Boy the go the to store ❏ Semantic Analysis : It draws the exact meaning or the dictionary meaning from the text. The text is checked for meaningfulness. It is done by mapping syntactic structures. i.e, colourless green idea CREATED BY AANCHAL CHAURASIA - 2017
  • 18. ❏ Discourse Integration : The meaning of any sentence depends upon the meaning of the sentence just before it. In addition, it also brings about the meaning of immediately succeeding sentence. I.e, Kate wanted it. ❏ Pragmatic Analysis : During this, what was said is re-interpreted on what it actually meant. It involves deriving those aspects of language which require real world knowledge. I.e, Do you know what day it is? CREATED BY AANCHAL CHAURASIA - 2017
  • 19. Developers Use of NLP Algorithms? CREATED BY AANCHAL CHAURASIA - 2017
  • 20. ❏ By utilizing NLP, developers can organize and structure knowledge to perform tasks such as automatic summarization, translation, named entity recognition, relationship extraction, sentiment analysis, speech recognition etc. ❏ Algorithms are available with various useful applications on many platforms using : Ruby, Python, Java, JavaScript, NodeJs, Swift & so on. ❏ Instead of hand-coding large sets of rules, NLP can rely on machine learning to automatically learn these rules by analyzing a set of examples. ❏ The more data analyzed, the more accurate the model will be. CREATED BY AANCHAL CHAURASIA - 2017
  • 21. Let us look into some practical Applications CREATED BY AANCHAL CHAURASIA - 2017
  • 22. 1. Summarize blocks of Text Summarizer extracts the most important and central ideas while ignoring irrelevant information. ❏ INPUTS : (Required): Large block of text. (Optional): Number of sentences (default=3) ❏ OUTPUT : Summarized block of text. CREATED BY AANCHAL CHAURASIA - 2017
  • 23. 2. Auto tag Automatically generate keyword tags : A technique that discovers topics contained within a body of text. ❏ INPUTS : (Required): Raw text. ❏ OUTPUT : List of extracted tags. CREATED BY AANCHAL CHAURASIA - 2017
  • 24. ❏ INPUTS : { "document": text } ❏ OUTPUT : { "sentences": List [{ "detectedEntities": List [{ "word": String, "entity": String }] }] } Identify the type of entity extracted, such as it being a person, place, or organization using Named Entity Recognition. 3. Named Entity Recognition CREATED BY AANCHAL CHAURASIA - 2017
  • 25. Identify and extract sentiment in given string. This algorithm takes an input string and assigns a sentiment rating in the range [-1 t0 1] (very negative to very positive). ❏ INPUTS : { "document": text } ❏ OUTPUTS : [ { "sentiment": 0.474, "document": "I really like eating ice cream in the morning! } ] 4. Sentiment Analysis CREATED BY AANCHAL CHAURASIA - 2017
  • 26. This is a algorithm for summarizing content in webpages. This algorithm utilizes two algorithm: util/Html2Text and nlp/Summarizer. It retrieves the html content in a smart way by using heuristics with Html2Text algorithm, and summarizes the content using the Summarizer algorithm. ❏ INPUTS : (Required): String URL. (Optional): Number of sentences. (default=3) ❏ OUTPUT : List of extracted tags. 5. Summarize URL CREATED BY AANCHAL CHAURASIA - 2017
  • 27. Some Other Applications CREATED BY AANCHAL CHAURASIA - 2017
  • 28. ❏ Spelling Correction, Suggests spelling corrections (aka. Autocorrect) for English words. ❏ Detect profanity in text automatically. It is word-based only. It may miss some words, miss certain misspellings, or double entendres and other material that is offensive in context. ❏ Create a chat bot using Parsey McParseface, a language parsing deep learning model made by Google that uses Point-of-Speech tagging. ❏ Reduce words to their root, or stem, using PorterStemmer, or break up text into tokens using Tokenizer. CREATED BY AANCHAL CHAURASIA - 2017
  • 29. NLP Libraries CREATED BY AANCHAL CHAURASIA - 2017
  • 30. These libraries provide the algorithmic building blocks of NLP in real-world applications. ❏ Algorithmia provides a free API endpoint for many algorithms. ❏ Apache OpenNLP: a machine learning toolkit that provides tokenizers, sentence segmentation, part-of-speech tagging, named entity extraction, parsing and more. ❏ Natural Language Toolkit (NLTK): a Python library that provides modules for processing text, classifying, tokenizing, tagging, parsing, and more. ❏ Standford NLP: a suite of NLP tools that provide part-of-speech tagging, the named entity recognizer, coreference resolution system, sentiment analysis, and more. ❏ Google SyntaxNet a neural-network framework for analyzing and understanding the grammatical structure of sentences. CREATED BY AANCHAL CHAURASIA - 2017
  • 31. Difficulties Of NLP CREATED BY AANCHAL CHAURASIA - 2017
  • 32. Human language is rarely precise, or plainly spoken. To understand human language is to understand not only the words, but the concepts and how they’re linked together to create meaning. Despite language being one of the easiest things for humans to learn, the ambiguity of language is what makes NLP a difficult problem for computers to master. ❏ Lexical ambiguity : It is at very primitive level such as word-level. For example, treating the word “board” as noun or verb? CREATED BY AANCHAL CHAURASIA - 2017
  • 33. ❏ Syntax Level ambiguity : A sentence can be parsed in different ways. For example, “He lifted the beetle with red cap.” Did he use cap to lift the beetle or he lifted a beetle that had red cap? ❏ Referential ambiguity : Referring to something using pronouns. For example, Rima went to Gauri. She said, “I am tired.” − Exactly who is tired? ❏ One input can mean different meanings. ❏ Many inputs can mean the same thing. CREATED BY AANCHAL CHAURASIA - 2017
  • 34. Innovations CREATED BY AANCHAL CHAURASIA - 2017
  • 35. ❏ Last year Google had announced the public beta launch of its Cloud Natural Language API which has, sentimental analysis, syntax detection, entity recognition etc, i.e the Vision API and the Translate API. ❏ Vision API has features : Label Detection, Face Detection, Explicit Content Detection, Image Attributes, Logo Detection, Landmark Detection, Optical Character Recognition. ❏ Translate API feature : Text Translation, Language Detection. ❏ Vision API & Translation API pricing is based on usage. CREATED BY AANCHAL CHAURASIA - 2017
  • 36. ❏ A new foundational machine learning framework used across Apple products, including Siri, Camera, and QuickType has made available for developers via Core ML. ❏ Core ML delivers fast performance with easy integration using just a few lines of code. ❏ Supported features include face tracking, face detection, landmarks, text detection, rectangle detection, barcode detection, object tracking, and image registration. ❏ Core ML Tools is a python package. ❏ Core ML Model provides Places205-GoogLeNet, ResNet50, Inception v3, VGG16. ❏ Requirements Xcode 9 and iOS 11 CREATED BY AANCHAL CHAURASIA - 2017
  • 37. ❏ Medical ❏ Forensic Science ❏ Advertisement ❏ Education ❏ Business Development ❏ Marketing & where ever we use language Expansion Domains of NLP CREATED BY AANCHAL CHAURASIA - 2017
  • 38. Research Areas ❏ Text Processing ❏ Speech Processing ❏ Text to Speech ❏ Automatic Speech Recognition ❏ Speech to Speech Translation ❏ Commercial machine translation systems ❏ Search Engines ❏ Advanced text Editors ❏ Information extraction ❏ Fraud detection ❏ Opinion mining ❏ Sentimental analysis CREATED BY AANCHAL CHAURASIA - 2017
  • 39. References ❏ http://blog.algorithmia.com/introduction-natural-language-processing-nlp/ ❏ https://www.tutorialspoint.com/artificial_intelligence/artificial_intelligence_natural_language_processing.htm ❏ https://monkeylearn.com/blog/the-definitive-guide-to-natural-language-processing/ ❏ https://cloudacademy.com/blog/google-cloud-natural-language-processing-api/ ❏ https://algorithmia.com/algorithms/nlp/ ❏ https://github.com/algorithmiaio/algorithmia-swift ❏ https://techcrunch.com/2016/07/20/google-launches-new-api-to-help-you-parse-natural-language/ ❏ https://cloud.google.com/vision/?utm_source=google&utm_medium=cpc&utm_campaign=2017-q1-cloud-japac-in-gcp-SKWS-fr eetrial&utm_content=en&gclid=CJ7wrIOb9NQCFUiGjwodCBcHcQ&dclid=CMDY1IOb9NQCFYqFaAod6M0I9g ❏ https://cloud.google.com/translate/ ❏ https://developer.apple.com/machine-learning/ CREATED BY AANCHAL CHAURASIA - 2017
  • 40. Thank You!! CREATED BY AANCHAL CHAURASIA - 2017