7. WHAT IS TEXT ANALYTICS?
unstructured data
Linguistics Search
Statistics Data Extraction
Text Processing Document Organization
Machine Learning Business Intelligence
Natural Language Processing Opinion Mining
Text Mining
8. TEXT ANALYTICS SAVES MORE TIME
Compose search reports
Extract entities
… automatically
Mine opinions & sentiment
Cluster search results
Redact
Summarize
Generate metadata
Fill databases
Profanity check
10. TEXT ANALYTICS: GLOBAL PERSPECTIVE
User adoption has grown by 25% in 2010
creating an $835 million market because:
• Unstructured data grows (ex. social) Text analytics!
• Text analytics is central to effective information access
• Many successes in NLP: IBM Watson, Wolfram Alpha
Full report by Seth Grimes:
http://altaplana.com/TA2011
11. APPLICATIONS OF TEXT ANALYTICS
Search & info access 39%
Customer experience management 39%
Brand management 39%
Research 36%
Competitive intelligence 33%
Customer service 26%
E-discovery 15%
Life sciences 15%
Product design 15%
Online commerce 11%
Finance 10%
Other 9%
Content management 8%
Insurance & fraud 8%
Millitary intelligence 7%
Law enforcement 6% Source:
http://altaplana.com/TA2011
12. SEARCH & INFO ACCESS
METADATA EXTRACTION
Document Easy to extract: Metadata
File type, name & location,
creation & modification date,
authors
Difficult to extract:
Keywords,
people & companies mentioned,
suppliers & addresses mentioned
13. SEARCH & INFO ACCESS
KEYWORD EXTRACTION
Document Candidates Keywords
Hi All,
As of today, MetaStock has several new functions.
The most important new feature is the ability to
display forward heat rate charts.
Also, notice that the interface looks different -- this
reflects and accommodates the new features.
If you have any questions regarding this new
version of MetaStock, please contact Bella Santuri.
14. SEARCH & INFO ACCESS
KEYWORD EXTRACTION
Document Candidates Keywords
Hi All,
As of today, MetaStock has several new functions.
The most important new feature is the ability to
display forward heat rate charts.
Also, notice that the interface looks different -- this
reflects and accommodates the new features.
If you have any questions regarding this new
version of MetaStock, please contact Bella Santuri.
15. SEARCH & INFO ACCESS
KEYWORD EXTRACTION
Document Candidates Properties Keywords
Hi All,
As of today, MetaStock has several new functions.
Frequency The most important new feature is the ability to
Position display forward heat rate charts.
Corpus stats Also, notice that the interface looks different -- this
Relatedness reflects and accommodates the new features.
If you have any questions regarding this new
version of MetaStock, please contact Bella Santuri.
16. SEARCH & INFO ACCESS
KEYWORD EXTRACTION
Document Candidates Properties Scoring Keywords
Hi All,
As of today, MetaStock has several new functions.
Heuristic The most important new feature is the ability to
scoring display forward heat rate charts.
Also, notice that the interface looks different -- this
Machine reflects and accommodates the new features.
learning If you have any questions regarding this new
version of MetaStock, please contact Bella Santuri.
17. SEARCH & INFO ACCESS
NAMES EXTRACTION
Document Examples Properties Learning Names
If you have any questions regarding this new version of
MetaStock, please contact Bella Santuri.
NLP,
Training data Machine
Heuristics,
(annotations) Learning
Text mining
19. BRAND & CUSTOMER MANAGEMENT
SENTIMENT ANALYSIS
Reviews
Document
Document Visualization
Tweets Sentiment Analysis
Summary
Surveys
Naïve approach: Sentiment-words dictionary!
Negative Positive BUT:
suck fantastic If you are reading this because it
terrible excellent is your darling fragrance, please
awful awesome wear it at home exclusively, and
tape the windows shut.
No sentiment words!
20. BRAND & CUSTOMER MANAGEMENT
SENTIMENT ANALYSIS
Reviews
Document
Document Visualization
Tweets Examples Properties Learning
Summary
Surveys
Presence
Position
Training data Lexicon Machine
Part-of-Speech
(annotations) induction Learning
Negation
Generalization
Important:
Identifying sentiment bearing sentences
Attaching sentiment to a topic!
22. RESEARCH
TEXT SUMMARIZATION
Address Hi All,
Announcement As of today, MetaStock has several new functions.
Details The most important new feature is the ability to
display forward heat rate charts.
More details Also, notice that the interface looks different -- this
reflects and accommodates the new features.
Conclusion If you have any questions regarding this new
version of MetaStock, please contact Bella Santuri.
Extractive summary: As of today, MetaStock has several new functions.
Sentence compression: MetaStock has several new functions.
The new interface looks different.
Abstractive summary: MetaStock has new features and a new interface.
27. APIs
What’s an API and how does it work?
What are the advantages of the API model?
Which API is the right one for you?
28. API ACCESS
a protocol specifies how • SOAP
XML needs to be encoded • REST
a call is an XML message
describing the request
includes API authentication
calls via a web service
API ENGINE
SDK
usage examples
Developer creates An interface that Software engine
an application ensures communication solves a specific task
29. REST API ACCESS FROM A BROWSER
API request
http://search.yahooapis.com/WebSearchService/V1/webSe
arch?appid=YahooDemo&query=madonna&context=Italian+sc
ulptors+and+painters+of+the+renaissance+favored+the+V
irgin+Mary+for+inspiration
API response
31. SOAP API ACCESS IN POWERSHELL
Read complete blog post “Bulk metadata extraction in SharePoint”:
http://bit.ly/powershell-migrate
32. API = EASY INTEGRATION & FLEXIBILITY
• Integrate into existing architecture
via any programming language
• Improve known flaws in the current system/process
• Minimize adoption barriers within the company
no or little training required for stuff
• Only pay for the features you need
• Flexible deployment:
• Host API on site = Secure data exchange
• Access the API in the cloud = Save on tech support & hardware
33. WHICH API IS BEST FOR YOU?
I need to take some text and get a list of the
important entities/keywords/phrases.
Y: Term Extractor API restrictions
OpenCalais Supported languages
BeliefNetworks Quality of results
OpenAmplify Semantic links
AlchemyAPI 2nd Synonyms/Duplicates
Evri 1st
Blog post on API comparison:
faganm.com/blog
34. HOW TO CHOOSE AN API:
• Define a specific task
• Think of what features are important
• Get prepared:
• Subscribe for API keys
• Get SDKs
• Learn libraries
• Find representative data
• Build a test framework
• Compare results
38. THE NEXT-GENERATION SHAREPOINT:
POWERED BY TEXT ANALYTICS
• What can be automated?
• Metadata extraction, Data entry, Opinion mining,
Sanitization, Doc approval, Summarization, …
• How to integrate text analytics
into existing SharePoint applications?
• Easy! Via an API
• How to find the right text analytics API?
• Review what’s available
Set up an experiment
Compare results
How many hours per week does an average person that uses a computer spends on Searching?What the heck is text analytics, a 101 introduction course…How API work and why they are great for both business people and developers.
What are your primary applications where text comes into play?