Technology Frontiers: Text, Sentiment, and Sense by Seth Grimes of Alta Plana Corporation - Presented at the Insight Innovation eXchange North America 2013
A basic definition: Text analytics transforms text-sourced information into data to help you generate insights that fuel better-informed business decision-making. Methods are applied to online and social information, as well as enterprise feedback, to complement and extend traditional and emerging research methods. Text analytics is the leading opinion mining technique, evolving to link emotion and intent signals to behaviors, profiles, and transactions. If text analytics isn’t part of your data toolkit, it should be; if you’re already exploiting text analytics, you’ll want to stay on top of developments. Seth Grimes, in this What’s Next talk, will tell you how.
Workshop: Using Games to Explore Human Behavior by Mark Earls of I'll Have Wh...
Similar to Technology Frontiers: Text, Sentiment, and Sense by Seth Grimes of Alta Plana Corporation - Presented at the Insight Innovation eXchange North America 2013
Similar to Technology Frontiers: Text, Sentiment, and Sense by Seth Grimes of Alta Plana Corporation - Presented at the Insight Innovation eXchange North America 2013 (20)
The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...
Technology Frontiers: Text, Sentiment, and Sense by Seth Grimes of Alta Plana Corporation - Presented at the Insight Innovation eXchange North America 2013
4. Natural Language Processing
By H.P. Luhn, in
IBM Journal,
April, 1958
http://altaplana.com/ibm-
luhn58-LiteratureAbstracts.pdf
5. Modelling Text
“Statistical information derived from word frequency and distribution is
used by the machine to compute a relative measure of significance, first
for individual words and then for sentences. Sentences scoring highest in
significance are extracted and printed out to become the auto-abstract.”
-- H.P. Luhn, The Automatic Creation of Literature Abstracts, IBM Journal, 1958.
Luhn’s analysis of Messengers
of the Nervous System, a
Scientific American article
http://wordle.net, applied
to the NY Times article
13. Lexical, syntactic, and semantic analysis discern
features including relationships in source materials.
Features = entities, measure-value pairs, concepts,
topics, events, sentiment, and more.
Text analytics may draw on:
• Lexicons & taxonomies.
• Statistics.
• Patterns.
• Linguistics.
• Machine learning.
Text Analytics
15. From POS to Relationships
Understand parts of
speech (POS), e.g. –
<subject> <verb>
<object> –to
discern facts and
relationships.
Semantic networks
such as WordNet
are a
disambiguation
asset.
17. Platforms and ecosystems.
APIs and services.
Text and content analytics --
Discerns and extracts features including relationships from
source materials.
Features = entities, key-value pairs, concepts, topics,
events, sentiment, etc.
Provide (for) BI on content-sourced data.
Data integration, record linkage, data fusion.
The Back End
21. Sentiment Analysis
“Sentiment analysis is the task of identifying positive
and negative opinions, emotions, and evaluations.”
-- Wilson, Wiebe & Hoffman, 2005, “Recognizing Contextual Polarity in
Phrase-Level Sentiment Analysis”
“Sentiment analysis or opinion mining is the
computational study of opinions, sentiments and
emotions expressed in text… An opinion on a feature f is
a positive or negative view, attitude, emotion or
appraisal on f from an opinion holder.”
-- Bing Liu, 2010, “Sentiment Analysis and Subjectivity,” in Handbook of
Natural Language Processing
25. Complications
Sentiment may be of interest at multiple levels.
Corpus / data space, i.e., across multiple sources.
Document.
Statement / sentence.
Entity / topic / concept.
Human language is noisy and chaotic!
Jargon, slang, irony, ambiguity, anaphora, polysemy,
synonymy, etc.
Context is key. Discourse analysis comes into play.
Must distinguish the sentiment holder from the object:
“Geithner said the recession may worsen.”
27. Sensemaking
“It is convenient to divide the entire
information access process into two
main components: information retrieval
through searching and browsing, and
analysis and synthesis of results. This
broader process is often referred to in
the literature as sensemaking.
Sensemaking refers to an iterative
process of formulating a conceptual
representation from of a large volume
of information. Search plays only one
part in this process.”
-- Marti Hearst, 2009 http://searchuserinterfaces.com/
28. Apply new tech to old needs, e.g., automated coding.
Select from and use all available data.
Marry social to profiles and surveys.
Factor in behaviors.
Interpret according to context and needs.
Understand intent to create situational predictive
models.
Explore; experiment.
Suggestions