3. Text Analytics 2015
“Who controls the past, controls the future. Who
controls the present, controls the past.”
-- 1984, George Orwell
Let’s start with the past…
4. Document
input and
processing
Knowledge
handling is
key
Desk Set (1957): Computer engineer
Richard Sumner (Spencer Tracy)
and television network librarian
Bunny Watson (Katherine Hepburn)
and the "electronic brain" EMERAC.
Hans Peter Luhn
“A Business Intelligence System”
IBM Journal, October 1958
5. Text Analytics 2015
2005: “The bulk of information value is perceived as
coming from data in relational tables. The reason is
that data that is structured is easy to mine and
analyze.”
-- Prabhakar Raghavan, now Google VP Engineering
6. Text Analytics 2015
2007: “Organizations embracing text analytics all
report having an epiphany moment when they
suddenly knew more than before.”
-- Philip Russom, the Data Warehousing Institute
7. Text Analytics 2015
2010: “The Web has dramatically changed the way
that people express their views and opinions.”
-- Prof. Bing Liu, Univ. of Illinois, Chicago
“The future is clearly about analyzing feedback
in any form that your customers give it. That’s a
trend that won’t go away.”
-- Bruce Temkin
11. Text Analytics 2015
Ava applies Affective, Cognitive & Psychomotor
methods (per Bloom’s Taxonomy of educational
objectives).
12. Text Analytics 2015
Drivers and Trends
For insights, technology drives method.
• Data science, data monetization.
• Big data: Social, online & enterprise.
• Volume and velocity mean new analytical
approaches.
• Variety: new types and a new fusion imperative.
• Algorithms… cognitive and affective.
• Stats.
• Language engineering.
• Deep learning; Unsupervised, semi-, supervised
& active methods.
• Via-API cloud services… the API economy.
13. Text Analytics 2015
Current, 33%
Current, 31%
Current, 34%
Current, 47%
Current, 51%
Current, 56%
Current, 47%
Current, 54%
Current, 66%
Expect, 21%
Expect, 24%
Expect, 23%
Expect, 23%
Expect, 28%
Expect, 25%
Expect, 33%
Expect, 28%
Expect, 22%
0% 10% 20% 30% 40% 50% 60% 70% 80% 90%
Events
Semantic annotations
Other entities – phone numbers, part/product
numbers, e-mail & street addresses, etc.
Metadata such as document author, publication
date, title, headers, etc.
Concepts, that is, abstract groups of entities
Named entities – people, companies,
geographic locations, brands, ticker symbols,…
Relationships and/or facts
Sentiment, opinions, attitudes, emotions,
perceptions, intent
Topics and themes
Do you currently need (or expect to need) to
extract or analyze –
http://altaplana.com/TA2014
14. Text Analytics 2015
“The share rise in
users who selected
Arabic…coincided
with much of the
civil unrest… in
Middle Eastern
countries.”
http://bits.blogs.nytimes.com/2014/03/
09/the-languages-of-twitter-users/