Sentiment analysis of online social networks, such as Twitter, is a rapidly growing research area with an increasing number of applications such as: informing disaster response; tracking spread of contagious disease; early forecast of civil unrest. This talk will present an approach for building scalable social media analysis platforms in Clojure
7. Apache Ant
Leiningen Versus the Ants
Clojure Versus Java
Functional Versus Object Oriented
8. “You wanted a banana but what you got was a
gorilla holding the banana and the entire jungle.”
Joe Armstrong - inventor of Erlang
9. “Sometimes, the elegant implementation is just
a function. Not a method. Not a class. Not a
framework. Just a function.”
John Carmack - Lead on Doom, Quake, Rage
10.
11. Blood aches pain
sick shivers spasm
vomit dizzy fainting
colic and in a
coma!
1 FAVORITE
12. Blood aches pain
sick shivers spasm
vomit dizzy fainting
colic and in a
coma!
1 FAVORITE
Scores HIGH for flu symptoms
13. Blood aches pain
sick shivers spasm
vomit dizzy fainting
colic and in a
coma!
1 FAVORITE
Who is Catalina Rubottom?
14. Blood aches pain
sick shivers spasm
vomit dizzy fainting
colic and in a
coma!
1 FAVORITE
30 million geo-tagged
tweets sent from UK
HDFS/Hadoop
Mongo/Aggregation
Mongo/MapReduce
Postgres
15. How can we do fast, real time analytics of
social media?
44. How many people in England are Happy
about the referendum result?
(wcar*
(car/bitop "AND" “ENGLAND&JOVIALITY" "ENGLAND" "JOVIALITY")
(car/expire “ENGLAND&JOVIALITY" 10)
(car/bitcount “ENGLAND&JOVIALITY"))
45. How many people in England are Happy
about the referendum result?
(wcar*
(car/bitop "AND" “ENGLAND&JOVIALITY" "ENGLAND" "JOVIALITY")
(car/expire “ENGLAND&JOVIALITY" 10)
(car/bitcount “ENGLAND&JOVIALITY"))
46. How many people in England are Happy
about the referendum result?
(wcar*
(car/bitop "AND" “ENGLAND&JOVIALITY" "ENGLAND" "JOVIALITY")
(car/expire “ENGLAND&JOVIALITY" 10)
(car/bitcount “ENGLAND&JOVIALITY"))
47. How many people in England are Happy
about the referendum result?
(wcar*
(car/bitop "AND" “ENGLAND&JOVIALITY" "ENGLAND" "JOVIALITY")
(car/expire “ENGLAND&JOVIALITY" 10)
(car/bitcount “ENGLAND&JOVIALITY"))
48. How many people in England are Happy
about the referendum result?
49. How many people in England are Happy
about the referendum result?
ENG 0 0 1 0 1 1 0 1 0 0 1 0 1 1 0 1
50. How many people in England are Happy
about the referendum result?
ENG 0 0 1 0 1 1 0 1 0 0 1 0 1 1 0 1
JOVIAL 1 0 1 0 1 1 1 0 0 1 0 1 0 0 0 1
51. How many people in England are Happy
about the referendum result?
ENG 0 0 1 0 1 1 0 1 0 0 1 0 1 1 0 1
JOVIAL 1 0 1 0 1 1 1 0 0 1 0 1 0 0 0 1
1
AND
(car/bitop "AND" “ENGLAND&JOVIALITY" "ENGLAND" "JOVIALITY")
52. How many people in England are Happy
about the referendum result?
ENG 0 0 1 0 1 1 0 1 0 0 1 0 1 1 0 1
JOVIAL 1 0 1 0 1 1 1 0 0 1 0 1 0 0 0 1
0 1
AND
(car/bitop "AND" “ENGLAND&JOVIALITY" "ENGLAND" "JOVIALITY")
53. How many people in England are Happy
about the referendum result?
ENG 0 0 1 0 1 1 0 1 0 0 1 0 1 1 0 1
JOVIAL 1 0 1 0 1 1 1 0 0 1 0 1 0 0 0 1
0 0 1
AND
(car/bitop "AND" “ENGLAND&JOVIALITY" "ENGLAND" "JOVIALITY")
54. How many people in England are Happy
about the referendum result?
ENG 0 0 1 0 1 1 0 1 0 0 1 0 1 1 0 1
JOVIAL 1 0 1 0 1 1 1 0 0 1 0 1 0 0 0 1
0 0 1 0 1 1 0 0 0 0 0 0 0 0 0 1
AND
(car/bitop "AND" “ENGLAND&JOVIALITY" "ENGLAND" "JOVIALITY")
55. ENG 0 0 1 0 1 1 0 1 0 0 1 0 1 1 0 1
JOVIAL 1 0 1 0 1 1 1 0 0 1 0 1 0 0 0 1
0 0 1 0 1 1 0 0 0 0 0 0 0 0 0 1
(car/bitcount “ENGLAND&JOVIALITY")
4
How many people in England are Happy
about the referendum result?
56. How many people in Scotland are Tired
and Grumpy after the referendum?
(wcar*
(car/bitop "OR" "FATIGUE|HOSTILITY" "FATIGUE" “HOSTILITY")
(car/expire "FATIGUE|HOSTILITY" 10)
(car/bitop "AND" "SCOTLAND&(FATIGUE|HOSTILITY)"
"SCOTLAND" "FATIGUE|HOSTILITY")
(car/expire "SCOTLAND&(FATIGUE|HOSTILITY)" 10)
(car/bitcount "SCOTLAND&(FATIGUE|HOSTILITY)"))
57. How many people in Scotland are Tired
and Grumpy after the referendum?
(wcar*
(car/bitop "OR" "FATIGUE|HOSTILITY" "FATIGUE" “HOSTILITY")
(car/expire "FATIGUE|HOSTILITY" 10)
(car/bitop "AND" "SCOTLAND&(FATIGUE|HOSTILITY)"
"SCOTLAND" "FATIGUE|HOSTILITY")
(car/expire "SCOTLAND&(FATIGUE|HOSTILITY)" 10)
(car/bitcount "SCOTLAND&(FATIGUE|HOSTILITY)"))
58. How many people in Scotland are Tired
and Grumpy after the referendum?
(wcar*
(car/bitop "OR" "FATIGUE|HOSTILITY" "FATIGUE" “HOSTILITY")
(car/expire "FATIGUE|HOSTILITY" 10)
(car/bitop "AND" "SCOTLAND&(FATIGUE|HOSTILITY)"
"SCOTLAND" "FATIGUE|HOSTILITY")
(car/expire "SCOTLAND&(FATIGUE|HOSTILITY)" 10)
(car/bitcount "SCOTLAND&(FATIGUE|HOSTILITY)"))
59. How many people in Scotland are Tired
and Grumpy after the referendum?
(wcar*
(car/bitop "OR" "FATIGUE|HOSTILITY" "FATIGUE" “HOSTILITY")
(car/expire "FATIGUE|HOSTILITY" 10)
(car/bitop "AND" "SCOTLAND&(FATIGUE|HOSTILITY)"
"SCOTLAND" "FATIGUE|HOSTILITY")
(car/expire "SCOTLAND&(FATIGUE|HOSTILITY)" 10)
(car/bitcount "SCOTLAND&(FATIGUE|HOSTILITY)"))
60. How many people in Scotland are Tired
and Grumpy after the referendum?
61. How many people in Scotland are Tired
and Grumpy after the referendum?
HOSTILE 0 0 0 0 0 0 0 1 0 0 1 0 0 0 1 0
62. How many people in Scotland are Tired
and Grumpy after the referendum?
HOSTILE 0 0 0 0 0 0 0 1 0 0 1 0 0 0 1 0
FATIGUE 0 1 0 1 0 0 0 0 1 0 0 0 1 0 0 0
63. HOSTILE 0 0 0 0 0 0 0 1 0 0 1 0 0 0 1 0
FATIGUE 0 1 0 1 0 0 0 0 1 0 0 0 1 0 0 0
BOTH
OR
How many people in Scotland are Tired
and Grumpy after the referendum?
64. HOSTILE 0 0 0 0 0 0 0 1 0 0 1 0 0 0 1 0
FATIGUE 0 1 0 1 0 0 0 0 1 0 0 0 1 0 0 0
BOTH 0 1 0 1 0 0 0 1 1 0 1 0 1 0 1 0
OR
How many people in Scotland are Tired
and Grumpy after the referendum?
65. HOSTILE 0 0 0 0 0 0 0 1 0 0 1 0 0 0 1 0
FATIGUE 0 1 0 1 0 0 0 0 1 0 0 0 1 0 0 0
BOTH 0 1 0 1 0 0 0 1 1 0 1 0 1 0 1 0
OR
How many people in Scotland are Tired
and Grumpy after the referendum?
66. How many people in Scotland are Tired
and Grumpy after the referendum?
HOSTILE 0 0 0 0 0 0 0 1 0 0 1 0 0 0 1 0
FATIGUE 0 1 0 1 0 0 0 0 1 0 0 0 1 0 0 0
BOTH 0 1 0 1 0 0 0 1 1 0 1 0 1 0 1 0
SCT 1 0 0 0 0 0 1 0 1 0 0 0 0 0 1 0
67. HOSTILE 0 0 0 0 0 0 0 1 0 0 1 0 0 0 1 0
FATIGUE 0 1 0 1 0 0 0 0 1 0 0 0 1 0 0 0
BOTH 0 1 0 1 0 0 0 1 1 0 1 0 1 0 1 0
SCT 1 0 0 0 0 0 1 0 1 0 0 0 0 0 1 0
BOTH 0 1 0 1 0 0 0 1 1 0 1 0 1 0 1 0
AND
How many people in Scotland are Tired
and Grumpy after the referendum?
79. PANAS-t
Accounts for bias on social media!
Outlines sanitisation
Validate against 10 real events
80. dakrone/clojure-opennlp
Sanitisation
• Exclude media/advertising/spam
• Account for text speak
• Account for smileys & emoj’s
• Word stemming (or lemmatisation)
• Part of Speech Tagging (POS)
LMAO
GetRichQuick.com
81. SHYNESS
FATIGUE
SERENITY
SURPRISE
FEAR
HOSTILITY
GUILT
SADNESS
JOVIALITY
SELF ASSURANCE
ATTENTIVENESS
We have sentiment keys!
82. RING
Journaling
Syncing
Business logic
What
Where
When
Memory
Web API
Incoming Data
mbostock/d3
84. Reverse Geocoding Issues
• Don’t want external services
• Don’t want heavy IO
• Don’t want round trips to the database
• Accuracy not too much of a concern