Mining auditory hallucinations from unsolicited Twitter posts

Mining auditory
hallucinations from
unsolicited Twitter 
posts
M. Belousov, M. Dinev, R. M. Morris, N. Berry, S. Bucci, G. Nenadic
University of Manchester
Health eResearch Centre (HeRC), The Farr Institute of Health Informatics Research
Portorož, May 2016

Mining auditory
hallucinations from
posts
schizophrenia
hearing voices
mental
psychosissymptom
sound
health
Portorož, May 2016

Mining auditory
hallucinations from
posts
social network
brief message
fewer than 140 characters
310M active users
share opinions
spontaneous unforced
unasked-for
Portorož, May 2016

Mining auditory
hallucinations from
posts
knowledge discovery
exploratory
patternsunseen
data
analysis
Portorož, May 2016

Mining auditory
hallucinations from
posts
schizophrenia
hearing voices
mental
psychosissymptom
sound
health
knowledge discovery
patternsunseen
social network
brief message
fewer than 140 characters
320M active usersshare opinions
spontaneous unforced
unasked-for
Portorož, May 2016

Research aim
Q: Is it feasible to generate useful datasets from
unsolicited Twitter posts regarding auditory
hallucinatory experiences to support psychological
investigations?
6

Research aim
Q: Is it feasible to generate useful datasets from
unsolicited Twitter posts regarding auditory
hallucinatory experiences to support psychological
investigations?
6
A: Classiﬁcation model that can predict whether a
given post is related to hallucinatory experiences.

Potentially related posts
7
I am hearing a scary voice right now, I don’t know if
it’s in my head or in television.. Crazy
All twitter posts were paraphrased to preserve anonymity
✅
If hallucinating is thought of as hearing voices that
are not actually real, then these painkillers are
causing me to hallucinate like mad ✅

Unrelated posts
8
My grandmom is watching Deliver Us From Evil and I
can hear this weird high-pitched voice and want
Ralph Sarchie to hold me
❌
So I was convinced I was hearing stuﬀ. It was so
funny because the noise was coming from the
kitchen but I thought I was hallucinating ❌

Iterative workflow
9
Define search queries
Collect unique posts from Twitter
Annotate posts &
Explore data
Predict relatedness of posts to
hallucinatory experiences
Analyse data
Redefine
search queries

Data collection
10
Search query
hallucinating hearing
(“hear things” OR “hearing things”) “in my head”
hearing scary things “in my head”
(hear OR hearing)
(“other people” OR “other ppls” OR “other ppl”) thoughts
(voice OR voices) (commenting OR criticising)
(scary OR frightening OR “everything I do”)
(hear OR hearing) (voice OR voices)
(god OR angel OR allah OR spirit OR soul OR “holy spirit” OR djinn OR jinn)
(hear OR hearing) (voice OR voices)
(scary OR devil OR demon OR daemon OR evil OR “evil spirit”)
List of deﬁned search queries for Twitter Search API

Data annotation
11
• Two research psychologists manually
annotated posts:
• Assign classes: related or unrelated
to hallucinations
• Highlight specific phrases to describe
their decisions
• Later highlighted words and phrases
were utilised to identify characteristics
of each classification category
Data annotation process
RESULT: 401 annotated examples: 94 related to hallucinatory
experiences
• The observed IAA was 0.85 on 41 examples (10% of the final annotated set)

Data exploration: semantic classes
12
• Relative (father, friend)
• Communication Tool (phone)
• Audio Device 
(headphones, TV)
• Drug (cannabis, painkillers)
• Audio Recording (voicemail)
• Possible Hallucination 
(seeing things, in my head)
• Audio & Visual Media, Apps
(song, YouTube, Siri)
• Religious Term (prayer)
• Emotional Support (helpline)
• Own Voice Indicator  
(my voice, our own voice)
• Fear Expression (scared,
creepy)
• Abusive Language (sh*t, hell)
• Stigmatising Language 
(crazy, insane)

Text classiﬁcation pipeline
13
Im hearing a scary voice rn,idk if
it’s in my head or in TV..craazy
Information
Extraction
Classiﬁcation
Text
Preprocessing
corrected
text
structured
text
raw 
(unstructured)
text
structured
text
label
label: related to hallucinatory experience
I am hearing a scary voice right now, I don’t know if
audio device
it’s in my head or in television.. Crazy
stigmatising lang.
fear expr.
possible hallucination
O V V D A N R R O V V P
AL P D N & P N
POS tagset from Gimpel et al. (2011): O - personal pronoun, V - verb, D - determiner, etc.

Information extraction
14
My grandmom is watching Deliver Us From Evil and
I can hear this weird high-pitched voice and want
Neg. sentimentRelative [1] NE (person) [1] POS Tags
NE (misc) [1]
*Stanford NER using 4-class model trained on the CoNLL 2003 data
*

Information extraction
14
My grandmom is watching Deliver Us From Evil and
I can hear this weird high-pitched voice and want
Neg. sentimentRelative [1] NE (person) [1]
key phrase 
extraction
POS Tags
hear this weird high-pitched voice
Neg. sentimentWeird / strange [1] POS Tags
V D A A N
NE (misc) [1]
*Stanford NER using 4-class model trained on the CoNLL 2003 data
*

Groups of features
15
Feature group Features
Mentions of semantic classes mentions of each semantic class
Key phrases sentiment polarity, sem. classes, POS tags
Part-of-speech tags nouns, verbs, adjectives, etc.
Sentiment polarity positive, negative or neutral
Popularity of the post likes, retweets
Use of nonstandard language spelling mistakes, abbreviations
Number of Twitter entities URLs, #hashtags, @mentions
Named entities persons, locations, organisations
Lexical distribution sentences, words, characters

Classification scenario
• 401 labelled examples: 94 related; 307 unrelated
• Three different types of classification methods:
• Naive Bayes (probabilistic model)
• Support Vector Machine (geometric model)
• AdaBoost (boosting of the tree model)
• Compare performance with simple baseline: tf-idf
features
16

Evaluation
17
Based on ten experiments of stratified 10-fold cross validation
Baseline features outperform only with SVM, difference is non-significant (p-value=0.375)
Classification performance of various classification methods on two
different sets of features
NB
SVM
AdaBoost
F2-score
0 0.225 0.45 0.675 0.9
0.711
0.751
0.486
0.772
0.743
0.831
Proposed features
Baseline features
🏆

Contribution of features
18
Features F2-score
Mentions of semantic classes * 0.769 ▼
Key phrases * 0.788 ▼
Part-of-speech tags 0.817 ▼
Sentiment polarity * 0.818 ▼
Popularity of the post 0.828 ▼
Use of nonstandard language 0.831 ▬
Number of Twitter entities 0.832 ▲
Named entities 0.832 ▲
Lexical distribution 0.833 ▲
All features 0.831 ▲
* Statistically signiﬁcant diﬀerences are marked with asterisk

Error analysis (highlights)
19
Text Predicted Actual
I do not hear voices, I am not
paranoid
✅ 
Related
❌ 
Unrelated
I’m hallucinating I’m hearing
hawks! Oh hang on, it is just
the television
✅ 
Related
❌ 
Unrelated
The voices which I hear every
night tell me to do it
❌ 
Unrelated
✅ 
Related

Generating dataset for analysis
1. Take best-performed classiﬁcation model
2. Predict relatedness for unlabelled examples
3. Combine with 401 labelled (annotated)
examples
RESULT: 4957 examples: 546 potentially related to
hallucinatory experiences *
20
* e.g. Wiles et. al (2006) national survey only 62 cases identiﬁed

Preliminary data analysis
21
Related
Unrelated
0 25 50 75 100
72%
19%
28%
81%• Negative sentiments
signiﬁcantly associated
with posts that indicated
the occurrence of auditory
hallucinations

Preliminary data analysis
21
Related
Unrelated
0 25 50 75 100
72%
19%
28%
81%• Negative sentiments
signiﬁcantly associated
with posts that indicated
the occurrence of auditory
hallucinations
• Posts linked to auditory
hallucinations had a higher
proportionate distribution 
between the hours of
11pm and 5am

Summary
• Experimental methodology to harvest and mine
datasets from unsolicited Twitter posts to identify
potential psychotic(-like) experiences.
• Classiﬁcation model that can relatively accurate predict
the relatedness of posts to auditory hallucinations
• Preliminary data analysis that identiﬁed interesting
patterns in sentiment polarity and posting time
• Future research: investigate expressions of sleep in
Twitter users’ who report a diagnosis of a psychosis-
related disorder
22

23
Questions?
Acknowledgements
Centre for Doctoral Training, School of Computer Science, University of Manchester
School of Psychological Sciences, University of Manchester

Mining auditory hallucinations from unsolicited Twitter posts

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (7)

Similar to Mining auditory hallucinations from unsolicited Twitter posts

Similar to Mining auditory hallucinations from unsolicited Twitter posts (20)

Recently uploaded

Recently uploaded (20)

Mining auditory hallucinations from unsolicited Twitter posts