Screening Twitter Users for Depression and PTSD

Screening Twitter Users for
Depression and PTSD with
Lexical Decision Lists
Ted Pedersen
University of Minnesota, Duluth
tpederse@d.umn.edu

Motivations
● Interesting classification task
● Even more interesting to identify vocabulary
that indicates depression or PTSD
● Or tendency to self-report?
● Focused on decision lists, a simple machine
learning method that learns a human
interpretable model

Decision Lists
●
All tweets for each user kept on single line (to avoid splitting)
●
Text is lowercased, anything not alpha-numeric is removed
●
Randomly shuffled
●
Ngram features learned from first 8 million words in training data
for each condition
●
Ngrams may be binary or any length 1-6
●
Ngrams made up of stopwords removed (or not)
●
Ngrams weighted by frequency (or binary)
●
Eight different decision lists learned
●
system2 most accurate : Ngrams 1-6, stopwords, and binary weighting

Decision Lists
● Any Ngram that meets previous three conditions
and occurs at least 50 times more often in one
condition than the other is selected as a feature
● Since conditions are binary (DvC, PvC, DvP)
frequency in one condition is positive while the
other is negative
● Ngrams that occur about the same number of
times in both conditions not especially indicative
or interesting

Running Decision List
● For each Ngram in tweet, check to see if it
is in decision list
● If using frequency weight, add value (positive
or negative) of the Ngram to an overall score
● If using binary weight, add 1 or -1 to overall
score
● Do this for all tweets for a user, if overall
score > 0 then one class, <= 0 the other

Decision List
● Decision lists often make a classification after
finding the most indicative feature
● Elected to use all features found in user tweets
to provide more nuanced decision
● System2 decision list has
● 18,617 features (DvC)
● 21,145 features (DvP)
● 17,936 features (PvC)

Results?
DvP DvC PvC
System2 .769 .736 .720
System1 .760 .731 .721
Random .471 .492 .489
● System2 and System1 are identical except
that 2 uses a stoplist while 1 does not
● Both use Ngrams 1-6 and binary weighting

Top 10 Features
● DvC
● Depression : ud83c, please, love, follow, ufe0f, re, f*cking, love you, im, udf38
● Control : http, http t co, http t, co, t co, ud83d, lol, u2764 u2764 -, u2764 u2764
u2764, u2764 u2764 u2764 u2764
● PvC
●
PTSD : u2026, co, t co, u043e, u0430, u0435, thank, thank you, please, u0438
● Control : ud83d, rt, ude02, ud83d ude02, gt, u2764 -, lol, u201c, ude02 ud83d -,
ud83d ude02 ud83d
● DvP
● Depression : ud83d, ud83c, rt, love, ude02, ud83d ude02, im, follow, don t, don,
love you
●
PTSD : co, t co, http -, http t, http t co, u2026, amp, news, thanks, answer

Lessons
● Standard machine learning algorithms can
perform well at this task
● Even very simple ones like our decision lists
● Emoticons and Emoji are often strong indicators
● Ngrams of varying length combined with binary
weights attained best results
● Frequency weighting very poor
● Stoplist has minimal impact

Discussion
● How typical is it to self-report depression or PTSD?
● Is desire to self-report an indicator of something else?
● Do untreated / undiagnosed users look differently?
● How common are these conditions?
● PTSD : 7-8% (www.ptsd.va.gov)
● Depression : 17% (www.adaa.org)
● Typical to have multiple diagnoses
● PTSD + Depression
● Anxiety + Depression

A case of self-reporting
Which is worse, cancer or depression? The answer
is clear. Depression is worse: depression makes
you want to die and cancer doesn’t.
I’ve spent all my adult life with depression lurking. I
haven’t mentioned it to very many people at all. For
the first ten years I talked about it to nobody at all,
for the next decade only Gill and therapists ...

Adam Kilgarriff
● Posted to blog May 3, 2015. Died
May 16 at age 55.
● https://blog.kilgarriff.co.uk/?p=101

Screening Twitter Users for Depression and PTSD

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (9)

Destaque

Destaque (8)

Semelhante a Screening Twitter Users for Depression and PTSD

Semelhante a Screening Twitter Users for Depression and PTSD (12)

Mais de University of Minnesota, Duluth

Mais de University of Minnesota, Duluth (20)

Último

Último (20)

Screening Twitter Users for Depression and PTSD