Hate speech is language intended to cause harm against a particular individual or group, often based on their racial, ethnic, religious, or gender identity. Hate speech is widespread on social media, and is increasingly common in mainstream political discourse. That said, there is no clear consensus as to what constitutes hate speech. In addition, human moderators come with their own biases, and automatic computer algorithms are often easy to fool. All of these factors complicate the efforts of social media platforms to filter or reduce such content. During this interactive workshop we will discuss examples from Twitter in the hopes of reaching some consensus as to what is and is not hate speech. We will also try to determine what kind of knowledge a human moderator or an automatic algorithm would need to have in order to make this determination. We will try to avoid particularly graphic examples of hate speech and focus on more subtle cases.
What Makes Hate Speech: Understanding and Addressing Offensive Language
1. What Makes Hate Speech
An interactive workshop
March 4, 2020 - UMD Summit on Equity, Race, and Diversity
Ted Pedersen
Department of Computer Science
University of Minnesota, Duluth
tpederse@d.umn.edu @SeeTedTalk
2. About Me
● Computer Science Professor at UMD since 1999
○ http://www.d.umn.edu/~tpederse (today’s slides)
● Research in NLP on automatically identifying the …
○ Meaning of words & phrases in text
○ Sentiment of a text
○ Intent of an author (?)
● Recent research problems have included identifying ...
○ Computational Humor
○ Offensive Language
○ Hate Speech (our focus today)
○ Islamophobia
3. Interactive means we ask and answer (?) questions
● What is hate speech?
● How do we recognize hate speech?
● Can automatic methods identify hate speech?
● What do we do about hate speech?
5. YouTube Hate Speech policy
● Immigration Status
● Religion
● Sex/Gender
● Sexual Orientation
● Victims of a major violent event and their kin
● Veteran Status
If you find content that violates this policy, please
report it. Instructions for reporting violations of our
Community Guidelines are available here. If you find
many videos, comments, or a creator's entire channel
that you wish to report, visit our reporting tool.
https://support.google.com/youtube/answer/2801939
Hate speech is not allowed on YouTube. We
remove content promoting violence or hatred
against individuals or groups based on any of
the following attributes:
● Age
● Caste
● Disability
● Ethnicity
● Gender Identity and Expression
● Nationality
● Race
6. Twitter Hateful Conduct policy
Hateful conduct : You may not promote violence against or directly attack or
threaten other people on the basis of race, ethnicity, national origin, caste, sexual
orientation, gender, gender identity, religious affiliation, age, disability, or serious
disease. We also do not allow accounts whose primary purpose is inciting harm
towards others on the basis of these categories.
Hateful imagery and display names: You may not use hateful images or
symbols in your profile image or profile header. You also may not use your
username, display name, or profile bio to engage in abusive behavior, such as
targeted harassment or expressing hate towards a person, group, or protected
category.
https://help.twitter.com/en/rules-and-policies/hateful-conduct-policy
7. Facebook Hate Speech policy
We define hate speech as a direct attack on people based on what we call
protected characteristics — race, ethnicity, national origin, religious affiliation,
sexual orientation, caste, sex, gender, gender identity, and serious disease or
disability. We also provide some protections for immigration status. We define
attack as violent or dehumanizing speech, statements of inferiority, or calls for
exclusion or segregation. We separate attacks into three tiers of severity, as
described below.
https://www.facebook.com/communitystandards/hate_speech
8. Dangerous Speech
Any form of expression (e.g., speech, text, or images) that can increase the risk
that its audience (in-group) will condone or commit violence against members of
another group (out-group).
Dangerous speech is :
● Aimed at groups (or individuals who are seen as members of a group)
● Promotes fear
● Is often false
● Harms directly and indirectly
https://dangerousspeech.org/
9. Examples
A series of examples from twitter follows. I’ve tried to be
careful in these selections but realize that different kinds of
content is potentially hurtful and offensive to different people
in different ways.
The intent of the examples is to generate discussion about
what is hate speech (or not).
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21. How do we recognize hate speech?
● Our ideas …
22. How do we recognize hate speech?
● Dehumanizing
○ Biologically subhuman, vermin, biological threat, environmental catastrophe
○ Supernaturally strong
● Accusation in a mirror
○ Attribute to your “enemies” the acts which you wish to perpetrate on them
● Threat to Group Purity or Integrity
○ Out-group as a target
● Assertions of Attack against Women & Girls
○ Defiled by members of out-group
● Question of Loyalty
https://dangerousspeech.org/guide/
23. Can automatic methods identify hate speech?
● Need manually classified examples of
hate speech that Artificial Intelligence
algorithms can learn from
○ Training data
○ These are hard to reliably create
● Training data teaches you about what has
already happened
● New manifestations of hate arise which
are not in training data
○ Coronavirus
● Blacklists and rule based systems reflect
human insight on hate speech
● Hate speech can be profane, but not
always
○ Hard to distinguish between profanity and
hate speech
○ In-group use of profanity or slurs is often
not hate speech
● Blacklists easy to trick with misspellings,
typos
○ Eye will k1ll ewe
● Algorithms in general don’t recognize hate
speech that doesn’t include profanity or
obvious slurs
25. What should we do about hate speech?
● Protect Privacy
● Ignore
● Report
● Block
● Expose
● Engage
● Counter Speech
● Stay Online
https://www.takebackthetech.net/be-safe/hate-speech-strategies
26. Why be concerned?
● Hate speech creates a negative environment online
○ https://www.pewresearch.org/internet/2017/07/11/online-harassment-2017/
● Hate speech online escalates to violence in the “real world”
○ https://www.unlv.edu/news/release/hate-speech-hate-crimes
● Hate speech is a prerequisite to genocide
○ https://www.un.org/en/genocideprevention/hate-speech-strategy.shtml
○ Rwanda : https://dangerousspeech.org/tag/rwanda/
○ Myanmar : https://www.wired.com/story/how-facebooks-rise-fueled-chaos-and-confusion-in-myanmar/
○ Nazi Germany : https://news.un.org/en/story/2019/01/1031742
27. Resources
● Dangerous Speech Project
○ https://dangerousspeech.org/guide/
● Hate Watch - Southern Poverty Law Center
○ https://www.splcenter.org/hatewatch
● Free Speech Debate
○ https://freespeechdebate.com/en/
● Automatic Approaches (algorithms)
○ https://www.wired.com/story/break-hate-speech-algorithm-try-love/
○ https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0221152
○ https://techcrunch.com/2019/08/14/racial-bias-observed-in-hate-speech-detection-algorithm-from-google/
● These slides
○ http://www.d.umn.edu/~tpederse