Coding Social Imagery: Learning from a #selfie #humor Image Set from Instagram

Coding Social Imagery:
Learning from a #selfie #humor
Image Set from Instagram
DIGITAL POSTER SESSION
BIG XII TEACHING AND LEARNING CONFERENCE (3 RD ANNUAL)
THEME: TRANSFORMATIONAL TEACHING AND LEARNING
KANSAS STATE UNIVERSITY
JUNE 2 – 3, 2016

Definition: Social imagery
Digital images shared through social media platforms
3
Social networking sites (Facebook)
Work-based social networking sites
(LinkedIn)
Digital content sharing sites
Image-sharing sites (Flickr)
Video sharing sites (YouTube,
Vimeo)
Microblogging sites (Twitter)
Web logs / blogs
Wikis
Crowd-sourced online
encyclopedias (Wikipedia)
Email systems
SMS systems (short message
systems)
Work-based collaboration systems
The Web and Internet (broadly
speaking), and others

Social imagery
SOME CHARACTERISTICS
In the wild (shared on social media platforms)
Social
Plentiful (“big data”)
Multilingual
Opinion-ful
Repurposed, mash-ups
Low-res, medium-res, high-res
Uncensored, low-censored, censored
USE FOR RESEARCH
Scrape-able / mass collectible
High-dimension data
Codable (by type, by concept, by content, and
other aspects)
May be harnessed for research…but how?
4

#selfie and #humor(Venn diagram)
5

Overview
Social media messaging has long been harnessed to inform faculty about their respective
learners. The textual channel is often used because of the ease of interpretation and analysis.
Social imagery—tagged images, #selfies, grouped imagery, and others—has been less used, in
part because images are more complex and multi-meaninged to analyze. Also, there are not
many generalist models that inform how to code or even understand social imagery in an
emergent way. (There are large-scale computational means to interpret online images, such as
the AlchemyAPI of IBM Watson, for various types of feature extractions. There are ways to code
imagery based on specific research questions in particular fields-of-practice.)
7

Overview(cont.)
The presenter recently analyzed a 941-image #selfie + #humor image set from Instagram, with
three main research questions:
(1) What does identity-based humor look like in terms of a #selfie #humor- tagged image set
from the Instagram photo-sharing mobile app?
(2) Do more modern forms of mediated social humor link to more traditional forms
theoretically? Is it possible to apply the Humor Styles Model to the images from the #selfie
#humor Instagram image set to better understand #selfie #humor?
(3) What are some constructive and systematized ways to analyze social image sets manually
(with some computational support)?
This digital poster session will highlight some of the initial research findings (forthcoming in a
near-future publication) and share insights about effectively coding social imagery in a bottom-up
and emergent way.
8

Study of #selfie #humor
Image Set from
Instagram
9

About Instagram
Initially designed as mobile app for mobile-image sharing (and now also short video sharing)
Conceptualized as an “instant” “telegram” (= Instagram)
By the numbers (in 2016):
◦ 400 million monthly active users
◦ More than 75 million daily active users
◦ 20% of Internet users use Instagram
◦ 77.6 million Instagram users in the U.S.
◦ 51% male; 49% female Instagram users (Craig Smith, April 5, 2016, DMR)
A young demographic user base (55% in 18 – 29 age group, 28% in 30 – 49 age group) (Pew Research
Center, Mobile Messaging and Social Media, 2015)
Purchased by Facebook in 2012 for $1 billion
Launched in 2010 as a free mobile app
10

Research method
Study of identity-based humor based on on-the-fly digital self-portraits (“selfies”) from imagery
shared on Instagram tagged with #selfie #humor
◦ Review of the Literature
◦ Extraction of Targeted Imagery (based on folk tagging)
◦ Ingestion of Imagery into NVivo 11 Plus
◦ Light data cleaning
◦ Analysis and Interpretation
◦ Theme extraction (qualitative analysis)
◦ Image categorization (quantitative analysis)
◦ Comparison with humor theories, humor styles, and other prior research)
◦ Consideration for Future Research and Methodologies
11

Some features of the #selfie #humor
image set
Imagery tagged with #selfie and #humor on Instagram
◦ “Selfie” defined generally as an on-the-fly digital self-portraits
◦ “Humor” defined as something created to induce amusement
“Found” imagery vs. created imagery in user-generated contents
Digital image editing with some text overlays, melded images, image filters, and higher end
image editing (image masking, layering)
Identifiable social messaging and statements about self and about others
Some recurring themes and apparent broad emulation of others
Some elicitations for topical image sharing, such as through hashtag campaigns and event-based
calls
13

Some features of the #selfie #humor
image set (cont.)
Word play and visual gags, name calling, and public call-outs and shaming
Visual synecdoche and visual metonymy; visual symbolism
Cause-and-effect narratives (if this, then this), side-by-side image comparisons (before-after,
analogies in imagery), and image sequences (implied chronologies)
Images separated from original contexts (in some cases), such as from blog posts and
Tweetstreams and websites
◦ Varying degrees of messaging coherence / effectiveness as stand-alone images
14

What do people find funny (based on
research)?
ABOUT HUMOR
People are hard-wired to respond instinctually
to the ludicrous and the incongruous; there
has to be surprise (the unexpected) but not
contravening of internalized social ethics
(humor can be edgy, but if it is seen as truly
wrong, people will respond with anger, not
amusement)
Appreciation for humor is linked to cognition
and emotion…and personality
ABOUT LAUGHTER
A lot of laughter is not related to humor per se
but may be used to relieve tension in a social
context
◦ Laughter may be used to emphasize certain
points made in speech
◦ Shared laughter raises people’s moods
Laughter may be voiced or unvoiced
◦ There are a wide array of laugh sounds
15

research)? (cont.)
ABOUT HUMOR
Areas of the brain have been linked to
response to jokes, such as one area for word
play and puns
Certain types of humor may be preferred by
certain individuals in certain demographics,
such as a preference for pratfalls and fart jokes
among younger males
tragedy + time = comedy
ABOUT LAUGHTER
Laughter tends to be social (much more likely
in a context where people are in the company
of others)
◦ People who have lived for long periods in
isolation still have a laugh response but it tends
to be silent or unvoiced
People who tell jokes tend to laugh more at
their own jokes
16

research)? (cont.)
ABOUT HUMOR
There are social power dynamics in the
expression of humor in the workplace
◦ Supervisors are more likely to tell jokes than
subordinates
◦ Humor can have salutary effects in work places
(those that are not NSFW)
Males tend to tell more jokes than females
ABOUT LAUGHTER
Subordinates are more likely to laugh at their
supervisor’s jokes than vice versa (and more
likely to laugh at their supervisor’s jokes than
their peers’ jokes)
17

High-level topic-based image categories
in the #selfie #humor image set
1. Truth-telling about the self
2. Un-selfie (counter-messaging)
3. Animals (as self and other)
4. Inspirational
5. Human sociality and social media (meta-perspective)
6. Human tensions (and issues)
7. Funny faces
8. Spectacle
18

High-level image categories in the #selfie
#humor image set (cont.)
19
1. truth-
telling
about self
2. un-selfie
(counter-
messaging)
3. animals
(as self and
other)
4. in-
spirational
5. human
sociality
and social
media
(meta-
perspective
)
6. human
tensions
and issues
7. funny
faces
8. spectacle
8 40 31 57 78 54 8 12

Understanding the frequency counts
Image set was scraped over two days using two different web add-ons to extract images that
came up from <#selfie #humor Instagram> Google Images and then Microsoft’s Bing Images
◦ Range of dependencies (time, technological)
Not an N-of-all (by any means)
Not a random sampling of the target image set (so not generalizable or directly
representational)
◦ Unclear what sampling bias was inherent in the data extraction
◦ = Convenience sample descriptions
Unweighted count of the respective images
21

Three main types of organizing themes
from the image set
(a) the purposes of the selfies
◦ 1. truth-telling about the self, 4. inspirational, and 8. spectacle;
(b) the messaging
◦ 2. un-selfie (counter-messaging), 5. human sociality and social media (meta), 6. human tensions (and
issues), and
(c) the types
◦ 3. animals (as self and others), 7. funny faces
22

Critique of this bottom-up coding of
complex social image sets
WEAKNESSES
Non-alignment in using three different
approaches and conceptualizations to organize
images (purposes / intentional
communication, messaging, and types)
◦ Why not purposes alone? Or messaging alone?
Or types alone?
Inelegance in terms of lack of mutual
exclusivity in terms of image coding (social
images that can fit multiple categories
simultaneously)
◦ The practice of analyzing for salience but lacking
in sufficient explanatory power
STRENGTHS
Mixed approaches necessary given complexity
of social image sets
◦ is some degree of coherence
Fairly parsimonious in its approach
Somewhat transferable to other types of data
sets: purposes, messaging, and types
Tapping into human perceptual analytics
(without necessarily a need for computational
supports)
23

Category 1: Truth-telling about the self
Owning one’s own reality
Owning one’s own laziness
Avoiding gullibility
24

Category 2: Un-selfie (counter-
messaging)
Calling out humble-brags
Mistaking the virtual for the real
Using others to see the self
25

Category 3: Animals (as self and other)
A literal animal selfie (“Monkey Selfie”)
Animal selfie humor
Straight animal images
26

Category 4: Inspirational
Going forth and conquering
Human predicaments
Not Monday!
Text manifestos about life
Time
Expressions of gratitude
27

Category 5: Human sociality and social
media (meta-perspective)
Over-focusing on looks
Relating around money and purchases
Social exclusion and inclusion
Lots of talk and word play
Social media socializing
◦ But first, let me take a selfie!
◦ Sharing food…images
◦ Dismissing others
Truth behind the screen
28

Category 6: Human tensions (and issues)
Racial tensions
Gender group tensions
Critiques of and comments about celebrities
No drunkenness please
Contemporary social and political issues
29

Category 7: Funny faces
Funny faces
30

Category 8: Spectacle
Spectacle (acts of derring-do)
31

Gender counts and number presentation
Male-female gender parity in single selfie image counts (selfies showing one individual in the
image) but slight higher count for male subjects
A larger male-to-female gap for duo and group selfie images
Majority of selfie images in set were singletons, followed by duo and group selfies (the latter
two at about the same rates)
Minority of selfie images had unclear gender (in all three categories)
33

Selfie types and counts
Photos predominated over drawings / illustrations and mixed text and visualizations
Text-only selfies tend to be drawings or illustrations mostly, not photos
Animal selfies tended to combine images and text
36

“Extended self” and behavioral residue
approaches to image set
Not just images of the “self” in a self-captured digital image in a humorous context (in a literalist
way)
“Extended self” analyses of selfies showed uses of the following to represent the “self”: people,
animals, figures (objects) / materials, and texts
Broad application of “personality psychology” concepts which suggest that everything people do
reflects something of their personality (“inner world”) and social lives
◦ Concept of “behavioral residue”
◦ Consistent “self” emerges and self-reveals…over time…and over data…in an observable and describable
way
37

Humans by general age categories
General age categories:
Adults: 526
Children: 32
Toddlers: 4
Babies: 6
Numbers may be indicative of ease-of-access
to imagery of adults by adults
Child images are usually from a mix of child
stars (through screen grabs) and family images
Likewise, toddler and baby images seem to be
either in the public arena (likely copyrighted)
or personal
39

Three common imagery format types
Common imagery format types:
Photographs: 429
Drawings / illustrations: 91
Mixed images and text visuals: 63
41

Observed humor styles
Predominant humor style may have an effect on one’s social-psychological health.
R. Martin, P. Puhlik-Doris, G. Larsen, J. Gray, and K. Weir (Feb. 2003) shared a piece “Individual
differences in uses of humor and their relation to psychological well-being: Development of the
Humor Styles Questionnaire” in the Journal of Research in Personality about four types of humor
styles:
1. Affiliative humor (social): used to charm and amuse others so as to benefit relationships
2. Aggressive humor (social): used to critique and ridicule others so as to put others down
3. Self-enhancing humor (self): used to relieve tensions and stress so as to aid in coping
4. Self-defeating humor (self): used to put oneself down to make others laugh (at a cost to
one’s dignity)
42

Observed humor styles (cont.)
Humor styles have implications for healthy / unhealthy self-concept and for constructive / non-
constructive social interactions with others.
Humor styles may be seen in what types of humor people engage in and prefer.
Humor styles may be inferred from shared messaging in the #selfie #humor image set.
43

An expanded 2 x 2 table of the four dimensions of
humor styles with linked #selfie #humor image themes
Enhancing Self Enhancing Social Relations
Benign Humor Style Self-enhancing (adaptive) Affiliational (adaptive)
1. Truth-telling about the self
2. Un-selfie (counter-messaging)
3. Animals (as self and other)
4. Inspirational
5. Human sociality and social
media (meta-perspective)
7. Funny faces
8. Spectacle
Injurious Humor Style Aggressive (mal-adaptive) Self-defeating (mal-adaptive)
6. Human tensions (and issues) 6. Human tensions and issues
44

Some Early Insights re:
Coding Social Imagery
45

Image data cleaning
Keep a pristine master collection of the raw images before any data cleaning is done.
◦ Avoid losing data from data cleaning.
Early data cleaning involves deciding what belongs in the research set and what doesn’t.
◦ Spell out standards for inclusion / exclusion. Be consistent.
◦ What is a #selfie? What is #humor?
Remove duplicate images.
◦ Different messaging using the same underlying images were considered to be different #selfie #humor
images.
46

Image sufficiency
Knowing how many images to collect was not clear initially, or even later in the work.
◦ Saturation would suggest that images should be collected until there are not relevant new themes
identifiable for a fair emergent representation.
Amount of effort required to iteratively code the imagery manually was a deterrent against
searching for more images.
Check-backs at multiple periods thereafter (over months) using the same seeding terms (#selfie
#humor Instagram) resulted in many images that were
◦ visually and thematically similar to those identified in this exploratory research and
◦ some recognizable images (from the initial image sets);
◦ novelty of concept, along with production quality, tends to be rare.
47

Image sufficiency (cont.)
Originality is a rarity in this context (combined folk-labeled #hashtagged topics, the selected
images from Instagram).
◦ Contents on Instagram are generally conceptualized to be “instant” “telegrams”—so the speed of
creation is a factor.
◦ Social media account follower-ship may encourage emulation of others.
◦ Trending memes, which encourage copycats, apparently lead to repeated types of messaging.
◦ Practice of “photoshop battles” (such as on Reddit) appear to lead to particular types of visual
expressions based on specific digital image editing / visual expressions.
◦ Instagram “filters” (puppy filter, flower crown filter, and others) result in some common image overlays.
48

Some benefits to emergent coding
Emergent coding begins with the imagery set…and not any a priori coding theory, framework,
model, research question, or other approach.
◦ This bottom-up approach starts with the minutiae of specific dimensions of the images in the image set.
This type of coding is closer to the data (rather than starting from a top-down theory and seeing how
the data fits).
◦ This does not assume that the data is somehow speaking for itself. Rather, this still acknowledges the
subjectivity of coding achieved by a person or people.
Setting a baseline description for an image set provides some useful insights…which may be
built upon with more targeted types of coding (see reference above).
◦ “Baseline” is used as a limited descriptor of the particular limited image set. This is not understood in
any way as generalizing out to the N=all image set.
49

Some benefits to emergent coding (cont.)
Baseline-setting of social image sets include qualitative (descriptive) and quantitative (count)
approaches: categorization of images based on …
◦ concepts (themes, messaging)
◦ contents (gender)
◦ types (single; duo; group) (photos, drawings, combined text and visualization)
Identification of anomalies in image sets
◦ It is helpful to have a sense of general tendencies of the image set.
◦ It is helpful to identify outliers and to be able to describe why and how the selected images are outliers.
50

Social image data extraction
Only 62% of extracted images fit a broad definition of #selfie #humor (the seeding hashtags)
◦ Using “extended self” conceptualizations
◦ “Self” is not only the individual but relationships, possessions, employment, and other aspects
◦ Including non-human animal and object-based and word representations of a self
◦ Plenty of images that were “selfies” but with no apparent “humor” (except maybe in the sense of “in
good humor” as in a good mood (a smiling face but no funny or attempt at funny)
Selfie multiple ways (in terms of counts):
◦ self as individual “I,”
◦ self as collective “we,”
◦ self as individual “other,”
◦ self as collective “other”
Messaging on continuum of empathy and sympathy to non-empathetic and antagonistic
A fair amount of noise (vs. signal) to folksonomic tagging of socially-shared images
51

Social image data extraction (cont.)
Methods for image data capture limit amount of data and metadata captured
◦ Some data lossiness such as non-capture of the original names of images (using Firefox browser add-on
DownThemAll and Google Chrome add-on Chronos Download Manager)
◦ Google Images more effective than Bing Images in capturing larger sets with more original images, using
the same data find parameters (#selfie #humor Instagram)
Image data scraping using Python or R would likely be much more effective in terms of amounts
of images and additional data collection beyond the images
Manual image downloads could enable the capture of more information (higher resolution
images, more metadata), but that process is time-costly and slow and not particularly scalable
(maybe except through crowd-sourcing)
52

Social image data extraction (cont.)
May look for “semi-automatic” social image-capture approaches in the future (part machine
scraping, part human downloads)
53

Some observations about social image
coding
Image pixelation can be a problem given the low resolution of the scraped images.
◦ Some research, such as through reverse image searches online (like TinEye), can be helpful to establish
provenance.
◦ However, it is important to set research limits for interpretation of the images in the #selfie #humor set.
Some research led to some iffy sites. Others dead-ended in misspellings.
◦ If an image is too muddy on its own, it was generally omitted from the research set.
◦ Obvious ads were also omitted.
There is more ambiguity in imagery than one might initially assume. For example, coding for
gender could be challenging. Coding for age can be challenging; for example, does the image
show a child or a young adult? Coding for race was avoided because it is not verifiably possible
to.
Honest and thorough coding means handling some images that can be offensive and socially
questionable.
54

Some observations about social image
coding (cont.)
An image may be coded multiple ways because the categories tend not to be mutually exclusive.
◦ An image can fit fully and / or partially in multiple coding categorizations.
Non-English languages (or non-base languages) can be a problem, even with the help of Google
Translate. (Original language use may be slang, and many textual elements contain
misspellings.)
It is helpful to iterate over image sets multiple times with different focuses each time in order to
capture accurate information. Image coding is not going to be a once-through sort of activity.
◦ It helps to have a clear and focused purpose for each iteration.
Guess-ti-mating numbers is not helpful, not accurate. For an accurate manual count, it helps to
go through and count attentively…and even recount.
55

Coding social imagery in NVivo 11 Plus
NVivo 11 Plus (a qualitative data analytics research suite) enables the curation of a large number
of images for data analytics.
To benefit more from NVivo 11 Plus, it helps to add plenty of insightful descriptive text in the
notes fields…in order to have textual contents against which to run data queries and autocoding.
◦ Without text annotations, word frequency counts, word searches, and other types of data queries
cannot be run against the image data.
◦ Without detailed text annotations, no sentiment and no theme extractions will be seen through the
autocoding.
It helps to also do other coding using other software, such as Word (for notetaking) and Excel
(for quantitative representations).
56

Social image exploration to create a
social image codebook
An initial image set may be used to create analytical tools to apply to larger sets of images.
◦ For example, it may help to use an image sample set to create an initial codebook.
A basic codebook is comprised of a code and then a description of the standards for what would
be coded to that particular node (or code category).
◦ It may help to have some digital exemplars for those categories as well (to better explain the coding).
57

Copyright challenges
It is hard to chase copyright on such user-shared images since there is so much cooptation of
images and ideas from others.
◦ The ostensible users of the images were often not likely the owners of the images and so do not have
standing to release copyright for an image.
◦ It’s easier to not publish any of the images without established image ownership and a legal release.
◦ Descriptions of #selfie #humor imagery with proper uses of quotation marks was preferable to actual inclusion of the images.
The Instagram End User License Agreement (EULA) requires people who post to own copyright
to the materials that they upload and to release rights to Instagram, but there is no blanket
copyright release to users of the service.
◦ Users of Instagram indemnify Instagram as a service and platform only.
58

Other image coding methods in the
academic literature
In the research literature, there have been some early works on digital image coding:
◦ using algorithms to analyze social images for contents (understanding of one central form) and facial
recognition, and sentiment (such as AlchemyAPI, now of IBM Watson) / algorithms trained on Web-
scale data
◦ seems to enable broad-scale summary data
◦ seems to be applied to trending issues
◦ using crowd-sourcing (CrowdFlower) to have people label social and other images (with fairly high levels
of confidence)
◦ seems to require a targeted question or research aim
Both above methods involve commercial entities and additional costs.
59

Other image coding methods in the
academic literature (cont.)
There are domain-specific works as well, with coding for specific research and queries (a priori
coding)…but nothing the author could find about emergent manual coding of social image sets.
Some generic types of research questions based on image coding include the following:
◦ Are there gender biases in how people are depicted in mass media around particular social issues?
◦ Are there geographical tendencies in terms of visual depictions based around particular topics?
◦ What are differences between image sets tagged with the same term (keyword or hashtag) from
different geographical regions? Different cultures? Different people groups?
60

Conclusion and contact
Dr. Shalin Hai-Jew
◦ iTAC, Kansas State University
◦ 212 Hale / Farrell Library
◦ 785-532-5262
◦ shalin@k-state.edu
Thanks to Dr. Jana R. Fallin and her team at the K-State Teaching & Learning Center for accepting
this digital poster session for the Big XII Teaching and Learning Conference (2016)!
This digital poster session was created from a chapter that is forthcoming in a text scheduled for
publication in 2017. © All contents are copyrighted.
61

Coding Social Imagery: Learning from a #selfie #humor Image Set from Instagram

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Semelhante a Coding Social Imagery: Learning from a #selfie #humor Image Set from Instagram

Semelhante a Coding Social Imagery: Learning from a #selfie #humor Image Set from Instagram (20)

Mais de Shalin Hai-Jew

Mais de Shalin Hai-Jew (20)

Último

Último (20)

Coding Social Imagery: Learning from a #selfie #humor Image Set from Instagram