Data Driven Societies
Digital & Computational Studies
Bowdoin College
February 10, 2014
Professors Gieseking & Gaze
Lecture Slides "Defining Data & Redefining Privacy"
2. Data & Information (Recap)
✦
Information society
!
✦
Data vs. information
!
✦
Information-as-freedom
vs. information-as-control
3. Big Data & Privacy
✦
Ethical research
!
✦
Data sample and data
access
!
✦
Defining big data,
defining privacy
daily.captaindash.com
4. Social Scientific Approach
0. Identify an issue
1. Research question
2. Theoretical approach
3. Literature review
4. Methods
5. Analysis
6. Discussion
7. Conclusion
5. Social Scientific Approach
0. Identify an issue
1. Research question
2. Theoretical approach
3. Literature review
4. Methods
5. Analysis
6. Discussion
7. Conclusion
6. Social Scientific Approach
0. Identify an issue
1. Research question
2. Theoretical approach
3. Literature review
4. Methods
5. Analysis
6. Discussion
7. Conclusion
7. Social Scientific Approach
0. Identify an issue
1. Research question
2. Theoretical approach
3. Literature review
4. Methods
5. Analysis
6. Discussion
7. Conclusion
Ethics,
anyone?
8. The Future of Now
The Chronicle of Higher Ed
The White House
9. Visualize This
How we handle to emergence of Big Data is critical.
…it is still necessary to ask critical questions about
what all this data means, who gets access to what
data, how data analysis is deployed, and to what ends.
—danah boyd & Kate Crawford,
“Critical Questions for Big Data” (2012)
13. Data Access: Twitter
✦
API - application programming interface is the set of tools
developers can use to access structured data
!
✦
“Firehose” of access: GNIP, DataSifter
✦
“Gardenhose" of access: 10% of public tweets
✦
“Spritzer” of access: about 1% of public tweets
✦
White-listed accounts: allowed access to certain subject matter
14. Data Rich and the Data Poor
Manovich (2011) writes of three classes of people in the
realm of Big Data: “those who create data (both consciously
and by leaving digital footprints), those who have the means
to collect it, and those have expertise to analyze it.”
-boyd & Crawford (2012)
!
✦
Data rich and data poor - research insiders and
outsiders, respectively, who have varied degrees of
access to data and the means to analyze it
15. Defining Big Data
1. Large data sets that require supercomputers for analysis,
i.e., usually over 2gb (Manovich 2011)
!
2. A cultural, technological, and scholarly phenomenon that
depends on the interplay of the following:
✦
Technology: maximized computation power and
algorithmic accuracy
✦
Analysis: examining large data sets to identify patterns
to make claims
✦
Mythology: widespread brief that the larger the data set,
the more accurate the findings (boyd & Crawford 2012)
17. ScraperWiki Support
A clever and elegant solution to our problem of
accessing Twitter data with a limited number of calls:
!
1. Open ScraperWiki and view your table
!
2. Download EVERY MONDAY
!
3. Restart EVERY MONDAY (you will need to do this
the first Monday of break too)
18. Next Class: Feb. 12
Today: big data, privacy, research ethics,
data rich vs. data poor
✦
!
Quiz: terms / concepts coming via email
✦
!
Readings: Pariser, Stray
✦
!
✦
Next class/lab:
✦
filter bubbles
✦
correlation/causation
✦
work with Twitter datasets
✦
continue learning R