SlideShare uma empresa Scribd logo
1 de 15
Baixar para ler offline
1
My
Little
Data
A Poster by Candida Haynes - PyData NYC 2015
Big
Data
World
in a
!
!
!
!
!
!
!
You share on social media. You use email. Big data machines “think”
they know you once they have analyzed some of your patterns. Now
imagine “writing” a deliberate data story.
!
This poster describes an early process of
!
1) retrieving personal data,
!
2) securing and exploring it, and
!
3) identifying the tools that will allow individuals to start cultivating and/or
preserving their data identities with code.
Description
3
Considerations at Each
Step:
Does it expose
personal data or
thought processes
to the web? To a
platform that does
not already have
the data?
Friction for beginners –
i.e. difficulty,
information available
online for someone
with limited knowledge
of jargon.
More on Friction
Most
documentation
describes how to
pull information
from distributed
API's.
Required
precautions might
delay beginners
and ultimately lose
them by seeming
(or being) too far
from their project
goals.
5
Problem-Solving Strategy:
Local Security
!
!
Computed behind the firewall
that was already on my system.
Encrypted hard drive,
protected via password.
6
USB 4 gigabytes
Did not encrypt USB drive but will
consider it in the next iteration to
discover if / how that changes the
process
Storage and Memory
Management
7
Getting Your Data
Some social media sites have a page
where you can request a download file of
your data. I chose to use Twitter and found
my request link here: https://twitter.com/
settings/account
Timing: Sometimes the packaging and
prep of your data download can take
several minutes, several hours, or a few
days. The email alert that the data was
ready for download took less than two
minutes to arrive this time.
8
Twitter Data Characteristics
Personal information that is / was already
consumable by the public.
!
Delivered as ZIP file with Twitter's
encryption while in transit.
!
Twitter archive file, which had a numerical
name, included the a tweets.csv file, which
matched the data type in Bokeh's example.
!
Each word in a tweet had its own column,
which made counting easier.
!
9
The data I downloaded appeared
in a folder once I unzipped the file.
Grailbird.data.tweets_2008_12 =
[ {
"source" : "u003Ca href="http://twitter.com" rel="nofollow"u003ETwitter Web Clientu003C/au003E",
"entities" : {
"user_mentions" : [ ],
"media" : [ ],
"hashtags" : [ ],
"urls" : [ ]
},
"geo" : { },
"id_str" : "1064099455",
"text" : "I need cookies.",
"id" : 1064099455,
"created_at" : "2008-12-18 00:00:00 +0000",
"user" : {
"name" : "Dida Lakes",
"screen_name" : "dihaynes",
"protected" : false,
"id_str" : "18206386",
"profile_image_url_https" : "https://pbs.twimg.com/profile_images/654911288917659648/QKsP0wHR_normal.jpg",
"id" : 18206386,
This is a JSON file of my first tweet!
11
!
Handled a .CSV file with > 7k tweets
Adequately displayed data
Sorting and counting tools did not require code
!
!
!
!
Problem: Dependencies
Solution: Anaconda
Understanding Data: OpenOffice Calc
as a Problem-Solving Tool
Not having the right software and configurations for
the new software you are installing causes errors.
Anaconda resolves a lot of those problems and is
recommended in the Bokeh documentation
From @dihaynes on Twitter
(sans cleansing) via Jason Davies.
You can explore more word cloud generators at
http://worditout.com/ and http://tagcrowd.com.
Word Cloud:
Visual Alternative to Calc
13
Data that matched my
interest in employment
data for future projects
Learning opportunities via
dependencies that I could
use to interact with tools that
had previously posed too
much friction
It had a
presentation format
that would allow for
quick interactions
“Employment Sample”
visualization had colors
that resonated with me
Data Visualization Tool: Bokeh
Data visualization from Bokeh sample.
http://bokeh.pydata.org/en/latest/docs/gallery/unemployment.html
Visual Inspiration
15
What is the role of
data science in
society?
What would you add
to the story of this
project?
What are some
moments when
data science and
storytelling are at
odds? When are
they not?
What questions do
you still have about
the data? About the
process?
Questions for Discussion?

Mais conteúdo relacionado

Semelhante a My Little Data in a Big Data World

Masterclass on the DID Universal Resolver
Masterclass on the DID Universal ResolverMasterclass on the DID Universal Resolver
Masterclass on the DID Universal ResolverMarkus Sabadello
 
Data protection in Practice
Data protection in PracticeData protection in Practice
Data protection in PracticeTomppa Järvinen
 
Content Disarm Reconstruction & Cyber Kill Chain
Content Disarm Reconstruction & Cyber Kill ChainContent Disarm Reconstruction & Cyber Kill Chain
Content Disarm Reconstruction & Cyber Kill ChainMuhammad Sahputra
 
How to protect the cookies once someone gets into the cookie jar
How to protect the cookies once someone gets into the cookie jarHow to protect the cookies once someone gets into the cookie jar
How to protect the cookies once someone gets into the cookie jarJudgeEagle
 
Content Disarm Reconstruction and Cyber Kill Chain - Muhammad Sahputra
Content Disarm Reconstruction and Cyber Kill Chain - Muhammad SahputraContent Disarm Reconstruction and Cyber Kill Chain - Muhammad Sahputra
Content Disarm Reconstruction and Cyber Kill Chain - Muhammad Sahputraidsecconf
 
So whats in a password
So whats in a passwordSo whats in a password
So whats in a passwordRob Gillen
 
Password and Account Management Strategies - April 2019
Password and Account Management Strategies - April 2019Password and Account Management Strategies - April 2019
Password and Account Management Strategies - April 2019Kimberley Dray
 
Notes to The Ten Commandments of Online Security and Privacy
Notes to The Ten Commandments of Online Security and PrivacyNotes to The Ten Commandments of Online Security and Privacy
Notes to The Ten Commandments of Online Security and PrivacyJonathan Bacon
 
Hacking databases
Hacking databasesHacking databases
Hacking databasessunil kumar
 
Hacking databases
Hacking databasesHacking databases
Hacking databasessunil kumar
 
CONFidence 2017: Hackers vs SOC - 12 hours to break in, 250 days to detect (G...
CONFidence 2017: Hackers vs SOC - 12 hours to break in, 250 days to detect (G...CONFidence 2017: Hackers vs SOC - 12 hours to break in, 250 days to detect (G...
CONFidence 2017: Hackers vs SOC - 12 hours to break in, 250 days to detect (G...PROIDEA
 
Data Science for Beginner by Chetan Khatri and Deptt. of Computer Science, Ka...
Data Science for Beginner by Chetan Khatri and Deptt. of Computer Science, Ka...Data Science for Beginner by Chetan Khatri and Deptt. of Computer Science, Ka...
Data Science for Beginner by Chetan Khatri and Deptt. of Computer Science, Ka...Chetan Khatri
 
Ethical_Hacking_ppt
Ethical_Hacking_pptEthical_Hacking_ppt
Ethical_Hacking_pptNarayanan
 
Industry of Things World - Berlin 19-09-16
Industry of Things World - Berlin 19-09-16Industry of Things World - Berlin 19-09-16
Industry of Things World - Berlin 19-09-16Boris Adryan
 
Open Source Information Gathering Brucon Edition
Open Source Information Gathering Brucon EditionOpen Source Information Gathering Brucon Edition
Open Source Information Gathering Brucon EditionChris Gates
 
Baking Security into the Company Culture (2017)
Baking Security into the Company Culture (2017) Baking Security into the Company Culture (2017)
Baking Security into the Company Culture (2017) Mike Kleviansky
 
Thinkful DC - Intro to Data Science
Thinkful DC - Intro to Data Science Thinkful DC - Intro to Data Science
Thinkful DC - Intro to Data Science TJ Stalcup
 

Semelhante a My Little Data in a Big Data World (20)

Masterclass on the DID Universal Resolver
Masterclass on the DID Universal ResolverMasterclass on the DID Universal Resolver
Masterclass on the DID Universal Resolver
 
Data protection in Practice
Data protection in PracticeData protection in Practice
Data protection in Practice
 
Content Disarm Reconstruction & Cyber Kill Chain
Content Disarm Reconstruction & Cyber Kill ChainContent Disarm Reconstruction & Cyber Kill Chain
Content Disarm Reconstruction & Cyber Kill Chain
 
How to protect the cookies once someone gets into the cookie jar
How to protect the cookies once someone gets into the cookie jarHow to protect the cookies once someone gets into the cookie jar
How to protect the cookies once someone gets into the cookie jar
 
Content Disarm Reconstruction and Cyber Kill Chain - Muhammad Sahputra
Content Disarm Reconstruction and Cyber Kill Chain - Muhammad SahputraContent Disarm Reconstruction and Cyber Kill Chain - Muhammad Sahputra
Content Disarm Reconstruction and Cyber Kill Chain - Muhammad Sahputra
 
So whats in a password
So whats in a passwordSo whats in a password
So whats in a password
 
Password and Account Management Strategies - April 2019
Password and Account Management Strategies - April 2019Password and Account Management Strategies - April 2019
Password and Account Management Strategies - April 2019
 
Notes to The Ten Commandments of Online Security and Privacy
Notes to The Ten Commandments of Online Security and PrivacyNotes to The Ten Commandments of Online Security and Privacy
Notes to The Ten Commandments of Online Security and Privacy
 
Hacking databases
Hacking databasesHacking databases
Hacking databases
 
Hacking databases
Hacking databasesHacking databases
Hacking databases
 
CONFidence 2017: Hackers vs SOC - 12 hours to break in, 250 days to detect (G...
CONFidence 2017: Hackers vs SOC - 12 hours to break in, 250 days to detect (G...CONFidence 2017: Hackers vs SOC - 12 hours to break in, 250 days to detect (G...
CONFidence 2017: Hackers vs SOC - 12 hours to break in, 250 days to detect (G...
 
Data Science for Beginner by Chetan Khatri and Deptt. of Computer Science, Ka...
Data Science for Beginner by Chetan Khatri and Deptt. of Computer Science, Ka...Data Science for Beginner by Chetan Khatri and Deptt. of Computer Science, Ka...
Data Science for Beginner by Chetan Khatri and Deptt. of Computer Science, Ka...
 
Ethical_Hacking_ppt
Ethical_Hacking_pptEthical_Hacking_ppt
Ethical_Hacking_ppt
 
Industry of Things World - Berlin 19-09-16
Industry of Things World - Berlin 19-09-16Industry of Things World - Berlin 19-09-16
Industry of Things World - Berlin 19-09-16
 
Codemash-2017
Codemash-2017Codemash-2017
Codemash-2017
 
Open Source Information Gathering Brucon Edition
Open Source Information Gathering Brucon EditionOpen Source Information Gathering Brucon Edition
Open Source Information Gathering Brucon Edition
 
Baking Security into the Company Culture (2017)
Baking Security into the Company Culture (2017) Baking Security into the Company Culture (2017)
Baking Security into the Company Culture (2017)
 
Life on Clouds: a forensics overview
Life on Clouds: a forensics overviewLife on Clouds: a forensics overview
Life on Clouds: a forensics overview
 
Thinkful DC - Intro to Data Science
Thinkful DC - Intro to Data Science Thinkful DC - Intro to Data Science
Thinkful DC - Intro to Data Science
 
OlgerHoxha_Thesis_Final
OlgerHoxha_Thesis_FinalOlgerHoxha_Thesis_Final
OlgerHoxha_Thesis_Final
 

Último

Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...shivangimorya083
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceDelhi Call girls
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxolyaivanovalion
 

Último (20)

Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 

My Little Data in a Big Data World

  • 1. 1 My Little Data A Poster by Candida Haynes - PyData NYC 2015 Big Data World in a
  • 2. ! ! ! ! ! ! ! You share on social media. You use email. Big data machines “think” they know you once they have analyzed some of your patterns. Now imagine “writing” a deliberate data story. ! This poster describes an early process of ! 1) retrieving personal data, ! 2) securing and exploring it, and ! 3) identifying the tools that will allow individuals to start cultivating and/or preserving their data identities with code. Description
  • 3. 3 Considerations at Each Step: Does it expose personal data or thought processes to the web? To a platform that does not already have the data? Friction for beginners – i.e. difficulty, information available online for someone with limited knowledge of jargon.
  • 4. More on Friction Most documentation describes how to pull information from distributed API's. Required precautions might delay beginners and ultimately lose them by seeming (or being) too far from their project goals.
  • 5. 5 Problem-Solving Strategy: Local Security ! ! Computed behind the firewall that was already on my system. Encrypted hard drive, protected via password.
  • 6. 6 USB 4 gigabytes Did not encrypt USB drive but will consider it in the next iteration to discover if / how that changes the process Storage and Memory Management
  • 7. 7 Getting Your Data Some social media sites have a page where you can request a download file of your data. I chose to use Twitter and found my request link here: https://twitter.com/ settings/account Timing: Sometimes the packaging and prep of your data download can take several minutes, several hours, or a few days. The email alert that the data was ready for download took less than two minutes to arrive this time.
  • 8. 8 Twitter Data Characteristics Personal information that is / was already consumable by the public. ! Delivered as ZIP file with Twitter's encryption while in transit. ! Twitter archive file, which had a numerical name, included the a tweets.csv file, which matched the data type in Bokeh's example. ! Each word in a tweet had its own column, which made counting easier. !
  • 9. 9 The data I downloaded appeared in a folder once I unzipped the file.
  • 10. Grailbird.data.tweets_2008_12 = [ { "source" : "u003Ca href="http://twitter.com" rel="nofollow"u003ETwitter Web Clientu003C/au003E", "entities" : { "user_mentions" : [ ], "media" : [ ], "hashtags" : [ ], "urls" : [ ] }, "geo" : { }, "id_str" : "1064099455", "text" : "I need cookies.", "id" : 1064099455, "created_at" : "2008-12-18 00:00:00 +0000", "user" : { "name" : "Dida Lakes", "screen_name" : "dihaynes", "protected" : false, "id_str" : "18206386", "profile_image_url_https" : "https://pbs.twimg.com/profile_images/654911288917659648/QKsP0wHR_normal.jpg", "id" : 18206386, This is a JSON file of my first tweet!
  • 11. 11 ! Handled a .CSV file with > 7k tweets Adequately displayed data Sorting and counting tools did not require code ! ! ! ! Problem: Dependencies Solution: Anaconda Understanding Data: OpenOffice Calc as a Problem-Solving Tool Not having the right software and configurations for the new software you are installing causes errors. Anaconda resolves a lot of those problems and is recommended in the Bokeh documentation
  • 12. From @dihaynes on Twitter (sans cleansing) via Jason Davies. You can explore more word cloud generators at http://worditout.com/ and http://tagcrowd.com. Word Cloud: Visual Alternative to Calc
  • 13. 13 Data that matched my interest in employment data for future projects Learning opportunities via dependencies that I could use to interact with tools that had previously posed too much friction It had a presentation format that would allow for quick interactions “Employment Sample” visualization had colors that resonated with me Data Visualization Tool: Bokeh
  • 14. Data visualization from Bokeh sample. http://bokeh.pydata.org/en/latest/docs/gallery/unemployment.html Visual Inspiration
  • 15. 15 What is the role of data science in society? What would you add to the story of this project? What are some moments when data science and storytelling are at odds? When are they not? What questions do you still have about the data? About the process? Questions for Discussion?