Attempt to inspire some kids to pay attention in Math and Science classes so they can get a good job and help fill the skills gap in the years to come.
11. The hive mind map shows popular twitter hashtags
for the last 7 days and how they are connected
http://hivemindmap.com/?#
12. HIVE MIND MAP
A mind-map of what’s happening onTwitter
Thanks to Mark Harwood for these slides and the Hive Mind Map
http://www.infoq.com/presentations/elasticsearch-revealing-uncommonly-common
13. Connections
The thickness of a line between hashtags is
based on the strength of connection
Tip:!
Strength of connection
is the number of tweets
with both tags vs the
number with only one -
see “Jaccard similarity
coefficient”
14. Top tweets
The most popular tweets for a tag are sorted
based on the number of “retweets”
15. When?
The rise and fall of each hashtag’s popularity
can be shown over time
16. Calendar summary
Tags that “peak” together are grouped into
events on a calendar
Tip:!
Peaks are detected
using standard
deviations. Only tags
with a single peak are
chosen as events
Tip:!
Tags that rise and
fall in popularity at
the same time are
detected using
Pearson’s
Correlation
17. What makes this possible?
• Free software (Lucene, Java, Eclipse, Gephi, Tomcat, d3, Google analytics…)
• Free data (millions of users’ tweets from Twitter’s 1% sample feed)
• “Cloud” computing (rented server)
• Smarter web browsers (visualizations using HTML5’s SVG/Canvas)
• All the friendly folks on the internet (e.g. http://stackoverflow.com/
questions/14799842)
• Some imagination…
18. Opportunities in Data Science
• We are all generating volumes of data never seen before
• You can recycle the behaviors of billions of people into
more intelligent systems
• customer purchases can be used for product recommendations
• user searches can be used for spelling corrections,
• Reader clicks can influence the trending news
• Spotify activity is used to make music recommendations)
• The tools have never been cheaper
• It has never been easier to find help in developing systems
19. …one more thing..
I’m writing these slides for you
while on my annual snowboarding
trip to Canada.
Data science pays well ;-)
Wish you were here…
31. SCORES 2004-2012
Elementary - 4th Grade, Middle School - 8th Grade, High School
About half of
high school
students in
California are
proficient at
Math and
Science
33. CALIFORNIA SCHOOLS
Science and Math Scores at Elementary, Middle and High School Level
Scores have
been getting
better. Good!
34. CALIFORNIA SCHOOLS
Science and Math Scores at Elementary, Middle and High School Level
Scores have
been getting
better. Good!
Maybe the
Math tests
were harder
for everyone
that year?
35. CALIFORNIA SCHOOLS
Science and Math Scores at Elementary, Middle and High School Level
Scores have
been getting
better. Good!4th Grade
“cohort” in
2004 was 8th
Grade in 2008
Maybe the
Math tests
were harder
for everyone
that year?
36. DATA SCIENCE WITH EXCEL
Pivot tables let you rearrange data and trend lines measure the slope
37. LEARNTO BE A DATA SCIENTIST FOR $1
• Everything is being measured
• The latest data science tools are
available to anyone for pennies
• There is lots of freely available data
• Pay attention in math and science class,
play around with EMR and Bigquery
and get an interesting and well paid job
as a data scientist!