Video and slides synchronized, mp3 and slide download available at URL http://bit.ly/17bqGie.
Rebecca Parsons reviews some of the changes in how data is used and analyzed, including new technology approaches, looking at how data is used to track election violence, movement of people after a natural disaster, and attempts to predict famine and other humanitarian crises before they happen.Filmed at qconnewyork.com.
Dr. Rebecca Parsons is ThoughtWorks' Chief Technology Officer. She has more than 30 years' experience in leading the creation of large-scale distributed and services based applications, and the integration of disparate systems. Rebecca received a BS in Computer Science and Economics from Bradley University, and both an MS and Ph.D. in Computer Science from Rice University.
2. InfoQ.com: News & Community Site
• 750,000 unique visitors/month
• Published in 4 languages (English, Chinese, Japanese and Brazilian
Portuguese)
• Post content from our QCon conferences
• News 15-20 / week
• Articles 3-4 / week
• Presentations (videos) 12-15 / week
• Interviews 2-3 / week
• Books 1 / month
Watch the video with slide
synchronization on InfoQ.com!
http://www.infoq.com/presentations
/big-data-analysis
3. Presented at QCon New York
www.qconnewyork.com
Purpose of QCon
- to empower software development by facilitating the spread of
knowledge and innovation
Strategy
- practitioner-driven conference designed for YOU: influencers of
change and innovation in your teams
- speakers and topics driving the evolution and innovation
- connecting and catalyzing the influencers and innovators
Highlights
- attended by more than 12,000 delegates since 2007
- held in 9 cities worldwide
4. Changing Nature of Data
Response
How we use data now
3 v’s velocity, variety and volume (and add value)
5. Walmart: 1 million transactions per hour
Facebook: 40 billion photos
The Economist: Feb 25th 2010
Data is: Growing
Set up next slide by saying that some people think only Google
size companies should worry about this.
6. 640K ought to be
enough for anybody
Note: Although this is often attributed to Bill Gates - he never said
it.
7. 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012
1,482,824
1,287,537
1,080,872
853,698
616,308
356,191
127,942
40,223
8,6401,990442
Monthly Contributors to Wikipedia souce: wikipedia
Data is: Distributed
http://stats.wikimedia.org/EN/
TablesWikipediansContributors.htm
Contributors defined as people who edited at least 10 times. Data
for the month of January for the years in question
9. Data is: Distributed
98% of internet access
points in Africa are mobile
30 million networked sensor nodes
growing 30% per year
McKinsey Global Institute: Big data:The next frontier
for innovation, competition, and productivity
10. Data is: Valuable
$300 billion / year for US health care
60% increase in retail margins
McKinsey Global Institute: Big data:The next frontier
for innovation, competition, and productivity
19. Analytics
will be
pattern recognition
data mining
chasing connections
were
roll-ups
trends
variance
Not just chasing but discovering connections - significant value
here. EXPLoratory
40. More data more readily available requires better access
protection. Protect internally, from hackers and more accidental
exposure.
Balance needs versus privacy, even given changing expectations
around privacy. Also worry about accuracy