During the month of April, the growing impact of Big Data and data-driven insight on our daily lives became increasingly apparent. While pundits debated the merits of this massive sea change in data collection and analysis, its value and results were borne out this month in intriguing and surprising ways, including revealing things like why UPS trucks never turn left and exploring if there are time travelers living among us.
2. Learn more about how data science is changing your world at blog.gopivotal.com
Why UPS Trucks Don’t Turn Left
Among data geeks, UPS’s 2004
announcement that their delivery vehicles
would avoid taking left turns to conserve
fuel has long been a source of curiosity. As this Priceonomics post
explains, the company’s idiosyncratic yet data-driven company policy has
yielded significant efficiency gains, utilizing simple algorithms to map routes
which maximize right turns. According to the company, since 2012 the policy
has ―saved around 10 million gallons of gas and reduced emissions by the equivalent of
taking 5,300 cars off the road for a year.‖
http://priceonomics.com/why-ups-trucks-dont-turn-left/?imm_mid=0bae5b&cmp=em-strata-
na-na-newsltr_20140416_elist
3. Learn more about how data science is changing your world at blog.gopivotal.com
Data Science: From Half-Baked Ideas to
Data-Driven Insights
In this post for the Wall Street Journal’s CIO Journal, Irving Wladawsky-Berger
provides executives with a high-level overview of the growing importance of
data scientists within the enterprise, a field he describes as ―one of the most
exciting new professions and academic disciplines.‖
http://blogs.wsj.com/cio/2014/04/11/data-science-from-half-
baked-ideas-to-data-driven-insights/
4. Learn more about how data science is changing your world at blog.gopivotal.com
The Backlash Against Big Data,
Continued
Big Data may have entered the hype cycle’s dreaded trough of
disillusionment if the recent media backlash is any indication, even
though many critiques lack a sophisticated understanding of the tools
and methodologies involved. Mike Loukides at O’Reilly Media pushes
back against the backlash. He acknowledges that data scientists must
be ever-vigilant and skeptical when considering the limitations of
particular methodologies and data sources, but emphasizes that the Big
Data revolution is well underway, and powers a great number of
technologies we rely on daily, and trend that will only continue to grow
in future years.
http://radar.oreilly.com/2014/04/the-backlash-against-big-data-
continued.html?imm_mid=0ba721&cmp=em-strata-na-na-
newsltr_20140409_elist
5. Learn more about how data science is changing your world at blog.gopivotal.com
Is There a Wonk Bubble?
The semi-concurrent launch of Nate Silver’s 538, Vox, and a
slew of data-driven ―explainer‖ sites from big media outlets
like the Washington Post and the New York Times has driven
much debate this month about the value and potential
limitations of data journalism. In this Politico essay, Felix
Salmon argues why the boom in data journalism is actually a
good thing for the news industry and media junkies alike.
http://www.politico.com/magazine/story/2014/04/is-there-a-
wonk-bubble-105473.html?imm_mid=0bb43a&cmp=em-
strata-na-na-newsltr_20140423_elist#
6. Learn more about how data science is changing your world at blog.gopivotal.com
The Internet of Things is Great for
Chipmakers And a Challenge for Intel
GigaOM details how the Internet of Things
has the potential to revitalize the chip
industry, noting the amount of new
opportunities and challenges that will arise as
companies attempt to bring everyday
physical objects into the connected world.
http://gigaom.com/2014/04/16/the-internet-
of-things-is-great-for-chipmakers-and-a-challenge-for-intel/
7. Learn more about how data science is changing your world at blog.gopivotal.com
900 Years of Tree Diagrams, the Most
Important Data Viz Tool in History
It may be the new hotness in boardrooms and shareable viral content, but
data visualization is a centuries-old practice. In this fun post, Wired looks
back at the past 900 years of tree diagrams, which came about during
the Middle Ages, during which time there was an explosion of new
knowledge needing to be categorized and communicated, drawing
parallels with the Big Data explosion of today.
http://www.wired.com/2014/04/tree-diagrams-the-most-important-
data-viz-tool-in-history/
8. Learn more about how data science is changing your world at blog.gopivotal.com
The Big Data Challenge to Legacy Data
Management Companies
The New York Times explores how big data software companies are threatening the
profitability of legacy hardware vendors such as Oracle, IBM, Teredata, and others. It
relates the current industry shift to the way microprocessor-based computing drove
computer mainframe prices into the ground.
http://bits.blogs.nytimes.com/2014/04/07/the-big-data-challenge-to-legacy-data-
management-companies/?_php=true&_type=blogs&_php=true&_type=blogs&_php
=true&_type=blogs&emc=edit_tu_20140407&nl=technology&nlid=7804711&_r=2
9. Learn more about how data science is changing your world at blog.gopivotal.com
Researchers Search for Time Travelers
Using Internet Tools, Clever Statistical
AnalysisA capricious group of Cornell researchers utilized
data mining and deep statistical analysis to trawl the
web and determine whether there are time travelers
lurking in our midst. Unfortunately for the
sci-fi minded among us, the researchers came up
short in their research, but in the process illuminated
the lighter side of data analysis.
http://www.geek.com/science/new-statistical-research-asks-
do-time-travelers-walk-among-us-1581169/?imm_mid=
0bb43a&cmp=em-strata-na-na-newsltr_20140423_elist
10. Learn more about how data science is changing your world at blog.gopivotal.com
THIS MONTH IN PIVOTAL
DATA SCIENCE
11. Learn more about how data science is changing your world at blog.gopivotal.com
Pivotal’s New Big Data Suite Redefines
the Economics of Big Data Including
UNLIMITED Hadoop to Enterprises
This month, Pivotal changed the economics of Big Data
forever, launching the Pivotal Big Data Suite. It is an annual
subscription based software, support, and maintenance
package that bundles Pivotal Greenplum Database, Pivotal
GemFire, Pivotal SQLFire, Pivotal GemFire XD, and Pivotal
HAWQ, into a flexible pool of big and fast data products.
http://blog.gopivotal.com/pivotal/products/pivotals-new-big-
data-suite-redefines-the-economics-of-big-data-including-
unlimited-hadoop-to-enterprises
12. Learn more about how data science is changing your world at blog.gopivotal.com
Pivotal Debuts at ApacheCon North
America Thanks For Having Us!
Pivotal’s Roman Shaposhnik reviews ApacheCon 14, which took place
last week in Denver. At Pivotal’s self-described ―coming out party‖ to
the Apache Software Foundation, we worked to make an impression
by starting off with a keynote, providing and attending various sessions
and even hosting a cocktail party. In this review of the event,
Shaposhnik also points community members to some of the newer
technologies he believes are hot to watch and use right now.
http://blog.gopivotal.com/pivotal/features/pivotal-
debuts-at-apachecon-north-america-2014-thanks-for-having-us
13. Learn more about how data science is changing your world at blog.gopivotal.com
Big Data & Brews Video Explains How
Pivotal’s Hadoop Distribution Is Different
In a video interview for the Big Data & Brews series, Pivotal’s
Chief Scientist Milind Bhandarkar shares a beer with Datameer’s
CEO Stefan Groschupf and provides an overview of the many
features that differentiate Pivotal’s Hadoop distribution from the
rest.
http://blog.gopivotal.com/pivotal/products/big-data-brews-video-
explains-how-pivotals-hadoop-distribution-is-different
14. Learn more about how data science is changing your world at blog.gopivotal.com
DSC Webinar Series: Data Science for the
99% Open Source Software for Machine
Learning and AnalyticsIn this webinar, available to now view at Data Science Central, Pivotal’s
Woo J. Jung, Sarah Aerni, and Srivatsan Ramanujam discuss some of the
open source tools in their arsenal. They introduce and provide details on the
variety of open source tools — such as MADlib, PL/R, PL/Python, PivotalR,
PyMADlib and a host of others — they have utilized and extended for
customer engagements.
http://www.datasciencecentral.com/video/dsc-webinar-series-data-
science-for-the-99-open-source-software
15. Learn more about how data science is changing your world at blog.gopivotal.com
Time Series Analysis Part 3:
Resampling and Interpolation
The previous blog posts in this series introduced
how Window Functions can be used for many
types of ordered data analysis. This post further
elaborates how these techniques can be
expanded to handle time series resampling and
interpolation.
http://blog.gopivotal.com/tag/data-
science#sthash.RCeaWqlT.dpuf
16. Learn more about how data science is changing your world at blog.gopivotal.com
UPCOMING EVENTS
17. Learn more about how data science is changing your world at blog.gopivotal.com
Parquet: Open-Source Columnar
Format
For HadoopThursday, May 15, 2014
5:45 PM to 8:30 PM
Pivotal Labs
San Francisco, CA
Twitter’s Dmitriy Ryaboy and Pivotal’s Milind Bhandarkar discuss Parquet, an open source
project implementing columnar storage that supports deeply nested structures, efficient
encoding and column compression schemes, and is designed to be compatible with a
variety of higher-level type systems. In this talk, they will go over the Parquet design, use
cases, and performance numbers.
http://www.meetup.com/Pivotal-Open-Source-Hub/events/177942192/
18. Learn more about how data science is changing your world at blog.gopivotal.com
Hadoop Cluster Configuration and
Performance Tuning
Tuesday, May 20, 2014
5:45 PM to 8:30 PM
Pivotal Labs
San Francisco, CA
Configuring and operating a Hadoop cluster is still not a trivial task and needs special
considerations. In this talk, Pivotal’s Suhas Gogate will provide various tips to configure a
Hadoop cluster and to analyze and tune the performance of Map/Reduce applications. He
will also demo ―Hadoop Vaidya‖, a performance advisor for Hadoop M/R, which he
submitted as a Hadoop contrib project.
http://www.meetup.com/Pivotal-Open-Source-Hub/events/178861422/
19. Learn more about how data science is changing your world at blog.gopivotal.com
2014 Hadoop Summit
June 3–5, 2014
San Jose, CA
The 7th Annual Hadoop Summit will feature many of the Apache Hadoop thought leaders
who will showcase successful Hadoop use cases, share development and administration
tips and tricks, and educate organizations about how best to leverage Apache Hadoop as a
key component in their enterprise data architecture.
http://hadoopsummit.org/san-jose/