SlideShare uma empresa Scribd logo
1 de 67
BIG DATA AT HUMAN SCALE.
Matt LeMay, @mattlemay
BIG DATA
IS BIG
How BIG is it?
We have built the capacity to store
more bytes of data than the
Earth has grains of sand.
... about 315 times more.
If each bit of data we have the capacity to
store were to represent a star, then
there would be a GALAXY OF
DATA for every person on Earth.
The data Walmart generates every hour from
its customer transactions represents 167 times the
information contained in all the books in the United
States Library of Congress.
PWNED
The number of bytes
we’ve built the
capacity to store
constitutes only a
TINY FRACTION
of the number of
atoms you have in
your body.
... or the amount of
data stored in your
DNA.
In fact, the data storage capacity of the entire
world is less than one percent of the information
stored in the DNA molecules of a single person.
as we approach human scale...
...big data seems smaller.
... but it’s bigger than it’s ever been before.
=
ALL the data
created until the
year 2003
ALL the data
created every
two days
Scale of Data ~3,000 Years Ago:
Scale of Data ~300 Years Ago:
Scale of Data ~30 Years Ago:
Scale of Data ~3 Years Ago:
We’ve been writing stuff on walls for 30,000 years...
... and we’re still not entirely what it all means.
“BIG DATA” is US*,
in higher resolution.
“We’re distracted by a bunch of nonsense.”
“Ephemeral thoughts and actions, which were once
lost to time, are now recorded forever.”
That record is “BIG DATA.”
According to , 43% of all data
gathered on people comes from social media.
We overshare compulsively, but we are more
concerned than ever before about our privacy.
Privacy vs Permission
Privacy = “My data is valuable, and
others want access so that they can spy
on me or sell me stuff I don’t want.”
Permission = “My data is valuable, so
I will explicitly grant others access to it
in specific situations where it is
worthwhile for me to do so.”
Privacy is something we need to worry about
when expectations are violated around the
permissions we agree to.
Even explicit permission...
... doesn’t override expectation.
... often struggles to square permission with
expectation, at times to their own detriment.
weknowwhatyouredoing.com
We expect clicks to be private gestures,
and shares to be public gestures.
Facebook’s social reader violated those
expectations.
We share who we want to be.
We click who we fear we are.
... and it matters.
We share our
information
because we trust
that sharing will
make it more
valuable to us.
“The future has an ancient heart.”
- Carlo Levi
My data Your data
BIG DATA “MAGIC”
Me You
BIG DATA “MAGIC”
“HADOOP!”
MAGICKAL RABBITS OF INSIGHT!!11
Me You
... but “BIG DATA” is not magic.
“MAGIC BIG DATA TECHNOLOGY”
is a set of tools...
... necessitated by scale.
- Tim O’Brien, O’Reilly Strata Conference
COUNTING
is not
UNDERSTANDING
THE ALGORITHM
WON’T SAVE YOU
BIG DATA is only as
good as the questions
we ask of it.
... and many of those questions haven’t changed.
Loyalty clubs and targeted coupons are the
oldest trick in the “big data” book.
- Andrew Pole,Target
Big Data could make advertising and
marketing better.*
(Which will, in turn, hopefully pay for all those nifty services we use to generate all that data.)
Twitter Search == BIG Data.
*
... but the potential goes beyond advertising.
When done right, BIG DATA encourages
you to SHARE MORE, not less.
“BIG DATA” is all around us.
...and it doesn’t feel ZOMG WORLD-CHANGING
... because it’s in our cells.
Thank you.
Questions?
@MATTLEMAY

Mais conteúdo relacionado

Destaque

Chapter 4 scale and proportion
Chapter 4 scale and proportionChapter 4 scale and proportion
Chapter 4 scale and proportionTracie King
 
2015 Upload Campaigns Calendar - SlideShare
2015 Upload Campaigns Calendar - SlideShare2015 Upload Campaigns Calendar - SlideShare
2015 Upload Campaigns Calendar - SlideShareSlideShare
 
What to Upload to SlideShare
What to Upload to SlideShareWhat to Upload to SlideShare
What to Upload to SlideShareSlideShare
 
Getting Started With SlideShare
Getting Started With SlideShareGetting Started With SlideShare
Getting Started With SlideShareSlideShare
 

Destaque (6)

HUMAN SCALE
HUMAN SCALEHUMAN SCALE
HUMAN SCALE
 
Scale & Proportion
Scale & ProportionScale & Proportion
Scale & Proportion
 
Chapter 4 scale and proportion
Chapter 4 scale and proportionChapter 4 scale and proportion
Chapter 4 scale and proportion
 
2015 Upload Campaigns Calendar - SlideShare
2015 Upload Campaigns Calendar - SlideShare2015 Upload Campaigns Calendar - SlideShare
2015 Upload Campaigns Calendar - SlideShare
 
What to Upload to SlideShare
What to Upload to SlideShareWhat to Upload to SlideShare
What to Upload to SlideShare
 
Getting Started With SlideShare
Getting Started With SlideShareGetting Started With SlideShare
Getting Started With SlideShare
 

Semelhante a "Big Data at Human Scale," Wharton Web Conference 2013

Big Data, Small Data, Data that Totally Rocks - SMWTO
Big Data, Small Data, Data that Totally Rocks - SMWTOBig Data, Small Data, Data that Totally Rocks - SMWTO
Big Data, Small Data, Data that Totally Rocks - SMWTORob Clark
 
Big Data in the Legal Industry
Big Data in the Legal IndustryBig Data in the Legal Industry
Big Data in the Legal IndustryEvolve Law
 
Risk Factory Big Daddy Digs Big Data
Risk Factory Big Daddy Digs Big DataRisk Factory Big Daddy Digs Big Data
Risk Factory Big Daddy Digs Big DataRisk Crew
 
Data Days 2014 - Nina Dierks
Data Days 2014 - Nina DierksData Days 2014 - Nina Dierks
Data Days 2014 - Nina Dierksdatadays
 
Thriving in the 21st century
Thriving in the 21st centuryThriving in the 21st century
Thriving in the 21st centuryGlenn Wiebe
 
Bitcoins May 2013
Bitcoins May 2013Bitcoins May 2013
Bitcoins May 2013WesWWeber
 
Family. Our Future in Cyberspace
Family. Our Future in CyberspaceFamily. Our Future in Cyberspace
Family. Our Future in Cyberspacemangoups
 
InfographicsMadeEasy.pdf
InfographicsMadeEasy.pdfInfographicsMadeEasy.pdf
InfographicsMadeEasy.pdfzdczxcxzczx1
 
SXSW 2012 - Big Data Conversation
SXSW 2012 - Big Data ConversationSXSW 2012 - Big Data Conversation
SXSW 2012 - Big Data Conversationjohn st.
 
2600 v21 n3 (autumn 2004)
2600 v21 n3 (autumn 2004)2600 v21 n3 (autumn 2004)
2600 v21 n3 (autumn 2004)Felipe Prado
 
Homeland security
Homeland securityHomeland security
Homeland securityWes Widner
 
2600 v24 n4 (winter 2007)
2600 v24 n4 (winter 2007)2600 v24 n4 (winter 2007)
2600 v24 n4 (winter 2007)Felipe Prado
 
The Intranets of Babel
The Intranets of BabelThe Intranets of Babel
The Intranets of BabelIqbal Mohammed
 
GnoTag - Semantically Barcoding Our World
GnoTag - Semantically Barcoding Our WorldGnoTag - Semantically Barcoding Our World
GnoTag - Semantically Barcoding Our WorldLee Livezey
 
SSI Meetup – interpersonal data, identity and collective minds
SSI Meetup – interpersonal data, identity and collective mindsSSI Meetup – interpersonal data, identity and collective minds
SSI Meetup – interpersonal data, identity and collective mindsPhilip Sheldrake
 
Business considerations for privacy and open data: how not to get caught out
Business considerations for privacy and open data: how not to get caught outBusiness considerations for privacy and open data: how not to get caught out
Business considerations for privacy and open data: how not to get caught outtheODI
 

Semelhante a "Big Data at Human Scale," Wharton Web Conference 2013 (20)

Big Data, Small Data, Data that Totally Rocks - SMWTO
Big Data, Small Data, Data that Totally Rocks - SMWTOBig Data, Small Data, Data that Totally Rocks - SMWTO
Big Data, Small Data, Data that Totally Rocks - SMWTO
 
Big Data in the Legal Industry
Big Data in the Legal IndustryBig Data in the Legal Industry
Big Data in the Legal Industry
 
Big Data! Dopey Quotes!
Big Data! Dopey Quotes!Big Data! Dopey Quotes!
Big Data! Dopey Quotes!
 
Big Data, Deep Thought
Big Data, Deep ThoughtBig Data, Deep Thought
Big Data, Deep Thought
 
Risk Factory Big Daddy Digs Big Data
Risk Factory Big Daddy Digs Big DataRisk Factory Big Daddy Digs Big Data
Risk Factory Big Daddy Digs Big Data
 
Data Days 2014 - Nina Dierks
Data Days 2014 - Nina DierksData Days 2014 - Nina Dierks
Data Days 2014 - Nina Dierks
 
Thriving in the 21st century
Thriving in the 21st centuryThriving in the 21st century
Thriving in the 21st century
 
Bitcoins May 2013
Bitcoins May 2013Bitcoins May 2013
Bitcoins May 2013
 
Family. Our Future in Cyberspace
Family. Our Future in CyberspaceFamily. Our Future in Cyberspace
Family. Our Future in Cyberspace
 
Algorithms
AlgorithmsAlgorithms
Algorithms
 
InfographicsMadeEasy.pdf
InfographicsMadeEasy.pdfInfographicsMadeEasy.pdf
InfographicsMadeEasy.pdf
 
SXSW 2012 - Big Data Conversation
SXSW 2012 - Big Data ConversationSXSW 2012 - Big Data Conversation
SXSW 2012 - Big Data Conversation
 
2600 v21 n3 (autumn 2004)
2600 v21 n3 (autumn 2004)2600 v21 n3 (autumn 2004)
2600 v21 n3 (autumn 2004)
 
Homeland security
Homeland securityHomeland security
Homeland security
 
Big Human Data
Big Human DataBig Human Data
Big Human Data
 
2600 v24 n4 (winter 2007)
2600 v24 n4 (winter 2007)2600 v24 n4 (winter 2007)
2600 v24 n4 (winter 2007)
 
The Intranets of Babel
The Intranets of BabelThe Intranets of Babel
The Intranets of Babel
 
GnoTag - Semantically Barcoding Our World
GnoTag - Semantically Barcoding Our WorldGnoTag - Semantically Barcoding Our World
GnoTag - Semantically Barcoding Our World
 
SSI Meetup – interpersonal data, identity and collective minds
SSI Meetup – interpersonal data, identity and collective mindsSSI Meetup – interpersonal data, identity and collective minds
SSI Meetup – interpersonal data, identity and collective minds
 
Business considerations for privacy and open data: how not to get caught out
Business considerations for privacy and open data: how not to get caught outBusiness considerations for privacy and open data: how not to get caught out
Business considerations for privacy and open data: how not to get caught out
 

Último

Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 

Último (20)

Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 

"Big Data at Human Scale," Wharton Web Conference 2013