SlideShare uma empresa Scribd logo
1 de 39
Baixar para ler offline
MIT Brown Bag Session:
Making Decisions in a
World Awash in Data:
We’re going to need a
different boat!
Anthony J. Scriffignano,Ph.D.
SVP/Chief Data Scientist
Dun & Bradstreet
14-November,2016
22
The Jetsons image is licensed under Creative Commons
We need to seriously think
about the implications of
trivial inference from data…
3
How is the data landscape changing?
What we are doing in data science to respond?
What type of skills and thinking are required to
remain relevant in this evolving world?
4
The New Normal:
How the data around us
continues to change
What’s changing?
What’s New?
When do we have enough?
5
We must observe what is changing critically
Reading about events
that happened in the
past, listening to people
who are not present.
Reading about things in the “now” – can’t
tell who is communicating with whom –
does anybody really know what is going
on?
Courtesy Get Smart (1965–1970)
Information was created and
shared in silos
According to Google Plus. Precisely five photographs were ever taken of Neil Armstrong while Apollo 11 operated on the surface of the moon. Only four of those
photos show Armstrong outside the Lunar Module and actually moonwalking. Only three of them show Armstrong in direct view, rather than a reflection. Aug 31, 2012
6
A qualitative look at a shorter period from the
perspective of an extensible analytical frame…
What about the digital footprint of all of the
smartphones?
What about the social networks the crowd?
What about the metadata in the photos?
What are the opportunity costs to other
activities?
The largest corpus of data preceded the event
Most data created about the event had
significant, and asymmetric latency
The rate of “data decay” attributable to the
participants in the event is significant
Pope Benedict Inauguration
Pope Francis Inauguration
BEWARE THE DANGERS OF MAKING
INFRENCE OUTSIDE FRAMES OF
REFERENCE, KNOW THE RULES OF
WHEN YOU CAN OR PAY THE PRICE!
7
The burning platform…
Too much data?
Different objectives?
8
When is enough enough?
Identify the Scenario
• Relative size
• Key question
Assess
DispositiveThreshold
• Estimate
• Triangulate
Decision Elasticity
• Bias
• Opportunity cost
D A T A I N
H A N D
D I S C O V E R A B L E
D A T A
E X I S T I N G B U T
I N A C C E S S I B L E D A T A
@Scriffignano1
9
VOLUME
Data sensing
Curation, Data at
rest vs. data in
motion
More is not
necessarily
better
VELOCITY
Judging
Simultaneity of
truth
Time of curation
vs. time of
creation
The myth of real
time data
VERACITY
Triangulation,
Non-regressive
methods
Malfeasance
innovation,
regulatory
All true data is
not necessarily
simultaneously
true
VARIETY
Entity extraction,
discovery
Data that exists
but is
unavailable,
unstructured
More data is
being
disregarded than
used
VALUE
Disambiguation
Opportunity cost
of curation,
single use data
Value
deteriorates at
an alarming rate
Challenging assumptions:Taking a More Sophisticated
View of “Big Data”
10
This change gives rise to choices about “where to land”
• Where to “land”
• How to measure in a changing environment
• Implications of choosing poorly
11
How are We Reacting:
Bringing data science into the room
Bringing Data Science into the room
“Black cat” problems
Emerging relationships
Quantum observation / thinking
Implications
Emerging Trends Description Opportunities Risks
Adoption of
Natural Language
Processing
• Semantic
Disambiguation:
automated parsing and
analysis of unstructured
and structured text, and
spoken language – or, in a
more advanced state,
across multiple languages
• Detecting conversations
of interest in the ocean
of social chatter
• Foundational capability
for detecting
relationships between
entities of interest and
their behaviors
• Falling behind “bad
guys,” who may use
increasingly
sophisticated
“cloaking”
techniques in digital
communications
Export of Fintech
innovation into
other areas
• Social decision-making
• Blockchain and related
technologies: promotes
transparency and security
of transactions
• Supports regulatory
efforts e.g. privacy
• Exporting techniques
outside of finance
• New malfeasance,
especially with
alternate digital
identity
• Dispositive
threshold
Emergence of AI
in place of human
decision makers
• Relegation of decision
making to algorithms, e.g.
in trading, but potentially in
healthcare, transportation,
energy, etc.
• Significantly faster,
cheaper and more
efficient outcomes
• Drives scale /
consistency
• Cybersecurity is
even more critical
• Economically
optimal and socially
beneficial?
• Perpetuating
heuristic error
Emerging Digital Technologies: Behaviors
In scope for today’s discussion
13
How data is being discovered and used is a constantly-changing
environment with changing context.
Situational awareness…
• More than 85% of data
creation is
unstructured
• Language and use of
language are constantly
evolving
• Commonly available
tools and solutions only
address a small part of
this space
Unstructured
Data
14
Focusing on smaller parts of the problem space…
CHALLENGES
The “John Smith” problem
Caroline M Smith
302 N Liberty St.
Albion, IA
Addr Type: Residential
Carrie Smith
Monolith Corporation
1716 Locust St.
Des Moines, IA
Addr. Type: Commercial
Caroline Smith
University of Iowa
21 E Market St.
Iowa City, IA
Addr. Type: Commercial
The “Sybil” problem
Carrie Smith
Tenderheart Daycare
2635 Cleveland Dr.
Adel, IA
Addr. Type: Commercial
The “Ann Taylor” problem
PROGRESSIVE DECOMPOSITION
Who is speaking?
About whom?
How do they feel?
In what context?
EPISTEMOLOGY
Dispositive threshold
15
At the same time, we must continue to innovate in traditional spaces
where we have strong enabling capabilities.
16
Intentionally focusing on smaller parts of the problem space…
THE CHALLENGE
The “John Smith” problem
Caroline M Smith
302 N Liberty St.
Albion, IA
Addr Type: Residential
Carrie Smith
Monolith Corporation
1716 Locust St.
Des Moines, IA
Addr. Type: Commercial
Caroline Smith
University of Iowa
21 E Market St.
Iowa City, IA
Addr. Type: Commercial
The “Sybil” problem
Carrie Smith
Tenderheart Daycare
2635 Cleveland Dr.
Adel, IA
Addr. Type: Commercial
What would it look like?
The “Ann Taylor” problem
Who is
speaking?
About
whom?
How do
they feel?
In what
context?
Epistemology
17
Simple mean of sentiment
Weighted mean of sentiment
Standard deviation of sentiment
Thought experiment (assuming completion)
18
New Answers…
Are there identifiable
modes? Do they change
when speaking about
their own enterprise?
Is the leadership leading
or lagging mean
sentiment? How does this
change over time? How
do leaders influence?
19
To avoid getting distracted by “all things social”, the
science involves continuous evolution and focus on
specific use cases that drive value
Sarcasm
ABC corporation is a wonderful
company, if you don’t do business
with them.
Neologism
Be sure to like us on FaceBook and
use #shallow when you Tweet.
Grammar variations
FBI is Hunting Terrorists With
Explosives.
Punctuation / Intrusion of foreign language
“Hi mom!” vs. “Hi, mom?”
Intentional mis-spelling
RU There?
Context /
Behavior
Sentiment
Attribution
Entity
Extraction
USE CASES CONFOUNDING
CHARACTERISTICS
DERIVING EMPIRICAL MEASURES
THAT INFORM USE CASES
D&B proprietary information, do not distribute or copy without permission
Feedback
20
What do we have to believe?
21
Watch this space…
• Inter- and intra-language correlation : deciding when things mean the same thing
• Inter- and intra-language transformation : transforming inference among languages
• Changing behavior to attract/obviate grapheme analysis : reacting to changing language
• Emerging “metalanguage” (e.g. “textspeak”) : reacting to language about language
• A language of “things” : reacting to new languages used by automation
• Using language to hide language : reacting to attempts to obscure via language
• Unicode is not universal… : understanding the limitations of automation
Implications
Emerging Trends Description Opportunities Risks
Connected
Space
• Discovery of Complex
Counterparty Relationships:
construction of an n-dimensional
space where dyadic relationships
are established into a connected
graph
• Discovering
relationships between
entities of interest and
their behavioral
patterns
• Changing the
environment by
measuring it (e.g.
fraud) and
creating smarter
malfeasance
• Confirmation bias
based on available
data
Large Scale
Machine
Learning
• Intelligence models which rely on
computational intelligence:
automated learning, classification
and storage
• Extending discovery of
relationships beyond
rule-based approaches
• New valuable insights,
esp. counter-intuitive
and paradoxical
• “Garbage in,
garbage out”
• Exporting human
bias into training
sets
Digital
Everything
• Scientific concepts become
computable as data is digitized in
new and more nuanced ways
(e.g. digital X-ray, fitness
monitors, autonomous vehicles)
• Discovery of patterns
and solutions that are
theoretically
“invisible”
• A powerful
weapon in hands
of a “bad guy”
• Laws will always
lag evolution
Emerging Digital Technologies: Relationships
In scope for today’s discussion
23
The Black Cat Problem
… but also some very ominous
23
Dealing with “Black Cat” problems
• Signals
• Systemic measures
• Anomaly detection
• Isotropism
• Character / quality measures
• Data sensing
• Triggers
24
Visualizing extremely complex, changing relationships
addresses questions never before feasible
Dyadic relationships across
multiple perspectives
Abstracting dimensions
Time
“What If”
scenarios
News
Social
signals
Market
Data
Events
Internal
Data
Some examples
• Understanding Signals derived from changes to business information
• Discovering and investigating clusters of unusual behavior
• Exploring the impact of new regulation
• Applying standard measures to a highly dynamic environment
• Exploring the impact of new market forces
• Studying the real or potential impact of supply chain interruptions
• Investigating emerging capabilities (e.g. reputational risk)
Asking new questions never before feasible
Observing key
measures over time
Blending with
similarly
constructed graphs
Understanding language as a changing phenomenon
leads to greater computational capability with
regard to changing behavior
Money laundering
CorporateTheft Identify
Bust- out
Shell Company
Trade Rings
T R A D I T I O N A L V I E W
M O R E N U A N C E D V I E W
Cybersecurity - inside
out/outside in
Data sovereignty
Permissible use
Discovering prior behavior
vs. emerging behavior in
extremely large sets of data
@Scriffignano1 25
Manifesting new relationships in our behaviorManifesting new relationships in our behavior
Data Privacy
Expressed Consent
Intellectual Property
Transferring data across borders
New data assets
Compliance with industry
standards and best practices
@Scriffignano1 26
Data sovereignty
Cybersecurity
Integrated global value chain
Emergency preparedness
National security: cyber-everything
Border protection
Balance ofTrade
27
The Path Ahead:
Creating a mindset in the
organization that evolves
Where do we need to focus
Future challenges
Implications
Emerging Trends Description Opportunities Risks
Changing Nature of
Computing
• Neuromorphic computing:
Biologically-inspired techniques
aiming to mimic human thought
• Quantum computing: Non-
Boolean physical and algorithmic
devices
• Computing as a commodity:
ubiquitous access to unlimited
computational power on a “pay-as-
you-go” model
• Analytics / Data Science as a
Service: Attempts to deliver
capabilities down-market or in a
more scalable way
• Solutions to previously
intractable problems
enabled by increased
“parallelism,” and ability
to compute on non-
binary concepts
• ‘Bad guys’ get access
to unprecedented
computing power
and techniques
• Need to rethink
cybersecurity
Internet of Things • Objects connected on a network,
sending and receiving data,
distinctions by use (e.g. Industrial)
• Gleaning previously
unobserved patterns of
behaviors and
relationships
• Cybersecurity and
privacy challenges
• Authentication /
validation
Emerging Digital Technologies:“Things”
In scope for today’s discussion
29
Quantum, Neuromorphic Computing
Technology to go beyond digital / Boolean computation
Neural Networks – integrated computing models inspired by biological analogs
Quantum Computing – Continuous, non-Boolean computational engines and algorithms
Substitution of Addition/Multiplication/AND & OR with Translation/Rotation & Expectation
Potential applications:
•Quicker non-deterministic searches from Quantum Algorithms
•Reconstruction of implied data/metadata models
•Quantum Pattern Discovery (e.g. anisotropism, neosophism)
30
Quantum Computing Information stored in
qubits; qubits can be in
base states (|0>, |1>) or
any superposition or
entanglement (inner
product) of these.
Each additional qubit adds
infinitely many more
superpositions
Calculations are
performed by unitary
transformations –
essentially linear
translations or rotations –
on the qubit states
31
Food for Thought…
Traditional
thinking Quantum
thinking
• Deterministic
Add/change/delete/search
• Workflow
• Value Chain dynamics
• Regressive analysis
• Non-Deterministic approach to
monotonicity
• Probabilistic workflow / heuristics
• Value Chain order-n understanding
• Non-Regressive analysis, especially with
new data/new behavior
32
Machine Learning / Learning from Machines
32
33
Watch this space…
34
It is very important to have a working definition of
“Innovation”
New ways of doing old things?
Doing new things?
(Traditional)
Ways of doing new things?
Creating simpler unsolved
problems?
New ways of knowing?
New ways of cheating?
35
Reflecting on the “Data Journey” of Large Multinational Organizations
From focus on data
to focus on meaning
From analytic focus
to holistic focus
From ingestion
to curation
From early adoption to
integrated value chain
Technology Process
MindsetPeople
36
The Evolving Leadership Mindset for Data-Inspired Organizations
Initial
Focus
Emerging
Mindset
Modern
Mindset
• Awareness of a skills gap
• Recognition of significant risk and opportunity
• Some degree of feeling overwhelmed by the “new”
• Silos of evolution, internal competition due to new roles/responsibilities
• Focus on evolving the existing workforce and hiring new skills
• As landscape rapidly changes, skills required shift from tools to competencies
• More focus on governance, provenance, security, malfeasance
• More educated customers placing higher demands on the organization
• Breaking down silos of data and innovation
• Ability to measure and respond in a truly agile fashion
• Reflective leaders who constantly re-evaluate and constantly learn
• Analytical, data-based decisions evolve through new methods, learning
37
Reflecting on the Journey -- Things to Consider
Too Much Data?
• Start with a problem or question, not a tool or dataset
• Understand the going-in assumptions
• Continuously evaluate how the environment is changing
NewTypes of Data
• Select methods (especially non-regressive) carefully
• Use new methods and visualizations for a reason, not for an
expedient
• Be aware of “Black Cat” problems
The Burning Platform
• Continuously evaluate new skills and capabilities
• Continuously evaluate new ways of knowing, breaking down
problems into smaller pieces, reducing complexity
• The decision to do nothing is a decision to sink further behind
37
38
We can’t solve problems by using
the same kind of thinking we used
when we created them.
Anthony Scriffignano, Ph.D., SVP / Chief Data Scientist
scriffignanoa@dnb.com
@SCRIFFIGNANO1
ThankYou

Mais conteúdo relacionado

Mais procurados

Critically Assembling Data, Processes & Things: Toward and Open Smart City
Critically Assembling Data, Processes & Things: Toward and Open Smart CityCritically Assembling Data, Processes & Things: Toward and Open Smart City
Critically Assembling Data, Processes & Things: Toward and Open Smart City
Communication and Media Studies, Carleton University
 
Black Box Learning Analytics? Beyond Algorithmic Transparency
Black Box Learning Analytics? Beyond Algorithmic TransparencyBlack Box Learning Analytics? Beyond Algorithmic Transparency
Black Box Learning Analytics? Beyond Algorithmic Transparency
Simon Buckingham Shum
 
Prateek Jain dissertation defense, Kno.e.sis, Wright State University
Prateek Jain dissertation defense, Kno.e.sis, Wright State UniversityPrateek Jain dissertation defense, Kno.e.sis, Wright State University
Prateek Jain dissertation defense, Kno.e.sis, Wright State University
Prateek Jain
 
Jessica Vitak, "When Contexts Collapse: Managing Self-Presentation Across Soc...
Jessica Vitak, "When Contexts Collapse: Managing Self-Presentation Across Soc...Jessica Vitak, "When Contexts Collapse: Managing Self-Presentation Across Soc...
Jessica Vitak, "When Contexts Collapse: Managing Self-Presentation Across Soc...
summersocialwebshop
 

Mais procurados (20)

Comments to FTC on Mobile Data Privacy
Comments to FTC on Mobile Data PrivacyComments to FTC on Mobile Data Privacy
Comments to FTC on Mobile Data Privacy
 
Critically Assembling Data, Processes & Things: Toward and Open Smart City
Critically Assembling Data, Processes & Things: Toward and Open Smart CityCritically Assembling Data, Processes & Things: Toward and Open Smart City
Critically Assembling Data, Processes & Things: Toward and Open Smart City
 
Available Data Science M.Sc. Thesis Proposals
Available Data Science M.Sc. Thesis Proposals Available Data Science M.Sc. Thesis Proposals
Available Data Science M.Sc. Thesis Proposals
 
"Reproducibility from the Informatics Perspective"
"Reproducibility from the Informatics Perspective""Reproducibility from the Informatics Perspective"
"Reproducibility from the Informatics Perspective"
 
Black Box Learning Analytics? Beyond Algorithmic Transparency
Black Box Learning Analytics? Beyond Algorithmic TransparencyBlack Box Learning Analytics? Beyond Algorithmic Transparency
Black Box Learning Analytics? Beyond Algorithmic Transparency
 
Ethics for Conversational AI
Ethics for Conversational AIEthics for Conversational AI
Ethics for Conversational AI
 
Automating Homelessness
Automating HomelessnessAutomating Homelessness
Automating Homelessness
 
Prateek Jain dissertation defense, Kno.e.sis, Wright State University
Prateek Jain dissertation defense, Kno.e.sis, Wright State UniversityPrateek Jain dissertation defense, Kno.e.sis, Wright State University
Prateek Jain dissertation defense, Kno.e.sis, Wright State University
 
BROWN BAG TALK WITH MICAH ALTMAN, SOURCES OF BIG DATA FOR SOCIAL SCIENCES
BROWN BAG TALK WITH MICAH ALTMAN, SOURCES OF BIG DATA FOR SOCIAL SCIENCESBROWN BAG TALK WITH MICAH ALTMAN, SOURCES OF BIG DATA FOR SOCIAL SCIENCES
BROWN BAG TALK WITH MICAH ALTMAN, SOURCES OF BIG DATA FOR SOCIAL SCIENCES
 
What's up at Kno.e.sis?
What's up at Kno.e.sis? What's up at Kno.e.sis?
What's up at Kno.e.sis?
 
Linking Data to Publications through Citation and Virtual Archives
Linking Data to Publications through Citation and Virtual ArchivesLinking Data to Publications through Citation and Virtual Archives
Linking Data to Publications through Citation and Virtual Archives
 
Fact Checking & Information Retrieval
Fact Checking & Information RetrievalFact Checking & Information Retrieval
Fact Checking & Information Retrieval
 
Jessica Vitak, "When Contexts Collapse: Managing Self-Presentation Across Soc...
Jessica Vitak, "When Contexts Collapse: Managing Self-Presentation Across Soc...Jessica Vitak, "When Contexts Collapse: Managing Self-Presentation Across Soc...
Jessica Vitak, "When Contexts Collapse: Managing Self-Presentation Across Soc...
 
Text REtrieval Conference (TREC) Dynamic Domain Track 2015
Text REtrieval Conference (TREC) Dynamic Domain Track 2015Text REtrieval Conference (TREC) Dynamic Domain Track 2015
Text REtrieval Conference (TREC) Dynamic Domain Track 2015
 
Info leakage 200510
Info leakage 200510Info leakage 200510
Info leakage 200510
 
Rogers data days_2014_slides_opti
Rogers data days_2014_slides_optiRogers data days_2014_slides_opti
Rogers data days_2014_slides_opti
 
Reputation Management for Early Career Researchers
Reputation Management for Early Career ResearchersReputation Management for Early Career Researchers
Reputation Management for Early Career Researchers
 
Qual, Mixed, Machine and Everything in Between
Qual, Mixed, Machine and Everything in BetweenQual, Mixed, Machine and Everything in Between
Qual, Mixed, Machine and Everything in Between
 
Rogers studyingpoliticalissues mar2014_optimized_ii_
Rogers studyingpoliticalissues mar2014_optimized_ii_Rogers studyingpoliticalissues mar2014_optimized_ii_
Rogers studyingpoliticalissues mar2014_optimized_ii_
 
WiNLP2020 Keynote "Challenges for Conversational AI: Reflections on Gender Is...
WiNLP2020 Keynote "Challenges for Conversational AI: Reflections on Gender Is...WiNLP2020 Keynote "Challenges for Conversational AI: Reflections on Gender Is...
WiNLP2020 Keynote "Challenges for Conversational AI: Reflections on Gender Is...
 

Destaque

BROWN BAG: THE VISUAL COMPONENT: MORE THAN PRETTY PICTURES - WITH FELICE FRANKEL
BROWN BAG: THE VISUAL COMPONENT: MORE THAN PRETTY PICTURES - WITH FELICE FRANKELBROWN BAG: THE VISUAL COMPONENT: MORE THAN PRETTY PICTURES - WITH FELICE FRANKEL
BROWN BAG: THE VISUAL COMPONENT: MORE THAN PRETTY PICTURES - WITH FELICE FRANKEL
Micah Altman
 
BROWN BAG TALK WITH CHAOQUN NI- TRANSFORMATIVE INTERACTIONS IN THE SCIENTIFIC...
BROWN BAG TALK WITH CHAOQUN NI- TRANSFORMATIVE INTERACTIONS IN THE SCIENTIFIC...BROWN BAG TALK WITH CHAOQUN NI- TRANSFORMATIVE INTERACTIONS IN THE SCIENTIFIC...
BROWN BAG TALK WITH CHAOQUN NI- TRANSFORMATIVE INTERACTIONS IN THE SCIENTIFIC...
Micah Altman
 
Redistricting Algorithms
Redistricting AlgorithmsRedistricting Algorithms
Redistricting Algorithms
Micah Altman
 
WORLDMAP: A SPATIAL INFRASTRUCTURE TO SUPPORT TEACHING AND RESEARCH (BROWN BA...
WORLDMAP: A SPATIAL INFRASTRUCTURE TO SUPPORT TEACHING AND RESEARCH (BROWN BA...WORLDMAP: A SPATIAL INFRASTRUCTURE TO SUPPORT TEACHING AND RESEARCH (BROWN BA...
WORLDMAP: A SPATIAL INFRASTRUCTURE TO SUPPORT TEACHING AND RESEARCH (BROWN BA...
Micah Altman
 

Destaque (20)

The Open Access Network: Rebecca Kennison’s Talk for the MIT Prorgam on Infor...
The Open Access Network: Rebecca Kennison’s Talk for the MIT Prorgam on Infor...The Open Access Network: Rebecca Kennison’s Talk for the MIT Prorgam on Infor...
The Open Access Network: Rebecca Kennison’s Talk for the MIT Prorgam on Infor...
 
Gary Price, MIT Program on Information Science
Gary Price, MIT Program on Information ScienceGary Price, MIT Program on Information Science
Gary Price, MIT Program on Information Science
 
Data Citation Rewards and Incentives
 Data Citation Rewards and Incentives Data Citation Rewards and Incentives
Data Citation Rewards and Incentives
 
Can computers be feminist? Program on Information Science Talk by Gillian Smith
Can computers be feminist? Program on Information Science Talk by Gillian SmithCan computers be feminist? Program on Information Science Talk by Gillian Smith
Can computers be feminist? Program on Information Science Talk by Gillian Smith
 
BROWN BAG: THE VISUAL COMPONENT: MORE THAN PRETTY PICTURES - WITH FELICE FRANKEL
BROWN BAG: THE VISUAL COMPONENT: MORE THAN PRETTY PICTURES - WITH FELICE FRANKELBROWN BAG: THE VISUAL COMPONENT: MORE THAN PRETTY PICTURES - WITH FELICE FRANKEL
BROWN BAG: THE VISUAL COMPONENT: MORE THAN PRETTY PICTURES - WITH FELICE FRANKEL
 
BROWN BAG TALK WITH CHAOQUN NI- TRANSFORMATIVE INTERACTIONS IN THE SCIENTIFIC...
BROWN BAG TALK WITH CHAOQUN NI- TRANSFORMATIVE INTERACTIONS IN THE SCIENTIFIC...BROWN BAG TALK WITH CHAOQUN NI- TRANSFORMATIVE INTERACTIONS IN THE SCIENTIFIC...
BROWN BAG TALK WITH CHAOQUN NI- TRANSFORMATIVE INTERACTIONS IN THE SCIENTIFIC...
 
Software Repositories for Research-- An Environmental Scan
Software Repositories for Research-- An Environmental ScanSoftware Repositories for Research-- An Environmental Scan
Software Repositories for Research-- An Environmental Scan
 
Program on Information Science Brown Bag:David Weinberger on Libraries as Pla...
Program on Information Science Brown Bag:David Weinberger on Libraries as Pla...Program on Information Science Brown Bag:David Weinberger on Libraries as Pla...
Program on Information Science Brown Bag:David Weinberger on Libraries as Pla...
 
Dulin PermaCC Talk for MIT PIS
Dulin PermaCC Talk for MIT PISDulin PermaCC Talk for MIT PIS
Dulin PermaCC Talk for MIT PIS
 
Brown Bag: DMCA §1201 and Video Game Preservation Institutions: A Case Study ...
Brown Bag: DMCA §1201 and Video Game Preservation Institutions: A Case Study ...Brown Bag: DMCA §1201 and Video Game Preservation Institutions: A Case Study ...
Brown Bag: DMCA §1201 and Video Game Preservation Institutions: A Case Study ...
 
BROWN BAG TALK WITH MICAH ALTMAN INTEGRATING OPEN DATA INTO OPEN ACCESS JOURNALS
BROWN BAG TALK WITH MICAH ALTMAN INTEGRATING OPEN DATA INTO OPEN ACCESS JOURNALSBROWN BAG TALK WITH MICAH ALTMAN INTEGRATING OPEN DATA INTO OPEN ACCESS JOURNALS
BROWN BAG TALK WITH MICAH ALTMAN INTEGRATING OPEN DATA INTO OPEN ACCESS JOURNALS
 
Ndsa 2016 opening plenary
Ndsa 2016 opening plenaryNdsa 2016 opening plenary
Ndsa 2016 opening plenary
 
Redistricting Algorithms
Redistricting AlgorithmsRedistricting Algorithms
Redistricting Algorithms
 
WORLDMAP: A SPATIAL INFRASTRUCTURE TO SUPPORT TEACHING AND RESEARCH (BROWN BA...
WORLDMAP: A SPATIAL INFRASTRUCTURE TO SUPPORT TEACHING AND RESEARCH (BROWN BA...WORLDMAP: A SPATIAL INFRASTRUCTURE TO SUPPORT TEACHING AND RESEARCH (BROWN BA...
WORLDMAP: A SPATIAL INFRASTRUCTURE TO SUPPORT TEACHING AND RESEARCH (BROWN BA...
 
Complicating the Question of Access (and Value) with University Press Publica...
Complicating the Question of Access (and Value) with University Press Publica...Complicating the Question of Access (and Value) with University Press Publica...
Complicating the Question of Access (and Value) with University Press Publica...
 
Brown Bag: New Models of Scholarly Communication for Digital Scholarship, by ...
Brown Bag: New Models of Scholarly Communication for Digital Scholarship, by ...Brown Bag: New Models of Scholarly Communication for Digital Scholarship, by ...
Brown Bag: New Models of Scholarly Communication for Digital Scholarship, by ...
 
10 SIMPLE STEPS TO BUILDING A REPUTATION AS A RESEARCHER, IN YOUR EARLY CAREER
10 SIMPLE STEPS TO BUILDING A REPUTATION AS A RESEARCHER, IN YOUR EARLY CAREER10 SIMPLE STEPS TO BUILDING A REPUTATION AS A RESEARCHER, IN YOUR EARLY CAREER
10 SIMPLE STEPS TO BUILDING A REPUTATION AS A RESEARCHER, IN YOUR EARLY CAREER
 
Academic social networking sites
Academic social networking sitesAcademic social networking sites
Academic social networking sites
 
Con3036 soaring-through-the-clouds-oow2016-160920214845
Con3036 soaring-through-the-clouds-oow2016-160920214845Con3036 soaring-through-the-clouds-oow2016-160920214845
Con3036 soaring-through-the-clouds-oow2016-160920214845
 
Test driven cloud development using Oracle SOA CS and Oracle Developer CS
Test driven cloud development using Oracle SOA CS and Oracle Developer CSTest driven cloud development using Oracle SOA CS and Oracle Developer CS
Test driven cloud development using Oracle SOA CS and Oracle Developer CS
 

Semelhante a Making Decisions in a World Awash in Data: We’re going to need a different boat : Anthony Scriffignano’s Talk for the MIT Program on Information Science

Emergent MEDIA, NEXT GEN THINKING
Emergent MEDIA, NEXT GEN THINKINGEmergent MEDIA, NEXT GEN THINKING
Emergent MEDIA, NEXT GEN THINKING
Ann DeMarle
 
Michael Edson, Relevance, Existence, and Smithsonian Strategy, for OCLC "Web ...
Michael Edson, Relevance, Existence, and Smithsonian Strategy, for OCLC "Web ...Michael Edson, Relevance, Existence, and Smithsonian Strategy, for OCLC "Web ...
Michael Edson, Relevance, Existence, and Smithsonian Strategy, for OCLC "Web ...
Michael Edson
 
Ppt shark global forum session 3 2012 v4
Ppt shark global forum session 3 2012 v4Ppt shark global forum session 3 2012 v4
Ppt shark global forum session 3 2012 v4
GlobalForum
 
Carl Miller
Carl MillerCarl Miller
Carl Miller
MRS
 

Semelhante a Making Decisions in a World Awash in Data: We’re going to need a different boat : Anthony Scriffignano’s Talk for the MIT Program on Information Science (20)

Corso pisa-2 dh-2017
Corso pisa-2 dh-2017Corso pisa-2 dh-2017
Corso pisa-2 dh-2017
 
The Ethics of Structured Information
The Ethics of Structured InformationThe Ethics of Structured Information
The Ethics of Structured Information
 
How to Think in the Information Age: Finding Facts in a Post-Truth World
How to Think in the Information Age: Finding Facts in a Post-Truth WorldHow to Think in the Information Age: Finding Facts in a Post-Truth World
How to Think in the Information Age: Finding Facts in a Post-Truth World
 
Emergent MEDIA, NEXT GEN THINKING
Emergent MEDIA, NEXT GEN THINKINGEmergent MEDIA, NEXT GEN THINKING
Emergent MEDIA, NEXT GEN THINKING
 
Michael Edson, Relevance, Existence, and Smithsonian Strategy, for OCLC "Web ...
Michael Edson, Relevance, Existence, and Smithsonian Strategy, for OCLC "Web ...Michael Edson, Relevance, Existence, and Smithsonian Strategy, for OCLC "Web ...
Michael Edson, Relevance, Existence, and Smithsonian Strategy, for OCLC "Web ...
 
The Problem with dots: A critique of the Lessig and Murray models
The Problem with dots: A critique of the Lessig and Murray modelsThe Problem with dots: A critique of the Lessig and Murray models
The Problem with dots: A critique of the Lessig and Murray models
 
CKX: The Dark Side of Data
CKX: The Dark Side of DataCKX: The Dark Side of Data
CKX: The Dark Side of Data
 
SXSW 2012 - Big Data Conversation
SXSW 2012 - Big Data ConversationSXSW 2012 - Big Data Conversation
SXSW 2012 - Big Data Conversation
 
2008 earth intelligence network at hope
2008 earth intelligence network at hope2008 earth intelligence network at hope
2008 earth intelligence network at hope
 
Conference Report Final 11.18
Conference Report Final 11.18Conference Report Final 11.18
Conference Report Final 11.18
 
Web Science Session 2: Social Media
Web Science Session 2: Social MediaWeb Science Session 2: Social Media
Web Science Session 2: Social Media
 
Loss Of Innocence Essay
Loss Of Innocence EssayLoss Of Innocence Essay
Loss Of Innocence Essay
 
Ppt shark global forum session 3 2012 v4
Ppt shark global forum session 3 2012 v4Ppt shark global forum session 3 2012 v4
Ppt shark global forum session 3 2012 v4
 
Big data, human agency, critical realism and the future of the social sciences
Big data, human agency, critical realism and the future of the social sciencesBig data, human agency, critical realism and the future of the social sciences
Big data, human agency, critical realism and the future of the social sciences
 
Carl Miller
Carl MillerCarl Miller
Carl Miller
 
Philosophy of Big Data: Big Data, the Individual, and Society
Philosophy of Big Data: Big Data, the Individual, and SocietyPhilosophy of Big Data: Big Data, the Individual, and Society
Philosophy of Big Data: Big Data, the Individual, and Society
 
who is writing my autobiography
who is writing my autobiographywho is writing my autobiography
who is writing my autobiography
 
Coś zupełnie offline: badania etnograficzne są kluczem do skutecznego zaangaż...
Coś zupełnie offline: badania etnograficzne są kluczem do skutecznego zaangaż...Coś zupełnie offline: badania etnograficzne są kluczem do skutecznego zaangaż...
Coś zupełnie offline: badania etnograficzne są kluczem do skutecznego zaangaż...
 
Bocconi futuro
Bocconi futuroBocconi futuro
Bocconi futuro
 
Knowledge Sharing in the Networked World of the Internet of Things
Knowledge Sharing in the Networked World of the Internet of ThingsKnowledge Sharing in the Networked World of the Internet of Things
Knowledge Sharing in the Networked World of the Internet of Things
 

Mais de Micah Altman

SAFETY NETS: RESCUE AND REVIVAL FOR ENDANGERED BORN-DIGITAL RECORDS- Program ...
SAFETY NETS: RESCUE AND REVIVAL FOR ENDANGERED BORN-DIGITAL RECORDS- Program ...SAFETY NETS: RESCUE AND REVIVAL FOR ENDANGERED BORN-DIGITAL RECORDS- Program ...
SAFETY NETS: RESCUE AND REVIVAL FOR ENDANGERED BORN-DIGITAL RECORDS- Program ...
Micah Altman
 
Creative Data Literacy: Bridging the Gap Between Data-Haves and Have-Nots
Creative Data Literacy: Bridging the Gap Between Data-Haves and Have-NotsCreative Data Literacy: Bridging the Gap Between Data-Haves and Have-Nots
Creative Data Literacy: Bridging the Gap Between Data-Haves and Have-Nots
Micah Altman
 
SOLARSPELL: THE SOLAR POWERED EDUCATIONAL LEARNING LIBRARY - EXPERIENTIAL LEA...
SOLARSPELL: THE SOLAR POWERED EDUCATIONAL LEARNING LIBRARY - EXPERIENTIAL LEA...SOLARSPELL: THE SOLAR POWERED EDUCATIONAL LEARNING LIBRARY - EXPERIENTIAL LEA...
SOLARSPELL: THE SOLAR POWERED EDUCATIONAL LEARNING LIBRARY - EXPERIENTIAL LEA...
Micah Altman
 

Mais de Micah Altman (20)

Selecting efficient and reliable preservation strategies
Selecting efficient and reliable preservation strategiesSelecting efficient and reliable preservation strategies
Selecting efficient and reliable preservation strategies
 
Well-Being - A Sunset Conversation
Well-Being - A Sunset ConversationWell-Being - A Sunset Conversation
Well-Being - A Sunset Conversation
 
Matching Uses and Protections for Government Data Releases: Presentation at t...
Matching Uses and Protections for Government Data Releases: Presentation at t...Matching Uses and Protections for Government Data Releases: Presentation at t...
Matching Uses and Protections for Government Data Releases: Presentation at t...
 
Privacy Gaps in Mediated Library Services: Presentation at NERCOMP2019
Privacy Gaps in Mediated Library Services: Presentation at NERCOMP2019Privacy Gaps in Mediated Library Services: Presentation at NERCOMP2019
Privacy Gaps in Mediated Library Services: Presentation at NERCOMP2019
 
Well-being A Sunset Conversation
Well-being A Sunset ConversationWell-being A Sunset Conversation
Well-being A Sunset Conversation
 
Can We Fix Peer Review
Can We Fix Peer ReviewCan We Fix Peer Review
Can We Fix Peer Review
 
Academy Owned Peer Review
Academy Owned Peer ReviewAcademy Owned Peer Review
Academy Owned Peer Review
 
Redistricting in the US -- An Overview
Redistricting in the US -- An OverviewRedistricting in the US -- An Overview
Redistricting in the US -- An Overview
 
A Future for Electoral Districting
A Future for Electoral DistrictingA Future for Electoral Districting
A Future for Electoral Districting
 
A History of the Internet :Scott Bradner’s Program on Information Science Talk
A History of the Internet :Scott Bradner’s Program on Information Science Talk  A History of the Internet :Scott Bradner’s Program on Information Science Talk
A History of the Internet :Scott Bradner’s Program on Information Science Talk
 
SAFETY NETS: RESCUE AND REVIVAL FOR ENDANGERED BORN-DIGITAL RECORDS- Program ...
SAFETY NETS: RESCUE AND REVIVAL FOR ENDANGERED BORN-DIGITAL RECORDS- Program ...SAFETY NETS: RESCUE AND REVIVAL FOR ENDANGERED BORN-DIGITAL RECORDS- Program ...
SAFETY NETS: RESCUE AND REVIVAL FOR ENDANGERED BORN-DIGITAL RECORDS- Program ...
 
Labor And Reward In Science: Commentary on Cassidy Sugimoto’s Program on Info...
Labor And Reward In Science: Commentary on Cassidy Sugimoto’s Program on Info...Labor And Reward In Science: Commentary on Cassidy Sugimoto’s Program on Info...
Labor And Reward In Science: Commentary on Cassidy Sugimoto’s Program on Info...
 
Utilizing VR and AR in the Library Space:
Utilizing VR and AR in the Library Space:Utilizing VR and AR in the Library Space:
Utilizing VR and AR in the Library Space:
 
Creative Data Literacy: Bridging the Gap Between Data-Haves and Have-Nots
Creative Data Literacy: Bridging the Gap Between Data-Haves and Have-NotsCreative Data Literacy: Bridging the Gap Between Data-Haves and Have-Nots
Creative Data Literacy: Bridging the Gap Between Data-Haves and Have-Nots
 
SOLARSPELL: THE SOLAR POWERED EDUCATIONAL LEARNING LIBRARY - EXPERIENTIAL LEA...
SOLARSPELL: THE SOLAR POWERED EDUCATIONAL LEARNING LIBRARY - EXPERIENTIAL LEA...SOLARSPELL: THE SOLAR POWERED EDUCATIONAL LEARNING LIBRARY - EXPERIENTIAL LEA...
SOLARSPELL: THE SOLAR POWERED EDUCATIONAL LEARNING LIBRARY - EXPERIENTIAL LEA...
 
Attribution from a Research Library Perspective, on NISO Webinar: How Librari...
Attribution from a Research Library Perspective, on NISO Webinar: How Librari...Attribution from a Research Library Perspective, on NISO Webinar: How Librari...
Attribution from a Research Library Perspective, on NISO Webinar: How Librari...
 
Agenda's for Preservation Research
Agenda's for Preservation ResearchAgenda's for Preservation Research
Agenda's for Preservation Research
 
Software Repositories for Research -- An Environmental Scan
Software Repositories for Research -- An Environmental ScanSoftware Repositories for Research -- An Environmental Scan
Software Repositories for Research -- An Environmental Scan
 
July IAP: Confidential Information - Storage, Sharing, & Publication - with M...
July IAP: Confidential Information - Storage, Sharing, & Publication - with M...July IAP: Confidential Information - Storage, Sharing, & Publication - with M...
July IAP: Confidential Information - Storage, Sharing, & Publication - with M...
 
How Many Copies is Enough
How Many Copies is EnoughHow Many Copies is Enough
How Many Copies is Enough
 

Último

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Último (20)

Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 

Making Decisions in a World Awash in Data: We’re going to need a different boat : Anthony Scriffignano’s Talk for the MIT Program on Information Science

  • 1. MIT Brown Bag Session: Making Decisions in a World Awash in Data: We’re going to need a different boat! Anthony J. Scriffignano,Ph.D. SVP/Chief Data Scientist Dun & Bradstreet 14-November,2016
  • 2. 22 The Jetsons image is licensed under Creative Commons We need to seriously think about the implications of trivial inference from data…
  • 3. 3 How is the data landscape changing? What we are doing in data science to respond? What type of skills and thinking are required to remain relevant in this evolving world?
  • 4. 4 The New Normal: How the data around us continues to change What’s changing? What’s New? When do we have enough?
  • 5. 5 We must observe what is changing critically Reading about events that happened in the past, listening to people who are not present. Reading about things in the “now” – can’t tell who is communicating with whom – does anybody really know what is going on? Courtesy Get Smart (1965–1970) Information was created and shared in silos According to Google Plus. Precisely five photographs were ever taken of Neil Armstrong while Apollo 11 operated on the surface of the moon. Only four of those photos show Armstrong outside the Lunar Module and actually moonwalking. Only three of them show Armstrong in direct view, rather than a reflection. Aug 31, 2012
  • 6. 6 A qualitative look at a shorter period from the perspective of an extensible analytical frame… What about the digital footprint of all of the smartphones? What about the social networks the crowd? What about the metadata in the photos? What are the opportunity costs to other activities? The largest corpus of data preceded the event Most data created about the event had significant, and asymmetric latency The rate of “data decay” attributable to the participants in the event is significant Pope Benedict Inauguration Pope Francis Inauguration BEWARE THE DANGERS OF MAKING INFRENCE OUTSIDE FRAMES OF REFERENCE, KNOW THE RULES OF WHEN YOU CAN OR PAY THE PRICE!
  • 7. 7 The burning platform… Too much data? Different objectives?
  • 8. 8 When is enough enough? Identify the Scenario • Relative size • Key question Assess DispositiveThreshold • Estimate • Triangulate Decision Elasticity • Bias • Opportunity cost D A T A I N H A N D D I S C O V E R A B L E D A T A E X I S T I N G B U T I N A C C E S S I B L E D A T A @Scriffignano1
  • 9. 9 VOLUME Data sensing Curation, Data at rest vs. data in motion More is not necessarily better VELOCITY Judging Simultaneity of truth Time of curation vs. time of creation The myth of real time data VERACITY Triangulation, Non-regressive methods Malfeasance innovation, regulatory All true data is not necessarily simultaneously true VARIETY Entity extraction, discovery Data that exists but is unavailable, unstructured More data is being disregarded than used VALUE Disambiguation Opportunity cost of curation, single use data Value deteriorates at an alarming rate Challenging assumptions:Taking a More Sophisticated View of “Big Data”
  • 10. 10 This change gives rise to choices about “where to land” • Where to “land” • How to measure in a changing environment • Implications of choosing poorly
  • 11. 11 How are We Reacting: Bringing data science into the room Bringing Data Science into the room “Black cat” problems Emerging relationships Quantum observation / thinking
  • 12. Implications Emerging Trends Description Opportunities Risks Adoption of Natural Language Processing • Semantic Disambiguation: automated parsing and analysis of unstructured and structured text, and spoken language – or, in a more advanced state, across multiple languages • Detecting conversations of interest in the ocean of social chatter • Foundational capability for detecting relationships between entities of interest and their behaviors • Falling behind “bad guys,” who may use increasingly sophisticated “cloaking” techniques in digital communications Export of Fintech innovation into other areas • Social decision-making • Blockchain and related technologies: promotes transparency and security of transactions • Supports regulatory efforts e.g. privacy • Exporting techniques outside of finance • New malfeasance, especially with alternate digital identity • Dispositive threshold Emergence of AI in place of human decision makers • Relegation of decision making to algorithms, e.g. in trading, but potentially in healthcare, transportation, energy, etc. • Significantly faster, cheaper and more efficient outcomes • Drives scale / consistency • Cybersecurity is even more critical • Economically optimal and socially beneficial? • Perpetuating heuristic error Emerging Digital Technologies: Behaviors In scope for today’s discussion
  • 13. 13 How data is being discovered and used is a constantly-changing environment with changing context. Situational awareness… • More than 85% of data creation is unstructured • Language and use of language are constantly evolving • Commonly available tools and solutions only address a small part of this space Unstructured Data
  • 14. 14 Focusing on smaller parts of the problem space… CHALLENGES The “John Smith” problem Caroline M Smith 302 N Liberty St. Albion, IA Addr Type: Residential Carrie Smith Monolith Corporation 1716 Locust St. Des Moines, IA Addr. Type: Commercial Caroline Smith University of Iowa 21 E Market St. Iowa City, IA Addr. Type: Commercial The “Sybil” problem Carrie Smith Tenderheart Daycare 2635 Cleveland Dr. Adel, IA Addr. Type: Commercial The “Ann Taylor” problem PROGRESSIVE DECOMPOSITION Who is speaking? About whom? How do they feel? In what context? EPISTEMOLOGY Dispositive threshold
  • 15. 15 At the same time, we must continue to innovate in traditional spaces where we have strong enabling capabilities.
  • 16. 16 Intentionally focusing on smaller parts of the problem space… THE CHALLENGE The “John Smith” problem Caroline M Smith 302 N Liberty St. Albion, IA Addr Type: Residential Carrie Smith Monolith Corporation 1716 Locust St. Des Moines, IA Addr. Type: Commercial Caroline Smith University of Iowa 21 E Market St. Iowa City, IA Addr. Type: Commercial The “Sybil” problem Carrie Smith Tenderheart Daycare 2635 Cleveland Dr. Adel, IA Addr. Type: Commercial What would it look like? The “Ann Taylor” problem Who is speaking? About whom? How do they feel? In what context? Epistemology
  • 17. 17 Simple mean of sentiment Weighted mean of sentiment Standard deviation of sentiment Thought experiment (assuming completion)
  • 18. 18 New Answers… Are there identifiable modes? Do they change when speaking about their own enterprise? Is the leadership leading or lagging mean sentiment? How does this change over time? How do leaders influence?
  • 19. 19 To avoid getting distracted by “all things social”, the science involves continuous evolution and focus on specific use cases that drive value Sarcasm ABC corporation is a wonderful company, if you don’t do business with them. Neologism Be sure to like us on FaceBook and use #shallow when you Tweet. Grammar variations FBI is Hunting Terrorists With Explosives. Punctuation / Intrusion of foreign language “Hi mom!” vs. “Hi, mom?” Intentional mis-spelling RU There? Context / Behavior Sentiment Attribution Entity Extraction USE CASES CONFOUNDING CHARACTERISTICS DERIVING EMPIRICAL MEASURES THAT INFORM USE CASES D&B proprietary information, do not distribute or copy without permission Feedback
  • 20. 20 What do we have to believe?
  • 21. 21 Watch this space… • Inter- and intra-language correlation : deciding when things mean the same thing • Inter- and intra-language transformation : transforming inference among languages • Changing behavior to attract/obviate grapheme analysis : reacting to changing language • Emerging “metalanguage” (e.g. “textspeak”) : reacting to language about language • A language of “things” : reacting to new languages used by automation • Using language to hide language : reacting to attempts to obscure via language • Unicode is not universal… : understanding the limitations of automation
  • 22. Implications Emerging Trends Description Opportunities Risks Connected Space • Discovery of Complex Counterparty Relationships: construction of an n-dimensional space where dyadic relationships are established into a connected graph • Discovering relationships between entities of interest and their behavioral patterns • Changing the environment by measuring it (e.g. fraud) and creating smarter malfeasance • Confirmation bias based on available data Large Scale Machine Learning • Intelligence models which rely on computational intelligence: automated learning, classification and storage • Extending discovery of relationships beyond rule-based approaches • New valuable insights, esp. counter-intuitive and paradoxical • “Garbage in, garbage out” • Exporting human bias into training sets Digital Everything • Scientific concepts become computable as data is digitized in new and more nuanced ways (e.g. digital X-ray, fitness monitors, autonomous vehicles) • Discovery of patterns and solutions that are theoretically “invisible” • A powerful weapon in hands of a “bad guy” • Laws will always lag evolution Emerging Digital Technologies: Relationships In scope for today’s discussion
  • 23. 23 The Black Cat Problem … but also some very ominous 23 Dealing with “Black Cat” problems • Signals • Systemic measures • Anomaly detection • Isotropism • Character / quality measures • Data sensing • Triggers
  • 24. 24 Visualizing extremely complex, changing relationships addresses questions never before feasible Dyadic relationships across multiple perspectives Abstracting dimensions Time “What If” scenarios News Social signals Market Data Events Internal Data Some examples • Understanding Signals derived from changes to business information • Discovering and investigating clusters of unusual behavior • Exploring the impact of new regulation • Applying standard measures to a highly dynamic environment • Exploring the impact of new market forces • Studying the real or potential impact of supply chain interruptions • Investigating emerging capabilities (e.g. reputational risk) Asking new questions never before feasible Observing key measures over time Blending with similarly constructed graphs
  • 25. Understanding language as a changing phenomenon leads to greater computational capability with regard to changing behavior Money laundering CorporateTheft Identify Bust- out Shell Company Trade Rings T R A D I T I O N A L V I E W M O R E N U A N C E D V I E W Cybersecurity - inside out/outside in Data sovereignty Permissible use Discovering prior behavior vs. emerging behavior in extremely large sets of data @Scriffignano1 25
  • 26. Manifesting new relationships in our behaviorManifesting new relationships in our behavior Data Privacy Expressed Consent Intellectual Property Transferring data across borders New data assets Compliance with industry standards and best practices @Scriffignano1 26 Data sovereignty Cybersecurity Integrated global value chain Emergency preparedness National security: cyber-everything Border protection Balance ofTrade
  • 27. 27 The Path Ahead: Creating a mindset in the organization that evolves Where do we need to focus Future challenges
  • 28. Implications Emerging Trends Description Opportunities Risks Changing Nature of Computing • Neuromorphic computing: Biologically-inspired techniques aiming to mimic human thought • Quantum computing: Non- Boolean physical and algorithmic devices • Computing as a commodity: ubiquitous access to unlimited computational power on a “pay-as- you-go” model • Analytics / Data Science as a Service: Attempts to deliver capabilities down-market or in a more scalable way • Solutions to previously intractable problems enabled by increased “parallelism,” and ability to compute on non- binary concepts • ‘Bad guys’ get access to unprecedented computing power and techniques • Need to rethink cybersecurity Internet of Things • Objects connected on a network, sending and receiving data, distinctions by use (e.g. Industrial) • Gleaning previously unobserved patterns of behaviors and relationships • Cybersecurity and privacy challenges • Authentication / validation Emerging Digital Technologies:“Things” In scope for today’s discussion
  • 29. 29 Quantum, Neuromorphic Computing Technology to go beyond digital / Boolean computation Neural Networks – integrated computing models inspired by biological analogs Quantum Computing – Continuous, non-Boolean computational engines and algorithms Substitution of Addition/Multiplication/AND & OR with Translation/Rotation & Expectation Potential applications: •Quicker non-deterministic searches from Quantum Algorithms •Reconstruction of implied data/metadata models •Quantum Pattern Discovery (e.g. anisotropism, neosophism)
  • 30. 30 Quantum Computing Information stored in qubits; qubits can be in base states (|0>, |1>) or any superposition or entanglement (inner product) of these. Each additional qubit adds infinitely many more superpositions Calculations are performed by unitary transformations – essentially linear translations or rotations – on the qubit states
  • 31. 31 Food for Thought… Traditional thinking Quantum thinking • Deterministic Add/change/delete/search • Workflow • Value Chain dynamics • Regressive analysis • Non-Deterministic approach to monotonicity • Probabilistic workflow / heuristics • Value Chain order-n understanding • Non-Regressive analysis, especially with new data/new behavior
  • 32. 32 Machine Learning / Learning from Machines 32
  • 34. 34 It is very important to have a working definition of “Innovation” New ways of doing old things? Doing new things? (Traditional) Ways of doing new things? Creating simpler unsolved problems? New ways of knowing? New ways of cheating?
  • 35. 35 Reflecting on the “Data Journey” of Large Multinational Organizations From focus on data to focus on meaning From analytic focus to holistic focus From ingestion to curation From early adoption to integrated value chain Technology Process MindsetPeople
  • 36. 36 The Evolving Leadership Mindset for Data-Inspired Organizations Initial Focus Emerging Mindset Modern Mindset • Awareness of a skills gap • Recognition of significant risk and opportunity • Some degree of feeling overwhelmed by the “new” • Silos of evolution, internal competition due to new roles/responsibilities • Focus on evolving the existing workforce and hiring new skills • As landscape rapidly changes, skills required shift from tools to competencies • More focus on governance, provenance, security, malfeasance • More educated customers placing higher demands on the organization • Breaking down silos of data and innovation • Ability to measure and respond in a truly agile fashion • Reflective leaders who constantly re-evaluate and constantly learn • Analytical, data-based decisions evolve through new methods, learning
  • 37. 37 Reflecting on the Journey -- Things to Consider Too Much Data? • Start with a problem or question, not a tool or dataset • Understand the going-in assumptions • Continuously evaluate how the environment is changing NewTypes of Data • Select methods (especially non-regressive) carefully • Use new methods and visualizations for a reason, not for an expedient • Be aware of “Black Cat” problems The Burning Platform • Continuously evaluate new skills and capabilities • Continuously evaluate new ways of knowing, breaking down problems into smaller pieces, reducing complexity • The decision to do nothing is a decision to sink further behind 37
  • 38. 38 We can’t solve problems by using the same kind of thinking we used when we created them.
  • 39. Anthony Scriffignano, Ph.D., SVP / Chief Data Scientist scriffignanoa@dnb.com @SCRIFFIGNANO1 ThankYou