SlideShare uma empresa Scribd logo
1 de 21
Baixar para ler offline
Big Data Conference 2013:
Analytics and Applications for Federal Big Data

Data Tactics Corp: A Blended Approach to Big
Data Analytics
!

Richard Heimann,
Data Scientist at Data Tactics Corporation
!

Data Tactics Analytics Practice
The Team:
(Nathan D., Shrayes R., David P., Adam VE., Geoffrey B., Rich H.)

Graduates from top universities...


!
Advanced degrees include:

mathematics, computer science, astrophysics, electrical
engineering, mechanical engineering, statistics, social sciences.

!
Base competencies (horizontals): clustering, association rules,
regression, naive bayesian classifier, decision trees, time-series,
text analysis.

!
Going beyond the base (verticals)...
th

an

pl

st

RT

CA

Ra

ru

nd
om
se
ct
nt
ni
co
ur
ng
im Fo
ns
al
en res
alg
tra
eq
ta
t
in
or
ua
na
ed
ith
tio
to
lys
m
op
n
pi
ec
s
is
m
tim
c
on
od
m
om
od iza
eli
ng
els tion fac
et
sp
ri
to
s
ra
at cs
ial
na
ec
di
lys
au
ba
m
on
is
to
ye
en
om
re
sia
sio
gr
et
n
es
na
ric
st
siv
lr
at
s
ed
ist
e
m
uc
lat
ics
od
tio PC
en
els
n
tc
A
las
IC
s
A
as
an
hi
tro
gr
aly
er
ph
ap
ar
ys sis
ch
h
th
ica
ica
eo
lt
lm
ry
im
od
DL
alg
enu IRT
els
se
IS
or
m
A
rie
ith
er
s
m
ica
an
s
l in
aly
te
sis
m
gr
ba
ixt
at
gg
ur
io
SV
e
in
n
m
g/
M
te
od
bo
ch
m
els
os
ni
ax
qu
tin
en
es
g
t

pa

Horizontals & Verticals

Clustering || Regression || Decision Trees || Text Analysis

Association Rules || Naive Bayesian Classifier || Time Series Analysis
Data Tactics Analytics Practice
Hierarchy of Data Scientists
Why Analytics [Business]???
Why are analytics important? 

(Business, Analytics, Practical)

!
!

!

"We need to stop reinventing the cloud
and start using it!"
(Dave Boyd)
!
!
!
!
Why Analytics [Analytics]???
Why are analytics important? 

(Business, Analytics, Practical)
!
!
No Free Lunch (NFL): no algorithm performs better than
any other when their performance is averaged uniformly
over all possible problems of a particular type. Algorithms
must be designed for a particular domain or style of
problem, and that there is no such thing as a general
purpose algorithm.

!
!
!
Why Analytics [Practical]???
Academic Publications Scale

N

Web Scales
IC Scales

t

If this guy doesn’t scale - none of us do.

t
algo to users > algo to data
Development
Deployment
Machine

User

Parallel

Distributed

Objective

Subjective

M/R

HDFS

Valid

Useful

MPP

SOA

Nontrivial

Novel

Accurate

Comprehensible

GPU
Shiny
Open Sourced by RStudio in November 2012

!
Not the first to wrap R in the browser but perhaps the
easiest for R developers 

!
Don’t need to know HTML, CSS and javascript to get
started 

!
Reactive Programming model 

!
Web sockets for communication
server.R
# Define server logic required to generate and plot a random
# distribution!
shinyServer(function(input, output) {!
!
# Expression that generates a plot of the distribution.!
# renderPlot:!
#!
# 1: Is "reactive" and will therefore automatically !
#
re-executed when inputs change.!
# 2: Its output type is a plot. !
!
output$distPlot <- renderPlot({!
!
# generate an rnorm distribution and plot it!
dist <- rnorm(input$obs)!
hist(dist)!
})!
})
ui.R
library(shiny)!

!

# Define UI for application that plots random distributions !
shinyUI(pageWithSidebar(!
!
# Application title:!
headerPanel("My Shiny App!"),!
!
# Sidebar with a slider input for number of observations:!
sidebarPanel(!
sliderInput("obs", !
"Number of observations:", !
min = 0, !
max = 1000, !
value = 500)!
),!
# Show a plot of the generated distribution:!
mainPanel(!
plotOutput("distPlot")!
)!
))
ui.R
headerPanel()

sidebarPanel()

mainPanel()
server.R + ui.R = microscope
adjustable parameters (knobs): 0 < knobs < small k
knobs = lighting, varying objectives, focusing (fine and course)

!
knobs: 

fine and course filtering: 

geography

time

variable of interest 

observations of interest

promote significant (objective) patterns

change model parameters
BDE + Shiny
Overlapping Solutions
Multiple models allow more nuanced
learning from data.

Latent Spatial Traffic Patterns

!

Convergent results serve as crossvalidation.

!

2

Points of divergence provide additional
insights and allow models to be
calibrated further.

!

Different models can provide answers to
different questions or answers to the
same question for different analysts.

!

Multi-method excels to diverse teams
with mutable missions.

!
smooth + rough = data
!

New paradigm where the question, “Are
there multiple, overlapping ways to solve
this problem” dominate.

3

1
Overlapping Solutions
Are there multiple, overlapping ways to solve this problem?

yt
ic

yt

al

A


An

An

B

al

ic

A+B

+

+

B

C

A+B+C

A

C

Analytic C
Summary:

# our blended approach !
dt.philosophy <- lm(analytics ~ bigdata +
smalldata + objective +
subjective:overlapping.solutions,
data=data)
Overlapping Solutions
Data Science for Government (DS4G)
About (DS4G):

!

1: Improve on definitions of analytics.

2: Outline optimal interactions with Data Scientists.

3: Provide a life-cycle for Data Science.

4: Most importantly, share a taxonomy to identify analytical questions one
could ask of data (Causal Effects, Classification, Outlier Detection, Big Data and
Analytics, Measurement Models, & Text Analysis)

!

Presented by Data Tactics Analytics Team

Location: TBD 

Time: 1Q 2014

Duration: ~ 5 hrs.

Cost: FREE

Audience: Government managers and Data Tactics partners with their
customers.
LUBAP goes wild!
421 attending!

http://www.meetup.com/Data-Science-DC/events/146953142/
Thank you...	

Questions?
Homepage: http://www.data-tactics.com
Blog: http://datatactics.blogspot.com
Twitter: @DataTactics
Slideshare: http://www.slideshare.net/DataTactics/presentations
Or, me (Rich Heimann): rheimann@data-tactics-corp.com

Mais conteúdo relacionado

Mais procurados

Get hands-on with Explainable AI at Machine Learning Interpretability(MLI) Gym!
Get hands-on with Explainable AI at Machine Learning Interpretability(MLI) Gym!Get hands-on with Explainable AI at Machine Learning Interpretability(MLI) Gym!
Get hands-on with Explainable AI at Machine Learning Interpretability(MLI) Gym!Sri Ambati
 
Data Science and Analytics Brown Bag
Data Science and Analytics Brown BagData Science and Analytics Brown Bag
Data Science and Analytics Brown BagDataTactics
 
Schema-Agnostic Queries (SAQ-2015): Semantic Web Challenge
Schema-Agnostic Queries (SAQ-2015): Semantic Web ChallengeSchema-Agnostic Queries (SAQ-2015): Semantic Web Challenge
Schema-Agnostic Queries (SAQ-2015): Semantic Web ChallengeAndre Freitas
 
Model evaluation in the land of deep learning
Model evaluation in the land of deep learningModel evaluation in the land of deep learning
Model evaluation in the land of deep learningPramit Choudhary
 
Feature Engineering for Machine Learning at QConSP
Feature Engineering for Machine Learning at QConSPFeature Engineering for Machine Learning at QConSP
Feature Engineering for Machine Learning at QConSPAmanda Casari
 
Moving Your Machine Learning Models to Production with TensorFlow Extended
Moving Your Machine Learning Models to Production with TensorFlow ExtendedMoving Your Machine Learning Models to Production with TensorFlow Extended
Moving Your Machine Learning Models to Production with TensorFlow ExtendedJonathan Mugan
 
[系列活動] 資料探勘速遊 - Session4 case-studies
[系列活動] 資料探勘速遊 - Session4 case-studies[系列活動] 資料探勘速遊 - Session4 case-studies
[系列活動] 資料探勘速遊 - Session4 case-studies台灣資料科學年會
 
Keynote by Agus Sudjianto, Wells Fargo - Interpretable Machine Learning - H2O...
Keynote by Agus Sudjianto, Wells Fargo - Interpretable Machine Learning - H2O...Keynote by Agus Sudjianto, Wells Fargo - Interpretable Machine Learning - H2O...
Keynote by Agus Sudjianto, Wells Fargo - Interpretable Machine Learning - H2O...Sri Ambati
 
Module 8: Natural language processing Pt 1
Module 8:  Natural language processing Pt 1Module 8:  Natural language processing Pt 1
Module 8: Natural language processing Pt 1Sara Hooker
 
Gentle Introduction: Bayesian Modelling and Probabilistic Programming in R
Gentle Introduction: Bayesian Modelling and Probabilistic Programming in RGentle Introduction: Bayesian Modelling and Probabilistic Programming in R
Gentle Introduction: Bayesian Modelling and Probabilistic Programming in RMarco Wirthlin
 

Mais procurados (13)

Get hands-on with Explainable AI at Machine Learning Interpretability(MLI) Gym!
Get hands-on with Explainable AI at Machine Learning Interpretability(MLI) Gym!Get hands-on with Explainable AI at Machine Learning Interpretability(MLI) Gym!
Get hands-on with Explainable AI at Machine Learning Interpretability(MLI) Gym!
 
Data science
Data scienceData science
Data science
 
Data Science and Analytics Brown Bag
Data Science and Analytics Brown BagData Science and Analytics Brown Bag
Data Science and Analytics Brown Bag
 
Schema-Agnostic Queries (SAQ-2015): Semantic Web Challenge
Schema-Agnostic Queries (SAQ-2015): Semantic Web ChallengeSchema-Agnostic Queries (SAQ-2015): Semantic Web Challenge
Schema-Agnostic Queries (SAQ-2015): Semantic Web Challenge
 
Model evaluation in the land of deep learning
Model evaluation in the land of deep learningModel evaluation in the land of deep learning
Model evaluation in the land of deep learning
 
Feature Engineering for Machine Learning at QConSP
Feature Engineering for Machine Learning at QConSPFeature Engineering for Machine Learning at QConSP
Feature Engineering for Machine Learning at QConSP
 
Moving Your Machine Learning Models to Production with TensorFlow Extended
Moving Your Machine Learning Models to Production with TensorFlow ExtendedMoving Your Machine Learning Models to Production with TensorFlow Extended
Moving Your Machine Learning Models to Production with TensorFlow Extended
 
[系列活動] 資料探勘速遊 - Session4 case-studies
[系列活動] 資料探勘速遊 - Session4 case-studies[系列活動] 資料探勘速遊 - Session4 case-studies
[系列活動] 資料探勘速遊 - Session4 case-studies
 
Keynote by Agus Sudjianto, Wells Fargo - Interpretable Machine Learning - H2O...
Keynote by Agus Sudjianto, Wells Fargo - Interpretable Machine Learning - H2O...Keynote by Agus Sudjianto, Wells Fargo - Interpretable Machine Learning - H2O...
Keynote by Agus Sudjianto, Wells Fargo - Interpretable Machine Learning - H2O...
 
Poster
PosterPoster
Poster
 
Learning to learn Model Behavior: How to use "human-in-the-loop" to explain d...
Learning to learn Model Behavior: How to use "human-in-the-loop" to explain d...Learning to learn Model Behavior: How to use "human-in-the-loop" to explain d...
Learning to learn Model Behavior: How to use "human-in-the-loop" to explain d...
 
Module 8: Natural language processing Pt 1
Module 8:  Natural language processing Pt 1Module 8:  Natural language processing Pt 1
Module 8: Natural language processing Pt 1
 
Gentle Introduction: Bayesian Modelling and Probabilistic Programming in R
Gentle Introduction: Bayesian Modelling and Probabilistic Programming in RGentle Introduction: Bayesian Modelling and Probabilistic Programming in R
Gentle Introduction: Bayesian Modelling and Probabilistic Programming in R
 

Destaque

Data Tactics dhs introduction to cloud technologies wtc
Data Tactics dhs introduction to cloud technologies wtcData Tactics dhs introduction to cloud technologies wtc
Data Tactics dhs introduction to cloud technologies wtcDataTactics
 
Ontology and Reports
Ontology and ReportsOntology and Reports
Ontology and ReportsDataTactics
 
Data Tactics Open Source Brief
Data Tactics Open Source BriefData Tactics Open Source Brief
Data Tactics Open Source BriefDataTactics
 
Multi Discipline Intelligence Production Teams 1
Multi Discipline Intelligence Production Teams 1Multi Discipline Intelligence Production Teams 1
Multi Discipline Intelligence Production Teams 1DataTactics
 
Data Tactics Semantic and Interoperability Summit Feb 12, 2013
Data Tactics Semantic and Interoperability Summit Feb 12, 2013Data Tactics Semantic and Interoperability Summit Feb 12, 2013
Data Tactics Semantic and Interoperability Summit Feb 12, 2013DataTactics
 
Data Tactics and Nervve Integrated Big Data v3
Data Tactics and Nervve Integrated Big Data v3Data Tactics and Nervve Integrated Big Data v3
Data Tactics and Nervve Integrated Big Data v3DataTactics
 
NETWORK CENTRALITY IN SUB-NATIONAL AREAS OF INTEREST USING GDELT DATA
NETWORK CENTRALITY IN SUB-NATIONAL AREAS OF INTEREST USING GDELT DATANETWORK CENTRALITY IN SUB-NATIONAL AREAS OF INTEREST USING GDELT DATA
NETWORK CENTRALITY IN SUB-NATIONAL AREAS OF INTEREST USING GDELT DATADataTactics
 
ODSC_Cherven_20160518
ODSC_Cherven_20160518ODSC_Cherven_20160518
ODSC_Cherven_20160518Ken Cherven
 
Why L-3 Data Tactics Data Science?
Why L-3 Data Tactics Data Science?Why L-3 Data Tactics Data Science?
Why L-3 Data Tactics Data Science?Rich Heimann
 
Horizontal Integration of Big Intelligence Data
Horizontal Integration of Big Intelligence DataHorizontal Integration of Big Intelligence Data
Horizontal Integration of Big Intelligence DataDataTactics
 

Destaque (11)

Data Tactics dhs introduction to cloud technologies wtc
Data Tactics dhs introduction to cloud technologies wtcData Tactics dhs introduction to cloud technologies wtc
Data Tactics dhs introduction to cloud technologies wtc
 
Οι Λάπωνες
Οι ΛάπωνεςΟι Λάπωνες
Οι Λάπωνες
 
Ontology and Reports
Ontology and ReportsOntology and Reports
Ontology and Reports
 
Data Tactics Open Source Brief
Data Tactics Open Source BriefData Tactics Open Source Brief
Data Tactics Open Source Brief
 
Multi Discipline Intelligence Production Teams 1
Multi Discipline Intelligence Production Teams 1Multi Discipline Intelligence Production Teams 1
Multi Discipline Intelligence Production Teams 1
 
Data Tactics Semantic and Interoperability Summit Feb 12, 2013
Data Tactics Semantic and Interoperability Summit Feb 12, 2013Data Tactics Semantic and Interoperability Summit Feb 12, 2013
Data Tactics Semantic and Interoperability Summit Feb 12, 2013
 
Data Tactics and Nervve Integrated Big Data v3
Data Tactics and Nervve Integrated Big Data v3Data Tactics and Nervve Integrated Big Data v3
Data Tactics and Nervve Integrated Big Data v3
 
NETWORK CENTRALITY IN SUB-NATIONAL AREAS OF INTEREST USING GDELT DATA
NETWORK CENTRALITY IN SUB-NATIONAL AREAS OF INTEREST USING GDELT DATANETWORK CENTRALITY IN SUB-NATIONAL AREAS OF INTEREST USING GDELT DATA
NETWORK CENTRALITY IN SUB-NATIONAL AREAS OF INTEREST USING GDELT DATA
 
ODSC_Cherven_20160518
ODSC_Cherven_20160518ODSC_Cherven_20160518
ODSC_Cherven_20160518
 
Why L-3 Data Tactics Data Science?
Why L-3 Data Tactics Data Science?Why L-3 Data Tactics Data Science?
Why L-3 Data Tactics Data Science?
 
Horizontal Integration of Big Intelligence Data
Horizontal Integration of Big Intelligence DataHorizontal Integration of Big Intelligence Data
Horizontal Integration of Big Intelligence Data
 

Semelhante a Big Data Conference

The Rensselaer IDEA: Data Exploration
The Rensselaer IDEA: Data Exploration The Rensselaer IDEA: Data Exploration
The Rensselaer IDEA: Data Exploration James Hendler
 
How Graph Databases used in Police Department?
How Graph Databases used in Police Department?How Graph Databases used in Police Department?
How Graph Databases used in Police Department?Samet KILICTAS
 
392_SannaReddyBharath (1)
392_SannaReddyBharath (1)392_SannaReddyBharath (1)
392_SannaReddyBharath (1)bharath reddy
 
How to Become a Big Data Professional.pdf
How to Become a Big Data Professional.pdfHow to Become a Big Data Professional.pdf
How to Become a Big Data Professional.pdfCareervira
 
Data Structure Graph DMZ #DMZone
Data Structure Graph DMZ #DMZoneData Structure Graph DMZ #DMZone
Data Structure Graph DMZ #DMZoneDoug Needham
 
Data science presentation
Data science presentationData science presentation
Data science presentationMSDEVMTL
 
438_AmeeruddinMohammed
438_AmeeruddinMohammed438_AmeeruddinMohammed
438_AmeeruddinMohammedAmeeruddin MD
 
Machine Learning - Challenges, Learnings & Opportunities
Machine Learning - Challenges, Learnings & OpportunitiesMachine Learning - Challenges, Learnings & Opportunities
Machine Learning - Challenges, Learnings & OpportunitiesCodePolitan
 
(Big) Data (Science) Skills
(Big) Data (Science) Skills(Big) Data (Science) Skills
(Big) Data (Science) SkillsOscar Corcho
 
Data science | What is Data science
Data science | What is Data scienceData science | What is Data science
Data science | What is Data scienceShilpaKrishna6
 

Semelhante a Big Data Conference (20)

566_SriramDandamudi_CEE
566_SriramDandamudi_CEE566_SriramDandamudi_CEE
566_SriramDandamudi_CEE
 
587_EswarPrasadReddyMachireddy_CEE
587_EswarPrasadReddyMachireddy_CEE587_EswarPrasadReddyMachireddy_CEE
587_EswarPrasadReddyMachireddy_CEE
 
The Rensselaer IDEA: Data Exploration
The Rensselaer IDEA: Data Exploration The Rensselaer IDEA: Data Exploration
The Rensselaer IDEA: Data Exploration
 
662_AravindKumarN_CEE
662_AravindKumarN_CEE662_AravindKumarN_CEE
662_AravindKumarN_CEE
 
671_JeevanRavula_CEE
671_JeevanRavula_CEE671_JeevanRavula_CEE
671_JeevanRavula_CEE
 
How Graph Databases used in Police Department?
How Graph Databases used in Police Department?How Graph Databases used in Police Department?
How Graph Databases used in Police Department?
 
598_RamaSrikanthJakkam_CEE
598_RamaSrikanthJakkam_CEE598_RamaSrikanthJakkam_CEE
598_RamaSrikanthJakkam_CEE
 
603_SaiKiranPutta_CEE
603_SaiKiranPutta_CEE603_SaiKiranPutta_CEE
603_SaiKiranPutta_CEE
 
392_SannaReddyBharath (1)
392_SannaReddyBharath (1)392_SannaReddyBharath (1)
392_SannaReddyBharath (1)
 
How to Become a Big Data Professional.pdf
How to Become a Big Data Professional.pdfHow to Become a Big Data Professional.pdf
How to Become a Big Data Professional.pdf
 
Data Structure Graph DMZ #DMZone
Data Structure Graph DMZ #DMZoneData Structure Graph DMZ #DMZone
Data Structure Graph DMZ #DMZone
 
Data analysis
Data analysisData analysis
Data analysis
 
Data Analytics_BigData Cert
Data Analytics_BigData CertData Analytics_BigData Cert
Data Analytics_BigData Cert
 
Data science presentation
Data science presentationData science presentation
Data science presentation
 
421_PrakashMudholkar
421_PrakashMudholkar421_PrakashMudholkar
421_PrakashMudholkar
 
402_DheerajKura
402_DheerajKura402_DheerajKura
402_DheerajKura
 
438_AmeeruddinMohammed
438_AmeeruddinMohammed438_AmeeruddinMohammed
438_AmeeruddinMohammed
 
Machine Learning - Challenges, Learnings & Opportunities
Machine Learning - Challenges, Learnings & OpportunitiesMachine Learning - Challenges, Learnings & Opportunities
Machine Learning - Challenges, Learnings & Opportunities
 
(Big) Data (Science) Skills
(Big) Data (Science) Skills(Big) Data (Science) Skills
(Big) Data (Science) Skills
 
Data science | What is Data science
Data science | What is Data scienceData science | What is Data science
Data science | What is Data science
 

Mais de DataTactics

C Star Analytic Presentation
C Star Analytic PresentationC Star Analytic Presentation
C Star Analytic PresentationDataTactics
 
Text Analysis Using Twitter: A Case Study in Dhaka
Text Analysis Using Twitter: A Case Study in Dhaka Text Analysis Using Twitter: A Case Study in Dhaka
Text Analysis Using Twitter: A Case Study in Dhaka DataTactics
 
Data Tactics Analytics Practice
Data Tactics Analytics PracticeData Tactics Analytics Practice
Data Tactics Analytics PracticeDataTactics
 
Discontinuities Demo
Discontinuities DemoDiscontinuities Demo
Discontinuities DemoDataTactics
 
Analytics Brownbag
Analytics Brownbag Analytics Brownbag
Analytics Brownbag DataTactics
 
Big Data Taxonomy 8/26/2013
Big Data Taxonomy 8/26/2013Big Data Taxonomy 8/26/2013
Big Data Taxonomy 8/26/2013DataTactics
 
Data Tactics Unified Dataspace Architecture and Description
Data Tactics Unified Dataspace Architecture and DescriptionData Tactics Unified Dataspace Architecture and Description
Data Tactics Unified Dataspace Architecture and DescriptionDataTactics
 
Bill Ontology Summit (08 feb 1400hrs) v2
Bill Ontology Summit (08 feb 1400hrs) v2Bill Ontology Summit (08 feb 1400hrs) v2
Bill Ontology Summit (08 feb 1400hrs) v2DataTactics
 
DT Company Overview January 2013
DT Company Overview January 2013DT Company Overview January 2013
DT Company Overview January 2013DataTactics
 
Capabilities Brief Analytics
Capabilities Brief AnalyticsCapabilities Brief Analytics
Capabilities Brief AnalyticsDataTactics
 

Mais de DataTactics (11)

C Star Analytic Presentation
C Star Analytic PresentationC Star Analytic Presentation
C Star Analytic Presentation
 
Text Analysis Using Twitter: A Case Study in Dhaka
Text Analysis Using Twitter: A Case Study in Dhaka Text Analysis Using Twitter: A Case Study in Dhaka
Text Analysis Using Twitter: A Case Study in Dhaka
 
Data Tactics Analytics Practice
Data Tactics Analytics PracticeData Tactics Analytics Practice
Data Tactics Analytics Practice
 
Discontinuities Demo
Discontinuities DemoDiscontinuities Demo
Discontinuities Demo
 
DLISA
DLISADLISA
DLISA
 
Analytics Brownbag
Analytics Brownbag Analytics Brownbag
Analytics Brownbag
 
Big Data Taxonomy 8/26/2013
Big Data Taxonomy 8/26/2013Big Data Taxonomy 8/26/2013
Big Data Taxonomy 8/26/2013
 
Data Tactics Unified Dataspace Architecture and Description
Data Tactics Unified Dataspace Architecture and DescriptionData Tactics Unified Dataspace Architecture and Description
Data Tactics Unified Dataspace Architecture and Description
 
Bill Ontology Summit (08 feb 1400hrs) v2
Bill Ontology Summit (08 feb 1400hrs) v2Bill Ontology Summit (08 feb 1400hrs) v2
Bill Ontology Summit (08 feb 1400hrs) v2
 
DT Company Overview January 2013
DT Company Overview January 2013DT Company Overview January 2013
DT Company Overview January 2013
 
Capabilities Brief Analytics
Capabilities Brief AnalyticsCapabilities Brief Analytics
Capabilities Brief Analytics
 

Último

MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityWSO2
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 

Último (20)

MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 

Big Data Conference

  • 1. Big Data Conference 2013: Analytics and Applications for Federal Big Data Data Tactics Corp: A Blended Approach to Big Data Analytics ! Richard Heimann, Data Scientist at Data Tactics Corporation
  • 2. ! Data Tactics Analytics Practice The Team: (Nathan D., Shrayes R., David P., Adam VE., Geoffrey B., Rich H.) Graduates from top universities... ! Advanced degrees include: mathematics, computer science, astrophysics, electrical engineering, mechanical engineering, statistics, social sciences. ! Base competencies (horizontals): clustering, association rules, regression, naive bayesian classifier, decision trees, time-series, text analysis. ! Going beyond the base (verticals)...
  • 3. th an pl st RT CA Ra ru nd om se ct nt ni co ur ng im Fo ns al en res alg tra eq ta t in or ua na ed ith tio to lys m op n pi ec s is m tim c on od m om od iza eli ng els tion fac et sp ri to s ra at cs ial na ec di lys au ba m on is to ye en om re sia sio gr et n es na ric st siv lr at s ed ist e m uc lat ics od tio PC en els n tc A las IC s A as an hi tro gr aly er ph ap ar ys sis ch h th ica ica eo lt lm ry im od DL alg enu IRT els se IS or m A rie ith er s m ica an s l in aly te sis m gr ba ixt at gg ur io SV e in n m g/ M te od bo ch m els os ni ax qu tin en es g t pa Horizontals & Verticals Clustering || Regression || Decision Trees || Text Analysis Association Rules || Naive Bayesian Classifier || Time Series Analysis
  • 4. Data Tactics Analytics Practice Hierarchy of Data Scientists
  • 5. Why Analytics [Business]??? Why are analytics important? (Business, Analytics, Practical) ! ! ! "We need to stop reinventing the cloud and start using it!" (Dave Boyd) ! ! ! !
  • 6. Why Analytics [Analytics]??? Why are analytics important? (Business, Analytics, Practical) ! ! No Free Lunch (NFL): no algorithm performs better than any other when their performance is averaged uniformly over all possible problems of a particular type. Algorithms must be designed for a particular domain or style of problem, and that there is no such thing as a general purpose algorithm. ! ! !
  • 7. Why Analytics [Practical]??? Academic Publications Scale N Web Scales IC Scales t If this guy doesn’t scale - none of us do. t
  • 8. algo to users > algo to data Development Deployment Machine User Parallel Distributed Objective Subjective M/R HDFS Valid Useful MPP SOA Nontrivial Novel Accurate Comprehensible GPU
  • 9. Shiny Open Sourced by RStudio in November 2012 ! Not the first to wrap R in the browser but perhaps the easiest for R developers ! Don’t need to know HTML, CSS and javascript to get started ! Reactive Programming model ! Web sockets for communication
  • 10. server.R # Define server logic required to generate and plot a random # distribution! shinyServer(function(input, output) {! ! # Expression that generates a plot of the distribution.! # renderPlot:! #! # 1: Is "reactive" and will therefore automatically ! # re-executed when inputs change.! # 2: Its output type is a plot. ! ! output$distPlot <- renderPlot({! ! # generate an rnorm distribution and plot it! dist <- rnorm(input$obs)! hist(dist)! })! })
  • 11. ui.R library(shiny)! ! # Define UI for application that plots random distributions ! shinyUI(pageWithSidebar(! ! # Application title:! headerPanel("My Shiny App!"),! ! # Sidebar with a slider input for number of observations:! sidebarPanel(! sliderInput("obs", ! "Number of observations:", ! min = 0, ! max = 1000, ! value = 500)! ),! # Show a plot of the generated distribution:! mainPanel(! plotOutput("distPlot")! )! ))
  • 13. server.R + ui.R = microscope adjustable parameters (knobs): 0 < knobs < small k knobs = lighting, varying objectives, focusing (fine and course) ! knobs: fine and course filtering: geography time variable of interest observations of interest promote significant (objective) patterns change model parameters
  • 15. Overlapping Solutions Multiple models allow more nuanced learning from data. Latent Spatial Traffic Patterns ! Convergent results serve as crossvalidation. ! 2 Points of divergence provide additional insights and allow models to be calibrated further. ! Different models can provide answers to different questions or answers to the same question for different analysts. ! Multi-method excels to diverse teams with mutable missions. ! smooth + rough = data ! New paradigm where the question, “Are there multiple, overlapping ways to solve this problem” dominate. 3 1
  • 16. Overlapping Solutions Are there multiple, overlapping ways to solve this problem? yt ic yt al A An An B al ic A+B + + B C A+B+C A C Analytic C
  • 17. Summary: # our blended approach ! dt.philosophy <- lm(analytics ~ bigdata + smalldata + objective + subjective:overlapping.solutions, data=data)
  • 19. Data Science for Government (DS4G) About (DS4G): ! 1: Improve on definitions of analytics. 2: Outline optimal interactions with Data Scientists. 3: Provide a life-cycle for Data Science. 4: Most importantly, share a taxonomy to identify analytical questions one could ask of data (Causal Effects, Classification, Outlier Detection, Big Data and Analytics, Measurement Models, & Text Analysis) ! Presented by Data Tactics Analytics Team Location: TBD Time: 1Q 2014 Duration: ~ 5 hrs. Cost: FREE Audience: Government managers and Data Tactics partners with their customers.
  • 20. LUBAP goes wild! 421 attending! http://www.meetup.com/Data-Science-DC/events/146953142/
  • 21. Thank you... Questions? Homepage: http://www.data-tactics.com Blog: http://datatactics.blogspot.com Twitter: @DataTactics Slideshare: http://www.slideshare.net/DataTactics/presentations Or, me (Rich Heimann): rheimann@data-tactics-corp.com