SlideShare uma empresa Scribd logo
1 de 111
Big (Geo) Data Science




Robert Cheetham
cheetham@azavea.com
   @rcheetham
Web/Mobile

Geospatial

UI/UX Design

High Performance
Computing

R&D
B Corporation
   • Projects w/ Social Value
   • Summer of Maps
   • Pro Bono Program
   • Donate share of profits

Research-Driven
  • 10% Research Program
  • Academic Collaborations
  • Open Source
Spatial Temporal Forecasting
with Philadelphia Crime Data
How Phila PD uses Maps

 Customized Map Products




            Weekly CompStat Meetings




   Web Crime Analysis
INCT & PARS – main database sources
over 5,000 incidents daily, over 2 million annually



                                                                                        PARS

   Complainant                                                   INCT


      Verizon                                                           Daily download
        911                      District                               & Geocoding Routines
                                48 Desk
                                       Incident Report
                                       Completed by Officer                      District X


   911 Operator
                                Police Officer      Maps distributed
                                                   Through Intranet,            District Y
                                                  Printing, CompStat
      Radio
    Dispatcher
                                  CAD                                            District Z
The Context

1,500,000 people
7,000 police
1,000 civilian employees
2,000,000 new incidents / year
3 crime analysts
What we did

•   Weekly Compstat
•   Lots of maps
•   Automation of map creation
•   Web-based systems
… but what if we could…

 Accelerate the cycle
 Proactively notify
 Automate the process
Prototype
          VB & MapObjects                                ArcView
                                                  .ini
                                                  file




Process Documentation




                                          Shapefiles
                                          and
                                          GRIDs




                        MS SQL Server
                        Crime Incidents
                        Database
… but there was a problem …
…it was crap …
… sort of.
We needed ….

1. Better Statistics

2. Notification

3. Simplicity
Crime Analysis – What has happened?
   – Mapping (spatial / temporal densities)
   – Trending
   – Intelligence Dashboard
Early Warning – What is out of the ordinary?
   – Statistical & Threshold-based Hunches (data mining)
   – Alerting
Risk Forecasting – What is likely to happen next?
   – Near Repeat Pattern
   – Load Forecasting
Crime Analysis
   – Mapping (spatial / temporal densities)
   – Trending
   – Intelligence Dashboard
Early Warning
   – Statistical & Threshold-based Hunches (data mining)
   – Alerting
Risk Forecasting
   – Near Repeat Pattern
   – Load Forecasting
Crime Analysis
Intelligence Dashboard
Crime Analysis
Early Warning
Early Warning

• Geographic Early Warning System
   – A system to alert staff of an unusual situation in a particular
     location
   – Ingests data sets to automatically “cook on” and only
     involves staff when a statistically unusual situation is found


                               Geostatistical Engine



  Operational
   Operational
   Database
                                                       Alerting
     Operational
    Database                HunchLab
                            Database                   System
     Databases
Early Warning
What is a Hunch?

• A proposed hypothesis, saved into the system, and
  continually tested for validity
• Incident Attribute Requirements
   – Location (x, y)
   – Time (timestamp)
   – Classification
• Hunch Attributes
   – Location (area)
   – Time (recent / historic periods)
   – Classification
• Analyses
   – Statistical Hunch
   – Threshold Hunch
Hunch Parameters: Location

•   Address & Radius
•   Precinct/County/Country
•   Custom Drawn Area
•   Mass Hunch
Hunch Parameters: Time

• Statistical Hunch
   – Recent Past
   – Historic Past
Hunch Parameters: Classification

• Category
• Time of Day
• Narrative
Hunch Helper
Email Alert
Hunch Details
Risk Forecasting
Predictive Analytics?

• Prediction vs. Forecasting
Near Repeat Pattern Analysis
Contagious Crime?

• Near repeat pattern analysis
      • “If one burglary occurs, how does the risk change nearby?”
What Do We Mean By Near Repeat?

• Repeat victimization
   – Incident at the same location at a later time (likely related)
• Near repeat victimization
   – Incident at a nearby location at a later time (likely related)

• Incident A (place, time) --> Incident B (place, time)
Near Repeat Pattern Analysis

• The goal:
   – Quantify short term risk due to near-repeat victimization
      • “If one burglary occurs, how does the risk of burglary for the
        neighbors change?”


• What we know:
   – Incident A (place, time) --> Incident B (place, time)
      • Distance between A and B
      • Timeframe between A and B


• What we need to know:
   – What distances/timeframes are not simply random?
Near Repeat Pattern Analysis

• The process
   –   Observe the pattern in historic data
   –   Simulate the pattern in randomized historic data
   –   Compare the observed pattern to the simulated patterns
   –   Apply the non-random pattern to new incidents

• An example
   – 180 days of burglaries in Division 6 of Philadelphia
Near Repeat Pattern Analysis
Near Repeat Pattern Analysis
Near Repeat Pattern Analysis
Near Repeat Pattern Analysis
Near Repeat Pattern Analysis

• How can you test your own data?
   – Near Repeat Calculator
      • http://www.temple.edu/cj/misc/nr/
• Papers
   – Near-Repeat Patterns in Philadelphia Shootings (2008)
      • One city block & two weeks after one shooting
           – 33% increase in likelihood of a second event




                                             Jerry Ratcliffe
                                           Temple University
Contagious Crime?
Workload Forecasting
Improving CompStat

• Workload forecasting
      • “Given the time of year, day of week, time of day and
        general trend, what counts of crimes should I expect?”
What Do We Mean By Load Forecasting?

 • Workload forecasting
         • Generating aggregate crime counts for a future timeframe
           using cyclical time series analysis



                                    Measure cyclical patterns


                                                +
                                    Identify non-cyclical trend

                                    Forecast expected count

bit.ly/gorrcrimeforecastingpaper
Load Forecasting

• Measure cyclical patterns
      • Take historic incidents (for example: last five years)
      • Generate multiplicative seasonal indices
          – For each time cycle:
              » time of year
              » day of week
              » time of day
          – Count incidents within each time unit (for example: Monday)
          – Calculate average per time unit if incidents were evenly
            distributed
          – Divide counts within each time unit by the calculated average to
            generate multiplicative indices
              » Index ~ 1 means at the average
              » Index > 1 means above average
              » Index < 1 means below average
Load Forecasting
Load Forecasting
Load Forecasting
Load Forecasting
Load Forecasting

• Identify non-cyclical trend
      • Take recent daily counts (for example: last year daily counts)
      • Remove cyclical trends by dividing by indices




      • Run a trending function on the new counts
          – Simple average
              » Last X Days
          – Smoothing function
              » Exponential smoothing
              » Holt’s linear exponential smoothing
Load Forecasting

• Forecast expected count
      • Project trend into future timeframe
          – Always flat
              » Simple average
              » Exponential smoothing
          – Linear trend
              » Holt’s linear exponential smoothing
      • Multiple by seasonal indices to reseasonalize the data
Load Forecasting




                                   Measure cyclical patterns


                                             +
                                   Identify non-cyclical trend

                                   Forecast expected count



bit.ly/gorrcrimeforecastingpaper
Improving CompStat
How Do We Know It’s Accurate?

• Testing
      • Generated forecasting techniques(examples)
            – Commonly Used
                » Average of last 30 days
                » Average of last 365 days
                » Last year’s count for the same time period
            – Advanced Combinations
                » Different cyclical indices (example: day of year vs. month of year)
                » Different levels of geographic aggregation for indices
                » Different trending functions
      • Scoring methodologies (examples)
            – Mean absolute percent error (with some enhancements)
            – Mean percent error
            – Mean squared error
      • Run thousands of forecasts through testing framework
      • Choose the right technique in the right situation
Ongoing Research
Research Topics

• Risk Forecasting
   – Load forecasting enhancements
      • Weather and special events




   – Combining short and long term risk forecasts (Temple)
      • Socioeconomic changes in neighborhoods
   – Risk Terrain Modeling (Rutgers)
      • Context of crime at the microplace
Research Topics
Research Topics

• Risk Forecasting
   – Offender Management
      • Prioritize offenders based upon statistical models using past
        behaviors
• Evaluation
   – Automate Randomized Controlled Trials
Data Processing for Big (Geo) Data
A Story
Robert’s Rules of Housing
                     Close to Center City      somewhat important
                   Walk to Grocery Store       vital
                     Nearby Restaurants        very important
                                  Library      nice to have
                             Near a Park       somewhat important
Biking / walking distance from our work        very important
               Biking distance to fencing      somewhat important
Your factors might include…
                      Child Care
                      Local School Rankings
                      Farmer's Market
                      Car Share
                      Public Transit
We stand on the
shoulders of giants
Not a new idea … Design with Nature
Not a new Idea … Dana Tomlin
Desktop GIS
Weighted Overlay


             +        +        +

    x5           x1       x3       x2




         =
Summary

      Geography-driven Decisions

      Iterative

      Individual

      Web [and Mobile]

      Growing data sets
Web Challenges
Web is different from the Desktop

  Lots of simultaneous users

  Stateless environment

  HTML+JS+CSS

  Users are less skilled

  Users are less patient
But wait … there’s a problem
 10 – 60 second calculation time

 Multiple simultaneous users …

 … that are impatient
Data Challenges
Big Data – Social Media
Big Data – Science
Big Data – Citizen Science
Big Data – Cities
Early Prototype
Specific Optimization Goals
 New Raster File Structure

 Distributed processing

 Binary messaging protocol
Optimization: File Format
 Limit data type and range

 1D arrays are fast to read/write

 Tiled

 Pyramids

 Azavea Raster Grid (ARG)
Optimization: Distributed Processing
 Parallelizable - Local Ops and Focal Ops

 Support multiple
  –   Threads
  –   Cores
  –   CPU’s
  –   Machines


 Considered
  – Hadoop
  – Amazon Map Reduce
  – Beowolf
Success!!
  Reduced from 10-60 seconds to

  <500 milliseconds
Optimizing one process sub-optimizes others
   Complex to configure and maintain
   Limited to one operation
   No interpolation
   No mixing
    – cell sizes
    – extents
    – projections
 etc.
 Broader set of functionality

 Both raster and vector

 Scala + Akka

 Open source
Faster is Different
Regional/State:     84 ms

National:           84 ms

Large Country     115 ms

Continental       271 ms

Planet          1.2 – 2.0 s
Ongoing R&D
GPUs
GPU Results
  Re-wrote a few Map
   Algebra operations:
    Local
    Neighborhood
    Zonal
    Viewshed
    etc.
  15 – 120x
  Large grids
  Large kernels
New Spatial Operations
 Vector

 Neighborhood/Focal

 Spatial Statistics

 Integration
Urban Forest Ecosystem Modeling
Crime Analysis, Early Warning and Forecasting
Open Source Geoprocessing

       GDAL

       GeoServer

       PostGIS

      R

       GeoDa
Many Thanks!
© Photo used with permission from Alphafish, via Flickr.com
Big (Geo) Data Science

                 [We are hiring]


Robert Cheetham
cheetham@azavea.com
   @rcheetham

Mais conteúdo relacionado

Mais procurados

Machine Learning Approaches for Crime Pattern Detection
Machine Learning Approaches for Crime Pattern DetectionMachine Learning Approaches for Crime Pattern Detection
Machine Learning Approaches for Crime Pattern DetectionAPNIC
 
Crime prediction-using-data-mining
Crime prediction-using-data-miningCrime prediction-using-data-mining
Crime prediction-using-data-miningmohammed albash
 
Chicago Crime Dataset Project Proposal
Chicago Crime Dataset Project ProposalChicago Crime Dataset Project Proposal
Chicago Crime Dataset Project ProposalAashri Tandon
 
Crime Analysis & Prediction System
Crime Analysis & Prediction SystemCrime Analysis & Prediction System
Crime Analysis & Prediction SystemBigDataCloud
 
Crime analysis mapping, intrusion detection using data mining
Crime analysis mapping, intrusion detection using data miningCrime analysis mapping, intrusion detection using data mining
Crime analysis mapping, intrusion detection using data miningVenkat Projects
 
Crime Mapping & Analysis – Georgia Tech
Crime Mapping & Analysis – Georgia TechCrime Mapping & Analysis – Georgia Tech
Crime Mapping & Analysis – Georgia TechJonathan D'Cruz
 
PredPol: How Predictive Policing Works
PredPol: How Predictive Policing WorksPredPol: How Predictive Policing Works
PredPol: How Predictive Policing WorksPredPol, Inc
 
EvIM: a real time complex event discovery platform for CPSS
EvIM: a real time complex event discovery platform for CPSSEvIM: a real time complex event discovery platform for CPSS
EvIM: a real time complex event discovery platform for CPSSSiripen Pongpaichet
 
Social Life Networks (Eventshop and Personal Event Shop)
Social Life Networks (Eventshop and Personal Event Shop)Social Life Networks (Eventshop and Personal Event Shop)
Social Life Networks (Eventshop and Personal Event Shop)Siripen Pongpaichet
 
How the growth of R helps data-driven organizations succeed
How the growth of R helps data-driven organizations succeedHow the growth of R helps data-driven organizations succeed
How the growth of R helps data-driven organizations succeedRevolution Analytics
 
Order Fulfillment Forecasting at John Deere: How R Facilitates Creativity and...
Order Fulfillment Forecasting at John Deere: How R Facilitates Creativity and...Order Fulfillment Forecasting at John Deere: How R Facilitates Creativity and...
Order Fulfillment Forecasting at John Deere: How R Facilitates Creativity and...Revolution Analytics
 
Fundamentalsof Crime Mapping 6
Fundamentalsof Crime Mapping 6Fundamentalsof Crime Mapping 6
Fundamentalsof Crime Mapping 6Osokop
 
Predictive policing computational thinking show and tell
Predictive policing computational thinking show and tellPredictive policing computational thinking show and tell
Predictive policing computational thinking show and tellArchit Sharma
 
Observing real world phenomena through event web
Observing real world phenomena through event webObserving real world phenomena through event web
Observing real world phenomena through event webSiripen Pongpaichet
 
Crime Identification Denver Colorado
Crime Identification Denver ColoradoCrime Identification Denver Colorado
Crime Identification Denver ColoradoChad Yowler
 
Using Data Mining Techniques to Analyze Crime Pattern
Using Data Mining Techniques to Analyze Crime PatternUsing Data Mining Techniques to Analyze Crime Pattern
Using Data Mining Techniques to Analyze Crime PatternZakaria Zubi
 

Mais procurados (20)

Machine Learning Approaches for Crime Pattern Detection
Machine Learning Approaches for Crime Pattern DetectionMachine Learning Approaches for Crime Pattern Detection
Machine Learning Approaches for Crime Pattern Detection
 
Crime prediction-using-data-mining
Crime prediction-using-data-miningCrime prediction-using-data-mining
Crime prediction-using-data-mining
 
Chicago Crime Dataset Project Proposal
Chicago Crime Dataset Project ProposalChicago Crime Dataset Project Proposal
Chicago Crime Dataset Project Proposal
 
Crime Analysis & Prediction System
Crime Analysis & Prediction SystemCrime Analysis & Prediction System
Crime Analysis & Prediction System
 
Crime analysis mapping, intrusion detection using data mining
Crime analysis mapping, intrusion detection using data miningCrime analysis mapping, intrusion detection using data mining
Crime analysis mapping, intrusion detection using data mining
 
Crime Mapping & Analysis – Georgia Tech
Crime Mapping & Analysis – Georgia TechCrime Mapping & Analysis – Georgia Tech
Crime Mapping & Analysis – Georgia Tech
 
PredPol: How Predictive Policing Works
PredPol: How Predictive Policing WorksPredPol: How Predictive Policing Works
PredPol: How Predictive Policing Works
 
Crime analysis
Crime analysisCrime analysis
Crime analysis
 
Applications of R (DataWeek 2014)
Applications of R (DataWeek 2014)Applications of R (DataWeek 2014)
Applications of R (DataWeek 2014)
 
EvIM: a real time complex event discovery platform for CPSS
EvIM: a real time complex event discovery platform for CPSSEvIM: a real time complex event discovery platform for CPSS
EvIM: a real time complex event discovery platform for CPSS
 
Social Life Networks (Eventshop and Personal Event Shop)
Social Life Networks (Eventshop and Personal Event Shop)Social Life Networks (Eventshop and Personal Event Shop)
Social Life Networks (Eventshop and Personal Event Shop)
 
How the growth of R helps data-driven organizations succeed
How the growth of R helps data-driven organizations succeedHow the growth of R helps data-driven organizations succeed
How the growth of R helps data-driven organizations succeed
 
Order Fulfillment Forecasting at John Deere: How R Facilitates Creativity and...
Order Fulfillment Forecasting at John Deere: How R Facilitates Creativity and...Order Fulfillment Forecasting at John Deere: How R Facilitates Creativity and...
Order Fulfillment Forecasting at John Deere: How R Facilitates Creativity and...
 
Fundamentalsof Crime Mapping 6
Fundamentalsof Crime Mapping 6Fundamentalsof Crime Mapping 6
Fundamentalsof Crime Mapping 6
 
Predictive policing computational thinking show and tell
Predictive policing computational thinking show and tellPredictive policing computational thinking show and tell
Predictive policing computational thinking show and tell
 
EventShop Demo
EventShop DemoEventShop Demo
EventShop Demo
 
Observing real world phenomena through event web
Observing real world phenomena through event webObserving real world phenomena through event web
Observing real world phenomena through event web
 
Crime Identification Denver Colorado
Crime Identification Denver ColoradoCrime Identification Denver Colorado
Crime Identification Denver Colorado
 
EventShop ISG talk 140213
EventShop ISG talk 140213EventShop ISG talk 140213
EventShop ISG talk 140213
 
Using Data Mining Techniques to Analyze Crime Pattern
Using Data Mining Techniques to Analyze Crime PatternUsing Data Mining Techniques to Analyze Crime Pattern
Using Data Mining Techniques to Analyze Crime Pattern
 

Destaque

Rinkal.cpd.ppt
Rinkal.cpd.pptRinkal.cpd.ppt
Rinkal.cpd.pptrashmika28
 
Exploratory Spatial Analysis Norma
Exploratory Spatial Analysis NormaExploratory Spatial Analysis Norma
Exploratory Spatial Analysis NormaBeniamino Murgante
 
Web mapping with vector data. Is it the future ? 2012
Web mapping with vector data. Is it the future ? 2012Web mapping with vector data. Is it the future ? 2012
Web mapping with vector data. Is it the future ? 2012Moullet
 
Spatial queries entity recognition and disambiguation
Spatial queries entity recognition and disambiguationSpatial queries entity recognition and disambiguation
Spatial queries entity recognition and disambiguationEhsan Hamzei
 
Introduction to Oracle Spatial
Introduction to Oracle SpatialIntroduction to Oracle Spatial
Introduction to Oracle SpatialEhsan Hamzei
 
3D Visibility with Vector GIS Data
3D Visibility with Vector GIS Data3D Visibility with Vector GIS Data
3D Visibility with Vector GIS DataWassim Suleiman
 
Spatial enhancement
Spatial enhancement Spatial enhancement
Spatial enhancement abinarkt
 
Exploratory Spatial Analysis using GeoDa
Exploratory Spatial Analysis using GeoDaExploratory Spatial Analysis using GeoDa
Exploratory Spatial Analysis using GeoDaMEASURE Evaluation
 
Spatial Analytics, Where 2.0 2010
Spatial Analytics, Where 2.0 2010Spatial Analytics, Where 2.0 2010
Spatial Analytics, Where 2.0 2010Kevin Weil
 
Components of Spatial Data Quality in GIS
Components of Spatial Data Quality in GISComponents of Spatial Data Quality in GIS
Components of Spatial Data Quality in GISKaium Chowdhury
 
Big Social Data: The Spatial Turn in Big Data (Video available soon on YouTube)
Big Social Data: The Spatial Turn in Big Data (Video available soon on YouTube)Big Social Data: The Spatial Turn in Big Data (Video available soon on YouTube)
Big Social Data: The Spatial Turn in Big Data (Video available soon on YouTube)Rich Heimann
 
Spatial data analysis 1
Spatial data analysis 1Spatial data analysis 1
Spatial data analysis 1Johan Blomme
 
Dmitriy Kolesov - GIS as an environment for integration and analysis of spati...
Dmitriy Kolesov - GIS as an environment for integration and analysis of spati...Dmitriy Kolesov - GIS as an environment for integration and analysis of spati...
Dmitriy Kolesov - GIS as an environment for integration and analysis of spati...AIST
 
Introduction To Gis With Employment Info
Introduction To Gis With Employment InfoIntroduction To Gis With Employment Info
Introduction To Gis With Employment InfoJo Dyson
 
QGIS Module 1
QGIS Module 1QGIS Module 1
QGIS Module 1CAPSUCSF
 
Spatial data mining
Spatial data miningSpatial data mining
Spatial data miningMITS Gwalior
 
4.2 spatial data mining
4.2 spatial data mining4.2 spatial data mining
4.2 spatial data miningKrish_ver2
 

Destaque (20)

Rinkal.cpd.ppt
Rinkal.cpd.pptRinkal.cpd.ppt
Rinkal.cpd.ppt
 
Ijcatr04061005
Ijcatr04061005Ijcatr04061005
Ijcatr04061005
 
Exploratory Spatial Analysis Norma
Exploratory Spatial Analysis NormaExploratory Spatial Analysis Norma
Exploratory Spatial Analysis Norma
 
Web mapping with vector data. Is it the future ? 2012
Web mapping with vector data. Is it the future ? 2012Web mapping with vector data. Is it the future ? 2012
Web mapping with vector data. Is it the future ? 2012
 
Spatial queries entity recognition and disambiguation
Spatial queries entity recognition and disambiguationSpatial queries entity recognition and disambiguation
Spatial queries entity recognition and disambiguation
 
Introduction to Oracle Spatial
Introduction to Oracle SpatialIntroduction to Oracle Spatial
Introduction to Oracle Spatial
 
3D Visibility with Vector GIS Data
3D Visibility with Vector GIS Data3D Visibility with Vector GIS Data
3D Visibility with Vector GIS Data
 
Spatial enhancement
Spatial enhancement Spatial enhancement
Spatial enhancement
 
Exploratory Spatial Analysis using GeoDa
Exploratory Spatial Analysis using GeoDaExploratory Spatial Analysis using GeoDa
Exploratory Spatial Analysis using GeoDa
 
Spatial Analytics, Where 2.0 2010
Spatial Analytics, Where 2.0 2010Spatial Analytics, Where 2.0 2010
Spatial Analytics, Where 2.0 2010
 
Components of Spatial Data Quality in GIS
Components of Spatial Data Quality in GISComponents of Spatial Data Quality in GIS
Components of Spatial Data Quality in GIS
 
Big Social Data: The Spatial Turn in Big Data (Video available soon on YouTube)
Big Social Data: The Spatial Turn in Big Data (Video available soon on YouTube)Big Social Data: The Spatial Turn in Big Data (Video available soon on YouTube)
Big Social Data: The Spatial Turn in Big Data (Video available soon on YouTube)
 
Spatial data analysis 1
Spatial data analysis 1Spatial data analysis 1
Spatial data analysis 1
 
Dmitriy Kolesov - GIS as an environment for integration and analysis of spati...
Dmitriy Kolesov - GIS as an environment for integration and analysis of spati...Dmitriy Kolesov - GIS as an environment for integration and analysis of spati...
Dmitriy Kolesov - GIS as an environment for integration and analysis of spati...
 
Introduction To Gis With Employment Info
Introduction To Gis With Employment InfoIntroduction To Gis With Employment Info
Introduction To Gis With Employment Info
 
QGIS Module 1
QGIS Module 1QGIS Module 1
QGIS Module 1
 
Vectors and Rasters
Vectors and RastersVectors and Rasters
Vectors and Rasters
 
Spatial data mining
Spatial data miningSpatial data mining
Spatial data mining
 
Vector analysis
Vector analysisVector analysis
Vector analysis
 
4.2 spatial data mining
4.2 spatial data mining4.2 spatial data mining
4.2 spatial data mining
 

Semelhante a Big (Geo) Data Science Insights

Forecasting Space-Time Events - Strata + Hadoop World 2015 San Jose
Forecasting Space-Time Events - Strata + Hadoop World 2015 San JoseForecasting Space-Time Events - Strata + Hadoop World 2015 San Jose
Forecasting Space-Time Events - Strata + Hadoop World 2015 San JoseAzavea
 
Mining Large-Scale Temporal Dynamics with Hadoop
Mining Large-Scale Temporal Dynamics with HadoopMining Large-Scale Temporal Dynamics with Hadoop
Mining Large-Scale Temporal Dynamics with HadoopDataWorks Summit
 
Rent, Rain, and Regulations | Du Phan, Dataiku | DN18
Rent, Rain, and Regulations | Du Phan, Dataiku | DN18Rent, Rain, and Regulations | Du Phan, Dataiku | DN18
Rent, Rain, and Regulations | Du Phan, Dataiku | DN18DataconomyGmbH
 
Extracting City Traffic Events from Social Streams
 Extracting City Traffic Events from Social Streams Extracting City Traffic Events from Social Streams
Extracting City Traffic Events from Social StreamsPramod Anantharam
 
EPOP: Quantifying Violent Risk for Every Point on the Planet
EPOP: Quantifying Violent Risk for Every Point on the PlanetEPOP: Quantifying Violent Risk for Every Point on the Planet
EPOP: Quantifying Violent Risk for Every Point on the PlanetEsri
 
HunchLab 2.0 Predictive Missions: Under the Hood
HunchLab 2.0 Predictive Missions: Under the HoodHunchLab 2.0 Predictive Missions: Under the Hood
HunchLab 2.0 Predictive Missions: Under the HoodAzavea
 
RAPID-N: A tool for mapping Natech risk due to earthquakes
RAPID-N: A tool for mapping Natech risk due to earthquakesRAPID-N: A tool for mapping Natech risk due to earthquakes
RAPID-N: A tool for mapping Natech risk due to earthquakesGlobal Risk Forum GRFDavos
 
High Availability HPC ~ Microservice Architectures for Supercomputing
High Availability HPC ~ Microservice Architectures for SupercomputingHigh Availability HPC ~ Microservice Architectures for Supercomputing
High Availability HPC ~ Microservice Architectures for Supercomputinginside-BigData.com
 
Get Started with Data Science by Analyzing Traffic Data from California Highways
Get Started with Data Science by Analyzing Traffic Data from California HighwaysGet Started with Data Science by Analyzing Traffic Data from California Highways
Get Started with Data Science by Analyzing Traffic Data from California HighwaysAerospike, Inc.
 
Cyber Threat Ranking using READ
Cyber Threat Ranking using READCyber Threat Ranking using READ
Cyber Threat Ranking using READZachary S. Brown
 
Machine Learning from Statistical Point of View
Machine Learning from Statistical Point of ViewMachine Learning from Statistical Point of View
Machine Learning from Statistical Point of ViewYury Gubman
 
(BDT207) Real-Time Analytics In Service Of Self-Healing Ecosystems
(BDT207) Real-Time Analytics In Service Of Self-Healing Ecosystems(BDT207) Real-Time Analytics In Service Of Self-Healing Ecosystems
(BDT207) Real-Time Analytics In Service Of Self-Healing EcosystemsAmazon Web Services
 
Cyber Attacks Spatial Analysis
Cyber Attacks Spatial AnalysisCyber Attacks Spatial Analysis
Cyber Attacks Spatial AnalysisShwetha Narayanan
 
Nye forskninsgresultater inden for geo-spatiale data af Christian S. Jensen, AAU
Nye forskninsgresultater inden for geo-spatiale data af Christian S. Jensen, AAUNye forskninsgresultater inden for geo-spatiale data af Christian S. Jensen, AAU
Nye forskninsgresultater inden for geo-spatiale data af Christian S. Jensen, AAUInfinIT - Innovationsnetværket for it
 
A data driven approach for monitoring network events
A data driven approach for monitoring network eventsA data driven approach for monitoring network events
A data driven approach for monitoring network eventsJisc
 
Cyber Analytics Applications for Data-Intensive Computing
Cyber Analytics Applications for Data-Intensive ComputingCyber Analytics Applications for Data-Intensive Computing
Cyber Analytics Applications for Data-Intensive ComputingMike Fisk
 
Time series and forecasting from wikipedia
Time series and forecasting from wikipediaTime series and forecasting from wikipedia
Time series and forecasting from wikipediaMonica Barros
 

Semelhante a Big (Geo) Data Science Insights (20)

Forecasting Space-Time Events - Strata + Hadoop World 2015 San Jose
Forecasting Space-Time Events - Strata + Hadoop World 2015 San JoseForecasting Space-Time Events - Strata + Hadoop World 2015 San Jose
Forecasting Space-Time Events - Strata + Hadoop World 2015 San Jose
 
Mining Large-Scale Temporal Dynamics with Hadoop
Mining Large-Scale Temporal Dynamics with HadoopMining Large-Scale Temporal Dynamics with Hadoop
Mining Large-Scale Temporal Dynamics with Hadoop
 
Rent, Rain, and Regulations | Du Phan, Dataiku | DN18
Rent, Rain, and Regulations | Du Phan, Dataiku | DN18Rent, Rain, and Regulations | Du Phan, Dataiku | DN18
Rent, Rain, and Regulations | Du Phan, Dataiku | DN18
 
Extracting City Traffic Events from Social Streams
 Extracting City Traffic Events from Social Streams Extracting City Traffic Events from Social Streams
Extracting City Traffic Events from Social Streams
 
EPOP: Quantifying Violent Risk for Every Point on the Planet
EPOP: Quantifying Violent Risk for Every Point on the PlanetEPOP: Quantifying Violent Risk for Every Point on the Planet
EPOP: Quantifying Violent Risk for Every Point on the Planet
 
HunchLab 2.0 Predictive Missions: Under the Hood
HunchLab 2.0 Predictive Missions: Under the HoodHunchLab 2.0 Predictive Missions: Under the Hood
HunchLab 2.0 Predictive Missions: Under the Hood
 
RAPID-N: A tool for mapping Natech risk due to earthquakes
RAPID-N: A tool for mapping Natech risk due to earthquakesRAPID-N: A tool for mapping Natech risk due to earthquakes
RAPID-N: A tool for mapping Natech risk due to earthquakes
 
High Availability HPC ~ Microservice Architectures for Supercomputing
High Availability HPC ~ Microservice Architectures for SupercomputingHigh Availability HPC ~ Microservice Architectures for Supercomputing
High Availability HPC ~ Microservice Architectures for Supercomputing
 
Get Started with Data Science by Analyzing Traffic Data from California Highways
Get Started with Data Science by Analyzing Traffic Data from California HighwaysGet Started with Data Science by Analyzing Traffic Data from California Highways
Get Started with Data Science by Analyzing Traffic Data from California Highways
 
Cyber Threat Ranking using READ
Cyber Threat Ranking using READCyber Threat Ranking using READ
Cyber Threat Ranking using READ
 
Machine Learning from Statistical Point of View
Machine Learning from Statistical Point of ViewMachine Learning from Statistical Point of View
Machine Learning from Statistical Point of View
 
Integrating Sensor and Social Data for Understanding City Events
Integrating Sensor and Social Data for Understanding City EventsIntegrating Sensor and Social Data for Understanding City Events
Integrating Sensor and Social Data for Understanding City Events
 
(BDT207) Real-Time Analytics In Service Of Self-Healing Ecosystems
(BDT207) Real-Time Analytics In Service Of Self-Healing Ecosystems(BDT207) Real-Time Analytics In Service Of Self-Healing Ecosystems
(BDT207) Real-Time Analytics In Service Of Self-Healing Ecosystems
 
Cyber Attacks Spatial Analysis
Cyber Attacks Spatial AnalysisCyber Attacks Spatial Analysis
Cyber Attacks Spatial Analysis
 
Data Science At Zillow
Data Science At ZillowData Science At Zillow
Data Science At Zillow
 
PPT.pptx
PPT.pptxPPT.pptx
PPT.pptx
 
Nye forskninsgresultater inden for geo-spatiale data af Christian S. Jensen, AAU
Nye forskninsgresultater inden for geo-spatiale data af Christian S. Jensen, AAUNye forskninsgresultater inden for geo-spatiale data af Christian S. Jensen, AAU
Nye forskninsgresultater inden for geo-spatiale data af Christian S. Jensen, AAU
 
A data driven approach for monitoring network events
A data driven approach for monitoring network eventsA data driven approach for monitoring network events
A data driven approach for monitoring network events
 
Cyber Analytics Applications for Data-Intensive Computing
Cyber Analytics Applications for Data-Intensive ComputingCyber Analytics Applications for Data-Intensive Computing
Cyber Analytics Applications for Data-Intensive Computing
 
Time series and forecasting from wikipedia
Time series and forecasting from wikipediaTime series and forecasting from wikipedia
Time series and forecasting from wikipedia
 

Mais de Azavea

Using New Tools to Analyze and Plan Your Urban Forest
Using New Tools to Analyze and Plan Your Urban Forest Using New Tools to Analyze and Plan Your Urban Forest
Using New Tools to Analyze and Plan Your Urban Forest Azavea
 
7 misconceptions about predictive policing webinar
7 misconceptions about predictive policing webinar7 misconceptions about predictive policing webinar
7 misconceptions about predictive policing webinarAzavea
 
Tracking Your Green Infrastructure
Tracking Your Green InfrastructureTracking Your Green Infrastructure
Tracking Your Green InfrastructureAzavea
 
Growing Your Urban Forest: Using the OpenTreeMap Bulk Uploader
Growing Your Urban Forest: Using the OpenTreeMap Bulk UploaderGrowing Your Urban Forest: Using the OpenTreeMap Bulk Uploader
Growing Your Urban Forest: Using the OpenTreeMap Bulk UploaderAzavea
 
November 12, 2014 Webinar: Hackers, Beer Geeks, and Arborly Love - Reaching o...
November 12, 2014 Webinar: Hackers, Beer Geeks, and Arborly Love - Reaching o...November 12, 2014 Webinar: Hackers, Beer Geeks, and Arborly Love - Reaching o...
November 12, 2014 Webinar: Hackers, Beer Geeks, and Arborly Love - Reaching o...Azavea
 
Mobile Citizen Science
Mobile Citizen Science Mobile Citizen Science
Mobile Citizen Science Azavea
 
Getting Started with OpenTreeMap Cloud
Getting Started with OpenTreeMap CloudGetting Started with OpenTreeMap Cloud
Getting Started with OpenTreeMap CloudAzavea
 
HunchLab 2.0 Getting Started
HunchLab 2.0 Getting StartedHunchLab 2.0 Getting Started
HunchLab 2.0 Getting StartedAzavea
 
Is it a Package or a Wrapper? Designing, Documenting, and Distributing a Pyth...
Is it a Package or a Wrapper? Designing, Documenting, and Distributing a Pyth...Is it a Package or a Wrapper? Designing, Documenting, and Distributing a Pyth...
Is it a Package or a Wrapper? Designing, Documenting, and Distributing a Pyth...Azavea
 
Your New Partners: Understanding Civic Hackathons, Why You Should be Involved...
Your New Partners: Understanding Civic Hackathons, Why You Should be Involved...Your New Partners: Understanding Civic Hackathons, Why You Should be Involved...
Your New Partners: Understanding Civic Hackathons, Why You Should be Involved...Azavea
 
Using Open Data and Citizen Science to Promote Citizen Engagement with Green ...
Using Open Data and Citizen Science to Promote Citizen Engagement with Green ...Using Open Data and Citizen Science to Promote Citizen Engagement with Green ...
Using Open Data and Citizen Science to Promote Citizen Engagement with Green ...Azavea
 
HunchLab 2.0 Preview Webinar - Place
HunchLab 2.0 Preview Webinar - PlaceHunchLab 2.0 Preview Webinar - Place
HunchLab 2.0 Preview Webinar - PlaceAzavea
 
Five Technology Trends Every Nonprofit Needs to Know
Five Technology Trends Every Nonprofit Needs to KnowFive Technology Trends Every Nonprofit Needs to Know
Five Technology Trends Every Nonprofit Needs to KnowAzavea
 
PhillyHistory.org - Tracking Metrics for a Digital Project
PhillyHistory.org - Tracking Metrics for a Digital ProjectPhillyHistory.org - Tracking Metrics for a Digital Project
PhillyHistory.org - Tracking Metrics for a Digital ProjectAzavea
 
Fed Geo Day - Applying GeoTrellis at the US Army Corps
Fed Geo Day - Applying GeoTrellis at the US Army CorpsFed Geo Day - Applying GeoTrellis at the US Army Corps
Fed Geo Day - Applying GeoTrellis at the US Army CorpsAzavea
 
Fed Geo Day - GeoTrellis Intro
Fed Geo Day - GeoTrellis IntroFed Geo Day - GeoTrellis Intro
Fed Geo Day - GeoTrellis IntroAzavea
 
Fed Geo Day 2013 - Azavea Intro
Fed Geo Day 2013 - Azavea Intro Fed Geo Day 2013 - Azavea Intro
Fed Geo Day 2013 - Azavea Intro Azavea
 
Modeling Count-based Raster Data with ArcGIS and R
Modeling Count-based Raster Data with ArcGIS and RModeling Count-based Raster Data with ArcGIS and R
Modeling Count-based Raster Data with ArcGIS and RAzavea
 
OpenTreeMap NCGIS
OpenTreeMap NCGISOpenTreeMap NCGIS
OpenTreeMap NCGISAzavea
 
OpenTreeMap Overview
OpenTreeMap OverviewOpenTreeMap Overview
OpenTreeMap OverviewAzavea
 

Mais de Azavea (20)

Using New Tools to Analyze and Plan Your Urban Forest
Using New Tools to Analyze and Plan Your Urban Forest Using New Tools to Analyze and Plan Your Urban Forest
Using New Tools to Analyze and Plan Your Urban Forest
 
7 misconceptions about predictive policing webinar
7 misconceptions about predictive policing webinar7 misconceptions about predictive policing webinar
7 misconceptions about predictive policing webinar
 
Tracking Your Green Infrastructure
Tracking Your Green InfrastructureTracking Your Green Infrastructure
Tracking Your Green Infrastructure
 
Growing Your Urban Forest: Using the OpenTreeMap Bulk Uploader
Growing Your Urban Forest: Using the OpenTreeMap Bulk UploaderGrowing Your Urban Forest: Using the OpenTreeMap Bulk Uploader
Growing Your Urban Forest: Using the OpenTreeMap Bulk Uploader
 
November 12, 2014 Webinar: Hackers, Beer Geeks, and Arborly Love - Reaching o...
November 12, 2014 Webinar: Hackers, Beer Geeks, and Arborly Love - Reaching o...November 12, 2014 Webinar: Hackers, Beer Geeks, and Arborly Love - Reaching o...
November 12, 2014 Webinar: Hackers, Beer Geeks, and Arborly Love - Reaching o...
 
Mobile Citizen Science
Mobile Citizen Science Mobile Citizen Science
Mobile Citizen Science
 
Getting Started with OpenTreeMap Cloud
Getting Started with OpenTreeMap CloudGetting Started with OpenTreeMap Cloud
Getting Started with OpenTreeMap Cloud
 
HunchLab 2.0 Getting Started
HunchLab 2.0 Getting StartedHunchLab 2.0 Getting Started
HunchLab 2.0 Getting Started
 
Is it a Package or a Wrapper? Designing, Documenting, and Distributing a Pyth...
Is it a Package or a Wrapper? Designing, Documenting, and Distributing a Pyth...Is it a Package or a Wrapper? Designing, Documenting, and Distributing a Pyth...
Is it a Package or a Wrapper? Designing, Documenting, and Distributing a Pyth...
 
Your New Partners: Understanding Civic Hackathons, Why You Should be Involved...
Your New Partners: Understanding Civic Hackathons, Why You Should be Involved...Your New Partners: Understanding Civic Hackathons, Why You Should be Involved...
Your New Partners: Understanding Civic Hackathons, Why You Should be Involved...
 
Using Open Data and Citizen Science to Promote Citizen Engagement with Green ...
Using Open Data and Citizen Science to Promote Citizen Engagement with Green ...Using Open Data and Citizen Science to Promote Citizen Engagement with Green ...
Using Open Data and Citizen Science to Promote Citizen Engagement with Green ...
 
HunchLab 2.0 Preview Webinar - Place
HunchLab 2.0 Preview Webinar - PlaceHunchLab 2.0 Preview Webinar - Place
HunchLab 2.0 Preview Webinar - Place
 
Five Technology Trends Every Nonprofit Needs to Know
Five Technology Trends Every Nonprofit Needs to KnowFive Technology Trends Every Nonprofit Needs to Know
Five Technology Trends Every Nonprofit Needs to Know
 
PhillyHistory.org - Tracking Metrics for a Digital Project
PhillyHistory.org - Tracking Metrics for a Digital ProjectPhillyHistory.org - Tracking Metrics for a Digital Project
PhillyHistory.org - Tracking Metrics for a Digital Project
 
Fed Geo Day - Applying GeoTrellis at the US Army Corps
Fed Geo Day - Applying GeoTrellis at the US Army CorpsFed Geo Day - Applying GeoTrellis at the US Army Corps
Fed Geo Day - Applying GeoTrellis at the US Army Corps
 
Fed Geo Day - GeoTrellis Intro
Fed Geo Day - GeoTrellis IntroFed Geo Day - GeoTrellis Intro
Fed Geo Day - GeoTrellis Intro
 
Fed Geo Day 2013 - Azavea Intro
Fed Geo Day 2013 - Azavea Intro Fed Geo Day 2013 - Azavea Intro
Fed Geo Day 2013 - Azavea Intro
 
Modeling Count-based Raster Data with ArcGIS and R
Modeling Count-based Raster Data with ArcGIS and RModeling Count-based Raster Data with ArcGIS and R
Modeling Count-based Raster Data with ArcGIS and R
 
OpenTreeMap NCGIS
OpenTreeMap NCGISOpenTreeMap NCGIS
OpenTreeMap NCGIS
 
OpenTreeMap Overview
OpenTreeMap OverviewOpenTreeMap Overview
OpenTreeMap Overview
 

Último

Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 

Último (20)

Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 

Big (Geo) Data Science Insights

  • 1. Big (Geo) Data Science Robert Cheetham cheetham@azavea.com @rcheetham
  • 3. B Corporation • Projects w/ Social Value • Summer of Maps • Pro Bono Program • Donate share of profits Research-Driven • 10% Research Program • Academic Collaborations • Open Source
  • 4. Spatial Temporal Forecasting with Philadelphia Crime Data
  • 5. How Phila PD uses Maps Customized Map Products Weekly CompStat Meetings Web Crime Analysis
  • 6. INCT & PARS – main database sources over 5,000 incidents daily, over 2 million annually PARS Complainant INCT Verizon Daily download 911 District & Geocoding Routines 48 Desk Incident Report Completed by Officer District X 911 Operator Police Officer Maps distributed Through Intranet, District Y Printing, CompStat Radio Dispatcher CAD District Z
  • 7. The Context 1,500,000 people 7,000 police 1,000 civilian employees 2,000,000 new incidents / year 3 crime analysts
  • 8. What we did • Weekly Compstat • Lots of maps • Automation of map creation • Web-based systems
  • 9. … but what if we could…  Accelerate the cycle  Proactively notify  Automate the process
  • 10. Prototype VB & MapObjects ArcView .ini file Process Documentation Shapefiles and GRIDs MS SQL Server Crime Incidents Database
  • 11.
  • 12. … but there was a problem …
  • 15. We needed …. 1. Better Statistics 2. Notification 3. Simplicity
  • 16.
  • 17. Crime Analysis – What has happened? – Mapping (spatial / temporal densities) – Trending – Intelligence Dashboard Early Warning – What is out of the ordinary? – Statistical & Threshold-based Hunches (data mining) – Alerting Risk Forecasting – What is likely to happen next? – Near Repeat Pattern – Load Forecasting
  • 18. Crime Analysis – Mapping (spatial / temporal densities) – Trending – Intelligence Dashboard Early Warning – Statistical & Threshold-based Hunches (data mining) – Alerting Risk Forecasting – Near Repeat Pattern – Load Forecasting
  • 23. Early Warning • Geographic Early Warning System – A system to alert staff of an unusual situation in a particular location – Ingests data sets to automatically “cook on” and only involves staff when a statistically unusual situation is found Geostatistical Engine Operational Operational Database Alerting Operational Database HunchLab Database System Databases
  • 25. What is a Hunch? • A proposed hypothesis, saved into the system, and continually tested for validity • Incident Attribute Requirements – Location (x, y) – Time (timestamp) – Classification • Hunch Attributes – Location (area) – Time (recent / historic periods) – Classification • Analyses – Statistical Hunch – Threshold Hunch
  • 26. Hunch Parameters: Location • Address & Radius • Precinct/County/Country • Custom Drawn Area • Mass Hunch
  • 27. Hunch Parameters: Time • Statistical Hunch – Recent Past – Historic Past
  • 28. Hunch Parameters: Classification • Category • Time of Day • Narrative
  • 35. Contagious Crime? • Near repeat pattern analysis • “If one burglary occurs, how does the risk change nearby?”
  • 36. What Do We Mean By Near Repeat? • Repeat victimization – Incident at the same location at a later time (likely related) • Near repeat victimization – Incident at a nearby location at a later time (likely related) • Incident A (place, time) --> Incident B (place, time)
  • 37. Near Repeat Pattern Analysis • The goal: – Quantify short term risk due to near-repeat victimization • “If one burglary occurs, how does the risk of burglary for the neighbors change?” • What we know: – Incident A (place, time) --> Incident B (place, time) • Distance between A and B • Timeframe between A and B • What we need to know: – What distances/timeframes are not simply random?
  • 38. Near Repeat Pattern Analysis • The process – Observe the pattern in historic data – Simulate the pattern in randomized historic data – Compare the observed pattern to the simulated patterns – Apply the non-random pattern to new incidents • An example – 180 days of burglaries in Division 6 of Philadelphia
  • 43. Near Repeat Pattern Analysis • How can you test your own data? – Near Repeat Calculator • http://www.temple.edu/cj/misc/nr/ • Papers – Near-Repeat Patterns in Philadelphia Shootings (2008) • One city block & two weeks after one shooting – 33% increase in likelihood of a second event Jerry Ratcliffe Temple University
  • 46. Improving CompStat • Workload forecasting • “Given the time of year, day of week, time of day and general trend, what counts of crimes should I expect?”
  • 47. What Do We Mean By Load Forecasting? • Workload forecasting • Generating aggregate crime counts for a future timeframe using cyclical time series analysis Measure cyclical patterns + Identify non-cyclical trend Forecast expected count bit.ly/gorrcrimeforecastingpaper
  • 48. Load Forecasting • Measure cyclical patterns • Take historic incidents (for example: last five years) • Generate multiplicative seasonal indices – For each time cycle: » time of year » day of week » time of day – Count incidents within each time unit (for example: Monday) – Calculate average per time unit if incidents were evenly distributed – Divide counts within each time unit by the calculated average to generate multiplicative indices » Index ~ 1 means at the average » Index > 1 means above average » Index < 1 means below average
  • 53. Load Forecasting • Identify non-cyclical trend • Take recent daily counts (for example: last year daily counts) • Remove cyclical trends by dividing by indices • Run a trending function on the new counts – Simple average » Last X Days – Smoothing function » Exponential smoothing » Holt’s linear exponential smoothing
  • 54. Load Forecasting • Forecast expected count • Project trend into future timeframe – Always flat » Simple average » Exponential smoothing – Linear trend » Holt’s linear exponential smoothing • Multiple by seasonal indices to reseasonalize the data
  • 55. Load Forecasting Measure cyclical patterns + Identify non-cyclical trend Forecast expected count bit.ly/gorrcrimeforecastingpaper
  • 57. How Do We Know It’s Accurate? • Testing • Generated forecasting techniques(examples) – Commonly Used » Average of last 30 days » Average of last 365 days » Last year’s count for the same time period – Advanced Combinations » Different cyclical indices (example: day of year vs. month of year) » Different levels of geographic aggregation for indices » Different trending functions • Scoring methodologies (examples) – Mean absolute percent error (with some enhancements) – Mean percent error – Mean squared error • Run thousands of forecasts through testing framework • Choose the right technique in the right situation
  • 59. Research Topics • Risk Forecasting – Load forecasting enhancements • Weather and special events – Combining short and long term risk forecasts (Temple) • Socioeconomic changes in neighborhoods – Risk Terrain Modeling (Rutgers) • Context of crime at the microplace
  • 61. Research Topics • Risk Forecasting – Offender Management • Prioritize offenders based upon statistical models using past behaviors • Evaluation – Automate Randomized Controlled Trials
  • 62. Data Processing for Big (Geo) Data
  • 64. Robert’s Rules of Housing Close to Center City  somewhat important Walk to Grocery Store  vital Nearby Restaurants  very important Library  nice to have Near a Park  somewhat important Biking / walking distance from our work  very important Biking distance to fencing  somewhat important
  • 65. Your factors might include…  Child Care  Local School Rankings  Farmer's Market  Car Share  Public Transit
  • 66. We stand on the shoulders of giants
  • 67. Not a new idea … Design with Nature
  • 68. Not a new Idea … Dana Tomlin
  • 70. Weighted Overlay + + + x5 x1 x3 x2 =
  • 71. Summary Geography-driven Decisions Iterative Individual Web [and Mobile] Growing data sets
  • 73. Web is different from the Desktop  Lots of simultaneous users  Stateless environment  HTML+JS+CSS  Users are less skilled  Users are less patient
  • 74. But wait … there’s a problem  10 – 60 second calculation time  Multiple simultaneous users …  … that are impatient
  • 76. Big Data – Social Media
  • 77. Big Data – Science
  • 78. Big Data – Citizen Science
  • 79. Big Data – Cities
  • 81.
  • 82. Specific Optimization Goals  New Raster File Structure  Distributed processing  Binary messaging protocol
  • 83. Optimization: File Format  Limit data type and range  1D arrays are fast to read/write  Tiled  Pyramids  Azavea Raster Grid (ARG)
  • 84. Optimization: Distributed Processing  Parallelizable - Local Ops and Focal Ops  Support multiple – Threads – Cores – CPU’s – Machines  Considered – Hadoop – Amazon Map Reduce – Beowolf
  • 85. Success!! Reduced from 10-60 seconds to <500 milliseconds
  • 86. Optimizing one process sub-optimizes others  Complex to configure and maintain  Limited to one operation  No interpolation  No mixing – cell sizes – extents – projections  etc.
  • 87.
  • 88.  Broader set of functionality  Both raster and vector  Scala + Akka  Open source
  • 90.
  • 91.
  • 92.
  • 93.
  • 94.
  • 95.
  • 96.
  • 97.
  • 98.
  • 99.
  • 100.
  • 101. Regional/State: 84 ms National: 84 ms Large Country 115 ms Continental 271 ms Planet 1.2 – 2.0 s
  • 103. GPUs
  • 104.
  • 105. GPU Results  Re-wrote a few Map Algebra operations:  Local  Neighborhood  Zonal  Viewshed  etc.  15 – 120x  Large grids  Large kernels
  • 106. New Spatial Operations Vector Neighborhood/Focal Spatial Statistics Integration
  • 108. Crime Analysis, Early Warning and Forecasting
  • 109. Open Source Geoprocessing  GDAL  GeoServer  PostGIS R  GeoDa
  • 110. Many Thanks! © Photo used with permission from Alphafish, via Flickr.com
  • 111. Big (Geo) Data Science [We are hiring] Robert Cheetham cheetham@azavea.com @rcheetham