SlideShare uma empresa Scribd logo
1 de 39
GeoMesa: Scalable Geospatial Analytics 
Chris Eichelberger 
christopher.eichelberger@ccri.com
terms 
• GeoMesa: an open-source project organized under LocationTech 
• scalable: if you can continue to solve problems as N >> 1 with no more change than 
adding hardware and minor tweaks, you scale 
• geospatial: data that contain a geographic reference, a date/time, and zero 
or more additional attributes 
• analytics: formally, a logical decomposition via truth-preserving transformations; 
informally, any useful derivation (whether deductive or inductive)
outline 
• part 1: why? ( 3 minutes) 
• part 2: how? (10 minutes) 
• part 3: what? (10 minutes) 
• part 4: who? ( 2 minutes)
part 1: why?
[why] which X (points) are close to location Y? 
• hundreds: PostgreSQL and brute force 
– full table scan 
• hundreds of thousands: PostgreSQL and PostGIS 
– GeoTools API 
– GiST (think R-trees) 
• hundreds of millions: a funny thing happens as you collect much more data...
[why] dissolution of large-volume data
[why] perhaps SQL is the bottleneck? 
• NoSQL databases, such as Apache Accumulo 
• trade ACID for distributed processing, storage 
• but there’s no PostGIS for Accumulo, so how does the canonical diagram of an Accumulo (key, 
value) pair help us answer some simple questions...
[why] questions that ought to be easy for an index to answer 
• easy question: Which comes first, “Ontario” or “Quebec”?
[why] questions that ought to be easy for an index to answer 
• easy question: Which comes first, “Ontario” or “Quebec”? 
• similar question: Which comes first, or ?
[why] questions that ought to be easy for an index to answer 
• easy question: Which comes first, “Ontario” or “Quebec”? 
• similar question: Which comes first, or ? 
• simplify, and think only of representative cities, and think of them strictly as points
[why] geohashing
[why] geohashing
[why] geohashing 
City Coordinates (courtesy Wikipedia) Geohash 
Ottawa 45°25′15″N 75°41′24″W f244m 
Montréal 45°30′N 73°34′W f25dv 
Charlottesville (Virginia, USA) 38°1′48″N 78°28′44″W dqb0q 
● Two unique orders: 
○ Order by name: Charlottesville, Montréal, Ottawa 
○ Order by longitude or latitude or geohash: Charlottesville, Ottawa, Montréal 
● Lexicoding location -> geohash provides a deterministic, repeatable ordering 
○ with this, we can index, store, and query points by lexicographic ranges
[why] build-versus-buy remorse 
• PostgreSQL+PostGIS has some nice functions 
– geometric predicates 
– secondary indexes 
– standard GeoTools API 
• some of our data are (multi) lines, (multi) polygons 
• time is often more than a secondary consideration 
• sometimes, analysis work needn’t be done on the same old client 
– distributed across the tablet servers? 
– using tools like Spark? 
– streaming?
[why] synthesis
part 2: how?
[how] GeoMesa features 
• GeoTools API 
• sharding distributes queries uniformly 
• flexible SFC can incorporate time 
• supports (multi) point, (multi) line, (multi) polygon geometries 
• secondary indexes and a multi-stage query planner 
• burgeoning raster support via WCS 
• GeoServer as a plugin-based GUI 
• WPS standards for computation (and function chaining)
[how] GeoTools API
[how] sharding
[how] space-filling curve progression 
%~#s%3#r%0,3#gh%yyyyMM#d::%~#s%3,2#gh::%~#s%5,2#gh%HHmm#d%id
[how] multi-step query planning
[how] multi-step query planning
[how] non-point geometries
[how] rasters + GeoWave integration
[how] supporting other frameworks
[how] GeoServer as a plug-in GUI
[how] Web Processing Service 
• WPS is another OGC standard 
• Think of it as an abstract function definition, mapping input types to output types, and defining 
the computation that occurs between the two. 
• WPS processes can be chained. 
• This provides for a natural extension mechanism to GeoMesa.
[how] synthesis 
Those are merely the highlights of some of GeoMesa’s current features… 
… so what?
part 3: what?
[what] distributing computation
[what] queries that interpolate both position and time
[what] K-nearest neighbor
[what] clustering (DBSCAN)
[what] near-real-time streaming track analytics with web sockets
[what] track viewer utility
part 3: who?
[who] LocationTech and the greater community
[who] synthesis
questions 
For extended questions: 
geomesa-user@locationtech.org 
geomesa@ccri.com 
christopher.eichelberger@geomesa.org 
For additional reading: 
geomesa.org 
For code: 
github.com/locationtech/geomesa

Mais conteúdo relacionado

Mais procurados

Hive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveHive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep Dive
DataWorks Summit
 
Stratégies d’optimisation de requêtes SQL dans un écosystème Hadoop
Stratégies d’optimisation de requêtes SQL dans un écosystème HadoopStratégies d’optimisation de requêtes SQL dans un écosystème Hadoop
Stratégies d’optimisation de requêtes SQL dans un écosystème Hadoop
Sébastien Frackowiak
 

Mais procurados (20)

Apache Hadoop Tutorial | Hadoop Tutorial For Beginners | Big Data Hadoop | Ha...
Apache Hadoop Tutorial | Hadoop Tutorial For Beginners | Big Data Hadoop | Ha...Apache Hadoop Tutorial | Hadoop Tutorial For Beginners | Big Data Hadoop | Ha...
Apache Hadoop Tutorial | Hadoop Tutorial For Beginners | Big Data Hadoop | Ha...
 
Naive Bayes Classifier in Python | Naive Bayes Algorithm | Machine Learning A...
Naive Bayes Classifier in Python | Naive Bayes Algorithm | Machine Learning A...Naive Bayes Classifier in Python | Naive Bayes Algorithm | Machine Learning A...
Naive Bayes Classifier in Python | Naive Bayes Algorithm | Machine Learning A...
 
Bidirectional graph search techniques for finding shortest path in image base...
Bidirectional graph search techniques for finding shortest path in image base...Bidirectional graph search techniques for finding shortest path in image base...
Bidirectional graph search techniques for finding shortest path in image base...
 
word2vec, LDA, and introducing a new hybrid algorithm: lda2vec
word2vec, LDA, and introducing a new hybrid algorithm: lda2vecword2vec, LDA, and introducing a new hybrid algorithm: lda2vec
word2vec, LDA, and introducing a new hybrid algorithm: lda2vec
 
Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)
 
The Apache Spark File Format Ecosystem
The Apache Spark File Format EcosystemThe Apache Spark File Format Ecosystem
The Apache Spark File Format Ecosystem
 
Hive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveHive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep Dive
 
Introduction to Deep Learning (NVIDIA)
Introduction to Deep Learning (NVIDIA)Introduction to Deep Learning (NVIDIA)
Introduction to Deep Learning (NVIDIA)
 
Stratégies d’optimisation de requêtes SQL dans un écosystème Hadoop
Stratégies d’optimisation de requêtes SQL dans un écosystème HadoopStratégies d’optimisation de requêtes SQL dans un écosystème Hadoop
Stratégies d’optimisation de requêtes SQL dans un écosystème Hadoop
 
Introduction to TensorFlow 2.0
Introduction to TensorFlow 2.0Introduction to TensorFlow 2.0
Introduction to TensorFlow 2.0
 
The Basics of MongoDB
The Basics of MongoDBThe Basics of MongoDB
The Basics of MongoDB
 
Vector Search for Data Scientists.pdf
Vector Search for Data Scientists.pdfVector Search for Data Scientists.pdf
Vector Search for Data Scientists.pdf
 
Apache Spark - Basics of RDD | Big Data Hadoop Spark Tutorial | CloudxLab
Apache Spark - Basics of RDD | Big Data Hadoop Spark Tutorial | CloudxLabApache Spark - Basics of RDD | Big Data Hadoop Spark Tutorial | CloudxLab
Apache Spark - Basics of RDD | Big Data Hadoop Spark Tutorial | CloudxLab
 
Hadoop HDFS Concepts
Hadoop HDFS ConceptsHadoop HDFS Concepts
Hadoop HDFS Concepts
 
Scaling the mirrorworld with knowledge graphs
Scaling the mirrorworld with knowledge graphsScaling the mirrorworld with knowledge graphs
Scaling the mirrorworld with knowledge graphs
 
Breakout: Hadoop and the Operational Data Store
Breakout: Hadoop and the Operational Data StoreBreakout: Hadoop and the Operational Data Store
Breakout: Hadoop and the Operational Data Store
 
Probabilistic Reasoning
Probabilistic ReasoningProbabilistic Reasoning
Probabilistic Reasoning
 
Hadoop MapReduce Fundamentals
Hadoop MapReduce FundamentalsHadoop MapReduce Fundamentals
Hadoop MapReduce Fundamentals
 
How to overcome mysterious problems caused by large and multi-tenancy Hadoop ...
How to overcome mysterious problems caused by large and multi-tenancy Hadoop ...How to overcome mysterious problems caused by large and multi-tenancy Hadoop ...
How to overcome mysterious problems caused by large and multi-tenancy Hadoop ...
 
Apache Spark 101
Apache Spark 101Apache Spark 101
Apache Spark 101
 

Destaque

Accumulo Summit 2015: GeoWave: Geospatial and Geotemporal Data Storage and Re...
Accumulo Summit 2015: GeoWave: Geospatial and Geotemporal Data Storage and Re...Accumulo Summit 2015: GeoWave: Geospatial and Geotemporal Data Storage and Re...
Accumulo Summit 2015: GeoWave: Geospatial and Geotemporal Data Storage and Re...
Accumulo Summit
 
GeoMesa – Spatio-Temporal Indexing in Accumulo
GeoMesa – Spatio-Temporal Indexing in AccumuloGeoMesa – Spatio-Temporal Indexing in Accumulo
GeoMesa – Spatio-Temporal Indexing in Accumulo
CvilleDataScience
 
Foundation Comparison
Foundation ComparisonFoundation Comparison
Foundation Comparison
Jody Garnett
 
C2S Tech Tips: Rapid Prototyping
C2S Tech Tips: Rapid PrototypingC2S Tech Tips: Rapid Prototyping
C2S Tech Tips: Rapid Prototyping
Amazon Web Services
 
Accumulo Summit 2016: GeoMesa: Using Accumulo for Optimized Spatio-Temporal P...
Accumulo Summit 2016: GeoMesa: Using Accumulo for Optimized Spatio-Temporal P...Accumulo Summit 2016: GeoMesa: Using Accumulo for Optimized Spatio-Temporal P...
Accumulo Summit 2016: GeoMesa: Using Accumulo for Optimized Spatio-Temporal P...
Accumulo Summit
 
An Introduction to Accumulo
An Introduction to AccumuloAn Introduction to Accumulo
An Introduction to Accumulo
Donald Miner
 

Destaque (19)

GeoMesa LocationTech DC
GeoMesa LocationTech DCGeoMesa LocationTech DC
GeoMesa LocationTech DC
 
LocationTech Projects
LocationTech ProjectsLocationTech Projects
LocationTech Projects
 
Accumulo Summit 2015: GeoWave: Geospatial and Geotemporal Data Storage and Re...
Accumulo Summit 2015: GeoWave: Geospatial and Geotemporal Data Storage and Re...Accumulo Summit 2015: GeoWave: Geospatial and Geotemporal Data Storage and Re...
Accumulo Summit 2015: GeoWave: Geospatial and Geotemporal Data Storage and Re...
 
Intro to Big Data in Urban GIS Research
Intro to Big Data in Urban GIS ResearchIntro to Big Data in Urban GIS Research
Intro to Big Data in Urban GIS Research
 
GeoMesa – Spatio-Temporal Indexing in Accumulo
GeoMesa – Spatio-Temporal Indexing in AccumuloGeoMesa – Spatio-Temporal Indexing in Accumulo
GeoMesa – Spatio-Temporal Indexing in Accumulo
 
Foundation Comparison
Foundation ComparisonFoundation Comparison
Foundation Comparison
 
Processing Geospatial Data At Scale @locationtech
Processing Geospatial Data At Scale @locationtechProcessing Geospatial Data At Scale @locationtech
Processing Geospatial Data At Scale @locationtech
 
Processing Geospatial at Scale at LocationTech
Processing Geospatial at Scale at LocationTechProcessing Geospatial at Scale at LocationTech
Processing Geospatial at Scale at LocationTech
 
C2S Tech Tips: Rapid Prototyping
C2S Tech Tips: Rapid PrototypingC2S Tech Tips: Rapid Prototyping
C2S Tech Tips: Rapid Prototyping
 
Enabling Access to Big Geospatial Data with LocationTech and Apache projects
Enabling Access to Big Geospatial Data with LocationTech and Apache projectsEnabling Access to Big Geospatial Data with LocationTech and Apache projects
Enabling Access to Big Geospatial Data with LocationTech and Apache projects
 
Accumulo Summit 2016: GeoMesa: Using Accumulo for Optimized Spatio-Temporal P...
Accumulo Summit 2016: GeoMesa: Using Accumulo for Optimized Spatio-Temporal P...Accumulo Summit 2016: GeoMesa: Using Accumulo for Optimized Spatio-Temporal P...
Accumulo Summit 2016: GeoMesa: Using Accumulo for Optimized Spatio-Temporal P...
 
Oct 2012 HUG: Apache Accumulo: Unlocking the Power of Big Data
Oct 2012 HUG: Apache Accumulo: Unlocking the Power of Big DataOct 2012 HUG: Apache Accumulo: Unlocking the Power of Big Data
Oct 2012 HUG: Apache Accumulo: Unlocking the Power of Big Data
 
Redis adaptor for Apache Geode
Redis adaptor for Apache GeodeRedis adaptor for Apache Geode
Redis adaptor for Apache Geode
 
Using Big Data techniques to query and store OpenStreetMap data. Stephen Knox...
Using Big Data techniques to query and store OpenStreetMap data. Stephen Knox...Using Big Data techniques to query and store OpenStreetMap data. Stephen Knox...
Using Big Data techniques to query and store OpenStreetMap data. Stephen Knox...
 
Data Wrangling on Hadoop - Olivier De Garrigues, Trifacta
Data Wrangling on Hadoop - Olivier De Garrigues, TrifactaData Wrangling on Hadoop - Olivier De Garrigues, Trifacta
Data Wrangling on Hadoop - Olivier De Garrigues, Trifacta
 
An Introduction to Accumulo
An Introduction to AccumuloAn Introduction to Accumulo
An Introduction to Accumulo
 
Searching for effective farming policies in Gloucestershire
Searching for effective farming policies in GloucestershireSearching for effective farming policies in Gloucestershire
Searching for effective farming policies in Gloucestershire
 
Microservices Architectures on Amazon Web Services
Microservices Architectures on Amazon Web ServicesMicroservices Architectures on Amazon Web Services
Microservices Architectures on Amazon Web Services
 
C2S: What’s Next
C2S: What’s NextC2S: What’s Next
C2S: What’s Next
 

Semelhante a GeoMesa: Scalable Geospatial Analytics

Migrating from matlab to python
Migrating from matlab to pythonMigrating from matlab to python
Migrating from matlab to python
ActiveState
 
The openCypher Project - An Open Graph Query Language
The openCypher Project - An Open Graph Query LanguageThe openCypher Project - An Open Graph Query Language
The openCypher Project - An Open Graph Query Language
Neo4j
 
Journey of Migrating Millions of Queries on The Cloud
Journey of Migrating Millions of Queries on The CloudJourney of Migrating Millions of Queries on The Cloud
Journey of Migrating Millions of Queries on The Cloud
takezoe
 
Pg intro part1-theory_slides
Pg intro part1-theory_slidesPg intro part1-theory_slides
Pg intro part1-theory_slides
lasmasi
 

Semelhante a GeoMesa: Scalable Geospatial Analytics (20)

PostgreSQL 9.4: NoSQL on ACID
PostgreSQL 9.4: NoSQL on ACIDPostgreSQL 9.4: NoSQL on ACID
PostgreSQL 9.4: NoSQL on ACID
 
Time Series With OrientDB - Fosdem 2015
Time Series With OrientDB - Fosdem 2015Time Series With OrientDB - Fosdem 2015
Time Series With OrientDB - Fosdem 2015
 
Cloud conf-varna-2014-mihail mateev-spatial-data-and-microsoft-azure-sql-data...
Cloud conf-varna-2014-mihail mateev-spatial-data-and-microsoft-azure-sql-data...Cloud conf-varna-2014-mihail mateev-spatial-data-and-microsoft-azure-sql-data...
Cloud conf-varna-2014-mihail mateev-spatial-data-and-microsoft-azure-sql-data...
 
Типы данных JSONb, соответствующие индексы и модуль jsquery – Олег Бартунов, ...
Типы данных JSONb, соответствующие индексы и модуль jsquery – Олег Бартунов, ...Типы данных JSONb, соответствующие индексы и модуль jsquery – Олег Бартунов, ...
Типы данных JSONb, соответствующие индексы и модуль jsquery – Олег Бартунов, ...
 
PostgreSQL Moscow Meetup - September 2014 - Oleg Bartunov and Alexander Korotkov
PostgreSQL Moscow Meetup - September 2014 - Oleg Bartunov and Alexander KorotkovPostgreSQL Moscow Meetup - September 2014 - Oleg Bartunov and Alexander Korotkov
PostgreSQL Moscow Meetup - September 2014 - Oleg Bartunov and Alexander Korotkov
 
Il tempo vola: rappresentare e manipolare sequenze di eventi e time series co...
Il tempo vola: rappresentare e manipolare sequenze di eventi e time series co...Il tempo vola: rappresentare e manipolare sequenze di eventi e time series co...
Il tempo vola: rappresentare e manipolare sequenze di eventi e time series co...
 
Efficient Query Processing in Geographic Web Search Engines
Efficient Query Processing in Geographic Web Search EnginesEfficient Query Processing in Geographic Web Search Engines
Efficient Query Processing in Geographic Web Search Engines
 
A Production Quality Sketching Library for the Analysis of Big Data
A Production Quality Sketching Library for the Analysis of Big DataA Production Quality Sketching Library for the Analysis of Big Data
A Production Quality Sketching Library for the Analysis of Big Data
 
Search Analytics Component: Presented by Steven Bower, Bloomberg L.P.
Search Analytics Component: Presented by Steven Bower, Bloomberg L.P.Search Analytics Component: Presented by Steven Bower, Bloomberg L.P.
Search Analytics Component: Presented by Steven Bower, Bloomberg L.P.
 
OrientDB - Time Series and Event Sequences - Codemotion Milan 2014
OrientDB - Time Series and Event Sequences - Codemotion Milan 2014OrientDB - Time Series and Event Sequences - Codemotion Milan 2014
OrientDB - Time Series and Event Sequences - Codemotion Milan 2014
 
SQL Tuning 101
SQL Tuning 101SQL Tuning 101
SQL Tuning 101
 
sqltuning101-170419021007-2.pdf
sqltuning101-170419021007-2.pdfsqltuning101-170419021007-2.pdf
sqltuning101-170419021007-2.pdf
 
Application Monitoring using Open Source: VictoriaMetrics - ClickHouse
Application Monitoring using Open Source: VictoriaMetrics - ClickHouseApplication Monitoring using Open Source: VictoriaMetrics - ClickHouse
Application Monitoring using Open Source: VictoriaMetrics - ClickHouse
 
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
 
CTOs Perspective on Adding Geospatial and Location-based Information
CTOs Perspective on Adding Geospatial and Location-based InformationCTOs Perspective on Adding Geospatial and Location-based Information
CTOs Perspective on Adding Geospatial and Location-based Information
 
Migrating from matlab to python
Migrating from matlab to pythonMigrating from matlab to python
Migrating from matlab to python
 
The openCypher Project - An Open Graph Query Language
The openCypher Project - An Open Graph Query LanguageThe openCypher Project - An Open Graph Query Language
The openCypher Project - An Open Graph Query Language
 
Master tuning
Master   tuningMaster   tuning
Master tuning
 
Journey of Migrating Millions of Queries on The Cloud
Journey of Migrating Millions of Queries on The CloudJourney of Migrating Millions of Queries on The Cloud
Journey of Migrating Millions of Queries on The Cloud
 
Pg intro part1-theory_slides
Pg intro part1-theory_slidesPg intro part1-theory_slides
Pg intro part1-theory_slides
 

Mais de VisionGEOMATIQUE2014

Géomatique appliquée : revue des solutions novatrices mises en place en 2014
Géomatique appliquée : revue des solutions novatrices mises en place en 2014Géomatique appliquée : revue des solutions novatrices mises en place en 2014
Géomatique appliquée : revue des solutions novatrices mises en place en 2014
VisionGEOMATIQUE2014
 
Automatisation de la cartographie et de l'analyse des données de comptage de ...
Automatisation de la cartographie et de l'analyse des données de comptage de ...Automatisation de la cartographie et de l'analyse des données de comptage de ...
Automatisation de la cartographie et de l'analyse des données de comptage de ...
VisionGEOMATIQUE2014
 
Optimisation et analyse des parcours de déneigement à la Ville de Shawinigan
Optimisation et analyse des parcours de déneigement à la Ville de ShawiniganOptimisation et analyse des parcours de déneigement à la Ville de Shawinigan
Optimisation et analyse des parcours de déneigement à la Ville de Shawinigan
VisionGEOMATIQUE2014
 
AutoTri, une application automatisant l’analyse du stationnement de l’arrondi...
AutoTri, une application automatisant l’analyse du stationnement de l’arrondi...AutoTri, une application automatisant l’analyse du stationnement de l’arrondi...
AutoTri, une application automatisant l’analyse du stationnement de l’arrondi...
VisionGEOMATIQUE2014
 

Mais de VisionGEOMATIQUE2014 (20)

Géomatique appliquée : revue des solutions novatrices mises en place en 2014
Géomatique appliquée : revue des solutions novatrices mises en place en 2014Géomatique appliquée : revue des solutions novatrices mises en place en 2014
Géomatique appliquée : revue des solutions novatrices mises en place en 2014
 
Indoor location with the Bluetooth Low Energy standard
Indoor location with the Bluetooth Low Energy standardIndoor location with the Bluetooth Low Energy standard
Indoor location with the Bluetooth Low Energy standard
 
ScribeUI: La productivité avec MapServer
ScribeUI: La productivité avec MapServerScribeUI: La productivité avec MapServer
ScribeUI: La productivité avec MapServer
 
Fast, Distributed Geoprocessing with Scala, Spark and GeoTrellis
Fast, Distributed Geoprocessing with Scala, Spark and GeoTrellisFast, Distributed Geoprocessing with Scala, Spark and GeoTrellis
Fast, Distributed Geoprocessing with Scala, Spark and GeoTrellis
 
OpenGL ES pour le développement d’applications géospatiales sur Android
OpenGL ES pour le développement d’applications géospatiales sur AndroidOpenGL ES pour le développement d’applications géospatiales sur Android
OpenGL ES pour le développement d’applications géospatiales sur Android
 
Accès ouvert aux données météorologiques d’Environnement Canada
Accès ouvert aux données météorologiques d’Environnement CanadaAccès ouvert aux données météorologiques d’Environnement Canada
Accès ouvert aux données météorologiques d’Environnement Canada
 
LocationTech Data Commons
LocationTech Data CommonsLocationTech Data Commons
LocationTech Data Commons
 
TDW FOSS GEO-STACK FOR MINERAL EXPLORATION
TDW FOSS GEO-STACK FOR MINERAL EXPLORATIONTDW FOSS GEO-STACK FOR MINERAL EXPLORATION
TDW FOSS GEO-STACK FOR MINERAL EXPLORATION
 
Spatial Data processing with Hadoop
Spatial Data processing with HadoopSpatial Data processing with Hadoop
Spatial Data processing with Hadoop
 
Solution Geoctopus : améliorations et défis
Solution Geoctopus : améliorations et défisSolution Geoctopus : améliorations et défis
Solution Geoctopus : améliorations et défis
 
Infrastructure de géomatique ouverte (IGO) : un modèle inspirant de développe...
Infrastructure de géomatique ouverte (IGO) : un modèle inspirant de développe...Infrastructure de géomatique ouverte (IGO) : un modèle inspirant de développe...
Infrastructure de géomatique ouverte (IGO) : un modèle inspirant de développe...
 
Montrajet.ca : une solution multimodale de covoiturage et de planification d'...
Montrajet.ca : une solution multimodale de covoiturage et de planification d'...Montrajet.ca : une solution multimodale de covoiturage et de planification d'...
Montrajet.ca : une solution multimodale de covoiturage et de planification d'...
 
Automatisation de la cartographie et de l'analyse des données de comptage de ...
Automatisation de la cartographie et de l'analyse des données de comptage de ...Automatisation de la cartographie et de l'analyse des données de comptage de ...
Automatisation de la cartographie et de l'analyse des données de comptage de ...
 
MACHINE LEARNING FOR SATELLITE-GUIDED WATER QUALITY MONITORING
MACHINE LEARNING FOR SATELLITE-GUIDED WATER QUALITY MONITORINGMACHINE LEARNING FOR SATELLITE-GUIDED WATER QUALITY MONITORING
MACHINE LEARNING FOR SATELLITE-GUIDED WATER QUALITY MONITORING
 
Les contributions de la géomatique au développement de la ville intelligente
Les contributions de la géomatique au développement de la ville intelligenteLes contributions de la géomatique au développement de la ville intelligente
Les contributions de la géomatique au développement de la ville intelligente
 
SIGim la plateforme adaptée à la gestion municipale
SIGim la plateforme adaptée à la gestion municipaleSIGim la plateforme adaptée à la gestion municipale
SIGim la plateforme adaptée à la gestion municipale
 
Optimisation et analyse des parcours de déneigement à la Ville de Shawinigan
Optimisation et analyse des parcours de déneigement à la Ville de ShawiniganOptimisation et analyse des parcours de déneigement à la Ville de Shawinigan
Optimisation et analyse des parcours de déneigement à la Ville de Shawinigan
 
AutoTri, une application automatisant l’analyse du stationnement de l’arrondi...
AutoTri, une application automatisant l’analyse du stationnement de l’arrondi...AutoTri, une application automatisant l’analyse du stationnement de l’arrondi...
AutoTri, une application automatisant l’analyse du stationnement de l’arrondi...
 
Requirements for Geospatial Agent Simulation to Strengthen the 'Property-Powe...
Requirements for Geospatial Agent Simulation to Strengthen the 'Property-Powe...Requirements for Geospatial Agent Simulation to Strengthen the 'Property-Powe...
Requirements for Geospatial Agent Simulation to Strengthen the 'Property-Powe...
 
JMap 6.0 : une solution complète et évolutive pour l'intégration, la diffusio...
JMap 6.0 : une solution complète et évolutive pour l'intégration, la diffusio...JMap 6.0 : une solution complète et évolutive pour l'intégration, la diffusio...
JMap 6.0 : une solution complète et évolutive pour l'intégration, la diffusio...
 

Último

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Último (20)

Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 

GeoMesa: Scalable Geospatial Analytics

  • 1. GeoMesa: Scalable Geospatial Analytics Chris Eichelberger christopher.eichelberger@ccri.com
  • 2. terms • GeoMesa: an open-source project organized under LocationTech • scalable: if you can continue to solve problems as N >> 1 with no more change than adding hardware and minor tweaks, you scale • geospatial: data that contain a geographic reference, a date/time, and zero or more additional attributes • analytics: formally, a logical decomposition via truth-preserving transformations; informally, any useful derivation (whether deductive or inductive)
  • 3. outline • part 1: why? ( 3 minutes) • part 2: how? (10 minutes) • part 3: what? (10 minutes) • part 4: who? ( 2 minutes)
  • 5. [why] which X (points) are close to location Y? • hundreds: PostgreSQL and brute force – full table scan • hundreds of thousands: PostgreSQL and PostGIS – GeoTools API – GiST (think R-trees) • hundreds of millions: a funny thing happens as you collect much more data...
  • 6. [why] dissolution of large-volume data
  • 7. [why] perhaps SQL is the bottleneck? • NoSQL databases, such as Apache Accumulo • trade ACID for distributed processing, storage • but there’s no PostGIS for Accumulo, so how does the canonical diagram of an Accumulo (key, value) pair help us answer some simple questions...
  • 8. [why] questions that ought to be easy for an index to answer • easy question: Which comes first, “Ontario” or “Quebec”?
  • 9. [why] questions that ought to be easy for an index to answer • easy question: Which comes first, “Ontario” or “Quebec”? • similar question: Which comes first, or ?
  • 10. [why] questions that ought to be easy for an index to answer • easy question: Which comes first, “Ontario” or “Quebec”? • similar question: Which comes first, or ? • simplify, and think only of representative cities, and think of them strictly as points
  • 13. [why] geohashing City Coordinates (courtesy Wikipedia) Geohash Ottawa 45°25′15″N 75°41′24″W f244m Montréal 45°30′N 73°34′W f25dv Charlottesville (Virginia, USA) 38°1′48″N 78°28′44″W dqb0q ● Two unique orders: ○ Order by name: Charlottesville, Montréal, Ottawa ○ Order by longitude or latitude or geohash: Charlottesville, Ottawa, Montréal ● Lexicoding location -> geohash provides a deterministic, repeatable ordering ○ with this, we can index, store, and query points by lexicographic ranges
  • 14. [why] build-versus-buy remorse • PostgreSQL+PostGIS has some nice functions – geometric predicates – secondary indexes – standard GeoTools API • some of our data are (multi) lines, (multi) polygons • time is often more than a secondary consideration • sometimes, analysis work needn’t be done on the same old client – distributed across the tablet servers? – using tools like Spark? – streaming?
  • 17. [how] GeoMesa features • GeoTools API • sharding distributes queries uniformly • flexible SFC can incorporate time • supports (multi) point, (multi) line, (multi) polygon geometries • secondary indexes and a multi-stage query planner • burgeoning raster support via WCS • GeoServer as a plugin-based GUI • WPS standards for computation (and function chaining)
  • 20. [how] space-filling curve progression %~#s%3#r%0,3#gh%yyyyMM#d::%~#s%3,2#gh::%~#s%5,2#gh%HHmm#d%id
  • 24. [how] rasters + GeoWave integration
  • 26. [how] GeoServer as a plug-in GUI
  • 27. [how] Web Processing Service • WPS is another OGC standard • Think of it as an abstract function definition, mapping input types to output types, and defining the computation that occurs between the two. • WPS processes can be chained. • This provides for a natural extension mechanism to GeoMesa.
  • 28. [how] synthesis Those are merely the highlights of some of GeoMesa’s current features… … so what?
  • 31. [what] queries that interpolate both position and time
  • 34. [what] near-real-time streaming track analytics with web sockets
  • 37. [who] LocationTech and the greater community
  • 39. questions For extended questions: geomesa-user@locationtech.org geomesa@ccri.com christopher.eichelberger@geomesa.org For additional reading: geomesa.org For code: github.com/locationtech/geomesa