SlideShare uma empresa Scribd logo
1 de 28
Baixar para ler offline
High Performance OSM Data
Manipulation With Osmium

Jochen Topf
CC-BY http://www.flickr.com/photos/x1brett/4562610437/
Typical Problems

Slow.
Needs a lot of memory/disk space.
Doesn't work with entire planet.
OSM Data

There isn't all that much data
(current planet PBF: 23 GB)
But we need to store it efficiently!
OSM Data

Often we can work on the data
piece by piece
Streaming
C++
Osmium

A fast and flexible C++ library
for working with OSM data
Modular

CC-BY http://www.flickr.com/photos/jronaldlee/4479381576/
Has to work with
data of entire
planet!

...or a
small extract!
Features

Basic OSM objects:
Nodes, ways, relations, tags, ...
And operations on them.
Tag filtering
Input/Output
Read from: file, stdin or URL.
Write to: file or stdout.
XML or PBF.
Compressed or uncompressed.
OSM data (.osm) or changes (.osc).
With or without history.
Geometry

Add node locations to ways
Assemble Multipolygons
Convert geometries to WKT, WKB, OGR, GEOS
Line length (haversine)
Handler

OSM
file

Reader

Handler

Writer

Temp.
Storage

For converter and filter

OSM
file
Example: main
#include <osmium/io/any_input.hpp>
int main(int argc, char* argv[]) {
osmium::io::Reader reader(argv[1]);
NamesHandler handler;

}

reader.open();
reader.push(handler);
Example: handler
#include <iostream>
#include <osmium/handler.hpp>
struct NamesHandler : public
osmium::handler::Handler<NamesHandler> {
void node(const osmium::Node& node) {
auto n = node.tags().get_value_by_key("name");
if (n) std::cout << n << std::endl;
}
};
taginfo.openstreetmap.org

Statistics for
61 million different tags
on 2.2 billion objects.
Runs for about two hours every day.
Needs less than 8 GB RAM.
Linux
Mac OS X
Windows
Osmium History
Development started October 2010

Recently started „New Osmium“
Th

The New Osmium

Object Storage/Transport
Indexes
Multithreading
(no multipolygon support yet)

e

N
ew

O

sm

iu
m
Th

C++11

e

N
ew

Modern C++
Official ISO standard
Works with GCC 4.7.3, clang 3.2
Easier to write, more efficient, cleaner code

O

sm

iu
m
Th
e

Multithreading

N
ew

O
sm

Better design to take advantage of multithreading
Dynamic memory allocation is even worse than
with single thread

iu
m
Th
e

osmcode.org

Osmium
and
Osmium-based
software
github.com/osmcode

N
ew

O
sm

iu
m
Javascript

Old Osmium: osmjs
New Osmium: Working on NodeJS module
Status

Old Osmium: Tried and tested,
In production for >2 years
New Osmium: New and untested,
Not production ready yet
Thanks!
Hackday
tomorrow!

Thanks!
wiki.osm.org/wiki/Osmium
github.com/joto/osmium
osmcode.org
github.com/osmcode/libosmium

Jochen Topf
jochen@topf.org
jochentopf.com

Mais conteúdo relacionado

Mais procurados

Annette g09 job file for cyclohexene for niobium
Annette g09 job file for cyclohexene for niobiumAnnette g09 job file for cyclohexene for niobium
Annette g09 job file for cyclohexene for niobium
Dr Robert Craig PhD
 
Annette g09 job file for cyclohexene
Annette g09 job file for cyclohexeneAnnette g09 job file for cyclohexene
Annette g09 job file for cyclohexene
Dr Robert Craig PhD
 
My talk at Topconf.com conference, Tallinn, 1st of November 2012
My talk at Topconf.com conference, Tallinn, 1st of November 2012My talk at Topconf.com conference, Tallinn, 1st of November 2012
My talk at Topconf.com conference, Tallinn, 1st of November 2012
Kostja Osipov
 
Mi Primer Trabajo
Mi Primer TrabajoMi Primer Trabajo
Mi Primer Trabajo
carlosgp98
 

Mais procurados (20)

Bsdtw17: brooks davis: is it time to replace mmap?
Bsdtw17: brooks davis: is it time to replace mmap?Bsdtw17: brooks davis: is it time to replace mmap?
Bsdtw17: brooks davis: is it time to replace mmap?
 
Tokyocabinet
TokyocabinetTokyocabinet
Tokyocabinet
 
Tokyo Cabinet
Tokyo CabinetTokyo Cabinet
Tokyo Cabinet
 
Tokyo Cabinet
Tokyo CabinetTokyo Cabinet
Tokyo Cabinet
 
Annette g09 job file for cyclohexene for niobium
Annette g09 job file for cyclohexene for niobiumAnnette g09 job file for cyclohexene for niobium
Annette g09 job file for cyclohexene for niobium
 
Golang Arg / CABA Meetup #5 - go-carbon
Golang Arg / CABA Meetup #5 - go-carbonGolang Arg / CABA Meetup #5 - go-carbon
Golang Arg / CABA Meetup #5 - go-carbon
 
Curcumin job file
Curcumin job fileCurcumin job file
Curcumin job file
 
Annette g09 job file for cyclohexene
Annette g09 job file for cyclohexeneAnnette g09 job file for cyclohexene
Annette g09 job file for cyclohexene
 
My talk at Topconf.com conference, Tallinn, 1st of November 2012
My talk at Topconf.com conference, Tallinn, 1st of November 2012My talk at Topconf.com conference, Tallinn, 1st of November 2012
My talk at Topconf.com conference, Tallinn, 1st of November 2012
 
Memory
MemoryMemory
Memory
 
Your data isn't that big @ Big Things Meetup 2016-05-16
Your data isn't that big @ Big Things Meetup 2016-05-16Your data isn't that big @ Big Things Meetup 2016-05-16
Your data isn't that big @ Big Things Meetup 2016-05-16
 
Some analysis of BlueStore and RocksDB
Some analysis of BlueStore and RocksDBSome analysis of BlueStore and RocksDB
Some analysis of BlueStore and RocksDB
 
Formaldehye2 job program
Formaldehye2  job programFormaldehye2  job program
Formaldehye2 job program
 
Introduction to Hadoop - FinistJug
Introduction to Hadoop - FinistJugIntroduction to Hadoop - FinistJug
Introduction to Hadoop - FinistJug
 
Geo Package and OWS Context at FOSS4G PDX
Geo Package and OWS Context at FOSS4G PDXGeo Package and OWS Context at FOSS4G PDX
Geo Package and OWS Context at FOSS4G PDX
 
Mi Primer Trabajo
Mi Primer TrabajoMi Primer Trabajo
Mi Primer Trabajo
 
Big data solution capacity planning
Big data solution capacity planningBig data solution capacity planning
Big data solution capacity planning
 
(BDT307) Running NoSQL on Amazon EC2 | AWS re:Invent 2014
(BDT307) Running NoSQL on Amazon EC2 | AWS re:Invent 2014(BDT307) Running NoSQL on Amazon EC2 | AWS re:Invent 2014
(BDT307) Running NoSQL on Amazon EC2 | AWS re:Invent 2014
 
OpenMapTiles FOSS4G 2019
OpenMapTiles FOSS4G 2019OpenMapTiles FOSS4G 2019
OpenMapTiles FOSS4G 2019
 
Roman Kaplan, Graduate Student,Technion
Roman Kaplan, Graduate Student,TechnionRoman Kaplan, Graduate Student,Technion
Roman Kaplan, Graduate Student,Technion
 

Semelhante a High Performance OSM Data Manipulation With Osmium - State of the Map 2013

Oracle Open World 2014: Lies, Damned Lies, and I/O Statistics [ CON3671]
Oracle Open World 2014: Lies, Damned Lies, and I/O Statistics [ CON3671]Oracle Open World 2014: Lies, Damned Lies, and I/O Statistics [ CON3671]
Oracle Open World 2014: Lies, Damned Lies, and I/O Statistics [ CON3671]
Kyle Hailey
 
Apache Spark on Supercomputers: A Tale of the Storage Hierarchy with Costin I...
Apache Spark on Supercomputers: A Tale of the Storage Hierarchy with Costin I...Apache Spark on Supercomputers: A Tale of the Storage Hierarchy with Costin I...
Apache Spark on Supercomputers: A Tale of the Storage Hierarchy with Costin I...
Databricks
 

Semelhante a High Performance OSM Data Manipulation With Osmium - State of the Map 2013 (20)

Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...
Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...
Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...
 
Tuning Solr & Pipeline for Logs
Tuning Solr & Pipeline for LogsTuning Solr & Pipeline for Logs
Tuning Solr & Pipeline for Logs
 
Hacking OOo 2.0
Hacking OOo 2.0Hacking OOo 2.0
Hacking OOo 2.0
 
Oracle Open World 2014: Lies, Damned Lies, and I/O Statistics [ CON3671]
Oracle Open World 2014: Lies, Damned Lies, and I/O Statistics [ CON3671]Oracle Open World 2014: Lies, Damned Lies, and I/O Statistics [ CON3671]
Oracle Open World 2014: Lies, Damned Lies, and I/O Statistics [ CON3671]
 
The Anatomy Of The Google Architecture Fina Lv1.1
The Anatomy Of The Google Architecture Fina Lv1.1The Anatomy Of The Google Architecture Fina Lv1.1
The Anatomy Of The Google Architecture Fina Lv1.1
 
RADOS improvements and roadmap - Greg Farnum, Josh Durgin, Kefu Chai
RADOS improvements and roadmap - Greg Farnum, Josh Durgin, Kefu ChaiRADOS improvements and roadmap - Greg Farnum, Josh Durgin, Kefu Chai
RADOS improvements and roadmap - Greg Farnum, Josh Durgin, Kefu Chai
 
CNES @ Scilab Conference 2018
CNES @ Scilab Conference 2018CNES @ Scilab Conference 2018
CNES @ Scilab Conference 2018
 
Bluestore
BluestoreBluestore
Bluestore
 
Bluestore
BluestoreBluestore
Bluestore
 
Cliff sugerman
Cliff sugermanCliff sugerman
Cliff sugerman
 
Using Big Data techniques to query and store OpenStreetMap data. Stephen Knox...
Using Big Data techniques to query and store OpenStreetMap data. Stephen Knox...Using Big Data techniques to query and store OpenStreetMap data. Stephen Knox...
Using Big Data techniques to query and store OpenStreetMap data. Stephen Knox...
 
Exascale Capabl
Exascale CapablExascale Capabl
Exascale Capabl
 
(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...
(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...
(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...
 
Programmable Exascale Supercomputer
Programmable Exascale SupercomputerProgrammable Exascale Supercomputer
Programmable Exascale Supercomputer
 
Rob Savoye, Freelance Developer, OSM Data Manipulation | Workshop | SotM Asia...
Rob Savoye, Freelance Developer, OSM Data Manipulation | Workshop | SotM Asia...Rob Savoye, Freelance Developer, OSM Data Manipulation | Workshop | SotM Asia...
Rob Savoye, Freelance Developer, OSM Data Manipulation | Workshop | SotM Asia...
 
Open Source Storage at Scale: Ceph @ GRNET
Open Source Storage at Scale: Ceph @ GRNETOpen Source Storage at Scale: Ceph @ GRNET
Open Source Storage at Scale: Ceph @ GRNET
 
Lecture OSSIM
Lecture OSSIM Lecture OSSIM
Lecture OSSIM
 
High-Performance Physics Solver Design for Next Generation Consoles
High-Performance Physics Solver Design for Next Generation ConsolesHigh-Performance Physics Solver Design for Next Generation Consoles
High-Performance Physics Solver Design for Next Generation Consoles
 
Galaxy CloudMan performance on AWS
Galaxy CloudMan performance on AWSGalaxy CloudMan performance on AWS
Galaxy CloudMan performance on AWS
 
Apache Spark on Supercomputers: A Tale of the Storage Hierarchy with Costin I...
Apache Spark on Supercomputers: A Tale of the Storage Hierarchy with Costin I...Apache Spark on Supercomputers: A Tale of the Storage Hierarchy with Costin I...
Apache Spark on Supercomputers: A Tale of the Storage Hierarchy with Costin I...
 

Mais de OSMFstateofthemap

Mais de OSMFstateofthemap (13)

Towards an area datatype for OSM - State of the Map 2013
Towards an area datatype for OSM - State of the Map 2013Towards an area datatype for OSM - State of the Map 2013
Towards an area datatype for OSM - State of the Map 2013
 
Mapping for Transition: Comparisons with the OSM Community - State of the Map...
Mapping for Transition: Comparisons with the OSM Community - State of the Map...Mapping for Transition: Comparisons with the OSM Community - State of the Map...
Mapping for Transition: Comparisons with the OSM Community - State of the Map...
 
How does a Global Navigation Satellite know where it is to tell you where you...
How does a Global Navigation Satellite know where it is to tell you where you...How does a Global Navigation Satellite know where it is to tell you where you...
How does a Global Navigation Satellite know where it is to tell you where you...
 
Presenting the work of OSMF Working Groups - State of the Map 2013
Presenting the work of OSMF Working Groups - State of the Map 2013Presenting the work of OSMF Working Groups - State of the Map 2013
Presenting the work of OSMF Working Groups - State of the Map 2013
 
Disaggregate accessibility planning using OSM data and OpenTripPlanner - Stat...
Disaggregate accessibility planning using OSM data and OpenTripPlanner - Stat...Disaggregate accessibility planning using OSM data and OpenTripPlanner - Stat...
Disaggregate accessibility planning using OSM data and OpenTripPlanner - Stat...
 
The Bronze Age of OpenStreetMap - Ilya zverik - State of the Map 2013
The Bronze Age of OpenStreetMap - Ilya zverik - State of the Map 2013The Bronze Age of OpenStreetMap - Ilya zverik - State of the Map 2013
The Bronze Age of OpenStreetMap - Ilya zverik - State of the Map 2013
 
Martijn van Exel - Collaborate to compete: Regain your Competitive Edge with osm
Martijn van Exel - Collaborate to compete: Regain your Competitive Edge with osmMartijn van Exel - Collaborate to compete: Regain your Competitive Edge with osm
Martijn van Exel - Collaborate to compete: Regain your Competitive Edge with osm
 
OSM2World - Tobias Knerr - State of the Map 2013
OSM2World - Tobias Knerr - State of the Map 2013OSM2World - Tobias Knerr - State of the Map 2013
OSM2World - Tobias Knerr - State of the Map 2013
 
How and why governments should use OpenStreetMap - Pete Lancaster - State of ...
How and why governments should use OpenStreetMap - Pete Lancaster - State of ...How and why governments should use OpenStreetMap - Pete Lancaster - State of ...
How and why governments should use OpenStreetMap - Pete Lancaster - State of ...
 
Open Historical Map: re-using obsolete information - State of the Map 2013
Open Historical Map: re-using obsolete information - State of the Map 2013Open Historical Map: re-using obsolete information - State of the Map 2013
Open Historical Map: re-using obsolete information - State of the Map 2013
 
FixMyBarangay: OSM in Cebu Philippines - Neil Taylor (Integrated Transport Pl...
FixMyBarangay: OSM in Cebu Philippines - Neil Taylor (Integrated Transport Pl...FixMyBarangay: OSM in Cebu Philippines - Neil Taylor (Integrated Transport Pl...
FixMyBarangay: OSM in Cebu Philippines - Neil Taylor (Integrated Transport Pl...
 
OpenStreetMap as base layer in a linked open data distribution platform - Ber...
OpenStreetMap as base layer in a linked open data distribution platform - Ber...OpenStreetMap as base layer in a linked open data distribution platform - Ber...
OpenStreetMap as base layer in a linked open data distribution platform - Ber...
 
Smarter Cities - Rick Robinson, IBM - State of the Map 2013 (SotM 2013 Birmin...
Smarter Cities - Rick Robinson, IBM - State of the Map 2013 (SotM 2013 Birmin...Smarter Cities - Rick Robinson, IBM - State of the Map 2013 (SotM 2013 Birmin...
Smarter Cities - Rick Robinson, IBM - State of the Map 2013 (SotM 2013 Birmin...
 

Último

Último (20)

Connecting the Dots in Product Design at KAYAK
Connecting the Dots in Product Design at KAYAKConnecting the Dots in Product Design at KAYAK
Connecting the Dots in Product Design at KAYAK
 
Agentic RAG What it is its types applications and implementation.pdf
Agentic RAG What it is its types applications and implementation.pdfAgentic RAG What it is its types applications and implementation.pdf
Agentic RAG What it is its types applications and implementation.pdf
 
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptxWSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
 
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfLinux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
 
Optimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through ObservabilityOptimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through Observability
 
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdfThe Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
 
Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024
 
Enterprise Knowledge Graphs - Data Summit 2024
Enterprise Knowledge Graphs - Data Summit 2024Enterprise Knowledge Graphs - Data Summit 2024
Enterprise Knowledge Graphs - Data Summit 2024
 
Designing for Hardware Accessibility at Comcast
Designing for Hardware Accessibility at ComcastDesigning for Hardware Accessibility at Comcast
Designing for Hardware Accessibility at Comcast
 
Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutes
 
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone KomSalesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
 
The Metaverse: Are We There Yet?
The  Metaverse:    Are   We  There  Yet?The  Metaverse:    Are   We  There  Yet?
The Metaverse: Are We There Yet?
 
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
 
PLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. StartupsPLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. Startups
 
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdfIntroduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
 
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
 
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfWhere to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
 
What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024
 
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
 
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
 

High Performance OSM Data Manipulation With Osmium - State of the Map 2013