SlideShare uma empresa Scribd logo
1 de 47
The World of Geocoding and challenges in India
Dr. Nishant Sinha
Discussion
• Introduction to Geocoding
• All about spatial data and real world data
• Addressing system of developing nation
• Addressing system in India
• Data sources and standardization
• Data arrangement
• Geocoding
• Challenges
• Steps to overcome challenges
• Q & A
Introduction to Geocoding
What is Geocoding ?
Geocoding is the process of transforming a description of a location (such as a pair of
coordinates, an address, or a name of a place) to a location on the earth's surface
Source Definition Possible Problems
Environmental
Sciences Research
Institute (1999)
The process of matching tabular data that
contains location information such as street
addresses with realworld coordinates.
Limited to coordinate output
only.
Harvard University
(2008)
The assignment of a numeric code to a
geographical location.
Limited to numeric code output
only.
Statistics Canada (2008) The process of assigning geographic identifiers
(codes) to map features and data records.
Limited input range.
U.S. Environmental
Protection Agency
(2008)
The process of assigning latitude and longitude
to a point, based on street addresses, city, state
and USPS ZIP Code.
Limited to coordinate output
only.
What is Geocoding ?
Geocoding (verb) is the act of transforming
aspatial locationally descriptive text into a valid
spatial representation using a predefined process.
A geocoder (noun) is a set of inter-related
components in the form of operations, algorithms,
and data sources that work together to produce a
spatial representation for descriptive locational
references.
A geocode (noun) is a spatial representation of a
descriptive locational reference.
To geocode (verb) is to perform the process of
geocoding.
Geocoding ………
Some points to ponder…….
• Does geocoding refer to a specific
computational process of transforming
something into something else, or simply
the concept of a transformation?
• Is a geocode a real-world object, simply an
attribute of something else, or the process
itself?
• Is a geocoder the computer program that
performs calculations, a single component
of the process, or the human who makes
the decisions?
Spatial real world data
What is there in
Spatial data….
Location information
• The fundamental primitive is the
point, a 0-dimensional (0-D) object
that has a position in space but no
length.
• A line is a 1-D geographic object
having a length and is composed
of two or more 0-D point objects.
• A polygon is a geographic object
bounded by at least three 1-D line
objects or segments with the
requirement that they must start
and end at the same location (i.e.,
node)
Spatial Data
• Spatial data often referred to as layers
• Layers represent features on, above, or
below the surface of the earth
• Data layers are of 2 major types
• Vector data represent features as discrete
points, lines, and polygons.
• Raster data represent the landscape as a
rectangular matrix of square cells.
The Real World
Vector Data
Raster Data
Spatial Data
• Vector
• TAB Files
• Shape Files
• CAD (AutoCAD DXF & DWG)
• Raster Vector
• Grids
• Images
• Digital Elevation Models (DEMs)
Geocoding and Spatial Data
• Most geocoding applications work with
vector-based GIS data.
• The key aspects from a geocoding
perspective :
• Determine and record the locations of
these objects on the surface of the Earth,
and
• Calculate distance because many
geocoding algorithms rely on one or more
forms of linear interpolation.
Addressing system of developing nation
Addressing System
• Addresses are one of the fundamental means by which
people conceptualize location in the modern world
• Addresses are of 2 Types
• Relative input data
• Examples of these types of data include “Across the street from Togo’s”
and “The northeast corner of Vermont Avenue and 36th Place.”
• Absolute input data
• Example House no-xxxx, ABC Street, JJJJJ Locality, XYZ City, QQ
State, DEF Country, ###### Postal Code
Addresses
• Come in a variety of formats
• Address Components
• Address (house or building) number
• Prefix direction
• Prefix type
• Street name
• Street type
• Suffix direction
• Zone
What is street addressing?
• Street addressing is an exercise that makes it
possible to identify the location of a plot or
dwelling on the ground, that is, to “assign an
address” using a system of maps and signs that
give the numbers or names of streets and
buildings
• Street addressing provides an opportunity to
• Create a map of the city that can be used by different
municipal units
• Conduct a systematic survey that collects a
significant amount of information about the city and
its population, and
• Set up a database on the built environment
History of street addressing across some countries
• Before 1728, no street names were indicated in Paris,
except in very rare cases, such as “Rue Saint-
Dominique, formerly des Vaches” (1643)
• In Belgrade, changes to street names are frequent
• The name China itself (country of the center), which
makes reference to this principle, and in the name of
the capitals, Beijing (capital of the North) and Nanjing
(capital of the South). Names of the provinces are also
strongly influenced by these references to
geographical direction
• Aside from a few main thoroughfares, streets in Japan
do not have names. In fact, the city districts (ku) are
divided into neighborhoods (chome) that group
together several dozen houses and thus form a block.
Street Addressing System Some common Practice
• Sequential alternating numbering
systems
• Decametric numbering system
• Codification of intersections
• Combined addressing system
Segment Distance (m) No. left side No. right side
1 0–10 1 2
2 10–20 3 4
3 20–30 5 6
4 30–40 7 8
Sequential alternating
numbering systems
Codification of
intersections
Combined addressing
system
Addressing system in India
Addressing System in India
• Spatial data capturing based on demand rather
than homogeneous capture
• Streets with no names or unstructured addresses
• Absence of consistent and accurate dataset
throughout the area being geocoded
• Presence of slum-like areas that change frequently
and are not street addressable
• Non-existence of reference datasets or GIS data
infrastructure
• Lack of hierarchical data structure beyond tehsil
• Absence of standardized geocoding algorithm
• Non-existence of approach to validate assumptions
made in the geocoding algorithm
Sample Addresses varieties of India
Address with Person Name
c/o Yashwant S.Prabhu , 318, C - Wing, Suyog Co.Housing Society Ltd, T. P.S. Road & III Link Road, Vazira, Borivali, West Mumbai, Maharashtra, 400092
c/o Late Esmail Bagani, Y/2/122, Satghara Road, PO- Badartala, PS – Nadial, Kolkata, West Bengal, 700044
Address with Building names
White C/403, Aamrpali Appt, opp. GHB complex, Ankur Road, Ahmedabad, Gujarat, 380013
13/9, Daksha Bldg, Vallabh Baug Lane, Ghatkopar, Mumbai, Maharashtra, 400077
Address with House No
299/15, Padmavati Vikar Mandal, Shahibaug, Ahmedabad, Gujarat, 380001
NO88, Srinivasa Nagar, 2NS Main Road, Kolathur, Chennai, Tamil Nadu 600099
Address with Street Name
1304, Cornation Road, Bargarpet, Kolar, Bangalore, Karnataka, 560000
20K, Dhakuria Station Road, Dhakuria PS: Jadavpur, Kolkata, West Bengal, 700031
Address with POI
BMC Software, Next Muttha Chamber, Senapati Bapat Road, Pune, Maharashtra, 411016
Life Style International Pvt. Ltd., Near Payal Cinema Complex, Gurgaon, Haryana, 122001
India Dynamics
• 35 States & UTs, ~650 Districts , ~6000 Tehsil/Town, more than
30k postcodes
• 21 regional , two official (Hindi & English) languages
• 3.2m sq KM Area, 1.2 billion population
• Administrative Hierarchy
• State >>District>>Tehsil/Town>> Ward>>Locality/Village>>Sub
locality>>Block/Pocket
• Addressing Pattern
• Near Govt. Hopital, Zirapur , District Rajgarh (MP)
• 351 Ground Floor , Shakti khand 3 , Near St. Teresa School ,
Indirapuram, Ghaziabad 201010 U.P.
• 9 Mansarovar Colony Opp. 3/686, Kala kuan Housing Board Alwar
301001
• 9/19/98/19-D Flat No. # 303 Hitech City Madhapur Hyderabad
• 176 Devi Nagar New Sanganer Road Sodala , Jaipur Rajasthan
Indian Postal Code
The History of Postal System in India
• Long colonial realm – British, Mughal,
Portuguese…….
• Britain's involvement in the postal services of
India began in the eighteenth century
• Warren Hastings (Governor General of
British India from 1773-1784) opened the
posts to the public in March 1774
• Main purpose of the postal system had been
to serve the commercial interests of the East
India Company and to serve Govt. orders
The History of Postal System in India
The History of Postal System in India
Data sources and standardization
India Geocoding Data
27
Streets
• NH
• SH
• Local roads
Place of
Interest
• Banks
• Retails
• Hospital
• Other landmarks
Administrative
• State
• District
• Town
• Locality
• Sub locality
Geography
• Block
• Locality
• Town
• Postcode
Address Point
• House no
• Building name
Reference datasets
• The reference dataset is the underlying
geographic database containing
geographic features that the geocoder can
use to generate a geographic output.
• This dataset stores all of the information the
geocoder knows about the world and
provides the base data from which the
geocoder calculates, derives, or obtains
geocodes. Interpolation algorithms
Type Example
Vector line file U.S. Census Bureau’s TIGER/Line (United
States Census Bureau 2008c)
Vector polygon file Los Angeles (LA) County Assessor Parcel
Data (Los Angeles County Assessor 2008)
Vector point file Australian Geocoded National Address File
(G-NAF) (Paull 2003)
Reference dataset types
• Linear-Based Reference Datasets
• Roads, Ferries
• Polygon-Based Reference datasets
• Administrative Boundaries, Postal Codes
• Point-Based Reference Datasets
• POIs
Source Description Coverage Cost
Tele Atlas
(2008c),
NAVTEQ
(2008)
Building footprints, parcel footprints Worldwide,
but sparse
Expensive
County or
municipal
Assessors
Building footprints, parcel footprints U.S., but
sparse
Relatively
inexpensive but
varies
U.S. Census
Bureau
Census Block Groups, Census
Tracts, ZCTA, MCD, MSA,
Counties, States
U.S. Free
Name Description Coverage
U.S. Census Bureau’s TIGER/Line files (United
States Census Bureau 2008c)
Street centerlines U.S.
NAVTEQ Streets (NAVTEQ 2008) Street centerlines Worldwide
Tele Atlas Dynamap, MultiNet (Tele Atlas
2008a, c)
Street centerlines Worldwide
Supplier Product Description Coverage
Government GeoNames (United
States National
Geospatial-Intelligence
Agency 2008)
Gazetteer of geographic
features
World,
excepting
U.S.
Academia Alexandria Digital
Library (2008)
Gazetteer of geographic
features
World
The geocoding algorithm
• Geocoding Algorithm performs two basic tasks
• Feature matching,
• Feature interpolation,
Input data processing
Address normalization
• Address normalization organizes and
cleans input data to increase its
efficiency for use and sharing
Address standardization
• Address standardization converts an
address from one normalized format into
another. It is closely linked to
normalization and is heavily influenced
by the performance of the normalization
process.
Sample Address
3620 South Vermont Avenue, Unit 444, Los Angeles, CA 90089-0255
3620 S Vermont Ave, #444, Los Angeles, CA 90089-0255
3620 S Vermont Ave, 444, Los Angeles, 90089-0255
3620 Vermont, Los Angeles, CA 90089
Output data
• The last component of the geocoder is the actual
output data, which are the valid spatial
representations derived from features in the
reference dataset.
• Data can have many different forms and formats, but
each must contain some type of valid spatial
attribute.
• The most common format of output is points
described with geographic coordinates (latitude,
longitude).
• Alternate forms can include multi-point
representations such as polylines or polygons.
• These geocoder outputs, while in the same format
and produced through the same process, do not
represent data at the same geographic resolution
and must be differentiated.
Feature matching - Algorithm
• The matching algorithms are
• Noninteractive matching algorithms (i.e., they are
automated and the user is not directly involved).
• Interactive matching algorithms
• Classifications of matching algorithms
• Two main categories:
• Deterministic
• Probabilistic
Deterministic matching
• Ease of implementation
• These algorithms are created by
defining a series of rules and a
sequential order in which they
should be applied. Like- “Match all
attributes of the input address to
the corresponding attributes of the
reference feature.”
• Attribute relaxation
• Attribute relaxation, the process of
easing the requirement that all street
address attributes must exactly
match a feature in the reference data
source to obtain a matching street
feature, often is applied to create
these less restrictive rules.
Preferred attribute relaxation order with resulting ambiguity, relative magnitudes of ambiguity and
spatial error, and worst-case resolution, passes 1 – 4
Relaxed
Attribute
Ambiguity
Relative Exponent and
Magnitude of Ambiguity
Relative
Magnitude of
Spatial Error
Worst-
Case
Resolutio
n
none none (0) none certainty of
address location
single
address
location
number multiple
houses on
single street
(0) # houses on street length of street single
street
pre single
house on
multiple
streets
(1) # streets with same name and
different pre
bounding area of
locations
containing same
number house on
all streets with the
same name
USPS
ZIP
Codepost (1) # streets with same name and
different post
type (1) # streets with same name and
different type
number,
pre
multiple
houses on
multiple
streets
(2) # houses on street * # streets
with same name and different pre
bounding area of
all
streets with the
same
name
number,
type
(2) # houses on street * # streets
with same name and different type
number,
post
(2) # houses on street * # streets
with same name and different post
Probabilistic matching
• Probabilistic matching has its roots in the
fields of probability and decision theory
• Employed in geocoding processes since the
outset (e.g., O’Reagan and Saalfeld 1987,
Jaro 1989).
• The exact implementation details can be
quite messy and mathematically
complicated, but the concept in general is
quite simple.
Attribute weighting
• Attribute weighting is a form of
probabilistic feature matching in which
probability based values are associated
with each attribute, and either subtract
from or add to the composite score for
the feature as a whole.
String comparison algorithms
• Character-level equivalence,
• Essence-level equivalence
• This allows for minor misspellings in the input address to be handled,
returning reference features that “closely match” what the input may have
“intended.”.
• Word stemming
• Phonetic algorithms or the Soundex Algorithm
Soundex Algorithm
• It has existed since the late 1800s and originally was
used by the U.S. Census Bureau.
• The algorithm is very simple and consists of the
following steps:
• Keep the first letter of the string
• Remove all vowels and the letters y, h, and w,
unless they are the first letter
• Replace all letters after the first with numbers
based on a known table
• Remove any numbers which are repeated in a row
• Return the first four characters, padded on the right
with zeros if there are less than four
Original Porter Stemmed Soundex
Running Ridge run ridg R552 R320
Runs Ridge run ridg R520 R320
Hawthorne
Street
hawthorn street H650 S363
Heatherann
Street
heatherann street H650 S363
Challenges of India Geocoding
Geocoding Challenges
• Unavailability of geospatial data
• Data if available is
• Unstructured
• Incomplete
• Inaccurate
• Lack precision
• Does not have official name or region
developed because of anthropogenic
pressure
• “General" geocode no longer sufficient; the
"most accurate" geocode required
• Inconsistent use of base maps and geocoding
services within and across programs and
agencies
Data
Inconsistency
Erroneous
data
Frequent data
changes
requests
Coverage
New data and
modified data
in every
vintages
Some Samples of real postal deliveries
Classes of geocoding failures with an example of true address “Maulsari BnB, 142 Sunder Nagar, Near Delhi Public School, New Delhi, Delhi 110003, India”
Class Geocoded Problem Example
1 No Failed to geocode because the input data are incorrect. Maulsari BnB, 142 Sunder Nagar, New Delhi, 110003, India
2 No Failed to geocode because the input data are incomplete. Maulsari BnB, Sunder Nagar, Near Delhi Public School, New Delhi, 110003, India
3 No Failed to geocode because the reference data are incorrect. 140 Sunder Nagar, New Delhi, Delhi 110003, India
4 No Failed to geocode because the reference data are incomplete. Street segment does not exist in reference data
5 No Failed to geocode because the reference data are temporally incompatible. Street segment name has not been updated in the reference data
6 No Failed to geocode because of combination of one or more of 1-5.
Maulsari BnB, 142 Sunder Nagar, Near Delhi Public School, New Delhi, Delhi 110003, India where
the reference data has not been updated to include Near Delhi Public School, New Delhi, Delhi
address segment
7 Yes Geocoded to incorrect location because the input data are incorrect.
Maulsari BnB, 142 Sunder Nagar, Near Delhi Public School, New Delhi, Delhi 110003, India was
(incorrectly) relaxed and matched to Delhi Public School, New Delhi, 110003, India
8 Yes Geocoded to incorrect location because the input data are incomplete.
Maulsari BnB, 142 Sunder Nagar, Near Delhi Public School, New Delhi, Delhi 110003, India was
arbitrarily (incorrectly) assigned to 145 Sunder Nagar, New Delhi, Delhi 110003, India
9 Yes Geocoded to incorrect location because the reference data are incorrect. Sunder Nagar, New Delhi, Delhi 110003, India
10 Yes Geocoded to incorrect location because the reference data are incomplete. Street segment geometry is generalized straight line when the real street is extremely curvy
11 Yes Geocoded to incorrect location because of interpolation error. Interpolation (incorrectly) assumes equal distribution of properties along street segment
12 Yes Geocoded to incorrect location because of drop back error. Drop back placement (incorrectly) assumes a constant distance and direction
13 Yes
Geocoded to incorrect location because of combination of one or more of 7-
12.
The address range for 140-150 is reversed to 150-140 and dropback of length 0 is used
Geocoding
Impacts
A good geocoder vs A Bad Geocoder
Data Issues
Steps to address challenges
Overcome the challenge
• Standardization of addresses by governing bodies
• Integrating addressing data models from variety of addressing system to
develop region specific data models
• Utilizing multiple data sources for data completeness
• Inclusion of local landmarks in geocode process to they form integral part
of Indian address system
• Embed locational awareness and intelligence within geocoding data
models
• Multi Lingual phonetic support
• Large test set to address different kinds of geocoder irregularities
• Validation methodology to confirm geocode results
• Changes in map policies for consistent capturing
• Homogeneous information dissemination by governing bodies in regard to
spatial data
• Standards protocols developed for input/reference data correction
• Address normalization
• Address standardization
Features to address challenges
• Sub locality Features
• Inclusion of Local Landmarks
• POI Level Geocoding
• Bank Dictionaries
• Focus on Local Geography and Social Settings
……. Questions?

Mais conteúdo relacionado

Mais procurados

Spatial Data Science with R
Spatial Data Science with RSpatial Data Science with R
Spatial Data Science with Ramsantac
 
E-R Diagram of College Management Systems
E-R Diagram of College Management SystemsE-R Diagram of College Management Systems
E-R Diagram of College Management SystemsOmprakash Chauhan
 
Hospital management system project
Hospital management system projectHospital management system project
Hospital management system projectHimani Chopra
 
Online Vegetable Selling project Presentation
Online Vegetable Selling project PresentationOnline Vegetable Selling project Presentation
Online Vegetable Selling project Presentationmayur patel
 
2.1 project management srs
2.1 project management   srs2.1 project management   srs
2.1 project management srsAnil Kumar
 
What Is DATA MINING(INTRODUCTION)
What Is DATA MINING(INTRODUCTION)What Is DATA MINING(INTRODUCTION)
What Is DATA MINING(INTRODUCTION)Pratik Tambekar
 
Blood Bank Management System
Blood Bank Management SystemBlood Bank Management System
Blood Bank Management SystemChirag N Jain
 
7 Reasons Why CPG Marketers Are Turning To Location Analytics
7 Reasons Why CPG Marketers Are Turning To Location Analytics7 Reasons Why CPG Marketers Are Turning To Location Analytics
7 Reasons Why CPG Marketers Are Turning To Location AnalyticsCARTO
 
Project for Student Result System
Project for Student Result SystemProject for Student Result System
Project for Student Result SystemKuMaR AnAnD
 
Blood donation ppt
Blood donation pptBlood donation ppt
Blood donation pptR prasad
 
FINAL REPORT DEC
FINAL REPORT DECFINAL REPORT DEC
FINAL REPORT DECAxis Bank
 
Internal assessment marking system
Internal assessment marking systemInternal assessment marking system
Internal assessment marking systemShreshth Saxena
 
Location Intelligence for All: Enabling Individuals to Use Spatial Analysis [...
Location Intelligence for All: Enabling Individuals to Use Spatial Analysis [...Location Intelligence for All: Enabling Individuals to Use Spatial Analysis [...
Location Intelligence for All: Enabling Individuals to Use Spatial Analysis [...CARTO
 
Software design
Software designSoftware design
Software designambitlick
 

Mais procurados (20)

Spatial Data Science with R
Spatial Data Science with RSpatial Data Science with R
Spatial Data Science with R
 
E-R Diagram of College Management Systems
E-R Diagram of College Management SystemsE-R Diagram of College Management Systems
E-R Diagram of College Management Systems
 
Hospital management system project
Hospital management system projectHospital management system project
Hospital management system project
 
Online Vegetable Selling project Presentation
Online Vegetable Selling project PresentationOnline Vegetable Selling project Presentation
Online Vegetable Selling project Presentation
 
Data visualization
Data visualizationData visualization
Data visualization
 
2.1 project management srs
2.1 project management   srs2.1 project management   srs
2.1 project management srs
 
What Is DATA MINING(INTRODUCTION)
What Is DATA MINING(INTRODUCTION)What Is DATA MINING(INTRODUCTION)
What Is DATA MINING(INTRODUCTION)
 
Blood Bank Management System
Blood Bank Management SystemBlood Bank Management System
Blood Bank Management System
 
7 Reasons Why CPG Marketers Are Turning To Location Analytics
7 Reasons Why CPG Marketers Are Turning To Location Analytics7 Reasons Why CPG Marketers Are Turning To Location Analytics
7 Reasons Why CPG Marketers Are Turning To Location Analytics
 
Project for Student Result System
Project for Student Result SystemProject for Student Result System
Project for Student Result System
 
Spatial Database Systems
Spatial Database SystemsSpatial Database Systems
Spatial Database Systems
 
Blood donation ppt
Blood donation pptBlood donation ppt
Blood donation ppt
 
Tableau Presentation
Tableau PresentationTableau Presentation
Tableau Presentation
 
FINAL REPORT DEC
FINAL REPORT DECFINAL REPORT DEC
FINAL REPORT DEC
 
Data Models
Data ModelsData Models
Data Models
 
MYSQL.ppt
MYSQL.pptMYSQL.ppt
MYSQL.ppt
 
Internal assessment marking system
Internal assessment marking systemInternal assessment marking system
Internal assessment marking system
 
Location Intelligence for All: Enabling Individuals to Use Spatial Analysis [...
Location Intelligence for All: Enabling Individuals to Use Spatial Analysis [...Location Intelligence for All: Enabling Individuals to Use Spatial Analysis [...
Location Intelligence for All: Enabling Individuals to Use Spatial Analysis [...
 
Database anomalies
Database anomaliesDatabase anomalies
Database anomalies
 
Software design
Software designSoftware design
Software design
 

Destaque

Geocoding for beginners
Geocoding for beginnersGeocoding for beginners
Geocoding for beginnersAkansha Mishra
 
Physical street addressing of mombasa city
Physical street addressing of mombasa cityPhysical street addressing of mombasa city
Physical street addressing of mombasa cityMsaTech Mombasa
 
Reputational Due Diligence - The key to strategic risk management
Reputational Due Diligence - The key to strategic risk managementReputational Due Diligence - The key to strategic risk management
Reputational Due Diligence - The key to strategic risk managementJenniferHG
 
Learning & Development and the Performance management
Learning & Development and the Performance managementLearning & Development and the Performance management
Learning & Development and the Performance managementAhmed Shamim
 
Power Hour: 50 Actionable SEO Tips & Tricks
Power Hour: 50 Actionable SEO Tips & TricksPower Hour: 50 Actionable SEO Tips & Tricks
Power Hour: 50 Actionable SEO Tips & TricksConductor
 
A Charitable Life Wellness
A Charitable Life Wellness A Charitable Life Wellness
A Charitable Life Wellness Brian Barden
 
ITIL - IAM (Access Management)
ITIL - IAM (Access Management)ITIL - IAM (Access Management)
ITIL - IAM (Access Management)Josep Bardallo
 
Large Scale Data Processing & Storage
Large Scale Data Processing & StorageLarge Scale Data Processing & Storage
Large Scale Data Processing & StorageIlayaraja P
 
Some of Dr. Nishant Sinha's Research Papers
Some of Dr. Nishant Sinha's Research PapersSome of Dr. Nishant Sinha's Research Papers
Some of Dr. Nishant Sinha's Research PapersNishant Sinha
 
Best Practices You Must Apply to Secure Your APIs - Scott Morrison, SVP & Dis...
Best Practices You Must Apply to Secure Your APIs - Scott Morrison, SVP & Dis...Best Practices You Must Apply to Secure Your APIs - Scott Morrison, SVP & Dis...
Best Practices You Must Apply to Secure Your APIs - Scott Morrison, SVP & Dis...CA API Management
 
Secure Your REST API (The Right Way)
Secure Your REST API (The Right Way)Secure Your REST API (The Right Way)
Secure Your REST API (The Right Way)Stormpath
 
Enterprise workspaces - Extending SAP NetWeaver Portal capabilities
Enterprise workspaces - Extending SAP NetWeaver Portal capabilities Enterprise workspaces - Extending SAP NetWeaver Portal capabilities
Enterprise workspaces - Extending SAP NetWeaver Portal capabilities SAP Portal
 
Secure PIN Management How to Issue and Change PINs Securely over the Web
Secure PIN Management How to Issue and Change PINs Securely over the WebSecure PIN Management How to Issue and Change PINs Securely over the Web
Secure PIN Management How to Issue and Change PINs Securely over the WebSafeNet
 

Destaque (20)

Geocoding for beginners
Geocoding for beginnersGeocoding for beginners
Geocoding for beginners
 
Physical street addressing of mombasa city
Physical street addressing of mombasa cityPhysical street addressing of mombasa city
Physical street addressing of mombasa city
 
Reputational Due Diligence - The key to strategic risk management
Reputational Due Diligence - The key to strategic risk managementReputational Due Diligence - The key to strategic risk management
Reputational Due Diligence - The key to strategic risk management
 
Learning & Development and the Performance management
Learning & Development and the Performance managementLearning & Development and the Performance management
Learning & Development and the Performance management
 
Power Hour: 50 Actionable SEO Tips & Tricks
Power Hour: 50 Actionable SEO Tips & TricksPower Hour: 50 Actionable SEO Tips & Tricks
Power Hour: 50 Actionable SEO Tips & Tricks
 
A Charitable Life Wellness
A Charitable Life Wellness A Charitable Life Wellness
A Charitable Life Wellness
 
ITIL - IAM (Access Management)
ITIL - IAM (Access Management)ITIL - IAM (Access Management)
ITIL - IAM (Access Management)
 
Large Scale Data Processing & Storage
Large Scale Data Processing & StorageLarge Scale Data Processing & Storage
Large Scale Data Processing & Storage
 
Some of Dr. Nishant Sinha's Research Papers
Some of Dr. Nishant Sinha's Research PapersSome of Dr. Nishant Sinha's Research Papers
Some of Dr. Nishant Sinha's Research Papers
 
Android report
Android reportAndroid report
Android report
 
Best Practices You Must Apply to Secure Your APIs - Scott Morrison, SVP & Dis...
Best Practices You Must Apply to Secure Your APIs - Scott Morrison, SVP & Dis...Best Practices You Must Apply to Secure Your APIs - Scott Morrison, SVP & Dis...
Best Practices You Must Apply to Secure Your APIs - Scott Morrison, SVP & Dis...
 
Campaign planning
Campaign planningCampaign planning
Campaign planning
 
Secure Your REST API (The Right Way)
Secure Your REST API (The Right Way)Secure Your REST API (The Right Way)
Secure Your REST API (The Right Way)
 
ACUTE, SUB ACUTE & CHRONIC TOXICOLOGICAL STUDIES
ACUTE, SUB ACUTE & CHRONIC TOXICOLOGICAL STUDIESACUTE, SUB ACUTE & CHRONIC TOXICOLOGICAL STUDIES
ACUTE, SUB ACUTE & CHRONIC TOXICOLOGICAL STUDIES
 
cathy resume
cathy resumecathy resume
cathy resume
 
Basics of Coding in Pediatrics Medical Billing
Basics of Coding in Pediatrics Medical BillingBasics of Coding in Pediatrics Medical Billing
Basics of Coding in Pediatrics Medical Billing
 
Nt1310 project
Nt1310 projectNt1310 project
Nt1310 project
 
Enterprise workspaces - Extending SAP NetWeaver Portal capabilities
Enterprise workspaces - Extending SAP NetWeaver Portal capabilities Enterprise workspaces - Extending SAP NetWeaver Portal capabilities
Enterprise workspaces - Extending SAP NetWeaver Portal capabilities
 
Secure PIN Management How to Issue and Change PINs Securely over the Web
Secure PIN Management How to Issue and Change PINs Securely over the WebSecure PIN Management How to Issue and Change PINs Securely over the Web
Secure PIN Management How to Issue and Change PINs Securely over the Web
 
"15 Business Story Ideas to Jump on Now"
"15 Business Story Ideas to Jump on Now""15 Business Story Ideas to Jump on Now"
"15 Business Story Ideas to Jump on Now"
 

Semelhante a The World of Geocoding and Challenges in India

A presentation on Geocoding - Complete Guide
A presentation on Geocoding - Complete GuideA presentation on Geocoding - Complete Guide
A presentation on Geocoding - Complete GuideVickkyGupta
 
Building a Spatial Database in PostgreSQL
Building a Spatial Database in PostgreSQLBuilding a Spatial Database in PostgreSQL
Building a Spatial Database in PostgreSQLSohail Akbar Goheer
 
3 Easy Ways to Reach Financial Freedom: How Twitter user Geo to win advertising
3 Easy Ways to Reach Financial Freedom: How Twitter user Geo to win advertising3 Easy Ways to Reach Financial Freedom: How Twitter user Geo to win advertising
3 Easy Ways to Reach Financial Freedom: How Twitter user Geo to win advertisingSen Xu
 
Geospatial Database.pptx
Geospatial Database.pptxGeospatial Database.pptx
Geospatial Database.pptxMariamKariam1
 
GIS and Remote Sensing Training at Pitney Bowes Software
GIS and Remote Sensing Training at Pitney Bowes SoftwareGIS and Remote Sensing Training at Pitney Bowes Software
GIS and Remote Sensing Training at Pitney Bowes SoftwareNishant Sinha
 
GIS Introduction.ppt
GIS Introduction.pptGIS Introduction.ppt
GIS Introduction.pptmisterjis
 
Spatial Data Mining
Spatial Data MiningSpatial Data Mining
Spatial Data MiningRashmi Bhat
 
Big Data and Geospatial with HPCC Systems
Big Data and Geospatial with HPCC SystemsBig Data and Geospatial with HPCC Systems
Big Data and Geospatial with HPCC SystemsHPCC Systems
 
Managing GeoData with PostGIS @ KhmelnytskyiPy #1
Managing GeoData with PostGIS @ KhmelnytskyiPy #1Managing GeoData with PostGIS @ KhmelnytskyiPy #1
Managing GeoData with PostGIS @ KhmelnytskyiPy #1Volodymyr Gamula
 
M|18 Building Location-Based Services with Geospatial Data
M|18 Building Location-Based Services with Geospatial DataM|18 Building Location-Based Services with Geospatial Data
M|18 Building Location-Based Services with Geospatial DataMariaDB plc
 
What is GIS (Course Presentation).pdf
What is GIS (Course Presentation).pdfWhat is GIS (Course Presentation).pdf
What is GIS (Course Presentation).pdfjarriesgado
 
A Journey to the World of GIS
A Journey to the World of GISA Journey to the World of GIS
A Journey to the World of GISNishant Sinha
 
25 oct 2011 geocoding symposium washington 0950 richard abas
25 oct 2011 geocoding symposium washington 0950 richard abas25 oct 2011 geocoding symposium washington 0950 richard abas
25 oct 2011 geocoding symposium washington 0950 richard abasrichard abas
 
Vector data model
Vector data model Vector data model
Vector data model Pramoda Raj
 
Vector data model
Vector data modelVector data model
Vector data modelPramoda Raj
 
Introduction to GIS & Cartography.pdf
Introduction to GIS & Cartography.pdfIntroduction to GIS & Cartography.pdf
Introduction to GIS & Cartography.pdfLareebMoeen1
 
GEOGRAPHY ENG REVISION BOOKLET TERM 2.pdf
GEOGRAPHY ENG REVISION BOOKLET TERM 2.pdfGEOGRAPHY ENG REVISION BOOKLET TERM 2.pdf
GEOGRAPHY ENG REVISION BOOKLET TERM 2.pdfmanqobasmangaliso717
 

Semelhante a The World of Geocoding and Challenges in India (20)

Data Day 2012_Fradkin_Intro to GIS
Data Day 2012_Fradkin_Intro to GISData Day 2012_Fradkin_Intro to GIS
Data Day 2012_Fradkin_Intro to GIS
 
A presentation on Geocoding - Complete Guide
A presentation on Geocoding - Complete GuideA presentation on Geocoding - Complete Guide
A presentation on Geocoding - Complete Guide
 
Building a Spatial Database in PostgreSQL
Building a Spatial Database in PostgreSQLBuilding a Spatial Database in PostgreSQL
Building a Spatial Database in PostgreSQL
 
Openstreetmap
OpenstreetmapOpenstreetmap
Openstreetmap
 
3 Easy Ways to Reach Financial Freedom: How Twitter user Geo to win advertising
3 Easy Ways to Reach Financial Freedom: How Twitter user Geo to win advertising3 Easy Ways to Reach Financial Freedom: How Twitter user Geo to win advertising
3 Easy Ways to Reach Financial Freedom: How Twitter user Geo to win advertising
 
Geospatial Database.pptx
Geospatial Database.pptxGeospatial Database.pptx
Geospatial Database.pptx
 
GIS and Remote Sensing Training at Pitney Bowes Software
GIS and Remote Sensing Training at Pitney Bowes SoftwareGIS and Remote Sensing Training at Pitney Bowes Software
GIS and Remote Sensing Training at Pitney Bowes Software
 
GIS Introduction.ppt
GIS Introduction.pptGIS Introduction.ppt
GIS Introduction.ppt
 
Spatial Data Mining
Spatial Data MiningSpatial Data Mining
Spatial Data Mining
 
Big Data and Geospatial with HPCC Systems
Big Data and Geospatial with HPCC SystemsBig Data and Geospatial with HPCC Systems
Big Data and Geospatial with HPCC Systems
 
Managing GeoData with PostGIS @ KhmelnytskyiPy #1
Managing GeoData with PostGIS @ KhmelnytskyiPy #1Managing GeoData with PostGIS @ KhmelnytskyiPy #1
Managing GeoData with PostGIS @ KhmelnytskyiPy #1
 
M|18 Building Location-Based Services with Geospatial Data
M|18 Building Location-Based Services with Geospatial DataM|18 Building Location-Based Services with Geospatial Data
M|18 Building Location-Based Services with Geospatial Data
 
What is GIS (Course Presentation).pdf
What is GIS (Course Presentation).pdfWhat is GIS (Course Presentation).pdf
What is GIS (Course Presentation).pdf
 
A Journey to the World of GIS
A Journey to the World of GISA Journey to the World of GIS
A Journey to the World of GIS
 
25 oct 2011 geocoding symposium washington 0950 richard abas
25 oct 2011 geocoding symposium washington 0950 richard abas25 oct 2011 geocoding symposium washington 0950 richard abas
25 oct 2011 geocoding symposium washington 0950 richard abas
 
Vector data model
Vector data model Vector data model
Vector data model
 
Vector data model
Vector data modelVector data model
Vector data model
 
Intro_GIS.ppt
Intro_GIS.pptIntro_GIS.ppt
Intro_GIS.ppt
 
Introduction to GIS & Cartography.pdf
Introduction to GIS & Cartography.pdfIntroduction to GIS & Cartography.pdf
Introduction to GIS & Cartography.pdf
 
GEOGRAPHY ENG REVISION BOOKLET TERM 2.pdf
GEOGRAPHY ENG REVISION BOOKLET TERM 2.pdfGEOGRAPHY ENG REVISION BOOKLET TERM 2.pdf
GEOGRAPHY ENG REVISION BOOKLET TERM 2.pdf
 

Último

Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 

Último (20)

Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 

The World of Geocoding and Challenges in India

  • 1. The World of Geocoding and challenges in India Dr. Nishant Sinha
  • 2. Discussion • Introduction to Geocoding • All about spatial data and real world data • Addressing system of developing nation • Addressing system in India • Data sources and standardization • Data arrangement • Geocoding • Challenges • Steps to overcome challenges • Q & A
  • 4. What is Geocoding ? Geocoding is the process of transforming a description of a location (such as a pair of coordinates, an address, or a name of a place) to a location on the earth's surface Source Definition Possible Problems Environmental Sciences Research Institute (1999) The process of matching tabular data that contains location information such as street addresses with realworld coordinates. Limited to coordinate output only. Harvard University (2008) The assignment of a numeric code to a geographical location. Limited to numeric code output only. Statistics Canada (2008) The process of assigning geographic identifiers (codes) to map features and data records. Limited input range. U.S. Environmental Protection Agency (2008) The process of assigning latitude and longitude to a point, based on street addresses, city, state and USPS ZIP Code. Limited to coordinate output only.
  • 5. What is Geocoding ? Geocoding (verb) is the act of transforming aspatial locationally descriptive text into a valid spatial representation using a predefined process. A geocoder (noun) is a set of inter-related components in the form of operations, algorithms, and data sources that work together to produce a spatial representation for descriptive locational references. A geocode (noun) is a spatial representation of a descriptive locational reference. To geocode (verb) is to perform the process of geocoding.
  • 6. Geocoding ……… Some points to ponder……. • Does geocoding refer to a specific computational process of transforming something into something else, or simply the concept of a transformation? • Is a geocode a real-world object, simply an attribute of something else, or the process itself? • Is a geocoder the computer program that performs calculations, a single component of the process, or the human who makes the decisions?
  • 8. What is there in Spatial data…. Location information • The fundamental primitive is the point, a 0-dimensional (0-D) object that has a position in space but no length. • A line is a 1-D geographic object having a length and is composed of two or more 0-D point objects. • A polygon is a geographic object bounded by at least three 1-D line objects or segments with the requirement that they must start and end at the same location (i.e., node)
  • 9. Spatial Data • Spatial data often referred to as layers • Layers represent features on, above, or below the surface of the earth • Data layers are of 2 major types • Vector data represent features as discrete points, lines, and polygons. • Raster data represent the landscape as a rectangular matrix of square cells. The Real World Vector Data Raster Data
  • 10. Spatial Data • Vector • TAB Files • Shape Files • CAD (AutoCAD DXF & DWG) • Raster Vector • Grids • Images • Digital Elevation Models (DEMs)
  • 11. Geocoding and Spatial Data • Most geocoding applications work with vector-based GIS data. • The key aspects from a geocoding perspective : • Determine and record the locations of these objects on the surface of the Earth, and • Calculate distance because many geocoding algorithms rely on one or more forms of linear interpolation.
  • 12. Addressing system of developing nation
  • 13. Addressing System • Addresses are one of the fundamental means by which people conceptualize location in the modern world • Addresses are of 2 Types • Relative input data • Examples of these types of data include “Across the street from Togo’s” and “The northeast corner of Vermont Avenue and 36th Place.” • Absolute input data • Example House no-xxxx, ABC Street, JJJJJ Locality, XYZ City, QQ State, DEF Country, ###### Postal Code
  • 14. Addresses • Come in a variety of formats • Address Components • Address (house or building) number • Prefix direction • Prefix type • Street name • Street type • Suffix direction • Zone
  • 15. What is street addressing? • Street addressing is an exercise that makes it possible to identify the location of a plot or dwelling on the ground, that is, to “assign an address” using a system of maps and signs that give the numbers or names of streets and buildings • Street addressing provides an opportunity to • Create a map of the city that can be used by different municipal units • Conduct a systematic survey that collects a significant amount of information about the city and its population, and • Set up a database on the built environment
  • 16. History of street addressing across some countries • Before 1728, no street names were indicated in Paris, except in very rare cases, such as “Rue Saint- Dominique, formerly des Vaches” (1643) • In Belgrade, changes to street names are frequent • The name China itself (country of the center), which makes reference to this principle, and in the name of the capitals, Beijing (capital of the North) and Nanjing (capital of the South). Names of the provinces are also strongly influenced by these references to geographical direction • Aside from a few main thoroughfares, streets in Japan do not have names. In fact, the city districts (ku) are divided into neighborhoods (chome) that group together several dozen houses and thus form a block.
  • 17. Street Addressing System Some common Practice • Sequential alternating numbering systems • Decametric numbering system • Codification of intersections • Combined addressing system Segment Distance (m) No. left side No. right side 1 0–10 1 2 2 10–20 3 4 3 20–30 5 6 4 30–40 7 8 Sequential alternating numbering systems Codification of intersections Combined addressing system
  • 19. Addressing System in India • Spatial data capturing based on demand rather than homogeneous capture • Streets with no names or unstructured addresses • Absence of consistent and accurate dataset throughout the area being geocoded • Presence of slum-like areas that change frequently and are not street addressable • Non-existence of reference datasets or GIS data infrastructure • Lack of hierarchical data structure beyond tehsil • Absence of standardized geocoding algorithm • Non-existence of approach to validate assumptions made in the geocoding algorithm
  • 20. Sample Addresses varieties of India Address with Person Name c/o Yashwant S.Prabhu , 318, C - Wing, Suyog Co.Housing Society Ltd, T. P.S. Road & III Link Road, Vazira, Borivali, West Mumbai, Maharashtra, 400092 c/o Late Esmail Bagani, Y/2/122, Satghara Road, PO- Badartala, PS – Nadial, Kolkata, West Bengal, 700044 Address with Building names White C/403, Aamrpali Appt, opp. GHB complex, Ankur Road, Ahmedabad, Gujarat, 380013 13/9, Daksha Bldg, Vallabh Baug Lane, Ghatkopar, Mumbai, Maharashtra, 400077 Address with House No 299/15, Padmavati Vikar Mandal, Shahibaug, Ahmedabad, Gujarat, 380001 NO88, Srinivasa Nagar, 2NS Main Road, Kolathur, Chennai, Tamil Nadu 600099 Address with Street Name 1304, Cornation Road, Bargarpet, Kolar, Bangalore, Karnataka, 560000 20K, Dhakuria Station Road, Dhakuria PS: Jadavpur, Kolkata, West Bengal, 700031 Address with POI BMC Software, Next Muttha Chamber, Senapati Bapat Road, Pune, Maharashtra, 411016 Life Style International Pvt. Ltd., Near Payal Cinema Complex, Gurgaon, Haryana, 122001
  • 21. India Dynamics • 35 States & UTs, ~650 Districts , ~6000 Tehsil/Town, more than 30k postcodes • 21 regional , two official (Hindi & English) languages • 3.2m sq KM Area, 1.2 billion population • Administrative Hierarchy • State >>District>>Tehsil/Town>> Ward>>Locality/Village>>Sub locality>>Block/Pocket • Addressing Pattern • Near Govt. Hopital, Zirapur , District Rajgarh (MP) • 351 Ground Floor , Shakti khand 3 , Near St. Teresa School , Indirapuram, Ghaziabad 201010 U.P. • 9 Mansarovar Colony Opp. 3/686, Kala kuan Housing Board Alwar 301001 • 9/19/98/19-D Flat No. # 303 Hitech City Madhapur Hyderabad • 176 Devi Nagar New Sanganer Road Sodala , Jaipur Rajasthan Indian Postal Code
  • 22. The History of Postal System in India • Long colonial realm – British, Mughal, Portuguese……. • Britain's involvement in the postal services of India began in the eighteenth century • Warren Hastings (Governor General of British India from 1773-1784) opened the posts to the public in March 1774 • Main purpose of the postal system had been to serve the commercial interests of the East India Company and to serve Govt. orders
  • 23. The History of Postal System in India
  • 24. The History of Postal System in India
  • 25. Data sources and standardization
  • 26. India Geocoding Data 27 Streets • NH • SH • Local roads Place of Interest • Banks • Retails • Hospital • Other landmarks Administrative • State • District • Town • Locality • Sub locality Geography • Block • Locality • Town • Postcode Address Point • House no • Building name
  • 27. Reference datasets • The reference dataset is the underlying geographic database containing geographic features that the geocoder can use to generate a geographic output. • This dataset stores all of the information the geocoder knows about the world and provides the base data from which the geocoder calculates, derives, or obtains geocodes. Interpolation algorithms Type Example Vector line file U.S. Census Bureau’s TIGER/Line (United States Census Bureau 2008c) Vector polygon file Los Angeles (LA) County Assessor Parcel Data (Los Angeles County Assessor 2008) Vector point file Australian Geocoded National Address File (G-NAF) (Paull 2003)
  • 28. Reference dataset types • Linear-Based Reference Datasets • Roads, Ferries • Polygon-Based Reference datasets • Administrative Boundaries, Postal Codes • Point-Based Reference Datasets • POIs Source Description Coverage Cost Tele Atlas (2008c), NAVTEQ (2008) Building footprints, parcel footprints Worldwide, but sparse Expensive County or municipal Assessors Building footprints, parcel footprints U.S., but sparse Relatively inexpensive but varies U.S. Census Bureau Census Block Groups, Census Tracts, ZCTA, MCD, MSA, Counties, States U.S. Free Name Description Coverage U.S. Census Bureau’s TIGER/Line files (United States Census Bureau 2008c) Street centerlines U.S. NAVTEQ Streets (NAVTEQ 2008) Street centerlines Worldwide Tele Atlas Dynamap, MultiNet (Tele Atlas 2008a, c) Street centerlines Worldwide Supplier Product Description Coverage Government GeoNames (United States National Geospatial-Intelligence Agency 2008) Gazetteer of geographic features World, excepting U.S. Academia Alexandria Digital Library (2008) Gazetteer of geographic features World
  • 29. The geocoding algorithm • Geocoding Algorithm performs two basic tasks • Feature matching, • Feature interpolation,
  • 30. Input data processing Address normalization • Address normalization organizes and cleans input data to increase its efficiency for use and sharing Address standardization • Address standardization converts an address from one normalized format into another. It is closely linked to normalization and is heavily influenced by the performance of the normalization process. Sample Address 3620 South Vermont Avenue, Unit 444, Los Angeles, CA 90089-0255 3620 S Vermont Ave, #444, Los Angeles, CA 90089-0255 3620 S Vermont Ave, 444, Los Angeles, 90089-0255 3620 Vermont, Los Angeles, CA 90089
  • 31. Output data • The last component of the geocoder is the actual output data, which are the valid spatial representations derived from features in the reference dataset. • Data can have many different forms and formats, but each must contain some type of valid spatial attribute. • The most common format of output is points described with geographic coordinates (latitude, longitude). • Alternate forms can include multi-point representations such as polylines or polygons. • These geocoder outputs, while in the same format and produced through the same process, do not represent data at the same geographic resolution and must be differentiated.
  • 32. Feature matching - Algorithm • The matching algorithms are • Noninteractive matching algorithms (i.e., they are automated and the user is not directly involved). • Interactive matching algorithms • Classifications of matching algorithms • Two main categories: • Deterministic • Probabilistic
  • 33. Deterministic matching • Ease of implementation • These algorithms are created by defining a series of rules and a sequential order in which they should be applied. Like- “Match all attributes of the input address to the corresponding attributes of the reference feature.” • Attribute relaxation • Attribute relaxation, the process of easing the requirement that all street address attributes must exactly match a feature in the reference data source to obtain a matching street feature, often is applied to create these less restrictive rules. Preferred attribute relaxation order with resulting ambiguity, relative magnitudes of ambiguity and spatial error, and worst-case resolution, passes 1 – 4 Relaxed Attribute Ambiguity Relative Exponent and Magnitude of Ambiguity Relative Magnitude of Spatial Error Worst- Case Resolutio n none none (0) none certainty of address location single address location number multiple houses on single street (0) # houses on street length of street single street pre single house on multiple streets (1) # streets with same name and different pre bounding area of locations containing same number house on all streets with the same name USPS ZIP Codepost (1) # streets with same name and different post type (1) # streets with same name and different type number, pre multiple houses on multiple streets (2) # houses on street * # streets with same name and different pre bounding area of all streets with the same name number, type (2) # houses on street * # streets with same name and different type number, post (2) # houses on street * # streets with same name and different post
  • 34. Probabilistic matching • Probabilistic matching has its roots in the fields of probability and decision theory • Employed in geocoding processes since the outset (e.g., O’Reagan and Saalfeld 1987, Jaro 1989). • The exact implementation details can be quite messy and mathematically complicated, but the concept in general is quite simple.
  • 35. Attribute weighting • Attribute weighting is a form of probabilistic feature matching in which probability based values are associated with each attribute, and either subtract from or add to the composite score for the feature as a whole.
  • 36. String comparison algorithms • Character-level equivalence, • Essence-level equivalence • This allows for minor misspellings in the input address to be handled, returning reference features that “closely match” what the input may have “intended.”. • Word stemming • Phonetic algorithms or the Soundex Algorithm
  • 37. Soundex Algorithm • It has existed since the late 1800s and originally was used by the U.S. Census Bureau. • The algorithm is very simple and consists of the following steps: • Keep the first letter of the string • Remove all vowels and the letters y, h, and w, unless they are the first letter • Replace all letters after the first with numbers based on a known table • Remove any numbers which are repeated in a row • Return the first four characters, padded on the right with zeros if there are less than four Original Porter Stemmed Soundex Running Ridge run ridg R552 R320 Runs Ridge run ridg R520 R320 Hawthorne Street hawthorn street H650 S363 Heatherann Street heatherann street H650 S363
  • 38. Challenges of India Geocoding
  • 39. Geocoding Challenges • Unavailability of geospatial data • Data if available is • Unstructured • Incomplete • Inaccurate • Lack precision • Does not have official name or region developed because of anthropogenic pressure • “General" geocode no longer sufficient; the "most accurate" geocode required • Inconsistent use of base maps and geocoding services within and across programs and agencies Data Inconsistency Erroneous data Frequent data changes requests Coverage New data and modified data in every vintages
  • 40. Some Samples of real postal deliveries
  • 41. Classes of geocoding failures with an example of true address “Maulsari BnB, 142 Sunder Nagar, Near Delhi Public School, New Delhi, Delhi 110003, India” Class Geocoded Problem Example 1 No Failed to geocode because the input data are incorrect. Maulsari BnB, 142 Sunder Nagar, New Delhi, 110003, India 2 No Failed to geocode because the input data are incomplete. Maulsari BnB, Sunder Nagar, Near Delhi Public School, New Delhi, 110003, India 3 No Failed to geocode because the reference data are incorrect. 140 Sunder Nagar, New Delhi, Delhi 110003, India 4 No Failed to geocode because the reference data are incomplete. Street segment does not exist in reference data 5 No Failed to geocode because the reference data are temporally incompatible. Street segment name has not been updated in the reference data 6 No Failed to geocode because of combination of one or more of 1-5. Maulsari BnB, 142 Sunder Nagar, Near Delhi Public School, New Delhi, Delhi 110003, India where the reference data has not been updated to include Near Delhi Public School, New Delhi, Delhi address segment 7 Yes Geocoded to incorrect location because the input data are incorrect. Maulsari BnB, 142 Sunder Nagar, Near Delhi Public School, New Delhi, Delhi 110003, India was (incorrectly) relaxed and matched to Delhi Public School, New Delhi, 110003, India 8 Yes Geocoded to incorrect location because the input data are incomplete. Maulsari BnB, 142 Sunder Nagar, Near Delhi Public School, New Delhi, Delhi 110003, India was arbitrarily (incorrectly) assigned to 145 Sunder Nagar, New Delhi, Delhi 110003, India 9 Yes Geocoded to incorrect location because the reference data are incorrect. Sunder Nagar, New Delhi, Delhi 110003, India 10 Yes Geocoded to incorrect location because the reference data are incomplete. Street segment geometry is generalized straight line when the real street is extremely curvy 11 Yes Geocoded to incorrect location because of interpolation error. Interpolation (incorrectly) assumes equal distribution of properties along street segment 12 Yes Geocoded to incorrect location because of drop back error. Drop back placement (incorrectly) assumes a constant distance and direction 13 Yes Geocoded to incorrect location because of combination of one or more of 7- 12. The address range for 140-150 is reversed to 150-140 and dropback of length 0 is used
  • 44. Steps to address challenges
  • 45. Overcome the challenge • Standardization of addresses by governing bodies • Integrating addressing data models from variety of addressing system to develop region specific data models • Utilizing multiple data sources for data completeness • Inclusion of local landmarks in geocode process to they form integral part of Indian address system • Embed locational awareness and intelligence within geocoding data models • Multi Lingual phonetic support • Large test set to address different kinds of geocoder irregularities • Validation methodology to confirm geocode results • Changes in map policies for consistent capturing • Homogeneous information dissemination by governing bodies in regard to spatial data • Standards protocols developed for input/reference data correction • Address normalization • Address standardization
  • 46. Features to address challenges • Sub locality Features • Inclusion of Local Landmarks • POI Level Geocoding • Bank Dictionaries • Focus on Local Geography and Social Settings

Notas do Editor

  1. Relative input data are textual location descriptions which, by themselves, are not sufficient to produce an output geographic location. These produce relative geocodes that are geographiclocations relative to some other reference geographic locations Examples of these types of data include “Across the street from Togo’s” and “The northeast corner of Vermont Avenue and 36th Place.” Absolute input data are textual location descriptions which, by themselves, are sufficient to produce an output geographic location. These input data produce an absolute geocode in the form of an absolute known location or an offset from an absolute known location. Example House no-xxxx, ABC Street, JJJJJ Locality, XYZ City, QQ State, DEF Country, ###### Postal Code
  2. LT Once in a year in Tab format, TT Quarterly in Shape,OSL and some other format
  3. Linear-Based Reference Datasets A linear-based (line-based) reference dataset is composed of linear-based data, which can either be simple-line or polyline vectors. The type of line vector contained typically can be used as a first-order estimate of the descriptive quality of the reference data sourcePolygon-Based Reference datasets A polygon-based reference dataset is composed of polygon-based data. These datasets are interesting because they can represent both the most accurate and inaccurate forms of reference data.Point-Based Reference Datasets A point-based reference dataset is composed of point-based data. These are the least commonly encountered partly because of their usability, and partly because of the wide ranges in cost and accuracy.
  4. Feature matching,The process of identifying a geographic feature in the reference dataset corresponding to the input data to be used to derive the final geocode output for an input. A feature-matching algorithm is an implementation of a particular form of feature matching. These algorithms are highly dependent on both the type of reference dataset utilized and the attributes it maintains about its geographic features. Feature interpolation,The process of deriving a geographic output from a reference feature selected by feature matching. A feature interpolation algorithm is an implementation of a particular form of feature interpolation. These algorithms also are highly dependent on the reference dataset in terms of the type of data it contains and the attributes it maintains about these features.
  5. The matching algorithms are Noninteractive matching algorithms (i.e., they are automated and the user is not directly involved). Interactive matching algorithms involve the user in making choices when the algorithm fails to produce an exact match by either having the user correct/refine the input data or make a subjective, informed decision between two equally likely optionsClassifications of matching algorithms Two main categories: Deterministic - A deterministic matching method is based on a series of rules that are processed in a specific sequence. These can be thought of as binary operations; a feature is either matched or it is notProbabilistic. A probabilistic matching method uses a computational scheme to determine the likelihood, or probability, that a feature matches and returns this value for each feature in the reference set.
  6. The main benefit of deterministic matching is the ease of implementation. These algorithms are created by defining a series of rules and a sequential order in which they should be applied. The simplest possible matching rule is the following: “Match all attributes of the input address to the corresponding attributes of the reference feature.”Attribute relaxation Attribute relaxation, the process of easing the requirement that all street address attributes must exactly match a feature in the reference data source to obtain a matching street feature, often is applied to create these less restrictive rules.
  7. Any feature-matching algorithm requires the comparison of strings of character data to determine matches and non-matches. Character-level equivalence, enforces that each character of two strings must be exactly the same. Essence-level equivalence uses metrics capable of determining if two strings are “essentially” the same. This allows for minor misspellings in the input address to be handled, returning reference features that “closely match” what the input may have “intended.”. Word stemming is the simplest version of an essence-level equivalence technique. These algorithms reduce a word to its root (stem), which then is used for essence-level equivalence testing. The Porter Stemmer (Porter 1980) is the most famous of these. It starts by removing common suffixes (e.g., “-ed,” “-ing,”) and additionally applies more complex rules for specific substitutions such as “-sses” being replaced with “-ss.” Phonetic algorithms or the Soundex Algorithm provide an alternative method for encoding the essence of a word. These algorithms enable essence-level equivalence testing by representing a word in terms of how it sounds when it is pronounced (i.e., phonetically). The goal of these types of algorithms is to produce common representations for words that are spelled differently, yet sound the same.