FME Data Transformation for the Geographic Support System Initiative

FME Data Transformation for
the Geographic Support
System Initiative
Jay E. Spurlin
Software Architect and Development Manager for the
GSS-I Feature Source Evaluation software system

April 8, 2013

U.S. Census Bureau
• The Census Bureau serves as the leading
source of quality data about the nation's
people and economy. We honor privacy,
protect confidentiality, share our expertise
globally, and conduct our work openly. We
are guided on this mission by our strong and
capable workforce, our readiness to innovate,
and our abiding commitment to our
customers.

2

Geography Division
• The Geography Division plans, coordinates, and
administers all geographic and cartographic activities
needed to facilitate the Census Bureau's statistical
programs throughout the US and its territories. We
manage the Census Bureau's programs to
continuously update features, boundaries and
geographic entities in TIGER and the Master Address
File (MAF). We also conduct research into geographic
concepts, methods, and standards needed to facilitate
the Census Bureau's data collection and dissemination
programs.

3

GSS-I
• In support of the 2020 Decennial Census, the Census Bureau
is evaluating what areas should be targeted for a traditional,
on-the-ground address canvassing operation and in which
areas a traditional canvassing operation is not necessary.
• The task the Census Bureau is undertaking is determining
how to decide which areas should be considered for targeting
– GEO has evaluated the MAF/TIGER database and assigned
quality indicators to each of the census tracts
– A Targeted Address Canvassing strategy has been developed
that contains an inventory of criteria for evaluation

4

GSS-I
• The Geographic Partnership program is now underway.
– GEO is receiving both address and spatial data from invited partners
• This data is at the state, county, and local level.
• The data is being evaluated and integrated with the MAF/TIGER database.
• The next step is to determine what level of feedback we can give to the partners
about their data.
• GEO is also working with statisticians on predictive modeling to help
determine where to target.
• The combination of the evaluation of the current MAF/TIGER
database, the partner data, and the predictive modeling will
contribute to the recommendation on which areas of the country
should be considered for targeting.

5

The Geographic Partnership Program
• A partner provides a set of source files
• The source files are moved inside the Census firewall via a secure web-exchange module
• The content inventory of the files undergoes initial verification
• The files are preserved, as supplied, for later reference
• A more detailed content assessment is done, including verification the files meet the
minimum guidelines for content and metadata
• The files are prepared for automated processing, including re-projection and mapping to a
standardized schema
• A series of (mostly) automated checks is run, which provides metrics about the data in the
files
• An interactive review is conducted, in which the files and their associated metrics are
reviewed and a decision is made how to capture any new data
• Any data that are not useful for updating the MAF/TIGER database get removed from the
files
• Features or addresses are added or modified, using an automated conflate and review
process – or – an interactive update process

6

Feature Source Evaluation Software
• A number of MAF/TIGER spatial layers will be extracted for the extent of the partner
entity
• An analyst will use the supplied data and metadata to map the provided source
schema to a standardized schema, and the supplied road centerline file will be
converted to an ArcSDE layer, re-projected, and the name and MTFCC mappings
applied
• The feature names in the source file will be standardized to the parsed, MAF/TIGER
naming conventions
• The standardized feature names will be checked to see if any contain illegal
charactersor prohibited or generic names
• A topological check will be run, to gauge the topological stability of the source file
• A completeness / change detection check will be run to attempt to identify areas in
the source file that contain features not found in MAF/TIGER
• A comparison will be run between the universe of feature names in the source file
and the universe of feature names found in MAF/TIGER within the extent of the entity
• All intersections that meet the requirements for CE95 assessment will be identified

7

Previous FME Technology Architecture
• FME Workspaces were developed using FME Workbench 2012 on
desktop workstations, running 32-bit Windows XP Service Pack 3
• FME Server 2012 (FME Engine only), on batch servers running
Linux Redhat Enterprise 5 connected to a SAN (Storage Area
Network)
Linux Batch Server

Cronacle job-queueing system

Perl and shell scripts

MAF/TIGER FME Server (command line
(Oracle Shapefiles on
invocation of FME Engines)
Database) SAN
Oracle Run-Time Client

8

New FME Technology Architecture
• FME Workspaces are developed using FME Workbench 2012 SP3 on
desktop workstations, running 32-bit Windows XP Service Pack 3
• FME Server 2012 SP3 (FME Server Console), on batch servers running
Linux Redhat Enterprise 5
• FME Server 2012 SP3, on Windows server, with SAN (Storage Area
Network) disk(s) mounted via Samba
Linux Batch Server
Windows Web Server
Cronacle job-queueing system MAF/TIGER
Shapefiles on ArcGIS for Server (Oracle
SAN Database)
Perl and shell scripts
FME Server (full installation)
FME Server Console (remote job
submission to FME Server) ArcSDE
Oracle Run-Time Client
Geodatabase

9

Cross-walking (Transmogrification)

10

Topology Check
• The Topology Check workspace compiles a number of topology and
tolerance based metrics:
– Gaps – endpoints within 5 meters of any line segment
– Overshoots – line segments extending less than 5 meters beyond an
intersection
– Tiny Features – features with a total length less than 5 meters
– Floating Features – features or connected sets of features that are not
connected to the rest of the road network
– Exact Duplicates – features whose geometry and name are identical to
another feature
– Coincident – features whose geometry overlaps with another feature
– Crossing – features that cross but do not intersect at a node
– Multi-part – features that consist of multiple geometry parts
– Cutbacks – features containing angles less than 25 degrees

11

Completeness / Change Detection Check
• The MAF/TIGER road centerline features and the
feature source file road centerline features will be
compared using and FME workspace.
• The MAF/TIGER features will be Buffered to a
distance of 15 meters, then “overlayed” with the
source file features.
• Any source file feature parts that fall outside of the
Buffer areas will be chained together, and the total
length of difference (and of each part) will be
reported as an evaluation metric.

12

CE95 Qualifying Intersection Identification

• Qualifying intersections must meet the
following criteria:
– Must consist of three roads (a “T” intersection)
or four roads (an “X” intersection)
– Must consist of only secondary roads or local
roads
– Must meet at 90 or 180 degree angles, with a
15 degree plus/minus tolerance

13

Thank You!

 Questions?

 For more information:
 Jay E. Spurlin
 jay.e.spurlin@census.gov
 U.S. Census Bureau

 http://www.census.gov/geo/www/gss/

FME Data Transformation for the Geographic Support System Initiative

Recomendados

Recomendados

Mais conteúdo relacionado

Destaque

Destaque (6)

Semelhante a FME Data Transformation for the Geographic Support System Initiative

Semelhante a FME Data Transformation for the Geographic Support System Initiative (20)

Mais de Safe Software

Mais de Safe Software (20)

Último

Último (20)

FME Data Transformation for the Geographic Support System Initiative

Notas do Editor