The document discusses managing a spatial data warehouse (SDW) using FME. Previously, spatial data was stored across multiple SDE instances, causing issues with data duplication, inconsistencies, and access. Now, FME is used to consolidate all spatial data into a single SDW for standardized management, access control, and self-serve capabilities like web services. FME drives the SDW through hundreds of jobs that synchronize, quality check, derive, and distribute data via incremental updates and web services. This centralized approach using FME solves previous problems and provides benefits like standards, metadata, elimination of redundancy, and a single access point for spatial data.
4. The Case for an Enterprise
Spatial Data Warehouse (SDW)
As the company grew, many separate SDE instances
popped up for specific use cases. We used FME to
move data then, as we do now.
Geo
Land
Eng
Fin
EHS HR
shape
files
GDB
Web
Service
SDE 1
SDE 2
SDE 3
SDE 4
(Simplification of integrations)
Data
Service
5. SIGNIFICANT ISSUES WITH OLD SETUP
• Managing data (8 different SDE instances in 2014)
• Duplicated data
• Inconsistency in naming conventions, coordinate systems & metadata
• Access: Shared local user accounts = issues with account locks
• Synchronization between geodatabases
6. THE SDWWe can
• Create & Implement Standards
• Control access for individual users
• Add standardized metadata
• Eliminate redundant datasets
• And MUCH MORE!
Now there is ONE place
to store all spatial data.
7. Spatial Data Warehouse (SDW)
Single source for spatial data:
client connections, integrations, ArcGIS services,
FME web services, etc.
Geo
Land
Eng
Fin
EHS HR
shape
files
GDB
Web
Service
SDW
(Simplification of integrations)
Data
Service
FME/ArcGIS
Web Services
User
9. FME’s role in the SDW
100s of FME Server jobs manage
the following tasks:
• Sync and Spatialize
• Ensure data quality (QA/QC)
• Derive new data
• Self-Serve GIS via web services
• Download/Update vendor data
10. Place your screenshot here
EXAMPLE
Incremental Updates
For data that requires
high availability and
frequent updates
11. EXAMPLE
Incremental Updates
Startup Python:
● Before the workbench runs,
we want to make sure it isn’t
already running. If it is, this
job will be canceled.
● We get this information from
the FME REST services on
FME Server via Python.
import requests,json
url = 'http://fme.dvn.com/fmerest/v2/transformations/jobs/running?accept=json…'
jobName = "WELL_MDM_WC_ALL_SH_P_v4.fmw"
response = requests.get(url).json()
numJobs = len(response)
global runJob
runJob = 'Y'
jobList = []
if numJobs == 0 or numJobs is False:
runJob = 'Y'
elif numJobs > 0:
i = 0
while i < (numJobs):
jobList.append(response[i]['request']['workspacePath'].split('/')[2].strip('"'))
i += 1
if jobList.count(jobName) > 1:
runJob = 'N'
12. EXAMPLE
Incremental Updates
Writing Data:
● First write to landing table,
then MERGE
update DVN_GIS.WELL_MDM_WC_ALL_SH_PU SET UPDATEID = (NEXT
VALUE FOR DVN_GIS.MDMWELLUPDATE);
BEGIN TRAN;
MERGE DVN_GIS.WELL_MDM_WC_ALL_SH_P T USING
DVN_GIS.WELL_MDM_WC_ALL_SH_PU U
ON (T.WELLBORE_COMPLETION_DUWI =
U.WELLBORE_COMPLETION_DUWI)
WHEN NOT MATCHED BY TARGET THEN INSERT [OBJECTID], [WELL….
VALUES U.[UPDATEID],U.[WELL…
WHEN MATCHED THEN UPDATE SET T.[WELL…,… = U.[WELL… ,...;
COMMIT;
Geo
WELL_MDM_WC_ALL_SH_PU
FME SQL After
WELL_MDM_WC_ALL_SH_P
TIP: OBJECTID will be a pain! Use a database sequence to “fake” it for INSERTs
14. Why Offer Geospatial Web Services?
• EASY to set up and use & SAVES TIME!
• Geospatial data/geoprocessing via web request with parameters
• Users can run individually or incorporate into scripts, software, etc.
• Compatibility: Data returned in JSON format
• No additional software needed
15. Geospatial Web Services* at Devon
• Footage Call Converter (given footages referenced to landgrid, return lat/long)
• Offset wells (given wellbore, find neighboring wells in 3D)
• Wellbore XYZ from MD (given measured depth along wellbore, return XYZ)
• Gunbarrel Well Views (cross-sectional sub-surface view for set of well laterals)
• Coordinate conversions
• Well Area Analyzer (given location, return county/state/division/BU, etc.)
• Convergence Calculator (Grid North <--> True North)
* Provided via FME Server Data Streaming Service
17. Place your screenshot here
EXAMPLE
Lat/Lon to XY
Web Service
JSONTemplater formats
attributes into JSON string
18. Place your screenshot here
EXAMPLE
Lat/Lon to XY
Web Service - result
http://fme.pre.dvn.com/fmedatastreaming/web_services/LL_to_XY_Conv.fmw?
lat=35.934458&lon=-98.585498&ll_epsg=4267&end_epsg=32024&token=xxxxx
I can’t even fit the outputs on this diagram (even though it is simplified!)
clients directly access the SDW via applications or through web map services served from ArcGIS Server
Data can also be consumed via web services from FME Server
FME was integral in building out the SDW (migrating, reprojecting and consolidating data), but I am going to focus on a few specific examples of ongoing jobs we currently use to maintain our SDW.
The #1 way we use FME is to read data from some source (spatial or non-spatial) and create/update a spatial version of that dataset in our SDW. These jobs are scheduled on FME Server and run at varying frequencies.
- For small datasets, truncate and load overnight is sufficient
- For large datasets, we must update via transactions
“UPSERT”ing Data
Problem: Unfortunately, we don’t have a good way of knowing whether our well data updates are a new (INSERT) or existing (UPDATE) record. We would have to read in the entire dataset and use a ChangeDetector to determine. With a dataset of 4.8mil, this is not feasible. Furthermore, updating data transactionally with the GeoDB writer is relatively slow.
Solution: Insert all updates into a “hidden” table and use the “SQL After” functionality to execute a database MERGE statement. The MERGE updates or inserts accordingly and is significantly faster.
Read FME job history to see when the last successful job started – run updates from that time
The DATABASE handles the updates and inserts. This will always be more efficient than an application using transactions!
All records written to the landing table are TRUNC/INSERTS.
I like to refer to this method as UPSERTs.
Data deletions are not common, but a weekly full refresh is completed over the weekend just in case.
Devon makes use of the FME Server Data Streaming Service to provide web services for users across the enterprise.
Listed are a few examples of the web services offered.
Some of these are run ad hoc by users, but the most common use comes from custom developed applications that require GIS data.