Trusted analytics and predictive data models require accurate, consistent, and contextual data. The more attributes used to fuel models, the more accurate their results. However, building comprehensive models with trusted data is not easy. Accessing data from multiple disparate sources, making spatial data consumable, and enriching models with reliable third-party data is challenging.
In this webinar you will learn how to:
Organize and manage address data and assign a unique and persistent identifierEnrich addresses with standard and dynamic attributes from our curated data portfolioAnalyze enriched data to uncover relationships and create dashboard visualizations
Take control of your SAP testing with UiPath Test Suite
Learn How to Turbocharge Your AI/ML Data Workflows with Data Enrichment
1. Data Enrichment: The
Key to Turbocharging
your AI/ML Data
Workflow
Tim McKenzie | Director, Solution Architecture
1
2. Location and Data are solving real-world challenges
in a complex, digital economy
2
• Underwriting
• Risk Accumulation
• Catastrophe Modelling
• Claims Processing
• Fraud analysis
• Customer Insights
• Network and coverage
planning
• Opportunity analysis
• Location-based
marketing & advertising
• Asset management
INSURANCE TELECOMMUNICATIONS
• Citizen communications
• Service optimization
• Election operations
• Census operations
• Emergency response
and management
• Home search
• Data cleaning
• Data preparation
• Automated valuations
• Geotargeting
• Audience profile creation
• Mobile marketing &
advertising
• Geofence campaigns
GOVERNMENT REAL ESTATE AD TECH
• Retail location analysis
• Location-based
marketing & advertising
• Store finder
• Service area analysis
RETAIL
• Address data capture
• Customer insight
• Reduce abandonments
• Logistics and delivery
• Location-based
marketing & advertising
ECOMMERCE
• Mortgage processing
• Customer Insight
• Master Data
Management
• Financial crimes
and compliance
• Branch location analytics
FINANCIAL SERVICES
Data Enrichment: The Key to Turbocharging your AI/ML Data Workflow
3. Location data challenges
• Location is Messy: Addresses, Lat/Long,
Shapes, Lines, Formats
• Complexity of Joining Location Based Data
Sources (3rd Party and Internal)
• Data Sourcing Challenges: Many Providers,
Many Formats, Many Pricing, and Licensing
Differences
• Global Extensibility: Data Sources Tend to
Be Regional Yet Use Cases are Often
Global
• Need to Identify and Process Multi-Family
and Condo Properties
• De-centralized repositories of data
• Complex properties can often have multiple
valid addresses, parcels, and buildings.
• Legal descriptions in variety of format
leading to discrepancy, inefficiencies, errors,
and non-compliance
3
“For every minute spent in
organizing, an hour is earned.”
Benjamin Franklin
Inventor, Statesman, Insurer
Data Enrichment: The Key to Turbocharging your AI/ML Data Workflow
4. Data prep slows data science
3%
19%
9%
4%
5%
What data
scientists spend
the most time
doing
Building datasets
Cleaning and organizing data
Collecting datasets
Mining data for patterns
Refining algorithms
Other
accounts for about 80%
of the work of data
scientists
4 Data Enrichment: The Key to Turbocharging your AI/ML Data Workflow
5. Location enabling strategies for data analytics
03.
Analyze
Apply data science at
scale to gain a
competitive advantage
02.
Enrich
Leverage trusted ID to
join massive amounts of
your own and 3rd party
data sources
01.
Organize
Assign a trusted ID that is
unique and persistent to
each address
5 Data Enrichment: The Key to Turbocharging your AI/ML Data Workflow
6. Fast, easy, and consistent data enrichment
6
Precisely’s Geo Addressing with hyper-accurate Master Location Data (MLD) reference data
• Belgium & Luxembourg
• Canada
• Finland
• France
• Germany
• Great Britain
• Ireland
• Netherlands
• Sweden
• Singapore
• United States
• More coming soon!
International
Coverage
Data
Sources
• Postal Authorities
• Government
datasets: local city,
county, and state
• Global Vendors
• Local Players
• Open Sources
• Proprietary
Sources
• Largest & Best available
• Unparalleled &
• Parent-child relationship,
• Unique and Persistent Identifier,
• Multi-sourced,
• Simplify data enrichment process,
MLD Attributes
Data Enrichment: The Key to Turbocharging your AI/ML Data Workflow
7. Cloud-based location analytics technology
7
Spatial
Functions
30+ Common
Spatial Processes
Global
Geocoding
Forward & Reverse
Global Geocoding
and Trusted ID
Global
Addressing
Validate,
standardize and
parse global
addresses
Global Tax
Jurisdiction
How do extreme
weather events
affect the
“creditworthiness”
of my portfolio?
Map
Visualization
What alternative
data helps me
better understand
investment
opportunities?
Global Street
Routing
Where are my
customers and
how do they want
to interact with
me?
8. Data Enrichment – A global product portfolio
Addresses & Property
Verified and validated address and
property data for map display and
analytics
Boundaries
Administrative, community, and
industry-specific boundaries for data
enrichment and territory analysis
Demographics
Demographic and consumer context
data for better understanding people
and behavior
Points of Interest
Detailed business, leisure, and
geographic features for location
and competitive intelligence
Streets
Robust street-level data for mapping,
analysis, routing, and geocoding
Risk
Natural hazard boundaries related to
flood, fire, earthquakes, and weather
Expertly curated datasets containing thousands of attributes for faster, confident decisions
8 Data Enrichment: The Key to Turbocharging your AI/ML Data Workflow
9. Uniquely positioned to address data enrichment needs
Global coverage location enrichment data. Our portfolio includes:
• 400+ datasets
• 250+ countries and territories
• 100s of millions of data points
Datasets that are interoperable and are managed to quality standard, with consistent documentation, and
support e.g.
• Property Graph
• Market and Community Link
Ability to enrich with dynamic data (Dynamic Weather and Dynamic Demographics)
• Data that includes time as a dimension
• Creating insights from data that is updated at regular and short time intervals (e.g. 5 min)
Data experience through deep-domain expertise
• Adding data through, development, partnerships, and acquisitions
Best-in-class addressing and property datasets with a unique and persistent ID
• Link Precisely and customer address, buildings, demographics, risk, and more data using the PreciselyID,
a unique and persistent location identifier
9 Data Enrichment: The Key to Turbocharging your AI/ML Data Workflow
10. Understanding the
data challenge
10
• Accessing the right raw data
• Keeping up with continuously changing data feeds
• Building features from raw data
• Combining features into training data
• Calculating and serving features in production
• Monitoring features in production
Key data challenges that organizations
face when productionizing ML systems
Data Enrichment: The Key to Turbocharging your AI/ML Data Workflow
11. Location-enabled analytics
Bank Branch & ATM
Call Center/ Web
Customers by Product
Commercial & Mortgage
Active Mortgages
Historical Defaults
Geocoding and location
intelligence capabilities to
organize and enrich your data
Financial Transactions
All of your sources
Any structure
or frequency
Analytics capabilities for
any use case or persona
Ad Hoc Data Science
Low-cost, rapid experimentation with
new data and models.
Explainable Machine Learning
High volume, fine-grained analysis at scale
served in the tightest of service windows.
BI Reporting & Dashboarding
Power real-time dashboarding directly,
or feed data to a data warehouse for
high-concurrency reporting.
Real-time Applications
Provide real-time data to downstream
applications or power applications via APIs.
PreciselyID
ADMIN
BOUNDARIES
BANK DEPOSITS
MOBILE
MOVEMENT
WEATHER
EVENTS
HAZARD &
RISK DATA
AMENITIES &
COMPETITION
EVERY US/CAN
ADDRESS
BUSINESS
LOCATIONS
PROPERTY
ATTRIBUTES
SCHOOLS &
NEIGHBORHOODS
POPULATION
DEMOGRAPHICS
PARCELS &
BUILDINGS
Analytics Platform
12. What is a “feature-based”
architecture?
12
A feature store is an ML-specific data system that:
• Runs data pipelines that transform raw data into
feature values
• Stores and manages the feature data itself, and
• Serves feature data consistently for training and
inference purposes
A feature is data used as an input
signal to a predictive model
Data Enrichment: The Key to Turbocharging your AI/ML Data Workflow
13. 13
Processing
Storage
Inputs
Location specific records Shape files Streaming records
Address Fabric
Analytics
Processing
• Model outputs
• Scores
• Computed columns
• Analysis outcome
Batch Geocoding
with the Operational
Addressing SDKs
• Vaildate input addresses
• Validate other data
• Locate addresses
• Match inputs
• Assign PreciselyID
• Relate data around
PrecisleyID
Batch Spatial
Processing
with the Location
Intelligence SDK
• Flatten shape files
• Compute PIP
• Compute D2P, D2L
• Compute basic scores
• Generate geohash
• Relate data around geohash
(where application)
Realtime Processing
with the Precisely SDKs
• Operational Addressing APIs
• Assign PreciselyID
• Generate geohash
• Relate data
Message Bus
Feature Store
In-stream Analytics Layer
Model outputs, scores, computed columns,
analysis outcomes
PrecisleyID Address
P0000MK1IAAD 287 E 300 S. Provo, UT 84606
P0000MK1DPRD 410 N University Ave. Provo, UT 84601
Vendor
data files
Customer Loyalty Records
Equipment Inventories
Franchise
Zones
Pricing Delivery
Territories
Mobile Trace
Data
POS/IOT
Data
Administration, Governance, Security, Connectivity, Schema, Catalog
Model
Training
EDW
precisely
Data subscriptions
with PreciselyID
PrecisleyID Address Name Type Score Location MICode PointCode DemoRgn
P0000MK1IAAD 287 E 300 S. Provo, UT 84606 Empas LLC REST 91.529 UT108 10020100 101067669 8926
P0000MK1DPRD 410 N University Ave. Provo, UT 84601 THAI HUT REST 65.981 UT108 10020100 100854441 4144
…. ….. ….. ….. ….. …. …. ….. ….