All the information that is needed to find and stop bad actors from entering our financial system already exists and is available to you today; it’s just buried in terabits of messy, unstructured data all over the internet. For those performing investigations and evaluating risk, this needle in a stack of needles problem is huge and growing: Unstructured data already dominates the web (growing exponentially year over year), and the traditional technology these departments use cannot keep up. Recent developments in natural language processing technology (NLP), the field of AI that focuses on human language, have, for the first time, made it possible for automated systems to find and deliver identity-relevant intelligence hidden in unstructured textual data. In this talk, I will share some of the common patterns, common mistakes, and opportunities that I see in the field.
4. JIHADI BRIDES TRAGEDY
4AI FOR GOOD ● BASIS TECHNOLOGY
Image Sources:
- Bethnal trio: Mirror
- Article: Independent
5. ALL THE EVIDENCE EXISTS
5AI FOR GOOD ● BASIS TECHNOLOGY
Scotland Yard Report
ID
Social Activity
Image Sources:
- Tweet: : ISD Global
6. WHAT’S AT STAKE
6AI FOR GOOD ● BASIS TECHNOLOGY
FINANCIAL STABILITY
Global Money Laundering Operations
1% of Illegal Funds Captured
PUBLIC SAFETY
Deaths from Terrorist Attacks in Europe
11,288 from 1970-2017
Sources:
- Terrorism: Washington Post
- Money laundering: Wall Street Journal
9. COMMON PATTERN
##AI FOR GOOD ● BASIS TECHNOLOGY
80% of data is
unstructured
Join Processed and
Structured Data into
Knowledge Graph
1) Natural Language
Processing Extracts Facts
2) Scored for confidence
& relevance
Mine Graph
For Patterns
& Changes
People
Organizations
Locations
Relationships
Searching
Alerting
Anomaly Detection
Reporting
10. CHALLENGES AT EVERY LEVEL
##AI FOR GOOD ● BASIS TECHNOLOGY
● Domains
● Languages
● Training Data
● Data Salad!
● Data Access
● Duplication
● Variation
● Ambiguity
● Semantics
● Honey Pots
● Training Data
● GIGO
● Data Overload
● Alert Bombs
● Privacy
● Trust
11. ... government officials were
convicted of corruption.
ABC Company saw a drop in
sales as …
CHALLENGES AND ANTI-PATTERNS
##AI FOR GOOD ● BASIS TECHNOLOGY
Identifying Context
1) Reliance on Keywords
2) Naive Rules
Leads to False Positives
and False Negatives
12. CHALLENGES AND ANTI-PATTERNS
##AI FOR GOOD ● BASIS TECHNOLOGY
Identifying Proper Names
3) Name Variants
4) Name Parts (common keys)
Leads to False Positives
and False Negatives
abdul rashid
abdal rashide
abdal-rasheed
abdul-rashiyd
abdul-rachid
abd-errshiyd
abd-errchide
abd-errcheed
abd-errchiyd …
Abdul-Rasheed ➔
14. Challenges & Anti-patterns
3) Failure to match variants
4) Failure to disambiguate
5) Failure to model what matters
6) Monolingual design
“Operation Hairball”
CHALLENGES AND ANTI-PATTERNS
##AI FOR GOOD ● BASIS TECHNOLOGY
17. CROSS-LINGUAL SEMANTIC MODELING
##AI FOR GOOD ● BASIS TECHNOLOGY
Machine Learning
חישובית למידהEagle
Pharmaceuticals Inc.
Eagle
Drugs, Co.
Tesla
Energy Storage
טסלה
AI
ﻣوﺗورز ﺗﯾﺳﻼ
計算学習
אנרגיה אחסון
18. AI BUILDING BLOCKS: Algorithms & High Quality Data
##AI FOR GOOD ● BASIS TECHNOLOGY
● NN NER
● NN CLASS
● NN RELAX
● SVM
● TEXT
EMBEDDINGS
● NNs
● NL SEARCH
● CLASSIC ML
● ANOMALY
DETECTION
● HMM
● SEMANTIC
MODELING
● GRAPH
SIMILARITY
● Data Filtering
● Classification
● Deduplication
● High Quality
Annotations
● Language & domain
combos
● Active Learning
Feedback
● High Quality Name
Pairs in every
language pair
● Confidence Modeling
● Semantic Model
● Baseline
“normal”
● Queries
● Visualizations
19. PUTTING IT ALL TOGETHER
##AI FOR GOOD ● BASIS TECHNOLOGY
People
Organizations
Locations
Relationships
Searching
Alerting
Anomaly Detection
Reporting
20. ##AI FOR GOOD ● BASIS TECHNOLOGY
THIS TECHNOLOGY IS ALREADY AT WORK
21. CAPTURING EL CHAPO
##AI FOR GOOD ● BASIS TECHNOLOGY
Source: U.S. Immigration and Customs Enforcement
22. CAPTURING EL CHAPO
##AI FOR GOOD ● BASIS TECHNOLOGY
Source: El Chapo recaptured in gun battle
23. KEY DOMAINS OF IMPACT
##AI FOR GOOD ● BASIS TECHNOLOGY
National Security Financial ServicesLaw EnforcementIntelligence
24. THANK YOU
##AI FOR GOOD ● BASIS TECHNOLOGY
Chris Mack
● Basis Technology
● I design & implement NLP / NLU solutions for good
● Please reach out!
@cgmack