SlideShare a Scribd company logo
1 of 42
THE WORLD IS Y0UR$:
GEOLOCATION-BASED WORDLIST
GENERATION WITH WORDSMITH
SANJI V KAWA | TO M PO RTER
@ h a c k e r j i v | @ p o r t e r h a u 5
Formalities
2
Sanjiv Kawa
@hackerjiv
S R . P E N E T R A T I O N T E S T E R
P S C / N C C G R O U P
• Roots in dev and IT
• Penetration testing
• Binary analysis and exploitation
Formalities
3
Tom Porter
@porterhau5
S R . S E C U R I T Y C O N S U L T A N T
F U S I O N X R E D T E A M
• Flow data analytics
• Penetration testing
• Red teaming
• BloodHound extensions
What is Wordsmith?
4
Custom wordlist generation
Crack hashes / password
attacks
Tailored for your target
Geo-location data Modular and extensible
Username generation
Dictionary Attack
5
1. Guess
2. Encrypt
3. Compare
apple
banana
cherry
…
$hash <- encrypt(apple)
$hash : 5ebe7dfa074da8ee8aef1faa2bbde876
Search for $hash in obtained hash list:
af5432a79b941528fa7fac9e7e391651
5ebe7dfa074da8ee8aef1faa2bbde876
8846f7eaee8fb117ad06bdd830b7586c
Wordsmith v1
6
Wordsmith v1: Geo-location Data Collected
7
Major league sports teams
Colleges and universities
Common names
Area codesZip codes
Streets and roads
Landmarks
Cities, towns, etc
Wordsmith v1: Additional Features
8
CeWL Integration
Basic mangling
(whitespace, specials, split
on space)
Specify minimum
character length
To lowercase[a-z]
Wordsmith v1: Things we learned
9
Feedback from the community was incredible. Thank you!
Top three requests:
1. More countries need to be available (v1 was US only)
2. Needs to be a way to introduce more/your own data
3. Limited to the English language
Wordsmith v2
10
New CLI design
Multi-language
(13 so far!)
Introduced religions
Generate usernames
Modular framework allows
for user contribution and
extensibility
Geo-location data sets
for over 230 countries!
Data Sources
Coverage: World
Data types: Population, Religion,
Languages, etc
11
www.cia.gov/library/publications/the-world-
factbook/geos/print_[aa-zz].html
Coverage: 13 languages (hunspell)
Data Sources
12
Coverage: US
Data Types: Sports teams, colleges
Coverage: World
Data Types: Landmarks and archeological
sites
Coverage: World
Data Types: Religious texts
Data Sources
13
Coverage: World
Data Types: Roads, Cities, Counties
Coverage: US
Data Types: Popular first names. Last
names
Coverage: US
Data Types: Area Codes, Zip Codes
How to get Wordsmith
14
❯ git clone https://github.com/skahwah/wordsmith.git
❯ cd wordsmith
❯ bundle install # (optional for CeWL integration)
❯ ruby wordsmith.rb
wordsmith v2.0.7
Written by: Sanjiv "Trashcan Head" Kawa & Tom "Pain Train" Porter
Twitter: @hackerjiv & @porterhau5
[*] Hello new wordsmither!
[*] This script will remove the data/ directory in the current working
directory. Enter 'y' to continue: y
[*] Just need to unpack some files (Running: tar -xf data.tar.xz)
[*] Unpack completed!
[*] CeWL found: /usr/bin/cewl
Files
15
❯ ls -l
-rw-r--r-- 1 user staff 3159 Oct 1 22:57 CHANGELOG.md
drwxr-xr-x 2 user staff 4096 Oct 1 22:57 data
-rw-r--r-- 1 user staff 50602888 Oct 1 22:57 data.tar.xz
-rw-r--r-- 1 user staff 116 Oct 1 22:57 Gemfile
-rw-r--r-- 1 user staff 1393 Oct 1 22:57 LICENSE
-rw-r--r-- 1 user staff 7514 Oct 1 22:57 README.md
-rwxr-xr-x 1 user staff 31081 Oct 1 22:57 wordsmith.rb
• View README first, or check out –E option (examples)
• wordsmith.rb: primary ruby script
• data.tar.xz (~50 MB): compressed archive of data
• data/ (~250 MB): data arranged in hierarchy
Boundaries & Attributes
16
Boundaries (-I <input>)
• Areas of the world to get
words for
• 249 countries and
territories
• States/Provinces
• Cities
• Custom regions
Attributes (ex: -r -l)
• Types of words to grab:
• Cities
• Colleges
• Landmarks
• Languages
• Names
• Roads
• Religions
• and more…
❯ ruby wordsmith.rb –I usa –r –l
Structure
17
❯ ls data/
abw afg ago aia ala alb and are arg arm ... wlf wsm yem zaf zmb zwe
ISO ALPHA-3 Country Codes
❯ ls data/usa
ak al ar az ca cia.txt co ct dc ... tx usa.yaml ut va vt wa wi wv wy
States, Provinces, Counties, Municipalities
❯ ls data/usa/nc
areacodes.txt charlotte cities.txt colleges.txt counties.txt ...
Cities, Counties
❯ ls data/usa/nc/charlotte
sports.txt
Attributes (sports, colleges, roads, etc.) are .txt files
Boundaries and Input
18
❯ ruby wordsmith.rb –I usa [options]
❯ ruby wordsmith.rb –I usa-nc [options]
❯ ruby wordsmith.rb –I usa-nc-charlotte [options]
❯ ruby wordsmith.rb –I usa,can [options]
❯ ruby wordsmith.rb –I usa-dc,usa-md,usa-va [options]
-I for specifying input boundaries
Can supply one or many boundaries
❯ ruby wordsmith.rb –I 10 [options]
Providing a number (ex: 10) will select N most populous countries
Regions
19
❯ ruby wordsmith.rb –I europe [options]
❯ grep europe data/regions.csv
europe,"Continent of Europe",ala alb and arm aut aze bel bgr bih blr che
cyp cze deu dnk esp est fin fra fro gbr geo ggy gib grc hrv hun imn irl
isl ita jey kaz lie ltu lux lva mco mda mkd mlt mne nld nor pol prt rou
rus sjm smr srb svk svn swe tur ukr vat
regions.csv contains custom grouping of boundaries
Can see regions with -R option:
❯ ruby wordsmith.rb –R
Alias: newengland
Description: US - New England
Members: usa-ct usa-me usa-ma usa-nh usa-ri usa-vt
Alias: mideast
Description: US - Mideast
Members: usa-de usa-dc usa-md usa-nj usa-ny usa-pa
Alias: greatlakes
Description: US - Great Lakes
Members: usa-il usa-in usa-mi usa-oh usa-wi
Attributes
20
❯ ruby wordsmith.rb –I europe [options]
❯ ruby wordsmith.rb –h
Main Arguments:
-I, --input <input> Comma-delimited list of inputs
Input Options:
-a, --all Grab all options
-b, --other Grab other miscellaneous attributes
-e, --cia Grab demographics compiled by the CIA
-c, --cities Grab all city names
-f, --colleges Grab all college sports
-l, --landmarks Grab all landmarks
-v, --language Grab the most popular language(s)
-N, --all-names Grab all first names and last names
-G, --first-names Grab all first names
-L, --last-names Grab all last names
-F, --female-fnames Grab all female first names
-M, --male-fnames Grab all male first names
-p, --phone Grab all area codes
-r, --roads Grab all road names
-g, --religion Grab the most popular relgious text(s)
-t, --teams Grab all major sports teams
-u, --counties Grab all counties
-z, --zip Grab all zip codes
Attribute Examples
21
❯ ruby wordsmith.rb –I usa-ca -z
90001
90002
90003
90004
...
Grab all zip codes for California
❯ ruby wordsmith.rb –I gbr-eng –r –c -l
Ab Kettleby
Abberley
Abberton
Abbess Roding
...
Grab all roads, cities, and landmarks for England, GBR
❯ ruby wordsmith.rb –I asia -a
Abas
Abatan
Abbeg
Abejao
...
Grab all attributes for Asia
Child Nodes
22
❯ ruby wordsmith.rb –I gbr –C
Format:
boundary-name : attribute1 attribute2 attribute3 etc.
gbr : cities counties landmarks roads cia
|-- gbr-sco : cities counties roads
|-- gbr-wal : cities counties roads
|-- gbr-eng : cities counties roads
| |-- gbr-eng-su : cities counties roads
| |-- gbr-eng-ch : cities counties roads
| |-- gbr-eng-ex : cities roads
| |-- gbr-eng-nt : cities counties roads
| |-- gbr-eng-sk : cities roads
| |-- gbr-eng-ca : cities counties roads
| |-- gbr-eng-bu : cities counties roads
| |-- gbr-eng-sx
| | |-- gbr-eng-sx-east_sussex : cities counties roads
| | |-- gbr-eng-sx-west_sussex : cities counties roads
...
See the child nodes (-C) and their attributes of a given boundary
Country Metadata
23
❯ ls -l data/jpn/
-rw-r--r-- 1 user staff 32002 Aug 30 19:16 cia.txt
-rw-r--r-- 1 user staff 13184 Sep 9 2016 cities.txt
-rw-r--r-- 1 user staff 5608 Sep 9 2016 counties.txt
-rw-r--r-- 1 user staff 107 Aug 30 19:36 jpn.yaml
-rw-r--r-- 1 user staff 113672 Oct 1 21:10 landmarks.txt
-rw-r--r-- 1 user staff 871994 Sep 9 2016 roads.txt
❯ cat data/jpn/jpn.yaml
config:
population: 126,702,133
language_1: Japanese
religion_1: Shintoism
religion_2: Buddhism
The World Factbook:
Population
Official languages
Most popular religions
Most populous countries (ex: -I 25)
Official languages (-v, --language)
Most popular religions (-g, --religion)
Religions
24
❯ wc -l data/religion/*
28168 douay-rheims-parsed.txt
97682 king-james-bible-book-verse.txt
20190 king-james-bible-parsed.txt
42876 niv-bible-parsed-spanish.txt
34202 niv-bible-parsed.txt
7872 quran-parsed-eng.txt
❯ cat king-james-bible-book-verse.txt
The First Book of Moses: Called Genesis
Genesis1:1
1:1Genesis
John3:16
3:16John
...
❯ cat king-james-bible-parsed.txt
...
Jesuite
Jesus
Jether
Jetheth
Jethro
...
(-g, --religion)
Identified the most
common religions
• KJV Bible
• NIV Bible
• Douay Rheims
• Quran
~ 200 countries are
covered
Languages
25
❯ head –n 5 language-frequency.txt
83:English
38:French
29:Spanish
26:Arabic
11:Russian
❯ wc -l data/languages/*.txt
457097 arabic.txt
47866 bahasa.txt
110750 bengali.txt
115485 cedict.txt
466544 english.txt
72038 french.txt
585844 german.txt
338534 hebrew.txt
15990 hindi.txt
95152 italian.txt
47866 malay.txt
340235 portuguese.txt
379324 russian.txt
798915 spanish.txt
371169 turkish.txt
(-v, --language)
Identified the most
common languages
~ 195 countries are
covered
Modular Design
26
❯ ls data/usa/mn/
areacodes.txt colleges.txt fnames.txt landmarks.txt sports.txt
cities.txt counties.txt lakes.txt roads.txt zipcodes.txt
❯ cat data/usa/mn/lakes.txt
Aaron
Abbey
Acorn
Adelman's Pond
...
❯ ruby wordsmith.rb –I usa-mn –b
Aaron
Abbey
Acorn
Adelman's Pond
...
Modular design:
- Easily extensible
- Introduce your own .txt files (grab with –b option)
- Contribute and help build the project
Output Options
27
❯ ruby wordsmith.rb –h
<Input options snipped>
Output Options:
-o, --output FILE The filename for writing output
-q, --quiet Don't show words, use with -o option
-k, --min-length LEN Minimum length of word to include
-n, --max-length LEN Maximum length of word to include
-D, --complexity Words meet Windows default complexity
-j, --lowercase Convert all words to lowercase
-w, --specials Add words with special chars removed
-x, --spaces Add words with spaces removed
-y, --split Split words by space and add
-m, --mangle Add all permutations (-w, -x, -y)
-P, --prepend-phones Prepend state area codes to each word
-A, --append-phones Append state area codes to each word
-X, --prepend-zips Prepend zip codes to each word
-Z, --append-zips Append zip codes to each word
-W, --prepend-wordlist FILE Prepend words in FILE to each word
-Y, --append-wordlist FILE Append words in FILE to each word
Tweaking Output
28
❯ ruby wordsmith.rb –I usa-dc –r
Pennsylvania Ave.
Name of a road generated for D.C.
Mangle (-m): split words, remove specials, remove spaces
❯ ruby wordsmith.rb –I usa-dc –r -m
Pennsylvania Ave.
Pennsylvania Ave
Pennsylvania
Ave.
Ave
PennsylvaniaAve.
PennsylvaniaAve
❯ ruby wordsmith.rb –I usa-dc –r –m –k 8
Pennsylvania Ave.
Pennsylvania Ave
Pennsylvania
PennsylvaniaAve.
PennsylvaniaAve
Min Length (-k): specify minimum char length
Tweaking Output
29
❯ ruby wordsmith.rb –I usa-dc –r –m –D
Pennsylvania Ave.
Pennsylvania Ave
PennsylvaniaAve.
Windows Default complexity (-D): 8 char min, 3/4 cases
❯ ruby wordsmith.rb –I usa-dc –a –q –o DC.txt
cities in ./data/usa/dc: 1
colleges in ./data/usa/dc: 24
counties in ./data/usa/dc: 1
landmarks in ./data/usa/dc: 75
fnames in ./data/usa/dc: 2646
areacodes in ./data/usa/dc: 1
roads in ./data/usa/dc: 2088
sports in ./data/usa/dc: 4
zipcodes in ./data/usa/dc: 284
religions: 123676
languages: 1107300
[*] 1221226 words written to: /opt/wordsmith/DC.txt
Quiet output (-q), write results to file (-o DC.txt)
Prepending & Appending
30
• Prepend or Append:
• Zip codes (-X,-Z)
• Area codes (-P,-A)
• User-supplied wordlist (-W,-Y)
https://arstechnica.com/tech-policy/2016/08/if-youre-an-alleged-drug-dealer-dont-use-asshole209-as-a-password/
Prepending & Appending
31
❯ cat years.txt
17
17!
2017
2017!
years.txt: file I created with words I want to append
❯ ruby wordsmith.rb –I usa-dc –f -m –Y years.txt
...
Gallaudet
Gallaudet17
Gallaudet17!
Gallaudet2017
Gallaudet2017!
Hoyas
Hoyas17
Hoyas17!
Hoyas2017
Hoyas2017!
...
Grab colleges (-f), mangle (-m), then append custom wordlist (-Y)
Names
32
❯ cat data/usa/fnames.txt
James
John
Robert
Michael
Mary
...
❯ cat data/usa/lnames.txt
Smith
Johnson
Williams
Brown
Jones
...
• Most common baby names in each state
since 1910
• -G: most common first names
• -L: most common last names
• -N: all names
Username Generation
33
❯ ruby wordsmith.rb –h
<other options snipped>
Username Generation Options:
--filn FirstInitialLastName (bsmith)
--fnln FirstNameLastName (bobsmith)
--fnli FirstNameLastInitial (bobs)
--lnfi LastNameFirstInitial (smithb)
--lnfn LastNameFirstName (smithbob)
--fidln FirstInitial.LastName (b.smith)
--fndln FirstName.LastName (bob.smith)
--truncate LEN Truncate username at LEN number of chars (bobsmi)
--max-users LEN Max number of usernames to generate
--name-depth LEN Num of first/last names to iterate over
(default:100, 0 will get all)
• Generate different username formats
• Use --max-users and --name-depth to handle speed &
volume
Username Generation
34
❯ ruby wordsmith.rb –I usa --fnln
JamesSmith
JamesJohnson
JamesWilliams
JamesBrown
JamesJones
JamesGarcia
JamesMiller
...
First name Last Name
❯ ruby wordsmith.rb –I usa --fndln
James.Smith
James.Johnson
James.Williams
James.Brown
James.Jones
James.Garcia
James.Miller
...
First name (dot) Last Name
Username Generation
35
❯ ruby wordsmith.rb –I usa –filn –-truncate 8
...
aDavis
aRodrigu
aMartine
aHernand
aGonzale
aWilson
aAnderso
...
Truncate down to 8 characters
❯ ruby wordsmith.rb –I usa –lnfn –q
usernames in ./data/usa: 10000
❯ ruby wordsmith.rb –I usa –lnfn –q --name-depth 250
usernames in ./data/usa: 62500
❯ ruby wordsmith.rb –I usa –lnfn –q --name-depth 1000
usernames in ./data/usa: 1000000
Adjust --name-depth to generate more usernames
Ireland – Interesting Password Recoveries
37
• Cork1234
• Carlow123
• Dublin1234
• Seapoint1916
• Artane2016
• Templeroan2009
• Donegal56
• ParkLodge30!
• Portishead01
• Tipperary2
• Larkfield18
• Wolseley2014
• Farriers40
• 5RotheAbbey
Multinational Organization Results
38
• Organization has offices in USA, Australia and Canada
• Unable to disclose total number of hashes
Wordlist Hashcat
run time
Number of
passwords recovered
Top 10k
(10k words)
4 sec
Rockyou
(14.4m words)
30 mins
AUS, CAN, USA Wordlist
(7.3m words)
13 mins
256
476
241
ruby wordsmith.rb -I aus,can,usa -a -j -q -m
-o aus-can-usa-all-lowercase-q-m.txt
Multinational – Interesting Password Recoveries
39
Australia:
• Bayswater2017
• Primavera001
• Padstow123!
• Queenslander2015
• Razorback1965
• Parramatta16
• Sydney201%
Canada
• !Matthew2222
• Canada1984
• Vancouver186
USA
• Bernie424!
• ColoradoSprings3!
• ChicagoCubs2016
• BostonCeltics29
• Anakin2005s
• Denean1973
• Cubbie221!
• Metrocenter11
• Collecting and collating this data required the
development of some parsers
Parsers
40
❯ git clone https://github.com/skahwah/wordsmith_parsers.git
❯ ls
LICENSE cia-parsers landmark-parser osm-parsers
README.md census-parsers names-parsers religion-parsers
https://github.com/skahwah/wordsmith_parsers
Future Work
41
• Data!
– Diving deeper into OpenStreetMap
– Popular song lyrics (h/t @pfizzell)
– Got ideas? We’d love to hear them!
• Skills
– GIS
– Multiple language speakers
– Obscure website hunting & scraping
• Design
– Lookups based on coordinates
Thank you!
42
Sanjiv Kawa
@hackerjiv
S R . P E N E T R A T I O N T E S T E R
P S C / N C C G R O U P
Tom Porter
@porterhau5
S R . S E C U R I T Y C O N S U L T A N T
F U S I O N X R E D T E A M
https://github.com/skahwah/wordsmith

More Related Content

What's hot

"Whatever I can get..."
"Whatever I can get...""Whatever I can get..."
"Whatever I can get..."Dan Brickley
 
Poster - Completeness Statements about RDF Data Sources and Their Use for Qu...
Poster - Completeness Statements about RDF Data Sources and Their Use for Qu...Poster - Completeness Statements about RDF Data Sources and Their Use for Qu...
Poster - Completeness Statements about RDF Data Sources and Their Use for Qu...Fariz Darari
 
Rdf In A Nutshell V1
Rdf In A Nutshell V1Rdf In A Nutshell V1
Rdf In A Nutshell V1Fabien Gandon
 
Linking the Open Data? by Petko Valtchev
Linking the Open Data? by Petko ValtchevLinking the Open Data? by Petko Valtchev
Linking the Open Data? by Petko ValtchevTrudat
 
Two graph data models : RDF and Property Graphs
Two graph data models : RDF and Property GraphsTwo graph data models : RDF and Property Graphs
Two graph data models : RDF and Property Graphsandyseaborne
 
(Re-) Discovering Lost Web Pages
(Re-) Discovering Lost Web Pages(Re-) Discovering Lost Web Pages
(Re-) Discovering Lost Web PagesMichael Nelson
 
Programming with LOD
Programming with LODProgramming with LOD
Programming with LODFumihiro Kato
 
Creating web applications with LODSPeaKr
Creating web applications with LODSPeaKrCreating web applications with LODSPeaKr
Creating web applications with LODSPeaKrAlvaro Graves
 
FedX - Optimization Techniques for Federated Query Processing on Linked Data
FedX - Optimization Techniques for Federated Query Processing on Linked DataFedX - Optimization Techniques for Federated Query Processing on Linked Data
FedX - Optimization Techniques for Federated Query Processing on Linked Dataaschwarte
 
RDFS In A Nutshell V1
RDFS In A Nutshell V1RDFS In A Nutshell V1
RDFS In A Nutshell V1Fabien Gandon
 
Real-time Semantic Web with Twitter Annotations
Real-time Semantic Web with Twitter AnnotationsReal-time Semantic Web with Twitter Annotations
Real-time Semantic Web with Twitter AnnotationsJoshua Shinavier
 
Introduction to RDF
Introduction to RDFIntroduction to RDF
Introduction to RDFNarni Rajesh
 
SUMMER SCHOOL LEX 2014 - RDF + SPARQL querying the web of (lex)data
SUMMER SCHOOL LEX 2014 - RDF + SPARQL querying the web of (lex)dataSUMMER SCHOOL LEX 2014 - RDF + SPARQL querying the web of (lex)data
SUMMER SCHOOL LEX 2014 - RDF + SPARQL querying the web of (lex)dataDiego Valerio Camarda
 
Intro to Linked, Dutch Ships and Sailors and SPARQL handson
Intro to Linked, Dutch Ships and Sailors and SPARQL handson Intro to Linked, Dutch Ships and Sailors and SPARQL handson
Intro to Linked, Dutch Ships and Sailors and SPARQL handson Victor de Boer
 
Thinking in documents
Thinking in documentsThinking in documents
Thinking in documentsCésar Rodas
 
Web enabling your survey business
Web enabling your survey businessWeb enabling your survey business
Web enabling your survey businessRudy Stricklan
 

What's hot (20)

"Whatever I can get..."
"Whatever I can get...""Whatever I can get..."
"Whatever I can get..."
 
Poster - Completeness Statements about RDF Data Sources and Their Use for Qu...
Poster - Completeness Statements about RDF Data Sources and Their Use for Qu...Poster - Completeness Statements about RDF Data Sources and Their Use for Qu...
Poster - Completeness Statements about RDF Data Sources and Their Use for Qu...
 
Rdf In A Nutshell V1
Rdf In A Nutshell V1Rdf In A Nutshell V1
Rdf In A Nutshell V1
 
Linking the Open Data? by Petko Valtchev
Linking the Open Data? by Petko ValtchevLinking the Open Data? by Petko Valtchev
Linking the Open Data? by Petko Valtchev
 
Two graph data models : RDF and Property Graphs
Two graph data models : RDF and Property GraphsTwo graph data models : RDF and Property Graphs
Two graph data models : RDF and Property Graphs
 
(Re-) Discovering Lost Web Pages
(Re-) Discovering Lost Web Pages(Re-) Discovering Lost Web Pages
(Re-) Discovering Lost Web Pages
 
Web of data
Web of dataWeb of data
Web of data
 
Linked Data on Rails
Linked Data on RailsLinked Data on Rails
Linked Data on Rails
 
Programming with LOD
Programming with LODProgramming with LOD
Programming with LOD
 
Deepweb Tools
Deepweb ToolsDeepweb Tools
Deepweb Tools
 
2014.12 - Let's Disco (EDDI 2014)
2014.12 - Let's Disco (EDDI 2014)2014.12 - Let's Disco (EDDI 2014)
2014.12 - Let's Disco (EDDI 2014)
 
Creating web applications with LODSPeaKr
Creating web applications with LODSPeaKrCreating web applications with LODSPeaKr
Creating web applications with LODSPeaKr
 
FedX - Optimization Techniques for Federated Query Processing on Linked Data
FedX - Optimization Techniques for Federated Query Processing on Linked DataFedX - Optimization Techniques for Federated Query Processing on Linked Data
FedX - Optimization Techniques for Federated Query Processing on Linked Data
 
RDFS In A Nutshell V1
RDFS In A Nutshell V1RDFS In A Nutshell V1
RDFS In A Nutshell V1
 
Real-time Semantic Web with Twitter Annotations
Real-time Semantic Web with Twitter AnnotationsReal-time Semantic Web with Twitter Annotations
Real-time Semantic Web with Twitter Annotations
 
Introduction to RDF
Introduction to RDFIntroduction to RDF
Introduction to RDF
 
SUMMER SCHOOL LEX 2014 - RDF + SPARQL querying the web of (lex)data
SUMMER SCHOOL LEX 2014 - RDF + SPARQL querying the web of (lex)dataSUMMER SCHOOL LEX 2014 - RDF + SPARQL querying the web of (lex)data
SUMMER SCHOOL LEX 2014 - RDF + SPARQL querying the web of (lex)data
 
Intro to Linked, Dutch Ships and Sailors and SPARQL handson
Intro to Linked, Dutch Ships and Sailors and SPARQL handson Intro to Linked, Dutch Ships and Sailors and SPARQL handson
Intro to Linked, Dutch Ships and Sailors and SPARQL handson
 
Thinking in documents
Thinking in documentsThinking in documents
Thinking in documents
 
Web enabling your survey business
Web enabling your survey businessWeb enabling your survey business
Web enabling your survey business
 

Similar to The world is y0ur$: Geolocation-based wordlist generation with wordsmith

Tapping the Data Deluge with R
Tapping the Data Deluge with RTapping the Data Deluge with R
Tapping the Data Deluge with RJeffrey Breen
 
Semantic Pipes (London Perl Workshop 2009)
Semantic Pipes (London Perl Workshop 2009)Semantic Pipes (London Perl Workshop 2009)
Semantic Pipes (London Perl Workshop 2009)osfameron
 
useR! 2012 Talk
useR! 2012 TalkuseR! 2012 Talk
useR! 2012 Talkrtelmore
 
Ancient corpora analysis
Ancient corpora analysisAncient corpora analysis
Ancient corpora analysisDirk Roorda
 
Quran and Text-Fabric
Quran and Text-FabricQuran and Text-Fabric
Quran and Text-FabricDirk Roorda
 
The RDA Experience at the National Library of New Zealand
The RDA Experience at the National Library of New ZealandThe RDA Experience at the National Library of New Zealand
The RDA Experience at the National Library of New ZealandPeter Sime
 
John Fagan - The Black Art of Geocoding
John Fagan - The Black Art of GeocodingJohn Fagan - The Black Art of Geocoding
John Fagan - The Black Art of GeocodingJohn Fagan
 
John Fagan: The Black Art of Geocoding
John Fagan: The Black Art of GeocodingJohn Fagan: The Black Art of Geocoding
John Fagan: The Black Art of GeocodingAGI Geocommunity
 
SQL For PHP Programmers
SQL For PHP ProgrammersSQL For PHP Programmers
SQL For PHP ProgrammersDave Stokes
 
Apache drill self service data exploration (113)
Apache drill   self service data exploration (113)Apache drill   self service data exploration (113)
Apache drill self service data exploration (113)MapR Technologies
 
What You Need To Know About The Top Database Trends
What You Need To Know About The Top Database TrendsWhat You Need To Know About The Top Database Trends
What You Need To Know About The Top Database TrendsDell World
 
Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...
Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...
Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...Dataconomy Media
 
Why R? A Brief Introduction to the Open Source Statistics Platform
Why R? A Brief Introduction to the Open Source Statistics PlatformWhy R? A Brief Introduction to the Open Source Statistics Platform
Why R? A Brief Introduction to the Open Source Statistics PlatformSyracuse University
 
Analyze one year of radio station songs aired with Spark SQL, Spotify, and Da...
Analyze one year of radio station songs aired with Spark SQL, Spotify, and Da...Analyze one year of radio station songs aired with Spark SQL, Spotify, and Da...
Analyze one year of radio station songs aired with Spark SQL, Spotify, and Da...Paul Leclercq
 
What is in All of Those SSTable Files Not Just the Data One but All the Rest ...
What is in All of Those SSTable Files Not Just the Data One but All the Rest ...What is in All of Those SSTable Files Not Just the Data One but All the Rest ...
What is in All of Those SSTable Files Not Just the Data One but All the Rest ...DataStax
 
Max Neunhöffer – Joins and aggregations in a distributed NoSQL DB - NoSQL mat...
Max Neunhöffer – Joins and aggregations in a distributed NoSQL DB - NoSQL mat...Max Neunhöffer – Joins and aggregations in a distributed NoSQL DB - NoSQL mat...
Max Neunhöffer – Joins and aggregations in a distributed NoSQL DB - NoSQL mat...NoSQLmatters
 
Graph Query Languages: update from LDBC
Graph Query Languages: update from LDBCGraph Query Languages: update from LDBC
Graph Query Languages: update from LDBCJuan Sequeda
 

Similar to The world is y0ur$: Geolocation-based wordlist generation with wordsmith (20)

Tapping the Data Deluge with R
Tapping the Data Deluge with RTapping the Data Deluge with R
Tapping the Data Deluge with R
 
Semantic Pipes (London Perl Workshop 2009)
Semantic Pipes (London Perl Workshop 2009)Semantic Pipes (London Perl Workshop 2009)
Semantic Pipes (London Perl Workshop 2009)
 
useR! 2012 Talk
useR! 2012 TalkuseR! 2012 Talk
useR! 2012 Talk
 
Ancient corpora analysis
Ancient corpora analysisAncient corpora analysis
Ancient corpora analysis
 
Quran and Text-Fabric
Quran and Text-FabricQuran and Text-Fabric
Quran and Text-Fabric
 
The RDA Experience at the National Library of New Zealand
The RDA Experience at the National Library of New ZealandThe RDA Experience at the National Library of New Zealand
The RDA Experience at the National Library of New Zealand
 
John Fagan - The Black Art of Geocoding
John Fagan - The Black Art of GeocodingJohn Fagan - The Black Art of Geocoding
John Fagan - The Black Art of Geocoding
 
John Fagan: The Black Art of Geocoding
John Fagan: The Black Art of GeocodingJohn Fagan: The Black Art of Geocoding
John Fagan: The Black Art of Geocoding
 
SQL For PHP Programmers
SQL For PHP ProgrammersSQL For PHP Programmers
SQL For PHP Programmers
 
Apache drill self service data exploration (113)
Apache drill   self service data exploration (113)Apache drill   self service data exploration (113)
Apache drill self service data exploration (113)
 
What You Need To Know About The Top Database Trends
What You Need To Know About The Top Database TrendsWhat You Need To Know About The Top Database Trends
What You Need To Know About The Top Database Trends
 
Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...
Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...
Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...
 
Why R? A Brief Introduction to the Open Source Statistics Platform
Why R? A Brief Introduction to the Open Source Statistics PlatformWhy R? A Brief Introduction to the Open Source Statistics Platform
Why R? A Brief Introduction to the Open Source Statistics Platform
 
Odp
OdpOdp
Odp
 
Analyze one year of radio station songs aired with Spark SQL, Spotify, and Da...
Analyze one year of radio station songs aired with Spark SQL, Spotify, and Da...Analyze one year of radio station songs aired with Spark SQL, Spotify, and Da...
Analyze one year of radio station songs aired with Spark SQL, Spotify, and Da...
 
inteSearch: An Intelligent Linked Data Information Access Framework
inteSearch: An Intelligent Linked Data Information Access FrameworkinteSearch: An Intelligent Linked Data Information Access Framework
inteSearch: An Intelligent Linked Data Information Access Framework
 
What is in All of Those SSTable Files Not Just the Data One but All the Rest ...
What is in All of Those SSTable Files Not Just the Data One but All the Rest ...What is in All of Those SSTable Files Not Just the Data One but All the Rest ...
What is in All of Those SSTable Files Not Just the Data One but All the Rest ...
 
Max Neunhöffer – Joins and aggregations in a distributed NoSQL DB - NoSQL mat...
Max Neunhöffer – Joins and aggregations in a distributed NoSQL DB - NoSQL mat...Max Neunhöffer – Joins and aggregations in a distributed NoSQL DB - NoSQL mat...
Max Neunhöffer – Joins and aggregations in a distributed NoSQL DB - NoSQL mat...
 
Os Gottfrid
Os GottfridOs Gottfrid
Os Gottfrid
 
Graph Query Languages: update from LDBC
Graph Query Languages: update from LDBCGraph Query Languages: update from LDBC
Graph Query Languages: update from LDBC
 

Recently uploaded

Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareJim McKeeth
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension AidPhilip Schwarz
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is insideshinachiaurasa2
 
%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Hararemasabamasaba
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastPapp Krisztián
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplatePresentation.STUDIO
 
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfkalichargn70th171
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...SelfMade bd
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfonteinmasabamasaba
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park masabamasaba
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...Jittipong Loespradit
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...Shane Coughlan
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...masabamasaba
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...masabamasaba
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisamasabamasaba
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrandmasabamasaba
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdfPearlKirahMaeRagusta1
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...masabamasaba
 

Recently uploaded (20)

Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK Software
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
 
%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
 

The world is y0ur$: Geolocation-based wordlist generation with wordsmith

  • 1. THE WORLD IS Y0UR$: GEOLOCATION-BASED WORDLIST GENERATION WITH WORDSMITH SANJI V KAWA | TO M PO RTER @ h a c k e r j i v | @ p o r t e r h a u 5
  • 2. Formalities 2 Sanjiv Kawa @hackerjiv S R . P E N E T R A T I O N T E S T E R P S C / N C C G R O U P • Roots in dev and IT • Penetration testing • Binary analysis and exploitation
  • 3. Formalities 3 Tom Porter @porterhau5 S R . S E C U R I T Y C O N S U L T A N T F U S I O N X R E D T E A M • Flow data analytics • Penetration testing • Red teaming • BloodHound extensions
  • 4. What is Wordsmith? 4 Custom wordlist generation Crack hashes / password attacks Tailored for your target Geo-location data Modular and extensible Username generation
  • 5. Dictionary Attack 5 1. Guess 2. Encrypt 3. Compare apple banana cherry … $hash <- encrypt(apple) $hash : 5ebe7dfa074da8ee8aef1faa2bbde876 Search for $hash in obtained hash list: af5432a79b941528fa7fac9e7e391651 5ebe7dfa074da8ee8aef1faa2bbde876 8846f7eaee8fb117ad06bdd830b7586c
  • 7. Wordsmith v1: Geo-location Data Collected 7 Major league sports teams Colleges and universities Common names Area codesZip codes Streets and roads Landmarks Cities, towns, etc
  • 8. Wordsmith v1: Additional Features 8 CeWL Integration Basic mangling (whitespace, specials, split on space) Specify minimum character length To lowercase[a-z]
  • 9. Wordsmith v1: Things we learned 9 Feedback from the community was incredible. Thank you! Top three requests: 1. More countries need to be available (v1 was US only) 2. Needs to be a way to introduce more/your own data 3. Limited to the English language
  • 10. Wordsmith v2 10 New CLI design Multi-language (13 so far!) Introduced religions Generate usernames Modular framework allows for user contribution and extensibility Geo-location data sets for over 230 countries!
  • 11. Data Sources Coverage: World Data types: Population, Religion, Languages, etc 11 www.cia.gov/library/publications/the-world- factbook/geos/print_[aa-zz].html Coverage: 13 languages (hunspell)
  • 12. Data Sources 12 Coverage: US Data Types: Sports teams, colleges Coverage: World Data Types: Landmarks and archeological sites Coverage: World Data Types: Religious texts
  • 13. Data Sources 13 Coverage: World Data Types: Roads, Cities, Counties Coverage: US Data Types: Popular first names. Last names Coverage: US Data Types: Area Codes, Zip Codes
  • 14. How to get Wordsmith 14 ❯ git clone https://github.com/skahwah/wordsmith.git ❯ cd wordsmith ❯ bundle install # (optional for CeWL integration) ❯ ruby wordsmith.rb wordsmith v2.0.7 Written by: Sanjiv "Trashcan Head" Kawa & Tom "Pain Train" Porter Twitter: @hackerjiv & @porterhau5 [*] Hello new wordsmither! [*] This script will remove the data/ directory in the current working directory. Enter 'y' to continue: y [*] Just need to unpack some files (Running: tar -xf data.tar.xz) [*] Unpack completed! [*] CeWL found: /usr/bin/cewl
  • 15. Files 15 ❯ ls -l -rw-r--r-- 1 user staff 3159 Oct 1 22:57 CHANGELOG.md drwxr-xr-x 2 user staff 4096 Oct 1 22:57 data -rw-r--r-- 1 user staff 50602888 Oct 1 22:57 data.tar.xz -rw-r--r-- 1 user staff 116 Oct 1 22:57 Gemfile -rw-r--r-- 1 user staff 1393 Oct 1 22:57 LICENSE -rw-r--r-- 1 user staff 7514 Oct 1 22:57 README.md -rwxr-xr-x 1 user staff 31081 Oct 1 22:57 wordsmith.rb • View README first, or check out –E option (examples) • wordsmith.rb: primary ruby script • data.tar.xz (~50 MB): compressed archive of data • data/ (~250 MB): data arranged in hierarchy
  • 16. Boundaries & Attributes 16 Boundaries (-I <input>) • Areas of the world to get words for • 249 countries and territories • States/Provinces • Cities • Custom regions Attributes (ex: -r -l) • Types of words to grab: • Cities • Colleges • Landmarks • Languages • Names • Roads • Religions • and more… ❯ ruby wordsmith.rb –I usa –r –l
  • 17. Structure 17 ❯ ls data/ abw afg ago aia ala alb and are arg arm ... wlf wsm yem zaf zmb zwe ISO ALPHA-3 Country Codes ❯ ls data/usa ak al ar az ca cia.txt co ct dc ... tx usa.yaml ut va vt wa wi wv wy States, Provinces, Counties, Municipalities ❯ ls data/usa/nc areacodes.txt charlotte cities.txt colleges.txt counties.txt ... Cities, Counties ❯ ls data/usa/nc/charlotte sports.txt Attributes (sports, colleges, roads, etc.) are .txt files
  • 18. Boundaries and Input 18 ❯ ruby wordsmith.rb –I usa [options] ❯ ruby wordsmith.rb –I usa-nc [options] ❯ ruby wordsmith.rb –I usa-nc-charlotte [options] ❯ ruby wordsmith.rb –I usa,can [options] ❯ ruby wordsmith.rb –I usa-dc,usa-md,usa-va [options] -I for specifying input boundaries Can supply one or many boundaries ❯ ruby wordsmith.rb –I 10 [options] Providing a number (ex: 10) will select N most populous countries
  • 19. Regions 19 ❯ ruby wordsmith.rb –I europe [options] ❯ grep europe data/regions.csv europe,"Continent of Europe",ala alb and arm aut aze bel bgr bih blr che cyp cze deu dnk esp est fin fra fro gbr geo ggy gib grc hrv hun imn irl isl ita jey kaz lie ltu lux lva mco mda mkd mlt mne nld nor pol prt rou rus sjm smr srb svk svn swe tur ukr vat regions.csv contains custom grouping of boundaries Can see regions with -R option: ❯ ruby wordsmith.rb –R Alias: newengland Description: US - New England Members: usa-ct usa-me usa-ma usa-nh usa-ri usa-vt Alias: mideast Description: US - Mideast Members: usa-de usa-dc usa-md usa-nj usa-ny usa-pa Alias: greatlakes Description: US - Great Lakes Members: usa-il usa-in usa-mi usa-oh usa-wi
  • 20. Attributes 20 ❯ ruby wordsmith.rb –I europe [options] ❯ ruby wordsmith.rb –h Main Arguments: -I, --input <input> Comma-delimited list of inputs Input Options: -a, --all Grab all options -b, --other Grab other miscellaneous attributes -e, --cia Grab demographics compiled by the CIA -c, --cities Grab all city names -f, --colleges Grab all college sports -l, --landmarks Grab all landmarks -v, --language Grab the most popular language(s) -N, --all-names Grab all first names and last names -G, --first-names Grab all first names -L, --last-names Grab all last names -F, --female-fnames Grab all female first names -M, --male-fnames Grab all male first names -p, --phone Grab all area codes -r, --roads Grab all road names -g, --religion Grab the most popular relgious text(s) -t, --teams Grab all major sports teams -u, --counties Grab all counties -z, --zip Grab all zip codes
  • 21. Attribute Examples 21 ❯ ruby wordsmith.rb –I usa-ca -z 90001 90002 90003 90004 ... Grab all zip codes for California ❯ ruby wordsmith.rb –I gbr-eng –r –c -l Ab Kettleby Abberley Abberton Abbess Roding ... Grab all roads, cities, and landmarks for England, GBR ❯ ruby wordsmith.rb –I asia -a Abas Abatan Abbeg Abejao ... Grab all attributes for Asia
  • 22. Child Nodes 22 ❯ ruby wordsmith.rb –I gbr –C Format: boundary-name : attribute1 attribute2 attribute3 etc. gbr : cities counties landmarks roads cia |-- gbr-sco : cities counties roads |-- gbr-wal : cities counties roads |-- gbr-eng : cities counties roads | |-- gbr-eng-su : cities counties roads | |-- gbr-eng-ch : cities counties roads | |-- gbr-eng-ex : cities roads | |-- gbr-eng-nt : cities counties roads | |-- gbr-eng-sk : cities roads | |-- gbr-eng-ca : cities counties roads | |-- gbr-eng-bu : cities counties roads | |-- gbr-eng-sx | | |-- gbr-eng-sx-east_sussex : cities counties roads | | |-- gbr-eng-sx-west_sussex : cities counties roads ... See the child nodes (-C) and their attributes of a given boundary
  • 23. Country Metadata 23 ❯ ls -l data/jpn/ -rw-r--r-- 1 user staff 32002 Aug 30 19:16 cia.txt -rw-r--r-- 1 user staff 13184 Sep 9 2016 cities.txt -rw-r--r-- 1 user staff 5608 Sep 9 2016 counties.txt -rw-r--r-- 1 user staff 107 Aug 30 19:36 jpn.yaml -rw-r--r-- 1 user staff 113672 Oct 1 21:10 landmarks.txt -rw-r--r-- 1 user staff 871994 Sep 9 2016 roads.txt ❯ cat data/jpn/jpn.yaml config: population: 126,702,133 language_1: Japanese religion_1: Shintoism religion_2: Buddhism The World Factbook: Population Official languages Most popular religions Most populous countries (ex: -I 25) Official languages (-v, --language) Most popular religions (-g, --religion)
  • 24. Religions 24 ❯ wc -l data/religion/* 28168 douay-rheims-parsed.txt 97682 king-james-bible-book-verse.txt 20190 king-james-bible-parsed.txt 42876 niv-bible-parsed-spanish.txt 34202 niv-bible-parsed.txt 7872 quran-parsed-eng.txt ❯ cat king-james-bible-book-verse.txt The First Book of Moses: Called Genesis Genesis1:1 1:1Genesis John3:16 3:16John ... ❯ cat king-james-bible-parsed.txt ... Jesuite Jesus Jether Jetheth Jethro ... (-g, --religion) Identified the most common religions • KJV Bible • NIV Bible • Douay Rheims • Quran ~ 200 countries are covered
  • 25. Languages 25 ❯ head –n 5 language-frequency.txt 83:English 38:French 29:Spanish 26:Arabic 11:Russian ❯ wc -l data/languages/*.txt 457097 arabic.txt 47866 bahasa.txt 110750 bengali.txt 115485 cedict.txt 466544 english.txt 72038 french.txt 585844 german.txt 338534 hebrew.txt 15990 hindi.txt 95152 italian.txt 47866 malay.txt 340235 portuguese.txt 379324 russian.txt 798915 spanish.txt 371169 turkish.txt (-v, --language) Identified the most common languages ~ 195 countries are covered
  • 26. Modular Design 26 ❯ ls data/usa/mn/ areacodes.txt colleges.txt fnames.txt landmarks.txt sports.txt cities.txt counties.txt lakes.txt roads.txt zipcodes.txt ❯ cat data/usa/mn/lakes.txt Aaron Abbey Acorn Adelman's Pond ... ❯ ruby wordsmith.rb –I usa-mn –b Aaron Abbey Acorn Adelman's Pond ... Modular design: - Easily extensible - Introduce your own .txt files (grab with –b option) - Contribute and help build the project
  • 27. Output Options 27 ❯ ruby wordsmith.rb –h <Input options snipped> Output Options: -o, --output FILE The filename for writing output -q, --quiet Don't show words, use with -o option -k, --min-length LEN Minimum length of word to include -n, --max-length LEN Maximum length of word to include -D, --complexity Words meet Windows default complexity -j, --lowercase Convert all words to lowercase -w, --specials Add words with special chars removed -x, --spaces Add words with spaces removed -y, --split Split words by space and add -m, --mangle Add all permutations (-w, -x, -y) -P, --prepend-phones Prepend state area codes to each word -A, --append-phones Append state area codes to each word -X, --prepend-zips Prepend zip codes to each word -Z, --append-zips Append zip codes to each word -W, --prepend-wordlist FILE Prepend words in FILE to each word -Y, --append-wordlist FILE Append words in FILE to each word
  • 28. Tweaking Output 28 ❯ ruby wordsmith.rb –I usa-dc –r Pennsylvania Ave. Name of a road generated for D.C. Mangle (-m): split words, remove specials, remove spaces ❯ ruby wordsmith.rb –I usa-dc –r -m Pennsylvania Ave. Pennsylvania Ave Pennsylvania Ave. Ave PennsylvaniaAve. PennsylvaniaAve ❯ ruby wordsmith.rb –I usa-dc –r –m –k 8 Pennsylvania Ave. Pennsylvania Ave Pennsylvania PennsylvaniaAve. PennsylvaniaAve Min Length (-k): specify minimum char length
  • 29. Tweaking Output 29 ❯ ruby wordsmith.rb –I usa-dc –r –m –D Pennsylvania Ave. Pennsylvania Ave PennsylvaniaAve. Windows Default complexity (-D): 8 char min, 3/4 cases ❯ ruby wordsmith.rb –I usa-dc –a –q –o DC.txt cities in ./data/usa/dc: 1 colleges in ./data/usa/dc: 24 counties in ./data/usa/dc: 1 landmarks in ./data/usa/dc: 75 fnames in ./data/usa/dc: 2646 areacodes in ./data/usa/dc: 1 roads in ./data/usa/dc: 2088 sports in ./data/usa/dc: 4 zipcodes in ./data/usa/dc: 284 religions: 123676 languages: 1107300 [*] 1221226 words written to: /opt/wordsmith/DC.txt Quiet output (-q), write results to file (-o DC.txt)
  • 30. Prepending & Appending 30 • Prepend or Append: • Zip codes (-X,-Z) • Area codes (-P,-A) • User-supplied wordlist (-W,-Y) https://arstechnica.com/tech-policy/2016/08/if-youre-an-alleged-drug-dealer-dont-use-asshole209-as-a-password/
  • 31. Prepending & Appending 31 ❯ cat years.txt 17 17! 2017 2017! years.txt: file I created with words I want to append ❯ ruby wordsmith.rb –I usa-dc –f -m –Y years.txt ... Gallaudet Gallaudet17 Gallaudet17! Gallaudet2017 Gallaudet2017! Hoyas Hoyas17 Hoyas17! Hoyas2017 Hoyas2017! ... Grab colleges (-f), mangle (-m), then append custom wordlist (-Y)
  • 32. Names 32 ❯ cat data/usa/fnames.txt James John Robert Michael Mary ... ❯ cat data/usa/lnames.txt Smith Johnson Williams Brown Jones ... • Most common baby names in each state since 1910 • -G: most common first names • -L: most common last names • -N: all names
  • 33. Username Generation 33 ❯ ruby wordsmith.rb –h <other options snipped> Username Generation Options: --filn FirstInitialLastName (bsmith) --fnln FirstNameLastName (bobsmith) --fnli FirstNameLastInitial (bobs) --lnfi LastNameFirstInitial (smithb) --lnfn LastNameFirstName (smithbob) --fidln FirstInitial.LastName (b.smith) --fndln FirstName.LastName (bob.smith) --truncate LEN Truncate username at LEN number of chars (bobsmi) --max-users LEN Max number of usernames to generate --name-depth LEN Num of first/last names to iterate over (default:100, 0 will get all) • Generate different username formats • Use --max-users and --name-depth to handle speed & volume
  • 34. Username Generation 34 ❯ ruby wordsmith.rb –I usa --fnln JamesSmith JamesJohnson JamesWilliams JamesBrown JamesJones JamesGarcia JamesMiller ... First name Last Name ❯ ruby wordsmith.rb –I usa --fndln James.Smith James.Johnson James.Williams James.Brown James.Jones James.Garcia James.Miller ... First name (dot) Last Name
  • 35. Username Generation 35 ❯ ruby wordsmith.rb –I usa –filn –-truncate 8 ... aDavis aRodrigu aMartine aHernand aGonzale aWilson aAnderso ... Truncate down to 8 characters ❯ ruby wordsmith.rb –I usa –lnfn –q usernames in ./data/usa: 10000 ❯ ruby wordsmith.rb –I usa –lnfn –q --name-depth 250 usernames in ./data/usa: 62500 ❯ ruby wordsmith.rb –I usa –lnfn –q --name-depth 1000 usernames in ./data/usa: 1000000 Adjust --name-depth to generate more usernames
  • 36.
  • 37. Ireland – Interesting Password Recoveries 37 • Cork1234 • Carlow123 • Dublin1234 • Seapoint1916 • Artane2016 • Templeroan2009 • Donegal56 • ParkLodge30! • Portishead01 • Tipperary2 • Larkfield18 • Wolseley2014 • Farriers40 • 5RotheAbbey
  • 38. Multinational Organization Results 38 • Organization has offices in USA, Australia and Canada • Unable to disclose total number of hashes Wordlist Hashcat run time Number of passwords recovered Top 10k (10k words) 4 sec Rockyou (14.4m words) 30 mins AUS, CAN, USA Wordlist (7.3m words) 13 mins 256 476 241 ruby wordsmith.rb -I aus,can,usa -a -j -q -m -o aus-can-usa-all-lowercase-q-m.txt
  • 39. Multinational – Interesting Password Recoveries 39 Australia: • Bayswater2017 • Primavera001 • Padstow123! • Queenslander2015 • Razorback1965 • Parramatta16 • Sydney201% Canada • !Matthew2222 • Canada1984 • Vancouver186 USA • Bernie424! • ColoradoSprings3! • ChicagoCubs2016 • BostonCeltics29 • Anakin2005s • Denean1973 • Cubbie221! • Metrocenter11
  • 40. • Collecting and collating this data required the development of some parsers Parsers 40 ❯ git clone https://github.com/skahwah/wordsmith_parsers.git ❯ ls LICENSE cia-parsers landmark-parser osm-parsers README.md census-parsers names-parsers religion-parsers https://github.com/skahwah/wordsmith_parsers
  • 41. Future Work 41 • Data! – Diving deeper into OpenStreetMap – Popular song lyrics (h/t @pfizzell) – Got ideas? We’d love to hear them! • Skills – GIS – Multiple language speakers – Obscure website hunting & scraping • Design – Lookups based on coordinates
  • 42. Thank you! 42 Sanjiv Kawa @hackerjiv S R . P E N E T R A T I O N T E S T E R P S C / N C C G R O U P Tom Porter @porterhau5 S R . S E C U R I T Y C O N S U L T A N T F U S I O N X R E D T E A M https://github.com/skahwah/wordsmith