SlideShare a Scribd company logo
1 of 227
Download to read offline
go.indeed.com/IndeedEngTalks
Imhotep
Large Scale Analytics and
Machine Learning at Indeed
Jeff Plaisance
Engineering Manager
I help
people
get jobs.
Indeed is a
Search Engine for Jobs
Indeed is a data driven
organization
Indeed is a data driven
organization
Data driven organizations
need great tools
What does Imhotep allow you to do?
● Decision Tree Building
● Analytics
What does Imhotep allow you to do?
● Decision Tree Building
● Analytics
Indeed’s Analytics Philosophy
Analytics systems should be:
1. Interactive
2. Not Sampled
3. Not Approximate
Imhotep answers questions
What was the weekly average query time in the
last quarter from people doing the query
“software”?
Imhotep answers questions
What percent of jobsearch results pages are for
page 2 and beyond?
Imhotep answers questions
What are the 5 most common queries in each
country?
Total Job Searches From 2014-03-09
to 2014-03-23
?
Query
Query Location
Query Location
Impression
Document
query: “indeed software engineer”
location: “austin”
impressions: 10
clicks: 2
time: 2014-03-17T12:00:00
Shard
0 21 3 4
5 76 8 9
10 1211 13 14
Shard
0 21 3 4
5 76 8 9
10 1211 13 14
Server
2014/03/02 2014/03/09 2014/03/11
2014/03/12 2014/03/22 2014/03/24
Documents Documents Documents
Documents Documents Documents
Server
2014/03/02 2014/03/09 2014/03/11
2014/03/12 2014/03/22 2014/03/24
Documents Documents Documents
Documents Documents Documents
Cluster
2014-03-02
Server A
2014-03-03
Server B
2014-03-04
Server C
Cluster
2014-03-02 2014-03-03
Server B
2014-03-04
Server CServer A
Cluster
2014-03-02 2014-03-03
Server B
2014-03-04
Server C
Client
Session
Server A
Total Job Searches From 2014-03-09
to 2014-03-23
secret
Total Job Searches From 2014-03-09
to 2014-03-23 Per Day
2014-03-09 2014-03-16 2014-03-23
Metrics
● 64 bit integers
● Exactly one value per doc
● Random access by doc id
Metrics
● Time
● Clicks
● Impressions
● Revenue
● … or anything else that is a number
Groups
● Documents are placed into numbered
groups
● Every document starts in group 1
● Group 0 means “filtered out”
Groups
● Groups are stateful and scoped to a session
● Regroup operations update group for each
doc in shard
width
Metric Regroup
● Iterate over doc_id->metric lookup
● Set group to
(value - start)/ bucket_width
● Useful for making graphs (buckets on x-axis)
1 2 3 4 5
start end
Get Group Stats
● For each group, sums a metric for all docs in
that group
Bucket By Day
1. Regroup on time metric
2. Get Group Stats for count metric (always 1)
Total Job Searches From 2014-03-09
to 2014-03-23 Per Day
2014-03-09 2014-03-16 2014-03-23
Total and US Job Searches From
2014-03-09 to 2014-03-23 Per Day
2014-03-09 2014-03-16 2014-03-23
Inverted Indexes
Inverted Index
● Like index in the back of a book
● words = terms, page numbers = doc ids
● Term list is sorted
● Doc list for each term is sorted
doc id query country impressions clicks
0 software Canada 10 1
1 blank Canada 10 0
2 sales US 5 0
3 software US 8 1
4 blank US 10 1
Standard Index
Constructing an Inverted Index
query country impression clicks
doc id blank sales software Canada US 5 8 10 0 1
0 ✔ ✔ ✔ ✔
1 ✔ ✔ ✔ ✔
2 ✔ ✔ ✔ ✔
3 ✔ ✔ ✔ ✔
4 ✔ ✔ ✔ ✔
Constructing an Inverted Index
field term 0 1 2 3 4
query blank ✔ ✔
sales ✔
software ✔ ✔
country Canada ✔ ✔
US ✔ ✔ ✔
impressions 5 ✔
8 ✔
10 ✔ ✔ ✔
clicks 0 ✔ ✔
1 ✔ ✔ ✔
Inverted Index
field term doc list
query blank 1, 4
sales 2
software 0, 3
country Canada 0, 1
US 2, 3, 4
impressions 5 2
8 3
10 0, 1, 4
clicks 0 1, 2
1 0, 3, 4
Inverted Indexes
Allow you to:
● Quickly find all documents containing
a term
● Intersect several terms to perform
boolean queries
Lucene
● Open source inverted index implementation
● Reasonably fast
● Widely used, well tested
Global and US Job Searches From
2014-03-09 to 2014-03-23 Per Day
2014-03-09 2014-03-16 2014-03-23
field term doc list
query blank 1, 4
sales 2
software 0, 3
country Canada 0, 1
US 2, 3, 4
impressions 5 2
8 3
10 0, 1, 4
clicks 0 1, 2
1 0, 3, 4
Searches in the US only
field term doc list
query blank 1, 4
sales 2
software 0, 3
country Canada 0, 1
US 2, 3, 4
impressions 5 2
8 3
10 0, 1, 4
clicks 0 1, 2
1 0, 3, 4
Searches in the US only
Searches in the US only
field term doc list
country Canada 0, 1
US 2, 3, 4
Searches in the US only
Query Regroup
● Regroup all docs which do not match a
boolean query to group zero
field term doc list
country Canada 0, 1
US 2, 3, 4
Term Regroup
Splits docs in a group into one of two new
groups based on presence/absence of a term
country:US everything else
1
32
Multiterm Regroup
Generalization of term regroup to N terms and
N+1 new groups
country:US everything elsecountry:CA country:FR
52 3 4
1
Total and US Job Searches From
2014-03-09 to 2014-03-23 Per Day
2014-03-09 2014-03-16 2014-03-23
Inverted Index Compression
Size of Organic Dataset for last 5 months
● Original: 102 TB
● Inverted: 51 TB
Inverted Index Optimizations
● Compressed data structures
○ Better use of RAM and processor cache
○ Better use of memory bandwidth
○ Increased CPU usage and time
● Micro optimizations matter!
Delta / Varint Encoding
● Doc id lists are sorted
● Delta between a doc id and the previous doc
id is sufficient
● Deltas are usually small integers
● Use less bits for small integers and more bits
for large integers
Delta Encoding
field term doc list
query nursing 34, 86, 247, 301, 674, 714
Delta Encoding
field term doc list
query nursing 34, 86, 247, 301, 674, 714
34, 52, 161, 54, 373, 40
Small Integer Compression
● Golomb/Rice
● Varint
● Binary Packing
● PForDelta
Small Integer Compression
● Golomb/Rice
● Varint
● Bit Packing
● PForDelta
Varint Encoding
9838
Varint Encoding
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 1 0 0 1 1 0 0 1 1 0 1 1 1 0
9838
Varint Encoding
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 1 0 0 1 1 0 0 1 1 0 1 1 1 0
9838
Varint Encoding
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 1 0 0 1 1 0 0 1 1 0 1 1 1 0
9838
Varint Encoding
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 1 0 0 1 1 0 0 1 1 0 1 1 1 0
9838
Varint Encoding
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 1 0 0 1 1 0 0 1 1 0 1 1 1 0
9838
? 1 1 0 1 1 1 0
Varint Encoding
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
9838
? 1 1 0 1 1 1 0
0 0 1 0 0 1 1 0 0 1 1 0 1 1 1 0
Varint Encoding
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 1 0 0 1 1 0 0 1 1 0 1 1 1 0
9838
? 1 1 0 1 1 1 0
Varint Encoding
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 1 0 0 1 1 0 0 1 1 0 1 1 1 0
9838
1 1 1 0 1 1 1 0
Varint Encoding
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 1 0 0 1 1 0 0 1 1 0 1 1 1 0
9838
1 1 1 0 1 1 1 0
Varint Encoding
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 1 0 0 1 1 0 0 1 1 0 1 1 1 0
9838
1 1 1 0 1 1 1 0
? 1 0 0 1 1 0 0
Varint Encoding
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 1 0 0 1 1 0 0 1 1 0 1 1 1 0
9838
1 1 1 0 1 1 1 0
? 1 0 0 1 1 0 0
Varint Encoding
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 1 0 0 1 1 0 0 1 1 0 1 1 1 0
9838
1 1 1 0 1 1 1 0
? 1 0 0 1 1 0 0
Varint Encoding
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 1 0 0 1 1 0 0 1 1 0 1 1 1 0
9838
1 1 1 0 1 1 1 0
0 1 0 0 1 1 0 0
Varint Encoding
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 1 0 0 1 1 0 0 1 1 0 1 1 1 0
9838
1 1 1 0 1 1 1 0
0 1 0 0 1 1 0 0
Inverted Index Compression
Size of Organic Dataset for last 5 months
● Original: 102 TB
● Inverted: 51 TB
● Delta / Varint: 17 TB
Flamdex
● Two files per field (terms/docs)
● Can add fields without rebuilding index
● Faster varint decoding
● No TF or positions (or wasted time decoding
them)
Varints
Pros:
● Compression
● Can fit more of index in RAM
● Higher information throughput per byte read
from disk
Varints
Cons:
● Decodes one byte at a time
● Lots of branch mispredictions
● Not fast to decode
Vectorized Varint Decoding
01001010 11001000 01110001 01001110
10011011 01101010 10110101 00010111
01110110 10001101 10110011 11000001
Vectorized Varint Decoding
01001010 11001000 01110001 01001110
10011011 01101010 10110101 00010111
01110110 10001101 10110011 11000001
pmovmskb: Extract top bit of each byte
Vectorized Varint Decoding
01001010 11001000 01110001 01001110
10011011 01101010 10110101 00010111
01110110 10001101 10110011 11000001
pmovmskb: Extract top bit of each byte
010010100111
Vectorized Varint Decoding
01001010 11001000 01110001 01001110
10011011 01101010 10110101 00010111
01110110 10001101 10110011 11000001
pmovmskb: Extract top bit of each byte
010010100111
Lookup in 4096 entry lookup table
010010100111
Pattern of leading bits determines:
● how many varints to decode
● sizes and offsets of varints
● length of longest varint in bytes
● number of bytes to consume
010010100111
Pattern of leading bits determines:
● how many varints to decode
● sizes and offsets of varints
● length of longest varint in bytes
● number of bytes to consume
010010100111
Pattern of leading bits determines:
● how many varints to decode
● sizes and offsets of varints
● length of longest varint in bytes
● number of bytes to consume
010010100111
Pattern of leading bits determines:
● how many varints to decode
● sizes and offsets of varints
● length of longest varint in bytes
● number of bytes to consume
010010100111
Pattern of leading bits determines:
● how many varints to decode
● sizes and offsets of varints
● length of longest varint in bytes
● number of bytes to consume
010010100111
Pattern of leading bits determines:
● how many varints to decode
● sizes and offsets of varints
● length of longest varint in bytes
● number of bytes to consume
010010100111
Pattern of leading bits determines:
● how many varints to decode
● sizes and offsets of varints
● length of longest varint in bytes
● number of bytes to consume
010010100111
Pattern of leading bits determines:
● how many varints to decode
● sizes and offsets of varints
● length of longest varint in bytes
● number of bytes to consume
010010100111
Pattern of leading bits determines:
● how many varints to decode
● sizes and offsets of varints
● length of longest varint in bytes
● number of bytes to consume
010010100111
Decoding options for:
● up to twelve 1 byte varints
● six 1-2 byte varints
● four 1-3 byte varints
● two 1-5 byte varints
Vectorized Varint Decoding
● Decode six 1-2 byte varints in parallel
● Need to pad out all 1 byte varints to 2 bytes
pshufb: Intel SSSE3 instruction to shuffle
bytes
Vectorized Varint Decoding
01001010 11001000 01110001 01001110
10011011 01101010 10110101 00010111
01110110 10001101 10110011 11000001
Decode 6 varints from 9 bytes
Vectorized Varint Decoding
01001010 11001000 01110001 01001110
10011011 01101010 10110101 00010111
01110110 10001101 10110011 11000001
Pad out 1 byte ints to 2 bytes
Vectorized Varint Decoding
01001010 00000000 11001000 01110001
01001110 00000000 10011011 01101010
10110101 00010111 01110110 00000000
Pad out 1 byte ints to 2 bytes
Vectorized Varint Decoding
01001010 00000000 11001000 01110001
01001110 00000000 10011011 01101010
10110101 00010111 01110110 00000000
Reverse bytes in 2 byte varints
Vectorized Varint Decoding
00000000 01001010 01110001 11001000
00000000 01001110 01101010 10011011
00010111 10110101 00000000 01110110
Reverse bytes in 2 byte varints
Vectorized Varint Decoding
00000000 01001010 01110001 11001000
00000000 01001110 01101010 10011011
00010111 10110101 00000000 01110110
Mask out leading purple 1’s
Vectorized Varint Decoding
00000000 01001010 01110001 01001000
00000000 01001110 01101010 00011011
00010111 00110101 00000000 01110110
Mask out leading purple 1’s
Vectorized Varint Decoding
00000000 01001010 01110001 01001000
00000000 01001110 01101010 00011011
00010111 00110101 00000000 01110110
Shift top bytes of each varint 1 bit right
(mask/shift/or)
Vectorized Varint Decoding
00000000 01001010 00111000 11001000
00000000 01001110 00110101 00011011
00001011 10110101 00000000 01110110
Shift top bytes of each varint 1 bit right
(mask/shift/or)
Vectorized Varint Decoding
00000000 01001010 00111000 11001000
00000000 01001110 00110101 00011011
00001011 10110101 00000000 01110110
● ~10 instructions
● No branches
● Less than 2 instructions per varint
Vectorized Varint Decoding
00000000 01001010 00111000 11001000
00000000 01001110 00110101 00011011
00001011 10110101 00000000 01110110
● Imhotep spends ~40% of its CPU time
decoding varints
Vectorized Varint Decoding
00000000 01001010 00111000 11001000
00000000 01001110 00110101 00011011
00001011 10110101 00000000 01110110
● Imhotep spends ~40% of its CPU time
decoding varints
● Vectorized decoder ~3-5x faster
○ Decompresses at 1.5 GB per second
○ ~2x overall system performance
Top 5 Locations
Term Stats
atlanta 49
austin 14
boston 25
chicago 28
dallas 13
houston 36
new york 68
san francisco 54
Term Stats Iterator
● For each term in a field, sum metrics across
all docs containing that term
Term Stats Iterator
● For each term in a field, sum metrics across
all docs containing that term
● How do we compute this across many
machines?
dallas 5
boston 12
austin 3
atlanta 16
dallas 8
chicago 19
austin 4
atlanta 12
chicago 9
boston 13
austin 7
atlanta 21
dallas 5
boston 12
austin 3
atlanta 16
dallas 8
chicago 19
austin 4
atlanta 12
chicago 9
boston 13
austin 7
atlanta 21
dallas 5
boston 12
austin 3
atlanta 16
dallas 8
chicago 19
austin 4
atlanta 12
chicago 9
boston 13
austin 7
atlanta 21
dallas 5
boston 12
austin 3
atlanta 16
dallas 8
chicago 19
austin 4
atlanta 12
chicago 9
boston 13
austin 7
atlanta 21
dallas 5
boston 12
austin 3
atlanta 16
chicago 9
boston 13
austin 7
atlanta 21
atlanta 49
dallas 5
boston 12
austin 3
atlanta 16
dallas 8
chicago 19
austin 4
atlanta 12
chicago 9
boston 13
austin 7
atlanta 21
atlanta 49
dallas 5
boston 12
austin 3
atlanta 16
chicago 9
boston 13
austin 7
atlanta 21
dallas 5
boston 12
austin 3
atlanta 16
dallas 8
chicago 19
austin 4
atlanta 12
chicago 9
boston 13
austin 7
atlanta 21
dallas 5
boston 12
austin 3
dallas 8
chicago 19
austin 4
atlanta 12
chicago 9
boston 13
austin 7
atlanta 21
atlanta 49atlanta 49
dallas 5
boston 12
austin 3
dallas 8
chicago 19
austin 4
chicago 9
boston 13
austin 7
atlanta 21
atlanta 49atlanta 49
chicago 9
boston 13
austin 7
atlanta 49atlanta 49
dallas 5
boston 12
austin 3
dallas 8
chicago 19
austin 4
austin 14
atlanta 49
chicago 9
boston 13
austin 7
dallas 5
boston 12
austin 3
dallas 8
chicago 19
austin 4
austin 14
atlanta 49
chicago 9
boston 13
austin 7
dallas 5
boston 12
austin 3
dallas 8
chicago 19
austin 4
dallas 5
boston 12
austin 14
atlanta 49
chicago 9
boston 13
austin 7
dallas 8
chicago 19
austin 4
dallas 8
chicago 19
dallas 5
boston 12
austin 14
atlanta 49
chicago 9
boston 13
austin 7
chicago 9
boston 13
dallas 8
chicago 19
dallas 5
boston 12
austin 14
atlanta 49
chicago 9
boston 13
dallas 8
chicago 19
dallas 5
boston 12
boston 25
austin 14
atlanta 49
boston 25
austin 14
atlanta 49
chicago 9
boston 13
dallas 8
chicago 19
dallas 5
boston 12
dallas 5
boston 25
austin 14
atlanta 49
chicago 9
boston 13
dallas 8
chicago 19
chicago 9dallas 5
boston 25
austin 14
atlanta 49
dallas 8
chicago 19
chicago 9dallas 5
chicago 28
boston 25
austin 14
atlanta 49
dallas 8
chicago 19
chicago 28
boston 25
austin 14
atlanta 49
chicago 9dallas 5
dallas 8
chicago 19
dallas 8
chicago 28
boston 25
austin 14
atlanta 49
chicago 9dallas 5
dallas 8
chicago 28
boston 25
austin 14
atlanta 49
dallas 5
dallas 8
dallas 13
chicago 28
boston 25
austin 14
atlanta 49
dallas 5
dallas 5 dallas 8
dallas 13
chicago 28
boston 25
austin 14
atlanta 49
dallas 8
dallas 13
chicago 28
boston 25
austin 14
atlanta 49
dallas 13
chicago 28
boston 25
austin 14
atlanta 49
Term Stats
1-6
TS 1 TS 2 TS 3 TS 4 TS 5 TS 6
TS 1-6 TS 7-12 TS 13-18
TS 1-6 TS 7-12 TS 13-18
Term Stats 1-
18
Amdahl’s Law
● The speedup of a program using multiple
processors is limited by the time needed for
the sequential fraction of the program
Amdahl’s Law
● Sequential part of FTGS is last step in
merge
● Can we distribute some part of the final
merge?
Hash Partition + Interleave
● Send all stats for each unique term to the
same thread based on a hash of the term
● Interleave merged terms
TS 1-6 TS 7-12 TS 13-18
Term Stats 1-
18
Shard Distribution
dallas 5
boston 12
austin 3
atlanta 16
dallas 8
chicago 19
austin 4
atlanta 12
chicago 9
boston 13
austin 7
atlanta 21
dallas 5
boston 12
austin 3
atlanta 16
dallas 8
chicago 19
austin 4
atlanta 12
chicago 9
boston 13
austin 7
atlanta 21
dallas 5
boston 12
austin 3
atlanta 16
dallas 8
chicago 19
austin 4
atlanta 12
chicago 9
boston 13
austin 7
atlanta 21
dallas 5
boston 12
austin 3
atlanta 16
dallas 8
chicago 19
austin 4
atlanta 12
chicago 9
boston 13
austin 7
atlanta 21
dallas 5
boston 12
austin 3
atlanta 16
dallas 8
chicago 19
austin 4
atlanta 12
chicago 9
boston 13
austin 7
atlanta 21
dallas 5
boston 12
austin 3
atlanta 16
dallas 8
chicago 19
austin 4
atlanta 12
chicago 9
boston 13
austin 7
atlanta 21
dallas 5
boston 12
atlanta 16
dallas 8
atlanta 12
boston 13
atlanta 21
dallas 5
boston 12
atlanta 16
dallas 8
atlanta 12
boston 13
atlanta 21
dallas 5
boston 12
atlanta 16
dallas 8
atlanta 12
boston 13
atlanta 21
atlanta 49
dallas 5
boston 12 dallas 8 boston 13
boston 25
atlanta 49
dallas 5 dallas 8
dallas 13
boston 25
atlanta 49
dallas 13
boston 25
atlanta 49
dallas 13
boston 25
atlanta 49
chicago 28
austin 14
dallas 13
boston 25
atlanta 49
chicago 28
austin 14
dallas 13
boston 25
atlanta 49
chicago 28
austin 14
atlanta 49
dallas 13
boston 25
atlanta 49
chicago 28
austin 14
atlanta 49
dallas 13
boston 25
atlanta 49
chicago 28
austin 14
dallas 13
boston 25
atlanta 49
chicago 28
austin 14
austin 14
atlanta 49
dallas 13
boston 25
chicago 28
austin 14
austin 14
atlanta 49
dallas 13
boston 25
chicago 28
austin 14
chicago 28
dallas 13
boston 25
austin 14
atlanta 49
boston 25
austin 14
atlanta 49
chicago 28
dallas 13
boston 25
boston 25
austin 14
atlanta 49
chicago 28
dallas 13
boston 25
dallas 13
boston 25
austin 14
atlanta 49
chicago 28
chicago 28
boston 25
austin 14
atlanta 49
dallas 13 chicago 28
chicago 28
boston 25
austin 14
atlanta 49
dallas 13 chicago 28
chicago 28
boston 25
austin 14
atlanta 49
dallas 13
dallas 13
dallas 13
chicago 28
boston 25
austin 14
atlanta 49
dallas 13
dallas 13
chicago 28
boston 25
austin 14
atlanta 49
dallas 13
chicago 28
boston 25
austin 14
atlanta 49
Shard Distribution
● Lots of datasets for different event types
● Each dataset is split into one shard per
(hour/day)
● Each shard has 2 replicas for fault tolerance
● How do we assign shards to machines?
Shard Distribution Considerations
● Space
● Load
● Hot Spots
● Adding/Removing machines
Homogeneous vs. Heterogeneous
Systems
● Must decide how or if you will handle
heterogeneous hardware
● Cannot balance for both space and load on
heterogeneous hardware
1 TB
3 TB
Homogeneous vs. Heterogeneous
Homogeneous vs. Heterogeneous
12 shards
50% capacity used
4 shards
50% capacity used
Homogeneous vs. Heterogeneous
12 shards
50% capacity used
4 shards
50% capacity used
read hotspot
Homogeneous vs. Heterogeneous
8 shards
33% capacity used
8 shards
100% capacity used
wasted space
Hot Spots
When accessing any subset of a dataset,
evenly spread the load across CPUs, drives,
network cards
Hot Spots
When accessing any subset of a dataset,
evenly spread the load across CPUs, drives,
network cards
This is hard
Hot Spots
Maybe random is good enough?
Hot Spots
Maybe random is good enough?
On average about 10% more data read from
the most loaded machine than the least
Two Choice Randomized Load
Balancing
● 2 replicas of each shard to choose from
● Greedily choose the machine that currently
has the least load from this client
Two Choice Randomized Load
Balancing
● 2 replicas of each shard to choose from
● Greedily choose the machine that currently
has the least load from this client
● On average about 1% more data read from
the most loaded machine than the least
Rendezvous Hashing
● Assignment of a shard to machines
determined only by the machines that exist
in the cluster
● Hash all pairs of shard ID and machine ID
and pick the largest two
Rendezvous Hashing
Shard ID: organic.2014-03-02T06:00:00
H(Shard ID + m1
) = 0.592624
H(Shard ID + m2
) = 0.294647
H(Shard ID + m3
) = 0.736681
H(Shard ID + m4
) = 0.647578
H(Shard ID + m5
) = 0.835598
Rendezvous Hashing
0
1
m5
m3
m4
m1
m2
Rendezvous Hashing
0
1
m5
m3
m4
m1
m2
Rendezvous Hashing
0
1
m5
m3
m4
m1
m2
Rendezvous Hashing
● No coordination required - deterministic
algorithm used to determine assignment
● No centralized storage for shard to machine
assignment
Rendezvous Hashing
Rendezvous Hashing
Rendezvous Hashing
Rendezvous Hashing
Rendezvous Hashing
Rendezvous Hashing
Rendezvous Hashing
Rendezvous Hashing
Rendezvous Hashing
Rendezvous Hashing
Expected max hash for a shard is
Rendezvous Hashing
Expected max hash for a shard is
Probability that new machine will get shard
Rendezvous Hashing
Imhotep answers questions
What was the weekly average query time in the
last quarter from people doing the query
“software”?
1. Query Regroup on query:software
2. Metric Regroup on time, width 7 days
3. Get Group Stats on query time and count,
divide after summing
Ramses
Imhotep answers questions
What percent of jobsearch results pages are for
page 2 and beyond?
1. Get Group Stats on count
2. Query Regroup on “-page:1”
3. Get Group Stats on count
4. Divide -page:1 count by total count
Ramses
Imhotep answers questions
What are the 5 most common queries in each
country?
1. Multiterm Regroup on all values of country
2. Term Group Stats Iteration on query
IQL
select count()
from jobsearch
‘2014-01-01’
‘2014-03-26’
group by country, query[5]
IQL
select count()
from jobsearch
‘2014-01-01’
‘2014-03-26’
group by country, query[5]
Metrics
select count()
from jobsearch
‘2014-01-01’
‘2014-03-26’
group by country, query[5]
IQL
Dataset
select count()
from jobsearch
‘2014-01-01’
‘2014-03-26’
group by country, query[5]
IQL
Regroup
select count()
from jobsearch
‘2014-01-01’
‘2014-03-26’
group by country, query[5]
IQL
Term Group
Stats
Imhotep
Large Scale Analytics and Machine
Learning
Imhotep
Large Scale Analytics and Machine
Learning
● Varint Decoding:
High Performance Vector Instructions
● Stream Merging: Hash Partition +
Interleave
● Shard Distribution: Rendezvous Hashing
We’re Open Sourcing
Imhotep
How You Can Use Imhotep
Data Ingestion
● TSV Uploader
● Hadoop
Data Access
● Imhotep Primitives
● IQL
Next @IndeedEng Talk
Large Scale Interactive Analytics
with Imhotep
Tom Bergman, Product Manager
Zak Cocos, Manager of Marketing Sciences
April 30, 2014
http://engineering.indeed.com/talks
Q&A
More Questions?
David James
Next @IndeedEng Talk
Large Scale Interactive Analytics
with Imhotep
Tom Bergman, Product Manager
Zak Cocos, Manager of Marketing Sciences
April 30, 2014
http://engineering.indeed.com/talks

More Related Content

Viewers also liked

NYC event Razom Ukraine: Rise of Ukraine as Technology Nation
NYC event Razom Ukraine: Rise of Ukraine as Technology NationNYC event Razom Ukraine: Rise of Ukraine as Technology Nation
NYC event Razom Ukraine: Rise of Ukraine as Technology NationYevgen Sysoyev
 
[@IndeedEng] Building Indeed Resume Search
[@IndeedEng] Building Indeed Resume Search[@IndeedEng] Building Indeed Resume Search
[@IndeedEng] Building Indeed Resume Searchindeedeng
 
[@IndeedEng] Managing Experiments and Behavior Dynamically with Proctor
[@IndeedEng] Managing Experiments and Behavior Dynamically with Proctor[@IndeedEng] Managing Experiments and Behavior Dynamically with Proctor
[@IndeedEng] Managing Experiments and Behavior Dynamically with Proctorindeedeng
 
Engineering fast indexes (Deepdive)
Engineering fast indexes (Deepdive)Engineering fast indexes (Deepdive)
Engineering fast indexes (Deepdive)Daniel Lemire
 
Vectorized VByte Decoding
Vectorized VByte DecodingVectorized VByte Decoding
Vectorized VByte Decodingindeedeng
 
Lessons from Sharding Solr
Lessons from Sharding SolrLessons from Sharding Solr
Lessons from Sharding SolrGregg Donovan
 
How to Define, Measure and Ensure Workplace Happiness
How to Define, Measure and Ensure Workplace HappinessHow to Define, Measure and Ensure Workplace Happiness
How to Define, Measure and Ensure Workplace HappinessIndeed
 
The Science of Talent Attraction: Understanding What Makes People Click
The Science of Talent Attraction: Understanding What Makes People Click The Science of Talent Attraction: Understanding What Makes People Click
The Science of Talent Attraction: Understanding What Makes People Click Indeed
 
Hiring Practices That Build Powerfully Diverse Teams
Hiring Practices That Build Powerfully Diverse TeamsHiring Practices That Build Powerfully Diverse Teams
Hiring Practices That Build Powerfully Diverse TeamsIndeed
 
Experts Weigh In: Data Discoveries That Changed Our Businesses
Experts Weigh In: Data Discoveries That Changed Our BusinessesExperts Weigh In: Data Discoveries That Changed Our Businesses
Experts Weigh In: Data Discoveries That Changed Our BusinessesIndeed
 
Creating Top-Notch Job Content
Creating Top-Notch Job ContentCreating Top-Notch Job Content
Creating Top-Notch Job ContentIndeed
 
Inform, Delight and Engage with Talent on Company Pages
Inform, Delight and Engage with Talent on Company PagesInform, Delight and Engage with Talent on Company Pages
Inform, Delight and Engage with Talent on Company PagesIndeed
 
[@IndeedEng] Redundant Array of Inexpensive Datacenters
[@IndeedEng] Redundant Array of Inexpensive Datacenters[@IndeedEng] Redundant Array of Inexpensive Datacenters
[@IndeedEng] Redundant Array of Inexpensive Datacentersindeedeng
 
The Science of Talent Attraction: Understanding What Makes People Click
The Science of Talent Attraction: Understanding What Makes People ClickThe Science of Talent Attraction: Understanding What Makes People Click
The Science of Talent Attraction: Understanding What Makes People ClickIndeed
 
Finding Your Next Great Hire with Indeed CV
Finding Your Next Great Hire with Indeed CVFinding Your Next Great Hire with Indeed CV
Finding Your Next Great Hire with Indeed CVIndeed
 
Digging Into Data to Create Hiring Strategies That Work
Digging Into Data to Create Hiring Strategies That WorkDigging Into Data to Create Hiring Strategies That Work
Digging Into Data to Create Hiring Strategies That WorkIndeed
 

Viewers also liked (17)

NYC event Razom Ukraine: Rise of Ukraine as Technology Nation
NYC event Razom Ukraine: Rise of Ukraine as Technology NationNYC event Razom Ukraine: Rise of Ukraine as Technology Nation
NYC event Razom Ukraine: Rise of Ukraine as Technology Nation
 
[@IndeedEng] Building Indeed Resume Search
[@IndeedEng] Building Indeed Resume Search[@IndeedEng] Building Indeed Resume Search
[@IndeedEng] Building Indeed Resume Search
 
[@IndeedEng] Managing Experiments and Behavior Dynamically with Proctor
[@IndeedEng] Managing Experiments and Behavior Dynamically with Proctor[@IndeedEng] Managing Experiments and Behavior Dynamically with Proctor
[@IndeedEng] Managing Experiments and Behavior Dynamically with Proctor
 
Engineering fast indexes (Deepdive)
Engineering fast indexes (Deepdive)Engineering fast indexes (Deepdive)
Engineering fast indexes (Deepdive)
 
Vectorized VByte Decoding
Vectorized VByte DecodingVectorized VByte Decoding
Vectorized VByte Decoding
 
Lessons from Sharding Solr
Lessons from Sharding SolrLessons from Sharding Solr
Lessons from Sharding Solr
 
How to Define, Measure and Ensure Workplace Happiness
How to Define, Measure and Ensure Workplace HappinessHow to Define, Measure and Ensure Workplace Happiness
How to Define, Measure and Ensure Workplace Happiness
 
The Science of Talent Attraction: Understanding What Makes People Click
The Science of Talent Attraction: Understanding What Makes People Click The Science of Talent Attraction: Understanding What Makes People Click
The Science of Talent Attraction: Understanding What Makes People Click
 
Job Search Gone Mobile
Job Search Gone MobileJob Search Gone Mobile
Job Search Gone Mobile
 
Hiring Practices That Build Powerfully Diverse Teams
Hiring Practices That Build Powerfully Diverse TeamsHiring Practices That Build Powerfully Diverse Teams
Hiring Practices That Build Powerfully Diverse Teams
 
Experts Weigh In: Data Discoveries That Changed Our Businesses
Experts Weigh In: Data Discoveries That Changed Our BusinessesExperts Weigh In: Data Discoveries That Changed Our Businesses
Experts Weigh In: Data Discoveries That Changed Our Businesses
 
Creating Top-Notch Job Content
Creating Top-Notch Job ContentCreating Top-Notch Job Content
Creating Top-Notch Job Content
 
Inform, Delight and Engage with Talent on Company Pages
Inform, Delight and Engage with Talent on Company PagesInform, Delight and Engage with Talent on Company Pages
Inform, Delight and Engage with Talent on Company Pages
 
[@IndeedEng] Redundant Array of Inexpensive Datacenters
[@IndeedEng] Redundant Array of Inexpensive Datacenters[@IndeedEng] Redundant Array of Inexpensive Datacenters
[@IndeedEng] Redundant Array of Inexpensive Datacenters
 
The Science of Talent Attraction: Understanding What Makes People Click
The Science of Talent Attraction: Understanding What Makes People ClickThe Science of Talent Attraction: Understanding What Makes People Click
The Science of Talent Attraction: Understanding What Makes People Click
 
Finding Your Next Great Hire with Indeed CV
Finding Your Next Great Hire with Indeed CVFinding Your Next Great Hire with Indeed CV
Finding Your Next Great Hire with Indeed CV
 
Digging Into Data to Create Hiring Strategies That Work
Digging Into Data to Create Hiring Strategies That WorkDigging Into Data to Create Hiring Strategies That Work
Digging Into Data to Create Hiring Strategies That Work
 

More from indeedeng

Weapons of Math Instruction: Evolving from Data0-Driven to Science-Driven
Weapons of Math Instruction: Evolving from Data0-Driven to Science-DrivenWeapons of Math Instruction: Evolving from Data0-Driven to Science-Driven
Weapons of Math Instruction: Evolving from Data0-Driven to Science-Drivenindeedeng
 
Alchemy and Science: Choosing Metrics That Work
Alchemy and Science: Choosing Metrics That WorkAlchemy and Science: Choosing Metrics That Work
Alchemy and Science: Choosing Metrics That Workindeedeng
 
Indeed Engineering and The Lead Developer Present: Tech Leadership and Manage...
Indeed Engineering and The Lead Developer Present: Tech Leadership and Manage...Indeed Engineering and The Lead Developer Present: Tech Leadership and Manage...
Indeed Engineering and The Lead Developer Present: Tech Leadership and Manage...indeedeng
 
Indeed Engineering and The Lead Developer Present: Tech Leadership and Manage...
Indeed Engineering and The Lead Developer Present: Tech Leadership and Manage...Indeed Engineering and The Lead Developer Present: Tech Leadership and Manage...
Indeed Engineering and The Lead Developer Present: Tech Leadership and Manage...indeedeng
 
Improving the development process with metrics driven insights presentation
Improving the development process with metrics driven insights presentationImproving the development process with metrics driven insights presentation
Improving the development process with metrics driven insights presentationindeedeng
 
Data-Driven off a Cliff: Anti-Patterns in Evidence-Based Decision Making
Data-Driven off a Cliff: Anti-Patterns in Evidence-Based Decision MakingData-Driven off a Cliff: Anti-Patterns in Evidence-Based Decision Making
Data-Driven off a Cliff: Anti-Patterns in Evidence-Based Decision Makingindeedeng
 
Indeed My Jobs: A case study in ReactJS and Redux (Meetup talk March 2016)
Indeed My Jobs: A case study in ReactJS and Redux (Meetup talk March 2016)Indeed My Jobs: A case study in ReactJS and Redux (Meetup talk March 2016)
Indeed My Jobs: A case study in ReactJS and Redux (Meetup talk March 2016)indeedeng
 
Data Day Texas - Recommendations
Data Day Texas - RecommendationsData Day Texas - Recommendations
Data Day Texas - Recommendationsindeedeng
 
[@IndeedEng] Large scale interactive analytics with Imhotep
[@IndeedEng] Large scale interactive analytics with Imhotep[@IndeedEng] Large scale interactive analytics with Imhotep
[@IndeedEng] Large scale interactive analytics with Imhotepindeedeng
 
[@IndeedEng] Boxcar: A self-balancing distributed services protocol
[@IndeedEng] Boxcar: A self-balancing distributed services protocol [@IndeedEng] Boxcar: A self-balancing distributed services protocol
[@IndeedEng] Boxcar: A self-balancing distributed services protocol indeedeng
 
[@IndeedEng] Engineering Velocity: Building Great Software Through Fast Itera...
[@IndeedEng] Engineering Velocity: Building Great Software Through Fast Itera...[@IndeedEng] Engineering Velocity: Building Great Software Through Fast Itera...
[@IndeedEng] Engineering Velocity: Building Great Software Through Fast Itera...indeedeng
 

More from indeedeng (11)

Weapons of Math Instruction: Evolving from Data0-Driven to Science-Driven
Weapons of Math Instruction: Evolving from Data0-Driven to Science-DrivenWeapons of Math Instruction: Evolving from Data0-Driven to Science-Driven
Weapons of Math Instruction: Evolving from Data0-Driven to Science-Driven
 
Alchemy and Science: Choosing Metrics That Work
Alchemy and Science: Choosing Metrics That WorkAlchemy and Science: Choosing Metrics That Work
Alchemy and Science: Choosing Metrics That Work
 
Indeed Engineering and The Lead Developer Present: Tech Leadership and Manage...
Indeed Engineering and The Lead Developer Present: Tech Leadership and Manage...Indeed Engineering and The Lead Developer Present: Tech Leadership and Manage...
Indeed Engineering and The Lead Developer Present: Tech Leadership and Manage...
 
Indeed Engineering and The Lead Developer Present: Tech Leadership and Manage...
Indeed Engineering and The Lead Developer Present: Tech Leadership and Manage...Indeed Engineering and The Lead Developer Present: Tech Leadership and Manage...
Indeed Engineering and The Lead Developer Present: Tech Leadership and Manage...
 
Improving the development process with metrics driven insights presentation
Improving the development process with metrics driven insights presentationImproving the development process with metrics driven insights presentation
Improving the development process with metrics driven insights presentation
 
Data-Driven off a Cliff: Anti-Patterns in Evidence-Based Decision Making
Data-Driven off a Cliff: Anti-Patterns in Evidence-Based Decision MakingData-Driven off a Cliff: Anti-Patterns in Evidence-Based Decision Making
Data-Driven off a Cliff: Anti-Patterns in Evidence-Based Decision Making
 
Indeed My Jobs: A case study in ReactJS and Redux (Meetup talk March 2016)
Indeed My Jobs: A case study in ReactJS and Redux (Meetup talk March 2016)Indeed My Jobs: A case study in ReactJS and Redux (Meetup talk March 2016)
Indeed My Jobs: A case study in ReactJS and Redux (Meetup talk March 2016)
 
Data Day Texas - Recommendations
Data Day Texas - RecommendationsData Day Texas - Recommendations
Data Day Texas - Recommendations
 
[@IndeedEng] Large scale interactive analytics with Imhotep
[@IndeedEng] Large scale interactive analytics with Imhotep[@IndeedEng] Large scale interactive analytics with Imhotep
[@IndeedEng] Large scale interactive analytics with Imhotep
 
[@IndeedEng] Boxcar: A self-balancing distributed services protocol
[@IndeedEng] Boxcar: A self-balancing distributed services protocol [@IndeedEng] Boxcar: A self-balancing distributed services protocol
[@IndeedEng] Boxcar: A self-balancing distributed services protocol
 
[@IndeedEng] Engineering Velocity: Building Great Software Through Fast Itera...
[@IndeedEng] Engineering Velocity: Building Great Software Through Fast Itera...[@IndeedEng] Engineering Velocity: Building Great Software Through Fast Itera...
[@IndeedEng] Engineering Velocity: Building Great Software Through Fast Itera...
 

Recently uploaded

Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 

Recently uploaded (20)

Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 

@IndeedEng: Imhotep - Large Scale Analytics and Machine Learning at Indeed