SlideShare a Scribd company logo
1 of 74
Download to read offline
Zardosht Kasheff
Understanding Indexing
Without Needing to Understand Data Structures
What’s a Table?
A dictionary is a set of (key, value) pairs.
• We’ll assume you can update the dictionary (insertions,
deletions, updates) and query the dictionary (point
queries, range queries)
• B-Trees and Fractal Trees are examples of dictionaries
• Hashes are not (range queries are not supported)
2
What’s a Table?
3
a b c
100 5 45
101 92 2
156 56 45
165 6 2
198 202 56
206 23 252
256 56 2
412 43 45
A table is a set of
dictionaries.
Example:
create table foo (a
int, b int, c int,
primary key(a));
Then we insert a bunch
of data and get...
What’s a Table?
The sort order of a dictionary is defined by the key
• For the data structures/storage engines we’ll think about,
range queries on the sort order are FAST
• Range queries on any other order require a table scan =
SLOW
• Point queries -- retrieving the value for one particular key --
is SLOW
‣A single point query is fast, but reading a bunch of rows this way is going
to be 2 orders of magnitude slower than reading the same number of
rows in range query order
4
What’s an Index?
A Index I on table T is itself a dictionary
• We need to define the (key, value) pairs.
• The key in index I is a subset of fields in the primary
dictionary T.
• The value in index I is the primary key in T.
‣ There are other ways to define the value, but we’re sticking with this.
Example:
alter table foo add key(b);
Then we get...
5
What’s an Index?
6
a b c
100 5 45
101 92 2
156 56 45
165 6 2
198 202 56
206 23 252
256 56 2
412 43 45
b a
5 100
6 165
23 206
43 412
56 156
56 256
92 101
202 198
Primary key(b)
Q: count(*) where a<120;
7
a b c
100 5 45
101 92 2
156 56 45
165 6 2
198 202 56
206 23 252
256 56 2
412 43 45
b a
5 100
6 165
23 206
43 412
56 156
56 256
92 101
202 198
Q: count(*) where a<120;
7
a b c
100 5 45
101 92 2
156 56 45
165 6 2
198 202 56
206 23 252
256 56 2
412 43 45
b a
5 100
6 165
23 206
43 412
56 156
56 256
92 101
202 198
100 5 45
101 92 2
2
Q: count(*) where b>50;
8
a b c
100 5 45
101 92 2
156 56 45
165 6 2
198 202 56
206 23 252
256 56 2
412 43 45
b a
5 100
6 165
23 206
43 412
56 156
56 256
92 101
202 198
Q: count(*) where b>50;
8
a b c
100 5 45
101 92 2
156 56 45
165 6 2
198 202 56
206 23 252
256 56 2
412 43 45
b a
5 100
6 165
23 206
43 412
56 156
56 256
92 101
202 198
56 156
56 256
92 101
202 198
4
Q: sum(c) where b>50;
9
a b c
100 5 45
101 92 2
156 56 45
165 6 2
198 202 56
206 23 252
256 56 2
412 43 45
b a
5 100
6 165
23 206
43 412
56 156
56 256
92 101
202 198
Q: sum(c) where b>50;
9
a b c
100 5 45
101 92 2
156 56 45
165 6 2
198 202 56
206 23 252
256 56 2
412 43 45
b a
5 100
6 165
23 206
43 412
56 156
56 256
92 101
202 198
56 156
56 256
92 101
202 198
105
156 56 45
256 56 2
101 92 2
198 202 56
SlowFast
What are indexes good for?
Indexes make queries go fast
• Each index will speed up some subset of queries
Design indexes with queries in mind
• Pick most important queries and set up indexes for those
• Consider the cost of maintaining the indexes
10
What are indexes good for?
Indexes make queries go fast
• Each index will speed up some subset of queries
Design indexes with queries in mind
• Pick most important queries and set up indexes for those
• Consider the cost of maintaining the indexes
10
Goals of Talk
3 simple rules for designing good indexes
Avoid details of any data structure
• B-trees and Fractal Trees are interesting and fun for the
algorithmically minded computer scientist, but the 3 rules
will apply equally well to either data structure.
• All we need to care about is that range queries are fast
(per row) and that point queries are much slower (per
row).
11
The Rule to Rule Them All
There is no absolute rule
• Indexing is like a math problem
• Rules help, but each scenario is its own problem, which
requires problem-solving analysis
That said, rules help a lot
12
Three Basic Rules
1. Retrieve less data
• Less bandwidth, less processing, ...
2. Avoid point queries
• Not all data access cost is the same
• Sequential access is MUCH faster than random access
3. Avoid Sorting
• GROUP BY and ORDER BY queries do post-retrieval
work
• Indexing can help get rid of this work
13
Rule 1
Retrieve less data
Rule 1: Example of slow query
Example TABLE (1B rows, no indexes):
•create table foo (a int, b int, c int);
Query (1000 rows match):
•select sum(c) from foo where b=10 and a<150;
Query Plan:
• Rows where b=10 and a<150 can be anywhere in the table
• Without an index, entire table is scanned
Slow execution:
• Scan 1B rows just to count 1000 rows
Rule 1: How to add an index
What should we do?
• Reduce the data retrieved
• Analyze much less than 1B rows
How (for a simple select)?
• Design index by focusing on the WHERE clause
‣This defines what rows the query is interested in
‣Other rows are not important for this query
16
Rule 1: How to add an index
What should we do?
• Reduce the data retrieved
• Analyze much less than 1B rows
How (for a simple select)?
• Design index by focusing on the WHERE clause
‣This defines what rows the query is interested in
‣Other rows are not important for this query
16
select sum(c) from foo where b=10 and a<150;
Rule 1: Which index?
Option 1: key(a)
Option 2: key(b)
Which is better? Depends on selectivity:
• If there are fewer rows where a<150, then key(a) is better
• If there are fewer rows where b=10, then key(b) is better
Option 3: key(a) AND key(b), then MERGE
• We’ll come to this later
17
Rule 1: Picking the best key
Neither key(a) nor key(b) is optimal
Suppose:
• 200,000 rows exist where a<150
• 100,000 rows exist where b=10
• 1000 rows exist where b=10 and a<150
Then either index retrieves too much data
For better performance, indexes should try to
optimize over as many pieces of the where clause
as possible
• We need a composite index
18
Composite indexes reduce data retrieved
Where clause: b=5 and a<150
• Option 1: key(a,b)
• Option 2: key(b,a)
Which one is better?
• key(b,a)!
KEY RULE:
• When making a composite index, place equality checking
columns first. Condition on b is equality, but not on a.
19
Q: where b=5 and a>150;
20
a b c
100 5 45
101 6 2
156 5 45
165 6 2
198 6 56
206 5 252
256 5 2
412 6 45
b,a a
5,100 100
5,156 156
5,206 206
5,256 256
6,101 101
6,165 165
6,198 198
6,412 412
Q: where b=5 and a>150;
20
a b c
100 5 45
101 6 2
156 5 45
165 6 2
198 6 56
206 5 252
256 5 2
412 6 45
b,a a
5,100 100
5,156 156
5,206 206
5,256 256
6,101 101
6,165 165
6,198 198
6,412 412
5,156 156
5,206 206
5,256 256
Composite Indexes: No equality clause
What if where clause is:
•where a>100 and a<200 and b>100;
Which is better?
• key(a), key(b), key(a,b), key(b,a)?
KEY RULE:
• As soon as a column on a composite index is NOT used for
equality, the rest of the composite index no longer reduces
data retrieved.
‣key(a,b) is no better* than key(a)
‣key(b,a) is no better* than key(b)
21
Composite Indexes: No equality clause
What if where clause is:
•where a>100 and a<200 and b>100;
Which is better?
• key(a), key(b), key(a,b), key(b,a)?
KEY RULE:
• As soon as a column on a composite index is NOT used for
equality, the rest of the composite index no longer reduces
data retrieved.
‣key(a,b) is no better* than key(a)
‣key(b,a) is no better* than key(b)
21
* Are there corner cases where it helps? Yes, but rare.
Q: where b>=5 and a>150;
22
a b c
100 5 45
101 6 2
156 5 45
165 6 2
198 6 56
206 5 252
256 5 2
412 6 45
b,a a
5,100 100
5,156 156
5,206 206
5,256 256
6,101 101
6,165 165
6,198 198
6,412 412
Q: where b>=5 and a>150;
22
a b c
100 5 45
101 6 2
156 5 45
165 6 2
198 6 56
206 5 252
256 5 2
412 6 45
b,a a
5,100 100
5,156 156
5,206 206
5,256 256
6,101 101
6,165 165
6,198 198
6,412 412
5,156 156
5,206 206
5,256 256
6,101 101
6,165 165
6,198 198
6,412 412
Q: where b>=5 and a>150;
22
a b c
100 5 45
101 6 2
156 5 45
165 6 2
198 6 56
206 5 252
256 5 2
412 6 45
b,a a
5,100 100
5,156 156
5,206 206
5,256 256
6,101 101
6,165 165
6,198 198
6,412 412
5,156 156
5,206 206
5,256 256
6,101 101
6,165 165
6,198 198
6,412 412
Composite Indexes: Another example
WHERE clause: b=5 and c=100
• key(b,a,c) is as good as key(b), because a is not used
in clause, so having c in index doesn’t help. key(b,c,a)
would be much better.
23
a b c
100 5 100
101 6 200
156 5 200
165 6 100
198 6 100
206 5 200
256 5 100
412 6 100
b,a,c a
5,100,100 100
5,156,200 156
5,206,200 206
5,256,100 256
6,101,200 101
6,165,100 165
6,198,100 6,198
6,412,100 412
Composite Indexes: Another example
WHERE clause: b=5 and c=100
• key(b,a,c) is as good as key(b), because a is not used
in clause, so having c in index doesn’t help. key(b,c,a)
would be much better.
23
a b c
100 5 100
101 6 200
156 5 200
165 6 100
198 6 100
206 5 200
256 5 100
412 6 100
b,a,c a
5,100,100 100
5,156,200 156
5,206,200 206
5,256,100 256
6,101,200 101
6,165,100 165
6,198,100 6,198
6,412,100 412
5,100,100 100
5,156,200 156
5,206,200 206
5,256,100 256
Rule 1: Recap
Goal is to reduce rows retrieved
• Create composite indexes based on where clause
• Place equality-comparison columns at the beginning
• Make first non-equality column in index as selective as
possible
• Once first column in a composite index is not used for
equality, or not used in where clause, the rest of the
composite index does not help reduce rows retrieved
‣Does that mean they aren’t helpful?
‣They might be very helpful... on to Rule 2.
24
Rule 2
Avoid Point Queries
Rule 2: Avoid point queries
Table:
•create table foo (a int, b int, c int,
primary key(a), key(b));
Query:
•select sum(c) from foo where b>50;
Query plan: use key(b)
• retrieval cost of each row is high because random point
queries are done
26
Q: sum(c) where b>50;
27
a b c
100 5 45
101 92 2
156 56 45
165 6 2
198 202 56
206 23 252
256 56 2
412 43 45
b a
5 100
6 165
23 206
43 412
56 156
56 256
92 101
202 198
Q: sum(c) where b>50;
27
a b c
100 5 45
101 92 2
156 56 45
165 6 2
198 202 56
206 23 252
256 56 2
412 43 45
b a
5 100
6 165
23 206
43 412
56 156
56 256
92 101
202 198
56 156
56 256
92 101
202 198
105
156 56 45
256 56 2
101 92 2
198 202 56
Rule 2: Avoid Point Queries
Table:
•create table foo (a int, b int, c int,
primary key(a), key(b));
Query:
•select sum(c) from foo where b>50;
Query plan: scan primary table
• retrieval cost of each row is CHEAP!
• But you retrieve too many rows
28
Q: sum(c) where b>50;
29
a b c
100 5 45
101 92 2
156 56 45
165 6 2
198 202 56
206 23 252
256 56 2
412 43 45
b a
5 100
6 165
23 206
43 412
56 156
56 256
92 101
202 198
Q: sum(c) where b>50;
29
a b c
100 5 45
101 92 2
156 56 45
165 6 2
198 202 56
206 23 252
256 56 2
412 43 45
b a
5 100
6 165
23 206
43 412
56 156
56 256
92 101
202 198
105
100 5 45
101 92 2
156 56 45
165 6 2
198 202 56
206 23 252
256 56 2
412 43 45
Rule 2: Avoid Point Queries
Table:
•create table foo (a int, b int, c int,
primary key(a), key(b));
Query:
•select sum(c) from foo where b>50;
What if we add another index?
• What about key(b,c)?
• Since we index on b, we retrieve only the rows we need.
• Since the index has information about c, we don’t need to
go to the main table. No point queries!
30
Q: sum(c) where b>50;
31
a b c
100 5 45
101 92 2
156 56 45
165 6 2
198 202 56
206 23 252
256 56 2
412 43 45
b,c a
5,45 100
6,2 165
23,252 206
43,45 412
56,2 256
56,45 156
92,2 101
202,56 198
Q: sum(c) where b>50;
31
a b c
100 5 45
101 92 2
156 56 45
165 6 2
198 202 56
206 23 252
256 56 2
412 43 45
b,c a
5,45 100
6,2 165
23,252 206
43,45 412
56,2 256
56,45 156
92,2 101
202,56 198
56,2 256
56,45 156
92,2 101
202,56 198
105
Covering Index
An index covers a query if the index has enough
information to answer the query.
Examples:
Q: select sum(c) from foo where b<100;
Q: select sum(d) from foo where b<100;
Indexes:
• key(b,c) -- covering index for first query
• key(b,d) -- covering index for second query
• key(b,c,d) -- covering index for both
32
How to build a covering index
Add every field from the select
• Not just where clause
Q: select c,d from foo where a=10 and
b=100;
Mistake: add index(a,b);
• This doesn’t cover the query. You still need point queries
to retrieve c and d values.
Correct: add index(a,b,c,d);
• Includes all referenced fields
• Place a and b at beginning by Rule 1
33
What if Primary Key matches where?
Q: select sum(c) from foo where b>100
and b<200;
Schema: create table foo (a int, b
int, c int, ai int auto_increment,
primary key (b,ai));
• Query does a range query on primary dictionary
• Only one dictionary is accessed, in sequential order
• This is fast
Primary key covers all queries
• If sort order matches where clause, problems solved
34
What’s a Clustering Index
What if primary key doesn’t match the where
clause?
• Ideally, you should be able to declare secondary indexes
that carry all fields
• AFAIK, storage engines don’t let you do this
• With one exception... TokuDB
• TokuDB allows you to declare any index to be
CLUSTERING
• A CLUSTERING index covers all queries, just like the
primary key
35
Clustering Indexes in Action
Q: select sum(c) from foo where b<100;
Q: select sum(d) from foo where b>200;
Q: select c,e from foo where b=1000;
Indexes:
• key(b,c) covers first query
• key(b,d) covers second query
• key(b,c,e) covers first and third queries
• key(b,c,d,e) covers all three queries
Indexes require a lot of analysis of queries
36
Clustering Indexes in Action
Q: select sum(c) from foo where b<100;
Q: select sum(d) from foo where b>200;
Q: select c,e from foo where b=1000;
What covers all queries?
• clustering key(b)
Clustering keys let you focus on the where
clause
• They eliminate point queries and make queries fast
37
More on clustering: Index Merge
We had example:
• create table foo(a int, b int, c int);
• select sum(c) from foo where b=10 and
a<150;
Suppose
• 200,000 rows have a<150
• 100,000 rows have b=10
• 1000 rows have b=10 and a<150
What if we use key(a) and key(b) and merge
results?
38
Query Plans
Merge plan:
• Scan 200,000 rows in key(a) where a<150
• Scan 100,000 rows in key(b) where b=10
• Merge the results and find 1000 row identifiers that
match query
• Do point queries with 1000 row identifiers to retrieve c
Better than no indexes
• Reduces number of rows scanned compared to no index
• Reduces number of point queries compared to not
merging
39
Does Clustering Help Merging?
Suppose key(a) is clustering
Query plan:
• Scan key(a) for 200,000 rows where a<150
• Scan resulting rows for ones where b=10
• Retrieve c values from 1000 remaining rows
Once again, no point queries
What’s even better?
• clustering key(b,a)!
40
Rule 2 Recap
Avoid Point Queries
Make sure index covers query
• By mentioning all fields in the query, not just those in the
where clause
Use clustering indexes
• Clustering indexes cover all queries
• Allows user to focus on where clause
• Speeds up more (and unforeseen) queries -- simplifies
database design
41
Rule 3
Avoid Sorting
Rule 3: Avoid Sorting
Simple selects require no post-processing
• select * from foo where b=100;
Just get the data and return to user
More complex queries do post-processing
• GROUP BY and ORDER BY sort the data
Index selection can avoid this sorting step
43
Avoid Sorting
Q1: select count(c) from foo;
Q2: select count(c) from foo group by
b, order by b;
Q1 plan:
• While doing a table scan, count rows with c
Q2 plan
• Scan table and write data to a temporary file
• Sort tmp file data by b
• Rescan sorted data, counting rows with c, for each b
44
Avoid Sorting
Q2: select count(c) from foo group by
b, order by b;
Q2: what if we use key(b,c)?
• By adding all needed fields, we cover query. FAST!
• By sorting first by b, we avoid sort. FAST!
Take home:
• Sort index on group by or order by fields to avoid
sorting
45
Summing Up
Use indexes to pre-sort for order by and
group by queries
46
Putting it all together
Sample Queries
Example
48
Example
select count(*) from foo where c=5,
group by b;
48
Example
select count(*) from foo where c=5,
group by b;
key(c,b):
• First have c to limit rows retrieved (R1)
• Then have remaining rows sorted by b to avoid sort (R3)
• Remaining rows will be sorted by b because of equality
test on c
48
Example
49
Example
select sum(d) from foo where c=100,
group by b;
49
Example
select sum(d) from foo where c=100,
group by b;
key(c,b,d):
• First have c to limit rows retrieved (R1)
• Then have remaining rows sorted by b to avoid sort (R3)
• Make sure index covers query, to avoid point queries (R2)
49
Example
Sometimes, there is no clear answer
• Best index is data dependent
Q: select count(*) from foo where
c<100, group by b;
Indexes:
• key(c,b)
• key(b,c)
50
Example
Q: select count(*) from foo where
c<100, group by b;
Query plan for key(c,b):
• Will filter rows where c<100
• Still need to sort by b
‣ Rows retrieved will not be sorted by b
‣ where clause does not do an equality check on c, so b values are scattered in clumps for
different c values
51
Example
Q: select count(*) from foo where
c<100, group by b;
Query plan for key(b,c):
• Sorted by b, so R3 is covered
• Rows where c>=100 are also processed, so not taking
advantage of R1
52
Example
Which is better?
• Answer depends on the data
• If there are many rows where c>=100, saving time by not
retrieving these useless rows helps. Use key(c,b).
• If there aren’t so many rows where c>=100, the time to
execute the query is dominated by sorting. Use
key(b,c).
The point is, in general, rules of thumb help, but
often they help us think about queries and
indexes, rather than giving a recipe.
53
Why not just load up
with indexes?
Need to keep up with insertion load
More indexes = smaller max load
Indexing cost
Space
• Issue
‣Each index adds storage requirements
• Options
‣Use compression (i.e., aggressive compression of 5-15x always on for
TokuDB)
Performance
• Issue
‣B-trees while fast for certain indexing tasks (in memory, sequential keys),
are over 20x slower for other types of indexing
• Options
‣Fractal Tree indexes (TokuDB’s data structure) is fast at all kinds of
indexing (i.e., random keys, large tables, wide keys, etc...)
‣ No need to worry about what type of index you are creating.
‣ Fractal trees enable customers to index early, index often
55
Last Caveat
Range Query Performance
• Issue
‣Rule #2 (range query performance over point query performance)
depends on range queries being fast
‣However B-trees can get fragmented
‣ from deletions, from random insertions, ...
‣ Fragmented B-trees get slow for range queries
• Options
‣For B-trees, optimize tables, dump and reload, (ie, time consuming and
offline maintenance) ...
‣For Fractal Tree indexing (TokuDB), not an issue
‣ Fractal trees don’t fragment
56
Thanks!
For more information...
• Please contact me at zardosht@tokutek.com for any
thoughts or feedback
• Please visit Tokutek.com for a copy of this presentation
(goo.gl/S2LBe) to learn more about the power of
indexing, read about Fractal Tree indexes, or to download
a free eval copy of TokuDB
57

More Related Content

What's hot

Lecture 5 data structures and algorithms
Lecture 5 data structures and algorithmsLecture 5 data structures and algorithms
Lecture 5 data structures and algorithmsAakash deep Singhal
 
OOUG: VST , visual sql tuning diagrams
OOUG: VST , visual sql tuning diagramsOOUG: VST , visual sql tuning diagrams
OOUG: VST , visual sql tuning diagramsKyle Hailey
 
SQL for pattern matching (Oracle 12c)
SQL for pattern matching (Oracle 12c)SQL for pattern matching (Oracle 12c)
SQL for pattern matching (Oracle 12c)Logan Palanisamy
 
9-2 Compare and Order Integers
9-2 Compare and Order Integers9-2 Compare and Order Integers
9-2 Compare and Order IntegersRudy Alfonso
 
Insersion & Bubble Sort in Algoritm
Insersion & Bubble Sort in AlgoritmInsersion & Bubble Sort in Algoritm
Insersion & Bubble Sort in AlgoritmEhsan Ehrari
 
SQL Tuning Methodology, Kscope 2013
SQL Tuning Methodology, Kscope 2013 SQL Tuning Methodology, Kscope 2013
SQL Tuning Methodology, Kscope 2013 Kyle Hailey
 

What's hot (8)

Sorting ppt
Sorting pptSorting ppt
Sorting ppt
 
Lecture 5 data structures and algorithms
Lecture 5 data structures and algorithmsLecture 5 data structures and algorithms
Lecture 5 data structures and algorithms
 
OOUG: VST , visual sql tuning diagrams
OOUG: VST , visual sql tuning diagramsOOUG: VST , visual sql tuning diagrams
OOUG: VST , visual sql tuning diagrams
 
SQL for pattern matching (Oracle 12c)
SQL for pattern matching (Oracle 12c)SQL for pattern matching (Oracle 12c)
SQL for pattern matching (Oracle 12c)
 
01 analysis-of-algorithms
01 analysis-of-algorithms01 analysis-of-algorithms
01 analysis-of-algorithms
 
9-2 Compare and Order Integers
9-2 Compare and Order Integers9-2 Compare and Order Integers
9-2 Compare and Order Integers
 
Insersion & Bubble Sort in Algoritm
Insersion & Bubble Sort in AlgoritmInsersion & Bubble Sort in Algoritm
Insersion & Bubble Sort in Algoritm
 
SQL Tuning Methodology, Kscope 2013
SQL Tuning Methodology, Kscope 2013 SQL Tuning Methodology, Kscope 2013
SQL Tuning Methodology, Kscope 2013
 

Viewers also liked

магия чтения, или библиотека вне стен
магия чтения, или библиотека вне стенмагия чтения, или библиотека вне стен
магия чтения, или библиотека вне стенAtner Yegorov
 
Linde lecturesmunich2
Linde lecturesmunich2Linde lecturesmunich2
Linde lecturesmunich2Atner Yegorov
 
Fractal tree-technology-and-the-art-of-indexing
Fractal tree-technology-and-the-art-of-indexingFractal tree-technology-and-the-art-of-indexing
Fractal tree-technology-and-the-art-of-indexingAtner Yegorov
 
Prezentaciya chr (russk_)_11_02_14_(3)(1)
Prezentaciya chr (russk_)_11_02_14_(3)(1)Prezentaciya chr (russk_)_11_02_14_(3)(1)
Prezentaciya chr (russk_)_11_02_14_(3)(1)Atner Yegorov
 
Peat processing and production of soil modificator for agriculture
Peat processing and production of soil modificator for agriculturePeat processing and production of soil modificator for agriculture
Peat processing and production of soil modificator for agricultureAtner Yegorov
 
долговая политика пресс конференция
долговая политика пресс конференциядолговая политика пресс конференция
долговая политика пресс конференцияAtner Yegorov
 

Viewers also liked (7)

магия чтения, или библиотека вне стен
магия чтения, или библиотека вне стенмагия чтения, или библиотека вне стен
магия чтения, или библиотека вне стен
 
Linde lecturesmunich2
Linde lecturesmunich2Linde lecturesmunich2
Linde lecturesmunich2
 
Msatw 2011 report
Msatw 2011 reportMsatw 2011 report
Msatw 2011 report
 
Fractal tree-technology-and-the-art-of-indexing
Fractal tree-technology-and-the-art-of-indexingFractal tree-technology-and-the-art-of-indexing
Fractal tree-technology-and-the-art-of-indexing
 
Prezentaciya chr (russk_)_11_02_14_(3)(1)
Prezentaciya chr (russk_)_11_02_14_(3)(1)Prezentaciya chr (russk_)_11_02_14_(3)(1)
Prezentaciya chr (russk_)_11_02_14_(3)(1)
 
Peat processing and production of soil modificator for agriculture
Peat processing and production of soil modificator for agriculturePeat processing and production of soil modificator for agriculture
Peat processing and production of soil modificator for agriculture
 
долговая политика пресс конференция
долговая политика пресс конференциядолговая политика пресс конференция
долговая политика пресс конференция
 

Similar to Understanding indexing-webinar-deck

[Www.pkbulk.blogspot.com]file and indexing
[Www.pkbulk.blogspot.com]file and indexing[Www.pkbulk.blogspot.com]file and indexing
[Www.pkbulk.blogspot.com]file and indexingAnusAhmad
 
Table partitioning in PostgreSQL + Rails
Table partitioning in PostgreSQL + RailsTable partitioning in PostgreSQL + Rails
Table partitioning in PostgreSQL + RailsAgnieszka Figiel
 
Cardinality Estimation through Histogram in Apache Spark 2.3 with Ron Hu and ...
Cardinality Estimation through Histogram in Apache Spark 2.3 with Ron Hu and ...Cardinality Estimation through Histogram in Apache Spark 2.3 with Ron Hu and ...
Cardinality Estimation through Histogram in Apache Spark 2.3 with Ron Hu and ...Databricks
 
Chapter 2 Boolean Algebra (part 2)
Chapter 2 Boolean Algebra (part 2)Chapter 2 Boolean Algebra (part 2)
Chapter 2 Boolean Algebra (part 2)Frankie Jones
 
Dat 305 dat305 dat 305 education for service uopstudy.com
Dat 305 dat305 dat 305 education for service   uopstudy.comDat 305 dat305 dat 305 education for service   uopstudy.com
Dat 305 dat305 dat 305 education for service uopstudy.comULLPTT
 
Performance tuning through iterations
Performance tuning through iterationsPerformance tuning through iterations
Performance tuning through iterationsGert Franz
 
Introduction to MySQL Query Tuning for Dev[Op]s
Introduction to MySQL Query Tuning for Dev[Op]sIntroduction to MySQL Query Tuning for Dev[Op]s
Introduction to MySQL Query Tuning for Dev[Op]sSveta Smirnova
 
Rijpma's Catasto meets SPARQL dhb2017_workshop
Rijpma's Catasto meets SPARQL dhb2017_workshopRijpma's Catasto meets SPARQL dhb2017_workshop
Rijpma's Catasto meets SPARQL dhb2017_workshopRichard Zijdeman
 
boolean algebra and logic simplification
boolean algebra and logic simplificationboolean algebra and logic simplification
boolean algebra and logic simplificationUnsa Shakir
 
Travelling salesman problem
Travelling salesman problemTravelling salesman problem
Travelling salesman problemMariamS99
 
45 Essential SQL Interview Questions
45 Essential SQL Interview Questions45 Essential SQL Interview Questions
45 Essential SQL Interview QuestionsBest SEO Tampa
 
Normalization.pdf
Normalization.pdfNormalization.pdf
Normalization.pdfabhijose1
 
Dsoop (co 221) 1
Dsoop (co 221) 1Dsoop (co 221) 1
Dsoop (co 221) 1Puja Koch
 
Fortran & Link with Library & Brief Explanation of MKL BLAS
Fortran & Link with Library & Brief Explanation of MKL BLASFortran & Link with Library & Brief Explanation of MKL BLAS
Fortran & Link with Library & Brief Explanation of MKL BLASJongsu "Liam" Kim
 
SciPy 2011 pandas lightning talk
SciPy 2011 pandas lightning talkSciPy 2011 pandas lightning talk
SciPy 2011 pandas lightning talkWes McKinney
 
You might be paying too much for BigQuery
You might be paying too much for BigQueryYou might be paying too much for BigQuery
You might be paying too much for BigQueryRyuji Tamagawa
 
Geek Sync I Consolidating Indexes in SQL Server
Geek Sync I Consolidating Indexes in SQL ServerGeek Sync I Consolidating Indexes in SQL Server
Geek Sync I Consolidating Indexes in SQL ServerIDERA Software
 

Similar to Understanding indexing-webinar-deck (20)

[Www.pkbulk.blogspot.com]file and indexing
[Www.pkbulk.blogspot.com]file and indexing[Www.pkbulk.blogspot.com]file and indexing
[Www.pkbulk.blogspot.com]file and indexing
 
Table partitioning in PostgreSQL + Rails
Table partitioning in PostgreSQL + RailsTable partitioning in PostgreSQL + Rails
Table partitioning in PostgreSQL + Rails
 
Cardinality Estimation through Histogram in Apache Spark 2.3 with Ron Hu and ...
Cardinality Estimation through Histogram in Apache Spark 2.3 with Ron Hu and ...Cardinality Estimation through Histogram in Apache Spark 2.3 with Ron Hu and ...
Cardinality Estimation through Histogram in Apache Spark 2.3 with Ron Hu and ...
 
DBMS.pptx
DBMS.pptxDBMS.pptx
DBMS.pptx
 
Chapter 2 Boolean Algebra (part 2)
Chapter 2 Boolean Algebra (part 2)Chapter 2 Boolean Algebra (part 2)
Chapter 2 Boolean Algebra (part 2)
 
Dat 305 dat305 dat 305 education for service uopstudy.com
Dat 305 dat305 dat 305 education for service   uopstudy.comDat 305 dat305 dat 305 education for service   uopstudy.com
Dat 305 dat305 dat 305 education for service uopstudy.com
 
Performance tuning through iterations
Performance tuning through iterationsPerformance tuning through iterations
Performance tuning through iterations
 
Introduction to MySQL Query Tuning for Dev[Op]s
Introduction to MySQL Query Tuning for Dev[Op]sIntroduction to MySQL Query Tuning for Dev[Op]s
Introduction to MySQL Query Tuning for Dev[Op]s
 
Rijpma's Catasto meets SPARQL dhb2017_workshop
Rijpma's Catasto meets SPARQL dhb2017_workshopRijpma's Catasto meets SPARQL dhb2017_workshop
Rijpma's Catasto meets SPARQL dhb2017_workshop
 
boolean algebra and logic simplification
boolean algebra and logic simplificationboolean algebra and logic simplification
boolean algebra and logic simplification
 
Travelling salesman problem
Travelling salesman problemTravelling salesman problem
Travelling salesman problem
 
45 Essential SQL Interview Questions
45 Essential SQL Interview Questions45 Essential SQL Interview Questions
45 Essential SQL Interview Questions
 
Normalization.pdf
Normalization.pdfNormalization.pdf
Normalization.pdf
 
Dsoop (co 221) 1
Dsoop (co 221) 1Dsoop (co 221) 1
Dsoop (co 221) 1
 
LEC3-DS ALGO(updated).pdf
LEC3-DS  ALGO(updated).pdfLEC3-DS  ALGO(updated).pdf
LEC3-DS ALGO(updated).pdf
 
Fortran & Link with Library & Brief Explanation of MKL BLAS
Fortran & Link with Library & Brief Explanation of MKL BLASFortran & Link with Library & Brief Explanation of MKL BLAS
Fortran & Link with Library & Brief Explanation of MKL BLAS
 
SciPy 2011 pandas lightning talk
SciPy 2011 pandas lightning talkSciPy 2011 pandas lightning talk
SciPy 2011 pandas lightning talk
 
You might be paying too much for BigQuery
You might be paying too much for BigQueryYou might be paying too much for BigQuery
You might be paying too much for BigQuery
 
Geek Sync I Consolidating Indexes in SQL Server
Geek Sync I Consolidating Indexes in SQL ServerGeek Sync I Consolidating Indexes in SQL Server
Geek Sync I Consolidating Indexes in SQL Server
 
UNIT V.pptx
UNIT V.pptxUNIT V.pptx
UNIT V.pptx
 

More from Atner Yegorov

Здравоохранение Чувашии 2015
Здравоохранение Чувашии 2015Здравоохранение Чувашии 2015
Здравоохранение Чувашии 2015Atner Yegorov
 
Чувашия. Итоги за 5 лет. 2015
Чувашия. Итоги за 5 лет. 2015Чувашия. Итоги за 5 лет. 2015
Чувашия. Итоги за 5 лет. 2015Atner Yegorov
 
Инвестиционная политика Чувашской Республики 2015
Инвестиционная политика Чувашской Республики 2015Инвестиционная политика Чувашской Республики 2015
Инвестиционная политика Чувашской Республики 2015Atner Yegorov
 
Господдержка малого и среднего бизнеса в Чувашии 2015
Господдержка малого и среднего бизнеса в Чувашии 2015Господдержка малого и среднего бизнеса в Чувашии 2015
Господдержка малого и среднего бизнеса в Чувашии 2015Atner Yegorov
 
ИННОВАЦИИ В ПРОМЫШЛЕННОСТИ ЧУВАШИИ. 2015
ИННОВАЦИИ В ПРОМЫШЛЕННОСТИ ЧУВАШИИ. 2015ИННОВАЦИИ В ПРОМЫШЛЕННОСТИ ЧУВАШИИ. 2015
ИННОВАЦИИ В ПРОМЫШЛЕННОСТИ ЧУВАШИИ. 2015Atner Yegorov
 
Брендинг Чувашии
Брендинг ЧувашииБрендинг Чувашии
Брендинг ЧувашииAtner Yegorov
 
VIII Экономический форум Чебоксары
VIII Экономический форум ЧебоксарыVIII Экономический форум Чебоксары
VIII Экономический форум ЧебоксарыAtner Yegorov
 
Рейтин инвестклимата в регионах 2015
Рейтин инвестклимата в регионах 2015Рейтин инвестклимата в регионах 2015
Рейтин инвестклимата в регионах 2015Atner Yegorov
 
Аттракционы тематического парка, Татарстан
Аттракционы тематического парка, ТатарстанАттракционы тематического парка, Татарстан
Аттракционы тематического парка, ТатарстанAtner Yegorov
 
Тематический парк, Татарстан
Тематический парк, ТатарстанТематический парк, Татарстан
Тематический парк, ТатарстанAtner Yegorov
 
Будущее журналистики
Будущее журналистикиБудущее журналистики
Будущее журналистикиAtner Yegorov
 
Будущее нишевых СМИ
Будущее нишевых СМИБудущее нишевых СМИ
Будущее нишевых СМИAtner Yegorov
 
Будущее онлайн СМИ в регионах
Будущее онлайн СМИ в регионах Будущее онлайн СМИ в регионах
Будущее онлайн СМИ в регионах Atner Yegorov
 
Война форматов. Как люди читают новости
Война форматов. Как люди читают новости Война форматов. Как люди читают новости
Война форматов. Как люди читают новости Atner Yegorov
 
Национальная технологическая инициатива
Национальная технологическая инициативаНациональная технологическая инициатива
Национальная технологическая инициативаAtner Yegorov
 
О РАЗРАБОТКЕ И РЕАЛИЗАЦИИ НАЦИОНАЛЬНОЙ ТЕХНОЛОГИЧЕСКОЙ ИНИЦИАТИВЫ
О РАЗРАБОТКЕ И РЕАЛИЗАЦИИ НАЦИОНАЛЬНОЙ  ТЕХНОЛОГИЧЕСКОЙ ИНИЦИАТИВЫ О РАЗРАБОТКЕ И РЕАЛИЗАЦИИ НАЦИОНАЛЬНОЙ  ТЕХНОЛОГИЧЕСКОЙ ИНИЦИАТИВЫ
О РАЗРАБОТКЕ И РЕАЛИЗАЦИИ НАЦИОНАЛЬНОЙ ТЕХНОЛОГИЧЕСКОЙ ИНИЦИАТИВЫ Atner Yegorov
 
Участники заседания по модернизации
Участники заседания по модернизацииУчастники заседания по модернизации
Участники заседания по модернизацииAtner Yegorov
 
Город Innopolis
Город InnopolisГород Innopolis
Город InnopolisAtner Yegorov
 
презентация1
презентация1презентация1
презентация1Atner Yegorov
 

More from Atner Yegorov (20)

Здравоохранение Чувашии 2015
Здравоохранение Чувашии 2015Здравоохранение Чувашии 2015
Здравоохранение Чувашии 2015
 
Чувашия. Итоги за 5 лет. 2015
Чувашия. Итоги за 5 лет. 2015Чувашия. Итоги за 5 лет. 2015
Чувашия. Итоги за 5 лет. 2015
 
Инвестиционная политика Чувашской Республики 2015
Инвестиционная политика Чувашской Республики 2015Инвестиционная политика Чувашской Республики 2015
Инвестиционная политика Чувашской Республики 2015
 
Господдержка малого и среднего бизнеса в Чувашии 2015
Господдержка малого и среднего бизнеса в Чувашии 2015Господдержка малого и среднего бизнеса в Чувашии 2015
Господдержка малого и среднего бизнеса в Чувашии 2015
 
ИННОВАЦИИ В ПРОМЫШЛЕННОСТИ ЧУВАШИИ. 2015
ИННОВАЦИИ В ПРОМЫШЛЕННОСТИ ЧУВАШИИ. 2015ИННОВАЦИИ В ПРОМЫШЛЕННОСТИ ЧУВАШИИ. 2015
ИННОВАЦИИ В ПРОМЫШЛЕННОСТИ ЧУВАШИИ. 2015
 
Брендинг Чувашии
Брендинг ЧувашииБрендинг Чувашии
Брендинг Чувашии
 
VIII Экономический форум Чебоксары
VIII Экономический форум ЧебоксарыVIII Экономический форум Чебоксары
VIII Экономический форум Чебоксары
 
Рейтин инвестклимата в регионах 2015
Рейтин инвестклимата в регионах 2015Рейтин инвестклимата в регионах 2015
Рейтин инвестклимата в регионах 2015
 
Аттракционы тематического парка, Татарстан
Аттракционы тематического парка, ТатарстанАттракционы тематического парка, Татарстан
Аттракционы тематического парка, Татарстан
 
Тематический парк, Татарстан
Тематический парк, ТатарстанТематический парк, Татарстан
Тематический парк, Татарстан
 
Будущее журналистики
Будущее журналистикиБудущее журналистики
Будущее журналистики
 
Будущее нишевых СМИ
Будущее нишевых СМИБудущее нишевых СМИ
Будущее нишевых СМИ
 
Relap
RelapRelap
Relap
 
Будущее онлайн СМИ в регионах
Будущее онлайн СМИ в регионах Будущее онлайн СМИ в регионах
Будущее онлайн СМИ в регионах
 
Война форматов. Как люди читают новости
Война форматов. Как люди читают новости Война форматов. Как люди читают новости
Война форматов. Как люди читают новости
 
Национальная технологическая инициатива
Национальная технологическая инициативаНациональная технологическая инициатива
Национальная технологическая инициатива
 
О РАЗРАБОТКЕ И РЕАЛИЗАЦИИ НАЦИОНАЛЬНОЙ ТЕХНОЛОГИЧЕСКОЙ ИНИЦИАТИВЫ
О РАЗРАБОТКЕ И РЕАЛИЗАЦИИ НАЦИОНАЛЬНОЙ  ТЕХНОЛОГИЧЕСКОЙ ИНИЦИАТИВЫ О РАЗРАБОТКЕ И РЕАЛИЗАЦИИ НАЦИОНАЛЬНОЙ  ТЕХНОЛОГИЧЕСКОЙ ИНИЦИАТИВЫ
О РАЗРАБОТКЕ И РЕАЛИЗАЦИИ НАЦИОНАЛЬНОЙ ТЕХНОЛОГИЧЕСКОЙ ИНИЦИАТИВЫ
 
Участники заседания по модернизации
Участники заседания по модернизацииУчастники заседания по модернизации
Участники заседания по модернизации
 
Город Innopolis
Город InnopolisГород Innopolis
Город Innopolis
 
презентация1
презентация1презентация1
презентация1
 

Recently uploaded

Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 

Recently uploaded (20)

Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 

Understanding indexing-webinar-deck

  • 1. Zardosht Kasheff Understanding Indexing Without Needing to Understand Data Structures
  • 2. What’s a Table? A dictionary is a set of (key, value) pairs. • We’ll assume you can update the dictionary (insertions, deletions, updates) and query the dictionary (point queries, range queries) • B-Trees and Fractal Trees are examples of dictionaries • Hashes are not (range queries are not supported) 2
  • 3. What’s a Table? 3 a b c 100 5 45 101 92 2 156 56 45 165 6 2 198 202 56 206 23 252 256 56 2 412 43 45 A table is a set of dictionaries. Example: create table foo (a int, b int, c int, primary key(a)); Then we insert a bunch of data and get...
  • 4. What’s a Table? The sort order of a dictionary is defined by the key • For the data structures/storage engines we’ll think about, range queries on the sort order are FAST • Range queries on any other order require a table scan = SLOW • Point queries -- retrieving the value for one particular key -- is SLOW ‣A single point query is fast, but reading a bunch of rows this way is going to be 2 orders of magnitude slower than reading the same number of rows in range query order 4
  • 5. What’s an Index? A Index I on table T is itself a dictionary • We need to define the (key, value) pairs. • The key in index I is a subset of fields in the primary dictionary T. • The value in index I is the primary key in T. ‣ There are other ways to define the value, but we’re sticking with this. Example: alter table foo add key(b); Then we get... 5
  • 6. What’s an Index? 6 a b c 100 5 45 101 92 2 156 56 45 165 6 2 198 202 56 206 23 252 256 56 2 412 43 45 b a 5 100 6 165 23 206 43 412 56 156 56 256 92 101 202 198 Primary key(b)
  • 7. Q: count(*) where a<120; 7 a b c 100 5 45 101 92 2 156 56 45 165 6 2 198 202 56 206 23 252 256 56 2 412 43 45 b a 5 100 6 165 23 206 43 412 56 156 56 256 92 101 202 198
  • 8. Q: count(*) where a<120; 7 a b c 100 5 45 101 92 2 156 56 45 165 6 2 198 202 56 206 23 252 256 56 2 412 43 45 b a 5 100 6 165 23 206 43 412 56 156 56 256 92 101 202 198 100 5 45 101 92 2 2
  • 9. Q: count(*) where b>50; 8 a b c 100 5 45 101 92 2 156 56 45 165 6 2 198 202 56 206 23 252 256 56 2 412 43 45 b a 5 100 6 165 23 206 43 412 56 156 56 256 92 101 202 198
  • 10. Q: count(*) where b>50; 8 a b c 100 5 45 101 92 2 156 56 45 165 6 2 198 202 56 206 23 252 256 56 2 412 43 45 b a 5 100 6 165 23 206 43 412 56 156 56 256 92 101 202 198 56 156 56 256 92 101 202 198 4
  • 11. Q: sum(c) where b>50; 9 a b c 100 5 45 101 92 2 156 56 45 165 6 2 198 202 56 206 23 252 256 56 2 412 43 45 b a 5 100 6 165 23 206 43 412 56 156 56 256 92 101 202 198
  • 12. Q: sum(c) where b>50; 9 a b c 100 5 45 101 92 2 156 56 45 165 6 2 198 202 56 206 23 252 256 56 2 412 43 45 b a 5 100 6 165 23 206 43 412 56 156 56 256 92 101 202 198 56 156 56 256 92 101 202 198 105 156 56 45 256 56 2 101 92 2 198 202 56 SlowFast
  • 13. What are indexes good for? Indexes make queries go fast • Each index will speed up some subset of queries Design indexes with queries in mind • Pick most important queries and set up indexes for those • Consider the cost of maintaining the indexes 10
  • 14. What are indexes good for? Indexes make queries go fast • Each index will speed up some subset of queries Design indexes with queries in mind • Pick most important queries and set up indexes for those • Consider the cost of maintaining the indexes 10
  • 15. Goals of Talk 3 simple rules for designing good indexes Avoid details of any data structure • B-trees and Fractal Trees are interesting and fun for the algorithmically minded computer scientist, but the 3 rules will apply equally well to either data structure. • All we need to care about is that range queries are fast (per row) and that point queries are much slower (per row). 11
  • 16. The Rule to Rule Them All There is no absolute rule • Indexing is like a math problem • Rules help, but each scenario is its own problem, which requires problem-solving analysis That said, rules help a lot 12
  • 17. Three Basic Rules 1. Retrieve less data • Less bandwidth, less processing, ... 2. Avoid point queries • Not all data access cost is the same • Sequential access is MUCH faster than random access 3. Avoid Sorting • GROUP BY and ORDER BY queries do post-retrieval work • Indexing can help get rid of this work 13
  • 19. Rule 1: Example of slow query Example TABLE (1B rows, no indexes): •create table foo (a int, b int, c int); Query (1000 rows match): •select sum(c) from foo where b=10 and a<150; Query Plan: • Rows where b=10 and a<150 can be anywhere in the table • Without an index, entire table is scanned Slow execution: • Scan 1B rows just to count 1000 rows
  • 20. Rule 1: How to add an index What should we do? • Reduce the data retrieved • Analyze much less than 1B rows How (for a simple select)? • Design index by focusing on the WHERE clause ‣This defines what rows the query is interested in ‣Other rows are not important for this query 16
  • 21. Rule 1: How to add an index What should we do? • Reduce the data retrieved • Analyze much less than 1B rows How (for a simple select)? • Design index by focusing on the WHERE clause ‣This defines what rows the query is interested in ‣Other rows are not important for this query 16 select sum(c) from foo where b=10 and a<150;
  • 22. Rule 1: Which index? Option 1: key(a) Option 2: key(b) Which is better? Depends on selectivity: • If there are fewer rows where a<150, then key(a) is better • If there are fewer rows where b=10, then key(b) is better Option 3: key(a) AND key(b), then MERGE • We’ll come to this later 17
  • 23. Rule 1: Picking the best key Neither key(a) nor key(b) is optimal Suppose: • 200,000 rows exist where a<150 • 100,000 rows exist where b=10 • 1000 rows exist where b=10 and a<150 Then either index retrieves too much data For better performance, indexes should try to optimize over as many pieces of the where clause as possible • We need a composite index 18
  • 24. Composite indexes reduce data retrieved Where clause: b=5 and a<150 • Option 1: key(a,b) • Option 2: key(b,a) Which one is better? • key(b,a)! KEY RULE: • When making a composite index, place equality checking columns first. Condition on b is equality, but not on a. 19
  • 25. Q: where b=5 and a>150; 20 a b c 100 5 45 101 6 2 156 5 45 165 6 2 198 6 56 206 5 252 256 5 2 412 6 45 b,a a 5,100 100 5,156 156 5,206 206 5,256 256 6,101 101 6,165 165 6,198 198 6,412 412
  • 26. Q: where b=5 and a>150; 20 a b c 100 5 45 101 6 2 156 5 45 165 6 2 198 6 56 206 5 252 256 5 2 412 6 45 b,a a 5,100 100 5,156 156 5,206 206 5,256 256 6,101 101 6,165 165 6,198 198 6,412 412 5,156 156 5,206 206 5,256 256
  • 27. Composite Indexes: No equality clause What if where clause is: •where a>100 and a<200 and b>100; Which is better? • key(a), key(b), key(a,b), key(b,a)? KEY RULE: • As soon as a column on a composite index is NOT used for equality, the rest of the composite index no longer reduces data retrieved. ‣key(a,b) is no better* than key(a) ‣key(b,a) is no better* than key(b) 21
  • 28. Composite Indexes: No equality clause What if where clause is: •where a>100 and a<200 and b>100; Which is better? • key(a), key(b), key(a,b), key(b,a)? KEY RULE: • As soon as a column on a composite index is NOT used for equality, the rest of the composite index no longer reduces data retrieved. ‣key(a,b) is no better* than key(a) ‣key(b,a) is no better* than key(b) 21 * Are there corner cases where it helps? Yes, but rare.
  • 29. Q: where b>=5 and a>150; 22 a b c 100 5 45 101 6 2 156 5 45 165 6 2 198 6 56 206 5 252 256 5 2 412 6 45 b,a a 5,100 100 5,156 156 5,206 206 5,256 256 6,101 101 6,165 165 6,198 198 6,412 412
  • 30. Q: where b>=5 and a>150; 22 a b c 100 5 45 101 6 2 156 5 45 165 6 2 198 6 56 206 5 252 256 5 2 412 6 45 b,a a 5,100 100 5,156 156 5,206 206 5,256 256 6,101 101 6,165 165 6,198 198 6,412 412 5,156 156 5,206 206 5,256 256 6,101 101 6,165 165 6,198 198 6,412 412
  • 31. Q: where b>=5 and a>150; 22 a b c 100 5 45 101 6 2 156 5 45 165 6 2 198 6 56 206 5 252 256 5 2 412 6 45 b,a a 5,100 100 5,156 156 5,206 206 5,256 256 6,101 101 6,165 165 6,198 198 6,412 412 5,156 156 5,206 206 5,256 256 6,101 101 6,165 165 6,198 198 6,412 412
  • 32. Composite Indexes: Another example WHERE clause: b=5 and c=100 • key(b,a,c) is as good as key(b), because a is not used in clause, so having c in index doesn’t help. key(b,c,a) would be much better. 23 a b c 100 5 100 101 6 200 156 5 200 165 6 100 198 6 100 206 5 200 256 5 100 412 6 100 b,a,c a 5,100,100 100 5,156,200 156 5,206,200 206 5,256,100 256 6,101,200 101 6,165,100 165 6,198,100 6,198 6,412,100 412
  • 33. Composite Indexes: Another example WHERE clause: b=5 and c=100 • key(b,a,c) is as good as key(b), because a is not used in clause, so having c in index doesn’t help. key(b,c,a) would be much better. 23 a b c 100 5 100 101 6 200 156 5 200 165 6 100 198 6 100 206 5 200 256 5 100 412 6 100 b,a,c a 5,100,100 100 5,156,200 156 5,206,200 206 5,256,100 256 6,101,200 101 6,165,100 165 6,198,100 6,198 6,412,100 412 5,100,100 100 5,156,200 156 5,206,200 206 5,256,100 256
  • 34. Rule 1: Recap Goal is to reduce rows retrieved • Create composite indexes based on where clause • Place equality-comparison columns at the beginning • Make first non-equality column in index as selective as possible • Once first column in a composite index is not used for equality, or not used in where clause, the rest of the composite index does not help reduce rows retrieved ‣Does that mean they aren’t helpful? ‣They might be very helpful... on to Rule 2. 24
  • 36. Rule 2: Avoid point queries Table: •create table foo (a int, b int, c int, primary key(a), key(b)); Query: •select sum(c) from foo where b>50; Query plan: use key(b) • retrieval cost of each row is high because random point queries are done 26
  • 37. Q: sum(c) where b>50; 27 a b c 100 5 45 101 92 2 156 56 45 165 6 2 198 202 56 206 23 252 256 56 2 412 43 45 b a 5 100 6 165 23 206 43 412 56 156 56 256 92 101 202 198
  • 38. Q: sum(c) where b>50; 27 a b c 100 5 45 101 92 2 156 56 45 165 6 2 198 202 56 206 23 252 256 56 2 412 43 45 b a 5 100 6 165 23 206 43 412 56 156 56 256 92 101 202 198 56 156 56 256 92 101 202 198 105 156 56 45 256 56 2 101 92 2 198 202 56
  • 39. Rule 2: Avoid Point Queries Table: •create table foo (a int, b int, c int, primary key(a), key(b)); Query: •select sum(c) from foo where b>50; Query plan: scan primary table • retrieval cost of each row is CHEAP! • But you retrieve too many rows 28
  • 40. Q: sum(c) where b>50; 29 a b c 100 5 45 101 92 2 156 56 45 165 6 2 198 202 56 206 23 252 256 56 2 412 43 45 b a 5 100 6 165 23 206 43 412 56 156 56 256 92 101 202 198
  • 41. Q: sum(c) where b>50; 29 a b c 100 5 45 101 92 2 156 56 45 165 6 2 198 202 56 206 23 252 256 56 2 412 43 45 b a 5 100 6 165 23 206 43 412 56 156 56 256 92 101 202 198 105 100 5 45 101 92 2 156 56 45 165 6 2 198 202 56 206 23 252 256 56 2 412 43 45
  • 42. Rule 2: Avoid Point Queries Table: •create table foo (a int, b int, c int, primary key(a), key(b)); Query: •select sum(c) from foo where b>50; What if we add another index? • What about key(b,c)? • Since we index on b, we retrieve only the rows we need. • Since the index has information about c, we don’t need to go to the main table. No point queries! 30
  • 43. Q: sum(c) where b>50; 31 a b c 100 5 45 101 92 2 156 56 45 165 6 2 198 202 56 206 23 252 256 56 2 412 43 45 b,c a 5,45 100 6,2 165 23,252 206 43,45 412 56,2 256 56,45 156 92,2 101 202,56 198
  • 44. Q: sum(c) where b>50; 31 a b c 100 5 45 101 92 2 156 56 45 165 6 2 198 202 56 206 23 252 256 56 2 412 43 45 b,c a 5,45 100 6,2 165 23,252 206 43,45 412 56,2 256 56,45 156 92,2 101 202,56 198 56,2 256 56,45 156 92,2 101 202,56 198 105
  • 45. Covering Index An index covers a query if the index has enough information to answer the query. Examples: Q: select sum(c) from foo where b<100; Q: select sum(d) from foo where b<100; Indexes: • key(b,c) -- covering index for first query • key(b,d) -- covering index for second query • key(b,c,d) -- covering index for both 32
  • 46. How to build a covering index Add every field from the select • Not just where clause Q: select c,d from foo where a=10 and b=100; Mistake: add index(a,b); • This doesn’t cover the query. You still need point queries to retrieve c and d values. Correct: add index(a,b,c,d); • Includes all referenced fields • Place a and b at beginning by Rule 1 33
  • 47. What if Primary Key matches where? Q: select sum(c) from foo where b>100 and b<200; Schema: create table foo (a int, b int, c int, ai int auto_increment, primary key (b,ai)); • Query does a range query on primary dictionary • Only one dictionary is accessed, in sequential order • This is fast Primary key covers all queries • If sort order matches where clause, problems solved 34
  • 48. What’s a Clustering Index What if primary key doesn’t match the where clause? • Ideally, you should be able to declare secondary indexes that carry all fields • AFAIK, storage engines don’t let you do this • With one exception... TokuDB • TokuDB allows you to declare any index to be CLUSTERING • A CLUSTERING index covers all queries, just like the primary key 35
  • 49. Clustering Indexes in Action Q: select sum(c) from foo where b<100; Q: select sum(d) from foo where b>200; Q: select c,e from foo where b=1000; Indexes: • key(b,c) covers first query • key(b,d) covers second query • key(b,c,e) covers first and third queries • key(b,c,d,e) covers all three queries Indexes require a lot of analysis of queries 36
  • 50. Clustering Indexes in Action Q: select sum(c) from foo where b<100; Q: select sum(d) from foo where b>200; Q: select c,e from foo where b=1000; What covers all queries? • clustering key(b) Clustering keys let you focus on the where clause • They eliminate point queries and make queries fast 37
  • 51. More on clustering: Index Merge We had example: • create table foo(a int, b int, c int); • select sum(c) from foo where b=10 and a<150; Suppose • 200,000 rows have a<150 • 100,000 rows have b=10 • 1000 rows have b=10 and a<150 What if we use key(a) and key(b) and merge results? 38
  • 52. Query Plans Merge plan: • Scan 200,000 rows in key(a) where a<150 • Scan 100,000 rows in key(b) where b=10 • Merge the results and find 1000 row identifiers that match query • Do point queries with 1000 row identifiers to retrieve c Better than no indexes • Reduces number of rows scanned compared to no index • Reduces number of point queries compared to not merging 39
  • 53. Does Clustering Help Merging? Suppose key(a) is clustering Query plan: • Scan key(a) for 200,000 rows where a<150 • Scan resulting rows for ones where b=10 • Retrieve c values from 1000 remaining rows Once again, no point queries What’s even better? • clustering key(b,a)! 40
  • 54. Rule 2 Recap Avoid Point Queries Make sure index covers query • By mentioning all fields in the query, not just those in the where clause Use clustering indexes • Clustering indexes cover all queries • Allows user to focus on where clause • Speeds up more (and unforeseen) queries -- simplifies database design 41
  • 56. Rule 3: Avoid Sorting Simple selects require no post-processing • select * from foo where b=100; Just get the data and return to user More complex queries do post-processing • GROUP BY and ORDER BY sort the data Index selection can avoid this sorting step 43
  • 57. Avoid Sorting Q1: select count(c) from foo; Q2: select count(c) from foo group by b, order by b; Q1 plan: • While doing a table scan, count rows with c Q2 plan • Scan table and write data to a temporary file • Sort tmp file data by b • Rescan sorted data, counting rows with c, for each b 44
  • 58. Avoid Sorting Q2: select count(c) from foo group by b, order by b; Q2: what if we use key(b,c)? • By adding all needed fields, we cover query. FAST! • By sorting first by b, we avoid sort. FAST! Take home: • Sort index on group by or order by fields to avoid sorting 45
  • 59. Summing Up Use indexes to pre-sort for order by and group by queries 46
  • 60. Putting it all together Sample Queries
  • 62. Example select count(*) from foo where c=5, group by b; 48
  • 63. Example select count(*) from foo where c=5, group by b; key(c,b): • First have c to limit rows retrieved (R1) • Then have remaining rows sorted by b to avoid sort (R3) • Remaining rows will be sorted by b because of equality test on c 48
  • 65. Example select sum(d) from foo where c=100, group by b; 49
  • 66. Example select sum(d) from foo where c=100, group by b; key(c,b,d): • First have c to limit rows retrieved (R1) • Then have remaining rows sorted by b to avoid sort (R3) • Make sure index covers query, to avoid point queries (R2) 49
  • 67. Example Sometimes, there is no clear answer • Best index is data dependent Q: select count(*) from foo where c<100, group by b; Indexes: • key(c,b) • key(b,c) 50
  • 68. Example Q: select count(*) from foo where c<100, group by b; Query plan for key(c,b): • Will filter rows where c<100 • Still need to sort by b ‣ Rows retrieved will not be sorted by b ‣ where clause does not do an equality check on c, so b values are scattered in clumps for different c values 51
  • 69. Example Q: select count(*) from foo where c<100, group by b; Query plan for key(b,c): • Sorted by b, so R3 is covered • Rows where c>=100 are also processed, so not taking advantage of R1 52
  • 70. Example Which is better? • Answer depends on the data • If there are many rows where c>=100, saving time by not retrieving these useless rows helps. Use key(c,b). • If there aren’t so many rows where c>=100, the time to execute the query is dominated by sorting. Use key(b,c). The point is, in general, rules of thumb help, but often they help us think about queries and indexes, rather than giving a recipe. 53
  • 71. Why not just load up with indexes? Need to keep up with insertion load More indexes = smaller max load
  • 72. Indexing cost Space • Issue ‣Each index adds storage requirements • Options ‣Use compression (i.e., aggressive compression of 5-15x always on for TokuDB) Performance • Issue ‣B-trees while fast for certain indexing tasks (in memory, sequential keys), are over 20x slower for other types of indexing • Options ‣Fractal Tree indexes (TokuDB’s data structure) is fast at all kinds of indexing (i.e., random keys, large tables, wide keys, etc...) ‣ No need to worry about what type of index you are creating. ‣ Fractal trees enable customers to index early, index often 55
  • 73. Last Caveat Range Query Performance • Issue ‣Rule #2 (range query performance over point query performance) depends on range queries being fast ‣However B-trees can get fragmented ‣ from deletions, from random insertions, ... ‣ Fragmented B-trees get slow for range queries • Options ‣For B-trees, optimize tables, dump and reload, (ie, time consuming and offline maintenance) ... ‣For Fractal Tree indexing (TokuDB), not an issue ‣ Fractal trees don’t fragment 56
  • 74. Thanks! For more information... • Please contact me at zardosht@tokutek.com for any thoughts or feedback • Please visit Tokutek.com for a copy of this presentation (goo.gl/S2LBe) to learn more about the power of indexing, read about Fractal Tree indexes, or to download a free eval copy of TokuDB 57