Mais conteúdo relacionado Semelhante a Government and Education Webinar: SQL Server—Indexing for Performance (20) Government and Education Webinar: SQL Server—Indexing for Performance3. 3
@solarwinds
Agenda
© 2021 SolarWinds Worldwide, LLC. All rights reserved.
• SolarWinds overview
• Index structures and fundamentals
• Strategies for choosing good
indexes
• Best practices for high-
performance indexes
• Summary
• SolarWinds® Database
Performance Analyzer (DPA)
Overview
• Demonstration
• Additional resources
• Q&A
4. 4
@solarwinds
SolarWinds at a Glance
1. IDC defined Network Management Software functional market, IDC’s Worldwide Semiannual Software Tracker, October 15, 2020.
2. Gartner, Market Share Analysis: ITOM: Performance Analysis Software, Worldwide, 2019. June 17, 2020. (AIOps/ITIM/Other Monitoring Tools Software Market). SolarWinds term, Systems Management, refers to the AIOps/ITIM/Other Monitoring
Tools Software Market Taxonomy referenced in the Gartner report. All statements in this report attributable to Gartner represent SolarWinds interpretation of data, research opinion, or viewpoints published as part of a syndicated subscription service
by Gartner, Inc., and have not been reviewed by Gartner. Each Gartner publication speaks as of its original publication date (and not as of the date of this presentation). The opinions expressed in Gartner publications are not representations of fact
and are subject to change without notice.
3. Customers are defined as individuals or entities that have an active subscription for our subscription products or that have purchased one or more of our perpetual license products since our inception under a unique customer identification number.
We may have multiple purchasers of our products within a single organization, each of which may be assigned a unique customer identification number and deemed a separate customer.
#1
in network
management1
320,000+
customers in 190
countries3
60+
IT management
products
22,000+ MSPs serving
450,000+ organizations
Every branch of the DoD and
nearly every civilian and
intelligence agency
150,000+ registered members of THWACK®, our global IT community
Founded in 1999
More than 3,200
employees globally
Austin, TX headquarters
Reston, VA, government office
30+ offices globally
Leader
in remote monitoring and
management
#3
in systems
management2
Growing security
portfolio
499 of the
Fortune 500®
© 2021 SolarWinds Worldwide, LLC. All rights reserved.
5. 5
@solarwinds
What Is SolarWinds ITOM?
IT operations management. Simplified.
IT Ops
On-Premises Public Cloud
Service Management
IT
Security
Database Performance Management
Network Management
Infrastructure Management
IT Security
DevOps
Private Cloud
Wherever IT
sits
All things IT Whoever
manages IT
performance
Application Performance Management
© 2021 SolarWinds Worldwide, LLC. All rights reserved.
6. 6
@solarwinds
• Three primary structures to remember – tables without a clustered
index (Heap), tables with a clustered index, (CI) and non-clustered
indexes (NCI).
• All three structures use an 8Kb data page (8192b total/ 8096
available) for storage:
Page header
1 - Cooper
1
2 - Hofstadter
2
3
4 - Kripke
5 4
5 - Wolowitz
5
3 - Koothrappali
Page header: Contains PageID, Next Page,
Previous Page, amount of free space available,
the object ID of owning object, status, and more.
96 bytes total.
Slot array tracks logical order of records
Graphic by Rich Douglas, @sqlrich,
rdouglas@sentryone.com
Overview of Tables and Indexes
© 2021 SolarWinds Worldwide, LLC. All rights reserved.
7. 7
@solarwinds
Leaf Pages
Root Page
Level 0
Intermediate
Pages Level 1
Level 2
IAM
Leaf Pages
Heap
…
… …
Index
Rows
Leaf
NCI
…
Doubly-
linked
list
Doubly-
linked
list
Heaps, Clustered, and Non-Clustered
© 2021 SolarWinds Worldwide, LLC. All rights reserved.
8. 8
@solarwinds
Clustered vs. Non
Clustered index *is* the data, sorted based on key
Non-clustered index has its own key
• Points to the row (RID) for a heap, to the key for a clustered index
• Extents (sets of eight 8kb pages) may not be physically sorted
Like a clustered index key, narrower index is better
Should every table have a clustered index? In most systems, yes!
© 2021 SolarWinds Worldwide, LLC. All rights reserved.
9. 9
@solarwinds
Heap Pros and Cons
Pros:
• Great for bulk loading (create indexes after)
• Freed space can be reused on delete + subsequent insert
Cons:
• Insert performance can be poor if space is reclaimed
• Updates lead to forwarding pointers
• 8-byte RID assigned to every row (RID is file:page:slot)
© 2021 SolarWinds Worldwide, LLC. All rights reserved.
10. 10
@solarwinds
Considerations for Clustered
Narrow, static, unique, ever-increasing
• Narrow and static because key is repeated in every non-clustered index
• Unique because otherwise a unique identifier needs to be added to
each row
• Ever-increasing to avoid “bad” page splits and fragmentation
Some will argue a GUID is better – spread out I/O
• This is great for high-end, write-heavy workloads, until you have to read
On modern hardware, these factors are less important
• More on this later
© 2021 SolarWinds Worldwide, LLC. All rights reserved.
11. 11
@solarwinds
Considerations for Non-Clustered
Consider an index a skinnier, sorted version of your table
Leading key column should support some balance of:
• Most selective
• Most likely to be used to sort
• Most likely to be used to filter
• Most likely to be used to join
• Less likely to change often (or at least less often than queried)
© 2021 SolarWinds Worldwide, LLC. All rights reserved.
12. 12
@solarwinds
Benefits of Non-Clustered
Can have multiple (only one clustered)
Can support ordering not supported by clustered index
Can contain fewer columns and still “cover” many queries,
i.e., fewer I/O reads.
Can contain fewer rows (filtered) and still satisfy queries
© 2021 SolarWinds Worldwide, LLC. All rights reserved.
13. 13
@solarwinds
Seek vs. Scan
Seeks are used for single-row or range scans
• Seek really means “where does the scan start”
• SQL Server chooses seeks versus scans based on statistics
Scans are generally used to look at the whole index / table
• Scan count in stats I/O does not necessarily mean full scan count
• E.g., a parallel query or partitioned query has multiple partial scans
Generally, a seek is better, but not universally true
• Depending on index structure, a scan can be better
• And a NC index scan can be better than a CIX/table scan
• Don’t blindly optimize for seeks, or assume a scan is bad news
© 2021 SolarWinds Worldwide, LLC. All rights reserved.
14. 14
@solarwinds
• During design, try to maximize the number of rows per page
• Determine row size = data type consumption + overhead bytes
• Determine rows per page = 8096 / row size
• Full pages are best for read-heavy workloads. Less pages to read equals less I/O work
• Avoid:
• MAX data types, where possible, especially on heaps
• ONE BIG TABLE: Too many columns hanging on one primary key, usually 1st or 2nd normal form
• Redundant data or malformed data, as evidenced by lots of calculations, datepart function calls, isnull
function calls, concatenations, and shredding upon retrieval
• Column stuffing:
• Storing arrays (CSVs, XML, JSON to stuff multiple values into a single field)
• Parsing and validation are very cumbersome.
• Clustered index best practices:
• Narrow, static, unique (but not GUID), and ever-increasing columns
• Remember the clustered index is stored in every non-clustered index!
Full Pages vs. Read Performance
© 2021 SolarWinds Worldwide, LLC. All rights reserved.
15. 15
@solarwinds
Full Pages vs. Write Performance
Remember:
Each write to leaf pages requires all non-clustered
index structures to be updated!
Leaf Pages
Root Page
Level 0
Intermediate
Pages
Level 1
Level 2
Page
Split
DML
Actual
place-
ment
• Upon creation of the clustered
index every page is 100% full
leaving no empty space on the leaf
or intermediate pages for added
data from INSERT or UPDATE
statements.
• Promotes logical table
fragmentation, if there are writes.
• Default fill factor of 0/100 can cause
costly page splits on some tables
for write-heavy workloads.
• Ways to fix:
• Rebuild or Reorg indexes
• Specify FILL FACTOR option to
leave open space in leaf pages
• Specify PAD_INDEX option pushes
fill factor up into intermediate pages
© 2021 SolarWinds Worldwide, LLC. All rights reserved.
16. 16
@solarwinds
Power move? Implement Data Compression for huge performance
improvements
Index Analogy – Full Pages vs. an Old Time Bath
© 2021 SolarWinds Worldwide, LLC. All rights reserved.
17. 17
@solarwinds
• Don’t forget the clustered index is stored in EVERY non-clustered index! The
shorter the clustered index, the less the storage and fewer I/Os.
• Remember composite keys are most useful from the leftmost column to the
rightmost column, in the order they appeared in the CREATE INDEX statement.
Example:
• Now, consider this index:
• ALTER TABLE yellow_pages ADD CONSTRAINT [UPKCL_sales] PRIMARY KEY
CLUSTERED ([lname], [fname], [street] )
They can help. They can hurt.
LName
• Austin
• Hanso
• Linus
FName
• Kate
• Magnus
• Benjamin
Street
• 1010 Madison
• 315 Ajira
• 815 Oceanic
Performance Penalty for Concatenated Index
© 2021 SolarWinds Worldwide, LLC. All rights reserved.
18. 18
@solarwinds
Works well with:
• WHERE lname = @a AND fname = @b AND street = @c
Also works well with:
• WHERE lname = @a AND fname = @b
How does the index work with this?
• WHERE fname = @b OR title_id = @c
Best practice? Use a surrogate key for the primary key and
then a natural key as an additional non-clustered index.
Might not be what you expect
What Happens When Querying a Concatenated Index
© 2021 SolarWinds Worldwide, LLC. All rights reserved.
19. 19
@solarwinds
Covering Indexes – The Flipside of Concatenated Indexes
A covering index answers a query entirely from its own
intermediate pages, rather than going all the way down the tree to
leaf pages.
All columns referenced in the query need to be in the index. For
example:
• SELECT lname, fname
• FROM employee
• WHERE lname LIKE '%ttlieb%'
All columns in the index must be referenced in the query:
• CREATE INDEX employee_name_lvl ON employee (lname, fname)
Don’t confuse a covering index with an included columns index
An index covering the entire data requested by the query
© 2021 SolarWinds Worldwide, LLC. All rights reserved.
20. 20
@solarwinds
Include Columns
• Included columns “come along for the ride”
• Copied into NC index structure, again to avoid lookups
• Can exceed 900b key length, can be types not allowed in key (e.g., MAX)
• Should additional columns be in the key or included?
• Generally:
• Only add columns that are small and queried often
• Don’t change an index to satisfy one query that does 12 LOB lookups
• If they are just in the SELECT list, INCLUDE
• If they are used in JOIN/WHERE/GROUP BY/ORDER BY, key if possible
The foil to the dreaded key/RID lookup
© 2021 SolarWinds Worldwide, LLC. All rights reserved.
21. 21
@solarwinds
“Tipping Point”
Generally, if seek estimated to hit > 25-30% of pages, scan is better
Lookups needed with seek on non-covering indexes
• A lookup is executed for every row
Again, generally, tipping point looks like this:
• Decision can be guided by index coverage, row size, cost of lookups, hardware
• Often the tipping point will be much lower
You can’t always get what you want
http://www.sqlpassion.at/archive/2013/06/12/sql-server-tipping-games-why-non-clustered-indexes-are-just-ignored/
© 2021 SolarWinds Worldwide, LLC. All rights reserved.
22. 22
@solarwinds
Sargability
The ability to actually *use* the index
Beware of applying functions against JOIN/WHERE columns
• WHERE DATEDIFF(DAY, Column, GETDATE()) = 0
• Not sargable – will scan – converting to a string is even worse
• WHERE Column >= @today AND Column < DATEADD(DAY, 1, @today)
• Sargable
WHERE CONVERT(DATE, Column) = @today
• Can be sargable, but there’s a catch – bad estimates – see link in notes
ORDER BY key column(s) won’t use index if it doesn’t cover
Surprising reasons why indexes might be unused
© 2021 SolarWinds Worldwide, LLC. All rights reserved.
23. 23
@solarwinds
Creating Indexes: Rules of Thumb for CIs
• Clustered indexes are the actual physically written records
• A SELECT statement with no ORDER BY clause will return
data in the clustered index order
• 1 clustered index per table, 249 non-clustered indexes per table
• Highly (almost) recommended for every table!
• Very useful for columns sorted on GROUP BY and ORDER BY
clauses, as well as those filtered by WHERE clauses
• Place clustered indexes on the PK on high-intensity OLTP
tables
© 2021 SolarWinds Worldwide, LLC. All rights reserved.
24. 24
@solarwinds
Creating Indexes: Rules of Thumb for NCIs
• Useful for retrieving a single record or a range of records
• Maintained in a separate structure and maintained as
changes are made to the base table
• Tend to be much narrower than the base table, so they can
locate the exact record(s) with much less I/O
• Has at least one more intermediate level than the clustered
index but are much less valuable if table doesn’t have a
clustered index
Non-clustered indexes
© 2021 SolarWinds Worldwide, LLC. All rights reserved.
25. 25
@solarwinds
Creating Indexes: Short vs. Long Keys
• A “concatenated key” is an index with more than one
column:
• Up to 16 columns
• Up to 900 bytes
• Short keys usually perform better, from fewer I/Os.
• Concatenated keys, also called composite keys, are
evaluated from leftmost column to right (more on that later).
So be sure the columns are indexed in order from most
frequently used to least frequently used.
© 2021 SolarWinds Worldwide, LLC. All rights reserved.
26. 26
@solarwinds
GUID vs. INT / BIGINT as Clustered Key
Constant debate – use GUID or INT for clustering key?
INT / BIGINT GUID SEQUENTIAL GUID
Pros
• Small
• Compresses well
• Fewer “bad” page splits
• Globally unique*
• Minimal hotspots
• Globally unique*
• Fewer “bad” page splits
Cons
• Can cause hotspots
• IDENTITY/SEQUENCE
issues on 2012+
• Ascending key problem
• Wide
• Compresses poorly
• Many “bad” page splits
• Heavy fragmentation
• Tough to troubleshoot
• Wide
• Compresses poorly
• Causes hotspots
• Tough to troubleshoot
* I’m not convinced – we’ll run out eventually, right?
© 2021 SolarWinds Worldwide, LLC. All rights reserved.
27. 27
@solarwinds
GUID vs. INT / BIGINT Issues
I’m still a fan of minimalism
• Prefer INT / BIGINT mostly for memory footprint
Which would you rather troubleshoot?
• OrderID = 42
• OrderID = ‘2C0C06E1-28A0-46BF-8CC6-63BC437AB42F’
On modern hardware, performance differences can be minimal
• “Disk space is cheap” – as long as you have lots of memory too
• Modern CPUs and SSDs are fast
© 2021 SolarWinds Worldwide, LLC. All rights reserved.
28. 28
@solarwinds
Optimize Index I/O
• Minimize the number of pages you need to read
• Narrower indexes, narrower columns, compression, columnstore
• Minimize output columns to hopefully hit non-clustered and avoid lookups
• Keep statistics up to date (so memory grants are right)
• Otherwise tempdb may be required to help sorts and joins
• Avoid sorting by columns not supported by indexes
• Adding memory can keep bigger indexes in cache, but can’t help sort
• Minimize the number of indexes, especially in write-heavy workloads
• One wide index may be better than two skinny indexes
© 2021 SolarWinds Worldwide, LLC. All rights reserved.
29. 29
@solarwinds
Summary
• Understand the Primary Usage of Table (read vs
read/write). This determines lots of things such as fill factor,
number of indexes, and which is column clustered index.
• (Almost) Always create primary key, clustering key.
• Manually add (non-clustered) indexes to foreign key
constraints and other important columns such as join
columns.
• Most important of all is to test, analyze, and retest. Don’t be
afraid to experiment!
© 2021 SolarWinds Worldwide, LLC. All rights reserved.
31. 31
@solarwinds
Database Performance Analyzer (DPA)
• Down-to-the-second monitoring, for both real-time and historical analysis
• Find your worst-performing applications, SQL, and tables in seconds
• Hybrid monitoring for virtualized, physical, and cloud-based DB instances
• Uncover root causes from CPU and I/O down to the SQL statement
• Visualize all wait times from resources to blocking, to host resources and
VM resources, and more
• Machine learning anomaly detection gets better over time
• Detailed blocking and deadlock analysis expose your SQL blocking
hierarchy
• Links: Data - Demo - Resource
Cross-platform performance monitoring and management for cloud and on-premises
© 2021 SolarWinds Worldwide, LLC. All rights reserved.
33. 33
@solarwinds
Additional Resources
• Indexing
• SQL index fragmentation use case page
• SQL performance tuning use case page
• Database Performance Analyzer
• Webpage
• Datasheet
• DPA demo
• Download free DPA trial
Let us know how we can help you
© 2021 SolarWinds Worldwide, LLC. All rights reserved.
34. 34
@solarwinds
Q&A
© 2021 SolarWinds Worldwide, LLC. All rights reserved.
Call government or education
sales:
877.946.3751
Contact federal sales:
federalsales@solarwinds.com
Contact state and local
government sales:
governmentsales@solarwinds.com
Contact education sales:
educationsales@solarwinds.com
35. 35
@solarwinds
2021 Virtual Government and Education User Group
Wednesday, April 7, 2021
9:00:00 AM MDT - 3:00:00 PM MDT
Register here
© 2021 SolarWinds Worldwide, LLC. All rights reserved.
36. 36
@solarwinds
Upcoming SolarWinds Webinars
© 2021 SolarWinds Worldwide, LLC. All rights reserved.
Government and Education
Webinar: Conquering Remote
Work IT Challenges
March 23, 2021
2:00 p.m. – 3:00 p.m. ET
Register here
Government and Education
Webinar: SQL Server—Advanced
Performance Tuning
June 10, 2021
2:00 p.m. – 3:00 p.m. ET
Register here
37. 37
@solarwinds
Contact Us
• Call government sales: 877.946.3751
• Email SolarWinds federal government sales: federalsales@solarwinds.com
• Email SolarWinds state and local government sales:
governmentsales@solarwinds.com
• Email SolarWinds education sales: educationsales@solarwinds.com
• Visit our THWACK government group: http://thwack.com/government
• Watch a short demo video: http://demo.solarwinds.com/sedemo/
• Download a free trial: http://www.solarwinds.com/downloads/
• Visit our government website: http://www.solarwinds.com/government
• Follow us on LinkedIn®: https://www.linkedin.com/company/solarwinds-
government
Let us know how we can help you
© 2021 SolarWinds Worldwide, LLC. All rights reserved.
39. 39
@solarwinds
The SolarWinds, SolarWinds & Design, Orion, and THWACK
trademarks are the exclusive property of SolarWinds Worldwide,
LLC or its affiliates, are registered with the U.S. Patent and
Trademark Office, and may be registered or pending registration
in other countries. All other SolarWinds trademarks, service
marks, and logos may be common law marks or are registered or
pending registration. All other trademarks mentioned herein are
used for identification purposes only and are trademarks of (and
may be registered trademarks) of their respective companies.
Notas do Editor There are 128 pages stored in 1Mb of storage or memory.
By checking the header, SQL Server can quickly determine the status of each page, making it easy to find. For example, there’s a single bit in the header that marks whether a page is clean or dirty. Another bit in the header marks if the page has been backed up. That’s why, using these bits, SQL Server can be so fast with differential backups.
Heaps, have index_id of 0, Uses IAM and PFS to find the next place to put a record. Not optimised to find a single row, always scan.
No doubly-linked list.
Efficient for large serial data loads starting from empty
Inefficient for transaction processing: Can provoke forwarded record pointers, Forces an 8-byte RID for every inserted row (RID is file:page:slot)
Clustered index is a physically sorted table using a Binary Plus tree structure.
Contains non-leaf pages or “nodes” (Intermediate & Root in image) and leaf / data pages
Uses doubly linked lists – drive an optimization called read ahead – requests up to 512kb using lowest non-leaf level.
Note: Data is never returned from non-leaf level.
Primary means of exclusive locking on a table.
Non-clustered index is also a b-tree structure containing the clustered index data and a pointer row_id (for heaps) or clustered key (for clustered tables). Speed up SELECT statements, slow down write operations
The key and data is passed up the tree, unless the NCI is a unique nci whence it is stored at the leaf-level
Large and concatenated clustered indexes bloat ALL nci’s on a table.
Has at least one more intermediate level than the clustered index, but are much less valuable if table doesn’t have a clustered index.
RID = x:y:z where x = file number, y = page number, z = slot number http://support2.microsoft.com/kb/297861 http://www.sqlskills.com/BLOGS/KIMBERLY/post/GUIDs-as-PRIMARY-KEYs-andor-the-clustering-key.aspx
http://sqlskills.com/BLOGS/KIMBERLY/post/The-Clustered-Index-Debate-Continues.aspx
http://www.sqlskills.com/BLOGS/KIMBERLY/post/Ever-increasing-clustering-key-the-Clustered-Index-Debateagain!.aspx
http://www.sqlskills.com/BLOGS/KIMBERLY/post/Disk-space-is-cheap.aspx
http://sqlperformance.com/2015/01/sql-indexes/bad-habits-disk-space
http://use-the-index-luke.com/blog/2014-01/unreasonable-defaults-primary-key-clustering-key http://www.sqlhammer.com/blog/sql-server-index-column-order/ MAX data types provoke implicit conversion issues when used as a search argument. FORWARDED RECORD POINTERS
In general, it is best practice to create a clustered index on narrow, static, unique, and ever-increasing columns. This is for numerous reasons. First, using an updateable column as the clustering key can be expensive, as updates to the key value could require the data to be moved to another page. This can result in slower writes and updates, and you can expect higher levels of fragmentation. Secondly, the clustered key value is used in non-clustered indexes as a pointer back into the leaf level of the clustered index. This means that the overhead of a wide clustered key is incurred in every index created.
If you don’t make your clustered index unique, it’ll have to add a 4-byte “unique-ifyer” to each row. But since it’s stored as a variable-length column it will actually consume 8-bytes of over head space. Creating a non-unique clustered index on a column with many duplicate values, perhaps on a column of date data type where you might have thousands of records with the same clustered key value, could result in a significant amount of internal overhead. Think about an index as eliminating rows, not finding rows Concatenated keys, also called composite keys, are evaluated from leftmost column to right (more on that later). So be sure the columns are indexed in order from most frequently used to least frequently used.
Also called composite indexes or composite keys.
SQL Server Index Design Guide:
https://technet.microsoft.com/en-us/library/jj835095(v=sql.110).aspx
---
Klaus shows how creating a recommended missing index does not help the query and adds overhead to the database anyway:
http://www.sqlpassion.at/archive/2015/01/20/how-i-tune-sql-server-queries/
Queries may filter on LastName, but return more columns
Can be useful to add those other columns to the index
Don’t do this blindly! Bigger indexes = costlier DML maintenance
Used to avoid lookups into heap or clustered index
Lookups are expensive – 1 read per row, more with LOBs
In the key, put most selective columns first
Think about an index as eliminating rows, not finding rows http://sqlperformance.com/2014/07/sql-indexes/new-index-columns-key-vs-include DEMO – 03.TippingPoint.sql
http://www.sqlpassion.at/archive/2013/06/12/sql-server-tipping-games-why-non-clustered-indexes-are-just-ignored/
http://www.sqlskills.com/blogs/kimberly/why-arent-those-nonclustered-indexes-being-used/
http://www.sqlskills.com/blogs/kimberly/the-tipping-point-query-answers/
http://sqlfascination.com/tag/tipping-point/
http://sqlinthewild.co.za/index.php/2009/01/09/seek-or-scan/
DEMO – 04.Sargability.sql
http://dba.stackexchange.com/questions/34047/cast-to-date-is-sargable-but-is-it-a-good-idea http://sqlperformance.com/2015/01/sql-indexes/bad-habits-disk-space
http://sqlperformance.com/2015/01/sql-indexes/bad-habits-disk-space
Thomas Kejser highly advocates GUIDs to spread out write I/O:
http://kejser.org/boosting-insert-speed-by-generating-scalable-keys/
(Which is great if all you care about is write performance – eventually you’re going to have to read and manage that data.)