SlideShare uma empresa Scribd logo
1 de 16
Baixar para ler offline
Unterschätzte Details der
Datenbank-Indizierung
(und deren Auswirkung)

Vienna System Architects Meetup
2013-12-02

© 2013 by Markus Winand
Was weiß man über Indizierung?
Ergebnisse des “3-Minuten Tests”
http://use-the-index-luke.com/de/3-minuten-test

5 Fragen: Jede prüft das Wissen zu einer
bestimmen Indizierungs-Technik ab.
3-Minute Quiz: Indexing Skills
Q1: Good or Bad?

(Function use)

CREATE	
  INDEX	
  tbl_idx	
  ON	
  tbl	
  (date_column);
SELECT	
  text,	
  date_column
	
  	
  FROM	
  tbl
	
  WHERE	
  TO_CHAR(date_column,	
  'YYYY')	
  =	
  '2013';
3-Minute Quiz: Indexing Skills
...WHERE	
  TO_CHAR(date_column,	
  'YYYY')	
  =	
  '2013';
Seq	
  Scan	
  on	
  tbl	
  (rows=365)
	
  	
  	
  Rows	
  Removed	
  by	
  Filter:	
  49635
Total	
  runtime:	
  118.796	
  ms

...WHERE	
  date_column	
  BETWEEN	
  '2013-­‐01-­‐01'
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  AND	
  '2013-­‐12-­‐31'
Index	
  Scan	
  using	
  tbl_idx	
  on	
  tbl	
  (rows=365)
	
  Total	
  runtime:	
  0.430	
  ms
(Above: simplified PostgreSQL execution plans when selecting 365 rows out of 50000)
3-Minute Quiz: Indexing Skills
Q2: Good or Bad? (Indexed Top-N, no IOS)
CREATE	
  INDEX	
  tbl_idx	
  ON	
  tbl	
  (a,	
  date_col);
SELECT	
  id,	
  a,	
  date_col
	
  	
  FROM	
  tbl
	
  WHERE	
  a	
  =	
  ?
	
  ORDER	
  BY	
  date_col	
  DESC
	
  LIMIT	
  1;
Understandable

controversy!
3-Minute Quiz: Indexing Skills
It is already the most optimal solution
(w/o index-only scan).
	
  Limit	
  (rows=1)
	
  	
  	
  -­‐>	
  Index	
  Scan	
  Backward	
  using	
  tbl_idx	
  on	
  tbl	
  	
  (rows=1)
	
  	
  	
  	
  	
  	
  Index	
  Cond:	
  (a	
  =	
  123::numeric)	
  
	
  Total	
  runtime:	
  0.053	
  ms

As fast as a primary key lookup because it
can never return more than one row.
3-Minute Quiz: Indexing Skills
Q3: Good or Bad?

(Column order)

CREATE INDEX tbl_idx ON tbl (a, b);
SELECT id, a, b FROM tbl
WHERE a = ? AND b = ?;
SELECT id, a, b FROM tbl
WHERE b = ?;
3-Minute Quiz: Indexing Skills
As-is only one query can use the index (a,b):
...WHERE a = ? AND b = ?;
	
  	
  Bitmap	
  Heap	
  Scan	
  on	
  tbl	
  	
  (rows=6)
	
  	
  	
  -­‐>	
  Bitmap	
  Index	
  Scan	
  on	
  tbl_idx	
  (rows=6)
	
  	
  	
  	
  	
  	
  Index	
  Cond:	
  ((a	
  =	
  123)	
  AND	
  (b	
  =	
  1))
	
  	
  Total	
  runtime:	
  0.055	
  ms

...WHERE b = ?;
	
  Seq	
  Scan	
  on	
  tbl	
  (rows=5142)
	
  	
  	
  Rows	
  Removed	
  by	
  Filter:	
  44858
	
  Total	
  runtime:	
  29.849	
  ms
3-Minute Quiz: Indexing Skills
Change the index to (b, a) so both can use it:
...WHERE a = ? AND b = ?;
	
  	
  Bitmap	
  Heap	
  Scan	
  on	
  tbl	
  	
  (rows=6)
	
  	
  	
  -­‐>	
  Bitmap	
  Index	
  Scan	
  on	
  tbl_idx	
  (rows=6)
	
  	
  	
  	
  	
  	
  Index	
  Cond:	
  ((a	
  =	
  123)	
  AND	
  (b	
  =	
  1))
	
  	
  Total	
  runtime:	
  0.056	
  ms

...WHERE b = ?;
Bitmap	
  Heap	
  Scan	
  on	
  tbl	
  (rows=5142)
	
  -­‐>	
  Bitmap	
  Index	
  Scan	
  on	
  tbl_idx	
  	
  (rows=5142)
	
  	
  	
  	
  Index	
  Cond:	
  (b	
  =	
  1::numeric)
Total	
  runtime:	
  6.932	
  ms
3-Minute Quiz: Indexing Skills
Q4: Good or Bad?
CREATE
ON
SELECT
FROM
WHERE

INDEX tbl_idx
tbl (text);
id, text
tbl
text LIKE '%TERM%';

(Indexing LIKE)
3-Minute Quiz: Indexing Skills
B-Tree indexes don’t support prefix wildcards.
	
  Seq	
  Scan	
  on	
  tbl	
  (rows=0)
	
  	
  	
  Rows	
  Removed	
  by	
  Filter:	
  50000
	
  Total	
  runtime:	
  23.494	
  ms

Using a special purpose index (e.g. in PostgreSQL):
CREATE	
  INDEX	
  tbl_tgrm	
  ON	
  tbl
	
  USING	
  gin	
  (text	
  gin_trgm_ops);
Bitmap Heap Scan on tbl (rows=0)
Rows Removed by Index Recheck: 2
-> Bitmap Index	
  Scan on tbl_tgrm (rows=2)
Index Cond: ((text)::text ~~ '%TERM%'::text)
Total runtime: 0.114	
  ms
3-Minute Quiz: Indexing Skills
Q5: Good or Bad?

(equality vs. ranges)

CREATE INDEX tbl_idx
ON tbl (date_col, state);
SELECT id, date_col, state FROM tbl
WHERE date_col >= TO_DATE(‘2008-12-02’)
AND state = 'X';
3-Minute Quiz: Indexing Skills
For the curious, the data distribution:
SELECT	
  count(*)
	
  	
  FROM	
  tbl
	
  WHERE	
  date_column	
  >=	
  TO_DATE(‘2008-­‐12-­‐02’);	
  	
  	
  	
  	
  	
  	
  	
  -­‐-­‐-­‐>	
  1826
SELECT	
  count(*)
	
  	
  FROM	
  tbl
	
  WHERE	
  state	
  =	
  'X';	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  -­‐-­‐-­‐>	
  10000
SELECT	
  count(*)
	
  	
  FROM	
  tbl
	
  WHERE	
  date_column	
  >=	
  TO_DATE(‘2008-­‐12-­‐02’)	
  
	
  	
  	
  AND	
  state	
  =	
  'X';	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  -­‐-­‐-­‐>	
  365
3-Minute Quiz: Indexing Skills
CREATE INDEX tbl_idx
ON tbl (date_col, state);
	
  Index	
  Scan	
  using	
  tbl_idx	
  on	
  tbl	
  (rows=365)
	
  Total	
  runtime:	
  0.893	
  ms

CREATE INDEX tbl_idx
ON tbl (state, date_col);
Index	
  Scan using	
  tbl_idx	
  on	
  tbl	
  (rows=365)
	
  Total	
  runtime: 0.530	
  ms
Indexes: The Neglected All-Rounder
Everybody knows indexing is
important for performance,
yet nobody takes the time to
learn and apply is properly.
About Markus Winand
Tuning developers for
high SQL performance
Training & co (one-man show):
winand.at
Author of:
SQL Performance Explained
Geeky blog:
use-the-index-luke.com

Mais conteúdo relacionado

Semelhante a SQL Performance - Vienna System Architects Meetup 20131202

Apache Cassandra Data Modeling with Travis Price
Apache Cassandra Data Modeling with Travis PriceApache Cassandra Data Modeling with Travis Price
Apache Cassandra Data Modeling with Travis PriceDataStax Academy
 
SQL: Query optimization in practice
SQL: Query optimization in practiceSQL: Query optimization in practice
SQL: Query optimization in practiceJano Suchal
 
Customer Clustering for Retailer Marketing
Customer Clustering for Retailer MarketingCustomer Clustering for Retailer Marketing
Customer Clustering for Retailer MarketingJonathan Sedar
 
Golang in TiDB (GopherChina 2017)
Golang in TiDB  (GopherChina 2017)Golang in TiDB  (GopherChina 2017)
Golang in TiDB (GopherChina 2017)PingCAP
 
Checking clustering factor to detect row migration
Checking clustering factor to detect row migrationChecking clustering factor to detect row migration
Checking clustering factor to detect row migrationHeribertus Bramundito
 
Building Machine Learning Pipelines
Building Machine Learning PipelinesBuilding Machine Learning Pipelines
Building Machine Learning PipelinesInMobi Technology
 
A few things about the Oracle optimizer - 2013
A few things about the Oracle optimizer - 2013A few things about the Oracle optimizer - 2013
A few things about the Oracle optimizer - 2013Connor McDonald
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine LearningBig_Data_Ukraine
 
Полнотекстовый поиск в PostgreSQL за миллисекунды (Олег Бартунов, Александр К...
Полнотекстовый поиск в PostgreSQL за миллисекунды (Олег Бартунов, Александр К...Полнотекстовый поиск в PostgreSQL за миллисекунды (Олег Бартунов, Александр К...
Полнотекстовый поиск в PostgreSQL за миллисекунды (Олег Бартунов, Александр К...Ontico
 
Cassandra 3.0 Awesomeness
Cassandra 3.0 AwesomenessCassandra 3.0 Awesomeness
Cassandra 3.0 AwesomenessJon Haddad
 
Time Series Analysis:Basic Stochastic Signal Recovery
Time Series Analysis:Basic Stochastic Signal RecoveryTime Series Analysis:Basic Stochastic Signal Recovery
Time Series Analysis:Basic Stochastic Signal RecoveryDaniel Cuneo
 
Row patternmatching12ctech14
Row patternmatching12ctech14Row patternmatching12ctech14
Row patternmatching12ctech14stewashton
 
SQL Tuning 101 - Sep 2013
SQL Tuning 101 - Sep 2013SQL Tuning 101 - Sep 2013
SQL Tuning 101 - Sep 2013Connor McDonald
 
Using PostgreSQL statistics to optimize performance
Using PostgreSQL statistics to optimize performance Using PostgreSQL statistics to optimize performance
Using PostgreSQL statistics to optimize performance Alexey Ermakov
 
Apache Cassandra - Data modelling
Apache Cassandra - Data modellingApache Cassandra - Data modelling
Apache Cassandra - Data modellingAlex Thompson
 
Just in time (series) - KairosDB
Just in time (series) - KairosDBJust in time (series) - KairosDB
Just in time (series) - KairosDBVictor Anjos
 
Customer Clustering For Retail Marketing
Customer Clustering For Retail MarketingCustomer Clustering For Retail Marketing
Customer Clustering For Retail MarketingJonathan Sedar
 
Effective Numerical Computation in NumPy and SciPy
Effective Numerical Computation in NumPy and SciPyEffective Numerical Computation in NumPy and SciPy
Effective Numerical Computation in NumPy and SciPyKimikazu Kato
 
Clustering techniques
Clustering techniquesClustering techniques
Clustering techniquestalktoharry
 

Semelhante a SQL Performance - Vienna System Architects Meetup 20131202 (20)

Apache Cassandra Data Modeling with Travis Price
Apache Cassandra Data Modeling with Travis PriceApache Cassandra Data Modeling with Travis Price
Apache Cassandra Data Modeling with Travis Price
 
SQL: Query optimization in practice
SQL: Query optimization in practiceSQL: Query optimization in practice
SQL: Query optimization in practice
 
Customer Clustering for Retailer Marketing
Customer Clustering for Retailer MarketingCustomer Clustering for Retailer Marketing
Customer Clustering for Retailer Marketing
 
Golang in TiDB (GopherChina 2017)
Golang in TiDB  (GopherChina 2017)Golang in TiDB  (GopherChina 2017)
Golang in TiDB (GopherChina 2017)
 
Checking clustering factor to detect row migration
Checking clustering factor to detect row migrationChecking clustering factor to detect row migration
Checking clustering factor to detect row migration
 
Building Machine Learning Pipelines
Building Machine Learning PipelinesBuilding Machine Learning Pipelines
Building Machine Learning Pipelines
 
A few things about the Oracle optimizer - 2013
A few things about the Oracle optimizer - 2013A few things about the Oracle optimizer - 2013
A few things about the Oracle optimizer - 2013
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Building ML Pipelines
Building ML PipelinesBuilding ML Pipelines
Building ML Pipelines
 
Полнотекстовый поиск в PostgreSQL за миллисекунды (Олег Бартунов, Александр К...
Полнотекстовый поиск в PostgreSQL за миллисекунды (Олег Бартунов, Александр К...Полнотекстовый поиск в PostgreSQL за миллисекунды (Олег Бартунов, Александр К...
Полнотекстовый поиск в PostgreSQL за миллисекунды (Олег Бартунов, Александр К...
 
Cassandra 3.0 Awesomeness
Cassandra 3.0 AwesomenessCassandra 3.0 Awesomeness
Cassandra 3.0 Awesomeness
 
Time Series Analysis:Basic Stochastic Signal Recovery
Time Series Analysis:Basic Stochastic Signal RecoveryTime Series Analysis:Basic Stochastic Signal Recovery
Time Series Analysis:Basic Stochastic Signal Recovery
 
Row patternmatching12ctech14
Row patternmatching12ctech14Row patternmatching12ctech14
Row patternmatching12ctech14
 
SQL Tuning 101 - Sep 2013
SQL Tuning 101 - Sep 2013SQL Tuning 101 - Sep 2013
SQL Tuning 101 - Sep 2013
 
Using PostgreSQL statistics to optimize performance
Using PostgreSQL statistics to optimize performance Using PostgreSQL statistics to optimize performance
Using PostgreSQL statistics to optimize performance
 
Apache Cassandra - Data modelling
Apache Cassandra - Data modellingApache Cassandra - Data modelling
Apache Cassandra - Data modelling
 
Just in time (series) - KairosDB
Just in time (series) - KairosDBJust in time (series) - KairosDB
Just in time (series) - KairosDB
 
Customer Clustering For Retail Marketing
Customer Clustering For Retail MarketingCustomer Clustering For Retail Marketing
Customer Clustering For Retail Marketing
 
Effective Numerical Computation in NumPy and SciPy
Effective Numerical Computation in NumPy and SciPyEffective Numerical Computation in NumPy and SciPy
Effective Numerical Computation in NumPy and SciPy
 
Clustering techniques
Clustering techniquesClustering techniques
Clustering techniques
 

Último

Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 

Último (20)

Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 

SQL Performance - Vienna System Architects Meetup 20131202

  • 1. Unterschätzte Details der Datenbank-Indizierung (und deren Auswirkung) Vienna System Architects Meetup 2013-12-02 © 2013 by Markus Winand
  • 2. Was weiß man über Indizierung? Ergebnisse des “3-Minuten Tests” http://use-the-index-luke.com/de/3-minuten-test 5 Fragen: Jede prüft das Wissen zu einer bestimmen Indizierungs-Technik ab.
  • 3. 3-Minute Quiz: Indexing Skills Q1: Good or Bad? (Function use) CREATE  INDEX  tbl_idx  ON  tbl  (date_column); SELECT  text,  date_column    FROM  tbl  WHERE  TO_CHAR(date_column,  'YYYY')  =  '2013';
  • 4. 3-Minute Quiz: Indexing Skills ...WHERE  TO_CHAR(date_column,  'YYYY')  =  '2013'; Seq  Scan  on  tbl  (rows=365)      Rows  Removed  by  Filter:  49635 Total  runtime:  118.796  ms ...WHERE  date_column  BETWEEN  '2013-­‐01-­‐01'                                                  AND  '2013-­‐12-­‐31' Index  Scan  using  tbl_idx  on  tbl  (rows=365)  Total  runtime:  0.430  ms (Above: simplified PostgreSQL execution plans when selecting 365 rows out of 50000)
  • 5. 3-Minute Quiz: Indexing Skills Q2: Good or Bad? (Indexed Top-N, no IOS) CREATE  INDEX  tbl_idx  ON  tbl  (a,  date_col); SELECT  id,  a,  date_col    FROM  tbl  WHERE  a  =  ?  ORDER  BY  date_col  DESC  LIMIT  1; Understandable controversy!
  • 6. 3-Minute Quiz: Indexing Skills It is already the most optimal solution (w/o index-only scan).  Limit  (rows=1)      -­‐>  Index  Scan  Backward  using  tbl_idx  on  tbl    (rows=1)            Index  Cond:  (a  =  123::numeric)    Total  runtime:  0.053  ms As fast as a primary key lookup because it can never return more than one row.
  • 7. 3-Minute Quiz: Indexing Skills Q3: Good or Bad? (Column order) CREATE INDEX tbl_idx ON tbl (a, b); SELECT id, a, b FROM tbl WHERE a = ? AND b = ?; SELECT id, a, b FROM tbl WHERE b = ?;
  • 8. 3-Minute Quiz: Indexing Skills As-is only one query can use the index (a,b): ...WHERE a = ? AND b = ?;    Bitmap  Heap  Scan  on  tbl    (rows=6)      -­‐>  Bitmap  Index  Scan  on  tbl_idx  (rows=6)            Index  Cond:  ((a  =  123)  AND  (b  =  1))    Total  runtime:  0.055  ms ...WHERE b = ?;  Seq  Scan  on  tbl  (rows=5142)      Rows  Removed  by  Filter:  44858  Total  runtime:  29.849  ms
  • 9. 3-Minute Quiz: Indexing Skills Change the index to (b, a) so both can use it: ...WHERE a = ? AND b = ?;    Bitmap  Heap  Scan  on  tbl    (rows=6)      -­‐>  Bitmap  Index  Scan  on  tbl_idx  (rows=6)            Index  Cond:  ((a  =  123)  AND  (b  =  1))    Total  runtime:  0.056  ms ...WHERE b = ?; Bitmap  Heap  Scan  on  tbl  (rows=5142)  -­‐>  Bitmap  Index  Scan  on  tbl_idx    (rows=5142)        Index  Cond:  (b  =  1::numeric) Total  runtime:  6.932  ms
  • 10. 3-Minute Quiz: Indexing Skills Q4: Good or Bad? CREATE ON SELECT FROM WHERE INDEX tbl_idx tbl (text); id, text tbl text LIKE '%TERM%'; (Indexing LIKE)
  • 11. 3-Minute Quiz: Indexing Skills B-Tree indexes don’t support prefix wildcards.  Seq  Scan  on  tbl  (rows=0)      Rows  Removed  by  Filter:  50000  Total  runtime:  23.494  ms Using a special purpose index (e.g. in PostgreSQL): CREATE  INDEX  tbl_tgrm  ON  tbl  USING  gin  (text  gin_trgm_ops); Bitmap Heap Scan on tbl (rows=0) Rows Removed by Index Recheck: 2 -> Bitmap Index  Scan on tbl_tgrm (rows=2) Index Cond: ((text)::text ~~ '%TERM%'::text) Total runtime: 0.114  ms
  • 12. 3-Minute Quiz: Indexing Skills Q5: Good or Bad? (equality vs. ranges) CREATE INDEX tbl_idx ON tbl (date_col, state); SELECT id, date_col, state FROM tbl WHERE date_col >= TO_DATE(‘2008-12-02’) AND state = 'X';
  • 13. 3-Minute Quiz: Indexing Skills For the curious, the data distribution: SELECT  count(*)    FROM  tbl  WHERE  date_column  >=  TO_DATE(‘2008-­‐12-­‐02’);                -­‐-­‐-­‐>  1826 SELECT  count(*)    FROM  tbl  WHERE  state  =  'X';                                                                  -­‐-­‐-­‐>  10000 SELECT  count(*)    FROM  tbl  WHERE  date_column  >=  TO_DATE(‘2008-­‐12-­‐02’)        AND  state  =  'X';                                                                  -­‐-­‐-­‐>  365
  • 14. 3-Minute Quiz: Indexing Skills CREATE INDEX tbl_idx ON tbl (date_col, state);  Index  Scan  using  tbl_idx  on  tbl  (rows=365)  Total  runtime:  0.893  ms CREATE INDEX tbl_idx ON tbl (state, date_col); Index  Scan using  tbl_idx  on  tbl  (rows=365)  Total  runtime: 0.530  ms
  • 15. Indexes: The Neglected All-Rounder Everybody knows indexing is important for performance, yet nobody takes the time to learn and apply is properly.
  • 16. About Markus Winand Tuning developers for high SQL performance Training & co (one-man show): winand.at Author of: SQL Performance Explained Geeky blog: use-the-index-luke.com