SlideShare uma empresa Scribd logo
1 de 27
PostgreSQL
It’s kind’ve a nifty database
How do you pronounce it?
Thanks to Thad for the suggestion
Answer Response Percentage
post-gres-q-l 2379 45%
post-gres 1611 30%
pahst-grey 24 0%
pg-sequel 50 0%
post-gree 350 6%
postgres-sequel 574 10%
p-g 49 0%
database 230 4%
Total 5267
Who is this guy?
• I’m Barry Jones
• Been a developing web apps since ’98
• Not a DBA
• Performance and infrastructure nut
• Pragmatic Idealist
– Prefer the best solution possible for current circumstances vs the best solution
possible at all costs
What is PostgreSQL NOT?
• NOT a silver bullet
• NOT the answer to life, the universe and
everything
• NOT better at everything than everybody
• NOT always the best option for your needs
• NOT used to it’s potential in most cases
• NOT owned by Oracle or Microsoft
What IS PostgreSQL?
• Fully ACID compliant
• Feature rich and extensible
• Fast, scalable and leverages multicore
processors very well
• Enterprise class with quality corporate
support options
• Free as in beer
• It’s kind’ve nifty
Laundry List of Features
• Multi-version Concurrency Control (MVCC)
• Point in Time Recovery
• Tablespaces
• Asynchronous replication
• Nested Transactions
• Online/hot backups
• Genetic query optimizer multiple index types
• Write ahead logging (WAL)
• Internationalization: character sets, locale-aware sorting, case sensitivity,
formatting
• Full subquery support
• Multiple index scans per query
• ANSI-SQL:2008 standard conformant
• Table inheritance
• LISTEN / NOTIFY event system
• Ability to make a Power Point slide run out of room
What are we covering today?
• Full text-search
• Built in data types
• User defined data types
• Automatic data compression
• A look at some other cool features and
extensions, depending how we’re doing on
time
Full-text Search
• What about…?
– Solr
– Elastic Search
– Sphinx
– Lucene
– MySQL
• All have their purpose
– Distributed search of multiple document types
• Sphinx
– Client search performance is all that matters
• Solr
– Search constantly incoming data with
streaming index updates
• Elastic Search excels
– You really like Java
• Lucene
– You want terrible search results that don’t even
make sense to you much less your users
• MySQL full text search = the worst thing in the world
Full-text Search
• Complications of stand alone search engines
– Data synchronization
• Managing deltas, index updates
• Filtering/deleting/hiding expired data
• Search server outages, redundancy
– Learning curve
– Character sets match up with my database?
– Additional hardware / servers just for search
– Can feel like a black box when you get a support
question asking “why is/isn’t this showing up?”
Full-text Search
• But what if your needs are more like:
– Search within my database
– Avoid syncing data with outside systems
– Avoid maintaining outside systems
– Less black box, more control
Full-text Search
• tsvector
– The text to be searched
• tsquery
– The search query
• to_tsvector(„the church is AWESOME‟) @@ to_tsquery(SEARCH)
• @@ to_tsquery(„church‟) == true
• @@ to_tsquery(„churches‟) == true
• @@ to_tsquery(„awesome‟) == true
• @@ to_tsquery(„the‟) == false
• @@ to_tsquery(„churches & awesome‟) == true
• @@ to_tsquery(„church & okay‟) == false
• to_tsvector(„the church is awesome‟)
– 'awesom':4 'church':2
• to_tsvector(„simple‟,‟the church is awesome‟)
– 'are':3 'awesome':4 'church':2 'the':1
Full-text Search
• ALTER TABLE mytable ADD COLUMN search_vector tsvector
• UPDATE mytable
SET search_vector = to_tsvector(„english‟,coalesce(title,‟‟) || „ „ ||
coalesce(body,‟‟) || „ „ || coalesce(tags,‟‟))
• CREATE INDEX search_text ON mytable USING gin(search_vector)
• SELECT some, columns, we, need
FROM mytable
WHERE search_vector @@ to_tsquery(„english‟,„Jesus & awesome‟)
ORDER BY ts_rank(search_vector,to_tsquery(„english‟,„Jesus & awesome‟))
DESC
• CREATE TRIGGER search_update BEFORE INSERT OR UPDATE
ON mytable FOR EACH ROW EXECUTE PROCEDURE
tsvector_update_trigger(search_vector, ‟english‟, title, body, tags)
Full-text Search
• CREATE FUNCTION search_trigger RETURNS trigger AS $$
begin
new.search_vector :=
setweight(to_tsvector(„english‟,coalesce(new.title,‟‟)),‟A‟) ||
setweight(to_tsvector(„english‟,coalesce(new.body,‟‟)),‟D‟) ||
setweight(to_tsvector(„english‟,coalesce(new.tags,‟‟)),‟B‟);
return new;
end
$$ LANGUAGE plpgsql;
• CREATE TRIGGER search_vector_update
BEFORE INSERT OR UPDATE OF title, body, tags ON mytable
FOR EACH ROW EXECUTE PROCEDURE search_trigger();
Full-text Search
• A variety of dictionaries
– Various Languages
– Thesaurus
– Snowball, Stem, Ispell, Synonym
– Write your own
• ts_headline
– Snippet extraction and highlighting
Datatypes: ranges
• int4range, int8range, numrange, tsrange, tstzrange, daterange
• SELECT int4range(10,20) @> 3 == false
• SELECT numrange(11.1,22.2) && numrange(20.0,30.0) == true
• SELECT int4range(10,20) * int4range(15,25) == 15-20
• CREATE INDEX res_index ON schedule USING gist(during)
• ALTER TABLE schedule ADD EXCLUDE USING gist (during WITH &&)
ERROR: conflicting key value violates exclusion constraint
”schedule_during_excl”
DETAIL: Key (during)=([ 2010-01-01 14:45:00, 2010-01-01
15:45:00 )) conflicts with existing key (during)=([ 2010-01-01
14:30:00, 2010-01-01 15:30:00 )).
Datatypes: hstore
• properties
– {“author” => “John Grisham”, “pages” => 535}
– {“director” => “Jon Favreau”, “runtime” = 126}
• SELECT … FROM mytable
WHERE properties -> „director‟ LIKE „%Favreau‟
– Does not use an index
• WHERE properties @> („author‟ LIKE “%Grisham”)
– Uses an index to only check properties with an „author‟
• CREATE INDEX table_properties ON mytable USING gin(properties)
Datatypes: arrays
• CREATE TABLE sal_emp(name text, pay_by_quarter integer[],
schedule text[][])
• CREATE TABLE tictactoe ( squares integer[3][3] )
• INSERT INTO tictactoe VALUES („{{1,2,3},{4,5,6},{7,8,9}}‟)
• SELECT squares[1:2][1:1] == {{1},{4}}
• SELECT squares[2:3][2:3] == {{5,6},{8,9}}
Datatypes: JSON
• Validate JSON structure
• Convert row to JSON
• Functions and operators very similar to hstore
Datatypes: XML
• Validates well-formed XML
• Stores like a TEXT field
• XML operations like Xpath
• Can’t index XML column but you can index the
result of an Xpath function
Data compression with TOAST
• TOAST = The Oversized Attribute Storage Technique
• TOASTable data is automatically TOASTed
• Example:
– stored a 2.2m XML document
– storage size was 81k
User created datatypes
• Built in types
– Numerics, monetary, binary, time, date, interval, boolean,
enumerated, geometric, network address, bit string, text search, UUID,
XML, JSON, array, composite, range
– Add-ons for more such as UPC, ISBN and more
• Create your own types
– Address (contains 2 streets, city, state, zip, country)
– Define how your datatype is indexed
– GIN and GiST indexes are used by custom datatypes
Further exploration: PostGIS
• Adds Geographic datatypes
• Distance, area, union, intersection, perimeter
• Spatial indexes
• Tools to load available geographic data
• Distance, Within, Overlaps, Touches, Equals,
Contains, Crosses
• SELECT name, ST_AsText(geom)
FROM nyc_subway_stations
WHERE name = „Broad St‟
• SELECT name, boroname
FROM nyc_neighborhoods
WHERE ST_Intersects(geom,
ST_GeomFromText(„POINT(583571 4506714)‟,26918)
• SELECT sub.name, nh.name, nh.borough
FROM nyc_neighborhoods AS nh
JOIN nyc_subway_stations AS sub
ON ST_Contains(nh.geom, sub.geom)
WHERE sub.name = „Broad St”
Further exploration: Functions
• Can be used in queries
• Can be used in stored procedures and triggers
• Can be used to build indexes
• Can be used as table defaults
• Can be written in PL/pgSQL, PL/Tcl, PL/Perl,
PL/Python out of the box
• PL/V8 is available an an extension to use
Javascript
Further exploration: PLV8
• CREATE OR REPLACE FUNCTION plv8_test(keys text[], vals text[])
RETURNS text AS $$
var o = {};
for(var i = 0; i < keys.length; i++) {
o[keys[i]] = vals[i];
}
return JSON.stringify(o);
$$ LANGUAGE plv8 IMMUTABLE STRICT;
SELECT plv8_test(ARRAY[„name‟,‟age‟],ARRAY[„Tom‟,‟29‟]);
• CREATE TYPE rec AS (i integer, t text);
CREATE FUNCTION set_of_records RETURNS SETOF rec AS $$
plv8.return_next({“i”: 1,”t”: ”a”});
plv8.return_next({“i”: 2,”t”: “b”});
$$ LANGUAGE plv8;
SELECT * FROM set_of_records();
Further exploration: Async commands
/ indexes
• Fine grained control within functions
– PQsendQuery
– PQsendQueryParams
– PQsendPrepare
– PQsendQueryPrepared
– PQsendDescribePrepared
– PQgetResult
– PQconsumeInput
• Per connection asynchronous commits
– set synchronous_commit = off
• Concurrent index creation to avoid blocking large tables
– CREATE INDEX CONCURRENTLY big_index ON mytable (things)
Thanks!
References / Credits
• NOTE: Some code samples in this presentation have minor
alterations for presentation clarity (such as leaving out
dictionary specifications on some search calls, etc)
• http://www.postgresql.org/docs/9.2/static/index.html
• http://workshops.opengeo.org/postgis-intro/
• http://stackoverflow.com/questions/15983152/how-can-i-find-out-
how-big-a-large-text-field-is-in-postgres
• https://devcenter.heroku.com/articles/heroku-postgres-
extensions-postgis-full-text-search
• http://railscasts.com/episodes/345-hstore?view=asciicast
• http://www.slideshare.net/billkarwin/full-text-search-in-
postgresql

Mais conteúdo relacionado

Mais procurados

Building Enterprise Search Engines using Open Source Technologies
Building Enterprise Search Engines using Open Source TechnologiesBuilding Enterprise Search Engines using Open Source Technologies
Building Enterprise Search Engines using Open Source TechnologiesRahul Singh
 
Drupal meets PostgreSQL for DrupalCamp MSK 2014
Drupal meets PostgreSQL for DrupalCamp MSK 2014Drupal meets PostgreSQL for DrupalCamp MSK 2014
Drupal meets PostgreSQL for DrupalCamp MSK 2014Kate Marshalkina
 
Introduction to Cassandra - Denver
Introduction to Cassandra - DenverIntroduction to Cassandra - Denver
Introduction to Cassandra - DenverJon Haddad
 
Rupy2012 ArangoDB Workshop Part2
Rupy2012 ArangoDB Workshop Part2Rupy2012 ArangoDB Workshop Part2
Rupy2012 ArangoDB Workshop Part2ArangoDB Database
 
ElasticSearch AJUG 2013
ElasticSearch AJUG 2013ElasticSearch AJUG 2013
ElasticSearch AJUG 2013Roy Russo
 
quick intro to elastic search
quick intro to elastic search quick intro to elastic search
quick intro to elastic search medcl
 
Fluentd - Flexible, Stable, Scalable
Fluentd - Flexible, Stable, ScalableFluentd - Flexible, Stable, Scalable
Fluentd - Flexible, Stable, ScalableShu Ting Tseng
 
Elasticsearch: You know, for search! and more!
Elasticsearch: You know, for search! and more!Elasticsearch: You know, for search! and more!
Elasticsearch: You know, for search! and more!Philips Kokoh Prasetyo
 
An Introduction to Elastic Search.
An Introduction to Elastic Search.An Introduction to Elastic Search.
An Introduction to Elastic Search.Jurriaan Persyn
 
Building a CRM on top of ElasticSearch
Building a CRM on top of ElasticSearchBuilding a CRM on top of ElasticSearch
Building a CRM on top of ElasticSearchMark Greene
 
Dcm#8 elastic search
Dcm#8  elastic searchDcm#8  elastic search
Dcm#8 elastic searchIvan Wallarm
 
Modernizing WordPress Search with Elasticsearch
Modernizing WordPress Search with ElasticsearchModernizing WordPress Search with Elasticsearch
Modernizing WordPress Search with ElasticsearchTaylor Lovett
 
Elastic Search
Elastic SearchElastic Search
Elastic SearchNavule Rao
 
Philly PHP: April '17 Elastic Search Introduction by Aditya Bhamidpati
Philly PHP: April '17 Elastic Search Introduction by Aditya BhamidpatiPhilly PHP: April '17 Elastic Search Introduction by Aditya Bhamidpati
Philly PHP: April '17 Elastic Search Introduction by Aditya BhamidpatiRobert Calcavecchia
 
Scaling with swagger
Scaling with swaggerScaling with swagger
Scaling with swaggerTony Tam
 
High-Performance Hibernate Devoxx France 2016
High-Performance Hibernate Devoxx France 2016High-Performance Hibernate Devoxx France 2016
High-Performance Hibernate Devoxx France 2016Vlad Mihalcea
 
Why Is My Solr Slow?: Presented by Mike Drob, Cloudera
Why Is My Solr Slow?: Presented by Mike Drob, ClouderaWhy Is My Solr Slow?: Presented by Mike Drob, Cloudera
Why Is My Solr Slow?: Presented by Mike Drob, ClouderaLucidworks
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to ElasticsearchSperasoft
 
APRICOT 2015 - NetConf for Peering Automation
APRICOT 2015 - NetConf for Peering AutomationAPRICOT 2015 - NetConf for Peering Automation
APRICOT 2015 - NetConf for Peering AutomationTom Paseka
 

Mais procurados (20)

Apache Jackrabbit
Apache JackrabbitApache Jackrabbit
Apache Jackrabbit
 
Building Enterprise Search Engines using Open Source Technologies
Building Enterprise Search Engines using Open Source TechnologiesBuilding Enterprise Search Engines using Open Source Technologies
Building Enterprise Search Engines using Open Source Technologies
 
Drupal meets PostgreSQL for DrupalCamp MSK 2014
Drupal meets PostgreSQL for DrupalCamp MSK 2014Drupal meets PostgreSQL for DrupalCamp MSK 2014
Drupal meets PostgreSQL for DrupalCamp MSK 2014
 
Introduction to Cassandra - Denver
Introduction to Cassandra - DenverIntroduction to Cassandra - Denver
Introduction to Cassandra - Denver
 
Rupy2012 ArangoDB Workshop Part2
Rupy2012 ArangoDB Workshop Part2Rupy2012 ArangoDB Workshop Part2
Rupy2012 ArangoDB Workshop Part2
 
ElasticSearch AJUG 2013
ElasticSearch AJUG 2013ElasticSearch AJUG 2013
ElasticSearch AJUG 2013
 
quick intro to elastic search
quick intro to elastic search quick intro to elastic search
quick intro to elastic search
 
Fluentd - Flexible, Stable, Scalable
Fluentd - Flexible, Stable, ScalableFluentd - Flexible, Stable, Scalable
Fluentd - Flexible, Stable, Scalable
 
Elasticsearch: You know, for search! and more!
Elasticsearch: You know, for search! and more!Elasticsearch: You know, for search! and more!
Elasticsearch: You know, for search! and more!
 
An Introduction to Elastic Search.
An Introduction to Elastic Search.An Introduction to Elastic Search.
An Introduction to Elastic Search.
 
Building a CRM on top of ElasticSearch
Building a CRM on top of ElasticSearchBuilding a CRM on top of ElasticSearch
Building a CRM on top of ElasticSearch
 
Dcm#8 elastic search
Dcm#8  elastic searchDcm#8  elastic search
Dcm#8 elastic search
 
Modernizing WordPress Search with Elasticsearch
Modernizing WordPress Search with ElasticsearchModernizing WordPress Search with Elasticsearch
Modernizing WordPress Search with Elasticsearch
 
Elastic Search
Elastic SearchElastic Search
Elastic Search
 
Philly PHP: April '17 Elastic Search Introduction by Aditya Bhamidpati
Philly PHP: April '17 Elastic Search Introduction by Aditya BhamidpatiPhilly PHP: April '17 Elastic Search Introduction by Aditya Bhamidpati
Philly PHP: April '17 Elastic Search Introduction by Aditya Bhamidpati
 
Scaling with swagger
Scaling with swaggerScaling with swagger
Scaling with swagger
 
High-Performance Hibernate Devoxx France 2016
High-Performance Hibernate Devoxx France 2016High-Performance Hibernate Devoxx France 2016
High-Performance Hibernate Devoxx France 2016
 
Why Is My Solr Slow?: Presented by Mike Drob, Cloudera
Why Is My Solr Slow?: Presented by Mike Drob, ClouderaWhy Is My Solr Slow?: Presented by Mike Drob, Cloudera
Why Is My Solr Slow?: Presented by Mike Drob, Cloudera
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to Elasticsearch
 
APRICOT 2015 - NetConf for Peering Automation
APRICOT 2015 - NetConf for Peering AutomationAPRICOT 2015 - NetConf for Peering Automation
APRICOT 2015 - NetConf for Peering Automation
 

Semelhante a PostgreSQL - It's kind've a nifty database

MYSQL Query Anti-Patterns That Can Be Moved to Sphinx
MYSQL Query Anti-Patterns That Can Be Moved to SphinxMYSQL Query Anti-Patterns That Can Be Moved to Sphinx
MYSQL Query Anti-Patterns That Can Be Moved to SphinxPythian
 
Transforming WordPress Search and Query Performance with Elasticsearch
Transforming WordPress Search and Query Performance with Elasticsearch Transforming WordPress Search and Query Performance with Elasticsearch
Transforming WordPress Search and Query Performance with Elasticsearch Taylor Lovett
 
PostgreSQL 9.0 & The Future
PostgreSQL 9.0 & The FuturePostgreSQL 9.0 & The Future
PostgreSQL 9.0 & The FutureAaron Thul
 
Elasticsearch - Scalability and Multitenancy
Elasticsearch - Scalability and MultitenancyElasticsearch - Scalability and Multitenancy
Elasticsearch - Scalability and MultitenancyBozhidar Bozhanov
 
You're not using ElasticSearch (outdated)
You're not using ElasticSearch (outdated)You're not using ElasticSearch (outdated)
You're not using ElasticSearch (outdated)Timon Vonk
 
Wordpress search-elasticsearch
Wordpress search-elasticsearchWordpress search-elasticsearch
Wordpress search-elasticsearchTaylor Lovett
 
Webinar Slides: Tungsten Replicator for Elasticsearch - Real-time data loadin...
Webinar Slides: Tungsten Replicator for Elasticsearch - Real-time data loadin...Webinar Slides: Tungsten Replicator for Elasticsearch - Real-time data loadin...
Webinar Slides: Tungsten Replicator for Elasticsearch - Real-time data loadin...Continuent
 
Modernizing WordPress Search with Elasticsearch
Modernizing WordPress Search with ElasticsearchModernizing WordPress Search with Elasticsearch
Modernizing WordPress Search with ElasticsearchTaylor Lovett
 
How ElasticSearch lives in my DevOps life
How ElasticSearch lives in my DevOps lifeHow ElasticSearch lives in my DevOps life
How ElasticSearch lives in my DevOps life琛琳 饶
 
Introduction to Azure Data Lake and U-SQL for SQL users (SQL Saturday 635)
Introduction to Azure Data Lake and U-SQL for SQL users (SQL Saturday 635)Introduction to Azure Data Lake and U-SQL for SQL users (SQL Saturday 635)
Introduction to Azure Data Lake and U-SQL for SQL users (SQL Saturday 635)Michael Rys
 
Типы данных JSONb, соответствующие индексы и модуль jsquery – Олег Бартунов, ...
Типы данных JSONb, соответствующие индексы и модуль jsquery – Олег Бартунов, ...Типы данных JSONb, соответствующие индексы и модуль jsquery – Олег Бартунов, ...
Типы данных JSONb, соответствующие индексы и модуль jsquery – Олег Бартунов, ...Yandex
 
PostgreSQL Moscow Meetup - September 2014 - Oleg Bartunov and Alexander Korotkov
PostgreSQL Moscow Meetup - September 2014 - Oleg Bartunov and Alexander KorotkovPostgreSQL Moscow Meetup - September 2014 - Oleg Bartunov and Alexander Korotkov
PostgreSQL Moscow Meetup - September 2014 - Oleg Bartunov and Alexander KorotkovNikolay Samokhvalov
 
GreenDao Introduction
GreenDao IntroductionGreenDao Introduction
GreenDao IntroductionBooch Lin
 
Turning a Search Engine into a Relational Database
Turning a Search Engine into a Relational DatabaseTurning a Search Engine into a Relational Database
Turning a Search Engine into a Relational DatabaseMatthias Wahl
 
3 CityNetConf - sql+c#=u-sql
3 CityNetConf - sql+c#=u-sql3 CityNetConf - sql+c#=u-sql
3 CityNetConf - sql+c#=u-sqlŁukasz Grala
 

Semelhante a PostgreSQL - It's kind've a nifty database (20)

MYSQL Query Anti-Patterns That Can Be Moved to Sphinx
MYSQL Query Anti-Patterns That Can Be Moved to SphinxMYSQL Query Anti-Patterns That Can Be Moved to Sphinx
MYSQL Query Anti-Patterns That Can Be Moved to Sphinx
 
PostgreSQL
PostgreSQLPostgreSQL
PostgreSQL
 
Transforming WordPress Search and Query Performance with Elasticsearch
Transforming WordPress Search and Query Performance with Elasticsearch Transforming WordPress Search and Query Performance with Elasticsearch
Transforming WordPress Search and Query Performance with Elasticsearch
 
Full Text Search In PostgreSQL
Full Text Search In PostgreSQLFull Text Search In PostgreSQL
Full Text Search In PostgreSQL
 
PostgreSQL 9.0 & The Future
PostgreSQL 9.0 & The FuturePostgreSQL 9.0 & The Future
PostgreSQL 9.0 & The Future
 
Mathias test
Mathias testMathias test
Mathias test
 
Elasticsearch - Scalability and Multitenancy
Elasticsearch - Scalability and MultitenancyElasticsearch - Scalability and Multitenancy
Elasticsearch - Scalability and Multitenancy
 
You're not using ElasticSearch (outdated)
You're not using ElasticSearch (outdated)You're not using ElasticSearch (outdated)
You're not using ElasticSearch (outdated)
 
Wordpress search-elasticsearch
Wordpress search-elasticsearchWordpress search-elasticsearch
Wordpress search-elasticsearch
 
Webinar Slides: Tungsten Replicator for Elasticsearch - Real-time data loadin...
Webinar Slides: Tungsten Replicator for Elasticsearch - Real-time data loadin...Webinar Slides: Tungsten Replicator for Elasticsearch - Real-time data loadin...
Webinar Slides: Tungsten Replicator for Elasticsearch - Real-time data loadin...
 
Modernizing WordPress Search with Elasticsearch
Modernizing WordPress Search with ElasticsearchModernizing WordPress Search with Elasticsearch
Modernizing WordPress Search with Elasticsearch
 
How ElasticSearch lives in my DevOps life
How ElasticSearch lives in my DevOps lifeHow ElasticSearch lives in my DevOps life
How ElasticSearch lives in my DevOps life
 
Introduction to Azure Data Lake and U-SQL for SQL users (SQL Saturday 635)
Introduction to Azure Data Lake and U-SQL for SQL users (SQL Saturday 635)Introduction to Azure Data Lake and U-SQL for SQL users (SQL Saturday 635)
Introduction to Azure Data Lake and U-SQL for SQL users (SQL Saturday 635)
 
ORM Pink Unicorns
ORM Pink UnicornsORM Pink Unicorns
ORM Pink Unicorns
 
Типы данных JSONb, соответствующие индексы и модуль jsquery – Олег Бартунов, ...
Типы данных JSONb, соответствующие индексы и модуль jsquery – Олег Бартунов, ...Типы данных JSONb, соответствующие индексы и модуль jsquery – Олег Бартунов, ...
Типы данных JSONb, соответствующие индексы и модуль jsquery – Олег Бартунов, ...
 
PostgreSQL Moscow Meetup - September 2014 - Oleg Bartunov and Alexander Korotkov
PostgreSQL Moscow Meetup - September 2014 - Oleg Bartunov and Alexander KorotkovPostgreSQL Moscow Meetup - September 2014 - Oleg Bartunov and Alexander Korotkov
PostgreSQL Moscow Meetup - September 2014 - Oleg Bartunov and Alexander Korotkov
 
GreenDao Introduction
GreenDao IntroductionGreenDao Introduction
GreenDao Introduction
 
Oracle by Muhammad Iqbal
Oracle by Muhammad IqbalOracle by Muhammad Iqbal
Oracle by Muhammad Iqbal
 
Turning a Search Engine into a Relational Database
Turning a Search Engine into a Relational DatabaseTurning a Search Engine into a Relational Database
Turning a Search Engine into a Relational Database
 
3 CityNetConf - sql+c#=u-sql
3 CityNetConf - sql+c#=u-sql3 CityNetConf - sql+c#=u-sql
3 CityNetConf - sql+c#=u-sql
 

Mais de Barry Jones

Repeating History...On Purpose...with Elixir
Repeating History...On Purpose...with ElixirRepeating History...On Purpose...with Elixir
Repeating History...On Purpose...with ElixirBarry Jones
 
Go from a PHP Perspective
Go from a PHP PerspectiveGo from a PHP Perspective
Go from a PHP PerspectiveBarry Jones
 
Day 1 - Intro to Ruby
Day 1 - Intro to RubyDay 1 - Intro to Ruby
Day 1 - Intro to RubyBarry Jones
 
Protecting Users from Fraud
Protecting Users from FraudProtecting Users from Fraud
Protecting Users from FraudBarry Jones
 
AWS re:Invent 2013 Recap
AWS re:Invent 2013 RecapAWS re:Invent 2013 Recap
AWS re:Invent 2013 RecapBarry Jones
 
Pair Programming - the lightning talk
Pair Programming - the lightning talkPair Programming - the lightning talk
Pair Programming - the lightning talkBarry Jones
 
What's the "right" PHP Framework?
What's the "right" PHP Framework?What's the "right" PHP Framework?
What's the "right" PHP Framework?Barry Jones
 
Exploring Ruby on Rails and PostgreSQL
Exploring Ruby on Rails and PostgreSQLExploring Ruby on Rails and PostgreSQL
Exploring Ruby on Rails and PostgreSQLBarry Jones
 

Mais de Barry Jones (10)

Repeating History...On Purpose...with Elixir
Repeating History...On Purpose...with ElixirRepeating History...On Purpose...with Elixir
Repeating History...On Purpose...with Elixir
 
Go from a PHP Perspective
Go from a PHP PerspectiveGo from a PHP Perspective
Go from a PHP Perspective
 
Day 8 - jRuby
Day 8 - jRubyDay 8 - jRuby
Day 8 - jRuby
 
Day 6 - PostGIS
Day 6 - PostGISDay 6 - PostGIS
Day 6 - PostGIS
 
Day 1 - Intro to Ruby
Day 1 - Intro to RubyDay 1 - Intro to Ruby
Day 1 - Intro to Ruby
 
Protecting Users from Fraud
Protecting Users from FraudProtecting Users from Fraud
Protecting Users from Fraud
 
AWS re:Invent 2013 Recap
AWS re:Invent 2013 RecapAWS re:Invent 2013 Recap
AWS re:Invent 2013 Recap
 
Pair Programming - the lightning talk
Pair Programming - the lightning talkPair Programming - the lightning talk
Pair Programming - the lightning talk
 
What's the "right" PHP Framework?
What's the "right" PHP Framework?What's the "right" PHP Framework?
What's the "right" PHP Framework?
 
Exploring Ruby on Rails and PostgreSQL
Exploring Ruby on Rails and PostgreSQLExploring Ruby on Rails and PostgreSQL
Exploring Ruby on Rails and PostgreSQL
 

Último

Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 

Último (20)

Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 

PostgreSQL - It's kind've a nifty database

  • 2. How do you pronounce it? Thanks to Thad for the suggestion Answer Response Percentage post-gres-q-l 2379 45% post-gres 1611 30% pahst-grey 24 0% pg-sequel 50 0% post-gree 350 6% postgres-sequel 574 10% p-g 49 0% database 230 4% Total 5267
  • 3. Who is this guy? • I’m Barry Jones • Been a developing web apps since ’98 • Not a DBA • Performance and infrastructure nut • Pragmatic Idealist – Prefer the best solution possible for current circumstances vs the best solution possible at all costs
  • 4. What is PostgreSQL NOT? • NOT a silver bullet • NOT the answer to life, the universe and everything • NOT better at everything than everybody • NOT always the best option for your needs • NOT used to it’s potential in most cases • NOT owned by Oracle or Microsoft
  • 5. What IS PostgreSQL? • Fully ACID compliant • Feature rich and extensible • Fast, scalable and leverages multicore processors very well • Enterprise class with quality corporate support options • Free as in beer • It’s kind’ve nifty
  • 6. Laundry List of Features • Multi-version Concurrency Control (MVCC) • Point in Time Recovery • Tablespaces • Asynchronous replication • Nested Transactions • Online/hot backups • Genetic query optimizer multiple index types • Write ahead logging (WAL) • Internationalization: character sets, locale-aware sorting, case sensitivity, formatting • Full subquery support • Multiple index scans per query • ANSI-SQL:2008 standard conformant • Table inheritance • LISTEN / NOTIFY event system • Ability to make a Power Point slide run out of room
  • 7. What are we covering today? • Full text-search • Built in data types • User defined data types • Automatic data compression • A look at some other cool features and extensions, depending how we’re doing on time
  • 8. Full-text Search • What about…? – Solr – Elastic Search – Sphinx – Lucene – MySQL • All have their purpose – Distributed search of multiple document types • Sphinx – Client search performance is all that matters • Solr – Search constantly incoming data with streaming index updates • Elastic Search excels – You really like Java • Lucene – You want terrible search results that don’t even make sense to you much less your users • MySQL full text search = the worst thing in the world
  • 9. Full-text Search • Complications of stand alone search engines – Data synchronization • Managing deltas, index updates • Filtering/deleting/hiding expired data • Search server outages, redundancy – Learning curve – Character sets match up with my database? – Additional hardware / servers just for search – Can feel like a black box when you get a support question asking “why is/isn’t this showing up?”
  • 10. Full-text Search • But what if your needs are more like: – Search within my database – Avoid syncing data with outside systems – Avoid maintaining outside systems – Less black box, more control
  • 11. Full-text Search • tsvector – The text to be searched • tsquery – The search query • to_tsvector(„the church is AWESOME‟) @@ to_tsquery(SEARCH) • @@ to_tsquery(„church‟) == true • @@ to_tsquery(„churches‟) == true • @@ to_tsquery(„awesome‟) == true • @@ to_tsquery(„the‟) == false • @@ to_tsquery(„churches & awesome‟) == true • @@ to_tsquery(„church & okay‟) == false • to_tsvector(„the church is awesome‟) – 'awesom':4 'church':2 • to_tsvector(„simple‟,‟the church is awesome‟) – 'are':3 'awesome':4 'church':2 'the':1
  • 12. Full-text Search • ALTER TABLE mytable ADD COLUMN search_vector tsvector • UPDATE mytable SET search_vector = to_tsvector(„english‟,coalesce(title,‟‟) || „ „ || coalesce(body,‟‟) || „ „ || coalesce(tags,‟‟)) • CREATE INDEX search_text ON mytable USING gin(search_vector) • SELECT some, columns, we, need FROM mytable WHERE search_vector @@ to_tsquery(„english‟,„Jesus & awesome‟) ORDER BY ts_rank(search_vector,to_tsquery(„english‟,„Jesus & awesome‟)) DESC • CREATE TRIGGER search_update BEFORE INSERT OR UPDATE ON mytable FOR EACH ROW EXECUTE PROCEDURE tsvector_update_trigger(search_vector, ‟english‟, title, body, tags)
  • 13. Full-text Search • CREATE FUNCTION search_trigger RETURNS trigger AS $$ begin new.search_vector := setweight(to_tsvector(„english‟,coalesce(new.title,‟‟)),‟A‟) || setweight(to_tsvector(„english‟,coalesce(new.body,‟‟)),‟D‟) || setweight(to_tsvector(„english‟,coalesce(new.tags,‟‟)),‟B‟); return new; end $$ LANGUAGE plpgsql; • CREATE TRIGGER search_vector_update BEFORE INSERT OR UPDATE OF title, body, tags ON mytable FOR EACH ROW EXECUTE PROCEDURE search_trigger();
  • 14. Full-text Search • A variety of dictionaries – Various Languages – Thesaurus – Snowball, Stem, Ispell, Synonym – Write your own • ts_headline – Snippet extraction and highlighting
  • 15. Datatypes: ranges • int4range, int8range, numrange, tsrange, tstzrange, daterange • SELECT int4range(10,20) @> 3 == false • SELECT numrange(11.1,22.2) && numrange(20.0,30.0) == true • SELECT int4range(10,20) * int4range(15,25) == 15-20 • CREATE INDEX res_index ON schedule USING gist(during) • ALTER TABLE schedule ADD EXCLUDE USING gist (during WITH &&) ERROR: conflicting key value violates exclusion constraint ”schedule_during_excl” DETAIL: Key (during)=([ 2010-01-01 14:45:00, 2010-01-01 15:45:00 )) conflicts with existing key (during)=([ 2010-01-01 14:30:00, 2010-01-01 15:30:00 )).
  • 16. Datatypes: hstore • properties – {“author” => “John Grisham”, “pages” => 535} – {“director” => “Jon Favreau”, “runtime” = 126} • SELECT … FROM mytable WHERE properties -> „director‟ LIKE „%Favreau‟ – Does not use an index • WHERE properties @> („author‟ LIKE “%Grisham”) – Uses an index to only check properties with an „author‟ • CREATE INDEX table_properties ON mytable USING gin(properties)
  • 17. Datatypes: arrays • CREATE TABLE sal_emp(name text, pay_by_quarter integer[], schedule text[][]) • CREATE TABLE tictactoe ( squares integer[3][3] ) • INSERT INTO tictactoe VALUES („{{1,2,3},{4,5,6},{7,8,9}}‟) • SELECT squares[1:2][1:1] == {{1},{4}} • SELECT squares[2:3][2:3] == {{5,6},{8,9}}
  • 18. Datatypes: JSON • Validate JSON structure • Convert row to JSON • Functions and operators very similar to hstore
  • 19. Datatypes: XML • Validates well-formed XML • Stores like a TEXT field • XML operations like Xpath • Can’t index XML column but you can index the result of an Xpath function
  • 20. Data compression with TOAST • TOAST = The Oversized Attribute Storage Technique • TOASTable data is automatically TOASTed • Example: – stored a 2.2m XML document – storage size was 81k
  • 21. User created datatypes • Built in types – Numerics, monetary, binary, time, date, interval, boolean, enumerated, geometric, network address, bit string, text search, UUID, XML, JSON, array, composite, range – Add-ons for more such as UPC, ISBN and more • Create your own types – Address (contains 2 streets, city, state, zip, country) – Define how your datatype is indexed – GIN and GiST indexes are used by custom datatypes
  • 22. Further exploration: PostGIS • Adds Geographic datatypes • Distance, area, union, intersection, perimeter • Spatial indexes • Tools to load available geographic data • Distance, Within, Overlaps, Touches, Equals, Contains, Crosses • SELECT name, ST_AsText(geom) FROM nyc_subway_stations WHERE name = „Broad St‟ • SELECT name, boroname FROM nyc_neighborhoods WHERE ST_Intersects(geom, ST_GeomFromText(„POINT(583571 4506714)‟,26918) • SELECT sub.name, nh.name, nh.borough FROM nyc_neighborhoods AS nh JOIN nyc_subway_stations AS sub ON ST_Contains(nh.geom, sub.geom) WHERE sub.name = „Broad St”
  • 23. Further exploration: Functions • Can be used in queries • Can be used in stored procedures and triggers • Can be used to build indexes • Can be used as table defaults • Can be written in PL/pgSQL, PL/Tcl, PL/Perl, PL/Python out of the box • PL/V8 is available an an extension to use Javascript
  • 24. Further exploration: PLV8 • CREATE OR REPLACE FUNCTION plv8_test(keys text[], vals text[]) RETURNS text AS $$ var o = {}; for(var i = 0; i < keys.length; i++) { o[keys[i]] = vals[i]; } return JSON.stringify(o); $$ LANGUAGE plv8 IMMUTABLE STRICT; SELECT plv8_test(ARRAY[„name‟,‟age‟],ARRAY[„Tom‟,‟29‟]); • CREATE TYPE rec AS (i integer, t text); CREATE FUNCTION set_of_records RETURNS SETOF rec AS $$ plv8.return_next({“i”: 1,”t”: ”a”}); plv8.return_next({“i”: 2,”t”: “b”}); $$ LANGUAGE plv8; SELECT * FROM set_of_records();
  • 25. Further exploration: Async commands / indexes • Fine grained control within functions – PQsendQuery – PQsendQueryParams – PQsendPrepare – PQsendQueryPrepared – PQsendDescribePrepared – PQgetResult – PQconsumeInput • Per connection asynchronous commits – set synchronous_commit = off • Concurrent index creation to avoid blocking large tables – CREATE INDEX CONCURRENTLY big_index ON mytable (things)
  • 27. References / Credits • NOTE: Some code samples in this presentation have minor alterations for presentation clarity (such as leaving out dictionary specifications on some search calls, etc) • http://www.postgresql.org/docs/9.2/static/index.html • http://workshops.opengeo.org/postgis-intro/ • http://stackoverflow.com/questions/15983152/how-can-i-find-out- how-big-a-large-text-field-is-in-postgres • https://devcenter.heroku.com/articles/heroku-postgres- extensions-postgis-full-text-search • http://railscasts.com/episodes/345-hstore?view=asciicast • http://www.slideshare.net/billkarwin/full-text-search-in- postgresql