SlideShare uma empresa Scribd logo
1 de 3
Baixar para ler offline
Admin         How To




Installing
and Using
PostgreSQL
Modules
In this article, we will learn how to install and use the PostgreSQL modules chkpass,
fuzzystrmatch, isn and hstore. Modules add different capabilities to a database, like
admin and monitoring tools, new data types, operators, functions and algorithms.
Let’s look at modules that add new data types and algorithms, which will help us to
push some of the application logic to the database.



P
        ostgreSQL has been called the ‘most advanced open           su postgres
        source database’. I have been using it for the last four    createdb module_test
        years as an RDBMS for Foodlets.in, and as a spatial
data store at CSTEP (Center for Study of Science, Technology            Apply the chkpass, fuzzystrmatch, isn and hstore modules
and Policy). PostgreSQL is one piece of software that doesn’t       to the module_test database by running the following
fail to impress me every now and then.                              commands:

Installing the modules                                              psql -d module_test -f chkpass.sql
                                                                    psql -d module_test -f fuzzystrmatch.sql
        Note: I am running Ubuntu 10.04 and PostgreSQL 8.4.         psql -d module_test -f isn.sql
                                                                    psql -d module_test -f hstore.sql
    Install the postgresql-contrib package and restart the
database server, then check the contrib directory for the list of     Let us now look at an example of how each of the
available modules:                                                  modules is used.

sudo apt-get install postgresql-contrib                             Using chkpass
sudo /etc/init.d/postgresql-8.4 restart                             The chkpass module will introduce a new data type,
cd /usr/share/postgresql/8.4/contrib/                               ‘chkpass’, in the database. This type is used to store an
ls                                                                  encrypted field, e.g., a password. Let’s see how chkpass
                                                                    works for a user account table that we create and insert
     Create a test database called module_test:                     two rows into:


88  |  March 2012  | LINUX For You  |  www.LinuxForU.com
How To         Admin

CREATE TABLE accounts (username varchar(100), password         Using isn
chkpass);                                                      This module will introduce data types to store
INSERT INTO accounts(username, “password”) VALUES (‘user1’,    international standard numbers like International Standard
‘pass1’);                                                      Book Numbers (ISBN), International Standard Music
INSERT INTO accounts(username, “password”) VALUES (‘user2’,    Numbers (ISMN), International Standard Serial Numbers
‘pass2’);                                                      (ISSN), Universal Product Codes (UPC), etc. It will also
                                                               add functions to validate data, type-cast numbers from
    We can authenticate users with a query like the one        older formats to the newer 13-digit formats, and vice-
that follows:                                                  versa. Let’s test this module for storing book information:

SELECT count(*) from accounts where username=’user1’ and       CREATE TABLE books(number isbn13, title varchar(100))
password = ‘pass1’                                             INSERT INTO books(“number”, title) VALUES (‘978-03’,
                                                               ‘Rework’);
    The ‘=’ operator uses the eq(column_name, text) in
the module to test for equality. Chkpass uses the Unix            The INSERT statement throws an error: Invalid
crypt() function, and hence it is weak; only the first eight   input syntax for ISBN number: “978-03”. However, this
characters of the text are used in the algorithm. Chkpass      works just fine:
has limited practical use; the pgcrypto module is an
effective alternative.                                         INSERT INTO books(“number”, title) VALUES (‘978-0307463746’,
                                                               ‘Rework’)
Using fuzzystrmatch
This module installs the soundx(), difference(),                   To convert a 10-digit ISBN to 13 digits, use the
levenshtein() and metaphone() functions. Soundx() and          isbn13() function:
metaphone() are phonetic algorithms—they convert a
text string to a code string based on its pronunciation.       INSERT INTO books(“number”, title) VALUES
Difference() and levenshtein() return a numeric value          (isbn13(‘0307463745’), ‘Rework’)
based on the similarity of the two input strings. Let’s
now look into the levenshtein() and metaphone()                     (Actually, the name of the book mentioned here,
functions. The Levenshtein distance between two                'Rework' by Jason Fried, happens to be my favourite
strings is the minimum number of insertions, deletions         book on product/project management! I have prescribed
or substitutions required to convert one string to             it to all my team-mates.)
another.
                                                               Using hstore
SELECT levenshtein(‘foodlets’, ‘booklets’);                    You must have heard enough about NoSQL and key-
                                                               value databases. It’s not always NoSQL vs relational
    This query returns 2, as is obvious.                       databases—with the hstore module, PostgreSQL
    The metaphone() function takes a text string and           allows you to store data in the form of key-value pairs,
the maximum length of the output code as its two input         within a column of a table. Imagine you are processing
parameters. These examples return FTLTS:                       spreadsheets and you have no idea about the column
                                                               headers and the data type of the data in the sheets.
SELECT metaphone(‘foodlets’, 6);                               That’s when hstore comes to your rescue! Incidentally,
SELECT metaphone(‘fudlets’, 6);                                hstore takes keys and values as text; the value can
                                                               be NULL, but not the key. Let’s create a table with a
    If we try to get the Levenshtein distance between the      column of type hstore and insert some rows:
returned strings, this returns 0:
                                                               CREATE TABLE kv_data( id integer, data hstore)
SELECT levenshtein(‘FTLTS’,’FTLTS’);                           INSERT into kv_data values
                                                               (1, hstore(‘name’, ‘amit’) || hstore(‘city’, ‘bangalore’)),
    This means that the two words sound similar.               (2, hstore(‘name’, ‘raghu’) || hstore(‘age’, ‘26’)),
    Fuzzystrmatch is very helpful in implementing the          (3, hstore(‘name’, ‘ram’) || hstore(‘age’, ‘28’));
search feature for a website. Now the search can work with
alternate spellings and misspelled keywords. Reminds you           You can create your own keys like ‘height’,
of the ‘Did you mean...’ feature on Google Search, right?      ‘favourite_book,’ etc. The ‘||’ operator is used for


                                                                            www.LinuxForU.com  | LINUX For You  |  March 2012  |  89
Admin         How To
                                                           concatenation. Now that we have a table and a few
                                                           rows of data, let’s look at some SELECT, UPDATE and
                                                           DELETE queries. To select rows with the value for ‘city’ as
                                                           ‘bangalore’, use the following query:

                                                           SELECT * from kv_data where data->’city’ = ‘bangalore’


                                                               To get the average age across the table (returns 27.0), use
                                                           the query given below:

                                                           SELECT avg((data->’age’)::integer) age from kv_data;


                                                               Here, ::integer is used to type-cast the text value to an
                                                           integer, so that math operations can be performed on it.

                                                               To select and sort rows by ‘name’ values, use:

                                                           SELECT * from kv_data order by data->’name’ desc


                                                               Update the ‘city’ value to ‘delhi’ for all rows, as follows:

                                                           UPDATE kv_data SET data = data || (‘city’ => ‘delhi’);


                                                              Then, delete the ‘age’ key (and values) from all rows, as
                                                           shown below:

                                                           UPDATE kv_data set data = delete(data, ‘age’)


                                                               Next, delete rows with the ‘name’ as ‘amit’:

                                                           DELETE from kv_data where data->’name’ = ‘amit’


                                                              Although not a full-fledged key-value storage, hstore does
                                                           provide us with the flexibility of a key-value database and the
                                                           power of SQL queries.

                                                           Other useful modules
                                                           Here are some other modules you may find useful:
                                                              •	 Pgcrypto provides functions for hashing and
                                                                  encryption. It supports SHA, MD5, Blowfish, AES
                                                                  and other algorithms.
                                                              •	 Citext adds a case-insensitive text data type, which
                                                                  stores text in lower-case form.
                                                              •	 Uuid-ossp provides functions to generate universally
                                                                  unique identifiers.
                                                              •	 Pg_trgm adds functions to find text similarity based
                                                                  on trigram matching.


                                                             By: Sagar Arlekar
                                                             The author is a research engineer at CSTEP, Bengaluru. He
                                                             works in the domains of GIS and agent-based simulations. He
                                                             co-founded Foodlets.in, a visual food guide built entirely on
                                                             open source technologies.



90  |  March 2012  | LINUX For You  |  www.LinuxForU.com

Mais conteúdo relacionado

Mais procurados

Mongoseattle indexing-2010-07-27
Mongoseattle indexing-2010-07-27Mongoseattle indexing-2010-07-27
Mongoseattle indexing-2010-07-27
MongoDB
 
Powerful Explain in MySQL 5.6
Powerful Explain in MySQL 5.6Powerful Explain in MySQL 5.6
Powerful Explain in MySQL 5.6
MYXPLAIN
 
Indexing and Query Optimization
Indexing and Query OptimizationIndexing and Query Optimization
Indexing and Query Optimization
MongoDB
 

Mais procurados (20)

Sql
SqlSql
Sql
 
The Tools for Data Migration Between Oracle , MySQL and Flat Text File.
The Tools for Data Migration Between Oracle , MySQL and Flat Text File.The Tools for Data Migration Between Oracle , MySQL and Flat Text File.
The Tools for Data Migration Between Oracle , MySQL and Flat Text File.
 
Python data structures
Python data structuresPython data structures
Python data structures
 
MYSQL
MYSQLMYSQL
MYSQL
 
learn you some erlang - chap 9 to chap10
learn you some erlang - chap 9 to chap10learn you some erlang - chap 9 to chap10
learn you some erlang - chap 9 to chap10
 
Indexing and Query Optimizer (Richard Kreuter)
Indexing and Query Optimizer (Richard Kreuter)Indexing and Query Optimizer (Richard Kreuter)
Indexing and Query Optimizer (Richard Kreuter)
 
집단지성 프로그래밍 08-가격모델링
집단지성 프로그래밍 08-가격모델링집단지성 프로그래밍 08-가격모델링
집단지성 프로그래밍 08-가격모델링
 
Python 표준 라이브러리
Python 표준 라이브러리Python 표준 라이브러리
Python 표준 라이브러리
 
MySql slides (ppt)
MySql slides (ppt)MySql slides (ppt)
MySql slides (ppt)
 
Mongoseattle indexing-2010-07-27
Mongoseattle indexing-2010-07-27Mongoseattle indexing-2010-07-27
Mongoseattle indexing-2010-07-27
 
Stata Cheat Sheets (all)
Stata Cheat Sheets (all)Stata Cheat Sheets (all)
Stata Cheat Sheets (all)
 
Sql analytic queries tips
Sql analytic queries tipsSql analytic queries tips
Sql analytic queries tips
 
Powerful Explain in MySQL 5.6
Powerful Explain in MySQL 5.6Powerful Explain in MySQL 5.6
Powerful Explain in MySQL 5.6
 
Python Day1
Python Day1Python Day1
Python Day1
 
Sql
SqlSql
Sql
 
SQL
SQLSQL
SQL
 
Functional es6
Functional es6Functional es6
Functional es6
 
Developing Applications with MySQL and Java for beginners
Developing Applications with MySQL and Java for beginnersDeveloping Applications with MySQL and Java for beginners
Developing Applications with MySQL and Java for beginners
 
Indexing and Query Optimization
Indexing and Query OptimizationIndexing and Query Optimization
Indexing and Query Optimization
 
Reducing Development Time with MongoDB vs. SQL
Reducing Development Time with MongoDB vs. SQLReducing Development Time with MongoDB vs. SQL
Reducing Development Time with MongoDB vs. SQL
 

Semelhante a PostgreSQL Modules Tutorial - chkpass, hstore, fuzzystrmach, isn

INFORMATIVE ESSAYThe purpose of the Informative Essay assignme.docx
INFORMATIVE ESSAYThe purpose of the Informative Essay assignme.docxINFORMATIVE ESSAYThe purpose of the Informative Essay assignme.docx
INFORMATIVE ESSAYThe purpose of the Informative Essay assignme.docx
carliotwaycave
 
Data Manipulation with Numpy and Pandas in PythonStarting with N
Data Manipulation with Numpy and Pandas in PythonStarting with NData Manipulation with Numpy and Pandas in PythonStarting with N
Data Manipulation with Numpy and Pandas in PythonStarting with N
OllieShoresna
 
NOSQL and Cassandra
NOSQL and CassandraNOSQL and Cassandra
NOSQL and Cassandra
rantav
 

Semelhante a PostgreSQL Modules Tutorial - chkpass, hstore, fuzzystrmach, isn (20)

ch3.ppt
ch3.pptch3.ppt
ch3.ppt
 
Introduction to SQL
Introduction to SQLIntroduction to SQL
Introduction to SQL
 
ch3.ppt
ch3.pptch3.ppt
ch3.ppt
 
ch3.ppt
ch3.pptch3.ppt
ch3.ppt
 
Ch 3.pdf
Ch 3.pdfCh 3.pdf
Ch 3.pdf
 
ch3.ppt
ch3.pptch3.ppt
ch3.ppt
 
ch3.ppt
ch3.pptch3.ppt
ch3.ppt
 
Aggregate.pptx
Aggregate.pptxAggregate.pptx
Aggregate.pptx
 
Python-for-Data-Analysis.pptx
Python-for-Data-Analysis.pptxPython-for-Data-Analysis.pptx
Python-for-Data-Analysis.pptx
 
Data Structure In C#
Data Structure In C#Data Structure In C#
Data Structure In C#
 
INFORMATIVE ESSAYThe purpose of the Informative Essay assignme.docx
INFORMATIVE ESSAYThe purpose of the Informative Essay assignme.docxINFORMATIVE ESSAYThe purpose of the Informative Essay assignme.docx
INFORMATIVE ESSAYThe purpose of the Informative Essay assignme.docx
 
Sql
SqlSql
Sql
 
Data Manipulation with Numpy and Pandas in PythonStarting with N
Data Manipulation with Numpy and Pandas in PythonStarting with NData Manipulation with Numpy and Pandas in PythonStarting with N
Data Manipulation with Numpy and Pandas in PythonStarting with N
 
Getting started with ES6
Getting started with ES6Getting started with ES6
Getting started with ES6
 
Interface Python with MySQL connectivity.pptx
Interface Python with MySQL connectivity.pptxInterface Python with MySQL connectivity.pptx
Interface Python with MySQL connectivity.pptx
 
NOSQL and Cassandra
NOSQL and CassandraNOSQL and Cassandra
NOSQL and Cassandra
 
Android database tutorial
Android database tutorialAndroid database tutorial
Android database tutorial
 
Mysql1
Mysql1Mysql1
Mysql1
 
Ch3
Ch3Ch3
Ch3
 
DBMS_INTRODUCTION OF SQL
DBMS_INTRODUCTION OF SQLDBMS_INTRODUCTION OF SQL
DBMS_INTRODUCTION OF SQL
 

Mais de Sagar Arlekar

Mais de Sagar Arlekar (8)

Map Making with QGIS - Part II
Map Making with QGIS - Part IIMap Making with QGIS - Part II
Map Making with QGIS - Part II
 
Map Making with QGIS
Map Making with QGISMap Making with QGIS
Map Making with QGIS
 
Foodlets in Business Goa Magazine
Foodlets in Business Goa MagazineFoodlets in Business Goa Magazine
Foodlets in Business Goa Magazine
 
The Foodlets Business Plan Released
The Foodlets Business Plan ReleasedThe Foodlets Business Plan Released
The Foodlets Business Plan Released
 
Rails Plugins - Linux For You, March 2011 Issue
Rails Plugins - Linux For You, March 2011 IssueRails Plugins - Linux For You, March 2011 Issue
Rails Plugins - Linux For You, March 2011 Issue
 
Foodlets Team Interview - Navhind Times
Foodlets Team Interview - Navhind TimesFoodlets Team Interview - Navhind Times
Foodlets Team Interview - Navhind Times
 
Getting Started - Creating products and services that make life better
Getting Started - Creating products and services that make life betterGetting Started - Creating products and services that make life better
Getting Started - Creating products and services that make life better
 
Getting Started - Going out and creating a change
Getting Started - Going out and creating a changeGetting Started - Going out and creating a change
Getting Started - Going out and creating a change
 

Último

Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Último (20)

MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 

PostgreSQL Modules Tutorial - chkpass, hstore, fuzzystrmach, isn

  • 1. Admin How To Installing and Using PostgreSQL Modules In this article, we will learn how to install and use the PostgreSQL modules chkpass, fuzzystrmatch, isn and hstore. Modules add different capabilities to a database, like admin and monitoring tools, new data types, operators, functions and algorithms. Let’s look at modules that add new data types and algorithms, which will help us to push some of the application logic to the database. P ostgreSQL has been called the ‘most advanced open su postgres source database’. I have been using it for the last four createdb module_test years as an RDBMS for Foodlets.in, and as a spatial data store at CSTEP (Center for Study of Science, Technology Apply the chkpass, fuzzystrmatch, isn and hstore modules and Policy). PostgreSQL is one piece of software that doesn’t to the module_test database by running the following fail to impress me every now and then. commands: Installing the modules psql -d module_test -f chkpass.sql psql -d module_test -f fuzzystrmatch.sql Note: I am running Ubuntu 10.04 and PostgreSQL 8.4. psql -d module_test -f isn.sql psql -d module_test -f hstore.sql Install the postgresql-contrib package and restart the database server, then check the contrib directory for the list of Let us now look at an example of how each of the available modules: modules is used. sudo apt-get install postgresql-contrib Using chkpass sudo /etc/init.d/postgresql-8.4 restart The chkpass module will introduce a new data type, cd /usr/share/postgresql/8.4/contrib/ ‘chkpass’, in the database. This type is used to store an ls encrypted field, e.g., a password. Let’s see how chkpass works for a user account table that we create and insert Create a test database called module_test: two rows into: 88  |  March 2012  | LINUX For You  |  www.LinuxForU.com
  • 2. How To Admin CREATE TABLE accounts (username varchar(100), password Using isn chkpass); This module will introduce data types to store INSERT INTO accounts(username, “password”) VALUES (‘user1’, international standard numbers like International Standard ‘pass1’); Book Numbers (ISBN), International Standard Music INSERT INTO accounts(username, “password”) VALUES (‘user2’, Numbers (ISMN), International Standard Serial Numbers ‘pass2’); (ISSN), Universal Product Codes (UPC), etc. It will also add functions to validate data, type-cast numbers from We can authenticate users with a query like the one older formats to the newer 13-digit formats, and vice- that follows: versa. Let’s test this module for storing book information: SELECT count(*) from accounts where username=’user1’ and CREATE TABLE books(number isbn13, title varchar(100)) password = ‘pass1’ INSERT INTO books(“number”, title) VALUES (‘978-03’, ‘Rework’); The ‘=’ operator uses the eq(column_name, text) in the module to test for equality. Chkpass uses the Unix The INSERT statement throws an error: Invalid crypt() function, and hence it is weak; only the first eight input syntax for ISBN number: “978-03”. However, this characters of the text are used in the algorithm. Chkpass works just fine: has limited practical use; the pgcrypto module is an effective alternative. INSERT INTO books(“number”, title) VALUES (‘978-0307463746’, ‘Rework’) Using fuzzystrmatch This module installs the soundx(), difference(), To convert a 10-digit ISBN to 13 digits, use the levenshtein() and metaphone() functions. Soundx() and isbn13() function: metaphone() are phonetic algorithms—they convert a text string to a code string based on its pronunciation. INSERT INTO books(“number”, title) VALUES Difference() and levenshtein() return a numeric value (isbn13(‘0307463745’), ‘Rework’) based on the similarity of the two input strings. Let’s now look into the levenshtein() and metaphone() (Actually, the name of the book mentioned here, functions. The Levenshtein distance between two 'Rework' by Jason Fried, happens to be my favourite strings is the minimum number of insertions, deletions book on product/project management! I have prescribed or substitutions required to convert one string to it to all my team-mates.) another. Using hstore SELECT levenshtein(‘foodlets’, ‘booklets’); You must have heard enough about NoSQL and key- value databases. It’s not always NoSQL vs relational This query returns 2, as is obvious. databases—with the hstore module, PostgreSQL The metaphone() function takes a text string and allows you to store data in the form of key-value pairs, the maximum length of the output code as its two input within a column of a table. Imagine you are processing parameters. These examples return FTLTS: spreadsheets and you have no idea about the column headers and the data type of the data in the sheets. SELECT metaphone(‘foodlets’, 6); That’s when hstore comes to your rescue! Incidentally, SELECT metaphone(‘fudlets’, 6); hstore takes keys and values as text; the value can be NULL, but not the key. Let’s create a table with a If we try to get the Levenshtein distance between the column of type hstore and insert some rows: returned strings, this returns 0: CREATE TABLE kv_data( id integer, data hstore) SELECT levenshtein(‘FTLTS’,’FTLTS’); INSERT into kv_data values (1, hstore(‘name’, ‘amit’) || hstore(‘city’, ‘bangalore’)), This means that the two words sound similar. (2, hstore(‘name’, ‘raghu’) || hstore(‘age’, ‘26’)), Fuzzystrmatch is very helpful in implementing the (3, hstore(‘name’, ‘ram’) || hstore(‘age’, ‘28’)); search feature for a website. Now the search can work with alternate spellings and misspelled keywords. Reminds you You can create your own keys like ‘height’, of the ‘Did you mean...’ feature on Google Search, right? ‘favourite_book,’ etc. The ‘||’ operator is used for www.LinuxForU.com  | LINUX For You  |  March 2012  |  89
  • 3. Admin How To concatenation. Now that we have a table and a few rows of data, let’s look at some SELECT, UPDATE and DELETE queries. To select rows with the value for ‘city’ as ‘bangalore’, use the following query: SELECT * from kv_data where data->’city’ = ‘bangalore’ To get the average age across the table (returns 27.0), use the query given below: SELECT avg((data->’age’)::integer) age from kv_data; Here, ::integer is used to type-cast the text value to an integer, so that math operations can be performed on it. To select and sort rows by ‘name’ values, use: SELECT * from kv_data order by data->’name’ desc Update the ‘city’ value to ‘delhi’ for all rows, as follows: UPDATE kv_data SET data = data || (‘city’ => ‘delhi’); Then, delete the ‘age’ key (and values) from all rows, as shown below: UPDATE kv_data set data = delete(data, ‘age’) Next, delete rows with the ‘name’ as ‘amit’: DELETE from kv_data where data->’name’ = ‘amit’ Although not a full-fledged key-value storage, hstore does provide us with the flexibility of a key-value database and the power of SQL queries. Other useful modules Here are some other modules you may find useful: • Pgcrypto provides functions for hashing and encryption. It supports SHA, MD5, Blowfish, AES and other algorithms. • Citext adds a case-insensitive text data type, which stores text in lower-case form. • Uuid-ossp provides functions to generate universally unique identifiers. • Pg_trgm adds functions to find text similarity based on trigram matching. By: Sagar Arlekar The author is a research engineer at CSTEP, Bengaluru. He works in the domains of GIS and agent-based simulations. He co-founded Foodlets.in, a visual food guide built entirely on open source technologies. 90  |  March 2012  | LINUX For You  |  www.LinuxForU.com