The document compares INT and GUID data types for primary keys. INTs have better performance for joins and indexing but GUIDs are unique across systems and allow for asynchronous architectures. A GUID8 data type is proposed that combines the benefits of both by using a semi-sequential number based on date-time stamps and a random number, avoiding duplication risks. Code samples are provided to generate GUID8 values in SQL and PostgreSQL.
2. INT
ADVANTAGE
Has better performance when used in joins, indexes and conditions.
Numeric values are easier to understand for application’s users if
they are displayed.
Widely used for incrementing key values serially.
Less space for storing.
3. GUID
ADVANTAGE
Unique across the [universe] servers.
Client side generated. With GUID, the client application can
generate a new value and can send it to the server. It does not need
to wait till the SAVE function returns to know what is the ID.
Consolidation and syncronization. You have Customer Table in 5
different Databases and you want to make a Data warehouse – no
problem – the records can keep their keys.
4. GUID
ANATOMY
A GUID is most commonly showed as text as a sequence of
hexadecimal digits separated into five groups, such as:
{3F2504E0-4F89-41D3-9A0C-0305E82C3301}
This text notation contains the following fields, separated by hyphens:
Hex digits Description
8 Data1
4 Data2
4 Data3
4 Initial two bytes from Data4
12 Remaining six bytes from Data4
5. INT, BIGINT & GUID
SPACE
INT
INT (4 bytes or 32bits)
++Native and faster to manage in older Pcs.
++Less space for Indexing.
BIGINT
GUID
BIGINT (8 bytes or 64bits)
++Native and faster to manage in new Pcs.
+Moderate space for Indexing.
GUID (16 bytes or 128bits)
+Practical without performance hits on new Pcs.
-Fragmentation hits on Indexing.
-More space for storing.
6. Cast BIGINT as GUID8
THE BETTER OF BOTH WORLD
Practical unique for most database systems, meaning easier
integration with replication
Semi-random client side or server side generated.
Semi-Sequential based on date-time stamp.
Not Fragmentation hits on indexes.
7. GUID8
ANATOMY
LOW INT (lower 4 bytes)
Hold seconds elapsed since Jan 1, 2000.
HI
HI INT (upper 4 bytes)
Random number.
LOW
8. GUID8
Date-time stamp RANGE?
One Year has 31536000 Seconds (365*24*60*60)
Lower INT can hold 4294967295
4294967295 / 31536000 = 136 years
The GUID8 time stamp approach is safe from YEAR
2000-2136
9. GUID8
Probability of one DUPLICATE?
Lower INT hold the Date-Time stamp in seconds.
For every second, the upper INT can hold a
RANDOM Number in range from 0 to 4,294,967,295
So having a duplicate is possible but far probable.
10. FACTS
ON DATABASE MANAGEMENT
GUID is hard to read or typing... Yes but come on! if you're querying that much at
once, you're probably doing it wrong anyhow.
No all tables needs a GUID.
Cost of storage are cheaper and computers are fasters.
GUID Allows asynchronous architectures more easily.
GUID guiltless used by: IPv6, Electronics Devices, Item tagging, OS…
11. GUID8
GENERATION CODE SQL
SELECT
CAST(
(
CAST(
( EXTRACT(DAY FROM now()-'2001-01-01')*(24*60*60) ) +
( EXTRACT(HOUR FROM now()) * (60*60) ) +
( EXTRACT(MINUTE FROM now()) * (60) ) +
EXTRACT(SECOND FROM now())
AS BIGINT ) << 32
)
+
CAST ( CAST( ROUND(RANDOM()*999999999) AS INT) AS BIGINT )
AS BIGINT )
15. Download this presentation:
http://www.carabez.com/downloads/sql_guid_vs_int.zip
More Info:
http://es.wikipedia.org/wiki/Globally_unique_identifier
http://betterexplained.com/articles/the-quick-guide-to-guids/
http://krow.livejournal.com/497839.html
http://blog.sqlauthority.com/2010/04/28/sql-server-guid-vs-int-
your-opinion/
LINKS
For review: