This document summarizes a lecture on database integrity and performance. It discusses various techniques for ensuring database integrity, including primary key constraints to prevent duplicate tuples, foreign key constraints to prevent dangling references, and attribute constraints to prevent inconsistent attribute values. It also covers views, which allow querying virtual tables, and indexes to improve query performance by enabling faster searching. The document proposes discussing index structures and report topics at the next meeting.
3. Primary Key Constraints What are Primary Keys good for? Uniquely identify the subject of each tuple Ensure that there are no duplicates Cannot be null – that would imply a NULL subject. A table may not have more than one primary key A Primary Key may consist of one or more columns Multiple Unique keys are OK For Table R, <P1, P2, …, Pm> together constitute a primary key if for each tuple in R, <P1, P2, …, Pm> are unique P1, P2, …, Pm are non-null <U1, U2, …, Um> together constitute a unique key if for each tuple in R, <U1, U2, …, Um> are unique But U1, U2, …, Umcan be null
4. Foreign Key Constraints (p1) Main Idea: Prevent Dangling Tuples Foreign Key Key Reference Foreign Key Must point to a Key Reference CREATE TABLE City ( :: CountryCode char(3) REFERENCES Country(Code) ) Key Reference Must be unique or primary key Try: INSERT INTO city (Name, CountryCode) value ('xyzzy', 'XYZ'); Try: UPDATE city set CountryCode='XYZ' where CountryCode='FIN'; Key reference must already exist before a referencing tuple can be added
5. Foreign Key Constraints (p2) Alternative methods of defining a foreign key CREATE TABLE City ( CountryCode char(3) REFERENCES COUNTRY(Code), …) CREATE TABLE City ( CountryCode char(3), …, FOREIGN KEY CountryCode [CONSTRAINT [ctyREFcntry]] REFERENCES COUNTRY(Code)) CREATE TABLE City ( CountryCode char(3), …) Then, later, ALTER TABLE City ADD [CONSTRAINT [ctyREFcntry]] FOREIGN KEY CountryCode REFERENCES COUNTRY(Code); Notation: [] signifies optional
6. Foreign Key Constraints (p3) Foreign Key Key Reference Referential Integrity Options Restrict (default) Reject request Cascade Reflect changes back Set Null Set the foreign key to NULL Changes to Key References Try: DELETE FROM country WHERE code=‘FIN’; Try: UPDATE country SET Code='XYZ' WHERE Code='FIN‘;
7. Foreign Key Constraints (p4) Chicken and Egg definitions CREATE TABLE chicken ( cID INT PRIMARY KEY, eID INT REFERENCES egg(eID)); CREATE TABLE egg( eID INT PRIMARY KEY, cID INT REFERENCES chicken(cID)); Consistently fails Can’t define a foreign key to a table before it has been defined Solution Define the tables w/o constraints CREATE TABLE chicken( cID INT PRIMARY KEY, eID INT); CREATE TABLE egg( eID INT PRIMARY KEY, cID INT); And then add foreign keys ALTER TABLE chicken ADD CONSTRAINT c_e FOREIGN KEY (eID) REFERENCES egg(eID); ALTER TABLE egg ADD CONSTRAINT e_c FOREIGN KEY (cID) REFERENCES chicken(cID);
8. Foreign Key Constraints (p5) Chicken and Egg insertion INSERT INTO chicken VALUES(1, 1001); INSERT INTO egg VALUES(1001, 1); Still consistently fails Need a way to postpone constraint checking How long to postpone? Until transaction commit Solution Define the tables with deferred constraint-checking ALTER TABLE chicken ADD CONSTRAINT c_e FOREIGN KEY (eID) REFERENCES egg(eID) INITIALLY DEFERRED DEFERRABLE; ALTER TABLE egg ADD CONSTRAINT e_c FOREIGN KEY (cID) REFERENCES chicken(cID) INITIALLY DEFERRED DEFERRABLE; And then INSERT INTO chicken VALUES(1, 1001); INSERT INTO egg VALUES(1001, 1); COMMIT;
9. Attribute-Based Constraints NOT NULL The most common Reasonability Constraints Validate incoming data? e.g., Population Density < 30000 Specification: Population INT(11) NOT NULL CHECK (Population <= 30000 * SurfaceArea), The condition in CHECK(cond) can take any value that a condition in WHERE(cond) can take Including subqueries The attribute constraint is checked when assigned Can be violated underneath as long as it is not re-evaluated For example, if we update SurfaceArea, the violation won’t be flagged Not implemented in all databases, e.g., MySQL
10. Tuple-Based Constraints Validate the entire tuple whenever anything in that tuple is updated More integrity enforcement than with attribute-based constraints e.g., Population Density <= 30000 Specification: Population INT(11) NOT NULL, CHECK (Population <= 30000 * SurfaceArea), The condition in CHECK(cond) can take any value that a condition in WHERE(cond) can take Including subqueries The attribute constraint is checked when tuple is updated If we update SurfaceArea, the violation will be flagged But the violation of CHECK (Population > ( SELECT SUM(Population) FROM City WHERE City.CountryCode = Code)) which specifies a subquery involving another table, will not be flagged Not implemented in all databases, e.g., MySQL
11. Assertions Validate the entire database whenever anything in the database is updated Part of the database, not any specific table Specification: Table-like CREATE ASSERTION CountryPop CHECK ( NOT EXISTS (SELECT * FROM Country WHERE Population < (SELECT SUM(Population) FROM City WHERE City.CountryCode = Code))) Difficult to implement efficiently Often not implemented I don’t know of any implementations Can be implemented for specific cases using Triggers, see Section 7.5
12. Views Also called Virtual Views Don’t actually exist in the database but behave as if they do Can be subsets of the data or joins – actually, arbitrary queries Subset example, CREATE VIEW ct AS SELECT c.Name AS nm, c.countrycode AS cntry FROM city c WHERE population > 0 Join example CREATE VIEW CityLanguage as SELECT city.name, city.countrycode, lang.languageas Language FROM city, countrylanguage as lang WHERE city.countrycode = lang.countrycode AND lang.isOfficial= ‘T‘;
13. Operations on Views (p1) SELECT SELECT * FROM CityLanguage WHERE Language='Dutch'; Shouldn’t ‘temporarily’ create the table and SELECT from it. Should use the definition of CityLanguage to make a query, i.e., SELECT * FROM (SELECT …blabla… FROM city, countrylanguage as lang WHERE city.countrycode = lang.countrycode AND lang.isOfficial = 'T') WHERE Language='Dutch';
14. Operations on Views (p2) UPDATE, INSERT not always possible, except Can sometimes be implemented using INSTEAD OF triggers Modifications are permitted when the view is derived from a single table R and The WHERE clause does not involve R in a Subquery The FROM clause can only consist of one occurrence of R The valued of all attributes not specified in the view definition can be ‘manufactured’ by the database Example. For the view ct CREATE VIEW ct AS SELECT c.Name AS nm, c.countrycodeAS cntry FROM city c WHERE population > 0 the query INSERT INTO ct (nm, cntry) values ('FirSPA', 'FIN') can be automatically rewritten as INSERT INTO CITY (Name, CountryCode) values ('FirSPA', 'FIN')
15. Top-Down Datalog Recursion Revisited IDB’s are conceptualized (and implemented) as Views for IDB predicate p(x,y, …) FOR EACH subgoal of p DO IF subgoal is IDB, recursive call; IF subgoal is EDB, look up
16. Indexes Main Idea: Data Structures for Fast Search Motivation: Preventing the need for linear search through a big table Example query: SELECT * FROM City WHERE CountryCode = 'FIN'; Another: SELECT * FROM City WHERE Population > (0.4 * ( SELECT Population FROM Country WHERE CountryCode= Code)); Expected time for first example: O(n). For the second, O(n2) Declaration CREATE INDEX CityIndex ON City(CountryCode); CREATE INDEX CityPopIndex ON City(Population); CREATE INDEX CountryPopIndex ON Country(Population);
17. Selection of Indexes (p1) Why not create an index for every attribute? Useful indexes, and not so useful ones Primary key? Unique key? From previous examples, CityIndex? CityPopIndex? CountryPopIndex?
18. Selection of Indexes (p2) The Mantra: Don’t define indexes too early: know your workload first Be as empirical as is practical The Greedy approach to index selection: Start with no indexes Evaluate candidate indexes, choose the one potentially most effective Repeat Query execution will take advantage of defined indexes
19. CS 542 Database Management Systems Report Proposals J Singh January 31, 2011