Introduction to SQL for MBA students, presented spring 2012 to the Chicago Booth technology club.
Why an MBA would want to know SQL, a sandbox to practice in, and the basics of SQL syntax so you can pull your own datasets to analyze.
2. AGENDA
• What's SQL and when would you use it
• How to find out what's in your database
• How to pull the information you need, with
progressively more realistic exercises
• How to learn more
• Questions and more practice time
• Note: specific syntax today is for MySQL, but any
database will be similar
3. WHAT IS SQL?
A N D WH Y I T ’ S U S E F U L T O A N M B A
5. WHAT KIND OF DATA?
• Numeric
• Integers with various possible ranges (sometimes TRUE/FALSE
Boolean values are encoded as 0/1 in an integer field) - INT
• Decimals or floating-point numbers – FLOAT
• Time and date - DATETIME
• Text
• With a specified length in characters – VARCHAR
• Longer text – TEXT
• Binary data – BLOB
6. WHY USE A RELATIONAL
DATABASE
• Think about a list of contacts in Excel – basically one
database table – and adding a phone number.
• Do you have enough phone number columns?
• Does cell phone have its own field or is it just
another phone number?
• How do you mark the primary phone number?
• A database lets you have as many phone numbers
as you want associated with one person, because
you can put phone numbers in a separate table.
7. STRUCTURED QUERY LANGUAGE
• SQL is a way to pull the information you need out of
a relational database
• You’ll probably use it to fetch summary information
from a central data warehouse and then do
additional analysis in Excel/SAS/etc.
• SQL is also used (by the people who set up the data
warehouse) to create tables, insert data, and
delete unneeded information.
8. SANDBOX
1. Take out your laptop
2. Go to http://coderzone.org/sqlsandbox/
3. Click “Sign In”
9. WHAT’S IN YOUR DATABASE?
SHOW TABLES; DESCRIBE <TABLENAME>;
10. SHOW TABLES;
• Returns a list of tables in the database
• May not need to use this if you have a graphical
user interface (GUI)
• Can limit which tables are returned, e.g. “SHOW
TABLES LIKE ‘%c%’;” will return all the tables with a
“c” in their name
11. DESCRIBE <TABLENAME>;
• Returns information about each field in table
<tablename>
• Includes field name, type, and other information like
if the field can be NULL or if it’s required to have a
value
• May not need to use this if you have a graphical
user interface (GUI)
12. TRY IT OUT
• In the sandbox, try looking at tables.
• SHOW TABLES [LIKE “%c%”];
• DESCRIBE sb_css_colors;
• Which tables contain the most types of data?
13. SQL STYLE
• Capitalize SQL keywords (SELECT, JOIN, AS, etc.)
• Lower-case table and field names
Makes it easy to see what values are specific to the
database you’re using.
15. SELECT <FIELDS> FROM <TABLENAME>;
• Returns some data from a table
• You’ll need the field names you found before from
the DESCRIBE <tablename> statement
• <fields> can be:
•
•
•
•
One field name
Multiple field names separated by commas
Special value * (asterisk), which means “everything”
Special value COUNT(*), which means “how many rows are
there?”
• This returns all the rows of the table – why might that
be a problem?
16. TRY IT OUT
• In the sandbox, try looking at some data in tables.
• SELECT * FROM sb_css_colors;
• SELECT COUNT(*) FROM sb_airports;
• SELECT biz_name, state, phone FROM sb_airports;
18. WHERE
• SELECT <fields> FROM <tablename> WHERE <limits> ;
• This is how to start picking out only the data you’re
interested in
• WHERE clauses can be:
• <fieldname> = <value> such as 2 or “Bob”
• Similar but with LIKE, >, >=, <, <=, != (note that different SQL
implementations require different representations for
dates/times, so you may have to check documentation to
figure out how to say purchase_date > “2012-01-01”, for
example)
• Things involving functions, like ABS(<fieldname>) > 20
• Several of those joined by AND, OR, etc.
19. TRY IT OUT
• In the sandbox, try limiting what data you grab.
• SELECT * FROM sb_css_colors WHERE
hexcolor=“#800000”
• SELECT * FROM sb_css_colors WHERE hexcolor LIKE
“%FF%”
• SELECT biz_name, state, zip, phone FROM
sb_airports WHERE (state=“IL” OR state=“IN”) AND
phone LIKE “%(800)%”;
20. MORE DATA SELECTION
SELECT <FIELDS> FROM < TABLENAME> WHERE <LIMITS>
ORDER BY <FIELD> <ORDER> GROUP BY <FIELD>;
21. ORDER BY
• SELECT <fields> FROM <tablename> ORDER BY
<field> <order>;
• Like sorting in Excel
• Order can be ASC (ascending) or DESC
(descending)
• Don’t sort and then assume you know what the
possible values were – e.g. what’s at the top if you
do SELECT * FROM sb_css_colors ORDER BY hexcolor
DESC;
22. GROUP BY
• SELECT <fields and/or functions of fields> FROM
<tablename> GROUP BY <field>;
• Lets you create summary statistics directly instead of
in Excel later
• Commonly used functions: COUNT(),
COUNT(DISTINCT), SUM(), AVG(), MAX(), MIN(),
23. TRY IT OUT
• In the sandbox, try more complicated SELECTs.
• SELECT * FROM sb_css_colors ORDER BY hexcolor
DESC; (what were the possible values?)
• SELECT state, zip, COUNT(*) from sb_airports GROUP
BY zip;
• SELECT state, AVG(zip) FROM sb_airports GROUP BY
state;
25. JOIN
• SELECT <fields> FROM <tablename> JOIN
<tablename> ON <field match>;
• Joins let you connect related data from multiple
tables, e.g. a customer name from a customers
table and an order date from an orders table
• Normally you’ll join on an ID or “key” column
• Often need to specify both table and field when
writing the field match statement
• customers.id = orders.customer_id
26. TRY IT OUT
• In the sandbox, try joining tables.
• SELECT comp_dept, first_name, last_name, phone
FROM sb_departments JOIN sb_employees ON
sb_departments.emp_id = sb_employees.emp_id
WHERE comp_dept = “Payroll”;
28. ONLINE RESOURCES
• MySQL manuals (MySQL is open source, so there are
lots of resources freely available)
• http://dev.mysql.com/doc/refman/5.6/en/index.html
• Question sites:
• http://www.quora.com/MySQL/
• http://programmers.stackexchange.com/questions/tagged
/sql (pretty technical in general)
29. SANDBOXES
• We’ve used http://coderzone.org/sqlsandbox/
(nice because it has multiple tables)
• Also http://www.w3schools.com/sql/default.asp
• You can make your own sandbox to try things out if
you have web hosting (e.g. unlimited MySQL
databases on DreamHost)