Relational Database Management Systems For Epidemiologists
Relational Database Management Systems For Epidemiologists
SQL Part I
Outline
SQL Basics
Retrieving Data from a Table
Operators and Functions
What is SQL?
SQL is the standard programming language to
create, update, delete, and retrieve data stored in
a RDMS.
SQL is defined by rules of syntax (words and
symbols allowed) and semantics (meaning).
Three versions:
ANSI-89 SQL
ANSI-92 SQL
ANSI SQL-99 (newer)
SQL Syntax
SQL Statement
Keywords
SQL Syntax
Identifiers
SQL Syntax
Terminating
semicolon
Elements of SQL Style
SQL statements
can be in uppercase or lowercase (case insensitive).
can extend across multiple lines, so long as you do
not split words or quoted strings in two.
To improve readability and maintenance,
Begin each SQL statement on a new line.
Use uppercase for SQL keywords (eg, SELECT,
NULL, CHARACTER).
Indent with a fixed number of spaces (eg, four).
SELECT
SELECT retrieves rows, columns, and derived
values from one or more tables.
Syntax:
SELECT column(s)
FROM table(s)
[JOIN join(s)]
[WHERE search_condition(s)]
[GROUP BY grouping_column(s)]
[HAVING search_condition(s)]
[ORDER BY sort_column(s)];
SELECT Example
SELECT *
FROM patients;
AS
The AS statement can be used to create a column
alias (an alternative name/identifier) that you
specify to control how column headings are
displayed in a result.
Syntax:
SELECT column1 AS alias1,
column2 AS alias2,
...
columnN AS aliasN
FROM table;
AS Example
Operator Descriptors
= Equal to
<> Not equal to
< Less than
<= Less than or equal to
> Greater than
>= Greater than or equal to
WHERE Example
SELECT caseid, fname, lname
FROM patients
WHERE lname = 'Johnson';
WHERE Example
SELECT caseid, fname, lname
FROM patients
WHERE lname <> 'Johnson';
WHERE Example
SELECT caseid, fname, lname
FROM patients
WHERE datevar >= '2001-01-10';
WHERE Example
SELECT caseid,
fname,
lname,
MONTH(datevar) AS “Month”,
DAY(datevar) AS “Day”
FROM patients
WHERE “Day” >= “Month”;
Notes on WHERE
Occasionally, you may need to specify multiple
conditions in a single WHERE clause.
You can use the AND, OR or NOT operators to
combine two or more conditions into a compound
condition.
AND, OR, and NOT operators are known as Boolean
operators; they are designed to work with “truth”
values: true, false, and unknown.
AND Example
SELECT caseid,
fname,
lname,
MONTH(datevar) AS “Month”
FROM patients
WHERE “Month”=12 AND lname='Johnson';
Truth Table for Two Conditions
Operator Matches
% Any string of zero or
more characters
_ Any one character
LIKE Example
Syntax:
SELECT columns
FROM table
WHERE test_column [NOT] LIKE 'pattern';
Example:
SELECT fname, lname
FROM patients
WHERE lname LIKE 'John%';
BETWEEN
Use the BETWEEN clause to determine whether
a given value falls within a specified range.
BETWEEN works with character strings,
numbers, and date/times.
The range contains a low and high value,
separated by AND (inclusive).
You can negate a BETWEEN condition with
NOT BETWEEN.
BETWEEN Example
Syntax:
SELECT columns
FROM table
WHERE test_column BETWEEN
low_value AND high value;
Example:
SELECT fname, lname
FROM patients
WHERE zip BETWEEN 94510 AND 94515;
Equivalent Statements
SELECT fname, lname
FROM patients
WHERE zip BETWEEN 94510 AND 94515;
SELECT fname, lname
FROM patients
WHERE (zip >= 94510) AND (zip<=94515);
List Filtering with IN
The IN clause can be used to display records with
any value in a specified list for a particular
column.
Syntax:
SELECT columns
FROM table
WHERE test_column [NOT] IN
(value1, value2, ...);
IN Example
Example:
SELECT fname, lname, state
FROM patients
WHERE state IN ('CA', 'NY');
IS NULL
Recall: NULLs represent missing or unknown
values.
IS NULL can be used to find records with NULL
values
Syntax:
SELECT columns
FROM table
WHERE test_column IS [NOT] NULL;
IS NULL Example
SELECT caseid,
fname,
lname
FROM patients
WHERE occupation IS NULL;
Additional Operators & Functions
Arithmetic Operations (+. -, *, /)
Concatenation (||)
Extracting Text (SUBSTRING())
Changing Case (UPPER() and LOWER())
Trimming Characters (TRIM())
Length of a String (CHARACTER_LENGTH() or
LEN())
Position of a Substring (POSITION())
Next Time
Summarizing and Grouping Data
Aggregate functions (MIN, MAX, SUM, AVG,
COUNT)
Grouping rows with GROUP BY
Filtering groups with HAVING
Joins
Cross, natural, inner, left outer, right outer, full outer,
self-join