SQL
SQL
SQL
The select statement is used to query the database and retrieve selected data that match the criteria that you specify. Here is the format of a simple select statement:
select "column1" [,"column2",etc] from "tablename" [where "condition"]; [] = optional
The column names that follow the select keyword determine which columns will be returned in the results. You can select as many column names that you'd like, or you can use a "*" to select all columns. The table name that follows the keyword from specifies the table that will be queried to retrieve the desired results. The where clause (optional) specifies which data values or rows will be returned or displayed, based on the criteria described after the keyword where. Conditional selections used in the where clause: = Equal > Greater than < Less than >= Greater than or equal <= Less than or equal <> Not equal to LIKE *See note below The LIKE pattern matching operator can also be used in the conditional selection of the where clause. Like is a very powerful operator that allows you to select only rows that are "like" what you specify. The percent sign "%" can be used as a wild card to match any possible character that might appear before or after the characters specified. For example:
select first, last, city from empinfo where first LIKE 'Er%';
This SQL statement will match any first names that start with 'Er'. Strings must be in single quotes. Or you can specify,
select first, last
This statement will match any last names that end in a 's'.
select * from empinfo where first = 'Eric';
This will only select rows where the first name equals 'Eric' exactly. Sample Table: empinfo first John Mary Eric last Jones Jones id age city Payson Payson San Diego Phoenix state Arizona Arizona California Arizona
99980 45 99982 25
Edwards 88232 32
Mary Ann Edwards 88233 32 Ginger Howell 98002 42 92001 23 22322 35 32326 52
Cottonwood Arizona Gila Bend Bagdad Tucson Show Low Pinetop Globe Arizona Arizona Arizona Arizona Arizona Arizona
Enter the following sample select statements in the SQL Interpreter Form at the bottom of this page. Before you press "submit", write down your expected results. Press "submit", and compare the results.
select first, last, city from empinfo; select last, city, age from empinfo where age > 30; select first, last, city, state from empinfo where first LIKE 'J%'; select * from empinfo; select first, last, from empinfo where last LIKE '%s';
select first, last, age from empinfo where last LIKE '%illia%'; select * from empinfo where first = 'Eric
Creating Tables
The create table statement is used to create a new table. Here is the format of a simple create table statement:
create table "tablename" ("column1" "data type", "column2" "data type", "column3" "data type");
Note: You may have as many columns as you'd like, and the constraints are optional. Example:
create table employee (first varchar(15), last varchar(20), age number(3), address varchar(30), city varchar(20), state varchar(20));
To create a new table, enter the keywords create table followed by the table name, followed by an open parenthesis, followed by the first column name, followed by the data type for that column, followed by any optional constraints, and followed by a closing parenthesis. It is important to make sure you use an open parenthesis before the beginning table, and a closing parenthesis after the end of the last column definition. Make sure you seperate each column definition with a comma. All SQL statements should end with a ";". The table and column names must start with a letter and can be followed by letters, numbers, or underscores - not to exceed a total of 30 characters in length. Do not use any
SQL reserved keywords as names for tables or column names (such as "select", "create", "insert", etc). Data types specify what the type of data can be for that particular column. If a column called "Last_Name", is to be used to hold names, then that particular column should have a "varchar" (variable-length character) data type. Here are the most common Data types:
char(size) varchar(size) number(size) date number(size,d)
Fixed-length character string. Size is specified in parenthesis. Max 255 bytes. Variable-length character string. Max size is specified in parenthesis. Number value with a max number of column digits specified in parenthesis. Date value Number value with a maximum number of digits of "size" total, with a maximum number of "d" digits to the right of the decimal.
What are constraints? When tables are created, it is common for one or more columns to have constraints associated with them. A constraint is basically a rule associated with a column that the data entered into that column must follow. For example, a "unique" constraint specifies that no two records can have the same value in a particular column. They must all be unique. The other two most popular constraints are "not null" which specifies that a column can't be left blank, and "primary key". A "primary key" constraint defines a unique identification of each record (or row) in a table. All of these and more will be covered in the future Advanced release of this Tutorial. Constraints can be entered in this SQL interpreter, however, they are not supported in this Intro to SQL tutorial & interpreter. They will be covered and supported in the future release of the Advanced SQL tutorial - that is, if "response" is good. It's now time for you to design and create your own table. You will use this table throughout the rest of the tutorial. If you decide to change or redesign the table, you can either drop it and recreate it or you can create a completely different one. The SQL statement drop will be covered later.
IMPORTANT: When selecting a table name, it is important to select a unique name that no one else will use or guess. Your table names should have an underscore followed by your initials and the digits of your birth day and month. For example, Tom Smith, who was born on November 2nd, would name his table myemployees_ts0211 Use this convention for all of the tables you create. Your tables will remain on a shared database until you drop them, or they will be cleaned up if they aren't accessed in 4-5 days. If "support" is good, I hope to eventually extend this to at least one week. When you are finished with your table, it is important to drop your table (covered in last lesson).
In the example below, the column name first will match up with the value 'Luke', and the column name state will match up with the value 'Georgia'. Example:
insert into employee (first, last, age, address, city, state) values ('Luke', 'Duke', 45, '2130 Boars Nest', 'Hazard Co', 'Georgia');
Updating Records
The update statement is used to update or change records that match a specified criteria. This is accomplished by carefully constructing a where clause.
update "tablename" set "columnname" = "newvalue" [,"nextcolumn" = "newvalue2"...] where "columnname" OPERATOR "value" [and|or "column" OPERATOR "value"];
[] = optional
[The above example was line wrapped for better viewing on this Web page.] Examples:
update phone_book set area_code = 623 where prefix = 979; update phone_book set last_name = 'Smith', prefix=555, suffix=9292 where last_name = 'Jones'; update employee set age = age+1 where first_name='Mary' and last_name='Williams';
Deleting Records
The delete statement is used to delete records or rows from the table.
delete from "tablename" where "columnname" OPERATOR "value" [and|or "column" OPERATOR "value"]; [ ] = optional
[The above example was line wrapped for better viewing on this Web page.] Examples:
delete from employee;
Note: if you leave off the where clause, all records will be deleted!
delete from employee where lastname = 'May'; delete from employee where firstname = 'Mike' or firstname = 'Eric';
To delete an entire record/row from a table, enter "delete from" followed by the table name, followed by the where clause which contains the conditions to delete. If you leave off the where clause, all records will be deleted.
Drop a Table
The drop table command is used to delete a table and all rows in the table. To delete an entire table including all of its rows, issue the drop table command followed by the tablename. drop table is different from deleting all of the records in the table. Deleting all of the records in the table leaves the table including column and constraint information. Dropping the table removes the table definition as well as all of its rows.
drop table "tablename"
Example:
drop table myemployees_ts0211;
SELECT Statement
The SELECT statement is used to query the database and retrieve selected data that match the criteria that you specify. The SELECT statement has five main clauses to choose from, although, FROM is the only required clause. Each of the clauses have a vast selection of options, parameters, etc. The clauses will be listed below, but each of them will be covered in more detail later in the tutorial.
SELECT [ALL | DISTINCT] column1[,column2] FROM table1[,table2] [WHERE "conditions"] [GROUP BY "column-list"] [HAVING "conditions] [ORDER BY "column-list" [ASC | DESC] ]
Example:
SELECT name, age, salary FROM employee WHERE age > 50;
The above statement will select all of the values in the name, age, and salary columns from the employee table whose age is greater than 50. Note: Remember to put a semicolon at the end of your SQL statements. The ; indicates that your SQL statment is complete and is ready to be interpreted. Comparison Operators = > < >= <= <> or != LIKE Equal Greater than Less than Greater than or equal to Less than or equal to Not equal to String comparison test
FROM employee
The above statement will select all of the rows/values in the name, title, and dept columns from the employee table whose title starts with 'Pro'. This may return job titles including Programmer or Pro-wrestler. ALL and DISTINCT are keywords used to select either ALL (default) or the "distinct" or unique records in your query results. If you would like to retrieve just the unique
records in specified columns, you can use the "DISTINCT" keyword. DISTINCT will discard the duplicate records for the columns you specified after the "SELECT" statement: For example:
SELECT DISTINCT age FROM employee_info;
This statement will return all of the unique ages in the employee_info table. ALL will display "all" of the specified columns including all of the duplicates. The ALL keyword is the default if nothing is specified.
Aggregate Functions
MIN MAX SUM AVG COUNT COUNT(*) returns the smallest value in a given column returns the largest value in a given column returns the sum of the numeric values in a given column returns the average value of a given column returns the total number of values in a given column returns the number of rows in a table
Aggregate functions are used to compute against a "returned column of numeric data" from your SELECT statement. They basically summarize the results of a particular column of selected data. We are covering these here since they are required by the next topic, "GROUP BY". Although they are required for the "GROUP BY" clause, these functions can be used without the "GROUP BY" clause. For example:
SELECT AVG(salary) FROM employee;
This statement will return a single result which contains the average value of everything returned in the salary column from the employee table. Another example:
SELECT AVG(salary) FROM employee; WHERE title = 'Programmer';
This statement will return the average salary for all employees whose title is equal to 'Programmer' Example:
SELECT Count(*) FROM employees;
GROUP BY clause
The GROUP BY clause will gather all of the rows together that contain data in the specified column(s) and will allow aggregate functions to be performed on the one or more columns. This can best be explained by an example: GROUP BY clause syntax:
SELECT column1, SUM(column2) FROM "list-of-tables" GROUP BY "column-list";
Let's say you would like to retrieve a list of the highest paid salaries in each dept:
SELECT max(salary), dept FROM employee GROUP BY dept;
This statement will select the maximum salary for the people in each unique department. Basically, the salary for the person who makes the most in each department will be displayed. Their, salary and their department will be returned.
HAVING clause
The HAVING clause allows you to specify conditions on the rows for each group - in other words, which rows should be selected will be based on the conditions you specify. The HAVING clause should follow the GROUP BY clause if you are going to use it. HAVING clause syntax:
HAVING can best be described by example. Let's say you have an employee table containing the employee's name, department, salary, and age. If you would like to select the average salary for each employee in each department, you could enter:
SELECT dept, avg(salary) FROM employee GROUP BY dept;
But, let's say that you want to ONLY calculate & display the average if their salary is over 20000:
SELECT dept, avg(salary) FROM employee GROUP BY dept HAVING avg(salary) > 20000;
ORDER BY clause
ORDER BY is an optional clause which will allow you to display the results of your query in a sorted order (either ascending order or descending order) based on the columns that you specify to order by.
ORDER BY clause syntax:
FROM "list-of-tables"
ORDER BY
[ ] = optional
This statement will select the employee_id, dept, name, age, and salary from the employee_info table where the dept equals 'Sales' and will list the results in Ascending (default) order based on their Salary.
ASC = Ascending Order - default DESC = Descending Order
For example:
FROM employee_info
ORDER BY salary;
If you would like to order based on multiple columns, you must seperate the columns with commas. For example:
SELECT employee_id, dept, name, age, salary FROM employee_info WHERE dept = 'Sales' ORDER BY salary, age DESC;
The OR operator can be used to join two or more conditions in the WHERE clause also. However, either side of the OR operator can be true and the condition will be met hence, the rows will be displayed. With the OR operator, either side can be true or both sides can be true. For example:
SELECT employeeid, firstname, lastname, title, salary FROM employee_info WHERE salary >= 50000.00 AND title = 'Programmer';
This statement will select the employeeid, firstname, lastname, title, and salary from the employee_info table where the salary is greater than or equal to 50000.00 AND the title is equal to 'Programmer'. Both of these conditions must be true in order for the rows to be returned in the query. If either is false, then it will not be displayed. Although they are not required, you can use paranthesis around your conditional expressions to make it easier to read:
SELECT employeeid, firstname, lastname, title, salary FROM employee_info WHERE (salary >= 50000.00) AND (title = 'Programmer');
Another Example:
FROM employee_info
This statement will select the firstname, lastname, title, and salary from the employee_info table where the title is either equal to 'Sales' OR the title is equal to 'Programmer'.
The IN conditional operator is really a set membership test operator. That is, it is used to test whether or not a value (stated before the keyword IN) is "in" the list of values provided after the keyword IN. For example:
SELECT employeeid, lastname, salary FROM employee_info WHERE lastname IN ('Hernandez', 'Jones', 'Roberts', 'Ruiz');
This statement will select the employeeid, lastname, salary from the employee_info table where the lastname is equal to either: Hernandez, Jones, Roberts, or Ruiz. It will return the rows if it is ANY of these values. The IN conditional operator can be rewritten by using compound conditions using the equals operator and combining it with OR - with exact same output results:
SELECT employeeid, lastname, salary FROM employee_info WHERE lastname = 'Hernandez' OR lastname = 'Jones' OR lastname = 'Roberts' OR lastname = 'Ruiz';
As you can see, the IN operator is much shorter and easier to read when you are testing for more than two or three values. You can also use NOT IN to exclude the rows in your list. The BETWEEN conditional operator is used to test to see whether or not a value (stated before the keyword BETWEEN) is "between" the two values stated after the keyword BETWEEN. For example:
SELECT employeeid, age, lastname, salary FROM employee_info WHERE age BETWEEN 30 AND 40;
This statement will select the employeeid, age, lastname, and salary from the employee_info table where the age is between 30 and 40 (including 30 and 40). This statement can also be rewritten without the BETWEEN operator:
SELECT employeeid, age, lastname, salary FROM employee_info WHERE age >= 30 AND age <= 40;
You can also use NOT BETWEEN to exclude the values between your range.
Mathematical Operators
Standard ANSI SQL-92 supports the following first four basic arithmetic operators: + * / % addition subtraction multiplication division modulo
The modulo operator determines the integer remainder of the division. This operator is not ANSI SQL supported, however, most databases support it. The following are some more useful mathematical functions to be aware of since you might need them. These functions are not standard in the ANSI SQL-92 specs, therefore they may or may not be available on the specific RDBMS that you are using. However, they were available on several major database systems that I tested. They WILL work on this tutorial. ABS(x) SIGN(x) MOD(x,y) FLOOR(x) CEILING(x) or CEIL(x) POWER(x,y) ROUND(x) ROUND(x,d) SQRT(x) For example:
SELECT round(salary), firstname FROM employee_info
returns the absolute value of x returns the sign of input x as -1, 0, or 1 (negative, zero, or positive respectively) modulo - returns the integer remainder of x divided by y (same as x %y) returns the largest integer value that is less than or equal to x returns the smallest integer value that is greater than or equal to x returns the value of x raised to the power of y returns the value of x rounded to the nearest whole integer returns the value of x rounded to the number of decimal places specified by the value d returns the square-root value of x
This statement will select the salary rounded to the nearest whole value and the firstname from the employee_info table.
For example:
SELECT "list-of-columns" FROM table1,table2 WHERE "search-condition(s)"
Joins can be explained easier by demonstrating what would happen if you worked with one table only, and didn't have the ability to use "joins". This single table database is also sometimes referred to as a "flat table". Let's say you have a one-table database that is used to keep track of all of your customers and what they purchase from your store: id first last address city state zip date item price
Everytime a new row is inserted into the table, all columns will be be updated, thus resulting in unnecessary "redundant data". For example, every time Wolfgang Schultz purchases something, the following rows will be inserted into the table: id first last address city state zip date item 10982 Wolfgang Schultz 300 N. 1st Ave Yuma AZ 85002 032299 snowboard snow 10982 Wolfgang Schultz 300 N. 1st Ave Yuma AZ 85002 082899 shovel 10982 Wolfgang Schultz 300 N. 1st Ave Yuma AZ 85002 091199 gloves 10982 Wolfgang Schultz 300 N. 1st Ave Yuma AZ 85002 100999 lantern 10982 Wolfgang Schultz 300 N. 1st Ave Yuma AZ 85002 022900 tent An ideal database would have two tables: 1. One for keeping track of your customers 2. And the other to keep track of what they purchase: "Customer_info" table: customer_number firstname lastname address city state zip price 45.00 35.00 15.00 35.00 85.00
Now, whenever a purchase is made from a repeating customer, the 2nd table, "Purchases" only needs to be updated! We've just eliminated useless redundant data, that is, we've just normalized this database! Notice how each of the tables have a common "cusomer_number" column. This column, which contains the unique customer number will be used to JOIN the two tables. Using the two new tables, let's say you would like to select the customer's name, and items they've purchased. Here is an example of a join statement to accomplish this:
SELECT customer_info.firstname, customer_info.lastname, purchases.item FROM customer_info, purchases WHERE customer_info.customer_number = purchases.customer_number;
This particular "Join" is known as an "Inner Join" or "Equijoin". This is the most common type of "Join" that you will see or use. Notice that each of the colums are always preceeded with the table name and a period. This isn't always required, however, it IS good practice so that you wont confuse which colums go with what tables. It is required if the name column names are the same between the two tables. I recommend preceeding all of your columns with the table names when using joins. Note: The syntax described above will work with most Database Systems -including the one with this tutorial. However, in the event that this doesn't work with yours, please check your specific database documentation. Although the above will probably work, here is the ANSI SQL-92 syntax specification for an Inner Join using the preceding statement above that you might want to try:
SELECT customer_info.firstname, customer_info.lastname, purchases.item FROM customer_info INNER JOIN purchases ON customer_info.customer_number = purchases.customer_number;
Another example:
SELECT employee_info.employeeid, employee_info.lastname, employee_sales.comission FROM employee_info, employee_sales WHERE employee_info.employeeid = employee_sales.employeeid;
This statement will select the employeeid, lastname (from the employee_info table), and the comission value (from the employee_sales table) for all of the rows where the employeeid in the employee_info table matches the employeeid in the employee_sales table.
SQL JOIN
The JOIN keyword is used in an SQL statement to query data from two or more tables, based on a relationship between certain columns in these tables. Tables in a database are often related to each other with keys. A primary key is a column (or a combination of columns) with a unique value for each row. Each primary key value must be unique within the table. The purpose is to bind data together, across tables, without repeating all of the data in every table. Look at the "Persons" table: P_Id 1 2 3 LastName Hansen Svendson Pettersen FirstName Ola Tove Kari Address Timoteivn 10 Borgvn 23 Storgt 20 City Sandnes Sandnes Stavanger
Note that the "P_Id" column is the primary key in the "Persons" table. This means that no two rows can have the same P_Id. The P_Id distinguishes two persons even if they have the same name. Next, we have the "Orders" table: O_Id 1 2 3 4 5 OrderNo 77895 44678 22456 24562 34764 P_Id 3 3 1 1 15
Note that the "O_Id" column is the primary key in the "Orders" table and that the "P_Id" column refers to the persons in the "Persons" table without using their names. Notice that the relationship between the two tables above is the "P_Id" column.
Before we continue with examples, we will list the types of JOIN you can use, and the differences between them.
JOIN: Return rows when there is at least one match in both tables LEFT JOIN: Return all rows from the left table, even if there are no matches in the right table RIGHT JOIN: Return all rows from the right table, even if there are no matches in the left table FULL JOIN: Return rows when there is a match in one of the tables
02 03 04 "Employees_USA": E_ID 01 02 03 04
Now we want to list all the different employees in Norway and USA. We use the following SELECT statement: SELECT E_Name FROM Employees_Norway UNION SELECT E_Name FROM Employees_USA The result-set will look like this: E_Name Hansen, Ola Svendson, Tove Svendson, Stephen Pettersen, Kari Turner, Sally Kent, Clark Scott, Stephen Note: This command cannot be used to list all employees in Norway and USA. In the example above we have two employees with equal names, and only one of them will be listed. The UNION command selects only distinct values.
SELECT E_Name FROM Employees_USA Result E_Name Hansen, Ola Svendson, Tove Svendson, Stephen Pettersen, Kari Turner, Sally Kent, Clark Svendson, Stephen Scott, Stephen
SQL Alias
You can give a table or a column another name by using an alias. This can be a good thing to do if you have very long or complex table names or column names. An alias name could be anything, but usually it is short.
Alias Example
Assume we have a table called "Persons" and another table called "Product_Orders". We will give the table aliases of "p" an "po" respectively. Now we want to list all the orders that "Ola Hansen" is responsible for. We use the following SELECT statement: SELECT po.OrderID, p.LastName, p.FirstName FROM Persons AS p,
Product_Orders AS po WHERE p.LastName='Hansen' WHERE p.FirstName='Ola' The same SELECT statement without aliases: SELECT Product_Orders.OrderID, Persons.LastName, Persons.FirstName FROM Persons, Product_Orders WHERE Persons.LastName='Hansen' WHERE Persons.FirstName='Ola' As you'll see from the two SELECT statements above; aliases can make queries easier to both write and to read.
SQL Constraints
Constraints are used to limit the type of data that can go into a table. Constraints can be specified when a table is created (with the CREATE TABLE statement) or after the table is created (with the ALTER TABLE statement). We will focus on the following constraints:
The following SQL creates a PRIMARY KEY on the "P_Id" column when the "Persons" table is created: MySQL: CREATE TABLE Persons ( P_Id int NOT NULL, LastName varchar(255) NOT NULL, FirstName varchar(255), Address varchar(255), City varchar(255), PRIMARY KEY (P_Id) )
The "Orders" table: O_Id 1 2 3 4 OrderNo 77895 44678 22456 24562 P_Id 3 3 2 1
Note that the "P_Id" column in the "Orders" table points to the "P_Id" column in the "Persons" table. The "P_Id" column in the "Persons" table is the PRIMARY KEY in the "Persons" table. The "P_Id" column in the "Orders" table is a FOREIGN KEY in the "Orders" table.
The FOREIGN KEY constraint is used to prevent actions that would destroy link between tables. The FOREIGN KEY constraint also prevents that invalid data is inserted into the foreign key column, because it has to be one of the values contained in the table it points to.
To delete a column in a table, use the following syntax (notice that some database systems don't allow deleting a column): ALTER TABLE table_name DROP COLUMN column_name To change the data type of a column in a table, use the following syntax: ALTER TABLE table_name ALTER COLUMN column_name datatype
Now we want to add a column named "DateOfBirth" in the "Persons" table. We use the following SQL statement: ALTER TABLE Persons ADD DateOfBirth date Notice that the new column, "DateOfBirth", is of type date and is going to hold a date. The data type specifies what type of data the column can hold. For a complete reference of all the data types available in MS Access, MySQL, and SQL Server, go to our complete Data Types reference. The "Persons" table will now like this: P_Id 1 2 3 LastName Hansen Svendson Pettersen FirstName Ola Tove Kari Address Timoteivn 10 Borgvn 23 Storgt 20 City Sandnes Sandnes Stavanger DateOfBirth
Now we want to change the data type of the column named "DateOfBirth" in the "Persons" table. We use the following SQL statement: ALTER TABLE Persons ALTER COLUMN DateOfBirth year Notice that the "DateOfBirth" column is now of type year and is going to hold a year in a two-digit or four-digit format.
SQL Views
A view is a virtual table. This chapter shows how to create, update, and delete a view.
You can add SQL functions, WHERE, and JOIN statements to a view and present the data as if the data were coming from one single table.
CREATE VIEW [Category Sales For 1997] AS SELECT DISTINCT CategoryName,Sum(ProductSales) AS CategorySales FROM [Product Sales for 1997] GROUP BY CategoryName We can query the view above as follows: SELECT * FROM [Category Sales For 1997] We can also add a condition to the query. Now we want to see the total sale only for the category "Beverages": SELECT * FROM [Category Sales For 1997] WHERE CategoryName='Beverages'