Unit4 SQL and Database Project at Students Notes
Unit4 SQL and Database Project at Students Notes
Definition: relational algebra is the one whose operands are relations or variables that represent
relations.
Unary operations
By definition, a unary operation is an operator that uses only one operand (relation). In
Relational algebra, the unary operations are selection and projection
Selection operation
Selection is a unary operation that selects records satisfying a given predicate (criteria). It selects
a subset of records. The lowercase Greek letter sigma (σ) is used to denote selection. The
selection condition appears as a subscript to σ. The argument relation is given in parenthesis
following the σ.
The selection condition or selection criterion can be any legally formed expression that involves:
Projection operation
The PROJECT operation is another unary operation. This operation returns a set of tuples
containing a subset of the attributes in the original relation. Thus, we state that the SELECT
operation selects some rows and discards the others. The PROJECT operation, on the other hand,
selects some columns of the relation and discards the other column. The PROJECT operation can
be viewed as the vertical filter of the relation.
The projection operation copies its arguments relation, but certain columns are left out. The
projection operation lists the desired attributes to appear in the result as a subscript to π.
Notice that if the projection produces two identical rows, the duplicate rows must be removed
since the relation is a set and it is not allowed to contain identical records.
PREPARED BY GILBERT H. 1
Binary operations
A binary operation is an operation that uses two operands (relations). In Relational algebra, the
binary operations are Cartesian product, Union operator, Set Difference, Intersection, Theta-join
and Natural Join.
a. Cartesian product
Cartesian product in DBMS is an operation used to merge columns from two relations.
Generally, a Cartesian product is never a meaningful operation when it performs alone.
However, it becomes meaningful when it is followed by other operations. It is also called Cross
Product or Cross Join.
C. Intersection
An intersection is defined by the symbol ∩
A∩B
PREPARED BY GILBERT H. 2
Defines a relation consisting of a set of all tuple that are in both A and B. However, A and B
must be union-compatible.
Join Operations
Join operation is essentially a Cartesian product followed by a selection criterion.
JOIN operation also allows joining variously related tuples from different relations.
Types of JOIN:
Inner Joins:
Theta join
EQUI join
Natural join
Outer join:
Inner Join:
In an inner join, only those tuples that satisfy the matching criteria are included, while the rest
are excluded. Let’s study various types of Inner Joins:
Theta Join:
The general case of JOIN operation is called a Theta join. It is denoted by symbol θ
Example
A ⋈θ B
EQUI join:
PREPARED BY GILBERT H. 3
When a theta join uses only equivalence condition, it becomes a equi join.
OUTER JOIN
In an outer join, along with tuples that satisfy the matching criteria, we also include some or all
tuples that do not match the criteria.
Summary
Operation(Symbols) Purpose
The SELECT operation is used for selecting a subset of the tuples
Select(σ)
according to a given selection condition
The projection eliminates all attributes of the input relation but those
Projection(π)
mentioned in the projection list.
UNION is symbolized by symbol. It includes all tuples that are in
Union Operation(∪)
tables A or in B.
– Symbol denotes it. The result of A – B, is a relation which includes
Set Difference(-)
all tuples that are in A but not in B.
Intersection defines a relation consisting of a set of all tuple that are in
Intersection(∩)
both A and B.
PREPARED BY GILBERT H. 4
Cartesian Product(X) Cartesian operation is helpful to merge columns from two relations.
Inner Join Inner join, includes only those tuples that satisfy the matching criteria.
The general case of JOIN operation is called a Theta join. It is denoted
Theta Join(θ)
by symbol θ.
When a theta join uses only equivalence condition, it becomes a equi
EQUI Join
join.
Natural join can only be performed if there is a common attribute
Natural Join(⋈)
(column) between the relations.
Outer Join In an outer join, along with tuples that satisfy the matching criteria.
In the left outer join, operation allows keeping all tuple in the left
Left Outer Join( )
relation.
In the right outer join, operation allows keeping all tuple in the right
Right Outer join( )
relation.
In a full outer join, all tuples from both relations are included in the
Full Outer Join( )
result irrespective of the matching condition.
Division operation
Division identifies attribute values from a relation that are paired with all of the values
from another relation.
The division operator is used for queries which involve the ‘all’.
The resulting operation must have all combinations of tuples of relation S that are present in the
first relation or R.
SQL is a standard language for storing, manipulating and retrieving data in databases.
What is SQL?
PREPARED BY GILBERT H. 5
Commonly used statements are grouped into the following categories:
A data definition language (DDL) is a computer language used to create and modify the structure
of database objects in a database.
DDL Commands:
1. Create
2. Alter
3. Truncate
4. Drop
1.CREATE:
This command is used to create a new table in SQL. The user has to give information like table
name, column names, and their datatypes.
Syntax –
CREATE TABLE table_name
(
Column_1 datatype,
Column_2 datatype,
Column_3 datatype,
....
);
PREPARED BY GILBERT H. 6
2.ALTER:
This command is used to add, delete or change columns in the existing table. The user needs to
know the existing table name and can do add, delete or modify tasks easily.
Syntax –
Syntax to add a column to an existing table.
3.TRUNCATE:
This command is used to remove all rows from the table, but the structure of the table still exists.
Syntax –
Syntax to remove an existing table.
SQL Constraints
SQL constraints are a set of rules implemented on tables in relational databases to dictate
what data can be inserted, updated or deleted in its tables.
Constraints can be specified when the table is created with the CREATE TABLE statement, or
after the table is created with the ALTER TABLE statement.
Syntax
2. If you are adding values for all the columns of the table, you do not need to specify the column
names in the SQL query. However, make sure the order of the values is in the same order as the
columns in the table. Here, the INSERT INTO syntax would be as follows:
PREPARED BY GILBERT H. 8
SELECT Syntax
SELECT column1, column2 ...
FROM table_name;
Here, column1, column2 ... are the field names of the table you want to select data from. If you
want to select all the fields available in the table, use the following syntax:
The SELECT DISTINCT statement is used to return only distinct (different) values.
Inside a table, a column often contains many duplicate values; and sometimes you only want to
list the different (distinct) values.
WHERE Syntax
SELECT column1, column2 ...
FROM table_name
WHERE condition;
The WHERE clause can be combined with AND, OR, and NOT operators.
The AND and OR operators are used to filter records based on more than one condition:
The AND operator displays a record if all the conditions separated by AND are TRUE.
The OR operator displays a record if any of the conditions separated by OR is TRUE.
PREPARED BY GILBERT H. 9
AND Syntax
SELECT column1, column2 ...
FROM table_name
WHERE condition1 AND condition2 AND condition3 ...;
OR Syntax
SELECT column1, column2 ...
FROM table_name
WHERE condition1 OR condition2 OR condition3 ...;
NOT Syntax
SELECT column1, column2 ...
FROM table_name
WHERE NOT condition;
SQL Aliases
SQL aliases are used to give a table, or a column in a table, a temporary name.
There are two wildcards often used in conjunction with the LIKE operator:
PREPARED BY GILBERT H. 10
The percent sign (%) represents zero, one, or multiple characters
The underscore sign (_) represents one, single character
Note: MS Access uses an asterisk (*) instead of the percent sign (%), and a question mark (?)
instead of the underscore (_).
The percent sign and the underscore can also be used in combinations!
LIKE Syntax
SELECT column1, column2, ...
FROM table_name
WHERE columnN LIKE pattern;
Tip: You can also combine any number of conditions using AND or OR operators.
Here are some examples showing different LIKE operators with '%' and '_'
wildcards:
WHERE CustomerName LIKE Finds any values that start with "a"
'a%'
WHERE CustomerName LIKE Finds any values that end with "a"
'%a'
WHERE CustomerName LIKE Finds any values that have "or" in any
'%or%' position
PREPARED BY GILBERT H. 11
WHERE CustomerName LIKE Finds any values that have "r" in the second
'_r%' position
WHERE CustomerName LIKE Finds any values that start with "a" and are at
'a_%' least 2 characters in length
WHERE CustomerName LIKE Finds any values that start with "a" and are at
'a__%' least 3 characters in length
Wildcard characters are used with the LIKE operator. The LIKE operator is used in
a WHERE clause to search for a specified pattern in a column.
* Represents zero or more characters bl* finds bl, black, blue, and blob
PREPARED BY GILBERT H. 12
? Represents a single character h?t finds hot, hat, and hit
[] Represents any single character within the h[oa]t finds hot and hat, but not hit
brackets
! Represents any character not in the brackets h[!oa]t finds hit, but not hot and hat
- Represents any single character within the c[a-b]t finds cat and cbt
specified range
# Represents any single numeric character 2#5 finds 205, 215, 225, 235, 245, 255,
265, 275, 285, and 295
IN Syntax
SELECT column_name(s)
FROM table_name
WHERE column_name IN (value1, value2, ...);
PREPARED BY GILBERT H. 13
or:
SELECT column_name(s)
FROM table_name
WHERE column_name IN (SELECT STATEMENT);
The BETWEEN operator selects values within a given range. The values can be numbers, text,
or dates.
The BETWEEN operator is inclusive: begin and end values are included.
BETWEEN Syntax
SELECT column_name(s)
FROM table_name
WHERE column_name BETWEEN value1 AND value2;
The ORDER BY keyword is used to sort the result-set in ascending or descending order.
The ORDER BY keyword sorts the records in ascending order by default. To sort the records in
descending order, use the DESC keyword.
ORDER BY Syntax
SELECT column1, column2 ...
FROM table_name
ORDER BY column1, column2 ... ASC|DESC;
PREPARED BY GILBERT H. 14
UPDATE Syntax
UPDATE table_name
SET column1 = value1, column2 = value2 ...
WHERE condition;
Note: Be careful when updating records in a table! Notice the WHERE clause in
the UPDATE statement. The WHERE clause specifies which record(s) that should be
updated. If you omit the WHERE clause, all records in the table will be updated!
DELETE Syntax
DELETE FROM table_name WHERE condition;
Note: Be careful when deleting records in a table! Notice the WHERE clause in
the DELETE statement. The WHERE clause specifies which record(s) should be deleted. If you
omit the WHERE clause, all records in the table will be deleted!
Aggregate functions:
GENERAL SYNTAX
PREPARED BY GILBERT H. 15
SELECT aggregate Function (column_name)
FROM table_name
WHERE condition;
The GROUP BY statement groups rows that have the same values into summary rows, like "find
the number of customers in each country".
GROUP BY Syntax
SELECT column_name(s)
FROM table_name
WHERE condition
GROUP BY column_name(s)
ORDER BY column_name(s);
Example
SELECT COUNT(Customer ID), Country
FROM Customers
GROUP BY Country;
The HAVING clause was added to SQL because the WHERE keyword cannot be used with
aggregate functions.
HAVING Syntax
SELECT column_name(s)
FROM table_name
WHERE condition
GROUP BY column_name(s)
HAVING condition
ORDER BY column_name(s);
Example
PREPARED BY GILBERT H. 16
SELECT COUNT (Customer ID), Country
FROM Customers
GROUP BY Country
HAVING COUNT (Customer ID) > 5;
SQL string functions are used primarily for string manipulation. The following table details the
important string functions:
NAME DESCRIPTION
CONCAT() Returns concatenated string
LCase() Synonym for LOWER()
UCase() Synonym for UPPER()
Left() Returns the leftmost number of characters as specified
Right() Returns the rightmost number of characters as specified
Mid() Returns the middle number of characters as specified
Space() Returns a string of the specified number of spaces
Reverse() Reverse the characters in a string
Length() Returns the length of a string in bytes
Strcmp() Compare two strings
A JOIN clause is used to combine rows from two or more tables, based on a
related column between them.
PREPARED BY GILBERT H. 17
10308 2 1996-09-18
10309 37 1996-09-19
10310 77 1996-09-20
Notice that the "CustomerID" column in the "Orders" table refers to the
"CustomerID" in the "Customers" table. The relationship between the two tables
above is the "CustomerID" column.
Then, we can create the following SQL statement (that contains an INNER JOIN),
that selects records that have matching values in both tables:
PREPARED BY GILBERT H. 18
Example
SELECT Orders.OrderID, Customers.CustomerName, Orders.OrderDate
FROM Orders
INNER JOIN Customers ON Orders.CustomerID=Customers.CustomerID;
(INNER) JOIN: Returns records that have matching values in both tables
LEFT (OUTER) JOIN: Returns all records from the left table, and the
matched records from the right table
RIGHT (OUTER) JOIN: Returns all records from the right table, and the
matched records from the left table
FULL (OUTER) JOIN: Returns all records when there is a match in either
left or right table
PREPARED BY GILBERT H. 19
1. SQL INNER JOIN Keyword
The INNER JOIN keyword selects records that have matching values in both tables.
The LEFT JOIN keyword returns all records from the left table (table1), and the matching
records from the right table (table2). The result is 0 records from the right side, if there is no
match.
The RIGHT JOIN keyword returns all records from the right table (table2), and the matching
records from the left table (table1). The result is 0 records from the left side, if there is no match.
The FULL OUTER JOIN keyword returns all records when there is a match in left (table1) or
right (table2) table records.
A self join is a regular join, but the table is joined with itself.
Every SELECT statement within UNION must have the same number of columns
The columns must also have similar data types
The columns in every SELECT statement must also be in the same order
UNION Syntax
SELECT column_name(s) FROM table1
UNION
SELECT column_name(s) FROM table2;
The UNION operator selects only distinct values by default. To allow duplicate values,
use UNION ALL:
Note: The column names in the result-set are usually equal to the column names in the
first SELECT statement.
PREPARED BY GILBERT H. 21
The SQL INTERSECT operator is used to return the results of 2 or more SELECT statements.
However, it only returns the rows selected by all queries or data sets. If a record exists in one
query and not in the other, it will be omitted from the INTERSECT results.
Syntax
The syntax for the INTERSECT operator in SQL is:
FROM tables
[WHERE conditions]
INTERSECT
FROM tables
[WHERE conditions];
Syntax
EXCEPT
GRANT privilege_name
ON object_name
TO {user_name |PUBLIC |role_name}
[WITH GRANT OPTION];
PREPARED BY GILBERT H. 23
For Example:
The REVOKE command removes user access rights or privileges to the database
objects.
For Example:
PREPARED BY GILBERT H. 24
DATABASE SECURITY CONCEPT
Entity (or table) integrity requires that all rows in a table have a unique identifier,
known as the primary key value. Whether the primary key value can be changed,
or whether the whole row can be deleted, depends on the level of integrity required
between the primary key and any other tables.
Referential integrity ensures that the relationship between the primary key (in a
referenced table) and the foreign key (in each of the referencing tables) is always
maintained.
Domain (or column) integrity specifies the set of data values that are valid for a
column and determines whether null values are allowed. Domain integrity is
enforced by validity checking and by restricting the data type, format, or range of
possible values allowed in a column.
User-Defined integrity: Enforces some specific business rules that do not fall into
entity, domain, or referential integrity.
2. Availability
PREPARED BY GILBERT H. 25
Availability is the condition where in a given resource can be accessed by its
consumers. So in terms of databases, availability means that if a database is
available, the users of its data; that is, applications, “Customers”, and business
users; can access it. Any condition that renders the resource inaccessible causes the
opposite of availability: unavailability.
Reliability: the ability to deliver service at specified levels for a stated period
All four of these “abilities” impact the overall availability of a system, database, or
application.
3. Data Privacy
Privacy of information is extremely important in this digital age where everything
is interconnected and can be accessed and used easily. The possibilities of our
private information being extremely vulnerable are very real, which is why we
require data privacy.
Importance of Data Security
PREPARED BY GILBERT H. 26
o
o
o
o
4. Confidentiality
a. Backup
b. Remote access
Individuals, small and big institutions/companies are using databases in their daily
businesses. Most of the time institutions have agencies spread around the country,
region or the world. Umwalimu SACCO is a saving and credit Cooperative that
helps teachers to improve their lives by getting financial loans at low interests.
This institution is having different agencies in different districts. The central
agency is located at Kigali and host the main database of all members of
Umwalimu SACCO in Rwanda. When a client goes to look for a service at an
agency, the teller requests permissions from Kigali by identifying, authenticating
him/her self so that the authorization can be granted to him/her. The whole
network works in the mode of Cleint/Server. The fact of getting connection to the
server from far is what we call “Remote Access”. Hence, the database is accessed
remotely. This act requires some security measures because otherwise anybody can
disturbs the system of working and hack the whole business system of Umwalimu
SACCO.
PREPARED BY GILBERT H. 27
c. Concurrent control
Process of managing simultaneous operations on the database without having them
interfere with one another.
DATABASE THREATS
Threat is any situation or event, whether intentional or unintentional, that will
adversely affect a system and consequently an organization.
Threats to databases result in the loss or degradation of some or all of the following
security goals: integrity, availability, and confidentiality.
Database protection
The protection of a database can be done through access control and data en-
cryption
Access control
Data encryption
The encoding of the data by a special algorithm that renders the data
Unreadable by any program without the decryption key.
Before planning, designing and managing a database, first it is created. Its creation
goes through defined steps known as Database System Development Lifecycle.
Those steps are:
PREPARED BY GILBERT H. 28
Database planning
System definition
Requirements collection and analysis
Database design
DBMS selection (optional)
Prototyping (optional)
Implementation
Data conversion and loading
Testing
Operational maintenance
END UNIT
PREPARED BY GILBERT H. 29