SQL Notes
SQL Notes
SQL Notes
the calculated summary like sum of salary, average of salary etc. Available
aggregate functions are –
1. SUM() 2. AVG() 3. COUNT() 4. MAX() 5. MIN() 6. COUNT(*)
count(*) Vs count()
Count(*) function is used to count the number of rows in query output
whereas count() is used to count values present in any column excluding NULL
values.
Note: All aggregate function ignores the NULL values.
GROUP BY clause is used to divide the table into logical groups and we can
perform aggregate functions in those groups. In this case aggregate function
will return output for each group. For example if we want sum of salary of each
department we have to divide table records
The MINUS compares the results of two queries and returns distinct rows from
the result set of the first query that does not appear in the result set of the
second query
Select admn from s1 MINUS select admn from s2;
The UNION operator allows you to combine two or more result sets of queries
into a single result set.
Select admn from s1 UNION select admn from s2;
The INTERSECT operator compares the result sets of two queries and returns
the distinct rows that are output by both queries. To use of intersect operator
for two queries, you follow these rules:
1. The order and the number of columns in the select list of the queries
must be the same.
2. The data types of the corresponding columns must be compatible.
Select admn from s1 INTERSECT select admn from s2;
1. What is a database?
There are many definitions for a database. We will use a simple definitions
“A database is an organized collection of data, generally stored and accessed electronically from
a computer system.”
There are many examples of a database around us, admission application’s database, fee
collection database and so on.
2. Need for database
A database is typically used to make the storage and the access of information quicker and
efficient. Over and above this there are many advantages of having computerised data. Some of
them are
(a) Controlled redundancy : Data redundancy is defined as duplication of data. Some data keeps
getting repeated in many places. This results in increase in storage space requirement. Also it
may result in data inconsistency. A database by its design controls and limits the data
duplication to only where and what is necessary.
(b) Data consistency : Data consistency can be defined as , ‘Same data drawn from separate
places must not conflict with each other.’ This does not relate to correctness of data but the
same data being available at all access points and all times.
(c) Data integrity : Data integrity refers to data being accurate and consistent in the database. It
is very important that the data integrity is maintained through various rules.
(d) Data Security : There are many ways that data stored in database can be secured. There can
be multiple users with different kinds of access. Only authorised users can access the database.
(e) Standard structure : Data stored in database must follow a certain standard structure. This
makes the data more understandable and portable.
3. RDBMS (Relational Database Management Software) A Relational database management
system (RDBMS) is a database management system (DBMS) that is based on the relational
model as introduced by E. F. Codd. A relational database has following major components:
3.1 Table A relational model has a table as a basic structure to store data. A table is arrangement
of rows and column to store data. Table is identified by its name. Here is an example of RDBMS
table;
3.2 Record or Tuple Each row in the table is known as a tuple. Following image shows a row in
the table. This is also called a record.
The number of rows of a table is cardinality of the table. The above table has cardinality 5
Each column in the table is called attribute. The attributes are the description of the entity. Here
is the example a pet is described by the identification number, name of the pet, owner’s name,
species of the pet, the gender of the pet and pet’s date of birth.
The number of columns of the table is it’s degree. The degree of the above table is 6.
3.4 Domain A domain is a set of permitted values for an attribute in table. For example, a
domain of month-of-year can accept January, February,…December as values. An attribute
cannot accept values that are outside of their domains.
3.5 Keys Key plays an important role in relational database; it is used for identifying unique rows
from table. It also establishes relationship among tables. We are going to discuss four keys.
Primary Key : A primary is a column or set of columns in a table that uniquely identifies tuples
(rows) in that table.
Candidate key : These are the column or set of columns that qualify to be primary key. The
qualification for being a primary key is unique and non-null values in the column.
Alternate keys : From all the candidate keys one key is selected for being primary key all the
remaining candidate keys are called alternate key.
Foreign key : Foreign keys are the columns of a table that points to the primary key of another
table. Values of foreign key are derived from primary key of another table. They set relationship
between tables.
SQL Introduction
1. Structured Query Language or SQL is a standard Database language which is used to create,
maintain and retrieve the data from relational databases. There are many software which
use SQL as language of coding. Few of them are
▪ MySQL ▪ Oracle ▪ SQL Server
The commands of SQL are called as queries. The queries are categorised as follows
• DML
These commands are used to manipulate data in the relation. Some of the DML commands are
INSERT, DELETE, DELETE.
• TCL
TCL commands can only use with DML commands like INSERT, DELETE and UPDATE only. Some
TCL commands are COMMIT, ROLLBACK
• DCL
DCL commands are used to grant and take back authority from any database user. Some DCL
commands are GRANT, REVOKE
•DQL
These commands are used to extract the data from the relations. There is only one DQL
command and that is SELECT
Datatypes in MySQL
1. INT : This datatype stores integers i.e. a number without fraction. Integer can be zero,
positive or negative. In CREATE TABLE command this datatype can be specified as ‘INT’ or as
‘INTEGER’ The maximum digits for INT can be specified inside parenthesis. By default this
value is 11 that means 11 digits can be displayed.
2. DECIMAL (P,D) : This datatype is used to store fractions. The syntax for this is column_name
decimal(P,D)
D should be less than or equal to P. The actual range of the decimal column depends on the
precision and scale.
Constraints in MySQL
PRIMARY KEY : A PRIMARY KEY constraint for a table enforces the table to accept unique and
not null data for a specific column.
NOT NULL : This constraint allows to specify that a column can not contain any NULL value. A
null value means absence of a value.
UNIQUE : The UNIQUE constraint does not allow to insert a duplicate value in a column. The
difference between UNIQUE and PRIMARY KEY is that UNIQUE column can have null values
while PRIMARY KEY cannot have null values.
DEFAULT : While inserting data into a table, if no value is supplied to a column, then the column
gets the value set as DEFAULT
Till now we have been working on a single table at a time. But keeping data in a single table can make it
bulky and also management of data difficult. Therefore often very big data is divided into multiple
connected table.
1) Foreign Key
When we have multiple table and we need to establish connection we need to use foreign
key. A FOREIGN KEY is a key used to link two tables together.
Let us recall the definition of foreign key, A FOREIGN KEY is a field (or collection of fields) in
one table that refers to the PRIMARY KEY in another table.
The table containing the foreign key is called the child table, and the table containing the
candidate key is called the referenced or parent table.
2) Referential Constraint
Referential integrity requires that a foreign key must have a matching primary key. This
constraint is specified between two tables (parent and child); it maintains the
correspondence between rows in these tables. It means the reference from a row in one
table to another table must be valid.
Ex :
CREATE TABLE student ( studentid char(8) primary key, fname varchar(20), lname
varchar(20), courseid varchar(20), CONSTRAINT FK_course FOREIGN KEY (courseid)
REFERENCES course(courseid) );
3) Cartesian Products
Cartesian product (also called Cross Join) of two tables is a table obtained by pairing up each
row of one table with each row of the other table.
JOINS
1) Equi-Join
SQL EQUI JOIN performs a JOIN against equality or matching column(s) values of the
associated tables. An equal sign (=) is used as comparison operator in the where clause to
refer equality
2) Natural Join
A natural join is a type of equi join which occurs implicitly by comparing all the same names
columns in both tables. The join result has only one column for each pair of equally named
columns.
3) Left Join
The SQL LEFT JOIN (specified with the keywords LEFT JOIN and ON) joins two tables and
fetches all matching rows of two tables for which the SQL-expression is true, plus rows from
the first table that do not match any row in the second table.
SELECT count(*) FROM gfs1 LEFT JOIN gfs2 ON gfs1.admn = gfs2.admn WHERE admn is
null;
4) Union
Union is an operation of combining the output of two SELECT statements. Union of two
SELECT statements can be performed only if their outputs contain same number of columns
and data types of corresponding columns are also the same.
5) Intersection
Intersect operation is used to combine two SELECT statements, but it only returns the
records which are common from both SELECT statements.
To use the INTERSECT operator for two queries, you follow these rules:
• The order and the number of columns in the select list of the queries must be the same.
• The data types of the corresponding columns must be compatible
Select admn from gfs1 where admn in (select admn from gfs2);
6) Minus
The MINUS compares the results of two queries and returns distinct rows from the result set
of the first query that does not appear in the result set of the second query.
SELECT gfs1.admn FROM gfs1 LEFT JOIN gfs2 ON gfs1.admn = gfs2.admn WHERE admn is
null;