SQL - The Ultimate Beginner - S Guide To Learn SQL Programming Step-by-Step
SQL - The Ultimate Beginner - S Guide To Learn SQL Programming Step-by-Step
Mark Reed
© Copyright 2018 - All rights reserved.
The content contained within this book may not be reproduced, duplicated or transmitted without direct
written permission from the author or the publisher.
Under no circumstances will any blame or legal responsibility be held against the publisher, or author,
for any damages, reparation, or monetary loss due to the information contained within this book. Either
directly or indirectly.
Legal Notice:
This book is copyright protected. This book is only for personal use. You cannot amend, distribute, sell,
use, quote or paraphrase any part, or the content within this book, without the consent of the author or
publisher.
Disclaimer Notice:
Please note the information contained within this document is for educational and entertainment
purposes only. All effort has been executed to present accurate, up to date, and reliable, complete
information. No warranties of any kind are declared or implied. Readers acknowledge that the author is
not engaging in the rendering of legal, financial, medical or professional advice. The content within this
book has been derived from various sources. Please consult a licensed professional before attempting
any techniques outlined in this book.
By reading this document, the reader agrees that under no circumstances is the author responsible for
any losses, direct or indirect, which are incurred as a result of the use of information contained within
this document, including, but not limited to, — errors, omissions, or inaccuracies.
Table of Contents
Introduction
Chapter 1: Understanding Databases
What’s a Database?
Database Management Systems
Flat Files
Database Types
The Relational Database
The Object-Relational Database
Chapter 2: Using Queries to Obtain Data
The Basics
Query Structure and the SELECT Statement
The WHERE Clause
Using ORDER BY
Subqueries
Filtering Data with Subqueries
Derived Tables with Subqueries
Table Expressions
Cross Tabulations
Tabulating
Chapter 3: The Data Definition Language (DDL)
Using DDL to Create
Adding Foreign Keys with ALTER
Creating Foreign Key DDL
Unique Constraints
Deleting Tables and Databases
How to Create Views
Chapter 4: SQL Joins and Union
INNER JOIN
RIGHT JOIN
LEFT JOIN
The UNION Statement
The UNION ALL Statement
Chapter 5: Data Integrity
Integrity Constraints
The Not Null Constraint
The Unique Constraint
The PRIMARY KEY Constraint
The FOREIGN KEY Constraint
The MATCH Part
The Referential Action
The CHECK Constraint
Defining the Assertion
Using the Domain Constraint
Chapter 6: Creating an SQL View
Adding a View to the Database
Defining the View
Creating an Updatable View
How to Drop a View
Database Security
The Security Scheme
Creating and Deleting a Role
How to Assign and Revoke a Privilege
Chapter 7: Database Setup
Creating a Database
Deleting a Database
Schema Creation
Specific Software Considerations
Creating Tables and Inserting Data
How to Create a Table
Creating a New Table Based on Existing Tables
How to Insert Data into a Table
Inserting New Data
Inserting Data into Specific Columns
Inserting NULL Values
Chapter 8: Table Manipulation
Altering Column Attributes
Renaming Columns
Deleting a Column
Adding a New Column
Alter a Column without Modifying the Name
Chapter 9: Time
Datetime Data Types
Time Periods
Time Period Tables
System Versioned Tables
Chapter 10: Database Administration
Recovery Models
Database Backup Methods
Restoring a Database
Restore Types
Attaching and Detaching Databases
Detaching the Database
Attaching Databases
Chapter 11: Logins, Users and Roles
Server Logins
Server Roles
Assigning Server Roles
Database Users and Roles
The LIKE Clause
The COUNT Function
The AVG Function
The ROUND Function
The SUM Function
The MAX() Function
The MIN() Function
Chapter 12: Dealing with Errors
SQLSTATE
The WHENEVER Clause
Diagnostics
Exceptions
Conclusion
The Next Steps
References
Introduction
SQL, which stands for Structured Query Language, is a particular
programming language that’s focused on managing and manipulating
databases instead of writing software. After being in development for
decades, this language entered the mainstream scene and became the
preferred language in data mining, database management, and combined with
object-oriented programming languages like C++, it can be used to create
complex business applications.
Computers store data in a categorized fashion. Our hard drives are host to
many files, whether they’re text files, sound files, or image files. Accessing
and manipulating such diverse data can’t be done efficiently without some
kind of system that can bring everything together. This is where the database
comes in.
A database can be described as a collection of data that’s stored on a
computer in such a way for us to easily access, manipulate, and manage it.
They work by storing all of our information in an assemblage of tables. This
is where SQL comes in handy, and with the help of this book, you’re going to
explore its most important features.
Databases and database management systems are important for companies,
government agencies, and many other organizations. SQL experts are always
needed, and a lot of people shy away from such work, thinking that you need
to be gifted to manage a database. That’s not the case at all, and this book
seeks to lift the veil of mystery by guiding you through the foundation of a
database, the core elements of SQL, and teach you how to work with data.
The goal of this book is to offer you the tools you need to create a relational
database and use SQL to manipulate and manage its data.
Chapter 1:
Understanding Databases
Before the age of computers, we used typewriters and various mechanical
calculators to record data and perform mathematical operations with it. That
data was then stored in massive rooms packed with cabinets and boxes with
paperwork. Nowadays, we don’t have to rely on limiting physical space to
keep track of information. We have computers and databases.
To keep all of this data safe, however, we need to prepare:
Our two row and column variables are the political candidate and the district.
Values can be found in each cell where the row intersects with the column.
Now, let’s talk more about actual cross tabulations.
By default, SQL isn’t capable of generating cross tabulations because it lacks
the functions for it. So, the solution is to work with a well-developed
database management system like PostgreSQL, because we can find modules
for it that can extend its usability through non-standard SQL functions and
features. Take note that you can’t use the same module for every database
management system. If you’re using this one, then you can install the
tablefunc package to get the cross tabulations feature. If you use other
system, you need to research for potential modules, if the system even allows
such features. Before we get to our actual cross tabulation, here’s how the
module is installed in PostgreSQL:
CREATE EXTENSION tablefunc:
The module will now be installed and you’ll be able to work with the syntax
needed for cross tabulations. We’ll go over it in the following section.
Tabulating
Let’s say you’re part of a corporation that likes to set up all sorts of fun
activities for team building during certain weekends. Such activities are often
planed between different offices or departments to stoke a bit of competition
as well, but the employees can’t agree on which activity to go for. So, we
need to analyze some data to figure out which activity is the most preferred
by the majority. Let’s say we’ll conduct a survey with 200 participants. We’ll
have a participantID row, officebranch, and funactivity rows as well.
Based on the data we collect and analyze, we will present a readable result
that can be added to a simple report for the company’s executive. Here’s how
this scenario would play out in SQL:
CREATE TABLE activitychoice (
participantID integer PRIMARY KEY,
officebranch varchar (30),
funactivity varchar (30) ) ;
COPY activitychoice
FROM ‘C: |MyFolder|activitychoice.csv’
WITH (FORMAT CSV, HEADER);
Take note that the .CSV file is the survey we conducted on the 200
participants. So, all of the results are stored in a basic Excel-like spreadsheet.
Let’s continue with the exercise and see what kind of results we could get:
SELECT *
FROM activitychoice
LIMIT 5;
For the sake of this example, we’re going to pretend that most participants
from the five office branches chose paintball. Next, we need to generate the
cross tabulation like so:
SELECT *
FROM crosstab (‘SELECT officebranch, funactivity, count(*)
FROM activitychoice
GROUP BY officebranch, funactivity
ORDER BY officebranch,
‘SELECT funactivity
FROM activitychoice
GROUP BY funactivity
ORDER BY funactivity’)
AS (officebranch varchar (30),
paintbal bigint,
archery bigint,
football bigint);
As you can see, we have a SELECT statement that is used to return the
contents from the cross tabulation function. Afterwards, we execute two
subqueries inside that function, with the first one generating its own data by
taking the input from three data columns (officebranch that has data on the
office locations, funactivities which has data on the preset activities, and
lastly the column that contains the intersecting values).
In order to return the number of the participants that chose a certain activity,
we need to cross the data objects. To do that, we execute a subquery to
generate a list in which another subquery will insert the categories. The
crosstab function is then used to instruct the second subquery to only return a
single column on which we apply the SELECT statement. That’s how we
gain access to all the activities in the list so that we can group them and return
the values.
You’ll notice that we’re also using the AS keyword. In this example we need
it to specify the data types inside our cross-table’s data columns. Just make
sure you have a match for the names, otherwise the subqueries won’t work
properly. So, if the subquery returns the activities in a certain order, the result
column must apply the same rules. Here’s how our theoretical end-result
would look:
officebranch paintball archery football
Central 19 29 23
Uptown 47 17
Downfield 20 14 24
The data is now readable, and we can add it to a simple report that can be
shown to the manager. We can see which activity is the preferred one, but we
also spotted a null value in the second column. This simply means that
nobody from Uptown chose archery.
Chapter 3:
The Data Definition Language (DDL)
The data definition language is needed to define, delete, or manipulate
databases and the data structures within them through the use of the create,
alter, and drop keywords. In this chapter we are going to discuss each aspect
of SQL’s DDL.
Using DDL to Create
As mentioned before, the DDL is needed to build databases in the
management system we use. Here’s how the syntax looks for this process:
CREATE DATABASE my_Database
For instance, when we set up a database containing some information such as
“customer details,” we are going to write the following statement:
CREATE DATABASE customer_details
Don’t forget about the use of uppercase or lowercase because SQL is a case
sensitive language. With that in mind, let’s set up a number of customer
tables that will contain all of our data on our customers that’s stored in the
database we just created. This is a perfect example of working with a
relational database management system since all of our data is linked and
therefore accessing or managing our information is an easy task. Now, let’s
use the following code:
CREATE TABLE my_table
(
table_column-1 data_type,
table_column-2 data_type,
table_column-3 data_type,
…
table_column-n data_type
)
CREATE TABLE customer_accounts
(
acct_no INTEGER, PRIMARY KEY,
acct_bal DOUBLE,
acct_type INTEGER,
acct_opening_date DATE
)[3]
Take note of the primary key attribute. Its purpose is to guarantee that the
“acct_no” column will contain only unique values. In addition, we won’t
have any null values. While we should have a primary key field for each
record, other attributes will be “not null” in order to make sure that the
columns won’t take in any null values. Furthermore, we also need a “foreign
key” so that the linked data record from other tables won’t be removed by
accident. The column with this attribute is actually just a copy of the primary
key from the other table. So, for instance, in order to create a new table like
“customer_personal_info” inside our database, we can type the following
lines:
CREATE TABLE customer_personal_info
(
cust_id INTEGER PRIMARY KEY,
first_name VARCHAR(100) NOT NULL,
second_name VARCHAR(100),
lastname VARCHAR(100) NOT NULL,
sex VARCHAR(5),
date_of_birth DATE,
address VARCHAR(200)
[4]
)
As you can see, in this new table we have a cust_id primary key column. The
customer_accounts table will also have to include a cust_id column in order
to make the connection with this table. This is how we gain customer data by
using only an account number. So, the ideal method of ensuring the integrity
of information added to the two tables is to add a cust_id key in the form of a
foreign key to the customer_accounts data table. This way the data from one
table that’s related to that of another table can’t be removed by accident.
Adding Foreign Keys with ALTER
We created a new table, so now we need to change the other table in order to
include the foreign key. Here’s the DDL syntax we need to use:
ALTER TABLE mytable
ADD FOREIGN KEY (targeted_column)
REFERENCES related_table (related_column)
Next, we need to write the following statements in order to include the key to
the customer_accounts table and reference it to the cust_id from the other
table.
ALTER TABLE customer_accounts
ADD FOREIGN KEY (cust_id)
REFERENCES customer_personal_info(cust_id)
Creating Foreign Key DDL
In some scenarios we’ll have to create foreign keys when we set up a new
table. Here’s how we can do that:
CREATE TABLE my_table
(
Column-1 data_type FOREIGN KEY, REFERENCES (related
column)
)
Unique Constraints
In order to make sure that the data we store in the column is unique like that
from the primary column, we need to apply a unique constraint. Keep in mind
that we can have an unlimited number of unique columns, however, we can
have only one primary key column. With that in mind, here’s the syntax used
to set up a unique column:
CREATE TABLE my_table
(
Column-1 data_type UNIQUE
)
Deleting Tables and Databases
DROP TABLE my_table
Take note that you can’t go back once you perform this action. So, make sure
you really need to remove a table before you use this query. Now, here’s the
syntax used to delete the entire database:
DROP DATABASE my_database
Just as before, you can’t reverse this process. Once you delete the database,
it’s gone. On a side note, you might want to just remove a column from the
table. In that case, we are going to use the DELETE statement like so:
DELETE column_name FROM data_table
How to Create Views
CREATE VIEW virtual_table AS
SELECT column-1, column-2, …, column-n
FROM data_table
WHERE column-n operator value[5]
Here’s an example using the customer_personal_info table:
cust_id first_nm second_nm lastname sex Date_of_birth
03051 Mary Ellen Brown Female 1980-10-19
03231 Juan John Maslow Female 1978-11-18
03146 John Ken Pascal Male 1983-07-12
03347 Alan Lucas Basal Male 1975-10-09
Table 2.1 customer_personal_info
Now, let’s say we want to create a view of only the female customers. Here’s
what we need to do:
CREATE VIEW [Female Depositors] AS
SELECT cust_id, first_nm, second_nm, lastname, sex, date_of_birth
FROM customer_personal_info
WHERE sex = ‘Female’[6]
Next, we can declare the following statements in order to see the data from
the view we created:
SELECT * FROM [Female Depositors]
This is the result we would get:
cust_id first_nm second_nm lastname sex Date_of_birth
03051 Mary Ellen Brown Female 1980-10-19
03231 Juan Maslow Female 1978-11-18
Table 2.2 view of customer_personal_info
Take note that views are recreated only as needed and they are never stored
inside the computer memory. However, the view is presented exactly like a
table.
Chapter 4:
SQL Joins and Union
We can add logical operators such as AND to our select statement in order to
process multiple tables in the same statement. Take note that you can also use
the join operator (left, right, and inner) for the same process, and you might
get a more efficient and better optimized processing result.
INNER JOIN
The INNER JOIN statement will enable you to use a single statement to
process multiple tables at the same time. In order for this to work, our tables
have to include linked columns, namely a primary column connected to the
foreign columns. The INNER JOIN operation will extract the data from two
or more tables when a relation between the tables is found. Here’s how the
syntax looks for this process:
SELECT column-1, column-2… column-n
FROM data_table-1
INNER JOIN data_table-2
ON data_table-1.keycolumn = data_table-2.foreign_keycolumn
So, if you need to extract certain matching data from the two tables, we are
going to use the query like so:
acct_no cust_id acct_bal acct_type acct_opening_date
0411003 03231 2540.33 100 2000-11-04
0412007 03146 10350.02 200 2000-09-13
0412010 03347 7500.00 200 2002-12-05
Table 4.1 customer_accounts
cust_id first_nm second_nm lastname sex Date_of_birth addr
03051 Mary Ellen Brown Female 1980-10-19 Coventry
03231 Juan Maslow Female 1978-11-18 York
03146 John Ken Pascal Male 1983-07-12 Liverpool
03347 Alan Basal Male 1975-10-09 Easton
Table 4.2 customer_personal_info
SELECT a.acct_no, b.first_nm AS first_name, b. lastname AS
surname, b.sex, a.acct_bal
FROM customer_accounts AS a
INNER JOIN customer_personal_info AS b
ON b.cust_id = a.cust_id[7]
Here’s the result we’re going to achieve with the above SQL statement in
tabular format:
acct_no first_name surname sex acct_bal
0411003 Juan Maslow Female 2540.33
0412007 John Pascal Male 10350.02
0412010 Alan Basal Male 7500.00
Table 4.3
RIGHT JOIN
When we use this operator with the SQL SELECT query, we can include all
the data from the right table without the requirement of any matching data in
the left table. Here’s the syntax for this process:
SELECT column-1, column-2… column-n
FROM left-data_table
RIGHT JOIN right-data_table
ON left-data_table.keyColumn = right-data_table.foreign_keycolumn
For instance, if we want to select the information of our customers, such as
acct_no, surname, and more, across both tables, the customer_accounts and
customer_personal_info will present that data even if there’s no active
customer account. Here’s how our RIGHT JOIN statement will look:
SELECT a.acct_no, b.first_nm AS first_name, b.lastname AS
surname, b.sex, a.acct_bal
FROM customer_accounts AS a
RIGHT JOIN customer_personal_info AS b
ON b.cust_id = a.cust_id[8]
The result of this process can be seen in the following table:
acct_no first_name surname sex acct_bal
0411003 Juan Maslow Female 2540.33
0412007 John Pascal Male 10350.02
0412010 Alan Basal Male 7500.00
Mary Brown Female
Table 4.4
LEFT JOIN
This statement is basically the opposite of the previous statement.
LEFT JOIN will return the data from the left (first) table, whether we have
any matching data or not when comparing it with the right table. Let’s take a
look at the syntax:
SELECT column-1, column-2… column-n
FROM left-data_table
LEFT JOIN right-data_table
ON left-data_table.keyColumn = right-data_table.foreign_keycolumn
Now let’s say we want to see all the details about the customer’s accounts
across our two tables. Here’s how:
SELECT a.acct_no, b.first_nm AS first_name, b.lastname AS
surname, b.sex, a.acct_bal
FROM customer_accounts AS a
LEFT JOIN customer_personal_info AS b
ON b.cust_id = a.cust_id[9]
And this is the output of our LEFT JOIN statement as tabular data.
acct_no first_name surname sex acct_bal
0411003 Juan Maslow Female 2540.33
0412007 John Pascal Male 10350.02
0412010 Alan Basal Male 7500.00
Table 4.5
The UNION Statement
Next, we might want to combine our results in a single place. That’s what the
UNION keyword is for, and here’s how we can use it:
SELECT column-1, column-2… column-n
FROM data_table-1
UNION
SELECT column-1, column-2… column-n
FROM data_table-2[10]
Let’s say we have a bank that has a number of customers from the US and
UK, and they would like to know which ones have an active account in both
countries. Here are the two tables with customers from each country,
followed by our SQL statement:
acct_no first_name surname sex acct_bal
0411003 Juan Maslow Female 2540.33
0412007 John Pascal Male 10350.02
0412010 Alan Basal Male 7500.00
Table 4.6 London_Customers
acct_no first_name surname sex acct_bal
0413112 Deborah Johnson Female 4500.33
0414304 John Pascal Male 13360.53
0414019 Rick Bright Male 5500.70
0413014 Authur Warren Male 220118.02
Table 4.7 Washington_Customers
You’ll notice that John Pascal is mentioned once in the resulting table. That’s
how we know our union statement worked as intended, since this customer is
part of both banks.
The UNION ALL Statement
This SQL statement is nearly identical to the previous union query with the
only real difference being that we can see the total number of data records
from all the involved tables. Here’s how we use it:
SELECT column-1, column-2… column-n
FROM data_table-1
UNION
SELECT column-1, column-2… column-n
FROM data_table-2
Now we can use this command on our previous customer tables by writing
the following SQL statement:
SELECT first_name, surname, sex, acct_bal
FROM London_Customers
UNION ALL
SELECT first_name, surname, sex, acct_bal
FROM Washington_Customers[12]
Here’s the resulting table:
first_name surname sex acct_bal
Juan Maslow Female 2540.33
John Pascal Male 10350.02
Alan Basal Male 7500.00
Deborah Johnson Female 4500.33
John Pascal Male 13360.53
Rick Bright Male 5500.70
Authur Warren Male 220118.02
Table 4.9
Chapter 5:
Data Integrity
Databases are more than just simple data storage units. One of the most
important data management aspects is maintaining the data’s integrity
because if it’s compromised, we can no longer trust in the quality of the
information. When the information becomes unreliable, the entire database is
compromised.
In order to ensure the integrity of the data, SQL provides us with a number of
rules and guides that we can use to place a limit on the stored table values.
The rules are known as integrity constraints and they can be applied to either
columns or tables. In this chapter we are going to focus on this aspect and
discuss all the constraint types.
Integrity Constraints
For a better understanding, database integrity constraints can be divided into
three categories:
1) Assertions: An assertion definition refers to the definition of an
integrity constraint within another definition. In other words, we
don’t have to specify the assertion in the definition of the data
table. In addition, one assertion can be assigned to more than a
single table.
2) Table Related Constraints: This is a group of integrity constraints
that must be defined within the table. Therefore, we can define
them in the definition of the table’s column, or as a separate
element of the table.
3) Domain Constraints: These constraints have to be defined in their
own definition. This makes them similar to assertions. However, a
domain constrain will work on the specific column that’s been
defined within the domain.
Take note that, of these categories of integrity constraints, the table related
constraints come with a number of options. Therefore, they are probably the
most popular nowadays. This category can be further divided into two
subcategories, namely the table constraints and the column constraints.
Their purpose is self-explanatory, and both of them can function with other
integrity constraints as well. On the other hand, the assertions and the domain
constraints aren’t so versatile because they can only be used with a single
type of constraint.
The Not Null Constraint
Earlier, we discussed that null refers to an undefined value, not the idea that
nothing exists. So, take note that this concept is not the same as having
default values, empty strings, blank spaces, or zero values. The easiest way to
understand the concept of ‘null’ in SQL is to imagine it as a label that tells us
something about the column: that is has an absent value. In other words, an
empty column equals a null value and therefore we will be notified that we’re
dealing with an undefined or unknown value.
It’s important to understand this concept, because data columns come with a
“nullability” attribute, which alerts us when we’re dealing with undefined
values. Remember that in SQL we can have null values in the data columns,
but if we remove the nullability attribute from the column by using the
NOT NULL constraint we can make changes to the attribute. We’ve used
this command in other examples and you’ve seen already that it translates to
no longer allowing a column to contain null values.
Remember that in SQL, this constraint must be applied to a column, and
therefore we can’t use it on the other constrain categories, such as assertions
or table-based constraints. With that being said, the syntax is fairly simple
because using the NOT NULL constraint is an easy process:
(name of column) [ (domain) | (data type) ] NOT NULL
For instance, let’s say that you want to create a new table and you’re going to
call it FICTION_NOVEL_AUTHORS. We’re going to need to add three
columns to this table, namely an AUTHOR_ID, AUTHOR_DOB, and
AUTHOR NAME. Furthermore, we need to guarantee that the entries we’re
inserting will contain valid values for the ID and the name columns. This
means that we must add the NOT NULL constraint. Here's how we’re going
to accomplish that:
CREATE TABLE FICTION_NOVEL_AUTHORS
(
AUTHOR_ID INT NOT NULL ,
AUTHOR_NAME CHARACTER(50) NOT NULL ,
AUTHOR_DOB CHARACTER(50)
);[13]
Take note that we didn’t add the constraint for the AUTHOR_DOB column.
So, if there’s no value in the new entry, a null value will be inserted by
default.
The Unique Constraint
Take note that in SQL, both table and column integrity constraints will only
accept unique constraints. These constraints are divided into two categories:
UNIQUE and PRIMARY KEY.
For now, we’re going to discuss mostly the first type of constraint and leave
the second for later.
UNIQUE is used to deny duplicate values from being inserted in the column.
If a value is already in the column, you won’t be able to add it again. Now,
let’s say we need to apply this constraint to the AUTHOR_DOB data
column. We’ll soon find out that making such a column entirely unique isn’t
a great idea. So, what do we do? One method is to apply the UNIQUE
constraint to both the Name and DOB columns because the table won’t allow
us to repeat the column pair, however, it will let us add duplicate values to
the individual columns. The only restriction we have now is the inability to
introduce the Name/DOB pair a second time in the table.
Remember that we can use this constraint on the entire table or just a column.
Here’s how we add it to the column:
(name of column) [ (domain) | (data type) ] UNIQUE
To add it to the table, we must declare the constraint as a component of the
table:
{ CONSTRAINT (name of constraint) }
UNIQUE < (name of column) { [, (name of column) ] … } >
As you can see, this version of the constraint is slightly more difficult to
apply. All UNIQUE constraints have to be defined. With that being said,
here’s an example on how we can add it at the column level:
CREATE TABLE BOOK_LIBRARY
(
AUTHOR_NAME CHARACTER (50),
BOOK_TITLE CHARACTER (70) UNIQUE,
PUBLISHED_DATE INT
) ;[14]
Next, we’re going to apply the constraint to other columns as well. However,
we will observe different effects than when using the constraint at the table
level to apply it on multiple columns. Let’s examine this process:
CREATE TABLE BOOK_LIBRARY
(
AUTHOR_NAME CHARACTER (50) ,
BOOK_TITLE CHARACTER (70) ,
PUBLISHED_DATE INT,
CONSTRAINT UN_AUTHOR_BOOK UNIQUE (
AUTHOR_NAME, BOOK_TITLE )
) ;[15]
Now, if we need to insert new entries, the AUTHOR_NAME column
together with the BOOK_TITLE column will require unique values.
Remember that the purpose of the UNIQUE constraint is to guarantee that
we can’t add duplicates to our data. Furthermore, the constraint can’t be
applied to null values. Therefore, our affected columns can take multiple null
values even if we apply the UNIQUE ruleset to it.
But if we don’t want null values at all, we can use the NOT NULL constraint
once again, like so:
CREATE TABLE BOOK_LIBRARY
(
AUTHOR_NAME CHARACTER (50),
BOOK_TITLE CHARACTER (70) UNIQUE NOT NULL,
PUBLISHED_DATE INT
) ;[16]
Don’t forget that SQL allows us to add the NOT NULL constraint on
columns to which the table definition refers:
CREATE TABLE BOOK_LIBRARY
(
AUTHOR_NAME CHARACTER (50) ,
BOOK_TITLE CHARACTER (70) NOT NULL,
PUBLISHED_DATE INT,
CONSTRAINT UN_AUTHOR_BOOK UNIQUE (BOOK_TITLE)
) ;[17]
You’ll notice that in either example we have a constraint applied to the
BOOK_TITLE column, which means that it can’t contain duplicate values
or null values.
The PRIMARY KEY Constraint
This constraint is very similar to the UNIQUE variant in the sense that we
can also use it to ban duplicate values. Furthermore, we can use it on multiple
columns, on a single column, or as a table constraint as well. The main
difference between the PRIMARY KEY and UNIQUE is that if you use the
first one on the column, that column can’t take in a null value anymore.
Therefore, we don’t have to use the NOT NULL constrain as well, if we
insert the PRIMARY KEY. In addition, tables can take more than a single
PRIMARY KEY constraint.
Remember that primary keys are unique identifiers and they’re important in
every table. That’s why we need to have the aforementioned limitations.
Furthermore, in the first chapter we discussed that we can’t have duplicate
rows in our tables. If we have such duplicates, then whenever we change the
value in one row, the same value in the duplicate row also changes. This
would make the rows redundant.
What we need to do is select the primary key for our database from a pool of
keys. These potential keys can be seen as groups of columns that can identify
the rows in unique ways. We can make sure the potential key is unique by
using either the PRIMARY KEY constraint or the UNIQUE one instead.
Just make sure you add a primary key to every table, even if you lack a
defined unique constraint. This way, you can guarantee you’ll have unique
data rows.
In order to define the primary key, first we need to specify the column (or
columns) we intend to use. Here’s how to apply the PRIMARY KEY:
(name of column) [ (domain) | (data type) ] PRIMARY KEY
We can also apply it to a table as a component by using the following syntax:
{ CONSTRAINT (name of constraint) }
PRIMARY KEY < (name of column) {, (name of column) ] … } >
We can also define the primary key with the use of constraints, however, we
can only take advantage of this option on one column. Here’s how it works:
CREATE TABLE FICTION_NOVEL_AUTHORS
(
AUTHOR_ID INT,
AUTHOR_NAME CHARACTER (50) PRIMARY KEY,
PUBLISHER_ID INT ) ; [18]
This is an example on a single column. In order to apply the key to more than
one, we have to apply it to the table:
CREATE TABLE FICTION_NOVEL_AUTHORS
(
AUTHOR_ID INT,
AUTHOR_NAME CHARACTER (50),
PUBLISHER_ID INT,
CONSTRAINT PK_AUTHOR_ID PRIMARY KEY ( AUTHOR_ID
, AUTHOR_NAME )
);
Using this method, we’re introducing a primary key to the AUTHOR_ID
and AUTHOR_NAME columns. This way we need to have unique paired
values in both columns. However, this doesn’t necessarily mean that we can’t
have duplicate values anymore inside the columns. What we’re dealing with
here is what the SQL pros refer to as a “superkey,” which means that the
primary key goes over the required number of columns.
Take note that under normal conditions we tend to use the primary key and
the unique constraints on the table itself. However, to do that we need to
define the constraints first. Here’s how:
CREATE TABLE FICTION_NOVEL_AUTHORS
(
AUTHOR_ID INT,
AUTHOR_NAME CHARACTER (50) PRIMARY KEY,
PUBLISHER_ID INT,
CONSTRAINT UN_AUTHOR_NAME UNIQUE
(AUTHOR_NAME)
);
Here’s another example that will also work:
CREATE TABLE FICTION_NOVEL_AUTHORS
(
AUTHOR_ID INT,
AUTHOR_NAME CHARACTER (50) -> UNIQUE,
PUBLISHER_ID INT,
CONSTRAINT PK_PUBLISHER_ID PRIMARY KEY
(PUBLISHER_ID)
) ;[19]
What name describes the purpose of the table with the most
accuracy?
Which data types do I need to store?
How should I name the table columns to make them easy to
understand for everyone?
Which column do I need to designate as my main key?
How wide should each column be?
Which column is going to be empty and which ones will
contain something?
Once you answer those questions you can create your table using the
previously mentioned “xyzcompany” database. We’ll call it EMPLOYEES
because it’s readable and describes the purpose of the table.
CREATE TABLE EMPLOYEES(
ID INT(6) auto_increment, NOT NULL,
FIRST_NAME VARCHAR(35) NOT NULL,
LAST_NAME VARCHAR(35) NOT NULL,
POSITION VARCHAR(35),
SALARY DECIMAL(9,2).
ADDRESS VARCHAR(50),
PRIMARY KEY (id)
);
As you can see, we now have a six-column table with the ID field being the
primary key. The first column will contain only the integer data type and it
won’t take in any null values. The second column is called FIRST_NAME,
and as the name suggests, it’s going to be a varchar data type column. We’ll
allow it a limit of 35 characters. The third and fourth columns are also
varchar columns called LAST_NAME and POSITION, and are also limited
to 35 characters. The SALARY column is going to hold decimal data types
with a scale of 2 and precision value of 9. Last, but not least, we have the
ADDRESS column which is, again, a varchar column, but with a limit of 50
characters. The primary key is the ID column.
Creating a New Table Based on Existing Tables
If you use the CREATE TABLE statement with a SELECT clause you can
use existing tables to create new ones. Here’s how:
CREATE TABLE new_table AS
(
SELECT [column1, column2, ... columnn]
FROM existing_table_name
[WHERE]
); [34]
By running this code you’ll generate a new table with the exact same column
definitions as those in the original version of the table. Take note that you can
choose any number of columns, or all of them, and copy them to the new
table, adding the values from the original table to its copy. Here’s an example
where we create a new table called STOCKHOLDERS using the table
above as a reference point:
CREATE TABLE STOCKHOLDERS AS
SELECT ID, FIRST_NAME, LAST_NAME, POSITION, SALARY,
ADDRESS
FROM EMPLOYEES;
How to Insert Data into a Table
In order to manipulate a database, we use SQL’s aptly named Data
Manipulation Language (DML). DML clauses are used to insert new
information in a table or update the already existing data.
Inserting New Data
You can add new data to a table in two ways. You either use the automatic
method by using a program, or you manually enter all the information. The
first method implies using an external source from which data is extracted
and added to your table, or it can refer to the transfer of data from one table to
another. The second method involves you and your keyboard.
Take note that data is case-sensitive and therefore you need to always double
check everything you insert. For example, if you insert the name of an
employee written as John, you must always type it exactly the same. JOHN is
different from John or john when it comes to data.
To add data to your table you need to use the INSERT statement. You can
use the statement in two ways. The first method is simple. All you need to do
is add the information to each field and assign them to the columns of the
table. This option is best used when you want to insert data into all of your
columns. Here’s how the syntax looks:
INSERT INTO table_name
VALUES (‘value1’, ‘value2’, [NULL];
To use the second method, you need to insert the data in the order in which
the columns appear, and you also have to add the names of those columns.
Normally, this system is used when you need to insert data into certain
columns. Here’s how the syntax looks:
INSERT INTO table_name (column1, column2, column3)
VALUES (‘value1’, ‘value2’ ‘value3’);
As you can see, in both methods we use commas to set the values and column
apart from each other. Furthermore, we use quotation marks to limit the
strings and datetime information.
Now, let’s say we have the following information about an employee:
First Name: Robert
Last Name: Page
Position: Clerk
Salary: 5,000.00
Address: 282 Patterson Avenue, Illinois
Here’s the statement we need to add all of this data to our EMPLOYEES
table:
INSERT INTO EMPLOYEES (FIRST_NAME, LAST_NAME,
POSITION, SALARY, ADDRESS)
VALUES (‘Robert’, ‘Page’, ‘Clerk’, 5000.00, ‘282 Patterson Avenue,
Illinois’);
Next, we need to display the data we just stored in our table:
SELECT * FROM EMPLOYEES;
Take note of the asterisk (*), which is used to inform the database
management system to select all the fields in the table. Here’s the output:[35]
ID FIRST_NAME LAST_NAME POSITION SALARY ADDRESS
1 Robert Page Clerk 5000.00 282 Patterson Avenue,
Illinois
Now let’s work with the next employees and add their information to the
table:
First Name Last Name Position Salary Address
John Malley Supervisor 7,000.00 5 Lake View, New York
Kristen Johnston Clerk 4,500.00 25 Jump Road, Florida
Jack Burns Agent 5,000.00 5 Green Meadows, California
In order to add this data to our database we’ll use the INSERT INTO
statement. However, it needs to be used for each employee in every line, like
this:
INSERT INTO EMPLOYEES(FIRST_NAME, LAST_NAME,
POSITION, SALARY, ADDRESS)
VALUES(‘John’, ‘Malley’, ‘Supervisor’, 7000.00, ‘5 Lake View New
York);
INSERT INTO EMPLOYEES(FIRST_NAME, LAST_NAME,
POSITION, SALARY, ADDRESS)
VALUES(‘Kristen’, ‘Johnston’, ‘Clerk’, 4000.00, ‘25 Jump Road,
Florida’);
INSERT INTO EMPLOYEES(FIRST_NAME, LAST_NAME,
POSITION, SALARY, ADDRESS)
VALUES(‘Jack’, ‘Burns’, ‘Agent’, 5000.00, ‘5 Green Meadows,
California’)’;
Now, let’s display the updated table once again:
SELECT * FROM EMPLOYEES;
And here it is:
ID FIRST_NAME LAST_NAME POSITION SALARY ADDRESS
1 Robert Page Clerk 5000.00 282 Patterson
Avenue, Illinois
2 John Malley Supervisor 7,000.00 5 Lake View,
New York
There are a couple of general rules and guidelines when manipulating your
data with the ALTER TABLE command. The first thing to keep in mind is
that when you insert a new column, you can’t have the NOT NULL property
attached to it when you’re adding it to a table that already contains data. This
property should be specified only to show that it will store a value. A
NOT NULL column can counter the constraint if the table’s data doesn’t
contain any values for the new column.
When you make changes to the columns and the fields you need to take note
of the following aspects:
1. DATE: The first type is the most basic one and you’ll work
with it frequently. The DATE datetime will contain three data
objects, namely the year, month, and day. Furthermore, there
are automatically placed restrictions on the number of digits
you can use for each object. For instance, the month and day
data items can only have two-digit values, while the year can
have four digits. There’s also a restriction on the year range
because the system will allow you to add data starting from
year one (presented as 0001 due to the 4-digit restriction) all
the way to year 9999. Finally, you need be aware that the year,
month, and day objects are written with dashes in between and
therefore the length of the value must have 10 spaces. It is
written in this format: 2020-05-31.
2. TIME WITHOUT TIME ZONE: As the command for the
data type suggests, this is used to store information that
contains the hour, minute, and second time objects. In some
ways it’s similar to the DATE data type because the temporal
items are also restricted to a number of digits. For instance,
hours and minutes can only be written with two digits, while
seconds can have at least two, but more are possible. Here’s
how the format is stored: 12:55:11.676. Take note that we also
have decimals when storing seconds and that the time objects
are separated by colons. Furthermore, the data type can be
declared using the TIME keyword alone, however, this won’t
include the use of decimals. Therefore, if your time stamp
needs to be accurate to the millisecond, then you need to use
the TIME WITHOUT TIME ZONE (y) command, where
“y” is a fill in for the number of fractional digits you want to
allow.
3. TIME WITH TIME ZONE: This SQL datetime is similar to
the previous one, however, it also enables us to store
information on the time zone, the default being Universal Time
(UTC). The value of the this datetime object can be anything
from -12:59 to +13:00 and it must be written in this format that
takes six spaces. The time zone information is also added only
after the time data by separating them from each other using a
hyphen. The plus and minus signs are also mandatory because
they are used to declare the temporal offset. Finally, we can use
decimals just like in the TIME WITHOUT TIME ZONE
example.
4. TIMESTAMP WITHOUT TIME ZONE: The timestamp
data type is used to store the date as well as the time. This data
type still comes with a number of constraints, but they are the
same as in the previous examples. Just look at the DATE
restrictions for the date object and at the TIME restrictions for
the time object. In addition to having the date and time in a
single value, however, we also have default factional values for
the temporal data. In the other examples we need to specify
when we want to use decimals, and here we have a default
value set to 0, which we can modify up to 6 digits.
5. TIMESTAMP WITH TIME ZONE: As you can probably
guess, the only real difference between this data type and the
previous one is that we can store the time zone object next to
our temporal data. If you need every possible detail about time,
you should use this data type.
6. INTERVAL: Yes, there are five datetime data types, however,
the interval data type is related to the others, even though
technically it doesn’t belong in the same category. The purpose
of this data type is to represent the difference between two
distinct points in time, whether that time is formulated as dates
or day time. Therefore, it can be used in two scenarios: when
we calculate a period between two dates and when we calculate
the period between two different parts of the same day.
These are the basic datetime data types you’ll be working with in SQL.
However, there’s much more to them, and in the following section we are
going to discuss more about the period data type, which is more complex
than the INTERVAL type.
Time Periods
As mentioned in the previous section, a period can be defined between two
dates, or two time positions. The problem is that SQL doesn’t have a well-
defined period data type because the idea of the period came very late to the
table and the SQL developers didn’t want to risk damaging the well-defined
structure of the language. Therefore, period definitions are found as metadata.
This means they are part of the data tables as an actual element because
there’s a column that stands for the start of the period and a column to mark
when the period ends. To control this aspect of our tables we have to modify
the creation, and alter table syntax.
Statements can be used to add the functionality that enables us to either
declare or delete a time period. Because the period is represented by normal
columns, the same rules apply to them as for any other columns. SQL is
simply used to label the starting point of the period, while the end is
controlled by the database management system through a constraint. The
constraint is used to ensure that the end of the period won’t be smaller than
the defined start marker.
Take note that there are two temporal dimensions that we need in order to
work with this kind of data. They are referred to as the transaction time and
the valid time. The first represents the period when a specific piece of
information was inserted in our database. The second dimension describes the
period during which our data is valid, meaning it shows the present reality.
The two dimensions can have different values because our data can be stored
in a database before it becomes valid. Let’s say we’re working with some
data based on a contract between two parties. That contract can be agreed
upon months before any of the data in it becomes valid. We can store the
information, but it’s not yet reflecting the reality as the contract hasn’t come
into force yet.
Finally, we can also set up tables to represent each dimension, or we can
define them both in the same table. In this case, you need to be aware that the
transaction time dimension is stored in system-controlled tables, and
therefore the time frame is attached to the system time. On the other hand, the
valid time only depends on the time presented by the program that relies on
it.
Time Period Tables
The best way to understand the concept we discussed earlier is by jumping
straight in the middle of an example. So, imagine that a company wants to
store some information on its employees during a period of time by dividing
them into categories that match their departments. Here’s how this kind of
table would look:
CREATE TABLE employees (
EmployeesID INTEGER,
EmployeesBeginning DATE,
EmployeesEnding DATE,
EmployeesDepartment VARCHAR (25) ,
PERIOD FOR EmployeesPeriod (EmployeesBeginning,
EmployeesEnding) ) ;
Now that we have the table, we need to add some temporal data:
INSERT INTO employees
VALUES (4242, DATE ‘2018-02-02’, DATE ‘3000-12-31’,
‘TechAssistance’ );
As you can probably conclude from the code itself, our end date (year 3,000)
is used here to say that the employees’ data will remain valid for a long time,
or at least as long as they’ll still be working at this company. Take note that
we can also specify the exact time limit if we want, but in this case, we just
went with a period that can’t be too short.
Now start asking yourself what will happen when one of the employees is
moved from one department in 2019 and then in 2020, he is assigned back to
his previous department. We need to change our database to take this action
into account.
UPDATE employees
FOR PORTION OF Employeesperiod
FROM DATE ‘2019-12-03’
TO DATE ‘2020-06-03’
SET EmployeesDepartment = ‘Development’
WHERE EmployeesID = 42;
After the update, we now have three new rows. In the first one we have
information on the employment period before the departmental reassignment
happened. The second one contains the temporal data on the time of
employment spent in the new department. Finally, the last row begins a new
period that marks that employee’s return to his initial department.
Next, we’re going to delete some of the temporal data as the removal process
is a bit different from deleting data in a regular table. We can’t just start
deleting the rows. Imagine if the employee in our example didn’t move to
another department, but instead he quits the company and at a later date he
returns. This is a situation (a bit unlikely in this particular example) that
requires us to delete only part of the temporal information, so here’s how our
delete statement would look:
DELETE employees
FOR PORTION OF EmployeesPeriod
FROM DATE ‘2018-12-03’
TO DATE ‘2020-06-03’
WHERE EmployeeID = 42;
The period in the middle that we had in the previous example no longer
exists. Now we have only the first-time frame containing data on the
employee’s initial hiring, and the last one which represents the return to his
original department.
Now, did you notice anything weird about our tables? There’s no primary
key. All we have is an employee ID, but it’s enough for us to use it as an
identifier in our example. However, the table can contain multiple rows of
data for every single employee and thus the ID is not enough to work in place
of a proper primary key. Using our example, we can’t make sure we always
have unique values, and that’s why we should insert the EmployeesBeginning
and EmployeesEnding into the key. Unfortunately, this won’t fix all of the
potential problems.
Look at our table where one of the employees moved between departments
for certain periods of time.
If the starting and ending points of a time frame are added to the primary key,
then we’ll have unique objects. The problem here, though, is that these
periods can overlap and therefore some employees might show up as working
for both departments at the same time for a certain amount of time. This is
what we would refer to as corrupted data. So, we need to get this fixed. The
easiest approach would be a constraint definition that tells the system that the
company’s employees can only work for a single department. Here’s an
example by using the ALTER TABLE statement:
ALTER TABLE employees
ADD PRIMARY KEY (EmployeesID, EmployeesPeriod WITHOUT
OVERLAPS) ;
We can also introduce the constraint when we set up the data table:
CREATE TABLE employees (
EmployeesID INTEGER NOT NULL,
EmployeesBeginning DATE NOT NULL,
EmployeesEnding DATE NOT NULL,
EmployeesDepartment VARCHAR (25) ,
PERIOD FOR EmployeesPeriod (EmployeesBeginning,
EmployeesEnding)
PRIMARY KEY (EmployeesID, EmployeesPeriod WITHOUT
OVERLAPS) );
With this final form of our table, we no longer run the risk of intersecting
rows and corrupted data. In addition, we introduced the NOT NULL
constrains just in case we have a database management system that doesn’t
deal with possible null values on its own. Not all systems handle them
automatically for us, so it’s better to plug any potential leaks from the start.
System Versioned Tables
Another type of temporal table is the system version table that serves a
different purpose. Remember that the tables in the previous section work
based on particular periods of time and only the data that’s valid during that
period is going to be processed and used.
The purpose of system versioned tables is to enable us to determine the
auditable data on objects that were either newly inserted, altered, or deleted
from the database. For example, imagine a bank that has to determine the
time when a deposit was made. This is information that has to be
immediately registered and then managed for a specific amount of time
depending on laws and regulations. Stock brokers operate the same way
because an accurate temporal record can have a serious impact on other data.
System versioned tables make sure that all this information is accurate to
milliseconds. With this example in mind, here are two conditions a program
needs from this type of data table:
1. The table rows have to exist in their original form even after
being altered or deleted. In other words, the original data has to
be preserved.
2. Each time frame has to be processed by the system and the
program needs to be aware of this aspect.
The first rows that we alter or remove will remain part of the table, however,
they’ll appear as historical rows and we won’t be able to change them. In a
way they’re backed up information in its original form. Furthermore, we
can’t make future alterations to the time frames attached to this data. But this
doesn’t mean that the time periods can’t be changed. It only means that we as
users can’t alter them any further, but the system can. This aspect of system
versioned information is important because it acts as an additional security
measure by not allowing any user to make modifications or remove the
original data and the time frames during which it was valid. This way audits
can be performed on the data and government agencies can always have
access to it because tampering is nearly impossible.
Now that you know more about both types of tables, let’s compare them to
get a better understanding of the difference between a period table and a
system versioned table:
When working with period tables, we can define the period and
the name of the time frame ourselves. On the other side, when
it comes to system versioned tables, the period is defined under
SYSTEM_TIME.
Another big difference is related to the CREATE command.
To have a system versioned table, we need to introduce the
WITH SYSTEM VERSIONING statement. In addition, we
should use the timestamp datetime data type when we declare
the time frame starting and ending points. This, however, is
just a recommendation because we should look for the best
accuracy we can get out of the system. As previously
discussed, the timestamp data type is the most precise.
Let’s now see an actual system versioned table and how it’s created:
CREATE TABLE employees_sys (
EmployeesID IINTEGER,
SystemStartingPoint TIMESTAMP (12) GENERATED ALWAYS
AS A ROW START,
SystemEndPoint TIMESTAMP (12) GENERATED ALWAYS AS
A ROW END,
EmployeesName VARCHAR (40),
PERIOD FOR SYSTEM_TIME (SystemStartingPoint,
SystemEndPoint))
WITH SYSTEM VERSIONING;
Notice that we identify the valid system row by checking whether the time
frame is within the system time. If it’s not, then we have a historical row
instead.
The two systems we’ve been working with are different, as you’ve seen. But
now that you’re familiar with the syntax, you may have spotted other
different elements such as:
Furthermore, you need to work with the master database and then declare the
wanted database as a non-active database, meaning SINGLE_USER, so that
you enable just one connection before the restoring process. Take a look at
the following syntax, where we use WITH ROLLBACK IMMEDIATE in
order to roll back any unfinished transaction logs in order to set our database
to a one-connection state.
In this example we’re setting the database to enable a single open connection,
and the rest will be closed.
ALTER DATABASE DatabaseName SET SINGLE_USER WITH
ROLLBACK IMMEDIATE
When the restoration process is over, we can reset the database to its active
state, namely MULTI_USER. Look at the syntax below where we enable
multiple users to connect to the database.
ALTER DATABASE DatabaseName SET MULTI_USER
Restore Types
In this section we are going to discuss several database restore types you can
use based on your situation.
The first is the full restore type, which is used to restore the whole database
and all of its files. Take note that you can also use the WITH REPLACE
clause to overwrite an already existing database. However, if you set the
recover model to Full, you’ll need to use this clause anyway. Furthermore, if
you use the FILE = parameter action, you can select the file you want to
restore, whether you’re performing a full or differential restore. On the other
hand, if you went with the simple recovery model, you can skip using the
WITH REPLACE clause. Finally, this action will create the database, its
files, and all of the information within if it doesn’t already exist, because it’s
restoring everything from the .BAK file.
Here's an example of restoring a file when using the full recovery model:
RESTORE DATABASE DatabaseName FROM DISK =
'C:\SQLBackups\DatabaseName.BAK' WITH FILE = 1,
REPLACE
And here we have an example when using the simple recover model
instead:
RESTORE DATABASE DatabaseName FROM DISK =
'C:\SQLBackups\DatabaseName.BAK'
Next up we have the differential restore option. Again, you should start by
using the RESTORE HEADERONLY statement to learn which backup
files are available and whether they’re full or differential file types. Then you
can choose to run a full restore of the backup file using the NORECOVERY
option, because this specifies that other files should be restored as well.
When the backup is specified, you can also use the WITH NORECOVERY
option in order to restore the differential file types, too. Lastly, if you add the
RECOVERY statement in the differential file type restore, you’ll be telling
the system that there are no other files that require restoring.
Here's the syntax for restoring the full backup file:
RESTORE DATABASE DatabaseName FROM DISK =
'C:\SQLBackups\DatabaseName.BAK' WITH FILE = 8,
NORECOVERY
Here's the syntax for restoring the differential backup file:
RESTORE DATABASE DatabaseName FROM DISK =
'C:\SQLBackups\DatabaseName.BAK' WITH FILE = 9,
RECOVERY
And finally, to run a perfect database restore process you need to use the log
restore option. Once you perform the full or differential restore of the
database, you can restore the log as well. Take note that you also need to use
the NORECOVERY statement when you restore the backup so that you can
restore the log.
Now, going back to the backup you made for your database and the data loss
simulation performed on the Product_Sales table, you can now restore the
database. To do that, you need to use the master database. Set the database to
a one user state, run restore process, and then set it to the multiuser state.
After the database has been restored, you can recover the data from the table
just to confirm that you have access to that information once more.
Attaching and Detaching Databases
The techniques used to attach and detach databases have many similarities
with the methods used to back up or restore a database. Essentially, you’ll be
copying MDF and LDF files, and performing a process similar to back up
and restore, but faster. Furthermore, the database has to go, which means that
nobody can have access to it in this state. It will remain this way until we
reattach it.
Since the process of backing up/restoring and attaching/detaching are similar,
we need to determine when to use which one. Normally, you should go with
the backup solution, however, you’ll encounter scenarios when all you can do
is go with the second option.
Imagine having a database with a large number of file groups. Attaching it
can be difficult and time-consuming. So, in this case you should back it up
and restore it to a different destination. During the backup process, the files
will group together automatically.
Now, if you have a huge database, the backup process might take a very long
time. This is when we can take advantage of the speed offered by the
attaching/detaching method. It works quickly because the database is taken
offline, detached, and then simply attached to a different location.
The two file groups that can be used with this method are the .MDF and
.LDF files. As you probably noticed, the .MDF file is actually the main
database file that contains all the data and information on the structure of the
database. The LDF file, on the other hand, deals with only the history of the
transactional log and activities. Then we have the BAK backup file, which
automatically groups all the files, and can be used to restore various versions
of those files.
Choose the right method after analyzing the scenario you’re in. However, the
backup/restore method should still be your preferred option in most cases.
Just examine your situation and perform your tests before working with live
data.
We’ve attached the database, so let’s detach it from the server and then
perform the attaching process once more with the SQL syntax.
Detaching the Database
Take note that in SQL Server we have a stored method of detaching a
database, and it can be found inside the master database. Here’s how you can
peak inside the procedure and analyze its complexity:
When you see the login window, click on the drop-down menu and select
Server Authentication.
Then you can type the username and password to log in.
When you’re logged into the user account, you can use the Object Explorer to
open a database. Here’s how the window looks:
Take note that if you’re trying to expand our example database, you’ll have
just a blank result. Why? Simply because the current user has no access to
this particular database. Let’s talk more about this in the following section.
Database Users and Roles
The database user is the user who has access with certain privileges to a
specific database or to multiple databases. The database user can also give
access to other users in order to use the data inside, and they can also limit
that access or even fully restrict it. Take note that in the real world you
normally won’t give full access to many users. Instead you’d allow them
access to a server instance.
Database users also come with a set of database roles that are similar to the
server level roles. Here they are:
Now, try using (*) instead of entering the name of the column:
SELECT COUNT(*)
FROM SALES_REP;[47]
You’ll notice the same result after executing the statement because we don’t
have any null values in the EMP_NAME field. However, if this field would
have a null value, then we wouldn’t add it to the statement that mentions the
EMP_NAME, but instead we would include it in the COUNT() output
because the asterisk symbol was used as a parameter.
SELECT BRANCH, COUNT(*) FROM SALES_REP
GROUP BY BRANCH;
Take note that you can use the COUNT function together with the
DISTINCT clause to discover certain entries. For instance, when you need to
learn the number of distinct branches that were stored in the table, you can
use the following syntax:
SELECT COUNT (DISTINCT BRANCH)
FROM SALES_REP;
The AVG Function
Let’s start with the following example:
SELECT AVG (<expression>)
FROM “table_name”;[48]
Remember that the expression can also be a mathematical operation or a
column name. Furthermore, a mathematical operation can involve more than
one column.
In the following example we can use the AVG function in order to determine
the value of the average sales from the Sales_Rep table. Here’s how it works:
ID EMP_NAME SALES BRANCH
1001 ALAN MARSCH 3000.00 NEW YORK
1098 NEIL BANKS 5400.00 LOS ANGELES
2005 RAIN ALONZO 4000.00 NEW YORK
3008 MARK FIELDING 3555.00 CHICAGO
4356 JENNER BANKS 14600.00 NEW YORK
4810 MAINE ROD 7000.00 NEW YORK
5783 JACK RINGER 6000.00 CHICAGO
6431 MARK TWAIN 10000.00 LOS ANGELES
7543 JACKIE FELTS 3500.00 CHICAGO
MARK GOTH 5400.00 AUSTIN
SELECT AVG(Sales) FROM Sales_Rep;
And this is the output:
AVG(SALES)
6245.50
The average value you can see in the output represents the average result
from the entire sales information we have in the Sales_Rep table. It is
determined by calculating the sum of the sales field and by dividing that
value by the number of total sales entries, 10 rows in this case.
Now, let’s use the same function in a mathematical operation where we
assume a tax of 6.6% of sales and we need to determine the average tax
value.
SELECT AVG(Sales*.066) FROM Sales_Rep;
The mathematical operation is performed first, and only afterward we’ll
apply the following function:
SELECT Branch, AVG(Sales)
FROM Sales_Rep
GROUP BY Branch;
The ROUND Function
Let’s start with the following example:
ROUND (expression, [decimal place])
Take note that the decimal place refers to the number of points to be returned.
So, if we have a -1-value specified, the number will be rounded to the nearest
unit number.
Now, let’s go through a couple of examples by using the following table as
our source of data.
ID Name Grade
1 Jack Knight 87.6498
2 Daisy Poult 98.4359
3 James McDuff 97.7853
4 Alicia Stone 89.9753
If you want to approximate to the nearest tenth, the syntax will look like this:
SELECT Name, ROUND (Grade, 1) Rounded_Grade FROM
Student_Grade;[49]
This would be the result:
Name Rounded_Grade
Jack Knight 87.6
Daisy Poult 98.4
James McDuff 97.8
Alicia Stone 90.0
But what if you want to round down the values without using decimals? To
do that, you simply need to add a negative parameter to the function.
SELECT Name, ROUND (Grade, -1) Rounded_Grade FROM
Student_Grade;
This is the output:
Name Rounded_Grade
Jack Knight 90
Daisy Poult 100
James McDuff 100
Alicia Stone 90
Let’s see another example where we use the sum function together with the
GROUP BY clause to determine the number of sales accomplished by every
company branch in our table.
SELECT Branch, SUM(Sales) FROM Sales_Rep
GROUP BY Branch;
This is the output:
BRANCH SUM(Sales)
AUSTIN 5400.00
CHICAGO 13055.00
LOS ANGELES 15400.00
NEW YORK 28600.00
Now let’s apply the function to a mathematical operation like we did with the
others. We will use the data in our sales table and calculate the highest sales
tax value with the following statement:
SELECT MAX(Sales*0.066) FROM Sales_Rep;[52]
This is the output:
MAX(sales*.066)
963.60
Finally, we can use the function with the GROUP BY clause in order to
determine the maximum value for each company branch. Here’s how:
SELECT Branch, MAX(Sales) FROM Sales_Rep GROUP BY
Branch;[53]
[1]
https://www.w3schools.com/sql/sql_where.asp
[2]
https://www.w3schools.com/sql/sql_orderby.asp
[3]
https://sites.google.com/site/prgimr/sql/ddl-commands---create---drop---
alter
[4]
https://www.w3schools.com/sql/sql_primarykey.asp
[5]
https://www.1keydata.com/sql/sql-create-view.html
[6]
https://www.1keydata.com/sql/sql-create-view.html
[7]
https://www.w3schools.com/sql/sql_join_inner.asp
[8] https://www.w3schools.com/sql/sql_join_right.asp
[9]
https://www.w3schools.com/sql/sql_join_left.asp
[10]
https://www.w3schools.com/sql/sql_union.asp
[11]
https://www.w3schools.com/sql/sql_union.asp
[12]
https://www.techonthenet.com/sql/union_all.php
[13]
https://www.w3schools.com/sql/sql_notnull.asp
[14]
https://www.w3schools.com/sql/sql_unique.asp
[15]
https://www.w3schools.com/sql/sql_unique.asp
[16]
https://chartio.com/resources/tutorials/how-to-alter-a-column-from-null-to-not-null-in-sql-server/
[17]
https://www.w3schools.com/sql/sql_notnull.asp
[18]
https://www.tutorialspoint.com/sql/sql-primary-key.htm
[19]
https://www.tutorialspoint.com/sql/sql-primary-key.htm
[20]
https://docs.microsoft.com/en-us/sql/relational-databases/tables/primary-and-foreign-key-
constraints?view=sql-server-2017
[21]
https://docs.faircom.com/doc/esql/32218.htm
[22]
https://stackoverflow.com/questions/7573590/can-a-foreign-key-be-null-and-or-duplicate
[23] https://stackoverflow.com/questions/7573590/can-a-foreign-key-be-null-and-or-duplicate
[24]
https://www.w3schools.com/sql/sql_check.asp
[25]
https://stackoverflow.com/questions/11981868/limit-sql-server-column-to-a-list-of-possible-values
[26]
https://www.tutorialspoint.com/sql/sql-primary-key.htm
[27]
https://www.tutorialspoint.com/sql/sql-primary-key.htm
[28]
https://stackoverflow.com/questions/11981868/limit-sql-server-column-to-a-list-of-possible-values
[29]
https://stackoverflow.com/questions/11981868/limit-sql-server-column-to-a-list-of-possible-values
[30]
https://docs.microsoft.com/en-us/biztalk/core/step-2-create-the-inventory-
request-schema
[31]
https://www.techonthenet.com/oracle/schemas/create_schema_statement.php
[32]
(Stanek, 2010)
[33]
https://www.postgresql.org/docs/9.3/sql-createschema.html
[34]
https://www.techonthenet.com/sql/tables/create_table2.php
[35]
https://www.w3schools.com/sql/sql_wildcards.asp
[36]
https://www.w3schools.com/sql/sql_like.asp
[37]
https://www.w3schools.com/sql/sql_like.asp
[38]
https://www.1keydata.com/sql/alter-table-rename-column.html
[39]
https://www.1keydata.com/sql/alter-table-rename-column.html
[40]
https://www.1keydata.com/sql/alter-table-rename-column.html
[41]
https://docs.microsoft.com/en-us/sql/relational-databases/tables/modify-columns-database-engine?
view=sql-server-2017
[42]
https://docs.microsoft.com/en-us/sql/relational-databases/security/authentication-access/server-
level-roles?view=sql-server-2017
[43]
https://docs.microsoft.com/en-us/sql/relational-
databases/databases/database-detach-and-attach-sql-server?view=sql-server-
2017
[44]
https://www.tutorialspoint.com/sql/sql-like-clause.htm
[45]
https://www.tutorialspoint.com/sql/sql-like-clause.htm
[46]
https://docs.microsoft.com/en-us/sql/t-sql/statements/create-function-transact-sql?view=sql-server-
2017
[47]
https://docs.microsoft.com/en-us/sql/t-sql/statements/create-function-
transact-sql?view=sql-server-2017
[48]
https://docs.microsoft.com/en-us/sql/t-sql/statements/create-function-transact-sql?view=sql-server-
2017
[49]
https://docs.microsoft.com/en-us/sql/t-sql/functions/round-transact-sql?view=sql-server-2017
[50]
https://docs.microsoft.com/en-us/sql/t-sql/functions/round-transact-sql?view=sql-server-2017
[51]
https://docs.microsoft.com/en-us/sql/t-sql/functions/round-transact-sql?view=sql-server-2017
[52]
https://stackoverflow.com/questions/1475589/sql-server-how-to-use-an-aggregate-function-like-
max-in-a-where-clause
[53]
https://stackoverflow.com/questions/1475589/sql-server-how-to-use-an-aggregate-function-like-
max-in-a-where-clause
[54]
https://stackoverflow.com/questions/1475589/sql-server-how-to-use-an-aggregate-function-like-
max-in-a-where-clause
[55]
https://github.com/lerocha/chinook-database
[56]
https://github.com/jpwhite3/northwind-SQLite3