Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

CSCI235 Database Systems Assignment 1

Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

School of Computing and Information Technology Session: Autumn 2021

University of Wollongong Lecturer: Janusz R. Getta

CSCI235 Database Systems


Assignment 1
15 March 2021

Scope
This assignment includes the tasks related to database normalization, indexing of relational
tables, and using cursors to search a database.

The outcomes of this assignment are due by Saturday 3 April, 2021, 7.00 pm (sharp).

Please read very carefully information listed below.

This assignment contributes to 20% of the total evaluation in a subject CSCI235.

A submission procedure is explained at the end of specification.

This assignment consists of 4 tasks and specification of each task starts from a new page.

It is recommended to solve the problems before attending the laboratory classes in order to
efficiently use supervised laboratory time.

A submission marked by Moodle as "late" is treated as a late submission no matter how


many seconds it is late.

A policy regarding late submissions is included in the subject outline.

A submission of compressed files (zipped, gzipped, rared, tared, 7-zipped, lhzed, … etc) is
not allowed. The compressed files will not be evaluated.

All files left on Moodle in a state "Draft(not submitted)" will not be evaluated.

An implementation that does not compile due to one or more syntactical and/or run time
errors scores no marks.

It is expected that all tasks included within Assignment 1 will be solved individually
without any cooperation with the other students. If you have any doubts, questions, etc.
please consult your lecturer or tutor during lab classes or office hours. Plagiarism will result
in a FAIL grade being recorded for the assessment task.
Task 1 (6 marks)
Normalization of relational schemas

In all subtasks listed below you must apply the following methodology to find the highest
normal form valid for a relational schema and to decompose it into BCNF schemas
whenever it is necessary.

(i) Apply derivations of functional dependencies to find the minimal keys


(ii) Apply the definitions of normal forms to find the highest normal form valid for a
schema.
(ii) If a relational schema is not in BCNF then decompose it into the smallest number of
relational schemas, each in BCNF. Try to enforce as much functional dependencies as
it possible in the decomposed schemas.

Please note, that other methodologies and "educated guesses" do not score any marks.

(1) Consider the following relational schema and a set of functional dependencies valid in
the schema.

R (A, B, C, D, E)
F = {A ® CD, CD ® E}

Find the highest normal form valid in the relational schema. If the schema is not in
BCNF then decompose the schema in to the smallest number of relational schemas
each one in BCNF. Try to enforce as much functional dependencies as it is possible in
the decomposed schemas. List all derivations of functional dependencies, minimal
keys found, normal forms and decompositions.

(2) Consider the following relational schema and a set of functional dependencies valid in
the schema.

R (A, B, C, D, E)
F = {ABC ® D, D ® E}

Find the highest normal form valid in the relational schema. If the schema is not in
BCNF then decompose the schema in to the smallest number of relational schemas
each one in BCNF. Try to enforce as much functional dependencies as it is possible in
the decomposed schemas. List all derivations of functional dependencies, minimal
keys found, normal forms and decompositions.

(3) Consider the following relational schema and a set of functional dependencies valid in
the schema.

R (A, B, C, D, E)
F = {ABC ® DE, DE ® AB}

Find the highest normal form valid in the relational schema. If the schema is not in
BCNF then decompose the schema in to the smallest number of relational schemas
each one in BCNF. Try to enforce as much functional dependencies as it is possible in
the decomposed schemas. List all derivations of functional dependencies, minimal
keys found, normal forms and decompositions.

Deliverables
A file solution1.pdf with the solutions of the problems (1), (2), and (3). Note, that
the methodologies other than requested in a specification above and "educated guesses" of
the solutions score no marks. You must provide the complete justifications of your answers.

Submission of a file with a different name and/or different extension and/or different type
scores no marks !
Task 2 (4 marks)
Normalization of relational schemas

Consider the following conceptual schema of a sample database domain where people own
the cars manufactured by the manufacturers. A person is described by a mobile phone
number, first and last names and date of birth. The cars are described by a registration
number, model, year when a car has been manufactured, engine capacity and retail price
recommended by a manufacturer. A manufacturer is described by a name, web site, country
where a manufacturer is located and a list of car models manufactured. A manufacturer
manufactures many car models. A manufacturer determines for each car model a unique
recommended retail price. An engine capacity is the same for all cars that belong to the
same model.

A conceptual schema created by a database designer is given below.

A database designer made few mistakes at both conceptual modelling stage and logical
design stage (i.e. transformation of a conceptual schema into the relational schemas).

At present, the relational schemas created by a database designer are the following.

PERSON(mobile-phone, first-name, last-name, date-of-birth)


primary key = (mobile-phone)

CAR(registration-number, model, year-when-manufactured,


recommended-retail-price,engine-capacity, owner, mname)
primary key = (registration-number)
foreign key = (owner) references PERSON(mobile-phone)
foreign key = (mname) references MANUFACTURER(name)

MANUFACTURER(name, web-site, country, models-manufactured)


primary key = (name)
candidate key = (web-site)

Assume, that at the moment all relational schemas are in 1NF.

Your task is to use the analysis of functional dependencies and normalization of relational
schemas to find the highest normal form valid for each one of the relational schemas:
PERSON, CAR and MANUFACTURER listed above. If a relational schema is not in BCNF
then you must decompose it into the relational schemas in BCNF.
You must apply the following methodology to find the highest normal form valid for a
relational schema and to decompose it into BCNF schemas whenever it is necessary.

(i) Apply derivations of functional dependencies to find the minimal keys


(ii) Apply the definitions of normal forms to find the highest normal form valid for a
schema.
(ii) If a relational schema is not in BCNF then decompose it into the smallest number of
relational schemas, each one in BCNF. Try to enforce as much functional
dependencies as possible in the decomposed schemas.

Please note, that different methodologies and "educated guesses" do not score any marks.

Repeat such process for every relational schema listed above.

Please remember, that both conceptual schema and relational schemas are not completely
correct. You must be very careful when finding functional dependencies. Using only a
conceptual schema and the relational schemas is like "walking over a mine field".

Finally, please do not send to me any emails saying that the design is incorrect. I know,
that it is incorrect ! It is your task to improve it.

Deliverables
A file solution2.pdf with a report from normalization of the relational schemas given
above. A report must include a list of functional dependencies found, complete derivations
of minimal keys, complete identification of the highest normal form valid and
decomposition into BCNF whenever it is necessary.
Task 3 (5 marks)
Indexing

Prologue
Download the files dbschema.bmp, dbcreate.sql, dbload.sql and
dbdrop.sql included in a section SAMPLE DATABASES on Moodle. To drop a sample
database, process a script dbdrop.sql. To create a sample database, process as script
dbcreate.sql. To load data into a sample database, process as script dbload.sql.

Connect to Oracle database server and process the following SQL statement that saves a
query processing plan for a given SELECT statement in PLAN_TABLE.

EXPLAIN PLAN FOR


SELECT bank_name
FROM BANK
WHERE hq_country = 'Brasil';

Next, process the following SELECT statement to display a query processing plan stored
in PLAN_TABLE.

SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY);

Among the others, you should get the following results.


PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 1547002607

--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 44 | 3 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| BANK | 1 | 44 | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):


---------------------------------------------------

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------

1 - filter("HQ_COUNTRY"='Brasil')

A line TABLE ACCESS FULL| BANK in a plan given above indicates that a database
system plans to read entire table BANK to compute the query.

Next, create an index on a column hq_country in a relational table BANK.

CREATE INDEX hqc_idx ON BANK(hq_country);

Again, save a query processing plan for the same SELECT statement as before in
PLAN_TABLE and display a query processing plan stored in PLAN_TABLE.
EXPLAIN PLAN FOR
SELECT bank_name
FROM BANK
WHERE hq_country = 'Brasil';

SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY);

Among the others, you should get the following results.


PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------
Plan hash value: 1642847795

-----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 44 | 2 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID BATCHED| BANK | 1 | 44 | 2 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | HQC_IDX | 1 | | 1 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------

2 - access("HQ_COUNTRY"='Brasil')

This time, to process a query, a database system plans to use an index HQC_IDX created a
moment ago. Note, a line INDEX RANGE SCAN | HQC_IDX in a plan
given above. It means, that a database system plans to vertically traverse an index
HQC_IDX to find the identifiers of rows, that satisfy a condition hq_country =
'Brasil'. A line TABLE ACCESS BY INDEX ROWID BATCHED| BANK means,
that the system plans to group row identifiers pointing to the same data blocks in order to
minimize the total number of read block operations. Then, the system plans to access data
blocks of a relational table BANK pointed by the row identifiers to find the values of an
attribute bank_name.

Conclusions
EXPLAIN PLAN statement of SQL can be used to get information about a processing plan
created by a query processor for a given SELECT statement. A query processing plan
provides information on whether an index created earlier will be used for processing of
SQL statement. We shall use EXPLAIN PLAN statement to check whether an index
created to speed up SELECT statement will be used for processing of the statements.

To drop an index, process a statement

DROP INDEX hqc_idx;

No report is expected from processing of SQL statements given above.


Problem
Your task is to find what indexes should be created to speed up processing of SELECT
statements listed below. You are expected to create one index for one SELECT statement.
To simplify the problem, assume that any index, later on used by a query processor to speed
up processing of SELECT statement, will do.

(1)
SELECT amount, bank_name
FROM TRANSACTION
ORDER BY amount, bank_name, acc_num;

(2)
SELECT bank_name
FROM ACCOUNT
WHERE account_type IN ('savings', 'checking');

(3)
SELECT acc_num, bank_name
FROM TRANSACTION
WHERE amount = 100 AND
(type = 'deposit' OR
TO_CHAR(tr_date_time, 'YYYY') = '2019' );

(4)
SELECT *
FROM TRANSACTION
WHERE type = 'deposit'
INTERSECT
SELECT *
FROM TRANSACTION
WHERE amount = 1000;

(5)
SELECT bank_name, amount, COUNT(acc_num)
FROM TRANSACTION
GROUP BY bank_name, amount;

Implement SQL script solution3.sql, such that for each one of SELECT statements
given above the script:
- finds and lists a query processing plan for SELECT statement without an index,
- creates an index.
- finds and lists a query processing plan for SELECT statement with an index,
- drops an index.
When ready process SQL script file solution3.sql and save a report from processing
in a file solution3.lst.

Your report must include a listing of all SQL statements processed and all feedback from
processing of SQL statements. To achieve that put the following SQLcl commands:

SPOOL solution3
SET ECHO ON
SET FEEDBACK ON
SET LINESIZE 300
SET PAGESIZE 5000

at the beginning of SQL script and

SPOOL OFF

at the end of SQL script.

Deliverables
A file solution3.lst with a report from processing of a script file solution3.sql
that lists query processing plans before and after indexing. A report must have no errors
and it must list all SQL statements processed and it must list all feedback from processing
of SQL statements.
Task 4 (5 marks)
Implement an anonymous PL/SQL block, that inserts into a relational table EMPLOYEE
3 new rows with the values of an attribute emp_num larger than the current largest value
of the attribute. The values of all other attributes are up to you.

Next, an anonymous PL/SQL block sorts a relational table EMPLOYEE in descending order
of employee numbers and displays the values of attributes emp_num, first_name,
last_name and date_of_birth from the first 5 rows obtained after sorting. The
values must be displayed in a vertical mode. A vertical mode of listing the rows is explained
below.

Assume, that the values of attributes emp_num, first_name, last_name and


date_of_birth in the first 3 rows are the following.

007 James Bond 12-DEC-1960


006 Harry Potter 06-Jun-1966
005 Robin Hood 12-DEC-1202

Then, a sample vertical way of displaying the rows is the following.

Employee number: 007 006 005


First name : James Harry Robin
Last name : Bond Potter Hood
Date of birth : 12-12-1960 06-06-1966 12-12-1202

Note, that the values of attributes must be left adjusted in each column and a distance
between the columns is up to you. Please remember about the types of the columns.

To list information retrieved from a sample database use PL/SQL package


DBMS_OUTPUT. It is explained in the Cookbook, Recipe 7.1 How to start
programming in PL/SQL how to use DBMS_OUTPUT package. Remember about
SET SERVEROUTPUT ON at the beginning of a script file that contains your anonymous
PL/SQL block.

Your implementation must use at least one cursor and at least one exception handler.

To test your solution put an implemented anonymous PL/SQL block into SQL script file
solution4.sql and process the script.

Your report must include a listing of all PL/SQL statements processed. To achieve that put
the following SQLcl commands:

SPOOL solution4
SET SERVEROUTPUT ON
SET ECHO ON
SET FEEDBACK ON
SET LINESIZE 300
SET PAGESIZE 400
SET SERVEROUTPUT ON

at the beginning of SQL script and

SPOOL OFF

at the end of SQL script.

Deliverables
A file solution4.lst with a report from testing of an anonymous PL/SQL block
implemented in this task. A report must have no errors and it must list all PL/SQL and SQL
statements processed.
Submission
Submit the files solution1.pdf, solution2.pdf, solution3.lst, and
solution4.lst through Moodle in the following way:
(1) Access Moodle at http://moodle.uowplatform.edu.au/
(2) To login use a Login link located in the right upper corner the Web page or in the
middle of the bottom of the Web page
(3) When logged select a site CSCI235 (S121) Database Systems
(4) Scroll down to a section SUBMISSIONS
(5) Click at a link In this place you can submit the outcomes of
Assignment 1
(6) Click at a button Add Submission
(7) Move a file solution1.pdf into an area You can drag and drop files
here to add them. You can also use a link Add…
(8) Repeat a step (7) for the files solution2.pdf, solution3.lst, and
solution4.lst.
(9) Click at a button Save changes
(10) Click at a button Submit assignment
(11) Click at the checkbox with a text attached: By checking this box, I
confirm that this submission is my own work, … in order to
confirm the authorship of your submission.
(12) Click at a button Continue

End of specification

You might also like