5.Introduction to relational databases Relational Model Keys
5.Introduction to relational databases Relational Model Keys
EVOLUTION OF RDBMS
Before the acceptance of Codd’s Relational Model, database management systems was just an ad
hoc collection of data designed to solve a particular type of problem, later extended to solve more basic purposes.
This led to complex systems, which were difficult to understand, install, maintain and use. These database systems
were plagued with the following problems:
They required large budgets and staffs of people with special skills that were in short supply.
Database administrators’ staff and application developers required prior preparation to access these
database systems.
End-user access to the data was rarely provided.
These database systems did not support the implementation of business logic as a DBMS responsibility.
Hence, the objective of developing a relational model was to address each and every one of the
shortcomings that plagued those systems that existed at the end of the 1960s decade, and make DBMS products
more widely appealing to all kinds of users. The existing relational database management systems offer powerful,
yet simple solutions for a wide variety of commercial and scientific application problems. Almost every industry
uses relational systems to store, update and retrieve data for operational, transaction, as well as decision support
systems. All modern database management systems like SQL, MS SQL Server, IBM DB2, ORACLE, My-SQL and
Microsoft Access are based on RDBMS.
RELATIONAL DATABASE:
A relational database is a database system in which the database is organized and accessed
according to the relationships between data items without the need for any consideration of physical orientation and
relationship. Relationships between data items are expressed by means of tables. It is a tool, which can help you
store, manage and disseminate information of various kinds. It is a collection of objects, tables, queries, forms,
reports, and macros, all stored in a computer program all of which are inter-related. It is a method of structuring
data in the form of records, so that relations between different entities and attributes can be used for data access and
transformation.
Features of an RDBMS
The features of a relational database are as follows:
The ability to create multiple relations (tables) and enter data into them
An interactive query language
Retrieval of information stored in more than one table
Provides a Catalog or Dictionary, which itself consists of tables ( called system tables )
Relational Model:
Relational data model is the primary data model, which is used widely around the world for data
storage and processing. This model is simple and it has all the properties and capabilities required to process data
with storage efficiency.
Attribute: Each column in a Table. Attributes are the properties which define a relation. e.g.,
Student_Rollno, NAME,etc.
Tables – In the Relational model the, relations are saved in the table format. It is stored along with
its entities. A table has two properties rows and columns. Rows represent records and
columns represent attributes.
Tuple – It is nothing but a single row of a table, which contains a single
record.
Relation Schema: A relation schema represents the name of the relation with its attributes.
Degree: The total number of attributes which in the relation is called the degree of the relation.
Cardinality: Total number of rows present in the Table.
Column: The column represents the set of values for a specific attribute.
Relation instance – Relation instance is a finite set of tuples in the RDBMS system. Relation
instances never have duplicate tuples.
Attribute domain – Every attribute has some pre-defined value and scope which is known as
attribute domain
Null Values: The NULL value of the table specifies that the field has been left blank during
record creation. It is totally different from the value filled with zero or a field that contains space.
Relation instance - A finite set of tuples in the relational database system represents relation
instance. Relation instances do not have duplicate tuples.
Relation schema - A relation schema describes the relation name (table name), attributes, and
their names.
Relation key - Each row has one or more attributes, known as relation key, which can identify the
row in the relation (table) uniquely.
Constraints
Every relation has some conditions that must hold for it to be a valid relation. These conditions are called
Relational Integrity Constraints. There are three main integrity constraints -
Key constraints
Domain constraints
Referential integrity constraints
Key Constraints
Keys are the entity set that is used to identify an entity within its entity set uniquely.
An entity set can have multiple keys, but out of which one key will be the primary key. A primary key can contain a
unique and null value in the relational table.
There must be at least one minimal subset of attributes in the relation, which can identify a tuple uniquely. This
minimal subset of attributes is called key for that relation. If there are more than one such minimal subsets, these are
called candidate keys.
Primary Key Constraint − A primary key constraint is an individual identifier for each record in a
database. It guarantees that each database entry contains a single, distinct value—or a pair of values—
that cannot be null—as its method of identification.
Foreign Key Constraint − Reference to the primary key in another table is a foreign key constraint. It
ensures that the values of a column or set of columns in one table correspond to the primary key
column(s) in another table.
Unique Constraint − In a database, a unique constraint ensures that no two values inside a column or
collection of columns are the same.
Key constraints force that :-
in a relation with a key attribute, no two tuples can have identical values for key attributes.
a key attribute cannot have NULL values.
Key constraints are also referred to as Entity Constraints.
Domain Constraints
Domain constraints can be defined as the definition of a valid set of values for an attribute.
The data type of domain includes string, character, integer, time, date, currency, etc. The value of the attribute must
be available in the corresponding domain.
Example:
Attributes have specific values in real-world scenario. For example, age can only be a positive integer. The same
constraints have been tried to employ on the attributes of a relation. Every attribute is bound to have a specific range
of values. For example, age cannot be less than zero and telephone numbers cannot contain a digit outside 0
Data type constraints − These limitations define the kinds of data that can be kept in a column. A column created as
VARCHAR can take string values, but a column specified as INTEGER can only accept integer values.
Length Constraints − These limitations define the largest amount of data that may be put in a column. For instance, a
column with the definition VARCHAR(10) may only take strings that are up to 10 characters long.
Range constraints − The allowed range of values for a column is specified by range restrictions. A column
designated as DECIMAL(5,2), for example, may only take decimal values up to 5 digits long, including 2 decimal
places.
Nullability constraints − Constraints on a column's capacity to accept NULL values are known as nullability
constraints. For instance, a column that has the NOT NULL definition cannot take NULL values.
Unique constraints − Constraints that require the presence of unique values in a column or group of columns are
known as unique constraints. For instance, duplicate values are not allowed in a column with the UNIQUE
definition.
Check constraints − Constraints for checking data: These constraints outline a requirement that must hold for any
data placed into the column. For instance, a column with the definition CHECK (age > 0) can only accept ages that
are greater than zero.
Default constraints − Constraints by default: Default constraints automatically assign a value to a column in case no
value is provided. For example, a column with a DEFAULT value of 0 will have 0 as its value if no other value is
specified.
Referential integrity Constraints
Relational database systems are expected to be equipped with a query language that can assist its users to query the
database instances. There are two kinds of query languages - relational algebra and relational calculus.
Simple and Easy To Use - Storing data in tables is much easier to understand and implement as
compared to other storage techniques.
Manageability - Because of the independent nature of each relation in a relational database, it is
easy to manipulate and manage. This improves the performance of the database.
Query capability - With the introduction of relational algebra, relational databases provide easy
access to data via high-level query language like SQL.
Data integrity - With the introduction and implementation of relational constraints, the relational
model can maintain data integrity in the database.
The main disadvantages of relational model in DBMS occur while dealing with a huge amount of data as:
The performance of the relational model depends upon the number of relations present in the
database.
Hence, as the number of tables increases, the requirement of physical memory increases.
The structure becomes complex and there is a decrease in the response time for the queries.
Because of all these factors, the cost of implementing a relational database increase.
Keys:
Keys are very important part of Relational database model. They are used to establish and identify relationships
between tables and also to uniquely identify any record or row of data inside a table. A Key can be a single attribute
or a group of attributes, where the combination may act as a key.
1. Super Key
Super Key is defined as a set of attributes within a table that can uniquely identify each record within a table. Super
Key is a superset of Candidate key.
In the table defined above super key would include student_id, (student_id, phone) etc.
Confused? The first one is pretty simple as student_id is unique for every row of data, hence it can be used to
identity each row uniquely. Similarly, phone number for every student will be unique, hence again, phone can also
be a key.
2. Candidate Key
Candidate keys are defined as the minimal set of fields which can uniquely identify each record in a table. It is an
attribute or a set of attributes that can act as a Primary Key for a table to uniquely identify each record in that table.
There can be more than one candidate key.
In our example, student_id and phone both are candidate keys for table Student.
A candidate key can never be NULL or empty. And its value should be unique.
3. Primary Key
Primary key is a candidate key that is most appropriate to become the main key for any table. It is a key that can
uniquely identify each record in a table. There can be only one primary Key in a table. For the table Student we can
make the student_id column as the primary key.
4. Composite Key
Composite Key is a set of two or more attributes that help identify each tuple in a table uniquely. The attributes in
the set may not be unique when considered separately. However, when taken all together, they will ensure
uniqueness. In this table student_id and subject_id together will form the primary key, hence it is a composite key.
5. Secondary or Alternative key
The candidate key which are not selected as primary key are known as secondary keys or alternative keys.
6. Foreign Key:
A foreign key is a field (or collection of fields) in one table that uniquely identifies a row of another table.
In simpler words, the foreign key is defined in a second table, but it refers to the primary key in the first table.
Foreign keys help to maintain data and referential integrity.
7. Unique Key
Unique Key is a column or set of columns that uniquely identify each record in a table. All values will have to be
unique in this Key. A unique Key differs from a primary key because it can have only one null value, whereas a
primary Key cannot have any null values.