Intelligent Data Management With SQL Server_ACE_INTL - Aptech
Intelligent Data Management With SQL Server_ACE_INTL - Aptech
No part of this book may be reproduced or copied in any form or by any means – graphic, electronic or mechanical, including
photocopying, recording, taping, or storing in information retrieval system or sent or transferred without the prior written permission of
copyright owner Aptech Limited.
APTECH LIMITED
Contact E-mail: ov-support@onlinevarsity.com
Edition 1 - 2020
PREFACE
SQL Server is a client-server based Relational Database Management System (RDBMS) from
Microsoft. It provides an enterprise-level data management platform for an organization.
SQL Server includes numerous features and tools that make it an outstanding database and
data analysis platform. It is also targeted for large-scale Online Transaction Processing
(OLTP), data warehousing, and e-commerce applications. One of the key features of SQL
Server is that it is now available on the cloud too.
The book begins with an introduction to RDBMS concepts and moves on to introduce SQL
Server 2019. The book then covers various SQL Server topics such as data types, usage of
Transact-SQL, and database objects such as indexes, stored procedures, functions, and so
on. The book also introduces Azure SQL and cloud databases. The book describes
transactions, programming elements with Transact-SQL, and finally troubleshooting errors
with error handling techniques.
The book also explores SQL Server 2019 new features and enhancements. These include
features such as Big Data clusters, PolyBase, Query Store, Stretch Database, and
In-Memory enhancements. Besides these, you will also learn about the improved
Performance Tools and Transact-SQL enhancements.
The knowledge and information in this book is the result of the concentrated effort of the
Design Team, which is continuously striving to bring to you the latest, the best and the
most relevant subject matter in Information Technology. As a part of Aptech’s quality
drive, this team does intensive research and curriculum enrichment to keep it in line with
industry trends and learner requirements.
1. RDBMS Concepts
2. Entity-Relationship (E-R) Model and Normalization
3. Introduction to SQL Server 2019
4. Transact-SQL
5. Creating and Managing Databases
6. Creating Tables
7. Azure SQL
8. Accessing Data
9. Advanced Queries and Joins
10. Views, Stored Procedures, and Querying Metadata
11. Indexes
12. Triggers
13. Programming Transact-SQL
14. Transactions
15. Error Handling
16. Enhancements in SQL Server 2019
17. PolyBase, Query Store, and Stretch Database
Session - 1
RDBMS Concepts
Welcome to the Session, RDBMS Concepts.
This session deals with the concepts related to databases and database management
systems, explores various database models, and introduces the concept of an RDBMS.
1.1 Introduction
Organizations often maintain large amounts of data, which are generated as a result of day-to-day
operations. A database is an organized form of such data. It may consist of one or more related data items
called records. Think of a database as a data collection to which different questions can be asked. For
example, 'What are the phone numbers and addresses of the five nearest post offices?' or 'Do we have any
books in our library that deal with health food? If so, on which shelves are they located?' or 'Show me the
personnel records and sales figures of five best-performing sales people for the current quarter, but their
address details are not required to be shown'.
Information helps to foresee and plan events. Intelligent interpretation of data yields information. In the
world of business, to be able to predict an event and plan for it could save time and money. Consider an
example, where a car manufacturing company is planning its annual purchase of certain parts of the car,
which has to be imported since it is not locally available. If data of the purchase of these parts for the last
A database is a collection of data. Some like to think of a database as an organized mechanism that has
the capability of storing information. This information can be retrieved by the user in an effective and
efficient manner.
A phone book is a database. The data contained consists of individuals' names, addresses, and telephone
numbers. These listings are in alphabetical order or indexed. This allows the user to reference a particular
local resident with ease. Ultimately, this data is stored in a database somewhere on a computer. As people
move to different cities or states, entries may have to be added or removed from the phone book. Likewise,
entries will have to be modified for people changing names, addresses, or telephone numbers, and so on.
Thus, a database is a collection of data that is organized such that its contents can be easily accessed,
managed, and updated.
The two different approaches of managing data are file-based systems and database systems.
In a file-based system, different programs in the same application may be interacting with different private
data files. There is no system enforcing any standardized control on the organization and structure of these
data files.
• Unanticipated queries
In a file-based system, handling sudden/ad-hoc queries can be difficult, since it requires changes in the
existing programs. For example, the bank officer must generate a list of all the customers who have an
account balance of $20,000 or more. The bank officer has two choices: either obtain the list of all
customers and have the required information extracted manually, or hire a system programmer to
design the necessary application program. Both alternatives are obviously unsatisfactory. Suppose that
such a program is written, and several days later, the officer must trim that list to include only those
customers who have opened their account one year ago. As the program to generate such a list does
not exist, it leads to a difficulty in accessing the data.
• Data isolation
Data are scattered in various files, and files may be in a different format. Though data used by different
programs in the application may be related, they reside as isolated data files.
• Security problems
In data-intensive applications, security of data is a major concern. Users should be given access only
to required data and not to the whole database.
• Integrity problems
In any application, there will be certain data integrity rules, which must be maintained. These could
be in the form of certain conditions/constraints on the elements of the data records. In the savings bank
application, one such integrity rule could be 'Customer ID, which is the unique identifier for a customer
record, should not be empty'. There can be several such integrity rules. In a file-based system, all these
rules must be explicitly programmed in the application program.
Though all these are common issues of concern to any data-intensive application, each application had to
handle all these problems on its own. The application programmer must not only be concerned about
implementing the application business rules but also, about handling these common issues.
Databases are used to store data in an efficient and organized manner. A database allows quick and easy
management of data. For example, a company may maintain details of its employees in various databases.
At any point of time, data can be retrieved from the database, new data can be added into the databases and
data can be searched based on some criteria in these databases.
Data storage can be achieved even using simple manual files. For instance, a college has to maintain
information about teachers, students, subjects, and examinations.
Details of the teachers can be maintained in a Staff Register and details of the students could be entered in
a Student Register and so forth. However, data stored in this form is not permanent. Records in such
manual files can only be maintained for a few months or few years. The registers or files are bulky, consume
a lot of space, and hence, cannot be kept for many years.
Instead of this, if the same data was stored using database system, it could be more permanent and
long-lasting.
Some of the benefits of using such a centralized database system are as follows:
In such a case, there is just one record in the database that must be changed. As a result, data
inconsistency is reduced.
It is certain that all the names stored in the database will follow the same format if the standards are
set in this manner.
Centralized control of the database helps in avoiding these errors. It is certain that if a record is deleted
from one table, its linked record in the other table is also deleted.
A Database is a collection of interrelated data, and a DBMS is a set of programs used to add or modify
this data. Thus, a DBMS is a set of software programs that allow databases to be defined, constructed, and
manipulated.
A DBMS provides an environment that is both convenient and efficient to use when there is a large volume of
data and many transactions to be processed. Different categories of DBMS can be used, ranging from small
systems that run on personal computers to huge systems that run on mainframes.
From a technical standpoint, DBMS products can differ widely. Different DBMS support different query
languages, although there is a semi-standardized query language called Structured Query Language (SQL).
Sophisticated languages for managing database systems are called Fourth Generation Language (4GLs).
The information from a database can be presented in a variety of formats. Most DBMS include a report
writer program that enables the user to output data in the form of a report. Many DBMSs also include a
graphics component that enables the user to output information in the form of graphs and charts.
It is not necessary to use general-purpose DBMS for implementing a computerized database. The users
can write their own set of programs to create and maintain the database, in effect creating their own special-
purpose DBMS software. The database and the software together are called a database system.
The end user accesses the database system through application programs and queries. The DBMS software
enables the user to process the queries and programs placed by the end user. The software accesses the data
from the database.
These reports are the source of information, which is, processed data. A DBMS is also responsible for
data security and integrity.
➢ Data definition
A DBMS provides functions to define the structure of the data in the application. These include
defining and modifying the record structure, the type and size of fields, and various
constraints/conditions to be satisfied by the data in each field.
➢ Data manipulation
Once the data structure is defined, data must be inserted, modified, or deleted. The functions, which
perform these operations, are also part of a DBMS. These functions can handle planned and unplanned
data manipulation requirements. Planned queries are those, which form part of the application.
Data in the database should contain as few errors as possible. For example, the employee number for
adding a new employee should not be left blank. Telephone number should contain only numbers.
Such checks are taken care of by a DBMS.
Thus, the DBMS contains functions, which handle the security and integrity of data in the application.
These can be easily invoked by the application and hence, the application programmer does not have
to code these functions in the programs.
➢ Performance
Optimizing the performance of the queries is one of the important functions of a DBMS. Hence, the
DBMS has a set of programs forming the Query Optimizer, which evaluates the different
implementations of a query and chooses the best among them.
The Windows Registry is an example of a hierarchical database storing configuration settings and options
on Microsoft Windows operating systems.
Within the hierarchical model, Department is perceived as the parent of the segment. The tables, Project
and Employee, are children. A path that traces the parent segments beginning from the left, defines the
tree. This ordered sequencing of segments tracing the hierarchical structure is called the hierarchical path.
It is clear from the figure that in a single department, there can be many employees and a department can
have many projects.
This model is very efficient when a database contains a large volume of data. For example, a bank's
customer account system fits the hierarchical model well because each customer's account is subject to
a number of transactions.
In the network model, data is stored in sets, instead of the hierarchical tree format. This solves the problem
of data redundancy. The set theory of the network model does not use a single-parent tree hierarchy. It
allows a child to have more than one parent. Thus, the records are physically linked through linked-lists.
Integrated Database Management System (IDMS) from Computer Associates International Inc. and
Raima Database Manager (RDM) Server by Raima Inc. are examples of a Network DBMS.
The network model together with the hierarchical data model was a major data model for implementing
numerous commercial DBMS. The network model structures and language constructs were defined by
Conference on Data Systems Language (CODASYL).
For every database, a definition of the database name, record type for each record, and the components
that make up those records is stored. This is called its network schema. A portion of the database as seen
by the application's programs that actually produce the desired information from the data contained in
the database is called sub-schema. It allows application programs to access the required data from the
database.
The relationships are easier to implement in the network database model than in the
hierarchical model.
The programmer has to be very familiar with the internal structures to access the
database.
The model provides a navigational data access environment. Hence, to move from A to
E in the sequence A-B-C-D-E, the user has to move through B, C, and D to get to E.
This model is difficult to implement and maintain. Computer programmers, rather than end users, utilize
this model.
Popular relational DBMSs are Oracle, Sybase, DB2, Microsoft SQL Server, and so on.
This model represents the database as a collection of relations. In this model's terminology, a row is called
a tuple, a column, an attribute, and the table is called a relation. The list of values applicable to a particular
field is called domain. It is possible for several attributes to have the same domain. The number of attributes
of a relation is called degree of the relation. The number of tuples determines the cardinality of the relation.
In order to understand the relational model, consider tables 1.3 and 1.4.
Roll Number Student Name
1 Sam Reiner
2 John Parkinson
3 Jenny Smith
4 Lisa Hayes
5 Penny Walker
6 Peter Jordan
7 Joe Wong
Table 1.3: Students Table
The Students table displays the Roll Number and the Student Name, and the Marks table displays the
Roll Number and Marks obtained by the students. Now, two steps must be carried out for students who
have scored more than 50. First, locate the roll numbers of those who have scored more than 50 from the
Marks table. Second, their names have to be located in the Students table by matching the roll number.
In a relational model, data is stored in tables. A table in a database has a unique name that identifies its
contents. Each table can be defined as an intersection of rows and columns.
The relational database model gives Though the model hides all complexities of
programmers time to concentrate on logical system, it tends to be slower than other database
view of the database rather than being bothered systems.
about physical view. One of the reasons for
popularity of relational databases is querying As compared to all other models, relational data
flexibility. Most relational databases use model is the most popular and widely used.
Structured Query Language (SQL). An
RDBMS uses SQL to translate user query into
technical code required to retrieve requested
data. Relational model is so easy to handle that
even untrained people find it easy to generate
handy reports and queries, without giving much
thought to the requirement to design a proper
database.
As compared to all other models, the relational data model is the most popular and widely used.
Tables are related in a relational database, allowing adequate data to be retrieved in a single query
(although the desired data may exist in more than one table). By having common keys, or fields, among
relational database tables, data from multiple tables can be joined to form one large resultset.
Figure 1.5 shows two tables related to one another through a common key (data value) in a relational
database.
Thus, a relational database is a database structured on the relational model. The basic characteristic of a
relational model is that in a relational model, data is stored in relations. To understand relations, consider
the following example.
The Capitals table shown in table 1.6 displays a list of countries and their capitals, and the Currency
table shown in table 1.7 displays the countries and the local currencies used by them.
Country Capital
Greece Athens
Italy Rome
USA Washington
China Beijing
Japan Tokyo
Australia Sydney
France Paris
Table 1.6: Capitals
Country Currency
Greece Drachma
Italy Lira
USA Dollar
China Renminbi (Yuan)
Japan Yen
Australia Australian Dollar
France Francs
Table 1.7: Currency
Both the tables have a common column, that is, the Country column. Now, if the user wants to display
the information about the currency used in Rome, first find the name of the country to which Rome
belongs. This information can be retrieved from table 1.6. Next, that country should be looked up in table
It is possible to get this information because it is possible to establish a relation between the two tables
through a common column called Country.
For example, a company might have an Employee table with a row for each employee. What attributes
might be interesting for such a table? This will depend on the application and the type of use the data will be
put to, and is determined at database design time.
Consider the scenario of a company maintaining customer and order information for products being sold
and customer-order details for a specific month, such as, August.
The tables 1.8, 1.9, 1.10, and 1.11 are used to illustrate this scenario. These tables depict tuples and
attributes in the form of rows and columns. Various terms related to these tables are given in table 1.12.
For a small personal database, one person typically defines the constructs and manipulates the database.
However, many persons are involved in the design, use, and maintenance of a large database with a few
hundred users.
Database Administrator (DBA)
• Is a person who collects information that will be stored in database.
• Administering these resources is responsibility of DBA. DBA is also responsible for
authorizing access to the database, for coordinating and monitoring its use, and for
acquiring software and hardware resources as required. DBA is accountable for
problems such as breach of security or poor system response time.
Database Designer
• Are responsible for identifying the data to be stored in the database and for choosing
appropriate structures to represent and store this data. It is the responsibility of
database designers to communicate with all prospective database users, in order to
understand their requirements, and to come up with a design that meets the
requirements.
End User
• Invokes an application to interact with the system, or writes a query for easy retrieval,
modification, or deletion of data.
1.7.1 Entity
An entity is a person, place, thing, object, event, or even a concept, which can be distinctly identified. For
example, the entities in a university are students, faculty members, and courses.
Each entity has certain characteristics known as attributes. For example, the student entity might include
attributes such as student number, name, and grade. Each attribute should be named appropriately.
A grouping of related entities becomes an entity set. Each entity set is given a name. The name of the entity
set reflects the contents. Thus, the attributes of all the students of the university will be stored in an entity
set called Student.
DBMS RDBMS
It does not require to have data in tabular In an RDBMS, tabular structure is a must and table
structure nor does it enforce tabular relationships are enforced by the system. These
relationships between data items. relationships enable the user to apply and manage
business rules with minimal coding.
Small amount of data can be stored and An RDBMS can store and retrieve large amount of
retrieved. data.
A DBMS is less secure than an RDBMS. An RDBMS is more secure than a DBMS.
It is a single user system. It is a multi-user system.
Most DBMSs do not support client/server It supports client/server architecture.
architecture.
Table 1.13: Difference between DBMS and RDBMS
In an RDBMS, a relation is given more importance. Thus, the tables in an RDBMS are dependent and
the user can establish various integrity constraints on these tables so that the ultimate data used by the user
remains correct. In case of a DBMS, entities are given more importance and there is no relation stablished
among these entities.
5. A describes a container for storing data and the process of storing and
retrieving data from that container.
Try It Yourself
1. Create a PowerPoint presentation highlighting in brief what is a DBMS, different database
models, and key features of an RDBMS.
This session talks about Data Modeling, the E-R model, its components, symbols,
diagrams, relationships, Data Normalization, and Relational Operators.
2.1 Introduction
A data model is a group of conceptual tools that describes data, its relationships, and semantics. It also
consists of the consistency constraints that the data adheres to. The Entity-Relationship, Relational,
Network, and Hierarchical models are examples of data models. The development of every database begins
with the basic step of analyzing its data in order to determine the data model that would best represent it.
Once this step is completed, the data model is applied to the data.
Data modeling is as essential to database development as are planning and designing to any project
development. Building a database without a data model is similar to developing a project without its plans
and design. Data models help database developers to define the relational tables, primary and foreign keys,
stored procedures, and triggers required in the database.
Attributes are features that an entity has. Attributes help distinguish every
entity from another. For example, the attributes of a student would be
Attributes roll_number, name, stream, semester, and so on. The attributes of a car
would be registration_number, model, manufacturer, color, price, owner,
and so on.
An entity set is the collection of similar entities. For example, the
Entity Set employees of an organization collectively form an entity set called
employee entity set.
Relationships associate one or more entities and can be of three types. They are as follows:
➢ Self-relationships
Relationships between entities of the same entity set are called self-relationships. For example, a
manager and his team member, both belong to the employee entity set. The team member works for the
manager. Thus, the relation, 'works for', exists between two different employee entities of the same
employee entity set.
The relationship can be seen in figure 2.4.
➢ Binary relationships
Relationships that exist between entities of two different entity sets are called binary relationships. For
example, an employee belongs to a department. The relation exists between two different entities,
which belong to two different entity sets. The employee entity belongs to an employee entity set. The
department entity belongs to a department entity set.
Relationships can also be classified as per mapping cardinalities. Different mapping cardinalities are
as follows:
One-to-One
This kind of mapping exists when an entity of one entity set can be associated with only one entity of
another set. Consider the relationship between a vehicle and its registration. Every vehicle has a unique
registration. No two vehicles can have the same registration details. The relation is one-to-one, that is, one
One-to- Many
This kind of mapping exists when an entity of one set can be associated with more than one entity of another
entity set.
Consider the relation between a customer and the customer's vehicles. A customer can have more than one
vehicle. Therefore, the mapping is a one to many mapping, that is, one customer - one or more vehicles.
The mapping cardinality can be seen in figure 2.8.
Many-to-One
This kind of mapping exists when many entities of one set is associated with an entity of another set. This
association is done irrespective of whether the latter entity is already associated to other or more entities of
the former entity set.
Consider the relation between a vehicle and its manufacturer. Every vehicle has only one manufacturing
company or coalition associated to it under the relation, 'manufactured by', but the same company or
coalition can manufacture more than one kind of vehicle.
This kind of mapping exists when any number of entities of one set can be associated with any number of
entities of the other entity set.
Consider the relation between a bank's customer and the customer's accounts. A customer can have more
than one account and an account can have more than one customer associated with it in case it is a joint
account or similar. Therefore, the mapping is many-to-many, that is, one or more customers associated with
one or more accounts.
The mapping cardinality can be seen in figure 2.10.
Primary keys
A primary key is an attribute that can uniquely define an entity in an entity set. Consider table 2.1
containing the details of students in a school.
Enrollment_Number Name Grade Division
786 Ashley Seven B
957 Joseph Five A
1011 Kelly One A
In a school, every student has a unique enrollment number (such as enrollment_number in table 2.1),
which is unique to the student. Any student can be identified based on the enrollment number. Thus, the
attribute enrollment_number plays the role of the primary key in the Student Details table.
Entity sets that do not have enough attributes to establish a primary key are called weak entity sets.
Entity sets that have enough attributes to establish a primary key are called strong entity sets. Consider the
scenario of an educational institution where at the end of each semester, students are required to complete
and submit a set of assignments. The teacher keeps track of the assignments submitted by the students. Now,
an assignment and a student can be considered as two separate entities. The assignment entity is described
by the attributes assignment_numberand subject. The student entity is described by roll_number, name,
and semester. The assignment entities can be grouped to form an assignment entity set and the student
entities can be grouped to form a student entity set. The entity sets are associated by the relation 'submitted
by'. This relation is depicted in figure 2.11.
The attributes, assignment_number and subject, are not enough to identify an assignment entity
uniquely. The roll_number attribute alone is enough to uniquely identify any student entity. Therefore,
roll_number is a primary key for the student entity set. The assignment entity set is a weak entity set since
it lacks a primary key. The student entity set is a strong entity set due to the presence of the roll_number
attribute.
Weak Entity
Attribute
Key Attribute
➢ Multi-valued
A multi-valued attribute is illustrated with a double-line ellipse, which has more than one value for at least
one instance of its entity. This attribute may have upper and lower bounds specified for any individual entity
value.
The telephone attribute of an individual may have one or more values, that is, an individual can have one
or more telephone numbers. Hence, the telephone attribute is a multi-valued attribute.
The symbol and example of a multi-valued attribute can be seen in figure 2.12.
A composite attribute may itself contain two or more attributes, which represent basic attributes having
independent meanings of their own.
The address attribute is usually a composite attribute, composed of attributes such as street, area, and so
on. The symbol and example of a composite attribute can be seen in figure 2.13.
Derived attributes are attributes whose value is entirely dependent on another attribute and are
indicated by dashed ellipses.
The age attribute of a person is the best example for derived attributes. For a particular person entity, the
age of a person can be determined from the current date and the person's birth date. The symbol and
example of a derived attribute can be seen in figure 2.14.
Consider the scenario of a bank, with customers and accounts. The E-R diagram for the scenario can be
constructed as follows:
1. Step 1: Gather data
The bank is a collection of accounts used by customers to save money.
2. Step 2: Identify entities
Customer
Account
3. Step 3: Identify the attributes
Customer: customer_name, customer_address, customer_contact
Account: account_number, account_owner, balance_amount
2.4 Normalization
Initially, all databases are characterized by large number of columns and records. This approach has certain
drawbacks. Consider the following details of the employees in a department. Table 2.3 consists of the
employee details as well as the details of the project they are working on.
The data such as Project_Id, Project_Name, Grade, and Salary repeat many times. This repetition
hampers both, performance during retrieval of data and the storage capacity. This repetition of data is
called the repetition anomaly.
The repetition is shown in table 2.4 with the help of shaded cells.
Insertion anomaly
Suppose the department recruits a new employee named Ann. Now, consider that Ann has not been assigned
any project. Insertion of her details in the table would leave columns Project_Id and Project_Name
empty. Leaving columns blank could lead to problems later. Anomalies created by such insertions are called
insertion anomalies. The anomaly can be seen in table 2.5.
Deletion anomaly
Suppose, Bob is relieved from the project MAGNUM. Deleting the record deletes Bob's Emp_No, Grade,
and Salary details too. This loss of data is harmful as all of Bob's personal details are also lost as seen in
the table 2.6. This kind of loss of data due to deletion is called deletion anomaly. The anomaly can be seen
in table 2.6.
Updating anomaly
Suppose John was given a hike in Salary or John was demoted. The change in John's Salary or
Grade must be reflected in all projects John works for. This problem in updating all the occurrences is
called updating anomaly.
The Department Employee Details table is called an unnormalized table. These drawbacks lead to the
need for normalization.
Normalization is the process of removing unwanted redundancy and dependencies. Initially, Codd (1972)
presented three normal forms (1NF, 2NF, and 3NF), all based on dependencies among the attributes of a
relation. The fourth and fifth normal forms are based on multi-value and join dependencies and were
proposed later.
The table has data related to projects and employees. The table must be split into two tables, that is, a
Project Details table and an Employee Details table. The table columns, Project_Id
and Project_Name, have multiple values. The data must be split over different rows. The resultant
tables
Project_Id Project_Name
113 BLUE STAR
124 MAGNUM
Table 2.8: Project Details
The Project_Id attribute is the primary key for the Project Details table.
The Emp_No attribute is the primary key for the Employee Details table. Therefore, in first normal
form, the initial Employee Project Details table has been reduced to the Project Details and
Employee Details tables.
Partial dependency means a non-key attribute should not be partially dependent on more than one key
attribute. The Project Details and Employee Details tables do not exhibit any partial dependencies.
The Project_Name is dependent only on Project_Id and Emp_Name, Grade, and Salary are
dependent only on Emp_No. The tables also need to be related through foreign keys. A third table, named
Employee Project Details, is created with only two columns, Project_Id and Emp_No.
So, the project and employee details tables on conversion to second normal form generates tables Project
Details, Employee Details, and Employee Project Details as shown in tables 2.10, 2.11, and 2.12.
Project_Id Project_Name
113 BLUE STAR
124 MAGNUM
Table 2.12: Employee Project Details After Conversion to Second Normal Form
The attributes, Emp_no and Project_id, of the Employee Project Details table combine together to
form the primary key. Such primary keys are called composite primary keys.
The Project Details, Employee Details, and Employee Project Details tables are in second
normal form. If an attribute can be determined by another non-key attribute, it is called a transitive
dependency. To make it simpler, every non-key attribute should be determined by the key attribute only.
If a non-key attribute can be determined by another non-key attribute, it must put into another table.
On observing the different tables, it is seen that the Project Details and Employee Project Details
tables do not exhibit any such transitive dependencies. The non-key attributes are totally determined by the
key attributes. Project_Name is only determined by Project_Id. On further scrutinizing the Employee
Details table, a certain inconsistency is seen. The attribute Salary is determined by the attribute Grade
and not the key attribute Emp_No. Thus, this transitive dependency must be removed.
The Employee Details table can be split into the Employee Details and Grade Salary Details
tables as shown in tables 2.13 and 2.14.
Thus, at the end of the three normalization stages, the initial Employee Project Details table has been
reduced to the Project Details, Employee Project Details, Employee Details, and Grade
Salary Details tables as shown in tables 2.15, 2.16, 2.17, and 2.18.
Project_Id Project_Name
113 BLUE STAR
124 MAGNUM
Table 2.16: Employee Project Details After Conversion to Third Normal Form
Grade Salary
A 20,000
B 15,000
C 10,000
Table 2.18: Grade Salary Details After Conversion to Third Normal Form
If such joins are used very often, the performance of the database will become very poor.
The CPU time required to solve such queries will be very large too. In such cases, storing a few fields
redundantly can be ignored to increase the performance of the database. The databases that possess such
minor redundancies in order to increase performance are called denormalized databases and the process
of doing so is called denormalization.
SELECT
The SELECT operator is used to extract data that satisfies a given condition. The lowercase Greek letter
sigma, 'σ', is used to denote selection. A select operation, on the Branch Reserve Details table, to display
the details of the branches in London would result in table 2.20.
Branch Branch_Id Reserve (Billion €)
London BS-01 9.2
London BS-02 10
Table 2.20: Details of Branches in London
PROJECT
The PROJECT operator is used to project certain details of a relational table. The PROJECT operator only
displays the required details leaving out certain columns. The PROJECT operator is denoted by the Greek
letter pi, ''. Assume that only the Branch_Id and Reserve amounts need to bedisplayed.
A project operation to do the same, on the Branch Reserve Details table, would result in table 2.22.
PRODUCT
The PRODUCT operator, denoted by 'x' helps combine information from two relational tables. Consider
table 2.23.
Branch_Id Loan Amount ( Billion €)
BS-01 0.56
BS-02 0.84
Table 2.23: Branch Loan Details
The product operation on the Branch Reserve Details and Branch Loan Details tables would
result in table 2.24.
Branch Branch_Id Reserve (Billion €) Loan Amount ( Billion €)
London BS-01 9.2 0.56
London BS-01 9.2 0.84
London BS-02 10 0.56
London BS-02 10 0.84
Paris BS-03 15 0.56
Paris BS-03 15 0.84
Los Angeles BS-04 50 0.56
Los Angeles BS-04 50 0.84
The product operation combines each record from the first table with all the records in the second table,
somewhat generating all possible combinations between the table records.
UNION
Suppose an official of the bank with the data given in tables 2.19 and 2.23 wanted to know which branches
had reserves below 20 billion Euros or loans. The resultant table would consist of branches with either
reserves below 20 billion Euros or loans or both.
This is similar to the union of two sets of data; first, set of branches with reserve less than 20 billion Euros
and second, branches with loans. Branches with both, reserves below 20 billion Euros and loans would be
displayed only once. The UNION operator does just that, it collects the data from the different tables and
presents a unified version of the complete data. The union operation is represented by the symbol, 'U'. The
union of the Branch Reserve Details and Branch Loan Details tables would generate table 2.25.
Branch Branch_Id
London BS-01
London BS-02
Paris BS-03
Table 2.25: Unified Representation of Branches with Less Reserves or Loans
INTERSECT
Suppose the same official after seeing this data wanted to know which of these branches had both low
reserves and loans too. The answer would be the intersect relational operation. The INTERSECT operator
generates data that holds true in all the tables it is applied on. It is based on the intersection set theory and
is represented by the '∩' symbol. The result of the intersection of the Branch Reserve Details and Branch
Loan Details tables would be a list of branches that have both reserves below 20 billion Euros and loans in
their account. The resultant table generated is table 2.26.
Branch Branch_Id
London BS-01
London BS-02
Table 2.26: Branches with Low Reserves and Loans
DIFFERENCE
If the same official now wanted the list of branches that had low reserves but no loans, then the official
would have to use the difference operation. The DIFFERENCE operator, symbolized as '-', generates data
from different tables too, but it generates data that holds true in one table and not the other. Thus, the branch
would have to have low reserves and no loans to be displayed.
Table 2.27 is the result generated.
JOIN
The JOIN operation is an enhancement to the product operation. It allows a selection to be performed on
the product of tables. For example, if the reserve values and loan amounts of branches with low reserves
and loan values was needed, the product of the Branch Reserve Details and Branch Loan Details
would be required. Once the product of tables 2.19 and 2.23 would be generated, only those branches would
be listed which have both reserves below 20 billion Euros and loans.
DIVIDE
Suppose an official wanted to see the branch names and reserves of all the branches that had loans. This
process can be made very easy by using the DIVIDE operator. All that the official must do is divide the
Branch Reserve Details table (shown earlier in table 2.19) by the list of branches, that is, the Branch Id
column of the Branch Loan Details table (shown earlier in table 2.23). Table 2.29 is the result generated.
Note that the attributes of the divisor table should always be a subset of the dividend table. The resultant
table would always be void of the attributes of the divisor table and the records not matching the records
in the divisor table.
1. One or more attributes that can uniquely define an entity from an entity set is called a
key.
(A) Primary (C) Alternate
Consider yourself as a student of second year Computer Science who has participated in the campus
interview. You have been provided with a set of interview questions as follows:
2. Champs Online is an online learning company that provides computer education to kids. The
company has basic courses in subjects of Computer Science such as fundamentals of computers,
databases, programming languages, and so on. The company has organized their data in a manner
similar to this:
Recent versions of SQL Server have brought a revolutionary change in areas such as speedy transactions,
higher security, more profound insights, and the latest hybrid cloud. It enhances mission-essential
capabilities of in-memory operations.
Microsoft launched latest version of this product, SQL Server 2019, on November 2019. SQL Server 2019
provides industry leading security, performance and intelligence over your data, regardless of whether it is
structured or unstructured. SQL Server 2019 provides support for Big Data Clusters in SQL Server.
Existing SQL Server tools such as database engine, SQL Server Analysis Services, SQL Server Machine
Learning Services, SQL Server on Linux, and SQL Server Master Data Services have been enhanced in
SQL Server 2019. Using SQL Server 2019 not only helps an organization to store and manage huge amount
of information, but also to protect and utilize this data at different locations as required.
Policy-Based SQL Server has built- Microsoft has made SQL Server is simple
Management to detect in transparent data available different to install with a one-
security policies that compression feature editions of SQL Server click installation
are non-compliant. along with encryption. for different kinds of procedure and
This feature allows SQL Server provides users. These are also readable GUI having
only authorized access control coupled priced accordingly. easy instructions for
personnel access to the with efficient Thus, from hobbyists the layman.
database. Security permission to professional
audits and events can management tools. It developers to
be written also offers an enterprise users, there
automatically to log enhanced performance is an edition suitable
files. when it comes to data for each one.
collection.
Tools
There are a number of tools that are provided in SQL Server 2019 for development and query management of
a database. The SQL Server Installation Center must be used to install SQL Server program features and
tools. Features can also be modified or removed using SQL Server Installation Center. Table 3.1 lists
different tools available for SQL Server 2019.
There are various services that are executed on a computer running SQL Server. These services run along
with the other Windows services and can be viewed in the task manager.
SQL Server Database Database Engine is a core service that is used for storing, processing, and
Engine securing data. It is also used for replication, full-text search, and Data Quality
Services (DQS). It contains tools for managing relational and eXtensible Markup
Language (XML) data.
Analysis Services contain tools that help to create and manage Online
SQL Server Analysis Analytical Processing (OLAP). This is used for personal, team, and corporate
Services (SSAS) business intelligence purposes. Analysis services are also used in data mining
applications.
SQL Server Reporting Reporting Services help to create, manage, publish, and deploy reports. These
Services (SSRS) reports can be in tabular, matrix, graphical, or free-form format. Report
applications can also be created using Reporting Services.
SQL Server Master Data Master Data Services (MDS) are used for master data management. MDS is
Services used for analysis, managing, and reporting information such as hierarchies,
granular security, transactions, business rules, and so on.
Instances
All the programs and resource allocations are saved in an instance. An instance can include memory,
configuration files, and CPU. Multiple instances can be used for different users in SQL Server 2019. Even
though many instances may be present on a single computer, they do not affect the working of other
instances. This means that all instances work in isolation. Each instance can be customized as per the
requirement. Even permissions for each instance can be granted on individual basis. The resources can
also be allocated to the instance accordingly, for example, the number of databases allowed.
In other words, instances can be called as a bigger container that contains sub-containers in the form of
databases, security options, server objects, and so on.
Developer Edition
Free to use and includes all features of Enterprise edition, licensed for use as a
development and test database in a non-production environment.
1. Locate the Microsoft SQL Server Management Studio tool on the list of programs on Start menu
and start the tool.
2. In the Connect to Server dialog box, select the Server type as Database Engine.
4. Select either Windows Authentication or SQL Server Authentication, provide the required Login
and Password, and click Connect.
Note - The two authentication methods provided by SQL Server are SQL Server Authentication and
Windows Authentication. SQL Server Authentication requires a user account for login and password.
Hence, multiple user accounts can access the information using their respective usernames and
passwords. With Windows Authentication, the operating system credentials can be used to log in to the
SQL Server database. This will work only on a single machine and cannot be used in any other computer.
An SQL Server database is made up of collection of tables that stores sets of specific structured data.
A table includes a set of rows (also called as records or tuples) and columns (also called as attributes).
Each column in the table is intended to store a specific type of information, for example, dates, names,
currency amounts, and numbers.
SQL Server 2019 supports three kinds of databases, which are as follows:
Using SQL Server 2019, users can create their own databases, also called user-defined databases, and
work with them. The purpose of these databases is to store user data.
Table 3.2 shows the system databases that are supported by SQL Server 2019.
Database Description
master The database records all system-level information of an instance of SQL Server.
The database is used by SQL Server Agent for scheduling database alerts and
msdb various jobs.
The database is used as a template for all databases to be created on the
model particular instance of SQL Server 2019.
The database is a read-only database. It contains system objects included with
resource SQL Server 2019.
tempdb The database holds temporary objects or intermediate result sets.
Table 3.2: System Databases
3.7.2 User-defined Databases
Using SQL Server 2019, users can create their own databases, also called user-defined databases, and
work with them. The purpose of these databases is to store user data.
The AdventureWorks2019 database schema covers many functional areas for a fictitious bicycle
manufacturer. These areas include:
The database comprises several features. Some of its key features are as follows:
A database engine that includes administration facilities, data access capabilities, Full-Text Search
facility, Common Language Runtime (CLR) integration advantage, and more
A set of integrated samples for two multiple feature-based samples: HRResume and
Storefront
Analysis Services and Integration Services
Notification Services
Replication Facilities
Reporting Services
The structure includes databases, security, server objects, replications, and may also show features such as
AlwaysOn High Availability, Management, Integration Services Catalogs, and so on. Object Explorer can
be accessed through SSMS by connecting to the database server.
Used to provide flexible and trustworthy security configuration in SQL Server 2019.
Security This includes logins, roles, credentials, audits, and so on.
Used for high availability and disaster recovery. It is generally used for applications
AlwaysOn that require high uptime and failure protection.
High
Availability
Integration Services Catalogs stores all the objects of the project after the project has
Integration been deployed.
Services
Catalogs
A solution is a file in which all the projects in SQL Server are saved. This acts as a top-most node in the
hierarchy. The solution file is stored as a text file with .ssmssln extension. A project comes under a
solution node. There can be more than one project in SQL Server. All the data related to database
connection metadata and other miscellaneous files are stored under a project. It is stored as a text file
with .ssmssqlproj extension. Script files are the core files in which the queries are developed and
executed. The scripts have a .sql extension.
Code Snippet 1 depicts a sample script. This script can be saved as InsertData.sql.
Code Snippet 1:
USE [AdventureWorks2019]
GO
INSERT INTO [Person].[Person]
([BusinessEntityID]
,[PersonType]
,[NameStyle]
,[Title]
,[FirstName]
,[MiddleName]
,[LastName]
,[EmailPromotion]
,[ModifiedDate])
VALUES(21907
,'EM'
,0
,'Mr.'
,'John'
,'Gareth'
,'Hopkins'
,0
,'2020-10-10')
GO
2. Which of these components of Object Explorer is used to monitor activity in computers running an
instance of SQL Server?
(A) SQL Server Deterministic Tools (C) SQL Server Diagnostic Tools
(B) SQL Server Data Tools (D) SQL Server Database Tracking
4. Which of the following statements about the tools in SQL Server 2019 are true?
a. The SQL Server Installation Center tool can be used to add, remove, and modify SQL
Server programs.
b. SQLCMD is an IDE used for Business Intelligence Components. It helps to design the
database using Visual Studio.
c. SQL Server Profiler is used to monitor an instance of the Database Engine or Analysis
Services.
d. SQL Server Installation Center is an application provided with SQL Server 2019 that
helps to develop databases, query data, and manage the overall working of SQL Server.
Statement 1: Script files are the core files in which the queries are developed and executed.
Statement 2: Script files are files that contain a set of SQL commands.
➢ Explain Transact-SQL
➢ List different categories of Transact-SQL statements
➢ Explain various data types supported by Transact-SQL
➢ Explain Transact-SQL language elements
➢ Explain sets and predicate logic
➢ Describe logical order of operators in the SELECT statement
4.1 Introduction
SQL is the universal language used in the database world. Most modern RDBMS products use some type
of SQL dialect as their primary query language. SQL can be used to create or destroy objects such as tables
on the database server and to manipulate those objects, such as adding data into them or retrieving data
from them.
Transact-SQL is Microsoft's implementation of the standard SQL. Usually referred to as T-SQL, this
language implements a standardized way to communicate to the database. The Transact-SQL language is
an enhancement to SQL, the American National Standards Institute (ANSI) standard relational database
language. It provides a comprehensive language that supports defining tables, inserting, deleting, updating,
and accessing the data in the table.
4.2 Transact-SQL
Transact-SQL is a powerful language offering features such as data types, temporary objects, and extended
stored procedures. Scrollable cursors, conditional processing, transaction control, and exception and
error-handling are also some of the features which are supported by Transact-SQL.
The Transact-SQL language in SQL Server 2019 provides improved performance, increased functionality,
and enhanced features. Enhancements include scalar functions, paging, sequences, meta-data discovery,
and better error handling support.
Note: All the queries in this session will make use of the AdventureWorks2019 sample database.
Code Snippet 1:
USE AdventureWorks2019
SELECT LoginID FROM HumanResources.Employee
WHERE JobTitle = 'Design Engineer'
Figure 4.1 shows the result that retrieves all records of employees with 'Design Engineer' as the JobTitle
from HumanResources.Employee table.
Transact-SQL includes many syntax elements that are used by or that influence most statements. These
elements include data types, predicates, functions, variables, expressions, control-of-flow, comments, and
batch separators.
Most DDL statements take the following form, where object_name can be a table, view, trigger, stored
procedure, and so on:
➢ CREATE object_name
➢ ALTER object_name
➢ DROP object_name
➢ GRANT statement
➢ REVOKE statement
➢ DENY statement
Various items in SQL Server 2019 such as columns, variables, and expressions are assigned data types.
SQL Server 2019 supports three kinds of data types:
These data types are provided by SQL Server 2019. Table 4.1 shows the commonly used system-defined
data types of SQL Server.
These are based on the system-supplied data types. Alias data types are used when more than one table
stores the same type of data in a column and has similar characteristics such as length, nullability, and
type. In such cases, an alias data type can be created that can be used commonly by all these tables.
Alias data types can be created using the CREATE TYPE statement. The syntax for the CREATE TYPE
statement is as follows:
Syntax:
where,
schema_name: identifies name of the schema in which the alias data type is being created. A schema
is a collection of objects such as tables, views, and so forth in a database.
type_name: identifies name of the alias type being created.
base_type: identifies name of the system-defined data type based on which the alias data type is
being created.
precision and scale: specify precision and scale for numeric data.
NULL | NOT NULL: specifies whether the data type can hold a null value or not.
Code Snippet 2 shows how to create an alias data type using CREATE TYPE statement.
Code Snippet 2:
In the code, the built-in data type varchar is stored as a new data type named usertype by using the
CREATE TYPE statement.
➢ User-defined types
These are created using programming languages supported by the .NET Framework.
Operators are used to perform arithmetic, comparison, concatenation, or assignment of values. For
example, data can be tested to verify that the COUNTRY column for the customer data is populated (or has
a NOT NULL value). In queries, anyone who can see the data in the table requiring an operator can perform
operations. Appropriate permissions are required before data can be successfully changed. SQL Server has
seven categories of operators. Table 4.3 describes the different operators supported in SQL Server 2019.
Operator Description Example
Comparison Compares a value against another value or an =, <, >, >=, <=, !=,
expression !>
Logical Tests for the truth of a condition AND, OR, NOT
Arithmetic Performs arithmetic operations such as addition, +, -, *, /, %
subtraction, multiplication, and division
Code Snippet 3:
DECLARE @Number int;
SET @Number = 2 + 2 * (4 + (5 - 3))
SELECT @Number
1. 2 + 2 * (4 + (5 – 3))
2. 2 + 2 * (4 + 2)
3. 2+2*6
4. 2 + 12
5. 14
4.5.2 Functions
A function is a set of Transact-SQL statements that is used to perform some task. Transact-SQL includes a
large number of functions. These functions can be useful when data is calculated or manipulated. In SQL,
functions work on the data, or group of data, to return a required value. They can be used in a SELECT list
or anywhere in an expression.
Ranking functions
Scalar functions
Many tasks, such as creating arrays, generating
sequential numbers, finding ranks, and so on In scalar functions, the input is a single value
can be implemented in an easier and faster way and the output received is also a single value.
by using ranking functions. For example,
RANK, DENSE_RANK, NTILE, and
ROW_NUMBER are ranking functions.
There are also other scalar functions such as cursor functions, logical functions, metadata functions,
security functions, and so on that are available in SQL Server 2019.
In Transact-SQL, local variables are created and used for temporary storage while SQL statements are
executed. Data can be passed to SQL statements using local variables. The name of a local variable must
be prefixed with '@' sign.
For example,
In earlier versions of SQL Server, a concept called global variables existed which referred to in-built variables
that are defined and maintained by the system. In SQL Server 2019, they are categorized as functions. They
are prefixed with two '@' signs. The return value of these functions can be retrieved with a simple SELECT
query.
For example,
SELECT @@LANGUAGE as 'Language'
This returns the language currently used in SQL Server, such as US English.
4.5.4 Expressions
An expression is a combination of identifiers, values, and operators that SQL Server can evaluate in order
to obtain a result. Expressions can be used in several different places when accessing or changing data.
Code Snippet 4 shows an expression that operates on a column to add an integer to the results of the
YEAR function on a datetime column.
Code Snippet 4:
USE AdventureWorks2019
SELECT SalesOrderID, CustomerID, SalesPersonID, TerritoryID,YEAR(OrderDate) AS
CurrentYear, YEAR(OrderDate) + 1 AS NextYear
FROM Sales.SalesOrderHeader
Table 4.6 shows some of the commonly used control-of-flow statements in Transact-SQL.
Control-of-Flow Description
Statement
IF. . .ELSE Provides branching control based on a logical test.
WHILE Repeats a statement or a block of statements as long as the condition is true.
BEGIN. . .END Defines the scope of a block of Transact-SQL statements.
TRY. . . CATCH Defines the structure for exception and error handling.
BEGIN TRANSACTION Marks a block of statements as part of an explicit transaction.
Table 4.6: Control-of-Flow Statements
Code Snippet 5:
Here, the weekday is retrieved from the current date and checked to see if it is either a Saturday or a
Sunday and then, accordingly a suitable message is displayed.
4.5.6 Comments
Comments are descriptive text strings, also known as remarks, in program code that will be ignored by the
compiler. Comments can be included inside the source code of a single statement, a batch, or a stored
A complete line of code or part of a code can These comment characters can be used on the
be marked as a comment, if two hyphens (- -) same line as code to be executed, on lines by
are placed at the beginning. The remainder of themselves, or even within executable code.
the line becomes a comment. Everything between /* to */ is considered part
of the comment. For a multiple-line comment,
the open-comment character pair (/*) must begin
the comment, and the close-comment character
pair (*/) must end the comment.
Code Snippet 6:
USE AdventureWorks2019
-- HumanResources.Employee table contains the details of an employee.
-- This statement retrieves all the rows of the table
-- HumanResources.Employee.
SELECT * FROM HumanResources.Employee
Code Snippet 7 displays the use of /* … */ (forward slash-asterisk character pairs) style of comment.
Code Snippet 7:
USE AdventureWorks2019
/* HumanResources.Employee table contains the details of an employee.
This statement retrieves all the rows of the table HumanResources.Employee. */
SELECT * FROM HumanResources.Employee
A batch separator is handled by SQL Server client tools such as SSMS to execute commands. For example,
you must specify GO as a batch separator in SSMS.
In Code Snippet 8, the two statements will be grouped into one execution plan, but executed one statement
at a time. The GO keyword signals the end of a batch.
Table 4.7 shows different applications in the set theory and their corresponding application in SQL Server
queries.
Set Theory Applications Application in SQL Server Queries
Act on the whole set at once. Query the whole table at once.
Use declarative, set-based processing. Use attributes in SQL Server to retrieve specific data.
Elements in the set must be unique. Define unique keys in the table.
No sorting instructions. The results of querying are not retrieved in any order.
Table 4.7: Set Theory Applications
One of the common Set operators is the INTERSECT operator. It returns distinct rows that are produced
by both the left and right input queries operator.
Code Snippet 9:
USE AdventureWorks2019
GO
SELECT ProductID
FROM Production.Product
INTERSECT
SELECT ProductID
FROM Production.WorkOrder ;
The outcome will show 238 rows of products that have work orders.
Defining subqueries
Syntax:
USE AdventureWorks2019
SELECT SalesPersonID, YEAR(OrderDate) AS OrderYear FROM
Sales.SalesOrderHeader
WHERE CustomerID = 30084
GROUP BY SalesPersonID, YEAR(OrderDate)
HAVING COUNT(*) > 1
ORDER BY SalesPersonID, OrderYear;
In the example, the order in which SQL Server will execute the SELECT statement is as follows:
1. First, the FROM clause is evaluated to define the source table that will be queried.
2. Next, the WHERE clause is evaluated to filter the rows in the source table. This filtering is defined by the
predicate mentioned in the WHERE clause.
3. After this, the GROUP BY clause is evaluated. This clause arranges the filtered values received from
the WHERE clause.
4. Next, the HAVING clause is evaluated based on the predicate that is provided.
5. Next, the SELECT clause is executed to determine the columns that will appear in the query
results.
6. Finally, the ORDER BY statement is evaluated to display the output.
The order of execution for the SELECT statement in Code Snippet 10 would be as follows:
5. SELECT SalesPersonID, YEAR(OrderDate) AS OrderYear
1. FROM SalesOrderHeader
2. WHERE CustomerID = 30084
3. GROUP BY SalesPersonID, YEAR(OrderDate)
4. HAVING COUNT(*) > 1
6. ORDER BY SalesPersonID, OrderYear;
1. Which of the following is used to define and manage all attributes and properties of a database,
including row layouts, column definitions, key columns, file locations, and storage strategy?
(A) DDL (C) DCL
(A) a-4, b-2, c-3, d-1, e-5 (C) a-1, b-4, c-5, d-3, e-2
(B) a-1, b-2, c-4, d-3, e-5 (D) a-5, b-3, c-4, d-2, e-1
(B) 62 (D) 26
Then, using the AdventureWorks2019 database, create and execute following queries:
2. You are given the following code that adds five days to a given date:
SET NOCOUNT ON
DECLARE @startdate DATETIME, @adddays INT;
SET @startdate = 'January 10, 1900 12:00 AM';
SET @adddays = 5;
SET NOCOUNT OFF;
SELECT @startdate + 1.25 AS 'Start Date',
@startdate + @adddays AS 'Add Date';
Create Transact-SQL code that will subtract six days from a given date.
a) Display all the records from Person.Address table having city as Montreal.
b) Display all the records from HumanResources.Department table that have a value in
DepartmentID that is greater than or equal to the value 13.
c) Display all the records from Production.ProductCategory table that do not have value
in ProductCategoryID that is equal to the value 3 or the value 2.
4. Display all the records from Person.Person for all persons whose last name begins with B.
SSMS Administration utilities: From SQL Server 2005 onwards, several SQL Server administrative
utilities are integrated into SSMS. It is the core administrative console for SQL Server installations. It
enables to perform high-level administrative functions, schedule routine maintenance tasks, and so forth.
SQL Server Management Objects (SQL-SMO) API: Includes complete functionality for administering
SQL Server in applications.
Transact-SQL scripts and stored procedures: These use system stored procedures and Transact- SQL
DDL statements. Figure 5.1 shows a Transact-SQL query window.
These tools also guard applications from making changes in the system objects.
System Catalog Views Views displaying metadata for describing database objects in an SQL Server
instance.
New managed code object model, providing a set of objects used for managing
SQL-SMO
Microsoft SQL Server.
Catalog Functions,
Methods, Attributes, or Used in ActiveX Data Objects (ADO), OLE DB, or ODBC applications.
Properties of Data API
Stored Procedures and Used in Transact-SQL as stored procedures and built-in functions.
Functions
Syntax:
where,
DATABASE_NAME: is the name of the database to be created.
ON: indicates the disk files to be used to store the data sections of the database and data files.
PRIMARY: is the associated <filespec> list defining the primary file.
<filespec>: controls the file properties.
<filegroup>: controls filegroup properties.
LOG ON: indicates disk files to be used for storing the database log and log files.
COLLATE collation_name: is the default collation for the database. A collation defines rules for
comparing and sorting character data based on the standard of particular language and locale.
Collation name can be either a Windows collation name or a SQL collation name.
Code Snippet 1 shows how to create a database with database file and transaction log file with collation
name.
Code Snippet 1:
SQL Server databases use two files - an .mdf file, known as the primary database file, containing the
schema and data and a .ldf file, which contains the logs. A database may also use secondary database
file, which normally uses an .ndf extension.
MDF stands for Master Database File. It contains main information of a database that are part of the
server. This extension also points to various other files. It plays a crucial role in information storage.
LDF stands for Log Database File. This file stores information related to transaction logs for main data
file. It basically keeps track of changes that have been made in the database.
Besides these, there are.ndf files representing secondary data files. Secondary data files make up all the data
files, other than the primary data file. Some databases may not have any secondary data files, while others
have several secondary data files. The recommended file name extension for secondary data files is .ndf.
Note: SQL Server does not enforce the .mdf, .ndf, and .ldf file name extensions, but these extensions
are recommended to help identify the use of the file.
Figure 5.2 shows the query and the database Customer_DB listed in the Object Explorer.
Syntax:
Code Snippet 2 shows how to rename a database Customer_DB with a new database name, CUST_DB.
Code Snippet 2:
Figure 5.3 shows database Customer_DB is renamed with a new database name, CUST_DB.
where,
login is an existing database username.
After sp_changedbowner is executed, the new owner is known as the dbo user inside the selected
database. The dbo receives permissions to perform all activities in the database. The owner of the master,
model, or tempdb system databases cannot be changed.
Code Snippet 3, when executed, makes the login 'sa' the owner of the current database and maps 'sa' to
existing aliases that are assigned to the old database owner, and will display 'Command(s) completed
successfully'.
Code Snippet 3:
USE CUST_DB
EXEC sp_changedbowner 'sa'
Table 5.1 shows the database options that are supported by SQL Server 2019.
Option Type Description
Automatic options Controls automatic behavior of database.
Cursor options Controls cursor behavior.
Recovery options Controls recovery models of database.
Miscellaneous options Controls ANSI compliance.
State options Controls state of database, such as online/offline and user connectivity.
Code Snippet 4 when executed sets AUTO_SHRINK option for the CUST_DB database to ON. The
AUTO_SHRINK options when set to ON, shrinks the database that have free space.
USE CUST_DB;
ALTER DATABASE CUST_DB SET AUTO_SHRINK ON
5.3.5 Filegroups
In SQL Server, data files are used to store database files. The data files are further subdivided into filegroups for
the sake of performance. Each filegroup is used to group related files that together store a database object.
Every database has a primary filegroup by default. This filegroup contains the primary data file. The
primary file group and data files are created automatically with default property values at the time of
creation of the database. User-defined filegroups can then be created to group data files together for
administrative, data allocation, and placement purposes.
Table 5.2 shows filegroups that are supported by SQL Server 2019.
Filegroup Description
The filegroup that consists of the primary file. All system tables are
Primary placed inside the primary filegroup.
Any filegroup that is created by the user at the time of creating or
User-defined
modifying databases.
Filegroups can be created when the database is created for the first time or can be created later when more
files are added to the database. However, files cannot be moved to a different filegroup after the files have
been added to the database.
A file cannot be a member of more than one filegroup at the same time. A maximum of 32,767 filegroups
can be created for each database. Filegroups can contain only data files. Transaction log files cannot belong
to a filegroup.
Syntax:
CREATE DATABASE database_name
[ ON
[ PRIMARY ] [ <filespec> [ ,...n ]
[ , <filegroup> [ ,...n ] ]
[ LOG ON { <filespec> [ ,...n ] } ]
]
[ COLLATE collation_name ]
]
[;]
where,
database_name: is the name of the new database.
ON: indicates the disk files to store the data sections of the database, and data files.
PRIMARY and associated <filespec> list: define the primary file. The first file specified in the
<filespec> entry in the primary filegroup becomes the primary file.
LOG ON: indicates the disk files used to store the database log files. COLLATE
collation_name: is the default collation for the database.
Code Snippet 5 shows how to add a filegroup (PRIMARY as default) while creating a database, called
SalesDB.
Code Snippet 5:
Figure 5.4 shows the file groups when creating SalesDB database.
Syntax:
ALTER DATABASE database_name
{ <add_or_modify_files>
| <add_or_modify_filegroups>
| <set_database_options>
| MODIFY NAME = new_database_name
| COLLATE collation_name
}
[;]
Code Snippet 6 shows how to add a filegroup to an existing database, called CUST-DB.
Code Snippet 6:
USE CUST_DB
ALTER DATABASE CUST_DB
ADD FILEGROUP FG_ReadOnly
After executing the code, SQL Server 2019 displays the message 'Command(s) completed successfully' and
the filegroup FG_ReadOnly is added to the existing database, CUST_DB.
Default Filegroup
Objects are assigned to the default filegroup when they are created in the database. The PRIMARY filegroup
is the default filegroup. The default filegroup can be changed using the ALTER DATABASE statement.
System objects and tables remain within the PRIMARY filegroup, but do not go into the new default
filegroup.
To make the FG_ReadOnly filegroup as default, it should contain at least one file inside it.
Code Snippet 7:
USE CUST_DB
ALTER DATABASE CUST_DB
ADD FILE (NAME = Cust_DB1, FILENAME ='C:\Program Files\Microsoft SQL
Server\MSSQL15.MYEXPRESS\MSSQL\DATA\Cust_DB1.ndf')
TO FILEGROUP FG_ReadOnly ALTER DATABASE CUST_DB
MODIFY FILEGROUP FG_ReadOnly DEFAULT
After executing the code in Code Snippet 7, SQL Server 2019 displays the message saying the filegroup
property 'DEFAULT' has been set.
Recovery of all incomplete If a server that is running SQL Server fails, the databases may be
transactions when SQL Server is left in an inconsistent state. When an instance of SQL Server is
started started, it runs a recovery of each database.
Supporting transactional The Log Reader Agent monitors the transaction log of each
replication database configured for replications of transactions.
Supporting standby server The standby-server solutions, database mirroring, and log shipping
solutions depend on the transaction log.
SQL Server uses the transaction log of each database to recover transactions. The transaction log is a serial
record of all modifications that have occurred in the database as well as the transactions that performed the
modifications. This log keeps enough information to undo the modifications made during each transaction.
The transaction log records the allocation and deallocation of pages and the commit or rollback of each
transaction. This feature enables SQL Server either to roll forward or to back out.
1. In Object Explorer, connect to an instance of the SQL Server Database Engine and then, expand that
instance.
2. Right-click Databases, and then, click New Database as shown in Figure 5.6.
3. In New Database dialog box, enter a database name in the Database name box.
4. To create the database by accepting all default values, click OK, as shown in Figure 5.7; otherwise,
continue with the following optional steps.
6. To change the default values of the primary data and transaction log files, in the Database files grid,
click the appropriate cell and enter the new value.
7. To change the collation of the database, select the Options page, and then, select a collation from the
list as shown in Figure 5.8.
8. To change the recovery model, select the Options page and then, select a recovery model from the list
as shown in Figure 5.9.
10. To add a new filegroup, click the Filegroups page. Click Add and then, enter the values for the
filegroup as shown in Figure 5.10.
11. To add an extended property to the database, select the Extended Properties page.
In the Name column, enter a name for the extended property.
In the Value column, enter the extended property text. For example, enter one or more
statements that describe the database.
The database will be successfully created and will be visible in Object Explorer after the Databases node
is refreshed.
3. Confirm that the correct database is selected, and then, click OK.
Apart from using SSMS to drop a database, one can also use a Transact-SQL command.
Advantages Disadvantages
• Provide a convenient, read-only copy of • Snapshot backup cannot be created.
data. • Snapshot must exist on the same database
• When queried, no deterioration of server as that of the source database.
performance. • A new user cannot be granted access to the
• Snapshot files are small and are very quick data in a snapshot.
to create.
where,
database_snapshot_name: is the name of the new database snapshot.
ON ( NAME = logical_file_name, FILENAME = 'os_file_name' ) [ ,... n ]: is the list
of files in the source database. For the snapshot to work, all the data files must be specified
individually.
AS SNAPSHOT OF source_database_name: is a database snapshot of the source database
specified by source_database_name.
Code Snippet 9:
(A) a, b, c (C) b, d
(B) c, d (D) All of these
3. Which of these terms correspond to the description, ‘Defines rules for comparing and sorting
character data based on the standard of particular language and locale.’
(A) a, b, c (C) b, c
➢ The SQL Server data files are used to store database files, which are further subdivided into
filegroups for the sake of performance.
➢ The CREATE DATABASE statement with various options can be used to create databases.
➢ ALTER DATABASE and DROP DATABASE are used to modify and delete a database
respectively.
➢ A transaction log in SQL Server records all transactions and the database modifications made by
each transaction.
➢ Objects are assigned to the default filegroup when they are created in the database. The PRIMARY
filegroup is the default filegroup.
• Primary filegroup with files, UnitedAir1_dat and UnitedAir2_dat. The size, maximum
size, and file growth should be 5, 10, and 15% respectively. Note that _dat is given as a suffix
to the file name to indicate that it is a filegroup name.
2. Repeat similar process for a database named BelAir, however, use SSMS to do this instead of
Transact SQL.
This session explores various data types provided by SQL Server 2019 and describes how to
use them. The techniques for creation, modification, and removal of tables and columns are
also discussed.
6.1 Introduction
One of the most important types of database objects in SQL Server 2019 is a table. Tables in SQL Server
2019 contain data in the form of rows and columns. Each column may have data of a specific type and
size.
Name Description
hierarchyid
It is a system data type with variable length. You can use it to represent a position
in a hierarchy.
It is a spatial data type, implemented as a Common Language Runtime (CLR) data
type in SQL Server. It represents data in a Euclidean (flat) coordinate system. SQL
Server supports a set of methods for this data type. The geometry type is
geometry
predefined and available in each database. You can create table columns of type
geometry and operate on geometry data similar to how you do on other CLR
types.
It is a spatial type for storing ellipsoidal (round-earth) data, such as GPS latitude
and longitude coordinates. SQL Server supports a set of methods for the
geography
geography spatial data type. This type too is predefined and available in each
database. You can create table columns of type geography.
xml It is a special data type for storing XML data in SQL Server tables.
cursor
It is a data type for variables or stored procedure OUTPUT parameters that contain
a reference to a cursor.
Note: AdventureWorks2019 database will be used for all the subsequent statements/commands.
Syntax:
where,
database_name: is the name of the database in which the table is created.
table_name: is the name of the new table. table_name can be a maximum of 128 characters.
column_name: is the name of a column in the table. column_name can be up to
128 characters. column_name are not specified for columns that are created with
a timestamp data type. The default column name of a timestamp column is timestamp.
data_type: It specifies data type of the column.
ON [filegroup | "default"]: When you create a database in SQL Server, you can have
multiple file groups, where storage is created in multiple places, directories or disks. Each file
group can be named. The PRIMARY file group is the default one, which is always created. Giving
ON PRIMARY creates the table on this file group.
Code Snippet 1:
USE AdventureWorks2019
CREATE TABLE [dbo].[CustomerInfo](
[CustomerID number] [numeric](10, 0) NOT NULL, [CustomerName] [varchar](50)
NOT NULL)
ON [PRIMARY]
Syntax:
where,
ALTER COLUMN: specifies that the particular column must be changed or modified.
ADD: specifies that one or more column definitions are to be added.
DROP COLUMN ([<column_name>]: specifies that column_name is to be removed from the table.
Code Snippet 2:
Code Snippet 3:
Before attempting to drop columns, however, it is important to ensure that the columns can be dropped.
Under certain conditions, columns cannot be dropped, such as, if they are used in a CHECK, FOREIGN
KEY, UNIQUE, or PRIMARY KEY constraint, associated with a DEFAULT definition, and so forth. Constraints
will be explored in a later section.
Syntax:
where,
<Table_Name>: is the name of the table to be dropped.
Code Snippet 5:
The table will be removed from the database. Once the table is removed, all data in it will be gone too.
Syntax:
where,
<Table_Name>: is the name of the table in which row is to be inserted.
[INTO]: is an optional keyword used between INSERT and the target table.
<Values>: specifies the values for columns of the table.
Code Snippet 6:
The outcome of this will be that one row with the given data is inserted into the table.
➢ UPDATE Statement - The UPDATE statement modifies the data in the table.
Syntax:
UPDATE <Table_Name>
SET <Column_Name = Value>
[WHERE <Search condition>]
where,
<Table_Name>: is the name of the table where records are to be updated.
<Column_Name>: is the name of the column in the table in which record is to be updated.
<Value>: specifies the new value for the modified column.
<Search condition>: specifies the condition to be met for the rows to be deleted.
Code Snippet 7 demonstrates the use of the UPDATE statement to modify the value in column
PhoneNumber.
Code Snippet 7:
UPDATE [Person].[PersonPhone]
SET [PhoneNumber] = ‘731-511-0142’ WHERE BusinessEntityID=299 AND
ModifiedDate='2020-10-12'
The specified value will be updated in the PhoneNumber column of Person.PersonPhone table.
➢ DELETE Statement - The DELETE statement removes rows from a table.
Syntax:
where,
<Table_Name>: is the name of the table from which the records are to be deleted.
The WHERE clause is used to specify the condition. If WHERE clause is not included in the
DELETE statement, all the records in the table will be deleted.
Code Snippet 8 demonstrates how to delete a row from the Person.PersonPhone table whose
PhoneNumber value is 731-511-0142 and BusinessEntityID is 229.
Note: If WHERE clause is not specified with the DELETE statement for a table, all the rows of the
table will be deleted.
Nullability of a column can be defined either when creating a table or modifying a table. The NULL
keyword is used to indicate that null values are allowed in the column and NOT NULL is used to indicate
that null values are not allowed.
When inserting a row, if no value is given for a nullable column (that is, it allows null values), then, SQL
Server automatically gives it a null value unless the column has been given a default definition. It is also
possible to explicitly enter a null value into a column regardless of what data type it is or whether it has a
default associated with it. Making a column non-nullable (that is, not permitting null values) enforces data
integrity by ensuring that the column contains data in every row.
In Code Snippet 9, the CREATE TABLE statement uses the NULL and NOT NULL keywords with column
definitions.
Code Snippet 9:
CREATE TABLE StoreDetails (StoreID int NOT NULL, Name varchar(40) NULL)
The result of the command is that StoreDetails table is created with StoreID and Name columns,
and StoreID will not accept any null values whereas Name will allow null values.
Thus, the code in Code Snippet 10 will not succeed, because it assumes a null value will be inserted into
StoreID column.
Figure 6.2 depicts the resulting error when this command is executed.
In such situations, a DEFAULT definition can be given for the column to assign it as a default value if no
value is given at the time of creation. For example, it is common to specify zero as the default for numeric
columns or 'N/A' or 'Unknown' as the default for string columns when no value is specified.
A DEFAULT definition for a column can be created at the time of table creation or added at a later stage to
an existing table. When a DEFAULT definition is added to an existing column in a table, SQL Server
applies the new default values only to those rows of data, which have been newly added to the table.
In Code Snippet 11, CREATE TABLE statement uses DEFAULT keyword to define default value for Price.
CREATE TABLE StoreProduct(ProductID int NOT NULL, Name varchar(40) NOT NULL, Price
money NOT NULL DEFAULT (100))
When a row is inserted using a statement as shown in Code Snippet 12, value of Price will not be
blank; it will have a value of 100.00 even though a user has not entered any value for that column.
Figure 6.3 shows the output of Code Snippet 12, where, though values are added only to ProductID and
Name columns, Price column will still show a value of 100.00. This is because of DEFAULT definition.
A column having IDENTITY property must be defined using one of the following data
types:
A column having IDENTITY property need not have a seed and increment value
specified. If they are not specified, a default value of 1 will be used for both.
A table cannot have more than one column with IDENTITY property.
The identifier column in a table must not allow null values and must not contain a
DEFAULT definition or object.
Columns defined with IDENTITY property cannot have their values updated.
The values can be explicitly inserted into the identity column of a table only if
the IDENTITY_INSERT option is set ON. When IDENTITY_INSERT is ON, INSERT
statements must supply a value.
The advantage of identifier columns is that SQL Server can automatically provide key values, thus reducing
costs that would have been incurred for extra storage and improving performance. Using identifier
columns simplifies programming and keeps primary key values short.
Once the IDENTITY property has been set, retrieving the value of the identifier column can be done by
using the IDENTITYCOL keyword with the table name in a SELECT statement. To know if a table has an
IDENTITY column, the OBJECTPROPERTY() function can be used. To retrieve the name of the IDENTITY
column in a table, the COLUMNPROPERTY function is used.
where,
seed_value: is the seed value which starts generating identity values.
increment_value: is the increment value by which to increase each time.
Code Snippet 13 demonstrates the use of IDENTITY property. HRContactPhone is created as a table with
two columns. The Person_ID column is an identity column. The seed value is 500 and the increment
value is 1.
While inserting rows into the table, if IDENTITY_INSERT is not turned ON, then, explicit values for the
IDENTITY column cannot be given. Instead, statements similar to Code Snippet 14 can be given.
Figure 6.4 shows the output where IDENTITY property is incrementing Person_ID column values.
Values for a globally unique column are not automatically generated. One has to create a DEFAULT
definition with a NEWID() function for a uniqueidentifier column to generate a globally unique
value. The NEWID() function creates a unique identifier number which is a 16-byte binary string.
The column can be referenced in a SELECT list by using the ROWGUIDCOL keyword.
To know whether a table has a ROWGUIDCOL column, the OBJECTPROPERTY function is used. The
COLUMNPROPERTY function is used to retrieve the name of the ROWGUIDCOL column. Code Snippet
15 demonstrates how to use CREATE TABLE statement to create the EMPCellularPhone table.
The Person_ID column automatically generates a GUID for each new row added to the table.
Code Snippet 15:
Figure 6.4 shows the output where a unique identifier is displayed against a specific PersonName.
6.4 Constraints
One of the important functions of SQL Server is to maintain and enforce data integrity. There are a number
of means to achieve this, but one of the commonly used and preferred methods is to use constraints. A
constraint is a property assigned to a column or set of columns in a table to prevent certain types of
inconsistent data values from being entered. Constraints are used to apply business logic rules and enforce
data integrity.
Constraints can be created when a table is created, as part of the table definition by using CREATE TABLE
statement or can be added at a later stage using ALTER TABLE statement.
Is specified as part of a column definition Can apply to more than one column in a table and
and applies only to that column. is declared independently from a column
definition. Table constraints must be used when
more than one column is included in a constraint.
CONSTRAINTS
PRIMARY KEY UNIQUE FOREIGN KEY CHECK NOT NULL
Note: Only one primary key constraint can be created per table.
Two rows in a table cannot have the same primary key value and a column that is a primary key cannot
have NULL values. Hence, when a primary key constraint is added to existing columns of a table, SQL
Server 2019 checks to see if the rules for primary key are complied with. If the existing data in the columns
do not comply with the rules for primary key then, the constraint will not be added.
Syntax:
Code Snippet 17 demonstrates how to create a table EMPContactPhone to store contact telephone details
of a person. Since the column EMP_ID must be a primary key for identifying each row uniquely, it is created
with the primary key constraint.
Syntax:
Having created a primary key for Employee_ID, a query is written to insert rows into the table with the
statements shown in Code Snippet 18.
The first INSERT statement shown in Code Snippet 18 is executed successfully, but the next INSERT
statement will fail because the value for Employee_ID is duplicate as shown in Figure 6.5.
6.4.2 UNIQUE
A UNIQUE constraint is used to ensure that only unique values are entered in a column or set of columns. It
allows developers to make sure that no duplicate values are entered. Primary keys are implicitly unique.
Unique key constraints enforce entity integrity because once the constraints are applied; no two rows in
Syntax:
Code Snippet 19 demonstrates how to make MobileNumber and LandlineNumber columns as unique.
Though a value of NULL has been given for the LandlineNumber columns, which are defined as UNIQUE, the
command will execute successfully because UNIQUE constraints check only for the uniqueness of values,
but do not prevent null entries. The first statement shown in Code Snippet 20 is executed successfully, but
the next INSERT statement will fail even though the primary key value is different because the value for
MobileNumber is a duplicate value as shown in Figure 6.7. This is because the column MobileNumber
is defined to be unique and disallows duplicate values.
Figure 6.8 shows the result of retrieving rows from NewEmpContactPhone table. It shows the row that
was successfully inserted as an outcome of the first INSERT statement.
Syntax:
where,
table_name: is the name of the table from which to reference primary key.
<pk_column_name>: is the name of the primary key column.
Code Snippet 21 demonstrates how to create a foreign key constraint. Here, it ensures that MobileNumber
in EmpPhoneExpenses is linked to MobileNumber column in NewEmpContactPhone. This means that
you cannot insert any values in the MobileNumber column of EmpPhoneExpenses that do not exist in
MobileNumber column of NewEmpContactPhone.
Figure 6.9 shows the database diagram depicting the relationship between NewEmpContactPhone and
EmpPhoneExpenses. The two tables are related based on the column MobileNumber.
Figure 6.9: Database Diagram Showing Foreign Key and Relationship Between Tables
Now, a row is inserted into EmpPhoneExpenses table such that the mobile number is the same as one of
the mobile numbers in NewEmpContactPhone. The statements that will be written are shown in Code
Snippet 22.
If there is no key in the referenced table having a value that is being inserted into the foreign key, the
insertion will fail as shown in Figure 6.10. It is, however, possible to add NULL value into a foreign key
column.
6.4.4 CHECK
A CHECK constraint limits the values that can be placed in a column. Check constraints enforce integrity
of data. For example, a CHECK constraint can be given to check if the value being entered into VoterAge is
greater than or equal to 18. If the data being entered for the column does not satisfy the condition, then,
insertion will fail.
A CHECK constraint operates by specifying a search condition, which can evaluate to TRUE, FALSE, or
unknown. Values that evaluate to FALSE are rejected. Multiple CHECK constraints can be specified for a
single column. A single CHECK constraint can also be applied to multiple columns by creating it at the
table level.
Code Snippet 23 demonstrates creating a CHECK constraint to ensure that Amount value will always be non-
zero. A NULL value can, however, be added into Amount column if the value of Amount is not known.
Code Snippet 23:
Once a CHECK constraint has been defined, if an INSERT statement is written with data that violates the
constraint, it will fail as shown in Code Snippet 24.
3. Which of the following code is used to drop a table from AdventureWorks2019 database?
USE [AdventureWorks2019]
GO
(A) DROPTABLE [dbo].[Table_1] (C) DELETE TABLE [dbo].[Table_1]
USE [AdventureWorks2019]
USE GO
[AdventureWorks2019] GO DROP
(B) DROP TABLE [dbo].[Table_1] (D)
[dbo].[Table_1]
4. Which of the following property of SQL Server is used to create identifier columns that can
contain auto-generated sequential values to uniquely identify each row within a table?
(A) SELECT (C) INSERT
5. A constraint is used to ensure that only unique values are entered in a column
or set of columns.
(A) UNIQUE (C) Foreign key
SCI handles approximately 650 claims per month, but that can soar to 15,000 or more when a
hurricane or some other disaster strikes. Officers can use the software on the device type of
their choice: Tablet PCs or laptops in the field, or desktop PCs back in their offices. The use of
Microsoft SQL Server 2019 as the software's database enables to receive and update all the
necessary information regarding a customer or claimer.
With thousands of customers expected every month, data integrity of the data in the database
is very important. You must perform the following tasks:
d. Create a CHECK constraint in CustomerDetails table to ensure that the Amount value will
always be non-zero.
7.1 Introduction
Cloud computing is a technology trend, that involves delivery of software, platforms, and infrastructure as
services through the Internet. Microsoft Azure is a key offering in Microsoft's suite of cloud computing
products and services. The database functions of Microsoft's cloud platform are provided by Azure SQL.
The data on Azure SQL does not have the constraint of being location-specific. This means that data stored
in Azure SQL can be viewed and edited from any location, as the entire data is stored on cloud storage
platform.
Azure SQL is a cloud based relational database service that leverages existing SQL Server technologies. It
extends the functionality of Microsoft SQL Server for developing applications that are Web-based,
scalable, and distributed. Azure SQL is not just a single product but refers to a family of managed,
intelligent, and secure products that use the SQL Server database engine in the Azure cloud.
Azure SQL was earlier known by other names such as SQL Azure, SQL Server Data Services, SQL
Services, and Windows Azure SQL Database. From the time it was first released in 2010 and then renamed
to Azure SQL in 2014 to the present version, there have been many new features and enhancements added
to it.
Azure SQL can be used to store and manage data using queries and other functions that are similar to SQL
It can also be used in collaboration with other Azure applications, through Visual Studio IDE as shown in
figure 7.1.
One of the competitors to Azure SQL is Amazon Web Services (AWS) and its Relational Database
Services (RDS) product. Azure SQL is often compared to AWS RDS.
3.6 86%
Times Less
expensive
Faster than AWS
Both cloud based as well as on-premises applications can use the Azure SQL database. Applications
retrieve data from Azure SQL through a protocol known as Tabular Data Stream (TDS). This protocol is
not new to Azure SQL. Whenever on-premises applications involve interaction with SQL Server Database
Engine, this protocol is used by the client and the server.
Figure 7.2 outlines these services along with depicting their purpose.
SQL Server in a VM is ideal for existing applications that require full product functionality. It offers up to
20,000 Input/output Operations Per Second (IOPS) in terms of scalability.
Azure SQL Database on the other hand is more suited for applications that require elastic scale and/or
lesser overhead. Azure SQL Database can scale out to thousands of databases and process terabytes of
data.
Elastic Pool feature in Azure SQL allows you to assign a shared set of compute
Scalability resources to a collection of Azure SQL databases. The benefit of this is that, a single
database can be moved in and out of an elastic pool, which gives us flexibility and
in turn, achieve cost efficiency.
TDS is used in on-premises SQL Server databases for client libraries. Hence, most
Usage of developers are familiar with TDS and its use. The same kind of TDS interface is used
TDS in Azure SQL to build client libraries. Hence, it is easier for developers to work on
Azure SQL.
Azure SQL stores multiple copies of data on different physical locations. Even if
Automatic there is a hardware failure due to heavy usage or excessive load, Azure SQL helps
failover to maintain business operations by providing availability of data through other
measures physical locations. This is done by using automatic failover measures that are
provided in Azure SQL.
Even small organizations can use Azure SQL as pricing model for Azure SQL is
Flexibility
based on the storage capacity that is used by an organization. If the organization
in service requires more storage, the price can be altered to suit the requirement. This helps
usage organizations to be flexible in the investment depending on the service usage.
Transact- As Azure SQL is completely based on the relational database model, it also
supports Transact-SQL operations and queries. This concept is similar to the
SQL working of the on-premises SQL Servers. Hence, administrators do not need any
support additional training or support to use Azure SQL.
Backup and restore function must be supported in on-premises SQL Server for
Backup disaster recovery. For Azure SQL, as all the data is on the cloud platform, backup
and restore is not required.
Azure SQL supports only SQL Server authentication and on-premises SQL Server
Authentication supports both SQL Server authentication and Windows Authentication.
Accounts and In Azure SQL, administrative accounts are created in the Azure management portal.
Logins Hence, there are no separate instance-level user logins.
Firewalls Firewalls settings for allowed ports and IP addresses can be managed on physical
servers for on-premises SQL Server. As an Azure SQL database is present on cloud,
authentication through logins is the only method to verify the user.
1. Sign up for an Azure Free Account and get credit for 30 days and get 12 months of free access to
Azure SQL Database. After 30 days, however, your Azure account will be charged.
2. Alternatively, opt for a paid purchasing model. This approach is complicated because various
database service-tier options such as Database Transaction Units (DTUs), maximum database size,
and disaster recovery options are used to determine pricing instead of hardware (CPU/RAM/HD).
A DTU is a metric that defines a mixture of CPU, memory, and read/write rates.
1. Type the address http://portal.azure.com in the Address bar of your browser. You will be redirected
to a page to enter your login credentials. Refer to Figure 7.4.
2. If you already have a Microsoft account, you can use that to sign in. Refer to figure 7.5.
The Azure Portal page will open up and you will be prompted with a variety of actions as shown in figure
7.6.
Click Start under Start with an Azure free trial. You will be asked to fill in your profile details and
verify your identity through phone and credit card as shown in figures 7.7,7.8, and 7.9 (steps 1, 2, and 3).
Finally, upon successful verification of your credit card, your account will be set up. You will see various
services available under Azure services. Click SQL databases under Azure services as shown in figure
7.10.
The SQL databases page will be displayed as shown in figure 7.11. Here, you can create and work with
Click Create SQL database at the bottom of the page. You will be asked to fill up information such as
database name and server name. Refer to Figure 7.12.
Since a database server does not exist yet, you will create a new one as shown in figure 7.13.
Ensure that you specify valid details. Refer to Figure 7.15 for an example.
When you finish creating the new server and specify that server name, click Review+create on Create
SQL Database page. The database will be successfully created and a dashboard will be displayed as shown
Now that the database is ready, you can use SSMS to work with it, just as you do with on-premises SQL
Server databases.
Figure 7.17: Connect to Server Dialog Box with Azure SQL Server
7. Sign in to Azure by clicking Sign In. Your client IP address will be automatically populated in the
corresponding box as shown in figure 7.19.
8. Click Connect. The connection to the database is successfully established. You will see the Azure
SQL Database in the Object Explorer.
10. Add columns and save the table as shown in figure 7.21.
11. Open the Azure portal and in the SQL Database dashboard, click Query Editor on the left pane.
Figure 7.22: Specifying Login Details for SQL Database Query Editor
13. The Query Editor on the cloud will be displayed. Observe in figure 7.23 that the table you created in
SSMS is now visible on the left pane. Thus, you created a table on the cloud SQL Server database
using local SSMS.
Now, you can perform similar other tasks, including creating and executing queries from SSMS on cloud
databases.
2. Which one of the following protocols is used by applications retrieve data from SQL Server?
(A) ABS (C) TDS
(B) DTS (D) WSQL
4. Which layer in the Azure SQL architecture offers Provisioning, Billing and Metering, and
Connection Routing?
(A) Client Layer (C) Infrastructure Layer
(B) Service Layer (D) None of these
a) USE statement is not supported by Azure SQL. Hence, the user cannot switch between databases
in Azure SQL as compared to on-premises SQL Server.
b) For a single database, the vCore-based purchase model is ideal as it is flexible and allows you to
scale memory and storage based on your requirements.
c) Automatic failover measures are provided in Azure SQL.
d) All Transact-SQL functions are supported by Azure SQL.
e) In Azure SQL, there are separate instance-level user logins.
(A) a, b, c (C) a, c, d
(B) c, d, e (D) c, d
2. Then, using SSMS, create a Members table in this database with following columns:
CardNo: 5 characters
LastName: up to 15 characters
FirstName: up to 15 characters
Address: up to 150 characters
DOB: date
Gender: 1 char (M or F)
Phone_No: up to 15 characters
BookID: up to 5 characters
Title: up to 200 characters
Author: up to 100 characters
Publish Date: date
Verify using Azure SQL Query Editor whether the tables have been created.
4. Alter the tables to add CardNo as primary key in Members and BookID as primary key in
Books table.
5. Insert some rows into the Members and Books tables on the cloud database, Library.
6. Display records from Members table.
7. Display records from Books table.
8. Create a database named Supermarket on Azure SQL.
9. Create tables Customer, Products, and ProductOrders in this database with following
structures:
Customer
CustNo: 5 characters
FirstName: up to 45 characters
LastName: up to 45 characters
Address: up to 150 characters
DOB: date
Gender: 1 char (M or F)
Phone_No: up to 15 characters
ProductOrders
ProductNo: int, foreign key
OrderDate: date
Quantity: int
CustNo: 5 characters
This session describes SELECT statement. It also explains the expressions and various
clauses used with SELECT statement. Finally, the session introduces the new xml data type
and describes how to work with XML data in SQL Server 2019 tables.
8.1 Introduction
The SELECT statement is a core command used to access data in SQL Server 2019. XML allows developers
to develop their own set of tags and makes it possible for other programs to understand these tags. XML is
the preferred means for developers to store, format, and manage data on the Web.
Syntax:
where,
table_name: is the table from which the data will be displayed.
<column_name1>...<column_nameN>: are the columns that are to be displayed.
Code Snippet 1:
SELECT LEFT('International',5)
The code will display only the first five characters from the extreme left of the word 'International'.
Figure 8.1: First Five Characters from the Extreme Left of the Word
Syntax:
where,
*: specifies all columns of the named tables in the FROM clause.
<table_name>: is the name of the table from which the information is to be retrieved. It is possible
to include any number of tables. When two or more tables are used, the row of each table is mapped
with the row of others. This activity takes a lot of time if the data in the tables are huge. Hence, it is
recommended to use this syntax with a condition.
Code Snippet 2 demonstrates the use of ' * ' in the SELECT statement.
Code Snippet 2:
USE AdventureWorks2019
SELECT * FROM HumanResources.Employee
GO
Syntax:
where,
<column_name1>..<column_nameN>: are the columns that are to be displayed.
For example, to display the cost rates in various locations from Production.Location table in
AdventureWorks2019 database, the SELECT statement is shown in Code Snippet 3.
Code Snippet 3:
USE AdventureWorks2019
SELECT LocationID, CostRate FROM Production.Location
GO
Figure 8.3 shows LocationID and CostRate columns from AdventureWorks2019 database.
Code Snippet 4:
USE AdventureWorks2019
SELECT Name +':'+ CountryRegionCode +'->'+ Group FROM Sales.SalesTerritory
GO
In this code, the column name Group of Sales.SalesTerritory is given in brackets because it must
not conflict with the keyword GROUP of Transact-SQL.
Figure 8.4 displays the country name, country region code, and corresponding group from
Sales.SalesTerritory of AdventureWorks2019 database.
If you observe, the output in figure 8.4 had no column heading. The code in Code Snippet 4 can be
modified as given in Code Snippet 5 in order to display a named column heading.
Code Snippet 5:
USE AdventureWorks2019
SELECT Name +':'+ CountryRegionCode +'->'+ [Group] AS NameRegionGroup FROM
Sales.SalesTerritory
GO
Code Snippet 6:
USE AdventureWorks2019
SELECT ModifiedDate as 'ChangedDate' FROM Person.Person
GO
The output displays'ChangedDate' as the heading for ModifiedDate column in the Person.Person
table. Figure 8.6 shows the original heading and the changed heading.
Note - The table used in the FROM clause of a query is called as a base table.
Code Snippet 7:
USE AdventureWorks2019
SELECT ProductID,StandardCost,StandardCost * 0.15 AS Discount FROM
Production.ProductCostHistory
GO
Code Snippet 8:
USE AdventureWorks2019
SELECT DISTINCT StandardCost FROM Production.ProductCostHistory
GO
Syntax:
where,
expression: is the number or the percentage of rows to be returned as the result.
PERCENT: returns the number of rows limited by percentage.
WITH TIES: is the additional number of rows that is to be displayed.
Syntax:
where,
new_table: is the name of the new table that is to be created.
Code Snippet 9 uses an INTO clause which creates a new table Production.ProductName with details
such as the product's ID and its name from the table Production.ProductModel.
Code Snippet 9:
USE AdventureWorks2019
SELECT ProductModelID, Name INTO Production.ProductName FROM Production.
ProductModel
GO
After executing the code, a message stating '(128 row(s) affected)' is displayed. If a query is written to
display the rows of the new table, the output will be similar to figure 8.8.
where,
search_condition: is the condition to be met by the rows.
Table 8.1 shows different operators that can be used with the WHERE clause.
Operator Description
= Equal to
<> Not equal to
> Greater than
< Less than
>= Greater than or equal to
<= Less than or equal to
! Not
Operator Description
BETWEEN Between a range
LIKE Search for an ordered pattern
IN Within a range
Table 8.1: Operators
Code Snippet 10 demonstrates the equal to operator with WHERE clause to display data with EndDate
'2013-05-29 00:00:00.000'.
USE AdventureWorks2019
SELECT * FROM Production.ProductCostHistory WHERE EndDate='2013-05-29 00:00:00.000'
GO
Code Snippet 10 will return all records from the table Production.ProductCostHistory which has
the end date as 2013-05-29 00:00:00.000.
The output of the SELECT with WHERE clause will return is shown a large number of rows. Some of the
output is shown in figure 8.9.
Code Snippet 11 demonstrates the equal to operator with WHERE clause to display data with address
having Bothell city.
The output of the query will return is shown a large number of rows. Some of the output is shown in
figure 8.10.
Numeric values are not enclosed within any quotes as shown in Code Snippet 12.
USE AdventureWorks2019
SELECT * FROM HumanResources.Department WHERE DepartmentID < 10
GO
The query in Code Snippet 12 displays all those records where the value in DepartmentID is less than
10. The output of the query is shown in figure 8.11.
WHERE clause also uses logical operators such as AND, OR, and NOT. These operators are used with search
conditions in WHERE clauses.
AND operator joins two or more conditions and returns TRUE only when both the conditions are TRUE.
Therefore, it returns all the rows from the tables where both the conditions that are listed are true.
USE AdventureWorks2019
SELECT * FROM Person.Address WHERE AddressID > 900 AND City='Seattle'
GO
OR operator returns TRUE and displays all the rows if it satisfies any one of the conditions.
Code Snippet 14 demonstrates OR operator.
USE AdventureWorks2019
SELECT * FROM Person.Address WHERE AddressID > 900 OR City='Seattle'
GO
The query in Code Snippet 14 will display all the rows whose AddressID is greater than 900 or whose
City is Seattle.
The NOT operator negates the search condition. Code Snippet 15 demonstrates NOT operator.
USE AdventureWorks2019
SELECT * FROM Person.Address WHERE NOT AddressID = 5
GO
Code Snippet 15 will display all the records whose AddressID is not equal to 5. Multiple logical operators
in a single SELECT statement can be used. When more than one logical operator is used, NOT is evaluated
first, then AND, and finally OR.
The GROUP BY keyword is followed by a list of columns, known as grouped columns. Every grouped
column restricts the number of rows of the resultset. For every grouped column, there is only one row.
The GROUP BY clause can have more than one grouped column.
Syntax:
where,
column_name1: is the name of the column according to which the resultset should be grouped.
For example, consider that if the total number of resource hours has to be found for each work order, the
query in Code Snippet 16 would retrieve the resultset.
The GROUP BY clause can be used with different clauses, such as WHERE, ORDER BY, and HAVING. For
example, Code Snippet 17 shows the use of HAVING with GROUP BY.
USE AdventureWorks2019
SELECT WorkOrderID, SUM(ActualResourceHrs) FROM Production.WorkOrderRouting
GROUP BY WorkOrderID HAVING WorkOrderID <50
GO
The output will show total resource hours only for those workers that have WorkOrderID less than 50.
Syntax:
The SELECT statement in Code Snippet 18 sorts the query results on the SalesLastYear column of the
Sales.SalesTerritory table.
Native XML databases in SQL Server 2019 have a number of advantages. Some of them are listed as
follows:
Easy Data Search and Management - All the XML data is stored locally in one place,
thus making it easier to search and manage.
SQL Server 2019 supports native storage of XML data by using the xml data type. A native XML database
defines a logical model for an XML document -- as opposed to the data in that document -- and stores and
retrieves documents according to that model. At a minimum, the model must include elements, attributes,
PCDATA, and document order.
A native XML database has an XML document as its fundamental unit of (logical) storage, just as a
relational database has a row in a table as its fundamental unit of (logical) storage.
Syntax:
Code Snippet 19 creates a new table named PhoneBilling with CallDetails column belonging to xml
data type. It is assumed that AdventureWorks2019 is selected as the database.
A column of type xml can also be added to a table at the time of creation or after its creation. The xml
data type columns support DEFAULT values as well as NOT NULL constraint. Data can be inserted into the
xml column in the Person.PhoneBilling table as shown in Code Snippet 20.
USE AdventureWorks2019
INSERT INTO Person.PhoneBilling VALUES (100,9833276605,
'<Info><Call>Local</Call><Time>45 minutes</Time><Charges>200</Charges>
</Info>')
SELECT CallDetails FROM Person.PhoneBilling
GO
The output is shown in figure 8.14. The CallDetails column shows the XML document that has been
added as a single value. Thus, each row in the Person.PhoneBilling table can have an XML
document in the CallDetails column.
The DECLARE statement is used to create variables of type xml. The purpose of the DECLARE statement is
used to declare a variable in SQL Server.
Syntax:
Local variable names have to start with an at (@) sign. The value argument which is indicated in the
syntax is an optional parameter that helps to assign an initial value to a variable during the declaration. If
you do not specify any initial value assigned to a variable, it is initialized as NULL.
The xml data type columns cannot be used as a primary key, foreign key, or as a unique constraint.
SQL Server does not perform any validation for data entered in the xml column. However, it ensures that
the data stored is well-formed. Untyped XML data can be created and stored in either table columns or
variables depending upon the need and scope of the data.
The first step in using typed XML is registering a schema. This is done by using the CREATE XML SCHEMA
COLLECTION statement as shown in Code Snippet 22.
The CREATE XML SCHEMA COLLECTION statement creates a collection of schemas, any of which can be used
to validate typed XML data with the name of the collection. This example shows a new schema called
SoccerSchemaCollection being added to the AdventureWorks2019 database. Once a schema is
registered, the schema can be used in new instances of the xml data type.
Code Snippet 23 creates a table with an xml type column and specifies a schema for the column.
CREATE TABLE SoccerTeam ( TeamID int identity not null, TeamInfo xml
(SoccerSchemaCollection) )
To create new rows with the typed XML data, the INSERT statement can be used as shown in Code
Snippet 24.
A normal SELECT can display the data in XML format and on clicking the data in the output, it is
expanded to reveal the full XML. Figures 8.15 and 8.16 depict this.
A typed XML variable can also be created by specifying the schema collection name. For instance, in
Code Snippet 25, a variable team is declared as a typed XML variable with schema name as
SoccerSchemaCollection. The SET statement is used to assign the variable as an XML fragment.
2. The statement retrieves rows and columns from one or more tables.
(A) SELECT (C) INSERT
3. Which of the following is the correct code to display all records from
EmployeeDepartmentHistory for all employees who have DepartmentID more than 10 and
whose ShiftID is 1?
USE AdventureWorks2019 USE AdventureWorks2019
SELECT * FROM SELECT * FROM
[HumanResources].[EmployeeDepartm [HumanResources].[EmployeeDepart
(A) (C)
entHistory] WHERE DepartmentID > mentHistory] WHERE DepartmentID
10 with ShiftID=1 > 10 OR ShiftID=1
4. Which of the following clause with the SELECT statement is used to specify tables to retrieve records?
(A) WHERE (C) .VALUE
(B) FROM (D) .WRITE
5. _____________________ is used to improve the efficiency of queries on XML documents that are
stored in an XML column.
a) Write a SELECT statement that lists customer ID numbers and sales order ID numbers.
4. Write a query displaying all the columns of the Production.ProductCostHistory table from
the rows that were modified on June 17, 2003.
a) Write a query that displays the product ID and name for each product from the table with the
name starting with Chain.
b) Write a query similar to the previous one that displays the products with Lock in the name.
c) Change the last query so that the products without Lock in the name are displayed.
6. Write a query that displays all the rows from the Person.Person table where the rows were
modified after December 29, 2000. Display the business entity ID number, the name columns, and
the modified date.
7. Write a query displaying the ProductID, Name, and Color columns from rows in the
Production.Product table. Display only those rows where no color has been assigned.
8. Write a query that returns the business entity ID and name columns from the Person.Person
table.
9. Write a query using the Sales.SpecialOffer table. Display the difference between the MinQty
and MaxQty columns along with the SpecialOfferID and Description columns.
10. Write a query using the Production.ProductReview table. Use CONTAINS to find all the rows
that have the word socks in the Comments column. Return the ProductID and Comments
columns.
This session explains various techniques to group and aggregate data and describes the
concept of subqueries, table expressions, joins, and explores various set operators. The
session also covers pivoting and grouping set operations.
9.1 Introduction
SQL Server 2019 includes several powerful query features that help you to retrieve data efficiently and
quickly. Data can be grouped and/or aggregated together in order to present summarized information.
Using the concept of subqueries, a resultset of a SELECT can be used as criteria for another SELECT
statement or query. Joins help you to combine column data from two or more tables based on a logical
relationship between the tables. On the other hand, set operators such as UNION and INTERSECT help you
to combine row data from two or more tables. The PIVOT and UNPIVOT operators are used to transform
the orientation of data from column-oriented to row-oriented and vice versa. The GROUPING SET
subclause of the GROUP BY clause helps to specify multiple groupings in a single query.
Syntax:
SELECT select_list FROM table_name
GROUP BY column_name1, column_name2 ,...;
where,
column_name: is the name of the column according to which the resultset should be grouped.
Consider the WorkOrderRouting table in the AdventureWorks2019 database. The total resource hours
per work order must be calculated. To achieve this, the records should be grouped by work order number,
that is, WorkOrderID.
Code Snippet 1 retrieves and displays the total resource hours per work order along with the work order
number. In this query, a built-in function named SUM() is used to calculate the total. SUM() is an aggregate
function.
Code Snippet 1:
Executing this query will return all the work order numbers along with the total number of resource
hours per work order.
A part of the output is shown in figure 9.1.
The GROUP BY clause can also be used in combination with various other clauses. These clauses are as
follows:
Code Snippet 2:
SELECT WorkOrderID, SUM(ActualResourceHrs) AS TotalHoursPerWorkOrder
FROM Production.WorkOrderRouting WHERE WorkOrderID <50 GROUP BY WorkOrderID
As the number of records returned is more than 25, a part of the output is shown in figure 9.2.
If the grouping column contains a NULL value, that row becomes a separate group in the resultset. If
the grouping column contains more than one NULL value, the NULL values are put into a single row.
Consider the Production.Product table. There are some rows in it that have NULL values in the
Class column.
Using a GROUP BY on a query for this table will take into consideration the NULL values too. For
example, Code Snippet 3 retrieves and displays the average of the list price for each Class.
Code Snippet 3:
As shown in figure 9.3, the NULL values are grouped into a single row in the output.
Syntax:
SELECT <column_name> FROM <table_name> WHERE <condition> GROUP BY ALL
<column _name>
Consider the Sales.SalesTerritory table. This table has a column named Group indicating the
geographic area to which the sales territory belongs to. Code Snippet 4 calculates and displays the total
sales for each group. The output must display all the groups regardless of whether they had any sales or
not. To achieve this, the code makes use of GROUP BY with ALL.
Code Snippet 4:
SELECT [Group],SUM(SalesYTD) AS 'TotalSales' FROM Sales.SalesTerritory WHERE
[Group] LIKE 'N%' OR [Group] LIKE 'E%' GROUP BY ALL [Group]
Apart from the rows that are displayed in Code Snippet 4, it will also display the group 'Pacific' with null
values as shown in figure 9.4. This is because the Pacific region did not have any sales.
HAVING clause is used only with SELECT statement to specify a search condition for a group. The
HAVING clause acts as a WHERE clause in places where the WHERE clause cannot be used against
aggregate functions such as SUM(). Once you have created groups with a GROUP BY clause, you may
wish to filter the results further. The HAVING clause acts as a filter on groups, similar to how the WHERE
clause acts as a filter on rows returned by the FROM clause. Following is the syntax of GROUP BY with
HAVING:
Syntax:
SELECT <column_name> FROM <table_name> GROUP BY <column_name> HAVING
<search_condition>
Code Snippet 5 displays the row with the group 'Pacific' as it has total sales less than 6000000.
The output of this is only 1 row, with Group name Pacific and total sales, 5977814.9154.
➢ CUBE:
CUBE is an aggregate operator that produces a super-aggregate row. In addition to the usual rows
provided by the GROUP BY, it also provides the summary of rows that the GROUP BY clause generates.
The summary row is displayed for every possible combination of groups in the resultset. The summary
row displays NULL in the resultset, but at the same time returns all the values for those.
Syntax:
SELECT <column_name> FROM <table_name> GROUP BY <column_name> WITH CUBE
Code Snippet 6:
CUBE is like an extension of the GROUP BY clause. CUBE allows you to generate subtotals for all
combinations of grouping columns specified in the GROUP BY clause.
Code Snippet 6 retrieves and displays the total sales of each country (except Australia and Canada)
and also, the total of the sales of all the countries' regions.
In addition to the usual rows that are generated by the GROUP BY, it also introduces summary rows into
the resultset. It is similar to CUBE operator, but generates a resultset that shows groups arranged in a
hierarchical order. It arranges the groups from the lowest to the highest. Group hierarchy in the result is
dependent on the order in which the columns that are grouped are specified.
When generating grouping sets, ROLLUP assumes a hierarchy among the dimension columns and only
generates grouping sets based on this hierarchy. ROLLUP is often used to generate subtotals and totals
for reporting purposes. ROLLUP is commonly used to calculate the aggregates of hierarchical data such
as sales by year > quarter > month.
Syntax:
SELECT <column_name> FROM <table_name> GROUP BY <column_name> WITH ROLLUP
Code Snippet 7:
SELECT Name, CountryRegionCode,SUM(SalesYTD) AS TotalSales FROM
Sales.SalesTerritory
WHERE Name <> 'Australia' AND Name<> 'Canada' GROUP BY
Name,CountryRegionCode
WITH ROLLUP
The output is shown in figure 9.6. As seen in the output, the query retrieves and displays total sales of
each country (except Australia and Canada), total of the sales of these countries' regions, and then,
arranges them in order from lowest to highest.
Since aggregate functions return a single value, they can be used in SELECT statements where a single
expression is used, such as SELECT, HAVING, and ORDER BY clauses. Aggregate functions ignore NULLs,
except when using COUNT(*).
Aggregate functions in a SELECT list do not generate a column alias. You may wish to use the AS
clause to provide one.
Aggregate functions in a SELECT clause operate on all rows passed to the SELECT phase. If there is no
GROUP BY clause, all rows will be summarized.
SQL Server provides many built-in aggregate functions. Commonly used functions are included in
table 9.1.
Function
Syntax Description
Name
AVG AVG(<expression>) Calculates the average of all the non-NULL numeric values in a
column.
To use a built-in aggregate in a SELECT clause, consider the query in Code Snippet 8.
Code Snippet 8:
SELECT AVG([UnitPrice]) AS AvgUnitPrice, MIN([OrderQty])AS MinQty,
MAX([UnitPriceDiscount]) AS MaxDiscount
FROM Sales.SalesOrderDetail;
Since the query does not use a GROUP BY clause, all rows in the table will be summarized by the aggregate
formulas in the SELECT clause. The output is shown in figure 9.7.
When using aggregates in a SELECT clause, all columns referenced in the SELECT list must be used as
inputs for an aggregate function or must be referenced in a GROUP BY clause. Failing this, there will be
an error. For example, the query in Code Snippet 9 will return an error.
Code Snippet 9:
SELECT SalesOrderID, AVG(UnitPrice) AS AvgPrice FROM Sales.SalesOrderDetail;
Besides using numeric data, aggregate expressions can also include date, time, and character data for
Figures 9.9, 9.10, and 9.11 visually depict an example of these methods. Such output can be seen by
clicking the 'spatial results' tab. When SSMS sees any spatial data in the results, it adds a tab to visualize
it.
Here, two variables are declared of the geography type and appropriate values are assigned to them.
Then, they are combined into a third variable of geography type by using the STUnion() method.
These aggregates are implemented as static methods, which work for either geography or geometry
data types. Although, aggregates are applicable to all classes of spatial data, they can be best described
with polygons.
➢ Union Aggregate
It performs a union operation on a set of geometry objects. It combines multiple spatial objects into a
single spatial object, removing interior boundaries, where applicable. Following is the syntax of
UnionAggregate:
Syntax:
UnionAggregate (geometry_operand or geography_operand)
where,
geometry_operand: is a geometry type table column comprising the set of geometry objects on
which a union operation will be performed.
geography_operand: is a geography type table column comprising the set of geography objects
on which a union operation will be performed.
Code Snippet 13 demonstrates a simple example of using the UnionAggregate. It uses the
Person.Address table in the AdventureWorks2019 database.
➢ Envelope Aggregate
The Envelope Aggregate returns a bounding area for a given set of geometry or geography objects. It
exhibits different behaviors for geography and geometry types. Based on the type of object it is applied
to, it returns different results. For the geometry type, the result is a 'traditional' rectangular polygon, which
closely bounds the selected input objects. For the geography type, the result is a circular object, which
loosely bounds the selected input objects. Furthermore, the circular object is defined using the new
CurvePolygon feature.
Syntax:
➢ Collection Aggregate
Syntax:
CollectionAggregate (geometry_operand or geography_operand)
where,
geometry_operand: is a geometry type table column comprising the set of geometry objects.
geography_operand: is a geography type table column comprising the set of geography objects.
It returns a convex hull polygon, which encloses one or more spatial objects for a given set of
geometry/geography objects. Following is the syntax of ConvexHullAggregate:
Syntax:
where,
geometry_operand: is a geometry type table column comprising the set of geometry objects.
geography_operand: is a geography type table column comprising the set of geography
objects.
The simplest form of a subquery is one that returns just one column. The parent query can use the results of
this subquery using an = sign. The syntax for the most basic form of a subquery using just one column
with an = sign is as shown:
Syntax:
In a subquery, the innermost SELECT statement is executed first and its result is passed as criteria to the
outer SELECT statement.
Consider a scenario where it is required to determine the due date and ship date of the most recent
orders.
Code Snippet 17 shows the code to achieve this.
Here, a subquery has been used to achieve the desired output. The inner query or subquery retrieves the
most recent order date. This is then passed to the outer query, which displays due date and ship date
for all the orders that were made on that particular date.
• Scalar subqueries return a single value. Here, the outer query must be written to process a single
result.
• Multi-valued subqueries return a result similar to a single-column table. Here, the outer query must be
written to handle multiple possible results.
These keywords, also called predicates, are used with multi-valued queries. For example, consider that all
the first names and last names of employees whose job title is 'Research and Development Manager' have
to be displayed. Here, the inner query may return more than one row, as there may be more than one
employee with that job title. To ensure that the outer query can use the results of the inner query, the IN
keyword will have to be used.
Here, the inner query retrieves the BusinessEntityID from the HumanResources.Employee table for
those records having job title 'Research and Development Manager'. These results are then passed to the
outer query, which matches the BusinessEntityID with that in the Person.Person table. Finally,
from the records that are matching, the first and last names are extracted and displayed.
The SOME or ANY keywords evaluate to true if the result is an inner query containing at least one row that
satisfies the comparison. They compare a scalar value with a column of values. SOME and ANY are
equivalent; both return the same result. They are rarely used.
The ntext, text, and image data types cannot be used in the SELECT list of subqueries.
The SELECT list of a subquery introduced with a comparison operator can have only one expression
or column name.
Subqueries that are introduced by a comparison operator not followed by the keyword ANY or ALL
cannot include GROUP BY and HAVING clauses.
You cannot use DISTINCT keyword with subqueries that include GROUP BY.
Besides scalar and multi-valued subqueries, you can also choose between self-contained subqueries and
correlated subqueries. These are defined as follows:
➢ Self-contained subqueries are written as standalone queries, without any dependencies on the outer
query. A self-contained subquery is processed once when the outer query runs and passes its results to
the outer query.
➢ Correlated subqueries reference one or more columns from the outer query and therefore, depend on
the outer query. Correlated subqueries cannot be run separately from the outer query.
The EXISTS keyword is used with a subquery to check the existence of rows returned by the subquery.
The subquery does not actually return any data; it returns a value of TRUE or FALSE.
Syntax:
SELECT <ColumnName> FROM <table> WHERE [NOT] EXISTS
(
<Subquery_Statement>
)
where,
Subquery_Statement: specifies the subquery.
The code in Code Snippet 18 can be rewritten as shown in Code Snippet 19 using EXISTS keyword to
yield same output.
Here, the inner subquery retrieves all those records that match job title as 'Research and Development Manager'
and whose BusinessEntityID matches with that in the Person table. If there are no records matching
both these conditions, the inner subquery will not return any rows. Thus, in that case, the EXISTS will
return false and the outer query will also not return any rows. However, the code in Code Snippet 19 will
return two rows because the given conditions are satisfied. The output will be the same as figure 9.20.
Similarly, one can use the NOT EXISTS keyword. The WHERE clause in which it is used is satisfied if there
are no rows returned by the subquery.
However, if the subquery refers to a parent query, the subquery must be revaluated for every iteration in the
parent query. This is because the search criterion in the subquery is dependent upon the value of a
When a subquery takes parameters from its parent query, it is known as Correlated subquery. Consider
that you want to retrieve all the business entity ids of persons whose contact information was last
modified not earlier than 2019. To do this, you can use a correlated subquery as shown in Code
Snippet 21.
In Code Snippet 21, the inner query retrieves contact type ids for all those persons whose contact
information was modified on or before 2019. These results are then passed to the outer query, which
matches these contact type ids with those in the Person.BusinessEntityContact table and displays the
business entity IDs of those records. Figure 9.22 shows part of the output.
9.7 Joins
Joins are used to retrieve data from two or more tables based on a logical relationship between tables. A
join typically specifies foreign key relationship between the tables. It defines the manner in which two tables
are related in a query by:
• Specifying the column from each table • Specifying a logical operator such as =,
to be used for the join. A typical join <> to be used in comparing values from
specifies a foreign key from one table the columns.
and its associated key in the other table.
Syntax:
where,
<ColumnName1>, <ColumnName2>: Is a list of columns that must be displayed.
Table_A: Is the name of the table on the left of the JOIN keyword.
Table_B: Is the name of the table on the right of the JOIN keyword.
AS Table_Alias: Is a way of giving an alias name to the table. An alias defined for the table in a
query can be used to denote a table so that the full name of the table need not be used.
<CommonColumn>: Is a column that is common to both the tables. In this case, the join succeeds
only if the columns have matching values.
Consider that you want to list employee first names, last names, and their job titles from the
HumanResources.Employee and Person.Person. To extract this information from the two tables, you
must join them based on BusinessEntityID as shown in Code Snippet 22.
Here, the tables HumanResources.Employee and Person.Person are given aliases A and B. They are
joined together based on their business entity ids. The SELECT statement then retrieves the desired
columns through the aliases.
Figure 9.23 shows the output.
Syntax:
Left outer join returns all the records from the left table and only matching records from the right table.
Syntax:
SELECT <ColumnList> FROM Table_A
AS Table_Alias_A
LEFT OUTER JOIN
Table_B AS Table_Alias_B ON
Table_Alias_A.<CommonColumn> = Table_Alias_B.<CommonColumn>
Consider that you want to retrieve all the customer ids from the Sales.Customers table and order
information such as ship dates and due dates, even if the customers have not placed any orders. Since the
record count would be very huge, it is to be restricted to only those orders that are placed before 2019. To
achieve this, you perform a left outer join as shown in Code Snippet 24.
In Code Snippet 24, the left outer join is constructed between the tables Sales.Customer and
Sales.SalesOrderHeader. The tables are joined based on customer ids. In this case, all records from
the left table, Sales.Customer and only matching records from the right table,
Sales.SalesOrderHeader, are returned. Figure 9.24 shows the output.
As shown in the output, some records show the due dates and ship dates as NULL. This is because for some
customers, no order is placed, hence, their records will show the dates as NULL.
The right outer join retrieves all records from the second table in the join, regardless of whether there is
matching data in the first table or not.
Syntax:
Consider that you want to retrieve all the product names from Product table and all the corresponding
sales order ids from the SalesOrderDetail table even if there is no matching record for the products in
the SalesOrderDetail table. To do this, you will use a right outer join as shown in Code Snippet 25.
In the code, all the records from Product table are shown regardless of whether they have been sold or
not.
9.7.3 Self-Join
A self-join is used to find records in a table that are related to other records in the same table. A table is
joined to itself in a self-join.
Code Snippet 26 demonstrates how to use a self-join to retrieve the product details of those products
that have the same color from the table Production.Product.
SELECT
p1. ProductID,
p1.Color,
p1.Name,
p2.Name
The MERGE statement accomplishes the tasks in a single statement. MERGE also allows you to display
those records that were inserted, updated, or deleted by using an OUTPUT clause.
Syntax:
MERGE target_table USING source_table ON match_condition
WHEN MATCHED THEN UPDATE SET Col1 = val1 [, Col2 = val2...]
WHEN [TARGET] NOT MATCHED THEN INSERT (Col1 [,Col2...] VALUES (Val1 [,
Val2...])
WHEN NOT MATCHED BY SOURCE THEN DELETE
where,
target_table: is the table WHERE changes are being made.
source_table: is the table from which rows will be inserted, updated, or deleted into the target
table.
match_conditions: are the JOIN conditions and any other comparison operators.
MATCHED: true if a row in the target_table and source_table matches the
match_condition.
NOT MATCHED: true if a row from the source_table does not exist in the target_table.
SOURCE NOT MATCHED: true if a row exists in the target_table but not in the
source_table.
OUTPUT: An optional clause that allows to view those records that have been
inserted/deleted/updated in target_table.
If records present in the target table do not match with those of source table (NOT MATCHED BY SOURCE), then
these are deleted from the target table. The last statement displays a report consisting of rows that were
inserted/updated/deleted as shown in the output.
Key advantages of CTEs are improved readability and ease in maintenance of complex queries.
Syntax:
WITH <CTE_name>
AS (<CTE_definition>)
For example, to retrieve and display the customer count year-wise for orders present in the
Sales.SalesOrderHeader table, the code will be as given in Code Snippet 28.
Here, CTE_OrderYear is specified as the CTE name. The WITH...AS keywords begins the CTE definition.
Then, the CTE is used in the SELECT statement to retrieve and display the desired results.
CTEs are limited in scope to the execution of the outer query. Hence, when the outer query
ends, the lifetime of the CTE will end.
You must define a name for a CTE and also, define unique names for each of the columns
referenced in the SELECT clause of the CTE.
A single CTE can be referenced multiple times in the same query with one definition.
Multiple CTEs can also be defined in the same WITH clause. For example, consider Code Snippet 29. It
defines two CTEs using a single WITH clause.
This snippet assumes that three tables named Student, City, and Status are created in the
AdventureWorks2019 database.
WITH CTE_Students
AS (
Select StudentCode, S.Name,C.CityName, St.Status FROM Student S
INNER JOIN City C
ON S.CityCode = C.CityCode INNER
JOIN Status St
ON S.StatusId = St.StatusId)
,
StatusRecord –- This is the second CTE being defined
AS (
SELECT Status, COUNT(Name) AS CountofStudents FROM
CTE_Students
GROUP BY Status
)
SELECT * FROM StatusRecord
This will list all the product ids of both tables that match with each other. If you include the ALL clause,
all rows are included in the resultset including duplicate records.
Syntax:
Query_statement1
INTERSECT
Query_statement2
where,
Query_Statement1 and Query_Statement2 are SELECT statements.
➢ Number of columns and order in which they are given must be same in both queries.
➢ Data types of the columns being used must be compatible.
where,
Query_Statement1 and Query_Statement2 are SELECT statements.
The two rules that apply to INTERSECT operator are also applicable for EXCEPT operator. Code
Snippet 33 demonstrates the EXCEPT operator.
If the order of the two tables in this example is interchanged, only those rows are returned
from Production.Product table which do not match with the rows present in
Sales.SalesOrderDetail.
Thus, in simple terms, EXCEPT operator selects all the records from the first table except those that match
with the second table. Hence, when you are using EXCEPT operator, the order of the two tables in the
queries is important. Whereas, with the INTERSECT operator, it does not matter which table is specified
first.
Syntax:
where,
table_source: is a table or table expression.
aggregate_function: is a user-defined or in-built aggregate function that accepts one or more
inputs.
value_column: is the value column of the PIVOT operator.
pivot_column: is the pivot column of the PIVOT operator. This column must be of a type that
can implicitly or explicitly be converted to nvarchar().
IN (column_list): are values in the pivot_column that will become the column names of the
output table. The list must not include any column names that already exist in the input
table_source being pivoted.
table_alias: is the alias name of the output table.
The output of this will be a table containing all columns of the table_source except the pivot_column
and value_column. These columns of the table_source, excluding the pivot_column and
value_column, are called the grouping columns of the pivot operator.
In simpler terms, to use the PIVOT operator, you supply three elements to the operator:
In the FROM clause, the input Here, a comma-separated list An aggregation function, such
columns must be provided. of values that occur in the as SUM, to be performed on
The PIVOT operator uses source data is provided that the grouped rows.
those columns to determine will be used as the column
which column(s) to use for headings for the pivoted data.
grouping the data for
aggregation.
Consider an example to understand the PIVOT operator. Code Snippet 34 is shown without the PIVOT
operator and demonstrates a simple GROUP BY aggregation. As the number of records would be huge,
the resultset is limited to five by specifying TOP 5.
The top five year to date sales along with territory names grouped by territory names are displayed. Now,
the same query is rewritten in Code Snippet 35 using a PIVOT so that the data is transformed from a
row-based orientation to a column-based orientation.
As shown in figure 9.30, the data is transformed and the territory names are now seen as columns instead
of rows. This improves readability. A major challenge in writing queries using PIVOT is the requirement
to provide a fixed list of spreading elements to the PIVOT operator, such as specific territory names
given in Code Snippet 35. It would not be feasible or practical to implement this for large number of
spreading elements. To overcome this, developers can use dynamic SQL. Dynamic SQL provides a
means to build a character string that is passed to SQL Server, interpreted as a command, and then,
executed.
When unpivoting data, one or more columns are defined as the source to be converted into rows. The
data in those columns is spread, or split, into one or more new rows, depending on how many columns
are being unpivoted.
Source columns to be A name for the new column A name for the column that
unpivoted that will display the will display the names of
unpivoted values the unpivoted values
1. Which of the following statements can be used with subqueries that return one column and
many rows?
(A) ANY (C) IN
2. The ______ operator is used to display only the rows that are common to both the tables.
(A) INTERSECT (C) UNION
(B) EXCEPT (D) UNION WITH ALL
4. is formed when records from two tables are combined only if the rows from both the tables are
matched based on a common column.
(A) Inner join (C) Self-join
5. return all rows from at least one of the tables in the FROM clause of the SELECT statement, as
long as those rows meet any WHERE or HAVING conditions of the SELECT statement.
(A) Inner join (C) Self-join
6. Consider you have two tables, Products and Orders that are already populated. Based on
new orders, you want to update Quantity in Products table. You write the following code:
MERGE Products AS T
USING Orders AS S
ON S.ProductID = T.ProductID
2. Using the tables Sales.SalesPerson and Sales.SalesTerritory, retrieve IDs of all the sales
persons who operate in Canada.
3. Using the tables Sales.SalesPerson and Sales.SalesTerritory, retrieve IDs of all the
sales persons who operate in Northwest or Northeast.
4. Compare the bonus values of salespersons in the Sales.SalesPerson table to find out the sales
persons earning more bonuses. Display the SalesPersonID and bonus values in descending order.
(Hint: Use a self-join and ORDER BY ...DESC).
5. Retrieve all the values of SalesPersonID from Sales.SalesPerson table, but leave out those
values, which are present in the Sales.Store table. (Hint: Use EXCEPT operator).
7. Retrieve all the sales person IDs and territory IDs from Sales.SalesPerson table regardless of
whether they have matching records in the Sales.SalesTerritory table. (Hint: Use a left outer
join).
8. Retrieve a distinct set of Territory IDs that are present in both Sales.SalesPerson and
Sales.SalesTerritory tables. (Hint: Use INTERSECT operator).
This session explains about views and describes creating, altering, and dropping views. The
session also describes stored procedures in detail. The session concludes with an
explanation of the techniques to query metadata.
10.1 Introduction
An SQL Server database has two main categories of objects: those that store data and those that access,
manipulate, or provide access to data. Views and stored procedures belong to the latter category.
10.2 Views
A view is a virtual table that is made up of selected columns from one or more tables. The tables from
which the view is created are referred to as base tables. These base tables can be from different databases. A
view can also include columns from other views created in the same or a different database. A view can have
a maximum of 1,024 columns. The data inside the view comes from the base tables that are referenced in
the view definition. The rows and columns of views are created dynamically when the view is referenced.
Note - All the Code Snippets in the session are based on the AdventureWorks2019 database.
where,
view_name: specifies the name of the view.
select_statement: specifies the SELECT statement that defines the view.
Code Snippet 1 creates a view from the Production.Product table to display only the product id,
product number, name, and safety stock level of products.
Code Snippet 1:
CREATE VIEW vwProductInfo AS
SELECT ProductID, ProductNumber, Name, SafetyStockLevel
FROM Production.Product;
GO
Note - The words vw are prefixed to a view name as per recommended coding conventions.
Code Snippet 2:
The result will show specified columns of all products from Production.Product table. A part of the
output is shown in figure 10.1.
Syntax:
CREATE VIEW <view_name> AS
SELECT * FROM table_name1 JOIN
table_name2
ON table_name1.column_name = table_name2.column_name
where,
view_name: specifies the name of the view.
table_name1: specifies the name of first table.
JOIN: specifies that two tables are joined using JOIN keyword.
table_name2: specifies the name of the second table.
Code Snippet 3 creates a view named vwPersonDetails with specific columns from the Person and
Employee tables of HumanResources schema. The JOIN and ON keywords join the two tables based on
BusinessEntityID column.
Code Snippet 3:
CREATE VIEW vwPersonDetails AS
SELECT
p.Title
,p.[FirstName]
,p.[MiddleName]
,p.[LastName]
,e.[JobTitle]
FROM [HumanResources].[Employee] e
INNER JOIN [Person].[Person] p
ON p.[BusinessEntityID] = e.[BusinessEntityID]
GO
This view will contain the columns Title, FirstName, MiddleName, and LastName from the Person
table and JobTitle from the Employee table. Once the view is created, you can retrieve records from it,
and manipulate and modify records as well.
Code Snippet 4 shows executing a query on this view.
Code Snippet 4:
As shown in figure 10.2, all the rows may not have values for the Title or MiddleName columns - some
may have NULL in them. A person seeing this output may not be able to comprehend the meaning of the
NULL values. Hence, to replace all the NULL values in the output with a null string, the COALESCE() function
can be used as shown in Code Snippet 5.
Code Snippet 5:
CREATE VIEW vwPersonDetailsNew
AS
SELECT
COALESCE(p.Title, ' ') AS Title
,p.[FirstName]
,COALESCE(p.MiddleName, ' ') AS MiddleName
,p.[LastName]
,e.[JobTitle]
FROM [HumanResources].[Employee] e
INNER JOIN [Person].[Person] p
ON p.[BusinessEntityID] = e.[BusinessEntityID]
GO
When this view is queried with a SELECT statement, the output will be as shown in figure 10.3.
A view is created only in the current database. The base tables and views from which the view is
created can be from other databases or servers.
View names must be unique and cannot be the same as the table names in the schema.
The CREATE VIEW statement can include the ORDER BY clause only if the TOP keyword is used.
The CREATE VIEW statement cannot be combined with other Transact-SQL statements in a single
batch.
Code Snippet 6 reuses the code given in Code Snippet 5 with an ORDER BY clause.
Code Snippet 6:
CREATE VIEW vwSortedPersonDetails AS
SELECT TOP 10 COALESCE(p.Title, ' ') AS Title
,p.[FirstName]
,COALESCE(p.MiddleName, ' ') AS MiddleName
,p.[LastName]
,e.[JobTitle]
FROM [HumanResources].[Employee] e INNER JOIN [Person].[Person] p
ON p.[BusinessEntityID] = e.[BusinessEntityID] ORDER BY p.FirstName
GO
--Retrieve records from the view
SELECT * FROM vwSortedPersonDetails
The TOP keyword displays the name of the first ten employees with their first names in ascending order.
➢ INSERT
➢ UPDATE
➢ DELETE
While using the INSERT statement on a view, if any rules are violated, the record is not inserted.
In the following example, when data is inserted through the view, the insertion does not take place as the
view is created from two base tables.
Code Snippet 7:
CREATE TABLE Employee_Personal_Details (
EmpID int NOT NULL,
FirstName varchar(30) NOT NULL,
LastName varchar(30) NOT NULL, Address
varchar(30)
)
Code Snippet 9:
Code Snippet 10 uses the INSERT statement to insert data through the view vwEmployee_Details.
However, the data is not inserted as the view is created from two base tables.
Following rules and guidelines must be followed when using the INSERT statement:
➢ The INSERT statement must specify values for all columns in a view in the underlying table that do
not allow null values and have no DEFAULT definitions.
➢ When there is a self-join with the same view or base table, the INSERT statement does not work.
This insert is not allowed as the view does not contain the LastName column from the base table and
that column does not allow null values.
Assume some records are added in the table as shown in figure 10.4.
UPDATE vwProduct_Details
SET Rate=3000
WHERE ProductName='DVD Writer'
The outcome of this code affects not only the view, vwProduct_Details, but also the underlying table
from which the view was created.
Figure 10.5 shows the updated table which was automatically updated because of the view.
Large value data types include varchar(max), nvarchar(max), and varbinary(max). To update data
having large value data types, the .WRITE clause is used. The .WRITE clause specifies that a section of
the value in a column is to be modified. The .WRITE clause cannot be used to update a NULL value in a
column. Also, it cannot be used to set a column value to NULL.
Syntax:
column_name .WRITE (expression, @Offset, @Length)
where,
column_name: specifies the name of the large value data-type column.
expression: specifies the value that is copied to the column.
@Offset: specifies the starting point in the value of the column at which the expression is written.
@Length: specifies the length of the section in the column.
@Offset and @Length are specified in bytes for varbinary and varchar data types and in
characters for the nvarchar data type.
Assume that the table Product_Details is modified to include a column Description having data
type nvarchar(max).
A view is created based on this table, having the columns ProductName, Description, and Rate as
shown in Code Snippet 16.
Code Snippet 17 uses the UPDATE statement on the view vwProduct_Details. The .WRITE clause is
used with values 0 and 2 to change first two characters in the Description column. Thus, Internal
will be changed to External.
As a result of the code, all the rows in the view that had 'Portable Hard Drive' as product name will be
updated with External instead of Internal in the Description column.
Figure 10.6 shows a sample output of the view after the updation.
Following rules and guidelines must be followed when using the UPDATE statement:
When there is a self-join with the same view or base table, the UPDATE statement does not
work.
While updating a row, if a constraint or rule is violated, the statement is terminated, an error
is returned, and no records are updated.
For example, consider a view vwCustDetails that lists the account information of different customers.
When a customer closes the account, the details of this customer need to be deleted. This is done using the
DELETE statement.
Assume that a table named Customer_details and a view vwCustDetails based on the table are
created.
Code Snippet 18 is used to delete the record from the view vwCustDetails that has CustID C0004.
ALTER VIEW can be applied to indexed views; however, it unconditionally drops all indexes on the view.
Views are often altered when a user requests for additional information or makes changes in the underlying
table definition.
Note - After a view is created, if the structure of its underlying tables is altered by adding columns, the
new columns do not appear in the view. This is because the column list is interpreted only when you
first create the view. To see the new columns in the view, you must alter the view.
Syntax:
Code Snippet 19 alters the view, vwProductInfo to include the ReOrderPoint column.
Note - The owner of the view has the permission to drop a view and this permission is nontransferable.
However, the system administrator or database owner can drop any object by specifying the owner name
in the DROP VIEW statement.
Syntax:
Syntax:
sp_helptext <view_name>
The execution of the code will display the definition about the view as shown in figure 10.8.
Consider the view that was created in Code Snippet 16. It has been re-created in Code Snippet 22 to make
use of the AVG() function. Make sure to delete the existing view before executing this snippet.
The WITH CHECK OPTION clause forces all the modification statements executed against the view to
follow the condition set within the SELECT statement. When a row is modified, the WITH CHECK OPTION
makes sure that the data remains visible through the view.
Syntax:
CREATE VIEW <view_name>
AS select_statement [ WITH CHECK OPTION ]
where,
WITH CHECK OPTION: specifies that the modified data in the view continues to satisfy the view
definition.
Code Snippet 23 re-creates the view vwProductInfo having SafetyStockLevel less than or equal to
1000.
In Code Snippet 24, the UPDATE statement is used to modify the view vwProductInfo by changing the
value of the SafetyStockLevel column for the product having id 321 to 2500.
The UPDATE statement fails to execute as it violates the view definition, which specifies that
SafetyStockLevel must be less than or equal to 1000. Thus, no rows are affected in the view
vwProductInfo.
Note - Any updates performed on the base tables are not verified against the view, even if CHECK
OPTION is specified.
While using the SCHEMABINDING option in a view, you must specify the schema name along with the
object name in the SELECT statement.
Syntax:
CREATE VIEW <view_name> WITH SCHEMABINDING
AS <select_statement>
where,
view_name: specifies the name of the view.
WITH SCHEMABINDING: specifies that the view must be bound to a schema.
select_statement: Specifies the SELECT statement that defines the view.
Code Snippet 25 creates a view vwNewProductInfo with SCHEMABINDING option to bind the view to the
Production schema, which is the schema of the table Product.
The sp_refreshview stored procedure returns code value zero if the execution is successful or returns a
non-zero number in case the execution has failed.
Syntax:
sp_refreshview '<view_name>'
Code Snippet 26 creates a table Customers with the CustID, CustName, and Address columns.
The output of Code Snippet 28 shows three columns, CustID, CustName, and Address.
Code Snippet 29 uses the ALTER TABLE statement to add a column Age to the table Customers.
To resolve this, the sp_refreshview stored procedure must be executed on the view vwCustomers
as shown in Code Snippet 31.
When a SELECT query is run again on the view, the column Age is seen in the output. This is because the
sp_refreshview procedure refreshes the metadata for the view vwCustomers.
Tables that are schema-bound to a view cannot be dropped unless the view is dropped or changed such that
it no longer has schema binding. If the view is not dropped or changed and you attempt to drop the table,
the Database Engine returns an error message.
Also, when an ALTER TABLE statement affects the view definition of a schema-bound view, the ALTER
TABLE statement fails.
Consider the schema-bound view that was created in Code Snippet 25. It is dependent on the
Production.Product table.
Code Snippet 32 tries to modify the data type of ProductID column in the Production.Product table
from int to varchar(7).
The Database Engine returns an error message as the table is schema-bound to the vwNewProductInfo
view and hence, cannot be altered such that it violates the view definition of the view.
Stored procedures are useful when repetitive tasks have to be performed. This eliminates the need for
repetitively typing out multiple Transact-SQL statements and then repetitively compiling them.
Stored procedures can accept values in the form of input parameters and return output values as defined by
the output parameters.
The database administrator can improve the security by associating database privileges
Improved with stored procedures. Users can be given permission to execute a stored procedure even
Security if the user does not have permission to access the tables or views.
Stored procedures are compiled during the first execution. For every subsequent
Precompiled execution, SQL Server reuses this precompiled version. This reduces the time and
Execution resources required for compilation.
Stored procedures help in reducing network traffic. When Transact-SQL statements are
executed individually, there is network usage separately for execution of each statement.
Reduced When a stored procedure is executed, the Transact-SQL statements are executed together
Client/Server as a single unit. Network path is not used separately for execution of each individual
Traffic statement. This reduces network traffic.
Stored procedures can be used multiple times. This eliminates the need to repetitively
type out hundreds of Transact-SQL statements every time a similar task is to be
Reuse of code
performed.
User-defined stored procedures are also known as custom stored procedures. These procedures are used
for reusing Transact-SQL statements for performing repetitive tasks. There are two types of user-defined
stored procedures, the Transact-SQL stored procedures and the Common Language Runtime (CLR) stored
procedures.
Transact-SQL stored procedures consist of Transact-SQL statements whereas the CLR stored procedures
are based on the .NET framework CLR methods. Both the stored procedures can take and return user-defined
parameters.
➢ Extended Stored Procedures
Extended stored procedures help SQL Server in interacting with the operating system. Extended stored
procedures are not resident objects of SQL Server. They are procedures that are implemented as Dynamic-
Link Libraries (DLL) executed outside the SQL Server environment. The application interacting with SQL
A system stored procedure is a set of pre-compiled Transact-SQL statements executed as a single unit.
System procedures are used in database administrative and informational activities. These procedures
provide easy access to the metadata information about database objects such as system tables, user-defined
tables, views, and indexes.
System stored procedures logically appear in the sys schema of system and user-defined databases. When
referencing a system stored procedure, the sys schema identifier is used. System stored procedures are
stored physically in the hidden Resource database and have the sp_ prefix. System stored procedures are
owned by the database administrator.
Note - System tables are created by default at the time of creating a new database. These tables store the
metadata information about user-defined objects such as tables and views. Users cannot access or
update the system tables using system stored procedures except through permissions granted by a
database administrator.
Security stored procedures are used to manage the security of the database. For
Security Stored Procedures example, the sp_changedbowner security stored procedure is used to change the
owner of the current database.
Cursor procedures are used to implement the functionality of a cursor. For example,
Cursor Stored Procedures the sp_cursor_list cursor stored procedure lists all the cursors opened by the
connection and describes their attributes.
Database Mail and SQL Mail stored procedures are used to perform e-mail operations
Database Mail and SQL from within the SQL Server. For example, the sp_send_dbmail database mail stored
Mail Stored Procedures procedure sends e-mail messages to specified recipients. The message may include a
query resultset or file attachments or both.
SQL Server supports two types of temporary stored procedures namely, local and global. The differences
between the two types are given in table 10.1.
Local Temporary Procedure Global Temporary Procedure
Visible only to the user that created it Visible to all users
Dropped at the end of the current session Dropped at the end of the last session
Local Temporary Procedure Global Temporary Procedure
Can only be used by its owner Can be used by any user
Uses the # prefix before the procedure name Uses the ## prefix before the procedure name
Note - A session is established when a user connects to the database and is ended when the user
disconnects. The complete name of a global temporary stored procedure including the prefix # #
cannot exceed 128 characters. The complete name of a local temporary stored procedure including
the prefix # cannot exceed 116 characters.
Syntax:
EXECUTE <procedure_name>
Code Snippet 33 executes the extended stored procedure xp_fileexist to check whether the
MyTest.txt file exists or not.
Note - When you execute an extended stored procedure, either in a batch or in a module, qualify the
stored procedure name with master.dbo.
For example, consider a table Customer_Details that stores details about all the customers. You would
need to type out Transact-SQL statements every time you wished to view the details about the customers.
Instead, you could create a custom stored procedure that would display these details whenever the
procedure is executed.
Creating a custom stored procedure requires CREATE PROCEDURE permission in the database and ALTER
permission on the schema in which the procedure is being created.
where,
procedure_name: specifies the name of the procedure.
@parameter: specifies the input/output parameters in the procedure.
data_type: specifies the data types of the parameters.
sql_statement: specifies one or more Transact-SQL statements to be included in the procedure.
Code Snippet 34 creates and then executes a custom stored procedure, uspGetCustTerritory, which
will display the details of customers such as customer id, territory id, and territory name.
To execute the stored procedure, the EXEC command is used as shown in Code Snippet 35.
EXEC uspGetCustTerritory
Input Output
Parameters Parameters
Input Parameters
Values are passed from the calling program to the stored procedure and these values are accepted into the
input parameters of the stored procedure. The input parameters are defined at the time of creation of the
Syntax:
where,
data_type: specifies the system defined data type.
Following syntax is used to execute a stored procedure and pass values as input parameters:
Syntax:
Code Snippet 36 creates a stored procedure, uspGetSales with a parameter territory to accept the name
of a territory and display the sales details and salesperson id for that territory. Then, the code executes the
stored procedure with Northwest being passed as the input parameter.
Output Parameters
Stored procedures occasionally need to return output back to the calling program. This transfer of data from
the stored procedure to the calling program is performed using output parameters. Output parameters are
defined at the time of creation of the procedure. To specify an output parameter, the OUTPUT keyword is
used while declaring the parameter. Also, the calling statement has to have a variable specified with the
Following syntax is used to pass output parameters in a stored procedure and then, execute the stored
procedure with the OUTPUT parameter specified:
Syntax:
Code Snippet 37 creates a stored procedure, uspGetTotalSales with input parameter @territory to
accept the name of a territory and output parameter @sum to display the sum of sales year to date in that
territory.
Code Snippet 37:
CREATE PROCEDURE uspGetTotalSales @territory
varchar(40), @sum int OUTPUT AS
SELECT @sum= SUM(B.SalesYTD) FROM
Sales.SalesPerson A JOIN
Sales.SalesTerritory B
ON A.TerritoryID = B.TerritoryID
WHERE B.Name = @territory
Code Snippet 38 declares a variable sumsales to accept the output of the procedure
uspGetTotalSales.
The code passes Northwest as the input to the uspGetTotalSales stored procedure and accepts the
output in the variable sumsales. The output is printed using the PRINT command.
OUTPUT parameters have following characteristics:
The OUTPUT clause returns information from each row on which the INSERT, UPDATE, and DELETE
statements have been executed. This clause is useful to retrieve the value of an identity or computed column
after an INSERT or UPDATE operation.
Expand Programmability, right-click Stored Procedures, and then, click Stored Procedure as
shown in figure 10.12.
The Query Editor displays a default template in Transact SQL for the stored procedure, as shown
in figure 10.13.
The Specify Values for Template Parameters dialog box is displayed as shown in figure 10.15.
5. In the Specify Values for Template Parameters dialog box, enter the values for the parameters as
shown in table 10.2.
Parameter Value
Author Your name
Create Date Today's date
Description Returns year to sales data for a territory
Procedure_Name uspGetTotals
@Param1 @territory
@Datatype_For_Param1 varchar(50)
Default_Value_For_Param1 NULL
@Param2
@Datatype_For_Param2
Default_Value_For_Param2
Table 10.2: Parameter Values
9. To test the syntax, on the Query menu, click Parse. If an error message is returned, compare the
statements with the information and correct as required.
10. To create the procedure, from the Query menu, click Execute. The procedure is created as an object
in the database.
11. To see the procedure listed in Object Explorer, right-click Stored Procedures and select Refresh.
The procedure name will be displayed in the Object Explorer tree as shown in figure 10.16.
12. To run the procedure, in Object Explorer, right-click the stored procedure name uspGetTotals and
select Execute Stored Procedure.
13. In the Execute Procedure window, enter Northwest as the value for the parameter @territory
and click OK. The procedure will be executed and the output will be displayed, as shown in figure
10.17.
where,
ENCRYPTION: encrypts the stored procedure definition.
RECOMPILE: indicates that the procedure is compiled at run-time.
sql_statement: specifies Transact-SQL statements to be included in the body of the procedure.
Code Snippet 39 modifies the definition of the stored procedure named uspGetTotals to add a new
column CostYTD to be retrieved from Sales.SalesTerritory.
Note - When you change the definition of a stored procedure, the dependent objects may fail when
executed. This happens if the dependent objects are not updated to reflect the changes made to the stored
procedure.
Stored procedures can be dropped if they are no longer required. If another stored procedure calls a deleted
procedure, an error message is displayed.
If a new procedure is created using the same name as well as the same parameters as the dropped procedure,
all calls to the dropped procedure will be executed successfully. This is because they will now refer to the
new procedure, which has the same name and parameters as the deleted procedure.
Before dropping a stored procedure, execute the sp_depends system stored procedure to determine
which objects depend on the procedure.
Syntax:
When a stored procedure calls another stored procedure, the level of nesting is said to be increased by one.
Similarly, when a called procedure completes its execution and passes control back to the calling
procedure, the level of nesting is said to be decreased by one. The maximum level of nesting supported by
SQL Server 2019 is 32.
Code Snippet 41 is used to create a stored procedure NestedProcedure that calls two other stored
procedures that were created earlier through Code Snippets 34 and 36.
When the procedure NestedProcedure is executed, this procedure in turn invokes the
uspGetCustTerritory and uspGetSales stored procedures and passes the value France as the input
parameter to the uspGetSales stored procedure.
Note - Although there can be a maximum of 32 levels of nesting, there is no limit as to the number of
stored procedure that can be called from a given stored procedure.
There are over 230 different system views and these are automatically inserted into the user created
database. These views are grouped into several different schemas.
Code Snippet 42 retrieves a list of user tables and attributes from the system catalog view sys.tables.
Following points in table 10.3 will help to decide whether one should query SQL Server-specific system
views or information schema views:
Information Schema Views SQL Server System Views
They are stored in their own schema,
They appear in the sys schema.
INFORMATION_SCHEMA.
They use standard terminology instead of SQL
Server terms. For example, they use catalog instead
They adhere to SQL Server terminology.
of database and domain instead of user-defined
data type.
They may not expose all the metadata available to
SQL Server's own catalog views. For example,
They can expose all the metadata available to
sys.columns includes attributes for identity
SQL Server's catalog views.
property and computed column property, while
INFORMATION_SCHEMA.columns does not.
Table 10.3: Information Schema Views and SQL Server System Views
In addition to views, SQL Server provides a number of built-in functions that return metadata to a query.
These include scalar functions and table-valued functions, which can return information about system
SQL Server metadata functions come in a variety of formats. Some appear similar to standard scalar
functions, such as ERROR_NUMBER(). Others use special prefixes, such as @@VERSION or $PARTITION.
One of the new features in SQL Server 2019 is memory-optimized tempdb metadata. The SQL Server
team enhanced tempdb code with optimizations so that some of the metadata which could have been a
bottleneck on tempdb heavy systems can now rely on memory and be optimized for RAM access.
Large volume, large scale environments that use a lot of tempdb usually run into this type of bottleneck.
Earlier this would require some sort of workaround to reduce the use of tempdb. However, with this new
feature in place, it is possible to enable metadata to remain in memory and be optimally accessed.
To query a dynamic management object, you use a SELECT statement as you would with any user-defined
view or table-valued function. For example, Code Snippet 45 returns a list of current user connections from
the sys.dm_exec_sessions view.
sys.dm_exec_sessions is a server-scoped DMV that displays information about all active user
connections and internal tasks. This information includes login user, current session setting, client version,
client program name, client login time, and more. The sys.dm_exec_sessions can be used to identify a
specific session and find information about it.
Here, is_user_process is a column in the view that determines if the session is a system session or not.
A value of 1 indicates that it is not a system session but rather a user session. The program_name column
determines the name of client program that initiated the session. The login_time column establishes the
time when the session began. The output of Code Snippet 45 is shown in figure 10.18.
(A) a, c, d, e (C) a, b, d
(B) a, d (D) All of these
2. You are creating a view vwSupplier with FirstName, LastName, and City columns from the
Supplier_Details table. Which of the following code is violating the definition of a view?
CREATE VIEW vwSupplier AS CREATE VIEW vwSupplier AS
SELECT FirstName, LastName, City SELECT FirstName, LastName,
(A) FROM Supplier_Details WHERE City (C) City FROM Supplier_Details
IN('New York', 'Boston', 'Orlando') ORDER BY FirstName
CREATE VIEW vwSupplier AS SELECT CREATE VIEW vwSupplier AS
TOP 100 FirstName, LastName, City SELECT TOP 100 FirstName,
(B) FROM Supplier_Details (D) LastName, City FROM
WHERE FirstName LIKE 'A%' ORDER BY Supplier_Details
FirstName
3. Which of these statements about CHECK OPTION and SCHEMABINDING options are true?
a. The CHECK OPTION ensures entity integrity.
b. The SCHEMABINDING option binds the view to the schema of the base table.
c. When a row is modified, the WITH CHECK OPTION makes sure that the data remains
visible through the view.
d. SCHEMABINDING option ensures the base table cannot be modified in a way that would
affect the view definition.
e. SCHEMABINDING option cannot be used with ALTER VIEW statements.
(A) a, b, c (C) b, c, d
(B) b, c (D) c, d, e
5. A table Item_Details is created with ItemCode, ItemName, Price, and Quantity columns.
The ItemCode column is defined as the PRIMARY KEY, ItemName is defined with UNIQUE and
NOT NULL constraints, Price is defined with the NOT NULL constraint, and Quantity is defined
with the NOT NULL constraint and having a default value specified. Which of the following views
created using columns from the Item_Details table can be used to insert records in the table?
CREATE VIEW vwItemDetails CREATE VIEW vwItemDetails
AS SELECT ItemCode, AS
(A) ItemName, Price (C) SELECT ItemName, Price,
FROM Item_Details Quantity FROM Item_Details
CREATE VIEW vwItemDetails CREATE VIEW vwItemDetails
AS SELECT ItemCode, AS
(B) Price, Quantity (D) SELECT ItemCode, ItemName,
FROM Item_Details Quantity FROM Item_Details
(A) a, d (C) a, c, d, e
(B) b, c, e (D) d
2. ShoezUnlimited is a trendy shoe store based in Miami. It stocks various kinds of footwear in its store
and sells them for profits. ShoezUnlimited maintains the details of all products in an SQL Server 2019
database. The management wants their developer to make use of stored procedures for commonly
performed tasks. Assuming that you are the developer, perform the following tasks:
a) Create the Shoes table having structure as shown in table 10.6 in the database, ShoezUnlimited.
Field Name Data Type Key Field Description
ProductCode varchar(5) Primary Key Product Code that uniquely identifies each
shoe
BrandName varchar(30) Brand name of the shoe
Category varchar(30) Category of the shoe, such as for example,
sports shoe, casual wear, party wear, and so
forth
UnitPrice money Price of the shoe in dollars
QtyOnHand int Quantity available
Table 10.6: Shoes Table
b) Add at least five records to the table. Ensure that the value of the column QtyOnHand is more
than 20 for each of the shoes.
c) Write statements to create a stored procedure named sp_PriceIncrease that will increment
the unitprice of all shoes by 10 dollars.
d) Write statements to create a stored procedure sp_QtyOnHand that will decrease the quantity on
hand of specified brands by 25. The brand name should be supplied as input.
e) Execute the stored procedures sp_PriceIncrease and sp_QtyOnHand.
3. Create a view called dbo.vw_OrderDetails that displays details of sales orders for the product
whose id is 777. Use the Sales.SalesOrderDetail table of AdventureWorks2019 database Test
the view by creating a query that retrieves data from the view.
4. Create a view called dbo.vw_Products that displays a list of the products from the
Production.Product table joined to the Production.ProductCostHistory table of
AdventureWorks2019 database. Include columns that describe the product and show the cost
history for each product. Test the view by creating a query that retrieves data from the view.
11.1 Introduction
Indexes are special data structures associated with tables or views that help speed up the query. Table 11.1
lists commonly used indexes in SQL Server.
➢ Page number
➢ Page type
➢ Amount of free space on the page
➢ Allocation unit ID of the object to which the page is allocated
Primary
• A primary data file is automatically created at the time of creation of the database. This file has
references to all other files in the database. The recommended file extension for primary data files
is .mdf.
Secondary
• These are optional user-defined data files. Data can be spread across multiple disks by putting each
file on a different disk drive. Recommended file name extension for secondary data files is .ndf.
Transaction Log
• Log files contain information about modifications carried out in the database. This information is
useful in recovery of data in contingencies such as sudden power failure or the need to shift the
database to a different server. There is at least one log file for each database. The recommended file
extension for log files is .ldf.
Indexes are automatically created when PRIMARY KEY and UNIQUE constraints are defined on a
table. Indexes reduce disk I/O operations and consume fewer system resources.
The CREATE INDEX statement is used to create an index. Following is the syntax for this statement:
Syntax
where,
index_name: specifies the name of the index.
table_name: specifies the name of the table.
column_name: specifies the name of the column.
Indexes point to the location of a row on a data page instead of searching through the table. Consider the
following facts and guidelines about indexes:
➢ Indexes increase the speed of queries that join tables or perform sorting operations.
➢ Indexes implement the uniqueness of rows if defined when you create an index.
➢ Indexes are created and maintained in ascending or descending order.
11.1.5 Scenario
In a telephone directory, where a large amount of data is stored and is frequently accessed, the storage of
data is done in an alphabetical order. If such data were unsorted, it would be nearly impossible to search
for a specific telephone number.
Similarly, in a database table having a large number of records that are frequently accessed, the data is to be
sorted for fast retrieval. When an index is created on the table, the index either physically or logically sorts
the records. Thus, searching for a specific record becomes faster and there is less strain on system resources.
This index will create logical chunks of data rows based on the department. This again will limit the
amount of data actually scanned during query retrieval.
Hence, retrieval will be faster and there will be less strain on system resources.
Note - If an allocation unit contains extents from more than one file, there will be multiple IAM
pages linked together in an IAM chain to map these extents.
Leaf nodes contain data pages of the underlying table, root, and intermediate level nodes contain index
pages holding index rows.
Each index row contains a key value and a pointer to either an intermediate level page in the B-tree or a
data row in the leaf level of the index.
By default, a clustered index has a single partition. When a clustered index has multiple partitions, each
partition has a B-tree structure that contains the data for that specific partition.
The clustered index will also have one LOB_DATA allocation unit per partition if it contains large object (LOB)
columns. It will also have one ROW_OVERFLOW_DATA allocation unit per partition if it contains variable
length columns that exceed the 8,060 byte row size limit.
Nonclustered indexes have a similar B-Tree structure as clustered indexes, but with the following
differences:
➢ The data rows of the table are not physically stored in the order defined by their nonclustered
keys.
➢ In a nonclustered index structure, the leaf level contains index rows.
➢ Nonclustered indexes are useful when you require multiple ways to search data. Some facts and
guidelines to be considered before creating a nonclustered index are as follows:
➢ When a clustered index is re-created or the DROP_EXISTING option is used, SQL Server rebuilds the
existing nonclustered indexes.
➢ A table can have up to 999 nonclustered indexes.
➢ Create clustered index before creating a nonclustered index.
Columnstore indexes use both types of data storage format rowstore and columnstore.
Details of columnstore, rowstore, and deltastore data storage formats are as follows:
Columnstore • It is logically organized data as a table with rows and columns and physically
stored in a column-wise data format.
Rowstore • It is logically organized data as a table with rows and columns and then, physically
stored in a row-wise data format.
Deltastore • It is a holding place for rows that are too few in number to be compressed into
the columnstore. The deltastore stores the rows in rowstore format.
• Each bucket is eight bytes, which are used to store the memory address of a link list of key entries.
• Each entry is a value for an index key, plus the address of its corresponding row in the underlying
memory-optimized table.
• Each entry points to the next entry in a link list of entries, all chained to the current bucket.
Interplay of the hash index and the buckets is summarized in the figure 11.13.
Queries on XML columns are common in your workload. XML index maintenance cost during data
modification must be considered.
When XML values are relatively large and the retrieved parts are relatively small. Building the index
avoids parsing the whole data at run time and benefits index lookups for efficient query processing.
Types of population
Syntax
CREATE CLUSTERED INDEX index_name ON table_name (column1, column2, ...);
For creating clustered index, a new table name parts is created in AdventureWorks2019 database under
production with column details as shown in Code Snippet 1.
Code Snippet 1:
Code Snippet 2:
In a similar way, following syntaxes are used to Rename, Disable, Enable, and Drop indexes:
RENAME INDEX
The sp_rename is a system stored procedure that allows you to rename any user-created object in the
current database including table, index, and column.
Syntax
EXEC sp_rename
index_name,
new_index_name,
N'INDEX';
Code Snippet 3 renames the index ix_parts_id of the production.parts table to index_part_id
and output is shown in figure 11.18.
Code Snippet 3:
EXEC sp_rename
N'production.parts.ix_parts_id',
N'index_part_id',
N'INDEX';
Note: One can rename an index using the SQL Server Management Studio (SSMS), simply by
right-clicking index name it can be done.
DISABLE INDEX
To disable an index, ALTER INDEX statement is used as follows:
Syntax
ALTER INDEX index_name
ON table_name
DISABLE;
Code Snippet 4 displays example for disable index. It disables index_part_id on part_id column in
production.parts table.
Code Snippet 4:
As a result of this, simply if user tried to display all records from Production.Parts table, it displays
error as shown in following figure 11.20.
In addition to this, user can able to disable all indexes on table as shown in Code Snippet 5.
Code Snippet 5:
ENABLE INDEX
This statement uses the ALTER INDEX statement to ‘enable’ or rebuild an index on a table.
Syntax
ALTER INDEX index_name
ON table_name
REBUILD;
Figure 11.21 displays example for enabling index. It enables INDEX_PART_ID on PART_ID column on
Production.Parts table.
Similarly, user can enable all indexes for production.parts table as shown in figure 11.22.
DROP INDEX
DROP INDEX statement removes one or more indexes from the current database. Following is the syntax
for DROP INDEX statement:
Syntax
DROP INDEX [IF EXISTS] index_name
ON table_name;
• Specify name of index that you want to remove after DROP INDEX clause.
• Then, specify name of table to which index belongs.
Code Snippet 6 shows code for dropping index INDEX_PART_ID on Production.Parts table.
Code Snippet 6:
Removing a nonexisting index will result in an error. However, you can use the IF EXISTS option to
conditionally drop the index and avoid the error as shown in figure 11.23.
Syntax
• First, specify name of index followed by CREATE NONCLUSTERED INDEX clause. Note that the
NONCLUSTERED keyword is optional.
Code Snippet 7 displays how to create nonclustered index on Sales.Customer table from
Adventureworks2019 database.
Code Snippet 7:
It may consist of one or many columns. If a unique index has one column, the values in this column will
be unique. In case the unique index has multiple columns, the combination of values in these columns is
unique.
Syntax
Code Snippet 8, display how to create unique index on Sales.Customer table in AdventureWorks2019
database.
Code Snippet 8:
Syntax
For creating filtered index Sales.Customer table from AdventureWorks2019 database is used.
Code Snippet 9:
Note: To improve the key lookup, you can use an index with included columns.
Note: SQL Server 2019 (15.x) supports up to 15,000 partitions by default. In versions earlier than SQL
Server 2012 (11.x), the number of partitions was limited to 1,000 by default.
Partitioning large tables or indexes can have the following manageability and performance benefits.
Benefits of Partitioning
For creating sample partition following table with mentioned detail is used.
Specify exactly how table is going to be partitioned by partitioning column. In this case, date, along with
the range of values included in each partition. Regarding partition boundaries, you can specify either the
LEFT or RIGHT as displayed in Code Snippet 11.
Code Snippet 12 will allow to determine partition at which records are placed.
Following figure 11.29 displays how XML data is stored in Instructions column of
Production.ProductModel table in AdventureWorks2019 database.
XML indexes are created on such XML data columns stored in tables and databases. Code Snippet 13 is
used to create Primary XML index name PXML_ProductModel_CatalogDescription on
CatalogDescription column of Production.ProductModel column.
Primary XML index contains all the data in XML column. To provide an additional performance boost
to XML queries, you can add specialized secondary XML indexes. Secondary XML index covers the same
data as it is underlying primary index, but it creates a more specific index, based on the primary index as
shown in Code Snippet 14.
Figure 11.31, displays specified Secondary XML index in Code Snippet 14 is created.
Once the index is created, a basic aggregation query in Code Snippet 16 is similar to what you would see
in any data warehouse environment that is used to test created columnstore index.
SELECT ProductID,SUM(OrderQty)
FROM Sales.SalesOrderDetail
GROUP BY ProductId;
This query will simply sum of order quantity across each product. By looking at estimated execution plan
one can understand that the entire query was satisfied by scanning the column store index and none of
the table access was even required as shown in figure 11.33.
3. Creating and maintaining a full-text index involves populating the index by using a process called
a _________________ and is also called as ________________.
b. Create a table named BooksMaster to store the details of the books in the library as shown in
table 11.2.
c. Create a clustered index named IX _ Title on the Title column in the BooksMaster table.
d. Create a table BooksMaster1 having field names BookCode, Title, and Book Details.
e. Specify the data type for BookDetails as xml. Create an XML document with details of ISBN,
Author, Price, Publisher, and NumPages.
f. The library wants to retrieve the publisher name of the company which prints a specific author's
book.
g. Create a primary XML index PXML_Books on the BookCode column of the BooksMaster
table.
This session explains the triggers and different types of triggers. The session also
describes the procedure to create and alter DML triggers, nested triggers, update
functions, handling multiple rows in a session, LOGON trigger, and performance
implication of triggers.
Unlike CHECK constraints, triggers can reference the columns in other tables. This feature can be used to
apply complex data integrity checks. Data integrity can be enforced by:
Checking constraints before cascading updates or deletes.
Creating multi-row triggers for actions executed on multiple rows.
Enforcing referential integrity between databases.
• INSERT trigger
• UPDATE trigger
• DELETE trigger
Code Snippet 1 creates two tables with mentioned details into AdventureWorks2019 database.
Code Snippet 1:
CREATE TABLE Locations (LocationID int, LocName varchar(100));
CREATE TABLE LocationHistory (LocationID int, ModifiedDate DATETIME);
The trigger, TRIGGER_INSERT_Locations will get executed when a record is inserted in Locations
table. It then inserts LocationID and current date from inserted table to LocationHistory table.
To test the created INSERT trigger, code from Code Snippet 3 is used.
Code Snippet 3:
The trigger, TRIGGER_UPDATE_Locations is created on Locations table and will get executed when a
record is updated in Locations table. It then inserts the LocationID and current date from inserted
table to LocationHistory table. The FOR UPDATE clause specifies that this DML trigger will be
invoked after the update operations.
To test the created UPDATE trigger, code from Code Snippet 5 is used.
Code Snippet 5:
UPDATE dbo.Locations
SET LocName='Atlanta'
Where LocationID=443101;
Then, SELECT all records from LocationHistory table to check TRIGGER_UPDATE_Locations is
executed or not as shown in figure 12.4.
• The record is deleted from the trigger table and inserted in the Deleted table.
• If there is a constraint on the record to prevent deletion, the DELETE trigger displays an error message.
• The deleted record stored in the Deleted table is copied back to the trigger table.
A DELETE trigger is created using the DELETE keyword in the CREATE TRIGGER statement as shown in
Code Snippet 6.
Code Snippet 6:
The trigger TRIGGER_DELETE_Locations is created on Locations table and will get executed when a
record is deleted from Locations table. It then inserts the LocationID and current date from deleted
table to LocationHistory table.
To test the created DELETE trigger, code from Code Snippet 7 is used.
Code Snippet 7:
DELETE FROM dbo.Locations
Where LocationID=443101;
Then, SELECT all records from LocationHistory table to check TRIGGER_DELETE_Locations is
executed or not as shown in figure 12.5.
Code Snippet 8 creates an AFTER INSERT trigger on the Locations table. If a new record is inserted in
the table, the AFTER INSERT trigger activates. The trigger inserts the LocationID and current date from
inserted table to LocationHistory table.
Code Snippet 8:
To test the created AFTER INSERT trigger, code from Code Snippet 9 is used.
Code Snippet 9:
INSERT INTO dbo.Locations (LocationID,LocName) VALUES (443103,'SAN ROMAN');
Then, SELECT all records from LocationHistory table to check AFTER_INSERT_Locations is executed
or not as shown in figure 12.7.
Code Snippet 10 creates the trigger INSTEADOF_DELETE_Locations for delete event on Locations
table.
When delete statement is issued against the table, INSTEAD OF trigger is fired and the T-SQL block inside
the triggers in SQL Server is executed, but the actual delete operation does not happen.
where,
[triggerschema.] triggername: is the name of the DML or DDL trigger and the schema to
which it belongs and whose order needs to be specified.
value: specifies the execution order of the trigger as FIRST, LAST, or NONE. If FIRST is
specified, then the trigger is fired first. If LAST is specified, the trigger is fired last. If NONE is
specified, the order of the firing of the trigger is undefined.
statement_type: specifies the type of SQL statement (INSERT, UPDATE, or DELETE) that invokes
the DML trigger.
Code Snippet 12 first executes the TRIGGER_DELETE_Locations trigger defined on Locations table
when the DELETE operation is performed on the table row.
Code Snippet 12:
EXEC sp_settriggerorder @triggername = 'TRIGGER_DELETE_Locations', @order =
'FIRST', @stmttype = 'DELETE'
where,
DML_trigger_name: specifies the name of the DML trigger whose definitions are to be
displayed.
Code Snippet 13 displays the definition of the trigger, TRIGGER_DELETE_Locations, created on the
Locations table.
Code Snippet 13:
sp_helptext TRIGGER_DELETE_Locations
If the user wants to modify any of these parameters for a DML trigger, a user can do so in any one of two
ways:
If the object referencing a DML trigger is renamed, the trigger must be modified to reflect the change
in object name.
Code Snippet 14 alters the TRIGGER_UPDATE_Locations trigger created on the Locations table
using the WITH ENCRYPTION option.
Now, if the user tries to view the definition of the TRIGGER_UPDATE_Locations trigger using the
sp_helptext stored procedure, the following error message is displayed:
DDL triggers can invoke an event or display a message based on the modifications attempted on the
schema. DDL triggers are defined either at the database level or at the server level. Figure 12.12 displays
different types of DDL triggers.
where,
ALL SERVER: specifies that the DDL trigger executes when DDL events occur in the current
server.
DATABASE: specifies that the DDL trigger executes when DDL events occur in the current
database.
event_type: specifies the name of the DDL event that invokes the DDL trigger.
Code Snippet 16 creates a DDL trigger for dropping and altering a table.
Code Snippet 16:
In this code, the DDL trigger is created for DROP TABLE and ALTER TABLE statements.
When a record is deleted from the HumanResources.Employee table, the Employee_Deletion trigger
is activated and a message is displayed. Also, the record of the employee is deleted from the
HumanResources.EmployeePayHistory table.
Code Snippet 18 creates an AFTER DELETE trigger Deletion_Confirmation on the
When a record is
deleted from the HumanResources.EmployeePayHistory table, the
Deletion_Confirmation trigger is activated. This trigger prints the confirmation message of the record
being deleted.
Thus, the Employee_Deletion and the Deletion_Confirmation triggers are seen to be nested.
12.17 UPDATE()
UPDATE () function returns a Boolean value that specifies whether an UPDATE or INSERT action was
performed on a specific view or column of a table.
UPDATE() function can be used anywhere inside the body of a Transact-SQL UPDATE or INSERT
trigger to test whether the trigger should execute some actions.
Following is the syntax for UPDATE():
Syntax
UPDATE (column );
where,
column: is the name of the column to test for either an INSERT or UPDATE action.
Code Snippet 19 creates a trigger Accounting on the Production.TransactionHistory table to
update the columns TransactionID or ProductID.
Code Snippet 19:
USE AdventureWorks2019;
GO
CREATE TRIGGER PODetails
ON Purchasing.PurchaseOrderDetail AFTER INSERT AS
UPDATE PurchaseOrderHeader
SET SubTotal = SubTotal + LineTotal FROM inserted
WHERE PurchaseOrderHeader.PurchaseOrderID = inserted.PurchaseOrderID;
In this code, the subtotal is calculated and stored for a single-row insert operation. Code Snippet 21 stores
a running total for a multi-row or single-row insert.
USE AdventureWorks2019;
GO
CREATE TRIGGER PODetailsMultiple
ON Purchasing.PurchaseOrderDetail AFTER INSERT AS
UPDATE Purchasing.PurchaseOrderHeader SET SubTotal = SubTotal +
(SELECT SUM(LineTotal) FROM inserted
WHERE PurchaseOrderHeader.PurchaseOrderID
= inserted.PurchaseOrderID)
WHERE PurchaseOrderHeader.PurchaseOrderID IN (SELECT PurchaseOrderID FROM
inserted);
In this code, the subtotal is calculated and stored for a multi-row or single-row insert operation.
LOGON triggers are created at the server level and are useful in following cases:
• To audit login activity
• To control the login activity
Use EVENTDATA() and write your custom code to track or control the connections. Code Snippet 22
creates a simple trigger in SQL Server for LOGON event. This LOGON trigger tracks login activities on all
SQL server instances available and insert event related XML data and Date of login activity into
LoginActivity table in AdventureWorks2019 database.
BEGIN
INSERT INTO LoginActivity
SELECT EVENTDATA()
,GETDATE()
END;
Note: User must be cautious while creating these triggers as login may fail if the trigger execution fails
or if you do not have access to objects referenced in the LOGON trigger. In such cases, the only
member of the sysadmin role can connect to the server using a dedicated administrator connection.
So, it is always better to enable dedicated administrator connection when using these triggers.
(A) a, b, c (C) a, b ,d
(B) b, c, d (D) a, c, d
2. Match the description with different types of DML triggers in SQL Server 2019.
(A) a-1, b-4, c-2, d-3, e-5 (C) a-2, b-4, c-3, d-5, e-1
(B) a-2, b-4, c-5, d-1, e-3 (D) a-1, b-2, c-3, d-4, e-5
(A) a, b (C) a, d
(B) c, d (D) b, d
4. Which of these statements about the working with DML triggers of SQL Server 2019 are true?
Each triggering action cannot have DML trigger definition can be modified
a. c.
multiple AFTER triggers. by dropping and recreating the trigger.
Two triggering actions on a table can DML trigger definition can be viewed
b. d.
have the same first and last triggers. using the sp_helptext stored procedure.
(A) a, c (C) b, d
(B) b, c (D) c, d
5. triggers can be used to perform the functions such as storing the backup
of the rows that are affected by previous actions.
a. Create the following tables in GalaxyAirlines database. Table 12.2 lists the Flight table.
b. Write statements to create a trigger trgCheckSeats that will activate whenever a new row
is being inserted into the Flight_Details table. The maximum limit of seats that a flight
can contain is 150. The trigger should check for the value of seats being inserted. If it is
more than 150, the INSERT operation is not allowed to succeed.
d. Write statements to create a trigger UpdateValid that will activate whenever a row is being
updated in the Flight_Details table. The trigger should determine if the Seats column is present
in the list of columns being updated. If yes, the UPDATE operation should not succeed
because the Seats column is defined as a constant and cannot be changed.
e. Write statements to create a DDL trigger ProhibitDelete that will activate whenever a user is
trying to delete a table from the Galaxy Airlines database. The trigger must not allow a user
to perform deletes and must display a message ‘You are not allowed to delete tables in this
database.
13.1 Introduction
Transact-SQL programming is a procedural language extension to SQL. Transact-SQL programming is
extended by adding the subroutines and programming structures similar to high-level languages. Like
high-level languages, Transact-SQL programming also has rules and syntax that control and enable
programming statements to work together. Users can control the flow of programs by using conditional
statements such as IF and loops such as WHILE.
• Scripts
A script is a chain of Transact-SQL statements stored in a file that is used as input to the SSMS code editor
or sqlcmd utility.
Following features enable users to work with Transact-SQL statements:
• Variables
A variable allows a user to store data that can be used as input in a Transact-SQL statement.
• Control-of-flow
Control-of-flow is used for including conditional constructs in Transact-SQL.
• Error Handling
Error handling is a mechanism that is used for handling errors and provides information to the
users about the error occurred.
• Most of the run-time errors stop the current statement and the statements that follow in the batch.
• A specific run-time error such as a constraint violation stops only the existing statement and the
remaining statements in the batch are executed.
The SQL statements that execute before the run-time error is encountered are unaffected. The only
exception is when the batch is in a transaction and the error results in the transaction being rolled back.
For example, suppose there are 10 statements in a batch and the sixth statement has a syntax error, then
the remaining statements in the batch will not execute. If the batch is compiled and the third statement
fails to run, then, the results of the first two statements remains unaffected as it is already executed.
In Code Snippet 1, a view is created in a batch. The CREATE VIEW is the only statement in the batch, the
GO commands are essential to separate the CREATE VIEW statement from the SELECT and USE
statements. This was a simple example to demonstrate the use of a batch. In the real-world, a large
number of statements may be used within a single batch. It is also possible to combine two or more
batches within a transaction.
Code Snippet 2 shows an example of this.
Code Snippet 2:
BEGIN TRANSACTION
GO
USE AdventureWorks2019;
GO
CREATE TABLE Company (
Id_Num int IDENTITY(100, 5),
Company_Name nvarchar(100))
GO
INSERT Company (Company_Name) VALUES (N'A Bike Store')
INSERT Company (Company_Name) VALUES (N'Progressive Sports')
INSERT Company (Company_Name) VALUES (N'Modular Cycle Systems')
INSERT Company (Company_Name) VALUES (N'Advanced Bike Components')
INSERT Company (Company_Name) VALUES (N'Metropolitan Sports Supply')
INSERT Company (Company_Name) VALUES (N'Aerobic Exercise Company')
INSERT Company (Company_Name) VALUES (N'Associated Bikes')
INSERT Company (Company_Name) VALUES (N'Exemplary Cycles')
GO
SELECT Id_Num, Company_Name FROM dbo. Company
ORDER BY Company_Name ASC;
In Code Snippet 2, several batches are combined into one transaction. The BEGIN TRANSACTION and
COMMIT statements enclose the transaction statements. The CREATE TABLE, BEGIN TRANSACTION,
SELECT, COMMIT, and USE statements are in single-statement batches. The INSERT statements are all
included in one batch.
where,
@local_variable: specifies the name of the variables and begins with @ sign.
data_type: specifies the data type. A variable cannot be of image, text, or ntext data type.
=value: Assigns an inline value to a variable. The value can be an expression or a constant value.
The value should match with the variable declaration type or it should be implicitly converted to
that type.
Code Snippet 3 uses a local variable to retrieve contact information for the last names starting with
‘Man’.
Code Snippet 3:
USE AdventureWorks2019;
GO
DECLARE @find varchar(30) = 'Man%';
SELECT p.LastName, p.FirstName, ph.PhoneNumber FROM Person.Person AS p
JOIN Person.PersonPhone AS ph ON p.BusinessEntityID = ph.BusinessEntityID
WHERE LastName LIKE @find;
In Code Snippet 3, a local variable named @find is used to store the search criteria, which will be then,
used to retrieve the contact information. Here, the criteria include all last names beginning with ‘Man’.
Figure 13.1 displays the output.
Syntax
SET
{@local_variable = {expression}
}
|
{@local_variable
{+= | -= | *= | /= | %= | &= | ^= | |=} expression
}
where,
@local_variable: specifies the name of the variable and begins with @ sign.
=: Assigns the value on the right side to the variable on the left side.
{= | += | -= | *= | /= | %= | &= | ^= | |= }: specifies the compound
assignment operators.
where,
@local_variable: specifies the local variable to which a value will be assigned.
=: Assigns the value on the right-hand side to the variable on the left-hand side.
{= | += | -= | *= | /= | %= | &= | ^= | |= }: specifies the compound
assignment operators.
expression: specifies any valid expression which can even include a scalar subquery.
Code Snippet 5 shows how to use SELECT to return a single value.
Code Snippet 5:
USE AdventureWorks2019;
GO
DECLARE @var1 nvarchar(30);
SELECT @var1 = 'Unnamed Company';
SELECT @var1 = Name FROM Sales.Store WHERE BusinessEntityID = 10;
SELECT @var1 AS 'Company Name';
In Code Snippet 5, the variable @var1 is assigned Unnamed Company as its value.
The query against the Store table will return zero rows as the value specified for the BusinessEntityID
does not exist in the table. The variable will then, retain the Unnamed Company value and will be
displayed with the heading Company Name. Figure 13.2 displays the output.
• It is possible to assign only one variable at a time using SET. However, using SELECT you can
make multiple assignments at once.
• SET can only assign a scalar value when assigning from a query. It raises an error and does not
work if the query returns multiple values/rows. However, SELECT assigns one of the returned
values to the variable and the user will not even know that multiple values were returned.
13.3 Synonyms
Synonyms are database objects that serve the following purposes:
• They offer another name for a different database object, also called as the base object, which may
exist on a remote or local server.
• They present a layer of abstraction that guards a client application from the modifications made
to the location and the name of the base object.
For example, consider that the Department table of AdventureWorks2019 is located on the first server
named Server1. To reference this table from the second server, Server2, a client application would have
to use the following four-part name:
Server1.AdventureWorks2019.HumanResources.Department
If the location of the table was modified, for example, to another server, the client application would
have to be rectified to reflect that change. To address both these issues, users can create a
synonym, DeptEmpTable, on Server2 for the Department table on Server1.
Now, the client application only has to use the single name, DeptEmpTable, to refer to the
Department table.
Note - A synonym is a part of schema, and similar to other schema objects, the synonym name must
be unique.
Table 13.1 lists the database objects for which the users can create synonyms.
Database Objects
Extended stored procedure
SQL table-valued function
SQL stored procedure
Table (User-defined)
Replication-filter-procedure
SQL scalar function
SQL inline-tabled-valued function
View
Table 13.1: Database Objects
Permissions
DELETE
INSERT
TAKE OWNERSHIP
VIEW DEFINITION
CONTROL
EXECUTE
SELECT
UPDATE
Table 13.2: Permissions
1. In Object Explorer, expand the database where you want to create a new synonym.
2. Select the Synonyms folder, right-click it and then, click New Synonym as shown in figure 13.3.
3. In the New Synonym dialog box, provide the information as shown in figure 13.4.
where,
Synonym name: is the new name for the object. Here, Emp is the name.
Synonym schema: is the new name for the schema object. Here, the same schema name
HumanResources is used for the synonym and the object type.
Server name: is the name of the server to be connected. Here, the server name is specified as
10.2.110.140.
Database name: is the database name to connect the object. Here, AdventureWorks2019
is the database name.
Schema: is the schema that owns the object.
Object type and Object name: is the object type and name respectively. Here, the object type
selected is view and the object name that refers to the synonym is
vEmployeeDepartmentHistory.
Code Snippet 6:
USE AdventureWorks2019;
GO
CREATE SYNONYM MyAddressType
FOR AdventureWorks2019.Person.AddressType;
GO
In Code Snippet 6, a synonym is created from an existing table present in the AdventureWorks2019
database.
In Code Snippet 7, BEGIN and END statements describe a sequence of Transact-SQL statements that are
executed together. Suppose the BEGIN and END are not included, then, the ROLLBACK TRANSACTION
statements will execute and both the PRINT messages will be displayed.
Code Snippet 8:
USE AdventureWorks2019;
GO
DECLARE @ListPrice money;
SET @ListPrice = (SELECT MAX(p.ListPrice) FROM Production.Product AS p
JOIN Production.ProductSubcategory AS s
ON p.ProductSubcategoryID = s.ProductSubcategoryID WHERE s.[Name] = 'Mountain
Bikes');
PRINT @ListPrice
IF @ListPrice <3000
PRINT 'All the products in this category can be purchased for an amount less
than 3000'
ELSE
PRINT 'The prices for some products in this category exceed 3000'
In Code Snippet 8, the IF…ELSE statement is used to form a conditional statement. First, a variable
@ListPrice is defined and a query is created to return the maximum list price of the product category
Mountain Bikes. Then, this price is compared with a value of 3000 to determine if products can be
purchased for an amount less than 3,000. If yes, an appropriate message is printed using the first PRINT
statement. If not, then the second PRINT statement executes.
• WHILE
The WHILE statement specifies a condition for the repetitive execution of the statement block. The
statements are executed repetitively as long as the specified condition is true. The execution of statements
in the WHILE loop can be controlled by using the BREAK and CONTINUE keywords.
where,
Boolean_expression: specifies the expression that returns TRUE or FALSE values.
{sql_statement | statement_block}: Is any Transact-SQL statement that defines the
statement block.
BREAK: Results in an exit from the innermost WHILE loop. Every statement that appears
after the END keyword, that marks the end of the loop, is executed.
CONTINUE: Results in the WHILE loop being restarted. The statements after the CONTINUE
keyword within the body of the loop are not executed.
Code Snippet 9 shows the use of WHILE statement.
Code Snippet 9:
Using Code Snippet 9, all the even numbers beginning from 10 until 95 are displayed. This is achieved
using a WHILE loop along with an IF statement. Similarly, a WHILE loop can also be used with queries
and other Transact-SQL statements.
Function Description
CONVERT Is deterministic only if one of these conditions exists:
• Scalar-valued Functions
A Scalar-valued Function (SVF) always returns an int, bit, or string value. The data type returned
from and the input parameters of SVF can be of any data type except text, ntext, image,
cursor, and timestamp.
An inline scalar function has a single statement and no function body. A multi-statement scalar
function encloses the function body in a BEGIN...END block.
• Table-valued Functions
Table-valued functions are user-defined functions that return a table. Similar to an inline scalar
function, an inline table-valued function has a single statement and no function body.
Code Snippet 10 shows the creation of a table-valued function.
Code Snippet 10:
USE AdventureWorks2019;
GO
IF OBJECT_ID (N'Sales.ufn_CustDates', N'IF') IS NOT NULL DROP FUNCTION
Sales.ufn_ufn_CustDates;
GO
CREATE FUNCTION Sales.ufn_CustDates () RETURNS TABLE
AS RETURN (
SELECT A.CustomerID, B.DueDate, B.ShipDate FROM Sales.Customer A
LEFT OUTER JOIN
Sales.SalesOrderHeader B ON
A.CustomerID = B.CustomerID AND YEAR(B.DueDate)<2020
);
Here, an inline table-valued function defines a left outer join between the tables Sales.Customer
and Sales.SalesOrderHeader.
Tables are joined based on customer ids. In this case, all records from the left table and only
matching records from the right table are returned. The resultant table is then returned from the
table-valued function.
The function is invoked as shown in Code Snippet 11.
Code Snippet 11:
SELECT * FROM Sales.ufn_CustDates();
• Permissions
The ALTER permission is required on the schema or the function. If the function specifies a
user-defined type, then it requires the EXECUTE permission on the type.
• Partitioning
Partitioning is a feature that limits the window of the recent calculation to only those rows from
the resultset that contains the same values in the partition columns as in the existing row. It uses
the PARTITION BY clause.
Code Snippet 13 demonstrates use of the PARTITION BY and OVER clauses with aggregate functions.
Here, using the OVER clause proves to be better efficient than using subqueries to calculate the
aggregate values.
Code Snippet 13:
USE AdventureWorks2019;
GO
SELECT SalesOrderID, ProductID, OrderQty
,SUM(OrderQty) OVER(PARTITION BY SalesOrderID) AS Total
,MAX(OrderQty) OVER(PARTITION BY SalesOrderID) AS MaxOrderQty FROM
Sales.SalesOrderDetail
WHERE ProductId IN(776, 773);
GO
Code Snippet 14 makes use of the RANK() function which returns the rank of each row in the partition
of a resultset. The rank of a row is determined by adding 1 to the number of ranks that come before
the specified row. For example, while using descending ordering, the RANK() function returns one
more than the number of rows in the respective partition that has a greater ordering value than the
specified one.
Figure 13.6 displays the output of Code Snippet 14.
Code Snippet 15 makes use of the RANK() function which returns the rank of each row in the partition
of a resultset. In general, the rank of a row is determined by adding 1 to the number of ranks that come
before the specified row. Here in this code, the first RANK() function generates the attribute Rnk_One
that depends on the default partitioning, and the second RANK function generates Rnk_Two that uses
explicit partitioning by TerritoryID.
Figure 13.7 displays the partitions defined for a sample of three results of calculations in the query:
one Rnk_One value and two Rnk_Two value.
• Framing
Framing is a feature that enables you to specify a further division of rows within a window partition.
This is done by assigning upper and lower boundaries for the window frame that presents rows to
the window function. In simple terms, a frame is similar to a moving window over the data that
starts and ends at specified positions. Window frames can be defined using the ROW or RANGE
subclauses and providing starting and ending boundaries.
Code Snippet 16 displays a query against the ProductInventory, calculating the running total
quantity for each product and location.
Code Snippet 16:
In Code Snippet 16, the window function applies the SUM aggregate to the attribute Quantity,
partitions the window by ProductID, orders the partition rows by LocationID, and frames the
partition rows depending on the given ordering between unbounded preceding (no low boundary
point) and the current row. In other words, the result will be the sum of all prior rows in the frame,
including the current row. Figure 13.8 displays the output of Code Snippet 16.
• Ranking functions
These functions return a rank value for each row in a partition. Based on the function that is used,
many rows will return the same value as the other rows. Ranking functions are non-deterministic.
Table 13.6 lists various ranking functions.
The NTILE() function breaks a given input collection into N equal sized logical groups. To
determine how many rows belong in each group, SQL Server has to determine the total number
of rows in the input collection. The OVER clause decides the order of the rows when they have
been divided into groups. It is possible to perform the grouping in one order and return the
resultset in another order.
where,
year: specifies the integer expression for a year.
month: specifies the integer expression for a month.
day: specifies the integer expression for a day.
hour: specifies the integer expression for an hour.
minute: specifies the integer expression for a minute.
seconds: specifies the integer expression for a day.
fractions: specifies the integer expression for fractions.
• SYSDATETIMEOFFSET
These functions returns datetimeoffset(7) value which contains the date and time of the computer
on which the instance of SQL Server is running.
Following is the syntax for SYSDATETIMEOFFSET:
Syntax:
SYSDATETIMEOFFSET ();
Code Snippet 20 displays different formats used by the date and time functions.
Code Snippet 20:
SELECT SYSDATETIME() AS SYSDATETIME
,SYSDATETIMEOFFSET() AS SYSDATETIMEOFFSET
,SYSUTCDATETIME() AS SYSUTCDATETIME
Function Description
LEAD Provides access to data from a subsequent row in the same
resultset without using a self-join.
LAST_VALUE Retrieves the last value in an ordered set of values.
LAG Provides access to data from a previous row in the same
resultset without using a self-join.
FIRST_VALUE Retrieves the first value in an ordered set of values.
CUME_DIST Computes the cumulative distribution of a value in a group of
values.
PERCENTILE_ CONT Computes a percentile based on a continuous distribution of
the column value in SQL.
PERCENTILE_DISC Calculates a particular percentile for sorted values in an entire
rowset or within distinct partitions of a rowset.
2. Which of the following are used to set and declare local variables provided by SQL Server?
a. DECLARE
b. SET
c. DELETE
d. INSERT
(A) a, d (C) a, b
(B) b, c (D) c, d
4. Which of the following code uses a local variable to retrieve contact information for the last
names starting with ‘Per’?
USE
AdventureWorks2019;
GO
DECLARE @find
varchar(30);
SET @find = 'Per%';
(A) SELECT p.LastName,
p.FirstName,
ph.PhoneNumber FROM
Person.Person AS p
JOIN
Person.PersonPhone AS
ph ON
p.BusinessEntityID =
ph.BusinessEntityID
WHERE LastName LIKE @find;
USE
AdventureWorks2019;
GO
@find varchar(30);
@find varchar(30) =
'Per%'; SET @find =
(C) 'Per%';
SELECT p.LastName, p.FirstName, ph.PhoneNumber
FROM Person.Person AS p
JOIN Person.PersonPhone AS ph ON p.BusinessEntityID =
ph.BusinessEntityID
WHERE LastName LIKE @find;
USE
AdventureWorks2019; GO
SET @find varchar(30);
SET @find varchar(30) =
'Per%'; SET @find = 'Per';
SELECT p.LastName, p.FirstName, ph.PhoneNumber
(D) FROM Person.Person AS p
JOIN Person.PersonPhone AS ph ON p.BusinessEntityID =
ph.BusinessEntityID
WHERE LastName LIKE @find;
2. Maxwel distribution is product Distribution Company and having various department across
various countries you have to find the country wise report to display unique department in the
company. The report should display all unique departments’ country wise with the name of their
leaders.
This session explains the types of transactions and the procedure to implement the
transactions. It also describes the process to control and mark a transaction and lists the
differences between the implicit and explicit transactions. The session also covers the
isolation levels, scope, different types of locks, and transaction management.
14.1 Introduction
A transaction is a single unit of work. A transaction is successful only when all data modifications that
are made in a transaction are committed and are saved in the database permanently. If the transaction is
rolled back or cancelled, then it means that the transaction has encountered errors and there are no
changes made to the contents of the database. Hence, a transaction can be either committed or rolled
back.
Suppose if the first statement executes correctly but the other statements fail then the data remains in an
incorrect state.
For example, a good scenario will be the funds transfer activity in a banking system. The transfer of funds
will need an INSERT and two UPDATE statements. First, the user has to increase the balance of the
destination account and then, decrease the balance of the source account. The user has to check that the
transactions are committed and whether the same changes are made to the source account and the
destination account.
➢ Defining Transactions
A logical unit of work must exhibit four properties, called the Atomicity, Consistency, Isolation, and
Durability (ACID) properties, to qualify as a transaction.
➢ Implementing Transactions
SQL Server supports transactions in several modes. Some of these modes are as follows:
Autocommit Transactions: Each single-line statement is automatically committed as soon as it
completes. In this mode, one does not need to write any specific statements to start and end the
transactions. It is the default mode for SQL Server Database Engine.
Explicit Transactions: Each transaction explicitly starts with the BEGIN TRANSACTION
statement and ends with a ROLLBACK or COMMIT transaction.
Implicit Transactions: A new transaction is automatically started when the earlier transaction
completes and every transaction is explicitly completed by using the ROLLBACK or COMMIT
statement.
Batch-scoped Transactions: These transactions are related to Multiple Active Result Sets (MARS).
Any implicit or explicit transaction that starts in a MARS session is a batch-scoped transaction.
Distributed Transactions: It span two or more servers known as resource managers. The
management of the transaction must be coordinated between the resource managers by a server
component called transaction manager. Each instance of the SQL Server Database Engine can
operate as a resource manager in distributed transactions coordinated by transaction managers, such
as Microsoft Distributed Transaction Coordinator (MS DTC), or other transaction managers that
support the Open Group XA specification for distributed transaction processing.
➢ BEGIN TRANSACTION
The BEGIN TRANSACTION statement marks the beginning point of an explicit or local transaction.
Code Snippet 1:
USE AdventureWorks2019;
GO
DECLARE @TranName VARCHAR(30);
SELECT @TranName = 'FirstTransaction';
BEGIN TRANSACTION @TranName;
DELETE FROM HumanResources.JobCandidate WHERE JobCandidateID = 13;
In Code Snippet 1, a transaction name is declared using a variable with value FirstTransaction. A new
transaction with this name is then created having a DELETE statement. As the transaction comprises a
single-line statement, it is implicitly committed.
➢ COMMIT TRANSACTION
The COMMIT TRANSACTION statement marks an end of a successful implicit or explicit transaction. If the
@@TRANCOUNT is 1, then, COMMIT TRANSACTION performs all data modifications performed on the
database and becomes a permanent part of the database. Further, it releases the resources held by the
transaction and decrements @@TRANCOUNT by 0. If @@TRANCOUNT is greater than 1, then the COMMIT
TRANSACTION decrements the @@TRANCOUNT by 1 and keeps the transaction in active state.
Code Snippet 2:
BEGIN TRANSACTION;
GO
DELETE FROM HumanResources.JobCandidate WHERE JobCandidateID = 11;
GO
COMMIT TRANSACTION;
GO
Code Snippet 2 defines a transaction that will delete a job candidate record having
JobCandidateID as 11.
➢ COMMIT WORK
The COMMIT WORK statement marks the end of a transaction. Following is the syntax for the COMMIT WORK
statement:
Syntax
COMMIT [WORK] [ ; ]
COMMIT TRANSACTION and COMMIT WORK are identical except for the fact that COMMIT
TRANSACTION accepts a user-defined transaction name.
➢ Marking a Transaction
Code Snippet 3:
In Code Snippet 3, a transaction named DeleteCandidate is created and marked in the log.
➢ ROLLBACK TRANSACTION
This transaction rolls back or cancels an implicit or explicit transaction to the starting point of the
transaction, or to a savepoint in a transaction. A savepoint is a mechanism to roll back some parts of
transaction. The ROLLBACK TRANSACTION is used to delete all data modifications made from the
Syntax
ROLLBACK {TRAN | TRANSACTION}
[transaction_name | @tran_name_variable
| savepoint_name | @savepoint_variable]
[ ; ]
where,
transaction_name: specifies the name that is assigned to the BEGIN TRANSACTION statement.
@tran_name_variable: specifies the name of a user-defined variable that contains a valid
transaction name. The variable can be declared as char, varchar, nchar, or nvarchar data type.
savepoint_name: specifies the savepoint_name from a SAVE TRANSACTION statement. Use
savepoint_name only when a conditional roll back affects a part of a transaction.
@savepoint_variable: specifies the name of savepoint variable that contain a valid savepoint
name. The variable can be declared as char, varchar, nchar, or nvarchar data type.
Consider an example that demonstrates the use of ROLLBACK. Assume that a database named Sterling
has been created. A table named ValueTable is created in this database as shown in Code Snippet 4.
Code Snippet 4:
USE Sterling;
GO
CREATE TABLE ValueTable ([value] char)
GO
Code Snippet 5 creates a transaction that inserts two records into ValueTable. Then, it rolls back the
transaction and again inserts one record into ValueTable. When a SELECT statement is used to query
the table, you will see that only a single record with value C is displayed. This is because the earlier INSERT
operations have been rolled back or cancelled.
Code Snippet 5:
BEGIN TRANSACTION
INSERT INTO ValueTable VALUES('A'); INSERT INTO ValueTable VALUES('B');
GO
ROLLBACK TRANSACTION
INSERT INTO ValueTable VALUES('C');
SELECT [value] FROM ValueTable;
➢ ROLLBACK WORK
This statement rolls back a user-specified transaction to the beginning of the transaction. Following is the
syntax for the ROLLBACK WORK statement:
Syntax
ROLLBACK [WORK] [ ; ]
The keyword WORK is optional and is rarely used.
➢ SAVE TRANSACTION
The SAVE TRANSACTION statement sets a savepoint within a transaction. Following is the syntax for the
SAVE TRANSACTION statement:
Syntax
SAVE {TRAN | TRANSACTION} {savepoint_name | @savepoint_variable} [ ; ]
where,
savepoint_name: specifies the savepoint_name assigned.
@savepoint_variable: specifies the name of a user-defined variable that contain a valid savepoint
name. The variable can be declared as char, varchar, nchar, or nvarchar data type.
Code Snippet 6:
Code Snippet 7 shows the effect that nested BEGIN and COMMIT statements have on the @@TRANCOUNT
variable.
Code Snippet 7:
Code Snippet 7 displays the number of times the BEGIN TRAN and COMMIT statement execute in the
current connection.
Code Snippet 8 shows the effect that nested BEGIN and ROLLBACK statements have on the @@TRANCOUNT
variable.
Code Snippet 8:
PRINT @@TRANCOUNT BEGIN TRAN
PRINT @@TRANCOUNT BEGIN TRAN
PRINT @@TRANCOUNT
ROLLBACK
PRINT @@TRANCOUNT
In this case, code snippet 8 displays the number of times the BEGIN and ROLLBACK statements execute
in the current connection.
Figure 14.4 displays the output of Code Snippet 8.
For creating a marked transaction in a set of databases, the following steps are required:
1. Name the transaction in the BEGIN TRAN statement and use the WITH MARK clause.
2. Execute an update against all of the databases in the set.
Code Snippet 9 updates the list price in the Product table of the AdventureWorks2019
database.
Code Snippet 9:
USE AdventureWorks2019;
GO
BEGIN TRANSACTION ListPriceUpdate
WITH MARK 'UPDATE Product list prices';
GO
UPDATE Production.Product
SET ListPrice = ListPrice * 1.20 WHERE ProductNumber LIKE 'BK-%';
GO
COMMIT TRANSACTION ListPriceUpdate;
GO
Transaction isolation levels mainly describe the protection levels from the special effects of changes made
by other transactions for read operations. A lower isolation level increases the capability of several users
to access data at the same time. However, it increases the number of concurrency effects such as dirty reads
or lost updates that users might come across. On the other hand, a higher isolation level decreases the types
of concurrency effects which user may encounter. This requires additional system resources and increases
the chance of one transaction blocking another transaction.
Selecting a suitable isolation level is based on the data integrity requirements of the application as
compared to the overheads of each isolation level. The higher isolation level, serializable, assures that a
transaction will recover the same data each time it repeats the read operation. Then, it does this by
performing a level of locking that is expected to influence other users in a multi-user system. The lower
isolation levels retrieves data that is modified and is not committed by other transactions. All concurrency
side effects occur in read uncommitted, however, there is no read versioning or locking, hence, the
overhead is minimized.
Table 14.2 lists the concurrency effects that are allowed by different isolation levels.
Isolation Level Dirty Read NonRepeatable Read
Read committed No Yes
Read uncommitted Yes No
Snapshot No No
Repeatable Read No No
Serializable No No
Table 14.2: Isolation Levels
Transactions need to execute at an isolation level of at least repeatable read that prevents lost updates
occurring when two transactions each retrieve the same row and then, updates the row that is dependent
Table 14.3 lists the resource lock modes used by the Database Engine.
Lock Mode Description
Update Is used on resources that are to be updated.
Shared Is used for read operations that do not change data such as SELECT
statement.
Intent Is used to establish a hierarchy of locks.
Exclusive Is used for INSERT, UPDATE, or DELETE data-modification operations.
BULK UPDATE Is used while copying bulk data into the table.
Schema Is used when the operation is dependent on the table schema.
Table 14.3: Lock Modes
Different types of locks are as follows:
➢ Update Locks
These locks avoid common forms of deadlock. In a serializable transaction, the transaction will read data,
acquire a shared lock on the row or a page, and modify the data that requires lock conversion to an
exclusive lock.
When two transactions acquire a shared lock on a resource and try to update data simultaneously, the
same transaction attempts the lock conversion to an exclusive lock. The shared mode to exclusive lock
conversion should wait as the exclusive lock for one transaction is not compatible with shared mode lock
of the other transaction a lock wait occurs. Similarly, the second transaction tries to acquire an exclusive
lock for update. As both the transactions are converting to exclusive locks and each waits for the other
transaction to release its shared lock mode, a deadlock occurs.
➢ Shared Locks
These locks allow parallel transactions to read a resource under pessimistic concurrency control.
Transactions can change the data while shared locks exist on the resource. Shared locks are released on a
resource once the read operation is completed, except the isolation level is set to repeatable read or higher.
➢ Exclusive Locks
These locks prevent access to resources by concurrent transactions. By using an exclusive lock, no other
transaction can change data and read operations take place only through the read uncommitted isolation
level or NOLOCK hint. DML statements such as INSERT, DELETE, and UPDATE combine modification and
read operations. These statements first perform a read operation to get data before modifying the
statements. DML statements usually request both exclusive and shared locks. For example, if the user
wants to use an UPDATE statement that modifies the row in one table that is dependent on a join with other
table. Therefore, the update statements request shared lock on the rows that reads from the join table and
request exclusive locks on the modified rows.
➢ Intent Locks
The Database Engine uses intent locks for protecting and places an exclusive or shared lock on the resource
that is at a lower level in the lock hierarchy. The name intent locks is given because they are acquired
before a lock at the low level and hence, indicate intent to place locks at low level. An intent lock is useful
for two purposes:
To prevent other transactions from changing the higher-level resource in a way that will
invalidate the lock at the lower-level.
To improve the efficiency of the Database Engine for identifying the lock conflicts those are at
the higher level of granularity.
For example, a shared intent locks are requested at the table level before requesting the shared locks on
rows or pages within the table. Setting the intent lock at the table level protects other transaction from
subsequently acquiring an exclusive lock on the table containing pages. Intent locks also contain Intent
Exclusive (IX), Intent Shared (IS), and Shared with Intent Exclusive (SIX). Table 14.4 lists the lock modes
in Intent locks.
Lock Mode Description
Intent shared (IS) Protects the requested shared lock on some resources that are lower in the
hierarchy.
Intent exclusive Protects the requested exclusive lock on some resources lower in the
(IX) hierarchy. IX is a superset of IS, that protects requesting shared locks on
lower level resources.
Shared with Intent Protects the requested shared lock on all resources lower in the hierarchy
Exclusive (SIX) and intent exclusive locks on some of the lower level resources.
Concurrent IS locks are allowed at the top-level resource.
Intent Update (IU) Protects the requested update locks on all resources lower in the hierarchy.
IU locks are used only on page resources. IU locks are converted to IX
locks if an update operation takes place.
Shared intent Provides combination of S and IU locks, as a result of acquiring these
update (SIU) locks separately and simultaneously holding both locks.
➢ Schema Locks
Schema modification locks are used by Database Engine while performing a table DDL operation such as
dropping a table or a column. Schema locks prevent concurrent access to the table, which means a schema
lock blocks all external operations until the lock releases.
Some DML operations such as truncating a table use the Schema lock to prevent access to affected tables
by concurrent operations.
Schema stability locks are used by the database engine while compiling and executing the queries. These
stability locks do not block any of the transaction locks including the exclusive locks. Hence, the
transactions that include X locks on the table continue to execute during the query compilation. Though,
the concurrent DML and DDL operations that acquire the Schema modification locks do not perform on
the tables.
➢ Key-Range Locks
These types of locks protect a collection of rows that are implicitly present in a recordset which is being
read by a Transact-SQL statement while using the serializable transaction isolation level. Key-range locks
prevent phantom reads. By protecting the range of keys between rows, they also prevent the phantom
deletions or insertions in the recordset that accesses a transaction.
Truncating a log frees the space in the log file for reusing the transaction log. Truncation of logs starts
automatically after the following events:
In a simple recovery model after the checkpoint.
In a bulk-logged recovery model or full recovery model, if the checkpoint is occurred ever since the
last backup, truncation occurs after a log backup.
There are factors that delay a log truncation. When the log records remain active for a long time,
transaction log truncation is late and the transaction log fills up. Log truncations are delayed due to many
reasons. Users can also discover if anything prevents the log truncation by querying the
log_reuse_wait_desc and log_reuse_wait columns of the sys.databases catalog view.
3. Identify the function that returns a number of BEGIN TRANSACTION statements that occur in the
current connection.
(A @@TRANCOUNTER (C) @@TRANCOUNT
)
(B) @@ERRORMESSAGE (D @@ERROR
)
4. Which of the following is not the concurrency effect allowed by different isolation levels?
(A Read committed (C) Repeatable Read
)
(B) Snapshot (D COMMIT
)
(A a-1, b-4, c-2, d-3, e-5 (C) a-2, b-4, c-3, d-5, e-1
)
(B) a-5, b-4, c-3, d-2, e-1 (D a-1, b-2, c-3, d-4, e-5
)
This session introduces error-handling techniques in SQL Server and describes the use of
TRY-CATCH blocks. Various system functions and statements that can help display error
information are also covered in the session.
15.1 Introduction
Error handling in SQL Server has become easy through a number of different techniques. SQL Server
has introduced options that can help you to handle errors efficiently. Often, it is not possible to capture
errors that occur at the user's end. SQL Server provides the TRY…CATCHstatement that helps to handle
errors effectively at the back end. There are a number of system functions that print error related
information, which can help fix errors easily.
➢ Syntax Errors
Syntax errors are the errors that occur when code cannot be parsed by SQL Server. Such errors are detected
by SQL Server before beginning the execution process of a Transact-SQL block or stored procedure.
Some scenarios where syntax errors occur are as follows:
If a user is typing an operator or a keyword is used in a wrong way, the code editor will display the
tooltip showing the error. Figure 15.1 displays an example of syntax error.
➢ Run-time Errors
Run-time errors are errors that occur when the application tries to perform an action that is supported
neither by SQL Server nor by the operating system. Run-time errors are sometimes difficult to fix as
they are not clearly identified or are external to the database.
Some instances where run-time errors can occur are as follows:
Performing a calculation such as division by 0
Trying to execute code that is not defined clearly
BEGIN TRY
{ sql_statement | statement_block }
END TRY
BEGIN CATCH
[ { sql_statement | statement_block } ]
END CATCH
[ ; ]
where,
sql_statement: specifies any Transact-SQL statement.
statement_block: specifies the group of Transact-SQL statements in a BEGIN…END block.
A TRY…CATCH construct will catch all run-time errors that have severity higher than 10 and that do not
close the database connection. A TRY block is followed by a related CATCH block. A TRY…CATCH block
cannot span multiple batches or multiple blocks of Transact-SQL statements.
If there are no errors in the TRY block, after the last statement in the TRYblock has executed, control is
passed to the next statement following the END..CATCHstatement. If there is an error in the TRY block,
control is passed to the first statement inside the CATCH block. If END...CATCH is the last statement in a
trigger or a stored procedure, control is passed back to the calling block.
BEGIN TRY
DECLARE @num int;
SELECT @num=217/0;
END TRY
BEGIN CATCH
PRINT 'Error occurred, unable to divide by 0'
END CATCH;
In this code, an attempt is made to divide a number by zero. This will cause an error, hence, the
TRY…CATCH statement is used here to handle the error.
Both TRY and CATCH blocks can contain nested TRY…CATCH constructs. For example, a CATCH block can
have an embedded TRY…CATCH construct to handle errors faced by the CATCH code. Errors that are
encountered in a CATCH block are treated just like errors that are generated elsewhere. If the CATCH block
encloses a nested TRY…CATCH construct, any error in the nested TRY block passes the control to the nested
CATCH block. If there is no nested TRY…CATCH construct the error is passed back to the caller.
TRY…CATCH constructs can also catch unhandled errors from triggers or stored procedures that execute
through the code in TRY block. However, as an alternative approach, triggers or stored procedures can also
enclose their own TRY…CATCH constructs to handle errors generated through their code.
Note: GOTO statements can be used to jump to a label inside the same TRY…CATCH block or to leave
the TRY…CATCH block. The TRY…CATCH construct should not be used in a user-defined function.
15.5 Information
It is a good practice to display error information along with the error, so that it can help to solve the
error quickly and efficiently.
To achieve this, system functions need to be used in the CATCHblock to find information about the
error that initiated the CATCHblock to execute.
In this code, the SELECT statement will cause a divide-by-zero error that is handled using the TRY…CATCH
statement. The error causes execution to jump to the associated CATCH block within which the error
information will be displayed.
Figure 15.6 displays the result of the error information. The first resultset is blank because the statement
fails.
In this code, when an error occurs, the CATCH block of the TRY…CATCH construct is called and the error
information is returned.
➢ Uncommittable Transactions
If an error is generated in a TRY block, it causes the state of the current transaction to be invalid and the
transaction is considered as an uncommitted transaction. An uncommittable transaction performs only
ROLLBACK TRANSACTIONor read operations. The transaction does not execute any Transact-SQL statement
that performs a COMMIT TRANSACTIONor a write operation.
The XACT_STATE function returns -1 if the transaction has been classified as an uncommittable transaction.
When a batch is completed, the Database Engine rolls back any uncommittable transactions. If no error
messages are sent when the transaction enters the uncommittable state on completing the batch, then the
error messages are sent to the client. This specifies that an uncommittable transaction is detected and rolled
back.
15.6 @@ERROR
The @@ERRORfunction returns the error number for the last Transact-SQL statement executed.
@@ERROR
The @@ERRORsystem function returns a value of the integer type. This function returns 0, if the previous
Transact-SQL statement encountered no errors. It also returns an error number only if the previous
statements encounter an error. If the error was one of the errors in the sys.messages catalog view,
then @@ERROR contains the value from the sys.messages.message_id column for that error. Users
Code Snippet 5 demonstrates how to use @@ERROR to check for a constraint violation.
Code Snippet 5:
USE AdventureWorks2019;
GO
BEGIN TRY
UPDATE HumanResources.EmployeePayHistory SET PayFrequency = 4
WHERE BusinessEntityID = 1;
END TRY
BEGIN CATCH
IF @@ERROR = 547
PRINT N'Check constraint violation has occurred.';
END CATCH
In this code, @@ERROR is used to check for a check constraint violation (which has error number 547) in
an UPDATE statement.
It displays the following error message:
(0 rows affected)
Check constraint violation has occurred.
15.7 RAISERROR
The RAISERROR statement starts error processing for a session and displays an error message. RAISERROR
can reference a user-defined message stored in the sys.messages catalog view or build dynamic error
messages at run-time. The message is returned as a server error message to the calling application or to
the associated CATCH block of a TRY…CATCH construct.
Following is the syntax for RAISERROR statement.
Syntax:
RAISERROR ( { msg_id | msg_str | @local_variable }
{ ,severity,state }
[ ,argument [ ,...n ] ] )
[ WITH option [ ,...n ] ]
where,
msg_id: specifies the user-defined error message number that is stored in the sys.messages
catalog view using the sp_addmessage.
Value Description
Records the error in the error log and the application log for the instance of the
LOG
Microsoft SQL Server Database Engine.
NOWAIT Sends message directly to the client.
Sets the ERROR_NUMBER and @@ERROR values to msg_id or 5000 irrespective of
SETERROR
the severity level.
Table 15.1: Type Specification Values
When RAISERROR executes with a severity of 11 or higher in a TRY block, it will transfer the control to
the associated CATCH block.
Code Snippet 6:
RAISERROR (N'This is an error message %s %d.', 10, 1, N'serial number', 23);
GO
In this code, the RAISERROR statements takes the first argument of N'serial number' changes the first
conversion specification of %s, and the second argument of 23 changes the second conversion of %d. The
code snippet displays the 'This is error message serial number 23'. Code Snippet 7
demonstrates how to use RAISERROR statement to return the same string.
Code Snippet 7:
In this code, the RAISERROR statements return the same string, Hel. The first statement specifies the width
and the precision values and the second statement specifies the conversion specification.
Code Snippet 8 demonstrates how to use RAISERROR statement inside the TRY block.
Code Snippet 8:
BEGIN TRY
RAISERROR ('Raises Error in the TRY block.', 16, 1 );
END TRY
BEGIN CATCH
DECLARE @ErrorMessage NVARCHAR(4000); DECLARE @ErrorSeverity INT;
DECLARE @ErrorState INT; SELECT
@ErrorMessage = ERROR_MESSAGE(), @ErrorSeverity = ERROR_SEVERITY(),
@ErrorState = ERROR_STATE();
RAISERROR (@ErrorMessage, @ErrorSeverity, @ErrorState );
END CATCH;
In this code, the RAISERROR statement used inside the TRY block has severity 16, which causes the
execution to jump to the associated CATCH block.
RAISERROR is then used inside the CATCH block to return the error information about the original error.
Users can use the ERROR_STATE in a CATCH block. Code Snippet 9 demonstrates how to use ERROR_STATE
statement inside the TRY block.
Code Snippet 9:
BEGIN TRY
SELECT 217/0;
END TRY
BEGIN CATCH
SELECT ERROR_STATE() AS ErrorState;
END CATCH;
GO
In this code, the SELECT statement generates a divide-by-zero error. The CATCH statement will then
return the state of the error. The ERROR_STATEis displayed as 1.
15.9 ERROR_SEVERITY
The ERROR_SEVERITY function returns the severity of the error that causes the CATCH block of a
TRY…CATCH construct to be executed. Following is the syntax for ERROR_SEVERITY:
Syntax:
ERROR_SEVERITY ( )
It returns a NULLvalue if called outside the scope of the CATCHblock. ERROR_SEVERITYcan be called
anywhere within the scope of a CATCHblock. In nested CATCHblocks, ERROR_SEVERITYwill return the
error severity that is specific to the scope of the CATCH block where it is referenced. Users can use the
ERROR_SEVERITYfunction in a CATCHblock.
Code Snippet 10 shows how to display the severity of the error.
BEGIN TRY
SELECT 217/0;
END TRY
BEGIN CATCH
SELECT ERROR_SEVERITY() AS ErrorSeverity;
END CATCH;
GO
In this code, an attempt to divide by zero generates the error and causes the CATCH block to display the
severity error as 16.
15.10 ERROR_PROCEDURE
The ERROR_PROCEDURE function returns the trigger or a stored procedure name where the error has
occurred that has caused the CATCH block of a TRY…CATCH construct to be executed.
Following is the syntax of the ERROR_PROCEDURE:
Syntax:
ERROR_PROCEDURE( )
It returns the nvarchardata type. When the function is called in a CATCH block, it will return the name of
the stored procedure where the error occurred. The function returns a NULL value if the error has not
occurred within a trigger or a stored procedure. ERROR_PROCEDUREcan be called from anywhere in the
scope of a CATCH block. The function also returns NULL if this function is called outside the scope of a
CATCH block.
In nested CATCH blocks, the ERROR_PROCEDURE returns the trigger or stored procedure name specific
to the scope of the CATCH block where it is referenced.
Code Snippet 11 shows the use of the ERROR_PROCEDURE function.
Code Snippet 11:
USE AdventureWorks2019;
GO
IF OBJECT_ID ( 'usp_Example', 'P' ) IS NOT NULL
DROP PROCEDURE usp_Example;
GO
CREATE PROCEDURE usp_Example AS
SELECT 217/0;
GO
BEGIN TRY
EXECUTE usp_Example;
END TRY
BEGIN CATCH
SELECT ERROR_PROCEDURE() AS ErrorProcedure;
END CATCH;
GO
In this code, the stored procedure usp_Example generates a divide-by-zero error. The ERROR_PROCEDURE
This code makes use of several error handling system functions that can help to detect and rectify an error
easily.
15.11 ERROR_NUMBER
The ERROR_NUMBER system function when called in a CATCH block returns the error number of the error
that causes the CATCH block of a TRY…CATCH construct to be executed. Following is the syntax of
ERROR_NUMBER:
Syntax:
ERROR_NUMBER( )
The function can be called from anywhere inside the scope of a CATCHblock. The function will return
NULLwhen it is called out of the scope of a CATCHblock.
ERROR_NUMBER returns the error number irrespective of how many times it executes or whether it
executes within the scope of a CATCH block. This is different than the @@ERRORwhich only returns the
error number in the statement immediately after the one that causes error, or the first statement of the
CATCHblock.
Code Snippet 13 demonstrates the use of ERROR_NUMBER in a CATCH block.
Code Snippet 13:
BEGIN TRY
SELECT 217/0;
END TRY
BEGIN CATCH
SELECT ERROR_NUMBER() AS ErrorNumber;
END CATCH;
GO
15.12 ERROR_MESSAGE
The ERROR_MESSAGE function returns the text message of the error that causes the CATCH block of a
TRY…CATCH construct to execute.
Following is the syntax of ERROR_MESSAGE:
Syntax:
ERROR_MESSAGE( )
When the ERROR_MESSAGE function is called in the CATCH block, it returns the full text of the error
message that causes the CATCHblock to execute. The text includes the values that are supplied for any
parameter that can be substituted such as object names, times, or lengths. It also returns NULL if it is
called outside the scope of a CATCH block.
Code Snippet 14 demonstrates the use of ERROR_MESSAGE in a CATCH block.
Code Snippet 14:
BEGIN TRY
SELECT 217/0;
END TRY
BEGIN CATCH
SELECT ERROR_MESSAGE() AS ErrorMessage;
END CATCH;
GO
In this code, similar to other examples, the SELECT statement generates a divide-by-zero error. The
CATCH block displays the error message.
15.13 ERROR_LINE
The ERROR_LINE function returns the line number at which the error occurred in the TRY…CATCH
block.
Following is the syntax of ERROR_LINE:
Syntax:
ERROR_LINE( )
When this function is called in the CATCHblock, it returns the line number where the error has occurred.
If the error has occurred within a trigger or a stored procedure, it returns the line number in that trigger
or stored procedure. Similar to other functions, this function returns a NULL if it is called outside the
scope of a CATCH block.
Code Snippet 15 demonstrates the use of ERROR_LINE in a CATCH block.
BEGIN TRY
SELECT 217/0;
END TRY
BEGIN CATCH
SELECT ERROR_LINE() AS ErrorLine;
END CATCH;
GO
As a result of this code, the line number at which the error has occurred will be displayed.
Following types of errors are not handled by a CATCH block that occur at the same execution level as that
of the TRY…CATCHconstruct:
➢ Compile errors such as syntax errors that restrict a batch from running
➢ Errors that arise in the statement-level recompilation such as object name resolution errors
occurring after compiling due to deferred name resolution.
Code Snippet 16 demonstrates how an object name resolution error is generated by the SELECT statement.
Code Snippet 16:
USE AdventureWorks2019;
GO
BEGIN TRY
SELECT * FROM Nonexistent;
END TRY
BEGIN CATCH
SELECT
ERROR_NUMBER() AS ErrorNumber,
ERROR_MESSAGE() AS ErrorMessage;
END CATCH
This code will cause the object name resolution error in the SELECT statement. It will not be caught by the
TRY…CATCHconstruct.
Running a similar SELECTstatement inside a stored procedure causes the error to occur at a level lower
than the TRY block. The error is handled by the TRY…CATCH construct. Code Snippet 17 demonstrates how
the error message is displayed in such a case.
15.15 THROW
The THROW statement raises an exception and transfers control of the execution to a CATCH block of a
TRY…CATCH construct.
Following is the syntax of the THROW statement:
Syntax:
USE tempdb;
GO
CREATE TABLE dbo.TestRethrow
(ID INT PRIMARY KEY
);
BEGIN TRY
INSERT dbo.TestRethrow(ID) VALUES(1);
INSERT dbo.TestRethrow(ID) VALUES(1);
END TRY
BEGIN CATCH
PRINT 'In catch block.';
THROW;
END CATCH;
2. Which of the following constructs can catch unhandled errors from triggers or stored procedures?
3. Which of the following functions returns the error number for the last Transact-SQL
statement executed?
4. Which of these functions returns the severity of the error that causes the CATCH block of a
TRY…CATCHconstruct to be executed?
5. The statement raises an exception and transmits the execution to a CATCH block
of a TRY…CATCHconstruct in SQL Server 2019.
2. B
3. C
4. D
5. C
➢ Run-time errors occur when the application tries to perform an action that is supported neither by
Microsoft SQL Server nor by the operating system.
➢ TRY…CATCH constructs can also catch unhandled errors from triggers or stored procedures that
execute through the code in a TRY block.
➢ GOTO statements can be used to jump to a label inside the same TRY…CATCH block or to leave a
TRY…CATCH block.
➢ Various system functions are available in Transact-SQL to print error information about the error that
occurred.
➢ The RAISERROR statement is used to start the error processing for a session and displays an error
message.
a. Write error-handling statements using the TRY…CATCH construct for both normal statements as
well as stored procedures.
This session covers some of the enhancements made in SQL Server 2019. These include
Verbose Truncation Warnings, Vulnerability Assessment, Big Data Clusters, and how to
use JSON data effectively with SQL Server 2019.
Sometimes we receive the bad data in terms of character limits more than the defined limit in a column.
For example, if we want to insert bulk data into a database table using an insert statement, we get a bad
data on character length 5 into Color Name column while our existing column (Color_Name) allows only
three characters, so how SQL Server will behave? It will raise an SQL truncate error. Insert statement will
fail in this case. We normally call it as silent truncation and occur when we try to insert string data (varchar,
nvarchar, char, and nchar) into more than the size of the column.
Let us first create a sample database and a table to demonstrate the issue in earlier version of SQL Server.
SQL Server 2017 has been used for this example.
Step 1: Create a Sample DB 2017 database and set the compatibility to 140 as shown in Code Snippet 1.
Code Snippet 1:
USE [master]
GO
--Create Sample Database
CREATE DATABASE [Sample DB 2017]
GO
ALTER DATABASE [Sample DB 2017]
SET COMPATIBILITY_LEVEL = 140 -- SQL Server 2017
GO
Code Snippet 2:
In figure 16.1, as we can see, we get the SQL truncate error message ‘String or binary data would be
truncated.’ In other words, data size is exceeding the table’s column size. However, it does not specify
Consider if insertion or updation of more than one million rows in a table which has more than 50 columns.
It is complicated to perform and understand right? Now, you don’t need to worry because this issue has
been fixed in SQL Server 2019.
You can also set the database compatibility by selecting the Database Properties and then, select the
Options and set the Compatibility level.
Now, let us look at the SQL Server 2019 behavior of this SQL truncate issue.
Step 1: Create a database in SQL Server 2019 by default the compatibility is 150 as shown in Code Snippet
3.
Code Snippet 3:
USE [master]
GO
--Create Sample Database
CREATE DATABASE [Sample DB 2019]
GO
Step 2: Create a sample table and insert few records as shown in Code Snippet 4.
Code Snippet 4:
Output displayed in figure 16.2 shows error message which is very precise. It shows the database name,
table name, column name, and specify data exceeding the limit. In this case ‘Blue’ in Column ‘Color Name’
is exceeding the column size which is varchar(3).
Vulnerability Assessment is part of the Azure Defender for SQL offering, which is a unified package for
advanced SQL security capabilities. Vulnerability Assessment can be accessed and managed via the central
Azure Defender for SQL portal.
Note: Vulnerability Assessment is supported for Azure SQL Database, Azure SQL Managed
Instance, and Azure Synapse Analytics (formerly SQL Data Warehouse).
SQL Vulnerability Assessment is a service that provides visibility into your security state. Vulnerability
Assessment includes actionable steps to resolve security issues and enhance your database security. It can
help you to:
Vulnerability Assessment is a scanning service built into Azure SQL Database. The service employs a
The rules are based on Microsoft's best practices and focus on the security issues that present the biggest
risks to your database and its valuable data. They cover database-level issues and server-level security
issues, such as server firewall settings and server-level permissions. These rules also represent many of the
requirements from various regulatory bodies to meet their compliance standards.
Results of the scan include actionable steps to resolve each issue and provide customized remediation
scripts where applicable. You can customize an assessment report for your environment by setting an
acceptable baseline for:
• Permission configurations
• Feature configurations
• Database settings
• Go to Azure SQL Database, SQL Managed Instance Database, or Azure Synapse resource in the
Azure portal.
• Then, click Select Storage on the Vulnerability Assessment pane to open the Vulnerability
Assessment settings pane for either the entire server or managed instance.
• Configure a storage account where your scan results for all databases on the server or managed instance
will be stored. After storage is configured, select Scan to scan your database for vulnerabilities as shown
in figure 16.3.
Note: The scan is lightweight and safe. It takes a few seconds to run and is entirely read-only. It
does not make any changes to your database.
When vulnerability scan is finished, the scan report is automatically displayed in the Azure portal as shown
in figure 16.4. The report presents an overview of your security state. It lists how many issues were found
and their respective severities. Results include warnings on deviations from best practices and a snapshot
of your security-related settings, such as database principals and roles and their associated permissions.
Scan report also provides a map of sensitive data discovered in your database.
Review your results and determine the findings in the report that are true security issues in your
environment as shown in figure 16.5. Drill down to each failed result to understand the impact of the
finding and why each security check failed. Use the actionable remediation information provided by the
report to resolve the issue.
As you review your assessment results, you can mark specific results as being an acceptable baseline in
your environment as shown in figure 16.6. The baseline is essentially a customization of how the results
are reported. Results that match the baseline are considered as passing in subsequent scans. After you have
established your baseline security state, Vulnerability Assessment only reports on deviations from the
baseline. In this way, you can focus your attention on the relevant issues.
After you finish setting up your Rule Baselines, run a new scan to view the customized report as shown in
figure 16.7. Vulnerability Assessment now reports only the security issues that deviate from your approved
baseline state.
Note: Kubernetes (also known as k8s or "kube") is an open source container orchestration platform
that automates many of the manual processes involved in deploying, managing, and scaling
containerized applications.
Following are some popular uses of Big Data Clusters:
• Deploy scalable clusters of SQL Server, Spark, and HDFS containers running on Kubernetes
• Read, write, and process big data from Transact-SQL or Spark
• Easily combine and analyze high-value relational data with high-volume big data
• Query external data sources
• Store big data in HDFS managed by SQL Server
• Query data from multiple external data sources through the cluster
• Use the data for AI, machine learning, and other analysis tasks
• Deploy and run applications in Big Data Clusters
• Virtualize data with PolyBase
• Query data from external SQL Server, Oracle, Teradata, MongoDB, and ODBC data sources with
external tables
• Provide high availability for the SQL Server master instance and all databases by using Always On
availability group technology
SQL Server Big Data Clusters provide flexibility in interacting with big data. Query external data sources
and store big data in HDFS managed by SQL Server. Then this data can be used for AI, machine learning,
and other analysis tasks.
Data Virtualization
SQL Server Big Data Clusters can query external data sources without moving or copying the data as
shown in figure 16.8.
Data Lake
Data Lake is a storage repository that holds a huge amount of raw data in its native format. It is a scalable
HDFS storage pool. This can be used to store big data, potentially ingested from multiple external data
sources. Once the big data is stored in HDFS big data cluster, and then it can be analyzed. One can query
the data and combine it with relational data available in SQL Server as shown in figure 16.9.
A SQL Server big data cluster is a cluster of Linux containers orchestrated by Kubernetes.
Kubernetes Terms
Kubernetes is an open source container orchestrator, which can scale container deployments according to
the need. Table 16.1 defines some important Kubernetes terminology:
Term Description
A Kubernetes cluster is a set of machines, known as nodes. One node controls the
cluster and is designated the master node; the remaining nodes are worker nodes. The
Cluster
Kubernetes master is responsible for distributing work between the workers and for
monitoring the health of the cluster.
A node runs containerized applications. It can be either a physical machine or a virtual
Node machine. A Kubernetes cluster can contain a mixture of physical machine and virtual
machine nodes.
A pod is the atomic deployment unit of Kubernetes. A pod is a logical group of one or
more containers-and associated resources-required to run an application. Each pod
Pod
runs on a node; a node can run one or more pods. The Kubernetes master
automatically assigns pods to nodes in the cluster.
Table 16.1: Kubernetes Terms
In SQL Server Big Data Clusters, Kubernetes is responsible for the state of the SQL Server Big Data
Clusters; Kubernetes builds and configures the cluster nodes, assigns pods to nodes, and monitors the
health of the cluster.
Figure 16.12 shows the components of a SQL Server big data cluster.
Controller provides management and security for the cluster. It contains the control
Controller service, the configuration store, and other cluster-level services such as Kibana,
Grafana, and Elastic Search.
Data pool is used for data persistence and caching. The data pool consists of one or
Data Pool more pods running SQL Server on Linux. It is used to ingest data from SQL queries
or Spark jobs. SQL Server big data cluster data marts are persisted in the data pool.
Storage pool consists of storage pool pods comprised of SQL Server on Linux,
Storage Pool Spark, and HDFS. All the storage nodes in a SQL Server big data cluster are
members of an HDFS cluster.
For example, most Azure services, such as Azure Search, Azure Storage, and Azure Cosmos DB, have
REST endpoints that return or consume JSON. JSON is also the main format for exchanging data between
JSON functions in SQL Server enable you to combine NoSQL and relational concepts in the same
database. Now, it is possible to combine classic relational columns with columns that contain documents
formatted as JSON text in the same table, parse and import JSON documents in relational structures, or
format relational data to JSON text.
Code Snippet 5:
[
{
"name": "John",
"skills": ["SQL", "C#", "Azure"]
},
{
"name": "Jane",
"surname": "Doe"
}
]
By using SQL Server built-in functions and operators following things can be performed with JSON text
as shown in figure 16.13.
Code Snippet 6:
In Code Snippet 6 both relational and JSON data is used. Values from JSON text are used in any part of
a Transact-SQL query including WHERE, ORDER BY, or GROUP BY clauses, window aggregates, and so on.
JSON functions use syntax similar to JavaScript for referencing values inside JSON text.
To modify parts of JSON text, JSON_MODIFY (Transact-SQL) function is used to update the value of a
property in a JSON string and return the updated JSON string. Code Snippet 7 updates value of a property
in a variable that contains JSON.
Code Snippet 7:
DECLARE @json NVARCHAR(MAX);
SET @json = '{"info": {"address": [{"town": "Belgrade"}, {"town": "Paris"},
{"town":"Madrid"}]}}';
SET @json = JSON_MODIFY(@json, '$.info.address[1].town', 'London');
SELECT modifiedJson = @json;
Custom query language is not required to query JSON in SQL Server. To query JSON data, standard T-
SQL can be used. If you wish to retrieve or create a report on JSON data, you can easily able to do that by
Code Snippet 8 calls OPENJSON and transforms array of objects that is stored in the @json variable to a
rowset that can be queried with a standard SQL SELECT statement:
Code Snippet 8:
OPENJSON transforms the array of JSON objects into a table in which each object is represented as one row
and key/value pairs are returned as cells. The output observes the following rules:
• OPENJSON converts JSON values to the types that are specified in the WITH clause
• OPENJSON can handle both flat key/value pairs and nested, hierarchically organized objects
• It is not necessary to return all the fields that are contained in the JSON text
• If JSON values do not exist, OPENJSON returns NULL values
• Path can be specified optionally after type specification to reference a nested property or to reference a
property by a different name
• Optional strict prefix in the path specifies that values for the specified properties must exist in the JSON
text
JSON documents may have sub-elements and hierarchical data that cannot be directly mapped into the
standard relational columns. In this case, you can flatten JSON hierarchy by joining parent entity with
sub-arrays.
In Code Snippet 9, second object in array has sub-array representing person skills and each sub-object can
be parsed using additional OPENJSON function call.
In Code Snippet 9, skills array is returned in first OPENJSON function as original JSON text fragment and
passed to another OPENJSON function using APPLY operator. Second OPENJSON function will parse JSON
array and return string values as single column rowset that will be joined with the result of first OPENJSON.
You can format SQL Server data or results of SQL queries as JSON by adding the FOR JSON clause to a
SELECT statement. The FOR JSON delegates the formatting of JSON output from your client applications
to SQL Server.
Code Snippet 10 uses PATH mode with the FOR JSON clause.
USE AdventureWorks2019;
SELECT BusinessEntityId, FirstName AS "info.name", LastName AS "info.surname",
ModifiedDate AS dob
FROM Person.Person
FOR JSON PATH;
The FOR JSON clause formats SQL results as JSON text that can be provided to any app that understands
JSON. The PATH option uses dot-separated aliases in SELECT clause to nest objects in query results.
4. Which of the following big data cluster includes a scalable HDFS storage pool?
2. Smith Sons and Corp. have following JSON data related to their account, they want you to convert the
same in tabular format with OPENJSON and WITH clause.
[
{
"Order": {
"Number":"SO43659",
"Date":"2011-05-31T00:00:00"
},
"AccountNumber":"AW29825",
"Item": {
"Price":2024.9940,
"Quantity":1
}
},
{
"Order": {
"Number":"SO43661",
"Date":"2011-06-01T00:00:00"
},
"AccountNumber":"AW73565",
"Item": {
"Price":2024.9940,
"Quantity":3
}
}
]
This session introduces SQL Server features such as PolyBase, Query Store, and Stretch
Databases. PolyBase is used to process Transact-SQL queries that read data from external
data sources. The session also explains Query Store for analyzing query performance using
built-in reports. Finally, the session Stretch Database is used to migrate all or part of the
associated history table to cost-effective storage in Azure.
➢ Describe PolyBase
➢ Explain features and advantages of PolyBase
➢ Define and describe Query Store
➢ Explain how to dynamically stretch warm and cold transactional data from SQL
Server to Azure
➢ Describe how to tune workload performance with Query Store
Queries that access external data can also use to target relational tables in your SQL Server instance. This
allows to combine data from external sources with high-value relational data in your database. In SQL
Server, an external table or external data source provides the connection to Hadoop.
PolyBase pushes some computations to the Hadoop node to optimize the overall query. However,
PolyBase external access is not limited to Hadoop. Other unstructured non-relational tables are also
supported, such as delimited text files.
• Query data stored in Hadoop from SQL Server or Parallel Data Warehouse (PDW): Users are
storing data in cost-effective distributed and scalable systems, such as Hadoop. PolyBase makes it easy
to query the data by using T-SQL.
• Import data from Hadoop, Azure Blob Storage, or Azure Data Lake Store: Leverage the speed of
Microsoft SQL's columnstore technology and analysis capabilities by importing data from Hadoop,
Azure Blob Storage, or Azure Data Lake Store into relational tables. There is no need for a separate
Extract, Transform, and Load (ETL) or import tool.
• Export data to Hadoop, Azure Blob Storage, or Azure Data Lake Store: Archive data to Hadoop,
Azure Blob Storage, or Azure Data Lake Store to achieve cost-effective storage and keep it online for
easy access.
• Integrate with BI tools: Use PolyBase with Microsoft's business intelligence and analysis stack or use
any third party tools that are compatible with SQL Server.
It contains SQL Server It contains SQL Server When querying external SQL
instance to which PolyBase instance that assists with scale- Server, Oracle or Teradata
queries are submitted. Each out query processing on instances, partitioned tables
PolyBase group can have only external data. A compute node will benefit from scale-out
one head node. A head node is is a logical group of SQL reads. Each node in a
a logical group of SQL Server and the PolyBase data PolyBase scale-out group can
Database Engine, PolyBase movement service on SQL spin up to eight readers to read
Engine, and PolyBase Data Server instance. A PolyBase external data and each reader
Movement Service on the group can have multiple is assigned one partition to
SQL Server instance. compute nodes. Both head read in the external table.
node and compute nodes must
run on the same version of
SQL Server.
Note: PolyBase can be installed on only one SQL Server instance per machine.
Before you install PolyBase on your SQL Server instances, decide whether you want a single node
installation or a PolyBase scale-out group.
Note: SQL Server 2019 PolyBase now includes an additional option Java connector for HDFS data sources.
See SQL Server preview features for more information about this feature.
4. On the Server Configuration page, configure the SQL Server PolyBase Engine Service and SQL
Server PolyBase Data Movement Service to run under the same domain account.
This option also enables Microsoft Distributed Transaction Coordinator (MSDTC) firewall
connections and modifies MSDTC registry settings.
6. On the PolyBase Configuration page, specify a port range with at least six ports. SQL Server setup
allocates the first six available ports from the range.
Enable PolyBase
After installation, PolyBase must be enabled to access its features as shown in Code Snippet 1.
Code Snippet 1:
Once SQL Server installation is complete with the PolyBase feature, PolyBase must be configured to
interact with external data sources.
Note: PolyBase using SQL Server 2019 can interact with Hadoop, Oracle, MongoDB, Teradata, and
so on.
1. Run sp_configure with 'hadoop connectivity' and set an appropriate value for your provider. To find the
value for your provider, see official document for PolyBase Connectivity Configuration. By Default, the
Hadoop connectivity is set to 7.
Code Snippet 2:
2. Restart SQL Server using services.msc. Restarting SQL Server restarts these services as shown in figure
17.3.
To query the data in your Hadoop data source, you must define an external table to use in Transact-SQL
queries. Following steps describe how to configure the external table:
1. Create a master key on the database, if one does not already exist as shown in Code Snippet 3. This is
required to encrypt the credential secret.
Code Snippet 3:
PASSWORD ='password': It is a password that is used to encrypt the master key in the database.
Password must meet the Windows password policy requirements of the computer that is hosting instance
of SQL Server.
2. Create a database scoped credential for Kerberos-secured Hadoop clusters as shown in Code Snippet
4.
Code Snippet 4:
3. Create an external data source with CREATE EXTERNAL DATA SOURCE as shown in Code Snippet 5.
Code Snippet 5:
4. Create an external file format with CREATE EXTERNAL FILE FORMAT as shown in Code Snippet 6.
5. Create an external table pointing to data stored in Hadoop with CREATE EXTERNAL TABLE as shown
in Code Snippet 7. In Code Snippet 7, the external data used is car sensor data.
Code Snippet 7:
Code Snippet 8:
Query Store automatically captures a history of queries, plans, and runtime statistics, and retains these for
review. It separates data as per time frames so database usage patterns can be identified and query plan
changes happened on the server are noted. Query store can be configured using ALTER DATABASE SET
option.
The information about the compilation and execution is first stored in cache and then, stored to the disk.
The duration of keeping the information in the cache and the time when it is stored on the disk is
determined by the INTERVAL_LENGTH_MINUTES parameter and the DATA_FLUSH_INTERVAL_SECONDS
parameter, respectively. These parameters should be specified when Query Store is enabled and configured.
Sometimes, if there is an overload in the cache, the information is written to the disk and then, the cache
is cleared.
Note that information on both the cache and the disk can be accessed by running the sys.query_
store_runtime_stats catalog.
Using Transact-SQL
Use the ALTER DATABASE statement to enable the query store for a given database shown as follows:
Alternatively, Transact-SQL statements can be used. Figure 17.6 shows the Transact-SQL statements to
configure the Query Store parameters.
Code Snippet 9:
SELECT name, type_desc FROM sys.all_objects
WHERE name LIKE ‘%query_Store%’ or name=’query_context_settings’
Figure 17.7 displays the list.
If any queries are consuming the most of system resources, such as CPU, memory or IO, they can be
identified and accordingly changed to ensure optimal usage.
When changes to applications or platforms are planned, Query Store can be used to compare performance
before and after change implementation. These changes may include installing new versions of applications,
installing new hardware, compatibility level upgrades to the database, and adding or modifying indexes.
When upgrades are planned for Query Optimizer, performance of queries can be recorded before
upgradation and fixed during upgradation.
When there is a need to optimize resources, queries with a lower frequency of execution can be identified
and allocation of resources to them can be proactively handled.
Moreover, applications can still query the data in the same way as before. Enabling Stretch Database feature
makes it possible to do the following:
• Move archived data as well as current data to the cloud securely by using the encryption features
• Access and query the data on the cloud at any time, without any changes to existing applications or
queries
• Reduce storage requirements of on-premise data by using the vast storage capacity of the cloud
• Reduce processing burden on the on-premise data by running the processes on the cloud in a way that
is transparent to the applications
If you select Tasks | Stretch | Enable for an individual table and you have not yet enabled the database
for Stretch Database, the wizard configures the database for Stretch Database and lets you select tables as
part of the process.
Note:
Stretch Database migrates data to Azure. Therefore, you require an Azure account and a subscription for
billing.
Enabling Stretch Database on a database or a table requires db_owner permissions. Enabling Stretch
Database on a database also requires CONTROL DATABASE permissions.
To enable Stretch Database on a database or a table, first enable it on the local server. This operation
requires sysadmin or serveradmin permissions.
➢ If you have the required administrative permissions, Enable Database for Stretch wizard
configures the server for Stretch.
➢ If you do not have required permissions, an administrator has to enable it manually by running
sp_configure first then, run the wizard, or an administrator has to run the wizard.
1. Enable Stretch Database on the SQL Server instance as shown in Code Snippet 10.
Code Snippet 10 runs sp_configure and enable remote data archive option. In this example remote
data archive option is enabled by setting its value to 1.
2. To configure an existing table for Stretch Database, run the ALTER TABLE command. Code Snippet
11 shows syntax that migrates the entire table and begins data migration immediately.
3. Select the database and create the table that must be stretched. Code Snippet 12 shows syntax to
create a new table with Stretch Database.
Consider that you have created a sample database and table. This table will be stretched to the Azure
cloud. Code Snippet 13 creates a table StretchSampleTable on the StretchDemo database.
Insert sample records into the table as shown in Code Snippet 14, where sample data is inserted into
the StretchSampleTable.
4. On the Object Explorer, right-click the StretchDemo database, select Tasks, then select Stretch and
then, select Enable option as shown in figure 17.9.
The Enable Database for Stretch wizard opens as shown in figure 17.10.
6. Click Finish to verify the choices made and to complete enabling the database for Stretch. A
secure linked server definition between the local database and the remote database on Azure is
now established.
Constraints
• Uniqueness is not enforced for UNIQUE constraints and PRIMARY KEY constraints in the Azure
table that contains the migrated data.
DML operations
• You are not able to UPDATE or DELETE rows that have been migrated or rows that are eligible for
migration, in a Stretch-enabled table or in a view that includes Stretch-enabled tables.
• You are not able to INSERT rows into a Stretch-enabled table on a linked server.
Indexes
• You are not able to create an index for a view that includes Stretch-enabled tables.
• Filters on SQL Server indexes are not propagated to the remote table.
Following items currently prevent you from enabling Stretch for a table:
Table properties
• Tables that have more than 1,023 columns or more than 998 indexes
• FileTables or tables that contain FILESTREAM data
• Tables that are replicated, or that are actively using Change Tracking or Change Data
Capture
• Memory-optimized tables
Data types
• text, ntext, and image
• timestamp
• sql_variant
• XML
• CLR data types including geometry, geography, hierarchyid, and CLR user-defined
types
Column types
• COLUMN_SET
• Computed columns
Constraints
• Default constraints and check constraints
Indexes
• Full text indexes
• XML indexes
• Spatial indexes
• Indexed views that reference the table
(A) SQL Server PolyBase Engine (C) SQL Server Data Movement Service
2. Which of the following actions are necessary for data migration from a database table to an Azure
database through SQL Server?
Stretch Database must be enabled Stretch Database must be enabled on the
(A) (C)
on the Server instance table
Stretch Database must be enabled Stretch Database must be enabled on the
(B) (D)
on the database Server instance, database, and on the table
i. Create demo database as StarlightDemo and insert data in it from Chinook database
at following link: https://github.com/cwoodruff/ChinookDatabase
ii. Next, create sample Azure Cloud account for Starlight Digital Media.
iii. Using Stretch database functionality migrate StarlightDemo database data to Azure Cloud.