Lesson 1-SQL
Lesson 1-SQL
Getting Started
with sQL
Welcome to your first lesson in teaching yourself SQL. This lesson will
start your learning with a brief history of SQL and databases and will
provide you with a foundation that willhelp you through the rest of the
book. More specifically, you will learn the following:
The history of SQL and databases.
Dr. Codd's 12 rules for a relational database model.
RDBMS uses a
The characteristic that differentiates aDBMS from an RDBMS is that an
database language
set-oriented database language. For most RDBMSs, this set-oriented
groups.
is SQL. Set-oriented refers to the way that SQL processes data-in sets or
Standards Organization (ANSI)
Twostandards organizations-the American National
promote SQL standards to
and the International Standards Organization (ISO)-currently
used throughout this book.
industry. The ANSISQL standard is the standard for the SQL
system designers
Although these standard-making bodies prepare standards for database
some degree. In truth,
to follow, all database products differ from the ANSI standard to
that needs to be imple
although the ANSI standard has grown quite large, the amount
systems
mented by a particular RDBMS to be considered compliant is quite small. Most
into a true proce
provide some proprietary extensions to SQL that extend the language
dural language.
SOL
Various RDBMSs are discussed throughout this book, and the various flavors of the
language for specificimplementations are reviewed in more detail in Part 7, *SQL in
Various Database Implementations."
sible by using a combination of the table name, primary key value, and column
name.
Most databases have had a "parent/child" relationship; that is, a parent node would con
tain file pointers to its children (see Figure 1.1).
This method has several advantages and many disadvantages. In its favor is the fact that
the physical structure of data on a disk becomes unimportant. The programmer simply
data
stores pointers to the next location, so data can be accessed in this manner. Also,
can be added and deleted easily. However, different groups of information cannot be eas
ily joined to form new information. The format of the data on the disk cannot be arbitrar
ily changed after the database is created. Doing so would require the creation of a new
8
LESSON 1: Getting Started with SQL
database structure.
FIGURE 1.1
Codd's relational
Root node
database manage
ment system.
Level 1
Children of Root
Level 2
JChildren of Level 1
Level 3
Children of Level 2
Codd's idea for an RDBMS uses the mathematical concepts of relational algebra to break
data into sets and related common subsets.
Because information can naturally be grouped into distinct sets, Dr. Codd organized his
database system around this concept. Under the relational model, data is separated into
sets that resemble a table structure. This table structure consists of individual data ele
mentscalled columns, or fields. Asingle set of agroup of fields is known as a record, or
row. For instance, to create a relational database consisting of employee data, you might
start with a table called EMPLOYEE that contains the following pieces of information:
EMP_ID, LNAME, FNAME, and DOB. These four pieces of data make up the fields in the
EMPLOYEE table, shown in Table 1.1.
The eight rows are the records in the EMPLOYEE table. To retrieve a specific record from
this table for example, Mac Williams-a user would instruct the database management
system to retrieve the records in which the LNAME field was equal to Williams. If the
DBMS had been instructed to retrieve all the fields in the record, the employee's EMP_ID.
LNAME, FNAME, and DOB would be returned to the user. SQL is the language that tells the 1
database to retrieve this data. Asample SQL statement that makes this query is
SELECT *
FROM EMPLOYEE;
Remember that the exact syntax is not important at this point. We cover this topic in
much greater detail beginning in the next lesson.
Because the various data items can be grouped according to obvious relationships (such
as the relationship of Employee LNAME to Employee DOB), the relational database model
gives the database designer a great deal of flexibility to describe the relationships
between the data elements. Through the mathematical concepts of JOIN and UNION, rela
tionaldatabases can quickly retrieve pieces of data from different sets (tables) and return
them to the user or program as one "joined" collection of data (see Figure 1.2). The join
feature enables the designer to store sets of information in separate tables to reduce
repetition.
Figure I.3 shows a union. The union would return only data common to both sources.
FIGURE 1.2
The join feature.
Set A Set B
JOIN
FIGURE 1.3
The union feature.
Set A Set B
UNION
10
LESSON 1: Getting Started with SQL
Here's a simple example that shows how data can belogically divided betweentwo
tables. Table 1.2 is called DEPENDENTS and contains five columns: EMP_ID, LNAME,FNAME,
SEX, and RELATI ONSHIP.
It would be improper to duplicate the employee's EMP_ID, LNAME, FNAME, and DOB fields
for cach record. Over time, unnecessary duplication of data would waste a great deal of
hard disk space and increase access time for the RDBMS. However, if EMP ID and data
pertaining to family members were stored in a separate table named DEPENDENTS, the
user could join the DEPENDENTS and EMPLOYEE tables on the EMP ID field. Instructing the
RDBMS to retrieve all fields from the DEPENDENTS and EMPLOYEE tables in which the
EMP_ID field equals 3 would return the data in Table 1.3.
This type of application development requires an entirely new set of programming skills.
User interface programming is now written for graphical user interfaces, whether it be
MS Windows, IBM OS/2, Apple Macintosh. or the UNIX XWindow system. Using
SQL and a network connection. the application can interface to a database residing on a
remote server. The increased powerof personalcomputer hardware enables critical data
base information to be stored on a relatively inexpensive standalone server. In addition,
this server can be replaced with little or nochange to the client applications.
ACross-Product Language
You can apply the basic concepts introduced in this book in many environments. For
example, Microsoft Access running on a single-user Windows application or SQL Server
running with 100 user connections. One of SQL's greatest benefits is that it is truly a
cross-platorm language and a cross-product language. Because it is also what program
mers refer toas a high-level or fourth-generation language (4GL), a large amount of
work can be done in fewer lines of code.
Early Implementations
Oracle Corporation released the first commercial RDBMS that used SQL. Although the
original versionswere developed for VAXVMS systems, Oracle wasone of the first
vendors to release a DOS version of its RDBMS. (Oracle is now available on more than
70plattorms.) In the mid-1980s, Sybase released its RDBMS, SQL Server. With client
libraries for database access, support for stored procedures, and interoperability with var"
ious networks, SQL Server became a successful product. particularly in client/server
environments.
One of the strongest points for both of these powerful database systems is their scalabil
ity across platforms. Clanguage code (combined with SQL) written for Oracle on a R
is Virtually identical to its counterpart written for an Oracle database running on a VA
system.
An Overview of SQL 13
An Overview of SQL
SQL is the de facto standard language used to manipulate and retrieve data from these
relational databases. Through SQL, a programmer or database administrator can do the
following:
Modify a database's structure
Change system security settings
Add user permissions to databases or tables
Query a database for information
Update the contents of adatabase
NOTE The term SQL can be confusing. The S, for Structured, and the L,
for Language, are straightforward enough, but the Qis a little mis
leading. Q,of course, stands for Query, which-if taken literally
would restrict you to asking the database questions. But SQL
does much more than ask questions. With SQL you can also cre
ate tables, add data, delete data, splice data together, trigger
actions based on changes to the database, and store your queries
within your program or database.
Unfortunately, there is no good substitute for Query. Obviously,
Structured Add Modify Delete Join Store Trigger and Query
Language (SAMDJSTQL) 0s a bit cumbersome. In the interest of
harmony, we will stay with SQL. However, you now know that its
function is bigger than its name.
The most commonly used statement in SQL isthe SELECT Statement (see Lesson 2,
"Introducing the Query"),which retrieves data from the database and returns the data to
the user. The EMPLOYEE table illustrates a typical example of a SELECT Statement situa
tion. In addition to the SELECT statement, SQL provides statements for creating new data
bases, tables, fields, and indexes as well as statements for inserting and deleting records.
ANSISQL also recommends a core group of data manipulation functions. As you will
find out, many database systems also have tools for ensuring data integrity and enforcing
14
LESSON 1: Getting Started withSQL
MySQL
syntax.
Examples of MySQL are used in this book to demonstrate command-line SQL
relative ease
MySQL (available at http://www.mysql.com/) downloads and installs with
MySQL
and is gaining popularity as a DBMS. Detailed steps for getting and installing
are included as an appendix tothis book. Refer to Appendix D, "Using MySQL for
Exercises," for information about obtaining and installing MySQL on your computer.
Oracle
We use Oracle, which represents the larger corporate database world, to demonstrate
command-line SQL and database management techniques. (These techniques are impor
tant because the days of the standalone machine are drawing to an end, as are the days
when knowing one database or one operating system was enough.) In command-line s
Simple, standalone SQL statements are entered into Oracle's SOL*Plus tool. This tool
action on
then returns data to the screen for the user to see, or it performs the appropriate
the database.
Most of the examples are directed toward the beginning programmer or first-time user of
SQL. We begin with the simplest of SQL statements and advance to the topics of transwith
action management and stored procedure programming. The Oracle RDBMS comes
graphical tools for database, user, and object administration, as well as the SQL*Loader
utility, which is used to import and export data to and from Oracle.
Popular SQL Implementations 15
It includes nearly all the tools needed to demonstrate the topics discussed in this
book.
It is available on virtually every platform in use today and is one of the
most popu 1
lar RDBMS products worldwide.
A 30-day trial copy can be downloaded from Oracle Corporation's World Wide
Web server (http://www.oracle.com).
TIP Keep in mind that nearly all the SQL code given in this book is
portable to other database management systems. In cases where
syntax differs greatly among different vendors' products, examples
are given to illustrate these differences.
IBM DB2
IBM originally developed SQL in the late 1970s for its DB2 platform. As the world's
leading hardware vendor, IBM has also transformed its DB2 platform into what is known
as the Universal Database line.
Driver Manager
(Loads ODBC driver)
ODBC Driver
(Processes ODBC calls,
submits SQL request,
returns results)
Data Source
(Underlying DBMS)
The unique feature of ODBC (as compared to the Oracle or Sybase libraries) is that none
of its functions is database-vendor specific. For
instance, you can use tne saie
Piomqueries against a Microsoft Access table or against an Informix database iu
ue or no modification. Once again, it should be noted that most vendors add some po
Pietary extensions to the SQL standard. such as Microsoft's and Sybase's Transaci-se
and Oracle's PLISQL.
Embedding sOL in Application Prograrnning 17
You should always consult the documentation before beginning to work with a new data
source. ODBC has developed into a standard adopted into many products, including
Visual Basic, Visual C++, FoxPro, Borland Delphi, and PowerBuilder. As always, appli
cation developers need to weigh the benefit of using the emerging ODBC standard.
1
which enables you to design code without regard for a specific database, versus the speed
gained by using a database-specific function library. In other words, using ODBC will be
more portable but slower than using the Oracle or Sybase libraries.
Before the concept of dynamic SQL evolved, embedded SQL was the mnost popular way
still used,
to use SQL within a programming environment. Embedded SQL, which is
uses static SQL-meaning that the SQL statement is compiled into the application and
cannot be changed at runtime.
The principle is much the same as a compiler versus an interpreter. The performance for
needs
this type of SQL is good; however, it is not flexible-and cannot always meet the
of today's changing business environments. Dynamic SQL is discussed
shortly.
The ANSI 1992 standard (SQL-92) extended the language and became an international
standard. It defines three levels of SQL compliance: entry, intermediate, and full. The
new features that SQL-92 introduced include the following:
Connections to databases
Scrollable cursors
Dynamic SQL
Outer joins
18
LESSON 1: Getting Started with SQL
The largest ANSI standard revision (SOL3) has five interrelated documents. Other docu
ments may be added in the near future. The five parts are as follows:
The SQL standard has two levels of minimal conference that a DBMS may claim: Core
SQL Support and Enhanced SQL Support.
This book covers not only all these extensions but also some proprietary extensions used
by RDBMS vendors. Dynamic SQL allows you to prepare the SQL statement at runtime.
Although the performance for this type of SQL is not as good as that of embedded SQL.
itprovides the application developer (and user) with agreat degree of flexibility. Acall
level interface,such as ODBC or Sybase's DB-Library, is an example of dynamic SQL.
Call-level interfaces should not be a new concept to application programmers. When
using ODBC, for instance, yousimply fill a variable with your SQL statement and call
the function to send the SQL statement to the database. Errors or results can be
returned
tothe program through the use of other function calls designed for those purposes.
Results are returned through a process known as the bindingof variables.
Summary
This lessoncovers some of the history and structure behind SOL. Because SQL and rela
tional databases are so closely linked, this lesson also covers (albeit briefly) the
history
and function of relational databases. As you learned, databases are
used by most organi
zations in one form or another to manage important corporate data. Without
databases.
organizations would be forced to continue storing all data in hard-copy format. Without a
standard database language such as SQL, users would lack the robust and easy-to-Use
interface that allows communication with the database environment. Also, remember Dr.
Codd's rules of the relational database model as they are the basis for all relational data
base management systems (RDBMSs). Lesson 2 is devoted to the most important con
ponent of SQL: the query.
Workshop 19
Q&A
0 Why should I be concerned about SOL?
A Until recently, if you weren't working on a large
database system, you probably
had only a passing knowledge of SQL. With the advent of client/server develop 1
ment tools (such as Visual Basic, Visual C++, ODBC, Borland's
Delphi, and
Sybase's PowerBuilder) and the movement of several large databases (Oracle and
Sybase) to the PC platform, most business applications being developed today
require a working knowledge of SQL.
QWhy do Ineed to know anything about relational database theory to
use SQIL?
A SQL was developed to service relational databases. Without a minimal understand
ing of relational database theory. you will not be able to use SQL effectively.
except in the most trivial cases.
Q Allthe new GUItools enable me toclick a button to write SQL. Whyshould I
spendtime learning to write SQL manually?
A GUItools have their place, and manually writing SQL has itsplace. Manually
written SQL 0s generally more efficient than GUI-written SQL. Also, a GUI SQL
statement is not as easy to read as a manually-written SQL statement, and more
complex queries might be more cumbersome using a GUI tool than by writing the
query manually. Finally. knowing what is going on behind the scenes when you use
GUItools will help you get the most out of them.
0So, if SOL 0s standardized, should Ibe able to program with SQL on any
database?
A No, you will be able to program with SQL only on RDBMS databases that support
SQL, such as Microsoft Access, Oracle, Microsoft SQL Server, Sybase, and
Informix. Although each vendor's implementation will differ slightly from the oth
ers, you should be able to use SQL with very few adjustments.
Workshop
Ihe Workshop provides quiz questions to help solidify your understanding of the mater
1al covered, as well as exercises to provide you with experience in using what you have
learned. Try to answer the quiz and exercise questions before checking the answers in
Appendix A, "Answers."
LESSON 1: Getting Started with SQL
Quiz
1. What makes SQL a nonprocedural language?
2. How can you tell whether a database
is truly relational?
Exercise
computer to prepare for
Refer to Appendix D. Download and install MySQL on your for as many exer
using MySQL
hands-on exercises in the following lessons. We will be
ANSI-compliant, free, and easy
cises as possible in this book because MySQL is mostly
syntax that is slightly differ
to download and use. Some MySQL exercises might utilize
differences
ent than the Oracle examples we used. We will do our best to point out any
or noncompliance with the ANSIstandard that exist in MySQL.