Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Lecture 6 - Aggregating Data Using Group Functions

Download as pdf or txt
Download as pdf or txt
You are on page 1of 44

Aggregating Data

Using Group Functions


Objectives

After completing this lesson, you


should be able to do the following:
Identify
 Identify the
the available
available group
group functions
functions
Describe
 Describe the
the use
use of
of group
group functions
functions
Group
 Group data
data using
using the
the GROUP
GROUP BY
BY clause
clause
Include
 Include or
or exclude
exclude grouped
grouped rows
rows by
by
using
using the
the HAVING
HAVING clause
clause
What Are Group Functions?
Group
Group functions
functions operate
operate on
on sets
sets of
of rows
rows to
to
give
give one
one result
result per
per group.
group.
EMP
DEPTNO SAL
--------- ---------
10 2450
10 5000
10 1300
20 800
20 1100
20 3000 “maximum MAX(SAL)
20 3000 salary in ---------
20 2975 the EMP table” 5000
30 1600
30 2850
30 1250
30 950
30 1500
30 1250
Types of Group Functions

AVG
 AVG
COUNT
 COUNT
MAX
 MAX
MIN
 MIN
STDDEV
 STDDEV
SUM
 SUM
VARIANCE
 VARIANCE
Using Group Functions

SELECT [column,] group_function(column)


FROM table
[WHERE condition]
[GROUP BY column]
[ORDER BY column];
Guidelines for Using Group Functions
DISTINCT makes the function consider only nonduplicate
values; ALL makes it consider every value including
duplicates. The default is ALL and therefore does
not need to be specifed.
The datatypes for the arguments may be CHAR, VARCHAR2,
NUMBER, or DATE where expr is listed.
All group functions except COUNT(*) ignore null values. To
substitute a value for null values, use the NVL function.
The Oracle Server implicitly sorts the result set in ascending
order when using a GROUP BY clause. To override this
default ordering, DESC can be used in an ORDER
BY clause.
Using AVG and SUM Functions

You can use AVG and SUM for numeric data.

SQL> SELECT AVG(sal), MAX(sal),


2 MIN(sal), SUM(sal)
3 FROM emp
4 WHERE job LIKE 'SALES%';

AVG(SAL) MAX(SAL) MIN(SAL) SUM(SAL)


-------- --------- --------- ---------
1400 1600 1250 5600
Using MIN and MAX Functions

You can use MIN and MAX for any


datatype.
SQL> SELECT MIN(hiredate), MAX(hiredate)
2 FROM emp;

MIN(HIRED MAX(HIRED
--------- ---------
17-DEC-80 12-JAN-83
Group Functions (continued)
You can use MAX and MIN functions for any datatype. The
slide example displays the most junior and most senior
employee.
The following example displays the employee name that is first and the
employee name that is the last in an alphabetized list of all
employees.
SQL> SELECT MIN(ename), MAX(ename)
2 FROM emp;
MIN(ENAME)
MIN(ENAME) MAX(ENAME)
MAX(ENAME)
----------
---------- ----------
----------
ADAMS
ADAMS WARD
WARD

Note: AVG, SUM, VARIANCE, and STDDEV


functions can be used only with numeric
datatypes.
Using the COUNT Function
COUNT(*) returns the number of rows
in a table.
SQL> SELECT COUNT(*)
2 FROM emp
3 WHERE deptno = 30;

COUNT(*)
---------
6
Using the COUNT Function
COUNT(expr)
COUNT(expr) returns
returns the
the number
number of of nonnull
nonnull rows.
rows.
Display the number of employees in department 30 who can
earn a commission.
SQL> SELECT COUNT(comm)
2 FROM emp
3 WHERE deptno = 30;

COUNT(COMM)
-----------
4

Notice that the result gives the total number of


rows to be four because two employees in
department 30 cannot earn a commission and
contain a null value in the COMM column.
The COUNT Function
Example
Display the number of departments in the EMP table.
SQL> SELECT COUNT(deptno)
2 FROM emp;
COUNT(DEPTNO)
COUNT(DEPTNO)
-------------
-------------
14
14
Display the number of distinct departments in the EMP table.
SQL> SELECT COUNT(DISTINCT (deptno))
2 FROM emp;
COUNT(DISTINCT(DEPTNO))
COUNT(DISTINCT(DEPTNO))
-----------------------
-----------------------
33
Group Functions and Null Values
Group functions ignore null values in
the column.
SQL> SELECT AVG(comm)
2 FROM emp;

AVG(COMM)
---------
550

All group functions except COUNT (*) ignore null values in the column.
In the slide example, the average is calculated based only
on the rows in the table where a valid value is stored in
the COMM column. The average is calculated as total
commission being paid to all employees divided by the
number of employees receiving commission (4).
Using the NVL Function
with Group Functions
The NVL function forces group functions
to include null values.
SQL> SELECT AVG(NVL(comm,0))
2 FROM emp;

AVG(NVL(COMM,0))
----------------
157.14286
Creating Groups of Data
EMP
DEPTNO SAL
--------- ---------
10 2450
10 5000 2916.6667
10 1300
20 800 “average DEPTNO AVG(SAL)
20 1100 salary ------- ---------
20 3000 2175 in EMP
table 10 2916.6667
20 3000
20 2975 for each 20 2175
30 1600 department” 30 1566.6667
30 2850
30 1250 1566.6667
30 950
30 1500
30 1250

Until now, all group functions have treated the table as one large group
of information. At times, you need to divide the table of
information into smaller groups. This can be done by
using the GROUP BY clause.
Creating Groups of Data:
GROUP BY Clause
SELECT column, group_function(column)
FROM table
[WHERE condition]
[GROUP BY group_by_expression]
[ORDER BY column];

Divide rows in a table into smaller


groups by using the GROUP BY
clause.
Using the GROUP BY Clause
All columns in the SELECT list that are
not in group functions must be in the
GROUP BY clause.
Display the department number and the
average salary for each department.
SQL> SELECT deptno, AVG(sal)
2 FROM emp
3 GROUP BY deptno;

DEPTNO AVG(SAL)
--------- ---------
10 2916.6667
20 2175
30 1566.6667
Using the GROUP BY Clause
The GROUP BY column does not have
to be in the SELECT list.
SQL> SELECT AVG(sal)
2 FROM emp
3 GROUP BY deptno;

AVG(SAL)
---------
2916.6667
2175
1566.6667
Grouping by More
EMP
Than One Column
DEPTNO JOB SAL
--------- --------- ---------
10 MANAGER 2450
DEPTNO JOB SUM(SAL)
10 PRESIDENT 5000
-------- --------- ---------
10 CLERK 1300
10 CLERK 1300
20 CLERK 800 “sum salaries in 10 MANAGER 2450
20 CLERK 1100 the EMP table 10 PRESIDENT 5000
20 ANALYST 3000 for each job,
20 ANALYST 6000
20 ANALYST 3000 grouped by
20 CLERK 1900
20 MANAGER 2975 department”
department”
20 MANAGER 2975
30 SALESMAN 1600
30 CLERK 950
30 MANAGER 2850
30 MANAGER 2850
30 SALESMAN 1250
30 SALESMAN 5600
30 CLERK 950
30 SALESMAN 1500
30 SALESMAN 1250
Using the GROUP BY Clause
on Multiple Columns
SQL> SELECT deptno, job, sum(sal)
2 FROM emp
3 GROUP BY deptno, job;

DEPTNO JOB SUM(SAL)


--------- --------- ---------
10 CLERK 1300
10 MANAGER 2450
10 PRESIDENT 5000
20 ANALYST 6000
20 CLERK 1900
...
9 rows selected.
Illegal Queries
Using Group Functions
Any
Any column
column or
or expression
expression in
in the
the SELECT
SELECT list
list
that
that is
is not
not an
an aggregate
aggregate function
function must
must be
be in
in
the
the GROUP
GROUP BY BY clause.
clause.
SQL>
SQL> SELECT
SELECT deptno,
deptno, COUNT(ename)
COUNT(ename)
BYY c
c l
l a
auussee
22 FROM
FROM emp;
emp;
n RO
t
thOU
h UPPB
GR O
ee G
m i
iissssinngg i
i n
o l u
ummnn m
Col
C
SELECT
SELECT deptno,
deptno, COUNT(ename)
COUNT(ename)
**
ERROR
ERROR at
at line
line 1:
1:
ORA-00937:
ORA-00937: not
not aa single-group
single-group group
group function
function
Illegal Queries Using Group Functions
Whenever you use a mixture of individual items (DEPTNO) and group functions
(COUNT) in the same SELECT statement, you must include a GROUP BY
clause that specifes the individual items (in this case, DEPTNO). If
the GROUP BY clause is missing, then the error message “not a
single-group group function” appears and an asterisk (*) points to the
offending column. You can correct the error on the slide by adding
the GROUP BY clause.
SQL> SELECT deptno, COUNT(ename)
2 FROM emp
3 GROUP BYDEPTNO deptno;
COUNT(ENAME)
DEPTNO COUNT(ENAME)
----------
---------- ------------
------------
10
10 33
20
20 55
30
30 66

Any column or expression in the SELECT list that is not an aggregate function must be
in the GROUP BY clause.
Illegal Queries
Using Group Functions
You
 You cannot
cannot use
use the
the WHERE
WHERE clause
clause to
to restrict
restrict
groups.
groups.
You
 You use
use the
the HAVING
HAVING clause
clause to
to restrict
restrict groups.
groups.
SQL>
SQL> SELECT
SELECT deptno,
deptno, AVG(sal)
AVG(sal)
s
s ee
22 FROM
FROM emp
emp l u
aau
33 WHERE AVG(sal)
AVG(sal) >> 2000 c
c l
WHERE 2000
R EE
R ss
44 GROUP
GROUP BY
BY deptno;
deptno; HEE
H uupp
e W
W r oo
h e
tth ctt g g r
WHERE AVG(sal) > 2000 s
s e
e riic
WHERE AVG(sal) > 2000 u u
tt reess t
t r
** n
noo r
ann t
t o
o
ERROR
ERROR at
at line
line 3:Ca
3:C
ORA-00934: group function is not allowed here
ORA-00934: group function is not allowed here
The WHERE clause cannot be used to restrict groups. The
SELECT statement on the slide results in an error because it uses
the WHERE clause to restrict the display of average salaries of
those departments that have an average salary greater than $2000.
You can correct the slide error by using the HAVING clause to
restrict groups.
SQL> SELECT deptno, AVG(sal)
2 FROM emp
3 GROUP BY deptno
4 HAVING AVG(sal) > 2000;

DEPTNO
DEPTNO AVG(SAL)
AVG(SAL)
----------
---------- --------------
--------------
10
10 2916.6667
2916.6667
20
20 2175
2175
Excluding Group Results
EMP
DEPTNO SAL
--------- ---------
10 2450
10 5000 5000
10 1300
20 800
20 1100 “maximum DEPTNO MAX(SAL)
20 3000 salary --------- ---------
3000
20 3000 per department 10 5000
20 2975 greater than 20 3000
30 1600 $2900”
30 2850
30 1250
2850
30 950
30 1500
30 1250
Excluding Group Results:
HAVING Clause
Use the HAVING clause to restrict
groups
Rows

Rows are
are grouped.
grouped.
The

The group
group function
function is
is applied.
applied.
Groups

Groups matching
matching the
the HAVING
HAVING clause
clause are
are
displayed.
displayed.
SELECT column, group_function
FROM table
[WHERE condition]
[GROUP BY group_by_expression]
[HAVING group_condition]
[ORDER BY column];
Using the HAVING Clause
Display department numbers and maximum salary for those
departments whose maximum salary is greater than
$2900.
SQL> SELECT deptno, max(sal)
2 FROM emp
3 GROUP BY deptno
4 HAVING max(sal)>2900;

DEPTNO MAX(SAL)
--------- ---------
10 5000
20 3000
Using the HAVING Clause

Display the job title and total monthly salary for each
job title with a total payroll exceeding $5000
SQL> SELECT job, SUM(sal) PAYROLL
2 FROM emp
3 WHERE job NOT LIKE 'SALES%'
4 GROUP BY job
5 HAVING SUM(sal)>5000
6 ORDER BY SUM(sal);

JOB PAYROLL
--------- ---------
ANALYST 6000
MANAGER 8275
Nesting Group Functions
Display the maximum average salary.

SQL> SELECT max(avg(sal))


2 FROM emp
3 GROUP BY deptno;

MAX(AVG(SAL))
-------------
2916.6667
Summary
SELECT column, group_function(column)
FROM table
[WHERE condition]
[GROUP BY group_by_expression]
[HAVING group_condition]
[ORDER BY column];

Order of evaluation of the clauses:


WHERE
 WHERE clause
clause
GROUP
 GROUP BY
BY clause
clause
HAVING
 HAVING clause
clause

You might also like