Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
47 views

Databases: Wednesday, January 21, 2009 3:20 PM

1) Databases are used to store persistent data instead of using flat files. They allow for shared data access and improved data structure. 2) Database management systems provide software to create, access, and manage databases. Larger systems include Oracle, SQL Server, and MySQL. 3) Databases follow a three level architecture of external, conceptual, and internal levels to provide data independence and flexibility.

Uploaded by

espyter
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
47 views

Databases: Wednesday, January 21, 2009 3:20 PM

1) Databases are used to store persistent data instead of using flat files. They allow for shared data access and improved data structure. 2) Database management systems provide software to create, access, and manage databases. Larger systems include Oracle, SQL Server, and MySQL. 3) Databases follow a three level architecture of external, conceptual, and internal levels to provide data independence and flexibility.

Uploaded by

espyter
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

Databases

Wednesday, January 21, 2009


3:20 PM
Databases
File Processing application
Use several flat files to store persistent data
Applications built on databases
Use database instead of files
Database
Shared collection of related data and information about the structure of the data
Database management system
Collection of software for managing databases
Large packages
 
ORACLE(Big)|--(DB2, IBM)---(SQL Server)------(MySQL)-----------|Pyrho (small memory
requirement)
Databases for cloud computing
Trade offs
DBMS File- Processing

Nice simple way to think about data


(row & columns)

Concurrency Depend on software for file logging or


create own

Access control  

  Speed

  Don't have to learn another language to


interact with DB

Change structure of table

Error recovery

Flexibility to retrieve anything in the  


database
 
Three level architecture
External level
Users/user rights/views
Conceptual level (tables/descriptions)
Developer? DB designer
Internal level
DB administrator (high demand/high $)
Data independence
Logical: able to make changes at the middle level without being forced to
make changes at the internal level
Physical: able to make changes at internal level without changing conceptual
level
Objectives
All authorized users should be able to access the same data but perhaps in
slightly different ways
Different users should be authorized to access different parts of the data
Users shouldn’t have to know the structure/details of the database
DB admin should be able to change the structure
Languages
Most languages have two main parts
Setting up structure of database, like a class or a struct, data definition
language (DDL)
Setup, name entities, give out user rights
CREATE TABLE
Actually working on the data (DML)
Provides basic manipulation operations
SELECT UPDATE DELETE
Data model
Text book uses "entities" "attributes" "relationships"
Instead of
"tables" "rows" and "columns"
Simply a way to think about databases
O-O Databases
Used very rarely, ~5%, for small niche applications
Functions of DB management systems
Data storage retrieval, updating
User-accessible catalog
Transaction support
Complete somehting or fail
Concurrency control
Ways to recover from errors
Authorization
Networkable
Integrity of data
Promote data independence
Utilities available for it
Example
1st tier
Clients
2nd tier
Middle ware
application server
3rd tier
DB Servers
Vocab
First normal form
Can't have a set/array of values stored for a particular attribute in a row
Name Meaning Alt

Degree # of columns in a table  

Cardinality # of rows in a table

Tuple   Row

Attribute   Column

Relation   Table

Schema # of columns, names of columns, data types of columns Structure

Superkey Column or set of columns that uniquely identifies a row


with in a table

Composite key When a key consists of more than 1 attribute/column

Candidate key A column(s) that will have a unique value for each  
row/tuple.
Difference from super key: Least number of columns
needed to provided uniqueness

Primary key Candidate key that is selected as the unique identifier for  
the rows

Alternate keys Candidate keys that are not selected as primary keys  

Entity integrity Primary keys cannot be null  

Referential Foreign key must match a candidate key from another  


integrity relation or be null

Foreign key When a key must exist in another table to exist in the  
current table
A table can never have a duplicate tuple/row in a table EVER
Worst case, an entire set of columns will be the primary key
 
Mathematicians
Binary relation
Set of ordered pairs
# and its square { (1,1) (2,4) (3,9) } (25,5) would not fit, order within the
set does not matter
Set does not contain repetition
Ternary relations
Set of ordered 3-tuples
R3= {('Bob', 71,160), ('sue', 65,120)}
K-ary relation
Subset of the cartesian product of k domains
E. F. Codd
Mathematician who created idea for DBs
Relation (table) as we use it
Database relations
Assume we have identified domains
Domain
Finite set of values
A subset of the cartesian product of a number of domains
Cartesian product
All combinations of values
+ the list of domains that we are taking the values from
Schema (structure)
Description of the table
# of columns
Names of columns
Data types
Relation to store standard info about cars
Color: Make: Model: Year: Automatic Trans: VIN: String (primary
string string string Int Bool key)

         
 
Deck of cards
Suit: enum{H,S,D,C} Value: int/enum

 
Each row will be unique
 
Relational algebra/calculus
Unary operators
-1 i.e. -(1)
Binary operators
3-2 i.e. 3.-(2)
Results are always relations
5 basic operations
Selection σ
Selecting rows from table
Unary operation
The only things allowable in the selection statement can be from the
argument, they cannot be fro m other relations
Projection π
Subset of columns
Unary, on single table
Returns columns along with schema (structure)
Rather than SELECT *, you would only
Cartesian product X
Sets
All combinations of sets of values
PxQ
Concatenation of every row in R with every row in Q
Pairs the two rows together to create new rows
SELECT * FROM R,Q is Cartesian product in SQL
Union RUS
All rows of one table, adding (at end) to end of other table, maintain
uniqueness
Set difference R-S
Rows that R has that S doesn't
3 others that are made up of the other 5 but make life easier
Join
Intersection
Division
σ Predicate (R)
Works on a single relation R and defines a relation that contains only those
tuples (rows) of R that satisfy the specified
SELECT *
JOINS
SELECT and Cartesian together such as SELECT * FROM users and login WHERE
userid = userid
Types of join |X|
Natural Join
Most important
Equijoin of two tables over all common attributes X. one
occurrence of each common attribute, eliminated one (copy) of
that column in the resulting column
SQL
SELECT * FROM R,S WHERE R.A = S.A
SELECT R NATURAL JOIN S ON R.A=S.A
Theta join
R|X|FS === σF(RxS)
Aka: cartesian product where F is true
Equijoin
Theta join except only using =, equality, in the predicate F.
common
Outer join
Left outer join =|X|
Include all of the left side even if it doesn’t have a value in
the common attribute that is equal to the ones in the
These rows/tuples will get NULL values in the values for the
part of the newly created relation/table from the relation
where the tuple didn’t have a foreign key/value
Semi join R|>S
Let you take local table, S, send it to (distrubed database) and in
return you get the rows in R that match up with S so that you can
compute
Example
π street, city, postcode(σ (t2 x PropertyforRent) t2.proptertyNo =
propertyforrent.propertyNo)
Is the same as
π street, city, postcode(t2 |X| Property For Rent) (natural join, select from
the two tables where the similar attribute/column are equal)
Division R÷S
Take the similar columns of the two, find rows in R that have all values of S
S must be subset of R
Returns all columns in R except column(s) that were used in comparison with
S
Name Relational Algebra Tuple Relational Domain Relational
Calculus Calculus

Difference R-S {t|tε... ….

Cartesian Rxs {t | rεR ^ sεS ^ {<c1, …,cn,


Product t[a1] = r[a1] ^ …^ d1…,dn>|<c1, …,cn>
t[a1] = r[a1] } ε R ^ <d1…,dn> ε S

Find loan- Π loan-number(σ {t | t ε loan ^  


number for each amount >1200(Loan)) t[amount] > 1200}
loan of an how to select just
amount greater row #?
than $1200 {t | E s ε loan
(t[loan-number] =
s[loan-number] ^
s[amount]> 1200)}

Find the names Π customer-    


of all customers name(borrow) U Π
who have a customer-
loan, an name(depositor)
account, or both
from the bank

Find the largest Π balance(account) - Π Subtract from a full  


account balance account.balance table a table that is
(σaccount.balance concatenated with
<d.balance (account x itself where there
rename(account)) are values less
than other values
Names of   {t | E s ε borrower  
customers (t[customer-name]
having a loan at = s[customer-
the Perryridge name] ^ E u ε
branch loan(u[branch-
(natural join) name] =
"Perryridge" ^
u[loan-number] =
s[loan-number]))}
Anything you can write in one, algebra, tuple, domain relational, you can write
in the other
Its been proven
Grouping operator G
For example, say you want the average age of students within each major
G min,max,count,avg,sum( )
I want these values from the table
Attribute for group G min,max,count,avg,sum( )
Group these values from the table
branchNo G COUNT staffNo, SUM salary (Staff)
Grouped by branchNo (ie perform COUNT and SUM for tuples with
same branchNo, then create tuple of branchNo, COUNT, SUM, then
move on to next branchNo)
Query Language
M. Zloof created Query by example, QBE
Way to turn DRC into graphical query language
SQL
SELECT [DISTINCT | ALL] {*|[columnExpression[AS newName]][,…]}
FROM TableName[alias][,…]
[WHERE condition]
[GROUP BY columnList] [HAVING condition]
[ORDER BY ASC(default) or DESC, or columnList]
 
R.A-S.A how to do subtraction
SELECT R.A FROM R LEFT JOIN S on R.A= S.A WHERE S.A IS NULL
SELECT COUNT(comment) FROM VIEWING
Returns # of non-nulls
SELECT COUNT (*)
Returns # of rows in the table, regardless of nulls
 

You might also like