Data Structure Terminology 2
Data Structure Terminology 2
Introduction
BASIC TERMINOLOGY
Definition:
A data structure is a specialized format for organizing and storing data. General data
structure types include the array, the file, the record, the table, the tree, and so on.
Any data structure is designed to organize data to suit a specific purpose so that it can
be accessed and worked with in appropriate ways.
1.1 Elementary Data Organization
1.1.1 Data and Data Item
Data are simply collection of facts and figures. Data are values or set of values. A
data item refers to a single unit of values. Data items that are divided into sub items are
group items; those that are not are called elementary items. For example, a students
name may be divided into three sub items [first name, middle name and last name]
but the ID of a student would normally be treated as a single item.
In the above example ( ID, Age, Gender, First, Middle, Last, Street, Area ) are elementary
data items, whereas (Name, Address ) are group data items.
Example:
information is sometimes used for data with given attributes, of, in other words
meaningful or processed data.
1.1.8 Field
A field is a single elementary unit of information representing an attribute of an
entity, a record is the collection of field values of a given entity and a file is the
collection of records of the entities in a given entity set.
1.1.9 File
File is a collection of records of the entities in a given entity set. For example, file
containing records of students of a particular class.
1.1.10 Key
A key is one or more field(s) in a record that take(s) unique values and can be
used to distinguish one record from the others.
Need
Need of
of data
data structure
structure
Digital Data
Music
Photos
Movies
Protein
Shapes
DNA
gatcttttta
gataagtgat
ccggtgatcg
cagaatcaag
acacattcgt
tttaaacgat
tattcacatg
tattgcgtat
gttgttatgt
tcgcgcgatc
ctctttatta
gcagatcata
aagctgggat
ggatatctac
tttgagctaa
gatctcttat
taattaagga
ctaaatggca
tggttttacc
ttagagtaaa
taggatcatg
ggatcgtttg
tgttatgcac
ctgcttttaa
ttaatccaat
atcctctgtg
ttgtgagtga
agtcactcgg
gcatagttat
ctttgaccca
Maps
0010101001010101010100100100101010000010010010100....
ASCII table:
agreement for
the meaning
of bits
16-bit words
0 0100101001111011
2 0110111010100000
4 0010000100100011
6 1000010001010001
8 0000000000000100
10 1001010110001010
12 1000000111000001
We may agree to
interpret bits as an
address (pointer)
14 1111111111111111
Physically, RAM is
a random accessible
array of bits.
Accessed
- Insert new data
- Remove old data
- Find data matching some condition
Processed
Algorithms: shortest path, minimum cut, FFT, ...
The focus of
this class
insert()
delete()
find_min()
find()
int main() {
D = new Dictionary()
D.insert(3,10);
cout << D.find(3);
}
Dictionary ADT
Most basic and most useful ADT:
insert(key, value)
delete(key, value)
value = find(key)
D[AAPL] = 130
# associative array
my %D; $D[AAPL] = 130;
# hash
D = {}; D[AAPL] = 130
# dictionary
map<string,string> D = new map<string, string>();
D[AAPL] = 130;
// map
C++ STL
Iterator Patterns
vector
const
O(n)
Random
list
O(n)
const
Bidirectional
O(n)
Front
O(n)
Front, Back
const
Random
map
O(log n)
O(log n)
Bidirectional
set
O(log n)
O(log n)
Bidirectional
string
const
O(n)
Bidirectional
array
const
O(n)
Random
valarray
const
O(n)
Random
bitset
const
O(n)
Random
deque
push_back
find
insert
erase
size
begin, end (iterators)
operator[]
front
back
Iterators, Sequences
for_each
find_if
count
copy
reverse
sort
set_union
min
max
cities...
Show cities within a given
window...
Political revolution?
- Insert, delete, rename cities
Linking:
Add pointers to each record so that we can find related records quickly.
E.g. The index in the back of book provides links from words to the pages
on which they appear.
Partitioning:
Divide the records into 2 or more groups, each group sharing a particular
property.
Ordering
Pheasant,
10
Grouse,
89
Quail,
55
Pelican,
3
Partridge,
32
Duck,
18
Woodpecker,
50
Robin,
89
Cardinal,
102
Eagle,
43
Chicken,
7
Pigeon,
201
Swan,
57
Loon,
213
Turkey,
99
Albatross,
0
Ptarmigan,
22
Finch,
38
Bluejay,
24
Heron,
70
Egret,
88
Goose,
67
Albatross,
0
Bluejay,
24
Cardinal,
102
Chicken,
7
Duck,
18
Eagle,
43
Egret,
88
Finch,
38
Goose,
67
Grouse,
89
Heron,
70
Loon,
213
Partridge,
32
Pelican,
3
Pheasant,
10
Pigeon,
201
Ptarmigan,
22
Quail,
55
Robin,
89
Swan,
57
Turkey,
99
Woodpecker,
50
(2)
Search for
Goose
(3)
Binary Search
(4)
O(log n)
(1)
Linking
97
43
24
78
Partitioning
Ordering implicitly gives a partitioning based on the < relation.
Partitioning usually combined with linking to point to the two halves.
Prototypical example is the Binary Search Tree:
Find 18
31
19
16
58
35
18
98
NW
SW
NE
SE
98
58
19
18
35
Remember: for templates to work, you should put all the code into the .h file.
Templates arent likely to be required for the coding project, but theyre a good
mechanism for creating reusable data structures.
So,
Much of programming (and thinking about programming) involves deciding
how to arrange information in memory. [Aka data structures.]