Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Information Systems Basics: H. Turgut Uyar Date: 2022-09-19 1.0

Download as pdf or txt
Download as pdf or txt
You are on page 1of 37

Information Systems Basics

H. Turgut Uyar

Date:
2022-09-19

Version:

1.0
File System

data and programs are kept on secondary storage

conceptual unit: file

a folder is used to group files

also called a directory

a folder can contain other folders

file system hierarchy


top level folder: root
Unix File System

etc usr home tmp

passwd bin lib share turing

firefox Documents Music

cv.pdf
Paths

how can we refer to a file?

path: a sequence of folders, and then the file

absolute path: start from the root

relative path: start from the "current" folder

current folder: one dot

parent folder (immediately above the current): two dots


File Manager

program for operating on files and folders

change the current folder


copy, rename, delete, …
File Types

text: human-readable, easier to work with

binary: only machine-readable, more efficient


File Name Extensions

file names have extension parts that indicate the type

starting from the last dot

for example: .pdf

not reliable: these can easily be changed


MIME

standard categorization of file types

https://www.iana.org/assignments/media-types/media-types.xhtml

format: type/subtype

types: image, audio, video, text, …


MIME Types

image/jpeg, image/png

audio/mpeg

video/mp4, video/x-matroska

application/pdf, application/zip

text/html, text/plain

not reliable: only a declaration


Archiving and Compression

combine files and folders into one archive file

compress a file for smaller file size

extract archive file to get the original contents

tar (archiving)

gzip, bzip2 (compression)

zip (both)
Internet Addresses

how can we refer to something on the Internet?

resource: object to process


web page

document

computer


Resource Addresses

URL: Uniform Resource Locator

scheme://host/path

https://en.wikipedia.org/wiki/Alan_Turing
Binary Numbers

computers represent information using binary numbers

only two values: 0, 1

bit: binary digit


Representing Numbers

digits correspond to powers of 2

24 23 22 21 20

16 8 4 2 1
Binary Value Examples

decimal binary

2 10

3 11

4 100

5 101

13 1101

22 10110
Byte

smallest unit of information: byte

8 bits

27 26 25 24 23 22 21 20

128 64 32 16 8 4 2 1

MSB LSB

MSB: most significant bit


LSB: least significant bit

if we regard only positive numbers: [0 255]


Byte Value Examples

decimal binary

0 00000000

1 00000001

22 00010110

65 01000001

128 10000000

171 10101011

255 11111111
Binary Value Notation

is the value decimal or binary?

101 or 5?

notation: binary values start with 0b

0b101
Larger Numbers

larger numbers are represented using multiple bytes

561

1000110001

10 00110001

00000010 00110001
Byte Order

also called "endianness"

big endian: MSB → LSB

little endian: LSB → MSB

561

BE: 00000010 00110001

LE: 00110001 00000010


Larger Units

1000 B kilobyte KB 1024 B kibibyte KiB

1000 KB megabyteMB 1024 KiB mebibyteMiB

1000 MBgigabyte GB 1024 MiBgibibyte GiB

1000 GB terabyte TB 1024 GiB tebibyte TiB

1000 TB petabyte PB 1024 TiB pebibyte PiB


Hexadecimal Numbers

binary numbers are difficult to read

hexadecimal: base 16
digits correspond to powers of 16

163 162 161 160

4096 256 16 1
Hexadecimal Digits

decbin hex decbin hex

8 10008 12 1100C

9 10019 13 1101D

10 1010A 14 1110E

11 1011B 15 1111F
Hexadecimal Notation

1 hex digit: 4 bits

1 byte: 2 hex digits

notation: hex values start with 0x


Hex Value Examples

dec bin hex

16 00010000 10

22 00010110 16

30 00011110 1E

65 01000001 41

128 10000000 80

171 10101011 AB

255 11111111 FF
Hex-Binary Conversion

pair hexadecimal digits with groups of 4 bits

starting from the least significant bit

F 3 C 0

1111 0011 1100 0000 10011011100001

1111001111000000 0010 0110 1110 0001

2 6 E 1
Character Sets

how can we represent letters, punctuation signs, …?

we assign a number to each character

a set of all such assignments: character set

also called an "encoding"


ASCII Character Set

7 bits per character

128 characters

English letters

digits

punctuation signs

special characters
ASCII Table

char# char#

! 0x21 A 0x41

# 0x23 B 0x42

7 0x37 Z 0x5A

? 0x3F a 0x61

@ 0x40 z 0x7A

the character '7' (numeric value 55)


is different from the number 7
Case Sensitivity

'A' and 'a' have different numbers

most programs consider these as different letters


ISO8859 Sets

ASCII is only for English

8 bits per character: 256 characters

ISO8859-1: Western European

the first 128 are the same as ASCII

ISO8859-9: Turkish

Turkish instead of Icelandic


ISO8859-1 and ISO8859-9

# ISO8859-1 ISO8859-9

0x3F ? ?

0x41 A A

0xC7 Ö Ö

0xE7 ö ö

0xD0 Ý Ğ

0xF0 ð ğ
Unicode

all characters in all writing systems

UTF-32: 32 bits per character


UTF-16: 16/32 bits per character

UTF-8: 8/16/24/32 bits per character

UTF-8 is the most common character set


UTF Examples

char UTF-32 UTF-16 UTF-8

A 0x00000041 0x0041 0x41

Ö 0x000000D6 0x00D6 0xC396

Ğ 0x0000011E 0x011E 0xC49E

∞ 0x0000221E 0x221E 0xE2889E

举 0x00004E3E 0x4E3E 0xE4B8BE

💪 0x0001F4AA 0xD83DDCAA 0xF09F92AA


Metadata

data: information in the file

metadata: data describing the content

data: photograph

metadata: shooting location, date, …

data: song

metadata: title, artist, lyrics, …


Text File Metadata

actual data: text in the file

metadata: author, copyright, …

character set
Providing Metadata

in the same file along with the data

music files can contain title, artist, …

externally

character set of a text file

You might also like