Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

ICT and Research Methodology

Download as pdf or txt
Download as pdf or txt
You are on page 1of 46

Course Code: SGS 801.

Course Title: Information and Communication Technology and Research Methods

Course Content

 MODULE 1: Basic Concepts of Information and Communication Technology (ICT)


 UNIT 1: Computer Hardware and Software
 UNIT 2: Applications of ICT in Research
 UNIT 3: Internet Applications.
 MODULE 2: Research Methodology versus Research Methods
 UNIT 1: Sampling and Statistical Inference
 UNIT 2: Testing of Hypothesis
 UNIT 3: Regression Analysis
 UNIT 4: Factor Analysis
 UNIT 5: Discriminant Analysis
 MODULE 3: Interpretation and Publication of Research Findings.
MODULE 1: BASIC CONCEPT OF ICT

Information and communication technology, ICT, is an amalgamation of two terms:


information technology and communication technology. The term is generally accepted to
mean all devices, networking components, applications and systems that when combined
allow people, systems, and organizations (i.e., businesses, non-profit agencies, governments
and criminal enterprises) to interact in the digital world. It is an extended term for
information technology (IT) which stresses the role of unified communications and the
integration of telecommunications (telephone lines and wireless signals), computers as well
as necessary enterprise software, middleware, storage, and audio-visual systems, which
enable users to access, store, transmit, and manipulate information.

The term ICT is also used to refer to the convergence of audio-visual and telephone networks
with computer networks through a single cabling or link system. There are large economic
incentives (huge cost savings due to elimination of the telephone network) to merge the
telephone network with the computer network system using a single unified system of
cabling, signal distribution and management.

However, ICT has no universal definition, as "the concepts, methods and applications
involved in ICT are constantly evolving on an almost daily basis." The broadness of ICT
covers any product that will store, retrieve, manipulate, transmit or receive information
electronically in a digital form, e.g. personal computers, digital television, email, robots. For
clarity, Zuppo provided an ICT hierarchy where all levels of the hierarchy "contain some
degree of commonality in that they are related to technologies that facilitate the transfer of
information and various types of electronically mediated communications". Skills Framework
for the Information Age is one of many models for describing and managing competencies
for ICT professionals for the 21st century.

ICT (information and communications technology - or technologies) is an umbrella term that


includes any communication device or application, encompassing: radio, television, cellular
phones, computer and network hardware and software, satellite systems and so on, as well as
the various services and applications associated with them, such as videoconferencing and
distance learning. ICTs are often spoken of in a particular context, such as ICTs in education,
health care, or libraries.

The phrase Information and Communication Technology has been used by academic
researchers since the 1980s, and the term ICT became popular after it was used in a report to
the UK government by Dennis Stevenson in 1997 and in the revised National Curriculum for
England, Wales and Northern Ireland in 2000. But in 2012, the Royal Society recommended
that the term ICT should no longer be used in British schools "as it has attracted too many
negative connotations", and with effect from 2014 the National Curriculum was changed to
use the word computing reflecting the addition of computer programming to the curriculum.
A leading group of universities consider ICT to be a soft subject and advise students against
studying A-level ICT, preferring instead A-level Computer Science. Variations of the phrase
have spread worldwide, with the United Nations creating a "United Nations Information and
Communication Technologies Task Force" and an internal "Office of Information and
Communications Technology".

IT (Information Technology) encompasses all of the technology that we use to collect,


process, protect and store information. It refers to hardware, software (computer programs),
and computer networks.

This concept involves transfer and use of all kinds of information. ICT is the foundation of
economy and a driving force of social changes in the 21st century. Distance is no longer an
issue when it comes to accessing information; for example, working-from-home, distance
learning, e-banking, and e-government are now possible from any place with an Internet
connection and a computing device.
ICT LITERACY

Since we live in information society, everyone is expected to be ICT literate. The ICT
literacy entails.

 Awareness: As you study computers, you will become aware of their importance,
versatility, pervasiveness, and their potential for good and ill in our society.

 Knowledge: You will learn what computers are and how they work. This requires
learning some technical jargons that will help you deal with the computer and with
people that work with computers.

 Interaction: This implies learning to use a computer to perform some basic tasks or
applications

IMPACT OF ICT ON SOCIETY


There are both positive and negative impacts of ICT in this modern society. Some these
impact are discussed below:

Positive Impacts
i. Faster Communication Speed: In the past, it took a long time for any news or
messages to be sent. Now with the Internet, news or messages are sent via e-mail
to friends, business partners or to anyone efficiently. With the capability of
bandwidth, broadband and connection speed on the Internet, any information can
travel fast and at an instant. It saves time and is inexpensive.
ii. Lower Communication Cost: Using the Internet is cost-effective than the other
modes of communication such as telephone, mailing or courier service. It allows
people to have access to large amounts of data at a very low cost. With the
Internet we do not have to pay for any basic services provided by the Internet.
Furthermore, the cost of connection to the Internet is relatively cheap.
iii. Reliable Mode of Communication: Computers are reliable. With the internet,
information could be accessed and retrieved from anywhere and at any time. This
makes it a reliable mode of communication. However, the input to the computer is
contributed by humans. If the data passed to the computer is faulty, the result will
be faulty as well. This is related to the term GIGO. GIGO is a short form for
Garbage In Garbage Out. It refers to the quality of output produced according to
the input. Normally bad input produces bad output.
iv. Effective Sharing of Information: With the advancement of ICT, information can
be shared by people all around the world. People can share and exchange
opinions, news and information through discussion groups, mailing list and
forums on the Internet. This enable knowledge sharing which will contribute to
the development of knowledge based society.
v. Paperless Environment: ICT technology has created the term paperless
environment. This term means information can be stored and retrieved through the
digital medium instead of paper. Online communication via emails, online chat
and instant messaging also helps in creating the paperless environment.
vi. Borderless Communication: Internet offers fast information retrieval, interactivity,
accessibility and versatility. It has become a borderless source for services and
information. Through the Internet, information and communication can be
borderless.
vii. Create Employment: Although many employment areas have suffered job losses,
other areas have grown and jobs have been created. Some examples of areas
where jobs have been created:
 IT Technicians: All of the computers in a business need to be maintained:
hardware fixed, software installed, etc. IT technicians do this work.
 Computer Programmers: All of the software that is now used by
businesses has to be created by computer programmers. Hundreds of
thousands of people are now employed in the software industry
 Web Designers: Much of modern business is conducted on-line, and
company websites are very important. Company websites need to be
designed and built which is the role of web designers.

Negative Effects
i. Individualistic and introvert: Nowadays, people tend to choose online
communication rather than having real time conversations. People tend to become
more individualistic and introvert.
ii. Moral decedent and threats to the society: Some ICT users use ICT tools for,
fraud, identity theft, Pornography, Hacking etc. That could result to a moral
decedent and generate threats to the society.
iii. Health Problems: A computer may harm users if they use it for long hours
frequently. Computer users are also exposed to bad posture, eyestrain, physical
and mental stress. In order to solve the health problems, an ergonomic
environment can be introduced. For example, an ergonomic chair can reduces
back strain and a screen filter is used to minimize eye strain.
iv. Unemployment Situation: Some jobs have been lost as a result of computers being
used to do the same work that people used to do, for examples:
 Manufacturing: Many factories now have fully automated production
lines. Instead of using people to build things, computer-controlled robots
are used. Robots can run day and night, never needing a break, and don‘t
need to be paid! (Although the robots cost a lot to purchase, in the long-
term the factory saves money.)
 Secretarial Work: Offices used to employee many secretaries to produce
the documents required for the business to run. Now people have personal
computers, they tend to type and print their own documents.
 Accounting Clerks: Companies once had large departments full of people
whose job it was to do calculations (e.g. profit, loss, billing, etc.). A
personal computer running a spreadsheet can now do the same work.
 Newspaper Printing: It used to take a team of highly skilled printers to
typeset (layout) a newspaper page and to then print thousands of
newspapers. The same task can now be performed far more quickly using
computers with DTP software and computer-controlled printing presses
UNIT 1: COMPUTER HARDWARE AND SOFTWARE

What is a Computer?

The world has changed dramatically since the introduction of the first modern multipurpose
computer over 50 years ago. The ENIAC (Electronic Numerical Integrator and Computer),
designed by Drs. Mauchly and Eckert, two American engineers, was set up at the University
of Pennsylvania in 1946. This 30-ton machine occupied a thirty-by-thirty room, contained
18,000 vacuum tubes linked by 500 miles of wiring, and could perform 100,000 operations
per second. It consumed so much electricity that it dimmed the lights in the section of
Philadelphia where it was housed. Thanks to the development of the integrated chip, the
computer has evolved into a far smaller, more powerful, and less expensive machine. Today’s
microcomputer is 3,000 times lighter than the ENIAC, performs 4000 times faster, and costs
several million dollars less. Other innovations have made the computer easy enough for a
child to use and versatile enough for applications ranging from astrophysics to arcade-style
games. As a consequence of their decreasing size and cost, computers can be found today in
virtually every corner, from research facilities and corporate headquarters, to schools and
homes.
When we think of a computer, we generally picture computer hardware: the monitor, the
keyboard, and the electronic circuitry contained within the rectangular case. There is more to
a computer than this, however. The missing element is software–the instructions that tell the
computer how to operate the hardware. All computers must have these two components to
function. However, it is software that gives the computer one of its most distinguishing
characteristics—the ability to program a single machine to perform many different functions.
When we think of a computer, we generally picture computer hardware: the monitor, the
keyboard, and the electronic circuitry contained within the rectangular case. There is more to
a computer than this, however. The missing element is software–the instructions that tell the
computer how to operate the hardware. All computers must have these two components to
function. However, it is software that gives the computer one of its most distinguishing
characteristics—the ability to program a single machine to perform many different functions.
A computer is an electronic (nowadays) device that processes information. To qualify as a
computer a device must:
•Accept Input
•Remember and Store Information
•Process Inputs
•Produce Outputs

Application of Computer
Computers have moved into many facets of our lives. There is virtually no area of human
endeavour that computer usage has not penetrated. Though we cannot exhaust listing all the
areas of application of computers, the following are some key areas of computer application:

i. Science: One of the most important advantages of computers is in the field of science for
research and development. The computers have played a major role in most of what we know
about ourselves and the universe. The satellites, the telescopes and almost all the research
tools make use of computers in some or the other way. The huge calculations required for
space science, safe communication between scientists, storage of all the gathered information
are some of the computer uses in science and technology.

ii. Medical: The important use of computers in the medical field is for research and
development. The high end machines used for diagnosis and cure of many diseases are
nothing but computers. For example, the Magnetic Resonance Imaging (MRI), CT scan,
ultrasound devices, etc. are amongst the uses of computers in hospitals. Even many surgical
procedures, known as laparoscopic surgeries, need the help of computers. Web conferencing
helps doctors treat people remotely.

iii. Education: Computer uses in the field of education are infinite. The Internetis a huge
source of information. There are online universities that deliver online degreesand distance
learning is spreading far and wide. Many schools and colleges have started making use of
audio-visual ways of imparting knowledge. A horde of tools that need a computer, help
students in many ways.

iv. Banking: The banking sector has improved on fronts such as security, ease of use, etc.
with the help of computers. Most of the banking operations can be done online, known as
Internet banking, and you don't have to walk up to the bank for virtually anything. You can
withdraw money from ATMs and deposit money in any branch, thanks to the networking
affected by the use of computers. The complete banking experience has also become safer.
v. Crime Investigation: High end computer devices have ensured that justice is more
effective. CCTV cameras and other computer operated security systems have reduced the
amount of crime. And if it still happens there are many ways to track down the criminal in no
time. Forensic science employs computers for many of its operations related to investigations.

vi. Entertainment: The field of entertainment has been revolutionized by computers.


Animation, graphic image manipulation etc. has made the entertainment experience hundred
times better. Computer gaming is achieving new landmarks in terms of technology. Movie
making, editing, music composition etc. everything needs computers. This is only the tip of
the iceberg and the uses of computers in society are many more. But then the development of
computer technology has also given rise to many vices like identity theft.

vii. Government: The Government can use computers for the processing of immigration, tax
collection/administration, keeping tracks of criminals, computing budgets and statutory
allocations, Civil Service Records, computing wages, salaries, gratuities and pensions etc.

viii. Communication: Any computer has any potential to link up with other computers
through communication systems such as telephone lines or satellite. This link-up facilitates
exchange of memos, reports, letters, data/information, and even having meetings among
people in geographically dispersed locations.

ix. Robotics: Robots are information machines with the manual dexterity to perform tasks
too unpleasant, too dangerous, or too critical to assign to human beings. For example, robots
are used in defense to perform underwater military missions; robots could be used for
welding or paint-spraying in factories, and in car assembling.

x. Energy: Energy companies use computers and geological data to locate oil, coal, natural
gas and other mineral resources. Meter-readers use hand-held computers to record how much
energy is used in a month in homes and businesses. Computers can analyze the fuel
consumption in our cars.
HARDWARE BASICS
The concept of hardware includes computer components, the physical and tangible parts of
the computer, i.e., electrical, electronic and mechanical parts which comprise a computer.

Computer working principle:

Input devices System unit Output devices

Computer working principle: data are entered into a computer via input devices, then are
processed and stored in a system unit, and are finally displayed by the output device.

PERSONAL COMPUTER
Personal computer (PC), as the name suggests, is intended for personal use, as opposed to
the server, which is used by a larger number of people simultaneously, from different
locations, often via terminals. If you do not intend to move your computer frequently from
one place to another, and at the same time you want maximal price/performance ratio, then
you should use a desktop computer. In comparison to laptops or tablet computers, it is much
larger in size, inconvenient to carry/move, consumes more electricity but has a much better
price/performance ratio. Also, they are much easier to upgrade.

LAPTOP OR TABLET PC
Laptop or tablet PC is used by individuals who have the need to travel with a computer or
simply use them for aesthetic reasons when computing power is not an issue. Laptop
computers, as opposed to tablet PCs, more closely resemble a personal computer when it
comes to data input. Data entry is done via keyboard and mouse, while the tablet PC data
entry is done via touch screen.

Unlike desktop computers, notebooks and tablet PCs are optimized for portability, low power
requirements at the expense of performance and can be used (for a limited period of time-i.e.
until the batteries are depleted) without connection to the power grid. In order to prepare a
laptop or a tablet computer for use without a power connection, it is necessary to recharge the
batteries.
PORTABLE DIGITAL DEVICES
PDA-Personal Digital Assistant (PALM) is a convenient small sized computer. It easily
connects to mobile phones and can prove a good solution for less demanding users. As the
name suggests, it is a device that fits in the user’s palm. Its name directly tells us that this
computer is more of an assistant and not a workstation-whose name suggests the superiority
in capabilities and computing power, especially in comparison with PDA.

Mobile phone is a portable electronic device used for distant communication. In recent years,
mobile phone has evolved from simple communication device into a multi-functional device.
Additional functions, such as short text messaging (SMS), electronic mail, Internet access,
contact registration, calculator, clock, alarm, recording and photograph displaying, recording
and playback of video clips, sending/ receiving multimedia messages (MMS), audio
recording and playback, etc. has turned the mobile phone into an extremely useful device,
whose absence would make active involvement and participation in a modern society not
possible.

Smartphone is a device that merges functionality of phones, PDAs, cameras, camcorders and
computers. To function properly, Smart phones use operating systems, which are the basis for
application development. Some smart phones can be connected to an external screen and
keypad, which creates a working environment, similar to that of a laptop or a desktop
computer. Some operating systems for Smartphone are: Google Android, Symbian,
Blackberry, Palm Pilot, and Windows Phone.

MAIN COMPUTER PARTS


SYSTEM UNIT
The system unit (case) contains a computer's vital parts. There are two basic types of cases:

• Desktop casing is placed on a desk in a horizontal orientation.


• Towers come in 3 sizes (mini-tower, mid-tower and full-tower) and it is vertically
orientated.
Motherboard, MBO is computer's basic circuit, to which all computer components are
connected, directly or indirectly. Devices are connected to the motherboard through a system
bus. System bus connects all devices, ensures data flow and communication between different
devices using predefined protocols.
Protocol describes a manner in which communication between devices is defined. It enables
them to address each other and defines how they should look for each other on either system
bus or network. Buses can, according to the purpose, be divided into:

• Serial-USB, Firewire, etc.


• Parallel-AGP, PCI, etc.
• Mixed-Hyper Transport, InfiniBand, PCI, etc..

Central Processing Unit (CPU or processor) is a central part of a computer (and can be
referred to as the computer’s “brain”). It manages all other computer parts, monitors their
mutual communication and performs arithmetic-logical operations. Processor speed is
measured in hertz (or megahertz or gigahertz). Most famous manufacturers for personal
computer processors are Intel and AMD.

Graphics card is responsible for image processing and displaying it on a monitor. It has its
own graphics processor and memory. Image quality depends on the strength of these
components.

Modem enables computers to communicate via telephone lines. They connect computers to
the Internet.

Connectors or ports are slots visible in the back and the front side of a computer.

COMMON INPUT / OUTPUT PORTS

Universal Serial Bus (USB) is used to connect various devices (mouse, keyboard, USB
memory).

Serial port is used for example in connecting a mouse (labeled COM1 or COM2).

Parallel port is used for connecting a local printer (LPT1 or LPT2).

Network port is used for connecting computers to a network.

Firewire - used for connecting computers and audio-video devices (digital cameras, etc.).
MEMORY AND STORAGE DEVICES
ROM (Read Only Memory) is a type of permanent, internal memory that is used solely for
reading. BIOS (Basic Input/Output System), a program which is located in a separate ROM
on the motherboard, and defines, as the name suggests, basic input/output system, is a good
example

RAM (Random Access Memory) is a working memory in which analyzed data and
programs are stored, while computer runs. It allows reading and writing data, and is
deleted/cleared when the computer shuts down.

Cache is a small capacity memory which allows quick access to data. By storing data from
working memory in cache, the speed of communication between processor and RAM is
increased. Microprocessors use three levels of fast cache, L1, L2 and L3, used to store often
used data.

Hard Disk Drive (HDD) is a place for permanent data storage (it does not delete/clear when
computer shuts down). Its features are: large capacity, faster performance in comparison to
optical devices but slower in comparison to RAM and are used for permanent data storage.
We can distinguish between internal and external hard drives.

Floppy Disk Drive is used for storing and reading data stored on a floppy disk. Disk capacity
is 1.44MB. Before memory stick and a wider usage of CD recorders, it was used as data
carrier. 'Modern memory sticks have a memory capacity measured in GB while floppy disks
only have memory capacity of 1.44MB, indicating that floppy disks are becoming obsolete.

 CD-ROM drive is used for reading CD media.


 DVD drive is used for reading DVD discs. DVD disc capacity ranges from 4.7
to 18GB.
 Soundcard is a device used for sound creation and production by means of
computer speakers.

COMPUTER PERFORMANCE
Factors affecting computer performance:

• Processor clock speed, amount of cache and number of kernels


• The amount of installed RAM
• Graphics card-its memory and processor
• Clockbus
• Number of running applications
Applications use computing resources. The processor runs applications and performs code
that defines applications; therefore processors get the most workload when it comes to
running the application. In order for processors to execute the application, it is necessary for
application code to be loaded into the system memory. As a result, running applications take
up a certain amount of working memory. The more applications are running, the greater the
load on the processor and RAM. That is why the computer's performance depends on both the
processor (clock speed, number of cores, cache memory), and the amount of working
memory, as well as the number of applications running.

Processor speed is measured in hertz (Hz), and due to a large working clock speed of today's
processors, it is expressed in megahertz (MHz) or gigahertz (GHz). Besides the frequency,
the processor performance depends on the number of operations that the arithmetic-logic unit
(ALU) performs in one clock cycle.

Measurement units
Bit (binary digit) is the basic unit used to measure the amount of information. A byte or octet
contains eight bits.

1 KB (kilobyte) - 1024 B (approx. 1000 B)

1 MB (megabyte) - 1024 KB (approx. 1000 KB)

1 GB (gigabyte) - 1024 MB (approx. 1000MB)

1 TB (terabyte) - 1024GB(approx.1000GB)

1 PB (Petabyte) – 1024TB (approx. 1000TB)

BASIC TYPES OF STORAGE DEVICES


CD (Compact Disc) is an optical disc used for data storage. The standard capacity of a CD is
700MB. CD-R is used for reading and writing data one time-only, while CD-RW for reading
and writing data multiple times.

DVD (Digital Versatile Disc) is an optical disc which is, due to the larger capacity (about 4.7
GB), mostly used for video storage.
Blu-ray disc (BD) - the successor to DVD, is an optical disk storage, it comes in different
capacities, depending on how many layers it has and the capacity of each layer. Currently, the
capacity of one layer is between 27 GB and 33 GB, while the overall capacity is the product
of the number of layers and capacity of each layer.

Memory card is a type of flash memory used to store data in digital cameras, cell phones,
MP3 players etc.

USB Stick is a data storage device. It features small dimensions, relatively high capacity,
reliability and speed. It belongs to the type of flash memory that remembers data, even when
not under voltage i.e. they do not need electric power to maintain data integrity.

There is a difference between an internal hard disk drive, which is embedded in the computer
case, and an external hard disk drive, which is connected to a computer by using an
appropriate cable or USB port, and is usually used to transfer data from one computer to
another or for backup.

INPUT AND OUTPUT DEVICES


Input devices:
Mouse is an input device that facilitates work with the graphical user interface (GUI). The
mouse transmits hand movements and the screen displays the cursor (mouse pointer)
movements. They are divided into mechanical and optical (with respect to a transfer
movement), and wired and wireless (with respect to connection).

Trackball, unlike a mouse, is not movable. Hand movements are transmitted to the screen by
rolling the ball which is located on the upperside of the device.

Keyboard is used for data entry and is suing commands. They can also be wired or wireless.

Scanner is used to load data (image, text, etc.) from the printed material into a computer. The
result of scanning is an image, but with special programs, if we scan the text, we can get a
text as a result. Software used to recognize text from image is called a text recognition tool.

Touchpad is used for transmission of hand movement, but unlike working with a mouse, the
user is the one who determines the position of the cursor by touching the touchpad.

Light pen enables handwriting on screen and can be used as a mouse. It requires an
appropriate monitor type.
Joystick: mainly used in computer games. Unlike a mouse, it has many buttons which allow
control over game objects.

Microphone is a device that converts sound into an electrical signal, which can be stored on
a computer. It is mainly used for recording sound, communication between players in online
games, in combination with a web camera in video conferencing, for converting voice into
text on a computer (speech-to-text processing (e.g., textual files or emails), etc.

Webcam is a camera that stores video signal in a format appropriate for video transfer over
the Internet in realtime.

Digital camera, unlike analogue, stores photographs in digital format. It can be directly
connected to a computer and photographs can be downloaded. Photograph quality is
expressed in megapixels.

More megapixels mean better quality of photograph, however more memory is occupied.

Output devices:
Monitor displays images from the computer, it enables us to see, work and control
computers. In other words, working on a computer without a monitor would be
inconceivable. Common types of monitors, with regard to manufacturing technology, are the
CRT and LCD. CRT monitors have been present on the market for a long time, and other
technologies are pushing them out. They are based on cathode tube technology. LCD
monitors use liquid crystal technology. In comparison with CRT monitors, LCD monitors use
less electrical energy, do not emit radiation and their price is higher, however due to smaller
dimensions, more attractive design and a good picture quality, they are pushing CRT
monitors out of the market. Monitor size is expressed by the size of screen diagonal and
measured in inches (''). Picture quality is expressed with the notion of resolution, which is a
number of horizontal and vertical dots (pixels) (e.g. 1920x1080).

Projector is a device used to project a computer image or other images from independent
devices, such as DVD players, Blu-ray player, etc. onto canvas or a wall.

Printer is a device used for printing data from a computer onto a paper. We distinguish
between local printer (connected directly to the computer) and network printer (connected
directly to network using a network card). Also, printers also differ according to print
technology: dot matrix, laser, inkjet, thermal printer and plotter.
Dot matrix printers are the oldest, with the lowest price of print per paper, they are slow,
make a lot of noise while printing, and are mostly appropriate for printing text.

Laser printers are similar to photocopy devices when it comes to technology. They have
exceptional print quality, speed and are quiet. Downsides of laser printers are their high price
and high price of toners.

Inkjet printers have a high print quality (somewhat lower in comparison with laser printer),
they are quiet while printing, and have low initial investment. Ink price, especially color ink,
can cost as much as the printer itself. Printing technology is based onink dispersion from
container onto paper.

Plotter is used for printing large drawings (up to A0). They are extremely expensive and used
only for professional purposes, such as in designing firms for printing technical drawings
(blueprints).

Thermal printer, as its name states, leaves a print on the paper by utilizing heat. They use
paper sensitive to heat, feature small dimensions; they are quiet while printing and relatively
cheap. They are usually used for printing receipts, and owing to that they are called POS
printer (printer of sale). Also, they are used as calculator printers and due to their small
dimensions, as portable printers.

Input and output devices


Storage devices, due to necessity for writing and reading data, they are classified as
input/output devices.

Touch screen (i.e. monitor sensitive to touch) is out device while displaying computer image,
and at the same time input device while receiving manual orders.

SOFTWARE
Software is, unlike hardware, intangible part of the computer. It consists of a sequence of
commands, written according to strict rules. Programs are written by programmers, in various
programming languages. A computer system needs more than the hardware described above
in order to function. The hardware by itself, even when powered-up, is incapable of
producing useful output. It must be instructed how to direct its operations in order to
transform input into output of value to the user. This is the role of software; i.e., to provide
the detailed instructions that control the operation of a computer system. Just as hardware
comprises the tangible side of the computer, so software is the intangible side of the
computer. If the CPU is the physical brain of the computer, then software is its mind.

Software instructions are programmed in a computer language, translated into machine


language, and executed by the computer. Between the user and the hardware (specifically, the
memory), generally stand two layers of software: system software and application software.

Software types:
Operating system (system software) is a program which manages computer hardware. First
computers did not have operating systems; they had programs that were directly loaded into
the computer (e.g. punchcards). Today, computers have an operating system which loads into
the computer's memory during its startup. Computer functions are based on its operating
system. Within operating system, drivers (responsible for the functioning of a computer) and
various utility programs (responsible for the functionality of a computer) are installed. The
most famous operating systems are:

1. Linux (Debian, Ubuntu, Fedora, Knoppix,...) - open source software


2. Microsoft Windows (XP, Vista, 7,...) - proprietary software
3. Mac OS X (Cheetah, Panther, Snow Leopard,...) - proprietary software
Application Software (Utility programs) are all programs that users use to perform
different tasks or for problem solving. Users, according to his/her needs, install the
appropriate utility software. Computer functions and tasks that computers can perform are
defined by the installed utility software.

Utility software can often cost more than computer hardware unless the software is open
source.

Common utility softwares are:

Text processing software is used for creating and forming text documents and nowadays,
they can contain images, charts and tables. Examples of such programs are OpenOffice.org
Writer (open source software) and Microsoft Word (proprietary software).

Spreadsheet calculations software is used for performing various calculations and


presentation of results in charts. Examples of such programs are OpenOffice.org Calc Writer
(open source software) and Microsoft Excel (proprietary software).
Software for presentations is used to create professional presentations that consist of slides
with graphical and textual elements. Such a presentation can afterwards be displayed as a
"slide show” by using a projector. Examples of such programs are OpenOffice.org Impress
(open source software) and Microsoft PowerPoint (proprietary software).

Software for creating and managing database helps to manage a collection of structured
data. Examples of such programs are OpenOffice.org Base (open source software) and
Microsoft Access (proprietary software).

Common utility software installed on a computer:

- office programs - OpenOffice.org, Microsoft Office


- antivirus programs – Avira, Sophos, Kaspersky, Antivir etc.
- Internet browser: Mozilla Firefox, Microsoft Internet Explorer, Opera, Safari etc.
- programs for image editing: Adobe Photoshop, Canvas, CorelDraw, Draw etc.

PROGRAMS TO FACILITATE EASIER COMPUTER ACCESSIBILITY


We can access accessibility options: Start menu → All Programs → Accessories → Ease of
Access

Magnifier is used to enhance a part of the screen.

On-Screen Keyboard – text is entered using a mouse to click on the on-screen keyboard.

Narrator is commonly used by users with visual impairment - it can read text displayed on
monitor, it tells current cursor position, and describes certain events, like warning and error
messages generated by OS.

Windows Speech Recognition enables speech recognition, i.e. recognizes spoken word,
transfers it to text and enters it into a document; therefore, it enables you to dictate a text to a
computer, to browse the web using your voice etc.

Advantages of Computer System

Below are some advantages of computer systems


i. Accuracy and Reliability: The results produced by a computer are extremely correct and
reliable. What is often called computer errors‘ are actually human mistakes; invalid
data and errors are corrected easily.

ii. Speed: The speed of computer makes it the machine ideal for processing large amounts of
data; e.g. accounting, banking operations etc.

iii. Storage/Memory Capability: Computer systems can store


tremendous amounts of data, which can then be retrieved fast and efficiently. The volume of
information we deal with today is far beyond what we can handle manually.

iv. Productivity: Computers are able to perform dangerous, boring, routine jobs, such as
adding long list of numbers, punching holes in metal or monitoring water levels. Most
workers (e.g. in banks) will appreciate increased productivity when computers are used to do
their jobs.

v. Flexibility: Computer could be used for various purposes e.g. multiprogramming, batch
processing, real-time processing, data collection, bank transaction processing etc.

vi. Automatic operation: Computer performs data processing automatically under the
control of internally stored programs.

vii. Configuration and adaptability: Different or suitable peripherals may be used by


business organizations to suit their business processing requirements.

Disadvantages of Computer System

Some of the disadvantages of computers are discussed below


i. Cost of initial setup may be high.
ii. Cost of maintenance may be high.
iii. Inefficient feasibility study before implementation may hamper business operations.
iv. Lack of skilled personnel may hamper computer operations and results obtained.
v. Requires regular electrical power supply.
vi. Excessive exposure to computer may result in some health problem such as poor eye sight,
wrist pain, back ache, neck pain etc.
vii. Computer virus attack may infect and destroy Data/information, which will automatically
affect business operations.
viii. It may lead to unemployment, because one computer can do the job of about 10 persons.

SOCIAL IMPLICATION OF COMPUTER SYSTEM


The society in which we live has been so profoundly affected by computers that historians
refer to the present time as the information age. This is due to the ability to store and
manipulate large amounts of information (data) using computers. As an information society,
we must consider both the social and ethical implications of our use of computers. By ethical
questions we mean asking what are the morally right and wrong ways to use computers and
this could be explain as follows:

Ergonomics: this is the science that studies safe work environments. Many health-related
issues, such as carpal tunnel syndrome and computer vision syndrome (CVS), are related to
prolonged computer use.

Environmental concern: Power and paper wastes are environmental concerns associated
with computer use. Suggestions for eliminating these concerns include recycling paper and
printer toner cartridges and turning off monitors and printers when not in use.

Employee monitoring: Employee monitoring is an issue associated with computers in the


workplace. It is legal for employers to install software programs that monitor employee
computer use. As well, e-mail messages can be read without employee notification. The
invasion of privacy is a serious problem associated with computers.

Information: Because computers can store vast amounts of data we must decide what
information is proper to store, what is improper, and who should have access to the
information. Every time you use a credit card, make a phone call, withdraw money, reserve a
flight, or register at school, a computer records the transaction. These records can be used to
learn a great deal about you—where you have been, when you were there, and how much
money was spent. Should this information be available to everyone? Computers are also used
to store information about your credit rating, which determines your ability to borrow money.
If you want to buy a car and finance it at a bank, the bank first checks your credit records on a
computer to determine if you have a good credit rating. If you purchase the car and then
apply for automobile insurance, another computer will check to determine if you have traffic
violations.
UNIT 2: APPLICATIONS OF ICT IN RESEARCH

In modern society ICT is ever-present, with over three billion people having access to the
Internet. With approximately 8 out of 10 Internet users owning a smartphone, information
and data are increasing by leaps and bounds. This rapid growth, especially in developing
countries, has led ICT to become a keystone of everyday life, in which life without some
facet of technology renders most work and routine tasks dysfunctional. The most recent
authoritative data, released in 2014, shows "that Internet use continues to grow steadily, at
6.6% globally in 2014 (3.3% in developed countries, 8.7% in the developing world); the
number of Internet users in developing countries has doubled in five years (2009-2014), with
two thirds of all people online now living in the developing world."

However, hurdles are still at large. "Of the 4.3 billion people not yet using the Internet, 90%
live in developing countries. In the world's 42 Least Connected Countries (LCCs), which are
home to 2.5 billion people, access to ICTs remains largely out of reach, particularly for these
countries' large rural populations." ICT has yet to penetrate the remote areas of some
countries, with many developing countries dearth of any type of Internet. This also includes
the availability of telephone lines, particularly the availability of cellular coverage, and other
forms of electronic transmission of data. The latest "Measuring the Information Society
Report" cautiously stated that the increase in the aforementioned cellular data coverage is
ostensible, as "many users have multiple subscriptions, with global growth figures sometimes
translating into little real improvement in the level of connectivity of those at the very bottom
of the pyramid; an estimated 450 million people worldwide live in places which are still out
of reach of mobile cellular service."

Favorably, the gap between the access to the Internet and mobile coverage has decreased
substantially in the last fifteen years, in which "2015 is the deadline for achievements of the
UN Millennium Development Goals (MDGs), which global leaders agreed upon in the year
2000, and the new data show ICT progress and highlight remaining gaps." ICT continues to
take on new form, with nanotechnology set to usher in a new wave of ICT electronics and
gadgets. ICT newest editions into the modern electronic world include smart watches, such as
the Apple Watch, smart wristbands such as the Nike+ FuelBand, and smart TVs such as
Google TV. With desktops soon becoming part of a bygone era, and laptops becoming the
preferred method of computing, ICT continues to insinuate and alter itself in the ever-
changing globe.
Information communication technologies play a role in facilitating accelerated pluralism in
new social movements today. The Internet according to Bruce Bimber is "accelerating the
process of issue group formation and action" and coined the term accelerated pluralism to
explain this new phenomenon. ICTs are tools for "enabling social movement leaders and
empowering dictators" in effect promoting societal change. ICTs can be used to garner
grassroots support for a cause due to the Internet allowing for political discourse and direct
interventions with state policy as well as change the way complaints from the populace are
handled by governments.

Strictly speaking, ICT Research is organized in three broad specializations:

1. mobile communication systems


2. system development and security
3. multimedia.

However, ICTs is known to permeate all areas of endeavour and used as a powerful driver of
innovation, growth and productivity globally. New knowledge and applications created in
continual ICT Research and Development (R&D) activities are critical factors in meeting all
the challenges and risks connected with e-Business implementation and information society
development.

As ICT Research moves from Basic Research until it finally enters market uptake, it passes
through series of stages. The stages of ICT Research include:

 Basic Research
 Technology R&D
 Demonstration
 Prototyping
 Large scale validation
 Pilots
 Market Uptake.

However, as a generic technology, ICT ‘s influence in research can be categorized into three
broad ICT-driven areas of endeavour, vis:

• 1) Excellent Science: Future and emerging technologies; research infrastructure


• 2) Industrial leadership: Leadership in enabling & industrial technologies; Innovation in
SMEs

• 3) Societal Challenges: • Health, demographic change & wellbeing; Food security,


sustainable agriculture & the bio-based economy; Secure, clean & efficient energy; Smart,
green & integrated transport; Climate action, resource efficiency, & raw materials; Inclusive,
innovative & reflective societies; Secure societies

THE ETHICAL RESPONSIBILITIES OF AN IT PROFESSIONAL

IT (information technology) professional has responsibilities that relate to system reliability.


System reliability involves installing and updating appropriate software, keeping hardware
working and up-to-date and maintaining databases and other forms of data. Professional
ethics helps a professional choose what to do when faced with a problem at work that raises a
moral issue. One can certainly study what professionals do when faced with such problems,
and confine the enquiry to the description. Our concern here, however, is to assist with
making choices –an approach called prescriptive professional ethics.

Governments, schools, and employers rely on IT professionals to maintain their computer


systems. In addition to ensuring system reliability, an IT professional must take responsibility
for the ethical aspects of the career choice. The lists below are the most commonly reported
behaviours IT professional which is unethical;

i. Plagiarism
ii. Failure to protect confidential data
iii. Failure to share credit on a report
iv. Fabrication of data
v. Criticize the ability/integrity of colleague for own gain
vi. Holding back or disguising data
vii. Design of sampling strategy to favour a specific outcome
viii. Destruction of data that contradicts desired outcome
ix. Deliberately not reporting an incident
UNIT 3: INTERNET APPLICATIONS

NETWORKS
Computer network is comprised of at least two, connected, by wire or wireless, computers
that can exchange data i.e. communicate. There are many reasons for connecting computers
into a network, and some of them are:

• exchange of data between users that have network access,


• access to shared devices, such as network printers, network disks, etc.,
• enables user communication and socializing, etc.
Internet is the most famous and most widespread network with nearly2 billion users and the
number of users is still growing. An Internet application is an interactive, compiled
application that can be accessed through a corporate or through the Internet. Internet
applications can perform complex business processes on either the client or the server. The
application uses the Internet protocol to receive requests from a client, typically a Web
browser, process associated code, and return data to the browser.

Internet Applications:

o The World-Wide Web (WWW)


o Electronic Mail (E-Mail)
o File Transfer Protocol (FTP)
o Search Engine
o Chatting
o Video Conferencing
o E-Commerce

TYPES OF NETWORKS
Types of networks according to their size:

• LAN (Local Area Network) - a network that covers a relatively small geographical
area - it connects computers within a firm or household by wire,
• WLAN(Wireless Local Area Network) - a network that covers a relatively small
geographical area - it connects computers within a firm or household wirelessly,
• WAN (Wide Area Network) - a network that covers a relatively large geographical
area - it connects a greater number of computers and local networks.

Terms: client / server


Relationship client - server is defined in the following manner: client sends requests and
server responds to those requests. We can use Internet as the best known example. User's
computer, connected to the Internet, sends requests to a certain web page (by entering page
address into the Internet browser Address bar), and the server responds. Web page is loaded
into the user's computer Internet browser as a result of server response. From this example,
we can see that communication between client and server depends on connection speed
(bandwidth). Since bandwidth is limited, the amount of data that can flow through network is
limited too. Today, for instance, while purchasing access to mobile Internet, you will notice a
limited amount of data that can be transferred within a package, i.e. amount of transferred
data is what is charged.

The reason for that is limited bandwidth of mobile networks, and since companies that are
offering mobile Internet access do not want networks to be congested, they de-stimulate their
users by charging amounts of money related to the amount of transferred data. That was the
case with ADSL Internet access. Today, once Internet providers have developed
communication infrastructure, they do not need to de-stimulate users by charging based on
the amount of transferred data, therefore they are offering so called "flat rate" access)
charging only based on the access speed. That is why you will, while listening or reading
news about communication technologies, have the opportunity to hear how important it is to
develop communication infrastructure.

Types of networks according to their architecture:

• client-server - all clients are connected to the server,


• P2P (peer to peer) - all computers are clients and servers at the same time.

INTERNET, INTRANET, EXTRANET


Internet ("network of all networks") is a global system comprised of interconnected
computers and computer networks, which communicate by means of using TCP/IP protocols.
Although, in its beginnings, it emerged from the need for simple data exchange, today it
affects all domains of society. For example:
• Economy: Internet banking (paying bills, transferring funds, access to account, access
to credit debt, etc.), electronic trading (stocks, various goods, intellectual services,
etc), etc.
• Socializing: social networks, forums...
• Information: news portals, blogs etc.
• Healthcare: diagnosing disease, medical examinations (for people living on an island
or in other remote places, some examinations, that require a specialist, can be done
remotely), making appointments for medical examinations, the exchange of medical
data between hospitals and institutes, surgery and remote surgery monitoring
• Education: online universities with webinars (web + seminar), websites with tutorials,
expert advice, Ideas Worth Spreading @ www.TED.com, etc.
Internet really does have many applications and a huge social impact. Perhaps the most
important trait is information exchange, because information exchange among people enables
collaboration, collaboration of like-minded people leads to ideas and actions in real life, and
coordinated actions of people results in social change.

Intranet is a private network of an organization to which only authorized employees have


access (login and password).

Extranet is part of Intranet, to which independent collaborators have access.

DATA FLOW/TRANSFER

Download is a term that implies taking a copy of digital data from a network computer on a
local computer, and upload means placing digital content on a network computer. For
example, when you saved a copy of some book from some web site to your computer, you
have downloaded digital data, that is, the book. Likewise, when someone finished writing this
book, he has placed it (uploaded) it on the network computer (his Internet server).

Bitrate represents speed at which data is transferred through a modem (network). It is


measured in bit/s (bit per second). bit/s is/are a measurement unit for speed of digital data
flow through the network. The number of bits transferred in one second tells us how many
bits can be transferred through a network in one second.

1,000 bit/s rate = 1 kbit/s (one kilobit or one thousand bits per second)
1,000,000 bit/s rate = 1 Mbit/s (one megabit or one million bits per second)

1,000,000,000 bit/s rate = 1 Gbit/s (one gigabit or one billion bits per second)

Speed of data flow can be expressed in bytes per second. Since one byte has eight
bits, such is the relation between bit/s and Bp/s, i.e. bits per second and bytes per second.
Ways to connect to the Internet:
• Dial-up Internet access method uses a modem (56k) and a telephone line.
• Broadband is characterized by a high-speed data transfer, permanent access to the
Internet, and thus the risk of unauthorized access to the network or your personal
computer. Connection methods:
• Mobile - connecting by using a mobile network (GPRS, EDGE, UMTS, HSPA)
• Satellite - commonly used in parts of the world where there is no proper infrastructure
and there is no other way of accessing the Internet
• Wireless (Wi-Fi) - data is transferred between computers by using radio frequencies
(2,4 GHz) and the corresponding antennaes
• Cable - connecting to the Internet through television cable network using a cable
modem
• Broadband is characterized by a high-speed data transfer, permanent access to the
Internet, and thus the risk of unauthorized access to the network or your personal
computer. In the beginning of broadband Internet access, due to underdeveloped
communication infrastructure, Internet providers charged based on the data traffic but
not time spent on the Internet (unlike dial-up Internet access). Today, in large cities,
telecommunications infrastructure is developed, therefore Internet providers do not
charge money based on the time spent on the Internet or the amount of transferred
data but they do charge by access speed.

Review Question:
1. Describe briefly the term computer system
2. Write briefly on the following;
i. System unit
ii. Storage
iii. Hard disk drive
iv. Mouse
v. Keyboard
vi. Monitor
vii. Printer
viii. Modem
3. Explain some areas of application of computer in our society.
4. Identify some ethics expected of IT professionals
5. Discuss social implication of computer based on Ergonomics and Environmental
concern
MODULE 2: RESEARCH METHODS VS RESEARCH METHODOLOGY

RESEARCH METHODS

A Research Method is the specific technique, tool or procedure applied to achieve a given
research objective. Here the research design is laid out. It is what a researcher does in order to
collect his data and carry out his investigations. It depends on the question that the researcher
wishes to answer, and the philosophy that underpins his view of research. Research method is
a step in a Research process. It is also one of the four main features of research design. The
other three features of research design are ontology, epistemology and methodology.
Research method pertains to all those methods, which a researcher employs to undertake
research process, to solve the given problem. It comprises of methods of performing the
research, including survey, case study, interview, questionnaire, observation, etc. These are
the approaches, which help in collecting data and conducting research, in order to achieve
specific objectives such as theory testing or development. All the instruments and behaviour,
used at various levels of the research activity are included here. These research activities
include making observations, data collection, data processing, drawing inferences, decision
making, etc. Examples of research methods are surveys, interviews, experiments,
observation, case studies, questionnaires, statistical approaches, etc

methods are categorized into three:

1st Category: The methods relating to data collection are covered here. Such methods are
used when the existing data are not sufficient, to reach the solution.

2nd Category: Incorporates the processes of analysing data, i.e. to identify patterns and
establish a relationship between data and unknowns.

3rd Category: Comprises of the methods which are used to check the accuracy of the results
obtained.

A research method shows how the research study is designed. Its choice depends on:

Research Questions

Research Goals

Researcher Beliefs and Values

Researcher Skills
Time and Funds

The research method to be used in a study may be:

Qualitative –descriptive in nature eg. case study, participatory action research, data
gathering etc.

Quantitative –involves experiments/measurements, observations or surveys

Mixed Methods -drawn from both qualitative and quantitative methods

RESEARCH METHODOLOGY

Research methodology is the specific procedures or techniques used to identify, select,


process, and analyse information about a topic. In a research paper, the methodology section
allows the reader to critically evaluate a study ‘s overall validity and reliability. The
methodology section answers two main questions: How was the data collected or generated?
How was it analysed? The methodology may include publication research, interviews,
surveys and other research techniques, and could include both present and historical
information.

Research methodology is a way to systematically solve the research problem. It can be


defined as a science of studying how research is done scientifically. Research Methodology,
as its name suggest is the study of methods, so as to solve the research problem. It is the
science of learning the way research should be performed systematically. It refers to the
rigorous analysis of the methods applied in the stream of research, to ensure that the
conclusions drawn are valid, reliable and credible too.

The researcher is expected to know both the research methods/techniques and the
methodology. They not only need to know how to develop certain indices or tests, how to
calculate the mean, the mode, the median or the standard deviation or chi-square, how to
apply particular research techniques, but they also need to know which of these methods or
techniques, are relevant and which are not, and what would they mean and indicate and why.
Researchers also need to understand the assumptions underlying various techniques and they
need to know the criteria by which they can decide that certain techniques and procedures
will be applicable to certain problems and others will not.
SUMMARY: To conduct a research, the researcher uses research methods, during the course
of conducting research. Many times, the research methods are confused with research
methodology, which implies the scientific analysis of the research methods, so as to find a
solution to the problem at hand. Research methods are the strategies, tools, and techniques
used by the researcher to collect the relevant evidence needed to create theories.
Consequently, these research methods need to be credible, valid, and reliable. This is
accomplished by writing a sound methodology, which consists of a systematic and theoretical
analysis of the above research methods. A methodology allows the researcher to evaluate and
validate the rigour of the study and methods used to obtain the new information. Research
methods constitute only one component of the multidimensional research methodology.
UNIT 1: STATISTICAL INFERENCE

Each (and every) random variable has a unique probability distribution. For the most part
statisticians deal with the theory of these distributions. Engineers, on the other hand, are
mostly interested in finding factual knowledge about certain random phenomena, by way of
probability distributions of the variables directly involved, or other related variables.
In basic statistics we learn that probability density functions can be defined by certain
constants called distribution parameters. These parameters in turn can be used to characterize
random variables through measures of location, shape, and variability of random phenomena.
2
The most important parameters are the mean and the variance . The parameter is a
measure of the center of the distribution (an analogy is the center of gravity of a mass) while
2
is a measure of its spread or range (an analogy being the moment of inertia of a mass).
Hence, when we speak of the mean and the variance of a random variable, we refer to two
statistical parameters (constants) that greatly characterize or influence the probabilistic
behaviour of the random variable. The mean or expected value of a random variable x is
defined as

∑ ∫

where p(x) represents probabilities of a discrete random variable and ƒ(x) represents the
probability density function of a continuous random variable. The parameters of interest are
embedded in the form of the probability density functions. As an illustration, the variance of
a random variable is defined as

∑ ∫

In mathematical statistics we can show that many random variables that occur in nature
follow the same general form of distribution with differences only in the parameters and the
2
statistical quantities and Some of these recurring distributions have been given special
names, such as:

Binomial , Beta, Uniform, Hypergeometric, Normal Cauchy, Poisson, Chi-square, Rayleigh,


Geometric, Student’s-t, Maxwell, Negative binomial, F distribution, Weibull, Gamma,
Exponential Erlang.
Thus, it is not difficult to see why the field of ‘‘probability and statistics’’ is a discipline
within itself, nor is it difficult to see why almost every discipline in existence needs a
working knowledge of statistics. Random phenomena (variables) exist in all phases of
activity. When one is interested in certain random phenomena, the first requirement seems to
be that one develop some means of measuring it. Upon doing so, one often collects a number
of observations of the random phenomena. Statistics deals with developing tools and
techniques for choosing those observations (a sample) and manipulating them in such a way
that useful information is gained about the underlying random variable(s). This information is
generally derived from studying probability distributions of the random variables or functions
of the random variables. The average (or mean) and/or the variance (or spread) of the
probability distribution of the random variable obviously yield useful information.
While the parameters of a statistical distribution are constant, any computation based on the
numerical values of the random observations in a sample may yield different quantities from
sample to sample. These quantities are known as ‘‘statistics.’’ The two most widely used
statistics are the mean of a sample of n observations

̅ ∑

and the variance of the sample

̅

Note that these descriptors are not theoretical in nature but are calculated from a set of n data
points. Mathematical developments have proven that is usually the best single (point)
estimate of ̅ , and is usually the best single (point) estimate of . Normally, the nature of
these parameters is totally unknown, and statistics is used to draw inferences about their true
values. Since knowledge of the mean and variance is of utmost importance, statistics deals
extensively with developing tools and techniques for studying their behaviour.
UNIT 2: STATISTICAL HYPOTHESIS TESTING
Working with a representative (random) sample of n observations, statisticians have shown
that the function √ ̅ for sufficiently large values of n is a random variable that
follows (or has) a normal probability distribution with and .

If ̅ is the mean of a random sample of size n taken from a population having a mean and a
finite variance , then

̅

is a random variable whose probability distribution approaches that of the standard normal
distribution ( , ) as n approaches infinity. This is undoubtedly the most amazing
theorem in statistics for it does not require that one know anything about the shape of the
probability distribution of the individual observations. It only requires that the distribution of
those random observations have a finite mean, , and variance, . The standard normal
density function is defined as

Testing a mean value ( ) with known

Given a value of ̅ computed using a sample of size n from an infinite population with known
mean ( ) and variance ( ), the probability that the random variable

̅

falls between the points and is . Note that is a value between zero and one
and represents the probability that a random variable ̅ which approximates the mean ̅ , will
naturally fall outside the points and . This interpretation of the natural behaviour
of the random variable ̅ along with the distribution of the (transformed) variable Z, allows
one to structure a hypothesis test concerning the true mean, . Assume that the value of
, and that the following statement is to be tested:
The value is the numerical value of which is assumed known or is hypothesized. is
called the null or primary hypothesis and is called the alternative or secondary hypothesis.
If a random sample of size n is extracted from the population under study, then


̅

can be calculated. Since ̅ is the best point estimator of , and is assumed to be equal to
one would expect the random variable

̅

to fall between the points and and 95% of the time. The
values of and can be determined by using a standard normal table. We can see
that they are equal to ±1.96. These values are called critical values and obviously depend
upon . Hence, calculation of Z yields a statistic that will cause to be believed 95% of the
time and to be believed only 5% of the time, when is actually true. Therefore, can be
interpreted as the magnitude of the error of rejecting the null hypothesis when in fact it is
true. This error is often referred to as an error of type I. Additionally, if the null hypothesis is
false, there is still a chance that the calculated value of Z will lie between ±1.96 (when
=0.05). This result will cause the decision analyst to accept the null hypothesis when in fact
it is false. The magnitude (likelihood) of this error is commonly denoted by , and this error
is called an error of type II.

In order to illustrate the basic procedures of statistical hypothesis testing, consider the
following example:

An oil investment cartel is considering the purchase of an oil well from Blow Hard, Inc. in
Texas. Current owners claim that the well produces on the average 100 barrels of oil per day,
with a standard deviation of 10 barrels per day. In order to test this claim, the cartel chooses
and observes daily production for 16 days. Total production over this period of
time is 1690 barrels of oil. Can the owner’s claim be disputed?
Assumptions: ::

Infinite population

Hypothesis test:

̅
Test statistic:

Critical values:

̅
Calculated Z statistic:

Since the value of Z = 2.252 is greater than one would choose to reject in
favour of
UNIT 3: REGRESSION ANALYSIS

Regression analysis is:

A technique for measuring and explaining (reducing unexplained) variability in a system

An aid to understanding interrelationships in complex systems

A process for building a useful model of a system

A method for improving forecasting or prediction

A mechanism for focusing on important phenomena

A system for evaluating theories or beliefs

An aid in formulating new theory

A method for obtaining better control of variation

A technique for estimating equation parameters Regression modelling involves practical


problems, problems of judgment, and a good deal of art.

An equation of the form

is sometimes referred to as the general linear model. In this equation, Y is a variable whose
behaviour is of interest. It was once common to refer to Y as the dependent variable, taken
from the mathematical concept of a function. In statistical modelling, most authors have
come to call Y the response variable. This is the convention adopted here. In the equation
above, Y is a linear additive function of the X variables, which are P in number, .
These X’s were formerly often referred to as independent variables, again using the
mathematical sense. They are now sometimes called regressors or explanatory variables but
are more commonly called predictors.
Least-Squares Method

The great power of multiple linear regression (MLR) analysis lies in its ability to relate
simultaneously the many inter-correlated predictors to the response—to deal with non-
experimental data. Herein also lies the main source of danger. Successful modelling of non-
experimental data is a tricky business. But not all the dangers are associated with the natural
inter-correlations of non-experiments. The variety of ways in which the analyst can encounter
trouble is nearly as great as the variety of problem situations.

In MLR, understanding variation is the basis for problem solving. Variation in the response is
made up (theoretically) of two parts:

1. The systematic variation (signal), which is associated with or is in response to changes in


the predictors

2. Leftover variation (noise), which is called ‘‘residual error’’ or ‘‘experimental error.’’

The distinction is really not so sharp. The leftover error is actually associated with a great
many things that, in practice, might be measured (and included in the model) if analysts had
sufficient time, wisdom, patience, and money. They simply choose not to try to identify all
sources of variation. They will discontinue the search when there seems to be no regular
pattern of errors left over and when either all the reasonable predictors have been adequately
tested or the residual error variance is small enough—again depending on goals. In terms of
the true coefficients and residual error of the theoretical model, the observed response
variable may be expressed as

where is the ‘‘residual error’’ associated with Y and (theoretically) has variance
.The fitted model containing the estimates of the ’s, then, is

̂ ∑

where the circumflex or ‘‘hat’’ on Y denotes the predicted or estimated value of the response.
It is like an average (where a ‘‘bar’’ is used). In fact, it is the conditional average, given the
location in the space defined by the . It is an estimate of the expected or true value of the
response for that location or set of conditions.
The differences between the observed and fitted values of Y are the residual errors or, simply,
‘‘residuals,’’
̂ ̂

Where is an estimate of the ‘‘true error’’ . In practice, may contain anything the
analyst chooses to omit from the model. It has sample variance

∑ ̂ ∑ ̂
̂

which, for the theoretical case, is an estimate of the ‘‘experimental’’ error variance. The
subscript Y. X (‘‘Y dot X’’) means ‘‘for Y, given the model containing a particular set of
X’s.’’ Thus is the sample estimate of the residual variance in Y, given the model.

Residual Variance

For , Eq. (5) reduces to = , where is the residual sum of


squares given by

and where SSY is the (corrected) sum of squares of Y’s and SSReg is the regression sum of
squares.

Model Specification

In other circumstances, where the physics and chemistry are not so well understood (e.g., in
studying the cause of a disease), the question may focus on the statistical significance of the
relationship. The analyst is attempting to decide whether the relationship seen in the sample
is something real or just the result of chance association. This decision is appropriate along
all goal sequences except where existing theory permits prior specification* of the model.
Model specification is the process of choosing an adequate representation of reality. To
decide this question of reality, the analyst would want a test model for the behaviour of the
estimator, , when the association is just chance. One way would be to use the t test model
with the null hypothesis that (or some other appropriate value). The alternative
hypothesis might be . The t distribution is appropriate by the central limit theorem.
Then

with the critical value for t of , where represents the specified degree of risk of rejecting
a true null hypothesis (claiming a non-existent association). The standard errors for and
are given by



( )
UNIT 4: FACTOR ANALYSIS
Factor Analysis (FA) is an exploratory technique applied to a set of observed variables that
seeks to find underlying factors (subsets of variables) from which the observed variables
were generated. For example, an individual’s response to the questions on a college entrance
test is influenced by underlying variables such as intelligence, years in school, age, emotional
state on the day of the test, amount of practice taking tests, and so on. The answers to the
questions are the observed variables. The underlying, influential variables are the factors.
Factor analysis is carried out on the correlation matrix of the observed variables. A factor is a
weighted average of the original variables. The factor analyst hopes to find a few factors from
which the original correlation matrix may be generated.
Usually, the goal of factor analysis is to aid data interpretation. The factor analyst hopes to
identify each factor as representing a specific theoretical factor. Therefore, many of the
reports from factor analysis are designed to aid in the interpretation of the factors.
Another goal of factor analysis is to reduce the number of variables. The analyst hopes to
reduce the interpretation of a 200-question test to the study of 4 or 5 factors. One of the most
subtle tasks in factor analysis is determining the appropriate number of factors.
Factor analysis has an infinite number of solutions. If a solution contains two factors, these
may be rotated to form a new solution that does just as good a job at reproducing the
correlation matrix. Hence, one of the biggest complaints of factor analysis is that the solution
is not unique. Two researchers can find two different sets of factors that are interpreted quite
differently yet fit the original data equally well.

The principal-axis method is used here to solve the factor analysis problem. Factor analysis
assumes the following partition of the correlation matrix, R:

The principal-axis method proceeds according to the following steps:

1. Estimate U from the communalities as discussed below.

2. Find L and V, the eigenvalues and eigenvectors of R-U using standard eigenvalue analysis.

3. Calculate the loading matrix as follows:


4. Calculate the score matrix as follows:

5. Calculate the factor scores as follows:

Steps 1-3 may be iterated since a new U matrix may be estimated from the current loading
matrix.

We close this section with a discussion of obtaining an initial value of U. Using the initial
estimation of Cureton (1983), which will be outlined here. The initial communality estimates,
, are calculated from the correlation and inverse correlation matrices as follows:

∑ (| |)
( )
∑ ( )

where is the diagonal element of R-1 and is an element of R. The value of U is then
estimated by .

UNIT 5: DISCRIMINANT ANALYSIS

Discriminant Analysis finds a set of prediction equations based on independent variables that
are used to classify individuals into groups. There are two possible objectives in a
discriminant analysis: finding a predictive equation for classifying new individuals or
interpreting the predictive equation to better understand the relationships that may exist
among the variables.

In many ways, discriminant analysis parallels multiple regression analysis. The main
difference between these two techniques is that regression analysis deals with a continuous
dependent variable, while discriminant analysis must have a discrete dependent variable. The
methodology used to complete a discriminant analysis is similar to regression analysis. You
plot each independent variable versus the group variable. You often go through a variable
selection phase to determine which independent variables are beneficial. You conduct a
residual analysis to determine the accuracy of the discriminant equations.

Suppose you have data for K groups, with observations per group. Let N represent the
total number of observations. Each observation consists of the measurements of p variables.
The ith observation is represented by . Let M represent the vector of means of these
variables across all groups and the vector of means of observations in the group.

Define three sums of squares and cross products matrices, , , and , as follows

∑∑

∑∑

Next, define two degrees of freedom values, df1 and df2:

A discriminant function is a weighted average of the values of the independent variables. The
weights are selected so that the resulting weighted average separates the observations into the
groups. High values of the average come from one group, low values of the average come
from another group. The problem reduces to one of finding the weights which, when applied
to the data, best discriminate among groups according to some criterion. The solution reduces
to finding the eigenvectors, V, of . The canonical coefficients are the elements of these
eigenvectors.

A goodness-of-fit parameter, Wilks’ lambda, is defined as follows:

| |

| |
where is the jth eigenvalue corresponding to the eigenvector described above and m is the
minimum of K-1and p. The canonical correlation between the jth discriminant function and
the independent variables is related to these eigenvalues as follows:

Various other matrices are often considered during a discriminant analysis.

The overall covariance matrix, T, is given by:

( )

The within-group covariance matrix, W, is given by:

( )

The among-group (or between-group) covariance matrix, A, is given by:

( )

The linear discriminant functions are defined as:

The standardized canonical coefficients are given by:

where are the elements of V and are the elements of W.

The correlations between the independent variables and the canonical variates are given by:


You might also like