Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
32 views

01 Econ115a Mod2 Lesson3 BasicDataManagementusingSPSS

The document discusses scales of measurement and encoding variables in SPSS. It defines variables, data, and datasets. It describes different data types and scales of measurement including nominal, ordinal, interval, and ratio scales. It also outlines how to encode variables and data in SPSS.

Uploaded by

lyriemaecutara0
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views

01 Econ115a Mod2 Lesson3 BasicDataManagementusingSPSS

The document discusses scales of measurement and encoding variables in SPSS. It defines variables, data, and datasets. It describes different data types and scales of measurement including nominal, ordinal, interval, and ratio scales. It also outlines how to encode variables and data in SPSS.

Uploaded by

lyriemaecutara0
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 38

Econ 115a – Econometrics

Module 2: Variables and Data

Ian Dave B. Custodio


Instructor
Econ 115a

Lesson 3: Basic Data Management using SPSS

Learning objectives:
- Define Variables, Data, and Datasets
- Identify the different types of data
- Encode variables and data in SPSS
Econ 115a

Outline
3.1 Variables, Data, and Datasets
3.2 Data Types
3.3 Scales of Measurement
3.4 Encoding variables in SPSS
3.5 Encoding Data in SPSS
Econ 115a

3.1 Variables, Data,


and Datasets
Econ 115a

Variable
- a quantity that may assume any one of a set of values, something that vary (Merriam
Webster)

- is a symbol, commonly a single letter, that represents a number, called the value of
the variable, which is either arbitrary, not fully specified, or unknown.

- can be in any form as long as it replaces an unknown value (age, gender, civil status,
income/salary, sales, expenditures, etc.).
Econ 115a

Data
- factual information (such as measurements or statistics) used as a basis for
reasoning, discussion, or calculation (Merriam Webster)

- information in digital form that can be transmitted or processed (Merriam Webster)

The process of sorting/calculating is called “data processing”, while the result of it is


called “information”

singular form: “datum” but nowadays, it is used in both singular/plural forms


Econ 115a

Dataset
- refers to a file that contains one or more records (IBM)

- a collection of related sets of information that is composed of separate elements


but can be manipulated as a unit by a computer (Google/Oxford Languages)

- a collection of data

- can be “data set” or “dataset”


Econ 115a

3.2 Data Types


Econ 115a

According to source
1. Primary data – data are taken directly from the respondents/samples; mostly
involves person-to-person contact.

2. Secondary Data – data are taken from published articles complied and processed
by personnel from an institution and/or government agencies.
Econ 115a

According to nature
1. Quantitative data – numerical data (e.g., age, household size, income)

2. Qualitative Data – non-numerical data (e.g., sex/gender, civil status, occupation)


Econ 115a

According to time dimension


1. Cross-sectional data – data taken at one (1) time period

2. Time series data – data taken at several time periods

3. Panel data (cross-sectional time series data) – data that is derived from a
number of observations/participants over time.
Econ 115a

According to measurement
1. Continuous data – data that can be divided into smaller units (height, weight,
distance)

2. Discrete data – data that cannot be divided into smaller units (e.g. no. of students,
number of faculty in a school)
Econ 115a

According to arrangement
1. Ranked data – data which can be arranged into a set of ordered categories (e.g.,
first, second, third)

2. Nominal data – discrete data which cannot be ordered (e.g., sex/gender, civil
status)
Econ 115a

3.3 Scales of Measurement


Econ 115a

Quick review on the measures of Central Tendency


1. Mean (µ for population, 𝑥̅ for sample)
– is the arithmetic average. It is computed by summing up all the data and then
dividing by the number of data or cases.

2. Median
- the number that divides the data set into two (2) equal parts. First, put the data into
an array, then find the center value. For odd cases, it is easy to locate the median, for
even number of cases, add the middle pairs then divide by the two.
Econ 115a

3. Mode
- is the most frequently occurring value in a dataset. It is determined simply by
counting how many times each value appears and then finding the value with the
highest frequency.

- can be none, unimodal, bimodal, or multimodal


Econ 115a

Example: 30 14 28 7 12 4 21
4 22 8 16 20 2 10

Mean?

Median?

Mode?
Econ 115a

What measure is the most reliable in terms of determining the central tendency of a
given dataset?

Answer: Median

Mean is prone to outliers (extreme values)

Mode simply relies on frequency of occurrence (no mode, bimodal, and multimodal)
Econ 115a

Scales (Levels) of Measurement


1. Nominal
2. Ordinal
3. Interval
4. Ratio

NOTE: In SPSS, interval and ratio is called “Scale”


Econ 115a

1. Nominal
- a scale used to label variables that have no quantitative values.

- the values just “name” the attribute uniquely, no ordering of the cases is implied.

For example, jersey numbers in basketball are measures at the nominal level. A player
with number 30 is not more of anything than a player with number 15 and is certainly
not twice whatever number 15 is.

Examples: gender, hair/eye color, blood type, place/address, full name, etc.
Econ 115a

Properties
They have no natural order. For example, we can’t arrange eye colors in order of
worst to best or lowest to highest.

Categories are mutually exclusive. For example, an individual can’t have both blue
and brown eyes. Similarly, an individual can’t live both in the city and in a rural area.

The only number we can calculate for these variables are counts. For example, we
can count how many individuals have blonde hair, how many have black hair, how
many have brown hair, etc.
Econ 115a

The only measure of central tendency we can calculate for these variables is the
mode. The mode tells us which category had the most counts. For example, we could
find which eye color occurred most frequently.
Econ 115a

2. Ordinal
- a scale used to label variables that have a natural order, but no quantifiable
difference between values.

Examples: Satisfaction level, Socioeconomic status, Academic position, Level of pain


Econ 115a

Properties
They have a natural order. For example, “very satisfied” is better than “satisfied,”
which is better than “neutral,” etc.

The difference between values can’t be evaluated. For example, we can’t exactly
say that the difference between “very satisfied and “satisfied” is the same as the
difference between “satisfied” and “neutral.”
Econ 115a

The two measures of central tendency we can calculate for these variables are
the mode and the median. The mode tells us which category had the most counts
and the median tells us the “middle” value.
Econ 115a

3. Interval
- a scale used to label variables that have a natural order and a quantifiable difference
between values, but “no true zero” value.

Examples: temperature, time (24-hour format), IQ, shoe size


Econ 115a

Properties
These variables have a natural order.

We can measure the mean, median, and mode of these variables.

These variables have an exact difference between values. Recall that ordinal
variables have no exact difference between variables – we don’t know if the difference
between “very satisfied” and “satisfied” is the same as the difference between
“satisfied” and “neutral.”
Econ 115a

For variables on an interval scale, though, we know that the difference between a
credit score of 850 and 800 is the exact same as the difference between 800 and 750.

These variables have no “true zero” value. For example, it’s impossible to have a
credit score of zero. It’s also impossible to have a SAT score of zero.

For temperatures, it’s possible to have negative values (e.g. -10° F) which means there
isn’t a true zero value that values can’t go below.
Econ 115a

4. Ratio
- a scale used to label variables that have a natural order, a quantifiable difference
between values, and a “true zero” value.

Example: height, weight, length, width, allowance, income


Econ 115a

Properties
These variables have a natural order.

We can calculate the mean, median, mode, and a variety of other descriptive
statistics for these variables.

These variables have an exact difference between values.


Econ 115a

These variables have a “true zero” value. For example, length, weight, and height
all have a minimum value (zero) that can’t be exceeded.

Sometimes, it’s not possible for ratio variables to take on negative values. For this
reason, the ratio between values can be calculated.

For example, someone who weighs 200 lbs. can be said to weigh two times as much
as someone who weights 100 lbs. Likewise someone who is 6 feet tall is 1.5 times
taller than someone who is 4 feet tall.
Econ 115a

Image source: https://www.graphpad.com/support/faq/what-is-the-difference-between-ordinal-interval-and-ratio-variables-why-should-i-care/


Econ 115a

3.4 Encoding variables


in SPSS
Econ 115a

Demonstration
Econ 115a

3.5 Encoding Data


in SPSS
Econ 115a

Demonstration
Econ 115a

References:
Wahl, M. (2013). Crash Course on Basic Statistics. University of New York at Stony Brook
Beginning Statistics (2012). https://2012books.lardbucket.org/books/beginning-statistics/
Ho, R. (2006). Handbook of Univariate and Multivariate Data Analysis and Interpretation with SPSS. Chapman and
Hall/CRC
Isotalo, Jarkko (n.d.) Basic Statistics.
THANK YOU!

Ian Dave B. Custodio Visayas Socio-Economic Research and


Instructor I Data Analytics Center
Visayas Socio-Economic Research and 1st Floor, Department of Economics, Visayas State University
Data Analytics Center (ViSERDAC), VSU 6521-A Baybay City, Leyte, Philippines
E-mail: idcustodio@vsu.edu.ph E-mail: viserdac@vsu.edu.ph
Telephone: 053 563 7064 local 1121 Telephone: 053 563 7064 local 1121

You might also like