Week - 10 Evaluation

EVALUATION
CLO 5
Course Learning Outcomes (CLO)
Upon successful completion of the course, the student will demonstrate

the ability to:
CLO1:Explain the concepts underpinning the interaction between users and computer
interfaces including dialog techniques and accessibility guidelines
CLO2:Understand the cognitive principles that support human-centered design
CLO3:Analyze and evaluate existing interactive user interfaces, based on usability and
principles of good design
CLO4:Apply the steps in interactive design including requirements definition, task
analysis, prototyping and usability testing
CLO5:Build lo-fi and hi-fi prototypes for a user interface that meet HCI best practices
Kahoot Quiz
 WEEK 10
3
Chapter Objectives
Explain the key concepts and terms used in

Explain evaluation
Introduce range of different types of evaluation

Introduce methods
Show how different evaluation methods are used

Show for different purposes at different stages of the
design process and in different contexts of use
Demonstrate how evaluators mixed and modified to

Demonstrate meet the demands of evaluating novel systems
Why, what, where, and when to evaluate
Iterative design and evaluation is a continuous process that examines:
Why: To check users’ requirements and confirm that users can utilize the product and
that they like it
What: A conceptual model, early and subsequent prototypes of a new system, more
complete prototypes, and a prototype to compare with competitors’ products
Where: In natural, in-the-wild, and laboratory settings
When: Throughout design; finished products can be evaluated to collect information to

inform new products
Bruce Tognazzini tells you why you need
to evaluate
“Iterative design, with its repeating cycle of design and testing, is the
only validated methodology in existence that will consistently produce
successful results. If you don’t have user-testing as an integral part of
your design process you are going to throw buckets of money down the
drain.”
See AskTog.com for topical discussions about design and evaluation
Types of evaluation
Controlled settings that directly involve users (for example, usability and
research labs)
Natural settings involving users (for

Often there is little or no control over what
instance, online communities and users do, especially in in-the-wild settings
products that are used in public places)
Any setting that doesn’t directly involve users (for example, consultants and
researchers critique the prototypes, and may predict and model how successful
they will be when used by users
Living labs
People’s use of technology in their everyday lives can be evaluated in

living labs
Such evaluations are too difficult to do in a usability lab
An early example was the Aware Home that was embedded with a
complex network of sensors and audio/video recording devices (Abowd
et al., 2000)
Living labs (continued)
More recent examples include whole blocks and cities that house
hundreds of people, for example, Verma et al., research in
Switzerland (2017)
Many citizen science projects can also be thought of as living labs,
for instance, iNaturalist.org
These examples illustrate how the concept of a lab is changing to

include other spaces where people’s use of technology can be
studied in realistic environments
Evaluation case studies
A classic experimental investigation into the physiological responses of

players of a computer game
An ethnographic study of visitors at the the Royal Highland show in

which participants are directed and tracked using a mobile phone app
Crowdsourcing in which the opinions and reactions of volunteers (for

example, from the crowd) inform technology evaluation
Challenge and engagement in a collaborative immersive
game
Physiological measures were used
Players were more engaged when playing against another person than
when playing against a computer
Why was the physiological data collected normalized?

Physiological data of participants in a videogame
Source: Mandryk and

Inkpen (2004), “The
Physiological Indicators
for the Evaluation of Co-
located Collaborative
Play,” CSCW’2004, pp
102-111. Reproduced
with permission of
ACM Publications.
Example of physiological data
a) A participants’ skin
response when
scoring a goal
against a friend
b) Another participants’
response when
engaging in a
hockey fight against
a friend versus
against the
computer.
Ethnobot app used at the Royal Highland Show
• The Ethnobat directed

Billy to a particular
place (Aberdeenshire
Village)
• Next, Ethnobot asks
“…what’s going on?”
• The screen shows five
of the experience
buttons from which
Billy needs to select a
response
Experience responses submitted in Ethnobot
Number of prewritten
experience responses
submitted by
participants to the pre-
established questions
that Ethnobot asked
them about their
experiences
What did we learn from the case studies?
How to observe users in the lab and in natural settings
How evaluators excerpt different levels of control in the lab and in

natural settings and in crowdsourcing evaluation studies
Use of different evaluation methods

What did we learn from the case studies? (continued)
How to develop different data collection and analysis

techniques to evaluate user experience goals such as challenge
and engagement
The ability to run experiments on the Internet that are quick and
inexpensive using crowdsourcing
How a large number of participants can be recruited using

Mechanical Turk
Evaluation methods
Method Controlled Natural settings Without users

settings
Observing x x
Asking users x x
Asking experts x x
Testing x
Modeling x
The language of evaluation
Type Forms
 Informed consent form
Analytical
Analytics Biases  In the wild evaluation
evaluation
 Living laboratory
Controlled
Crowdsourcing
Ecological  Predictive evaluation
experiment validity
 Reliability
 Scope
Expert review Formative
Field study
or criticism evaluation  Summative evaluation
 Usability laboratory
Heuristic
evaluation  User studies
 Usability testing
 Users or participants
Participants’ rights and getting their consent
Participants need to be told why the evaluation is being done, what

they will be asked to do and informed about their rights
Informed consent forms provide this information and act as a

contract between participants and researchers
The design of the informed consent form, the evaluation process,

data analysis, and data storage methods are typically approved by a
high authority, such as the Institutional Review Board
Things to consider when interpreting data
Reliability: Does the method produce the same results on separate occasions?
Validity: Does the method measure what it is intended to measure?
Ecological validity: Does the environment of the evaluation distort the results?
Biases: Are there biases that distort the results?
Scope: How generalizable are the results?

Usability testing
Involves recording performance of typical users doing typical tasks
Controlled settings
Users are observed and timed
Data is recorded on video, and key presses are logged
The data is used to calculate performance times and to identify and explain errors
User satisfaction is evaluated using questionnaires and interviews
Field observations may be used to provide contextual understanding

Quantitative performance measures
Number Number of users successfully completing the task
Time Time to complete task
Time Time to complete task after time away from task
Number and Number and type of errors per task

type
Number Number of errors per unit of time
Number Number of navigations to online help or manuals
Number Number of users making a particular type of error

Usability lab with observers watching a user and assistant
Usability testing conditions
 Usability lab or other controlled space
 Emphasis on:
 Selecting representative users
 Developing representative tasks
 5-10 users typically selected
 Tasks usually around 30 minutes
 Test conditions are the same for every participant
 Informed consent form explains procedures and deals with ethical issues
How many participants is enough for user testing?
 The number is a practical issue

 Depends on:
 Schedule for testing
 Availability of participants
 Cost of running tests
 Typically 5-10 participants
 Some experts argue that testing should continue until no new insights are
gained
Usability testing and Experiments
 Usability testing is applied experimentation
 Developers check that the system is usable by the intended user

population by collecting data about participants’ performance on
prescribed tasks
 Experiments test hypotheses to discover new knowledge by

investigating the relationship between two or more variables
Summary
Evaluation and design are very closely integrated
Some of the same data gathering methods are used in evaluation as for
establishing requirements and identifying users’ needs, for example,
observation, interviews, and questionnaires
Evaluations can be done in controlled settings such as laboratories, less

controlled field settings, or where users are not present
Summary (continued)
Usability testing and experiments enable the evaluator to have a high level of control over what
gets tested, whereas evaluators typically impose little or no control on participants in field studies
Different methods can be combined to get different perspectives
Participants need to be made aware of their rights
It is important not to over-generalize findings from an evaluation

800 MyHCT (800 69428) www.hct.ac.ae

Week - 10 Evaluation

Uploaded by

Copyright:

Available Formats

Week - 10 Evaluation

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Week - 10 Evaluation

Uploaded by

Copyright:

Available Formats

EVALUATION

Upon successful completion of the course, the student will demonstrate

Explain the key concepts and terms used in

Introduce range of different types of evaluation

Show how different evaluation methods are used

Demonstrate how evaluators mixed and modified to

Iterative design and evaluation is a continuous process that examines:

Where: In natural, in-the-wild, and laboratory settings

When: Throughout design; finished products can be evaluated to collect information to

Natural settings involving users (for

People’s use of technology in their everyday lives can be evaluated in

Such evaluations are too difficult to do in a usability lab

These examples illustrate how the concept of a lab is changing to

A classic experimental investigation into the physiological responses of

An ethnographic study of visitors at the the Royal Highland show in

Crowdsourcing in which the opinions and reactions of volunteers (for

Physiological measures were used

Why was the physiological data collected normalized?

Source: Mandryk and

• The Ethnobat directed

How to observe users in the lab and in natural settings

How evaluators excerpt different levels of control in the lab and in

Use of different evaluation methods

How to develop different data collection and analysis

How a large number of participants can be recruited using

Method Controlled Natural settings Without users

Participants need to be told why the evaluation is being done, what

Informed consent forms provide this information and act as a

The design of the informed consent form, the evaluation process,

Validity: Does the method measure what it is intended to measure?

Biases: Are there biases that distort the results?

Scope: How generalizable are the results?

Involves recording performance of typical users doing typical tasks

Users are observed and timed

Data is recorded on video, and key presses are logged

User satisfaction is evaluated using questionnaires and interviews

Field observations may be used to provide contextual understanding

Number Number of users successfully completing the task

Time Time to complete task

Time Time to complete task after time away from task

Number and Number and type of errors per task

Number Number of errors per unit of time

Number Number of navigations to online help or manuals

Number Number of users making a particular type of error

 Usability lab or other controlled space

 5-10 users typically selected

 Tasks usually around 30 minutes

 Test conditions are the same for every participant

 The number is a practical issue

 Usability testing is applied experimentation

 Developers check that the system is usable by the intended user

 Experiments test hypotheses to discover new knowledge by

Evaluation and design are very closely integrated

Evaluations can be done in controlled settings such as laboratories, less

Different methods can be combined to get different perspectives

Participants need to be made aware of their rights

It is important not to over-generalize findings from an evaluation

You might also like