Writing Good Exam Question
Writing Good Exam Question
Writing Good Exam Question
deserves to be
inspired!
A Self-study Workbook
Written by Dr Kate Exley
FOR TRAINING PURPOSES ONLY
Produced by the Staff and Educational Development Unit, March 2010
(minor revisions made August 2012)
10
13
19
21
29
33
10. Ways of producing accurate and clear marking guidance for questions
39
46
47
48
49
54
56
Appendices
58
Appendix 1
58
Appendix 2
60
Appendix 3
63
List of Figures
Page
Figure 1
10
Figure 2
15
Figure 3
34
List of Tables
Page
Table 1
16
Table 2
17
Table 3
25
Table 4
37
Table 5
38
-1-
1.
1.1
1.2
-3-
2.
-4-
The Schools Assessment Code of Practice describes six assessment objectives that
should be kept in mind when writing examination questions and designing
assessments.
These are
Objectives of assessment at LSHTM
-5-
3.
The goal Test items should be really difficult for people who don't understand the
subject material, but they should be straightforward for those who do. If an item is
difficult because of complicated wording (e.g., double negatives) or vocabulary, you
will end up testing language skills rather than ability in the discipline.
Clarity
Reliability,
Validity
Authenticity
Fairness
i)
Clarity
-6-
EXERCISE
Testing for clarity contrast the following versions of the same exam question
(essay format answer required):
Version A:
Public health policy in the United Kingdom underwent a number of significant
changes during the Twentieth Century that can be directly attributed to the needs
and exigencies brought about by international conflict. Some of the changes and
developments that resulted to health systems and service delivery are still with us
today and it is important that we understand the background of circumstances that
influenced the decisions that were made. Provide a short analysis charting what you
consider to be the main transitions in public health policy brought about by the
unique needs and challenges, both direct and indirect, of an environment of
international conflict, within the UK health systems specifically, using the Second
World War as an example.
Version B:
Compare the advances in UK public health policy pre- and post-Second World War.
Think about points such as:
unclear test instructions,
confusing and ambiguous terminology,
being overly verbose,
using complicated vocabulary,
difficult or poor sentence structure,
unnecessary and distracting detail.
-7-
ii)
Reliability
Does the question allow markers to grade it consistently and reproducibly and does it
allow markers to discriminate between different levels of performance? This
frequently depends on the quality of the marking guidance and clarity of the
assessment criteria. It may also be improved through providing markers with training
and opportunities to learn from more experienced assessors.
The likelihood of eliciting an accurate measure of a students ability will be increased
when students are provided with a variety of ways to demonstrate their knowledge
and skills. For example, some students might generally do better on exams whilst
other students do better in their coursework. Including both, in a course will
accommodate those differences between students however as the DL courses are
provided through the University of Londons external programme, that restricts the
mode of most assessment to examinations, this may not be an option for all Module
leaders. However, even within a written examination we can include a variety of
question formats that can help to triangulate and cater to a students abilities and
provide a more reliable measure of their attainments.
iii)
Validity
iv)
Authenticity
Authenticity is the need to match the style and approach of question setting to the
reality of practice. This is particularly important when considering the assessment of
Masters level qualifications frequently taken by mature students who are
accustomed to working within a professional context. A general example might be,
rather than set an essay style question, ask students to present their understanding
in the style of a professional, or industrial, or clinical report.
This may be very important when considering the testing of procedural knowledge
or functioning knowledge (please see 5.1). When the exam seeks to test a
candidates knowledge of how something works, the order or sequencing of events,
the interplay between contributing factors etc it can be very important to ensure
this is built into the question formatting and context setting to allow authenticity.
-8-
Example
A learning outcome for a module is
..will be able to design survey questionnaires to gather quantitative and qualitative
data in the field.
v)
Fairness
You need to give students a fair chance to demonstrate what they know and can do
and to be able to succeed in examinations. Fairness can be facilitated by being very
clear about expectations in student performance, providing examples of past
examination papers, giving opportunities for students to practice and gain exam
technique (through mocks for example), plus transparency in the processes and
criteria that will be used to mark and grade their work.
Students should know what is expected of them in order to obtain a particular grade
and their marks should be a reflection of their abilities and not a reflection of
extraneous and irrelevant factors such as gender, disability etc. Providing a level
playing field is the aim and this is particularly important at The School when
considering the different groups of students who come to study or embark upon DL
courses, e.g. non-native English speakers, students who have previously
experienced very different educational cultures, mature professionals etc.
-9-
4.
How do you
know any of it
is working?
Module
Evaluation
Example
At the end of the module students should be able to select an appropriate method
and use it to test the significance of collected data.
The learning outcome clarifies what opportunities need to be built into a test question
and ensure that the test is valid. For the learning outcome given above students
should be expected to select a method and have the scope to be able to apply the
method to some data and finally to be able to comment on the significance or
otherwise of the data. To further clarify it would be beneficial to demarcate these
three different tasks within the question itself,
Example
perhaps as separate question sections a, b, and c and finally, for each to have a
clear allocation of the total marks for the question.
To what extent
e.g. Using your knowledge of both prokaryotes and eukaryotes
With reference to
e.g. With reference to the published research from ..
EXERCISE
For the questions given below Underline the verb and key elements of the question that give an indication of
the extent (limits and boundaries) of the question.
Do you feel these are appropriate for Masters level study?
1.
Describe the three main methods of economic evaluation (40%). What are the
main strengths and weaknesses of each method? (40%). Support your answer with
examples of disease evaluation (20%)
2.
A recent retrospective analysis of health records in the Gambia has
suggested that the incidence of malaria has fallen dramatically in that country over
the last 10 years. The elimination of the disease is beginning to be discussed. The
National Malaria Control Programme has begun a surveillance system to detect
future changes.
What advice would you give the National Malaria Control Programme on how to
organize a surveillance system for malaria. Give practical tips for ensuring its quality.
3.
Write short notes on THREE of the following. In each case explain the
importance of the infectious agent and the mode of transmission in its spread and
control.
a)
b)
c)
d)
e)
rotavirus diarrhoea
measles
guinea worm
dengue
tuberculosis
Please see Appendix 1. for some feedback comments on this exercise. You may
also wish to refer directly to the learning outcomes of your modules and the Masters
level descriptors in the Qualification Framework document.
- 12 -
5.
It is possible to test a wide variety of different kinds of knowledge, skills and attitudes
through the careful writing of examination questions.
5.1
5.2
5.3
5.4
e.g.
e.g.
e.g.
e.g.
Knowledge domains
Analysing, Evaluating
Writing skills, Time use
Ethics, equality
Again taking each of these elements in turn let us first consider the different kinds of
Knowledge and ways of knowing that you may wish to test in your students.
5.1.
Factual Knowledge
Terminology, facts, figures
Conceptual Knowledge
Classification, Principles, Theories, Structures, Frameworks
Procedural Knowledge
Algorithms, Techniques and Methods and Knowing when and how to use
them.
Metacognitive Knowledge
Strategy, Overview, Self Knowledge, Knowing how you know.
- 13 -
EXERCISE
Please consider the following four examination questions and decide what
kind of knowledge you feel they would test?
1.
What are the key steps and processes in bringing a new anti-cancer drug to
market and introducing it for clinical use?
2.
3.
Using the tabulated data provided calculate the incidence risk of prostate
cancer per 1000 men, per 5 years, at each of the given levels of alchohol
consumption.
4.
Why do malaria parasites persist in the human population. Explain the choice
of drugs which could be used to prevent persistence of Plasmoduim
falciparum and Plasmodium vivax.
5.2.
e.g. Do we want to test a candidates ability to list important features, analyse the
given findings? or critique the argument they give.
Anderson et als (2001) re-working of Blooms taxonomy makes this easier as they
chose to present the hierarchy of sub-categories as active verbs and it is their
version particularly that has been widely used in course design and question design
in more recent years. It is however important to remember that,
Although Bloom's lends itself to wide application, each discipline must define
the original classifications within the context of their field
Crowe et al (2008)
Figure 2.
Bloom's Taxonomy of Cognition Revisited by Anderson & Krathwohl (2001)
Create
Evaluate
Analyse
Apply
Understand
Remember
Note Some colleagues in the School may already be familiar with the original Bloom
taxonomy that uses the terms Knowledge, Comprehension, Application, Analysis, Synthesis
and Evaluation
- 15 -
Table 1
A table of suggested verbs mapped against the Anderson and Krathwohl
adapted levels of Blooms Taxonomy of Cognition
Cognitive Level
Verb Examples
1. Remember
2. Understand
3. Apply
4. Analyse
5. Evaluate
6. Create
- 16 -
Table 2.
Ways in which intellectual skills can be tested through different question
stems.
Intellectual Skill
Stem
Comparing
Justifying
Summarising
Generalising
Inferring
Classifying
Creating
Applying
Analysing
- 17 -
Synthesising
Evaluating
(Adapted from Figure 7.11 of McMillan (2001) and Piontek, M.E. (2008))
Note you may like to compare these question stems with Blooms taxonomy,
given earlier and draw comparisons and to cross refer to the learning
outcomes specified for your own Modules.
EXERCISE
Take a few moments to look down this list of question stems and select two
that you feel could be used to test students on your module/course.
Why have you selected these two?
5.3.
Short answer and essay styled questions do give an assessor the opportunity to
judge a range of generic or transferable skills in the way students answer the
questions or respond to the tasks set. The most obvious of these are to do with
ability to write clearly and appropriately, to structure and organise answers so that
most important points are prioritised and well made and the ability to cite and use
source material effectively.
If these skills are to be included and given value in the assessment this should be
clearly stated in the assessment criteria used to make judgements and this fact
should be made clear to students. At The School this is an important issue as many
of the Masters students are non-native English speakers. What proportion of the
marks for a test question are allocated to skills such as written English should be
related to the Aims and Learning Outcomes for the course and context. In some
cases accuracy and style may be considered important, e.g. to highlight professional
skills and competencies, and be included in the assessment criteria, whilst in others
such characteristics are not what is being taught and considered.
- 18 -
5.4.
* Open book examinations can allow students to take their own notes or choice of
texts or previously specified items into the examination.
- 19 -
Check that the question does not assume a lot of background knowledge
which may be culturally specific or introduce unnecessary bias;
Provide any important (untested) background detail within the body of the
question;
Give mark or timing guides within the framing of the question that indicate the
relative importance or attached weightings for each sub-section;
Set multiple-part problem questions so that the parts are independent from
each other. This means that if a student gets the first part wrong they dont
automatically lose marks or subsequent sections and makes grading much
quicker and more straightforward.
E.g in the second part of a question, write something like In the next part of
the calculation, assume that the answer to Part (a) was 25, regardless of what
you actually got in Part (a). Note that 25 is NOT necessarily the correct
answer to Part (a).
EXERCISE
Can you think of any additional aspects in the exam questions you will be writing that
should be considered to reduce the impact of stress factors?
Please list these here.
- 20 -
7.
There are a number of ways in which examination questions can be written and
structured that in turn require very different responses from students. Examination
papers may consist of a variety of these formats. For example a paper may consist
of an initial section of 10 compulsory, short answer questions followed by a second
section in which the student is asked to attempt three from six longer questions
which may be essay or case study or problem solving styled questions.
Here are some examples of different ways questions are written at the School with a
commentary highlighting important features (such as the need to avoid ambiguity,
bias, inequality and yet be able to discriminate between different levels of attainment
and achievement).
7.1
Objective Tests
e.g. True-False, Matching Pairs and Multiple Choice Questions
There are few examples of such question types being used extensively in summative
assessments at the School and they are included here for completeness sake and
an acknowledgement that some teachers may well be using these question formats
as part of their class or on-line teaching, as self assessment or formative
assessment opportunities for their students.
- 21 -
True- False
Used to test a breadth in knowledge of information but the problem of
guessing is a major worry.
Matching Pairs
Used to assess knowledge of complex and inter-connecting relationships.
7.2
- 22 -
Example The investigators want to perform a sample size calculation with 80% power
and 5% 2-sided significance. They estimate that HIV-free survival at 7
months will be 60% in the control arm.
(i)
(ii)
Assume that 5% of motherinfant pairs are lost to follow up prior to the infant
reaching 7 months and adjust your sample size calculation accordingly. (4
marks)
EXERCISE
How could you improve parts (i) and (ii) of the example question above?
Please see the concerns that were raised by the Module team over the page
- 23 -
- 24 -
7.3
Longer format to allow students to respond to open ended questions at length. Used
to test higher skills, writing and structuring skills, further reading and a deeper level
of understanding. Assessors are frequently interested in a students ability to
organise and integrate a range of ideas and information and build an argument or
make a case (the intellectual skills of synthesis and evaluation, going back to
Blooms taxonomy).
Two types of essay questions can be readily identified, restricted-response and
extended-response. Restricted-response essays focus on understanding of basic
knowledge through relatively brief and confined written responses.
e.g. Outline the morphology, genome organisation and replication of the human
immunodeficiency virus (HIV).
Extended-response essays allow student to construct a variety of interpretations and
explanation and draw upon a wider and more flexibly defined set of information and
sources
e.g. The burden of disease caused by intestinal parasites in a community reflects
the levels of personal and environmental hygiene.
To what extent do you agree with this statement and what are its implications? Make
reference to specific infections to support your conclusions.
Table 3.
Some Common Essay Style Questions used in Exams
Question Stem
Give a Quotation Discuss
Make an Assertion Discuss
Compare and Contrast
Write-on
Outline
Describe
Explain (with examples)
Evaluate
Analyse the advantages...
Design a
- 25 -
EXERCISE
Look back over recent examination papers set for your course or teaching
module and add two more commonly used Question Stems to this list.
1.
2.
7.4
Here the students are provided with some data (this could be in written, tabulated,
graphical form etc) and then asked a series of questions about it. The provided
information may be some research findings or monitoring data. The questions
usually begin with a couple of straightforward interpretative questions (e.g. Using the
table of infection rates provided, which of the described drug therapies reduces the
risk of infection the most?). They then move on to more complex questions of
application and analysis that require the students to carry out standard manipulations
or calculations of the data provided. The final questions are likely to be more
evaluative and open-ended, requiring the students to predict likely impacts or
suggest improvements etc.
An Example
On a hot summer day, children in three schools had a school outing to a playground
where some of the children played in the recreational fountain. Two days later nearly
half the children had symptoms of vomiting, diarrhoea, abdominal pain and
headache. A retrospective cohort study was carried out to try to identify the source
of the outbreak with the following results.
Risk factor
Exposed to risk
factor
Ill
Not ill
72
76
3
4
18
32
87
25
4
24
80
15
19
75
(a) Define what is meant by the risk and relative risk of becoming ill associated
with each factor (10%).
- 26 -
(b) Calculate BOTH the risk and relative risk associated with each factor (30%).
(c) Suggest possible interpretations of the results, and the implications for
control recommendations (10%).
The investigators wanted to identify the infectious agent involved. One possibility
they considered was norovirus which is known to cause acute gastroenteritis.
Although reverse transcription-PCR (RT-PCR) method is considered to be the gold
standard for diagnosis of this viral infection, it requires skilful personnel and a wellequipped laboratory. A simpler diagnostic kit has been developed. The following
table shows how the simpler diagnostic kit compares to the gold standard.
Diagnostic kit
Norovirus present
Norovirus absent
(d) Would you advise the investigators to use the simpler diagnostic test in their
epidemiological study? Would your recommendations change if the simpler
diagnostic test was to be used in clinical practice? Justify your answer. (50%)
[Note on norovirus: this highly infectious RNA virus causes a self-limited, mild to
moderate disease that often occurs in outbreaks with clinical symptoms of nausea,
vomiting, diarrhoea, abdominal pain, headache, low grade fever or combination of
these symptoms. No treatment is indicated apart from rehydration in severe cases. ]
EXERCISE
In section 6. we discussed a number of ways that a question writer could minimise
the impact extraneous factors, such as stress, interpretation, time-management etc
in the way they set a question please look over the question above and identify at
least three ways the question author has sought to do this.
1.
2.
3.
- 27 -
7.5
In case study styled questions a context or situation is described in detail (e.g. this
maybe a patient history or government strategy position etc). Such questions are
often seen as being very authentic and ask students to apply their knowledge to a
particular and novel, set of circumstances. They frequently take considerable work
and effort to write well and usually involve a team of people who craft an idea into a
realistic and challenging situation.
Note - Some examples of this type of question are presented as examples in section
11.
Giving Choice
A common structure in examination papers is to have part of the paper core, to be
attempted by everybody and other sections which provide a limited amount of
choice, e.g. choose 2 from the following list of essay questions to complete.
Whilst the structure of exam papers is set by the Board of Examiners and not by
individual question setters, it is never-the-less interesting to consider the impact of
providing question choice within an exam.
Many people view the giving of choice as a way to increase fairness and reduce the
affect of luck in question spotting. It allows students to address questions for which
they feel most prepared and have been most interested in so seeing the best the
student can produce. However, providing choice inherently reduces the validity and
reliability of the test instrument because each student is in fact taking a different test
and has been encouraged to sample from their learning in different ways. It is nearly
impossible to create parallel exam questions that test achievement of the learning
outcomes to the same extent, and it is equally difficult to grade two different essays
absolutely comparably both factors making consistency very difficult (Piontek,
2008).
EXERCISE
Do you personally think that the giving of a choice in an examination, (e.g.
choose 3 from the following 6 questions) is fair?
- 28 -
8.
It is very difficult to write a question and then immediately see the ambiguities or
errors that it contains. Separating the creating from the evaluating roles in time can
help. Write a question and then come back to it the following day and re-read with
fresh eyes. When you have a draft question, next write a model/specimen answer
and/or some marking guidance. As you do this come to a decision about the
appropriate break down of marks and try to estimate how long it will take to tackle
the question, part by part. In coming up with the marking scheme for your question
you might find it helpful to have the learning outcomes for the module or course in
sight to refer to so that you can check that you are valuing the right things and giving
credit to Masters level criteria.
Below is a checklist of questions to use once you have a draft question (doing some
of this in a group with questions on overheads can work well):
- 29 -
1.
2.
3.
4.
How well does the question relate to intended learning outcomes (of the
teaching module or MSc)?
5.
6.
What are the key words describing the task? Are they clear?(eg: list, define,
suggest reasons behind the effect are better than interpret, discuss,
evaluate)
7.
8.
Check punctuation and grammar as this can markedly change the meaning of
sentences (eg panda eats, leaves and shoots).
9.
10.
11.
12.
13.
Does the question lead to answers which will distinguish between weak and
strong candidates, eg are there elements for candidates to demonstrate
distinction-level skills/knowledge?
- 30 -
Question Validation
The Masters programme that you contribute too is likely to have its own process of
question validation and process of compiling the examination paper. It is important
that you ascertain this from the module leader and adhere to it.
In general terms, however, once you have the question, model/specimen answer
and marking scheme written ask someone else to answer it (do not give them the
model/specimen answer), timing each part of the question. It allows you to check
that your calculated time it takes to complete estimates were about right. Modify
the question, and timings and marking scheme based on any misunderstanding
made clear by their answer.
It can be helpful to agree a question swap with a colleague and undertake an
informal peer review of the questions you have both written. This frequently happens
across a course team.
At this stage you will be ready to submit your question to the module leader and they
too will scrutinise your question and may get back to you with further suggested
improvements (please see the extended case study in the Appendix for further detail
about the way The School conducts examination question approval processes.)
Water
Ethanol
Sodium and Potassium ions
Sugars.
Over the page you will find the edited version of Question X that was eventually
accepted and used in the examination.
- 31 -
The questions are usually considered together with the associated marking
guidance notes and for this question these were the guidance notes that were
accepted
Diffusion (neutral)
diffusion (lipid-soluble)
ion transporter
specific transporter protein
- 32 -
9.
Marking Approaches:
Using assessment criteria and marking schemes
Assessment criteria test the intended learning outcomes for a course or teaching
unit. They describe the knowledge and skills (and possibly attitude) that a student is
expected to demonstrate in their examination answers and they are then used in
marking the work. The learning outcomes describe what students should be able to
do; assessment criteria describe how well they should be able to do it they set
standards. Remember that learning outcomes define the minimum standard
required to achieve the award, and so in addition to these the assessment criteria
should provide an objective basis for interpreting and differentiating the performance
of students at the level of the outcome (a satisfactory pass) and at a series of predefined steps above this (usually up to a level considered an excellent or
outstanding pass).
For each examination question there should be a model/specimen answer, or a set
of specific marking guidance, that are used to mark the associated student answers.
These will usually vary with each and every question and are tailored and specific.
The assessment criteria are usually more generic and used as a framework to fairly
judge the merits of each students work across a whole course or teaching unit.
Assessment criteria describe the extent to which students have achieved the
specified learning outcomes. They are usually provided at two levels,
o the overarching criteria that describe the different bandings of overall
achievement at the Programme level e.g. First, Two-one, Two-two etc
at undergraduate level and Pass /Merit/Distinction categories at
Masters level.
o A detailed and specific level of criteria that describe and measure
achievement in particular modules of study or for individual
assessment tasks.
that a few students will fail and a similarly few students will get distinctions whilst the
majority will gain marks that cluster and peak in the middle mark range.
You will also sometimes hear experienced assessors referring to a particular piece of
student work as providing a benchmark. This is where the answer provided for
various reasons encapsulates the criteria for a mark or grade: for example,
determining the threshold for a distinction. This can be extremely helpful, and is a
way in which norm referencing and criteria referencing naturally come together.
Figure 3.
A Normal Distribution or bell-shaped curve.
However, not all cohorts will fit this pattern, for example, Computing for Beginners
courses could form a two peak pattern, with clusters of students achieving very high
marks (and represent the students who could have taught the course!) and a
second cluster with marks at the bottom of the range (ie those who had never done
any computing before!).
Absolute norm referencing also has the characteristic of effectively setting quotas,
only so many students can get As and only so many can get Bs etc, and the
application of a bell-shaped curve to small groups or cohorts of students becomes
clearly unfair where we can see that variations between groups, say from year to
year, is likely to give rise to very different patterns of achievement.
Criterion referenced grading on the other hand specifies a standard through the
description of clear criteria and anybody who achieves the level or standard
described gains the marks so everybody in the cohort could potentially get an A
and each students work is individually judged in comparison to the criteria
regardless of what other students may or may not do.
- 34 -
EXERCISE
Please consider the strengths and limitations of both forms of grading work.
Norm-referenced assessment
Strengths
Weaknesses / limitations
Criterion-referenced assessment
Strengths
Weaknesses / limitations
- 35 -
In The Schools Assessment Code of Practice we can see some guidance and clarity
on this issue.
Using the full mark range Advice from the School
(Code of Practice 2012)
Markers are encouraged to use the full range of available
marks, to reflect the full range of student achievement. In
particular, markers should not feel reluctant to award 5.0
grades provided work meets the appropriate standards. The
following specific points should be noted
- 36 -
Table 4.
LSHTM Marking Gradepoints descriptions (Overarching criteria)
Grade
point
Descriptor
Excellent
Very good
Good
Satisfactory
Unsatisfactory /
poor
(fail)
Not submitted
(null)
- 37 -
Example
Grade
MARK
point
(%)
80-100 5
70-79
4
60-69
3
50-59
2
40-49
1
<40
0
(typical scheme)
Example
MARK Grade
point
(%)
95-100 5
85-94
4
75-84
3
60-74
2
50-59
1
<50
0
(higher numeric
pass threshold)
Example
MARK Grade
point
(%)
75-100 5
60-74
4
45-59
3
30-44
2
20-29
1
<20
0
(lower numeric
pass threshold)
Students should be made aware of the criteria on which all assessment tasks will be
marked, to improve their understanding of the standards expected of them.
The criteria used to place students in each grade category must be written down by
staff setting assessments, and adhered to by all those involved in the marking.
- 38 -
10.
Marking guidelines should be based directly on the Assessment criteria and for some
modules, such as those that are quantitative in nature, there is probably a need for
model/specimen answers, in addition to or instead of marking guidelines..
The Assessment criteria will serve as the basis for the development of the marking
guidelines. For each criterion I suggest that you initially think about the major steps
in the continuum of student achievement i.e. what do you expect from a Pass
answer at a 50% grade level and what would you expect of a Distinction answer?
Firstly, for each criterion, consider carefully what you expect students to have written
to achieve a passing mark for this criterion. Draft a detailed description of the content
and quality that markers should evaluate, in addition to what has been included in
the assignment instructions. Ask yourself: What would comprise the minimum of
what I would expect the student to have written for this section, or about this subject,
to achieve a passing mark? This description or set of required
concepts/ideas/issues/ definitions will serve as the basis for a grade of 2.
Once the basic expectations for a 2 grade have been drafted in association with the
original criteria, it is then necessary to describe what additional level of content
and/or quality would achieve higher marks (3, 4, 5). Please draft descriptions of what
components might achieve the different possible higher marks (3, 4, 5).
(A note on Distance Learning assessments)
N.B. Remember that many students do not have access to other facilities, such as
libraries, so the student must be able to respond to the question by reference to the
study materials ONLY and still achieve a high mark.
It should be possible to obtain a 5 by original and creative use of nothing more than
the materials provided. All instructions should be devised to allow scope for
- 39 -
imaginative input and cross referencing from students who have access to nothing
more than the course materials. For a 5 grade in particular, it is original thought, not
extra facts, that would contribute.)
You may well find that, depending on the nature of your course, module or subject
area there is one criteria type that tends to take precedence in differentiating the
marks. For example, in a strongly practice-based, professional course, the quality
and authenticity of reflective practice may be a priority criterion. In courses
concerned with exploring the impact of public policy decisions and practices the lead
criteria may be those emphasising the application of key principles and the analysis
of outcomes. If there are lead criteria, then a transparent approach would be to
emphasise these in advance to students both within the teaching and the
assessment design. There should also be links made between the criteria and the
intended learning outcomes that help to show students where the emphasis lies.
Finally based on the basic criteria for a passing mark (2), draft a list of fundamental
omissions or errors that would result in a 1 ore even in a 0, fail.
A question to challenge yourself with Does a grade of Outstanding actually equate to Impossible to achieve?
This is particularly important if you are likely to be assessing essay style questions
rather than numeric or quantitative questions. It is possible to score 100% in a
calculation answer and virtually impossible to score more than 80% in a discursive
essay style answer.
You have to give your students the opportunity to be able to excel you need to
consider how your more able students can demonstrate their additional qualities,
creativity or more in-depth knowledge or understanding to you. This is often a difficult
thing to achieve, i.e. to incorporate into the question design an opportunity to
differentiate between your able and excellent students.
- 40 -
- 41 -
Here are a couple of examples showing how the marking guidance gives clear links
to the grading structure and differentiates between the possible grades.
Example 1.
Question
Discuss what is meant by the term epidemic. Describe the main features of an
epidemic curve. Identify the main types of epidemic, giving examples.
Marking Guidance
(Based on Teaching Session 3 and the Webber book chapter 2)
Example 2.
Question
What has been the impact of HIV on the epidemiology and control of TB?
- 42 -
Marking Guidance
(Based on Section 2 Teaching Session 3 and the TB/HIV clinical manual)
A Grade 3 answer should provide basic information on the epidemiology and control
of TB including
It is interesting to note that in both these examples the assessor has chosen to
provide a description for a Grade 3 answer first describing a point near the middle
of the grade-scale, the peak of the normal distribution, before going on to relate
higher (4/5) and lower (2/1) scoring grades to this mid-point.
- 43 -
EXERCISE
Consider an examination question that you have written or are currently in the
process of drafting. Produce some marking guidance for the question that provides
clear descriptions that differentiation between the Grades (0 to 5).
Think about which point on the grading scale you find it easiest to begin with.
Example 3.
Question
On a hot summer day, children in three schools had a school outing to a playground
where some of the children played in the recreational fountain. Two days later nearly
half the children had symptoms of vomiting, diarrhoea, abdominal pain and
headache. A retrospective cohort study was carried out to try to identify the source
of the outbreak with the following results.
Risk factor
Exposed to risk
factor
Ill
Not ill
72
76
3
4
18
32
87
25
4
24
- 44 -
80
15
19
75
(a)
(b)
(c)
Define what is meant by the risk and relative risk of becoming ill associated
with each factor (10 marks).
Calculate BOTH the risk and relative risk associated with each factor (30
marks).
Suggest possible interpretations of the results, and the implications for control
recommendations (10 marks).
Marking Guidance
Risk = children who were ill who were exposed/total number exposed
Relative risk = risk in exposed/risk in unexposed
Give 5 marks each for these definitions: total 10 marks.
Give3 marks for each correct risk & 3 marks for each correct relative risk:
total 30 marks
Main risk factor is playing in the recreational fountain. This suggests that
the source of the outbreak is water in the fountain, possibly indicating
faecal-oral transmission. Water in the fountain should be tested regularly
for relevant bacteria and viruses (eg, E Coli, salmonella, norovirus) and
should be monitored to ensure that adequate levels of chlorine are
present in the water. Alternatively children could be prevented from
playing in the fountain (however, on a hot sunny day it may be difficult to
keep them out of the water!) Up to 10 marks for that or similar relevant
comment.
The investigators wanted to identify the infectious agent involved. One possibility
they considered was norovirus which is known to cause acute gastroenteritis.
Although reverse transcription-PCR (RT-PCR) method is considered to be the gold
standard for diagnosis of this viral infection, it requires skilful personnel and wellequipped laboratory. A simpler diagnostic kit has been developed. The following
table shows how the simpler diagnostic kit compares to the gold standard.
Diagnostic test
Norovirus present
Norovirus absent
(d)
Gold standard
Norovirus present
37
13
Norovirus absent
3
47
Would you advise the investigators to use the simpler diagnostic test in their
epidemiological survey? Would your recommendations change if the simpler
diagnostic test was to be used in clinical practice. Justify your answer. (50
marks)
Marking Guidance
Sensitivity = 37/50 = 74%
Specificity = 27/50 = 94%
- 45 -
Give 5 marks each for calculation of sensitivity and specificity (10 marks).
Discussion of whether or not to use the test in (i) epidemiological survey
or (ii) clinical setting Up to 40 marks for answers that identify the key
requirements of a diagnostic test in the two situations and uses
information from the calculation of sensitivity and specifity correctly.
Some of the following points may be included in the answer:
Possible implications of missing true cases (1 in 4 true cases will be
missed) and of a diagnosis in true negatives (6/100 people without the
disease will be diagnosed by the test).
(i) Epidemiological survey: simpler diagnostic test will be adequate to
identify the outbreak. Do not need to identify all cases to recognise that
this is an outbreak. As large numbers to be tested consider
cost/resources/time savings of using the simpler test.
(ii) Clinical practice: what are the implications of missing 1 in 4 true
cases? As treatment non-specific (rehydration therapy) a false-negative
diagnosis with respect to norovirus would be unlikely to affect the outcome
of the illness in the individual. However consider whether other
investigations may be undertaken in people with symptoms who have
tested negative for the disease. Also as norovirus is known to be very
infectious consider impact on behaviour of having a diagnosis of the
disease. Also identification of contacts. Less issue for time/resources in
the clinical setting so it may be better to go for the gold standard test.
Other factors: cost, resources, time (up to 10 marks).
explanations of why a particular answer is correct (or more correct than others).
However, if answers are expected to use evidence or explain with reference to the
literature, the specimen answer provided should seek to model good practice in
these academic skills whilst also emphasising that there may be other ways of
achieving positive results. In very open ended response questions it may be best to
provide brief outlines for two or three different possible interpretations and
arguments presented this can be particularly useful in a feedback mode of
presentation in which students come, review and then discuss the different
approaches taken thus attempting to encourage students in finding their own voice.
You may like to refer to the extended case study provided in Appendix 3 that takes
you through the steps of exam question and marking guidance development together
with extracts from the module team discussions.
Question Background
A Work in Progress (presented in four steps)
i) An Early Draft with Feedback (Autumn Term)
ii) The Question Amended after Feedback from the Exam Chair (July)
iii) Some Fine Tuning (Final Version)
iv) A Completed Work? (Some reflections on the use of the question)
3.3
A reflective exercise
- 47 -
It is also important to remember that any grade divulged before the final meeting of
the Board of Examiners is a provisional grade, subject to external review and may be
amended at the discretion of the examiners.
Appeals
When thinking about the way we write examination questions and conduct
summative assessments it is worthwhile remembering that candidates may appeal
against a result where there is concern that the examination has not been conducted
in accordance with School policies and procedures. However, the University of
London does not allow appeals on purely academic grounds, such as challenging
the interpretation of a concept or principle.
- 48 -
Strategies to support students are usually based upon two guiding principles;
a)
b)
A common approach used in the School is to provide examples of past papers and
examiners reports so that students can see the process of assessment clearly.
It is also desirable to provide opportunities for students to experience assessment
forms and formats before they count. Building mock examinations into the module
or course and giving students feedback on their approach and success is one way
that this can be done. Having formative assessment that mirrors the summative
assessment can also be helpful. This is especially true for students at the School
who may have had very diverse experiences of education and assessment
processes prior to their Masters courses either in London or by DL.
The School has produced some guidance on the delivery of feedback to students
after formal course work assessments - this particularly highlights the need for
clarity, transparency and speed of feedback turn around-time (see below).
However, I would also like to emphasise the need to provide constructive feedback
on the formative, practice or mock assessments that are part of the teaching units
at The School. Feedback here needs to be focussed on helping the students to do
it better next time or to coin a phrase Feed-forward.
It may be helpful to keep in mind that ultimately learning is a transformative
process, personal to the individual, that isnt confined by or restricted to set points
of assessment. Marks provide useful measures and milestones, particularly within
formal course structures, but we also want our students to understand that learning
is life long, and to develop the skills needed to become sophisticated life-long
learners.
EXERCISE
How can you provide support for your students as they prepare for and
participate in examination assessments?
- 49 -
- 50 -
Feedback on Examinations
School policy is that for coursework and project reports, students should receive
individual feedback to aid their learning. For the June exams, students receive their
grades. For DL courses, Examiners Reports for Students are prepared on
expectations with references to marking schemes.
Here is an example of an Examiners Report for Students that shows variety in
feedback depending on question type. Please note where the Examiner has
identified what would be required for a sound pass and what would be expected of
an answer awarded the higher grades.
e) Corynebacterium diphtheriae
Expected: Gram-positive. Non-motile. Blood tellurite agar (black colonies). Child
hood infection of upper respiratory tract. Aerosol. Toxin (tox gene; regulated by low
iron concentrations. Vaccination against toxin; penicillin kills bacteria but does not
inactivated toxin.
Question 2
For a safe pass, the student should have discussed that N. meningitidis is Gramnegative diplococcus, non-motile and lives in a
Comment (M1): This is helpful for the
certain percentage of upper respiratory tracts
student to know what is a safe pass
within the population. They cause meningitis and
other diseases / symptoms by crossing the blood brain barrier through the same
path, which neutrophils use. As virulence factors, they have pili and fimbriae to
attach, endotoxins causing inflammation to help entry, capsules to interfere with
complement attack as well as phagocytosis, killing and degradation by
macrophages/neutrophils, and IgAase to neutralize IgA. They can be typed be
several capsule serotypes, which are not all covered by the available vaccine.
Diagnosis needs to be very fast, since the most affected ones are children and
teenagers, which can succumb to the disease rather fast. Antibiotic therapy needs to
be started quickly. It would have been excellent to name a few relevant antibiotics.
For diagnosis, growth test using CF and blood on chocolate / blood agar, and test for
sugar usage, and the latex agglutination test should be mentioned, as well as other
possible test including PCR. The more details the better the score.
Question 3
For a safe pass the student should have named 2 zoonotic infections such as
brucellosis, salmonellosis (Salmonella typhimurium), listeriosis (M. bovis),
leptospirosis, psittacosis, tularaemia (Francisella), anthrax (Bacillus anthracis),
Coxiella (Q fever), Lyme disease and so on, so lots of choices.
Better grades could have been achieved by describing their reservoirs, life cycles
and diseases in detail as well as how they can be
Comment (M2): This is good
controlled. Some of them have a more
complicated life cycle and are transmitted by vectors (Borrelia, Coxiella, Y. pestis);
some come from specific hosts (M. bovis from ruminants, Leptospira from rats); B.
anthracis makes spores and is therefore difficult to eliminate by simple disinfection,
and the cadavers need to be incinerated. If these issues were detailed, students
would have scored high marks.
- 52 -
Question 4.
This question is based on the paper in your reader (Bahl et al). The questions help
you understand and interpret the data that are given in the tables, and help you
follow the discussion of the data by the authors.
Comment (M3): This is helpful for
students
a)
First of all, always read the titles/headers of tables carefully, because these
tell you what exactly is presented in the table: what is measured and how, what the
numbers mean, etc. The two tables give you different information, table 1 counts
episodes, and therefore give you incidence, whereas table 2 gives prevalence, that
is days-with-disease during the observation period. Looking at risk for diarrhea, we
see in table 1 that children with low plasma zinc are at increased risk because there
is a higher incidence of diarrhea with a significantly higher RR (Relative Risk,
significant when the confidence interval does not contain 1): 1.47 (1.03, 2.09). There
is also a significantly higher risk for severe diarrhea (1.70), but the RR for prolonged
diarrhea is not significantly different (RR of 2.54, but confidence interval contains 1).
This is further supported by the prevalence data in Table 2, where we see that only
the diarrhea with fever (= more severe diarrhea) is significantly more frequent in the
children with low plasma zinc. There is no significant difference in the prevalence of
the other morbidities between the children with low and with normal plasma zinc (see
the P-values in the table).
b)
First, read carefully. On what data are these statements based? In table 1 you
can see that the nr of episodes of ALRI was not different between the groups
(Confidence Interval contains 1). However, the total number of days with ALRI, as
presented in Table 2, was significantly higher in the children with low plasma zinc.
Therefore, one has to conclude that there must have been more days per episode in
the children with low plasma zinc.
- 53 -
Questions that are divided into discrete sub-sections and are accompanied by
their associated marking schedules have many benefits for both students and
markers providing clarity in presentation and grading reliability.
Include data or information in the question to reduce the emphasis on memory
and increase the emphasis on application and critical thinking.
Check that your draft question does not favour or disadvantage students from
particular backgrounds or cultures.
Keep sentences short, layout clear and well spaced out and use precise and
unambiguous language.
Check that the question standard and assessment criteria are at Masters
level.
Check does the question enable students to excel and allow markers to
discriminate between able and excellent performances.
- 54 -
- 55 -
- 56 -
Useful web-sites
Center for Instructional Development and Research
Resources Writing Exam Questions
A collected set of web-links and guidance sites on writing exam questions. Most from
institutions in the USA. Lots of information on writing MCQ Questions and comparing
them with other forms of written assessments.
http://depts.washington.edu/cidrweb/resources/exams.html
- 57 -
Appendices
Appendix 1.
Feedback comments are inserted in bold below each question.
EXERCISE
For the questions given below Underline the verb and key elements of the question that give an indication of
the extent (limits and boundaries) of the question.
Do you feel these are appropriate for Masters level study?
1.
Describe the three main methods of economic evaluation (40%). What are the
main strengths and weaknesses of each method? (40%). Support your answer with
examples of disease evaluation (20%)
Describing is a relatively low level cognitive skill but then the student is
asked to evaluate the three methods by giving strengths and weaknesses
this is the Masters level task in this question.
Factors that give limits are the requirement to describe three methods and to
support the answer with examples.
2.
A recent retrospective analysis of health records in the Gambia has
suggested that the incidence of malaria has fallen dramatically in that country over
the last 10 years. The elimination of the disease is beginning to be discussed. The
National Malaria Control Programme has begun a surveillance system to detect
future changes.
What advice would you give the National Malaria Control Programme on how to
organize a surveillance system for malaria. Give practical tips for ensuring its quality.
Giving Advice requires the students to select from and apply their knowledge
in order to synthesise an appropriate surveillance system this is Masters
level Students are also asked to consider what makes a such a system
Quality this could be considered a further degree of difficulty. The limits in
this question are given by the scenario of the question which makes it specific
to a country and a disease context.
- 58 -
3.
Write short notes on THREE of the following. In each case explain the
importance of the infectious agent and the mode of transmission in its spread and
control.
a)
rotavirus diarrhoea
b)
measles
c)
guinea worm
d)
dengue
e)
tuberculosis
This question does not clearly articulate Masters level requirements as the
Write short notes does not indicate a level and the Explain the importance
may or may not require some level of evaluation and critique but could equally
be a measure of memory depending on what had been taught in the module.
- 59 -
Appendix 2.
A detailed example
re-writing and formatting a question to ease interpretation
(related to chapter 6.)
This example has kindly been provided by the teaching team responsible for one of
the DL programmes delivered by the School Fundamentals of Clinical Trials. It
shows clearly the way a set of guiding principles are used to mould a clearer
question context from a great idea to a very demanding but fair question set-up.
The team wanted to write a question that tested their students abilities to think about
and apply key concepts rather than re-work the study materials provided. Past
experience had underlined the importance of providing relevant and realistic
question contexts and considerable effort is made to vary the scenarios used in
question setting.
What is presented here is the first draft of the question some team discussion
notes and then the final question as it was used to asses the DL students.
Recruitment Criteria: Pregnant HIV infected mothers, who are currently not
receiving antiretrovirals (ART) and who plan to breast feed
Randomisation: Women to be randomized into 2 groups.
- 60 -
- 61 -
Intervention
Arm
Control Arm
During pregnancy
and labour/delivery
Maternal Standard of
Care
Maternal
antiretroviral
prophylaxis from 28
weeks gestation
Maternal
antiretroviral
prophylaxis from 28
weeks gestation
Maternal Intervention
6 months maternal
antiretroviral
prophylaxis
Infant Standard of
Care
1 month infant
prophylaxis
1 month infant
prophylaxis
control
*Randomisation: Pregnant women are randomised 1:1 to the intervention or
control arm
- 62 -
Appendix 3
Extended Case Study showing the Development of a real Exam Question
- CT101 Fundamentals of Clinical Trials
This extended case study, based on a real example, aims to show the stages
of development that the question went through and reflections on the process
made by the course team (shown in comment boxes)
The case study includes the following sections
3.1
3.1
3.2
Question Background
A Work in Progress (presented in four steps)
i) An Early Draft with Feedback (Autumn Term)
ii) The Question Amended after Feedback from the Exam Chair (July)
iii) Some Fine Tuning (Final Version)
iv) A Completed Work? (Some reflections on the use of the question)
3.3
A reflective exercise
Question Background
Question Motivation: We wanted to move away from the overused cardiology drug
trial examples of previous exam papers. We have a diverse tutor team that included
clinical trialists working at the Institute of Mental Health and we were inspired by a
BMJ article by Goodyer et al reporting on a Mental Health trial on major depression
in adolescents.
Question Context: A Mental Health Trials (major Depression) taking place in
Adolescents. The intervention of interest was Cognitive Behavioural Therapy (CBT)
as an add-on to the standard drug treatment, selective serotonin reuptake inhibitors
(SSRIs). The outcome measurements were determined by a mental health
questionnaire.
Overall Objective: To see if students could apply key fundamental principles of
clinical trials to:
a unique patient base (i.e. adolescents)
a non-drug intervention (i.e. CBT)
an outcome that is measured by questionnaire (rather than by clinical
measurements).
- 63 -
Assessment Needs: The exam was composed of two questions. Prior to question
setting, we identified and allocated the key concepts (as covered in the distance
learning study material for this module) to be tested for each question. For this
question the chosen key principles to test were:
Trial designs;
Recruitment; Blinding;
Randomisation;
Bias
The second question was to be much more numerical/statistical in nature, thus this
first question excluded calculation type questions. Question two also included
questions specifically designed to be grade differentiators. The first question was
seen as testing students understanding and application of central and
straightforward concepts.
Question Types: We aimed to include a range of question types.
Who was involved in question development?: Three key tutors, course
director(s), the external examiner and exam chair. So there was lots of input from a
number of experts, many drafts, and a long communication trail before agreeing a
final version. What follows is a tracking on this process
3.2
Step i
The question is given in plain text, marking guidance is indented and italics and
module teams discussion notes are the comments boxes alongside the text..
The Question
Selective serotonin reuptake inhibitors (SSRIs) are prescribed for the treatment of
major depression in adolescents (age 11-16),
Comment (A1): What sort of depression
although there are concerns regarding their
major? Should we define with a
usefulness and a raised risk of suicide. The
depression score?
National Institute for
Health and Clinical Excellence (NICE) recommends the use of SSRIs in combination
with Cognitive Behavioural Therapy (CBT) in the UK. This recommendation is based
on data collected from the United States.
- 64 -
c) What design and conduct features would you apply in order for the trial to be
explanatory or pragmatic?
(10 marks)
Comment (A7): Im not certain I would
know how to answer this question.
Design and conduct in one question
overwhelms me sorry Im just a babe
in arms really! Im guessing your
direction is to think about an intention to
treat analysis and how we define a
protocol deviation. If we continue down
the pragmatic route do you think we
could streamline this question?
Think about the eligibility criteria and how restrictive this should be
Think about who will be delivering the therapy intervention, what
training and experience these people would have.
- 65 -
Parents may not want to enter their children into a trial involving an
SSRI because of the risk of suicide, especially as they have major
depression. Therefore recruitment may be slow. As NICE
recommends SSRIs in conjunction with CBT based on US data there
may not be equipoise for this trial. Therefore clinicians may be
unwilling to randomise highly depressed children and their parents
may also be unwilling to be involved.
Both treatments would be available outside of the trial and therefore
there is not as much incentive to take part in a clinical trial. The
population may include those younger than 16 and therefore specific
consent and assent procedures would need to be put in place.There
may be a larger drop out in the CBT arm due to the extra burden of
having to attend numerous therapy sessions. Alternatively the extra
attention may be beneficial and increase retainment.
e) At the design stage a third treatment arm was suggested for inclusion
consisting of placebo only. What would be the advantages and disadvantages of
including this treatment arm?
(4 marks)
Comment (A9): Like the idea of this
question because you have to think
about it. But not certain whether it is a
step too far for the students. I think
maybe we could drop and ask about
randomisation? I think we have logged
about 10 marks.
The primary outcome of the trial was the Health and Nation Outcome scale which is
a 12 item scale covering a wide range of health and social domains such as
psychiatric symptoms, physical health,
Comment (A10): Lovely. Should we be
functioning, relationships and housing. Each
adding anything about a composite score
question is marked from 0 (no problem) to 4
and how we use that to conclude? Or is
this too much info? (I.e. what is
(severe problem). This was completed by an
considered as an improvement? I think
interviewer at 12 weeks post randomisation.
this information can come later)
Two hundred adolescents were to be recruited into the trial from six centres. Simple
Randomisation was used to allocate treatment in the ratio 1:1. Each centre had one
interviewer collecting data and several therapists giving CBT.
The primary analysis of this trial showed that at 12 weeks post randomisation the
mean (standard deviation) of the primary outcome
Comment (A11): Am I right in
was 18 (CI 7.5) in the SSRI group and 17.1
remembering this as the confidence
(CI8.3) in the SSRI plus CBT group. The
interval?
difference between the two groups was not
statistically significant under an intention-to-treat analysis.
h) What can you conclude from this result?
j) What other information would you consider when interpreting the results of this
trial, think particularly about what may be reported in the publication?
(20 marks)
Is the sample size large enough?
How many people were included in the analysis?
Was this the best and most appropriate design?
Has there been substantial bias introduced?
What were the results of the secondary outcomes, in particular
safety?
Are the conclusions similar under a per protocol analysis?
Was the randomisation successful, i.e. are the treatment groups
balanced?
Is the trial population generalisable to inform policy decisions?
Step ii
The Question Amended after Feedback from the Exam Chair (July)
(Ready for Review By The External Examiner)
- 68 -
b)
b)
They should get some point for mentioning that because this is a
pragmatic trial the drop out will reflect the normal situation as what
they are evaluating is a policy of recommending CBT it will not bias
the research question
Adolescents may drop out when they leave school
Two hundred adolescents were to be randomly assigned into the trial from six
centres. Each centre had one interviewer collecting data and several therapists
giving CBT.
The primary outcome of the trial was the total score of the Health and Nation
Outcome scale which is a 12 item scale covering a wide range of health and social
domains such as psychiatric symptoms, physical health, functioning, relationships
and housing. Each question is marked from 0 (no problem) to 4 (severe problem).
This was completed by an interviewer at 12 weeks post randomisation.
The primary analysis compared the average total score of the Health and Nation
Outcome scale at 12 weeks post randomisation between treatment groups (SSRI
alone versus. SSRI+CBT).
c)
d) Identify and discuss three possible sources of bias that could occur in this trial.
(6 marks)
5 marks for each well explained challenge up to a maximum of 15
The outcome measure is subjective, different interviewers could
be rating people very differently.
It is not possible to blind participants so what they may report
may be dependent on their treatment group.
More experienced therapists could always treat the more
severely depressed participants
- 70 -
e)
- 71 -
The primary analysis of this trial showed that at 12 weeks post randomisation the
mean (standard deviation) of the primary outcome was 18 (SD=7.5) in the SSRI
group and 17.1 (SD=8.3) in the SSRI plus CBT group. The difference between the
two groups was not statistically significant under an
intention-to-treat analysis.
Comment (L3): JR This is too broad
and vague a question. Change to
something more specific.
Comment (R2): EL Tom/Luke any
ideas to make this more specific?
Comment (L4): JR But why wasnt
baseline score used and ANCOVA
adjusted for baseline done?
f)
What other information would you consider important to report in the publication
of this trial to be able to interpret its results?
(6 marks)
Is the sample size large enough?
How many people were included in the analysis?
Was this the best and most appropriate design?
Has there been substantial bias introduced?
What were the results of the secondary outcomes, in particular
safety?
Are the conclusions similar under a per protocol analysis?
Was the randomisation successful, i.e. are the treatment groups
balanced?
Is the trial population generalisable to inform policy decisions?
Step iii
b)
Two hundred adolescents were to be randomly assigned into the trial from six
centres. The primary outcome of the trial was the total score of the 12 item Health
and Nation Outcome scale covering psychiatric symptoms, physical health,
relationships and housing. This was completed by an interviewer at 12 weeks post
randomisation.
c)
e)
Step iv
No! We still needed lots more work on the model answer to make it much
more specific for the exam marking phase. We also didnt like marking it out
of 50 much easier to allocate marks to 100 (but the 50 was a constraint
placed on us by the previous exam board)
- 75 -
EXERCISE
Please consider and make short notes on the following The Process
When do you begin developing examination questions in your course team?
What are the strengths and weaknesses for you in adopting a similar question
development approach to the one described in the case study above?
Having read this case study what elements would you like to transfer to your
own approach to question writing?
The Question
How would you rate the above in terms of clarity, authenticity and fairness?
How strong would you expect the inter-marker reliability to be based on the
marking guidelines provided?
- 76 -