Curriculum developers have a wide choice of assessment methods in all aspects of medical educatio... more Curriculum developers have a wide choice of assessment methods in all aspects of medical education including the specific area of medical knowledge. When selecting the appropriate tool, there is an increasing literature to provide a robust evidence base for developments or decisions. As a new medical school, we wished to select the most appropriate method for knowledge assessment. This article describes how a new medical school came to choose progress testing as its only method of summative assessment of undergraduate medical knowledge. The rationale, implementation, development and performance of the assessment are described. The position after the first cohort of students qualified is evaluated. Progress testing has worked well in a new school. Opportunities for further study and development exist. It is to be hoped that our experiences and evidence will assist and inform others as they consider developments for their own schools.
This article is primarily an opinion piece which aims to encourage debate and future research. Th... more This article is primarily an opinion piece which aims to encourage debate and future research. There is little theoretical or practical research on how best to design progress tests. We propose that progress test designers should be clear about the primary purpose of their assessment. We provide some empirical evidence about reliability and cost based upon generalisability theory. We suggest that the future research is needed in the areas of educational impact and acceptability.
There has been little work on standard setting for progress tests and it is common practice to us... more There has been little work on standard setting for progress tests and it is common practice to use normative standards. This study aimed to develop a new approach to standard setting for progress tests administered at the point when students approach graduation. In this study we obtained performance data from newly qualified doctors and used this information to set the standard for the last progress test in the final year of undergraduate medical education. This external reference was validated against projections of student performance data based upon normative grading, and other published results. A simple linear growth model was used to set pass scores for progress tests earlier in the final year and this was also validated by published data. There was good agreement between standards set using the data from newly qualified doctors, the standard expected from extrapolation of the student progression data, and published performance data from another medical school. We have demonstrated that a combination of data from independent sources can be used to triangulate standard-setting decisions for progress tests. Performance data from successive cohorts of medical students could provide a fruitful source of information for standard setting for progress tests.
This paper is aimed at assessment teams which are not steeped in the culture of educational measu... more This paper is aimed at assessment teams which are not steeped in the culture of educational measurement, but, rather, are composed of professionals whose jobs primarily require them to work as clinicians, but whose interest in medical education has given them responsibilities for assessment. It reiterates the difference between criterion-referenced tests and norm-referenced tests. It proposes that those who design and use any assessment in medicine should be clear about which of these approaches to testing they are using. This paper does not present any new results, but synthesises what is already known about norm-referenced and criterion-referenced tests by reviewing some of the literature. It explains how these two test paradigms lead to different approaches to test design, different measures of reliability and different standard errors of measurement. It shows how these factors may lead to differences in the standards set for some assessments. Many common medical assessments are assumed to be criterion-referenced but tend to follow norm-referenced practices. Assessment designers should examine the characteristics of each type of assessment to determine which approach is more appropriate and should then apply the correct theories and methods.
Abstract: second-year modules, we have been monitoring student performance andasking students to ... more Abstract: second-year modules, we have been monitoring student performance andasking students to evaluate the use of on-line examinations. Initial results(Ricketts & Wilks, in press) suggested that both student performance andstudent opinions were strongly affected by the on-screen style of theassessment.
... Medical Education , 26: 157–62. [CrossRef], [PubMed], [Web of Science ®] View all references;... more ... Medical Education , 26: 157–62. [CrossRef], [PubMed], [Web of Science ®] View all references; Pell, Boursicot, and Roberts 20097. Pell, G., Boursicot, K. and Roberts, T. 2009. ... Pell, Boursicot, and Roberts (20097. Pell, G., Boursicot, K. and Roberts, T. 2009. ...
To determine whether reporting plain films at faster rates lead to a deterioration in accuracy. F... more To determine whether reporting plain films at faster rates lead to a deterioration in accuracy. Fourteen consultant radiologists were asked to report a total of 90 radiographs in three sets of 30. They reported the first set at the rate they would report normally and the subsequent two sets in two thirds and one half of the original time. The 90 radiographs were the same for each radiologist, however, the order was randomly generated for each. There was no significant difference in overall accuracy for each of the three film sets (p=0.74). Additionally no significant difference in the total number of false-negatives for each film set was detected (p=0.14). However, there was a significant decrease in the number of false-positive reports when the radiologists were asked to report at higher speeds (p=0.003). When reporting accident and emergency radiographs increasing reporting speed has no overall effect upon accuracy, however, it does lead to less false-positive reports.
Although progress testing (PT) is well established in several medical schools, it is new to denti... more Although progress testing (PT) is well established in several medical schools, it is new to dentistry. Peninsula College of Medicine and Dentistry has recently established a Bachelor of Dental Surgery programme and has been one of the first schools to use PT in a dental setting. Issues associated with its development and of its adaption to the specific needs of the dental curriculum are considered.
Progress tests give a continuous measure of a student's growth in... more Progress tests give a continuous measure of a student's growth in knowledge. However, the result at each test instance is subject to measurement error from a variety of sources. Previous tests contain useful information that might be used to reduce this error. A Bayesian statistical approach to using this prior information was investigated. We first developed a Bayesian model that used the result from only one preceding test to update both the current estimated test score and its standard error of measurement (SEM). This was then extended to include results from all previous tests. The Bayesian model leads to an exponentially weighted combination of test scores. The results show smoothing of test scores when all previous tests are included in the model. The effective sample size is doubled, leading to a 30% reduction in measurement error. A Bayesian approach can give improved score estimates and smaller SEMs. The method is simple to use with large cohorts of students and frequent tests. The smoothing of raw scores should give greater consistency in rank ordering of students and hence should better identify both high-performing students and those in need of remediation.
Medical Education 2005 Vol 39 Pp 221 227 Peer Reviewed Journal, Feb 1, 2005
Progress testing is a form of longitudinal examination which, in principle, samples at regular in... more Progress testing is a form of longitudinal examination which, in principle, samples at regular intervals from the complete domain of knowledge considered a requirement for medical students on completion of the undergraduate programme. Over the course of the programme students improve their scores on the test, enabling them, as well as staff, to monitor their progress. We aimed to review methods which have been used to assess the results of individual tests, and to make recommendations on best practice. In assessing progress tests, there are a variety of choices that must be made. These include whether the test is norm- or criterion-referenced; whether marking is negative or "number-right"; whether the grades are reported on a continuous or a discontinuous scale, and whether the grades are weighted towards the most recent observations, or the entire set of grades is used to determine the final grade. Grade boundary setting in the context of progress tests is also considered, using a mathematical model to predict the consequences of different approaches. The relationships between boundary setting, progression and remediation rules are considered. We concluded that norm referencing is preferable to criterion referencing, negative marking preferable to number-right marking, a discontinuous scale preferable to a continuous scale and that grades should be weighted to favour the most recent outcomes, although there should still be a degree of persistence (earlier grades should not disappear all together). Grade boundaries should be established with regard to rules on remediation and progression.
Enhancing Teaching and Learning through Assessment, 2007
... James Oldham Adrian Freeman Suzanne Chamberlain Chris Ricketts Institute of Clinical Educatio... more ... James Oldham Adrian Freeman Suzanne Chamberlain Chris Ricketts Institute of Clinical Education Peninsula Medical School Universities of Exeter and ... the key principles of formative assessment (Manson & Bruning, 2005; Black & Wiliam, 1998; Sadler, 1989; Roos & Hamilton ...
Enhancing Teaching and Learning through Assessment, 2007
This paper presents the findings of an interactive case study that uses a problem-based learning ... more This paper presents the findings of an interactive case study that uses a problem-based learning approach to examine a typical layout planning case whereby, students assess the work of other students which are then used as a part of their continuous assessment. After a brief introduction to the topic, students are formed into small groups of about five students and given the case to analyse. The introduction contains just enough information for them to tackle the case, they then submit and present their solutions. The case is then used to demonstrate further layout planning techniques used to find solutions to such situations. Finally, students are given an introduction to typical methods of evaluation, and each group evaluates the results of other groups. These are then amalgamated and used as part of the continuous assessment for the subject. The case study has been used on postgraduate students five times and the results consistently demonstrate its value both as a teaching learning activity and as an excellent example of peer assessment.
To use progress testing, a large bank of questions is required, particularly when planning to del... more To use progress testing, a large bank of questions is required, particularly when planning to deliver tests over a long period of time. The questions need not only to be of good quality but also balanced in subject coverage across the curriculum to allow appropriate sampling. Hence as well as creating its own questions, an institution could share questions. Both methods allow ownership and structuring of the test appropriate to the educational requirements of the institution. Peninsula Medical School (PMS) has developed a mechanism to validate questions written in house. That mechanism can be adapted to utilise questions from an International question bank International Digital Electronic Access Library (IDEAL) and another UK-based question bank Universities Medical Assessment Partnership (UMAP). These questions have been used in our progress tests and analysed for relative performance. Data are presented to show that questions from differing sources can have comparable performance in a progress testing format. There are difficulties in transferring questions from one institution to another. These include problems of curricula and cultural differences. Whilst many of these difficulties exist, our experience suggests that it only requires a relatively small amount of work to adapt questions from external question banks for effective use. The longitudinal aspect of progress testing (albeit summatively) may allow more flexibility in question usage than single high stakes exams.
Peninsula Medical School, UK, employed six students to write MCQ items for a formative applied me... more Peninsula Medical School, UK, employed six students to write MCQ items for a formative applied medical knowledge item bank. The students successfully generated 260 quality MCQs in their six-week contracted period. Informal feedback from students and two staff mentors suggests that the exercise provided a very effective learning environment and that students felt they were 'being paid to learn'. Further research is under way to track the progress of the students involved in the exercise, and to formally evaluate the impact on learning.
Curriculum developers have a wide choice of assessment methods in all aspects of medical educatio... more Curriculum developers have a wide choice of assessment methods in all aspects of medical education including the specific area of medical knowledge. When selecting the appropriate tool, there is an increasing literature to provide a robust evidence base for developments or decisions. As a new medical school, we wished to select the most appropriate method for knowledge assessment. This article describes how a new medical school came to choose progress testing as its only method of summative assessment of undergraduate medical knowledge. The rationale, implementation, development and performance of the assessment are described. The position after the first cohort of students qualified is evaluated. Progress testing has worked well in a new school. Opportunities for further study and development exist. It is to be hoped that our experiences and evidence will assist and inform others as they consider developments for their own schools.
This article is primarily an opinion piece which aims to encourage debate and future research. Th... more This article is primarily an opinion piece which aims to encourage debate and future research. There is little theoretical or practical research on how best to design progress tests. We propose that progress test designers should be clear about the primary purpose of their assessment. We provide some empirical evidence about reliability and cost based upon generalisability theory. We suggest that the future research is needed in the areas of educational impact and acceptability.
There has been little work on standard setting for progress tests and it is common practice to us... more There has been little work on standard setting for progress tests and it is common practice to use normative standards. This study aimed to develop a new approach to standard setting for progress tests administered at the point when students approach graduation. In this study we obtained performance data from newly qualified doctors and used this information to set the standard for the last progress test in the final year of undergraduate medical education. This external reference was validated against projections of student performance data based upon normative grading, and other published results. A simple linear growth model was used to set pass scores for progress tests earlier in the final year and this was also validated by published data. There was good agreement between standards set using the data from newly qualified doctors, the standard expected from extrapolation of the student progression data, and published performance data from another medical school. We have demonstrated that a combination of data from independent sources can be used to triangulate standard-setting decisions for progress tests. Performance data from successive cohorts of medical students could provide a fruitful source of information for standard setting for progress tests.
This paper is aimed at assessment teams which are not steeped in the culture of educational measu... more This paper is aimed at assessment teams which are not steeped in the culture of educational measurement, but, rather, are composed of professionals whose jobs primarily require them to work as clinicians, but whose interest in medical education has given them responsibilities for assessment. It reiterates the difference between criterion-referenced tests and norm-referenced tests. It proposes that those who design and use any assessment in medicine should be clear about which of these approaches to testing they are using. This paper does not present any new results, but synthesises what is already known about norm-referenced and criterion-referenced tests by reviewing some of the literature. It explains how these two test paradigms lead to different approaches to test design, different measures of reliability and different standard errors of measurement. It shows how these factors may lead to differences in the standards set for some assessments. Many common medical assessments are assumed to be criterion-referenced but tend to follow norm-referenced practices. Assessment designers should examine the characteristics of each type of assessment to determine which approach is more appropriate and should then apply the correct theories and methods.
Abstract: second-year modules, we have been monitoring student performance andasking students to ... more Abstract: second-year modules, we have been monitoring student performance andasking students to evaluate the use of on-line examinations. Initial results(Ricketts & Wilks, in press) suggested that both student performance andstudent opinions were strongly affected by the on-screen style of theassessment.
... Medical Education , 26: 157–62. [CrossRef], [PubMed], [Web of Science ®] View all references;... more ... Medical Education , 26: 157–62. [CrossRef], [PubMed], [Web of Science ®] View all references; Pell, Boursicot, and Roberts 20097. Pell, G., Boursicot, K. and Roberts, T. 2009. ... Pell, Boursicot, and Roberts (20097. Pell, G., Boursicot, K. and Roberts, T. 2009. ...
To determine whether reporting plain films at faster rates lead to a deterioration in accuracy. F... more To determine whether reporting plain films at faster rates lead to a deterioration in accuracy. Fourteen consultant radiologists were asked to report a total of 90 radiographs in three sets of 30. They reported the first set at the rate they would report normally and the subsequent two sets in two thirds and one half of the original time. The 90 radiographs were the same for each radiologist, however, the order was randomly generated for each. There was no significant difference in overall accuracy for each of the three film sets (p=0.74). Additionally no significant difference in the total number of false-negatives for each film set was detected (p=0.14). However, there was a significant decrease in the number of false-positive reports when the radiologists were asked to report at higher speeds (p=0.003). When reporting accident and emergency radiographs increasing reporting speed has no overall effect upon accuracy, however, it does lead to less false-positive reports.
Although progress testing (PT) is well established in several medical schools, it is new to denti... more Although progress testing (PT) is well established in several medical schools, it is new to dentistry. Peninsula College of Medicine and Dentistry has recently established a Bachelor of Dental Surgery programme and has been one of the first schools to use PT in a dental setting. Issues associated with its development and of its adaption to the specific needs of the dental curriculum are considered.
Progress tests give a continuous measure of a student's growth in... more Progress tests give a continuous measure of a student's growth in knowledge. However, the result at each test instance is subject to measurement error from a variety of sources. Previous tests contain useful information that might be used to reduce this error. A Bayesian statistical approach to using this prior information was investigated. We first developed a Bayesian model that used the result from only one preceding test to update both the current estimated test score and its standard error of measurement (SEM). This was then extended to include results from all previous tests. The Bayesian model leads to an exponentially weighted combination of test scores. The results show smoothing of test scores when all previous tests are included in the model. The effective sample size is doubled, leading to a 30% reduction in measurement error. A Bayesian approach can give improved score estimates and smaller SEMs. The method is simple to use with large cohorts of students and frequent tests. The smoothing of raw scores should give greater consistency in rank ordering of students and hence should better identify both high-performing students and those in need of remediation.
Medical Education 2005 Vol 39 Pp 221 227 Peer Reviewed Journal, Feb 1, 2005
Progress testing is a form of longitudinal examination which, in principle, samples at regular in... more Progress testing is a form of longitudinal examination which, in principle, samples at regular intervals from the complete domain of knowledge considered a requirement for medical students on completion of the undergraduate programme. Over the course of the programme students improve their scores on the test, enabling them, as well as staff, to monitor their progress. We aimed to review methods which have been used to assess the results of individual tests, and to make recommendations on best practice. In assessing progress tests, there are a variety of choices that must be made. These include whether the test is norm- or criterion-referenced; whether marking is negative or "number-right"; whether the grades are reported on a continuous or a discontinuous scale, and whether the grades are weighted towards the most recent observations, or the entire set of grades is used to determine the final grade. Grade boundary setting in the context of progress tests is also considered, using a mathematical model to predict the consequences of different approaches. The relationships between boundary setting, progression and remediation rules are considered. We concluded that norm referencing is preferable to criterion referencing, negative marking preferable to number-right marking, a discontinuous scale preferable to a continuous scale and that grades should be weighted to favour the most recent outcomes, although there should still be a degree of persistence (earlier grades should not disappear all together). Grade boundaries should be established with regard to rules on remediation and progression.
Enhancing Teaching and Learning through Assessment, 2007
... James Oldham Adrian Freeman Suzanne Chamberlain Chris Ricketts Institute of Clinical Educatio... more ... James Oldham Adrian Freeman Suzanne Chamberlain Chris Ricketts Institute of Clinical Education Peninsula Medical School Universities of Exeter and ... the key principles of formative assessment (Manson & Bruning, 2005; Black & Wiliam, 1998; Sadler, 1989; Roos & Hamilton ...
Enhancing Teaching and Learning through Assessment, 2007
This paper presents the findings of an interactive case study that uses a problem-based learning ... more This paper presents the findings of an interactive case study that uses a problem-based learning approach to examine a typical layout planning case whereby, students assess the work of other students which are then used as a part of their continuous assessment. After a brief introduction to the topic, students are formed into small groups of about five students and given the case to analyse. The introduction contains just enough information for them to tackle the case, they then submit and present their solutions. The case is then used to demonstrate further layout planning techniques used to find solutions to such situations. Finally, students are given an introduction to typical methods of evaluation, and each group evaluates the results of other groups. These are then amalgamated and used as part of the continuous assessment for the subject. The case study has been used on postgraduate students five times and the results consistently demonstrate its value both as a teaching learning activity and as an excellent example of peer assessment.
To use progress testing, a large bank of questions is required, particularly when planning to del... more To use progress testing, a large bank of questions is required, particularly when planning to deliver tests over a long period of time. The questions need not only to be of good quality but also balanced in subject coverage across the curriculum to allow appropriate sampling. Hence as well as creating its own questions, an institution could share questions. Both methods allow ownership and structuring of the test appropriate to the educational requirements of the institution. Peninsula Medical School (PMS) has developed a mechanism to validate questions written in house. That mechanism can be adapted to utilise questions from an International question bank International Digital Electronic Access Library (IDEAL) and another UK-based question bank Universities Medical Assessment Partnership (UMAP). These questions have been used in our progress tests and analysed for relative performance. Data are presented to show that questions from differing sources can have comparable performance in a progress testing format. There are difficulties in transferring questions from one institution to another. These include problems of curricula and cultural differences. Whilst many of these difficulties exist, our experience suggests that it only requires a relatively small amount of work to adapt questions from external question banks for effective use. The longitudinal aspect of progress testing (albeit summatively) may allow more flexibility in question usage than single high stakes exams.
Peninsula Medical School, UK, employed six students to write MCQ items for a formative applied me... more Peninsula Medical School, UK, employed six students to write MCQ items for a formative applied medical knowledge item bank. The students successfully generated 260 quality MCQs in their six-week contracted period. Informal feedback from students and two staff mentors suggests that the exercise provided a very effective learning environment and that students felt they were 'being paid to learn'. Further research is under way to track the progress of the students involved in the exercise, and to formally evaluate the impact on learning.
Uploads