Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Reliability and Validity of a Graphic Design Assessment Rubric

Assessment in the visual arts is subjective in nature thereby raising a concern of how to effectively assess achievements in the subject. Quality assessment of artworks is highly dependent on accurate and reliable measurement. The objective of the paper was to construct and assess the reliability and validity of a scoring rubric for grading graphic design artefacts. The paper reports measure taken to ensure validity of the developed rubric. The reliability of the rubric was also investigated using Intraclass correlation coefficient (ICC). Findings indicate that the developed rubric show evidence of high reliability with an inter-rater correlation coefficient of 0.790 and intra-rater correlation coefficient of 0.828 which are within the very reliable range. It was concluded that the developed rubric was reliable. This conclusion has far reaching implications for the development and use of rubrics in assessment bearing in mind that subjectivity in the use of rubrics cannot be complete......Read more
International Journal of Technical Research and Applications e-ISSN: 2320-8163, www.ijtra.com Volume 4, Issue 2 (March-April, 2016), PP. 119-124 119 | Page RELIABILITY AND VALIDITY OF A GRAPHIC DESIGN ASSESSMENT RUBRIC Bede Blaise Chukwunyere Onwuagboke 1 , Termit Kaur Ranjit Singh 2 1 Department of Curriculum and Instruction, Alvan Ikoku Federal College of Education, Owerri, Nigeria 2 School of Educational Studies, Universiti Sains Malaysia, Pulau Penang, Malaysia bbconwu@yahoo.com AbstractAssessment in the visual arts is subjective in nature thereby raising a concern of how to effectively assess achievements in the subject. Quality assessment of artworks is highly dependent on accurate and reliable measurement. The objective of the paper was to construct and assess the reliability and validity of a scoring rubric for grading graphic design artefacts. The paper reports measure taken to ensure validity of the developed rubric. The reliability of the rubric was also investigated using Intraclass correlation coefficient (ICC). Findings indicate that the developed rubric show evidence of high reliability with an inter-rater correlation coefficient of 0.790 and intra-rater correlation coefficient of 0.828 which are within the very reliable range. It was concluded that the developed rubric was reliable. This conclusion has far reaching implications for the development and use of rubrics in assessment bearing in mind that subjectivity in the use of rubrics cannot be completely removed. Index terms- Graphic Design; assessment rubric; validity; reliability. I. INTRODUCTION The use of rubrics in instruction as assessment tool has been recognized by educationist. For example rubrics can be described as descriptive scoring instructional tool [1], [2] which form the foundation on which teachers make academic judgements about students' performances and measure students’ achievements and progress [2], [3] as well as for ensuring the development of professional skills [4]. Rubrics generally give a connotation of a simple assessment tool that describes levels of performance on a particular task which is used to assess outcomes in a variety of performance-based contexts at all levels of education [5], [6]. They are adjudged beneficial in the learning situation in that they can enhance learning process by providing to both the teacher and the learners with a clear understanding of the objectives of a given (design) assignment as well as the criteria for assessment [7]. Fine and Applied Arts is a subject area which involves the production of artefacts as a measure of learners’ accomplishment in the subject. This is applicable in almost all the branches of the subject whether it is drawing, painting, sculpture, ceramics, textiles or graphics except in art history and art education that are purely theoretical in nature. It is a subject that involves theory and practice in all the aspects that produces artefacts as measures of performance. While it may be easier to have an objective assessment of performance in the theoretical aspect, assessment of practical works produced by the students is usually subjective with instructors assessing the works according to personal appeal. The need for an assessment instrument capable of objectively assessing the artwork of students arises from the above nature of the subject. Every student who produces a design rarely finds any problem with his/her work thereby anticipating a very high grade. Often time’s students have got reasons to grumble over some scores given to them by their lecturers stressing that such scores do not reflect the quality of work they had produced. To solve the problem of subjectivity in graphic design assessments the authors embarked on this study based on the need for objectivity and transparency in the assessment process. Need for Assessment Rubrics Rubrics are recommended in assessment methods where the students’ responses to questions cannot be evaluated with complete objectivity, such as projects, artworks portfolios etc. as they can be employed for achieving reliable and valid professional judgement [8]. The use of rubrics is believed will lead to increased objectivity in the assessment of artefacts as criteria are explicitly defined [9]. To this end, different teachers or raters can make use of a common rubric across a subject to ensure that measurement of students’ performance is consistent. They have been used by teachers in the classroom to communicate expectations for an assignment, providing focused feedback on works in progress, as well as in grading final students’ products of learning activities [10], [11]. Rubrics are valuable instruments in the educational setting to both teachers and students alike. When the right type of rubric is used, it has the propensity to enhance the reliable scoring of performance assessments. They seem to have the potential of promoting learning and/or improve instruction. This is possible due to the fact that rubrics make expectations and criteria explicit, which also facilitates feedback and self-assessment on the part of the students [12]. Features and Types of Assessment Rubrics There are three basic features which are considered essential in scoring rubrics which are evaluation criteria, quality definitions and scoring strategy [13]. The valuation criteria are the factors which the assessor of an artefact must put into consideration in order to determine the quality of such a work. In areas where it is difficult to define certain concepts, the criteria will highlight the indicators of what is to be measured to bring about clarity and understanding of requirements of performance. A well spelt out assessment
International Journal of Technical Research and Applications e-ISSN: 2320-8163, www.ijtra.com Volume 4, Issue 2 (March-April, 2016), PP. 119-124 120 | Page criteria has the advantage of informing the students of the teacher/ assessors’ expectation from a given assignment thereby giving them insight of what constitutes complete and effective response to the design problem [2]. When assessment is criterion-based, a room is created for assessment to play a leading role in the learning process [14], as it can motivate the learners to greater achievement, improve the instructional process and equally enhance assessment [15]. . Quality definitions is a feature which entails stating a detailed explanation of what the learner have to do as a demonstration of the level of mastery or achievement of a skill, proficiency or criteria. They are statement which focuses on ascertaining a good response from a bad one. They are usually stated from the highest to the lowest level with other levels in between. It has not been widely accepted the ideal number of levels that a rubric should have though it is widely canvassed that the levels should be few. It has been suggested by some researchers that the minimum levels be three [13, 16] and a maximum of five levels [13]. Whatever the number of the levels is, it is important that the designer of the rubric should expressly state in clear and understandable language the expectation of the assessor from the student in the given task, so as to distinguish performances of students. The next important feature of rubrics is the scoring strategy which entails the use of a measurement scale for judging and interpreting the product. Rubrics can be categorized into two namely holistic and analytical rubrics [17, 18, 5], specific or generic [19]. A holistic rubric is a scoring scale which assigns a level of performance by assessing performance across multiple criteria as a whole. There is no separation of levels of performance for each criterion. This type of rubric can be very useful in assessing students work in large classes where the assessor has to assess many works as it gives a broad picture of performance at a glance. Its shortcoming is actually that it does not provide detailed information. Advocates of using a holistic rubric maintain that it has the advantage of making it possible for works to be scored quickly thereby giving an overview of performance of students. The decision of which type of rubric to use lies with the designer and it is a function of how the result of the assessment will be put to use. On the other hand, an analytical rubric is one which articulates levels of performance for each criterion so the teacher can assess student performance on each criterion. In this type of rubric, the scores for each criterion can be summed up to arrive at the final grade of the student. It may not be easy to provide one overall score of the student when a rubric with so many criteria is in use. Analytic rubrics would be preferable if the objective of its use is to provide a detailed diagnostic feedback of the strength and weaknesses of students’ artwork and the effectiveness of instructional intervention [13]. Analytical rubric is highly time-consuming in scoring students work, however it provides more detailed feedback, with scoring being more consistent across students and raters [20]. When a holistic or analytical rubric is designed to be used in assessing an individual assignment or task, it is said to be task specific whereas if it is designed so that it can be utilized in the assessment of a group of similar tasks, it is termed generic. Task specific rubrics lend themselves to thoroughness of detail which accounts for higher reliability and validity unlike generic rubrics [18]. The generic rubrics have the advantage of being used in assessment of wide range of similar courses and institutions because of its inherent flexibility. They have a wider scope and contain only the most essential ingredients of the learning outcome to be assessed across different tasks in the same assessment method [21]. For any measuring instrument to be considered effective in measuring students achievement in a given performance oriented task, there are two qualities the instrument must exhibit. Validity and reliability are essential qualities of good measuring tools in any setting. An assessment rubric whether it is in the holistic or analytical formant is expected to exhibit these qualities. Validity of a measuring instrument refers to the extent to which scores obtained using the instrument truly reflects the underlying variable of interest. On the other hand reliability of a measuring instrument is concerned with the consistency of scores across repeated measurements. Reliability and validity of assessment rubrics has often not been assessed probably because of the effort and time commitment required to undertake such assessment [22] despite the fact that some researchers have noted them as issues of concern in development of rubrics [23]. This paper designs and validates a rubric for scoring graphic design artworks produced in graphic design studios in schools and colleges. II. MATERIALS AND METHODS The study reviewed literature on the design and construction of assessment rubrics and proposed and validated an assessment rubric for assessing graphic design projects/assignments. Following the steps for instrument development and validation methodology [24] the authors constructed the rubric. The step-by-step approach followed by the researchers in designing the assessment rubrics are as stated in table 1.
International Journal of Technical Research and Applications e-ISSN: 2320-8163, www.ijtra.com Volume 4, Issue 2 (March-April, 2016), PP. 119-124 RELIABILITY AND VALIDITY OF A GRAPHIC DESIGN ASSESSMENT RUBRIC 1 Bede Blaise Chukwunyere Onwuagboke1, Termit Kaur Ranjit Singh2 Department of Curriculum and Instruction, Alvan Ikoku Federal College of Education, Owerri, Nigeria 2 School of Educational Studies, Universiti Sains Malaysia, Pulau Penang, Malaysia bbconwu@yahoo.com Abstract— Assessment in the visual arts is subjective in nature thereby raising a concern of how to effectively assess achievements in the subject. Quality assessment of artworks is highly dependent on accurate and reliable measurement. The objective of the paper was to construct and assess the reliability and validity of a scoring rubric for grading graphic design artefacts. The paper reports measure taken to ensure validity of the developed rubric. The reliability of the rubric was also investigated using Intraclass correlation coefficient (ICC). Findings indicate that the developed rubric show evidence of high reliability with an inter-rater correlation coefficient of 0.790 and intra-rater correlation coefficient of 0.828 which are within the very reliable range. It was concluded that the developed rubric was reliable. This conclusion has far reaching implications for the development and use of rubrics in assessment bearing in mind that subjectivity in the use of rubrics cannot be completely removed. Index terms- Graphic Design; assessment rubric; validity; reliability. I. INTRODUCTION The use of rubrics in instruction as assessment tool has been recognized by educationist. For example rubrics can be described as descriptive scoring instructional tool [1], [2] which form the foundation on which teachers make academic judgements about students' performances and measure students’ achievements and progress [2], [3] as well as for ensuring the development of professional skills [4]. Rubrics generally give a connotation of a simple assessment tool that describes levels of performance on a particular task which is used to assess outcomes in a variety of performance-based contexts at all levels of education [5], [6]. They are adjudged beneficial in the learning situation in that they can enhance learning process by providing to both the teacher and the learners with a clear understanding of the objectives of a given (design) assignment as well as the criteria for assessment [7]. Fine and Applied Arts is a subject area which involves the production of artefacts as a measure of learners’ accomplishment in the subject. This is applicable in almost all the branches of the subject whether it is drawing, painting, sculpture, ceramics, textiles or graphics except in art history and art education that are purely theoretical in nature. It is a subject that involves theory and practice in all the aspects that produces artefacts as measures of performance. While it may be easier to have an objective assessment of performance in the theoretical aspect, assessment of practical works produced by the students is usually subjective with instructors assessing the works according to personal appeal. The need for an assessment instrument capable of objectively assessing the artwork of students arises from the above nature of the subject. Every student who produces a design rarely finds any problem with his/her work thereby anticipating a very high grade. Often time’s students have got reasons to grumble over some scores given to them by their lecturers stressing that such scores do not reflect the quality of work they had produced. To solve the problem of subjectivity in graphic design assessments the authors embarked on this study based on the need for objectivity and transparency in the assessment process. Need for Assessment Rubrics Rubrics are recommended in assessment methods where the students’ responses to questions cannot be evaluated with complete objectivity, such as projects, artworks portfolios etc. as they can be employed for achieving reliable and valid professional judgement [8]. The use of rubrics is believed will lead to increased objectivity in the assessment of artefacts as criteria are explicitly defined [9]. To this end, different teachers or raters can make use of a common rubric across a subject to ensure that measurement of students’ performance is consistent. They have been used by teachers in the classroom to communicate expectations for an assignment, providing focused feedback on works in progress, as well as in grading final students’ products of learning activities [10], [11]. Rubrics are valuable instruments in the educational setting to both teachers and students alike. When the right type of rubric is used, it has the propensity to enhance the reliable scoring of performance assessments. They seem to have the potential of promoting learning and/or improve instruction. This is possible due to the fact that rubrics make expectations and criteria explicit, which also facilitates feedback and self-assessment on the part of the students [12]. Features and Types of Assessment Rubrics There are three basic features which are considered essential in scoring rubrics which are evaluation criteria, quality definitions and scoring strategy [13]. The valuation criteria are the factors which the assessor of an artefact must put into consideration in order to determine the quality of such a work. In areas where it is difficult to define certain concepts, the criteria will highlight the indicators of what is to be measured to bring about clarity and understanding of requirements of performance. A well spelt out assessment 119 | P a g e International Journal of Technical Research and Applications e-ISSN: 2320-8163, www.ijtra.com Volume 4, Issue 2 (March-April, 2016), PP. 119-124 criteria has the advantage of informing the students of the teacher/ assessors’ expectation from a given assignment thereby giving them insight of what constitutes complete and effective response to the design problem [2]. When assessment is criterion-based, a room is created for assessment to play a leading role in the learning process [14], as it can motivate the learners to greater achievement, improve the instructional process and equally enhance assessment [15]. . Quality definitions is a feature which entails stating a detailed explanation of what the learner have to do as a demonstration of the level of mastery or achievement of a skill, proficiency or criteria. They are statement which focuses on ascertaining a good response from a bad one. They are usually stated from the highest to the lowest level with other levels in between. It has not been widely accepted the ideal number of levels that a rubric should have though it is widely canvassed that the levels should be few. It has been suggested by some researchers that the minimum levels be three [13, 16] and a maximum of five levels [13]. Whatever the number of the levels is, it is important that the designer of the rubric should expressly state in clear and understandable language the expectation of the assessor from the student in the given task, so as to distinguish performances of students. The next important feature of rubrics is the scoring strategy which entails the use of a measurement scale for judging and interpreting the product. Rubrics can be categorized into two namely holistic and analytical rubrics [17, 18, 5], specific or generic [19]. A holistic rubric is a scoring scale which assigns a level of performance by assessing performance across multiple criteria as a whole. There is no separation of levels of performance for each criterion. This type of rubric can be very useful in assessing students work in large classes where the assessor has to assess many works as it gives a broad picture of performance at a glance. Its shortcoming is actually that it does not provide detailed information. Advocates of using a holistic rubric maintain that it has the advantage of making it possible for works to be scored quickly thereby giving an overview of performance of students. The decision of which type of rubric to use lies with the designer and it is a function of how the result of the assessment will be put to use. task specific whereas if it is designed so that it can be utilized in the assessment of a group of similar tasks, it is termed generic. Task specific rubrics lend themselves to thoroughness of detail which accounts for higher reliability and validity unlike generic rubrics [18]. The generic rubrics have the advantage of being used in assessment of wide range of similar courses and institutions because of its inherent flexibility. They have a wider scope and contain only the most essential ingredients of the learning outcome to be assessed across different tasks in the same assessment method [21]. For any measuring instrument to be considered effective in measuring students achievement in a given performance oriented task, there are two qualities the instrument must exhibit. Validity and reliability are essential qualities of good measuring tools in any setting. An assessment rubric whether it is in the holistic or analytical formant is expected to exhibit these qualities. Validity of a measuring instrument refers to the extent to which scores obtained using the instrument truly reflects the underlying variable of interest. On the other hand reliability of a measuring instrument is concerned with the consistency of scores across repeated measurements. Reliability and validity of assessment rubrics has often not been assessed probably because of the effort and time commitment required to undertake such assessment [22] despite the fact that some researchers have noted them as issues of concern in development of rubrics [23]. This paper designs and validates a rubric for scoring graphic design artworks produced in graphic design studios in schools and colleges. II. MATERIALS AND METHODS The study reviewed literature on the design and construction of assessment rubrics and proposed and validated an assessment rubric for assessing graphic design projects/assignments. Following the steps for instrument development and validation methodology [24] the authors constructed the rubric. The step-by-step approach followed by the researchers in designing the assessment rubrics are as stated in table 1. On the other hand, an analytical rubric is one which articulates levels of performance for each criterion so the teacher can assess student performance on each criterion. In this type of rubric, the scores for each criterion can be summed up to arrive at the final grade of the student. It may not be easy to provide one overall score of the student when a rubric with so many criteria is in use. Analytic rubrics would be preferable if the objective of its use is to provide a detailed diagnostic feedback of the strength and weaknesses of students’ artwork and the effectiveness of instructional intervention [13]. Analytical rubric is highly time-consuming in scoring students work, however it provides more detailed feedback, with scoring being more consistent across students and raters [20]. When a holistic or analytical rubric is designed to be used in assessing an individual assignment or task, it is said to be 120 | P a g e International Journal of Technical Research and Applications e-ISSN: 2320-8163, www.ijtra.com Volume 4, Issue 2 (March-April, 2016), PP. 119-124 Three experienced graphic design lectures not below the rank of senior lecturer from a College of Education participated in the rubric construction exercise. Two of the lecturers are males while one is a female. and excellent with numerical values 1-4 were given to show advancement from the lowest performance to the highest anticipated performance. The initial rubric that resulted from the above exercise contained twenty items. (1) Designing the rubric (steps 1-3) The first three steps for developing the rubric as recommended by scholars [24, 19] include 1. Conceptualize the construct of interest; 2. Identify and describe levels of behaviors that underlie the construct; 3. Develop the instrument by developing separate descriptive scoring schemes for each evaluation level and criteria. The experts were asked to list the criteria they normally consider in assessing designs produced by students in their classes. Four criteria of all that was listed were agreed upon after in-depth discussion on them. They are: 1. 1. General Appearance of the design 2. 2. Display of Creativity in the design 3. 3. Design layout 4. 4. Use of media and Technology. Review, Feedback and Revising the Rubric (steps 4-5) The developed instrument was discussed with major stakeholders in graphic design in the College which comprises students and lecturers in graphic design. The purpose was to determine the usability and appropriateness of the language used in the measure as well as to ensure that the instrument covers the content area. Lecturers were asked to rate and comment on the rubrics with regards the following: clarity, completeness and applicability in measuring all aspects of a finished work in graphic design. The students on their part were asked to comment on the clarity of language and completeness of the rubrics. Statements which were not clear to the students were rephrased to ensure their clarity while very ambiguous ones were deleted. Graphic design is seen as the activity that organizes visual communication in society with its primary concern being the efficiency of communication, the technology used for its implementation and the social impact it effects [25]. As the products of graphic design are visually appreciated and consumed, the general appearance of the products constitutes a major criterion to measure its effectiveness. Five items on the scale were constructed to address the general appearance of the final design. The graphic design process is a creative process which combines images and texts to convey information to a given audience, the layout and organization of the design elements on the working surface was adjudged by the team as a necessary criterion to be assessed. To this effect, five items were also developed to measure the design layout. The ratings and various comments by the lecturers and the students helped to reconstruction of some of the items to make them clearer as well as reduce the length of some of the descriptors. Some ambiguous items in the rubrics were deleted as well as items that seem to be repeated in the scale. In general, the rubric received high rating from the lecturers and the students found the rubric easy to interpret and apply in the context of their design assignments. The number of items in the rubric was reduced from twenty to fifteen items in line with the recommendations of the reviewers. Most items that appear to be repeated were deleted alongside items that do not seem measurable by merely observing and rating the finished artworks. Art and design are ventures that have been seen as creative. In assessing any work of art and design, the team also maintained that creativity exhibited in the work should be the hallmark of such assessment. Though it was difficult for the team to arrive at a consensus of the attributes constitutes creativity they however pointed out originality, innovativeness, thoughtfulness, improved product compared to previous ones and acquisition of more skills as manifestations of a creative work of art amounting to another five items. Finally they also came to the agreement that the technical use of media and the technology involved in the creative process is worth assessing when evaluating a finished work of art hence five items were developed. Each of the criteria so identified were subjected to further analysis to bring out detailed explanation of what the learner have to do as a demonstration of the level of mastery or achievement of a skill related to each of the set criterion. For each of the items, quality statements were constructed to state the levels of which a good response can be distinguished from a bad one. The statements were generally agreed on by the team to be at four levels as recommended [26]. Based on the above, a four point scale with labels as basic, acceptable, good Reliability and Validity of the Developed Rubrics (step 5) The essence of developing these rubrics is to overcome the shortcoming of subjective evaluation of students’ products by teachers. There is the need for the assessments based on the use of the rubrics to be sound, unbiased and free from distortions [19]. Establishing validity and reliability of the rubrics is an exercise that was taken to ensure confidence in the use of the instrument. In an ideal situation, a sore a candidate receives in a test should be independent of the scorer and similar results obtained no matter when and where the assessment is carried out [12]. This is however rarely obtainable in practice. Multiple choice tests are prone to yielding similar results irrespective of the scorer, the time of scoring and the place. This makes it imperative that rubrics which measures complex performance assessment should be made to be reliable and consistent in measurement. The rubric is also expected to be valid by measuring what it is intended to measure. Simply put, the rubrics should accurately capture the intended outcome [26] for it to be said to be valid. Validity Validation of any measure is a process through which evidences that support the appropriateness of any conclusion drawn from students’ work for specified assessment uses are 121 | P a g e International Journal of Technical Research and Applications e-ISSN: 2320-8163, www.ijtra.com Volume 4, Issue 2 (March-April, 2016), PP. 119-124 accumulated [27]. Test validity has been seen by researchers as the most important factor to be taken into consideration in test evaluation [28]. Validity of any measuring instrument can be looked at from varying perspectives, through which the needed evidences are gathered content, construct and criterion validity. The content validity dimension focuses on evidence that students’ responses/performance to the given instrument is a reflection of the students’ knowledge of the relevant content area. It is also concerned with the extent which the assessment tool adequately samples the content domain and also ensure that measurement does not go beyond the scope of the content being measured. Construct related evidence deal with gathering of evidence that responses provided by individuals are as a result of their internal thought processes rather than by chance. Criterion validity deals with evidence that performance of a student in a given task maybe generalized to other relevant activities. It accumulates evidence of transfer of learning to solving real world problems. Furthermore, [26] posit that validity is often harder to establish than reliability, it is preferable for assessments to contain multiple forms of validity as described above. Content validity of the rubric was ensured by the involvement of experts in the field of graphic design in the college to determine the criteria for assessment. It is most common to use experts to establish content validity to ensure that the rubric covers the full range of the concept’s meaning [29, 30, 31]. Researchers have also stressed the importance of appropriateness of language in the understanding and use of rubrics by both teachers and students [21], [32]. Stating the criteria and descriptors of the levels of performance expected of the student in clear and understandable language enables the students to work to specifications as well as make clear to the rater what is required of the product thereby making sure that no extraneous content was inherent in the rubrics. This in no small measure increases the validity of the rubric. To this end therefore not only were the experts involved in the determination of the criteria for assessment, the students were also involved in the review process. They read the rubrics and offered their feedback on their understanding of what the teacher requires of them in the given task being evaluated. The essence of the review was to eliminate ambiguity from the instrument as much as possible as it does not make room for proper interpretation [33]. All the above measures were taken, in order to ensure as well as enhance the validity of the rubric. Reliability Literatures on reliability of rubrics identify two ways of estimating the reliability of such a measure which are interrater reliability and intra-rater reliability [27, 19, 22, 33]. Inter-rater reliability Inter-rater reliability is concerned with the likelihood of variance in students’ score based on the subjective judgement of different raters. In effect the same piece of artwork produce by a student may receive as many different score values as there are scorers. Should there be a scoring rubric acting as a guide formalizing the criteria used by the assessors, the similarity in the score values given to such a work by different assessors can be highly enhanced. To ensure that the rubric developed can yield similar results when used by different raters, the poster designs produced by fifteen pre-service art teachers were rated by three lectures in the department using the rubric. The inter-rater reliability of the rubric was computed using Intraclass correlation coefficient (ICC) which is a measure of correlation, consistency or conformity for a data set with multiple groups. A two-way random effects model of ICC was utilized in this study. To determine the sample size needed, [34] concluded that the optimal sample size for two ratings varied from 15 for ICC=.9 to 378 for ICC = .1; for three ratings, it varied from 13 to 159; five ratings, 10 to 64; and 10 ratings, 8 to 29. In other words, fewer ratings and the smaller the ICC level, the larger the needed sample size. Fifteen designs were randomly selected from the products of the pre-service teachers. Three of the lecturers who participated in the development and review of the rubric were selected to act as raters for the designs. All the raters were made to rate the fifteen artworksindependently giving three scores for each poster. Following the recommendation of [35] all the individual scores awarded by the raters constitutes the unit of analysis rather than an average measure. This analysis was carried out using SPSS version 22 and a 95% confidence interval was defined. Intra-rater reliability Intra-rater reliability is concerned with the probability of a rater obtaining similar scores from the same set of sampled products at measurements taken at two different points in time (test-retest reliability). It is a metric for rater’s self-consistency in the scoring of subjects [36]. Experience has shown that certain psychological conditions around the rater at two different times most often affects the persons’ performance of a task. When this happens using the same measure, as a measure’s reliability is suspect. A reliable measure will yield a very high level of similarity in result when used by the same individual at different times. To ensure that this type of reliability is inherent in the rubric, the instrument was subjected to intra-rater reliability investigation. One of the raters was asked to rate the samples using the rubric and the ratings recorded, after an interval of two weeks he was asked to rate the same samples using the rubric. The ICC was computed for the two ratings of the rater taken at two different times. III. RESULTS AND DISCUSSION Either Consistency Agreement (CA-ICC) or Absolute Agreement (AA-ICC) Intraclass correlation coefficient can serve as a useful measure of agreement depending on whether rater variability is relevant for determining the degree of agreement. According to [37] CA-ICC is useful when comparative judgments are made about objects of measurement thus representing correlation when the rater is fixed. Based on that, the researchers report the absolute agreement ICC. The result of inter-rater investigation carried out shows that the absolute agreement ICC is 0.790 with 95% confidence interval 122 | P a g e International Journal of Technical Research and Applications e-ISSN: 2320-8163, www.ijtra.com Volume 4, Issue 2 (March-April, 2016), PP. 119-124 (0.585-0.916) for the rubric on a single measure as shown in table 2. The intra-rater correlation coefficient computed gave an output of 0.828 with 95% confidence interval (0.572-0.938) for one rater who repeated his rating of the sampled designs after two weeks of the first rating (repeated measures). Consistency measures of 0.70 or above are usually considered acceptable in literature [12, 38]. Table 3 shows the intra-rater reliability achieved using the rubrics to assess the designs of fifteen students by a single rater. The single measure ICC is reported instead of the average measure. assessment rubric (GDAR). For the developed rubrics to be effectively utilized in the instructional process, there is need to investigate its’ validity and reliability. The Graphic Design assessment rubric is thus validated following the laid down procedures. Similarly, the rubric is a reliable measure for assessing Graphic Design artefacts based on the established ICC. The relationship between reliability and validity of instruments is such that the establishment of reliability is necessary condition for establishing validity. It does not however imply that a reliable assessment is by extension valid even though a valid assessment is a reliable assessment [27]. Reliability and validity of a rubric is not a function of the type of rubric whether it is holistic or analytical, task specific or generic in nature but rather dependent on the pains taken and carefulness in the design process. Assessment rubrics are very useful assessment tools for teachers at all levels of education. Their use in literature is not limited to only subjects that produce artefacts as evidence of achievement in a learning setting. We therefore recommend that teachers in subject areas that require the use of rubrics in assessment to follow development procedures that will ensure very high reliability as well as validity of their rubric so as to give accurate measurement of students’ performances in their subject at all times. REFERENCES The use of single measure ICC is recommended if further research will use the ratings of a single rater [39]. The ICC (.828) is within acceptable range according to [12, 38], < 0.40 is poor, 0.40to <0.74 is adequate and acceptable while > 0.75 is regarded as excellent correlation. Thus the rubrics yielded high inter-rater and intra-rater consistency agreements among the raters unlike newly developed rubrics [22]. The authors attributed the high correlation coefficient established to the fact that the raters used in the scoring of the samples were actually part of the team that developed the rubrics as they may have become familiar with the rubrics. In other words the development process may as well have served as enough training in the use of rubrics in scoring [22]. Secondly their wealth of experience in rating of artworks over the years in their field as all are within the rank of principal lecturer and above may have made them to be conversant with what constitutes a good work of art. The practice of moderating scores given to students’ work by an external judge which is inherent in the fine art evaluation praxis may have likely influenced their pattern of evaluation and rating of artworks towards achieving consensus scores. IV. CONCLUSION The need to give an objective assessment to the graphic design artefacts produced by pre-service teachers so as to give them orientation on how to objectively assess students’ products necessitated the development of the graphic design Oakleaf, M. (2009). Using rubrics to assess information literacy: An examination of methodology and interrater reliability, Journal of the American Society for Information Science and Technology, 60(5), 969-983 [2] Egodawatte, G. (2010). A rubric to self-assess and peer-assess mathematical problem solving tasks of college students, Acta Didactica Napocensia 3(1), 78. [3] Reynolds-Keefer, L. (2010). Rubric-referenced assessment in teacher preparation: An opportunity to learn by using, Practical Assessment, Research & Evaluation 15(8). Retrieved December 10, 2013 from http://pareonline.net/getvn.asp%3Fv%3D15%26n%3D8. . [4] Mertler, C. A. (2001). Designing scoring rubrics for your classroom. Practical Assessment, Research & Evaluation, 7(25), 1-10. [5] Mueller, J. (2014). Rubrics (authentic assessment toolbox) Retrieved March 5, 2014 from http://jfmueller.faculty.noctrl.edu/toolbox/rubrics.htm [6] Hanfer, J.C. & Hanfer, P. M. (2003). Quantitative analysis of the rubric as an assessment tool; An empirical study of student peer-group ratings. International journal of Science Education 25, 1509-1528 [7] Andrade, H. G. (2005). Teaching with rubrics: The good, the bad and the ugly. College Teaching, 53, 27-30. [8] Pellegrino, J. W., Baxter, G. P., & Glaser, R. (1999). Addressing the" two disciplines" problem: Linking theories of cognition and learning with assessment and instructional practice. Review of research in education, 307-353. 123 | P a g e International Journal of Technical Research and Applications e-ISSN: 2320-8163, www.ijtra.com Volume 4, Issue 2 (March-April, 2016), PP. 119-124 [9] Peat, B. (2006). Integrating writing and research skills: Developing and testing of a rubric to measure student outcomes. Journal of Public Affairs Education, 12, 295-311. [10] Andrade, H. (2000). Using rubrics to promote thinking and learning. Educational Leadership, 57(5), 13-18 [11] Moskal, B. M. (2003). Recommendations for developing classroom performance assessments and scoring rubrics. Practical Assessment, Research & Evaluation, 8(14). Available: http://pareonline.net/getvn.asp?V=8&n=14 [12] Jonsson, A., & Svingby, G. (2007). The use of scoring rubrics: Reliability, validity and educational consequences. Educational Research Review, 2(2), 130-144. [13] Popham, W.J. (1997), “What’s wrong and what’s right with rubrics”, Educational Leadership, 55(2), 72-75. [14] Eshun, E. F. (2011). Report on the action research project on adopting innovative assessment for learning in communication design in higher education. Paper presented at the Conference on Design, Development & Research, 26-27 September, 2011, Cape Town, 382-395 [15] Gasaymeh, A-H. (2011). The implications of constructivism for rubric design and use, Higher Education International Conference (HEIC 2011). Retrieved 23rd February 2014 from http://heic.info/assets/templates/heic2011/papers/05-AlMothana_Gasaymeh.pdf [16] Stevens, D. D., & Levi, A. (2005). Leveling the field: Using Rubrics to achieve greater equity in teaching and grading. [17] Brookhart, S. M. (1999). The art and science of classroom assessment: The missing part of pedagogy. ASHEERIC Higher Education Report (Vol. 27, No.1). Washington, DC: The George Washington University, Graduate School of Education and Human Development. [18] Moskal, B. M. (2000). "Scoring rubrics: What, when and how?" Practical Assessment, Research & Evaluation, 7(3). Retrieved from http://pareonline.net/getvn.asp?v=7&n=3 [19] Reddy, M. Y. (2011). Design and development of rubrics to improve assessment outcomes: A pilot study in a Master's level business program in India. Quality Assurance in Education, 19(1), 84-104Callison, D. (2000). Rubrics. School Library Media Activities Monthly, 17(2), [20] Zimmaro, D. M. (2004). Developing grading rubrics, Retrieved March 5th 2014 from http://www.utexas.edu/academic/mec/research/pdf/rubricshand out.pdf [21] Tierney, R., & Simon, M. (2004). What's still wrong with rubrics: focusing on the consistency of performance criteria across scale levels. Practical Assessment, Research & Evaluation, 9(2), 1-10. [22] Stellmack, M. A., Konheim-Kalkstein, Y. L., Manor, J. E., Massey, A. R., & Schmitz, J. A. P. (2009). An assessment of reliability and validity of a rubric for grading APA-style introductions. Teaching of Psychology, 36(2), 102107. [23] Thaler, N., Kazemi, E., & Huscher, C. (2009). Developing a rubric to assess student learning outcomes using a class assignment. Teaching of Psychology, 36(2), 113-116. [24] Onwuegbuzie, A. J., Bustamante, R. M. & Nelson, J. A. (2010). Mixed research as a tool for developing quantitative instruments. Journal of Mixed Methods Research, 4(1), 56-78. [25] Frascara, J. (2006). Graphic design: Fine art or social science. Design Studies: Theory and Research in Graphic Design. Audrey Bennett (ed.), 26-35. [26] Finley, A. (2012). Making progress?: What we know about the achievement of liberal education outcomes. Association of American Colleges and Universities. [27] Moskal, B. M., & Leydens, J. A. (2000). Scoring rubric development: Validity and reliability. Practical Assessment, Research & Evaluation, 7(10), 1-11. [28] Downing, S. M., & Haladyna, T. M. (1997). Test item development: Validity evidence from quality assurance procedures. Applied Measurement in Education, 10(1), 61-82. [29] Chambliss, D. F., & Schutt. R. K. (2009). Making sense of a social world: Methods of investigation. Thousand Oaks, CA: Pine Forge Press. [30] Huck, S. W. (2008). Reading statistics and research. Boston, MA: Pearson Education. [31] Reynolds, C. R., Livingston, R. B., & Wilson, V. (2006). Measurement and assessment in education. In A. E. Burvidovs (Ed.), Validity for teachers (pp. 117–140). Boston, MA: Pearson Education. [32] Moni, R.W., Beswick, E. & Moni, K. B. (2005). Using student feedback to construct an assessment rubric for a concept map in physiology. Advances in Physiology Education, 29, 197–203. [33] Payne, D. A. 2003. Applied educational assessment. 2nd ed. Belmont, CA: Wadsworth/Thomson Learning. [34] Watts, F., Marin-Garcia, J. A., García Carbonell, A., & Aznar-Mas, L. E. (2012). Validation of a rubric to assess innovation competence. WPOM-Working Papers on Operations Management, 3(1), 61-70. [34] Bonett, D. G. (2002). Sample size requirements for estimating Intraclass correlations with desired precision. Statistics in Medicine 21, 1331-1335. [35] Shrout, P. E., & Fleiss, J. L. (1979). Intraclass correlations: uses in assessing rater reliability. Psychological Bulletin, 86(2), 420. [36] Gwet, K. L. 2014. Intrarater Reliability. Wiley Stats Ref: Statistics Reference Online. [37] McGraw, K. O. & Wong. S. P. (1996). Forming inferences about some Intraclass correlation coefficients. Psychological Methods, 1(1), 30–46. [38] Stemler, S. E. (2004). A comparison of consensus, consistency, and measurement approaches to estimating interrater reliability Practical Assessment, Research, and Evaluation, 9 (4) Retrieved from http://PARAonline.net/getvn.asp?v9&n=4 [39] David Garson, G. (2009). Reliability analysis. Retrieved 11th March, 2014 from http://tx.liberal.ntu.edu.tw/~purplewoo/Literature/!DataAnalysi s/Reliability%20Analysis.htm 124 | P a g e
Keep reading this paper — and 50 million others — with a free Academia account
Used by leading Academics
Eros Carvalho
Universidade Federal do Rio Grande do Sul
Keith Laws
University of Hertfordshire
Nicola Jane Holt
University of the West of England
Robert Lickliter
Florida International University