Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
MACHINES, TECHNOLOGIES, MATERIALS. ISSN 1313-0226. ISSUE 7/2013 NEURAL NETWORK MODEL FOR ASSESSMENT COMPUTER-BASED TEST Assistant Dr. Petar Halachev Department of Programming and Computer System Application, University of Chemical Technology and Metallurgy – Sofia, Bulgaria Abstract: The rapidly increasing use of information technologies in the educational process increases the need for automated tools and systems capable of objective, reliable and operational evaluation of students' knowledge. Computer-based testing is a technological and objective form of knowledge control. In this report, on the basis of evaluation criteria are analyzed and compared multilayer perceptron neural network and neural network with radial-basis activation function and is proposed a model to increase the accuracy and objectivity of evaluation for computer knowledge control. Keywords: KNOWLEDGE CONTROL, NEURAL NETWORK, EDUCATIONAL PROCESS, COMPUTER-BASED TESTING • The test assessment system is easy to apply in comparison with other similar systems, it is characterized by the lowest number of parameters - one parameter for the level of knowledge for each student and one parameter for the difficulty of each question of the assignment. • The model of G. Rasch is based on the criteria "difficulty of questions" and "level of knowledge." A test task is considered to be more difficult than other, if the rate of correct answers is smaller than the other, irrespectively of that, who of the students is being tested. Similarly, more well prepared student is more a likely to answer all the questions correctly than a less prepared. • In the model of G. Rasch computational procedures for multidimensional analysis can be applied: for the whole set of test tasks, for each test, for each student, for each specific answer. Approximation of the testing results of the model can be used to differentiate students. During the application of the "Rasch measurement", there are problems that may be classified as: • One of the disadvantages is that it requires knowledge of mathematics. • The Rasch model requires a great number of observations or replications that are needed to estimate the parameters of the model. • Third, the Rasch model holds strong assumptions which are not easily to met by the observations („In fact, Rasch specifications can never be met perfectly, but are nearly always met usefully by thoughtfully collected data“).[2] In establishing and application of models to practice one of the most important criterion is accuracy of estimation, i.e. the best model is characterized with high accuracy. In assessing of tests, however, errors may always occur for models providing minimal error evaluation are most reasonable in assessing the complexity of the tests. 1. Introduction The assessment of learning is one of the most important elements of computer-based control of knowledge. Incorrectstructured approach in evaluation can significantly reduce the motivation of students to reduce their effort in preparation, to break the conduct of monitoring and the proper organization of training itself. One of the problems in computer-based testing is the incompatibility between the students' knowledge and the complexity of test questions. Problem is also the selection of questions with various degrees of complexity by the teacher (there is unification of the complexity of the test questions). The purpose of creating tests with an optimal ratio between the easier questions and those with higher complexity requires long periods for the collection of statistical information, its analysis, its preparation and the correction of the tests for their final application. 2. Theoretical aspects in the evaluation of computer-based tests Theoretical developments related to the evaluation of the tests dating back since creation of the tests and are associated with the names of the scientists: A.Binet, T.Simon, Fr. Galton, G. Rasch et. al. A.Binet and T.Simon proposed the first intelligence quotient scale and a test, as their goal was to identify students who needed special help in coping with the school curriculum. They have carried out empirical testing of the tasks that need to be included in the test, and have proposed a method for differentiating tasks. Fr. Galton first suggested statistical processing of the results from the educational process. Ch. Spearman created the Classical Test Theory. The Item Response theory (IRT), proposed by Fr. Lord was developed in the 70s of the 20th century. It is successfully applied in computer adaptive tests and allows the increase of the reliability and accuracy of the assessment. On the basis of these theories G. Rasch [1] provides the theory "Rasch measurement", which was later developed by other authors (A. Birnbaum) and is applied nowadays. The model of the G. Rasch "Rasch measurement" has a number of advantages: • Translates the measurement made in a row and dichotomous scales in linear scales, as a result of which the quality data is analyzed by means of quantitative methods. It makes possible the application of statistical procedures; • • 3. Selection criteria for evaluation of computerbased tests In a computer-based testing it is essential to have proper selection of the evaluation criteria. They influence the objectivity of the evaluation and the motivation of the students and the teacher. In practice three criteria are applied most frequently: Complexity (degree of difficulty) of the questions. „The difficulty of a question (or mark point) can be thought of as the proportion of students who get the question correct. In order that students are separated out as much as possible, it is desirable for assessments overal to have a difficulty level of about 0.5 - so that the mean mark is roughly half of the mark available.“ [3] Correlation between the number of correct answers to the total number of questions. The determination of the optimal ratio between the number of correct answers to the total number of questions in the test is important. Time spent to answer the questions. In conducting tests an important indicator is the time needed for its solution. This criterion The assessment of the difficulty of the test tasks do not depend on the size and type of sample testing and the assessment of the level of knowledge of the test does not depend on the set of the test tasks; The exclusion of data from the sample for some combinations (students - test tasks) does not have a critical impact on the accuracy of the study; 17 MACHINES, TECHNOLOGIES, MATERIALS. ISSN 1313-0226. ISSUE 7/2013 is applied when it is necessary to consider not only the accuracy but also the speed of the response. When a test is conducted is time given for competition and after it passes the answers are not consider as correct. When there are differences in the time specified by the teacher and the time actually spent by students, problems can include: Vague questions or unbalanced combinations in the test. As additional criteria in evaluating of computer-based tests indicators can be applied such as: Number of students assessed; Relevance of the level of knowledge of the students tested to the complexity of the test tasks. To increase the objectivity of the study numerous values of these criterions need to be coordinated between teachers teaching the same subject. discipline might be include, and to examine of the results of previous tests. 2.Assessment the degree of complexity of the questions from the teacher (maximum valuation - MV) – the teacher defines degree of complexity for each test question in the range of 0.1 to 1 (lowest complexity question are weighing 0.1 and highest complexity have a weight of 1). 3. Conduction of trial test of students and establishment of a matrix of responses to calculate a coefficient of difficulty (CD) and index of response time (IT) of the questions in the test. Coefficient of difficulty (CD) is given by (1) or (2): CD = 1 − NR N (1) 4. Algorithm for assessing the complexity of test CD = A computer test of Informatics measures knowledge of 117 students. The number of questions in the test is 30. The time for sit for the test is 60 minutes. For the purposes of the study the students that did not respond to at least 10% of the questions in the test are excluded. The objective was to compare the degree of complexity specified previously by the trainer with the actual results obtained by the students and if necessary, to update the test. The algorithm of the process of assessing the complexity of the test questions is presented in Figure 1. NW N (2) where: NR – count of correct answers per question; NW – count of incorrect answers per question; N – total number of students. Index of response time (IT) is given by (3): IT = 1.Preparation of test Tav T (3) where: Tav – average time needed for response per question; T – time for competition of the whole test (60 minutes). 2. Determination of the degree of complexity of the test After conducting the trial test the following matrix reflects the result (Table 1 and Table 2). In the matrix are scores of the examined students. Where the student gives a right answer the value is 1, and if the student gives a wrong answer the value is 0. In Table 1 is presented calculation of value of coefficient of difficulty (CD). 3.Experimental Test 4.Assessment with neural network Table 1. Matrix to estimate CD 4.1 4.2 4.3 4.4 4.5 Question Student №1 №2 ... № 100 CD 5. Comparing the results of neural network, the experiment and evaluation of teacher 6. Adjustment to the test. №1 x1/1 x1/2 ... x1/100 CD1 №2 ... x2/1 x2/2 ... ... ... ... ... x2/100 CD2 № 30 x30/1 x30/2 ... x30/100 CE30 In Table 2 is presented calculation of value of index of response time (IT) of the questions in the test. Figure 1. Algorithm to the process of evaluating computer-based tests with neural network Table 2. Matrix to estimate IT The study was carried out in the following sequence: 1.Preparation of the test by the teacher. The teacher prepares a test of Informatics with 30 questions, and on the basis of his experience defines the following parameters: • Ccontents of the test - the relevance of the questions in the test to the curriculum and lectures; • Technical performance of the test - level of accuracy and clarity of the formulation of the questions and the answers; • Time to resolve any task (no reply within the time, the task is considered unresolved); • Criteria (limits) of assessment of students depending on the ratio between the resolved and unresolved questions in the test. In order to draw a more accurate assessment of these criteria at this stage opinions of experts or teachers from the same Time Student №1 №2 ... № 100 IT №1 №2 ... № 30 t1/1 t1/2 ... t2/1 t2/2 t30/1 t30/2 ... t1/100 IT1 t2/100 IT2 ... ... ... ... ... t30/100 IT30 . The values of the coefficient of difficulty (CD) and the index of response time (IT) set standards for ease / difficulty of the questions of the specific student group. The values of the coefficients depend on the level of the preparation of the students, not on their number, i.e. based on the answers students in their level of preparation are differentiated. 18 MACHINES, TECHNOLOGIES, MATERIALS. ISSN 1313-0226. ISSUE 7/2013 Complexity of the questions must be in compliance with the average level of knowledge in the test group; the test includes questions of varying difficulty - from easy to difficult. Obviously, very easy questions that were answered by all students and too complex, which can not be answered by any student, can not differentiate students by their level of training. These questions need to be corrected or removed from the test. 4.Evaluation of the test with neural network The artificial neural networks are applied to increase the precision of the evaluation. The purpose is to adjust the complexity of the questions in line with the level of students' knowledge. The neural networks have following advantages: • a neural network can perform tasks a linear program can not; • a neural network learns and does not need to be reprogrammed; • it can be implemented in any application.[4] The study of neural network is carried out by the following steps: 4.1. Selection of the structure of neural network The choice of the type of neural network depends on the size of the data and of the task. The size of the data sample is sufficient for a study with neural network. The main task in this paper is related to the evaluation of the complexity of the questions in the test by function approximation and forecasting. Suitable for these purposes are: multilayer perceptron neural network and neural network with radial-basis activation function. After the experiment was carried out the estimates the forecasts are compared and is selected the neural network, that gives a smaller error. Multilayer perceptron neural network [5] is a set of computing elements, that are situated in layers, including: • Multiple input nodes forming the input layer; • One or more hidden layers of computational neurons; • An output layer of neurons. The number of layers is not limited, but they are usually 2 or 3. The principle of operation of the multilayer perceptron neural network comprises of sequentially transmission of signal from layer to layer, and result in obtaining an output signal. Neural network with radial-basis activation function (RBF) are multilayer neural networks and consists of a hidden layer with elements with radial-basis activation function and an output layer of linear units. RBF are networks with direct connections, as their main purpose is interpolation and approximation of functions for solving problems of forecasting. RBF will consist of one input layer and one hidden layer of neurons. The number of neurons in the hidden layer correspond to the number of centers of data clusters retrieved by K-means clustering of the training data. Input vector x is transmitted to the neurons in the hidden layer, and each neuron of the hidden layer receives full information about the input vector x. Neural network with radial-basis activation function proposed by Powell in 1985 and have the following characteristics: • In this networks each neuron is connected to each neuron of the next layer; • Unlike back propagation neural network problems with the local minimum does not exist; • Neurons in the hidden layer have nonlinear activation function; • Fast training and good approximation capabilities 4.2. Separation of data for the performance of neural network. • Data is distributed: 70% for training; 15% for validation and 15% for testing neural network. 4.3 Training of neural network. In order to improve the performance of neural network it must to be trained. The training of Multilayer perceptron neural network uses the method of training with back propagation of error. When training neural network with radial-basis activation function, the center vectors (Ci) of the RBF functions in the hidden layer are chosen - centers can be randomly sampled from some set of examples, or they can be determined using k-means clustering. This step is unsupervised. Backpropagation step can be performed to fine-tune all of the RBF net's parameters [6]. The second step fits a linear model with coefficients (Wi) to the hidden layer's outputs with respect to some objective function. A common objective function, at least for regression/function estimation, is the least squares function: K ( w) = ∞ ∑ K (w) , (4) t t =1 where: K t ( w) = [ y (t ) − ϕ ( x(t ), w)] 2 (5) Minimization of the least squares objective function by optimal choice of weights optimizes accuracy of fit. The neural networks are validated after training. For training the neural networks were used student evaluations of 20 questions and for testing 10 questions. The input of the neural network vector consisting of 2 components is supplied: • Coefficient of difficulty CD - values for each of the 20 questions; • Index of response time (IT) - value of the of the 20 questions. At the output of the neural networks is fed teachers assessment of the complexity of a given question within 0.1 to 1. The training of the neural networks is performed successively with the data for all 20 questions of the test. 4.4. Testing the neural networks After the training is over checking the accuracy of work on neural network with the other 10 questions is done. To the input layer the values for CD and IT. At the output of the neural networks are obtained the teachers’ assessment of the complexity of each question calculated by the neural network. The testing of the neural networks is performed sequentially with all 10 questions of the test. Similarly, data with these experiment is performed with the neural network with radial-basis activation function. 4.5. Determining the error of the forecast The results obtained from the forecasts of Multilayer perceptron neural network and Neural network with radial-basis activation function are compared with the teacher assessment (MV) and determine the error rates. The multilayer perceptron neural network’s error is 14% and the neural network with radial-basis activation function’s is 8% (Table 3). Table 3. Error of the neural network Kind of neural network Multilayer perceptron neural network Neural network with radial-basis activation function Error (%) 14 8 RBF appears to be a better tool for conducting the assessment of the complexity of computer-based tests. 5.Comparision the results of sample testing and assessment of the teacher. The results of the assessment of the teacher and the responses of students of the real test in informatics (consisting from 20 questions) are presented in Table 3. 6. Correction the questions in the test. In order to improve the structure of the test it may be decided on: • Excluding the questions from the test where there are a great difference in the assessment ; • A change in the formulation of some questions; • To change the options of answers to the questions. 19 MACHINES, TECHNOLOGIES, MATERIALS. ISSN 1313-0226. ISSUE 7/2013 Table 3. Results of experiments with neural networks № of question 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Maximum valuation MV 0.1 0.2 0.2 0.3 0.3 0.4 0.4 0.5 0.5 0.5 0.5 0.6 0.6 0.7 0.7 0.8 0.8 0.9 0.9 1.0 Evaluation of RBF 0.3 0.1 0.4 0.4 0.1 0.4 0.3 0.3 0.4 0.7 0.5 0.6 0.4 0.5 0.4 0.9 0.6 0.7 0.9 0.9 [2] Alphen A., R. Halfens, A. Hasman, T. Imbos, "Likert or Rasch? Nothing is more applicable than a good theory", Journal of Advanced Nursing, 1994, 20, http://www.rasch.org/rmt/rmt82d.htm Adjusted assessment + 0.2 do not correct + 0.2 do not correct - 0.2 do not correct do not correct - 0.2 do not correct + 0.2 do not correct do not correct - 0.2 - 0.2 - 0.3 do not correct - 0.2 - 0.2 do not correct do not correct [3] McAlpine Mhairi, Principles of assessment, Robert Clark Centre for Technological Education University of Glasgow, CAA Centre, University of Luton, Bluepaper Number 1, 02.2002 [4] Neuro AI Artifical Neural Networks Digital Signal Processing, Algorithms and Applications, Neuro AI – Intelligent systems and Neural networks [5] Rosenblatt, Frank. x. Principles of Neurodynamics: Perceptrons and the Theory of Brain Mechanisms. Spartan Books, Washington DC, 1961 [6] Schwenker, Friedhelm; Kestler, Hans A.; Palm, Günther (2001). "Three learning phases for radial-basis-function networks". Neural Networks 14 In order to improve the structure of the test is decided to increase/decrease the degree of maximum valuation (MV) for questions 1, 3, 5, 8, 10, 13, 14, 15, 17, 18. 5. Conclusion Increasingly wider use of computer-based tests for assessing knowledge is related to a number of problems: • The complexity of the tasks and the real knowledge of students inadequacy of set by the teacher; • Incomplete study of the analytic relationship between the level of preparation of the students and the complexity of the job; To solve these problems it is necessary to reassess the complexity of the computer-based tests based on the students, as well as processing and analyzing the results obtained with modern technologies based on precise mathematical methods Artificial neural networks are widely used in various fields (research, manufacturing, etc.). But still have limited application in education. Application in this paper allow the solution of problems related to increase of the reliability of the results of computer tests and improve control knowledge. The proposed algorithm not only allows more accurate assessment of students' knowledge, but also the effective monitoring of the whole process of learning. Results could help to improve the curriculum and the administration of computer-based tests. Acknowledgment This report has been produced with the financial assistance of the European Social Fund, project number BG051PO001-3.3.06-0014. The author is responsible for the content of this material, and under no circumstances can be considered as an official position of the European Union and the Ministry of Education and Science of Bulgaria. Reference [1] Fischer G., I. Molenaar, Rasch Models. Foundations, Resent Developments and Applications.. New York, Berlin, 1997, Springer 20