Normalization Methods Application1
Normalization Methods Application1
Normalization
1. Purpose: If test is conducted in batches with different set of questions, there is
possibility that the difficulty level may differ and in such an event to have
scores of both batches comparable, process Normalization is adopted.
2. Process: Mean and Standard Deviation is ascertained for the Base as well as
Targeted Batch. Formula is applied using these figures to the Scores of
Targeted Batch and Normalized score is obtained.
A) Proportion of Deviation
B) Difference between Target Value and Average Value
C) Average Value
The elements comprising the above factors are modified to achieve precise
results based on of data (Scores of candidates in different Batches) resulting in
different methods. These methods are explained in Annexure “A”.
It is said that you need to take at least 30 samples, to be "sure" that you have
an exact enough mean and deviation estimates.
It is interesting to note here that, even if a sample is taken from one batch, the
size needs to be adequate enough, and then only it can represent approximate
distribution of that batch. (Ref : Central Limit Theorem)
The choice of n = 30 for a boundary between small and large samples is a rule
of thumb, only. There is a large number of books that quote (around) this value,
for example, Hogg and Tanis' Probability and Statistical Inference (7e) says
"greater than 25 or 30".
D. In the following diagram the B (data plotted in dotted line) represents more
difficult Questions used in the Batch as compared to A and C
4. Conclusion: Considering what has been stated above and also various methods
mentioned in annexure “A”, In our opinion :
a) Normalization cannot be done under following circumstance and the score
needs to be left as it is for ranking purpose.
• In case the size of Base Batch or Target Batch is less than 30.
• In case the test Question Papers are not comparable (i.e. with different
subject matter content, different pattern / level)
b) As regards Method B and C :
• There is a factor “Top 0.1% candidates. If the absolute value of the
same is expected to be at least 30, then the batch size needs to be
Annexure A
Method A
- Score Normalization using Mean and Standard Deviation of Base / Standard
and Target Batch
Xn = (S2/S1)*(X-Xav) + Yav
Where:
Suffix 1 and suffix 2 represent two sets of marks.
S represent standard deviation.
X and Xav represent raw score and average score for set 1.
Y and Yav represent raw score and average score for set 2.
And Xn = Normalized score.
Supposing set 1 is to be scaled against set 2 (which is declared as standard)
Batch with maximum average with minimum 70% of the overall average
attendance is considered as the Base / Standard Batch.
Method B
̅𝑡𝑔 − 𝑀𝑞𝑔
𝑀
̂𝑖𝑗 = 𝑔𝑚
𝑀 𝑥(𝑀𝑖𝑗 − 𝑀𝑖𝑞 ) + 𝑀𝑞
̅𝑡𝑖 − 𝑀𝑖𝑞
𝑀
Where,
̂𝑖𝑗 = 𝑁𝑜𝑟𝑚𝑎𝑙𝑖𝑧𝑒𝑑 𝑚𝑎𝑟𝑘𝑠 𝑜𝑓 𝑗𝑡ℎ 𝑐𝑎𝑛𝑑𝑖𝑑𝑎𝑡𝑒 𝑖𝑛 𝑖 𝑡ℎ 𝑠ℎ𝑖𝑓𝑡 (𝑢𝑝 𝑡𝑜 5 𝑑𝑒𝑐𝑖𝑚𝑎𝑙 𝑝𝑙𝑎𝑐𝑒𝑠)
𝑀
𝑀̅𝑡𝑔 = 𝐴𝑣𝑒𝑟𝑎𝑔𝑒 𝑚𝑎𝑟𝑘𝑠 𝑜𝑓 𝑡𝑜𝑝 0.1% 𝑐𝑎𝑛𝑑𝑖𝑑𝑎𝑡𝑒𝑠 𝑐𝑜𝑛𝑠𝑖𝑑𝑒𝑟𝑖𝑛𝑔 𝑎𝑙𝑙 𝑠ℎ𝑖𝑓𝑡𝑠
𝑔
𝑀𝑞 = 𝑆𝑢𝑚 𝑜𝑓 𝑚𝑒𝑎𝑛 𝑎𝑛𝑑 𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝐷𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 𝑚𝑎𝑟𝑘𝑠 𝑜𝑓 𝑐𝑎𝑛𝑑𝑖𝑑𝑎𝑡𝑒𝑠
𝑖𝑛 𝑡ℎ𝑒 𝑒𝑥𝑎𝑚 𝑐𝑜𝑛𝑠𝑖𝑑𝑒𝑟𝑖𝑛𝑔 𝑎𝑙𝑙 𝑏𝑎𝑡𝑐ℎ𝑒𝑠
̅𝑡𝑖 = 𝐴𝑣𝑒𝑟𝑎𝑔𝑒 𝑚𝑎𝑟𝑘𝑠 𝑜𝑓 𝑡𝑜𝑝 0.1% 𝑐𝑎𝑛𝑑𝑖𝑑𝑎𝑡𝑒𝑠 𝑖𝑛 𝑖𝑡ℎ 𝑠ℎ𝑖𝑓𝑡
𝑀
𝑀𝑖𝑞 = 𝑆𝑢𝑚 𝑜𝑓 𝑚𝑒𝑎𝑛 𝑎𝑛𝑑 𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝐷𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 𝑜𝑓 𝑖𝑡ℎ 𝑠ℎ𝑖𝑓𝑡
𝑀𝑖𝑗 = 𝑅𝑎𝑤 / 𝑆𝑐𝑎𝑙𝑒𝑑 𝑀𝑎𝑟𝑘𝑠 𝑜𝑏𝑡𝑎𝑖𝑛𝑒𝑑 𝑏𝑦 𝑗𝑡ℎ 𝑐𝑎𝑛𝑑𝑖𝑑𝑎𝑡𝑒 𝑖𝑛 𝑖 𝑡ℎ 𝑠ℎ𝑖𝑓𝑡
𝑔𝑚
𝑀𝑞 = 𝑆𝑢𝑚 𝑜𝑓 𝑚𝑒𝑎𝑛 𝑚𝑎𝑟𝑘𝑠 𝑜𝑓 𝑐𝑎𝑛𝑑𝑖𝑑𝑎𝑡𝑒𝑠 (𝑖𝑛 𝑡ℎ𝑒 𝑠ℎ𝑖𝑓𝑡 ℎ𝑎𝑣𝑖𝑛𝑔 𝑚𝑎𝑥. 𝑚𝑒𝑎𝑛)
𝑎𝑛𝑑 𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝐷𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 𝑜𝑓 𝑐𝑎𝑛𝑑𝑖𝑑𝑎𝑡𝑒𝑠 𝑐𝑜𝑛𝑠𝑖𝑑𝑒𝑟𝑖𝑛𝑔 𝑎𝑙𝑙 𝑏𝑎𝑡𝑐ℎ𝑒𝑠
______________________________________________________________________________
1. In this case additional elements like Average and Standard Deviation of top
0.1% of overall candidates as well as that of targeted batch are brought in the
picture.
2. By using Proportion of difference ( Average score of Top 0.1% candidates
minus Average+ SD of all candidates’ score) for all shifts to targeted shift the
purpose to normalize the data more precisely is achieved if there is significant
variation in marks scored by top 0.1% candidates in different batches.
3. Concept of Base / Standard and Target Batch is maintained.
4. Since ratio as indicated in 2 is used as one of the factor for the purpose of
normalization of candidate’s score, instead of difference between score of the
candidate and average score of the Base Batch, difference between score of the
candidate and average + SD of score of the Base Batch is taken.
5. Having taken proportionate difference as stated above Average + SD of score
for all batches is added to it. Here SD is also added because while calculating
proportionate difference SD is also deducted from candidate’s score.
Method C
̅𝑡𝑔 − 𝑀𝑞𝑔
𝑀
̂𝑖𝑗 = 𝑔
𝑀 𝑥(𝑀𝑖𝑗 − 𝑀𝑖𝑞 ) + 𝑀𝑞
̅𝑡𝑖 − 𝑀𝑖𝑞
𝑀
Where,
̂𝑖𝑗 = 𝑁𝑜𝑟𝑚𝑎𝑙𝑖𝑧𝑒𝑑 𝑚𝑎𝑟𝑘𝑠 𝑜𝑓 𝑗𝑡ℎ 𝑐𝑎𝑛𝑑𝑖𝑑𝑎𝑡𝑒 𝑖𝑛 𝑖𝑡ℎ 𝑠ℎ𝑖𝑓𝑡 (𝑢𝑝 𝑡𝑜 5 𝑑𝑒𝑐𝑖𝑚𝑎𝑙 𝑝𝑙𝑎𝑐𝑒𝑠)
𝑀
𝑀̅𝑡𝑔 = 𝐴𝑣𝑒𝑟𝑎𝑔𝑒 𝑚𝑎𝑟𝑘𝑠 𝑜𝑓 𝑡𝑜𝑝 0.1% 𝑐𝑎𝑛𝑑𝑖𝑑𝑎𝑡𝑒𝑠 𝑐𝑜𝑛𝑠𝑖𝑑𝑒𝑟𝑖𝑛𝑔 𝑎𝑙𝑙 𝑠ℎ𝑖𝑓𝑡𝑠
𝑔
𝑀𝑞 = 𝑆𝑢𝑚 𝑜𝑓 𝑚𝑒𝑎𝑛 𝑎𝑛𝑑 𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝐷𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 𝑚𝑎𝑟𝑘𝑠 𝑜𝑓 𝑐𝑎𝑛𝑑𝑖𝑑𝑎𝑡𝑒𝑠
𝑖𝑛 𝑡ℎ𝑒 𝑒𝑥𝑎𝑚 𝑐𝑜𝑛𝑠𝑖𝑑𝑒𝑟𝑖𝑛𝑔 𝑎𝑙𝑙 𝑏𝑎𝑡𝑐ℎ𝑒𝑠
̅𝑡𝑖 = 𝐴𝑣𝑒𝑟𝑎𝑔𝑒 𝑚𝑎𝑟𝑘𝑠 𝑜𝑓 𝑡𝑜𝑝 0.1% 𝑐𝑎𝑛𝑑𝑖𝑑𝑎𝑡𝑒𝑠 𝑖𝑛 𝑖𝑡ℎ 𝑠ℎ𝑖𝑓𝑡
𝑀
𝑀𝑖𝑞 = 𝑆𝑢𝑚 𝑜𝑓 𝑚𝑒𝑎𝑛 𝑎𝑛𝑑 𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝐷𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 𝑜𝑓 𝑖𝑡ℎ 𝑠ℎ𝑖𝑓𝑡
𝑀𝑖𝑗 = 𝑅𝑎𝑤 / 𝑆𝑐𝑎𝑙𝑒𝑑 𝑀𝑎𝑟𝑘𝑠 𝑜𝑏𝑡𝑎𝑖𝑛𝑒𝑑 𝑏𝑦 𝑗𝑡ℎ 𝑐𝑎𝑛𝑑𝑖𝑑𝑎𝑡𝑒 𝑖𝑛 𝑖 𝑡ℎ 𝑠ℎ𝑖𝑓𝑡
____________________________________________________________________________________
Therefore, this method is appropriate where all Batches are equally important or
unique and no Batch can be taken as a Base / Standard Batch.
𝑌2 − 𝑌1
𝑌 = 𝑌1 + 𝑥(𝑋 − 𝑋1)
(𝑋2 − 𝑋1)
Where:
Y = Equated Score rounded up to 2 decimal places
Y1 = Marks corresponding to immediate lower percentile form Batch II
Y2 = Marks corresponding to immediate upper percentile form Batch II
X1 = Immediate lower percentile form Batch II
X2 = Immediate upper percentile form Batch II
X = Percentile of the Candidate of the respective Batch
=======