In writing assessment, finding a valid, reliable, and efficient scale is critical. Appropriate scales, increase rater reliability, and can also save time and money. This exploratory study compared the effects… Click to show full abstract
In writing assessment, finding a valid, reliable, and efficient scale is critical. Appropriate scales, increase rater reliability, and can also save time and money. This exploratory study compared the effects of a binary scale and an analytic scale across teacher raters and expert raters. The purpose of the study is to find out how different scale types impact rating performance and scores. The raters in this study rated twenty short EFL essays using the two scales, completed a rater cognition questionnaire, and took part in an in-depth interview. The ratings were analyzed using a multi-faceted Rasch analysis to compare essay scores and rater statistics across scales and rater groups. The results indicated when using the binary scale, the raters spent less time and were less spread out and more consistent in their ratings. Three out of four raters replied that less mental effort was required when using the binary scale and felt more confident in their ratings. Across the two rater groups, there was a bigger shift in rating performance when using the binary scale for the teacher raters than the expert raters. This implies that scale design had a greater effect on teacher raters. The overall findings suggest that the binary scale maybe a better fit for large scale assessment with sufficient rater training.
               
Click one of the above tabs to view related content.