Zur Beurteilung von Schreibleistungen aus Deutsch als Erstsprache in High-Stakes Tests
Die Stabilität von Skalendeskriptoren im Bewertungsraster für die österreichische Matura
The present study describes a first step towards validating the rating scale for assessing L1 German writing in the context of the Austrian Matura exam. After describing the process of scale development in the context of the exam reform, it reports on an empirical study into the stability of scale descriptors. The 70 scale descriptors were assessed in terms of their difficulty by a panel of 100 experienced teachers who had not undergone training in the use of the scale. This data served as the basis for studying overall rater agreement, the correspondence of the sequence of empirically scaled descriptors to the intended sequence, and for studying rater agreement on individual descriptors. It was found that using the scale without previous rater training is not recommendable and rater training is indispensable. The highest level on the scale was found to be the most consensual among the assessors. There is relatively high agreement with regard to what constitutes excellence in L1 German writing. The descriptors on the critical pass level were found to function relatively well although at least two descriptors turned out to be unstable and should be focused on in rater training. Overall, a high number of stable descriptors was found, which is remarkable given that the assessors had not yet received training in using the scale. Suggestions for areas of focus in assessor training or minor improvements of the scale are made.
This work is licensed under a Creative Commons Attribution 4.0 International License.