Topic: Assessing learner's writing skills according to CEFR scales
CONTENT INTRODUCTION……………………………………………………..…………….2 Chapter-I Assessing learner's writing skills according to CEFR scales ….......4 Test Development and Setting Cut-Scores in Line With the CEFR…….…....4
1.2 Model of Communicative Competencies in the NES ……………………….13
Chapter-II Test Specifications.….……………..………………………………..…..14 2.1. Overview Over the National Educational Standards Writing Competency Descriptors for the Mittlerer Schulabschluss …………………………………....14
2.2 Percentage of Student Responses Classified as “Pass” by Sample, Task, and Rater……………………………………………………………………………..…26
2.3.Aggregated Variance Components Estimates for HSA and MSA Samples……………………………………………………….....................…..……27
III.Conclusion…………..………………………………………………………...34 IV.References......………………………………………………………………....36
INTRODUCTION Since its publication in 2001, the Common European Framework of Reference (CEFR; Council of Europe 2001) has increasingly become a key reference document for language test developers who seek to gain widespread acceptance for their tests within Europe. The CEFR represents a synthesis of key aspects about second and foreign language learning, teaching, and assessment. It primarily serves as a consciousness-raising device for anyone working in these areas and as an instrument for the self-assessment of language ability via calibrated scales. In other words, it is not a how-to guide for developing language tests even though it can serve as a basis for such endeavors. As a result, many test developers are unsure about how to use the information in the CEFR to design tests that are aligned with the CEFR, both in philosophy and practice. This article is situated within this broader context and aims to provide insight into the question how writing tasks can be aligned with the CEFR levels.
Of importance, writing ability is usually measured by using open tasks that can elicit a range of written responses. These, in turn, are generally scored by trained raters using a rating scale that covers several bands or levels of proficiency; we call this approach a multilevel approach. However, if one needs to determine whether a student has reached one specific level, it is worth exploring an approach in which tasks are used that are each targeted at one specific level; the written responses of the students are then assessed by having trained raters assign a fail/pass rating using level-specific rating instruments; we call this a level-specific approach.
This is the task development and rating approach that is taken in this study. Specifically the level-specific rating approach constitutes a rather novel approach, especially within the context of test development in line with the CEFR levels. The alignment process described in the Manual for Relating Examinations to the CEFR (Council of Europe, 2009) encompasses several steps, one of which is the “specification” stage focusing on task characterization as a prerequisite for the formal standard setting itself. Although this stage focuses on tasks, the standard setting methods suggested in the Manual suitable for the skill of writing are examinee-centered (Council of Europe, 2009, Chapter 6). We therefore aim to make a contribution to the literature by describing key development, implementation, and scoring facets of a test-centered approach to linking level-specific writing tasks to the CEFR.
To address this aim, we first analyze the effect of the level-specific design factors (i.e., tasks, rating criteria, raters, and students) on the variability of the ratings of students' written responses. We then investigate whether the analyses in particular suggest empirically grounded cut-scores that are in alignment with the targeted CEFR proficiency levels of the tasks.
We have organized this article as follows. In the first main section, we present a literature review focused on the significance of the CEFR for language testing and cut-core setting, on different approaches to assessing writing, on research into difficulty-determining task characteristics, and on the influence of rating approaches and rater training on rating variability. In the second main section we describe the broader background of our study, delineating the large-scale assessment project within which our study is situated and providing details on the development of the writing construct specification and the task development. In the Methodology section, we describe the design of the study reported here, as well as the rating procedure used for data collection; this is followed by the research questions and the statistical analyses we conducted with a specific emphasis on g-theory and multifaceted Rasch analysis. We then present the empirical results of our research in the fourth main section and conclude the article with a discussion of these results in the context of task development for large-scale writing assessment linked to the CEFR.1