Content introduction chapter-i assessing learner's writing skills according to cefr scales

Yüklə 113,49 Kb.

səhifə	12/20
tarix	03.05.2023
ölçüsü	113,49 Kb.
	#106646

1 ... 8 9 10 11 12 13 14 15 ... 20

Assessing learner\'s writing skills according to CEFR scales

Generalizability Theory
G-Theory is an extension of classical test theory which is concerned with identifying the influence that different design facets have on the magnitude of measurement error in assessment procedures. On one hand, this reduces the amount of unexplained measurement error and, on the other hand, it allows researchers to determine the number of levels by which each design facet should be increased or decreased to attain a desired level of precision for relative or absolute comparisons of students. Because g-theory models are based on variance components, they can provide information about relative contributions of each design facet to measurement error, but they do not allow for an adjustment of task difficulty, rater severity, or student proficiency estimates due to the influence of other design facets.
The types of information that can be culled from a g-theory analysis depend on the design with which the data were collected. The simplest design is a fully crossed balanced design in which an identical number of experimental units (i.e., student responses in our case) are randomly assigned to each combination of the design facets and the outcome variable of interest (i.e., the rating). Specifically, for our context, the four design facets of interest from a g-theory perspective are tasks, rating criteria, raters, and students.
Because the Youden Square design just described results in a fully crossed balanced design such that the same four raters rate each student response to each task on all five criteria, the g-theory model for analyzing the variation in the ratings has the following form: (1)
The subscripted model coefficients denote the effects of interest with the last term representing the confounded effect for error and the four-way interaction; s denotes students, t denotes tasks, c denotes criteria, and r denotes raters. That is, the model in Equation 1 contains four main effects, six two-way interaction effects, four three-way interaction effects, and one error term that is equivalent to the four-way interaction term due to the fact that there is only one observation per cell at that level. The estimation of variance component can follow different approaches. Because the response data in our case are dichotomous (below pass / pass), the assumption of a normally distributed rating is not justified. Consequently, the variance component point estimates were estimated by equating observed and expected mean squares and interpreted descriptively, not inferentially. The estimation was done using the program GENOVA which provided the estimates and the standard errors of the estimates.
We performed the g-theory analyses using the model in Equation 1 separately for each of the 13 Youden Square booklets in the multiple-marking study. We then aggregated the resulting variance components across the 13 booklets as described in. The proportion of ratings in each booklet was used as the weight, which varied across booklets as a different number of tasks were included in each booklet due to different difficulty levels of the tasks in each booklet.

Yüklə 113,49 Kb.

Dostları ilə paylaş:

1 ... 8 9 10 11 12 13 14 15 ... 20