| Proposal Type: | Individual Paper |
|---|---|
| Domain: | Assessment and Evaluation |
| SIG: | Assessment and Evaluation |
| Type | Submitted Paper |
| Equipment |
PC and projector |
| Paper Details |
|---|
| Title | Competency profiles from standard assessments |
|---|---|
| Abstract | Recently for
The present contribution addresses the question whether results can be reported simultaneously on five big ideas and six content related competencies of the test. The data to be analysed was collected on a second day of testing following the PISA 2006 assessment in |
| Summary | As a consequence of the mediocre German results in international large scale assessments like TIMSS, With the PISA 2006 assessment in The question addressed in this paper is whether and how results can be reported on the big ideas as well as on the content related competencies for each student or for relevant groups of students. More specifically it is asked how to model competency profiles, using item response theory models suitable for large scale assessment studies. What is the differential information we get from complex competency profiles, how reliable are the reported results or how complex can profiles be reported with sufficient reliability? The approach used to model the profiles utilizes multidimensional IRT models for large scale assessment studies. These models rely on the item responses and a population model on the joint distribution of all competencies and other student characteristics like gender, socio-economic status and parental support for example. Differentiating the five big ideas requires a five dimensional response model where different items are modelled to assess each dimension. A model on the six content related competencies assign some items to assess different competencies at the same time. A complete model on both aspects, big ideas and content related topics had to define five by six dimensions for each combination of big ideas an competencies. However with about 70 items for each student values on 30 dimensions cannot be very reliable. Alternative models with reduced complexity are defined and their fit to the data are compared. For example, assuming that no differential information from the combinations of big ideas and competencies will be obtained, a model on eleven dimensions, one for each big idea and each content related competency may be specified. Other alternatives are to assume that some of these combinations do in fact show differences between students whereas other combinations don’t. For example, within the big idea “stochastical data” a further differentiation may not be supported by the data. Within the big idea “Algebra” the competencies that show differences between students are not the same as within the big idea “space and shape”. The paper presents results on empirical fit of alternative models as well as the differences these models uncover between students and groups of students and the reliabilities of these differences.
The implications for policy are quite obvious: From a more general perspective, the level of detail to that standard test results can be reported is crucial for the impact on the process of quality improvement of the educational system. From a more assessment focused perspective the approach presented here is used to evaluate the standard test developed for mathematics in the sense that it analyses the discriminative power of the instrument. Both perspectives apply as well to testing standards in other content areas like science, reading or a foreign language. |
| Keywords | Assessment of competence Item response theory (IRT) Large-scale national assessment projects |
| Appendices | |
| Authors | ||||||
|---|---|---|---|---|---|---|
| Name | Surname | Institution | Country | EARLI Number | Presenting | |
| Claus H. | Carstensen | IPN Kiel | Germany | carstensen@ipn.uni-kiel.de | * | |
| Andreas | Frey | IPN Kiel | Germany | frey@ipn.uni-kiel.de | ||

