Proposal view
Proposal Type: Symposium 
Domain: Assessment and Evaluation 
SIG: Assessment and Evaluation 
Type Invited EARLI Symposium 
Title Large-scale assessment - National and international perspectives 
Abstract

The aim of this symposium is to give an overview of methods, applications, and recent developments in the field of large-scale assessment. For many years, large-scale assessments have been the driving force behind new developments in educational measurement (e.g., application of item-response models). Drawing on representative samples, they provide insights into educational outcomes, their correlations with school and student background variables, and changes across assessment cycles. The four papers in this symposium examine recent methodological and content-related developments in national and international large-scale assessments, as well as their potential for educational research.


Mark Wilson (USA) discusses the relationship between large-scale assessments, small-scale testing, and standard-based assessments, and considers the methodological challenges of the longitudinal perspective. Beno Csapo (Hungary) presents first results from the Hungarian Educational Longitudinal Survey for mathematics and reading, which aims at establishing a system-wide evaluation and accountability system. Manfred Prenzel (Germany) focuses on the potential that the Programme for International Student Assessment (PISA) holds for educational research, and presents Germany’s longitudinal extensions to the international PISA 2003 assessment. Juergen Baumert (Germany) presents further data from Germany’s follow-up assessment to PISA 2003, focusing on teacher knowledge, teaching, and student progress within the PISA framework.

 
Equipment Overhead projector
PC and projector
Keywords Assessment of competence
Large-scale international assessment projects
Large-scale national assessment projects 
Chair list
Name Surname Institution Country E-Mail EARLI Number
Cordula Artelt Bamberg University Germany cordula.artelt@ppp.uni-bamberg.de  
Organiser list
Name Surname Institution Country E-Mail EARLI Number
Cordula Artelt Bamberg University Germany cordula.artelt@ppp.uni-bamberg.de  
Discussant list
Name Surname Institution Country E-Mail EARLI Number
No Discussants Found!
Paper Details
Title On the large scale
Abstract

For many years, large-scale assessments have been the driving force of new developments in educational measurement.  The demands of the large-scale context have been the main driving forces behind the move away from the routine application of  classical test theory towards routine use of item response models.  At the same time, the limitations of large-scale testing tend to act as a brake on innovation, requiring high levels of efficiency, dependability, and sometimes just plain consistency with the past.  In this presentation, I will discuss some of the more recent pressures for change, and tendencies towards inertia, that I see in large-scale testing.  I will discuss the effects that the rise of so-called "standards-based" assessments are having on testing in the United States, in particular focusing on reactions to it, such as the development of concepts such as "learning performances" and "learning trajectories."  These reactions need to be seen as occurring in a context where "small-scale" testing, such as assessment on the  classroom and individual scales are becoming relatively much more important.  I will then relate these to technical developments in the field.  In particular, to longitudinal perspectives on modeling, to issues in vertical equating, and to ways to enable somewhat less rigid ideas of dimensionality, such as "essential" dimensionality and "thick" variables.  I will conclude with some comments on where I see these linked issues leading, both for large and small-scale testing, and within the technical domain.

Summary

For many years, large-scale assessments have been the driving force of new developments in educational measurement.  The demands of the large-scale context have been the main driving forces behind the move away from the routine application of  classical test theory towards routine use of item response models.  At the same time, the limitations of large-scale testing tend to act as a brake on innovation, requiring high levels of efficiency, dependability, and sometimes just plain consistency with the past.  In this presentation, I will discuss some of the more recent pressures for change, and tendencies towards inertia, that I see in large-scale testing.  I will discuss the effects that the rise of so-called "standards-based" assessments are having on testing in the United States, in particular focusing on reactions to it, such as the development of concepts such as "learning performances" and "learning trajectories."  These reactions need to be seen as occurring in a context where "small-scale" testing, such as assessment on the  classroom and individual scales are becoming relatively much more important.  I will then relate these to technical developments in the field.  In particular, to longitudinal perspectives on modeling, to issues in vertical equating, and to ways to enable somewhat less rigid ideas of dimensionality, such as "essential" dimensionality and "thick" variables.  I will conclude with some comments on where I see these linked issues leading, both for large and small-scale testing, and within the technical domain.


For many years, large-scale assessments have been the driving force of new developments in educational measurement.  The demands of the large-scale context have been the main driving forces behind the move away from the routine application of  classical test theory towards routine use of item response models.  At the same time, the limitations of large-scale testing tend to act as a brake on innovation, requiring high levels of efficiency, dependability, and sometimes just plain consistency with the past.  In this presentation, I will discuss some of the more recent pressures for change, and tendencies towards inertia, that I see in large-scale testing.  I will discuss the effects that the rise of so-called "standards-based" assessments are having on testing in the United States, in particular focusing on reactions to it, such as the development of concepts such as "learning performances" and "learning trajectories."  These reactions need to be seen as occurring in a context where "small-scale" testing, such as assessment on the  classroom and individual scales are becoming relatively much more important.  I will then relate these to technical developments in the field.  In particular, to longitudinal perspectives on modeling, to issues in vertical equating, and to ways to enable somewhat less rigid ideas of dimensionality, such as "essential" dimensionality and "thick" variables.  I will conclude with some comments on where I see these linked issues leading, both for large and small-scale testing, and within the technical domain.


For many years, large-scale assessments have been the driving force of new developments in educational measurement.  The demands of the large-scale context have been the main driving forces behind the move away from the routine application of  classical test theory towards routine use of item response models.  At the same time, the limitations of large-scale testing tend to act as a brake on innovation, requiring high levels of efficiency, dependability, and sometimes just plain consistency with the past.  In this presentation, I will discuss some of the more recent pressures for change, and tendencies towards inertia, that I see in large-scale testing.  I will discuss the effects that the rise of so-called "standards-based" assessments are having on testing in the United States, in particular focusing on reactions to it, such as the development of concepts such as "learning performances" and "learning trajectories."  These reactions need to be seen as occurring in a context where "small-scale" testing, such as assessment on the  classroom and individual scales are becoming relatively much more important.  I will then relate these to technical developments in the field.  In particular, to longitudinal perspectives on modeling, to issues in vertical equating, and to ways to enable somewhat less rigid ideas of dimensionality, such as "essential" dimensionality and "thick" variables.  I will conclude with some comments on where I see these linked issues leading, both for large and small-scale testing, and within the technical domain.

Keywords Assessment methods
Item response theory (IRT)
Large-scale international assessment projects
Appendices
Authors
Name Surname Institution Country e-mail EARLI Number Presenting
Mark Wilson UC Berkeley United States MarkW@berkeley.edu   *  
Title First Results of the Hungarian Educational Longitudinal Study
Abstract

The number of longitudinal surveys conducted or launched in educational context has been growing in the past decade. Among the traditional – mostly theoretical, developmental-psychological – reasons new aspects initiate such focused works which are related to the improvement of the quality of education. Building system-wide evaluation models, improving accountability, understanding and preventing school failures are among these new aspects. The research questions of the first large-scale Hungarian longitudinal educational study launched in 2003 focused around similar problems. In order to have comprehensive a picture about the 12 years of the compulsory schooling, the design of the survey combines longitudinal and cross-sectional aspects. Representative samples of 1st (N1<5200), 5th (N5<4300) and 9th (N9<3755) grade students were drawn where school classes were the units of sampling. Several questionnaires and tests were administered to the students at the beginning and at the end of the school years to collect data on their cognitive and affective characteristics, school achievements, and social background. By the end of the 2006/07 academic year, data of five waves of surveys will be available. This paper presents the overall results on the stability of the development of students within the education system, and discusses the role of those factors which predict the later achievements and failures. The first analyses emphasize the importance of the early development of mathematics and reading skills. Data confirm the hypothesis that teachers’ evaluation is partly based on their subjective expectations: higher correlations were found between the grades given by teachers over years than between any other cognitive or affective variables.

Summary

The number of longitudinal surveys conducted or launched in educational context has been growing in the past decade. Among the traditional – mostly theoretical, developmental-psychological – reasons new aspects initiate such focused works which are related to the improvement of the quality of education. Building system-wide evaluation models, improving accountability, understanding and preventing school failures are among these new aspects. The research questions of the first large-scale Hungarian longitudinal educational study launched in 2003 focused around similar problems. In order to have comprehensive a picture about the 12 years of the compulsory schooling, the design of the survey combines longitudinal and cross-sectional aspects. Representative samples of 1st (N1<5200), 5th (N5<4300) and 9th (N9<3755) grade students were drawn where school classes were the units of sampling. Several questionnaires and tests were administered to the students at the beginning and at the end of the school years to collect data on their cognitive and affective characteristics, school achievements, and social background. By the end of the 2006/07 academic year, data of five waves of surveys will be available. This paper presents the overall results on the stability of the development of students within the education system, and discusses the role of those factors which predict the later achievements and failures. The first analyses emphasize the importance of the early development of mathematics and reading skills. Data confirm the hypothesis that teachers’ evaluation is partly based on their subjective expectations: higher correlations were found between the grades given by teachers over years than between any other cognitive or affective variables.


The number of longitudinal surveys conducted or launched in educational context has been growing in the past decade. Among the traditional – mostly theoretical, developmental-psychological – reasons new aspects initiate such focused works which are related to the improvement of the quality of education. Building system-wide evaluation models, improving accountability, understanding and preventing school failures are among these new aspects. The research questions of the first large-scale Hungarian longitudinal educational study launched in 2003 focused around similar problems. In order to have comprehensive a picture about the 12 years of the compulsory schooling, the design of the survey combines longitudinal and cross-sectional aspects. Representative samples of 1st (N1<5200), 5th (N5<4300) and 9th (N9<3755) grade students were drawn where school classes were the units of sampling. Several questionnaires and tests were administered to the students at the beginning and at the end of the school years to collect data on their cognitive and affective characteristics, school achievements, and social background. By the end of the 2006/07 academic year, data of five waves of surveys will be available. This paper presents the overall results on the stability of the development of students within the education system, and discusses the role of those factors which predict the later achievements and failures. The first analyses emphasize the importance of the early development of mathematics and reading skills. Data confirm the hypothesis that teachers’ evaluation is partly based on their subjective expectations: higher correlations were found between the grades given by teachers over years than between any other cognitive or affective variables.


The number of longitudinal surveys conducted or launched in educational context has been growing in the past decade. Among the traditional – mostly theoretical, developmental-psychological – reasons new aspects initiate such focused works which are related to the improvement of the quality of education. Building system-wide evaluation models, improving accountability, understanding and preventing school failures are among these new aspects. The research questions of the first large-scale Hungarian longitudinal educational study launched in 2003 focused around similar problems. In order to have comprehensive a picture about the 12 years of the compulsory schooling, the design of the survey combines longitudinal and cross-sectional aspects. Representative samples of 1st (N1<5200), 5th (N5<4300) and 9th (N9<3755) grade students were drawn where school classes were the units of sampling. Several questionnaires and tests were administered to the students at the beginning and at the end of the school years to collect data on their cognitive and affective characteristics, school achievements, and social background. By the end of the 2006/07 academic year, data of five waves of surveys will be available. This paper presents the overall results on the stability of the development of students within the education system, and discusses the role of those factors which predict the later achievements and failures. The first analyses emphasize the importance of the early development of mathematics and reading skills. Data confirm the hypothesis that teachers’ evaluation is partly based on their subjective expectations: higher correlations were found between the grades given by teachers over years than between any other cognitive or affective variables.

Keywords Accountability systems in education
Assessment of competence
Large-scale national assessment projects
Appendices
Authors
Name Surname Institution Country e-mail EARLI Number Presenting
Beno Csapo University of Szeged Hungary csapo@edpsy.u-szeged.hu   *  
Title How PISA can be used for educational research
Abstract

Large scale assessments like the OECD “Programme for International Student Assessment” (PISA) are based on excellent representative samples. These studies inform about educational outcomes, their correlations with school and student background variables, or changes between the assessment cycles. Attempts to explain differences between countries or subpopulations are limited by the survey design of these studies. On the other hand, the design of PISA can be extended by national options. This opportunity has been widely used in PISA 2003 by the national project managers in Germany: In a follow-up study all the students and an additional sample of two classes from each school had been tested again in 2004. The aim of the study was to test explanations models for the development of math and science competencies under classroom conditions. All the students in this sample had completed additional (national) math and science assessments. Also the parents of the students and their mathematics teachers had to fill in questionnaires. The design of this study allowed multi-level-analysis. The papers presents some of the findings from this study which show that extended large scale assessments can help to interpret international comparisons, and, at the same time, can contribute significantly to educational research.

Summary

Large scale assessments like the OECD “Programme for International Student Assessment” (PISA) are based on excellent representative samples. These studies inform about educational outcomes, their correlations with school and student background variables, or changes between the assessment cycles. Attempts to explain differences between countries or subpopulations are limited by the survey design of these studies. On the other hand, the design of PISA can be extended by national options. This opportunity has been widely used in PISA 2003 by the national project managers in Germany: In a follow-up study all the students and an additional sample of two classes from each school had been tested again in 2004. The aim of the study was to test explanations models for the development of math and science competencies under classroom conditions. All the students in this sample had completed additional (national) math and science assessments. Also the parents of the students and their mathematics teachers had to fill in questionnaires. The design of this study allowed multi-level-analysis. The papers presents some of the findings from this study which show that extended large scale assessments can help to interpret international comparisons, and, at the same time, can contribute significantly to educational research.


Large scale assessments like the OECD “Programme for International Student Assessment” (PISA) are based on excellent representative samples. These studies inform about educational outcomes, their correlations with school and student background variables, or changes between the assessment cycles. Attempts to explain differences between countries or subpopulations are limited by the survey design of these studies. On the other hand, the design of PISA can be extended by national options. This opportunity has been widely used in PISA 2003 by the national project managers in Germany: In a follow-up study all the students and an additional sample of two classes from each school had been tested again in 2004. The aim of the study was to test explanations models for the development of math and science competencies under classroom conditions. All the students in this sample had completed additional (national) math and science assessments. Also the parents of the students and their mathematics teachers had to fill in questionnaires. The design of this study allowed multi-level-analysis. The papers presents some of the findings from this study which show that extended large scale assessments can help to interpret international comparisons, and, at the same time, can contribute significantly to educational research.


Large scale assessments like the OECD “Programme for International Student Assessment” (PISA) are based on excellent representative samples. These studies inform about educational outcomes, their correlations with school and student background variables, or changes between the assessment cycles. Attempts to explain differences between countries or subpopulations are limited by the survey design of these studies. On the other hand, the design of PISA can be extended by national options. This opportunity has been widely used in PISA 2003 by the national project managers in Germany: In a follow-up study all the students and an additional sample of two classes from each school had been tested again in 2004. The aim of the study was to test explanations models for the development of math and science competencies under classroom conditions. All the students in this sample had completed additional (national) math and science assessments. Also the parents of the students and their mathematics teachers had to fill in questionnaires. The design of this study allowed multi-level-analysis. The papers presents some of the findings from this study which show that extended large scale assessments can help to interpret international comparisons, and, at the same time, can contribute significantly to educational research.


Large scale assessments like the OECD “Programme for International Student Assessment” (PISA) are based on excellent representative samples. These studies inform about educational outcomes, their correlations with school and student background variables, or changes between the assessment cycles. Attempts to explain differences between countries or subpopulations are limited by the survey design of these studies. On the other hand, the design of PISA can be extended by national options. This opportunity has been widely used in PISA 2003 by the national project managers in Germany: In a follow-up study all the students and an additional sample of two classes from each school had been tested again in 2004. The aim of the study was to test explanations models for the development of math and science competencies under classroom conditions. All the students in this sample had completed additional (national) math and science assessments. Also the parents of the students and their mathematics teachers had to fill in questionnaires. The design of this study allowed multi-level-analysis. The papers presents some of the findings from this study which show that extended large scale assessments can help to interpret international comparisons, and, at the same time, can contribute significantly to educational research.

Keywords Assessment of competence
Large-scale international assessment projects
Large-scale national assessment projects
Appendices
Authors
Name Surname Institution Country e-mail EARLI Number Presenting
Manfred Prenzel IPN Kiel Germany prenzel@ipn.uni-kiel.de   *  
Title On the Way to Causal Inferences: Teacher Knowledge, Teaching, and Student Progress within the Framework of PISA
Abstract

This presentation describes the longitudinal extension to PISA 2003 in Germany, which included a study of mathematics teachers’ content knowledge and pedagogical content knowledge and how these knowledge components relate to high-quality instruction. The structure of mathematics teachers’ professional knowledge will be analyzed, and structural equation modelling will be used to test the extent to which these knowledge components predict the quality of mathematics instruction and students’ learning gains.

Summary

This presentation describes the longitudinal extension to PISA 2003 in Germany, which included a study of mathematics teachers’ content knowledge and pedagogical content knowledge and how these knowledge components relate to high-quality instruction. The structure of mathematics teachers’ professional knowledge will be analyzed, and structural equation modelling will be used to test the extent to which these knowledge components predict the quality of mathematics instruction and students’ learning gains.


This presentation describes the longitudinal extension to PISA 2003 in Germany, which included a study of mathematics teachers’ content knowledge and pedagogical content knowledge and how these knowledge components relate to high-quality instruction. The structure of mathematics teachers’ professional knowledge will be analyzed, and structural equation modelling will be used to test the extent to which these knowledge components predict the quality of mathematics instruction and students’ learning gains.


This presentation describes the longitudinal extension to PISA 2003 in Germany, which included a study of mathematics teachers’ content knowledge and pedagogical content knowledge and how these knowledge components relate to high-quality instruction. The structure of mathematics teachers’ professional knowledge will be analyzed, and structural equation modelling will be used to test the extent to which these knowledge components predict the quality of mathematics instruction and students’ learning gains.


This presentation describes the longitudinal extension to PISA 2003 in Germany, which included a study of mathematics teachers’ content knowledge and pedagogical content knowledge and how these knowledge components relate to high-quality instruction. The structure of mathematics teachers’ professional knowledge will be analyzed, and structural equation modelling will be used to test the extent to which these knowledge components predict the quality of mathematics instruction and students’ learning gains.


This presentation describes the longitudinal extension to PISA 2003 in Germany, which included a study of mathematics teachers’ content knowledge and pedagogical content knowledge and how these knowledge components relate to high-quality instruction. The structure of mathematics teachers’ professional knowledge will be analyzed, and structural equation modelling will be used to test the extent to which these knowledge components predict the quality of mathematics instruction and students’ learning gains.


This presentation describes the longitudinal extension to PISA 2003 in Germany, which included a study of mathematics teachers’ content knowledge and pedagogical content knowledge and how these knowledge components relate to high-quality instruction. The structure of mathematics teachers’ professional knowledge will be analyzed, and structural equation modelling will be used to test the extent to which these knowledge components predict the quality of mathematics instruction and students’ learning gains.


This presentation describes the longitudinal extension to PISA 2003 in Germany, which included a study of mathematics teachers’ content knowledge and pedagogical content knowledge and how these knowledge components relate to high-quality instruction. The structure of mathematics teachers’ professional knowledge will be analyzed, and structural equation modelling will be used to test the extent to which these knowledge components predict the quality of mathematics instruction and students’ learning gains.


This presentation describes the longitudinal extension to PISA 2003 in Germany, which included a study of mathematics teachers’ content knowledge and pedagogical content knowledge and how these knowledge components relate to high-quality instruction. The structure of mathematics teachers’ professional knowledge will be analyzed, and structural equation modelling will be used to test the extent to which these knowledge components predict the quality of mathematics instruction and students’ learning gains.


This presentation describes the longitudinal extension to PISA 2003 in Germany, which included a study of mathematics teachers’ content knowledge and pedagogical content knowledge and how these knowledge components relate to high-quality instruction. The structure of mathematics teachers’ professional knowledge will be analyzed, and structural equation modelling will be used to test the extent to which these knowledge components predict the quality of mathematics instruction and students’ learning gains.



This presentation describes the longitudinal extension to PISA 2003 in Germany, which included a study of mathematics teachers’ content knowledge and pedagogical content knowledge and how these knowledge components relate to high-quality instruction. The structure of mathematics teachers’ professional knowledge will be analyzed, and structural equation modelling will be used to test the extent to which these knowledge components predict the quality of mathematics instruction and students’ learning gains.


This presentation describes the longitudinal extension to PISA 2003 in Germany, which included a study of mathematics teachers’ content knowledge and pedagogical content knowledge and how these knowledge components relate to high-quality instruction. The structure of mathematics teachers’ professional knowledge will be analyzed, and structural equation modelling will be used to test the extent to which these knowledge components predict the quality of mathematics instruction and students’ learning gains.


Keywords Assessment of competence
Large-scale international assessment projects
Teacher knowledge
Appendices
Authors
Name Surname Institution Country e-mail EARLI Number Presenting
Juergen Baumert MPI, Berlin Germany sekbaumert@mpib-berlin.mpg.de   *  
Visit NQcontent
© European Association for Research on Learning and Instruction, 2012 All rights reserved.