Proposal view
Proposal Type: Symposium 
Domain: Assessment and Evaluation 
SIG: Assessment and Evaluation 
Type Invited SIG Symposium 
Title Engaging Learners in the Processes of Assessment 
Abstract  
Equipment PC and projector
Keywords Self-regulation
Student perceptions
Teacher assessment 
Chair list
Name Surname Institution Country E-Mail EARLI Number
Jim Ridgway Durham University United Kingdom jim.ridgway@durham.ac.uk  
Organiser list
Name Surname Institution Country E-Mail EARLI Number
Jim Ridgway Durham University United Kingdom jim.ridgway@durham.ac.uk  
Discussant list
Name Surname Institution Country E-Mail EARLI Number
Denise M. Whitlock Open University United Kingdom D.M.Whitelock@open.ac.uk  
Paper Details
Title Situative Alignment of Formative and Summative Assessment Functions to Maximize Engagement and Learning
Abstract  

 


Recent reviews have highlighted the well-established potential of formative feedback on classroom assessments as a means of improving teaching and learning.  A key conclusion is that feedback must be both useful and used in order to realize its formative potential. We contend that all participants in educational systems (including students, teachers, and administrators) must engage with the whole range of assessment measures and data through formative assessment practices in order to realize and, moreover, maximize the formative potential of assessment. To this end, feedback must be appropriately useful and used not only for different participants in an educational system but across these participants as well. This presentation will summarize a program of research to coordinate student engagement in order maximize the formative potential of classroom assessment and external testing. This program of research is distinctive because it employs contemporary situative views of knowing and learning to accomplish widely held formative goals for assessment and testing. Our presentation will first summarize relevant aspects of situative theories and then describe specific assessment innovations emerging from this view before detailing our broader multi-level/multi-method approach. Our framework employs three levels of increasingly formal assessment (close-level quizzes, proximal-level exams, and distal-level tests).  In order to maximize engagement, formative and summative functions are aligned within and across levels, and refined across three increasingly formal cycles of design-based research (implementation, experimentation, and evaluation).  Data and examples from three separate studies of innovative science curricula are presented to illustrate how this approach can maximize collective discourse, individual understanding, and aggregated achievement, while also providing rigorous evidence of those improvements. We will also present the latest results from ongoing study of elementary mathematics that is attempting to document the long-term consequences of this approach when applied to the entire fifth-grade mathematics curriculum across four elementary schools.

Summary  

Several high-profile reviews (Black & Wiliam, 1998; National Research Council, 2001; Shepard, 2001) have drawn new attention to the well-established potential of formative assessment for improving teaching and learning.  This appears to have contributed to a steady stream of new studies showing that formative feedback can indeed improve learning.  From our perspective, it is no longer necessary to “prove” that formative feedback is valuable.  Rather, more attention needs to be directed at a key conclusion of the prior reviews: feedback from assessments must both useful and used to accomplish formative potential.  This means that participants in the educational system (including students, teachers, and administrators) must be appropriately engaged in assessment practice if formative potential is to be met.  Instead of simply documenting improved learning upon the introduction of formative feedback, we believe that researchers should direct more attention to the complex factors that impact engagement in assessment practices in ways that enhance (and, unfortunately, undermine) teaching and learning.   This presentation will summarize a program of research that does so.


 


Our program of assessment research is distinctive because it employs contemporary situative views of engagement (e.g., Gee, 2003; Beach 2003) to accomplish widely held formative goals for assessment and testing.  Our presentation will first summarize the unique aspects situative theories that make them useful for improving assessment.  Because situative views treat all learning as social change, they are well-suited to understanding and addressing the many complex and interactive ways that tests and assessment can impact learning and teaching.  A situative view rejects the problematic dichotomy between “formative assessments” and “summative tests”, and instead considers the unique formative and summative functions associated with any particular assessment practice.  A situative view also highlights the different timescales (Lemke, 1990) associated with particular feedback process, which dramatically impact the usefulness and use of that feedback.  


 


We will then describe specific assessment innovations that have emerged from this work, including “discursive feedback rubrics,” and the design-based refinement of “local” assessment theories.  These innovations will be presented in the context of our broader multi-level/multi-method assessment approach.  This approach employs three increasingly formal levels of assessment (i.e., close-level quizzes, proximal-level exams, and distal-level tests).  In order to maximize engagement in feedback, we iteratively align formative and summative functions within and across assessment levels, and across three increasingly formal cycles of design-based research (i.e., implementation, experimentation, and evaluation).


 


Data from three separate studies of innovative science curricula will be summarized.  These include the GenScope introductory genetics curriculum, three inquiry-oriented software programs developed by NASA, and the Quest Atlantis multi-user virtual environment. In each case, we aimed to maximize collective discourse, individual understanding, and aggregated achievement, while also providing rigorous evidence of those improvements.


We will also present the latest results from an ongoing study of elementary mathematics that is attempting to document the long-term consequences of this approach when applied to the entire fifth-grade mathematics curriculum across four elementary schools.


 


References


 


Beach, K. (2003).  Learning in complex social situations meets information processing and mental representation:  Some consequences for educational assessment.  Measurement:  Interdisciplinary research and perspectives, 1, 149-154.


Black, P., & Wiliam, D. (1998). Assessment and classroom learning. Assessment in Education: Principles, Policies, and Practices, 5, 7-74.


Gee, J.P. (2003). Opportunity to Learn: a language-based perspective on assessment. Assessment in Education: Principles, Policy & Practice, 10, 27-46.


Lemke, J. J.  (2000).  Across the scale of time: Artifacts, activities, and meaning in ecosocial systems.  Mind, Culture, and Activity 7 (4): 273-290. 2000.  


National Research Council (2001b). Classroom assessment and the national science education standards. J. M. Atkin., P. Black, & J. Coffey, (Eds.). Washington, DC: National Academy Press.


Shepard, L. (2000).  The role of assessment in a learning culture.  Educational Researcher, 29(7), 1-14.

Keywords Assessment
Classroom research
Science education
Appendices
Authors
Name Surname Institution Country e-mail EARLI Number Presenting
Daniel T Hickey Indiana University United States dthickey@indiana.edu   *  
Steven J Zuiker Indiana University United States szuiker@indiana.edu    
Kate Anderson Indiana University United States ka2@indiana.edu    
Title Teachers’ and students’ perceptions of assessments: Are they able to accurately estimate the difficulty levels of assessment items?
Abstract  

In today’s higher education, high quality assessments play an important role. Little is known, however, about the degree to which assessments are correctly aimed at the students’ levels of competence in relation to the defined learning goals. This contribution reviews previous research into teachers’ and students’ perceptions of item difficulty. It focuses on the item difficulty of assessments and students’ and teachers’ abilities to estimate item difficulty correctly. The review indicates that teachers tend to overestimate the difficulty of easy items and underestimate the difficulty of difficult items. Students seem to be better estimators of item difficulty. The accuracy of the estimates can be improved by: the information the estimators or teachers have about the target group and their earlier assessment results; defining the target group before the estimation process; by the possibility of having discussions about the defined target group students and their corresponding standards during the estimation process; and by the amount of training in item construction and estimating. In the subsequent study, the ability and accuracy of teachers and students to estimate the difficulty levels of assessment items was examined. In higher education, results show that teachers are able to estimate the difficulty levels correctly for only a small proportion of the assessment items. They overestimate the difficulty level of most of the assessment items. Students, on the other hand, underestimate their own performances. In addition, the relationships between the students’ perceptions of the difficulty levels of the assessment items and their performances on the assessments were investigated. Results provide evidence that the students who performed best on the assessments underestimated their performances the most. Several explanations are discussed and suggestions for additional research are offered.

Summary  

Introduction


Recently there has been growing attention to academic standards related to the connection of students’ entry-levels and their assessment outcomes. Within the framework of high quality assessments, combined with attention to the sometimes implicit performance standards, conducting research on item difficulty and the perception of teachers and students of item difficulty is relevant. Little is known about the degree to which assessments in higher education are correctly aimed at the students’ levels of competence. Therefore, this contribution focuses on the item difficulty of assessments in higher education. The central question to be answered is whether teachers and students have correct perceptions of the item difficulty.


 


Part I: literature review of previous research into teachers’ perceptions of item difficulty


To find relevant studies a wide variety of computerised databases were utilised including Educational Resources Information Center (ERIC), ISI Web of Knowledge, Science Direct, Online Contents and Google Scholar. The following keywords were used: difficulty level, assessment difficulty, item difficulty, performance standard, standard setting procedures, item construction, expectancy, accuracy, perceptions and higher education. Next, the ‘snowball method’ was employed and the references in the selected articles for additional works were reviewed. Regarding the context of our study only a few articles were found relevant. Little is known about teachers’ abilities to accurately estimate assessment and item difficulties during assessment and item construction processes. More research regarding the ability to estimate item difficulty has been done by researchers investigating various aspects of the modified Angoff standard setting method because estimating the difficulty of items for a certain group of students is an explicit part of that method. As well as teachers’ perceptions of item difficulty, students’ perceptions have been scrutinised in several studies.


 


Results of the literature review


            In summary, although the outcomes of previous research into teachers’ perceptions of item difficulty are not consistent, most research shows that estimating the difficulty of items or assessment standards is a difficult job. Teachers tend to overestimate the difficulty of easy items and underestimate the difficulty of difficult items. The accuracy of the estimates can be improved by the information the estimators (judges, teachers, assessment constructors) have about the target group (total group of students, borderline group students, average students) and former assessment results, defining the target group before the estimation process, the possibility of having discussions amongst the estimators about the defined target group of students and its corresponding standards during the estimation process, the amount of training in item construction, and practice with estimating.


Students seem to be better estimators of item difficulty. Nevertheless, studies of students’ perceptions are also inconclusive.


              


Part II: empirical study investigating the accuracy of teachers’ estimations and students’ perceptions of difficulty


To verify and further investigate the accuracy of teachers’ estimations during assessment and item construction processes and students’ perceptions of item and assessment difficulty an empirical study was conducted. The purposes of this third empirical part were (a) to examine the accuracy of the item constructors’ and assessment composers’ estimations of the difficulty levels of the assessment items and the assessment difficulty, (b) to assess to what extent the students’ perceptions of the difficulty levels of the assessment items correspond to the statistical difficulty levels, (c) to examine to what extent the students’ perceptions of the difficulty levels of the assessment items differ to the item constructors’ and assessment composers’ estimations of the item difficulties, and (d) to investigate the relationships between students’ perceptions of the difficulty levels of the assessment items and their performances on the assessment.


 


Method and Instrumentation


Both teachers and students participated in this study. Three assessment construction groups, consisting of six, seven and four teachers respectively, participated in this study. Students enrolled in a first year bachelor programme also took part in this study. 223 students participated and completed assessment A and the questionnaire assessing the students’ perceptions of the difficulty levels of the assessment items. Another group of 138 students completed assessment B and the questionnaire. A final group of 198 students completed assessment C and the questionnaire. Finally, 30 students were randomly selected to participate in one of three focus group interviews. 


To assess the teachers’ and students’ perceptions of the difficulty level of the assessment items for students, a questionnaire was developed. Every item constructor or assessment composer and student was asked to rate each assessment item by means of a 3-point rating scale ranging from (1) difficult, (2) not difficult and not easy, to (3) easy. The actual difficulty levels of the assessment items were indicated by the p values of each item.


To examine the accuracy of the teachers’ estimations of the difficulty levels of the assessment items, the teachers’ ratings were compared with the actual p values of each assessment item. A similar comparison was made between the students’ perceptions of the item difficulties and the actual item difficulty levels. Rating scale analyses were conducted to map the differences between the item constructors and assessment composers’ estimations, the students’ perceptions of the difficulty level of the assessments and the statistical difficulty levels of the items.

Keywords Assessment
Student perceptions
Teacher assessment
Appendices
Authors
Name Surname Institution Country e-mail EARLI Number Presenting
Gerard van de Watering Technische Universiteit Eindhoven Netherlands G.A.v.d.Watering@tm.tue.nl    
Janine van der Rijt Maastricht University Netherlands Janine.Vanderrijt@EDIT.unimaas.nl   *  
Filip Dochy Katholieke Universiteit Leuven Belgium Filip.Dochy@Ped.Kuleuven.be    
David Gijbels University of Antwerp Belgium David.Gijbels@UA.ac.be    
Title Engaging learners in assessment via the use of blogs
Abstract  

This paper presents the results of a qualitative exploration of student teachers’ practices using online web logs (blogs) to support the process of self assessment.  Nineteen students on a teacher education program in Macau participated in the study for two semesters.  The concept of assessment for learning emphasizes constructive feedback and explicit assessment criteria; this was implemented in two technology courses.  A systematic content analysis of students’ journals and portfolios reveals students applying a variety of assessment criteria to judge their learning, including: completion of class exercises; the time taken; their ability to work independently; and their ability to follow in class.


Students were found to value the feedback provided by the tutor and their peers.  They identified enhanced motivation, and social support, as useful components of the feedback, as well as the more obvious cognitive aspects.


Findings suggest that the availability of blogs can facilitate students in the process of assessment for some, but not for all, students.  Blogs can function well as a platform, but more work needs to be done to establish the circumstances under which they can be used effectively, to the benefit of all students.

Summary  

Extensive research (Black & William 1998) highlights the importance of formative assessment for improving learning.  An important target for teacher education is to cultivate a desire in students for learning after graduation and during their professional work.  Among the techniques of formative assessment for learning, Boud and Falchikov (2005) consider self-assessment as an important capability which students need to acquire in order to become lifelong learners.  This paper explores the role of technology in supporting students in this process of self-assessment.  The concept of assessment for learning was embedded in a teacher education programme in Macau.  The study set out to explore the extent to which learners engage in self-assessment when using blog, and to better understand the criteria which students use to assess their learning.  A seminal paper by Black and William (1998) shows that improving formative assessment increases student attainment.  In a comprehensive review of research literature, they examined studies ranging over different ages, across subjects and over several countries.  Their result showed that the use of formative assessment typically produced an effect size of 0.4 to 0.7.  James & Pedder (2006) show that if assessment for learning is implemented successfully, it can help develop students’ motivation for learning as an enduring disposition.  Black et al (2003) identify techniques such as constructive feedback, questioning, self-assessment and peer-assessment as techniques for formative assessment which are conducive to learning.  Clearly, students who can evaluate their performance effectively are in a good position to understand and regulate their own learning.


 


Research Methodology and Design


The study focused on two technology courses where students were required to create portfolios, and to record blogs for a period of two semesters.  Nineteen year 2 students in a Bachelor of Education programme took part.  All but one was male. In the first lesson of the course, students were introduced to the uses of blogs, and to the assessment and grading criteria to be used for individual and group projects.  The portfolios were used for assessment, and use of blogs was a course requirement.  The primary objective for the use of blogs was to provide a platform for students to evaluate aspects of their taught weekly sessions; students were required to engage in self-assessment as part of the overall teaching and learning strategy.  As individuals, students were required to write weekly journals based on each lesson.  They were asked to assess their learning on the lesson and to express their opinions via a blog.  Each journal entry was required to be at least 50 words.  The tutor and student peers provided feedback based on the content of their journal.  A questionnaire was used to collect students’ views on their experiences using blogs.  The content of students’ blogs and of their digital portfolios were analysed in order to investigate the extent to which they engaged in self-assessment via the blogs and also to explore their perceptions of the quality and usefulness of the feedback they received.  A systematic content analysis was conducted.  Blogs and portfolios were read to determine response categories, and subsequently these response categories were used to classify all the student comments.  Frequency of content relevant to students’ evaluations of each lesson was recorded.  Categories of assessment criteria which students applied to assess their learning were grouped.  For example, the quote “I can finish the worksheets provided in class on time and quite smoothly. (COMP-S1-L2)” was considered to be evidence of self-assessment and that student’s assessment criteria was based on time and completion of class exercises.


 


Results and interpretations


Analysis of the nineteen students’ journals collected on blogs for a period of 14 weeks showed that not all students engaged in self-assessment.  Further, evidence of self-assessment was not found in all lessons.  The distribution of the extent of students’ self-assessment is shown in Table 1.  It can be seen that the number of students practicing self-assessment is quite low on average.  The study confirms Rowntree’s (1987) observation that students develop and use self-assessment skills to variable degrees.  The study also points to the importance of formally training self-assessment skills so that more students will engage with the process (Sadler 1998).


 


Table 1. No. of students who engaged in self-assessment in each lesson


 








































Lesson



1



2



3



4



5



6



7



8



9



10



11



12



13



14



No. of students



0



8



8



4



1



0



1



7



2



3



2



2



4



7



 


Students apply different criteria to judge their learning and achievement in the course.  Among the criteria used, students tend to evaluate their learning or achievement in terms of their facility with software, whether they can finish the class exercises, the time required, their ability to work independently, and whether they can follow teacher’s instruction in class and the topic of the lesson.  The study echoes that of Tara (2003) that student often include time and effort as part of their assessment criteria.


 


Students were found to value the feedback provided by the tutor and their peers.  They identified enhanced motivation, and social support, as useful components of the feedback, as well as the more obvious cognitive aspects.


 


Conclusions


The study set out to explore the use of technology as a platform for self-assessment.  It showed that blogs and portfolios worked well with some students but not for all.  Blogs can serve as a convenient platform to communicate feedback, and to engage students in reflective learning, but more work needs to be done to establish the circumstances under which they can be used effectively, to the benefit of all students.

Keywords On-line learning
Peer interaction/friendship tutoring
Student perceptions
Appendices
Authors
Name Surname Institution Country e-mail EARLI Number Presenting
Kan-Kan Chan University of Macau Macau k.k.chan@durham.ac.uk   *  
Jim Ridgway Durham University United Kingdom jim.ridgway@durham.ac.uk    
Title Embedded assessment in sciences: Encouraging thinking skills and metacognition
Abstract  

There is a consensus that the development of thinking skills and metacognition should be a major constituent of teaching toward scientific literacy. Furthermore, we believe that merely teaching thinking skills is not enough and it should take part in the wider framework of enhancing thinking culture in the classroom along with embedded assessment.  In this study we investigated the (1) culture of variety of science classes, in which embedded assessment was employed, and (2) effect of embedded assessment on students' higher order thinking skills. By embedded assessment, we mean employing diversified assessment modes and their integration throughout the learning process. We studied four groups of secondary school students: chemistry majors (grades 11-12), two groups of non-science majors (grades 10-11), and a group of gifted students in a pull-out enrichment program (grades 7-9). All the groups were engaged in science courses that focused on the idea that learners should be active in the process of learning and knowledge construction. Data collection included pre- and post open-ended questionnaires with variety of tasks along with interviews. Our findings indicated differences among groups with regard to employing thinking culture, ranging from more teacher-centered traditional teaching to employing mainly open discussions and inquiry that require constant thinking and reasoning. In examining the different skills for each group we found that posing questions, argumentation, reflection, value judgment, graphing, and metacognition – were improved for all the groups with some variation in net gain and significance. In classes where the learning materials and the tasks were situated in more student-centered environments we successfully achieved our goals. However, in traditional teacher-centered environments only sometimes students participated in a socioscientific discourse. Our assumption that attainment of thinking and assessment culture is affected by the lack of substantial change in teachers' beliefs and practice, needs further investigation.

Summary  

There is a consensus that the development of thinking skills and metacognition should be a major constituent of teaching toward scientific literacy. Furthermore, we believe that merely teaching thinking skills is not enough and it should take part in the wider framework of enhancing thinking culture in the classroom along with embedded assessment.  In this study we investigated the (1) culture of variety of science classes, in which embedded assessment was employed, and (2) effect of embedded assessment on students' higher order thinking skills. By embedded assessment, we mean employing diversified assessment modes and their integration throughout the learning process. We studied four groups of secondary school students: chemistry majors (grades 11-12), two groups of non-science majors (grades 10-11), and a group of gifted students in a pull-out enrichment program (grades 7-9). All the groups were engaged in science courses that focused on the idea that learners should be active in the process of learning and knowledge construction. Data collection included pre- and post open-ended questionnaires with variety of tasks along with interviews. Our findings indicated differences among groups with regard to employing thinking culture, ranging from more teacher-centered traditional teaching to employing mainly open discussions and inquiry that require constant thinking and reasoning. In examining the different skills for each group we found that posing questions, argumentation, reflection, value judgment, graphing, and metacognition – were improved for all the groups with some variation in net gain and significance. In classes where the learning materials and the tasks were situated in more student-centered environments we successfully achieved our goals. However, in traditional teacher-centered environments only sometimes students participated in a socioscientific discourse. Our assumption that attainment of thinking and assessment culture is affected by the lack of substantial change in teachers' beliefs and practice, needs further investigation.


 


There is a consensus that the development of thinking skills and metacognition should be a major constituent of teaching toward scientific literacy. Furthermore, we believe that merely teaching thinking skills is not enough and it should take part in the wider framework of enhancing thinking culture in the classroom along with embedded assessment.  In this study we investigated the (1) culture of variety of science classes, in which embedded assessment was employed, and (2) effect of embedded assessment on students' higher order thinking skills. By embedded assessment, we mean employing diversified assessment modes and their integration throughout the learning process. We studied four groups of secondary school students: chemistry majors (grades 11-12), two groups of non-science majors (grades 10-11), and a group of gifted students in a pull-out enrichment program (grades 7-9). All the groups were engaged in science courses that focused on the idea that learners should be active in the process of learning and knowledge construction. Data collection included pre- and post open-ended questionnaires with variety of tasks along with interviews. Our findings indicated differences among groups with regard to employing thinking culture, ranging from more teacher-centered traditional teaching to employing mainly open discussions and inquiry that require constant thinking and reasoning. In examining the different skills for each group we found that posing questions, argumentation, reflection, value judgment, graphing, and metacognition – were improved for all the groups with some variation in net gain and significance. In classes where the learning materials and the tasks were situated in more student-centered environments we successfully achieved our goals. However, in traditional teacher-centered environments only sometimes students participated in a socioscientific discourse. Our assumption that attainment of thinking and assessment culture is affected by the lack of substantial change in teachers' beliefs and practice, needs further investigation.


 


There is a consensus that the development of thinking skills and metacognition should be a major constituent of teaching toward scientific literacy. Furthermore, we believe that merely teaching thinking skills is not enough and it should take part in the wider framework of enhancing thinking culture in the classroom along with embedded assessment.  In this study we investigated the (1) culture of variety of science classes, in which embedded assessment was employed, and (2) effect of embedded assessment on students' higher order thinking skills. By embedded assessment, we mean employing diversified assessment modes and their integration throughout the learning process. We studied four groups of secondary school students: chemistry majors (grades 11-12), two groups of non-science majors (grades 10-11), and a group of gifted students in a pull-out enrichment program (grades 7-9). All the groups were engaged in science courses that focused on the idea that learners should be active in the process of learning and knowledge construction. Data collection included pre- and post open-ended questionnaires with variety of tasks along with interviews. Our findings indicated differences among groups with regard to employing thinking culture, ranging from more teacher-centered traditional teaching to employing mainly open discussions and inquiry that require constant thinking and reasoning. In examining the different skills for each group we found that posing questions, argumentation, reflection, value judgment, graphing, and metacognition – were improved for all the groups with some variation in net gain and significance. In classes where the learning materials and the tasks were situated in more student-centered environments we successfully achieved our goals. However, in traditional teacher-centered environments only sometimes students participated in a socioscientific discourse. Our assumption that attainment of thinking and assessment culture is affected by the lack of substantial change in teachers' beliefs and practice, needs further investigation.

Keywords Assessment
Metacognition
Science education
Appendices
Authors
Name Surname Institution Country e-mail EARLI Number Presenting
Judy Dori Technion, Haifa Israel yjdori@technion.ac.il   *  
Tali Tal Technion, Haifa Israel rtal@techunix.technion.ac.il    
Visit NQcontent
© European Association for Research on Learning and Instruction, 2012 All rights reserved.