Proposal view
Proposal Type: Individual Paper 
Domain: Assessment and Evaluation 
SIG: Assessment and Evaluation 
Type Submitted Paper 
Equipment PC and projector
Paper Details
Title Effects of item headings in aptitude tests: evidence that math-related labels impair students’ performance in deductive reasoning tasks
Abstract
Two studies examine the effects of domain-related labels used as item headings in aptitude tests. It appears that items and exercises aimed at assessing domain-general cognitive abilities (e.g., hypothesis testing in deductive reasoning) are alternatively labelled under the headings of “math”, “sciences”, “verbal logic”, “reasoning” items, and so on. Past research points out that task labelling may heavily affect solvers’ performances, depending on shared beliefs about intrinsic difficulty of the domain evoked by task label. We hypothesized therefore that when deductive reasoning tasks – administered with the purpose of testing aptitudes - are labelled as diagnostic of math-related abilities, students’ performances may be thwarted, compared to conditions in which the same tasks are labelled as diagnostic of verbal reasoning skills. In study 1, a modified version of the Wason Selection Task was labelled as diagnostic of “formal math demonstration” vs. “verbal reasoning” skills, by means of a heading on the top of the page. Results confirm that the rate of incorrect (confirmatory) answers is significantly higher under math rather than verbal/logic heading, independently from students’ perceived past success in math. In study 2, the Selection Task was inserted in a battery including other six items, three equations labelled as “math” items, and three text comprehension items (labelled as “verbal reasoning”). We hypothesized and found that when math items are indicated as the most diagnostic of students’ aptitudes, the rate of incorrect (confirmatory) answers at the Selection Task is significantly higher when this task is placed under the “math” rather than to the “verbal reasoning” heading, whereas no difference due to the Selection Task heading appears neither when diagnosticity is attributed to verbal reasoning items, nor in the control condition. Theoretical and applied implications of the two studies will be discussed.
Summary
It is quite common that items in aptitude tests are subdivided in sections whose headings refer to different disciplinary domains. Headings are intended to help solvers in identifying the most appropriate epistemic procedures for the correct accomplishment of each task. Unfortunately, it appears that tasks aimed at assessing general cognitive processes (e.g., hypothesis testing in deductive reasoning) are labelled in a quite inconsistent way, and alternatively appear in aptitude tests under the headings of “math”, “sciences”, “verbal logic”, “reasoning” items, and so on. Although it is conceivable that these tasks involve transversal abilities, equally implied in all those different domains, literature provides evidence that solvers’ performances are heavily affected by task labels - also in tasks aimed at assessing very general cognitive processes - depending on shared beliefs about intrinsic difficulty of each domain. Monteil and Huguet (1) administered a modified version of the Rey-Osterreith test - normally used in neuropsychological assessment of spatial working memory - to high-school students, and alternatively labelled the task as diagnostic of abilities with high vs. low evaluative significance in school (namely, geometry vs. drawing). Results showed that when the task was labelled as diagnostic or geometry abilities, high-achieving students outperformed low achievers, whereas no difference appeared when the task was meant to test drawing abilities (2). In general, similar effects are shown when a task is labelled as diagnostic of highly valued cognitive abilities (e.g., QI, math skills, etc.) (3, 4, 5). Drawing upon these premises, we hypothesized that when deductive reasoning tasks – administered with the purpose of testing aptitudes - are labelled as diagnostic of math-related abilities, students’ performances may be thwarted, compared to conditions in which the same tasks are labelled as diagnostic of verbal reasoning skills.

Study 1: a modified version of the Wason Selection Task (6) was administered to 122 students from an Italian Faculty of Psychology, and presented as an aptitude test for an upcoming course in basic statistics. In two experimental conditions, the task was labelled as diagnostic of “formal math demonstration” vs. “verbal reasoning” skills, by means of an heading on the top of the page, and a bogus message explained why the abilities evoked by the label (and not the alternative ones) were relevant for course achievement. Experimental checks confirm that experimental induction is perceived accordingly to our intentions. It was hypothesized that the math label may impair students’ performance, giving rise to a higher rate of incorrect (confirmatory) answers. It was also predicted that this effect might be moderated by students’ reported past success in math domain. Logistic regression results confirm the expected main effect of the task label, since the rate of confirmatory answers is significantly higher under math rather than verbal/logic heading. On the contrary, the expected task labels by students’ perceived past success in math interaction does not attain significance, thus showing that – at least in our university students’ sample – math label thwarts students’ performance independently from their perceived success in math.

Study 2: the aim of study 2 was to discriminate between the effect of task labelling, and a general impairing effect due to the diagnosticity attributed to math tasks. Another sample of 109 students from the same Faculty took part to the study. In this case, the Selection Task was inserted in a battery including other six items, drawn from tests used by Italian universities’ admission trials - three extracted from math exercises (equations with one unknown), and three from verbal reasoning items (text comprehension and syllogisms). In a 3 (test diagnosticity: math vs. verbal vs. control) x 2 (Wason’s task label: math vs. verbal) experimental design, students were informed that the items labelled as “math demonstration” vs. those labelled as “verbal reasoning” were the most diagnostic of the skills needed for the upcoming statistics course, whereas in the control condition all the items were presented as equally diagnostic; the Selection Task was then alternatively inserted among the math vs. verbal reasoning items, whose presentation order was counterbalanced. We hypothesized and found a diagnosticity by task label interaction, after controlling for students’ previous notes in math-related disciplines and for students’ academic math and general self-concept (7). In detail, simple contrasts show that when math items are indicated as the most diagnostic of students’ aptitudes for the course, the rate of incorrect (confirmatory) answers at the Selection Task is significantly higher when the task is placed under the “math demonstration” rather than to the “verbal reasoning” heading, whereas no difference due to the Selection Task label appears neither when diagnosticity is attributed to the verbal items, nor in the control condition.

Theoretical and applied implications of the two studies will be discussed.


(1) Monteil J.M. & Huguet P. (1991). Insertion sociale, catégorisation sociale et activités cognitives. Psychologie Française, 36, 35-46.

(2) Huguet P., Brunot S. & Monteil J.-M. (2000). Performance feedback and self-focused attention in the classroom: when past and present interact. Social Psychology of Education, 3, 277–293.

(3) Katz I., Roberts S.O. & Robinson J. M. (1965). Effects of task difficulty, race of administrator, and instructions on digit-symbol performance of negroes. Journal of Personality and Social Psychology, 2, 53–59.

(4) Spencer S.J., Steele C. M. & Quinn D.M. (1999). Stereotype threat and women’s math performance. Journal of experimental Social Psychology, 35, 4–28.

(5) Marx D.M. & Stapel D.A. (2006). Distinguishing Stereotype Threat From Priming Effects: On the Role of the Social Self and Threat-Based Concerns. Journal of Personality and Social Psychology, 91(2), 243–254.

(6)Wason, P.C. (1966). Reasoning. In B. Foss (Ed.), New Horizons in Psychology. Harmondsworth: Penguin.
Keywords Beliefs
Reasoning
Testing
Appendices
Authors
Name Surname Institution Country e-mail EARLI Number Presenting
Carlo Tomasetto University of Bologna Italy carlo.tomasetto@unibo.it   *  
Visit NQcontent
© European Association for Research on Learning and Instruction, 2012 All rights reserved.