Proposal view
Proposal Type: Symposium 
Domain: Learning and Cognitive Science 
SIG: Comprehension of Text and Graphics 
Type Submitted Symposium 
Title Eye Tracking as a Means for Detailed Analyses of Multimedia Learning Processes – Part 1 
Abstract

Multimedia learning is defined as building mental representations from materials that involve both verbal (spoken or written text) and pictorial information (static or dynamic visualizations; Mayer, 2005). Many studies on the effectiveness of multimedia learning have been conducted, often inspired by Mayer’s cognitive theory of multimedia learning (see Mayer, 2005) and Sweller’s cognitive load theory (see Sweller, 2005). However, these studies have mainly drawn conclusions about the cognitive effects of different types of multimedia learning materials based on (transfer test) performance measures, and measures of cognitive load and time-on-task, without directly investigating the processes underlying these effects. Hence, the empirical work presented in this double symposium focuses on detailed analyses of the processes underlying the learning effects of different types of multimedia materials by means of eye tracking. Because eye movement data can provide detailed insight into the allocation of (visual) attention and processing demands, eye tracking is a valuable tool for such studies –albeit one that is little used in educational research. In this double symposium, studies are presented that focus on learning from a variety of multimedia materials that include dynamic visualizations, static visualizations, written text, and narrated text, in varying compositions.


 

 
Equipment PC and projector
Keywords Cognitive processes/development
Information processing
Multimedia and hypermedia 
Chair list
Name Surname Institution Country E-Mail EARLI Number
Katharina Scheiter University of Tuebingen Germany k.scheiter@iwm-kmrc.de  
Organiser list
Name Surname Institution Country E-Mail EARLI Number
Katharina Scheiter Eberhard Karls University Tuebingen Germany k.scheiter@iwm-kmrc.de  
Tamara van Gog Open University of The Netherlands Netherlands tamara.vangog@ou.nl  
Peter Gerjets Knowledge Media Research Center Germany pgerjet@gwdg.de  
Discussant list
Name Surname Institution Country E-Mail EARLI Number
Mary Hegarty University of California, Santa Barbara United States hegarty@psych.ucsb.edu  
Paper Details
Title How do learners actually use multiple external representations? An analysis of eye-movements and learning outcomes
Abstract  

Although multiple external representations can have benefits, especially for learning complex and new ideas, they are often not as effective as expected.. The present study employs eye tracking methodology to take a closer look at how learners use different external representations in learning from worked examples, how these activities are related to learning outcomes, and how well intended cognitive functions of multiple representations match to the functions as perceived by the learners. 16 (predominantly psychology) students studied worked examples on the application of probability principles, each consisting of a text (the problem formulation), an equation (representing the solution), and a tree diagram (that was intended to mediate between the concrete text and the highly abstract equation). During the learning phase the gazes of the participant were recorded. After the learning phase, the participants saw a gaze replay of their viewing behavior and were asked to think-aloud. The distribution of fixation durations on different representations indicates that single representations were not neglected. Rather, the participants switched frequently between different external representations which might indicate that they are not processed independently from one another. Yet, transitions between representations were not per se beneficial. For example, transitions between diagrams and equations were valuable only for learners with better learning prerequisites. For learners with poorer learning prerequisites, frequent transition obviously indicated mapping difficulties. For all learners, frequent transitions between equation and text were dysfunctional, pointing to the mediating function of the diagrams. Hence, the function of transitions seems to depend on characteristics of the learners as well as on characteristics of the representations involved. Finally, the verbal protocols revealed a mismatch between intended and perceived functions. The results suggests to inform learners more fully of the intended functions and/or to make the intended functions more salient.

Summary
Aims and Research Questions

Pictures, tables, graphs, equations, and texts, each type of representation can fulfill some beneficial cognitive functions that support learning. They may complement each other, constrain each others interpretation, and even help learners to construct deeper knowledge (Ainsworth, 2006). These functions, however, strongly depend on that learners process each single representation thoroughly, that they try to relate representations to each other, and that they actually use these functions. However, many studies indicate that learners often have difficulties to profit from multiple representations (e.g., Hegarty, Narayanan, & Freitas, 2002; Berthold & Renkl, 2005). It is known that people often cope with complexity by focusing their attention on specific aspects. Against this background, some authors (e.g. Ainsworth, 2006) assume that learners may neglect certain elements of the learning environment in order to reduce complexity. We conducted a pilot study to get a better understanding of

(a)                how learners actually use external representations in order to learn,

(b)                how viewing behavior is related to learning outcomes, and

(c)                how well the intended cognitive functions of multiple representations are used by the learners.

 

Methodology

 

Sample

16 students (9 female, 6 male), predominantly psychology students, participated in this study. They had a mean age of M = 24.21 (SD = 4.76)

 

Instruments

The pretest consisted of six questions on procedural knowledge (maximum: 6 points). The computer-based learning environment consisted of a set of 15 slides (7 slides as introduction to basic probability principles and 8 worked examples). Each example consisted of a word problem, a tree diagram, and an equation. While viewing a gaze replay of their viewing behavior on the first and last worked examples, participants were asked to think-aloud (Van Gog, Paas, Merriënboer, & Witte, 2005). The post-test consisted of 7 items on procedural (maximum: 7 points) and 7 items of conceptual knowledge (ratings: 1 = low; 6 = high).

 

Procedure

After completing the pretest, participants entered the learning environment. During the learning phase gazes were recorded. Afterwards, participants were exposed to the gaze replay (stimulated recall). Finally, all participants completed the post-test.

 

Data analysis

For each worked example, eye movements were aggregated in the areas of interest (AOI) text, diagram, and equation. Three eye movement parameters are reported: Cumulative fixation durations are defined as the sum of all fixation durations within an AOI. In this context, they are interpreted as an AOI-specific time-on-task. Gaze durations are calculated by adding all subsequent fixations on an AOI between two transitions. They indicate intra-representational processing. Transition frequencies are defined as the number of eye shifts between AOIs. We used them as a measure of attempts to relate different external representations. The verbal protocols were transcribed. Two trained coders rated the extent to which learners used the functions of multiple representations on 6-point rating scales (1 = hardly discernable; 6 = highly discernable; inter-rater reliability: 91%). The main categories were ‘complementary roles’, ‘constrain interpretation’, and ‘construct deeper understanding’.

 

Results

Due to the study’s explorative character, the level of significance was set to .10. Participants spent most of their time on the text (53 %). 27 % of the time was spent on the diagrams and 16 % of the time on the equations. Hence, all types of representation were processed by the learners. Gaze durations (in seconds) were longest for text (M = 4.40; SD = 2.02). They were shorter for equations (M = 1.04; SD = 0.34) and diagrams (M = 1.14; SD = 0.38). The most frequent transition between representations was the one between diagrams and equations (M = 7.96; SD = 3.70; per worked example). Transitions between text and diagram (M = 5.83; SD = 3.99) occurred also rather frequently. Transitions between text and equation (M = 1.14; SD = 0.38) were less frequent.

All correlations reported in the following are partial correlations controlling for time-on-task. Cumulative fixation duration on diagrams was positively correlated with both, procedural (r = .56; p < .05) and conceptual knowledge (r = .50; p < .10), gaze duration on diagrams were correlated only with conceptual knowledge (r = .53; p < .10). Transitions between text and equations were negatively correlated with conceptual knowledge (r = .46; p < .10). As transitions between equations and diagrams were, against our expectations, not positively correlated with conceptual learning, we computed separate correlations for students with low and high previous knowledge. For high prior knowledge students, we found the expected positive correlation (r = .62; p < .10), the opposite was true for students with low prior knowledge (r = -.74; p < .10).

In the analysis of the verbal protocols, rather low ratings were obtained for the use of the respective cognitive functions. Students were mainly (if at all) aware of the complementing function (M = 2.18; SD = 1.03). The constraining function (M = 1.29; SD = 0.19) and the construction function (M = 1.33; SD = 0.44) were hardly used.

 

Theoretical and educational significance

First, we found no support for the notion that single representations are neglected. Rather, the participants switched frequently between different external representations which might indicate that they are not processed independently from one another.

Second, it was found that time spent on the diagrams (but not on equations or text) was positively associated with learning, suggesting that time-on-task per se was not the crucial factor. Furthermore, transitions between diagrams and equations were found to be valuable for learners with better learning prerequisites. For learners with poorer learning prerequisites, frequent transition obviously indicated mapping difficulties. Moreover, for all learners, frequent transitions between equation and text were dysfunctional. Ignoring the diagrams that were intended to mediate between the (concrete) texts and (highly abstract) equations, obviously lead to shallow understanding of the principles to be learned.

Finally, the verbal protocols revealed that the learners were hardly aware of any of the cognitive functions. This mismatch suggests informing learners of the intended functions (e.g., by an informed training) and/or to make the intended functions more salient (e.g., by prompting, colour-coding, etc.).
Keywords Cognitive processes/development
Information processing
Multimedia and hypermedia
Appendices
Authors
Name Surname Institution Country e-mail EARLI Number Presenting
Rolf Schwonke University of Freiburg Germany rolf.schwonke@psychologie.uni-freiburg.de   *  
Alexander Renkl University of Freiburg Germany renkl@psychologie.uni-freiburg.de    
Kirsten Berthold University of Freiburg Germany kirsten.berthold@ifv.gess.ethz.ch    
Title Prior knowledge and interactive overview structure effects on cognitive load, disorientation and learning
Abstract
This study investigated the effects of the structure of an interactive conceptual map and the level of learner’s prior knowledge on their disorientation, cognitive load, and learning. The content to which the interactive conceptual map gave access was a text on the life cycle of a retrograde virus (HIV). Two types of map structures were designed: (a) low level of structure; a network that displayed the main concepts in an unstructured fashion, and (b) high level of structure; a hierarchical structure that displayed the same concepts according to categories of the domain. Eye movements were recorded during the first minutes of task performance. The results revealed that the hierarchical structure supported a better knowledge gain of factual knowledge and conceptual knowledge. But the hierarchical structure entailed higher conceptual learning (comprehension of relationship between concepts) only for low prior knowledge learners, whereas it entailed better factual knowledge learning (information specific to a concept - factual knowledge) only for high prior knowledge learners from the hierarchy focusing on details information. The results showed also for all participants an important cognitive load (i.e. disorientation and complexity perceived) entailed by the network structure compared to the hierarchical structure. Analyses of the eye movement data showed that the average fixation duration was higher for the hierarchical compared to the network structure. Interestingly, correlation analyses revealed that the average fixation duration was negatively correlated with the mental effort ratings and disorientation scores, but only within the network structure condition.
Summary
Hypertexts are informational electronic devices providing a non-linear organisation allowing and requiring the user makes his own sequence of the information (Conklin, 1987). The requirements of hypertexts using would entail cognitive overload and disorientation (Conklin, 1987 ; Niederhauser, Reynolds, Salmen, & Skolmoski, 2000) hindering an effective learning activity. However, few empirical studies assed disorientation or cognitive load, and fewer studies showed a negative effect of the disorientation on performance (Ahuja & Webster, 2001 ; Otter & Johnson, 2000).

Actually, authors argue or expect that high prior knowledge learners would benefit more from a flexible hypertext (e.g. network structure) whereas low prior knowledge learners would benefit more from a « well organised hypertext » as a hierarchical structure (e.g. Chen, Fan, & Macredie, 2006). However looking at the empirical literature, domain novices seem to reach better performance if a linear or hierarchical structure is provided, whereas experts do not benefit from any type of structure (Calisir & Gurel, 2003 ; Lee & Lee, 1991 ; Patel, Drury, & Shalin, 1998; Recker & Pirolli, 1995; Shin, Schallert, & Savenye, 1994 ).

Our study was conducted within the theoretical framework of the cognitive load theory (Paas & Van Merriënboer, 1994 ; Sweller, Van Merrienboer, & Paas, 1998). We argue that a learner built a comprehension of the hypertext content establishing coherence between concepts between the different nodes. To reach deep comprehension, learners have to mobilize important resources to run processes like decision making about the next node to consult, the construction of semantic link between concept,… Hence, a low prior knowledge learner will encounter high extraneous cognitive load. The learning performance will be hindered because a few resources will be allocated for the germane cognitive load. Introducing an overview displaying a hierarchical structure of the hypertext, low prior knowledge learners should build a semantic representation based on the overview structure because the overview drives the attention on the main concepts and their semantic relationships.

Concerning the effects of high prior knowledge, knowledge structure supports learning in non-linear hypertext (i.e. network) freeing resources in working memory and providing resources to run deep processing. On one hand, prior knowledge should reduce intrinsic cognitive load and thus free resources to run processes not useful for learning (i.e. extraneous cognitive load), and on the other hand, prior knowledge should support processing for establishing coherence between concept. High prior knowledge learners would activate prior knowledge to run elaborative inferences based on knowledge base in order to establish coherence, and hence, the germane cognitive load would be increased.

 

Method

The learning task consisted in studying a lesson on a computer dealing with the life cycle of the HIV (i.e. the infectiousness process of a cell by the HIV). Two types of lesson were compared based on the structure of an interactive conceptual map: (a) a hierarchical structure (i.e. main part of the lesson are distinguished by categories and the sequence of the process was respected), (b) a network structure (i.e. the concepts of the lesson were organized “randomly”). Twenty-four individuals (age M = 32.3, SD = 8.05; 15 females and 9 males) volunteered for participating. A pre-test assessed prior knowledge in the cell biology domain. In order to increase the difference between low and prior knowledge, the high prior knowledge learners studied before the learning phase a life cycle of a virus closed to the HIV giving them a macrostructure of the cycle life of the retroviruses.

Learning performances were assessed computing differences between post-test scores (questions) and pre-test scores on the retrovirus’ life cycle. Two types of knowledge were distinguished: (a) the factual knowledge (i.e. about information explicitly mentioned into a text section) and (b) conceptual knowledge (i.e. about implicit information understandable making inferences on relationships between different information between different text sections). For each type of questions (factual and conceptual) mental effort invested to answer the questions was measured thanks to a subjective scale in nine points.

Different scales were used to measure the cognitive load involved during the learning task: mental effort, perceived complexity of the lesson, disorientation (5 items), mental effort to process the map. A measure of eye movements was added to assess cognitive load and processes linked to the conceptual map processing.

 

Results

No significant differences were observed on the ratings mental effort scale, but the level of perceived complexity was higher for the network structure for both levels of prior knowledge. Average disorientation ratings were also lower in the hierarchical structure compared to the network structure, but the effect was only significant for the low prior knowledge learners. Analyses were conducted on the average fixation duration on the conceptual map (hierarchical vs. network) recorded the first minutes of the learning phase. The average fixation duration was higher for the hierarchical compared to the network structure. Interestingly, correlation analyses revealed that the average fixation duration was negatively correlated with the mental effort ratings and disorientation scores, but only within the network structure condition.

Concerning performance, for the factual knowledge scores, only high prior knowledge group benefited more from the hierarchical structure than the network structure. But for the conceptual knowledge scores the reverse result was observed showing that the low prior knowledge group tended to benefit more from the hierarchical structure whereas the high prior knowledge learners did not benefited from any type of structure. The ratings of mental effort invested to answer all questions (factual and conceptual) were lower in the hierarchical structure condition, but only for the lower prior knowledge learners.

In conclusion, the results corroborated that a hierarchical structure of an interactive conceptual map support higher deep learning for low prior knowledge learners but the results showed that high prior knowledge learners having a mental representation consistent with the conceptual map structure, may benefit more from the hierarchy focusing on details information (factual knowledge). Cognitive load investigations highlighted globally a higher cognitive requirement of a network structure. The average fixation duration appeared as a good measure of cognitive load linked to disorientation.
Keywords Cognitive processes/development
Information processing
Multimedia and hypermedia
Appendices
Authors
Name Surname Institution Country e-mail EARLI Number Presenting
Franck Amadieu University of Toulouse-Le Mirail France amadieu@univ-tlse2.fr   *  
Tamara van Gog Open University of the Netherlands Netherlands tamara.vangog@ou.nl    
Fred Paas Open University of the Netherlands Netherlands fred.paas@ou.nl    
Andre Tricot University of Toulouse-Le Mirail France andre.tricot@toulouse.iufm.fr    
Claudette Marine University of Toulouse-Le Mirail France marine@univ-tlse2.fr    
Title Understanding text and picture content as a unity
Abstract
Multimodality is at the heart of modern visual communication products. The combination of pictures and words as a mean of visual expression assumes reader behavior to include the reading of pictures on the same level as the reading of words. The dual coded message should be dual coded interpreted. Our experiment tested 24 adult persons choice of sentences intended as captions to pictures. 12 pictures, each one with three sentences as a captions possibility were presented to test persons in two different layouts: one with a single picture, another with three copies of the same picture. Test persons were eye-tracked while they decided which combination contentwise could be considered most objective and most subjective. Analysis of time spenditure showed that test persons used app. 25% more time on the subjective than on the objective task. Time used on each task was overall the same, but in three-picture layouts testpersons used more time at pictures than at answers, in one-picture layouts more time at answers than at pictures. App. 92% of choices for the objective content relation were as foreseen, app. 77% for the subjective. The experiment showed that interdependedness should be considered when making visual communication and that both layout and complexity of content influences the behavior of the reader.

 
Summary
Graphic communication is created through visual compositions, containing texts and pictures. The variety is big from the simple composition of a photographic portrait with the name under at a bookcover back to the newspaper spread containing several articles with several pictures, combined by layout with captions or headlines.

Texts are signs of language, pictures of other than language. The two categories are symbolic and iconic in the terminology of Charles Saunders Peirce’s theory of semiosis.

Graphic communication is based on the readers ability to interpret these two categories of signs, understand their mutual relation and create coherent and unified concepts of the meaning of their interconnectedness.

Reading research have not contained many projects investigating how readers understand the mutual content relations between pictures and texts, although many reading learning books for children are based on their understanding of the relationship between the picures and words. (Rayner, 2001).

To make more knowledge in the field, we carried out an experiment using eyetrack equipment to register the eyemovements during the reading of picture-words compositions. Eyemovements indicate cognitive activity (Just, Carpenter, 1976).

      

Hypothesis

When examples of the two categories of signs are presented in one graphic composition the relation between their content is influential on the understanding of the general idea of what is being communicated.

What is communicated through the content of the picture and text can be more or less easily understood. Therefore the interpretation of the meaning can vary in a way that will show in the amount of time used. If the picture-text content relation is understood as self-evident the reader will use less time than on a non-evident.

 

Experiment

The experiment was set up with 24 accidentically chosen grown ups, who saw a series of 24 stimuli on a computer. Each stimuli contained one of twelve photographs and four texts: a question and three possible answers. The questions invited the test person to decide which of the three answers would be a meaningful one when seen as a caption to the photograph, if the meaning of the content relation should be experienced as respectively most objective, most subjective, most nonsensical and most emotionally arousing.

The screen pictures were shown in two different layouts. In layout (Fig. 1) the question was placed in the upper left corner, the picture in the center, the answers in a column underneath the picture. In layout (Fig. 2) the question was placed as in Fig. 1. Each picture was shown in three copies. Two were placed in the right part of the layout, the third in the lower left corner. Answers were placed as captions under each picture.

 

<<<FIG>>>

 

The photographs were new, not manipulated pressphotos borrowed from Danish newspaper Politiken. They showed places, persons and situations and were new to the test persons.

      

Questions and answers were sentences written for the experiment. The sentences intented as possible answers to the question whether the content relation was mostly objective, were “semi-tautological”, that is evident, in their content relation to the picture. The sentences intented to be chosen as answers for the subjective content relation, were sentences that made sense, if for example one imagined them uttered by an average person who saw the picture and expressed his impression of what he saw. The sentences intended as answers for the nonsensical relation contained no meaning that could create a cohesion between picture and words.

Testpersons were divided in groups of six. Each group saw six photographic motives four times, one for each question. If group A saw picture A and question A in layout 1, group B saw picture A with question A in layout 2. Both layouts were equally represented in the series, their order being randomized.

      

Analysis

We analysed the answers of the most distinctive task: namely the questions of which texts were considered most objective and most subjective in relation with the photographs. The experiment resulted in 384 valid datafiles (4 persons, 4 groups, each seeing 24 screenpictures). A valid data file contained eye movement data for 85% of the time used at a serie of stimuli and were statistically analyzed using data from the-Areas of Interest-method.

 

Results

TIME EXPENDITURE RELATED TO TYPES OF QUESTIONS: The objective question was answered in an average of 2.3 sec. The subjective question was answered in an average of 3.0 sec (25% more time)

 

TIME EXPENDITURE RELATED TO DIFFERENCES IN LAYOUT: We found no differences in total use of time in the two layouts of the objective questions. and found the same result with the subjective questions. But in one picture layouts (Fig.1) more time was used on the answers.

·         Objective question:      app. 25% more time on answers than pictures

·         Subjective question:      app. 40% more time on answers than pictures

 

In three pictures layouts (Fig. 2) most time was used at the pictures.

·         Objective question:      app. 80% more time on pictures than answers

·         Subjective question:      app. 100% more time on pictures than answers

      

FORESEEN AND ACTUAL ANSWERS: The experimenters and the test persons agreed on the choice of answer in 92% of the results on objective questions and in 77% of the results on subjective questions

 

Conclusion

The experiment shows that it is profitable to consider the content relationship between pictures and text, because the relationship shape the understanding of the general idea expressed through the combination of pictures and words.

If reading time should be short and the interpretation of meaning as foreseen, content relationship between pictures and texts should be “objective”. In these cases the reader can make several integrative saccades (Holsanova, 2005) between elements in pictures and elements in texts, thereby creating cohesion.

If reading time should be longer and the interpretation of meaning less exact, the content relationship between pictures and texts should be “subjective”, e.g. puzzled.

Also we found that repetitive use of the same picture effects the readers looking strategies in a profound way.  
Keywords Cognitive processes/development
Information processing
Multimedia and hypermedia
Appendices heie_figures.jpg 
Authors
Name Surname Institution Country e-mail EARLI Number Presenting
Niels Heie The Graphic Arts Institute of Denmark Denmark nh@dgh.dk   *  
Karen Margrethe Oesterlin The Graphic Arts Institute of Denmark Denmark kmoe@dgh.dk    
Frank Christensen The Graphic Arts Institute of Denmark Denmark fc@dgh.dk    
Title Are there age differences in utilization of illustrations in reading science textbooks?
Abstract
This project aims at studying how students in two different age groups read science textbooks with illustrations. By tracking their eye-movements one may register how much time these groups use on the plain text on one side and the illustrations on the other, and one may search for patterns in how – or if – they switch between the two modalitites. Literature on iconotext or multimodal text is widely based on assumptions that are not well documented with regard to eye-movements during reading of such text. Our project can be regarded as a modest start in getting at some such information. In a study of high- and low-achievers Hannus and Hyönä (1999) found differences in how children integrated text and illustrations. In the present study we investigate differences between children of different age groups.
Summary
In our study we aim at testing hypotheses related to differences in two age groups, and the first empirical topic can be formulated as a hypothesis: 9-year old students spend relatively more time on illustrations in science textbooks than do 12-year olds. Although there will be many possible interpretations of these results, we will have a starting point for making predictions that have a better empirical foundation than traditional assumptions of the relation between illustration and text.

A second topic can be formulated in the hypothesis: Illustrations in science text books play a greater supporting role for learning during reading in 9-year old students than 12-year olds. This hypothesis will be tested by examining the children’s back-and-forth-looking between segments of the text that are closely linked to specific illustrations and relevant text passages. We will also study possible correlations between the eye movement patterns and the scores in answering questions about the text, i.e. how learning correlates with the relative time spent on illustrations in each group. The technical equipment used was the SMI iViewX Helmet with Polhemus magnetic headtracking.

 

Procedure

 

Sample

A total of 40 students, divided in two age groups: 20 students from 4th grade (9-years olds) and 20 students from 7th grade (12-years olds). The students are considered as normal achievers by their teachers, and they were evaluated as normal achievers based on standardized scores in the Raven nonverbal intelligence test and a standardized word chain test. 

 

Texts

Passages of two pages from the PIRLS 2001 study which contains both text and illustration, where the comprehension of the text is dependent on the illustration and vice versa. In a second set of recordings we used authentic science textbooks from 4th and 7th grade. The second set of recordings will be used for validation of the text the two age groups read in common.

 

Procedure

The recording was undertaken in two steps. First, the students were told to study carefully all information in the textbook passages, and that they should be able to answer some questions related to the text after having read the text. In the second recording they were asked to fill out a questionaire. When answering the questionaire they had the opportunity to look at the science book. This procedure in two steps was repeated with the authentic science textbooks, giving a total of four recordings for each subject. .

 

Data analysis

a) Does the younger group spend more time on text versus illustration than the older group?

b) Does the younger group have significantly more transitions between text and illustration than the older group?

c) Does the younger group have significantly more transitions between text and illustration at relevant locations?

The analysis of the recorded material will be undertaken in December 06 – February 07.

 

 

Theoretical and Educational significance

Mainstream theories on reading tend to consider written text as a representation of spoken language, a view that can be identified when teachers suggest the use of audio books for students who struggle with reading. Experience tells us that such attempts most often fail. Illustrated science textbooks – like many other written texts – give meaning first when they are read, and not when they are only listened to. The theoretical significance of this study is therefore to show how meaning potential is realized differently in multimodal texts. Depending on the results of the study, The educational significance of the study is related to the theoretical. The insight that the science text book must be taken seriously as a multimodal, written text, gives strong implications for how to work with the combination of text and illustration. A second aspect of both theoretical and practical significance, is whether the fixation patterns can be described in a dynamic way by a nuanced model of skill, defined as a flexible combination of automaticity and awareness. (Tønnessen, 1999)

 

 

Hannus, M., &Hyönä, J. (1999). Utilization of illustrations during learning of science textbook passages among low- and high-ability children. Contemporary Educational Psychology, 24, 95-123.

Tønnessen, F.E. (1999). Awareness and automaticity in reading. In I. Lundberg, F.E. Tønnessen & I. Austad (Eds), Dyslexia : advances in theory and practice (pp. 91–99). Dordrecht, Boston, London: Kluwer Academic Publishers.

 
Keywords Cognitive processes/development
Information processing
Multimedia and hypermedia
Appendices
Authors
Name Surname Institution Country e-mail EARLI Number Presenting
Oddny Judith Solheim University of Stavanger Norway oddny.j.solheim@uis.no   *  
Marianne Roskeland University of Stavanger Norway Marianne.roskeland@uis.no    
Per Henning Uppstad University of Stavanger Norway per.h.uppstad@uis.no    
Title Newspaper reading, eye tracking and multimodality
Abstract
Readers’ visual interaction with multimodal documents has been investigated in four eye-tracking studies on newspaper reading. Multimodal documents are divided into information graphics and articles containing text, photos and photo captions. Analyses of eye movement data from two experimental studies show that spatial layout of information graphics affects reading style, amount of reading, and fixation order. Results from one study shows that the amount of reading in information graphics is positively correlated to the comprehension of information graphics. In the case of news articles containing photos, two studies provide evidence that photo size and photo content have no general significant effect on reading time of related textual content. Analyses of scanpaths between picture objects and text objects within the same newspaper article show that text elements such as headlines and intros are fixated first, followed by a large number of transitions to pictures, and thereafter a large number of transitions to article text. Experimental results from one study show that short text length and easy text difficulty in articles are positively related to reading depth and article comprehension.
Summary
We will present an overview of the eye-tracking studies of newspaper reading that our laboratory has conducted over the past five years, with a special emphasis on the role of pictures and information graphics in relation to the text.

In close co-operation with local newspapers, we have recorded the eye movements of 150 newspaper readers, who read authentic-looking newspapers with built-in experimental conditions. In our latest study, we compared the reading of an ordinary printed newspaper to reading a digital news tablet.

Important methodological features of our studies are that readers can turn page freely and that they think they read real newspapers. The eye-tracker used was the SMI iView X Headset plus Polhemus magnetic headtracking. In some of the studies, comprehension tests were given to readers after eye-tracking data had been recorded. Retrospective interview data has been collected from a subset of the participants.

 

Data analysis

The data analysis is organized in two steps.

1) Each newspaper spread is divided into a number of objects (so-called AOIs) with specific attributes:

-            Object position (pixel coordinates on newspaper spread)

-            Object extension (pixels or word count)

-            Object type (e.g. picture, headline, advertisement)

-            Object content (specified with keywords)

 

2) The newspaper objects are then related to the eye-tracking data in order to derive the following standard reading measures:

-            Accumulated fixation time on each object (absolute time)

-            Accumulated fixation time, relative to object extension (reading depth)

-            Fixation order between all objects on the same spread

-            Transitions between objects (i.e. “eye movement traffic”)

-            Heatmap showing the spatial distribution of eye movements over the spread

 

The data analysis is designed to produce both general results about newspaper reading behaviour, and specific results about eye movements in multimodal texts.

 

Results

General results show that short portions of text are highly attractive to news readers. Short texts such as tickers, headlines, and intros are looked at earlier on the spread than pictures. Short texts also tend to get more accumulated reading time as relative to object extension, which means that reader read deeper into short texts than into long texts.

In our latest newspaper eye-tracking study text length and text difficulty was tested experimentally in a number of articles that were written in different variants for each condition. The results of the experiment show that short and less difficult texts facilitate reading depth and article comprehension for tabloid readers. In digital tablets, reading depth was equal regardless of text length, whereas text comprehension was poorer for long and difficult texts.

Concerning the effects of newspaper medium, an important finding is that time spent reading the average article is significantly greater in a digital news tablet than in printed newspapers. In this respect, news reading on digital tablets shows the same tendency as in online newspapers. Contrary to news reading in online newspapers, the amount of scanning is lower in digital news tablets than in paper newspapers.

A possible explanation for the finding that reading depth is greater in digital news tablets and online papers could be the utilization of hyperlinks in digital media. An analysis news tablet reading proved that link-using produced longer viewing time in readers than turning page. This suggests that news readers today are adept at exploiting links to navigate to the information that is found interesting.

Results on multimodal objects show that photographic pictures get an average viewing time of 1,3 seconds, whereas information graphics receive considerably more accumulated viewing time due to the close integration of textual and pictorial content. Another general difference between photos and info graphics is that photos are typically fixated earlier than info graphics in printed newspapers.

There is evidence that the spatial arrangement of the elements within an information graphics object has significant effects on readers’ eye movements. Information graphics with elements arranged in a serial “reading” order was read more extensively and effectively than when the same elements were arranged in a radial order. It is reasonable to infer that spatial layout also has an effect on comprehension, even though this relationship has not been unequivocally proven yet.

A typical scanpath in a multimodal article begins with the headline and intro objects. A large number of transitions lead forward from the headline to the article photo and the photo caption. A large number of transitions are also found between the photo and the article text, providing some evidence of multimodal information integration in news readers. A fine-grained content analysis of picture-text transitions will be implemented soon.
Keywords Cognitive processes/development
Information processing
Multimedia and hypermedia
Appendices
Authors
Name Surname Institution Country e-mail EARLI Number Presenting
Kenneth Holmqvist Lund University Sweden humlab@sol.lu.se   *  
Jana Holsanova Lund University Sweden jana.holsanova@lucs.lu.se    
Nils Holmberg Lund University Sweden nils.holmberg@gmail.com    
Visit NQcontent
© European Association for Research on Learning and Instruction, 2012 All rights reserved.