Menu Close

Global Attention in Reading Comics: Eye-movement Indications of Interplay between Narrative Content and Layout

By Kai Mikkonen and Olli Philippe Lautenbacher


It is a widely accepted notion in research on comics that variation in page layout, i.e. the spatial arrangement of the page, affects the order of reading. Recent empirical findings have also confirmed that deviation from the conventional grid layout of evenly arranged panels can transform viewing patterns (Cohn “Navigating Comics”; Cohn and Campbell). Large panels can temporarily “block” the Z-shaped reading pattern inherited from linear text reading in Western languages, for instance, and distance between panels, or borderless arrangements, may imply non-Z-path readings.[1] Layout choices could therefore encourage readers to deviate from the Z-path that functions as the default reading pattern of comics. It is less certain how layout can direct readers’ attention and affect their comprehension of the content, however. Many researchers in the field have argued, but without empirical evidence, that layout has a significant impact on the meaning of the story, claiming for instance that a dynamic page layout, one that is modified in the course of the story, supports narrative content (Peeters; Chavanne). The opposite has also been claimed, that the narrative content of the images may affect the meaning of the layout. Hence, layout choices become meaningful because of this narrative content (Postema 29). Some recent empirical research on comics and reading habits also shares the same general hypothesis that distinctions in layout style, such as types of bleeding (panels without frames that extend “beyond” the edge of the page), serve narrative functions and correlate with and co-construct narrative readings (Bateman et al.).

Although there seems to be general agreement among empiricists and theorists alike about the effect of layout on reading order in the case of comics, there is also some disagreement vis-à-vis the impact of layout on narrative meaning. This divergence is, to some extent, a matter of degree concerning the extent to which layout could be said to interact with narrative content, but it is also substantial, concerning the forms of attention and comprehension that matter in reading comics. Open questions involve, in particular, the following aspects of reading comics:

  • the extent to which narrative and visual content may direct or constrain the order of reading
  • the ways in which layout contributes to or even creates narrative meanings in its interplay with the image content
  • the kinds of attention (beyond the linear Z-path) that are prompted in the reading of longer comic formats.

The varying views on these issues perhaps derive from differences between empirical research based on quantitative data on the one hand and qualitative theory-based statements on the other.2 Thus far, however, there has been little empirical work done on the interplay between page layout and narrative content in comics. Many previous eye-tracking studies on comics conducted in the fields of psychology and linguistics focus on the image sequence. Some recent studies have also examined readers’ scanpaths with a view to optimizing the readability of comics on handheld digital displays (Rigaud et al.; Augereau et al.). Perhaps the most significant empirical research to date, undertaken by Neil Cohn and his colleagues, focused on sequences of empty panel units (Cohn “Navigating Comics”; Cohn and Campbell; Foulsham et al.). The findings reveal how the layout structure in itself may strengthen Z-path reading, as well as how the sequence of panel images could affect fixations and increase gaze consistency. Cohn’s research conspicuously leaves the question of narrative content aside, however.

The challenge of integrating narrative content in empirical research based on eye movements is real, but not insurmountable. One major problem in all empirical research based on eye tracking is that data about eye movements cannot be directly equated with forms of attention and comprehension. The first step with points of fixation is to move beyond its connection to reading order and instead interpret sequences of fixation and saccades, as well as their amplitude and duration, as indications of attention and thus coextensive with cognitive processing. What can be done, then, is to compare readers’ scanpaths (sequences of fixations and saccades3), detect their reading patterns from these comparisons, and then assess how these patterns relate to the visual components of the images, or the gist of the narrative scene. Any discrepancies between the linear reading path, which is supported by the layout, and the readers’ scanpaths, including large saccades across the composition, jumps forward, see-sawing and regression patterns (i.e. the reader returns to previous panels), are then significant. Therefore, by comparing the readers’ eye-movement patterns with deviations from Z-path reading and relating these findings to layout style and the visual and/or narrative content of the images, we developed an empirically informed interpretation of global attention in reading comics.

We hold that this approach could also benefit research on multimodal documents other than comics, thereby developing a more holistic understanding of what directs and constrains attention in reading.

Layout, Narrative Content, and States of Attention

Our analysis of eye movements on comic spreads focused on the deviations from the left to-right, top-to-bottom reading path that cannot be simply attributed to layout-triggered deviation, and there was reason to suspect that the subjects chose alternative pathways due to an interrelation between the layout and the narrative content of the images. It is worth noting, however, that some of the content features that drew our attention in the gaze data, such as the place and shape of the speech balloons or the image frame, or bleeds, can also be perceived as layout choices. We therefore also find it problematic to claim, specifically in relation to longer formats in comics, that page layout is an independent structure and separate from meaning (see Cohn, “Navigating Comics” 1, 14; “Architecture of Visual” 5). Apparently simple manipulations of traditional panels (such as choosing vertical frames instead of horizontal ones) could be considered at the least a meaning-echoing layout.

Our study focused on the reading of double-page spread in comics. The choice of stimuli reflected the need to investigate the implications of eye movement with regard to the readers’ states of attention and comprehension. One motive for this was that we were interested in empirically testing the common theoretical assumption in comics studies that it is possible to distinguish between linear and tabular forms of composition or phases of reading (first defined in Fresnault-Deruelle; also Mikkonen 36-37, 64-66). The notion of tabular reading has been used to refer to specific features in the composition of comics, such as when the layout choices reflect a character’s mental state (Fresnault-Deruelle 22-23), which invite a non-linear reading of the panels. A broader spatial arrangement, instead of the panel sequence, merits a more global focus and appreciation. As far as we know, the theoretical assumptions about “tabular” (or synchronic) reading in comics have never been tested empirically.

Empirical research on reading paths and principles of comprehension in the case of comics has previously focused mainly on the Z-path, or the most salient points of attention. In fact, Neil Cohn points out that his findings on the dominance of the Z-path in readers’ navigation across comic page layouts “go against the idea that comic pages are comprehended ‘holistically’”, or that readers would move across layouts in a more or less erratic fashion (“Reading without Words” 576). Although our research similarly shows that readers do not move erratically across layouts, but rather follow systematic patterns, our gaze data also strongly points to the relevance of what is referred to in the psychological study of attention as the global state of attention.

Psychological theory related to scene perception has shown how certain eye-movement patterns can reflect covert attention states, which are characterized by qualitatively different information-processing functions. In his groundbreaking investigation, Buswell (1935) was able to identify two general patterns of perception in relation to complex scenes: A general survey, in which the “eye moves with a series of relatively short pauses over the main portions of the picture”, and a second type of pattern, in which “series of fixations, usually longer in duration, are concentrated over small areas of the picture, evidencing detailed examination of those sections” (142). In this respect, it has become conventional to make a distinction between local and global states of attention (Liechty; Wedel et al.). More specifically, our division of global versus local states of attention relies on Liechty, Pieters, and Wedel’s (Liechty et al.; Wedel et al.) definition of these terms, as derived from their hidden Markov model of attention switching during scene perception. This model “distinguishes between two unobservable, latent states of attention”—designated local and global’—between which viewers can regularly switch while observing a picture scene, such as print advertisements in magazines that contain both text and pictorial scene information, and which are “reflected in distinct patterns of fixations and saccades over time” (Wedel et al. 129). According to this definition, local attention focuses on specific aspects and details of the scene and examines its content in greater visual detail, whereas global attention explores “the informative and perceptually salient areas of the scene”, and possibly integrates the information contained therein (Liechty et al. 519).

In further support of this distinction, we turn to eye-tracking research in psychology in which Mallorie Leinenger and the late pioneer in the eye-tracking methodology of reading, Keith Rayner, among others, argue: “While readers may have the subjective experience that they are able to see the entire page when fixating on it, the fact that they do move their eyes around the page making a series of fixations suggests that they are actually unable to process the entire page in one fixation” (283). As Leinenger and Rayner further argue, given that readers do not fixate every single letter on the page, it is relevant to ask what the area of effective visual processing during reading is. It has therefore been possible in eye-tracking research to explore the dynamics of attentional allocation and the perceptual span in reading text in some detail. This basic observation is relevant here even if we are dealing with the complex unit of words and images on a comics page and double page spread, where one could surmise that the reader’s eye movements are not only under the control of linguistic processing (but also involve visual processing). What interest us, specifically, are the readers’ eye-movement patterns around the comics page that deviate from the linear Z-pattern of reading and/or make longer (nonlocal) saccades across larger areas of the stimulus. Because of the size of the double-page spread in our stimuli, we presume that global attention requires several fixations separated by larger distances, hence longer “global” saccades. We considered both the greater amplitude and the nonlocal aspect of the saccades as possible indications of global attention. The shorter duration of the fixations could be an additional indication of global states of attention, or of switching attention states, but we focused here on the order and place of the fixations.

The area covered by foveal vision (with best acuity) is of approximately 2 degrees (Rayner, “Eye movements and attention” 1459). This means that at 60 cm from the screen (which is the viewing distance in our reading assignment), it would correspond to an area of circa 2 cm (0.787 inches). Parafoveal vision (with poorer visual acuity), which covers around a maximum of 5 degrees around the foveal area, would correspondingly cover at most an area of circa 8 cm (3.149 inches), i.e. merely 1/6th of our 475mm broad screen. The division of the stimuli into a grid-like structure of Areas of Interest (AOIs), based on the given frame-structure of the double-page composition, allowed us to identify “global” saccades vs. “local” saccades in terms of both their length (i.e. a travelled distance of more than 7 degrees) and their order. The deviations in order involve, specifically, longer saccades that relate AOIs beyond their immediate sequence, i.e. the sequential order of the panels, and furthermore are not confined to one area of the stimulus.


3.1 Participants
We asked 30 students to read a series of 12 double-page spreads from comics on a computer screen. Their task was to read the given spread in its entirety and then move on to the next example at their own pace. All readers read the same spreads, but appearing in random order. In our analysis for this article we used gaze data from 6 representative stimuli from this larger body of data.

Eye-movement recordings were collected from 30 students in the humanities faculty taking courses in translation studies at the University of Helsinki, Finland, in April 2016. Before taking the test, the participants were given a questionnaire to complete, informing them that the intention of the study was to gather information about the reading process by means of an eye-movement detector. The background questions covered age, gender, mother tongue and proficiency in Finnish, the use of eyeglasses, contact lenses, or mascara (which might affect the accuracy of the gaze recordings), their study history, and their habits in terms of reading comics. The participants ranged in age between 21 and 47, Mage=25 years. All participants self-identified as native or “native-like” speakers of Finnish, 21 of them were women, 8 were men, and 1 answered “other”. They reported no eyesight problems (11 used eyeglasses, 3 wore contact lenses).

In terms of their current reading habits, 3 participants reported that they never read comics, 5 read them once a year at the most, 8 read them once a month or less, 10 read them once a week or less, and 4 reported that they read comics more than once a week. With regard to past reading habits, 11 participants used to read comics much more than at present, 13 somewhat more, and 6 read about the same amount. Of the kinds of comics they were used to reading, newspaper comic strips were the most popular, being read by 22 participants, whereas 18 read web comics, 12 reported reading comic albums and graphic novels, 12 read comic books, and 11 read manga.

3.2 Apparatus and Data Analysis
The given double-pages were read on a horizontal 475 x 297 mm screen, using the SMI RED-m eye tracker with a 120 Hz sampling rate, and at a 600 mm viewing distance, which is fairly close to a real comic-book-reading situation, at least to the extent that the participants had a whole spread in front of them.4 The order of the stimulus documents was randomized. The reading task was self-paced: The participants simply pressed the space bar when they chose to move on to the next spread. No specific questions were asked of them after the test.

The eye-tracking data of 8 of the 30 participants was excluded because of relatively poor calibration values, including either a mean deviation on the X or Y axis of over 0.64° or a tracking ratio of less than 98 percent. 1 of these 8 recordings was then discarded because the whole scanpath was misplaced on the screen despite acceptable calibration figures. Eventually, the results of 22 participants were maintained for our research. Calibration was made according to the standard five-point pattern and was further validated with 4 extra points.

The data analysis focused especially on fixation points and fixation order, in other words the places where people were looking as they read the excerpts. Thus, the distribution of the participants’ fixations across each stimulus was of particular interest. Relevant temporal aspects of viewing behavior were also noted. In particular, we considered the implications of longer saccades and longer duration in fixation. Because we used low-speed event detection, our primary events were fixations, from which saccades were then automatically computed and derived by the SMI RED-m built-in-detector. Fixation-detection parameters were set at a minimum duration of 80 ms on a maximum dispersion area of 100 px. In general, we dismissed the first registered fixation (or in some cases the first two if they were close to each other) as irrelevant because of the serial nature of our test: Each first fixation area on a new double-page was actually the participant’s last fixation on the previous spread.

We also divided the stimuli into areas of interest (AOIs), limited mainly to entire panels (including their surrounding white areas such as gutters and page borders). This enabled us to observe the true realization of the Z-path and to compare the main areas of attention. We are aware that a focus on panel transitions could overemphasize the relation between two panels only, thus potentially diminishing the importance of a variety of relations between the narrative and visual content in the images. Nor did all the stimuli consist of comparable panels—the panels differed in size, shape, position, and impression of depth.

3.3 Stimuli
The excerpts of our reading assignment were taken from comic books and graphic novels published in the 1980s and the 1990s. The participants received no information about the titles of the comic books, the authors or the artists. 1 participant reported that she was familiar with 1 of the works included in the study.5

We chose excerpts that reflected the diversity of visual appearance and composition in terms of the ratio of text or dialogue, as well as their graphic and layout styles. All our stimuli included narrative comics that show characters engaged in action. We did not select any excerpts from the beginning or the end of the stories to avoid specific questions related to these privileged positions within plot structures. Layout and narrative content were the two core independent variables to be observed. We focused our analysis especially on examples with a dynamic layout style, which allowed us to examine interplay between layout and narrative content. In assessing the dynamism of the layout choices, we tentatively relied on the categories created in French-language comics theory (Peeters; Groensteen 112-18; Chavanne), including regular vs. irregular or discreet vs. ostentatious styles, as well as Bateman et al.’s (2016) classification scheme. Thus, we understood a dynamic layout style as referring to layouts, in which the spatial arrangement of the page and/or the double-page spread deviated from the canonical grid structure. Such deviations included inset or (partly) superimposed panels, panel bleeds, and other changing organizational principles, for instance. However, none of the examples could be categorized as being particularly radical in layout style: None of the excerpts implied theatrical, contiguous, or thoroughly ambivalent composition, for example, such as those studied by Chavanne (233-80). Only 1 spread, the Sin City excerpt (fig. 2), clearly deviated from the grid-line structure. 1 of the excerpts (Un ver dans le fruit) was plainly grid-like, except that it involved changing panel sizes, and that all of them were vertical. We included this excerpt in our study as a point of comparison, representing a more regular layout style.

Our excerpts were taken from the Finnish translations of these original works:

  • Le voyage en Italie, vol. 1 (1988) [VeI]
  • Frank Miller. Sin City. The Yellow Bastard (1996) [SC]
  • Will Eisner. To the Heart of the Storm (1991) [HoS]
  • Fabien Nury and Sylvain Vallée. Il était une fois en France. Tome 1: L’Empire de Monsieur Joseph (2007) [IEFeF]
  • Moebius and Bruno Marchand. Little Nemo. Tome 1: Le bon roi (1994) [LN]
  • Pascal Rabaté. Un ver dans le fruit (1997) [VdF].
Figure 1: A complete Z-path measured on VdF (Un ver dans le fruit, Rabaté 1997).

No modifications were made to the original materials. Modifying the excerpts (as in Omori et al.) might have allowed us to focus on the effects of a particular limited element of the composition. However, our objective was to look more generally at the ways in which actual forms of visual and narrative content act together with layouts in existing comic books. We also wanted to avoid any artificial separation between layout and narrative or visual content. The choice of the double-page spread as the unit of reading and analysis was motivated by the importance of this compositional element in longer formats in comics, as well as by the absence of previous empirical attention to this question. Nevertheless, certain limitations were inherent in this choice. In particular, as a fragment of a longer narrative, the double-page spread can lack important narrative context – the sense of what comes before or after the given citation – that could affect the reader’s attention.


The first step in our analysis was to identify any significant deviations from the Z-path of reading where layout seemed not to be the sole explanatory factor. Regarding content-related issues in our stimuli, we looked especially into the question of the impact of global attention as a potential reason for deviation. Global attention appeared essentially in the starting phase of the reading, regularly focusing on the center of the double-page spread, but also in the refocusing or checking for possibly meaningful elements during and at the end of reading.

Longer saccades were typical at the beginning and end of the reading process. We analyzed these patterns as indications of global attention that facilitates narrative comprehension. The possible functions of such attention include, for instance, checking the importance of the visual and narrative information on the pages and verifying the direction of reading. At the end of reading, global attention can affirm earlier and make new connections, and be a means of checking that nothing important has been missed.

4.1 Global attention at the beginning of reading
The beginning phase of reading – involving approximately the first 2–25 fixations before the reader’s Z-path was set in place – regularly included larger saccades. Variation at this point was also observable in the readers’ attention to the center or the upper part of the left-hand page (other than the first upper-left-hand corner panel), and in the importance of the center of the spread. These locational preferences in initial fixation could be attributable to the fact that they are optimal regions for the reader to get the “scene gist”, as described by Loschky et al. (218). For instance: readers may grasp the maximum of the whole spread at a glance, although the size of the spread might require that several fixations be separated by larger distances, hence the longer saccades.

The beginning phase of many readers’ scanpaths in each stimulus included long saccades (of over 7 degrees) that cut across the first page, connected areas between the first and the second pages, and in some cases also involved movements around the whole double-spread. Even in LN (see fig. 5), which includes figures and speech balloons in the top-left-hand-corner panel of the first page, and where this panel could thus be regarded as salient in its narrative content, only 8 readers out of 22 started reading from there. 2 of these 8 readers also checked the second panel in 1 saccade in the middle of their scrutiny of the first panel. The majority made larger saccades, however. 8 readers (8/22) stopped in the second uppermost panel for some fixations (2-8 fixations) before moving to the first panel. 6 readers (6/22) made long saccades across the first page or the whole spread before moving to the upper-left-hand corner.

The longer saccades across the first page, across the center or the top areas of two pages, or around the spread, could be regarded as more explicit indications of global attention. Longer saccades were most commonly made in our stimuli in SC (fig. 2), where we observed that almost half of the readers made long saccades across the first page or the two pages (11/22), before moving to the upper-left-hand side of the first page with two figures. Several readers also started in the center of the spread (4/22) or visited the center during the beginning phase of their reading (2/22).6 A minority of readers started reading from one of the two upper-left-hand figures (7/22). We attribute the relevance of longer saccades and the center of the composition in this case to the layout, the narrative content of the images, and the dynamic between the two. In terms of layout, SC stands out from the other examples in that it includes two bleed images in the background, and that the three framed panels of the spread create the effect of superimposed images through broken frames and overlaying parts.

Figure 2: Three Scanpaths on SC (Sin City. That Yellow Bastard, Miller 1996).

In some other cases, fixations on the panels located near the center or the upper half of the left-hand page were more common than long saccades. This was observed especially when the central panels of the left-hand page included faces, figures, or speech balloons. 8 readers stopped in the second panel of LN (fig. 5) in the center of the first page on the way to the left-hand corner. Similar observations were made in VdF (fig. 1), IEFeF (fig. 3) and VeI (fig. 4). Fixations located around the plain center of the spread were also significant in some stimuli. We noted an unusual number of fixations around the center of the spread in the readers’ scanpaths for HoS (11 readers had their 3rd-7th fixations in this area). In addition, several readers who started reading from the first human figures on the upper-left-hand page of HoS (see fig. 6), or the first text on the left, visited the center of the spread before their 10th fixation (6 readers). 1 reader, whose untypical reading path started from the most central speech balloons on the right-hand page, moved to the left-hand page through the center of the spread. In VdF (fig. 1) the center of the spread was also relatively common as a starting point for reading (7/22). One notable aspect of the scanpaths in VdF was that only 1 reader started reading directly from the first upper-left-hand-corner panel. The most common starting point among readers who did not first attend to the center of the spread was the upper central panel on the left-hand page featuring the protagonist in relief (5/22).

It is a well-documented feature of human viewing behavior that observers’ first fixation locations tend to be clustered around the center of the image (Parkhurst 116; Dorr et al.). However, the significance of the center of the spread in HoS (fig. 6) could also be attributed to the unconventional layout of the spread (a lack of panel frames, superimposed frames, a window structure that may be reminiscent of panels, for example), and a continuing line across the center of the spread from a sewing-machine table on the left to two rooftops on the right-hand-page city-street view. In the wordless scene depicted in VdF, interest in the center of the spread is possibly emphasized by the size and perspective (bird’s-eye view) of the largest panel on this spread, situated close to the lower-center field of the composition on the left-hand page.

In comparison, in IEFeF (fig. 3) only 3 readers made some fixations in the center of the spread at the beginning of their reading. There are several factors pertaining to narrative content that could explain the relative lack of interest in the middle of the spread in this stimulus. Clearly the most common pattern (10 readers) in starting to read IEFeF was to move forward from the third panel on the left-hand page, which includes two figures and two speech balloons, and then to move to its surrounding panels. The wordless panels on the central left-hand page that depict the act of strangling (panels 5 through 9) constituted a more common starting point than the upper-left-hand-corner panel, before the readers moved to the upper-left-hand corner (6/22). We can observe here that the figures engaged in action on the left-hand page attracted the majority of the readers’ first fixations – not the left-hand top corner. With regard to the beginning phase of reading IEFeF, 2 readers’ scanpaths included long saccades across the top of the spread or the whole spread before they moved to the upper-left-hand corner panel. All in all, although we may speculate that the layout and narrative content in IEFeF were relatively less encouraging of global attention than with HoS (fig. 6) or SC (fig. 2), global attention on the central figures on the first page was a dominant pattern at the beginning of reading.

Figure 3: AOIs for IEFeF (Il était une fois en France, Tome 1, Nury & Vallée 2007).

4.2 Global attention at the ending phase
With regard to the ending phase, we observed that the dominant mode of finishing the reading was to make long saccades across the two pages. At least 9 of the 22 readers in each of the 6 stimuli, and in some cases up to 13 readers, made long non-chronological saccades across the two pages at this point. Some of these reading patterns, between 1 and 3 readers in each case, also included chronological readings of some segments of the spread and could thus be considered (partial) re-readings of the stimulus, not just instances of global attention. In addition, between 2 and 6 readers in each case, exempting HoS, made large saccades across the second page at the end. These movements involved at least 3 panels and covered at least half of the entire area of the second page. Regressions that only went back to the previous panel, or the two immediate panels close to the end point of the Z-path, were also relatively common. Between 1 and 6 readers in each stimulus finished their reading assignment in this fashion. One interesting exception, however, was VeI (fig. 4) where 9 readers finished their reading in this way (in comparison with 11 readers who made large saccades across the whole spread).

Figure 4: Scanpath on VeI (Le Voyage en Italie, vol. 1, Cosey 1988). The last 5 saccades are indicated by thicker lines.

LN (fig. 5) was the only example among our stimuli in which more readers ended their reading with the last panel, instead of making larger saccades across the spread (in a ratio of 9/8). We speculate that the reasons for this anomaly concerned both the layout and the narrative content of the last panel on this spread. In terms of layout, the last vertically long panel on the extreme right on the second page suggests a determinate end to the Z-path in that the image extends over the whole page, thus framing all the smaller panels that are set in three tiers to its left. In terms of narrative content, this wordless panel depicts long elongated rock formations as well as a narrow ravine that points to the upper-right-hand corner of the spread.

Figure 5: Scanpath on LN (Little Nemo, Tome I: Le bon roi, Moebius & Marchand 1994).

All the scanpaths recorded in this last panel show how the readers made long vertical saccades across the ravine. The saccades reached at least the lowest bird figure seen flying in the ravine (6 readers went all the way to the top of the ravine). Therefore, we surmise that the contents of the panel function as a kind of visual pointer that directs the gaze through the panel.

4.3 Deviations in the middle of reading
Deviations in the middle of the Z-path reading pattern were not common. However, in each case at least 1 or 2, and sometimes between 3 and 4, readers made noticeable large saccades across the spread on one of the two pages. We should also note the exceptionally large number of anticipatory jumps forward at the end of IEFeF (fig. 3), a majority of the readers (13/22) visiting the last panel in the lower-right-hand corner of the second page before having read the 1-5 preceding panels. Here, again, we attribute these deviations to layout, narrative content, and their interplay. The vertically long wordless panel in question, which shows a man hanging on the wall of the prison cell underneath a barred window, extends over three tiers of smaller panels to its left. The readers deviated to the last panel most frequently from the third-tier panel (from the bottom), which is placed next to the last panel’s hung man’s head, but anticipatory deviations were also made from the panel extending horizontally above the last one and from the second-tier panel placed next to it. The layout appears to facilitate such deviations, although the visually and narratively salient content of the image may also play a part in directing the gaze.

Other important, albeit more local, deviations in the middle of the Z-path were observed in the reading patterns of HoS (fig. 6), where we recorded a significant number of forward jumps in the middle of the reading. Particularly noticeable in this regard were the jumps made to the large background image on the second page (AOI008), which occupies roughly two thirds of the lower side of that page. This “panel”,7 which attracted more transitions than any of the other AOIs on this spread, also attracted an unusual amount of attention from other directions than the immediate upper-right-hand-corner area of the page. Jumps forward comprised 47 percent of the overall transitions to this AOI, and regressions close to 30 percent. Only some of the jumps (7 out of the total of 41 to this panel) could be attributed to the fact that parts of AOI008 lie between AOI005 and AOI006 on the Z-path from the left-hand page to the right-hand page. AOI008 also had the most regressions on this spread.

Figure 6: Key Performance Indicators (KPI) of Areas of Interest (AOI) on HoS (Heart of the Storm, Eisner 1991).

We suggest that one potential reason why there were several jumps from the first page to this panel, especially from AOI003-005, was the combined effect of layout and visual content. Namely, the black background of the frameless panels AOI003-005 bleeds onto the background of AOI008, and also forms the wall of one of the buildings depicted in this panel. Moreover, the earlier mentioned shared line of the sewing table (AOI004) and the building roofs of the street scene (AOI008) strongly connect the two pages. In addition, the upper frame of AOI008, which is higher on the left, suggests a connection with the smaller panels that are next to it on the left-hand page. Thus, AOI008 functions as a kind of background for the whole of the spread.


Our eye-tracking recordings provided us with evidence of global attention at the beginning and end of reading a double-page spread in comics. However, we suggest that global attention in reading comics should be understood not as a look in the sense of one fixation between the saccades that would indicate comprehension of the whole of the document, but rather as attention to larger segments of the composition. We believe that global attention in reading comics, as far as examples in which the page layout makes a difference are concerned, consists of a series of saccades and fixations rather than one fixation, and that this form of attention may be recognized from the first and last fixations, and the amplitude of the saccades.

In light of our empirical study, it is plausible to infer indications of global attention in eye-movement patterns in the reading of comics. We found that strong fixation patterns occurred before the readers moved to the upper left-hand corner of the spread, and also after they had ended the Z-path in the lower right-hand corner of the stimulus. These patterns can be attributed to the combined effect of narrative content, such as faces or characters engaged in action, and layout choices. Furthermore, we found evidence indicating that readers may explore other potential pathways beyond the Z-path upon reading as they encounter layout changes, specific content, or implied connections between these two. Consequently, we wish to develop some of the implications included in Neil Cohn’s observation that

data from eye-tracking experiments have shown that readers do not explore various potential pathways before progressing panel-by-panel (Nakazawa, 2002; Omori et al., 2004; Chiba et al., 2007), indicating that panel content does not provide the main motivation to their reading order (though an alternate order may be chosen if content confounds that intended order). (“Architecture of Visual” 5)

Panel content does not provide the sole motivation for the reading order in our eye-movement data, either, and our research confirms once again that the panel-by-panel Z-path remains the default order of reading. However, our study also strongly indicates that a focus on the Z-path procedure alone gives a very limited idea of the kinds of attention that matter in reading comics, especially in longer formats that use the page, or the double page spread, as units of design, and that rely on internally varying layout styles. We believe that further empirical study on the relations between eye-movement patterns, page layout and narrative content could make a substantial contribution to current understanding of reading and types of attention in multimodal documents in a more general sense.

Psychological research on attention states with complex images has also shown that information processing may start local and end global, or it may become more global along the process due to people checking that they have not missed any important information with multiple switches between these states, for instance (Wedel et al. 136). Similarly, Jun Nakazawa’s empirical study involving novice and expert manga readers emphasizes that the kind of random eye movements he believed were characteristic of manga reading “may reflect free information gathering from pictorial stimuli and rechecking as a metacognitive function” (36). Furthermore, we believe that insights from cognitive studies suggesting that a global state of attention might occur when people switch back and forth between text and pictorial information (Liechty et al. 538) should be of further interest to the empirical study of multimodal texts such as comics. Although we could not find any clear indications of this in our gaze data, we cannot rule out the potential relevance of these kinds of switches of attention in the process of reading comics.

Our research confirms the significance of the Z-path in reading comics, even when the page layout breaks the grid pattern of the framed panels: The readers paid attention selectively, thus ignoring many elements in the pages, to follow the linear reading pattern. However, their scanpaths also strongly imply that comic-reading behavior is modifiable in more ways than conventional book reading. One reason for this is that readers need to understand the forms of interaction between layout and the narrative content of the images. Hence the importance of global attention: the readers need to find meaningful connections on a new double page spread, and again check the connections that they have established, or details they may have missed, at the end of their reading. Our research findings imply that the study of reading habits and attention in the reading of comics should not focus too strongly on 1) the structure of the panel sequence, 2) the assumption of a perfect distinction between layout and panel content, or 3) the linear Z-path alone.

We found much evidence of switching between local and global attention in the reading of double-page spreads in comics. Both layout and narrative content, and their interaction, may encourage global attention. This can be detected, in particular, in longer saccades at the beginning and end of the reading process. The findings indicate, for instance, that more complex layout choices (such as HoS and SC, in fig. 6 and 2, respectively) correspond with more variety in the scanpaths and reading patterns, especially at the beginning of the reading process, and when supported by significant narrative content such as characters engaged in action, speech balloons, or close-up images of faces. One possible objective in future empirical research on the reading of comics would be to find out whether the global attention state could be detected more systematically in the eye trajectories. Global attention is rare in the middle of reading when the reader has entered the Z-path, but it may occur at the end of the first page, for instance. A particular layout arrangement such as highlighting the center of the spread (HoS) or juxtaposing two large panels on the opposite sides with similar content (SC), may also encourage non-chronological larger saccades in the middle of the Z-path. However, these are far from the only types of modulation of the reader’s attention through layout. We could imagine other forms of contrast or juxtaposition, and other meaningful relations between various compositional elements that could be empirically tested. An interview with the participants might also contribute to a better understanding of how they process the information differently in the course of the reading, and what kinds of tasks the readers perform during those phases that we have identified here as global attention.

As much research shows, the task being performed by the viewer can influence eye movements (Yarbus; Tatler et al.; Haji-Abolhassani and Clark). When the task is to read a narrative text or a segment of a narrative, as in our examples, we presume readers are encouraged to pay attention to all elements that could help them to process the content. The task we gave our readers was to read the comic and then move forward, not to understand the narrative. Nevertheless, all our stimuli exhibited strong narrativity. That is to say they portrayed characters acting in an evolving event or a situation in some world.8 This in itself may activate the task of reading for a narrative, in other words comprehending the gist of the story on the basis of the narratively salient elements.


We would like to thank Maarit Koponen, for her invaluable technical help during the experiment and with the data collection, and in reporting to us about the participant interviews, as well as Tuomo Häikiö, for his invaluable critical comments on an earlier version of this text.


[1] The Z-pattern reading order follows the shape of the letter “z”. Readers will start in the top/left corner of the page, move horizontally to the top/right and then diagonally downward to the next panel or sequence in a “return sweep” (Rayner, “Eye Movements in Reading” 375) on the left before making another horizontal movement to the right.

[2] These derive mainly from educated generalizations from what scholars themselves do when they read comics.

[3] Holmqvist et al. define a scanpath as “the route of oculomotor events through space within a certain timespan” (254).

[4] Knowing nevertheless that the average focus distance for reading tends to be situated between 380 and 635 mm from the eyes, depending on the readers’ preferences, we are conscious that the distance used in our reading assignment is not quite in the middle of that range.

[5] The participant had read Frank Miller’s Sin City before and commented that it might have influenced her reading by slowing it down slightly

[6] In practice, we considered the first 2 to 6 fixations to mark the beginning of reading.

[7] We need to use quotation marks here because this “panel” is not framed, and furthermore, it continues “under” the next panel (AOI009).

[8] Narratologists regularly define narrativity as “that which makes a text a narrative”. See for instance Ryan 26-30.

Works Cited

Augereau, Olivier, Mizuki Matsubara, and Koichi Kise. “Comic visualization on smartphones based on eye tracking.” MANPU ’16 Proceedings of the 1st International Workshop on coMics ANalysis, Processing and Understanding, 2016.

Bateman, John A., Francisco O. D. Veloso, Janina Wildfeuer, Felix HiuLaam Cheung, and Nancy Songdan Guo. ”An Open Multilevel Classification Scheme for the Visual Layout of Comics and Graphic Novels: Motivation and Design.” Digital Scholarship in the Humanities, 2016, pp. 1-35.

Buswell, Guy Thomas. How People Look at Pictures. A Study of the Psychology of Perception in Art. The University of Chicago Press, 1935.

Chavanne, Renaud. Composition de la Bande dessinée. Montrouge, France: Éditions PLG/Collection « Mémoire Vive », 2010.

Chiba, S., T. Tanaka, K. Shoji, and F. Toyama. “Eye movement in reading comics.” Proceedings of the 14th Annual International Display Workshops. Hrsg. v. Society for Information Display, Curran, 2007, pp. 1255–1258.

Cohn, Neil and Hannah Campbell. “Navigating comics II: Constraints on the reading order of comic page layouts.” Applied Cognitive Psychology, vol. 29, 2015, pp. 193-199.

Cohn, Neil. “Navigating Comics: an Empirical and Theoretical Approach to Strategies of Reading Comic Page Layouts.” Frontiers in Psychology, vol. 4, no. 186, 2013, pp. 1-15.

Cohn, Neil. “The Architecture of Visual Narrative Comprehension: The Interaction of Narrative Structure and Page Layout in Understanding Comics.” Frontiers in Psychology, vol. 5, no. 680, 2014, pp. 1-9.

Cohn, Neil. “Reading Without Words: Eye Movements in the Comprehension of Comic Strips.” Applied Cognitive Psychology, vol. 30, 2016, pp. 566–579.

Dorr, Michael, Thomas Martinetz, Karl Gegenfurtner, and Erhardt Barth. “Variability of eye movements when viewing dynamic natural scenes.” Journal of Vision, vol. 10, no. 10, 2010, pp. 1–17.

Foulsham, Tom. ”Scene Perception.” The Handbook of Attention, edited by Jonathan Fawcett, Evan F. Risko, and Alan Kingstone, MIT, 2015, pp. 257-79.

Foulsham, Tom, Dean Wybrow, and Neil Cohn. “Reading Without Words: Eye Movements in the Comprehension of Comic Strips.” Applied Cognitive Psychology, vol. 30, 2016, pp. 566–579.

Fresnault-Deruelle, Pierre. “Du linéaire au tabulaire.” Communications, vol. 24, La bande dessinée et son discours, 1976, 7−23.

Groensteen, Thierry. Système de la bande dessinée. PUF, 1999.

Haji-Abolhassani, Amin and James J. Clark. “An inverse Yarbus process: Predicting observers’ task from eye movement patterns.” Vision Research, vol. 103, 2014, pp. 127–142.

Holmqvist, Kenneth, Marcus Nyström, Richard Andersson, Richard Dewhurst, Halszka Jarodzka, and Joost Van De Weijer. Eye Tracking – A comprehensive guide to methods and Measures, Oxford University Press, 2011.

Leinenger, Mallorie and Rayner, Keith. “Eye Movements and Visual Attention during Reading.” The Handbook of Attention, edited by Jonathan M. Fawcett, Evan F. Risko, and Alan Kingstone, MIT, 2015, pp. 281-300.

Liechty, John, Rik Pieters, and Michel Wedel. “Global and local covert visual attention: Evidence from a bayesian hidden markov model.” Psychometrika, vol. 68, no. 4, 2003, pp. 519-541.

Loschky, Lester C., John P. Hutson, Maverick E. Smith, Tim J. Smith and Joseph P. Magliano. “Viewing Static Visual Narratives Through the Lens of the Scene Perception and Event Comprehension Theory (SPECT).” Empirical Comics Research: Digital, Multimodal, and Cognitive Methods, edited by Jochen Laubrock, Janina Wildfeuer, and Alexander Dunst. Routledge, 2018, pp. 217-38.

Mikkonen, Kai. The Narratology of Comic Art. Routledge, 2017.

Nakazawa, Jun. “Analysis of manga (comic) reading processes: manga literacy and eye movement during manga reading.” Manga Studies, vol. 5, 2002, pp. 39–49.

Nakazawa, Jun. “The Development of Manga (Comic Book) Literacy in Children.”     Applied Developmental Psychology: Theory, Practice, and Research from Japan, edited by David W. Schwalb, Jun Nakazawa, and Barbara J. Schwalb, Information Age Publishing, 2005, pp. 23–42.

Omori, Takahide, Takeharu Igaki, Taku Ishii, Keiko Kurata, and Naoe Masuda. ”Eye catchers in comics: Controlling eye movements in reading pictorial and textual media.” Paper presented at XXVIII International Congress of Psychology (Beijing), 2004.

Parkhurst, Derrick, Klinton Law, and Ernst Niebur. “Modeling the role of salience in the allocation of overt visual attention.” Vision Research, vol. 42, 2002, pp. 107–123.

Peeters, Benoît Lire la bande dessinée. Flammarion, 1998.

Peeters, Benoît. ”Four Conceptions of the Page.” Trans. Jesse Cohn. ImageTexT: Interdisciplinary Comics Studies, vol. 3, no. 3, 2007.

Postema, Barbara. Narrative Structure in Comics. Making Sense of Fragments. RIT Press, 2013.

Rayner, Keith. “Eye Movements in Reading and Information Processing: 20 Years of Research.” Psychological Bulletin, vol. 124, no. 3, 1998, pp. 372-422.

Rayner, Keith. “Eye movements and attention in reading, scene perception, and visual search.” The Quarterly Journal of Experimental Psychology, vol. 62, no. 8, 2009, pp. 1457–1506.

Rigaud, Christophe, Thanh-Nam Le, J.-C. Burie, J.-M. Ogier, Shoya Ishimaru, Motoi Iwata, and Koichi Kise. “Semi-automatic Text and Graphics Extraction of Manga using Eye Tracking Information.” Paper presented at 2016 12th IAPR Workshop on Document Analysis Systems, 2016.

Ryan, Marie-Laure. “Towards a definition of Narrative.” The Cambridge Companion to Narrative, edited by David Herman (ed.), Cambridge University Press, 2007, pp. 22-38.

Tatler, Benjamin W., Nicholas J. Wade, Hoi Kwan, John M. Findley and Boris Velichkovsky. “Yarbus, Eye Movements, and Vision.” i-Perception, vol. 1, no. 1, 2010, pp. 7–27.

Wedel, Michel, Rik Pieters, and John Liechty. ”Attention Switching During Scene Perception: How Goals Influence the Time Course of Eye Movements Across Advertisements.” Journal of Experimental Psychology: Applied, vol. 14, no. 2, 2008, pp. 129–138.

Yarbus, A. L. Eye Movements and Vision. Plenum, 1967.

Related Articles