Auditory Attention Causes Visual Inattentional Blindness

When engaged in a visual task, we can fail to detect unexpected events that would otherwise be very noticeable. Here we ask whether a common auditory task, such as that of attending to a verbal stream, can also make us blind to the presence of visual objects that we do not anticipate. In two experiments, one hundred and twenty observers watched a dynamic display while performing either a visual or an auditory attention task, or both simultaneously. When observers were listening to verbal material, in order to either understand it or to remember it (auditory task), their probability of detecting an unexpected visual object was no higher than when they were counting bounces of moving items (visual task), although in the former case the observers' eyes and attention could move around the display freely rather than remaining focused on tracked items. Previous research has shown that attending to verbal material does not affect responses to lights flashing at irregular intervals, suggesting that driving performance is not hampered by listening. The lights, however, were expected. Our data imply that listening to the radio while driving, or to a portable audio player while walking or biking, can impair our reactions to objects or events that we do not expect.

Abstract. When engaged in a visual task, we can fail to detect unexpected events that would otherwise be very noticeable. Here we ask whether a common auditory task, such as that of attending to a verbal stream, can also make us blind to the presence of visual objects that we do not anticipate. In two experiments, one hundred and twenty observers watched a dynamic display while performing either a visual or an auditory attention task, or both simultaneously. When observers were listening to verbal material, in order to either understand it or to remember it (auditory task), their probability of detecting an unexpected visual object was no higher than when they were counting bounces of moving items (visual task), although in the former case the observers' eyes and attention could move around the display freely rather than remaining focused on tracked items. Previous research has shown that attending to verbal material does not affect responses to lights flashing at irregular intervals, suggesting that driving performance is not hampered by listening. The lights, however, were expected. Our data imply that listening to the radio while driving, or to a portable audio player while walking or biking, can impair our reactions to objects or events that we do not expect. see also Rees et al 2001). This also leads to the prediction that inattentional blindness would essentially disappear if observers were engaged in a non-visual, as opposed to a visual, attentional task.
A strictly related question is: Will concurrent engagement in two tasks, one visual and one non-visual (like listening to the traffic bulletin while driving), augment inattentional blindness (like failing to notice an ice patch lying on the street, say 30 m ahead)? And, would this depend on the type of auditory task (eg on whether we are trying to memorise the steps of an itinerary someone is dictating us over the cell phone, as opposed to simply comprehend what he/she is telling us)? The existence of cross-modal attentional links between vision and audition has been shown across a wide range of dual-task situations (see Spence and Driver 2004). For example, when people simulate driving and simultaneously carry out a verbal task, both performances are impaired (Horswill and McKenna 1999). However, it is not known whether detection of an unexpected event would also be affected. It has been argued that simply attending to verbal material, without active engagement in a conversation, does not interfere with driving (Strayer and Johnston 2001). This conclusion was based on a study in which participants tracked a target that flashed red or green at irregular intervals, and were asked to respond to red by pressing a button. No impairment in reaction to red lights was found when participants simultaneously listened to radio broadcasts, or to a book on tape. However, (a) the critical events were not unexpected, only the time of their occurrence was; and (b) the critical events occurred where the participants' eyes and attention were both focused.
In the work reported here, we searched for an answer to the above questions by using two types of auditory attention task (listening to a few sentences in order to understand them, listening to a list of words in order to remember them) in place of, or in addition to, a visual attention task (counting bounces in a dynamic display).
2 Experiment 1 2.1 Method 2.1.1 Participants. Ninety participants (thirty-two males and fifty-eight females; mean age 37 years) with normal or corrected-to-normal vision were tested individually. They were randomly assigned to one of three conditions: (a) visual, (b) auditory, and (c) visual and auditory (dual task). Within the auditory-task conditions, half of the participants were assigned to the comprehension condition and the other half to the recall condition (described in the next section). There were thirty participants per condition.
2.1.2 Stimuli and procedure. The visual stimuli we used were similar to those of Most et al (2001), and were presented on a portable Toshiba Satellite 1800^412 computer with a 14 inch display. On each of the five experimental trials, four black (luminance 1X0 cd m À2 ) and four white (luminance 87X4 cd m À2 ) L and T shapes moved independently on random paths, at variable velocities, against a 10.6 cm6 8.0 cm grey (luminance 15X8 cd m À2 ) background. The third, fourth, and fifth trials also contained a light-grey (luminance 42X3 cd m À2 ) cross with the same horizontal and vertical extent as the Ls and Ts, ie 8 mm, and the same thickness, ie 2 mm. As they moved, the black and white shapes could partially occlude each other, and occasionally bounce off the edges of the display window.
Auditory stimuli were prepared by using the Italian Assistant, Language Assistant Series software. They consisted of five short stories in Italian (about 26 words each) and five lists of Italian words (14 words for each list), uttered by a computerised female voice. Each story, or list, lasted 12 s. In the auditory-task conditions, each of the five trials was coupled with a different story (comprehension condition) or list of words (recall condition). Participants were told they had to listen because their comprehension, or recall, would be tested at the end of the trial. (1) We prepared five separate trials which were presented in the same order to all participants. The number of bounces was 8 on the first trial, 5 on the second, 6 on the third, and 7 on the fourth and fifth. Each trial lasted 12 s.
The ninety participants were distributed across five conditions. In the visual-task condition, thirty participants were instructed to watch the display and keep a silent tally, using their fingers, of the number of times that the white letters bounced off the edges of the display window; after each trial, they reported the number of bounces they had seen. In the auditory-task (comprehension) condition, fifteen participants watched the display and listened to short stories; after each trial, they answered three questions about the story. In the auditory-task (recall) condition, fifteen participants watched the display and listened to lists of words; after each trial, they recalled as many words as possible. In the dual-task (comprehension) condition, fifteen participants counted the number of bounces made by the white shapes and listened to short stories; after each trial, they reported the number of bounces they had seen and answered three questions about the story. In the dual-task (recall) condition, fifteen participants counted the number of bounces made by the white shapes and listened to lists of words; after each trial, they reported the number of bounces they had seen and recalled as many words as possible.
All observers viewed the display from a distance of about 60 cm and completed five consecutive trials. The first two trials contained no unexpected event. Approximately 2.45 s into the third trial (the`critical trial'), the grey cross unexpectedly entered the display from the right side, traversed the screen horizontally along a virtual midline and exited to the left side (see figure 1). The cross remained visible for 7.15 s. After this trial, observers answered a questionnaire adapted from Most et al (2005). They were asked to report whether they had seen anything other than the black and white Ls and Ts, something that was missing in the first two trials. If the answer was yes, they were asked to describe the colour, motion direction, and shape of the object. The shape could be picked from among four different shapes, graphically represented in the questionnaire: an E-shape, a cross, a heart, and a triangle.
(1) An example of a short story would be:``Joe has to go to the drugstore to buy some gauze and plasters. To get there he must turn right at the crossing's traffic-lights''. The comprehension questions for this story would be: (1)``Where does this person have to go?''; (2)``What does he have to buy?''; (3)``In which direction does he have to turn to get there?''. An example of word list would be:``Flower, Bag, Mountain, Dog, Chair, Pen, Blackboard, Puppy, Book, Window, Candle, Light, Wall, Airplane''. Participants then completed a fourth trial (called`divided-attention trial', because the questionnaire had indirectly alerted them to the possibility that a novel object could appear), after which they answered a second questionnaire, identical to the first. On the fifth and last trial (the`full-attention' trial), participants were simply asked to view the display, without performing any task. After this trial they answered a final questionnaire, identical to the previous two. Participants were debriefed at the end. Nobody reported to have been familiar with`inattentional blindness' concepts or experiments.

Results
Averaging across all conditions, the cross was noticed (ie the shape was correctly reported) by 38% of the participants on the critical trial, by 72% on the dividedattention trial, and by 100% on the full-attention trial. On the critical trial, only fifteen individuals out of ninety correctly described all three attributes (shape, colour, and direction of motion) of the unexpected object. Detailed data per trial and experimental condition are shown in table 1.
2.2.1 Performance in the auditory task. Comprehension and recall performances, measured respectively as the number of questions correctly answered and the number of words correctly recalled, were analysed with two separate repeated-measures ANOVAs, where the within-subjects factor was cross presence (precritical versus critical trial) and the between-subjects factors were condition (auditory-only versus dual-task) and inattentional blindness (noticers versus non-noticers). An additional ANOVA was performed on bounce-counting accuracy, with a within-subjects factor of cross presence (precritical versus critical trial) and between-subjects factors of condition (visual-only versus dual-task) and inattentional blindness (noticers versus non-noticers). Performance in the auditory task (both comprehension and recall) was reduced when the visual and auditory tasks were combined (F 1 26 9X59, p 0X005, and F 1 26 17X95, p 5 0X0001, respectively).
The number of words correctly recalled dropped from the precritical to the critical trial (F 1 26 5X56, p 0X026). (In the comprehension condition, where the answers to the final questions could partly be inferred from the general context, there was no significant effect.) A marginally significant interaction emerged between cross presence and condition (F 1 26 4X02, p 0X055), owing to the fact that recall worsened in the auditory-only condition (from 5.07 to 3.87 words) but not in the dual-task one (2.73 words in both cases). In all likelihood the lack of a performance drop in the latter case reflects a floor effect, due to the high attentional demands of the dual task.  The decline in recall was essentially due to noticers, as shown by a marginally significant interaction between inattentional blindness and cross presence (F 1 26 3X13, p 0X09). More specifically, noticers recalled fewer words in the critical relative to the precritical trial (the difference amounted to 1.75 words and was significant, p 0X04), whereas the performance of non-noticers did not change (the difference amounted to 0.18 words and was non-significant, p 4 0X1). This might be simply explained by the fact that, unlike non-noticers, noticers either became temporarily distracted from the auditory task when noticing the cross (interference), or kept a trace of the cross in memory until the end of the trial, in addition to the traces of the words (higher memory load).
2.2.2 Performance in the visual task. Performance in the visual task was also reduced when the visual and auditory tasks were combined, with an average of 1 error (mean absolute deviation from correct number of bounces) in the dual task versus 0.5 errors in the visual-only one (F 1 56 5X12, p 0X028). On the critical trial, participants who did not notice the cross made significantly more counting errors than noticers (in both the visual and dual conditions), suggesting that the unexpected object might have been subliminally monitored, thereby consuming attentional resources. This finding, together with supporting results coming from additional experiments, has been reported elsewhere (Bressan and Pizzighello 2008) and will not be expanded upon here.

Inattentional blindness.
From the standpoint of their effects on inattentional blindness, the comprehension and recall conditions were not significantly different, either alone (w 2 5 1), or within the dual task (w 2 5 1); hence, they were combined. Unsurprisingly, observers were much less likely to notice the cross when they had to attend simultaneously to two tasks, one visual and one auditory, rather than to a visual task only (w 2 1 N60 5X55, p 0X018; see table 1). Surprisingly, however, there was no significant difference in the amount of inattentional blindness between people engaged in the dual task and people engaged in the auditory task alone (w 2 5 1). This was true only as long as the stimulus was unanticipated: in the divided-attention trial, the gap between the two conditions disappeared (w 2 5 1), and there was a tendency for the cross to be noticed less often when the task was dual as opposed to auditory (w 2 1 N60 2X86, p 0X09). The dual task also hampered detailed perception of the cross: the number of perfect noticers (participants who correctly reported all three attributes of the cross) was significantly smaller in the dual-task condition than in either the auditory one (w 2 1 N60 4X04, p 0X044), or the visual one (w 2 1 N60 6X40, p 0X011).

Discussion
Only one-third of our observers reported noticing something unexpected in the conditions in which they were listening to verbal material. Of course, they might have perceived the cross, but then forgotten about it by the time they were questioned (ie inattentional blindness may actually be``inattentional amnesia''' öWolfe 1999). However, our finding that bounce-counting accuracy in the critical trial deteriorated in non-noticers, but not in noticers, is inconsistent with the idea that the unattended item was perceived but then forgotten (see also Bressan and Pizzighello 2008). This idea presupposes that all observers are actually noticers. Yet if this were the case, performance should, if anything, be worse for noticers (who keep a trace of the object in memory until the end) than for non-noticers (who dispose of it).
We found that inattentional blindness was as likely in the dual as in the auditoryonly condition: adding an auditory task to a visual task worsened inattentional blindness, but adding a visual task to an auditory one did not. A possible explanation is that of a ceiling effect. It has been shown that perception of irrelevant distractors is eliminated under conditions of high attentional load in an unrelated task (Rees et al 2001). The load of the auditory task might be so high that it engages attention fully, exhausting available capacity and making the supplemental visual task redundant. However, the higher attentional load of the dual task relative to the auditory-only task was clearly revealed by a corresponding decrease in both word retention and probability of full perception (or retention) of the cross. We must conclude that the auditory task, in itself, consumed attention only partly, and that attentional capacity was not at ceiling. Folk et al (1992Folk et al ( , 1993Folk et al ( , 2002 argued that, in order to capture attention, a target must be part of a top^down attentional set. When the task is visual and the distractor is auditory, they are less likely to be part of the same attentional set than when both are visual. Since in the auditory-only task the visual unexpected object is not part of the attentional set, it is also possible that, despite instructions to watch the screen, some participants may have`defocused' the display (by converging their eyes either behind or in front of it) in order to`focus' on the auditory task itself. Although participants were instructed to watch the screen and did keep their eyes on the monitor, by the third (critical) trial they would have realised that the visual stimulus was irrelevant to the task.
For this reason we ran a new experiment, where participants were explicitly told that it was crucial that they watched the videoclip from start to finish, and that they would be tested with a simple question about it at the end of each trial.
3 Experiment 2 3.1 Method 3.1.1 Participants. Thirty participants (one male and twenty-nine females; mean age 22 years) with normal or corrected-to-normal vision were tested individually. Five additional participants (three noticers and two non-noticers) were replaced because, when questioned at the end, they admitted they had not watched all videoclips from start to finish. All participants were tested with the auditory task: fifteen participants were randomly assigned to the comprehension condition and fifteen to the recall condition.
3.1.2 Stimuli and procedure. Apparatus, stimuli, and procedure were identical to those used in the auditory-only condition of experiment 1, with the following difference. In experiment 1 participants were asked to watch the screen, but no attempt was made to check to what extent they had actually done so. In experiment 2 participants were explicitly instructed to watch (without counting or performing any specific visual task) each videoclip for its whole duration. They were told that (a) this was important, and (b) they would be tested with a question about it at the end of each trial.
As in experiment 1, after each trial participants answered three questions about the story (comprehension condition) or recalled as many words as possible (recall condition). In addition, after the first (practice) trial they were asked which colour the letters were, and after the second (noncritical) trial they were asked which colour the background was. After the third (critical), the fourth (divided-attention), and the fifth (full-attention) trials participants answered the same questionnaire as in the previous experiment.
At the end of the experiment participants were asked if they had indeed watched the videoclip from start to finish, and fully debriefed. Nobody reported to have been familiar with`inattentional blindness' concept or experiments.

Results
The number of participants who showed inattentional blindness in the trials of interest (critical and divided-attention; all subjects reported the cross in the full-attention trial) is given in table 1. In the critical trial, this number was lower than in either the dual-task or auditory-only conditions of experiment 1 (w 2 1 N60 11X28, p 0X001, and w 2 1 N60 9X6, p 0X002, respectively). Interestingly, although all thirty participants , , reported that they had watched each videoclip from start to finish, only twenty-four of them correctly reported the colours of both sets of moving letters, ie black and white (a further four participants recalled one colour only), and only twenty-one correctly reported the colour of the background, ie grey. On the other hand, inattentional blindness was not linked to a scarce attentiveness to the visual display in general: the probability of missing the cross was basically the same (w 2 5 1) among those who responded correctly to both control questions (seventeen participants, five non-noticers) and those who did not (thirteen participants, three non-noticers). As in experiment 1, the decrease in the number of correctly recalled words from the precritical (4.53) to the critical (2.00) trial was significant (F 1 13 16X61, p 0X001). Because participants were forced to watch the display for its whole duration, we expected a worse recall performance in experiment 2 than in experiment 1. Across trials, participants did indeed recall fewer words in experiment 2 (3.33 on average) than in experiment 1 (4.25 on average) (F 1 28 12X42, p 0X001).

Discussion
Experiment 2 confirmed the finding of experiment 1 that the engagement of auditory attention per se can block an unexpected visual object from reaching awareness, even when no specific visual task is simultaneously carried out. The comparison between the two experiments suggests that, in order to focus on a verbal stream, people instinctively disengage visual attentionöeven when they are explicitly asked not to, and to the extent that they can get away with it (hence, less in experiment 2 than in experiment 1). This automatic disengagement of visual attention pays back: the number of words correctly recalled was indeed significantly smaller in experiment 2, that is when participants were forced to watch the visual event from start to finish. Nonetheless, a sizeable amount of inattentional blindness occurred even in this case, although participants not only were told that watching the display was a crucial part of the experiment task, but also reported, at the end of the experiment, to have fully followed the instructions.

Conclusions
Half of our observers failed to notice the unexpected visual object in the auditoryonly condition, even though, in the absence of a tracking task, they were free to move both their visual attention and their eyes over the display. This show that inattentional blindness does not need allocation of focused visual attention to concurrent stimuli (on an`add here/subtract there' same-modality principle, as generally implied in the literature), but can be induced by allocation of auditory attention as well. Actually, the engagement of auditory attention was no less effective than the engagement of visual attention. Our experiments indicate that (a) in order to pay attention to a verbal stream observers automatically withdraw attention from the visual scene; and (b) they do this even when they are explicitly required to watch the scene, and are convinced to have done so.
Situations in which we must pay attention to visual and verbal stimuli simultaneously are remarkably common outside laboratories. The importance of understanding how such tasks interfere with one another and with our ability to detect new visual objects is especially obvious when the consequences can be tragic, as in the case of driving accidents. Use of a cellular phone is associated with a fourfold increase in the likelihood of a crash that will result in hospital attendance (McEvoy et al 2005). The hazard is totally independent of the type of telephone öhand-held and hands-free devices entail equivalent risks. More worrying, a conversation with a passenger has about the same effects as a conversation on the cell phone, and even information-processing tasks much less engaging and emotionally loaded than a conversation, such as mental arithmetic or word games, entail significant costs (see Horrey and Wickens 2006 for a meta-analysis). These costs are manifested primarily in terms of longer reaction times. , , Nonetheless, no impairment in reaction times to irregularly occurring red lights was found for people who performed a visual tracking task and simultaneously listened to radio broadcasts, or to books on tape (Strayer and Johnston 2001). The authors concluded that simply attending to verbal material, without being involved in a conversation, does not affect driving performance. However reassuring it may appear to those of us who drive, this conclusion needs to be qualified by adding that this lack of interference was found for stimuli whose occurrence was erratic but expected. Our work crucially complements these data by indicating that listening to the radio while driving, or to a portable audio player while walking or biking, may impair our reactions to stimuli that we do not expect.