OUP user menu

Neural basis of eye gaze processing deficits in autism

Kevin A. Pelphrey, James P. Morris, Gregory McCarthy
DOI: http://dx.doi.org/10.1093/brain/awh404 1038-1048 First published online: 9 March 2005

Summary

Impairments in using eye gaze to establish joint attention and to comprehend the mental states and intentions of other people are striking features of autism. Here, using event-related functional MRI (fMRI), we show that in autism, brain regions involved in gaze processing, including the superior temporal sulcus (STS) region, are not sensitive to intentions conveyed by observed gaze shifts. On congruent trials, subjects watched as a virtual actor looked towards a checkerboard that appeared in her visual field, confirming the subject's expectation regarding what the actor ‘ought to do’ in this context. On incongruent trials, she looked towards empty space, violating the subject's expectation. Consistent with a prior report from our laboratory that used this task in neurologically normal subjects, ‘errors’ (incongruent trials) evoked more activity in the STS and other brain regions linked to social cognition, indicating a strong effect of intention in typically developing subjects (n = 9). The same brain regions were activated during observation of gaze shifts in subjects with autism (n = 10), but did not differentiate congruent and incongruent trials, indicating that activity in these regions was not modulated by the context of the perceived gaze shift. These results demonstrate a difference in the response of brain regions underlying eye gaze processing in autism. We conclude that lack of modulation of the STS region by gaze shifts that convey different intentions contributes to the eye gaze processing deficits associated with autism.

  • autism
  • eye gaze
  • functional MRI
  • ADI-R = Autism Diagnostic Interview—Revised
  • BOLD = blood oxygenation level-dependent
  • fMRI = functional MRI
  • STS = superior temporal sulcus

Introduction

Autism is an aetiologically complex, severe and pervasive neurodevelopmental disorder characterized by deficits in social interactions, language and communication abnormalities, as well as the presence of restricted, repetitive behaviours, and a characteristic developmental course (American Psychiatric Association, 1994). While autism features heterogeneous impairments, the core disability is thought to revolve around social functioning (Kanner, 1943; Waterhouse et al., 1996). Among the most striking features of the social impairments in autism are deficits in coordinating visual attention with others (i.e. initiating and responding to joint attention) and understanding the mental states and social intentions of other people on the basis of information gathered from the eyes (Loveland and Landry, 1986; Mundy et al., 1986; Baron-Cohen, 1995; Dawson et al., 1998; Leekam et al., 1998, 2000; Baron-Cohen et al., 1999a; Frith and Frith, 1999). Eye gaze processing impairments appear early in the development of children with autism (Mundy et al., 1986; Dawson et al., 1998) and, while many affected individuals improve in their ability to coordinate visual attention, even high-functioning individuals with autism exhibit impairments on tasks involving mentalistic inferences from viewing expressions in the eyes (Baron-Cohen et al., 2001).

Gaze processing deficits in autism do not appear to be based in eye gaze discrimination per se, but result from impairment in using gaze to understand the intentions and mental states of other people (Baron-Cohen, 1995, Baron-Cohen et al., 1999a, 2001; Leekam et al., 1998, 2000). For example, Baron-Cohen (1995) carried out an experiment in which he showed pictures of a cartoon face (named ‘Charlie’) looking at one of four different kinds of candy and asked the child which candy Charlie prefers. Normal and mentally retarded children generally pointed to the candy that Charlie was gazing towards, i.e. these children linked the direction of gaze with its mentalistic significance and inferred that Charlie most probably preferred the candy to which his gaze was directed. In contrast, children with autism were significantly less likely to point to the candy at which Charlie was gazing. This deficit was not due simply to an inability to perceive the direction of gaze; in a different task, children with autism scored as well as normal or mentally retarded children when shown faces looking towards or away from them, and asked, ‘Which one is looking at you?’ In other words, the children with autism were able to perceive direction of gaze in a somewhat simpler task, but were unable to use such information to infer the mental state of the other person.

Although behavioural studies such as the one just reviewed have produced elegant descriptions of gaze processing deficits in autism, the identification of underlying biological mechanisms that contribute to eye gaze processing deficits and other social perception impairments in autism remains a critical and largely unmet challenge. A growing body of cognitive neuroscience research in neurologically normal subjects is mapping out key neural structures involved in social cognition. Allison et al. (2000) used the term ‘STS region’ to refer to cortex within the superior temporal sulcus (STS), to adjacent cortex on the surface of the superior temporal gyrus and middle temporal gyrus (near the straight segment of the STS) and to adjoining cortex on the surface of the angular gyrus (near the ascending limb of the STS). Several early functional neuroimaging studies of neurologically normal adults from our laboratory and others demonstrated the role of the STS region in processing observed eye movements (Puce et al., 1998; Wicker et al., 1998; Hoffman and Haxby, 2000). Subsequent work demonstrated that this region is sensitive to the social context within which a gaze shift occurs, i.e. whether the gaze is perceived to be consistent or inconsistent with the subject's expectation regarding the intention of the person making the eye movement (Pelphrey et al., 2003). In that study, which used a functional MRI (fMRI) paradigm with neurologically normal subjects nearly identical to the one used in the present study, a strong effect of context was observed in the right posterior STS region in which observation of gaze shifts away from a target (incongruent shifts) evoked a haemodynamic response with extended duration and greater amplitude compared with gaze shifts toward the target (congruent shifts). We have since demonstrated that the STS region plays a critical role in processing eye gaze signals of approach and avoidance (Pelphrey et al., 2004). The STS region also responds to the intentionality of other observed human actions including reaching-to-grasp movements of the arm and hand (Pelphrey et al., 2004), and is sensitive to the level of intentionality exhibited by simple geometric figures moving in a goal-directed manner (Castelli et al., 2002). These and other findings support the conclusion that the human STS is involved in social perception and social cognition via the visual analysis of social information conveyed by gaze direction, body movement and other types of biological motion (Allison et al., 2000).

In the present research, using event-related fMRI at 1.5 T, we evaluated the hypothesis that, in autism, brain regions normally involved in eye gaze processing are not sensitive to intentions conveyed by gaze shifts. During fMRI scanning, subjects with and without high-functioning autism watched as a small checkerboard appeared and flickered in an animated character's visual field (Fig. 1). On congruent (goal-directed) trials, the character shifted her gaze towards the checkerboard (Fig. 1, top panel), confirming the subject's expectation. On incongruent (non-goal-directed) trials, the character shifted her gaze towards empty space (Fig. 1, bottom panel), violating the subject's expectation. The mechanical aspects of the two conditions, including the eccentricity, velocity and duration of the gaze shifts, were equivalent; only the potential for violating the subject's expectation regarding the character's intentions differed. Based on our prior findings using this and similar fMRI paradigms in neurotypical adults (Pelphrey et al., 2003, 2004a, b), we hypothesized that neural activity in the STS region evoked by observation of gaze shifts would differentiate congruent and incongruent trials in neurologically normal subjects. In contrast, we predicted that subjects with autism would exhibit activity in the STS region during observation of eye gaze shifts (reflecting perception of the gaze shifts), but that this activity would not differ for incongruent versus congruent trials (reflecting a failure to link the perception of the gaze shift with its mentalistic significance).

Fig. 1

Depiction of task design for the experiment. Trials began when a small checkerboard appeared and flickered for 5 s in the character's field of view. In the congruent condition, the character shifted gaze towards the checkerboard with a 1 s stimulus onset asynchrony. In the incongruent condition, the character gazed towards one of five empty locations in space with a 1 s stimulus onset asynchrony.

Methods

Subjects

Ten right-handed subjects with autism (one female, nine males; 23.2 ± 9.9 years; age range 17.9–50.7 years) were recruited through the North Carolina Neurodevelopmental Disorders Research Center Subject Registry and the Treatment and Education of Autistic and Related Communication Handicapped Children programme in Chapel Hill, North Carolina, USA. Subjects or their parents consented to a protocol approved by the local Human Investigations Committee. Subjects were paid for participating. Diagnoses were based on a history of clinical diagnosis of autism, parental interview [Autism Diagnostic Interview—Revised (ADI-R); Lord et al., 1994] and proband assessment [Autism Diagnostic Observation Schedule (ADOS); Lord et al., 2000]. The average (SDs; ranges in parentheses) intelligence quotient (IQ) scores were for the autism group: full scale = 107 (16; 83–128), verbal = 106 (20; 74–130) and performance = 107 (14; 82–123). Average (SDs in parentheses) ADI-R algorithmic scores were: 21 (8) for the Reciprocal Social Interaction Domain, 15 (5) for the Communication Domain, 6 (3) for the Restricted, Repetitive, and Stereotyped Patterns of Behaviour Domain, and 3 (2) for the Onset Domain. For the comparison group, nine right-handed neurologically normal subjects (one female, eight males; 23.4 ± 5.8 years; age range 15.5–32.4 years) screened against major psychiatric illness, developmental disability and neurological problems were recruited from the community. The average IQ scores for the comparison group were: full scale = 118 (9; 106–132), verbal = 118 (10; 108–132) and performance = 117 (9; 104–127). The subject groups were matched at the group level on age, gender and IQ. There was a trend towards the comparison group being higher on IQ measures, and these subjects had a more narrow age range, but the two groups did not differ significantly on any of the matching variables.

Design

An animated character was created using Poser 4.0® (Curious Labs Inc., Santa Cruz, CA) (Fig. 1). See Supplementary material for sample movies. At the start of each trial, a checkerboard-patterned box appeared and flickered for 5 s in one of three positions (i.e. above each shoulder at eye level, above eye level or below eye level) on either side of the animated character within her visual field. The character's response to the appearance of the checkerboard distinguished the two conditions. In the ‘congruent’ condition (Fig. 1, top panel), the character shifted her gaze to look towards the checkerboard after a 1 s delay. In the ‘incongruent’ condition (Fig. 1, bottom panel), she looked towards one of six empty locations in space after a 1 s delay. In both conditions, the gaze shift was maintained for 4 s. The eyes returned to the original position within 500 ms of the checkerboard's disappearance. Trials were separated by a 21 s interval during which the character alone was presented with eyes forward. The CIGAL (Voyvodic, 1999) program was used to control stimulus presentation. Stimuli were presented using an LCD projector at XGA resolution that projected images upon a translucent screen placed behind the subject's head. Subjects viewed the stimuli through custom glasses with angled mirrors. Participants were instructed to attend to the screen at all times, but otherwise were allowed to look at the stimulus presentation in any manner they wished. Subjects were also instructed to press a button with the thumb of their right hand whenever they saw the eyes move (regardless of whether the eyes acquired the target). Imaging sessions consisted of 10 runs (70 trials of each condition). Trials appeared to be seamless to the subject in that each trial began with the character's eyes facing forward without visual cues signalling the next trial. We used a virtual actor to create our stimuli. This has the advantage of affording precise control over the movements of the actor as well as some potentially confounding variables such as background colour and lighting. However, some reports indicate that activation within the STS region may differ depending upon whether the actions of virtual or real-life actors are viewed (e.g. Perani et al., 2001).

Imaging

Scanning was performed on a General Electric (Waukesha, WI) 1.5 T LX NVi MRI scanner system equipped with 41 mT/m gradients. A quadrature birdcage radio frequency head coil was used for transmission and reception (General Electric). Sixty-eight axial images were acquired using a 3D fast spoiled gradient-recalled (SPGR) sequence [repetition time (TR) = 500 ms; echo time (TE) = 20 ms; field of view (FOV) = 24 cm; image matrix = 2562; voxel size = 0.9375 × 0.9375 × 2 mm]. Functional images were acquired using a gradient-recalled inward spiral pulse sequence (Glover and Law, 2001; Guo and Song, 2003) sensitive to blood oxygenation level-dependent (BOLD) contrast (TR = 1500 ms; TE = 30 ms; FOV = 24 cm; image matrix = 642; α = 90°; voxel size = 3.75 × 3.75 × 4 mm; 34 axial slices). These functional imaging parameters allowed whole-brain coverage and the spiral imaging protocol facilitated recovery of signal from anterior ventral temporal and other cortical areas that can be highly susceptible to artefact, while also providing good sensitivity to changes in BOLD contrast.

Image pre-processing

Image pre-processing was performed using SPM99 (Wellcome Department of Cognitive Neurology, London, UK) and custom MATLAB (Mathworks, Natick, MA) scripts. Three subjects were excluded due to motion artefacts (two without autism, one with autism). The 10 subjects with autism and nine subjects without autism included in the final sample did not have a >3 mm deviation in the centre of mass in the x-, y- or z-dimensions. The temporally realigned and motion-corrected scans were normalized to the Montréal Neurological Institute (MNI) template. Normalizing the scans to a common template offered the advantage of facilitating across-subject averaging and group comparisons of data in the same coordinate space. However, this approach may obscure potentially important between-group anatomical and functional differences. For example, a recent study comparing cortical sulcal maps in individuals with and without autism found anterior and superior displacements of the STS (Levitt et al., 2003); and Boddaert et al. (2004) recently reported abnormal STS volume in autism.

The functional data were high-pass filtered and spatially smoothed with an 8 mm Gaussian kernel. These normalized and smoothed data were used in the analysis procedures described below.

Data analysis

First, epochs synchronized to the trial onsets (i.e. appearance of the checkerboard) and containing two images preceding and 11 images following the onset of the stimulus events were extracted from the continuous time series of image volumes. Epochs were segregated and averaged by movement condition (incongruent or congruent). Across-subjects average time course volumes were computed for each stimulus condition. The average BOLD intensity values were then converted to percentage signal change relative to the 3 s pre-stimulus baseline. Voxel-based analyses identified gaze shift-evoked activity through correlational analyses with a reference waveform, resulting in two normalized t maps for each subject (one for each condition). Group-average maps were computed separately for each condition using the individual subject t maps as the basis of random effects analysis. For each voxel, each group of t values was tested for a significant difference from zero. The threshold for significance was set at a voxelwise uncorrected P < 0.05 (two-tailed) and a spatial extent of six contiguous functional voxels. Lower P values indicated a higher positive correlation between a voxel's waveform and the reference waveform. Aided by human brain atlases, we localized each cluster of activation.

For each subject, a t statistic for each voxel on pair-wise comparisons between the incongruent and congruent condition was computed. This analysis was based on time course data for each condition (averaged over single trial repetitions of each condition) and was computed on the average of the two time points around the expected peak amplitude (7.5–9 s after stimulus onset) and computed across the single trial epochs. This resulted in a measure of the difference between the two conditions and provided a 3D incongruent > congruent statistical parametric map for each subject where the t statistic value at a given voxel represented an estimate of the effect size of the difference between the two epoch average activity waveforms within a time window encompassing the expected peak amplitudes of the haemodynamic response.

The individual subject t statistic maps were then used in a random effects analysis across subjects. For each voxel in the MNI common space, t values (one derived from each subject) were tested for a significant difference from zero using a one-sample t test. This process provided a whole-brain normalized map of significance values for the incongruent versus congruent comparison from the random effects analysis. Lower P values indicated a larger incongruent > congruent difference between a voxel's waveform at the expected peak. The threshold for significance was set at a voxelwise uncorrected P < 0.05 (two-tailed) and a spatial extent of six functional voxels (Forman et al., 1995; Xiong et al., 1995). We localized each cluster of incongruent > congruent activation and we report the anatomical label, Brodmann area, voxel count and MNI coordinates of the centre of activation from each cluster by group in Table 1.

View this table:
Table 1

Summary of observed regions of incongruent > congruent activation

RegionSidexyzNvoxBA
Neurologically normal group
    Superior temporal sulcusR−49−51111222
    Middle temporal gyrusR−63−5402221
    Inferior parietal lobuleR−52−37376640
    Inferior parietal lobuleL−53−40421040
    Middle frontal gyrusR−46−2332639
    Precentral gyrusR−56−16111544
    Cingulate gyrusR−7−26352832
Autism group
    Middle temporal gyrusR−49−79−18619
    Inferior occipital gyrusL−38−66−3637
    Inferior frontal gyrusL−42−19−1847
    Insular cortexR35−50−9NA
  • Nvox = number of voxels in the region of interest; x, y and z refer to the stereotaxic coordinates of the centre of activation within a region of interest; R = right hemisphere; L = left hemisphere; BA = Brodmann area. The threshold for significance of the clusters reported here was set at a voxelwise uncorrected P < 0.05 (two-tailed) and a spatial extent of six functional voxels.

For selected clusters of significant incongruent > congruent activation, we conducted a set of 2 (group: autism versus typical) × 2 (condition: incongruent versus congruent) repeated-measures analysis of variance (ANOVA) procedures with the percentage BOLD signal change averaged across the two time points around the expected peak amplitude (7.5–9 s after stimulus onset) as the dependent measure. These ANOVAs allowed us to test predicted group × condition interactions in key regions of interest.

Results

fMRI results

The congruent and incongruent eye gaze sequences strongly activated the STS region. As illustrated in Fig. 2A and B, in both groups of subjects, prominent activation was observed in the right hemisphere STS region where the STS bifurcates into the posterior portion of the main branch and its ascending limb. The anatomical location of this eye gaze-evoked STS activity is consistent with the area of the STS region identified in several prior reports of activity in neurologically normal adult subjects evoked by observation of eye movements (Puce et al., 1998; Wicker et al., 1998; Hoffman and Haxby, 2000; Pelphrey et al., 2003, 2004a).

Fig. 2

Results from random effects analyses. (A and B) Activation maps indicating regions with significant eye gaze-evoked activity (collapsing across congruent and incongruent gaze shifts) in the group of neurologically normal subjects (A) and subjects with autism (B). (C and D) Activation maps indicating regions where the average response at expected peak amplitude to incongruent gaze shifts was greater than the average response to congruent gaze shifts in neurologically normal subjects (C) and the lack of these activations in subjects with autism (D). All maps are thresholded at a voxelwise uncorrected P < 0.05 (two-tailed) and a spatial extent of six contiguous voxels.

We next identified brain regions in each group of subjects that responded more strongly to the incongruent gaze shifts than to congruent gaze shifts (incongruent > congruent). These activations are illustrated in Fig. 2C and D. Focusing first on Fig. 2C, the red colour map indicates regions of incongruent > congruent activation in the right STS region in the neurologically normal group. In these subjects, cortical regions including the right posterior STS, superior temporal gyrus, middle temporal gyrus and inferior parietal lobule responded more strongly to incongruent than to congruent gaze shifts. While our hypotheses in this study focused on the STS region, other regions of significant incongruent > congruent activity were observed including activation clusters localized to the right middle frontal gyrus, the inferior parietal lobules bilaterally, the right precentral gyrus and dorsal aspects of the cingulate gyrus. The stereotaxic coordinates of the centres of activation within each cluster of incongruent > congruent voxels identified in this random effects analysis are presented by subject group in Table 1.

In individuals with autism, the activation maps showing clusters of incongruent > congruent activations were markedly different. For the subjects with autism, we did not observe significant clusters of incongruent > congruent activity in the STS region (compare Fig. 2C and D). Rather, incongruent > congruent activity in the autism group was localized to the left inferior frontal gyrus and the right insular cortex/claustrum (Fig. 3A, B and D), the right posterior middle temporal gyrus (Fig. 3A) and the left middle and inferior occipital gyri (Fig. 3C). Note that this group difference was not due to our selection of significance threshold for the incongruent > congruent comparison. Even when we lowered this threshold considerably, the areas of differential activation identified in the neurologically normal subjects remained absent in the subjects with autism. By inspecting each subject's incongruent > congruent activation map, we determined that only one subject with autism evinced a significant cluster of incongruent > congruent activity in the STS region. Each of the neurologically normal subjects exhibited such clusters.

Fig. 3

Results from random effects analyses of the subjects with autism. Activation maps indicating regions where the average response at expected peak amplitude to incongruent gaze shifts was greater than the average response to congruent gaze shifts in subjects with autism. These activations were observed in the (A) right insula/claustrum and (B and D) left insula/inferior frontal gyrus, (A) right posterior middle temporal gyrus and (C) left middle/inferior occipital gyrus. Maps are thresholded at a voxelwise uncorrected P < 0.05 (two-tailed) and a spatial extent of six contiguous voxels.

The average time courses from clusters of incongruent > congruent activity in the right hemisphere STS region are given by condition and subject group in the two panels of Fig. 4. Two features of these graphs are particularly noteworthy. First, significant haemodynamic responses were observed in both groups for both stimulus conditions in the STS. Secondly, only the neurologically normal subjects showed a significantly greater response in the STS region to the incongruent gaze shift sequence compared with the congruent sequence (compare Fig. 4A and B). Thus, whereas both groups activated the STS during eye gaze processing, only the neurologically normal subjects demonstrated increased activity for observation of incongruent gaze shifts.

Fig. 4

Response properties from clusters of incongruent > congruent voxels. (A and B) Average BOLD signal time courses from voxels in the right posterior STS region that responded more strongly to incongruent than to congruent gaze shifts in typically developing subjects. Waveforms from neurologically normal subjects are presented in A and from subjects with autism in B.

Repeated-measures ANOVAs on the expected peak BOLD percentage signal change values from the posterior STS region confirmed the group (typical versus autism) × condition (incongruent versus congruent) interaction illustrated in Fig. 4A and B [F(1,17) = 28.41, P < 0.0001]. A significant group × condition interaction was also observed in the inferior frontal gyrus and insular cortex/claustrum [Fig. 3A, B and D; F(1,17) = 5.72, P < 0.05]. This interaction was not significant for the right posterior middle temporal gyrus (Fig. 3A) or left middle and inferior occipital gyri (Fig. 3C).

On average, activity in the STS region in the group of subjects with autism did not differ significantly for incongruent and congruent gaze shifts, but individual differences in the degree of dysfunction were apparent. To explore potential behavioural correlates of these individual differences in brain responses, we correlated scores on several algorithmic domains of the ADI-R (Lord et al., 1994) used to support the diagnosis of autism with the magnitude of incongruent > congruent differentiation in the right STS region. We assumed that higher levels of incongruent > congruent differentiation (i.e. greater incongruent minus congruent differences scores) in the STS region would indicate cortical functioning that is more similar to the functioning of the neurologically normal subjects. Higher scores on aspects of the ADI-R can indicate greater severity of autism. Consequently, we anticipated a negative correlation between the functioning of the STS region and severity of autism. As illustrated in Fig. 5, the magnitude of the incongruent > congruent difference score was negatively correlated with scores in the Reciprocal Social Interaction Domain (Fig. 5A; r = −0.78, P = 0.004), but was not significantly correlated with impairments in the Communication Domain (Fig. 5B; r = −0.47, P = 0.081) or the Restricted, Repetitive and Stereotyped Patterns of Behaviour Domain (Fig. 5C; r = −0.40, P = 0.129). We removed the verbal communication items from the Communication Domain and identified a significant correlation between the subset of non-verbal communication items and the magnitude of the incongruent > congruent difference score (Fig. 5D; r = −0.58, P = 0.039). Within and between each subject group, the magnitude of incongruent > congruent differentiation in the STS region did not correlate with age or intelligence levels (P > 0.50 for all). These findings suggest that the degree of neurofunctional impairment in the right STS region is related to the severity of specific and relevant features of the autism phenotype.

Fig. 5

Correlations between responses to incongruent and congruent gaze shifts in the right posterior STS and the Autism Diagnostic Interview—Revised (ADI-R) scores reflecting impairments in the (A) Reciprocal Social Interaction Domain, (B) Communication Domain, (C) Restricted, Repetitive and Stereotyped Patterns of Behaviour Domain and (D) Communication Domain (non-verbal items only).

Behavioural results

Eye tracking studies conducted by our group and others of individuals with autism have identified pronounced abnormalities in the ways in which these individuals look at faces including viewing non-feature areas of faces more often and feature areas of faces (i.e. eyes, nose and mouth) less often than typically developing matched comparison subjects (Klin et al., 2002; Pelphrey et al., 2002). With these reports in mind, we asked our participants to press a button whenever they saw the eyes move (regardless of the stimulus condition). On average, subjects with autism correctly identified eye movements on 98.77% of trials. Neurologically normal comparison subjects correctly identified eye movements on 98.56% of trials. Levels of accuracy and reaction times did not differ for the two groups, nor did these two measurements differ by condition (P > 0.21 for all). These behavioural findings suggest that the fMRI findings reported here cannot be explained by differences in the degree to which subjects attended to the stimuli.

Discussion

We hypothesized that in neurologically normal subjects but not in individuals with autism, the STS region and other brain structures that have been linked to social cognition and social perception would differentiate observed target- or goal-directed (congruent) and non-goal-directed (incongruent) eye gaze shifts. The fMRI results presented here demonstrate that both neurologically normal subjects and subjects with autism activated the STS region in response to viewing shifts in eye gaze, but only the neurologically normal subjects showed activity that differentiated congruent and incongruent gaze shifts in the STS region. These findings converge with behavioural studies of individuals with autism suggesting that gaze processing deficits in autism are not based on problems with gaze discrimination, but rather are linked to deficits in using information from gaze direction to solve real-world social puzzles that demand awareness of contextual subtleties and the intentions of another person.

Activity associated with additional processing was observed in other brain regions including the insular cortex/inferior frontal gyrus and the posterior middle temporal and middle occipital gyri in the individuals with autism for incongruent compared with congruent gaze shifts. However, only the insular cortex/inferior frontal gyrus cluster exhibited a reliable group × condition interaction. The insular cortex activity might represent an increase in arousal level relating to observation of the incongruent gaze shift. For instance, activation in the insula has been shown to mirror changes in subjects' autonomic arousal as indexed by the galvanic skin response (Critchley et al., 2000).

The finding of greater activity to incongruent target-directed versus congruent gaze shifts in neurologically normal subjects replicates prior studies from our laboratory using this paradigm (Pelphrey et al., 2003) and a similar paradigm involving reaching-to-grasp action sequences (Pelphrey et al., 2004b) with neurologically normal subjects. The current results, combined with our previous findings in this area, converge to suggest that the right posterior STS region is an important component of the neural architecture supporting social cognition and social perception in neurologically normal subjects. We have interpreted these findings to suggest that this region of the STS is sensitive to the intentionality and appropriateness of biological motion (Pelphrey et al., 2004b). In this interpretation, we propose that the flashing checkerboard serves as a target stimulus capable of eliciting a specific action from the virtual actor (shifting gaze to look at the target). This establishes an expectation for the character's subsequent gaze behaviour based on the subject's implicit predictions concerning the rationality of the actor. When the character makes a non-goal-directed gaze shift towards empty space, processing demands are increased due to a violation of the subject's expectation. In essence, this violation requires the subject to revise his or her initial expectation and thus demands greater perceptual processing of the action sequence. In the philosopher of mind Dennett's (1987) framework: a non-goal-directed gaze shift violates the subjects' ‘intentional stance’ and therefore their expectations about what the actor ‘ought to do’ given the demands of the action context.

The findings from the subjects with autism presented here closely mirror the extant behavioural findings concerning gaze processing deficits in autism and suggest that individuals with autism fail to link the perceptual representation of eyes moving and the concurrent representation regarding a character's goals, motives and desires (i.e. the contents of the actor's mind) to determine the intentions of another person. Thus, additional processing does not occur in the STS region in subjects with autism, either because an initial expectation regarding what the subject ought to do is never formed (i.e. they do not spontaneously adopt an intentional stance towards the virtual actor) or because information concerning violation of this expectation never reaches the STS region and thus no demand is made for additional processing (i.e. the STS is not re-engaged when the intentional stance is violated). Both interpretations of the present findings point to a disconnection between the perceptual processing of eye movements in the STS region and its connection with the mentalistic significance of these motions.

Areas of differential activity identified in our neurologically normal subjects are consistent with imaging studies reporting circumscribed activations to several different tasks that tap ‘theory of mind’ (i.e. the ability to make inferences about the mental states of others and to make predictions about the behaviour of others based on those inferences) (Premack and Woodruff, 1978) and other aspects of social cognition (Gallagher et al., 2000; Vogeley et al., 2001; Blakemore et al., 2003; Grèzes et al., 2004). For instance, Brothers (1990) proposed a network comprising the social brain including the orbitofrontal cortex, superior temporal gyrus and the amygdala. Frith and Frith (1999) argued for involvement of the STS, inferior frontal gyrus and medial prefrontal cortex. Grèzes et al. (2004) identified activity in a network of regions including the STS, orbitofrontal cortex, paracingulate cortex and cerebellum when subjects judged the actions of another to reflect a false belief. Our findings are also concordant with previous results linking deficits in aspects of social cognition and theory of mind in autism to functional abnormalities in prefrontal cortex, the amygdala and the STS. For example, a PET study demonstrated hypoactivation of the right posterior STS in subjects with autism relative to comparison subjects during a task requiring the visual analysis of goals and intentions from moving geometric figures designed to evoke attributions of intention to varying degrees (Castelli et al., 2002).

One other neuroimaging study has examined an aspect of gaze processing in individuals with high-functioning autism. Baron-Cohen et al. (1999b) used fMRI to measure brain activity during a task requiring participants to infer the mental state of another individual from the expression conveyed by that person's eyes alone. The superior temporal gyri, left amygdala and the insula were activated in neurologically normal subjects performing this ‘Eyes Task’. Relative to controls, subjects with autism activated frontal components less extensively than did neurologically normal subjects, and showed decreased activation in the amygdala and increased activity in the superior temporal gyri. Whereas the task used in the study of Baron-Cohen et al. involved only static gaze, the current study is unique in its focus on processing of observed gaze shifts (involving moving eyes) that confirmed or violated subjects' expectations regarding the virtual actor's intentions.

The present findings unfortunately cannot address the important question of whether the neurobiological basis of the lack of differential STS activity resides in the cortex of the STS region itself, or if the dysfunction is the result of failures in communication between the STS region and other brain structures involved in social processing. Consistent with the possibility of a primary pathology in the STS region, other neuroimaging studies have revealed hypoactivation of the STS in autism during tasks involving the attribution of intentions to moving geometric figures (Castelli et al., 2002) and human speech perception (Boddaert et al., 2003; Gervais et al., 2004). Bilateral hypoperfusion of temporal lobe areas has been observed in children with autism at rest (Ohnishi et al., 2000; Zilbovicius et al., 2000). A PET study of speech perception reported abnormal laterality of responses and hypoactivation of the left superior temporal gyrus (Boddaert et al., 2003), and an fMRI study observed abnormal responses in the STS region to human voices (Gervais et al., 2004). Finally, a recent study comparing cortical sulcal maps in individuals with and without autism found anterior and superior displacements of the STS (Levitt et al., 2003); and Boddaert et al. (2004) recently reported abnormal STS volume in autism. These findings are consonant with a potential disruption in the STS region itself (but cannot rule out the alternative discussed below). In light of the recent findings of anatomical abnormalities in the STS region in autism, it is important to comment on the potential effects on our results of normalizing the images to the MNI template. Specifically, neuroanatomical abnormalities in the STS region in subjects with autism might have adversely affected the accuracy of the normalization procedures used to create group level activation maps. The reduction in accuracy could have led to higher levels of between-subject variability in the localization of activity in the STS region. This increase in variability, in turn, might have led to a reduction in power to detect incongruent > congruent activity in the group of subjects with autism relative to neurologically normal subjects.

Alternatively, there may be abnormal functional connectivity between the STS region and other regions critical to social understanding. In this model, the STS region is initially activated in an obligatory manner when the subject perceives an eye gaze shift, and a representation of this information is fed forward to higher systems that analyse the goal-directed and intentional components of these motions. These higher systems may engage and maintain activation in the STS region and thus this higher level processing is reflected in the activation patterns of these lower level systems. The locations of these putative higher systems within this model are unspecified, but may include the prefrontal regions activated in this study. In individuals with autism, the connection between higher level systems and the STS region may be broken, and thus the higher level systems do not engage and maintain activation in the STS region. For example, a recent fMRI study by Just et al. (2004) found lower functional connectivity between Wernicke's and Broca's areas during language processing in subjects with autism. Similarly, Castelli et al. (2002) reported reduced functional connectivity between the posterior STS region and a portion of extrastriate visual cortex localized to the inferior occipital gyrus in individuals with autism during a task involving attribution of intentions. In an exploratory analysis, we measured the correlation between activity at expected peak amplitude in the STS and right posterior middle temporal gyrus (averaging across conditions) in each group of subjects. In the neurologically normal subjects, there was a significant positive correlation between these regions (r = 0.74, P < 0.05). This correlation was not observed in the autism group (r = 0.05, P > 0.05). These findings, together with those of Castelli et al. (2002) and Just et al. (2004), suggest the potential value of examining functional connectivity in future studies of the STS and social cognition in autism.

In conclusion, our findings revealed that the STS region and other brain structures that have previously been implicated in social cognition and theory of mind were active during observation of gaze shifts in subjects with autism, but these brain regions did not differentiate congruent and incongruent gaze shifts. The results presented here thus provide direct evidence for a neural basis of a specific eye gaze processing deficit in autism—reading intentions conveyed by shifts in eye gaze—and converge with the extant behavioural literature on gaze processing deficits in autism.

Supplementary material

The Supplementary material cited in this article is available at Brain on-line.

Acknowledgments

We wish to thank K. Karcher, P. Mack, Dr Grace Baranek, Marisa Houser, Anita Gordon, Dr P. Kartheiser and Dr A. Song for assistance with several aspects of this research. This research was supported by the North Carolina Studies to Advance Autism Research and Treatment Center, Grant 1 U54 MH66418 from the National Institutes of Health. K.A.P. was supported by a Career Scientist Development Award from the National Institute for Mental Health, Grant IKOIMH071284-1. G.M. is a VA Senior Research Career Scientist.

References

View Abstract