OUP user menu

Neural basis of irony comprehension in children with autism: the role of prosody and context

A. Ting Wang, Susan S. Lee, Marian Sigman, Mirella Dapretto
DOI: http://dx.doi.org/10.1093/brain/awl032 932-943 First published online: 15 February 2006

Summary

While individuals with autism spectrum disorders (ASD) are typically impaired in interpreting the communicative intent of others, little is known about the neural bases of higher-level pragmatic impairments. Here, we used functional MRI (fMRI) to examine the neural circuitry underlying deficits in understanding irony in high-functioning children with ASD. Participants listened to short scenarios and decided whether the speaker was sincere or ironic. Three types of scenarios were used in which we varied the information available to guide this decision. Scenarios included (i) both knowledge of the event outcome and strong prosodic cues (sincere or sarcastic intonation), (ii) prosodic cues only or (iii) knowledge of the event outcome only. Although children with ASD performed well above chance, they were less accurate than typically developing (TD) children at interpreting the communicative intent behind a potentially ironic remark, particularly with regard to taking advantage of available contextual information. In contrast to prior research showing hypoactivation of regions involved in understanding the mental states of others, children with ASD showed significantly greater activity than TD children in the right inferior frontal gyrus (IFG) as well as in bilateral temporal regions. Increased activity in the ASD group fell within the network recruited in the TD group and may reflect more effortful processing needed to interpret the intended meaning of an utterance. These results confirm that children with ASD have difficulty interpreting the communicative intent of others and suggest that these individuals can recruit regions activated as part of the normative neural circuitry when task demands require explicit attention to socially relevant cues.

  • autism
  • brain development
  • fMRI
  • language pragmatics
  • social cognition
  • ASD = autism spectrum disorders
  • BA = Brodmann area
  • EKO = event knowledge only
  • IFG = inferior frontal gyrus
  • MPFC = medial prefrontal cortex
  • PCO = prosodic cues only
  • STG = superior temporal gyrus
  • STS = superior temporal sulcus
  • TD = typically developing

Introduction

A disparity between formal linguistic skills (e.g. syntax, phonology, morphology) on one hand and pragmatic impairments (i.e. difficulties with the social use of language in context) on the other is evident in high-functioning individuals with autism spectrum disorders (ASD) (Minshew et al., 1997; Tager-Flusberg, 1981). For example, individuals with ASD frequently misinterpret the intended meaning of non-literal language—such as irony—as consistent with the literal meaning (Happe, 1995). Irony, the use of words to express something other than and especially the opposite of the literal meaning, is used commonly and understood effortlessly in everyday conversations. Detecting irony, however, actually involves rather complex mental representations, as the listener needs to understand not only that the speaker does not mean exactly what she/he said, but also that she/he does not expect to be taken literally. Accordingly, prior research has demonstrated an association between ‘theory of mind’ tasks tapping into the ability to represent the mental states of others and the ability to understand irony in both typically developing (TD) children (Sullivan et al., 1995) and children with ASD (Happe, 1993). High-functioning individuals with ASD who successfully perform second-order theory of mind tasks (i.e. representing beliefs about another's beliefs) may correctly detect irony in a laboratory setting (Happe, 1993), but they still have difficulty justifying their responses and show little evidence of using and understanding irony in their everyday lives (Leekam and Prior, 1994).

An understanding of the intended meaning behind an ironic remark typically emerges between 7 and 8 years of age (Ackerman, 1981; Demorest et al., 1984; Winner and Leekam, 1991; Hancock et al., 2000). Several researchers have shown that the presence of strong intonational cues—usually lower pitch, longer tempo and greater intensity (Rockwell, 2000)—facilitates the interpretation of ironic utterances in TD children (Ackerman, 1986; Capelli et al., 1990; de Groot et al., 1995; Milosky and Ford, 1997; Keenan and Quigley, 1999). In addition, TD children rely heavily on contextual cues in order to detect a speaker's ironic intent when an event outcome is incongruent with the literal meaning of a remark (Ackerman, 1986).

Difficulty in appreciating irony is widely reported in individuals with ASD (Tantam, 1991; Happe, 1993, 1994; Leekam and Prior, 1994; Kaland et al., 2002, 2005; Martin and McDonald, 2004). This impairment could be related to deficits in using both prosodic and contextual information to make inferences about a speaker's communicative intent. Although very little is known about the ability of individuals with ASD to perceive and interpret prosodic cues in the speech of others (Paul et al., 2005), evidence suggests that impairments in extracting meaning from voices are present from a very early age. Unlike TD children and children with learning disabilities, children with autism do not show a preference for listening to their mother's voice (Klin, 1991, 1992) and may actually prefer a non-speech analogue to motherese (Kuhl et al., 2005). Furthermore, older children and adults with ASD are impaired in identifying emotion expressed through tone of voice (Hobson et al., 1989; Van Lancker et al., 1989; Rutherford et al., 2002). With regard to the ability to use contextual information in a meaningful way, several studies have shown that children and adults with ASD are less likely than controls to use sentence context to aid in the correct pronunciation of homographs (e.g. ‘there was a tear in her eye’ versus ‘there was a tear in her dress’) and tend to give the more frequent pronunciation of a word instead of the appropriate one (Frith and Snowling, 1983; Happe, 1997; Jolliffe and Baron-Cohen, 1999; Lopez and Leekam, 2003). In addition, high-functioning adults with ASD are less able than controls to use context to make a global inference about a story character's action (Jolliffe and Baron-Cohen, 2000) and to appreciate the intent behind indirect requests (Ozonoff and Miller, 1996).

Neuropsychological research suggests that right hemisphere and prefrontal regions play important roles in irony comprehension. More specifically, patients with unilateral lesions in the right hemisphere are impaired in interpreting irony relative to both healthy controls (Tompkins and Mateer, 1985; Kaplan et al., 1990; Winner et al., 1998) and patients with left hemisphere brain damage, after controlling for the effects of aphasia (Giora et al., 2000). In addition, patients with prefrontal damage are less able to detect irony relative to patients with posterior lesions (Shamay et al., 2002). In particular, ventromedial prefrontal damage is associated with difficulty in comprehending irony (Shamay-Tsoory et al., 2003, 2005), although the extent of damage to lateral prefrontal regions (inferior and middle frontal gyri) of the left hemisphere has also been found to correlate with poor performance (Giora et al., 2000; Zaidel et al., 2002). Numerous neuroimaging studies have implicated the medial prefrontal cortex (MPFC), superior temporal sulcus (STS) and temporal poles as comprising a network involved in reasoning about the mental states of others (Siegal and Varley, 2002; Frith and Frith, 2003). Thus far, only one neuroimaging study has focused on the neural circuitry supporting irony comprehension in particular (Wang et al., submitted for publication). Consistent with the broader literature on theory of mind, both TD children and adults showed selective activity in the MPFC and bilateral temporal regions during potentially ironic scenarios that contained convergent facial, prosodic and contextual cues.

In recent years, significant progress has been made in describing both structural (Bauman and Kemper, 2005; Courchesne and Pierce, 2005) and functional (see Pelphrey et al., 2004 for a review) abnormalities associated with ASD. Many researchers have focused on lower-level perceptual impairments, such as those associated with face and voice processing (Siegal and Blades, 2003; Schultz, 2005). Relatively little is known about the neural circuitry underlying higher-level pragmatic impairments that persist even in high-functioning individuals with ASD. There is some evidence to suggest abnormalities in the networks previously described as supporting theory of mind abilities. Three studies have examined the neural basis of mentalizing impairments in adults with ASD, and all have observed abnormalities in MPFC activity in comparison with normal controls (Happe et al., 1996; Castelli et al., 2002; Nieminen-von Wendt et al., 2003). In the only study to date focusing on the neural underpinnings of impairments in irony comprehension, we recently asked children with ASD to view cartoon drawings while listening to short scenarios, where one of the characters makes a remark that is potentially ironic (Wang et al., submitted for publication). Consistent with the findings from the mentalizing studies described above, reduced activity in the MPFC and the right superior temporal gyrus (STG) was observed in ASD relative to TD children during the perception of potentially ironic versus control scenarios. However, MPFC activity increased reliably when children with ASD were given explicit instructions to attend to the speaker's facial expression and tone of voice, suggesting that neural functioning in the MPFC is, under some circumstances, intact in individuals with ASD.

In everyday social interactions, multiple converging cues (i.e. facial expression, tone of voice and knowledge of event outcome) are not always present. Consider, for example, a speaker who opens a conversation with the comment, ‘I've had such a great morning!’ Without further contextual information about the events of that morning, the listener must rely on other cues, such as the speaker's facial expression and/or tone of voice to determine if the remark is sincere or ironic. Conversely, a speaker can deliver an ironic comment with a deadpan expression, where the facial cues and intonation are both neutral, forcing the listener to use contextual cues to interpret the intent of a remark. The goal of the present study was to examine the neural circuitry underlying impairments in understanding irony in children with ASD, paying particular attention to the roles of prosody and context in inferring a speaker's communicative intent in the absence of facial affect cues. Given that impairments in utilizing both prosodic (Hobson et al., 1989; Van Lancker et al., 1989; Rutherford et al., 2002) and contextual (Ozonoff and Miller, 1996; Happe, 1997; Jolliffe and Baron-Cohen, 2000) cues are associated with ASD, we expected children with ASD to be less accurate than TD children at correctly detecting irony, given either intonational information or contextual information, or both. Because of the close link between theory of mind abilities and irony comprehension, we predicted that children with ASD would show less activity in brain regions known to be involved in mentalizing (i.e. MPFC, STS, temporal poles) relative to TD children.

Methods

Participants

Two groups of children and adolescents participated in the study: 18 males with autism or Asperger syndrome (7.4–16.9 years of age) and 18 TD males (8.1–15.7 years of age) were recruited through referrals from the UCLA Autism Evaluation Clinic and through flyers posted around the UCLA campus and the greater Los Angeles area. All participants were right-handed, native English speakers, with a verbal IQ > 70. Exclusionary criteria included a reported history of known neurological (e.g. epilepsy) or major psychiatric (e.g. schizophrenia) disorders other than autism and the presence of a structural brain abnormality (e.g. aneurysm). For the comparison group, the Social Communication Questionnaire (Rutter et al., 2003) was used to screen for the presence of autistic symptomatology. For the ASD group, prior clinical diagnoses of ASD were confirmed using both the Autism Diagnostic Interview—Revised (ADI-R) (Lord et al., 1994) and the Autism Diagnostic Observation Schedule—Generic (ADOS-G) (Lord et al., 2000). The social responsiveness scale (Constantino et al., 2003) was used to assess the severity of social impairment. The groups did not differ significantly in chronological age or IQ scores, as assessed using either the Wechsler Intelligence Scale for Children—Third Edition (Wechsler, 1991) or the Wechsler Abbreviated Scale of Intelligence (Wechsler, 1999). Participant characteristics, including chronological age, verbal, performance and full-scale IQ are presented in Table 1. Written informed consent was obtained from participants and their parents according to the specifications of the UCLA Institutional Review Board.

View this table:
Table 1

Participants' characteristics

CharacteristicsTD (mean ± SD)ASD (mean ± SD)
Chronological age (years)11.9 ± 2.311.9 ± 2.8
Verbal IQ108 ± 1399 ± 18
Performance IQ103 ± 15105 ± 16
Full-scale IQ106 ± 14102 ± 18
SRSNA102 ± 29

Stimuli

In each of three experimental conditions, participants listened to short scenarios and determined whether the speaker was being sincere or ironic. All scenarios ended with a remark that was literally positive, but could have a sarcastic intent. In the event knowledge + prosodic cues condition (EK + PC), both information about the valence of the event and strong prosodic cues (sincere or ironic tone of voice) were available to aid in interpreting the speaker's communicative intent. In the ‘event knowledge only’ condition (EKO), contextual cues about the event outcome were provided, but the speaker's remark was made with a neutral intonation. In the ‘prosodic cues only’ condition (PCO), no information about the event outcome was given, but the speaker's remark was made in a strongly ironic or sincere (positive) tone of voice. Examples of stimuli are presented in Table 2. The three versions of each scenario were matched in terms of syntactic structure, semantic complexity and length. Following the final remark, participants had to decide whether the speaker really meant what he/she said. Yes/no judgements were indicated by pressing a button with the index or middle finger, respectively. Instructions were clear that a ‘yes’ response should be given if the comment was sincere and should be taken literally, while a ‘no’ response should indicate that the final remark was sarcastic and the speaker meant the opposite of what he/she said. Participants were shown examples of scenarios not used during the scan and all demonstrated an understanding of the task requirements.

View this table:
Table 2

Example scenarios

ConditionScenarioProsody
EK + PCJack just got his test back. Ron sees the F/A* on it and says, ‘Way to go!’Ironic/sincere
Steve went to the barbershop. When Jen sees his bad/nice haircut, she says, ‘You look great!’Ironic/sincere
EKOJack just got his test back. Ron sees the F/A on it and says, ‘Way to go’Neutral
Steve went to the barbershop. When Jen sees his bad/nice haircut, she says, ‘You look great’Neutral
PCOJack just got his test back. Ron sees the grade on it and says, ‘Way to go!’Ironic/sincere
Steve went to the barbershop. When Jen sees his new haircut, she says, ‘You look great!’Ironic/sincere
  • *F and A denote a failing and excellent grade, respectively, in USA schools.

To verify that the final comments sounded sincere, ironic or neutral as intended, 12 adult volunteers listened to the remarks presented in isolation (i.e. without the surrounding context) and rated them on a scale of 1–7, with 1 as the anchor for ironic, 4 as the mid-point for neutral and 7 as the end-point for sincere/complimentary. Ironic remarks received a mean rating of 1.3 ± 0.6, neutral comments were rated as 4.0 ± 0.7 and sincere comments were rated as 6.7 ± 0.5.

Activation paradigm

Scenarios were presented in three 90-s activation blocks (one per condition), which were interspersed with rest periods of 21 s. Each condition consisted of six scenarios, presented at a rate of 15 s per story, where the speaker's communicative intent was ironic in half of the scenarios and sincere in the other half. Each participant heard only one version of every scenario. In order to avoid any specific item effects, each scenario was used equally often in each condition. The order of conditions was counterbalanced across subjects in a Latin square design. Response times and accuracy were recorded during scanning.

Data acquisition

Images were acquired on a Siemens Allegra 3T Scanner. A T2-weighted sagittal scout was used to prescribe the planes of the functional images and to ensure that no structural abnormalities were present. For each subject, the functional data consisted of 155 whole-brain volumes collected in the axial plane parallel to the anterior–posterior commissural line using an echoplanar imaging (EPI) gradient-echo sequence [repetition time (TR) = 3000 ms, echo time (TE) = 25 ms, 3 mm slice thickness/1 mm gap, 64 × 64 matrix size and field of view (FOV) = 20 cm]. A coplanar, high-resolution EPI structural volume (TR = 5000 ms, TE = 33 ms, 128 × 128 matrix size and FOV = 20 cm) was also acquired.

Data analysis

We analysed the imaging data using SPM99 (http://www.fil.ion.ucl.ac.uk/spm/). Functional images were first realigned to correct for head motion with automated image registration (AIR) (Woods et al., 1998a) using a linear rigid-body registration algorithm. In order to allow for inter-subject averaging, all images were then transformed into a Talairach-compatible standard space (Woods et al., 1999) using polynomial non-linear warping (Woods et al., 1998b). Functional volumes were smoothed using a 6 mm full-width-half-maximum Gaussian kernel to increase signal-to-noise ratio.

For each subject, condition effects were estimated according to the general linear model using a box-car reference function with a 6-s delay to compensate for the lag in haemodynamic response. Response time and accuracy scores collected during scanning were entered as regressors to ensure that, for each subject, any differences observed in activation patterns between conditions were not due to differences in task difficulty. The resulting contrast images were entered into group analyses using a random-effects model to allow for inferences at the population level (Friston et al., 1999). In order to identify significant activity for each activation condition independent of performance, analyses of covariance (ANCOVA) were conducted for each group, treating accuracy as a nuisance variable. Between-group differences were also examined using ANCOVAs, with accuracy as a covariate, in order to control for between-group differences in performance. Results were initially explored using liberal thresholds of P < 0.05, uncorrected for multiple comparisons, for both magnitude and spatial extent. However, we consider reliable and discuss only those activations that survived more stringent thresholds of P < 0.01 (t > 2.58) at the voxel level and P < 0.05, corrected for multiple comparisons, at the cluster level. Within- and between-group comparisons were made within regions where reliable activation was detected in either group across all conditions. Because the verbal IQ of children with ASD was slightly, though not significantly, lower than that of TD children, we ran supplementary ANCOVAs to control for verbal IQ, in addition to performance accuracy. These analyses, not reported here for the sake of brevity, confirmed that our findings did not reflect between-group differences in verbal IQ.

Lastly, analyses were also conducted to address the effects of head movement during scanning. For each subject, mean head motion was computed using AIR by averaging the displacements across all voxels in all functional images relative to their mean position (Woods, 2003). No between-group differences were observed in terms of the mean motion detected throughout the functional scan.

Results

Behavioural results

Both TD and ASD children performed above chance levels in all conditions (Table 3). However, a repeated-measures ANOVA yielded a main effect of group, such that TD children were significantly more accurate overall than children with ASD [F(1,34) = 7.83, P < 0.01]. Although no group × condition interactions were observed, planned comparisons revealed that TD children performed more accurately than children with ASD in the two conditions in which contextual knowledge of the event outcome was provided (the EK + PC condition and the EKO condition). Somewhat unexpectedly, the groups did not differ significantly when only prosodic cues were available (PCO condition). No between-group differences were observed in response time, either overall or in any of the conditions.

View this table:
Table 3

Behavioural performance

ConditionAccuracy (% correct)Response time (s)
TDASDtdf aTDASDtdf a
EK + PC99.1 (3.9)90.8 (11.2)2.95**212.58 (0.23)2.69 (0.71)0.64121
EKO93.5 (10.1)81.8 (17.7)2.45*272.59 (0.34)2.65 (0.75)0.27324
PCO86.1 (13.1)82.2 (21.8)0.651342.61 (0.64)2.68 (0.62)0.76634
  • a Degrees of freedom reflect unequal variances between groups;

  • ** P < 0.01;

  • * P < 0.05.

Functional MRI (fMRI) results

All conditions

Across all conditions, both groups showed reliable activity in temporal regions bilaterally, including the superior and middle temporal gyri and temporal poles. While the ASD group also recruited the inferior frontal gyrus (IFG) bilaterally, the TD group showed reliable activity only in the left IFG. However, significant activity was observed in the MPFC only in the TD group. Activity in this region was detected in the ASD group only at the most liberal spatial extent threshold of P < 0.05, uncorrected for multiple comparisons. For each group, reliable activity averaged across all conditions relative to rest is shown in Fig. 1. Talairach coordinates, Brodmann area (BA), and t-scores are presented in Table 4. Directly comparing the two groups revealed that children with ASD showed significantly greater activity than TD children in the right IFG as well as temporal regions bilaterally (Table 4 and Fig. 2). Greater activity was also observed in the left pre-central gyrus in ASD relative to TD children.

View this table:
Table 4

Peaks of activity across all conditions versus rest

Anatomical regionBAHTD groupASD groupASD > TD
xyztxyztxyzt
MPFC10L−662164.27
9L−1050345.32
8L−620484.50
Superior frontal gyrus6L−48545.54
Post-central gyrus213L−38−30585.60
Pre-central gyrus4L−30−14644.31−40−20563.31
IFG44L−521884.98
45L−5624146.52
45R462045.40402282.97
47R3424−64.36
STG41R38−26145.0044−22126.36
42L−50−32107.92−56−26810.12−48−26164.22
42R48−14613.7456−30107.50
22L−60−28612.23−60−3469.34
22R60−16411.4856−2049.63
STSL−62−24210.23−50−2829.24−48−3023.65
R52−3865.3456−3047.5040−4−123.21
Middle temporal gyrus21L−62−3009.21−66−3606.03
21R48−12−103.1460−3424.41
Temporal pole21L−480−287.80−446−284.26
21R402−266.08442−262.89
38L−4812−205.63−4412−265.50
38R4810−105.444412−183.48
  • BA = putative Brodmann area; H = hemisphere; L and R = left and right hemispheres, respectively; x, y, and z = the Talairach coordinates corresponding to the left–right, anterior–posterior, and inferior–superior axes, respectively; t = the highest t-score within a region.

  • P < 0.05, corrected for multiple comparisons at cluster level; P < 0.01, corrected for multiple comparisons at voxel level.

Fig. 1

Brain activity during potentially ironic scenarios relative to rest. Reliable activity was observed in the MPFC in the TD group but not in the ASD group across all conditions. Although both groups engaged the IFG in the left hemisphere, only children with ASD showed reliable prefrontal activity in the right hemisphere. In addition, the ASD group showed reliable activity in the left pre- and post-central gyrus, which probably reflects increased sensory and motor responsiveness to the response pad in the right hand. Figures are thresholded at t > 2.58, corrected for multiple comparisons at the cluster level, P < 0.05.

Fig. 2

Brain regions more strongly activated in children with ASD relative to TD children. The ASD group showed significantly greater activity than the TD group in the right IFG as well as the STS bilaterally across all conditions relative to rest. Figures are thresholded at t > 1.80, corrected for multiple comparisons at the cluster level, P < 0.05.

Event knowledge + prosodic cues

When both contextual knowledge of the event outcome and prosodic cues were available to help infer the speaker's communicative intent, both groups showed significant activity in the superior and middle temporal gyri bilaterally, the left temporal pole and the MPFC. However, while no reliable activity was observed in frontal regions outside of the MPFC in TD children, significant activity in the IFG was detected bilaterally in ASD children. Between-group ANCOVAs revealed that the ASD group did indeed engage the IFG more strongly than did the TD group during the EK + PC condition, after controlling for the effect of accuracy (Table 5). TD children did not show any regions of greater activity relative to children with ASD children for this condition.

View this table:
Table 5

Peaks of activity during EK + PC versus rest

Anatomical regionBAHTD groupASD groupASD > TD
xyztxyztxyzt
MPFC9L−250343.63−856265.11
9054243.31
Superior frontal gyrus6L−810624.10
Pre-central gyrus4L−42−24584.19−40−24542.77
IFG44R5018104.53
45L−5424125.77−5420123.15
45R482443.41482423.67
47L−4824−64.22
47R4438−23.764438−23.24
STG41L−48−28145.10
41R46−24106.8944−22125.14
42L−58−24105.87
42R52−2287.7342−2285.12
22L−58−2249.01−64−3665.67
22R50−14412.7850−2048.21
STSL−58−3266.69−54−2647.03
R50−2826.76
Middle temporal gyrus21L−64−28−27.32−56−4225.05
21R50−10−123.0356−24−102.84
39L−64−54102.87
Temporal pole21L−480−287.46−58−2−184.87
38L−4812−205.57
  • BA = putative Brodmann area; H = hemisphere; L and R = left and right hemispheres, respectively; x, y, and z=the Talairach coordinates corresponding to the left–right, anterior–posterior, and inferior–superior axes, respectively; t = the highest t-score within a region.

  • P < 0.05, corrected for multiple comparisons at cluster level; P < 0.01, corrected for multiple comparisons at voxel level.

Event knowledge only

When contextual cues about the valence of the event were provided but the speaker's remark was delivered in a neutral tone of voice, both groups showed reliable activity in the right temporal pole in addition to the temporal regions activated during the EK + PC condition. Significant activity in the IFG was also observed in the left hemisphere in TD children and bilaterally in children with ASD. Reliable MPFC activity was detected only in the TD group. Between-group comparisons confirmed that the ASD group showed greater activity in the right IFG than did the TD group, as expected from the within-group activation patterns. Greater activity was also observed in the ASD group in left temporal regions, including the superior and middle temporal gyri, as well as in the left post-central gyrus. TD children showed greater activity than children with ASD in dorsomedial prefrontal regions (superior frontal gyrus, BA 6). For this condition, peaks of activity for the within- and between-group comparisons are shown in Table 6.

View this table:
Table 6

Peaks of activity during EKO versus rest

Anatomical regionBAHTD groupASD groupTD > ASDASD > TD
xyztxyztxyztxyzt
MPFC8L−220503.73
9L−1648343.94
10L−662163.50
Superior frontal gyrus6L−28523.77−418462.63
Cingulate gyrus32220422.62
Pre-central gyrus4L−36−24484.80−30−14622.75
Post-central gyrus213L−44−22543.39−48−24525.05−50−24502.84
IFG44L−581463.76
45L−5018206.84−5626124.24
45R4020124.684020123.83
47L−5414−42.95−4824−64.07
47R4430−64.15
STG41R42−22123.51
42L−46−32128.64−60−1486.60
42R48−1469.3344−1684.79
22L−62−18211.32−60−1206.47−44−2023.55
22R46−2468.9756−2048.41
STSL−60−4485.72−54−4065.94−46−3242.99
R58−806.1558−2844.09
Middle temporal gyrus21L−542−125.05−62−20−85.00−44−6−163.91
21R56−10−46.2648−12−44.14
Temporal pole21L−482−223.71−54−6−204.43
21R402−263.89
38L−4012−284.07−508−225.02
38R4610−124.674412−183.61
  • BA = putative Brodmann area; H = hemisphere; L and R = left and right hemispheres respectively; x, y, and z = the Talairach coordinates corresponding to the left–right, anterior–posterior, and inferior–superior axes, respectively; t = the highest t-score within a region.

  • P < 0.05, corrected for multiple comparisons at cluster level; P < 0.01, corrected for multiple comparisons at voxel level.

Prosodic cues only

When the speaker's remark was made in a clearly sincere or sarcastic tone of voice but contextual information about the event outcome was neutral, both groups showed reliable activity in the superior and middle temporal gyri, temporal poles and IFG bilaterally, as well as in the dorsal MPFC. Between-group comparisons revealed no reliable differences in frontal regions, consistent with the finding that the ASD and TD groups showed similar behavioural performance when knowledge of the event outcome was unavailable. However, greater activity was observed in the left STS and the right temporal pole in children with ASD versus TD children (Table 7).

View this table:
Table 7

Peaks of activity during PCO versus rest

Anatomical regionBAHTD groupASD groupASD > TD
xyztxyztxyzt
MPFC8L−1030524.50
Superior frontal gyrus6L−616604.68−616564.42
Cingulate gyrus32222404.47
Post-central gyrus213L−46−28564.86
IFG44L−521684.21
44R
45L−4218222.92−3816143.35
45R522483.48
47L−5432−24.27−462403.35
47R501803.283224−44.96
STG41R42−30124.1248−24124.59
42L−44−32104.69−62−20108.02
42R44−1888.5060−2485.59
22L−56−2049.73−62−2005.83
22R50−1228.7050−2026.38
STSL−62−2609.88−58−26−26.03−48−3864.19
R52−4085.88
Middle temporal gyrus21L−54−24−64.98−66−3206.84
21R48−12−104.4050−18−44.11
Temporal pole21L−460−304.29
21R403−302.94382−305.86362−283.92
38L−4812−205.55−4612−223.95
38R4410−185.87444−184.853210−263.12
  • BA = putative Brodmann area; H = hemisphere; L and R = left and right hemispheres, respectively; x, y, and z = the Talairach coordinates corresponding to the left–right, anterior–posterior, and inferior–superior axes, respectively; t = the highest t-score within a region.

  • P < 0.05, corrected for multiple comparisons at cluster level; P < 0.01, corrected for multiple comparisons at voxel level.

Correlations with social responsiveness and verbal IQ

Within the ASD group, we conducted simple regression analyses to examine the extent to which inter-subject variability in regional brain activity was associated with social and communicative impairment. Across all conditions, activity in the right temporal pole (x = 44, y = 4, z = −20) was negatively correlated with both social impairment, as assessed by the SRS (social responsiveness scale) (r1,13 = −0.71, P < 0.005), and communicative impairment, as measured by the communication subscale of the ADOS-G (r1,16 = 0.66, P < 0.005). In other words, children with a higher level of social and communicative functioning showed greater activity in the right temporal pole when attempting to infer a speaker's communicative intent.

Verbal IQ was positively associated with activity in temporal regions bilaterally and in the right IFG in children with ASD (Table 8). These regions were similar to the areas more strongly recruited by the ASD group across all conditions and may reflect a compensatory strategy ‘hacked out’ by those who are more verbally able. Verbal IQ did not correlate with activity in the TD group.

View this table:
Table 8

Brain activity associated with verbal IQ in children with ASD

Anatomical regionBAHCoordinates
xyzt
IFG45R382663.35
Post-central gyrus213L−56−18444.93
STG22L−62−624.40
22R52−1224.32
Middle temporal gyrus21L−64−22−43.77
R46−8−84.17
Temporal pole38L−468−183.66
38R4810−103.88
  • BA = putative Brodmann area; H = hemisphere; L and R = left and right hemispheres, respectively; x, y, and z = the Talairach coordinates corresponding to the left–right, anterior–posterior, and inferior–superior axes, respectively; t = the highest t-score within a region.

Discussion

We found significant differences in the way in which children with ASD use prosodic and contextual cues to interpret irony at both the behavioural and neural levels. Despite performing at above chance levels, children with ASD were less accurate than TD children in detecting the communicative intent behind a speaker's remark overall and particularly in the conditions where the outcome of the event strongly indicated a non-literal interpretation. This finding suggests that children with ASD had difficulty in taking advantage of the contextual information available to interpret the speaker's intent, which converges with prior research demonstrating deficits in using context to make appropriate inferences (Ozonoff and Miller, 1996; Dennis et al., 2001). Although no performance differences were observed between groups when only the intonation of the speaker's remark was provided, the lack of differences in this condition probably reflects difficulty for TD children in the absence of contextual information rather than strength for children with ASD.

With respect to differences in patterns of brain activity, children with ASD recruited prefrontal and temporal regions more strongly than TD children overall. More specifically, greater activity was observed in the ASD group in the right IFG when only contextual cues were present, and bilaterally when both types of cues were available. Further, when forced to rely on prosodic cues alone, the ASD group showed heightened recruitment of temporal regions bilaterally. Importantly, the regions recruited more strongly in the ASD group were within the network also activated in the TD group. We interpret these results as consistent with the notion that increased task difficulty may be associated with greater activation of relevant brain regions (Durston et al., 2002; Tamm et al., 2002). Although we controlled for the effects of accuracy, heightened activity in children with ASD may still be attributed to more effortful processing required for adequate task performance. Indeed, activity in frontotemporal regions has been found to increase with the amount of computational demand or ambiguity imposed by language-processing tasks in normal adults even when no behavioural consequences are observed (Just et al., 1996; Mason et al., 2003). Thus, greater engagement of the IFG in children with ASD relative to TD children may reflect difficulties in integrating contextual cues with the speaker's utterance in order to interpret communicative intent. The temporal regions more strongly activated in ASD children when only prosodic cues were available have been previously established to be part of frontotemporal networks involved in processing affective prosody (Kotz et al., 2003; Mitchell et al., 2003; Wildgruber et al., 2004, 2005; Hesling et al., 2005). Greater recruitment of these temporal regions may reflect a greater burden for ASD than TD children when task demands require reliance on prosodic information alone.

Interestingly, activity in the right temporal pole was inversely related to impairment in communication and social interaction in children with ASD. Previous neuroimaging work has implicated the temporal poles as part of the neural architecture underlying mentalizing abilities (Siegal and Varley, 2002; Frith and Frith, 2003). In particular, activity in this region is elicited by the recognition of familiar faces and voices and the retrieval of relevant semantic and emotional context (Frith and Frith, 2003). Children with ASD who engaged this region when attempting to infer a speaker's communicative intent were more likely to have better social and communicative skills in real-life situations. However, we had predicted that children with ASD would show less activity than TD children in regions involved in understanding others' mental states (i.e. the MPFC, STS and temporal poles) and our results did not support this hypothesis. Across all conditions, no reliable between-group differences were observed in the MPFC, although activity in this region was significant in TD children but not so in children with ASD unless a less stringent spatial extent threshold was applied. Furthermore, temporal regions, including the STS and temporal pole, were recruited more strongly in children with ASD than TD children overall. Our results are contrary to those of Castelli et al. (2002), who found significantly reduced activity in the MPFC, STS and temporal poles in adults with ASD relative to controls while viewing animated sequences of shapes moving in a way that typically elicits the attribution of intentions.

One possible explanation for the discrepant findings may relate to the amount of automatic or implicit processing required by the experimental tasks. The animated shape paradigm used by Castelli et al. (2002) was a passive perceptual task, which relied on the automatic recruitment of networks supporting theory of mind. In contrast, the present study explicitly required participants to interpret a character's communicative intentions. Previous studies that also used a more cognitive/explicit task involving comprehension of the complex mental states of story characters have yielded results similar to ours (Happe et al., 1996; Nieminen-von Wendt et al., 2003). Specifically, Neiminen-von Wendt et al. (2003) found significant MPFC activity in both adults with ASD and healthy controls during stories requiring the interpretation of a character's actions (versus non-mentalistic physical stories). Although this activity appeared to be weaker and less extensive in the ASD group, no reliable differences were observed in this region when the two groups were compared directly. Similarly, Happe et al. (1996) used the same stories and found that both adults with ASD and normal controls showed reliable activity in the MPFC, although the region recruited by the ASD group was slightly adjacent to and more ventral than the area engaged by normal adults.

While it could be argued that some abnormalities in the neural circuitry underlying the interpretation of a character's intent were observed in the ASD group in each of these studies, the fact that no reliable between-group differences were detected in the MPFC shows that individuals with ASD do have the ability to bring this region on line under certain circumstances—specifically, when task demands require attention to mental states. Moreover, we recently found that, despite showing less activity than TD children in the MPFC when interpreting communicative intent during potentially ironic scenarios, specific instructions to attend to important social cues, such as the speaker's facial expression and tone of voice, elicited significantly greater activity in this region in children with ASD (Wang et al., submitted for publication). This distinction between deficits in implicit or automatic processing of social information and relative strength in explicit or cognitive assessment has been noted in the domain of face processing as well (Critchley et al., 2000; Wang et al., 2004). For example, Critchley et al. (2000) observed that adults with autism showed reduced activity in the amygdala relative to controls when labelling the gender of emotional faces (implicit condition), but comparable levels of activity labelling the emotion in the same faces (explicit condition). Interestingly, in the present study, children with higher verbal IQ were more likely to show greater engagement of regions more strongly activated by the ASD group overall (i.e. right IFG and bilateral temporal regions), which suggests that this increased activity may reflect compensatory strategies involving the use of verbal reasoning skills in place of an ‘instinct’ for interpreting the communicative intent of others.

Given that children with ASD were able to perform at above chance levels in all conditions and that the regions recruited more strongly in the ASD group were within the network also activated in the TD group, our results suggest that children with ASD were interpreting the intended meaning of utterances in a more effortful manner, though using similar neural mechanisms. Together with the studies discussed above, our findings support the notion that individuals with ASD are able to recruit more normative neural networks when task demands require attention to important social cues and allow for participants to capitalize on their cognitive skills. Here, successful task performance was dependent on attention to and assessment of contextual and/or prosodic information, particularly when only one type of cue was available for interpreting the intent behind a potentially ironic remark. It is possible, and in fact quite likely, that in a less structured environment requiring the implicit interpretation of communicative intent, abnormalities in neural functioning would be more striking in children with ASD. This distinction between strengths in the explicit processing of social information and impairment when implicit processing is required remains speculative, as we did not directly examine this possibility in the present study. Nevertheless, the observations that activity in the MPFC was comparable between groups and that activity in the STS and right temporal pole was greater in ASD than TD children argue against a fundamental deficit in the neural architecture supporting mentalizing. Instead, recruitment of this circuitry when explicit processing is encouraged suggests that one possible intervention strategy would be to teach children with ASD to apply their cognitive strengths to more naturalistic settings. Perhaps, increased awareness of and attention to important social cues would lead to more frequent engagement of normative neural circuitry and, even more importantly, greater success in everyday social interactions.

Acknowledgments

This study was supported by grants from the NIDCD (R03 DC005159), the NICHD (P01 HD035470), the National Alliance for Autism Research, the Cure Autism Now foundation, and the U.C. Davis M.I.N.D. Institute. For generous support the authors also wish to thank the Brain Mapping Medical Research Organization, the Brain Mapping Support Foundation, the Pierson-Lovelace Foundation, the Ahmanson Foundation, the Tamkin Foundation, the Jennifer Jones-Simon Foundation, the Capital Group Companies the Charitable Foundation, the Robson Family, the William M. and Linda R. Dietel Philanthropic Fund at the Northern Piedmont Community Foundation, the Northstar Fund, and the National Center for Research Resources (grants RR12169, RR13642, and RR08655).

References

View Abstract