OUP user menu

Expected value and prediction error abnormalities in depression and schizophrenia

Victoria B. Gradin, Poornima Kumar, Gordon Waiter, Trevor Ahearn, Catriona Stickle, Marteen Milders, Ian Reid, Jeremy Hall, J. Douglas Steele
DOI: http://dx.doi.org/10.1093/brain/awr059 1751-1764 First published online: 11 April 2011

Summary

The dopamine system has been linked to anhedonia in depression and both the positive and negative symptoms of schizophrenia, but it remains unclear how dopamine dysfunction could mechanistically relate to observed symptoms. There is considerable evidence that phasic dopamine signals encode prediction error (differences between expected and actual outcomes), with reinforcement learning theories being based on prediction error-mediated learning of associations. It has been hypothesized that abnormal encoding of neural prediction error signals could underlie anhedonia in depression and negative symptoms in schizophrenia by disrupting learning and blunting the salience of rewarding events, and contribute to psychotic symptoms by promoting aberrant perceptions and the formation of delusions. To test this, we used model based functional magnetic resonance imaging and an instrumental reward-learning task to investigate the neural correlates of prediction errors and expected-reward values in patients with depression (n = 15), patients with schizophrenia (n = 14) and healthy controls (n = 17). Both patient groups exhibited abnormalities in neural prediction errors, but the spatial pattern of abnormality differed, with the degree of abnormality correlating with syndrome severity. Specifically, reduced prediction errors in the striatum and midbrain were found in depression, with the extent of signal reduction in the bilateral caudate, nucleus accumbens and midbrain correlating with increased anhedonia severity. In schizophrenia, reduced prediction error signals were observed in the caudate, thalamus, insula and amygdala–hippocampal complex, with a trend for reduced prediction errors in the midbrain, and the degree of blunting in the encoding of prediction errors in the insula, amygdala–hippocampal complex and midbrain correlating with increased severity of psychotic symptoms. Schizophrenia was also associated with disruption in the encoding of expected-reward values in the bilateral amygdala–hippocampal complex and parahippocampal gyrus, with the degree of disruption correlating with psychotic symptom severity. Neural signal abnormalities did not correlate with negative symptom severity in schizophrenia. These findings support the suggestion that a disruption in the encoding of prediction error signals contributes to anhedonia symptoms in depression. In schizophrenia, the findings support the postulate of an abnormality in error-dependent updating of inferences and beliefs driving psychotic symptoms. Phasic dopamine abnormalities in depression and schizophrenia are suggested by our observation of prediction error abnormalities in dopamine-rich brain areas, given the evidence for dopamine encoding prediction errors. The findings are consistent with proposals that psychiatric syndromes reflect different disorders of neural valuation and incentive salience formation, which helps bridge the gap between biological and phenomenological levels of understanding.

  • major depression
  • schizophrenia
  • model based fMRI
  • prediction error
  • dopamine

Introduction

The dopamine system is implicated in both major depressive disorder and schizophrenia. In major depressive disorder, this is based on several lines of evidence: anhedonia (a diminished or absent ability to experience pleasure) is a core symptom of the illness and dopamine neurons are believed to be critical in the processing of pleasurable experiences (Dunlop and Nemeroff, 2007). Furthermore, behavioural studies have reported diminished reward responsiveness in patients with major depressive disorder (Pizzagalli et al., 2008), and animal, human-neuroimaging and post-mortem studies have reported a reduction in dopamine functioning (Dunlop and Nemeroff, 2007). In schizophrenia, ‘positive’ or ‘psychotic’ symptoms (defined here as delusions and hallucinations respectively, for false abnormally fixed beliefs and aberrant perceptions) have long been linked to a putative dopamine disturbance since repeated administration of stimulant drugs that release dopamine can elicit psychotic symptoms in healthy subjects (Harris and Batki, 2000) and markedly worsen pre-existing psychotic symptoms in patients with schizophrenia (Frith and Johnstone, 2003). Furthermore, all effective anti-psychotic medications block D2 receptors (Kapur, 2003). ‘Negative’ symptoms of schizophrenia (e.g. affective flattening, apathy, social withdrawal) have also been linked to abnormalities in the dopamine system, with a similar rationale to that of anhedonia in depression (Juckel et al., 2006b; Guillin et al., 2007). However, it is not known how a disturbance in dopamine functioning could lead mechanistically to such symptoms: i.e. there remains a gap between the phenomenological and physiological domains of understanding for schizophrenia (Kapur, 2003) and major depressive disorder (Kumar et al., 2008).

Electrophysiological studies in non-human primates have demonstrated that midbrain dopamine neurons code an error in the prediction of reward; increased (or decreased) firing rate if an outcome is better (or worse) than expected and no change in activity if an outcome is as expected (Schultz, 1998). Reinforcement learning algorithms, such as temporal difference models, are a quantitative framework for modelling this dopamine response (Montague et al., 1996; Dayan and Abbott, 2001). In temporal difference algorithms, a prediction error signal (prediction error; discrepancy between expected and actual outcome) is used for improving predictions of future reward, such that, in the long term, expectations converge to actual outcomes with a minimization of prediction error, and for biasing action selection so long-term future reward is maximized (McClure et al., 2003a). Given the robust evidence for consistency between dopamine firing and temporal difference calculated prediction error signals, plus molecular biology studies of the mechanisms of dopamine modulation of neuronal plasticity (Berke and Hyman, 2000; Andrzejewski et al., 2005; Dalley et al., 2005), it has been proposed that dopamine neuronal firing acts as a teaching signal, mediating the learning of stimulus–outcomes and stimulus–response–outcome associations (Montague et al., 1996). It has also been proposed (Berridge and Robinson, 1998; Berridge, 2007) that dopamine firing contributes to the attribution of ‘incentive salience’, the process by which a stimulus grasps attention and motivates goal-directed behaviour by associations with reinforcing events.

A number of functional MRI studies have used reinforcement learning algorithms to investigate the neural encoding of prediction errors in healthy humans. Evidence for the encoding of prediction errors has been found, in the context of Pavlovian passive conditioning and instrumental decision-making paradigms, in dopamine-rich brain regions such as the striatum (O'Doherty et al., 2003b, 2004; Pessiglione et al., 2006; Glascher et al., 2009), orbito-frontal cortex (O'Doherty et al., 2003b), amygdala (Seymour et al., 2005; Kumar et al., 2008) and midbrain (Murray et al., 2007) plus in non-dopamine rich regions such as the insula (Seymour et al., 2005; Waltz et al., 2009). Functional MRI studies have also investigated neural responses during expectation of reward (at the time of the stimulus predicting the reward). Neural correlates of expected-reward value have been found in the amygdala (Gottfried et al., 2003; Hampton et al., 2006) and amygdala–hippocampal complex (Glascher et al., 2009).

Abnormalities in the encoding of prediction errors could result in dysfunctional learning and/or a reduced attribution of salience to rewarding events, which in turn could explain anhedonia symptoms in major depressive disorder and negative symptoms in schizophrenia (Juckel et al., 2006b; Kumar et al., 2008). Similarly, linking neurophysical, pharmacological and phenomenological levels of description, a dysregulation in dopamine firing involving the ‘aberrant attribution of salience’ to external objects and internal representations could underlie psychotic symptoms (Kapur, 2003). In this framework, delusions arise as cognitive schemas that provide an ‘explanation’ for the experience of aberrant salience. According to Kapur (2003), hallucinations emerge more directly due to memories and thoughts having abnormal salience. More recently, Fletcher and Frith (2009) have proposed a broader theory linking prediction errors and psychosis that postulates that a disturbance in error-dependent updating of inferences and beliefs about the world underlies psychotic symptoms.

In major depressive disorder, various functional MRI studies have investigated the functioning of the reward system, with most reporting abnormalities of the basal ganglia (Elliott et al., 1998; Keedwell et al., 2005; Epstein et al., 2006; Steele et al., 2007; Pizzagalli et al., 2009). To date, however, only one functional MRI study (Kumar et al., 2008) has investigated the encoding of prediction errors in major depressive disorder. Using a Pavlovian conditioning paradigm, these authors reported a blunting of prediction error signals in several brain regions in major depressive disorder. Encoding of prediction errors in the context of an instrumental, decision-making paradigm remains to be investigated.

In schizophrenia, consistent with the hypothesis of a dopamine disturbance driving negative symptoms, reduced ventral striatal activation in response to reward-predicting cues was observed in unmedicated patients with schizophrenia (Juckel et al., 2006b). In that report, decreased ventral striatal activation correlated with negative symptom severity. In relation to psychotic symptoms, a disrupted prediction error signal in the frontal cortex of patients with psychosis was reported, with the degree of disruption correlating with the propensity to delusion formation (Corlett et al., 2007). In another study (Murray et al., 2007), patients with psychosis exhibited attenuated and augmented prediction errors on reward and neutral trials, respectively, in the midbrain. This finding was interpreted as evidence for an abnormal dopamine-dependent motivational salience in psychosis.

Here, we use functional MRI and an instrumental reward learning paradigm, with brain function modelled by a reinforcement learning algorithm, to investigate hypothesized abnormalities in the encoding of expected-reward value and prediction errors in patients with major depressive disorder and schizophrenia. Our study has various novel features: first, there has been no previous report on the encoding of expected-reward value and prediction errors using an instrumental (decision-making) learning task in major depressive disorder. Second, there has been no previous study that has directly compared major depressive disorder and schizophrenia groups using the same associative learning paradigm. We tested two hypotheses: first, that both major depressive disorder and schizophrenia groups would exhibit abnormalities in the encoding of expected-reward value and prediction errors when compared to controls. Second, the extent of these abnormalities would correlate with relevant illness severity ratings, for each clinical syndrome. Specifically, in major depressive disorder, neural signal abnormalities would correlate with measures of anhedonia, while in schizophrenia, neural signal abnormalities would correlate with psychotic and/or negative symptoms.

Materials and methods

Subjects

The study was approved by the local research ethics committee and written informed consent was obtained from all participants. Data were acquired from three groups of subjects: a group of 15 patients with a Diagnostic and Statistical Manual of Mental Disorders (fourth edition) diagnosis of major depressive disorder; a group of 15 patients with a Diagnostic and Statistical Manual of Mental Disorders (fourth edition) diagnosis of schizophrenia and a group of 20 healthy controls with no history of illness. Subject exclusion criteria were neurological disorders and any other comorbid Diagnostic and Statistical Manual of Mental Disorders (fourth edition) Axis I or II diagnosis. No patient with major depressive disorder reported psychotic features and no patient with schizophrenia met criteria for a co-morbid diagnosis of major depressive disorder. A detailed clinical assessment consisting of a case note review, discussion with the patient's usual clinical contacts and a psychiatric interview was carried out on all patients by J.D. Steele, an experienced consultant psychiatrist. Questionnaire-based diagnosis was not used. Subjects with claustrophobia were excluded since they were unlikely to tolerate scanning. One control participant data set was excluded from analysis since the scan revealed a gross structural brain abnormality. Two other control participant data sets were excluded since post-task discussion confirmed failure to understand the goal of the task. One schizophrenia data set was excluded due to a hardware failure during image data acquisition. The three groups did not differ with respect to age and National Adult Reading Test estimated pre-morbid IQ (Nelson and Wilson, 1991) when tested by analysis of variance (ANOVA). Given the smaller proportion of females in the schizophrenia group than in the other two groups, gender was used as a covariate for the behavioural and image analyses. Details of the subjects included in the analysis are provided in the Table 1.

View this table:
Table 1

Subject details

ControlsMajor depressive disorderSchizophrenia
Age (years)40.64 ± 11.8745.27 ± 12.3242.50 ± 12.27
Gender (male/female)7/106/911/3
National Adult Reading Test113.13 ± 8.45111.60 ± 8.43105.83 ± 11.63
Water pleasantness rating as percentage80.47 ± 23.3880.13 ± 22.2978.71 ± 18.71
Beck depression inventory3.18 ± 2.9222.93 ± 8.2217.43 ± 12.88
Beck depression inventory anhedonia subscore0.71 ± 0.926.27 ± 2.283.29 ± 3.27
Spielberg anxiety scale30.60 ± 10.7154.60 ± 11.5345.07 ± 12.18
Hamilton depression rating scale23.2 ± 4.3
Positive and Negative Syndrome Scale: positive13.3 ± 2.3
Positive and Negative Syndrome Scale: negative12.4 ± 5.7
Positive and Negative Syndrome Scale: general22.3 ± 6.6
Positive and Negative Syndrome Scale: total46.9 ± 11.5
  • Values are mean ± SD.

Patients with major depressive disorder were receiving the following medication per day: escitalopram 15 mg, imipramine 200 mg, phenelzine 45–90 mg, trazodone 300 mg, mirtazapine 30–60 mg, venlafaxine 150–225 mg, amitriptyline 200 mg, lithium carbonate 600–800 mg (as anti-depressant augmentation), citalopram 20 mg, fluoxetine 40 mg and sertraline 25–150 mg. Patients with schizophrenia were receiving per day: clozapine 250–900 mg, quetiapine 250–700 mg, olanzapine 20 mg, risperidone 6 mg and chlorpromazine 500 mg. Depot medication was pipothiazine palmitate 50 mg every 4 weeks and flupenthixol decanoate 200 mg every 3 weeks. In addition, three patients with schizophrenia received long-term anti-depressant medication because of previous episodes of depressive illness: sertraline 50 mg, paroxetine 50 mg and citalopram 20 mg, per day.

Rating scales

All subjects completed the Beck depression inventory (Beck et al., 1961) and Spielberger State Anxiety scale (Spielberger, 1983). Consistent with previous work (Huys, 2009; Pizzagalli et al., 2009), anhedonia was investigated using a Beck depression inventory derived anhedonia subscore (questions 4, loss of pleasure; 12, loss of interest; 15, loss of energy; 21, loss of libido). Patients with major depressive disorder were assessed with the 21-item Hamilton depression rating scale (Hamilton, 1960) and patients with schizophrenia were assessed with the Positive and Negative Syndrome Scale (Kay et al., 1987). The null hypothesis of no difference between groups was tested using a one way ANOVA. All assessments were obtained by the same rater (J.D. Steele) immediately before scanning.

Instrumental reward learning paradigm

Subjects performed an instrumental reward learning task in the scanner. On each trial, a pair of fractal pictures was presented, randomly assigned to the left or right of the screen. Subjects had to choose one of the two pictures using buttons. Once a picture had been selected, it immediately increased in brightness. Two seconds after picture presentation, two drops (0.1 ml) of water were delivered (or not) according to a probabilistic schedule.

Subjects were asked to abstain from drinking fluids from the night before the scan (a routine requirement for many types of medical procedure that does not cause detectable biochemical alteration) to ensure they were thirsty at the time of the study and that water delivery was rewarding. The 0.1 ml volume was chosen empirically such that subjects could perceive the water as rewarding while minimizing the risk of inducing movement artefacts due to swallowing (Kumar et al., 2008). Water was delivered via a polythene tube attached to an electrictronic syringe pump (World Precision Instruments Ltd.) positioned in the scanner control room and interfaced to the image presentation and log file generating computer.

The task consisted of one high-probability stimulus and one low-probability stimulus. There were 100 trials, each of 6 s duration acquired asynchronously with blood oxygen level-dependent volume acquisition (repetition time 2.5 s). The high-probability stimulus was associated with a range of probabilities from 60% to 90%. The low-probability stimulus was associated with a range of probabilities from 0% to 20%. The associations changed slowly and none lasted longer than 20 trials. This evolving pattern was used to help maintain the subjects’ engagement in the task. By trial and error, subjects had to learn throughout the task which picture was most likely to deliver water. Instructions to subjects were: ‘two pictures will be presented and you have to select one of them. Depending on your choice, drops of water may be delivered. You should try to make your choices to maximize water delivery’.

Behavioural analysis

Immediately after scanning, subjects completed a linear analogue scale of perceived pleasantness of the delivered water. The effect of group (controls, major depressive disorder and schizophrenia) on water pleasantness ratings, reaction times and number of times the water was delivered on the instrumental task, was assessed using a general linear model with group as a fixed factor and gender as a covariate.

Reinforcement learning model

Each participant's sequence of choices and rewards was input to a reinforcement learning algorithm to generate prediction errors and expected-reward value signals for modelling neural function. We used a one-step state action reward state action (SARSA) temporal difference model (Sutton and Barto, 1998), since actual dopamine firing seems most consistent with SARSA than alternative temporal difference models such as ‘actor-critic’ or ‘Q learning’ (Niv et al., 2006). In temporal difference models, successive trials are considered, each trial composed by consecutive time steps. At each time step, an ‘agent’ (the learner and decision maker, i.e. the subject) is assumed to be located at a state st, and interacts with the environment by selecting an action. The environment then responds by placing the agent on a new state st+1 and delivers an outcome rt+1. The goal of the agent is to maximize the rewards it receives in the long run. Within this framework, a Qπ (st, a) value is defined as the expected time-discounted future reward if action a is chosen at st, and policy π is followed thereafter (a policy directs action selection at each state). The SARSA algorithm continuously improves estimates (Embedded Image) of the Qπ values and at the same time changes π towards ‘greediness’. The main characteristic of SARSA is that the prediction error associated with a decision time depends on the Embedded Image value of the chosen action (rather than on the value of the better option as in the Q learning mechanism). More specifically, at each time step, the SARSA algorithm computes a prediction error defined as: Embedded Image where a is the action chosen at st, a′ is the action chosen at st+1 and γ is a discount factor, which determines how less important later rewards are compared with the ones that arrive earlier on. As in previous work (Kumar et al., 2008) we used γ = 1.0. The prediction error is used in the algorithm to update the estimates of the Q values on a trial by trial basis as: Embedded Image where α is the learning rate.

In the model, three time points were assigned to each trial of the instrumental task, with the fractal picture stimuli being presented at time point 1 (decision time) and the water reward being delivered at time point 2 (outcome time). The reward magnitude r was coded as 1 for water delivery and 0 for no-water delivery. The Embedded Image estimates were initially set to zero.

For image analysis, we focused on each subject's prediction error at the decision and outcome time. At the decision time, the subject's prediction error was the Embedded Image value of the chosen picture (assuming that the prediction of reward at the trial onset was zero), referred to here as the ‘expected-reward’ value. Prediction errors at the outcome time are simply referred to as prediction errors.

Based on Embedded Image estimates, the model calculates the probability of choosing each action for each trial. As there were two possible picture stimuli to choose from in each trial, a and b, the probability of choosing action a was calculated using the softmax rule: Embedded Image where β is the ‘inverse temperature’ (low β means all actions become equiprobable while high β increases the difference in probabilities for actions with different value estimates). For image analysis, values for the constants α and β had to be chosen. We selected α and β to maximize the log likelihood of the subjects actual choices according to the model. As in previous studies (Pessiglione et al., 2006; Murray et al., 2007), a single set of parameters was fitted across all groups and subjects since it has been noticed that multi-subject functional MRI results are more robust if a single set of parameters is used to generate regressors for all subjects (Daw, 2009). We used α = 0.45 and β = 3.5 for the image analysis as these values were found to be optimal (Supplementary Fig. 1). In addition, for purely behavioural analyses, we estimated α and β on individual subjects. First, individual parameters were derived by maximizing the likelihood of each subject's choices under the model. Second, each subject's parameters were re-estimated applying prior information about the likely range of parameters (the prior being derived from the previous stage) to regularize estimates and avoid extreme (implausible) α- or β-values due to the inherent noisiness of the maximum likelihood estimation (Daw, 2009). We then tested whether α- and β-values differed across groups using a general linear model with group as a fixed factor and gender as a covariate.

Functional MRI data acquisition

A 3D T1-weighted image was obtained to exclude gross structural brain abnormality. For blood oxygen level-dependent response imaging, T2*-weighted gradient echo planar images were obtained using a GE Medical Systems Signa 1.5 T MRI scanner. A total of 30 axially orientated 5-mm-thick contiguous sequential slices were obtained for each volume, 246 volumes being obtained with a repetition time of 2.5 s, echo time 30 ms, flip 90°, field of view 240 mm and matrix 64 × 64. The first four volumes were discarded to allow for transient effects.

Image analysis

For preprocessing, image data were converted to Analyze format and SPM5 (Friston, 2004) was used for analysis. Images were slice-time corrected then realigned to the first image in each time series. The average realigned image was used to derive parameters for spatial normalization to the SPM5 Montreal Neurological Institute template, then the parameters applied to each image in the time series. The resultant time-series realigned and spatially normalized images were smoothed with an 8 mm Gaussian kernel.

For first level analysis, an event related model-based analysis was implemented with onset regressors at two time points: at the decision time (when the two fractals are presented) and at the outcome delivery time (when the water is either delivered or not). The expected-reward value and the prediction error signals (generated by the optimally fitted SARSA model at the decision and outcome times, respectively) were used to parametrically modulate truncated delta function onset regressors corresponding to the relevant time points, then convolved with the SPM5 haemodynamic response function, without time or dispersion derivatives. It should be noted that the probabilistic nature of the task allows decorrelating of the expected-reward value and the prediction error signals. In addition, six motion realignment terms were also included in the design matrix, to allow for any residual movement artefacts not removed by preprocessing realignment. For each subject, the two covariate images of interest were the SPM5 beta images comprising linear regression coefficients between the two parametric regressors (expected-reward value and prediction error) and the observed blood oxygen level-dependent signal.

Two second level random effects analyses were conducted. The first consisted of testing the null hypothesis of no significant relationship between predicted reinforcement learning signals and the observed brain response within each group (control, major depressive disorder, schizophrenia). This was achieved by entering the covariate images of interest into six one-group t-tests. In order to correct for multiple comparisons, a cluster extent threshold determined by Monte Carlo simulations was applied (Slotnick et al., 2003). For an individual voxel threshold of P < 0.005 uncorrected and after running 1000 simulations, this method resulted in an estimated cluster extent threshold of 106 resampled voxels, which corresponded to a corrected threshold of P < 0.05 across the whole brain volume. The second (second level) analysis consisted of testing the null hypothesis of no difference in imaging parameter estimates of expected-reward value and prediction error between each pair of groups (control versus depression, control versus schizophrenia, depression versus schizophrenia). Comparisons between groups were performed using multiple regression with group as a covariate of interest and gender as a covariate of no interest. For the between groups maps, we used a cluster extent threshold of 141 voxels to ensure a P = 0.05 threshold corrected for multiple comparisons across the whole brain with an individual voxel threshold of P = 0.05. As this threshold is less stringent, between group differences are only reported for a priori regions of interest: striatum, medial temporal lobe (amygdala–hippocampal complex and parahippocampal gyrus), midbrain and insula. It should be noted that we did not restrict our search for between group differences to only the areas where controls demonstrated a main effect because that would have implied an assumption that controls and patients activated precisely the same network. Unless otherwise stated, all images are thresholded at the stated statistical parametric mapping threshold of significance. The Talairach Client (http://www.talairach.org/) tool for localization of anatomical regions was used to confirm the brain regions included in the activated clusters.

Next, we investigated whether imaging parameter estimates of expected-reward value and prediction error in the patient groups correlated with illness severity measures. This analysis was performed only for the regions of interest that exhibited abnormalities when patient groups were compared with controls. The dependent variable used for this analysis was the mean value of the parameter estimates across voxels within a 10 mm diameter sphere, centred at the maximum peak co-ordinates of the regions that showed between group differences. Both in the major depressive disorder and schizophrenia groups, hypothesized correlations between mean parameter estimates were tested against clinical ratings of illness severity: depression (Beck depression inventory), state anxiety (Spielberger) and anhedonia (Beck depression inventory subscore). Additionally, correlations with the Hamilton depression rating scale were also tested in the major depressive disorder group, and correlations with the negative symptom scale of the Positive and Negative Syndrome Scale and a psychotic symptoms subscore (delusions plus hallucinations subscores) from the Positive and Negative Syndrome Scale, were tested for the schizophrenia group. Correlations were tested using the Pearson correlation coefficient.

In order to examine whether abnormalities observed in the schizophrenia group were secondary to the anti-psychotic medication, we tested for correlations between brain activation in response to expected-reward value and prediction error, and medication dose in chlorpromazine equivalents at a less stringent threshold of P < 0.05 uncorrected.

Results

Subject characteristics and behavioural results

Patients with major depressive disorder or schizophrenia rated themselves more depressed, anxious and anhedonic than controls, and patients with major depressive disorder were more anhedonic than patients with schizophrenia.

In the functional MRI task, participants had the goal of maximizing the frequency of delivery of drops of water when thirsty by choosing between two fractal pictures. Subjects found the water pleasant during the task and there were no significant differences for water pleasantness ratings, or behavioural reaction times, between major depressive disorder, schizophrenia and control groups. However, patients with schizophrenia achieved water delivery significantly less often than controls (Supplementary Fig. 2A). We tested for differences across groups in the reinforcement learning model parameters α (learning rate) and β (inverse temperature) and no differences were observed (Supplementary Fig. 2B and C). Refer to the online Supplementary Material for further details on behavioural results.

Expected-reward value: within groups analyses

We tested for regions exhibiting a signal described by the calculated expected-reward value (i.e. regions that encode the learned value of the chosen fractal picture stimulus). Controls exhibited significant activations in the bilateral amygdala–hippocampal complex, bilateral parahippocampal gyrus and retrosplenial cortex (Fig. 1A and Supplementary Fig. 3). Brain regions that correlated negatively with the expected-reward value signal were also found in controls in the dorsal anterior cingulate and right dorsolateral prefrontal cortex. Negative correlations with expected-reward value have been interpreted as an expected future aversive outcome signal (Kim et al., 2006). Our reported pattern of positive and negative correlations with expected-reward value in controls is similar to previous reports (Kim et al., 2006; Glascher et al., 2009). Patients with major depressive disorder also exhibited a positive correlation with the calculated expected-reward value in the retroesplenial cortex and left parahippocampal gyrus, and a negative correlation in the dorsal anterior cingulate and other cortical areas. At a lower level of confidence (P < 0.005 uncorrected), patients with major depressive disorder exhibited small clusters of expected-reward value encoding in the bilateral amygdala–hippocampal complex (Fig. 1B). In schizophrenia, the expected-reward value was found to correlate positively with activity in the left ventral striatum but responses in the amygdala–hippocampal complex where not present even at a low confidence of 0.005 uncorrected (Fig. 1C). No negative correlations with the computed expected-reward value were observed for schizophrenia. Within groups expected-reward value results for the regions of interest are detailed in Table 2. Supplementary Tables 1 and 2 summarize all significant expected-reward value activations and deactivations for the three groups.

Figure 1

Expected-reward value encoding in the medial temporal lobe. (A) Controls exhibited expected-reward value encoding in the amygdala–hippocampal complex (A–H) and parahippocampal gyrus while patients with major depressive disorder (B) and schizophrenia (C) had relatively weaker or absent expected-reward value encoding. In the case of schizophrenia, absence of expected-reward value (ERV) encoding in the amygdala–hippocampal complex and parahippocampal gyrus correlated with increased severity of psychotic symptoms (delusions and hallucinations). For illustrative purposes images thresholded at P < 0.005, uncorrected.

View this table:
Table 2

Within group activations and between group comparisons for expected-reward value

xyzZ -value
Controls
    Left amygdala–hippocampal complex and parahippocampal gyrus−22−18−223.45
    Right amygdala–hippocampal complex and parahippocampal gyrus320−263.9
    Left posterior parahippocampal gyrus−22−34−123.44
    Right posterior parahippocampal gyrus28−52−44.11
Major depressive disorder
    Left amygdala–hippocampal complexa−26−16−183.01
    Right amygdala–hippocampal complexa32−12−202.82
    Left posterior parahippocampal gyrus−32−36−43.26
Schizophrenia
    Left ventral putamen and nucleus accumbens−1614−103.57
Controls > major depressive disorder
    Right hippocampus32−22−102.50
    Right posterior parahippocampal gyrus24−52−62.81
Major depressive disorder > controls
    No significant activations in regions of interest
Controls > schizophrenia
    Left amygdala–hippocampal complex and parahippocampal gyrus−26−14−242.87
    Right amygdala–hippocampal complex and parahippocampal gyrus32−20−123.26
    Left posterior parahippocampal gyrus−26−38−182.6
    Right posterior parahippocampal gyrus28−48−23.54
Schizophrenia > controls
    Left ventral striatum−60−43.3
Depression > schizophrenia
    Left amygdala–hippocampal complex and parahippocampal gyrus−28−18−222.97
    Right amygdala–hippocampal complex and parahippocampal gyrus32−12−202.78
    Right posterior parahippocampal gyrus32−36103.25
    Left posterior parahippocampal gyrus−34−46−103.35
Schizophrenia > depression
    Left ventral putamen and nucleus accumbens−1812−42.7
  • Co-ordinates (x, y, z) reported in Montreal Neurological Institute space. The Z-value of the peak voxel of the region is reported.

  • a Significant at P < 0.005 uncorrected. All other results significant at P < 0.05 cluster extent corrected across the whole brain.

Expected-reward value: between groups analyses

Comparisons between groups revealed significant differences in medial-temporal lobe structures in expected-reward value encoding. Patients with major depressive disorder exhibited reduced expected-reward value signal in the right hippocampus and parahippocampal gyrus (Fig. 2A) in comparison with controls. Patients with schizophrenia exhibited significantly decreased expected-reward value encoding in the bilateral amygdala–hippocampal complex and bilateral parahippocampal gyrus compared with both major depressive disorder and control groups (Fig. 2B and C; Supplementary Fig. 4). Patients with schizophrenia exhibited an increased expected-reward value signal in the left ventral striatum compared with both controls and patients with major depressive disorder. In summary, medial temporal lobe abnormalities in expected-reward value encoding were present in both major depressive disorder and schizophrenia, with the most marked abnormalities present in schizophrenia. Between groups expected-reward value results for the regions of interest are detailed in Table 2.

Figure 2

Between group differences in expected-reward value encoding. (A) Patients with major depressive disorder had significantly reduced expected-reward value encoding in the hippocampus (H) than controls. (B) Patients with schizophrenia had reduced expected-reward value encoding in the amygdala–hippocampal complex (A–H) and parahippocampal gyrus compared with controls. (C) Patients with major depressive disorder had significantly increased expected-reward value encoding in the amygdala hippocampal complex (A–H) and parahippocampal gyrus compared with schizophrenia. All regions are significant at P < 0.05 cluster extent corrected across the whole brain.

Prediction errors: within groups analyses

We tested for regions of the brain activating according to prediction errors signals at the outcome time of the task. Controls showed significant prediction error encoding in the basal ganglia, bilateral nucleus accumbens and bilateral upper caudate, bilateral amygdala–hippocampal complex, midbrain, thalamus and right insula (Fig. 3A and Supplementary Fig. 5). Patients with major depressive disorder showed prediction error encoding in the bilateral amygdala–hippocampal complex and right insula (Fig. 3B). Patients with schizophrenia demonstrated significant prediction error encoding in the left ventral striatum, right amygdala and medial orbito-frontal cortex (Fig. 3C). No negative correlations with calculated prediction errors were observed in any of the groups. In summary, brain regions exhibiting prediction error responses were largely confined to dopamine-rich brain regions, medial temporal lobe structures and the insula, as previously reported. Table 3 details within group prediction error activations for the regions of interest. Refer to Supplementary Table 3 for a complete list of within group prediction error activations.

Figure 3

Prediction error encoding. (A) Controls exhibited prediction error encoding in the midbrain (MB), striatum (Str), caudate (C), insula (In) and amygdala–hippocampal (A–H) complex. (B) Patients with patients with major depressive disorder showed prediction error encoding in the amygdala (A) and insula (In). (C) Patients with schizophrenia showed prediction error encoding in the ventral striatum (VS). All regions are significant at P < 0.05 cluster extent corrected across the whole brain.

View this table:
Table 3

Within group activations and between group comparisons for prediction errors

xyzZ-value
Controls
    Left putamen−244−44.14
    Right putamen and insula30603.04
    Left nucleus accumbens−64−63.05
    Right nucleus accumbens104−103.28
    Left amygdala–hippocampal  complex−28−4−183.45
    Right amygdala–hippocampal  complex32−14−144.2
    Left caudate−64163.48
    Right caudate and thalamus6−6184.21
    Right insula34−6163.86
    Midbrain−8−16−183.18
Major depressive disorder
    Left amygdala−18−2−123.20
    Right amygdala30−2−183.62
    Right hippocampus30−18−84.07
    Right insula446−104.05
    Right insula46−2163.87
Schizophrenia
    Left ventral putamen and  nucleus accumbens−1612−143.90
    Right amygdala220−223.51
Controls > major depressive disorder
    Left putamen−28402.80
    Left nucleus accumbens−80−62.99
    Right nucleus accumbens82−82.92
    Left caudate−62162.56
    Right caudate and thalamus8−6162.68
    Midbrain2−24−42.79
    Right hippocampus22−22−203.02
Major depressive disorder > controls
    No significant activations  in regions of interest
Controls > schizophrenia
    Right caudate and thalamus10−6144.78
    Right insula32642.84
    Right amygdala–hippocampal  complex26−4−102.88
    Midbraina−6−16−162.07
Schizophrenia > controls
    No significant activations  in regions of interest
Major depressive disorder > schizophrenia
    Right insula3012−183.54
Schizophrenia > major depressive disorder
    Left ventral putamen and  nucleus accumbens−1612−122.89
  • Co-ordinates (x, y, z) are reported in Montreal Neurological Institue space. The Z-value of the peak voxel of the region is reported.

  • a Significant at P < 0.05 uncorrected. All other results significant at P < 0.05 whole-brain cluster extend corrected by Monte Carlo simulations.

Prediction errors: between groups analyses

Compared with controls, patients with major depressive disorder exhibited reduced prediction error encoding in the left putamen, bilateral nucleus accumbens, bilateral superior caudate, thalamus, midbrain and right hippocampus (Fig. 4A). Compared with controls, patients with schizophrenia showed reduced prediction error encoding in the right superior caudate and thalamus, right insula and right amygdala–hippocampal complex (Fig. 4B). In schizophrenia, a trend was observed for reduced prediction error encoding in the midbrain (z = 2.07), but this result did not survive our correction for multiple testing, so it can only be regarded as exploratory. Comparing major depressive disorder and schizophrenia, patients with major depressive disorder exhibited increased prediction error encoding in the right insula, while patients with schizophrenia exhibited increased encoding in the left ventral striatum (Fig. 4C). Between groups prediction error results are detailed in Table 3. In summary, blunted prediction error encoding was found in both major depressive disorder and schizophrenia, but the pattern of abnormality differed between the two syndromes.

Figure 4

Between group differences in prediction error encoding.(A) Patients with major depressive disorder had significantly reduced prediction error encoding in the midbrain (MB), caudate (C), putamen (P) and bilateral nucleus accumbens (NA) compared with controls. (B) Patients with schizophrenia had significantly reduced prediction error encoding in the thalamus-caudate (T-C), insula (In) and amygdala–hippocampal complex (A–H), and a trend to reduced prediction error encoding in the midbrain (MB) compared with controls. (C) Patients with schizophrenia had significantly reduced prediction error encoding in the insula (In) compared with patients with major depressive disorder. (D) Patients with a major depressive disorder had significantly reduced ventral striatal (VS) prediction error encoding in comparison with schizophrenia. All regions are significant at P < 0.05 cluster extent corrected across the whole brain except for the midbrain region in (B).

Expected-reward value and clinical rating correlations

In schizophrenia, decreased expected-reward value encoding in the bilateral amygdala–hippocampal complex and in the bilateral parahippocampal gyrus, correlated with increased severity of psychotic symptoms (Fig. 1C) measured using relevant subscales of the Positive and Negative Syndrome Scale.

Prediction error and clinical rating correlations

In major depressive disorder, decreased prediction error encoding in the bilateral caudate, right nucleus accumbens and midbrain correlated with increased severity of anhedonia symptoms, measured using the Beck Depression Inventory anhedonia subscore (Fig. 5A). In schizophrenia, decreased prediction error encoding correlated with increasing severity of psychotic symptoms in the right insula, right amygdala–hippocampal complex and midbrain (Fig. 5B). Table 4 details the above symptom correlation findings.

Figure 5

Correlations between prediction error encoding and illness severity. (A) In major depressive disorder, decreased prediction error (PE) encoding correlated with increasing severity of anhedonia symptoms in the left and right caudate (LC, RC), midbrain (MB) and right nucleus accumbens (RNA). (B) In schizophrenia, decreased prediction error encoding correlated with increased severity of psychotic symptoms (delusions and hallucinations) in the right insula (RIn), midbrain (MB) and right amygdala–hippocampus (RA–H).

View this table:
Table 4

Correlations between expected-reward value and prediction error signal strength with illness severity ratings

xyzrP-value
Schizophrenia expected-reward valuea
    Left amygdala–hippocampal complex and parahippocampal gyrus−26−14−24−0.7170.004
    Right amygdala–hippocampal complex and parahippocampal gyrus32−20−12−0.6570.011
    Left posterior parahippocampal gyrus−26−38−18−0.6830.007
    Right posterior parahippocampal gyrus28−48−2−0.6360.014
Major depressive disorder prediction errorb
    Right nucleus accumbens82−8−0.5110.051
    Left caudate−6216−0.5700.027
    Right caudate8−616−0.6200.014
    Midbrain2−24−4−0.5800.023
Schizophrenia prediction errora
    Right insula3264−0.6900.006
    Right amygdala–hippocampal complex26−4−10−0.6020.023
    Midbrain−6−16−16−0.7540.002
  • a Correlations with psychotic symptoms severity.

  • b Correlations with anhedonia symptom severity.

No other correlations between expected-reward value and prediction error encoding and clinical rating scales were found. No correlation between expected-reward value or prediction error signals and anti-psychotic dose (calculated as chlorpromazine equivalents) was found. In summary, blunting of the prediction errors occurred for both major depressive disorder and schizophrenia, and the extent of blunting correlated with core syndrome severity in each group.

Discussion

This study investigated the neural correlates of expected-reward value and prediction error signals in major depressive disorder and schizophrenia using functional MRI, an instrumental decision-making task and a reinforcement learning algorithm. Abnormally reduced signal encoding was found in both syndromes, but the topography of abnormality differed between syndromes. Crucially, the extent of abnormal signal encoding correlated with anhedonia severity for patients with major depressive disorder, and with psychotic symptom severity (but not negative symptom severity or anhedonia) in patients with schizophrenia.

Major depressive disorder

While imaging differences were found between the major depressive disorder and control groups, behavioural differences were not observed, since patients with major depressive disorder obtained the water reward as well as controls. Thus, as has been observed in other psychiatric functional MRI studies (Murray et al., 2007), the image analysis showed increased sensitivity to detect between group differences compared with the behavioural analysis.

Compared with controls, patients with major depressive disorder showed reduced prediction error encoding in the striatum and midbrain. These striatal findings are consistent with previous studies (Elliott et al., 1998; Steele et al., 2007; Kumar et al., 2008; Pizzagalli et al., 2009), suggesting dysfunction of the striatum when processing rewarding events. A blunting of prediction error signals in the ventral striatum in major depressive disorder was reported previously using a Pavlovian paradigm (Kumar et al., 2008). In that study, subjects had to learn to predict rewarding water delivery using visual cues. Here, the same subjects now performing an instrumental reward learning paradigm, showed reduced prediction error responses in the ventral striatum, but also prediction error reduction in the dorsal striatum. In contrast, in the present study, which used the same subjects but an instrumental reward learning paradigm, we replicated the finding of reduced prediction error responses in the ventral striatum, but additionally found prediction error reduction in the dorsal striatum. The latter finding was attributable primarily to control participants activating both ventral and dorsal regions of the striatum in performance of the instrumental task, but only the ventral striatum in the Pavlovian task. This is consistent with previous work comparing Pavlovian and instrumental paradigm patterns of neural activity, in healthy subjects (O'Doherty et al., 2004).

Notably, decreased prediction error responses in the striatum and midbrain of patients with major depressive disorder correlated with increased severity of anhedonia symptoms. Although the relationship between structural and functional abnormalities can be complex, this is potentially consistent with our striatal findings that reduced caudate volume has been reported to be associated with anhedonia symptoms in unmedicated patients with major depressive disorder (Pizzagalli et al., 2009). Since dopamine has been hypothesized to be the neural substrate of prediction errors in dopamine-rich brain areas (Montague et al., 1996; Schultz et al., 1997; Dayan and Abbott, 2001; Tobler et al., 2006), our findings give strong support to the hypothesis of abnormal dopamine-mediated prediction errors being associated with anhedonia symptoms in major depressive disorder (Kumar et al., 2008). A link between reduced dopamine function and major depressive disorder has long been proposed (Dunlop and Nemeroff, 2007), but it is important to note that our current study specifically investigated prediction errors that have been linked to salience (Berridge, 2007) and potentially, therefore, anhedonia. The older literature, summarized by Dunlop and Nemeroff (2007), investigated long timescale dopamine changes only, which have not been linked to salience.

A study on major depressive disorder (Pizzagalli et al., 2009) distinguished anticipatory and consummatory phases of reward processing, and reported significantly reduced responses to reward outcomes in comparison to controls, but minor differences in responses to reward-predicting cues. Our findings are consistent with this study. We observed stronger differences between controls and patients with major depressive disorder in prediction error signals than in expected-reward value signals. As noted (Pizzagalli et al., 2009), it is not easy to understand this finding in neural terms. However, conditioned stimuli neural responses at the decision time are typically less strong than primary reward responses at the outcome time, so a difference in statistical power may account for the findings.

Imaging findings: schizophrenia

Patients with schizophrenia exhibited both behavioural and imaging differences in comparison with controls. They obtained significantly less water delivery than controls, which is consistent with previous studies indicating impairments in reinforcement learning in schizophrenia (Hall et al., 2009; Koch et al., 2010; Romaniuk et al., 2010). The image analysis showed abnormalities in the encoding of prediction error and expected-reward value signals in schizophrenia. Compared with controls, prediction errors were reduced in the superior caudate and thalamus, insula and amygdala–hippocampal complex. The midbrain of patients with schizophrenia exhibited a trend towards prediction error reduction. Consistent with finding prediction error abnormalities, the expected-reward value signal was also decreased in schizophrenia in the medial temporal lobe (amygdala–hippocampal complex and parahippocampal gyrus). Findings in the caudate, thalamus, midbrain and amygdala–hippocampal complex suggest a dopamine disturbance in the encoding of prediction errors. While truly diminished dopamine phasic signals may occur in major depressive disorder (Dunlop and Nemeroff, 2007), it has been hypothesized that increased noise corruption in the dopamine system in schizophrenia could interfere with normal phasic responses to events (Juckel et al., 2006b; Corlett et al., 2007; Roiser et al., 2009). This is consistent with an functional MRI study (Knutson et al., 2004) which reported that amphetamine administration blunted ventral striatal responses to reward-indicating cues. These findings suggest higher dopamine availability, rather than increased phasic signals, may interfere with the normal processing of reinforcing events.

In patients with schizophrenia, the degree of disruption in the encoding of prediction errors in the insula, amygdala–hippocampal complex and midbrain, and the blunting in expected-reward value responses in the amygdala–hippocampal complex and parahippocampal gyrus, correlated with increased severity of psychotic symptoms. These findings are in line with a previous study (Corlett et al., 2007) reporting a disrupted prediction error signal in the prefrontal cortex of patients with schizophrenia, the extent of the disruption being correlated with the experience of ‘odd beliefs’. Overall, our results give support to a theory of psychosis that postulates a disturbance in prediction error processes as a core abnormality driving symptoms (Fletcher and Frith, 2009). These authors argue that subjective experience depends significantly on predictions. Events that are predictable tend to be ignored, while surprising events capture attention and gain significance, facilitating learning. In psychosis, a disturbance in the encoding of prediction errors, involving a false attribution of attention/significance to external and internal stimuli may thus result in abnormal perceptual experiences. Delusions then arise from a subject's attempt to provide an explanation for such experiences. Since perceptions are also strongly influenced by beliefs, abnormal beliefs could drive abnormal perceptions, creating an interaction between learning and perception.

In particular, the correlation between blunting of prediction errors and psychotic symptom severity in the midbrain suggests that a disturbance in dopamine encoding of prediction errors could be linked to psychotic symptoms. This is consistent with the notion that a dysregulation of dopamine phasic responses, attributing aberrant salience to stimuli, accounts for psychotic symptoms (Kapur, 2003). Our findings are also in line with a previous functional MRI study reporting attenuated and augmented prediction error responses on reward and neutral trials, respectively, in the midbrain of psychotic patients (Murray et al., 2007).

Consistent with our findings, reduced prediction errors in the insular cortex of patients with schizophrenia during a Pavlovian reward learning task have been reported (Waltz et al., 2009). The insular cortex has been implicated in the encoding of prediction errors (Seymour et al., 2005; Waltz et al., 2009) and decision making (O'Doherty et al., 2003a). In gambling tasks, insula activation has been reported to reflect risk prediction and risk prediction error (Preuschoff et al., 2008). Furthermore, the insula has also been linked to the feeling of ‘agency’ (self-generation) during motor activity (Craig, 2009). Disruption of the feeling of agency occurs as a feature of ‘passivity phenomena’ in schizophrenia (Sims, 2005). Such phenomena involve the loss of feeling that a self-generated action, speech or thought, is self-generated, and are prominent features of Schneider's influential ‘first rank’ symptoms of schizophrenia (Sims, 2005). Here we report a correlation between a reduction in neural prediction errors and severity of psychotic symptoms in the insula. While we did not specifically assess passivity phenomena, the correlated reduction in normal prediction error encoding suggests a further link to core schizophrenia phenomenology.

Although we did not find significant correlations with the severity of negative symptoms, the observed prediction error abnormalities could contribute to these symptoms by affecting the normal processing of rewarding events (Juckel et al., 2006b). In this framework, the same core abnormality, a disturbance in the encoding of prediction errors (presumably related in some brain areas to altered dopamine functioning) could contribute to both negative and psychotic symptoms, by affecting learning of associations and blunting the salience of rewarding events, and simultaneously focus attention towards irrelevant stimuli (Roiser et al., 2009).

Study limitations

Potential limitations of the study should be noted. First, both groups of patients were medicated because it was not possible, for ethical considerations, to withdraw patients from medication. However, it is unlikely that the major depressive disorder findings were secondary to anti-depressant medication since the reduction in neural prediction error signals correlated with increased severity of anhedonia symptoms in a number of brain regions, suggesting an illness rather than a medication effect. Additionally, a recent functional MRI study on unmedicated depressed patients also reported reduced reward responsiveness in striatal areas (Pizzagalli et al., 2009). Similarly, there are a number of reasons to believe that the reduction in neural responses in schizophrenia was not a consequence of medication. Again, the disruption in the neural encoding of expected-reward values and prediction errors correlated with increased severity of psychotic symptoms and no correlations were observed between anti-psychotic equivalent medication doses and brain activity. In a different study, unmedicated patients with schizophrenia participating in a task involving reward-predicting cues, reported failure of activation of brain regions that overlap with regions we report (Juckel et al., 2006b). Furthermore, most of our patients with schizophrenia (11/14) were receiving atypical anti-psychotic medication. Interestingly, it has been suggested that such anti-psychotics may re-establish normal neural responses to rewards (Juckel et al., 2006a). Thus, it is possible that the neural activity we observed in the schizophrenia group in the ventral striatum, which correlated with both expected-reward values and prediction errors, might be a consequence of atypical anti-psychotics re-establishing normal reward responses in this region.

Second, while major depressive disorder and control groups were matched for gender, the schizophrenia group was not due to recruitment difficulties. Although we did not identify a significant gender-related difference in exploratory analyses, we nevertheless used gender as a covariate of no interest in reported calculations. Third, since the groups were not matched for performance (with patients with schizophrenia scoring less than the other two groups) there is a possibility that neural differences between patients with schizophrenia and controls were in part due to the schizophrenia group receiving a smaller number of rewards. Fourth, it should be noted that state rather than trait-based abnormalities were being investigated, as the rating scales are state measures. Fifth, episodes of depressive illness in patients with schizophrenia are common and might account for some of the similarities between schizophrenia and major depressive disorder groups, though as above, no patient with schizophrenia met criteria for major depressive disorder at the time of the study and no patient with major depressive disorder had ever been psychotic. Sixth, it should be noted that the discussion focused on the results found within a priori regions of interest. Finally, it is important to note that functional MRI cannot measure dopamine activity directly. While there is considerable evidence that the computational model we used does reflect dopamine activity in dopamine-rich brain areas (Schultz, 1998; Dayan and Abbott, 2001; McClure et al., 2003b; Pessiglione et al., 2006; Tobler et al., 2006), it is possible that other neuronal systems contributed to the observed reward related brain activity.

Conclusion

As hypothesized, we report abnormally reduced reward learning signals in major depressive disorder and schizophrenia groups during an instrumental reward learning task. Crucially, the spatial distribution differed between the disorders and the extent of the abnormalities correlated with severity of core symptoms. Our findings suggest that abnormal encoding of prediction errors in major depressive disorder could underlie anhedonia symptoms by affecting learning and salience of rewarding events. In schizophrenia, a disturbance in the encoding of prediction errors could contribute to psychotic symptoms by driving abnormal perceptions and beliefs. We speculate that, in major depressive disorder, a true reduction of reward learning signals occurs, but that in schizophrenia the observed reduction may be due to a normal signal corrupted by noise. Further work is required to test this hypothesis. Our findings support the hypothesis that major depressive disorder and schizophrenia syndromes reflect different disorders of neural valuation and incentive salience formation (Kapur, 2003; Kumar et al., 2008).

Funding

SINAPSE (studentship to V.B.G., partial); The Miller McKenzie Trust and Chief Scientist Office, Scotland, funded scanning.

Supplementary material

Supplementary material is available at Brain online.

Acknowledgements

The authors thank Prof. Keith Matthews for valuable discussions on the results of this study. The authors also would like to thank Dr Quentin Huys and Dr Nathaniel Daw for advice on behavioural modelling.

Abbreviations
SARSA 
 state action reward state action

References

View Abstract