OUP user menu

Depth of processing effects on neural correlates of memory encoding
Relationship between findings from across- and within-task comparisons

Leun J. Otten, Richard N. A. Henson, Michael D. Rugg
DOI: http://dx.doi.org/10.1093/brain/124.2.399 399-412 First published online: 1 February 2001

Summary

Neuroimaging studies have implicated the prefrontal cortex and medial temporal areas in the successful encoding of verbal material into episodic memory. The present study used event-related functional MRI to investigate whether the brain areas associated with successful episodic encoding of words in a semantic study task are a subset of those demonstrating depth of processing effects. In addition, we tested whether the brain areas associated with successful episodic encoding differ depending on the nature of the study task. At study, 15 volunteers were cued to make either animacy or alphabetical decisions about words. A recognition memory test including confidence judgements followed after a delay of 15 min. Prefrontal and medial temporal regions showed greater functional MRI activations for semantically encoded words relative to alphabetically encoded words. Two of these regions (left anterior hippocampus and left ventral inferior frontal gyrus) showed greater activation for semantically encoded words that were subsequently recognized confidently. However, other regions (left posterior hippocampus and right inferior frontal cortex) demonstrated subsequent memory effects, but not effects of depth of processing. Successful memory for alphabetically encoded words was also associated with greater activation in the left anterior hippocampus and left ventral inferior frontal gyrus. The findings suggest that episodic encoding for words in a semantic study task involves a subset of the regions activated by deep relative to shallow processing. The data provide little evidence that successful episodic encoding during a shallow study task depends upon regions different from those that support the encoding of deeply studied words. Instead, the findings suggest that successful episodic encoding during a shallow study task relies on a subset of the regions engaged during successful encoding in a deep task.

  • event-related fMRI
  • depth of processing
  • episodic encoding
  • hippocampus
  • prefrontal cortex
  • BA = Brodmann area
  • BOLD = blood oxygenation level-dependent
  • EPI = echoplanar imaging
  • fMRI = functional MRI
  • HRF = haemodynamic response function
  • MNI = Montreal Neurological Institute
  • RT = reaction time

Introduction

Neuropsychological evidence indicates that episodic memory (explicit memory for recently experienced events) depends critically upon a number of circumscribed, interconnected brain regions, including the medial temporal lobe and the prefrontal cortex (Squire and Knowlton, 2000). Whereas investigation of the effects of brain lesions can identify regions whose integrity is necessary for episodic memory, the specification of the functional role of these regions is less easy. In particular, it is difficult on the basis of behavioural evidence alone to distinguish between brain regions that play a role in the encoding as opposed to the retrieval of memories (Fletcher et al., 1997).

The question of which neural structures are associated with successful memory encoding can be addressed more directly with functional neuroimaging methods. While these methods do not permit conclusions to be drawn about whether a given brain region is necessary for normal memory function (Rugg, 1999), they provide a way to determine whether the region is active during encoding, retrieval, or both. In the study described in the present paper, event-related functional MRI (fMRI) was used to identify regions associated with successful memory encoding during two tasks that differed in their semantic processing demands.

Most previous neuroimaging studies of memory encoding have employed blocked experimental designs and searched for putative neural correlates of encoding by contrasting the activities associated with different study tasks. It has been reported consistently that tasks promoting relatively good memory performance, e.g. intentional learning versus reading (Kapur et al., 1996; Kelley et al., 1998), semantic versus non-semantic classification (Kapur et al., 1994; Demb et al., 1995; Wagner et al., 1998b, experiment 1), and full versus divided attention (Shallice et al., 1994) are associated with greater activity in several regions of left prefrontal cortex (for review, see Buckner et al., 1999). Findings from more recent studies, in which non-verbal as well as verbal items were employed, add further weight to the notion that the prefrontal cortex plays a role in memory encoding, and indicate that the lateralization of encoding-related prefrontal activity is material-specific (e.g. Kelley et al., 1998; Wagner et al., 1998a; McDermott et al., 1999).

In contrast to findings for the prefrontal cortex, neuroimaging studies have been less consistent in demonstrating encoding-related activation in the medial temporal lobe. Few of the studies cited above (for exceptions, see Kelley et al., 1998; Wagner et al., 1998b, experiment 1) reported that enhanced prefrontal activity was accompanied by enhanced medial temporal activity, and in one study (Dolan and Fletcher, 1997) activity in hippocampal and prefrontal regions was found to dissociate. One procedure that has yielded robust effects in the medial temporal lobe involves a contrast between blocks of trials wherein the same item is repeated and blocks in which items are trial-unique (Stern et al., 1996; Gabrieli et al., 1997). Blocks containing unique items elicited enhanced activity in the parahippocampal cortex (but not the hippocampus). This finding was interpreted as evidence for the role of the parahippocampal region in the encoding of contextually novel stimuli (for a similar argument in respect of the hippocampus, see Dolan and Fletcher, 1997; Saykin et al., 1999).

A problem in the interpretation of studies such as those cited above arises from the use of across-task or across-block contrasts to reveal activity that may be related to memory encoding. Whereas some of the effects revealed by such contrasts may indeed reflect differences in the effectiveness with which information is encoded into memory, other effects may arise because of differences between conditions that have nothing to do with memory encoding operations. Thus the argument linking, say, the left prefrontal cortex with episodic encoding is an indirect one.

An experimental approach that avoids the foregoing problem involves contrasts not between different encoding tasks, but between different classes of item presented during a single task. This approach was adopted in several studies (Cahill et al., 1996; Alkire et al., 1998; Fernández et al., 1998, 1999a) in which it was demonstrated that the magnitude of activity in one or more medial temporal structures at the time of encoding predicted memory performance either across (Cahill et al., 1996; Alkire et al., 1998) or within (Fernández et al., 1998, 1999a) subjects. Because data were obtained over blocks of items, however, the findings offer little insight into encoding processes supported by the medial temporal lobe that operate at the level of individual items.

With the advent of event-related fMRI (Dale and Buckner, 1997; Josephs et al., 1997; Zarahn et al., 1997) it is now possible to obtain functional neuroimaging data at the individual item level. To study memory encoding, item-related activity at study is contrasted according to whether items are remembered or forgotten in a subsequent memory test. Differences between the activities associated with remembered and forgotten items (subsequent memory effects) are then taken as candidate neural correlates of encoding. This approach has been used in numerous event-related brain potential studies of memory encoding over the past 20 years or so (for reviews, see Rugg, 1995; Wagner et al., 1999).

Few studies have employed the event-related fMRI technique to investigate subsequent memory effects for individual words (Wagner et al., 1998b; Henson et al., 1999; Kirchhoff et al., 2000; Buckner et al., 2001; for a similar study involving pictures, see Brewer et al., 1998). Henson and colleagues reported that study words subsequently judged as `remembered' elicited greater left prefrontal activity [Brodmann area (BA) 9/44] than did words judged as `known' (Henson et al., 1999), a finding consistent with the view that the left prefrontal cortex supports processes important for episodic encoding. Similar findings were reported by Buckner and colleagues (Buckner et al., 2001) and Kirchhoff and colleagues (Kirchhoff et al., 2000).

Wagner and colleagues employed a concrete/abstract discrimination at encoding and tested recognition memory some 20 min later (Wagner et al., 1998b). Relative to words that were subsequently unrecognized, words recognized with high confidence were associated at encoding with activation of the left prefrontal cortex (BA 44, 45 and 47) and the left parahippocampal/fusiform cortex (BA 35/36 and 37). These regions were a subset of the regions activated in a separate, blocked, experiment in which the same study task was contrasted with a non-semantic task (case judgement). In the light of the overlap between the regions activated in the semantic versus non-semantic contrast, and the regions demonstrating a within-task subsequent memory effect, Wagner and colleagues argued that many of the cognitive operations mediating the `depth of processing' effect on memory performance were the same as those supporting effective encoding during the performance of `deep' encoding tasks (Wagner et al., 1998b; see also Tulving et al., 1994).

The present study takes as its starting point the findings of Wagner and colleagues (Wagner et al., 1998b). Two principal questions are at issue. First, if subsequent memory and depth of processing effects are compared within subjects during the same experimental procedure, do the former effects still represent a subset of the latter? Secondly, to what degree do subsequent memory effects differ according to the nature of the encoding task used to reveal them? Specifically, do the effects found previously for semantic study tasks generalize to non-semantic tasks such as those typically used to engage `shallow' encoding operations, or is it the case that the neural activity supporting effective encoding in the two classes of task differs qualitatively? This question, which has not been addressed systematically in previous studies, is important because it bears on whether episodic encoding for a given class of items relies on a single neural system irrespective of study task, or whether encoding is supported by multiple, task-specific, systems.

We addressed these questions with an experimental design in which event-related fMRI images were obtained while subjects performed two interleaved incidental encoding tasks, one of which required a semantic discrimination and the other a non-semantic discrimination. To identify overlap between depth of processing and subsequent memory effects, we compared the outcome of the contrast between the two tasks (the depth of processing effect) with the contrast between semantically encoded words that were subsequently remembered versus forgotten (the deep subsequent memory effect). To investigate the dependence of subsequent memory effects on the nature of the encoding task, we contrasted semantic (deep) with non-semantic (shallow) subsequent memory effects.

Methods

Participants

The experimental procedures were approved by the National Hospital for Neurology and Neurosurgery and Institute of Neurology Medical Ethics Committee. Fifteen volunteers (nine women) were recruited via local advertisements. Mean age was 26 (range 20–31) years. All volunteers were native speakers of English and gave their informed consent prior to the experiment. All but one volunteer was right-handed according to self-report. All subjects claimed to be in good health and to be free from neurological and psychiatric problems.

Stimulus materials

The stimulus lists were constructed from a pool of 560 words ranging in frequency between 1 and 30 per million (Kucera and Francis, 1967) and in length between four and nine letters. Three sets of 140 words each were selected at random from this pool, with the restriction that (i) the distribution of word lengths was identical across the three sets, and (ii) within each set, 35 words were animate and `alphabetical' (i.e. their first and last letters were in alphabetical order), 35 animate and non-alphabetical, 35 inanimate and alphabetical, and 35 inanimate and non-alphabetical. These sets were used to form three different study–test blocks by rotating the sets across the animacy, alphabetical and new conditions. A volunteer saw one of the three possible blocks.

For each study list, a random sequence was generated from the 140 words used for animacy decisions and 140 words used for alphabetical decisions. The study sequence was divided into two blocks of 140 trials, and two filler words were added to the beginning of each block. For each test list, a random sequence was generated from the 140 words used for animacy decisions, 140 words used for alphabetical decisions, and 140 new words. The test sequence was divided into four blocks of 105 words each, and two filler words were added at the beginning of each block. An additional 24 words were selected from the word pool to create a practice list for the study task.

Task

The experiment was composed of an incidental study task followed by a recognition memory test after a delay of 15 min. During the study task, volunteers saw a series of 280 critical words, presented one at a time. Each word was preceded by a prestimulus cue, which consisted of the presentation of either the letter `O' or the letter `X'. The type of cue indicated whether the decision about the upcoming word should be based on the semantic (animacy decision) or non-semantic (alphabetical decision) properties of the word; this decision was signalled by a button press. Animacy and alphabetical decisions were equiprobable and were intermixed randomly. The words were presented visually for 300 ms in a white upper-case Helvetica 48-point font on a black background. The time between successive word onsets was 4.8 s. At a viewing distance of ~30 cm, words subtended a vertical visual angle of ~1.4°, and a horizontal visual angle of 4–10°. The prestimulus cue was presented for 2 s and measured 1.4 × 1.2° of visual angle. There was a 100-ms blank period between the offset of the cue and the onset of the word.

The recognition memory test consisted of the re-presentation of the 140 semantically judged words and 140 alphabetically judged words from the study task, along with 140 words not seen before during the experiment. A plus sign presented before each word served as a fixation point and warning stimulus. For each word, volunteers had to decide whether they had seen the word before during the experiment (old/new judgement), indicating whether they were confident or non-confident about their decision. The words were presented visually for 300 ms in a white upper-case Helvetica 24-point font on a black background. The time between successive word onsets was 4.8 s. At a viewing distance of ~50 cm, words subtended a vertical visual angle of ~0.5° and a horizontal visual angle of 1.5–3.5°. The warning stimulus was presented for 2 s and measured 0.5 × 0.4° of visual angle. There was a 100-ms blank period between the offset of the warning stimulus and the onset of the word.

Procedure

Scanning took place during the study task only. Before entering the scanner, volunteers were given an explanation about the study task. They were told that they would see words, presented one at a time, and that each word would be preceded either by the letter `O' or by the letter `X'. When an `O' preceded a word, they had to decide whether or not the word was animate or referred to the property of a living entity. When an `X' preceded a word, they had to decide whether or not the first and last letters of the word were in alphabetical order. Half of the volunteers were asked to make a left thumb response if a word was animate or alphabetical, and a right thumb response if a word was inanimate or non-alphabetical. For the other half, this response assignment was reversed. Both speed and accuracy were stressed. Volunteers undertook a short practice session to familiarize themselves with the study task. They were not informed about the nature of the task that would be performed subsequently outside the scanner.

Scanning began with a 15-min structural scan. Volunteers then performed two blocks of 142 trials each of the study task (~12 min each), during which the functional scans were acquired. A short rest was given between the blocks. The words were projected on to a mirror in direct view of the reclining volunteer, and responses were given with a hand-held response box.

The memory test was administered ~15 min after the completion of the study task. Volunteers were taken to another room, where they rested and conversed with the experimenter during the delay period. They were then informed about the old/new recognition memory task. It was explained that they would again see a sequence of words, presented one at a time, and that some of the words had been presented during the task they had performed in the scanner. For each word, volunteers had to decide whether or not the word had been presented during the study task. In addition, they were required to indicate whether they were confident or non-confident about their decision. One of four keys had to be depressed according to whether the word was confidently judged to be old, non-confidently judged to be old, confidently judged to be new, or non-confidently judged to be new. No specific instruction was given about how confident someone should be before pressing the `confident' key. Responses were given with the middle and index fingers of the left and right hands, which rested on the `r', `g', `k', and `o' keys of a keyboard placed on a table in front of the volunteer. Confident responses were always given with the middle fingers (`r' and `o' keys). The assignment of old responses to the left or right hand was counterbalanced across subjects. Volunteers were instructed to respond as fast as possible without sacrificing accuracy.

Four test blocks of 107 trials, each ~8 min in duration, were undertaken with short rests between the blocks. At the completion of the test blocks, volunteers were debriefed about the nature of the experiment and paid for their time.

MRI scanning methods

A 2 T Siemens Vision system (Siemens, Erlangen, Germany) was used to acquire both T1-weighted anatomical volume images [1 × 1 × 1.5 mm voxels, MPRAGE (magnetization-prepared, rapid acquisition gradient echo) sequence] and T2*-weighted echoplanar (EPI) images (64 × 64, 3 × 3 mm pixels, echo time = 40 ms) with blood oxygenation level-dependent (BOLD) contrast. Each EPI volume comprised 31 axial slices, 2 mm thick and separated by 1.5 mm, positioned to cover all but the most superior region of the brain and the cerebellum. Data were acquired during two sessions, each comprising 250 volumes, corresponding to the two study blocks. Volumes were acquired continuously with an effective repetition time of 2.85 s/volume. The first five volumes were discarded to allow for T1 equilibration effects. The constant interstimulus interval of 4.8 s allowed an effective sampling rate of the haemodynamic response of 6.7 Hz.

Preprocessing

For each volunteer, all volumes in a session were realigned to the first volume and resliced using a sinc interpolation in space. To correct for their different acquisition times, the signal measured in each slice was then shifted relative to the acquisition of the middle slice using a sinc interpolation in time. Each volume was normalized to a standard EPI template volume [based on the Montreal Neurological Institute (MNI) reference brain (Cocosco et al., 1997)] of 3 × 3 × 3 mm voxels in Talairach and Tournoux space (Talairach and Tournoux, 1988), using non-linear basis functions. Finally, the EPI volumes were smoothed with an 8 mm full-width half-maximum isotropic Gaussian kernel to accommodate residual anatomical differences across volunteers, and proportionally scaled to a global mean of 100.

Data analysis

The data were analysed by statistical parametric mapping (Friston et al., 1995), using the SPM99b program (Wellcome Department of Cognitive Neurology, London, UK). The volumes acquired during each session were treated as two time series. The haemodynamic response to the onset of each event type of interest was modelled with two basis functions: a canonical haemodynamic response function (HRF) (Friston et al., 1998) and a delayed HRF (Henson et al., 2001), shifted to onset 2.85 s (i.e. one repetition time) later than the canonical HRF. The use of both an early and a late response function was based on suggestions from published reports that the time of maximal activation is later for some brain regions (e.g. the prefrontal cortex) than the sensory regions on which the canonical HRF is based (Wilding and Rugg, 1996; Schacter et al., 1997; Buckner et al., 1998). The early and late response functions, when convolved with a sequence of delta functions representing the onset of each event, comprised the covariates in a general linear model, together with a constant term for each session. The covariates for the late HRF were orthogonalized with respect to those for the early HRF using a Gram–Schmidt procedure so as to give priority to the early covariate (Andrade et al., 1999). This orthogonalization attributes variance common to the early and late covariates to the early covariate; loadings on the orthogonalized late covariate account for part of the residual variance in the data not explained by the early covariate. The data were high-pass filtered to a maximum of 1/120 Hz, and both model and data were smoothed temporally with a 4 s full-width half-maximum Gaussian kernel. Parameter estimates for each covariate were calculated from the least mean squares fit of the model to the data.

Planned contrasts (specified in the Results section) were employed to test parameter estimates for both early and late covariates. The linear combinations of parameter estimates for each contrast were stored as separate images for each participant. These contrast images were entered into one-sample t-tests to permit inferences about condition effects across subjects (i.e. a random effects analysis). The images were subsequently transformed into statistical parametric maps of the Z statistic. Unless mentioned otherwise, contrasts were thresholded at P < 0.001, uncorrected for multiple comparisons. When reporting masked contrasts, the Z values refer to the outcome of the masked contrast only. Only activations involving contiguous clusters of at least five voxels were interpreted. We report the results from the late covariate only when they add meaningfully to the findings from the early covariate. The maxima of suprathreshold regions were localized by rendering them onto both the volunteers' normalized structural images and the MNI reference brain (Cocosco et al., 1997). They were labelled using the stereotactic system and nomenclature of Talairach and Tournoux (Talairach and Tournoux, 1988).

Results

Behavioural performance

Study task

Animacy decisions were made with an accuracy of 94% (SD = 3) and a mean reaction time (RT) of 955 ms (SD = 159). Alphabetical decisions were made with an accuracy of 92% (SD = 4) and a mean RT of 1380 ms (SD = 288). RTs were significantly longer for alphabetical than animacy decisions [F(1,14) = 79.78, P < 0.001], but the accuracy with which each type of decision was made did not differ reliably.

Anticipating the manner in which study items would be categorized for the purpose of analysing subsequent memory effects in the fMRI data (see below), RTs were calculated for study items that were subsequently recognized with high confidence as opposed to items recognized with low confidence or missed. These RTs were 954 and 951 ms, respectively, in the animacy task, and 1441 and 1362 ms, respectively, in the alphabetical task. RT did not differ reliably according to subsequent memory performance in the animacy task [F(1,14) < 1], but was slightly longer for subsequently remembered than forgotten items in the alphabetical task [F(1,14) = 10.94, P < 0.01].

Recognition memory

Recognition memory performance is shown in Table 1. Accuracy of confident and non-confident recognition was indexed by the discrimination measure Pr [probability of a hit (Phit) minus probability of a false alarm (Pfalse alarm)] (Snodgrass and Corwin, 1988). This measure showed a significant interaction between confidence and type of encoding [F(1,14) = 49.89, P < 0.001]. For confident hits, discrimination was significantly greater than zero for words from both the animacy and the alphabetical task [0.48 and 0.19, respectively; F(1,14) = 146.43 and 50.99, both P < 0.001]. Words from the animacy task, however, gave rise to better recognition than did alphabetically judged words [F(1,14) = 63.79, P < 0.001]. Performance for non-confidently recognized words was not reliably greater than zero for items from the animacy task [PhitPfalse alarm = 0.02; F(1,14) < 1], but was slightly greater than chance for alphabetically judged items [PhitPfalse alarm = 0.07; F(1,14) = 13.32, P < 0.01].

View this table:
Table 1

Recognition memory performance for new words and old words requiring an animacy or alphabetical decision during study

Word typeRecognition judgement
Sure oldUnsure oldSure newUnsure new
Values are across-subject means (standard deviation). *Mean reaction time for sure new judgements to old words requiring animacy decisions during study is based on 14 volunteers. One volunteer did not make any such judgement.
Proportion of responses
Old
Animacy0.58 (0.19)0.19 (0.11)0.07 (0.06)0.15 (0.09)
Alphabetical0.29 (0.15)0.24 (0.10)0.20 (0.19)0.27 (0.15)
New0.10 (0.07)0.17 (0.07)0.36 (0.24)0.35 (0.19)
Mean reaction time (ms)
Old
Animacy1094 (163)1536 (317)1275* (181)1589 (407)
Alphabetical1221 (258)1573 (371)1342 (263)1552 (369)
New1255 (347)1579 (322)1297 (221)1544 (360)

On the basis of these findings, words were categorized as `remembered' when they attracted a confident hit during the subsequent recognition test. Words were categorized as `forgotten' when they attracted a miss or a non-confident hit. Only confident hits were classified as remembered items because accurate discrimination between old and new words was carried primarily and the depth of processing effect was carried exclusively by confident judgements. Furthermore, previous studies (Brewer et al., 1998; Wagner et al., 1998b) have shown that subsequent memory effects are found predominantly for confidently recognized items. Accordingly, we maximized the signal-to-noise ratio for contrasts involving subsequent memory effects by contrasting confident hits with all other responses to old words. Because discrimination was above chance for alphabetically encoded words accorded a non-confident decision, we also computed a subsequent memory comparison for these items by contrasting confident hits with misses. The outcome of this comparison did not differ from that found when non-confident hits were included as forgotten items along with misses.

fMRI findings

All analyses of fMRI depth of processing and subsequent memory effects described below were confined to study trials associated with correct animacy or alphabetical decisions.

Depth of processing

Collapsed over subsequent memory performance, fMRI signals were greater for words from the animacy than for words from the alphabetical task in widespread regions of the brain, including the frontal cortex, hippocampus and the adjacent medial temporal cortex (Fig. 1A). The fMRI signals were greater for words from the alphabetical than the animacy task in, among others, the bilateral parietal and posterior prefrontal cortex (Fig. 1B). All activations loaded on the early covariate.

Fig. 1

Maximum-intensity projections detailing regions that showed significant fMRI signal increases loading on the early covariate for words studied in the animacy task versus the alphabetical task (A) and for the reverse subtraction (B). Increases that exceeded a threshold of P < 0.001 are shown.

Subsequent memory effects

Regions demonstrating a subsequent memory effect were identified by contrasts between remembered and forgotten items, performed separately for each study task. Data from one volunteer were excluded from the subsequent memory contrast for alphabetically judged words because too few (<12) words were confidently recognized.

Regions showing a subsequent memory effect for items encoded in the animacy task are given in Table 2. All of these regions were identified from contrasts performed on parameter estimates derived from the early covariate. Among these regions were the left inferior frontal gyrus (BA 47, 45, 9/44), along with a less extensive homotopic region on the right, and the anterior and posterior left hippocampus. These regions are illustrated in Figs 2A and 3A.

View this table:
Table 2

Regions showing significant (P < 0.001) signal increases on the early covariate for animacy-judged words that were subsequently remembered versus forgotten

RegionBALocation (x, y, z)Peak Z score (no. voxels)
Location is with respect to the system of Talairach and Tournoux (Talairach and Tournoux, 1988).
Z scores refer to the peak of the activated cluster, the size of which is indicated in parentheses.
Left inferior frontal gyrus47–36, 36, –94.91 (140)
Left inferior frontal gyrus 9/44–51, 12, 213.81 (25)
Left inferior frontal gyrus45–51, 27, 183.70 (52)
Left medial superior frontal gyrus 8 –3, 36, 453.43 (9)
Right inferior frontal gyrus4733, 36, –94.45 (74)
Right inferior frontal gyrus 9/4457, 12, 213.67 (10)
Right inferior frontal gyrus4539, 27, 93.60 (20)
Right superior frontal gyrus 627, –3, 573.83 (10)
Left anterior hippocampus–27, –15, –123.99 (38)
Left posterior hippocampus–30, –42, 03.88 (7)
Left inferior temporal/fusiform gyrus37–48, –54, –184.45 (28)
Left lateral parietal cortex40–45, –39, 334.47 (9)
Left occipital cortex17/18–18, –93, –63.54 (22)
Right anterior inferior temporal gyrus2033, 3, –453.39 (8)
Right subcentral gyrus 448, –6, 183.63 (7)
Right lateral occipital cortex1912, –87, 243.42 (5)
Right calcarine cortex1718, –81, 123.62 (10)
Brainstem 9, –24, –153.62 (6)
Left cerebellum–30, –42, –273.56 (16)
Fig. 2

(A) Significant clusters of signal increases for remembered versus forgotten words from the animacy task in the inferior frontal and medial temporal regions (P < 0.001). The activations are rendered onto the MNI reference brain. (B) The same contrast as for A, but masked by the regions that showed significant signal increases for the animacy versus alphabetical contrast (contrast and mask both thresholded at P < 0.001).

Fig. 3

Regions showing significant subsequent memory effects for words from the animacy and alphabetical tasks (threshold P < 0.001). The activations are rendered onto the MNI reference brain. (A) Effects from the animacy task (see legend to Fig. 2). (B) Effects from the alphabetical task in the left anterior hippocampus, loading on the late covariate. (C) Additional left frontal subsequent memory effect (threshold P < 0.01) in the alphabetical task, revealed by a mask (threshold P < 0.001) derived from the animacy subsequent memory effect.

To identify those regions that were not only sensitive to depth of processing but also manifested a subsequent memory effect in the animacy task, the subsequent memory contrast for that task was masked by the depth of processing effect (animacy task minus alphabetical task). Both the mask and subsequent memory contrasts were thresholded at P < 0.001, and were computed for the early covariate only. As illustrated in Fig. 2B, two regions survived the masking procedure: the left ventral inferior frontal gyrus (BA 47, 57 voxels, x = –36, y = 36, z = –9; Z = 4.91) and the left anterior hippocampus (10 voxels, x = –27, y = –12, z = –12; Z = 3.96).

For alphabetically judged words, the only area to show a subsequent memory effect was the left anterior hippocampus (10 voxels, x = –27, y = –15, z = –12; Z = 4.45; Fig. 3B). In contrast to the activations described in the foregoing sections, this effect loaded on the late rather than the early covariate. When masked in the same manner as described above for the animacy task, this left anterior hippocampal area was found not to overlap with the areas that showed a greater fMRI signal for words from the alphabetical than the animacy task.

In a subsequent set of analyses, the question of the overlap between animacy and alphabetical subsequent memory effects was addressed further. Subsequent memory effects for the alphabetical task were recomputed with a statistical threshold of P < 0.01, and were masked by the animacy subsequent memory effect (thresholded as before at P < 0.001). These analyses permit the identification, at a high level of sensitivity, of regions in which alphabetical subsequent memory effects overlap with animacy subsequent memory effects, while maintaining an acceptable type I error rate (as the two contrasts are orthogonal and independent, the probability of type I error is given by the multiple of their respective thresholds, i.e. 0.01 × 0.001). For the early covariate, the masked contrast revealed an alphabetical subsequent memory effect in left inferior frontal gyrus (BA 47, 17 voxels, x = –36, y = 36, z = –9; Z = 2.88) (Fig. 3C).

Figures 4 and 5 summarize quantitatively the most important of the effects described above. In the case of Fig. 4, it can be seen that subsequent memory effects are essentially absent for the alphabetical task in the right prefrontal cortex, and in the more dorsal of the two left frontal regions manifesting an effect in the animacy task. By contrast, subsequent memory effects for the two tasks are of similar magnitude [F(1,13) = 1.50, P > 0.20] in the ventral left frontal cortex. Figure 5 shows the relative magnitudes and variabilities of the parameter estimates obtained in the left anterior and left posterior hippocampus. The differential loadings of the animacy and alphabetical subsequent memory effects in the anterior hippocampus on the early and late covariates are clearly evident, as is the exclusivity to the animacy task of the subsequent memory effects in the posterior hippocampus.

Fig. 4

Parameter estimates for the early covariate for subsequent memory effects (i.e. differences between remembered and forgotten words) in the animacy (filled columns) and alphabetical (open columns) tasks for the left and right ventral (BA 47) and left and right dorsal (9/44) inferior frontal gyrus (for coordinates, see Table 2). Error bars show the standard error of the mean.

Fig. 5

Parameter estimates for early and late covariates for subsequent memory effects in the animacy (filled columns) and alphabetical (open columns) tasks for the left anterior and left posterior hippocampus (for coordinates, see Table 2). Error bars show the standard error of the mean. Note that the scale for the parameter estimates is specific to the covariate employed to obtain the estimates.

Discussion

This study addressed two principal issues: (i) the extent to which there is overlap between regions that are more active during a deep (animacy decision) than a shallow (alphabetical decision) encoding task, and regions where activity predicts subsequent memory for the deeply encoded items; (ii) whether subsequent memory effects dissociate according to the study task (animacy versus alphabetical ) in which they are obtained. With respect to the first issue, the findings indicated that a subset of the regions revealed by the depth of processing contrast did indeed manifest subsequent memory effects in the animacy task; however, effects were also found in regions that were insensitive to the depth of processing manipulation (Fig. 2). With regard to the second issue, we found little evidence that alphabetical and animacy subsequent memory effects were anatomically dissociable. Rather, the regions where effects were found for the alphabetical task were a subset of those showing effects for the animacy task. Below, after commenting on the behavioural data, we discuss the relevance of these fMRI findings to current ideas about the functional and neural bases of memory encoding.

Behavioural performance

As expected, recognition memory was substantially more accurate for items that were encoded in the animacy task rather than the alphabetical task. The advantage for deeply encoded items was observed despite the fact that RTs were >400 ms shorter in the animacy task, a clear demonstration that depth of processing effects in memory cannot be attributed to differences in factors such as time on task or task difficulty. The same argument applies with respect to those brain regions in which activity was greater during performance of the animacy task (Demb et al., 1995).

Comparison of hit and false alarm rates showed that non-confident recognition of items subjected to animacy decisions was at chance. By contrast, non-confident hits for alphabetically studied words marginally exceeded the corresponding false alarm rate, indicating the possibility of above-chance recognition of these items. However, the size of this effect was small (Phit = 0.24, Pfalse alarm = 0.17), meaning that the majority (~70%) of non-confident hits to alphabetically studied words were, like their counterparts from the animacy task, based on guesses. Together with previous findings (Brewer et al., 1998; Wagner et al., 1998b), these results emphasize the importance of employing confidence judgements when recognition memory is used to identify subsequent memory effects. Without the data from such judgements, the subsequent memory effects would have been diluted (to a greater degree in the alphabetical than in the animacy task) by the inclusion of trials in which there was no veridical memory of the study episode, and in which successful recognition was the result of guessing.

It is important to note that, in the animacy task, RT did not predict whether items would be recognized subsequently. This argues against the possibility that the subsequent memory effects obtained in the fMRI data for that task are correlates of trial-by-trial fluctuations in such factors as arousal, attention and time devoted to processing each item. This possibility is further diminished by the exclusion of fMRI data associated with study trials on which incorrect responses were made from the analyses of subsequent memory effects. A small difference in RT between subsequently remembered and forgotten items was found in the alphabetical task. It seems unlikely, however, that this difference was responsible for the subsequent memory effects observed in the fMRI data from this task; the regions displaying such effects were a subset of those identified in the animacy task, in which no RT differences were found.

fMRI findings

The depth of processing contrast revealed signal increases for deeply studied words in a variety of areas, including the prefrontal cortex and regions of the medial temporal lobe. With respect to the prefrontal cortex particularly, these findings broadly replicate earlier findings (e.g. Kapur et al., 1994; Demb et al., 1995; Fletcher et al., 1995; Wagner et al., 1998b, experiment 1; for review, see Poldrack et al., 1999). The subsequent memory contrast for deeply studied words revealed subsequent memory effects in two regions that overlapped with those activated by the depth of processing contrast, namely the left ventral inferior prefrontal cortex and the left anterior hippocampus. These findings support the claim by Wagner and colleagues, made on the basis of a comparison between data obtained with different groups of subjects and experimental procedures, that effective memory encoding in a deep study task involves activation of some of the same regions that are revealed by contrasts between tasks requiring deep versus shallow processing (Wagner et al., 1998b).

The overlap between regions activated by the depth of processing and deep subsequent memory effects implies the existence of cognitive operations that are engaged differentially both by semantic versus non-semantic processing and by effective versus less effective episodic encoding in a semantic task. It has been suggested that the operations supported by the left inferior prefrontal cortex might contribute to `semantic working memory' (Gabrieli et al., 1998)—the temporary storage, manipulation and selection of an item's semantic attributes. According to this hypothesis (see also Buckner and Koutstaal, 1998; Wagner et al., 1998b, 1999), the more a study item engages semantic working memory, the more likely it is that its semantic features will be incorporated in a representation of the study episode, and thus the more likely it is that the episode will be accessible in a subsequent memory test. It is easy to see how this hypothesis can encompass the findings from between-task comparisons, as in the case of the depth of processing manipulation in the present study. Additional assumptions are needed, however, to allow the hypothesis to account for left prefrontal subsequent memory effects when, as in the present study, the behavioural data provide no evidence that subsequently remembered and forgotten items were processed differently. One might imagine, for example, that the words placing the heaviest load on semantic working memory would be those that were most difficult to classify semantically, leading to a positive relationship between RT and the probability of subsequent recognition. There is, however, no hint of such a relationship. To reconcile the working memory hypothesis with these findings, it is necessary to assume that the engagement of semantic working memory goes beyond what is required to perform the study task, and that it is the extent of this additional processing that is particularly important for memory encoding.

Subsequent memory effects were observed during the animacy study task in the right as well as the left prefrontal cortex. These right-sided effects were localized to regions broadly homotopic with those found on the left but, unlike their left-sided counterparts, they were not localized to regions sensitive to the main effect of depth of processing. Thus, right prefrontal subsequent memory effects may reflect processes that benefit episodic encoding in ways other than through cognitive operations associated with semantic processing. In this regard, it is noteworthy that studies employing non-verbal material, such as textures, pictures and faces, have reported encoding-related activity in the vicinity of the right prefrontal regions identified in the present study (Brewer et al., 1998; Kelley et al., 1998; Wagner et al., 1998a; McDermott et al., 1999). The present findings suggest that the (presumably non-linguistic) processes reflected by such right frontal activity can, at least under some circumstances, operate on information derived from ostensibly verbal input. The nature of these processes is unclear. Nonetheless, together with the previous findings just mentioned, the present results argue against the view (Tulving et al., 1994) that prefrontally mediated episodic encoding operations are invariably left-lateralized.

In addition to the left inferior prefrontal cortex, the left anterior hippocampus was sensitive to both the depth of processing and animacy subsequent memory contrasts. Greater left anterior hippocampal activation during deep rather than shallow processing was also reported by Wagner and colleagues (Wagner et al., 1998b), but, for reasons that are unclear, these authors failed to observe subsequent memory effects in this region (but see Kirchhoff et al., 2000). The present findings are consistent with the large body of evidence from animal and human studies implicating the hippocampus in episodic memory, and provide direct evidence that the hippocampus plays a role in memory encoding (see also Fernández et al., 1999b). While the exact contribution to memory of the hippocampus is still debated, it seems likely that it includes some kind of integrative function, whereby disparate elements of an encoding episode, e.g. item and contextual information, are bound together to form an episodic representation (see Amaral, 1999, and following papers). One possibility is that hippocampal subsequent memory effects reflect differences in the processing associated with such binding operations, which in turn reflect the amount and nature of the item information (in the present case, largely semantic in origin) available to the hippocampus from working memory systems supported by prefrontal cortex (for similar views, see Buckner and Koutstaal, 1998; Wagner et al., 1998b). According to this hypothesis, encoding-related activity in the prefrontal cortex and hippocampus should be organized serially, the former activity onsetting before the latter. The poor temporal resolution of the BOLD signal, however, together with difficulties in interpreting inter-regional differences in the time course of such signals (Rajapakse et al., 1998), make the question of the relative timing of prefrontal and hippocampal encoding-related activity difficult to address with fMRI. The question may prove more amenable to investigation with methods based on electroencephalography or magnetoencephalography.

Whereas the left anterior hippocampus demonstrated effects of both depth of processing and deep subsequent memory, effects in a more posterior left hippocampal region were restricted to the subsequent memory contrast. These differing patterns of activity in the anterior and posterior hippocampus add to the evidence that this structure is functionally heterogeneous (Fernández et al., 1998; Lepage et al., 1998; Strange et al., 1999). The region identified in the present study as the posterior hippocampus (peak activation at x = –30, y = –42, z = 0) is quite close to the left medial temporal region reported by Wagner and colleagues to demonstrate a subsequent memory effect (peak activation at x = –31, y = –46, z = –12) (Wagner et al., 1998b). Although the region was identified by these authors as the parahippocampal cortex, it is possible, given the proximity of the respective peaks, that the two areas overlap to some extent.

We turn now to the second issue that motivated the present study, namely whether subsequent memory effects differ according to the nature of the encoding task used to reveal them. As already discussed, for words studied in the animacy task, the subsequent memory contrast revealed activations in the bilateral inferior prefrontal cortex and the left anterior and posterior hippocampus. For alphabetically studied words, the subsequent memory contrast revealed activations in part of the same region of the left inferior prefrontal cortex that manifested deep subsequent memory effects, as well as in the left anterior hippocampus. No regions were found that demonstrated shallow, but not deep, effects. Thus, the areas associated with episodic encoding in the alphabetical task were a subset of those associated with encoding in the animacy task.

The failure to find evidence for subsequent memory effects associated uniquely with the alphabetical encoding task must, like all null results, be treated with caution. This is especially so given that the power to detect subsequent memory effects was lower for the alphabetical than the animacy task, a consequence of the fewer confident hits made to alphabetically studied words, and the need to reject one volunteer's data from the relevant analyses. Thus, the issue of whether there are subsequent memory effects specific to non-semantic study tasks remains open.

As just noted, comparison of the outcome of the subsequent memory contrasts for the two tasks revealed no regions where effects were unique to the alphabetical task. There were regions—notably the right prefrontal cortex and the left posterior hippocampus—where subsequent memory effects were found only for the animacy task. This finding might indicate a dissociation between the neural substrates of encoding in the two tasks. The difference in the power of the two subsequent memory contrasts (see above) means, however, that it is not possible to reject the possibility that the finding is a consequence of regional differences in sensitivity to subsequent memory contrasts rather than differences related more specifically to the two encoding tasks.

The anatomical overlap between the subsequent memory effects in the animacy task and the two alphabetical subsequent memory effects that did prove reliable suggests that a common set of cognitive processes acted to facilitate encoding in the two tasks. As already mentioned, one plausible account is that subsequent memory effects in the left prefrontal cortex and the left anterior hippocampus reflect the benefit of incorporating relatively elaborate semantic information about an item in a representation of its study episode. By this account, even when the task is ostensibly non-semantic, it is the items that receive the greater semantic-level processing that are encoded the most effectively. One possibility is that semantic processing, and effective episodic encoding, occurred in alphabetical trials in which volunteers confused the two encoding tasks. This is unlikely, however, as study trials on which errors were committed were not included in the analyses of the fMRI data. It seems more likely that any semantic processing accorded these items was incidental, and perhaps subsequent, to the processing necessary for alphabetical classification. For example, for any given volunteer, words may have differed in their capacity to capture or hold attention beyond what was needed for the alphabetical decision. A delay in the engagement of semantic processing in the alphabetical task relative to the animacy task may account for the finding that the left anterior hippocampal subsequent memory effect loaded on the later of the two covariates employed to detect event-related fMRI signal change (Fig. 5).

In summary, the present findings confirm previous suggestions (Wagner et al., 1998b) that some of the brain regions sensitive to depth of processing manipulations play a role in verbal memory encoding when the study task is held constant. Chief among these regions are the left inferior prefrontal cortex and left anterior hippocampus. The findings also demonstrate a role in the episodic encoding of deeply processed words for regions, such as the right prefrontal cortex and left posterior hippocampus, that are insensitive to depth of processing. There was no evidence that successful episodic encoding during a non-semantic study task depended upon regions different from those that supported the encoding of semantically studied words. Instead, the findings suggest that successful non-semantic encoding relies on a subset of the regions engaged during successful semantic encoding.

Acknowledgments

We wish to thank the radiography staff of the Wellcome Department of Cognitive Neurology and Jane Herron for their help with data acquisition. The authors and their research are supported by the Wellcome Trust.

References

View Abstract