OUP user menu

Direct comparison of the neural substrates of recognition memory for words and faces

J. J. Kim, N. C. Andreasen, D. S. O'Leary, A. K. Wiser, L. L. Boles Ponto, G. L. Watkins, R. D. Hichwa
DOI: http://dx.doi.org/10.1093/brain/122.6.1069 1069-1083 First published online: 1 June 1999

Summary

For the purpose of identifying the relatively specific brain regions related to word and face recognition memory on the one hand and the regions common to both on the other, regional cerebral blood flow associated with different cognitive tasks for recognition memory was examined using [H215O]PET in healthy volunteers. The tasks consisted of recognizing two types of stimuli (faces and words) in two conditions (novel and familiar), and two baseline tasks (reading words and gender classification). The statistical analyses used to identify the specific regions consisted of three subtractions: novel words minus novel faces, familiar words minus familiar faces, and reading words minus gender classification. These analyses revealed relative differences in the brain circuitry used for recognizing words and for recognizing faces within a defined level of familiarity. In order to find the regions common to both face and word recognition, overlapping areas in four subtractions (novel words minus reading words, novel faces minus gender classification, familiar words minus reading words, and familiar faces minus gender classification) were identified. The results showed that the activation sites in word recognition tended to be lateralized to the left hemisphere and distributed as numerous small loci, and particularly included the posterior portion of the left middle and inferior temporal gyri. These regions may be related to lexical retrieval during written word recognition. In contrast, the activated regions for face recognition tended to be lateralized to the right hemisphere and located in a large aggregated area, including the right lingual and fusiform gyri. These findings suggest that strikingly different neural pathways are engaged during recognition memory for words and for faces, in which a critical role in discrimination is played by semantic cueing and perceptual loading, respectively. In addition, the investigation of the regions common to word and face recognition indicates that the anterior and posterior cingulate have dissociable functions in recognition memory that vary with familiarity, and that the cerebellum may serve as the co-ordinator of all four types of recognition memory processes.

  • PET
  • recognition memory
  • word recognition
  • face recognition
  • rCBF = regional cerebral blood flow

Introduction

Recognition memory is a combined evocation of pertinent multimodal memories that permit the experience of familiarity with a given stimulus. It might be considered that the recognition memory process is composed of several stages, including the formation of a percept originating from the stimulus, matching the percept to pre-existing stored information, and a contextual non-verbal and/or verbal evocation. One of the major variables affecting the network of brain areas activated during this process is the type of stimulus. Among various types of visual stimuli, words and faces hold a special place in humans. Their recognition is fundamental to human social life; word recognition is necessary to acquire knowledge and to communicate, whereas face recognition forms a basis for recognizing and meeting others and forming interpersonal relationships.

Activation studies using PET, functional MRI and neurophysiological recording have established the existence of specialized neural substrates for recognition processes according to different object categories, including words and faces. During visual presentation of words, activation occurs in a variety of regions, including the left extrastriate cortex (Petersen et al., 1990; Puce et al., 1996), the left temporal cortex (Howard et al., 1992; Price et al., 1994; Andreasen et al., 1995a) and the left frontal cortex (Petersen et al., 1990; Price et al., 1994). During the perception of faces, major activation occurs in extrastriate areas bilaterally, particularly in the fusiform gyri (Haxby et al., 1991, 1994; Sergent et al., 1992; Puce et al., 1995, 1996; Andreasen et al., 1996; Kanwisher et al., 1997) and in the inferior temporal gyri (Puce et al., 1995). These reports indicate that considerable overlap may exist between the areas activated by the two types of stimuli even though the tasks are theoretically different.

However, determining the areas specific to word and face recognition from existing studies in the literature is difficult because of differences in task requirements, control stimuli and the statistical criteria for significant activation used in different studies. Therefore, direct comparison between two recognition conditions is needed to investigate the cortical regions differentially activated by one or the other stimulus category, and thus address the question of stimulus specificity. This can be assessed by the subtraction of one recognition condition from the other using a PET analysis technique, such as the Montreal method (Worsley et al., 1992; Arndt et al., 1995). It is also of interest to study the regions common to word and face recognition memory, because they may be considered to be a general recognition memory pathway. These regions can be identified by a visual comparison between the brain regions related to word and face recognition memory, which are obtained by subtraction of recognition conditions from their perceptual baselines.

In addition to the stimulus type, another critical component affecting the pathway used for visual recognition may be the process for retrieving information. It has been shown to vary depending on the retention interval between the initial exposure to the stimulus and its recognition (McIntosh et al., 1996) and on the level of familiarity, reflecting the degree to which the stimulus was learned before recognition (Raichle et al., 1994; Andreasen et al., 1996; Tulving et al., 1996). Recently, the time allowed for retrieval has also been suggested to affect the degree of activation of the right anterior prefrontal cortex, which has been shown to be activated during retrieval tasks (Wagner et al., 1998). Thus, these variables related to the retrieval of previously acquired information must be controlled in order to make adequate decisions about the differences between the recognition pathway used for words and that used for faces.

The present study was designed to compare the regional cerebral blood flow (rCBF) associated with different recognition tasks by using H215O-PET in normal subjects. The tasks consisted of recognizing two contrasted types of stimuli—faces and words—in two familiarity conditions, a familiar one and a novel one. The goal was on the one hand to identify the neural substrates selectively involved in each word and face recognition memory, and on the other hand to find the substrates common to both types of memory.

Methods

Subjects

The subjects were 33 healthy right-handed volunteers recruited from the community through newspaper advertising. They were screened to rule out psychiatric, neurological and general medical illness by using a structured interview: a short version of the Comprehensive Assessment of Symptoms and History (CASH) (Andreasen et al., 1992b). The mean age of the subjects was 26.7 years (SD = 8.1) and their mean educational achievement was 14.5 years (SD = 1.6). Twenty-one were female and 12 were male. All gave written informed consent to a protocol approved by the University of Iowa Human Subjects Institutional Review Board.

Cognitive tasks

The experimental tasks consisted of four recognition conditions, which were chosen to permit comparisons across the following tasks: familiar words versus familiar faces and novel words versus novel faces. The two baseline conditions were reading words and gender classification. For the task referred to as recognizing familiar words or faces, subjects were required to remember a list of 18 words and 18 faces in a single training session 1 week prior to the PET experiment. Subjects were given successive study-and-test trials, with self-paced item presentation on a video monitor, until they reached a perfect performance level. Subjects were not given any explicit instructions about learning strategies. In a second training session provided on the day before the PET experiment, subjects were asked to recognize the 18 words and 18 faces by responding `yes' or `no' to indicate whether the word or face presented on the monitor had been previously learned or was one of the intermixed distractors. If they made any errors, they were re-exposed to the original lists until perfect recognition was again achieved. For the condition referred to as recognizing novel words or faces, the subjects were instructed to remember a new list of 18 words and 18 faces, which were presented on a video monitor in a single trial at a rate of one item every 2 s; the last item was shown to the subjects 60 s prior to the yes/no recognition test given during PET data acquisition.

The recognition tasks were designed so that stimulation and output were as similar as possible across tasks during PET data acquisition; all stimuli were scanned digitally and presented visually on a video monitor 12–13 inches high above the nose of the subject for 500 ms at 2.5-s intervals. All target and distractor words consisted of one- or two-syllable concrete nouns of similar frequency of usage (Thorndike and Lorge, 1944), and were presented as black letters (0.5 inches high × maximum 2.5 inches long) on a light grey background. All target and distractor faces were taken from a 1960 college yearbook to ensure that they would be initially unfamiliar to the subjects and relatively homogeneous in appearance: similar hairstyles among the men or the women, no extremely long hair and no facial hair. They were presented as black-and-white pictures (5 inches high × 4 inches wide) on a light grey background, and consisted of equal numbers of males and females.

All output consisted of a spoken verbal response by the subject. During the tasks, subjects were asked to respond with `yes' or `no' to indicate whether the word or face seen on the monitor was one of 18 targets learned previously or one of 22 intermixed distractors. To ensure that blood flow would be measured during an intense phase of the cognitive effort, the protocol was designed so that the proportion of targets was 80% during the 40-s acquisition time window centred around the arrival of the bolus of H215O in the brain (Hurtig et al., 1994). The total duration of every task was 120 s.

The experimental baseline tasks consisted of reading words and gender classification. In the former task, subjects were shown a new set of common English words of the same length and frequency as words used for the recognition tasks and were asked to read them aloud. In the latter task, subjects were shown a new set of faces of the same condition with the recognition tasks and asked to pronounce `boy' or `girl' depending on the gender of the face.

Imaging data acquisition

The PET data were acquired with a bolus injection of 75 mCi of H215O in 5–7 ml saline, using a GE 4096-plus 15-slice whole-body scanner. Arterial blood sampling for calculation of tissue perfusion and imaging began at the time of tracer injection via venous line and continued for 100 s in ten 10-s or in twenty 5-s frames. The time from injection to bolus arrival in the brain was measured on each subject. Based on the time–activity curves, the frames reflecting the 40 s after bolus transit were summed (Hurtig et al., 1994). The summed image was reconstructed into 2-mm voxels in a 128 × 128 matrix by using a Butterworth filter (order = 6, cutoff frequency = 0.35 Nyquist interval). Using the blood curve (expressed in PET counts) and an assumed brain/blood partition coefficient of 0.90, cerebral blood flow was calculated on a voxel-by-voxel basis by the autoradiographic method (Raichle et al., 1983). Injections were repeated at intervals of ~15 min.

MRI scans, to be used for anatomical localization of functional activity, were obtained for each subject with a standard T1-weighted three-dimensional SPGR sequence on a 1.5-tesla GE Signa scanner (echo time = 5 s; repetition time = 24 s; flip angle = 40°; number of excitations = 2; field of view = 26 × 26 cm; matrix = 256 × 192; slice thickness = 1.5 mm).

Image processing and statistical analysis

Images were analysed by the locally developed software package BRAINS (Brain Research: Analysis of Images, Networks, and Systems) (Andreasen et al., 1992a, 1993; Cizadlo et al., 1994). The outline of the brain was identified on the MRI images by a combination of edge detection and manual tracing. The anterior commissure–posterior commissure line was identified and used to realign the brains to a standard position and place each brain in standardized Talairach coordinate space (Talairach and Tournoux, 1988). The outline of the brain in the PET image was identified automatically using an edge-detection algorithm. The PET image for each individual was then fitted to that individual's MRI scan using a surface fit algorithm (Levin et al., 1988; Cizadlo et al., 1994). The MRIs from all 33 subjects were averaged and the functional activity measured with the PET scan was coregistered on the MRI `average brain'. The coregistered images were resampled and simultaneously visualized in all three orthogonal planes (Andreasen et al., 1992a, 1993).

Statistical analysis of the images was performed using an adaptation of the Montreal method (Worsley et al., 1992; Arndt et al., 1995). An 18-mm Hanning filter was applied to the PET images for each condition to eliminate residual anatomical variability. Images were resampled to 128 × 128 × 80 voxels by using the bounding box of the Talairach atlas (Talairach and Tournoux, 1988). Within-subject subtraction of relevant injections was then performed, followed by across-subject averaging of the subtraction images and computation of voxel-by-voxel t tests of the regional cerebral blood flow changes. Significant regions of activation were calculated and displayed on the t-map images by using a technique correcting for the large number of voxel-by-voxel t tests performed, the lack of independence between voxels and the resolution of the processed PET images.

Subtractions were chosen to make comparisons across conditions. The statistical analyses to identify the specific brain regions related to each word and face recognition memory consisted of three subtractions: novel words minus novel faces, familiar words minus familiar faces, and reading words minus gender classification. The former two subtractions indicate relative differences in the brain circuitry between recognizing words and recognizing faces within a defined level of familiarity. The third provides for a comparison of two baseline tasks that include the perceptual components present in the word versus face recognition and permit the isolation of the mnemonic component of word versus face recognition. The subtractions produced both positive and negative t values; positive t values are referred to as `word activations' and negative t values as `face activations'. The minus signs of negative t values in the tables, however, have been omitted in order to present a relative activation in recognizing faces versus recognizing words.

Data analysis to find the brain regions shared by word and face recognition memory contained four kinds of subtractions: novel words minus reading words, novel faces minus gender classification, familiar words minus reading words, and familiar faces minus gender classification. Because the present analysis was part of a larger study, some components of these subtractions have been reported previously (Andreasen et al., 1995a, 1996; Wiser et al., 1998). The first two and last two of the subtractions were compared visually, and the overlapping areas with similar Talairach coordinates were labelled as common areas.

Data are reported in the tables by showing the highest t value identified in the significantly activated area (or `peak'), the volume of the peak, the x, y and z Talairach coordinates of the voxel with the highest t value in the peak, and the anatomical location of the peak. The peaks reported in the tables contain at least 50 contiguous voxels exceeding a t threshold of 3.61, which corresponds to an uncorrected significance level of 0.0005. Anatomical localization is based not only on Talairach coordinates but also on visual inspection of coregistered MRIs and PET images.

Results

Recognition performance was monitored in order to ascertain how well the subjects remembered the words or faces and to examine the differences among the task conditions. The hit rates, based on all 40 items, were 97.8% (SD = 3.2) and 96.3% (SD = 4.3) for recognizing familiar words and faces, respectively, and 88.6% (SD = 8.6) and 82.4% (SD = 9.1) for recognizing novel words and faces, respectively. Analysis of variance (ANOVA) revealed a significant effect of task condition [F(3) = 34.03, P < 0.0001). Tukey's test revealed that the performances in the familiar condition were superior to the performances in the novel condition and that there was no significant difference between the performances for words and faces in the familiar condition, but in the novel condition the performance for recognizing words was superior to the performance for recognizing faces.

Recognition of novel words versus recognition of novel faces

The results of subtracting novel faces from novel words are shown in Table 1. Areas of relative activation during the novel word recognition task were distributed mainly in the left hemisphere: in the orbital and inferior frontal gyri, the middle and inferior temporal gyri, and the insula. In the right hemisphere, a single but large peak was observed in the inferior parietal cortex. Additional peaks were seen on both sides of the cerebellum.

On the other hand, areas activated by the novel face recognition task were predominantly distributed in the posterior part of the right hemisphere; a very large peak covered the left and right occipital areas and extended forwards into the right parietal cortex through the right lateral occipital cortex and into the right inferior temporal cortex through the right fusiform gyrus (see the sagittal slices of Figs 1 and 2). In the anterior part of the brain, only one peak in the right orbitofrontal gyrus appeared when recognizing novel faces.

Recognition of familiar words versus recognition of familiar faces

The areas identified as significant in the familiar words minus familiar faces subtraction are shown in Table 2. The hemispheric distribution in the familiar condition was very similar to that seen in the novel condition. Areas activated in recognizing familiar words, like those activated in the recognition of novel words, were mainly located in the left hemisphere: the middle frontal gyrus, the anterior cingulate, the insula, numerous temporal areas (Fig. 3) and the inferior parietal cortex. The right inferior parietal cortex and both sides of the cerebellum were also activated in both conditions. Areas of activation observed for the familiar but not for the novel words were located on the right side in the temporal lobe and the insula.

The similarity between areas activated in recognizing familiar faces and areas involved in recognizing novel faces was also striking. A very large peak was located in the occipital lobe on both sides and extended forwards into the right inferior parietal region and into the right inferior temporal region. In addition, the right putamen and the left precuneus were activated when recognizing familiar faces.

Reading words versus gender classification

The results of the comparison between reading words and gender classification are shown in Table 3. The general pattern of activation in reading words was different from that in word-recognition conditions; the former included numerous activations in the right cingulate, the left precentral gyrus and the right postcentral gyrus that were not found in the recognition conditions. However, activations on both sides of the insula, in the middle portion of the inferior temporal gyrus and in the right cerebellum were also seen in the two word-recognition conditions.

Gender classification, like the face recognition conditions,activated very large posterior areas on both sides of the occipital lobe, in the right fusiform gyrus and in the right inferior parietal cortex. The other activations, especially numerous frontal activations, illustrated the difference between gender classification and face recognition.

Comparison between word and face recognition memory

Table 4 shows the overlapping areas obtained by visual comparison of activated areas in the recognition task minus baseline task: novel words minus reading words versus novel faces minus gender classification, and familiar words minus reading words versus familiar faces minus gender classification. These areas represent the pathways common to novel word and face recognition memory and to familiar word and face recognition memory. They may be considered as a part of general recognition memory pathway. The left anterior cingulate cortex and the left cerebellum were commonly engaged in novel recognition memory (Fig. 4), whereas the left posterior cingulate cortex, the right superior parietal cortex and the left cerebellum were commonly involved in familiar recognition memory.

Discussion

The main goal of the present study was to investigate the differences in rCBF between word recognition and face recognition in normal subjects. Assuming that these two tasks share some of the regions involved in visual recognition, subtracting face recognition from word recognition would eliminate regions activated by both tasks. The assumption is, therefore, that this subtraction paradigm sorts out the neural substrates involved specifically in the process of recognition for words from those involved in face recognition. A second goal was to identify the pathways common to both forms of recognition memory, which was examined by identifying the overlapping regions obtained in the subtraction of the recognition task minus baseline perception task. Finally, although reading words and gender classification were used as baseline tasks, they can be considered as another set of experimental tasks, which are useful in determining the perceptual components of both sets of tasks excluding the mnemonic components.

Characteristics of activated pattern

The major conclusion to be drawn from the patterns of brain activation revealed in this experiment is that cerebral asymmetry differs strikingly between word recognition and face recognition; the activation sites for word recognition were largely lateralized to the left hemisphere whereas those for face recognition were localized in the right hemisphere. Left lateralization in visual and auditory word recognition has been demonstrated in a number of studies that specifically activated semantic and phonological processes (Petersen et al., 1988Petersen et al., 1990; Wise et al., 1991; Demonet et al., 1992; Howard et al., 1992; Price et al., 1994, 1996) as well as prelexical letter-string processing (Nobre and McCarthy, 1994). Right lateralization in face recognition is also consistent with the findings of previous imaging or electrophysiological studies (Ojemann et al., 1992; Sergent et al., 1992; Allison et al., 1994a; Puce et al., 1996; Kanwisher et al., 1997) and clinical studies demonstrating significant impairment of facial recognition in patients with right hemisphere lesions (Whiteley and Warrington, 1977; De Renzi, 1986; Rosler et al., 1997).

A particularly interesting activation with this lateralization pattern occurred in the orbitofrontal cortex; it was activated on the left side in novel word recognition and symmetrically activated on the right side in novel face recognition (see the transaxial slices in Fig. 1). The orbitofrontal cortex has diverse connections to other parts of the brain, including most of the limbic structures, and may play an important role in memory (Morecraft et al., 1992). The novel recognition tasks in this experiment demand a relatively short retention time, and keeping the image in memory during the short period of retention might occur by encoding processes in the prefrontal cortex (Jonides et al., 1993). Thus, the lateralization of these orbitofrontal areas might be explained by a difference in the set of buffers that temporarily store information: the left side may be operated as a temporary storage site of phonological form in word recognition processes, and the right side as a temporary storage site of visuospatial form in face recognition processes.

Another striking difference between word and face recognition is in the pattern of distribution of the activation sites; those for word recognition tend to be smaller and distributed in several loci whereas those for face recognition tend to be fewer and to be aggregated around one area. These patterns are well demonstrated in the transaxial planes of Fig. 2. We believe that these patterns of activation result from the difference in semantic processing and perceptual loading. Visual word recognition relies on semantic processing as well as on visuospatial processing. Language comprehension is known to involve access to long-term memory and to activate large-scale networks in which the flow of neural signals is fast, as demonstrated by short processing times (Mesulam, 1990). Lexicosemantic processing in language comprehension seems to implicate more widely distributed regions of the association cortex in the left hemisphere such as the angular gyrus, the middle and inferior parts of the temporal lobe and the dorsolateral prefrontal areas (Demonet et al., 1992). This trend may be related to the task performance in our results; semantic cueing probably leads to higher performance in recognizing words than in recognizing faces in the novel condition requiring working memory, but not in the well-learned familiar condition. On the other hand, the facial task is not associated with semantic information (faces were chosen to be lacking in verbalizable features). Instead, this task requires sophisticated perceptual mechanisms capable of detecting subtle differences among faces and of achieving a structural representation that makes each face unique (Sergent, 1994). The extraction of particularities from a general configuration common to all faces is probably enabled by the analysis of a number of clues, such as the width and height of the different elements that compose a face, distances between these elements, angles, contours, illumination, expression, hairline, hair style and so on. Facial recognition, therefore, appears to require more activities in the visual association cortex than word recognition.

Areas related to word recognition

The results of this experiment show that activation sites revealed by subtracting familiar faces from familiar words include most of the activation sites revealed by subtracting novel faces from novel words. Most of the areas they have in common seem to reflect specific recognition sites for words or faces regardless of their level of familiarity; some of these are also shown in the comparison of the two baselines: reading words and gender classification. In the case of word recognition, these common areas comprise the posterior portion of the left middle and inferior temporal gyri (Fig. 3), the middle portion of the left inferior temporal gyrus, the left insula, the right inferior parietal lobule and both sides of the cerebellum. A similar set of brain areas (left middle and inferior temporal gyri, left inferior parietal region and left superior prefrontal region) is activated by lexicosemantic processing during language tasks with auditorily presented words (Demonet et al., 1992).

In particular, posterior portions of the left middle and inferior temporal gyri activated in word recognition have also been engaged in both reading aloud and reading silently (Price et al., 1994), but not in the perception of simple letter-strings (Puce et al., 1996). This confirms the important role these areas play in visually presented word processing. Although these areas are not activated in the baseline reading words condition, their absence may be caused by being subtracted because gender classification activates similar sites (Andreasen et al., 1996). These areas are related to a lexicon for written word recognition as opposed to a spoken word lexicon, which is located in more superior and anterior sites (Howard et al., 1992). Furthermore, the activation of the right temporal cortex for recognizing words is observed only in the familiar but not in the novel condition. It suggests that this activation may be associated with relatively long-term verbal recognition. Although verbal memory, like word recognition, is thought to be left hemisphere-dominant, the right temporal lobe has been demonstrated to be activated during some verbal memory tasks (Wise et al., 1991; Fletcher et al., 1996). Moreover, it is suggested that the amount of right temporal activity is inversely related to the ease with which word associations are accessed (Price et al., 1996). We suggest that, in this experiment, the left temporal cortex serves as a lexical processing unit and the right temporal cortex serves as a long-term lexical storage area.

Another area common to word recognition in both novel and familiar conditions is the left insula. This area is also markedly activated in baseline reading tasks, which is consistent with previous findings (Raichle et al., 1994; Andreasen et al., 1995a). Fiez and colleagues (Fiez et al., 1996) suggested that the insula might be associated with relatively automatic or overlearned tasks because it was also activated during silent number-counting. However, in the recognition paradigm of our experiment, it is difficult to explain the left insular activation as an automatic response. Instead, a more plausible interpretation is that the activation may be related to internal phonological processing; i.e. reading is accompanied by the computation of a phonological representation associated with the visual word form, and the recognition of written words evokes recoding into their phonological form, which is mediated by neural structures close to the left sylvian fissure, including Wernicke's area, the supramarginal gyrus and the insular cortex. In addition, there is PET evidence for the involvement of the left insular cortex in phonological processing (Paulesu et al., 1993).

It is noteworthy that the direct comparison in this experiment did not reveal any word-related activations in the extrastriate cortex, though they appear in the baseline reading words condition. Indeed, the extrastriate cortex, particularly on the left side, is reportedly activated by the visual presentation of words (Petersen et al., 1990; Puce et al., 1996) but not by the auditory presentation of words (Petersen et al., 1988), suggesting that visual word percepts are developed in the occipital lobe. In addition, pure alexia, the inability to read without other language deficits, can result from lesions of the left occipitotemporal boundary (Damasio and Damasio, 1983). The occipital activations triggered by visually presented words seem to reflect a prelexical rather than a lexical stage of word processing because they are insensitive to the type of word presented and can occur with real words or pseudowords (Petersen et al., 1990; Nobre et al., 1994). (As mentioned previously, the lexical processes for reading words activate temporal lobe regions rather than occipital regions.) However, in our experiment, word recognition was subtracted from face recognition, and faces probably activate the extrastriate cortex to a greater extent than do words, which would mask any word-related activation in the extrastriate cortex. This is supported by evidence from a neurophysiological experiment with an N200 wave showing that more sites in the extrastriate cortex respond to face stimuli than to letter-strings (Puce et al., 1996).

Areas related to face recognition

In the case of face recognition, areas activated in both the novel and the familiar condition consist of the cuneus and lingual gyrus of both occipital lobes, the right fusiform and inferior temporal gyri and the right inferior parietal lobule. These areas of activation are vast because they are connected to one another, and they are parallel to the classically defined ventral stream dedicated to object recognition and the dorsal stream dedicated to object localization; the sagittal views of Figs 1 and 2 illustrate these two streams. The studies of non-human primates have shown that recognizing the identity of objects is dissociated from perceiving spatial relations among objects; the former engages an occipitotemporal pathway (ventral stream) and the latter an occipitoparietal pathway (dorsal stream) (Rolls, 1984; Desimone, 1991). This dissociation of pathways for visual processing in non-human primates has also been supported in human subjects with PET studies, suggesting that there may be similar specific pathways that analyse such features as colour, motion and location (Corbetta et al., 1991; Haxby et al., 1991; Zeki et al., 1991; Horwitz et al., 1992; Jonides et al., 1993; Grady et al., 1994).

Previous activation studies have consistently shown that bilateral lingual/fusiform gyri comprising ventral stream are involved in the face recognition process (Haxby et al., 1991, 1994; Sergent et al., 1992; Kapur et al., 1995; Puce et al., 1995, 1996; Andreasen et al., 1996; Kanwisher et al., 1997). Especially the fusiform gyrus is activated by all face-processing tasks, suggesting that this area is responsible primarily for perceptual operations in viewing faces independently of mnemonic operations associated with other areas activated by encoding and retrieval (Haxby et al., 1996). Activation of the fusiform gyrus, however, is not specific to facial stimuli. Previous studies report that the fusiform gyri are also activated by visual discrimination of colour or shape (Corbeta et al., 1991; Zeki et al., 1991) and even by words presented visually (Nobre et al., 1994), but not by visual stimuli like chequerboards or dot patterns (Fox et al., 1986, 1987). There has been some controversy concerning the existence of category-specific subsystems for object perception in the extrastriate cortex; i.e. whether or not different regions of the extrastriate cortex process different visual stimulus attributes. According to advocates of the `unitary system', face recognition can be thought of as being just another type of object recognition which is based on studies of patients who have lost the ability to identify familiar faces and often also exhibit other types of visual agnosia, for example the observation that a number of neurological patients with prosopagnosia also have cerebral achromatopsia, the inability to identify colours (Damasio et al., 1982). On the contrary, some imaging studies showed that occipitotemporal regions were more active during face perception than during object perception (Sergent et al., 1992), during face matching than during location matching (Haxby et al., 1991, 1994) and during face perception than while viewing scrambled faces (Puce et al., 1995) or textures (Malach et al., 1995; Puce et al., 1996). In addition, large N200 potentials evoked by face stimuli in the fusiform gyri have not been elicited by inanimate objects (Allison et al., 1994b). The findings of these studies suggest that neural substrates specialized for face perception, and not merely for object perception, exist in the extrastriate cortex.

The right parietal activation during face recognition in this experiment is consistent with the findings of many human lesion studies, which have implicated the right parietal lobe as an association cortex used for face recognition (Benton, 1980). As mentioned previously, the inferior parietal cortex constitutes part of the dorsal stream and is engaged in the spatial localization of objects. Thus, damage to the inferior parietal cortex can impair the capacity to localize objects in the environment but spare the ability to recognize and identify objects (Martin, 1996). In some neurophysiological studies, the inferior parietal cortex was activated by facial stimuli along with the occipitotemporal junction, which was critical for the ability to recognize (Lu et al., 1991). Our results also point to the activation of the dorsal stream during face recognition. One possible explanation is that activation of this area in face recognition may be due to the reliance on the perception of the spatial relationship among the different elements of a face. However, it has been suggested that perceiving spatial arrangement of facial elements is not necessary for face recognition (Sinha and Poggio, 1996). Moreover, the right inferior parietal area is not always activated in facial processing, as the occipitotemporal area is (Sergent et al., 1992; Haxby et al., 1994; Puce et al., 1996), suggesting that this may be not related to the formation of a facial percept but to later stages of recognition processes. Another possible explanation is that the right parietal activity might reflect the matching stage of the recognition processes; i.e. the matching of newly formed internal images to previously formed stored informations. This hypothesis can also be applied to explain the right inferior parietal activation by word recognition in this experiment. The reason why this area is also activated in the baseline gender classification condition may be because the task requires matching of the presented face to an internal virtual facial representation of the two genders. There is some evidence that this area is activated during recognition, but not during the encoding of faces (Grady et al., 1995).

Neural pathways common to word and face recognition memory

In addition to identifying differences in the pathways for word and face recognition, a second goal in this study was to identify the regions common to both word and face recognition memory, because these may be considered as part of a common recognition memory pathway. As shown in Table 4, one area common to both is the left cingulate gyrus; the left anterior cingulate is engaged in novel recognition while the left posterior cingulate is engaged in familiar recognition. The cingulate gyrus, a portion of the limbic association cortex, has been subdivided as anterior versus posterior, limbic versus paralimbic, or subcallosal versus supracallosal, but it is still unclear whether each subdivided area has its own specific function. In the case of the supracallosal anterior cingulate, it is anatomically connected with the prefrontal region, and so may be functionally related to recent or working memory (McIntosh et al., 1996; Jonides et al., 1998). However, the core function of this region has recently been suggested to be related to task difficulty, based on a meta-analysis of the results of 107 PET studies (Paus et al., 1998). Thus, activations of the anterior cingulate in this experiment are likely not to be specific to recognition memory, but to reflect task difficulty brought by recognition of novel materials. This concept may be extended to the possibility that activations of the posterior cingulate in this experiment can be associated with the relative ease of familiar recognition. The functional dissociation between the anterior and posterior cingulate demonstrated in our human study is supported by the findings of previous animal studies that the anterior cingulate is associated with the early stages of learning and constitutes a `recency system', while the posterior cingulate is related to the later stages of learning and constitutes a `primacy system' (Gabriel et al., 1990; Bussey et al., 1996).

Another area common to both recognition pathways is the left cerebellum. Many studies now indicate that the cerebellum co-ordinates and integrates a wide range of processes not confined to the motor function. Our own PET studies of normal individuals have produced impressive cerebellar activations during a wide range of cognitive tasks (Andreasen et al., 1995a, b, c, d, 1996). Cerebrocerebellar connections have been established for motor, sensory and limbic regions as well as for parts of the prefrontal and parietal association cortices (Schmahmann and Pandya, 1997). Particularly, neurons in the ventral dentate nuclei that project to the prefrontal cortex via the thalamus may mediate cerebellar activation during working memory tasks (Leiner et al., 1993; Middleton and Strick, 1994). This prefrontal–cerebellar connection plays an important role in diverse cognitive tasks, and its impairment may be a core mechanism of some psychiatric disorders, such as schizophrenia (Andreasen et al., 1998). Although numerous activations in various brain areas occurred in the four different subtractions presented in Table 4, the left cerebellum was the only site common to every subtraction. This suggests that the cerebellum, especially the left side, may serve as a co-ordinator or integrator of recognition memory processes.

Conclusion

We have shown that the activation sites in word recognition tend to be lateralized to the left hemisphere and distributed as numerous small loci, but those in face recognition tend to be lateralized to the right hemisphere and located in a few extended areas. We believe that semantic cueing plays a critical role in word recognition, and that the posterior portion of the left inferior and middle temporal gyri is highly involved in this process. In contrast, perceptual loading for discrimination is critical for face recognition, and is processed in the right fusiform gyrus. This set of results demonstrates that strikingly different neural pathways are engaged during recognition memory for words and faces. In addition, the investigation of the regions activated commonly by both word and face recognition memory reveals that the anterior and posterior cingulate have dissociable functions in recognition memory related to the level of familiarity, and that the cerebellum may play a role in the co-ordination of recognition memory processes.

View this table:
Table 1

Comparison between recognizing words and faces in novel condition

TmaxVolume (ml)CoordinatesLocation/Brodmann area
xmaxymaxzmax
Novel words
4.77 0.5–2939–18Left orbitofrontal/11
4.37 0.4–4223 –8Left inferior frontal/47
4.01 0.2–52–50 0Left posterior portion of middle temporal/21
4.50 0.3–50–58–18Left posterior portion of inferior temporal/37
4.32 0.6–51–30–17Left middle portion of inferior temporal/20
3.95 0.2–47–19–22Left anterior portion of inferior temporal/20
5.94 1.853–4641Right inferior parietal/40
4.05 0.1–3612 6Left insula
4.21 0.439–70–33Right cerebellum
3.84 0.1–34–78–30Left cerebellum
Novel faces
4.50 0.12930–17Right orbitofrontal/11
8.0026.3 0–8610Both sides; cuneus, lingual gyrus/17, 18
Right lateral occipital/18,19
Right superior and inferior parietal/7, 39, 40
7.5312.335–36–21Right fusiform and inferior temporal gyri/19, 37, 20
View this table:
Table 2

Comparison between recognizing words and faces in familiar conditions

TmaxVolume (ml)CoordinatesLocation/Brodmann area
xmaxymaxzmax
Familiar words
4.51 0.2–25 751Left middle frontal/6
3.83 0.1 –43322Left anterior cingulateral/24
3.94 0.2–48–2010Left transverse temporal/41
4.50 1.5–53–4514Left posterior portion of superior temporal/22
4.44 1.6–50–42 1Left posterior portion of middle temporal/21
5.38 1.5–52–46–14Left posterior portion of inferior temporal/37
4.27 0.4–48–27–22Left middle portion of inferior temporal/20
4.13 0.360–4716Right superior temporal/22
4.42 0.650–27 –2Right middle temporal/21
4.74 1.650–4638Right inferior parietal/40
4.71 1.3–45–5545Left inferior parietal/40
4.36 0.23912 –6Right insula
4.23 0.2–4012 –3Left insula
4.04 0.813–80–24Right cerebellum
4.15 0.3–14–92–30Left posterior cerebellum
3.96 0.1–36–72–30Left lateral cerebellum
Familiar faces
10.6144.5 2–8911Both sides; cuneus, lingual gyrus/17,18
Right fusiform and inferior temporal gyri/19, 37, 20
4.90 0.629–4144Right inferior parietal/40
4.19 0.2–10–6638Left precuneus/7
4.31 0.416 6 –6Right putamen
View this table:
Table 3

Comparison between reading words and gender classification

TmaxVolume (ml)CoordinatesLocation/Brodmann area
xmaxymaxzmax
Reading words
3.85 0.1 330 –6Right anterior cingulate/24
5.45 2.4 4–1142Right middle cingulate/31
4.64 0.753 –3 –6Right anterior portion of middle temporal/21
5.39 1.3–35–35–19Left middle portion of inferior temporal/20
3.99 0.2–36–1244Left precentral gyrus/4
3.85 0.1–46–1238Left precentral gyrus/4
4.06 0.246–1626Right postcentral gyrus/1, 2, 3
3.88 0.147–1242Right postcentral gyrus/1, 2, 3
4.24 0.552 –819Right frontoparietal operculum/43
5.23 2.343–11 3Right insula
6.9714.6–36 –1 7Left insula
4.87 0.7–19–96 –6Left calcarine cortex/17
4.96 1.419–78–44Right cerebellum
Gender classification
5.09 0.83935 –6Right middle frontal/10
3.91 0.22853 1Right middle frontal/10
4.24 0.3381525Right inferior frontal/44
5.43 3.22546–18Right orbitofrontal/11
6.95 8.8–3249–21Left orbitofrontal/11
5.78 1.6 –649–27Left straight gyrus/11
4.01 0.1 –33332Left anterior cingulate/32
5.28 3.842–4742Right inferior parietal/40
6.03 2.3–46–5736Left inferior parietal/39
4.81 0.7 –5–6542Left precuneus/7
10.0317.8 5–8916Both sides; cuneus, lingual gyrus/17,18
5.81 6.627–73 –8Right fusiform gyrus/19
View this table:
Table 4

Common areas between word and face recognition memory pathways

Location/Brodmann areaWord recognitionFace recognition
TmaxVolume (ml)CoordinatesTmaxVolume (ml)Coordinates
xmaxymaxzmaxxmaxymaxzmax
Word recognition in the novel condition was obtained by novel words minus reading words, face recognition in novel condition was by novel faces minus gender classification, word recognition in familiar condition was by familiar words minus reading words, and face recognition in familiar condition was by familiar faces minus gender classification.
Novel condition
Left anterior cingulate/328.9214.0 –226326.936.6 –12435
Left cerebellum7.94 6.9–35–69–375.524.9–29–76–37
Left cerebellum5.97 1.9 –8–82–215.431.7 –6–80–21
Familiar condition
Left posterior cingulate/314.33 0.5 –1–38334.690.3 –6–3835
Right superior parietal/74.49 0.429–71455.120.728–6742
Left cerebellum6.7912.4–10–82–254.460.5–40–62–18
Fig. 1

Statistical maps of the PET data comparing word and face recognition in the novel condition. The left side of the figure is a peak map showing only contiguous voxels exceeding a t threshold of 3.61, superimposed upon three orthogonal planes of the average MRI brain from the 33 subjects. The right side is a t map showing all t values in the image, and provides an overall landscape of relative activity. Green, yellow and red colours indicate relative increases in blood flow during word recognition; blue, purple and magenta indicate relative increases in blood flow during face recognition. The transaxial slices show that the orbitofrontal cortex is activated in the left side in novel word recognition and symmetrically activated in the right side in novel face recognition. The sagittal slices demonstrate that face recognition engages two typical streams: the occipitotemporal pathway and the occipitoparietal pathway.

Fig. 2

Telescoped view of the PET data comparing word and face recognition in the novel condition. In this view, all slices in each plane are superimposed, and all peaks identifiable in each plane are shown. The left side of the figure displays the regions activated in word recognition while the right side displays the regions activated in face recognition. The transaxial plane shows that the activation sites in word recognition (red peaks) tend to be distributed as numerous small loci in the left hemisphere whereas those in face recognition (blue peaks) tend to be located in a few large areas in the right hemisphere. The sagittal plane for face recognition (right middle) also demonstrates two typical occipitotemporal and occipitoparietal pathways.

Fig. 3

Statistical maps of the PET data comparing word and face recognition in the familiar condition. See legend of Fig. 1 for general description of images. The slices have been chosen to illustrate that posterior portions of the left superior, middle and inferior temporal gyri are activated in familiar word recognition. The middle and inferior regions are also engaged in novel word recognition, suggesting that these areas play an important role in visually presented word processing.

Fig. 4

Peak maps of the PET data showing the overlapping areas between word and face recognition memory. The left side of the figure is obtained by novel words minus reading words, while the right side is obtained by novel faces minus gender classification. See legend of Fig. 1 for general description of peak maps. The transaxial and coronal slices show two peaks in the left cerebellum as common areas. The sagittal slices show that the left supracallosal anterior cingulate is commonly activated.

References

View Abstract