Comparison of learning‐related neuronal activity in the dorsal premotor cortex and striatum

Previous studies have reported learning‐related changes in neuronal activity during conditional visuomotor learning, also known as arbitrary sensorimotor mapping, conditional visual discrimination, and symbolic or endogenous mapping. Qualitatively similar observations have been reported for the dorsal premotor cortex, the supplementary eye field, the prefrontal cortex, the hippocampus, the striatum and the globus pallidus. The fact that cells in both the dorsal premotor cortex (PMd) and the basal ganglia show changes in activity during associative learning enables a test of the hypothesis that cortex and basal ganglia function in distributed architectures known as cortical–basal ganglionic modules or ‘loops’. We reasoned that if these loops represent functional entities, as proposed, then learning‐related changes in activity should occur simultaneously in both the cortical and striatal nodes of a loop. The present results confirmed this prediction; as monkeys learned conditional visuomotor associations, neurons in the premotor cortex and associated parts of the putamen changed their rates at approximately the same time. For the largest number of neurons, the evolution in neural activity occurred in close correspondence to the monkeys' learning curves. As a population, however, learning‐related changes in activity continued after the monkeys reached an asymptote in performance.


Introduction
Conditional visuomotor learning (CVML) allows animals to associate any discriminable visual stimulus with any response in a learned motor repertoire. In this form of learning, animals learn by trial and error to associate a stimulus with an action or the target of action, usually based on stimulus dimensions such as colour and shape (Passingham, 1993). Previous neurophysiological studies have related changes in neural activity to this kind of learning in a number of cortical areas, including the dorsal premotor cortex (PMd) (Mitz et al., 1991), the supplementary eye ®eld (SEF) (Chen & Wise, 1995a,b), the frontal eye ®eld (FEF) (Chen & Wise, 1995b), the prefrontal cortex (PF) (Asaad et al., 1998), and the hippocampus (Cahusac et al., 1993;Wirth et al., 2003), as well as in subcortical structures such as the striatum Jog et al., 1999;Pasupathy & Miller, 2002) and globus pallidus (Inase et al., 2001).
Although the mechanisms underlying CVML remain incompletely understood, the fact that neuronal activity patterns evolve over time enables a test of a widely accepted tenant of forebrain organization. According to current thinking, the cerebral cortex and basal ganglia contribute to distributed functional modules called`loops' (DeLong & Georgopoulos, 1981;Alexander et al., 1986;Strick et al., 1995). Houk & Wise (1995) called these loops cortical±basal ganglionic modules and suggested that they are one among many kinds of distributed, recurrent modular architectures in the motor system. In cortical±basal ganglionic loops, the cortex sends a nonreciprocated axonal projection to the striatum, which engages the striatal pathways that control the output of the globus pallidus, including its output to thalamocortical neurons, thus closing the loop. Despite the widespread acceptance of this idea, few experiments have been aimed at testing it. The present study compared neuronal discharge rates in PMd and in the striatum as monkeys learned new conditional visuomotor associations, with emphasis on those parts of the putamen that receive inputs from PMd (Takada et al., 1998b;McFarland & Haber, 2000). We reasoned that if the concept of cortical±basal ganglionic loops has functional validity, then changes in the PMd and the associated parts of the striatum should occur at the same time. Alternatively, if the basal ganglia functions mainly to mediate habits, then striatal activity changes should occur later than those in PMd or, if the striatum drives appetitive learning, then its activity changes might be predicted to precede those in PMd. 4-mm-thick transparent panel. A computer program divided the touch screen's surface area into a 3 Â 3 grid of rectangles, invisible to the monkeys, with each of the nine rectangles measuring 8 cm wide by 5 cm tall. An infrared oculometer in front of the left eye of monkey 1 monitored eye position at 200 samples/s.

Behavioural paradigm
Both monkeys learned arbitrary associations between visual stimuli and reaching movements to one of four target locations. Figure 1 illustrates the spatial pattern of the targets. On each behavioural trial, a centrally positioned instructional stimulus (IS) indicated which one of the four targets on the touch screen would, if contacted by the monkey, produce reinforcement. Each IS consisted of two differently coloured ASCII characters, superimposed, with one %5.0 cm and the other %3.5 cm high (Gaffan & Harrison, 1988). The touch screen overlaid the monitor such that the IS always appeared in the middle rectangle of the 3 Â 3 grid, and the four response targets corresponded to the four corners of the grid.
Stimuli were selected pseudorandomly, from trial to trial, from a set of eight stimuli. Within any given stimulus set, each stimulus instructed one and only one correct response, and two different stimuli instructed each response. We call this design an 8 : 4 mapping; eight stimuli`map onto' four responses in equal numbers. The stimulus± response associations for four of the stimuli in the eight-stimulus set were highly familiar to the monkeys (familiar mappings), whereas the associations between the remaining four stimuli and their responses had to be learned through trial and error (novel mappings). For monkey 1, the stimuli used for novel mappings composed two sets of four stimuli that were highly familiar to the animal. Although the stimuli were familiar, the target or movement that they instructed varied from one block of trials to another. For monkey 2, the eightstimulus sets included two to four novel stimuli, with the exact number varied from block to block to adjust the monkey's learning rate. In sessions with either two or three novel stimuli, additional familiar stimuli (termed ®llers) instructed those responses not associated with novel stimuli. This procedure preserved an 8 : 4 mapping without inducing a strong response bias. For both monkeys, novel mappings (including ®ller stimuli) appeared twice as frequently as familiar mappings.
Monkey 1 began each trial by touching a metal start bar below the touch screen. It had to maintain contact with the bar until a trigger signal (TS) cued the response later in the trial. Touching the metal bar led to the immediate appearance of the four response targets, one in each corner of the screen (see Fig. 1, top). After the monkey had maintained contact for 200 ms, a small white circle (4 mm diameter) appeared in the centre of the video screen. Once the monkey made a saccadic eye movement to ®xate this point and maintained ®xation for 600 ms, the IS appeared there for 400 ms, termed the IS period. If the monkey broke ®xation during the ®rst 150 ms of the IS period, the trial ended. If the monkey broke ®xation after this period, the IS disappeared and the trial continued, but such behaviour rarely occurred (see Fig. 4). After the IS period, a grey, 3-cm square replaced the IS and remained on the screen for 0.75±1.75 s, an interval called the instructed-delay period. The disappearance of the grey square served as the TS, after which the monkey could remove its hand from the metal bar and touch one of the four targets. The monkey had a maximum of 1.9 s to initiate a response, up from 1.5 s during training. Once the monkey touched the target and maintained contact for 100 ms, all four target outlines became ®lled (white). In case of a correct response, after a variable pre-reward period (0.75±1.25 s) a tube delivered % 0.1 mL of water directly into the monkey's mouth as reinforcement. At the same time, the targets simultaneously disappeared from the screen. After incorrect responses, the targets simultaneously disappeared from the screen after the same variable period, but the response did not produce reinforcement, and, on the next trial, the same IS appeared. This sequence of events (termed a correction trial) continued until the monkey made a correct response. A 2.5-s intertrial interval followed.
The task differed for monkey 2 in several minor respects: there was no ®xation requirement; the monkey began each trial by touching a square at the bottom of the touch screen instead of a metal bar; the monkey had to contact that location for 0.5 s prior to IS appearance; the IS appeared for 1.0 s; the neutral grey square appeared for 0.5 s; the monkey had to contact a target for 1.0 s to register a response; and the pre-reward period lasted only 220 ms.

Surgery
Using aseptic techniques and iso¯uorane anaesthesia (1±3%, to effect), a recording chamber (27 mm Â 36 mm) was implanted over the exposed dura mater of the right frontal lobe, along with head restraints. Postoperatively, the animals recovered in an enclosed environment providing constant temperature, humidity, and oxygen partial pressure, under veterinary supervision. Antibiotics and analgesics were administered at the end of surgery and at other times as recommended by the attending veterinarian. After head-restraint implantation, the animals were chaired with the head ®xed for increasing periods in accord with the NIH Guidelines for the Use of Restraint Chairs with Nonhuman Primates. Animals were monitored periodically (generally, every half hour) when in head restraint to assure their general well-being.

Magnetic-resonance imaging
Recordings were guided by a coronal and a sagittal series of magnetic resonance (MR) images obtained with an electrode at the depth of the basal ganglia. A 1.5 Tesla whole body scanner (Signa, General Electric Medical Systems, USA) obtained 60 coronal and 60 sagittal MR images, derived from T1-weighted structural MR imaging, at 1-mm and 1.5-mm intervals, respectively.

Recordings
Glass-coated platinum-iridium electrodes (% 0.5±1.0 MV, measured at 1 kHz) recorded single-unit potentials in PMd and in the putamen. We also collected a neuronal sample from the caudate nucleus, although its relatively small size limits the extent of analyses and conclusions for this population. A low-noise preampli®er, with a gain of ten and a frequency response of 10 Hz to 10 kHz, differentially recorded the electrode signals. A multichannel ampli®er, which had independently adjustable gains from 0 to 10 000, received the preampli®er outputs, and a multispike detector (Alpha-Omega Engineering, Nazareth, Israel) discriminated single-unit potentials. Typically, during the isolation of single units, the monkeys performed a version of the CVML task with only familiar mappings. Thus, the search strategy caused a potential bias towards isolating cells with activity modulation for this condition, but this search strategy did not vary across recording locations. The potential bias caused by this search strategy could, however, affect the relative frequency of cells selective for familiar vs. novel mappings. The recording site alternated between the striatum and PMd, with a few days of PMd recording sessions followed by a roughly equal number of basal ganglia sessions. We intentionally limited recordings from tonically active neurons (TANs) in the striatum in the knowledge that they represent only % 5% of the striatal cell population (Aosaki et al., 1994).

Data analysis
To evaluate the monkey's learning for any given stimulus, we computed a three-trial moving average of the sequence of correct and incorrect response for each IS. We chose a learning criterion of threeconsecutive correct responses because the probability of achieving this level of performance by chance is less than 2% (4 À3 0.016) and performance rarely deteriorated dramatically once monkeys reached this criterion. For consistency, we designated the second of those three trials as the speci®c trial on which the monkey achieved criterion for that IS and designated that trial as normalized trial 0 for the purpose of comparisons with previously published work and the construction of population averages.
For the analysis of response latencies, reaction time (RT) was de®ned as the time from the TS to the time when the hand broke contact with the metal start bar in monkey 1 or central hold location in monkey 2. The subsequent time taken until the monkey contacted one of the four response targets was de®ned as the movement time (MT).
We analysed neuronal activity during each trial for eight task periods: a reference period and seven principal task periods. The reference period corresponded to the 500 ms before IS onset; the IS-onset period began 80 ms after the IS onset for a duration of 320 ms; an IS-offset period began 80 ms after offset of the IS for a duration of 170 ms; an instructed-delay period, de®ned as a 250-ms period ending 250 ms before the TS was given (monkey 1 only); a premovement period covered 250 ms prior to breaking screen contact; a movement period covered the 250 ms after breaking screen contact; a pre-reward period covered the 750 ms prior to the delivery of reward; and a post-reward period lasted for 1.0 s after the delivery of reward. Occasionally, we adjusted these time windows as appropriate to the activity of an individual neuron (for seven PMd and ®ve putamen cells, all in monkey 1). Task-related activity (one of the seven principal task periods vs. reference) and directional selectivity (four directions, one for each target) was assessed for each task period by a two-factor analysis of variance (ANOVA, a 0.01). For monkey 2, the number of directions in any given analysis was restricted by the number (two to four) of novel associations the monkey learned in a block of trials.
We then categorized cells according to their activity`preference' for trials with novel mappings (novel trials) vs. those with familiar ones (familiar trials). Activity for each task period, for novel and familiar trials separately, was compared against reference-period activity (t-test, a 0.01), and an additional comparison was made for each task period testing the activity difference between familiar and novel trials. On the basis of these comparisons, task-related cells could fall within one of ®ve categories: familiar only (F), novel only (N), familiar preferred (F > N), novel preferred (F < N), or no preference (F N).
As in previous studies (Mitz et al., 1991;Chen & Wise, 1995a,b), we analysed learning-related activity in each task period and each target location as a separate case. Although activity from different cases within the same neuron cannot be said to be independent from each other, experience has shown that the differences in activity in distinct task periods precludes easy generalization. Thus, analysis by case avoids the assumption that a cell's activity for one task period corresponds to that in another. To evaluate changes in neuronal activity during learning, the change-point test for continuous variables (a 0.01) (Siegel & Castellan, 1988) was applied to novel mappings and, as a control, to familiar mappings, as well. For each case showing a signi®cant change in activity, a cross-correlation analysis measured the relative timing of changes in neuronal activity and changes in performance. We subsequently used the Kolmogorov±Smirnov twosample test (Siegel & Castellan, 1988) to compare the case-by-case cross-correlation results among neuronal populations. Similar analyses were made on a cell-by-cell basis, in addition to a case-by-case basis.
As described in detail by Siegel & Castellan (1988), the changepoint test entails the null hypothesis that no time trend exists in a data series. As applied to the present neuronal data, the null hypothesis holds that no systematic change in neuronal activity occurs over trials. On that assumption, each trial should rank on average near the median. Of course, the ranks must distribute from the highest to lowest activity levels, but if the null hypothesis is true, the cumulative sum of ranks should increase approximately linearly with trial number. The maximal deviation from that expectation signi®es, according to this test, the trial on which the change occurs (the change point) and divides the series into all trials up to that point and all subsequent trials. The sampling distribution of the deviation statistic forms the basis for rejecting the null hypothesis (or failing to do so). The signi®cance of a given degree of deviation depends on the number of trials up to and after the change point and is a form of the Kolmogorov±Smirnov test. For behavioural data, the test depends similarly on the maximal deviation from the expected cumulative sum of correct responses as the series of trials progresses.
All novel and familiar trials were subjected to the change-point test, separately, by response direction. For both types of trials, the changepoint test was also applied to their reference-period activity. If reference-period activity changed signi®cantly for a given response direction, then all data for that direction were eliminated from the analysis, except for changes of opposite sign. Cases with less than 5 spikes/s throughout learning were also excluded.
For each case of signi®cant learning-related change in activity, a learning-effect index (LEI) was computed by comparing the mean activity for the ®rst three correctly executed trials (A early ) to that from ß 2004 Federation of European Neuroscience Societies, European Journal of Neuroscience, 19, 721±740 three consecutive trials later during learning (A late ), using a contrast ratio: A late À A early =A late A early An LEI of zero re¯ects no net change in activity, positive values denote increases in activity, whilst negative values indicate decreases in activity. Similar ratios were also computed for reference periods derived from the same trials and stimuli in which cases of signi®cant learning related changes were detected.

Histology
At the completion of neurophysiological data collection, we passed 10 mA of direct, anodal current for 10 s, through the electrodes in order to make electrolytic marking lesions at selected recording sites. Lesions were made at four sites (two striatal, two cortical) in each of six tracks in monkey 1. No lesions were attempted in monkey 2. The animals were later given xylazine (0.02 mg/kg) and ketamine (10±20 mg/kg i.m.) followed by induction of a deep anesthetic state with sodium pentobarbital (60 mg/kg i.v.). Following the complete loss of corneal and cutaneous re¯exes, a supplementary dose of sodium pentobarbital was given (30 mg/kg) prior to the perfusion procedure. Both monkeys were perfused through the heart with heparinized physiological saline followed by 10% formol-saline, with ®ve steel pins inserted at known chamber coordinates. The brains were subsequently removed, photographed, and then sectioned at 40 mm on a freezing microtome. A 1 : 3 series of Nissl-stained sections (cresyl violet) was used to plot the locations of recording sites and the estimated track of each penetration by reference to the four recovered electrolytic lesions (for monkey 1) and to the pin holes (for both monkeys). For both monkeys, MR images guided the histological reconstruction.

Behaviour
Both monkeys performed the CVML task with nearly 100% accuracy for familiar visuomotor mappings and quickly learned novel mappings. For comparison, we divided the behavioural data according to whether the neurophysiological recordings come from PMd or from the putamen.
The two monkeys learned novel mapping problems at similar rates, with a problem being de®ned as learning the correct response to a given IS. For each problem, monkey 1 took an average (AE SD) of 9.6 AE 6.3 trials to reach the performance criterion (three consecutive correct responses to a given IS); monkey 2 required 9.1 AE 5.1 trials to do so. For correct trials only, the monkeys took 4.5 AE 2.5 and 3.7 AE 2.0 trials to reach criterion, respectively. The difference for each monkey gives the mean number of errors to criterion: 5.1 and 5.4 trials per problem, respectively, for monkey 1 and monkey 2. The total number of trials to criterion for a set of ISs was the product of the values presented above (9.6 and 9.1 trials to criterion) and the number of concurrent, novel mappings in the set. For monkey 1, this means that it took % 38 trials to learn the novel mappings to criterion, consisting of % 18 correct and % 20 incorrect trials, excluding the interleaved familiar trials. Using the more-sensitive algorithm of Wirth et al. (2003) for detecting that learning had occurred, monkey 1 learned in 6.5 AE 4.5 trials, monkey 2 in 7.8 AE 6.3 trials. Figure 2 shows the overall learning curve for monkey 1. The number of total and correct trials to criteria did not differ with respect to recording site (PMd, putamen, and caudate) for monkey 1 (total, F 2,270 1.66, n.s.; correct, F 2,270 1.06, n.s.) or monkey 2 (yotal, t 64 < 1, n.s.; correct, t 64 < 1, n.s.). Table 1 gives the mean and median trials to criteria as the monkeys learned novel mappings for each recording site. Figure 3 shows the reaction times (RT) and movement times (MT) for trials with novel mappings, averaged across all recording sessions, normalized with respect to the attainment of criteria (normalized trial 0). Note that neither RT nor MT differed dramatically as a function of where recordings were made and neither showed a strong trend as the monkey learned novel mappings. Table 2 shows mean RT and MT for three phases of learning, termed early, criterion and late, with all incorrectly performed trials excluded. Early trials comprised the ®rst three trials with novel mappings. Criterion trials include trial`0' (see Materials and methods) and the trials immediately before and after it (trials À1 to 1 in the normalized trial scale). The next three trials (trials 2±4) composed a group of late trials. Trials with familiar mappings were divided likewise, but because early and criterion trials were the same (i.e. the monkey began at nearly perfect performance), Table 2 does not distinguish between them. Both monkeys had faster RTs for familiar than for novel mappings (monkey 1; F 1,24 79.42, P < 0.05; monkey 2; F 1,16 38.59, P < 0.05). RT decreased in late trials relative to those at the time the monkey achieved the learning criterion in both monkeys, Learning curve for monkey 1, for concurrent novel mappings. Unlike learning curves in later ®gures (e.g. Fig. 8D), which illustrate learning only for correct trials and only for one IS at a time, this curve shows the overall improvement in performance as the monkey learns four novel conditional visuomotor mappings. Because problem sets dropped out of the data set shortly after the monkey reaches the performance criterion, an average over last seven trials of a given data set were extrapolated to in®nity. The averages are similar to previously published ones (e.g. Wise & Murray, 1999) and show that novel mappings were learned quickly, with an exponential rise having a learning-rate constant of 34 trials (dashed line). The monkey reached criterion, on average, after 38 trials (dotted line). Mean (AESD) and median total trials and correct trials to achieve criterion for each recorded region. Criterion was set as the second of three consecutive correct trials, when performance first reached that level. although this occurred only for novel mappings in monkey 1 (F 1,24 6.34, P < 0.05) and only for familiar mappings in monkey 2 (F 1,16 9.80, P < 0.05). RT also decreased from early to criterion novel trials in monkey 1 (F 2,18 10.36, P < 0.05) but not in monkey 2 (F 2,18 < 1, n.s.). These effects did not differ signi®cantly as a result of whether the recording site was in PMd, the putamen, or the caudate (monkey 1, F 2,24 2.51, n.s.; monkey 2, F 1,16 < 1, n.s.). ANOVA showed that, in monkey 1, RT (F 2,24 4.07, P < 0.05) and MT (F 2,24 84.96, P < 0.05) were signi®cantly, if slightly, faster while we recorded from cells in PMd (con®rmed by posthoc Newman±Keuls analyses).
Oculomotor data was obtained for monkey 1. Inspection of the data revealed that the monkey typically ®xated the IS as required, and although the monkey initially broke ®xation after the IS disappeared, it typically ®xated the neutral grey square prior to the end of the delay period. Following the TS, the monkey typically made a saccade to the intended target immediately prior to making the reaching movement. Thus, oculomotor behaviour for task periods such as the IS-on, premovement, and movement periods was reasonably consistent. Examination of 16 randomly selected sessions (eight for PMd and eight for putamen recordings) revealed no obvious differences in oculomotor behaviour for different recording sites. Figure 4 shows representative oculomotor records for familiar and novel trials.
EMG data were also obtained from monkey 1 from 15 muscles or muscle groups. While many of the recordings showed activity related to the task, and to reaching movements speci®cally, there were few instances of muscle activity changing during the course of learning (15  Data (AESEM) are given separately for the first three trials with a novel IS, trials at the time of reaching criterion (early, normalized trials À1 to 1) and a group of subsequent trials (late, normalized trials 2 to 4). Similar data are also given for familiar IS; the near-perfect performance of monkeys with familiar IS means that`early' trials are typically also the first three trials. Learning-related activity 725 cases statistically, 3.6% of the sample with a 0.01), the majority of which (n 10) were in the pre-movement and movement periods. Figure 5 shows the reconstructed recording sites. Virtually all of the cortical sample appears to be located in PMd, between the superior precentral sulcus and the superior limb of the arcuate sulcus, within AE4 mm of the frontal plane containing the posterior limit of the arcuate sulcus. Recorded putamen cells were generally located in the middle part of the putamen along its rostrocaudal axis and, in monkey 1, were concentrated in the dorsomedial aspect of the putamen. Putamen cells in monkey 2 were somewhat more caudal and more ventrolateral, on average, than those in monkey 1. Caudate cells were generally recorded from the head and body of the caudate, and not from the tail.

Neuronal database
We recorded from 120 PMd cells (all in monkey 1), 120 cells in the putamen (75 in monkey 1), and 44 cells in the caudate (22 cells in monkey 1). ANOVA revealed signi®cant task relations for neuronal activity in all seven principal task periods, as shown in Table 3. Tests on the percentage of task-related cells revealed no signi®cant differences between PMd and putamen (x 1 2 3.04, n.s.). Because of the small size of the caudate sample, this statistical test excluded those data.

Learning-related changes in activity
Cases of learning-related activity were de®ned as signed changes in neural activity in a given task period (and not seen in the corresponding reference period) as detected by the change-point test for continuous variables (a 0.01). Tables 4 and 5 give the percentage and number of learning related changes in each task period. Cases that showed statistically signi®cant changes were divided into those increasing and those decreasing as a function of learning, roughly corresponding to changes termed learning-dependent and learning-selective in previous reports (Chen & Wise, 1995a). Figure 7 depicts two separate cells in PMd that showed learningrelated changes in discharge modulation. Figure 7A±D illustrates an example of a learning-related increase in activity, during the IS-on period, from one PMd neuron in its preferred direction (down and to the right). As noted above, the phrase familiar trials refers to trials requiring the monkeys to respond according to familiar mappings; the term novel trials refers to trials involving novel mappings. During  familiar trials that required a response to the bottom, right target, the cell demonstrated a low level of activity throughout the trial (Fig. 7A).
When the monkey was required to make the same movement in novel trials (Fig. 7B), the cell initially (for the ®rst four trials on which that IS was presented) showed a low level of activity. However, as learning progressed, the cell began to show signi®cant modulation during the IS-on period. Figure 7C and D shows the average activity in this taskperiod (marked by the arrows above the histogram) for familiar ( Fig. 7C) and novel ( Fig. 7D) trials (black lines), together with activity during the reference period on the same trials (grey lines) and the performance of the monkey (un®lled circles), all smoothed with the same three-point moving average. These ®gures show only correctly executed trials. Accordingly, although the IS differs for familiar and novel trials, both the IS and the movement are the same for all illustrated trials within each display (e.g. Fig. 7D). Together, these ®ndings show that the cell's activity cannot simply re¯ect the IS or the monkey's response. Further, the stability of activity in the reference period shows that the learning-related activity could not be accounted for in terms of unstable cell isolation or any interaction of the neuron with the electrode. Figure 7E±H illustrates the activity of a second PMd cell, one that shows learning-related changes in activity during the post-reward period. This cell was active during, and immediately prior to, the onset of the IS, but its discharge rate after the reward is of most interest here. The cell showed heightened modulation immediately after the delivery of reward, most notably for early trials requiring the learning of novel mappings. As the monkey's performance improved, i.e. as the monkey learned the novel mapping, the post-reward activity decreased (Fig. 7H), eventually to the level seen for familiar trials requiring the same response. This pattern of activity resembles the reward-prediction error signal described for dopamine neurons and in some striatal cells by Schultz and colleagues Hollerman et al., 1998;Tremblay et al., 1998;Waelti et al., 2001). Figure 8 depicts two putamen cells that also demonstrated learningrelated changes in activity. The top four panels (Fig. 8A±D) illustrate an example of a learning-related increase in activity, during the IS-on period. Note the overall resemblance to the PMd cell shown in Fig. 7A±D. This cell shows very low ®ring rates during familiar trials that required a response to be made to the top right target (Fig. 8A) and shows similarly little activation during the ®rst four novel trials involving a similar movement (Fig. 8B). However, as learning progressed the cell showed signi®cant modulation during the IS-on period. Figure 8E±H illustrates an example of a learning-related decrease in activity in the putamen, for the pre-reward period. Figure 9 shows the learning-related activity of two caudate cells. Figure 9A±D illustrates an example of a learning-related increase during the IS-on period; much like the PMd (Fig. 7A±D) and putamen ( Fig. 8A±D) cells shown previously. Figure 9E±H illustrate learningrelated decreases during the IS-on period for another caudate cell.
Signi®cant learning-related changes in activity were observed in 88 PMd (all in monkey 1), 74 putamen (47 in monkey 1), and 19 caudate cells (9 in monkey 1). As shown in Table 4, these made up 73%, 62%, and 43% (uncorrected) of the sampled neurons in the three recording sites, respectively. Neurons sometimes showed learning effects for more than one task period and for more than one response direction, yielding a total of 390 cases of signi®cant changes in activity in PMd, 245 (162 in monkey 1) in the putamen, and 55 (27 in monkey 1) in the caudate. Tables 4 and 5 show how these cases of learning-related changes in activity were distributed across task periods. Chi-square tests showed no signi®cant differences by task period or recording site. The majority of cells in all recording areas demonstrated directionally selective activity. For 52% of learning cases in the cortex, a change was seen only for one response direction and not for the other in a given task period. Similar values were obtained for cases in the putamen (62% monkey 1, 74% monkey 2) and the caudate (70% monkey 1, 57% monkey 2). Comparing across task periods, there was a signi®cant tendency for a degree of correspondence in the response directions showing learning-related changes in activity. For each task period in which a signi®cant learning-related activity change was observed for one and only one response direction, we computed a directional consistency statistic based on a comparison of observed data with the same data shuf¯ed randomly by direction. Only cells that showed stability in the reference period for all four directions were analysed. As expected, shuf¯ing the direction assigned to each case of learning-related activity change reduced the directional consistency statistic to chance levels (23 AE 25%, SD). The observed directional consistency was signi®cantly higher (41 AE 35%; Kruskal±Wallis test, d.f. 1,373, x 2 26.8, P << 0.001), which demonstrates that the response directions associated with learning-related activity changes were not randomly distributed.

Comparison of PMd and putamen populations
For cases showing signi®cant learning-related changes in activity by the change-point test, population averages were constructed, as in previous reports (Mitz et al., 1991;Chen & Wise, 1995a). Cases with signi®cant increases were analysed separately from those with sig-ni®cant decreases. For each task period passing the change-point test, a cross-correlation analysis assessed the degree to which changes in performance lagged or preceded changes in neural activity. Figure 10 shows the distribution of these peak correlation coef®cients, and the lead and lags at which they occurred. Analyses of these lags using the Kolmogorov-Smirnov two-sample test (a 0.01) revealed no signi®cant difference between cases showing increases and decreases, for any task period, for any recording site, or for either monkey. Accord- ingly, all of these data sets are combined in the grand averages illustrated in Figs 11 and 12. For these population averages, activity in each case was averaged across three trials (equally weighted) and normalized with respect to the maximum averaged ®ring rate. Note that these averages combine the activity from several task-periods during a trial. We also constructed averages for each task period and, although they appear noisier due to the smaller data sets, these averages closely resemble the grand means presented here. On average (AESEM), changes in PMd modulation lagged changes in performance by 1.5 AE 0.2 trials; in the putamen they did so by 1.8 AE 0.2; and in caudate the lag was 1.5 AE 0.4. There was no signi®cant differences in the distribution of lags for PMd and the putamen as revealed by the Kolmogorov±Smirnov test (D 360,207 0.09, n.s.). The caudate did not differ from either the putamen or PMd populations, but we note the small number of caudate cells sampled. Additional analysis con®rmed that these changes in population activity did not re¯ect cell instability, as indicated either for familiar mappings or in reference-period activity (Figs 11 and 12; ANOVA, a 0.01). Figures 13 and 14 illustrate the size of these learning effects. For each task period showing signi®cant learning-related activity, we computed a learning-effect index. As a benchmark for comparison, index values of AE0.33 indicate a doubling or halving of discharge rate during learning (dashed lines in Fig. 13). Figure 13 shows how this index evolved over the course of learning for the population of PMd and putamen cells. Note the stable index value near 0 for the reference period. Figure 14 shows the distribution of the index for normalized trials 4±6 in greater detail. Analysis of the effects shown in Fig. 14 revealed that the learning-effect index for the task-period activity signi®cantly differed from those for reference activity, both for cases that showed increases (F 1,201 20.98; P < 0.01) and decreases (F 1,316 68.40; P < 0.01) during learning. No difference was observed based on recording sites (increases, F 1,201 < 1; n.s.; decreases, F 1,316 < 1; n.s.). Similar results were seen for analyses of ratios derived from normalized trials 7±9 and 10±12.
We also performed two more restricted analyses on the data from monkey 1. Neuroanatomical evidence indicates that the forelimb representation of PMd projects to the dorsomedial part of the putamen, within 2±3 mm of the boundary with the internal capsule, extending to the striatal bridges within the internal capsule and perhaps to the most ventrolateral aspect of the caudate nucleus at the same frontal level (Takada et al., 1998b;McFarland & Haber, 2000). Accordingly, we compared neurophysiological data for the PMd cells in monkey 1 to the 24 cells (43 cases) that fell within the most medial part of the putamen (within 2±3 mm of the internal capsule) in the same monkey.
Nearly all of these cells were located in the dorsomedial aspect of the putamen (Fig. 5A). For this subset of neurons, cross-correlation analysis showed that learning-related changes in activity in the dorsomedial putamen lagged changes in performance by 1.0 AE 0.4 trials. This result did not differ from the lag demonstrated by the PMd cases (D 360,43 0.10, n.s.). In a separate population analysis, data from periods of uncertain ®xation were eliminated from the analysis. After removing data from the IS-off and instructed-delay period, the comparison between PMd changes (1.3 AE 0.3 trials) and putamen (1.5 AE 0.3 trials) did not differ from the other task periods or from each other (D 270,103 0.09, n.s.).
Finally, we categorized cells according to their preference for novel or familiar trials, using the data from monkey 1. Although the neuronal search strategy introduced a bias toward cells with activity during familiar trials, the PMd and putamen samples contained signi®cantly more cells with a preference for novel trials than for familiar trials (Fig. 15, x 1 2 40.3, P < 0.05).

Discussion
The current study compared learning-related changes in activity in PMd and the putamen, part of which receives inputs from PMd (Ku Ènzle, 1978;Takada et al., 1998b;McFarland & Haber, 2000). We reasoned that if the anatomical concept of cortical±basal ganglionic loops has functional validity, then changes in the cortex and the associated parts of the striatum should occur concurrently. The loop hypothesis of telencephalic organization, although well known and reasonably well accepted, has rarely been subjected to experimental test. The observations of similar patterns of task-related activity in neocortex and the parts of the striatum to which they project has been noted often in the past. For instance cells in the head of the caudate  resemble those in ventral prefrontal cortex , the stimulus-selective properties of caudal striatal cells (Caan et al., 1984;Brown et al., 1995) resemble those of inferotemporal areas (Gross, 1992), and the reinforcement-related properties of ventral striatal cells (Williams et al., 1993) may re¯ect striatal afferents from limbic regions. However, such similarities between striatal and cortical activity is only weakly suggestive of a loop organization; they could result from many causes. If neocortex and its targets in the striatum genuinely function as integrated, recurrent neural networks, as has been proposed (DeLong & Georgopoulos, 1981;Alexander & DeLong, 1985;Alexander et al., 1991; see also Houk & Wise, 1995;Beiser & Houk, 1998;Hikosaka, 1998), then activity in both regions should change contemporaneously during learning.
In accord with the loop hypothesis, we found no signi®cant differences in the timing, across trials, of activity changes for PMd and putamen. (The caudate nucleus also showed learning-related activity changes at approximately the same time.) For both increasing and decreasing learning-related activity, the population averages showed only small changes in activity, if any, until approximately the time that learning reached asymptotic levels. Then these populations commenced a steady rise or fall in activity over the course of 10±15 trials for each IS (% 40±60 trials overall, incorrect and familiar trials excluded).
This ®nding has some relevance to the concept of consolidation. In general, memory appears to progress from a short-lived, fragile form to a long-lasting, more stable one. According to current thinking, neurons store information in two ways: sustained, recurrent activity levels for short periods of retention and changes in synaptic weights for long periods (e.g. O'Reilly et al., 2002). The changes in activity observed in PMd or striatal neurons were not likely to re¯ect information stored in recurrent circuits because the activity rates typically reset to baseline levels during the intertrial interval. Instead, learning-related changes in Because there was no delay period for monkey 2, the number of comparisons for that task period was 300 for the putamen and 88 for the caudate.  Same as in A, for novel trials. (C and D) Three-point moving averages for the data above each graph, for the task period depicted by the arrows above the histograms (®lled square), for activity in the reference period (grey circle) and for behaviour (proportion correct responses of the three trials in the moving average, un®lled circle). Note that because only correct trials are included in these averages, the learning appears to be faster than in Fig. 2, although the monkey did learn this mapping relatively rapidly. E±H, shows data from a second PMd cell in the format of A±D, except for the ®lled squares in G and H show activity for the post-reward period and for the upper, right target. Because this cell shows signi®cant pre-IS, anticipatory activity, post-movement data is used for reference activity. Abbreviations: IS, instruction stimulus; Rew, reward. activity probably re¯ected the strength of synapses upon the cells we studied. We cannot comment on how long those changes lasted, because we typically monitored cells for less than an hour. But the ®nding that changes continued to occur after the monkeys reached a behavioural plateau (Figs 11A and B, and 12A and B) suggests that the synaptic weights continued to adjust, which could have played a role in stabilizing the information stored in the synaptic weight matrix of the relevant neural networks.

Interpretational problems
In considering changes in PMd and putamen activity, it is important to consider factors other than associative learning. For example, in monkey 1, reaction time (RT) was slightly faster for PMd recordings than for the striatum. This small difference may have re¯ected the fact that although recording sessions between PMd and striatum were usually intermixed, on occasion recording began in PMd and pro- gressed to the striatum later in the session. However, these differences were subtle, as shown by Fig. 3, and unrelated to learning rates. Monkey 1 also showed slightly faster RTs after achieving criterion performance vs. later trials (Table 2). Whilst pre-movement activity may correlate with RT (Lee & Assad, 2003), previous studies have ruled out this account of learning-related activity (Mitz et al., 1991;Chen & Wise, 1995a). Neurons in PMd Fu et al., 1993;Wise et al., 1997;Crammond & Kalaska, 2000;Gomez et al., 2000;Messier & Kalaska, 2000;Cisek et al., 2003) and its targets in the putamen (Crutcher & DeLong, 1984;Turner & Anderson, 1997;Ueda & Kimura, 2003) encode movement direction and amplitude, but neither these kinematic parameters nor EMG activity varied substantially during CVML. The small and infrequent differences in EMG activity during learning provide an unlikely account for the neurophysiological results.
As for factors such as oculomotor behaviour and attention, the task required the monkeys to attend to the centre of the screen for knowledge of both where and when to respond. Attention may have varied in some subtle way, but the stimulus remained highly salient from the beginning to the end of the learning session. There was also a ®xation requirement during stimulus presentation for monkey 1. That monkey had relatively stable gaze at the centre of the screen at the time of both the IS and TS (Fig. 4). Moreover, although the monkeys did show increased oculomotor activity during the delay period for novel trials, there was no evidence for any differences in eye movements for PMd and striatal recordings. Furthermore, removal from the analyses of the periods of inconsistent gaze did not affect the timing of learning-related activity changes relative to behaviour. Similarly, changes in neural activity were unlikely to re¯ect changes in response to any particular dimensions of the visual stimulus. Monkey 1 performed the task using only familiar stimuli, some of which changed their response mappings daily and some of which did not. For correctly executed responses, the same response followed the same stimulus early vs. late in learning, yet the activity signi®cantly changed in cells classed as learning related. It could not re¯ect simply the features of the stimulus or the motor response.
It is possible that learning-related changes in neural activity re¯ected changes in the internal state of the monkey, such as reward expectancy or motivation. Such accounts can be discounted for the majority of neurons, in which learning-related changes were apparent for only one of the four targets, as such nonspeci®c factors would not be directionally speci®c. Nevertheless, although they cannot account for learning-related activity changes, nonspeci®c factors such as reward expectancy contribute to neural activity in both PMd and in the striatum (Fiorillo et al., 2003).
An account of learning-related activity in terms of stimulus sensitization or habituation is also unlikely, as previous work has ruled out both mechanisms (Mitz et al., 1991;Chen & Wise, 1995a). In the present study, the use of highly familiar stimuli for novel mappings in monkey 1 argues further against these possibilities.
The lack of change in reference-period activity and its rarity in familiar-trial activity (Table 5) argues against the idea that the results re¯ect a change in the cell isolation or irritation artefacts. Furthermore, the fact that the response directions associated with learning-related activity changes were signi®cantly nonrandom indicates some consistency across the trial. When learning-related activity occurs for one response direction in a given task period, that neuron has signi®cantly greater likelihood of showing learning-related activity changes for that same response in other task periods. Artefactual learning-related changes should be randomly distributed with respect to response direction.

Functional implications for conditional visuomotor learning
The mechanisms of CVML are of particular interest because this form of behaviour allows individuals to learn a wide variety of goal-directed actions guided by arbitrary associations among motor responses and sensory cues. The behavioural¯exibility that these mechanisms afford may underlie the symbolic guidance of actions, including social communication .
The current study showed that cells in both the PMd and the striatum demonstrate changes in ®ring rates that accompanied CVML. This ®nding con®rms previous reports for both regions (Mitz et al., 1991;Tremblay et al., 1998). The timing of neuronal activity changes with respect to behaviour were similar to those found in PMd by Mitz et al. (1991), and in SEF by Chen & Wise (1995a,b).
There is also evidence from recordings in rats that the striatum plays a role in the acquisition and performance of stimulus±response associations. Jog et al. (1999) reported that the percentage of task-related cells in dorsolateral striatum increased as rats gradually learned a twochoice auditory stimulus-response task. Data from individual cells recorded from multiple sessions revealed similar ®ndings. Their task was a conditional motor learning task, and differs from CVML only in the sensory modality of the IS. Although the authors interpreted their results in terms of`habits', it seems unlikely, based on the relatively small number of trials the rats experienced, that their rats performed according to the formal de®nition of a habit from animal learning theory (Balleine & Dickinson, 1998). Carelli et al., 1997), in contrast, recorded from cells in dorsolateral striatum while rats learned a single association between a tone and a lever press. They reported that, as rats became pro®cient at the task over the course of hundreds of trials (perhaps instigating a habit), the extent of activity related to the conditioned response decreased with learning. Thus, one might be tempted to conclude that, in rats, activity in the dorsal striatum increases during appetitively driven decisions (as in Jog et al., 1999), but that it decreases as the rat forms a habit (as in Carelli et al., 1997). However, much more work will be needed to substantiate such a claim. The present data give no support to the idea that dorsal striatum functions exclusively in habits (McDonald & White, 1993).
Regarding the preference for novel over familiar trials mentioned above, we observed this property both for the putamen and PMd. This result occurred notwithstanding the sampling bias toward cells with activity on familiar trials. These ®ndings resemble those of Chen & Wise (1995b) for eye-movement-related activity in the SEF. They found a preponderance of preferences for novel stimuli and mappings, although both preferences were observed. The present results also concur with those of Tremblay et al. (1998), who reported similar numbers of neurons that demonstrate either an increase or decrease in neuronal activity in novel trials vs. familiar ones.
The reports of learning-related changes in neural activity in premotor cortex (including SEF) (Mitz et al., 1991;Chen & Wise, 1995a,b), prefrontal cortex (Asaad et al., 1998), the basal ganglia Inase et al., 2001), and the hippocampus   Figure 3 (bottom) shows the number of cases contributing to each data point. (B) Neuronal activity for all cases with learning-related increases in activity on novel trials, by recording site and monkey. Data are normalized relative to the maximum ®ring rate observed for each case (see Materials and methods). Note that the data for the second monkey (un®lled triangles) covers a smaller range of trials, re¯ecting the smaller number of associations the monkey was required to learn in a given problem set. (C) Activity in the reference period for all cases shown in B. (D) Performance accuracy for the familiar trials having the same target as in A. (E) Neuronal activity for the familiar trials shown in D. For each case, data comes from the same task period that contributed to the learning-related increases in B. (F) Population cross-correlations of the changes in performance plotted in A and the changes in neuronal ®ring plotted in B. Note that positive leads denote changes in behavioural performance occurring prior to changes in neural activity, whilst negative lags denote changes in behavioural performance occurring after changes in neural activity.  (Cahusac et al., 1993;Wirth et al., 2003) largely parallel the results of neuropsychological studies in monkeys. These studies demonstrate the necessary neural substrates for normal performance or acquisition of CVML includes PMd (Halsband & Passingham, 1985;Petrides, 1985), the orbital and ventral aspects of the prefrontal cortex and its interaction with inferotemporal cortex (Gaffan & Harrison, 1988;Eacott & Gaffan, 1992;Wang et al., 2000;Bussey et al., 2001Bussey et al., , 2002, the basal ganglia in conjunction either with thalamus (Canavan et al., 1989) or PMd (Nixon et al., 2002), and the hippocampal system (Rupniak & Gaffan, 1987;Murray & Wise, 1996;Brasted et al., 2002Brasted et al., , 2003. In contrast, lesion studies have found no evidence for parietal involvement in CVML (Rushworth et al., 1997;Pisella et al., 2000), whilst the discrimination of response±reward contingencies, which could con-ceivably assist CVML acquisition, have recently been attributed to the cingulate cortex (Hadland et al., 2003).
The neuroimaging literature also supports a role for cortex and basal ganglia in CVML (Paus et al., 1993;Deiber et al., 1997;Toni & Passingham, 1999), including increased striatal involvement as learning progresses (Toni et al., 2001a). Imaging studies have, in general, failed to detect accompanying blood-¯ow changes in PMd during learning, or they have found only small changes (Deiber et al., 1991;Paus et al., 1993;Deiber et al., 1997;Toni & Passingham, 1999;Toni et al., 2001a). By contrast, many imaging studies provide evidence for a role of PMd in performing according to either familiar mappings (Sweeney et al., 1996;Grafton et al., 1998;Toni et al., 2001b). Such inconsistency in neuroimaging ®ndings may arise for any number of reasons, discussed elsewhere (Brasted & Wise, 2004). For example, the combination of learning-related decreases and increases in PMd, as shown in the current study and elsewhere (Mitz et al., 1991;Chen & Wise, 1995a), makes it dif®cult to predict a particular neuroimaging result.
Nevertheless, a recent analysis of`effective connectivity' has led to a contention that corticostriatal interactions strengthen during CVML (Toni et al., 2002). The analytical methods of these investigators suggested that, as learning progressed, variation in the BOLD signal in the medial temporal and inferior frontal areas became increasingly correlated with that seen in the striatum. In addition, changes in the BOLD signal in the striatum became increasingly correlated with those subsequently seen in the premotor cortex. These analyses led the authors to infer that CVML depends on increased activity in corticostriatal pathways, an inference that is not inconsistent with the time course of cortical and striatal activity reported in the current study. However, the conclusions of Toni et al. (2002) depend on a degree of corticostriatal convergence that has not been conclusively demonstrated with neuroanatomical methods. An alternative means for interaction is through direct corticocortical projections. The importance of interaction between ventral prefrontal and inferotemporal cortex has been established for CVML (Eacott & Gaffan, 1992;Bussey et al., 2002). It is interesting to note that although the most severe de®cits in CVML follow dorsal premotor and ventral prefrontal lesions ± and neuroimaging results point to those areas, as well (Toni et al., 2001a;Eliassen et al., 2003) ± evidence of strong direct corticocortical connectivity between these two regions remains elusive (Lu et al., 1994;Ghosh & Gattera, 1995;Stephan et al., 2000;Wang et al., 2002). An additional mechanism for the integration of segregated basal ganglia-thalamocortical circuits could involve striato-pallido-thalamocortical projections (Toni et al., 2002) or interaction via the claustrum (Tanne Â- Gariepy et al., 2002).

Striatal recording sites in relation to PMd±basal ganglia anatomy
PMd both sends corticostriatal projections and receives inputs from pallidothalamocortical projections (Alexander et al., 1986;Parent & Hazrati, 1995;Sakai et al., 1996;Rouiller et al., 1999;Middleton & Strick, 2000), and it projects predominantly to dorsomedial aspects of the middle rostrocaudal levels of the putamen (Ku Ènzle, 1978;Takada et al., 1998b;McFarland & Haber, 2000), near the projections from forelimb representation in other nonprimary motor areas such as the supplementary motor area (SMA) (Strick et al., 1995;Inase et al., 1996;Takada et al., 1998a), the pre-SMA (Inase et al., 1999) and the ventral premotor cortex (PMv) (Takada et al., 1998b). The forelimb representation in primary motor cortex (M1) innervates more lateral regions within the putamen (Ku Ènzle, 1975;Flaherty & Graybiel, 1993;Parthasarathy & Graybiel, 1997;Takada et al., 1998a,b). In the current study, the cortical cells studied Putamen Decreasing -2-0 1-3 4-6 7-9 10-12 -2-0 1-3 4-6 7-9 10-12 Normalized trial numbers  N<F N=F N>F N Fig. 15. Proportion of cells in PMd and putamen, for monkey 1, classi®ed according to their preference for novel (N) or familiar (F) trials. N, signi®cant difference from reference-period in novel trials, only; F, signi®cant difference from reference-period activity in familiar trials only. N < F, N > F, signi®cant difference between familiar and novel trials, as well as from reference-period activity in both. N F, signi®cant task-related activity relative to referenceperiod activity, but no signi®cant difference between novel and familiar trials. were located in PMd domain of the putamen, principally its forelimb area (Kurata et al., 1985;Matelli et al., 1991;Godschalk et al., 1995;Raos et al., 2003). However, no independent con®rmation of the motor or mechanoreceptive ®elds was attempted in the present study. Some putamen cells were located in more lateral areas which receive projections from hand and arm representations of M1, and some striatal cells were located more medially, in the parts of the caudate nucleus that probably receive input from areas 8 and 9.

Functional implications for cortex±basal ganglia interactions
A variety of theories have relied on the idea that striatal connectivity is well suited to detecting complex contextual input patterns and evidence that it uses such context for the prediction of reinforcement Schultz, 1998;Suri & Schultz, 1999;Bar-Gad et al., 2000;Suri et al., 2001). On this view, the striatum detects the context for a learned action, estimates a predicted outcome, and provides this information to the cortex as well as to targets in the brainstem (Houk & Wise, 1995;Mink, 1996;Bar-Gad et al., 2000;Bar-Gad & Bergman, 2001;Gurney et al., 2001). Consistent with this idea, the results of the current study demonstrate that as monkeys learn the context for a given response, related areas of the putamen and PMd exhibit changes in activity, and so with a similar time course. Of further relevance is the ®nding that learning-related activity occurred as often during prereward and post-reward periods ± after the associative response had been selected and executed ± as during the earlier periods in which the responses were selected. This ®nding suggests a requirement for cells in both PMd and putamen to monitor the outcome of a context-based response choice. Neurons in the prefrontal cortex, SEF, and the anterior cingulate cortex appear to monitor the consequences of learned actions (Stuphorn et al., 2000;Hollerman et al., 2000;Ito et al., 2003), and it is likely that neurons in PMd and the putamen participate in similar processes. A recent review by Schultz et al. (2003) also notes the general similarity and simultaneity of changes in the striatum and associated parts of frontal cortex in the context of learning. The ®ndings reported here therefore accord with the idea that related areas of cortex and the striatum play a role in context recognition and the contextual addressing of motor skills.

Note added in proof
Readers are referred to a recent study by Hadj-Bouziane & Boussaoud (2003).