Chronic nicotine increases midbrain dopamine neuron activity and biases individual strategies towards reduced exploration in mice

Dongelmans, Malou; Durand-de Cuttoli, Romain; Nguyen, Claire; Come, Maxime; Duranté, Etienne K.; Lemoine, Damien; Brito, Raphaël; Ahmed Yahia, Tarek; Mondoloni, Sarah; Didienne, Steve; Bousseyrol, Elise; Hannesse, Bernadette; Reynolds, Lauren M.; Torquet, Nicolas; Dalkara, Deniz; Marti, Fabio; Mourot, Alexandre; Naudé, Jérémie; Faure, Philippe

doi:10.1038/s41467-021-27268-7

Download PDF

Article
Open access
Published: 26 November 2021

Chronic nicotine increases midbrain dopamine neuron activity and biases individual strategies towards reduced exploration in mice

Nature Communications volume 12, Article number: 6945 (2021) Cite this article

19k Accesses
10 Citations
53 Altmetric
Metrics details

Subjects

Abstract

Long-term exposure to nicotine alters brain circuits and induces profound changes in decision-making strategies, affecting behaviors both related and unrelated to drug seeking and consumption. Using an intracranial self-stimulation reward-based foraging task, we investigated in mice the impact of chronic nicotine on midbrain dopamine neuron activity and its consequence on the trade-off between exploitation and exploration. Model-based and archetypal analysis revealed substantial inter-individual variability in decision-making strategies, with mice passively exposed to nicotine shifting toward a more exploitative profile compared to non-exposed animals. We then mimicked the effect of chronic nicotine on the tonic activity of dopamine neurons using optogenetics, and found that photo-stimulated mice adopted a behavioral phenotype similar to that of mice exposed to chronic nicotine. Our results reveal a key role of tonic midbrain dopamine in the exploration/exploitation trade-off and highlight a potential mechanism by which nicotine affects the exploration/exploitation balance and decision-making.

Dorsomedial prefrontal cortex neurons encode nicotine-cue associations

Article 26 June 2019

Altered neuronal activity in the ventromedial prefrontal cortex drives nicotine intake escalation

Article 30 August 2022

Dopaminergic systems create reward seeking despite adverse consequences

Article Open access 25 October 2023

Introduction

Nicotine is the primary reinforcing component driving tobacco addiction^1,2,3. Like most addictive substances, nicotine is hypothesized to perpetuate addiction through alterations in dopamine (DA) signaling and plasticity in the mesocorticolimbic pathway⁴. Repeated activation of ventral tegmental area (VTA) DA neurons by nicotine not only leads to reinforcement but also to craving and lack of self-control over intake⁵. Concurrently, chronic exposure to nicotine exerts numerous effects on brain and circuits, affecting personality traits and behaviors that extend beyond drug-seeking or consumption,^6,7 such as changes to emotional state or levels of stress^8,9 and anxiety¹⁰. Chronic nicotine exposure also impacts various components of decision-making processes, such as impulsivity^11,12 or exploratory behaviors^13,14 which may contribute to the persistence of drug consumption by promoting relapse and susceptibility to other addictions¹⁵. However, directly linking the cellular effects of nicotine to modifications of decision-making has been elusive. Understanding the molecular and circuit-level mechanisms of nicotine on decision-making is needed to decipher its multifaceted effects. Here we take advantage of a decision-making framework in a rodent model to address the impact of chronic nicotine exposure on VTA DA neuron activity and decision-making parameters.

Among the components of decision-making, the explore/exploit trade-off is of particular interest. Exploitation refers to choosing the option that seems, based on the history of rewards, the optimal choice. However, when faced with two alternatives, one with low and one with high probability of reward, animals do not purely exploit, they also choose the less likely rewarded option a significant portion of the time. The origin of such seemingly suboptimal choices remains poorly understood. It has been interpreted as noise, error, risk-seeking, irrational belief, or exploration^{7,16,17,18,19}. In the context of exploration, choosing an option with less likelihood of immediate reward is essential to gather information about unknown or uncertain outcomes in a changing environment. As new information is crucial for learning and behavioral adaptation^7,17, exploration is central to the emergence and organization of behaviors²⁰. Nevertheless, optimizing behavioral strategies require to exploit reward knowledge. Exploitation and exploration thus constitute important, yet opposing, adaptive processes. Hence, determining the exact trade-off between exploration and exploitation is key to decision-making. This trade-off is ubiquitous across species and pervades a number of altered behaviors under specific psychiatric conditions, such as addiction^6,7. It is thus an ecologically valid tool for translational research and for dissecting the link between the impact of the drug at the molecular, circuit and behavioral levels. In the context of addiction, a modification of this trade off will impact the global equilibrium of decisions between drug and non-drug rewards. Determining whether chronic nicotine exposure alters such exploration–exploitation trade-off is thus fundamental to help understand modifications of individual traits associated with continued nicotine consumption.

Altered DA function is a promising candidate to link chronic nicotine exposure to changes in decision-making behavior. This neuromodulator, which is at the crossroads of motivation, learning and decision-making, is hijacked, in the context of addiction, by most drugs of abuse^21,22,23. Changes in the spontaneous tonic firing of VTA DA neurons, as a consequence of repetitive drug use, can indeed alter the subjective value assigned to available rewards²¹, as well as the motivational salience of the drug or of drug-predicting cues²⁴, influencing decisions about which reward to pursue²⁵. Tonic DA can scale the performance of a learned behavior²⁶, the incentive value associated with environmental stimuli²⁷, or signal the average reward²⁸. In the exploration/exploitation framework, the role of tonic DA remains debated. The effect of DA manipulation on the exploration/exploitation balance is convincing but varies depending on the task^29,30,31. Increasing tonic striatal DA release has been suggested to either increase²⁹ or decrease³¹ the level of exploration. Decreasing tonic striatal DA has also been suggested to increase exploration³². Hence, drug-induced alterations of DA transmission may modify behavioral choices, either positively or negatively depending on the environment and the specific type of DA manipulation.

We have shown that decisions in reward-based foraging are modulated by the cholinergic neurotransmission of the VTA, with a particular role of nicotinic acetylcholine receptors in directed exploration, driven by expected uncertainty³³. Here we demonstrated that chronic nicotine exposure increases the tonic activity of VTA DA neurons and reduces undirected exploration to favor exploitation, with mice focusing on the most valuable options at the expense of information gathering. Acutely increasing the tonic activity of VTA DA neurons using optogenetics is sufficient to mimic the behavioral bias (decreased exploration) induced by nicotine, suggesting that the DA control of the exploration/exploitation balance is altered by long-term nicotine exposure.

Results

Mouse choices depend on reward probability, uncertainty and on motor cost

To assess choice behavior in an uncertain environment, we used a multi-armed ICSS (intracranial self-stimulation) bandit task for mice where specific locations, hereafter called targets, were associated with medial forebrain bundle (MFB) stimulations as rewards (Fig. 1a and Supplementary Fig. 1)^19,33. The task takes place in a circular open-field (interior diameter = 68 cm), with three explicitly marked targets forming the apices of a triangle (Fig. 1b). Passing over each target results in the delivery of a rewarding intracranial electrical stimulation. Mice cannot receive two consecutive stimulations at the same target, and thus learn to forage from one target to another in order to continue receiving stimulations (Fig. 1b, left). During the training period (5 min daily sessions), hereafter called the deterministic setting (DS, Fig. 1c, left), every visit to a target was reinforced by a stimulation (reward probability P = 100% at each location, P₁₀₀). At the end of the DS, mice were confronted with a probabilistic setting (PS, Fig. 1c, right) where each target was now associated with a different probability of stimulation delivery (P = 100%, 50%, and 25%, Fig. 1c, right). As previously shown³³, the PS induced a marked change in the behavioral pattern compared to the deterministic one. Trajectories at the end of the DS were almost circular, with very few directional changes (i.e., returning to the previous target, Fig. 1d) due to the associated motor cost (mice have to do a U-turn instead of going forward)¹⁹. In contrast, mice distributed their choices differently in the PS by incorporating more directional changes—an adaptation from the circular strategy (Fig. 1d). Directional changes in the PS were not random: rather, they allowed animals to focus on specific targets. Indeed, compared to the DS where mice visited the three targets with a uniform distribution, in the PS mice visited more often the targets associated with the highest reward probabilities (i.e., P₁₀₀ and P₅₀, Fig. 1e). Contrary to a purely exploitative strategy with alternating visits between p₁₀₀ and p₅₀, mice continued to visit all three points, prompting us to further investigate the exploration/exploitation trade-off in their choices. However, the global repartition of visits does not directly measure choices. Indeed, since mice cannot receive two consecutive rewards from the same target, the repartition of visits on targets is the result of binary choices in three gambles (G₁₀₀, G₂₅, G₅₀) between two respective payoffs (here, G₁₀₀ = {P₅₀ vs P₂₅}, G₂₅ = {P₁₀₀ vs P₅₀}, G₅₀ = {P₁₀₀ vs P₂₅}) (Fig. 1f). Hence, we investigated free choices in each gamble independently, and the resulting trade-off between exploitation and exploration. When faced with a choice between two alternatives, exploitation corresponds to choosing the option for which the animal assigns the highest value, while exploration corresponds to choosing the less valued alternative. Animals purely exploiting would always choose the high-probability option, but we found that mice chose the less likely rewarded option a significant portion of the time, consistent with balancing exploitation and exploration in their choice behavior. For G₁₀₀ and G₅₀, mice chose the optimal location (i.e., the one associated with the highest probability of reward) in more than 50% of trials. However, for G₂₅ (i.e., the free binary choice between P₁₀₀ and P₅₀ when the animal is on P₂₅) the probability to choose P₁₀₀ over P₅₀ was not different from a random choice (Fig. 1f), which we interpreted as mice assigning a positive motivational value to expected uncertainty, which is maximal at p₅₀³³. Overall, mice biased their choices depending on the motor cost, and the probability and uncertainty of reward delivery. Behavior in the task was therefore the result of a balance between exploratory and exploitative choices.

**Fig. 1: Mice exhibited suboptimal behavior and exploratory choices in a spatial version of a multi-armed bandit task with probabilistic settings.**

Nicotine exposure biases choices toward the most valuable options and promotes exploitation

We next aimed to investigate the effects of chronic nicotine exposure on decision-making behavior and on the balance between exploration and exploitation in the same task. To do so, we implanted osmotic minipumps subcutaneously to expose mice to continuous nicotine (Nic, 10 mg/kg/day) or saline (Sal) for 3 weeks and then compared their behavior at the end of the DS and in the PS of the ICSS task (Fig. 2a). Because nicotine induces long lasting adaptations in the midbrain DA system³⁴, and because VTA DA neurons have been associated with decision-making under uncertainty^22,33, we first analyzed the spontaneous tonic activity of VTA DA cells in anesthetized mice. We recorded neurons from mice chronically exposed to either saline or nicotine via minipump, and that either had performed the behavioral task (“ICSS”, at the end of PS), or were behaviorally naive. DA neurons firing was analyzed with respect to the average firing frequency and the percentage of spikes within bursts (%SWB). As previously reported^9,35, chronic exposure to nicotine increased the tonic activity of DA neurons, both in terms of firing frequency and bursting activity, when compared to mice implanted with a saline minipump, in both mice that performed (ICSS) or not (no ICSS) the task (Fig. 2b). Furthermore, mice exposed to the ICSS task exhibited an increase in firing frequency, but no change in bursting activity when compared to mice that were not stimulated (Fig. 2b).

**Fig. 2: Chronic exposure to nicotine altered both spontaneous DA activity and choice strategies.**

We then analyzed the behavior of mice in the ICSS task. Overall, we did not see any behavioral difference between mice implanted with a saline minipump (n = 23) and the non-implanted mice (n = 32) analyzed in Fig. 1 (Supplementary Fig. 2a–c). Therefore, these two groups were pooled and henceforth referred to as control (Ctl, n = 55). Trajectories at the end of the DS were stereotyped, almost circular, in both control and nicotine-treated mice. Both groups distributed their visits equally over the three locations (Fig. 2c) and their respective probabilities of directional changes were equal (∆ = −2.7%, Fig. 2d). However, the total number of rewards was higher for nicotine-treated than for control mice (∆ = 26, Fig. 2e), as a consequence of the decrease in the mean time-to-goal (i.e., the time necessary to go from one target to the next) in nicotine-treated mice (∆ = 0.83 s, Fig. 2f). When mice were placed in a classical open-field (without ICSS), a greater velocity was observed in mice exposed to nicotine, yet only at the beginning of the session (first 5 min) (Supplementary Fig. 2d). This result suggests that the increased speed observed in the ICSS task for nicotine-treated mice may arise from the combined effects of nicotine exposure and the stimulation rewards.

Clear differences in the behavior of nicotine- and saline-exposed mice were observed in the PS. Both groups distributed their choices depending on the probability to receive a reward, but with markedly different strategies. Notably, while control mice visited significantly P₂₅, nicotine-treated mice instead focused on visiting the two most rewarded options (i.e., P₅₀ and P₁₀₀, Fig. 2g, ∆₂₅ = −5%, ∆₅₀ = 2.7%, ∆₁₀₀ = 2.3%), which was associated with an increase in the percentage of directional changes (∆ = 11%, Fig. 2h). These alterations in overall repartition resulted from changes in successive binary choices, with an increase in optimal choice selection in gamble G₁₀₀ (Fig. 2i, ∆ = 10%) for nicotine-treated mice compared to control mice. We also observed an increase in the total number of obtained rewards (∆ = 17.9, p = 0.002) and in the percentage of success (number of rewards divided by the number of trials, ∆ = 2%, p = 0.02) in nicotine-treated mice compared to control mice. Finally, the comparison of mean time-to-goal between the two groups (∆ = −1.1 s, Fig. 2j) indicates again an increased velocity in nicotine-treated mice, as was already observed in the DS. This increase in speed in the PS was not associated with a decrease in the number of directional changes made by nicotine-treated mice, suggesting that animals did not enter an automatic circular mode, disengaged from actual choices, but instead remained in a deliberative process. Altogether, these results indicate that chronic nicotine modifies the decision-making strategy of mice by biasing choices towards the seemingly most valuable options.

In the PS, adopting a purely exploitative strategy to maximize the success rate would require mice to choose the alternative with the highest probability of reward in each gamble, leading to a sequence of choices with solely the alternation of visits between P₁₀₀ and P₅₀. Both control and nicotine-treated groups clearly deviated from this strategy of pure exploitation, although nicotine-treated mice were more exploitative on average. Yet, population analyses (i.e., averaging over groups of animals) classically do not reflect the wide range of distinct behaviors and strategies that can be adopted by individuals. We therefore further analyzed our behavioral data, with the aim of revealing individual profiles and their adaptation under nicotine exposure.

Idiosyncrasy in choice behavior suggests individual strategies

Visual inspection of individual trajectories revealed that in the PS, some mice retained a circular strategy (with either an ascending {P₂₅ - P₅₀ - P₁₀₀} or descending {P₁₀₀ - P₅₀ - P₂₅} order) while others had what we hereafter call a gain-optimizing (GO) strategy, alternating between targets associated with the highest reward probabilities (P₁₀₀ and P₅₀) (Fig. 3a, lower left). Through “gain-optimizing” strategy, we mean a very basic definition of optimality based only on maximizing the number of rewards, but which does not take into account the potential advantage of exploration. Theoretically, always choosing the most valuable option would lead to an average success rate of 75% (Fig. 3a, lower right) while a purely circular strategy would lead to an average estimate of 58.3% success rate (Fig. 3a, upper right). Accordingly, the percentage of directional changes was correlated with the success rate (Fig. 3b, for control and nicotine-treated mice). In this graphical representation, the line (Fig. 3b, red line) that connects the theoretical points of the circular strategy (0% directional change, 58.3% success) and of the GO strategy (100% directional changes, 75% success) represents a progressive shift in strategy. We found that, experimentally, the slope (s = 17.1 ± 1.5, black line, Fig. 3b) of the correlation between the percentage of directional changes and success rate was almost parallel to the theoretical line from circular to GO strategies (S_th = 16.7, red line, Fig. 3b), indicating that most of the directional changes were not random, but consisted in back-and-forth sequences between the p₅₀ and p₁₀₀ targets.

**Fig. 3: Mice exhibited inter-individual differences in choice strategies which were differentially affected by chronic nicotine exposure.**

Differences in individual choice patterns were neither due to random variations, nor to different learning speeds, but rather a consequence of robust individual strategies. This is suggested first by the overall stability of the behaviors as indicated by the convergence to a plateau at sessions 8–10 (Supplementary Fig. 4a), and by the absence of any positive correlations between decision-making parameters and session number after the first 5 sessions (Supplementary Fig. 4b). Furthermore, to test whether the variabilities in behavior were robust for each individual from trial to trial, we compared the percentage of directional changes for two consecutive sessions for each animal of the control group. Directional changes showed a strong positive correlation from one session to the next (Fig. 3c), suggesting a strong consistency in individual behaviors. This observation was generalized by demonstrating that intra-individual variations are lower than the inter-individual variations (Supplementary Fig. 4c).

Having established that inter-individual variations in the PS performance arise from the strategies each mouse adopts within the task, we next aimed to characterize individual behaviors of all mice (both control and nicotine-treated groups, i.e., n = 82) in the task. For that purpose, we used a seven-dimensional dataset based on the statistics of (i) the directional changes, (ii) the target distributions and (iii) the three gambles (see data, Fig. 1d–f) followed by archetypal analysis^36,37. While principal component analysis methods have been classically used to split high-dimensional datasets into clusters by aggregating individual data onto typical observations (the cluster centers), archetypal analysis depicts individual behavior more as a continuum within an “archetypal landscape” defined by extreme strategies: the archetypes. Individual data points are represented as linear combinations of extrema (vertex corresponding to archetypal strategies) of the dataset. The seven-dimensional dataset was used to identify three archetypal phenotypes. The three archetypes and their characteristics (Fig. 3d) differentiated mice exhibiting a GO strategy (i.e., focusing on P₅₀ and P₁₀₀) (Fig. 3a, in gray), from mice with circular patterns (equal distribution between the three targets, Fig. 3a), which either turned in a descending manner (labeled Des, in blue, sequence P₁₀₀ - P₅₀ - P₂₅ associated with high G₁₀₀ and G₂₅ but low G₅₀) or an ascending manner (labeled Asc, sequence P₂₅ - P₅₀ - Pp₁₀₀ associated with low G₁₀₀ and G₂₅ but high G₅₀). The individual behavior of each of the 82 mice could be defined as a weighted combination of these three extrema in a ternary plot (Fig. 3e). An animal’s behavior in this ternary plot is defined by three coordinates (a,b,c) that sum to 1 and that depict its relative archetypal composition. Therefore, these coefficients (a,b,c) could be used to assign each individual to its nearest archetype based on its behavioral profile (Fig. 3e, left). This assignment revealed that 23.2% of the mice were closer to the GO archetype (gray), while the remaining mice were evenly distributed between the Des (39%, blue) and Asc archetypes (37.8%, green) (Fig. 3e, right). To analyze the effect of chronic nicotine, we split the control and nicotine-treated mice, and showed that these two groups are distributed differently in the archetypal space as indicated by a modification (i) of the distribution of the archetype’s assignments (Fig. 3f) and (ii) of the archetypal composition (Fig. 3g). Overall, chronic nicotine exposure produced an apparent displacement of the population further from Asc and Des apices and closer to the GO apex, thus it favored the emergence of the more exploitative, and thus less explorative, GO phenotype.

Nicotine modifies decision parameters associated with exploration

To quantitatively describe the effects of nicotine on the decision processes underlying steady-state choice behavior in mice, we modeled individual data using a softmax model of decision-making. In this model, the probability of choosing target A over B depends on the difference between their expected values, here the probability P of reward delivery associated with each target (as the stimulation magnitudes were the same for all targets), and the “inverse temperature” parameter β which represents the sensitivity to the difference of values (∆V). A small β favors exploration (the proportion of respective choices is less sensitive to ∆V, with a null β meaning all options have nearly the same probability to be selected, independently of their respective value), while a large β indicates exploitation (high sensitivity to ∆V, with an infinite β meaning that options associated with higher reward probabilities are always selected). β can thus be considered as a proxy to measure the exploration/exploitation trade-off. “Choosing the highest rewarded option” and “exploring less” are therefore equivalent in this exploit/explore framework. This model was adapted to account for the behavior of mice in the PS as follows: first, decisions were biased towards actions with the most uncertain consequences, by assigning a bonus value φ to the expected uncertainties, i.e., the variance P(1 − P) associated with each location³³. This allowed us to explain the atypically low probability of choosing P₁₀₀ over P₅₀ in G₂₅ (Fig. 1f). Second, to account for the circular bias observed in both DS and PS, we added a motor cost which decreases the value of a target if it requires the animal to perform a directional change¹⁹. Thus, in this adapted softmax model (Fig. 4a and “Methods”), three latent variables not directly observed but inferred from the model were used: the “exploration/exploitation” parameter β, which was defined as the weighted sum of the expected values (100%, 50%, or 25%); expected uncertainty (weighted by parameter φ); and expected motor cost (weighted by parameter κ) of a given target.

**Fig. 4: Computational modeling suggests that decision parameters differ between the three archetypes and are differentially affected by nicotine exposure.**

We fitted the transition function of each mouse from the control group (n = 55) with this model, and obtained positive β, φ, and κ values (Fig. 4b, left). We then compared the output of this model (labeled M3: β > 0, φ > 0, and κ > 0) with two simpler ones, M1 and M2, to test whether mouse choices can be explained by simpler hypotheses. In M1, β and φ are set to 0, hence choices would be solely driven by motor cost (i.e., a bias against U-turns), which could explain circling behaviors independently of the probabilities associated with reward delivery. In M2 φ is set to 0, which would correspond to animals not taking uncertainty into account. Comparison of the models (Bayesian information criterion, Fig. 4b, right; and likelihood ratio test for nested models, Supplementary Fig. 5) indicated that M3 provides the best fit for the data, and suggests that mice used both motor cost, reward probabilities and uncertainty of the reward location to drive their choices.

The generative performance of the model was then assessed by simulating sequences of choices (n = 2000 model choices) for n = 55 mice with their respective model parameters (Fig. 4c, see also Supplementary Fig. 5). The model accurately reproduced the mean distribution of targets (Fig. 4c, left), the proportion of directional changes (Fig. 4c, middle), and the choice transition function (Fig. 4c, right). Individual transition functions from nicotine-treated mice (n = 27) were then fitted by the same model. When compared with the model parameters of control mice, nicotine exposure increased the value sensitivity parameter β, but did not affect the cost of directional changes (κ parameter), nor the uncertainty bonus φ (Fig. 4d). We thus asked whether recapitulating these effects on decision parameter β would be sufficient to mimic the effect of nicotine. We modeled the choices (n = 2000) using decision-making parameters from the control population (n = 55, as in Fig. 3b, c) modified by the average difference observed in the β parameter from nicotine-treated mice. We compared the three main behavioral measures altered by nicotine: (i) the probability to choose the most valuable option in gamble G₁₀₀ (choosing P₅₀ over P₂₅), (ii) the percentage of directional changes, and (iii) the probability to visit P₂₅. By applying an increase in β (derived from nicotine-treated mice) to the control model parameters, the model accurately reproduced the changes observed in decision-making strategy following chronic nicotine exposure for the three measures (Fig. 4e). Conversely, by combining a decrease in β (i.e., subtracting the average effect of nicotine from the nicotine-treated model parameters) we were able to simulate the conversion of a nicotine-treated behavioral profile into a control profile. These results thus suggest a specific effect of nicotine on the β parameter.

Finally, we assessed the correspondence between the archetypal analysis and the decision-making model, by comparing the value of the three parameters (β, φ, κ) depending on the archetypal composition (see methods). Overall, the three archetypes corresponded to different combinations of the β and φ model parameters (Fig. 4f), and an almost homogeneous distribution of motor cost κ. The GO (gray) archetype was associated with a high value of β (corresponding to exploitation) and φ, which is consistent with individuals that favor the alternation between locations associated with higher probability (P₁₀₀ and P₅₀). The Des and Asc phenotypes corresponded to strong circular behaviors and a low sensitivity to value (low β value), resulting in an important impact of the motor cost parameter (κ) in their strategies. The Des and Asc groups differed by their preference to uncertainty, φ, value (∆ = 1.012, p = 0.0079), which was related to the directionality of their preferred rotation: a low φ corresponds to mice choosing the certain (P₁₀₀) reward over the uncertain (P₅₀) reward, resulting in a tendency for sequence P₂₅ - > P₁₀₀ - > P₅₀ observed in Des mice (blue). Conversely, a high φ is associated with the reverse sequence P₂₅ - > P₅₀ - > P₁₀₀ observed in Asc mice (green). Such decomposition of the archetypal phenotypes into their underlying decision-making processes illustrates how distribution of individual decision-making strategies (Asc, Des, and GO) in a population could correspond to transitions in the parameter values from the same model. Overall, the model identifies the effect of nicotine as an increase in β, which is consistent with a displacement of exposed mice towards the GO profile in the archetypal space.

Optogenetic DA neuron stimulation recapitulates the effects of nicotine

Nicotine exposure is known to induce modifications in a number of brain areas³⁸, including an increase in the tonic activity of VTA DA neurons, as we indeed confirmed in this study (Fig. 2b). Furthermore, the tonic activity of DA neurons has been proposed to play a role in the balance between exploration and exploitation^29,30,31. We thus asked whether directly and acutely modifying the activity of VTA DA neurons is sufficient to alter decision-making behavior within a session and recapitulate the effects of chronic nicotine in our ICSS task. To specifically and bi-directionally manipulate VTA DA neurons, we expressed either an excitatory channelrhodopsin (CatCh³⁹) or an inhibitory halorhodopsin variant (Jaws⁴⁰) in DAT^iCRE mice using a Cre-dependent viral strategy (Supplementary Fig. 5a). We confirmed in patch-clamp recordings that continuous 5 ms-light pulses at 8 Hz (470 nm) reliably increased VTA DA neuron activity in CatCh-transduced mice (Fig. 5a), while 500 ms-light pulses at 0.5 Hz (520 nm) reliably decreased their activity in Jaws-transduced mice (Fig. 5b). We then specifically tested the hypotheses that optogenetic activation of VTA DA neurons should reproduce the increased exploitation seen in nicotine-treated animals, and that optogenetic inhibition should produce the opposite effect.

**Fig. 5: Optogenetic manipulation of VTA DA neuron activity recapitulated the behavioral adaptations observed under chronic nicotine exposure.**

After mice completed both the DS and PS in the ICSS task, they went through optogenetic sessions maintaining the PS rules, with an alternating schedule of 2 days with photo-stimulation (ON, photo-stimulation started 5 min prior to the start of the daily session and was maintained throughout the 5 min session) and without (OFF) (Fig. 5c). During the OFF days, mice were connected to the optical fiber patch-cord but did not receive light stimulation. For each pair of ON/OFF experiments, we estimated the effect of the photo-stimulation on the four main behavioral measures that were altered by chronic nicotine (Fig. 5d and Supplementary Fig. 6d). As expected, optogenetic activation increased directional changes (Fig. 5d, left) and decreased the probability to visit P₂₅ (Fig. 5d, right), favoring alternations between P₁₀₀ and P₅₀, similar to the effect of nicotine. Opposite effects were observed for these two parameters when the firing rate was reduced in VTA DA cells using Jaws (Fig. 5d). Optogenetic activation reduced time-to-goal without affecting the choice in the gamble G₁₀₀, while optogenetic inhibition did not significantly affect either of these two parameters (Supplementary Fig. 6d). We fitted the transition function of CatCh- and Jaws-transduced mice with our decision-making model. The effects of photo-activating VTA DA neurons on decision-making during the ICSS task could be modeled as an increase of β (Fig. 5e), as observed with chronic nicotine. Photo-inhibition of VTA DA neurons, however, did not significantly affect the exploration/exploitation trade-off parameter β (Fig. 5e). For each pair of ON/OFF experiments, we also estimated the effect of the photo-stimulation on our seven main behavioral measures (Fig. 1d–f) and β parameter by calculating for a given measure (M) the difference ∆ = M_ON − M_OFF. These differences were compared with the net effect of nicotine obtained for each of these parameters by subtracting the mean control value from the mean effect of nicotine (Supplementary Fig. 6e, red). Overall, nicotine and photo-stimulation produced a similar pattern of effects on our behavioral measure (Supplementary Fig. 6e), while inhibition produced the opposite effect albeit to a lesser extent.

Finally, by analyzing decision-making behaviors between the stimulated (ON) and non-stimulated (OFF) conditions in the previously identified archetypal space, we revealed that VTA DA neuron activation draws individual phenotypes towards the GO archetype (i.e., increased GO archetypal composition), while VTA DA neuron inhibition draws individuals away from GO (Fig. 5f). Thus, altering the firing pattern of VTA DA neurons is, by changing both the motor cost and the balance between exploration and exploitation behavior, sufficient to bias decision behaviors in the ICSS task, as suggested by our simulations (Fig. 4). Furthermore, increasing VTA DA neuron firing mimicked the effects of chronic nicotine exposure on decision-making measures, linking the behavioral alterations with the physiological changes to DA neurons we observed in nicotine-treated mice.

Discussion

Understanding how nicotine affects decision-making has been challenging, because two different physiological aspects need to be distinguished⁴¹: (i) nicotine as a reinforcer that directly activates the dopaminergic system to produce reinforcement and nicotine-seeking, and (ii) nicotine as a neuromodulator that alters nicotine-independent decision-making processes by modifying the dynamics and computational properties of cholinoceptive circuits. Here, using a multi-armed ICSS bandit task, we show that mice passively treated with nicotine forage more frequently at locations with the highest probabilities of reward (P₅₀ and P₁₀₀) compared to non-exposed animals, suggesting a bias in the exploration/exploitation trade-off toward decreased exploration. These behavioral changes were accompanied by modifications in the spontaneous activity of VTA DA neurons. We further showed that inter-individual variations in foraging strategies emerge between mice, despite the fact that they are all males of the same genetic background. This suggests that animals idiosyncratically adapt their behavior in response to task constraints, rather than all converging toward a theoretical “optimal” performance maximizing reward. Finally, optogenetically increasing or decreasing VTA DA neuron activity shifted individual strategies, recapitulating the results from nicotine-exposed mice and computational modeling. Together, our data suggest that modifications of the dopaminergic activity, notably through chronic nicotine exposure, scale the exploration/exploitation trade-off.

In our task, mice adapted their choices according to the probability of reward delivery, but they also consistently kept visiting the targets associated with lower reward probabilities in all of the gambles, even after extended training. Such a high level of exploratory behavior is potentially attributable to the setup, which is characterized by the delivery of small rewards, serially repeated gambles with short delays between trials, and learning through experience⁴². In the exploration/exploitation framework, the fact that mice continue to visit targets with the lowest reward probability in each of the gambles, despite intensive learning, can reflect (i) exploratory noise, generally modeled via decreased value sensitivity (or increased randomness) β in the softmax model, (ii) directed exploration, if one considers that mice continue to explore locations associated with low reward probability to reduce the uncertainty associated with probabilistic omission and gain information of task contingencies, and (iii) uncertainty-seeking, which is neither really explorative nor exploitive, but considers that mice simply attribute a positive value to expected uncertainty, like a bonus for playing or gambling. Our analysis also introduces the idea of qualitative inter-individual variations, sometimes called “idiosyncrasy”, in choice strategy. Model comparison first suggests that all mice, even those that are away from the GO archetype, used information about reward probabilities and uncertainty. It also shows that the inter-individual variations were well described by a single computational model of decision-making that takes into account the exploration/exploitation trade-off, uncertainty, motor cost, and continuous variations of two latent variables⁴³ inferred through the model. Note that sex or strain differences may constitute another layer of variability⁴⁴, which we are currently addressing in ongoing experiments. Finally, despite variations in individual choice behaviors, the consequences of nicotine administration were consistent, with a clear effect on the β parameter, and a strategy biased towards the exploitation of the highest reward values.

This interpretation is supported by two findings. Firstly, we could demonstrate that variations in behavior are not due to chance but indicate individual personalities as revealed by the strong correlation in descriptive parameter values between consecutive sessions (Fig. 4b). Secondly, it is conceivable that mice eventually achieve the optimal behavior (alternating solely between P₅₀ and P₁₀₀), albeit at different rates of learning. In this case, a possible interpretation of our data would be that nicotine facilitates learning⁴⁵ and speeds up the convergence toward the GO profile. However, this hypothesis is unlikely, and is not supported by our data. Indeed, despite significant adaptations during the transition between DS and PS, as well as during the first deterministic and probabilistic sessions (1–4), the animals’ behavior at the end of each setting is overall stable, and the choice behavior close to steady-state. In addition, manipulating the activity of VTA DA neurons using optogenetics acutely altered the behavioral strategies of the mice, with kinetics that are incompatible with synaptic plasticity or learning. We thus argue that inter-individual variations in task performance, as well as nicotine effects, should not be interpreted as differences in learning processes or knowledge of the environment, but rather as differences in using the knowledge acquired about the statistical structure of the environment (quantified by variations in β, φ, and κ) to develop their strategy within the task.

The increase of β reflects an amplified exploitative behavior, an effect that has been previously linked to enhanced tonic DA activity, which is hypothesized to modulate the bias towards optimal choices^29,30,31. In this study, we demonstrate a direct link between the tonic activity of DA neurons and exploitation using electrophysiological and optogenetic approaches. The multi-armed ICSS bandit task enables, through a clear distinction between action selection (choices) and action execution (time-to-goal), to identify the modified components of value-based decision-making in relation to tonic DA. We explicitly demonstrate an increase in value sensitivity due to nicotine-induced alterations in tonic DA activity. Previous ICSS studies have observed that chronic exposure to drugs sensitizes the brain reward system, and in doing so lowers the stimulation threshold (expressed as a current intensity or stimulation frequency)⁴⁶ required for ICSS⁴⁷. Here we expand these results by quantifying the effects of such increased value sensitivity on choices between ICSS-mediated rewarding locations, and further identifying a causal link between these behavioral modifications and increased tonic activity of VTA DA neurons.

In the context of DA neuron physiology, activity varies in frequency and in degree of burst firing. Bursting has been defined as successive action potentials separated by less than 80 ms⁴⁸, occurring on top of a regular “pacemaking” firing activity. Spontaneous activity (regular spiking and bursting) is associated with the neuromodulatory function of DA and its ability to shift ongoing dynamics of target structures. In this context, bursting is not necessarily locked to any behaviorally relevant or salient event. By contrast, phasic activity is related to event-locked increase in firing²², which can typically be observed as a synchronous increase in firing rate in a population of neurons, but is not necessarily composed of bursts of action potentials (i.e., it can be single spikes but time-locked to an event during successive trials). DA phasic activity modulates synaptic plasticity⁴⁹ and is critical for learning the value of stimulus or actions⁵⁰. The observed increase in VTA DA neuron activity (both in bursts and in firing rate) after nicotine exposure suggests that dopaminergic tone is modified in nicotine-exposed animals. Such an increase in the basal activity of VTA DA neurons^9,35 occurs through desensitization and up-regulation of nicotinic receptors and the long-term strengthening of glutamatergic synaptic transmission⁵¹. Here we show that acutely elevating VTA DA neurons activity using optogenetics is sufficient to mimic the behavioral alterations seen in mice under chronic nicotine exposure. Nicotine and optogenetic stimulation obviously act differently on DA neurons, yet they both lead to an increase in VTA DA neurons activity. Our optogenetic experiments confirmed once more that acutely modifying DA neurons activity did not change the animal’s knowledge of reward probabilities (learning), but the way the animals used the learned contingencies (values, probabilities and uncertainties) to shape a decision strategy or policy. We thus link DA tonic neuromodulatory function and modifications of decision-making parameters (here β).

Variations in neuromodulatory functions, including those in the catecholamine and cholinergic systems, contribute to the process of individuation^52,53,54. DA, and in particular from the VTA, has been linked to a cluster of traits (extraversion, novelty-seeking, etc.) conceptually related to reward-seeking^55,56. However, despite the substantial attention paid to DA in personality neuroscience, and despite a clear association between modulations in dopaminergic function and variations in individual traits, defining which specific traits are influenced by DA remains a challenging task. Our data suggest that modification in basal VTA DA neuron activity can directly modify the expression of a specific trait: the exploit/explore trade-off here estimated through the β variable. This result is reminiscent of the observations made from male mice living continuously in a large environment, which display idiosyncratic behavioral strategies during a decision-making task, and for which the exploration/exploitation balance was correlated with the activity of their DA system⁵⁴.

Nicotine exposure alters decision-making processes⁶. Non-contingency studies have previously shown that yoked nicotine exposure increases the incentive salience of non-nicotine stimuli⁵⁷, similar to the sensitization to ICSS rewards⁴⁷. These studies suggest an essential role of contextual cues in smoking and the nicotine-induced increase in reward sensitivity. Neuroeconomics studies have also linked smoking with increased impulsivity (delay-discounting task¹¹), lack of counterfactual learning signals⁵⁸, and decreased behavioral flexibility (exploration in a dynamic bandit task¹³). Our results further reveal that nicotine exposure decreases exploration. In addition, we provide a mechanistic understanding of how reward processing may be altered at the level of the VTA in response to chronic nicotine. This insight is translationally valuable as nicotine-induced alterations in explore/exploit processes likely also have implications for the everyday life of smokers, particularly as they can increase vulnerability for addiction to other drugs of abuse and for behavioral disorders such as pathological gambling that rely on value-based decisions^7,59 and present a high comorbidity with tobacco addiction⁶⁰. Our data underscores altered choice behaviors in smokers that likely participate in, but are not limited to, addiction⁶. Finally, such an explore–exploit paradigm and archetypal analysis could be very useful to study the effects of other drugs of abuse on decision-making. Indeed, humans with methamphetamine⁶¹ or alcohol use disorders⁶² display alterations in bandit tasks, but human studies cannot disambiguate whether altered decision-making facilitates, or results from, drug use. Hence, preclinical studies are needed to dissect the causal mechanisms underlying alterations in economic decisions, and to understand the dynamics of drug users’ profiles in general, and of smokers (or vape users) in particular.

Methods

Animals

Experiments were performed on adult C57Bl/6Rj DAT^iCRE and wild-type (Janvier Labs, France) mice. Male mice, from 8 to 16 weeks old, weighing 25–35 g, were used for all the experiments. They were kept in an animal facility where temperature (20 ± 2 °C) and humidity were automatically monitored and a circadian light cycle of 12/12 h light–dark cycle was maintained. All experiments were performed in accordance with the recommendations for animal experiments issued by the European Commission directives 219/1990, 220/1990, and 2010/63, and approved by Sorbonne University.

AAV production

AAV vectors were produced as previously described⁶³ using the cotransfection method and purified by iodixanol gradient ultracentrifugation⁶⁴. AAV vector stocks were tittered by quantitative PCR (qPCR)⁶⁵ using SYBR Green (Thermo Fischer Scientific). Additional information is provided in Supplementary Table 1.

Intracranial self-stimulation (ICSS) electrode implantation

Mice were anaesthetized with a gas mixture of oxygen (1 L/min) and 1–3% of isoflurane (Piramal Healthcare, UK), then placed into a stereotaxic frame (Kopf Instruments, CA, USA). After the administration of a local anesthetic (Lurocain, 0.1 mL at 0.67 mg/kg), a median incision revealed the skull which was drilled at the level of the median forebrain bundle (MFB). A bipolar stimulating electrode (PlasticOne 2-channels, stainless-steel, 10 mm) for ICSS was then implanted unilaterally (left or right, randomized) in the brain (stereotaxic coordinates from bregma according to mouse after Paxinos atlas: AP −1.4 mm, ML ±1.2 mm, DV −4.8 mm from the brain). Dental cement (SuperBond, Sun Medical) was used to fix the implant to the skull. After stitching and administration of a dermal antiseptic, mice were then placed back in their home-cage and had, at least, 5 days to recover from surgery. An analgesic, buprenorphine solution at 0.015 mg/L (0.1 mL/10 g), was delivered after the surgery and if necessary, the following recovering days. The efficacy of electrical stimulation was verified through the rate of acquisition during the DS (see Intracranial Self Stimulation (ICSS) bandit task).

Implantation of osmotic minipumps

After 5 days of training in the DS (see Behavioral methods), animals were anesthetized with a gas mixture of oxygen (1 L/min) and 1–3% of isoflurane (IsoVet, Piramal Healthcare, UK). After the administration of a local anesthetic, an incision was performed at the level of the interscapular zone, to subcutaneously implant an osmotic minipump (Model 2004, ALZET, CA, USA) containing 200 μL of either a solution of nicotine hydrogen tartrate salt (Sigma-Aldrich, USA) at a dose of 10 mg/kg/day (free base) or saline solution (H₂O with 0.9% NaCl) for the control group. Both solutions were prepared in the laboratory. Minipumps delivered their content with a flow rate of 0.25 μL/h over 28 days (covering the remaining training days in the DS and all sessions in the PS). The surgical wound was closed with surgical stitches. Animals had 2 days of rest to recover from the minipump surgery before going further with their behavioral training.

Virus injections and optogenetics experiments

DAT^iCRE mice were anaesthetized (Oxygen 1 L/min, Isoflurane 1–3%) and implanted with an ICSS electrode as described above. They were then injected unilaterally (randomized left/right side and ipsi/contralateral side regarding the ICSS electrode) in the VTA (1 μL, coordinates from bregma: AP −3.1 mm; ML ±0.5 mm; DV −4.55 mm from the skull) with an adeno-associated virus (see Supplementary Table 1; AAV5.Ef1α.DIO.hCatCh.YFP, AAV5.Ef1α.DIO.Jaws.eGFP or AAV5.Ef1α.DIO.YFP). A double-floxed inverse open reading frame (DIO) allowed to restrain the expression of CatCh (Ca²⁺-translocating channelrhodopsin) or Jaws (red-shifted cruxhalorhodopsin) to VTA dopaminergic neurons of DAT^iCRE mice.

For optogenetic experiments on freely moving mice, an optical fiber (200 μm core, NA = 0.39, Thor Labs) coupled to a ferule (1.25 mm) was implanted just above the VTA ipsilateral to the viral injection (coordinates from bregma: AP −3.1 mm, ML ±0.5 mm, DV 4.4 mm from the skull), and fixed to the skull with dental cement (SuperBond, Sun Medical). The behavioral task began at least 4 weeks after virus injection to allow the transgene to be expressed in the target DA cells. An ultra-high-power LED (470 nm or 520 nm, Prizmatix) coupled to a patch cord (500 μm core, NA = 0.5, Prizmatix) was used for optical stimulation (output intensity of 10 mW). Optical stimulation was delivered continuously, starting 5 min before and continuing throughout the 5 min of ON sessions of the task. Excitatory opsin (CatCh) was stimulated using 470 nm light pulses of 5 ms duration and 8 Hz frequency. Inhibitory opsin (Jaws) was stimulated using 520 nm light pulses of 500 ms duration and 0.5 Hz frequency. The experiment followed a schedule of paired ON and OFF days after the end of training phase (DS + PS). The optical stimulation patch cord was plugged onto the ferrule during all experimental sessions (ON and OFF days) to habituate animals and control for latent experimental effects.

Ex vivo patch-clamp recordings of VTA DA neurons

To verify the functional expression of the excitatory opsin CatCh and the inhibitory opsin Jaws, 10–12 weeks-old male DAT^iCRE mice were injected with the viruses described above. After 4 weeks, mice were deeply anesthetized with an intraperitoneal (IP) injection of a mix of ketamine/xylazine. Coronal midbrain sections (250 µm) were sliced using a Compresstome (VF-200; Precisionary Instruments) after intracardiac perfusion of cold (4 °C) sucrose-based artificial cerebrospinal fluid (SB-aCSF) containing (in mM): 125 NaCl, 2.5 KCl, 1.25 NaH₂PO₄, 5.9 MgCl₂, 26 NaHCO₃, 25 Sucrose, 2.5 Glucose, and 1 Kynurenate (pH 7.2, 325 mOsm). After 10–60 min at 35 °C for recovery, slices were transferred into oxygenated aCSF containing (in mM): 125 NaCl, 2.5 KCl, 1.25 NaH₂PO₄, 2 CaCl₂, 1 MgCl₂, 26 NaHCO₃, 15 Sucrose, and 10 Glucose (pH 7.2, 325 mOsm) at room temperature for the rest of the day and individually transferred to a recording chamber continuously perfused at 2 mL/min with oxygenated aCSF. Patch pipettes (4–8 MΩ) were pulled from thin wall borosilicate glass (G150TF-3, Warner Instruments) using a micropipette puller (P-87, Sutter Instruments, Novato, CA) and filled with a KGlu-based intra-pipette solution containing (in mM): 116 K-gluconate, 10-20 HEPES, 0.5 EGTA, 6 KCl, 2 NaCl, 4 ATP, 0.3 GTP, and 2 mg/mL biocytin (pH adjusted to 7.2). Transfected VTA DA cells were visualized using an upright microscope coupled with a Dodt contrast lens and illuminated with a white light source (Scientifica). To characterize CatCh expression, a 460 nm LED (CoolLED) was used both for visualizing YFP-positive cells (using a band-pass filter cube) and for optical stimulation through the microscope (1 s continuous for light-evoked current in voltage-clamp mode and 8 Hz with 5 ms/pulse to drive neuronal firing in current-clamp mode). Regarding Jaws expression, 20 s continuous photo-stimulation of 500 ms pulses at 0.5 Hz with a 525 nm, pE-2, CoolLED, was used in current-clamp (−60 mV). Whole-cell recordings were performed using a patch-clamp amplifier (Axoclamp 200B, Molecular Devices) connected to a Digidata (1550 LowNoise acquisition system, Molecular Devices). Signals were low-pass filtered (Bessel, 2 kHz) and collected at 10 kHz using the data acquisition software pClamp 10.5 (Molecular Devices). All the electrophysiological recordings were extracted using Clampfit (Molecular Devices) and analyzed with R.

In vivo juxtacellular recordings of VTA DA neurons

Mice were deeply anaesthetized with chloral hydrate (8%), 400 mg/kg IP, supplemented as required to maintain optimal anesthesia throughout the experiment. The scalp was opened and a hole was drilled in the skull above the location of the VTA. Extracellular recording electrodes were constructed from 1.5 mm O.D./1.17 mm I.D. borosilicate glass tubings (Harvard Apparatus) using a vertical electrode puller (Narishige). Under microscopic control, the tip was broken to obtain a diameter of approximately 1 µm. The electrodes were filled with a 0.5% NaCl solution containing 1.5% of Neurobiotin tracer (AbCys) yielding impedances of 6–9 MΩ. Electrical signals were amplified by a high-impedance amplifier (Axon Instruments) and monitored through an audio monitor (A.M. Systems Inc.). The signal was digitized, sampled at 25 kHz and recorded using Spike2 software (Cambridge Electronic Design) for later analysis. The electrophysiological activity was sampled in the central region of the VTA (coordinates from bregma: 3.1–4 mm AP, 0.3–0.7 mm ML, and 4–4.8 mm DV from the brain surface). Individual electrode tracks were separated from one another by at least 0.1 mm in the horizontal plane. Spontaneously active DA neurons were identified based on previously established electrophysiological criteria^66,67

Fluorescence immunohistochemistry

After euthanasia, induced by IP injection of euthasol (0.1 mL per 30 g at 150 mg/kg) or by paraformaldehyde (PFA) intracardiac perfusion, brains were rapidly removed and fixed in 4% PFA. Following a period of fixation at 4 °C, serial 60 μm sections were cut from the midbrain with a vibratome. Immunohistochemistry was performed as follows: free-floating VTA brain sections were incubated 1 h at 4 °C in a blocking solution of phosphate-buffered saline (PBS) containing 3% bovine serum albumin (BSA, Sigma A4503) and 0.2% Triton X-100 and then incubated overnight at 4 °C with a mouse anti-tyrosine hydroxylase antibody (TH, Sigma, T1299) at 1:500 dilution in PBS containing 1.5% BSA and 0.2% Triton X-100 (see supplementary Table 1). The following day, sections were rinsed with PBS and then incubated for 3 h at 22–25 °C with Cy3-conjugated anti-mouse (Jackson ImmunoResearch, 715-165-150) at 1:500 dilution in a solution of 1.5% BSA in PBS, respectively. After three rinses in PBS, slices were wet-mounted using Prolong Gold Antifade Reagent (Invitrogen, P36930). Microscopy was carried out with a fluorescent microscope Leica DMR, and images captured in gray level using MetaView software (Universal Imaging Corporation) and colored post-acquisition with ImageJ.

For the optogenetic experiments on DAT^iCRE mice, an immunohistochemical identification of the transfected neurons was performed as described above, with an addition of chicken anti-eYFP antibodies (Life technologies Molecular Probes, A-6455) at 1:1000 dilution (Supplementary Table 1). A goat anti-chicken AlexaFluor 488 secondary antibody (711-225-152, Jackson ImmunoResearch) at 1:1000 dilution was then used in a solution of 1.5% BSA in PBS. Neurons co-labeled for TH and YFP in the VTA allowed to confirm their neurochemical phenotype and the transfection success.

Intracranial self-stimulation (ICSS) bandit task

Behavioral setup

The ICSS bandit task took place in a circular open-field with a diameter of 68 cm. Three explicit square-shaped marks (1 × 1 cm) were placed in the open field, forming an equilateral triangle (side = 35 cm). Entry in the circular zones (diameter = 6 cm) around each mark was associated with the delivery of a rewarding ICSS stimulation. Experiments were performed using a video camera, connected to a video-tracking system, out of sight of the experimenter. A LabVIEW (National Instruments) application precisely tracked and recorded the animal’s position with a camera (20 frames/s). When a mouse was detected in one of the circular rewarding zones, an electrical stimulator received a TTL signal from the software application and generated a 200 ms train of 5 ms biphasic square waves pulsed at 100 Hz (20 pulses per train). ICSS intensity was adjusted, within a range of 20–200 μA, during training (see “Training settings”) and then kept constant, so that mice would achieve between 50 and 150 visits per session (5 min duration) for two successive sessions, and then kept constant for all the experiment. Mice with insufficient scores in the DS and PS (<40 visits despite increasing the maximum intensity to 200 μA) were excluded.

Training settings

The training consisted of two settings: the deterministic setting (DS) and the probabilistic setting (PS), both consisting of at least 10 daily sessions of 5 min. In the DS, all zones were associated with an ICSS delivery (P = 100%). However, two consecutive rewards could not be delivered on the same target, which motivates mice to alternate between targets. In the PS, the zones were associated with three different probabilities (P = 25%, P = 50%, P = 100%) to obtain an ICSS stimulation. The probabilities locations were pseudo-randomly assigned per mouse.

Data acquisition per experimental group

Different experimental groups underwent the ICSS bandit task. Firstly, locomotion and choice behavior of the mice, which had been implanted with osmotic minipumps (saline = 23, nicotine = 27), were analyzed and compared between the last 2 days of both training settings. For optogenetics experiments, the DAT^iCRE mice (n = 21) completed the training, followed by a schedule of paired sessions with photo-stimulation (ON) alternated with days without photo-stimulation (OFF). The control animals (n = 55) were obtained by pooling together mice implanted with a saline minipump (n = 23) and non-implanted mice (n = 32). Figure 1 used only data from the non-implanted mice group. Figs. 2, 3, and 4 used the pooled control group.

Behavioral measures

For all of those groups, the following measures were analyzed and compared in the PS, as well as in the DS for the saline vs nicotine experiment: (i) number of visits, (ii) time-to-goal, (iii) choice repartition (proportion of visits at each location P₂₅, P₅₀, and P₁₀₀), and (iv) percentage of directional changes (nth visit = nth visit + 2). Furthermore, the ICSS bandit task can be seen as a Markovian decision process. Every transition between zones can be considered as a binary choice between two options, since the occupied zone cannot be reinforced twice in a row. The sequence of choices per session is summarized by the proportional result of the sum of three specific binary choices (or gambles, e.g., G_C would be the total number of visits in target A/total number of visits in targets A and B, when the animal is in target C). The three gambles (G) were named after the point on which the mouse is positioned at the time of the choice: G₂₅ = 100% vs 50%, G₁₀₀ = 50% vs 25% and G₅₀ = 100% vs 25%. The target selected in these gambles reflects the balance between exploitative (choosing the most valuable option) and exploratory (choosing the least valuable option) choices. With a softmax-based decision-making model fitted in the laboratory, we computed three parameters: the value sensitivity or inverse temperature (the power to discriminate between values in a binary choice), the uncertainty bonus (the preference for expected uncertainty, considering the reward variance of every option in a binary choice) and the motor cost to do a directional change (target value decreases if it requires to go back to the previous target).

Decision model

Decision-making models determined the probability P_i of choosing the next state i, as a function (the “choice rule”) of a decision variable. Because mice could not return to the same rewarding target, they had to choose between the two remaining ones. Accordingly, we modeled decisions between two alternatives labeled A and B and used a “softmax” choice rule, defined by P_A = 1/(1 + e^−ß(vA-vB)), where β is an inverse temperature parameter reflecting the sensitivity of choice to the difference of values V_i. The decision variable or value V of an option is modeled as the expected (average) reward + expected uncertainty + U-turn cost^19,33.

As mice could not receive two consecutive rewards on the same location, a 6 × 3 matrix is sufficient to describe the probability of choices between A, B, and C (the three targets) depending on the two preceding choices. For instance, after performing the sequence BA, the values for the three following options {A, B, C} are given by { V_A = 0; V_B = p_b + φ_*p_b*(1 – p_b) − κ ; V_C = p_c + φ_*p_c*(1 − p_c) }. The U-turn cost is only applied to the choice B, as the BAB sequence would constitute a U-turn. Likewise, after the sequence CA, the values are given by { V_A = 0; V_B = p_b + φ_*p_b*(1 − p_b) ; V_C = p_c + φ_*p_c*(1 − p_c) − κ}. The same holds after AB, CB, AC, BC sequences, effectively resulting in a 6 × 3 matrix of choices. The free parameters of the model (β, φ, κ) were fitted by maximizing the data likelihood. Given a sequence of choice c = c_1..T, data likelihood is the product of their probability given by the softmax choice rule⁶⁸. We used the optim function in R to perform the fits, with the constraints that β ∈ ]0,10], φ ∈ ]0,5] and κ ∈ ]0,5].

In the model, the β parameter reflects how much the difference in total value between the two options (∆V) translates into more or less preference for the best option in a given gamble. With a small β, choices have low sensitivity for ∆V, with the extreme case of a null β where both options have the same probability to be selected in each gamble (G₁₀₀ = G₅₀ = G₂₅ = 50%), leading to equal global distribution of visits (P₁₀₀ = P₅₀ = P₂₅ = 33%), independently of their respective value. On the contrary, a large β indicates a high sensitivity to ∆V, with an infinite beta indicating that options associated with higher reward probabilities are always selected (G₁₀₀ = G₅₀ = 100% and G₂₅ would not even exist considering that animals would never visit P₂₅, with P₁₀₀ = P₅₀ = 50% and P₂₅ = 0%).

Model comparison

To compare models⁶⁸, we used the Bayesian information criterion (BIC) to correct the raw likelihoods for the number of free parameters fit. BIC scores were aggregated across mice (Fig. 4b). M1 and M2 are nested cases of M3. In M3, β > 0, φ > 0, and κ > 0. In M1, β and φ = 0, so that choices are only driven by a motor cost κ > 0. In M2, φ = 0, corresponding to animals that do not take uncertainty into account. A likelihood ratio test was used to estimate the probability of the observed data under the null hypothesis that these data are generated by the simplest model. For that we computed d, twice the difference in log likelihoods of M2 or M1 with M3. The probability of a significant difference d follows a chi-square distribution with a number of degrees of freedom n equal to the difference of parameters number between M3 and M1 or M2 (here n = 1 or 2)⁶⁸.

Statistical analysis

All statistical analyses were computed using R (The R Project, version 4.0.0) and Python with custom programs. The results were plotted as a mean ± sem. The total number (n) of observations in each group and the statistics used are indicated in figure legends. Classical comparisons between means were performed using parametric tests (Student’s t-test, or ANOVA for comparing more than two groups when parameters followed a normal distribution (Shapiro test P > 0.05), and non-parametric tests (here, Wilcoxon or Mann-Whitney) when the distribution was skewed. Multiple comparisons were corrected using a sequentially rejective multiple test procedure (Holm). Probability distributions were compared using the Kolmogorov–Smirnov (KS) test, and proportions were evaluated using a chi-squared test (χ²). All statistical tests were two-sided except for the optogenetic experiment (Fig. 5) where statistical tests were paired and one-sided (we test hypotheses driven by nicotine effect and model). P > 0.05 was considered not to be statistically significant. For archetypal analysis, all computations and graphics have been done using the statistical software R and the archetype package (version 2.2-0.1). Briefly, given an n × m matrix representing a multivariate dataset with n observations (n = number of animals) and m attributes (here m = 7, consisting of the directional changes rate, the target distributions (3 values) and the three gambles (see data, Fig. 1c–e)), the archetypal analysis finds the matrix Z of k m-dimensional archetypes (k is the number of archetypes). Z is obtained by minimizing || X-α Z^T | | ₂, with α the coefficients of the archetypes (α_i,1..k ≥ 0 and ∑ α_i,1..k = 1), and | |.||₂ a matrix norm. The archetype is also a convex combination of the data points Z = X^Tδ, with δ ≥ 0 and their sum must be 1⁶⁹. The α-coefficient depicts the relative archetypal composition of a given observation. For k = 3 archetypes and an observation i, α_i,1; α_i,2; α_i,3 ≥ 0 and α_i,1 + α_i,2 + α_i,3 = 1. A ternary plot can then be used to visualize data (α_i,1; α_i,2; α_i,2) are used to assign individual behavior to its nearest archetype (i.e., k max(α_i,1; α_i,2; α_i,3)). α_i,j are also used as variable to estimate population archetypal composition. For Fig. 4e, archetypal composition (0 ≤ α_i,j ≤ 1) was binned into five intervals. Pure archetype corresponds to 1, the archetypal composition decreases linearly with increasing distance from the archetype, 0 correspond to points on the opposite side.

Statistics and reproducibility

All experiments were replicated with success.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability

The raw data from the online behavioral experiment (i.e., the trajectories) are available from the corresponding author. Source data are provided with this paper.

Code availability

Animal’s trajectories are collected with homemade LabVIEW program (version 2014). The results were generated using code written in R (version 4.1.0) and Python (version 3.8.5). All codes used to run the analysis are available from the authors upon request. A sample code for the model and archetype is publicly available at https://zenodo.org/record/5596424#.YXfBAy8ivgY

References

Marti, F. et al. Smoke extracts and nicotine, but not tobacco extracts, potentiate firing and burst activity of ventral tegmental area dopaminergic neurons in mice. Neuropsychopharmacology 36, 2244–2257 (2011).
Article CAS PubMed PubMed Central Google Scholar
Fowler, C. D., Arends, M. A. & Kenny, P. J. Subtypes of nicotinic acetylcholine receptors in nicotine reward, dependence, and withdrawal: evidence from genetically modified mice. Behav. Pharmacol. 19, 461–484 (2008).
Article CAS PubMed PubMed Central Google Scholar
Stolerman, I. P. & Jarvis, M. J. The scientific case that nicotine is addictive. Psychopharmacology (Berl.) 117, 2–10 (1995). discussion 14-20.
Article CAS Google Scholar
Lüscher, C. & Malenka, R. C. Drug-evoked synaptic plasticity in addiction: from molecular changes to circuit remodeling. Neuron 69, 650–663 (2011).
Article PubMed PubMed Central Google Scholar
Association, A. P. Diagnostic and Statistical Manual of Mental Disorders (DSM-5^®) (APA, 2013).
Naudé, J., Dongelmans, M. & Faure, P. Nicotinic alteration of decision-making. Neuropharmacology 96, 244–254 (2015).
Article PubMed Google Scholar
Addicott, M. A., Pearson, J. M., Sweitzer, M. M., Barack, D. L. & Platt, M. L. A primer on foraging and the explore/exploit trade-off for psychiatry research. Neuropsychopharmacology 42, 1931–1939 (2017).
Article CAS PubMed PubMed Central Google Scholar
Picciotto, M. R., Lewis, A. S., Schalkwyk, G. I. V. & Mineur, Y. S. Mood and anxiety regulation by nicotinic acetylcholine receptors: a potential pathway to modulate aggression and related behavioral states. Neuropharmacology 96, 235–243 (2015).
Article CAS PubMed PubMed Central Google Scholar
Morel, C. et al. Nicotinic receptors mediate stress-nicotine detrimental interplay via dopamine cells’ activity. Mol. Psychiatry 23, 1597–1605 (2017).
Article PubMed Google Scholar
Nguyen, C. et al. Nicotine inhibits the VTA-to-amygdala dopamine pathway to promote anxiety. Neuron 109, 2604–2615.e9 (2021).
Locey, M. L. & Dallery, J. Isolating behavioral mechanisms of intertemporal choice: nicotine effects on delay discounting and amount sensitivity. J. Exp. Anal. Behav. 91, 213–223 (2009).
Article PubMed PubMed Central Google Scholar
Viñals, X. et al. Overexpression of α3/α5/β4 nicotinic receptor subunits modifies impulsive-like behavior. Drug Alcohol Depend. 122, 247–252 (2012).
Article PubMed Google Scholar
Addicott, M. A., Pearson, J. M., Froeliger, B., Platt, M. L. & McClernon, F. J. Smoking automaticity and tolerance moderate brain activation during explore–exploit behavior. Psychiatry Res. 224, 254–261 (2014).
Article PubMed PubMed Central Google Scholar
Addicott, M. A., Pearson, J. M., Wilson, J., Platt, M. L. & McClernon, F. J. Smoking and the bandit: a preliminary study of smoker and nonsmoker differences in exploratory behavior measured with a multiarmed bandit task. Exp. Clin. Psychopharmacol. 21, 66–73 (2013).
Article PubMed Google Scholar
Levine, A. et al. Molecular mechanism for a gateway drug: epigenetic changes initiated by nicotine prime gene expression by cocaine. Sci. Transl. Med. 3, 107ra109 (2011).
Article PubMed PubMed Central Google Scholar
Wyart, V., & Koechlin, E. Choice variability and suboptimality in uncertain environments. Curr. Opin. Behav. Sci. 11, 109–115 (2016).
Article Google Scholar
Cohen, J. D., McClure, S. M. & Yu, A. J. Should I stay or should I go? How the human brain manages the trade-off between exploitation and exploration. Philos. Trans. R. Soc. Lond. Ser. B, Biol. Sci. 362, 933–942 (2007).
Article Google Scholar
Wilson, R. C., Geana, A., White, J. M., Ludvig, E. A. & Cohen, J. D. Humans use directed and random exploration to solve the explore–exploit dilemma. J. Exp. Psychol. Gen. 143, 2074–2081 (2014).
Article PubMed PubMed Central Google Scholar
Belkaid, M. et al. Mice adaptively generate choice variability in a deterministic task. Commun. Biol. 3, 1–9 (2020).
Google Scholar
Berlyne, D. E. Curiosity and exploration. Science 153, 25–33 (1966).
Article ADS CAS PubMed Google Scholar
Redish, A. D., Jensen, S. & Johnson, A. A unified framework for addiction: vulnerabilities in the decision process. Behav. Brain Sci. 31, 415–437 (2008). discussion 437-87.
Article PubMed PubMed Central Google Scholar
Schultz, W. Multiple dopamine functions at different time courses. Annu. Rev. Neurosci. 30, 259–288 (2007).
Article CAS PubMed Google Scholar
Lüscher, C., Robbins, T. W. & Everitt, B. J. The transition to compulsion in addiction. Nat. Rev. Neurosci. 21, 1–17 (2020).
Article Google Scholar
Kalivas, P. W. & Volkow, N. D. The neural basis of addiction: a pathology of motivation and choice. Am. J. Psychiat 162, 1403–1413 (2005).
Article PubMed Google Scholar
Mizumori, S. J. Y. & Jo, Y. S. Homeostatic regulation of memory systems and adaptive decisions. Hippocampus 23, 1103–1124 (2013).
Article PubMed PubMed Central Google Scholar
Cagniard, B. et al. Dopamine scales performance in the absence of new learning. Neuron 51, 541–547 (2006).
Article CAS PubMed Google Scholar
Westbrook, A. & Braver, T. S. Dopamine does double duty in motivating cognitive effort. Neuron 89, 695–710 (2016).
Article CAS PubMed PubMed Central Google Scholar
Niv, Y., Daw, N. D., Joel, D. & Dayan, P. Tonic dopamine: opportunity costs and the control of response vigor. Psychopharmacology (Berl.) 191, 507–520 (2007).
Article CAS Google Scholar
Beeler, J. A., Daw, N., Frazier, C. R. M. & Zhuang, X. Tonic dopamine modulates exploitation of reward learning. Front. Behav. Neurosci. 4, 170 (2010).
Article PubMed PubMed Central Google Scholar
Frank, M. J., Doll, B. B., Oas-Terpstra, J. & Moreno, F. Prefrontal and striatal dopaminergic genes predict individual differences in exploration and exploitation. Nat. Neurosci. 12, 1062–1068 (2009).
Article CAS PubMed PubMed Central Google Scholar
Humphries, M. D., Khamassi, M. & Gurney, K. Dopaminergic control of the exploration–exploitation trade-off via the basal ganglia. Front. Neurosci. 6, 9 (2012).
Article PubMed PubMed Central Google Scholar
Cinotti, F. et al. Dopamine blockade impairs the exploration–exploitation trade-off in rats. Sci. Rep. 9, 6770 (2019).
Article ADS PubMed PubMed Central Google Scholar
Naudé, J. et al. Nicotinic receptors in the ventral tegmental area promote uncertainty-seeking. Nat. Neurosci. 19, 471–478 (2016).
Article PubMed Google Scholar
Epping-Jordan, M. P., Watkins, S. S., Koob, G. F. & Markou, A. Dramatic decreases in brain reward function during nicotine withdrawal. Nature 393, 76–79 (1998).
Article ADS CAS PubMed Google Scholar
Tolu, S. et al. Nicotine enhances alcohol intake and dopaminergic responses through β2* and β4* nicotinic acetylcholine receptors. Sci. Rep. 7, 45116 (2017).
Article ADS CAS PubMed PubMed Central Google Scholar
Cutler, A. & Breiman, L. Archetypal analysis. Technometrics 36, 338–347 (1994).
Article MathSciNet MATH Google Scholar
Hart, Y. et al. Inferring biological tasks using Pareto analysis of high-dimensional data. Nat. Methods 12, 233–235 (2015). 3 p following 235.
Article CAS PubMed Google Scholar
Besson, M. et al. Long-term effects of chronic nicotine exposure on brain nicotinic receptors. Proc. Natl Acad. Sci. USA 104, 8155–8160 (2007).
Article ADS CAS PubMed PubMed Central Google Scholar
Kleinlogel, S. et al. Ultra light-sensitive and fast neuronal activation with the Ca²+-permeable channelrhodopsin CatCh. Nat. Neurosci. 14, 513–518 (2011).
Article CAS PubMed Google Scholar
Chuong, A. S. et al. Noninvasive optical inhibition with a red-shifted microbial rhodopsin. Nat. Neurosci. 17, 1123–1129 (2014).
Faure, P., Tolu, S., Valverde, S. & Naudé, J. Role of nicotinic acetylcholine receptors in regulating dopamine neuron activity. Neuroscience 282C, 86–100 (2014).
Article Google Scholar
Heilbronner, S. R. & Hayden, B. Y. Contextual factors explain risk-seeking preferences in rhesus monkeys. Front. Neurosci. 7, 7 (2013).
Article PubMed PubMed Central Google Scholar
Musall, S., Urai, A. E., Sussillo, D. & Churchland, A. K. Harnessing behavioral diversity to understand neural computations for cognition. Curr. Opin. Neurobiol. 58, 229–238 (2019).
Article CAS PubMed PubMed Central Google Scholar
Calipari, E. S. et al. Dopaminergic dynamics underlying sex-specific cocaine reward. Nat. Commun. 8, 13877–15 (2017).
Article CAS PubMed PubMed Central Google Scholar
Rezvani, A. H. & Levin, E. D. Cognitive effects of nicotine. Biol. Psychiat 49, 258–267 (2001).
Article CAS PubMed Google Scholar
Hernandez, G., Trujillo-Pisanty, I., Cossette, M.-P., Conover, K. & Shizgal, P. Role of dopamine tone in the pursuit of brain stimulation reward. J. Neurosci. 32, 11032–11041 (2012).
Article CAS PubMed PubMed Central Google Scholar
Kenny, P. J. & Markou, A. Nicotine self-administration acutely activates brain reward systems and induces a long-lasting increase in reward sensitivity. Neuropsychopharmacology 31, 1203–1211 (2006).
Article CAS PubMed Google Scholar
Grace, A. A. & Bunney, B. S. The control of firing pattern in nigral dopamine neurons: burst firing. J. Neurosci. 4, 2877–2890 (1984).
Article CAS PubMed PubMed Central Google Scholar
Tritsch, N. X. & Sabatini, B. L. Dopaminergic modulation of synaptic transmission in cortex and striatum. Neuron 76, 33–50 (2012).
Article CAS PubMed PubMed Central Google Scholar
Steinberg, E. E. et al. A causal link between prediction errors, dopamine neurons and learning. Nat. Neurosci. 16, 966–973 (2013).
Juarez, B. et al. Midbrain circuit regulation of individual alcohol drinking behaviors in mice. Nat. Commun. 8, 2220 (2017).
Article ADS PubMed PubMed Central Google Scholar
Stern, S., Kirst, C. & Bargmann, C. I. Neuromodulatory control of long-term behavioral patterns and individuality across development. Cell 171, 1–25 (2017).
Article Google Scholar
MacDonald, S. W. S., Nyberg, L. & Bäckman, L. Intra-individual variability in behavior: links to brain structure, neurotransmission and neuronal activity. Trends Neurosci. 29, 474–480 (2006).
Article CAS PubMed Google Scholar
Torquet, N. et al. Social interactions impact on the dopaminergic system and drive individuality. Nat. Commun. 9, 3081 (2018).
Article ADS CAS PubMed PubMed Central Google Scholar
Smillie, L. D. & Wacker, J. Dopaminergic foundations of personality and individual differences. Front. Hum. Neurosci. 8, 874 (2014).
Article PubMed PubMed Central Google Scholar
DeYoung, C. G. The neuromodulator of exploration: a unifying theory of the role of dopamine in personality. Front. Hum. Neurosci. 7, 762 (2013).
Article PubMed PubMed Central Google Scholar
Palmatier, M. I. et al. Dissociating the primary reinforcing and reinforcement-enhancing effects of nicotine using a rat self-administration paradigm with concurrently available drug and environmental reinforcers. Psychopharmacology (Berl.) 184, 391–400 (2006).
Article CAS Google Scholar
Chiu, P. H., Lohrenz, T. M. & Montague, P.R. Smokers’ brains compute, but ignore, a fictive error signal in a sequential investment task. Nat. Neurosci. 11, 514–520 (2008).
Article CAS PubMed Google Scholar
Addicott, M. A., Pearson, J. M., Kaiser, N., Platt, M. L. & McClernon, F. J. Suboptimal foraging behavior: a new perspective on gambling. Behav. Neurosci. 129, 656–665 (2015).
Article PubMed PubMed Central Google Scholar
McGrath, D. S. & Barrett, S. P. The comorbidity of tobacco smoking and gambling: a review of the literature. Drug Alcohol Rev. 28, 676–681 (2009).
Article PubMed Google Scholar
Harlé, K. M. et al. Altered statistical learning and decision-making in methamphetamine dependence: evidence from a two-armed bandit task. Front. Psychol. 6, 1910 (2015).
Article PubMed PubMed Central Google Scholar
Morris, L. S. et al. Biases in the explore–exploit tradeoff in addictions: the role of avoidance of uncertainty. Neuropsychopharmacology 41, 940–948 (2016).
Article PubMed Google Scholar
Khabou, H. et al. Noninvasive gene delivery to foveal cones for vision restoration. JCI Insight 3, D358 (2018).
Article Google Scholar
Choi, V. W., Asokan, A., Haberman, R. A. & Samulski, R. J. Production of recombinant adeno-associated viral vectors. Curr. Protoc. Hum. Genet. Chapter 12, Unit 12.9-12.9.21 (2007).
Aurnhammer, C. et al. Universal real-time PCR for the detection and quantification of adeno-associated virus serotype 2-derived inverted terminal repeat sequences. Hum. Gene Ther. Methods 23, 18–28 (2012).
Article CAS PubMed Google Scholar
Exley, R. et al. Distinct contributions of nicotinic acetylcholine receptor subunit alpha4 and subunit alpha6 to the reinforcing effects of nicotine. Proc. Natl Acad. Sci. USA 108, 7577–7582 (2011).
Article ADS CAS PubMed PubMed Central Google Scholar
Ungless, M. A. & Grace, A. A. Are you or aren’t you? Challenges associated with physiologically identifying dopamine neurons. Trends Neurosci. 35, 422–430 (2012).
Article CAS PubMed PubMed Central Google Scholar
Daw, N. D. Trial-by-trial data analysis using computational models. In Decision Making, Affect, and Learning (eds Delgado, M. R., Phelps, E. A. & Robbins, T. W.) 3–38 (Oxford University Press, 2011).
Eugster, M. J. A. & Leisch, F. From spider-man to hero - archetypal analysis in R. J. Stat. Softw. 30, 1–23 (2009).
Article Google Scholar

Download references

Acknowledgements

We are grateful to the animal facilities (IBPS), Camille Robert and Paris Vision Institute AAV production facility for viral production and purification. This work was supported by the Centre National de la Recherche Scientifique CNRS UMR 8246, INSERM U1130, the Foundation for Medical Research (FRM, Equipe FRM DEQ2013326488 to P.F.), FRM FDT201904008060 (to S.M.), the French National Cancer Institute Grant TABAC-16-022 et TABAC-19-020 (to P.F.), French state funds managed by the ANR (ANR-16 Nicostress, ANR -17 SNP-Nic, ANR-20 Nicado to P.F., ANR-19 Vampire to F.M.) and The LabEx Bio-Psy (to P.F.). M.L.D., R.D.C., and S.M. were the recipients of a fourth-year PhD fellowship from FRM (FDT20160435171, FDT20170437427, and FDT201904008060), C.N. was recipient of a doctoral fellowship from the Labex Bio-Psy, D.L. was recipient of a post-doctoral Fellowship from the Labex Bio-Psy, and L.M.R. was supported by a NIDA–Inserm Postdoctoral Drug Abuse Research Fellowship.

Author information

Romain Durand-de Cuttoli
Present address: Nash Family Department of Neuroscience, Icahn School of Medicine at Mount Sinai, New York, NY, USA
These authors contributed equally: Malou Dongelmans, Romain Durand-de Cuttoli, Claire Nguyen, Maxime Come.

Authors and Affiliations

Sorbonne Université, INSERM, CNRS, Neuroscience Paris Seine - Institut de Biologie Paris Seine (NPS - IBPS), 75005, Paris, France
Malou Dongelmans, Romain Durand-de Cuttoli, Claire Nguyen, Maxime Come, Etienne K. Duranté, Damien Lemoine, Raphaël Brito, Tarek Ahmed Yahia, Sarah Mondoloni, Steve Didienne, Elise Bousseyrol, Bernadette Hannesse, Lauren M. Reynolds, Nicolas Torquet, Fabio Marti, Alexandre Mourot, Jérémie Naudé & Philippe Faure
Brain Plasticity Unit, CNRS, ESPCI Paris, PSL Research University, 75005, Paris, France
Maxime Come, Steve Didienne, Elise Bousseyrol, Lauren M. Reynolds, Fabio Marti, Alexandre Mourot, Jérémie Naudé & Philippe Faure
Sorbonne Université, INSERM, CNRS, Institut de la Vision, Paris, France
Deniz Dalkara

Authors

Malou Dongelmans
View author publications
You can also search for this author in PubMed Google Scholar
Romain Durand-de Cuttoli
View author publications
You can also search for this author in PubMed Google Scholar
Claire Nguyen
View author publications
You can also search for this author in PubMed Google Scholar
Maxime Come
View author publications
You can also search for this author in PubMed Google Scholar
Etienne K. Duranté
View author publications
You can also search for this author in PubMed Google Scholar
Damien Lemoine
View author publications
You can also search for this author in PubMed Google Scholar
Raphaël Brito
View author publications
You can also search for this author in PubMed Google Scholar
Tarek Ahmed Yahia
View author publications
You can also search for this author in PubMed Google Scholar
Sarah Mondoloni
View author publications
You can also search for this author in PubMed Google Scholar
Steve Didienne
View author publications
You can also search for this author in PubMed Google Scholar
Elise Bousseyrol
View author publications
You can also search for this author in PubMed Google Scholar
Bernadette Hannesse
View author publications
You can also search for this author in PubMed Google Scholar
Lauren M. Reynolds
View author publications
You can also search for this author in PubMed Google Scholar
Nicolas Torquet
View author publications
You can also search for this author in PubMed Google Scholar
Deniz Dalkara
View author publications
You can also search for this author in PubMed Google Scholar
Fabio Marti
View author publications
You can also search for this author in PubMed Google Scholar
Alexandre Mourot
View author publications
You can also search for this author in PubMed Google Scholar
Jérémie Naudé
View author publications
You can also search for this author in PubMed Google Scholar
Philippe Faure
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

M.D., R.D.C., C.N. and M.C. contributed equally to this work. P.F. and M.D. designed the study. M.D., R.D.C., C.N., M.C., T.A.Y., E.K.D., R.B., E.B., B.H. and N.T. performed the behavioral experiments. M.D., R.D.C., S.M. and N.T. performed the minipumps implantations. J.N., D.L. and S.D. contributed to setup developments. C.N., S.M., R.D.C., D.L. and F.M. performed electrophysiological recordings. M.D., R.D.C., C.N., M.C., E.K.D., R.B., T.A.Y., E.B., N.T. and J.N. performed the surgeries and virus injections. C.N. and S.M. performed the immunohistochemistry experiments. D.D. provided the viruses. J.N. and P.F. developed the model. AM developed the optogenetic setup. M.D., R.D.C., C.N., M.C., S.M., J.N., F.M. and P.F. analyzed the data. P.F. wrote the paper with inputs from M.D., R.D.C., C.N., M.C., L.M.R., J.N., F.M. and A.M.

Corresponding author

Correspondence to Philippe Faure.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review information

Nature Communications thanks Brandon Henderson and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary information.

Peer Review File.

Reporting summary.

Source data

Source Data.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Dongelmans, M., Durand-de Cuttoli, R., Nguyen, C. et al. Chronic nicotine increases midbrain dopamine neuron activity and biases individual strategies towards reduced exploration in mice. Nat Commun 12, 6945 (2021). https://doi.org/10.1038/s41467-021-27268-7

Download citation

Received: 23 February 2021
Accepted: 04 November 2021
Published: 26 November 2021
DOI: https://doi.org/10.1038/s41467-021-27268-7

This article is cited by

Microglia sustain anterior cingulate cortex neuronal hyperactivity in nicotine-induced pain
- Dan-dan Long
- Yu-zhuo Zhang
- Di Wang
Journal of Neuroinflammation (2023)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.