Main

Exoplanet population studies reveal the crucial impact of planet formation and early evolutionary mechanisms1,2 on their demographic characteristics. However, evolutionary processes such as atmospheric mass loss driven by host star extreme ultraviolet (XUV) flux3, interior cooling4 and contraction5 can substantially alter their thermal structure and composition within the first 100 Myr, thereby obscuring the imprints of planet formation. In this context, young transiting exoplanets represent a unique opportunity to probe the atmospheres of freshly formed planets and test formation and early evolution theories6,7,8. However, studying these young planets is challenging as most of them do not have well-constrained masses due to the large uncertainties in radial velocity (RV) measurements from their highly variable host stars9. Young stars are known to have large spot coverage and frequent flaring activity10, which can contaminate the measured transmission spectrum by the transit light source effect11. Most of the known young transiting planets12,13 lie above the radius valley and are theoretically predicted to be Neptune or sub-Neptune/super-Earth progenitors5.

V1298 Tau is one of the youngest transiting multi-planet systems known, consisting of three confirmed planets in a near 3:2:1 mean motion resonance and a fourth planet with an unconfirmed period14,15,16,17. The host is a 23-Myr-old weak-lined T Tauri star, which is a member of Group 29 (a young association in the foreground of the Taurus–Auriga star-forming region)18,19. Several age estimates have been published for V1298 Tau: 23 ± 4 Myr (ref. 15), 20 ± 10 Myr (ref. 20) and 28 ± 4 Myr (ref. 21). All of these estimates agree within 1σ and we adopt 23 ± 4 from ref. 15. We observed one primary transit of V1298 Tau b using ten Hubble Space Telescope (HST) orbits with the Wide Field Camera 3 (WFC3) instrument in Grism-141 (G141) observing mode for General-Observer program (GO) 16083 (see ‘Observations’ in the Methods). V1298 Tau b is a warm (equilibrium temperature (Tequ) = 670 K)14, Jovian-sized planet (0.8–0.9 RJ; where RJ is the radius of Jupiter)14,16,17 orbiting its host star in 24.14 days (refs. 14,15,16,17). Mass estimates made using RVs provide a Jovian mass (0.64 ± 0.19 MJ (ref. 20); <0.5 MJ (ref. 16)), but the reliability of these constraints has recently been questioned9. The high stellar activity and uncertainty in the orbital period of planet e could result in unreliable RV mass estimates for all of the V1298 Tau planets9. Using RV mass measurements from ref. 20, ref. 22 concluded that V1298 Tau b would be stable to atmospheric mass loss due to its strong gravity.

The raw HST images were reduced using an in-house data-reduction pipeline23 (see ‘Data reduction’ in the Methods for details). We extracted a broadband integrated ‘white’ light curve in the HST/WFC3 G141 bandpass (1.12 μm–1.65 μm) and used a divide-white common-mode approach to derive systematics-corrected spectroscopic light curves24. The extracted white and detrended spectroscopic light curves are shown in Extended Data Figs. 1 and 2, respectively. The detrended spectroscopic light curves were fitted with a batman planetary transit model, linear limb darkening and a linear stellar baseline (see ‘Light curve analysis’ in the Methods and Table 1 for details). We estimated the effect of unocculted star spots on the transmission spectrum using techniques outlined in ref. 11 (see ‘Accounting for stellar activity’ in the Methods).

Table 1 Best-fit transit depths, linear limb-darkening coefficients and root mean squared residual compared with expected photon noise for V1298 Tau b

Results

The transmission spectrum of V1298 Tau b (Fig. 1) shows a high-amplitude absorption feature around the 1.4 μm water band (~400 ppm) that is larger than those of known Neptune/super-Neptune-mass planets, such as HAT-P-26b (250 ppm)25 and GJ 3470b (150 ppm)26. The water absorption amplitude is also large in comparison with well-studied hot Jupiters, such as HD209458b (~200 ppm)27. The amplitude of the water feature is indicative of a large atmospheric scale height, revealing an extended H-rich atmosphere. Assuming a clear, cloud-free, H/He-rich and isothermal atmosphere, we determined the scale height of this planet (1,100 ± 200 km), from which we estimated the mass to be 24 ± 5 Earth masses (M) using a known method28. This mass estimate becomes an upper limit if the atmosphere is partly cloudy or hazy. We also derived an estimate of the planet mass from atmospheric retrievals, keeping the mass free (see ‘Mass estimate’ in the Methods and Extended Data Fig. 3). These fits yielded a 99.73% (3σ) upper limit of the mass of 23 M, consistent with the analytically derived mass upper limit (24 ± 5 M). RV measurements of this system provide Jovian/sub-Jovian mass (220 ± 70 M) for V1298 Tau b16,20. However, our observation rules out a 100 M (~2σ lower limit from ref. 20) transmission spectrum model (Fig. 1) at ~5σ confidence. We compared the derived mass and radius of planet b to the population of exoplanets (Fig. 2). With a density upper limit of 0.12 g cm3, V1298 Tau b is comparable to the lowest-density planets known (super-puffs)29, but has a clear atmosphere compared with most super-puffs30. The estimated upper limit of the mass of V1298 Tau b is consistent with a Neptune/sub-Neptune-mass planet with a substantial H/He envelope (~40%; Fig. 2).

Fig. 1: Observed HST/WFC3 transmission spectrum and retrieved TP profile and molecular abundances for V1298 Tau b.
figure 1

a, Extracted transmission spectrum (stellar activity not corrected) of V1298 Tau b with 1σ uncertainties (green data points and error bars). The transit depths and uncertainties have been derived by fitting the spectroscopic light curves using an MCMC algorithm. The extraction of the spectrum is described in detail in ‘Light curve analysis’ in the Methods, and the transit depths and uncertainties are shown in Table 1. The solid blue line shows a best-fit atmospheric model from retrievals of the observed spectrum (see ‘Atmospheric models’ in the Methods). The dash-dotted black line shows a transmission spectrum for a 100 M solar metallicity model and the orange dashed line represents a 24 M 100× solar metallicity model, both of which are ruled out at 5σ. The red dotted line represents an isothermal model, which shows an absorption feature around 1.6 μm due to methane and is ruled out at 3σ. b, Best-fit TP profile for 24 M retrieval with a 1σ confidence interval (red shaded region). The CH4/CO transition33 line, with the CH4-dominated region on the left and the CO-dominated region on the right, is shown. The coloured dashed lines represent the equilibrium abundances for the chemical species included in our model (calculated for the red solid TP profile). The retrieved grey cloud deck and quenching pressure and the 1σ confidence region from our retrieval analysis (Table 2) are shown.

Fig. 2: V1298 Tau b in the exoplanet mass–radius diagram.
figure 2

The mass upper limit of V1298 Tau b (red star) was calculated from the observed transmission spectrum using free-mass retrievals and the formalism presented in ref. 28 (see ‘Mass estimate’ in the Methods). Other known young planets and low-density super-puffs are shown for comparison. The population of mature transiting planets is shown for comparison (data and uncertainties obtained from the NASA Exoplanet Archive80). The arrows indicate mass upper limits. The blue dotted lines are theoretical models from ref. 38 and show that the measured mass and radius of V1298 Tau b are consistent with an atmosphere with a large H/He envelope (~40% mass fraction assuming a 10 M core). Constant density contours show the density (ρ) of V1298 Tau b (0.12 g cm3) is among the lowest discovered.

The observed transmission spectrum (without a stellar activity correction applied) was modelled using a one-dimensional (1D) radiative transfer code (PetitRADTRANS)31. We fixed the planetary mass to the estimated upper limit (24 M). We modelled the atmosphere using an analytic temperature–pressure (TP) profile32 and a grey cloud deck. The dominant carbon-bearing species at 670 K is expected to be methane33. However, we did not detect methane absorption around 1.6 μm (Fig. 1). The absence of methane has been reported for other warm planets26, which may be explained by vertical mixing33. We simulated the effect of vertical mixing using a ‘quench’ pressure in our models (see ‘Atmospheric models’ in the Methods). The observed and modelled transmission spectra, and the retrieved atmospheric properties, are shown in Fig. 1. The retrievals converge to a low atmospheric metallicity (\(\log{Z}=-0.{1}_{-0.72}^{+0.66}\,{\mathrm{solar}}\)) compared with theoretical mass–metallicity relations for exoplanets34 based on core-accretion theory35 and known constraints for Neptune/sub-Neptune planets36,37. The observed spectrum can also be explained by lower planet masses (10–20 M) with even lower metallicities (0.1–0.01 solar). Cloudy models are marginally favoured over cloud-free models (Extended Data Fig. 4 and Table 2) for the fixed-mass retrievals. However, the retrieved cloud deck pressures are relatively high (10−2 bar), implying that clouds do not have a strong effect on the atmospheric absorption feature. Similarly, for free-mass retrievals, our posteriors showed an upper limit of ~10−1 bar for the cloud deck pressure.

Table 2 Atmospheric parameters of V1298 Tau b retrieved from its transmission spectrum

Using the aforementioned derived mass upper limit and radius measurement (0.91 ± 0.05 RJ) from ref. 14, we estimated a bulk metallicity of 6× solar for V1298 Tau b, applying the formalism from ref. 34. This estimate should be interpreted cautiously as the models presented in ref. 34 do not account for high interior flux from the planet. Given that V1298 Tau b is young with a potentially hot interior, the true bulk metallicity could be higher. The estimated bulk metallicity is shown in Fig. 3. However, the posteriors from our atmospheric retrievals (Extended Data Fig. 5) ruled out 6× solar values at ~3σ. The atmospheric metallicity of V1298 Tau b is significantly lower than its estimated bulk metallicity, which indicates that the atmosphere and interior of this planet may not be well mixed34.

Fig. 3: V1298 Tau b in a mass–metallicity diagram with a sample of exoplanets.
figure 3

The uncertainties (error bars) have been taken from individual references for each planet in the sample, which was compiled from ref. 81. The atmospheric metallicity of V1298 Tau b is derived from the retrieval analysis and has been converted into a multiple of stellar metallicity using the estimate of 0.1 ± 0.15 dex (ref. 20). The metallicity uncertainty is calculated by adding the atmospheric metallicity uncertainty from the retrievals (Table 2) and the stellar metallicity uncertainty in quadrature. The dashed blue and brown lines show the metallicities of Neptune and Jupiter, respectively. We note that the Solar System metallicity estimates are from methane abundance measurements41, whereas the metallicity estimates for exoplanets are derived from oxygen abundance measurements. V1298 Tau b has a mass consistent with Neptune/sub-Neptunes or potentially even super-Earths, but its metallicity is comparable to giant planets such as Jupiter. From the estimated mass upper limit and radius of V1298 Tau b, we estimate a bulk metallicity following the formalism of ref. 34, shown by a black dotted line.

In Fig. 3 we present V1298 Tau b in the context of the exoplanet population. V1298 Tau b has a mass consistent with a Neptune/sub-Neptune or even a super-Earth and a metallicity comparable to (or lower than) Jupiter. A high-metallicity atmosphere (100× solar) with the estimated mass upper limit of this planet can be ruled out at ~5σ confidence (see the orange dashed line in Fig. 1). Therefore, in spite of being a likely Neptune/sub-Neptune, or even a super-Earth progenitor, V1298 Tau b possesses an atmosphere that is 100–1,000× depleted in metal compared with Neptune and Uranus. The atmospheric metallicity in Fig. 3 has been shown with respect to stellar metallicity.

We also investigated the reason for the absence of expected methane. We ran chemical kinetics models that incorporated a self-consistent TP profile, photochemistry and vertical mixing (see ‘Atmospheric models’ in the Methods). We found that photochemical processes did not play an important role in methane depletion (see ‘Atmospheric models’ in the Methods and Supplementary Figs. 3 and 4). These models, with different internal temperatures, demonstrate that it is possible to remove methane through deep quenching, although it requires a high interior temperature (Extended Data Fig. 6). At the highest intrinsic temperature we tested (Tint = 400 K), the quenched molar fraction of methane was still 10−4.7, which is close to the detectability limit (10−5.5) with HST33. Retrievals using free chemistry (see ‘Atmospheric models’ in the Methods and Supplementary Fig. 1) put an upper limit of 10−6 on the methane volume mixing ratio. JWST observations of this planet will be more sensitive to such low methane abundances.

Discussion

The differences between the masses estimated from the atmospheric scale height and those from dynamical studies16,20 could potentially originate from the treatment of the impact of stellar activity on RV signals, as recently demonstrated in ref. 9. There is also uncertainty in the orbital period of planet e (ref. 17) in this system, which could substantially impact the RV mass constraints. Complementary efforts to estimate the mass from transit timing variation measurements are ongoing (Livingston et al., manuscript in preparation).

The low envelope metallicity and relatively large H/He content that we measured for V1298 Tau b are in agreement with early evolution models38, yet this planet must have been on the verge of runaway gas accretion. We emphasize that the origin and early evolution of Neptunes/sub-Neptunes have been open questions: it is not known why these planets accreted only a small fraction of H/He and did not become gas giants7,39. Such systems probably formed in situ, either early with an enhanced atmospheric opacity due to dust grains7 or with substantial disk–envelope interaction to replenish the proto-atmosphere with high-entropy gas40. Late formation in a depleting transitional disk in which the core does not have enough time to accrete a large H/He envelope6 can also produce Neptune/sub-Neptune-mass planets.

The standard core-accretion picture of planet formation35 predicts a mass–metallicity relationship that has been observed in the Solar System41 and reported for exoplanets42. The relatively water-poor atmosphere of V1298 Tau b that we found in this work indicates that this planet must have spent most of its accretion phase within the water ice line, thereby failing to accrete volatile-rich pebbles43. The volatile content of the inner disk can be strongly affected by the growth of massive planets in the outer part of the disk44. In this scenario, a massive planet, formed beyond the water ice line, blocks the supply of volatile-rich pebbles to the inner part of the disk. RV constraints on the mass of V1298 Tau e put it in a Jupiter/sub-Jupiter range16,17,20, with an orbital period that may be longer than 40 days. Pebble filtering could therefore play an important role in this system by producing volatile-poor atmospheres of the inner planets.

Alternatively, V1298 Tau b could have accreted volatile-rich material that ended being locked up in the interior of the planet. Young planets could also experience extreme mass loss driven by the intense XUV flux of their active host stars. Using our update mass upper limits (24 M) for this planet, we simulated the mass and radius evolution (Fig. 4; see also ‘Atmospheric evolution models’ in the Methods). We estimated the Jean’s escape parameter45 to be 27. Our calculations suggest that V1298 Tau b is susceptible to photoevaporation, in contrast to the conclusions of ref. 22 based on RV mass estimates. V1298 Tau b may lose up to a few Earth masses within first billion years of its life. The rocky pebble/planetesimal accretion theory of planet formation46 predicts a gradually mixed interior structure, as observed for Jupiter47. We show two possible interior and evolution models for V1298 Tau b (Extended Data Fig. 7): a core–envelope structure and a diluted core structure. The observed mass, radius and metal-poor envelope can all be explained by both models, but in the diluted core scenario, the atmospheric metallicity is expected to evolve due to the removal of the upper layer of the atmosphere by mass loss, as well as convective mixing in the interior that could ultimately reconcile V1298 Tau b with the mature exoplanet population48.

Fig. 4: Evolutionary track for the mass and radius of V1298 Tau b using the mass upper limit derived in this work.
figure 4

a,b, The mass (a) and radius (b) evolution tracks have been calculated for the first billion years using energy-limited atmospheric evolution models presented in the platypos code75 (see ‘Atmospheric evolution models’ in the Methods). The evolution of the radius is a combined effect of atmospheric contraction and mass loss. Simulations for different values of the mass loss efficiency parameter (0.1, 0.3, 0.5) are shown with different colours. The solid lines show a high stellar activity track (activity timescale 250 Myr) and the dash-dotted lines show a low stellar activity track (activity timescale 100 Myr; for details see ref. 75). The current position of V1298 Tau b is marked by an orange star. The arrow on the orange star in a represents the mass upper limit we use from this work. The error bars on the age and radius of V1298 Tau b represent radius and age uncertainty obtained from ref. 14.

Self-consistent atmospheric modelling for V1298 Tau b required an extremely high internal temperature (~400 K) and strong vertical mixing to explain the non-detection of methane. The internal temperature (170 K) expected from the early evolution models (Extended Data Fig. 7) is consistent with previous theoretical expectations33. Internal temperatures as high as 300–400 K may require external heating mechanisms such as tidal heating49. Alternatively, photolytic destruction of methane could also potentially produce a methane-poor atmosphere50, which may be feasible given the young age and high activity levels of V1298 Tau. We tested this hypothesis by re-running our self-consistent forward model using a published UV spectrum of V1298 Tau51. However, we found that photochemistry did not impact the methane abundance for V1298 Tau b for pressures higher than 10−4 bar, even for an extreme case (1,000× solar XUV flux; Supplementary Fig. 4).

The detection of a strong water vapour absorption feature in the near-infrared (NIR) spectrum of V1298 Tau b allowed us to put a stringent upper limit on its mass as determined analytically from the atmospheric scale height (24 ± 5 M) and from retrievals (23 M at a 99.73% level). The V1298 Tau system is likely to have formed either late, within the water ice line in a gas-poor, dry and depleting protoplanetary disk, or early in the inner region of the disk with an accretion rate that was probably moderated by disk gas replenishment or enhanced envelope opacity. V1298 Tau b is likely to undergo atmospheric mass loss and could end up as a Neptune, a low-density sub-Neptune or even a super-Earth (Fig. 4). The haze-free, low-metallicity atmosphere of V1298 Tau b, a potential Neptune/sub-Neptune progenitor, is distinct from its hazy/metal-rich mature counterparts. This could be a manifestation of the ongoing evolution of its atmospheric composition. We leveraged the absence of spectral signatures of methane and provide constraints on the internal temperature (>300 K) of the planet, which hints at external heating mechanisms such as tidal heating. Future JWST studies will be important in further investigation of the atmospheric chemistry of this planet.

Methods

Observations

The observations were taken using the HST/WFC3 G141 grism in bidirectional spatial scanning mode, covering a range of 1.1–1.7 μm with a scan rate of 0.23 arcsec s−1. This resulted in 180 exposures over ten HST orbits. The individual pixels reached a maximum flux level of 30,000 electrons, which is roughly 40% of the saturation level and well within the linear response regime of the detector. We used the 256 × 256 pixel subarray and SPARS25, NSAMP=5 readout mode, which resulted in 88.4 s exposures.

Data reduction

We used a data-reduction pipeline for our data analysis23,52. The WFC IR detectors were read multiple times non-destructively (without flushing out the accumulated charge) during an exposure. First, sub-exposures were formed for each exposure by subtracting consecutive non-destructive reads and each sub-exposure was reduced separately for improved background subtraction and cosmic ray rejection. We calculated a wavelength solution by matching the first exposure of the visit to a convolution of a PHOENIX stellar spectrum53 for V1298 Tau (temperature = 4,920 K (ref. 15)) with the response function of G141.

We applied a wavelength-dependent flat-field correction and flagged bad pixels with data quality DQ = 4, 32 or 512 by calwf3 and applied a local median filter to identify cosmic rays and clip pixels that deviated more than five median deviations. On average, we found that 0.53% of pixels were affected by cosmic rays for each sub-exposure. To account for the dispersion direction drift of the spectrum, we used the first exposure of a visit as a template and shifted the spectrum for each exposure along the dispersion direction to match the template. The maximum shift that we measured was 0.3 pixels. Finally, we applied an optimal extraction algorithm54 on each sub-exposure to maximize signal-to-noise ratios. We shifted and shrank the spectra of each sub-exposure to match the wavelength grid of the first sub-exposure by a maximum of 1.05 pixels and 0.65%.

Light curve analysis

WFC3 light curves are known to exhibit strong time-dependent ramp-like (charge-trapping) and visit-long systematics24,27,55. It is known that the first orbit of each visit has stronger systematics than the rest of the visit. Following common practice27, we excluded this orbit from the rest of the analysis. We modelled the white light curve instrumental systematics using the charge-trapping model RECTE56. The out-of-transit baseline is a combination of instrumental visit-long slopes, well known for HST/WFC3 time-series observations27, and rotational variability from the active young host star. We modelled the instrumental systematics, stellar rotational variability and transit shape simultaneously. Visit-long slopes have been modelled using linear functions in time27,57, but temporal nonlinearity exhibited by the baseline highlights the effects of stellar variability. We tested polynomial functions of first order to fourth order, as well as a sinusoidal function, to model the baseline. A third-order polynomial provided the best fit (lowest Bayesian information criteria (BIC) value) to the observations (Supplementary Table 2). Therefore, we modelled the stellar baseline using a third-order polynomial and the stellar disk using a linear limb-darkening model. The best-fit polynomial function is shown in Extended Data Fig. 1 and shows ~0.3% variability during the entire visit. The planetary transit signal was modelled using batman58; we fixed the orbital parameters to known literature values15,16,17. We ran a Markov chain Monte Carlo (MCMC) using emcee59 to estimate model parameter uncertainties (Supplementary Fig. 2). We found the ninth exposure of the seventh orbit to be affected by a satellite crossing event and excluded this exposure60.

We generated 7-pixel-bin spectroscopic light curves from the reduced 1D stellar spectra across 17 wavelength channels. We detrended the spectroscopic light curves using a common-mode approach given the deviations from the standard HST instrument systematics (possibly due to stellar activity). The common-mode divide-white has been used previously for WFC3 analysis24; it adopts an agnostic approach to the exact mathematical form of the instrument systematics assuming it is independent of wavelength. We modelled the spectroscopic light curves using a batman model and a linear stellar baseline. We fitted for the linear limb-darkening coefficient. The observed white light curve, best-fit transit model and the derived systematics function are shown in Extended Data Fig. 1. The systematics detrended spectroscopic light curves, along with the residuals, are shown in Extended Data Fig. 2. We also derived the transmission spectrum by fitting each spectroscopic light curve using a RECTE and polynomial stellar baseline models, and the derived spectrum agrees within 1σ with the common-mode spectrum. However, the quality of the fits in the common-mode approach was superior. The residual noise in all the spectroscopic channels was less than 1.3× the expected photon noise and the average precision on the extracted transit depths was 47 ppm. The fitted transit depths and linear limb-darkening coefficients are shown in Table 1. The root mean-squared noise is relatively high61; however, this could be a combination of stellar variability, spot crossings and high measured horizontal (x) shifts.

We noted a possible bright spot occultation in the third orbit and also a potential flaring event affecting the latter half of the seventh orbit (Extended Data Fig. 1). To estimate the effect of these exposures on the derived transmission spectrum, we fitted the spectroscopic light curves with and without these exposures. We did not find any change in the derived transmission spectrum and the average residuals decreased by 3 ppm when these exposures were excluded. We concluded that the removal of these exposures did not have a significant manifestation on the spectrum. We also tested the effect of the large horizontal drift of the telescope. We incorporated a linear function of x shifts as a correction factor for the white light curve fits, following the approach used in ref. 55. We found ΔBIC = 3 when we included horizontal drift into the fitting algorithm, and hence we concluded that including the effect of horizontal drifts was not statistically significant.

Accounting for stellar activity

V1298 Tau is a young pre-main-sequence star, known to exhibit 2–3% variability in Kepler and Transiting Exoplanet Survey Satellite light curves15,16,17. Variability in such young stars can be attributed photospheric inhomogeneity (star spots and faculae) and fast stellar rotation. Unocculted star spots can contaminate the observed transmission spectrum62. We estimated the effect of stellar contamination on the transmission spectrum of V1298 Tau b following the prescription of ref. 11. We adopted a surface inhomogeneity model (20% spot coverage) for V1298 Tau from ref. 63. Photospheric temperature contrasts have been studied for T Tauri stars64; stars with photospheric temperatures similar to V1298 Tau can have spot temperature contrasts of up to 1,000 K. We estimated an extreme case contamination spectrum for V1298 Tau assuming 20% spot coverage and a 1,000 K spot temperature contrast. The stellar-contamination correction function is shown in Supplementary Fig. 5. As V1298 Tau is a K-type star, even extreme temperature contrasts do not result in the appearance of water absorption features in the contamination spectrum, which can be the case for cooler M dwarfs62. Therefore, the mass measurement, which depends on transit depth difference between different spectroscopic channels, is relatively unaffected. A comparison between the corrected and uncorrected spectra is shown in Supplementary Fig. 5. We re-ran retrievals on the contamination-corrected transmission spectrum. The retrievals were identical in set-up to the uncorrected case (see ‘Atmospheric models’). The posterior distributions of the parameters are shown in Extended Data Fig. 5 with the posteriors from the uncorrected spectrum. All of the parameters agreed for both cases within 1σ. The retrieved atmospheric metallicity in the corrected case favoured more sub-solar values compared with the uncorrected case, thereby confirming the robustness of the conclusions drawn in this work. The retrieved parameters are shown in Table 2. The effect of stellar absorption has been seen in the limb-darkening coefficients24. We set the limb-darkening coefficients as free parameters while fitting for the spectroscopic light curves; the results are provided in Table 1. The limb-darkening coefficients did not show any effect of stellar absorption. To further confirm that the water absorption feature we found in the spectrum of V1298 Tau b around 1.4 μm is of planetary origin, we defined a quantity B as the ratio of the flux observed in two wavelength bands:

$$B=\frac{\int\nolimits_{{\lambda }_{1}}^{{\lambda }_{2}}F(\lambda )\,{\mathrm{d}}\lambda }{\int\nolimits_{{\lambda }_{3}}^{{\lambda }_{4}}F(\lambda )\,{\mathrm{d}}\lambda }$$
(1)

where F is electrons per unit wavelength in the 1D extracted spectra of our reduced exposures, λ1 and λ2 give us lower and upper limits of the first wavelength band and λ3 and λ4 give us lower and upper limits of the second wavelength band. We calculated B for all of the exposures, first using the wavelengths 1.25–1.35 μm (left end of the water feature) and 1.45–1.55 μm (right end of water feature) (top panel in Supplementary Fig. 6) and subsequently using 1.35–1.45 μm (centre of water feature) and 1.45–1.55 μm (bottom panel in Supplementary Fig. 6). For the latter case we found an excess absorption during the transit of the planet, which indicates that the water absorption is of planetary origin.

Atmospheric models

We used the publicly available 1D radiative transfer code PetitRADTRANS to retrieve the atmospheric properties of V1298 Tau b from its observed transmission spectrum. The transmission spectrum did not show a methane absorption signature around 1.6 μm as would be expected for a warm planet like V1298 Tau b based on equilibrium chemistry. The lack of methane can be explained by disequilibrium processes such as vertical mixing33,65 dredging up methane-poor gas from the hot interior parts of the atmosphere. In our retrieval framework, we modelled this effect using a quenching pressure, where the volume mixing ratios of C-, H-, O- and N-bearing molecules were calculated using PetitRADTRANS. However, the molecular concentrations were held constant above the quench point. We modelled the atmospheric thermal structure with a Guillot TP profile32 shown in equation (2), where Tequ and Tint are the equilibrium and internal temperature of the planet. κIR is the average IR atmospheric opacity and γ is the ratio between optical and IR opacity. τ is the optical depth and g is acceleration due to gravity. We constrained the models by fixing the values of both κIR (to 0.01 cm2 g−1) and γ (to 0.01), assuming the atmospheric opacity at the observed band pass to be water dominated. We included H2O, CH4, CO2 and CO opacities in our retrieval framework, as these molecular species have absorption features in the NIR66. We did not include HCN or NH3 opacities in our retrievals as we did not find evidence of these species in free retrievals (Supplementary Fig. 1). We assumed a grey cloud deck opacity model to simulate cloud absorption.

$$\begin{array}{ll}\displaystyle{T}^{4}=\frac{3{T}_{\mathrm{int}}^{\,4}}{4}\left(\frac{2}{3}+\tau \right)+\frac{3{T}_{\mathrm{equ}}^{\,4}}{4}&\displaystyle\left[\frac{2}{3}+\frac{1}{\gamma \sqrt{3}}+\left(\frac{\gamma }{\sqrt{3}}-\frac{1}{\gamma \sqrt{3}}\right){\mathrm{e}}^{-\gamma \tau \sqrt{3}}\right],\\ &\tau =P{\kappa }_{\mathrm{IR}}/g\end{array}$$
(2)

We fixed the mass of the planet to 24 M based on the mass upper limit estimated analytically from the scale height. The free parameters in our models were atmospheric metallicity, the C/O ratio, Rp, Tequ, Tint, Pquench and Pcloud. We ran an MCMC with 3,000 burn-in steps and 30,000 post-burn-in steps with 50 walkers. We imposed uniform priors on the fitting parameters. The posterior distribution of the fitted parameters is shown in Extended Data Fig. 5. We retrieved a sub-solar/solar metallicity. The retrieved equilibrium temperature was consistent with the expected equilibrium temperature of the planet. The retrieved parameters are summarized in Table 2.

We tested the importance of the internal temperature by fixing the internal temperature to 0 K (that is, fitting for an isothermal atmosphere). Models with high internal temperatures were statistically favoured by a ΔBIC = 50. Comparing 0 K interior temperature models with 300 K interior temperature models (Fig. 1), we concluded that high internal temperature models are preferred at 3σ. We also performed free-chemistry retrievals using an isothermal atmosphere (Supplementary Fig. 1). This yielded an upper limit to the methane volume mixing ratio in the atmosphere (~10−6) that is lower than the detection threshold for HST33, thereby independently confirming the non-detection of methane. The free retrieval did not yield evidence for other molecular species such as HCN or NH3, putting upper limits of 10−6 on their volume mixing ratios. We explored the effect of fixing the planet’s mass in Extended Data Fig. 4 and the following section.

We constructed self-consistent atmospheric models with varying internal temperatures to study the quenching of methane and CO in the deep atmosphere. We computed the TP profile using petitCODE67,68, assuming radiative–convective equilibrium. Irradiation onto the planet was computed assuming a planetary-wide energy redistribution, with a host star effective temperature and radius of 4,970 K and 1.31 R, semi-major axis of 0.1688 au and planetary intrinsic temperatures of 100–400 K. Using our retrievals as a guidance, a solar metallicity was adopted with a slightly sub-solar C/O of 0.3. We achieved this C/O by reducing the carbon abundance from its solar value. The resulting temperature profiles are shown in Supplementary Fig. 7. Subsequently, we used a 1D chemical kinetics model69 in combination with a photochemical network70 to calculate self-consistent vertical quenching pressures for the main atmospheric species. We performed our calculations with a constant eddy diffusion coefficient (Kzz) of 1010 cm2 s−1. This value, although high, is in line with the expected values for convective mixing in giant planets and brown dwarfs71,72,73. We included photochemistry in our models, but we found that it did not substantially affect the molecular abundances at pressures typically probed by transmission spectroscopy. We tested the effect of a higher XUV flux of the host star by computing models for scaled solar spectra (10–1,000×; Supplementary Figs. 3 and 4). The resulting chemical disequilibrium abundances for methane, CO and water are shown in Supplementary Fig. 4. We found that the planet should have a high internal temperature (~300–400 K) for the carbon chemistry to be dominated by CO. This is consistent with the high internal temperature and deep quenching concluded from the retrieval analysis.

Mass estimate

We ran an atmospheric retrieval, keeping mass as a free parameter. The posterior distribution is shown in Extended Data Fig. 3. We found planet mass with 14, 19 and 23 M as the 68%, 95% and 99.73% confidence limits. The metallicity in this case yielded an upper limit of solar value at 95% confidence. The 3 − σ mass upper limit derived from the free-mass fit is consistent with the analytical calculation; however, the free-mass fit favoured lower masses (10 M from the posterior distribution peak; Extended Data Fig. 3). Overall, even for planet masses lower than 24 M, given the Jovian size of the planet, the atmosphere is expected to be H/He dominated and the atmospheric chemistry to remain unaffected. Therefore, the conclusions of mass lower than 24 M and solar/sub-solar atmospheric metallicity are further supported by this test.

We estimated the mass of V1298 Tau b (Mp) from the transmission spectrum using an analytical approach described in ref. 28.

$${M}_{\mathrm{p}}=\frac{kT{R}_{\mathrm{p}}^{2}}{\mu GH}$$
(3)

We used the radius measurement from Kepler14 (0.91 ± 0.05 RJ) and an equilibrium temperature of 670 K for the calculation. k is the Boltzmann constant, G is universal gravitational constant, μ is mean molecular weight which is assumed to be 2.33 for a primordial H/He atmosphere and H is the scale height of the planet. We estimate the scale height from the observed spectrum of V1298 Tau b. The height of an atmosphere can be estimated using equation (1) of ref. 74:

$$z({\lambda }_{2})-z({\lambda }_{1})=H\ln\left(\frac{\sigma ({\lambda }_{2})}{\sigma ({\lambda }_{1})}\right)$$
(4)

In equation (4), σ is the absorption cross-section at a given wavelength and z is the measured radius of the planet at a given wavelength. We estimated 2.7 scale heights to account for the 1.4 μm water absorption feature, assuming a water-dominated atmospheric opacity and a cloud-free atmosphere. Given the young age and inflated size, we assumed a primordial H/He-rich atmosphere and fixed the mean molecular mass to 2.33. We found a large atmospheric scale height for V1298 Tau b of 1,000 ± 200 km and a mass estimate of 24 ± 5 M. The reported radius for V1298 Tau b differs between K2 (0.91 ± 0.05 RJ; ref. 14) and Transiting Exoplanet Survey Satellite (0.85 ± 0.03 RJ; ref. 17), which is explained by the stellar variability. From the present HST white light curve, we measured a radius of 0.84 ± 0.003 RJ for this planet. We repeated the analytical mass calculation using the HST radius measurement (18 ± 4 M). The mass estimates for different radii of this planet are shown in Supplementary Table 1. The mass estimates are consistent within 1σ for K2 and HST radii. As V1298 Tau is a variable star, absolute transit depths are expected to vary between epochs, biasing radius measurements. We therefore adopted the radius from K2 observations, which was obtained from multiple transits (three) and had a long baseline coverage (80 days) for effective modelling of the stellar variability. However, the scale height measurement of V1298 Tau b for our observation should be unaffected, as it is a relative measurement and independent of the continuum. Given that V1298 tau is a K-type star, the atmospheric water absorption is relatively unattenuated due to spot contamination (see ‘Accounting for stellar activity’ for further details). We tested this by analytically estimating the planet mass from an extreme-case stellar-contamination-corrected spectrum (for a comparison between uncorrected and stellar-contamination-corrected spectra, see Supplementary Fig. 6). This yielded a mass estimate of 23 ± 5 M, consistent with our previous estimate.

This estimate can be interpreted as an upper limit, given the assumption of a clear atmosphere. In the presence of aerosols (for example, clouds, hazes) in the atmosphere, the measured scale height from the spectrum would be underestimated, therefore leading to an overestimation of the mass. Given the observed spectrum, a cloud-free case would therefore yield the maximum possible mass for this planet.

To estimate the impact of mass uncertainty on atmospheric parameters, we ran retrievals on the observed transmission spectrum by fixing the temperature to 670 K for different masses (24, 15, 10 and 5 M) and both cloud-free and cloudy cases (Extended Data Fig. 4). We included the same molecules as the 24 M case, as molecular opacities do not depend on the planet’s gravity. We were able to fit the observations with 24, 15 and 10 M models. The 24 M (mass upper limit) model converged at solar atmospheric metallicity; for lower-mass models (10–20 M) our retrievals converged at even lower (0.1–0.01 solar) metallicities to fit the water absorption feature. However, for the 5 M case, our retrievals did not converge as it could not reproduce the water absorption signal. We tested the robustness of the estimated mass upper limit by running a retrieval with 40 M. This model failed to reproduce the observed water feature and could be rejected at high confidence.

Thus, from the transmission spectrum we could estimate a robust mass upper limit and conclude that V1298 Tau b is likely to be a Neptune, a low-density sub-Neptune or even a super-Earth progenitor75.

Atmospheric evolution models

The atmospheric evolution models shown in Fig. 4 were simulated using the open-source platypos code75. The code calculated the mass loss rate at a given point in time using the energy-limited mass loss formalism76,77 and evolved the planet’s physical properties (mass and radius) at every step of the calculation. The radius evolution was a combined effect of atmospheric contraction and mass loss, and the updated sizes of the planets were calculated from the scaling relation given in ref. 38. We adopted the stellar luminosity from ref. 75. These simulations were performed considering the estimated mass upper limit (24 M). For lower masses, we can expect higher mass loss rates.

Comparison with a previous study

V1298 Tau b was included in a sample of 70 transiting exoplanets whose spectra have been shown in ref. 78. Edwards et al. used a different pipeline (Iraclis55) for the data reduction. They also used a common-mode approach to derive the spectrum of this planet. The transmission spectrum obtained in this work is consistent within 1σ with the results of ref. 78, except for a constant offset of ~500 ppm. The constant offset is a result of ref. 78 using the third orbit in their white light curve fits, which we chose to exclude because of a potential spot crossing in that orbit. We tested the effect of including the third orbit in the white light curve fits. We found ΔBIC = 170 in favour of excluding the third orbit from the fits. The transmission spectrum obtained in ref. 78 and this work have been shown together for comparison in Supplementary Fig. 6.