Introduction

Nitrogen dioxide (NO2) pollution is a significant environmental concern stemming from various sources such as vehicle emissions, industrial processes, and combustion. This gas is a part of nitrogen oxides (NOx) and contributes to poor air quality, leading to respiratory issues and environmental damage. NO2 reacts in the atmosphere to form harmful particles and ozone, impacting human health, ecosystems, and even contributing to climate change1,2,3,4,5,6. Specifically, emissions of NOx play a key role in creating photochemical smog, triggering acid rain, and causing ecological harm in water reservoirs7. Furthermore, high NOx levels also elevate O3, adversely affecting agriculture. Needless to say, monitoring and reducing NO2 levels are critical for mitigating its adverse effects on both human health and the environment. Strict regulations have been implemented to control NO2 levels, such as the CAFE Directive, setting an annual average below 40 µg/m3 and hourly concentrations not exceeding 200 µg/m3 for over 18 h per year8. The World Health Organization (WHO) has proposed even stricter limits9. However, about one-sixth of European monitoring stations indicate NO2 levels that surpass these boundaries, especially in urban zones, particularly along transportation corridors. The economic toll of air pollution, including NO2, amounts to significant costs2,10.

Traditional methods for NO2 monitoring rely on stationary and bulky equipment, demanding controlled environments and regular maintenance. Commonly used measurement approaches encompass photofragment chemiluminescence11, long-range differential optical absorption spectroscopy12, laser-induced fluorescence13, and cavity ring down spectroscopy14. Although these methods exhibit high sensitivity, some present limitations (e.g. unsuitability for localized monitoring12) or require intricate hardware (e.g., a vacuum system and a pulsed laser13). These deficiencies in traditional monitoring systems have driven the development of alternative methods that are cost-effective, easily deployable, and straightforward to maintain. In recent years, considerable research efforts have been directed towards development of portable platforms, which may be useful to enhance the spatial resolution of air quality monitoring. The latter is essential for urban areas with diverse pollutant distributions15,16,17. Notwithstanding, low-cost sensors encounter reliability limitations18,19,20 due to instability21, fabrication inaccuracies22,23, and cross-sensitivity to multiple gases24,25,26. They are also sensitive to environmental conditions, especially temperature and humidity27,28. In spite of these constraints, affordable sensors may complement sparsely positioned reference stations and serve as cost-efficient air quality monitoring solutions29. They may also become foundations of integrated sensor networks30,31, including those deployed on cars or aerial vehicles32,33.

Enhancing the reliability of low-cost sensors has been a focal point in research, primarily focusing on refining calibration methods. These techniques are typically categorized into two types: laboratory-based and field-based34. While laboratory procedures are more precise in theory, they often fall short in practice as the actual operating conditions of sensors seldom align with controlled laboratory settings18,19. Consequently, field-based techniques are more prevalent, relying on reference data collected from public air monitoring stations. Numerical modelling for calibration typically involves either rudimentary regression techniques or more advanced machine learning approaches. In Ref.35, methods such as multivariate linear regression (MLR), support vector regression (SVR), and random forest regression (RFR) were employed to calibrate electrochemical NO and NO2 sensors based on temperature and humidity data. A study presented in Ref.36 utilized ridge regression, random forest regression (RFR), Gaussian process regression (GPR), and MLR to correct low-cost NO2 and PM10 sensors based on temperature and humidity. In Ref.37, calibration of a chemiluminescence NO-NO2-NOx analyser using MLR was showcased, also integrating temperature and humidity data. Further investigations into diverse regression models have been reported in Refs.38,39,40.

In recent times, there has been a surge in interest in employing artificial intelligence methods, specifically neural networks (NNs) and diverse machine learning techniques, to achieve more dependable correction of low-cost sensors. For instance, Ref.29 employed single linear regression (SLR), multivariate linear regression (MLR), random forest regression (RFR), and long short-term memory networks (LSTM) for calibrating CO, NO2, O3, and SO2 sensors, noting LSTM's superior performance compared to regression procedures. Meanwhile, in Ref.15, convolutional neural networks (CNNs) and recurrent neural networks (RNNs) were used to calibrate CO and O3 sensors using temperature and humidity data, showcasing advantages over linear regression (LR), SVR, or LSTM combined with CNN. Extensive literature, as observed in Refs.41,42,43,44, showcases the application of various ANN surrogates, e.g. Bayesian NNs, shallow NNs, or dynamic NNs for low-cost sensor calibration.

In this research, we introduce an innovative method for precise calibration of affordable NO2 sensors. The technique revolves around statistical preprocessing of low-cost sensor data to align its distribution with reference data before further refinement. Central to this approach is an artificial neural network (ANN) surrogate, tailored to predict sensor correction coefficients that encompass additive adjustment and multiplicative scaling. The surrogate model is trained using environmental variables (temperature, humidity, atmospheric pressure), data cross-referenced from auxiliary NO2 sensors, and short time series of previous readings from the primary sensor. Global data scaling is also integrated as an additional calibration mechanism. To validate our calibration methodology, we applied it to a custom-designed autonomous monitoring platform equipped with NO2 and environmental detectors, supported by electronic circuitry for monitoring implementation and data transfer protocols. Reference data was collected over five months from high-precision public stations in Gdansk, Poland. The results demonstrate exceptional calibration efficacy, achieving a correlation coefficient close to 0.95 with reference data and an extremely low RMSE below 2.4 µg/m3, even within a broad NO2 measurement range (from zero to sixty µg/m3). Additional experiments conducted with different sets of surrogate model inputs and by excluding certain algorithmic tools highlight the vital role of each mechanism within the calibration framework, reaffirming their significance in enhancing correction quality.

Autonomous NO2 monitoring platform

The article will showcase the sensor calibration methodology implemented on a custom-designed autonomous monitoring platform developed at Gdansk University of Technology, Poland. Section "Hardware description" details the hardware specifications, while Section "Monitoring platform: output data" delves into the data output from the platform's sensors.

Hardware description

The system is a comprehensive setup comprising multiple sensors for monitoring environmental factors such as temperature, humidity, and atmospheric pressure. It integrates a primary nitrogen dioxide sensor and two redundant sensors for cross-validation purposes. Furthermore, it includes a GSM modem for wirelessly transmitting measurement data to the cloud. Managing the air quality monitoring protocols are off-the-shelf components coordinated by the BeagleBone® Blue microprocessor system45, which houses a 1 GHz ARM® Cortex-A8 processor, 512 MB DDR3 RAM, and 4 GB eMMC memory, operating on the Linux OS.

The system relies on a rechargeable 7.4 V/4400 mA battery capable of sustaining operations for at least twenty hours without external power sources. The block diagram of the platform, featuring sensor details, is illustrated in Fig. 1. Data transmission occurs via the GSM modem, making the measurement data available online. The system is mounted on a polyethylene terephthalate base plate, as depicted in Fig. 2. The gas sensors (ST, SGX, MICS) are closely positioned (see Fig. 2a) along with environmental detectors monitoring their operational conditions. An auxiliary environmental sensor is placed at the device's edge.

Figure 1
figure 1

Autonomous air monitoring platform designed at Gdansk University of Technology, Poland: (a) block diagram, (b) included sensors46,47,48,49.

Figure 2
figure 2

Autonomous monitoring platform designed at Gdansk University of Technology, Poland: (a) internals (top view), (b) internals (bottom view), (c) systems mounted in weather-proof enclosure.

The employment of auxiliary sensors serves to address variations between external and internal temperatures and humidity, primarily influenced by heat generated by the electronic circuitry. An Intel USB Stick module is also installed for potential on-board execution of calibration procedures. The platform is accommodated in a weatherproof enclosure, cf. Fig. 2c.

Monitoring platform: output data

The monitoring platform, detailed in Section "Hardware description", gathers NO2 measurements from the primary sensor and two redundant sensors, along with environmental sensor data (internal and external temperature, humidity, and atmospheric pressure). Figure 3a visually represents these outputs, while Fig. 3b introduces the notation used in this study. It is crucial to note that this platform captures environmental parameters both within the system (close to the NO2 sensors) and externally (at the edge of the platform). The variations in internal and external temperature and humidity stem from the heat produced by the electronic circuitry. Given the influence of these parameters on sensor performance, incorporating both sets of temperature and humidity data can significantly enhance the reliability of the calibration process. Additionally, although the accuracy of the auxiliary NO2 sensors within the platform is limited, their readings offer indirect yet valuable insights into the factors affecting the primary sensor, notably its cross-sensitivity to other gases.

Figure 3
figure 3

Outputs of the low-cost monitoring platform of Section "Hardware description": (a) NO2 reading from the low-cost sensor under calibration (ys). The sensor also produces auxiliary outputs: auxiliary NO2 readings (S1 and S2), outside and inside temperature (To and Ti, respectively), outside and inside humidity (Ho and Hi, respectively), and atmospheric pressure (P); (b) symbols of data produced by the platform’s sensors. The number N stands for the total number of data samples obtained from the platforms, further divided into training and testing sets (cf. Section "Precise sensor calibration using statistical pre-processing, ANN surrogates, and global data scaling").

Reference data. Public monitoring stations

The calibration process for the low-cost sensor will utilize reference data obtained from high-precision public monitoring stations strategically located in Gdansk, Poland, operated by the ARMAG Foundation50. The geographical distribution of these stations is illustrated in Fig. 4a. The stations are housed within air-conditioned containers and are equipped with high-performance air monitoring instruments, detailed in Fig. 4b. The specific sensors used for NO-NO2-NOx measurements are listed in Fig. 4c. ARMAG provides open access to the generated data on their website (https://armaag.gda.pl/en/). Measurements are carried out hourly and are accessible on the foundation’s website for a duration of three days. To enable extended data collection periods, a custom script has been prepared, which allows automated download of this information into a text file hosted on a dedicated server.

Figure 4
figure 4

Reference monitoring stations of the ARMAG foundation used to acquire reference data: (a) station locations in the city of Gdansk, (b) photograph of the selected station with the proposed low-cost platform mounted in the vicinity, (c) NOx sensors installed on the stations.

Precise sensor calibration using statistical pre-processing, ANN surrogates, and global data scaling

This section delineates the comprehensive methodology devised for the calibration of low-cost NO2 sensors. The task of correcting the sensor is formulated in Section "Sensor calibration. Problem statement". Further details regarding the affine correction scheme are provided in Section "Additive and multiplicative low-cost sensor correction". Section "Statistical pre-processing of low-cost sensor measurements" delves into the statistical pre-processing of data, designed to enhance the initial alignment between the outputs of the reference and low-cost sensors. An in-depth exploration of the primary calibration model, an artificial neural network (ANN) surrogate, is presented in Section "Sensor calibration using neural network surrogate". The various configurations of inputs to the ANN model are elucidated in Section "Calibration model inputs". These encompass fundamental environmental parameters and redundant NO2 sensor readings (Section "Calibration input configuration I: basic setup"), expanded sets incorporating differentials (Section "Calibration input configuration II: differentials"), and time-series-based inputs comprising prior NO2 measurements from the primary sensor (Section "Calibration input configuration III: time series of prior NO2 measurements"). Additionally, Section "Global data scaling" discusses an auxiliary calibration mechanism, specifically global data scaling. The comprehensive workflow for NO2 monitoring utilizing the calibrated low-cost sensor is elucidated in Section "Operating flow of NO2 monitoring by means of calibrated sensor".

Sensor calibration. Problem statement

Sensor calibration is based on two datasets. The first one comprises NO2 readings obtained from the reference stations, as outlined in Section "Reference data. Public monitoring stations". The respective samples will be denoted as yr(j), j = 1, …, N, where N is the total number of points. The datasets obtained from the autonomous platform described in Section "Autonomous NO2 monitoring platform", i.e., {ys(j)} and the respective environmental parameter vectors {zs(j)} (cf. Fig. 3) is in correspondence with {yr(j)}, i.e., the respective outputs are collected at the same time intervals. Figure 5 elucidates the division of this data into training and testing sets. The testing set consists of several two-week sequences gathered at different time intervals during the five-month measurement campaign, as elaborated in Section "Results and discussion".

Figure 5
figure 5

Division of the reference and low-cost sensor data into training and testing set.

Sensor calibration is realized using the training datasets {yr0(j)}, {ys0(j)}, and {zs0(j)}, j = 1, …, N0 (cf. Fig. 5). The correction coefficients are jointly denoted as C(ys,zs;p), cf. Fig. 6, where p stands for the combined calibration model hyper-parameters. The corrected sensor’s output is denoted as yc = FCAL(ys,C(ys,zs;p)). Based on this terminology, the calibration problem is posed as a nonlinear minimization task.

$${\mathbf{p}}^{*} = \arg \mathop {\min }\limits_{{\mathbf{p}}} \sqrt {\sum\limits_{j = 1}^{{N_{0} }} {\left( {y_{r0}^{(j)} - F_{CAL} \left( {y_{s0}^{(j)} ,C(y_{s0}^{(j)} ,{\mathbf{z}}_{s0}^{(j)} ,{\mathbf{p}})} \right)} \right)^{2} } }$$
(1)
Figure 6
figure 6

Overall flow of the low-cost sensor calibration. Auxiliary data and sensor output ys are used to obtain the correction coefficients C(ys,zs,p), which are then used to compute the corrected sensor output yc, see Sections "Additive and multiplicative low-cost sensor correction" through Sect. "Global data scaling" for details. A more detailed procedure will be discussed in Section "Global data scaling".

The aim of (1) is to optimize the hyper-parameters of the calibration model to maximize the (L-square) alignment between the NO2 readings from the reference and corrected low-cost sensors across the training set.

Additive and multiplicative low-cost sensor correction

Conventional correction methods often model the disparities between reference and low-cost sensor readings directly. In this study, we adopt an affine scaling approach that involves both additive and multiplicative correction. This method introduces additional degrees of freedom, enhancing the reliability of the calibration process. In our case, it is recommended to use a multiplicative scaling factor greater than one, as the typical amplitude variations in reference data are higher than those in low-cost sensor measurements, cf. Fig. 7. Details of this correction process are outlined in Fig. 8. It is essential to note that for A(j) to be greater than unity, the hyper-parameter α must be less than unity (cf. (8)). In practice, α can be optimized simultaneously with training the NN calibration model (see Section "Statistical pre-processing of low-cost sensor measurements"). Through preliminary experiments, a suitable value for α found to be 0.8 will be utilized in our validation studies discussed in Section "Results and discussion".

Figure 7
figure 7

Selected reference and low-cost sensor training data subsets. A typical amplitude of low-cost sensor data variations is lower than for the reference, therefore, multiplicative scaling with coefficient A > 1 may be advantageous in improving the calibration process quality.

Figure 8
figure 8

Fundamental output correction of the low-cost NO2 sensor: affine scaling.

As indicated in Fig. 8, the ANN model is identified based on the training data in the form of the coefficients A and D computed for each training sample. In other words, the coefficients A(j) and D(j) are computed for each pair of the raw sensor data ys.0(j) and yr.0(j) so that perfect matching is ensured as shown in (5). Subsequently, the calibration ANN model is trained to render the values of A and D for any combination of auxiliary parameters zs and primary sensor reading ys. The information about the reference reading at this combination is encoded in the training pairs A(j), D(j) combined with their corresponding sensor output ys.0(j).

Statistical pre-processing of low-cost sensor measurements

One of the keystones of the proposed calibration procedure is statistical pre-processing of the low-cost sensor readings. A potential usefulness of this procedure stems from the observations made in Section "Additive and multiplicative low-cost sensor correction", specifically, the observed discrepancies between typical measured NO2 levels between the reference station and the low-cost sensor, as illustrated in Fig. 7. These discrepancies are well-represented on the histogram plots shown in Fig. 9. The statistical distribution of the measurements for the low-cost sensor is shifted towards lower values, which indicates that the typical readings are lower than for the reference.

Figure 9
figure 9

Histograms of the reference NO2 readings (top) and raw (uncorrected) low-cost sensor NO2 measurements (bottom), obtained for the complete training datasets. Note that the statistical distribution for the low-cost sensor is shifted towards lower values, which indicates that the typical readings are lower than for the reference, as also observed in Fig. 7.

The proposed pre-processing procedure aims at reducing the aforementioned misalignment by initial scaling of the low-cost sensor readings using a nonlinear transformation of the form

$$P(y_{s} ,{\mathbf{s}}) = P\left( {y_{s} ,[s_{1} \;s_{2} \;s_{3} ]^{T} } \right) = s_{1} + s_{2} y_{s} + s_{3} y_{s}^{2}$$
(9)

which is to be applied to all sensor measurements simultaneously. The second order polynomial has been chosen as the simplest nonlinear function that can be utilized to match the probability distributions represented by the histograms. The idea is as follows. Assuming that the probability distributions are generally similar, using affine transformation (shift + linear scaling) is generally sufficient because it allows for matching the distribution means and standard deviations. The second order has been added in order to introduce a slight nonlinearity, thereby improving the quality of histogram matching. We will also use a vector notation for P, i.e.,

$$P({\mathbf{y}},{\mathbf{s}}) = P\left( {[y_{1} \;...\;y_{N} ]^{T} ,[s_{1} \;s_{2} \;s_{3} ]^{T} } \right) = \left[ \begin{gathered} s_{1} + s_{2} y_{1} + s_{3} y_{1}^{2} \\ \vdots \\ s_{1} + s_{2} y_{N} + s_{3} y_{N}^{2} \\ \end{gathered} \right]$$
(10)

The coefficient vector s is determined to improve the alignment of the smoothed histograms shown in Fig. 10. The latter is defined as

$$H({\mathbf{y}}) = \left[ {{\mathbf{z}}\;\;S({\mathbf{N}}_{{\mathbf{y}}} )} \right]$$
(11)

where

$${\mathbf{z}} = \left[ {z_{1} \;z_{2} \;...\;z_{M} } \right]^{T}$$
(12)

is a vector of histogram bins (i.e., intervals splitting the horizontal axis in Fig. 9 into respective compartments), whereas

$${\mathbf{N}}_{{\mathbf{y}}} = \left[ {n_{y.1} \;n_{y.2} \;...\;n_{y.M} } \right]^{T}$$
(13)

denotes the vector of the number of (training data) readings that fall within the respective intervals. The function S() represents a smoothing procedure.

Figure 10
figure 10

Smoothened histograms of the reference versus raw low-cost sensor (top) and the reference versus pre-processed low-cost sensor (bottom). As it can be observed, pre-processing aligns the measurement distributions of the low-cost sensor, thereby making is better prepared for further calibration.

Having defined the smoothed histogram, the pre-processing is accomplished by solving

$${\mathbf{s}}^{*} = \arg \mathop {\min }\limits_{{\mathbf{s}}} \left\| {H({\mathbf{y}}_{r} ) - H(P({\mathbf{y}}_{s} ,{\mathbf{s}}))} \right\|$$
(14)

where yr and ys stand for the aggregated reference and low-cost sensor NO2 readings.

Note that if the histogram bins z are identical for the reference and the sensor (which is assumed here), the functional in (14) boils down to comparing the respective S(Ny) vectors. Solving problem (14) is equivalent to matching the smoothed histograms of the reference and pre-processed low-cost sensor histograms. The unknown variables in this process are the scaling polynomial coefficients, that is, the vector s defined in Eq. (9). Note that the matching is not performed for the number of observations falling into the reference bins as these are discrete numbers, and solving least-square regression problem would be problematic when using gradient-based routines. Instead, matching is performed upon smoothed histograms, which are continuous functions of the bin indices. The process (14) is effectively fitting the second-order polynomial that determines the histogram scaling.

Figure 10 shows the smoothed histograms before (top) and after pre-processing (bottom), indicating considerable improvement in terms of the alignment. Direct comparison between raw (non-smoothed) histograms can be found in Fig. 11. Figure 12 shows the effects of pre-processing for selected subsets of the training data. As mentioned earlier, pre-processing will be employed as the first calibration step, followed by surrogate-predicted correction to be discussed from Section "Sensor calibration using neural network surrogate" on.

Figure 11
figure 11

A comparison between the reference data (red) and pre-processed (blue) low-cost sensor histogram. Good alignment between the two datasets can be observed. Overlapping data marked purple.

Figure 12
figure 12

The effects of statistical pre-processing illustrated for two selected subsets of the training data. As it can be observed, pre-processing leads to a significant improvement of correlation between the reference and low-cost sensor readings.

Sensor calibration using neural network surrogate

The primary calibration model employed in this study is an artificial neural network (ANN) surrogate. Specifically, we have opted for a multi-layer perceptron (MLP) architecture51,52 featuring three fully connected hidden layers, each consisting of twenty neurons utilizing a sigmoid activation function, as illustrated in Fig. 13. The model's hyper-parameters are identified using a backpropagation Levenberg–Marquardt algorithm53 (setup: 1000 learning epochs, performance evaluation using mean-square error (MSE), randomized training/testing data division). It should be emphasized that the aforementioned data division is pertinent to the training data itself (i.e., the training data is internally split into ‘training’ and ‘validation’ data for the purpose of ANN training in each epoch). The testing data as specified in Fig. 5 is kept separate and only used for model validation in the numerical experiments in Section "Results and discussion".

Figure 13
figure 13

ANN surrogate used as the core calibration model. Here, we employ a multi-layer perceptron (MLP) with three fully-connected hidden layers. When statistical data pre-processing it utilized (cf. Section "Statistical pre-processing of low-cost sensor measurements"), then the input ys of the primary sensor reading is not taken directly from the sensor. Instead, it is a pre-processed value.

We deliberately chose a relatively simple ANN architecture to expedite the training process and prioritize its role as a regression model. Given the ample training samples available, the model's sensitivity to the number of layers and neurons is limited. Furthermore, this streamlined architecture effectively mitigates inherent noise present in both the reference and sensor readings.

The calibration model takes inputs comprising environmental factors (internal/external temperature, humidity, etc.) and NO2 measurements from both the primary and auxiliary sensors. The outputs of the neural network (NN) model are the affine scaling coefficients A and D. In Section "Calibration model inputs", we delve into diverse extended input sets aimed at bolstering the calibration process's reliability. The effects of these expanded sets, alongside the consequences of restricting inputs to various subsets of the vector zs, will be analysed in Section "Results and discussion" to assess how input configuration impacts the efficacy of calibration.

Calibration model inputs

In this section, we discuss various input configurations of the ANN calibration model. Section "Calibration input configuration I: basic setup" recalls the basic parameter set discussed earlier. The extended input set, integrating differentials of environmental variables and primary NO2 readings, is explored in Section "Calibration input configuration II: differentials".

Section "Calibration input configuration III: time series of prior NO2 measurements" analyses the final setup that involves time series of prior NO2 measurements from the low-cost sensor. In our investigations, we focus on potential benefits of particular setups in terms of improving the calibration process dependability.

Calibration input configuration I: basic setup

The fundamental configuration of the calibration model inputs includes the auxiliary data vector zs = [To Ti Ho Hi P S1 S2]T. This set of values comprises external/internal temperature, humidity, atmospheric pressure, and NO2 data from redundant sensors. These elements are augmented by the primary sensor's NO2 measurements, ys. Section "Results and discussion" will further investigate constrained variations of this arrangement to determine the individual elements' significance.

Calibration input configuration II: differentials

The basic input arrangement elucidated in Section "Calibration input configuration I: basic setup" can be extended by incorporating additional parameters representing local (temporal) fluctuations in environmental variables and NO2 readings. More specifically, we define differentials

$$\Delta y_{s}^{(j)} = \frac{{y_{s}^{(j)} - y_{s}^{(j)} ( - \Delta t)}}{\Delta t}$$
(15)

where Δt is the time interval between subsequent sensor readings; ys(j)(–Δt) stands for the last measurement taken before ys(j). Differentials of the environmental parameters are defined in a similar manner

$$\Delta T_{o}^{(j)} = \frac{{T_{o}^{(j)} - T_{o}^{(j)} ( - \Delta t)}}{\Delta t},\,\,\,\Delta T_{i}^{(j)} = \frac{{T_{i}^{(j)} - T_{i}^{(j)} ( - \Delta t)}}{\Delta t}$$
(16)
$$\Delta H_{o}^{(j)} = \frac{{H_{o}^{(j)} - H_{o}^{(j)} ( - \Delta t)}}{\Delta t},\,\,\Delta H_{i}^{(j)} = \frac{{H_{i}^{(j)} - H_{i}^{(j)} ( - \Delta t)}}{\Delta t}$$
(17)
$$\Delta P_{{}}^{(j)} = \frac{{P_{{}}^{(j)} - P_{{}}^{(j)} ( - \Delta t)}}{\Delta t}$$
(18)

Note that computing (15), (16), (17), (18) only requires storing one extra set of readings. The differentials, especially Δys(j), quantify local fluctuations in NO2 level, which facilitates prediction of forthcoming alterations. Moreover, integrating differentials of environmental variables can provide explicit or implicit insights into the dynamics of relevant factors such as cross-sensitivity to other gases. This addition of differentials as supplementary inputs into the NN surrogate allows exploration of their potential contribution to enhancing the calibration quality.

A visual illustration has been provided in Fig. 14. In particular, Fig. 14a shows—for a selected sequence of the training data—the NO2 readings from the low-cost sensor alongside the respective differentials. Meanwhile, Fig. 14b and c, demonstrate the effects of incorporating the differentials as auxiliary calibration model inputs. The flow diagram of the modified calibration process involving differentials can be found in Fig. 15.

Figure 14
figure 14

Differentials used as additional ANN surrogate inputs to enhance calibration dependability: (a) selected training data sequence (NO2 readings from the low-cost sensor) and its corresponding differentials (15); (b) the effects of incorporating differentials shown for a selected sequence of testing data; (c) the effects of differentials shown for another testing data sequence. Note that including differentials (here, of all environmental variables and the primary NO2 readings from the low-cost sensor) noticeably improves data alignment.

Figure 15
figure 15

Calibration of the low-cost sensor with differentials used as additional calibration model inputs. Auxiliary data and sensor output ys are used to obtain the correction coefficients C(ys,zsyszs,p), used to compute the corrected sensor output yc. The pre-processing step is not shown for clarity.

Calibration input configuration III: time series of prior NO2 measurements

Expanding the concept of differentials might involve integrating an extended series of previous sensor measurements, which may not be suitable for mobile monitoring platforms but could significantly enhance the calibration of stationary systems, like the one discussed in Section "Autonomous NO2 monitoring platform". The additional inputs for the calibration surrogate comprise

$$y_{s}^{(j)} ( - s\Delta t),\,\,\,s\, = \,1,\,2,\, \ldots ,\,N_{s} .$$
(19)

In (19), Δt is the reading time interval, whereas Ns is the number of prior measurements used as extra inputs. Although a natural choice for incorporating a time series such as (19) would be recurrent neural networks (RNN)54, in our case, Ns will be fixed throughout making feedforward networks a sufficient representation. Note that Ns = 1 is equivalent to the incorporation of differentials described in Section "Calibration input configuration II: differentials".

The extended flow diagram of the calibration procedure involving the time series of length Ns has been shown in Fig. 16. Figure 17 demonstrates the advantages of including short time series as auxiliary calibration model inputs for Ns = 3. Section "Results and discussion" will carry out a comprehensive analysis of the effects of the length Ns on calibration process reliability.

Figure 16
figure 16

Calibration of the low-cost sensor with time series of prior measurements used as additional calibration model inputs. Auxiliary data are used to obtain the correction coefficients C(ys,zsyszs,Ns,p), used to compute the corrected sensor output yc. The pre-processing step is not shown for clarity.

Figure 17
figure 17

The effects of incorporating a time series of length Ns = 3 of prior NO2 readings into the NN calibration model, along with the environmental parameters differentials. Shown are reference and calibrated low-sensor data without and with the mentioned time series, obtained for two selected sequences of the testing data: (a) first sequence, (b) second sequence.

Global data scaling

The last algorithmic component integrated into the proposed calibration process involves global data scaling. This approach adjusts the correction coefficients anticipated by the ANN surrogate based on the current values of environmental factors, NO2 measurements from both primary and redundant sensors, potential differentials, and a time series of Ns-length primary NO2 data. The surrogate aims to minimize the disparity between the reference and low-cost sensor data in the least-square sense (cf. (1)). Yet, resolving (1) might reveal certain systematic discrepancies reliant on the measured NO2 level, as depicted in Fig. 18a and b for a specific subset of training data. This distinction becomes apparent when examining the data sorted by reference NO2 levels and through the scatter plot's slight skew seen in the bottom panel of Fig. 18b.

Figure 18
figure 18

Global response correction: (a) a subset of selected training data; (b) the same data arranged based on increasing NO2 reference readings (top) accompanied by the corresponding scatter plot (bottom). Despite the apparent alignment showcased in Fig. 18a, there is an observable systematic offset dependent on the level; (c) the same data after the application of global data scaling, showcasing a notable decrease in the systematic offset and an enhancement in the symmetry of the scatter plot. In this instance, global correction results in an improved correlation coefficient, rising from 0.93 to 0.95, and a reduction in RMSE from 2.1 to 1.8 µg/m3.

The global data scaling aims at reducing the discussed offsets by means of an affine transformation of the smoothed sensor measurements. In plain words, it corresponds to a ‘rotation’ of the scatter plot rendering it less skewed with respect to the identify mapping. A rigorous formulation of the process has been explained in Fig. 19. Coefficients AG and DG are determined from the complete dataset; they are not functions of the environmental or auxiliary parameters.

Figure 19
figure 19

Global response correction through affine transformation of the ordered NO2 data from the calibrated low-cost sensor.

The impact of implementing global data scaling is evident in Fig. 18c. In the depicted case, there is a noticeable reduction in the offset and an enhanced symmetry within the scatter plot. Simultaneously, the correlation coefficient improves from 0.93 to 0.95, while the RMSE decreases from 2.1 to 1.8 µg/m3 based on the training data. Although its advantages might be somewhat constrained for the testing data, global data scaling still proves beneficial, as shown in Section "Results and discussion".

Again, it should be noted that that the global data correction is a separate stage, which is applied after calibrating the sensor using the scaling coefficients A and D rendered by the ANN model. The inputs of the ANN model are the auxiliary parameters (vector zs), the primary sensor measurement ys, and (optionally) the differentials and the time series of prior measurements.

The ANN model produces coefficients A and D being functions of these input variables and applies them to the low-cost sensor readings as in (2). The global correction (20) is applied afterwards using coefficients AG and DG obtained for the entire training dataset (i.e., not being functions of individual measurements). These coefficients are the same for all samples underdoing the global correction process.

Operating flow of NO2 monitoring by means of calibrated sensor

Below, we summarize the operation of the complete calibration process of the low-cost sensor. The procedure combines the correction mechanisms detailed in Sections "Additive and multiplicative low-cost sensor correction" through "Global data scaling". The first step is pre-processing elucidated in Section "Statistical pre-processing of low-cost sensor measurements", where the overall distributions of the sensor and the reference data are aligned. Subsequently, the ANN surrogate predicts the (local) correction coefficients using the auxiliary vector zs and NO2 reading ys from the low-cost sensor, their differentials, as well as an Ns-long time series of prior NO2 measurements from the primary sensor. The intermediate outcome yc is obtained by applying the affine correction (2), (3). The last stage is global data scaling (20), (21), which produces the final corrected NO2 reading. A flow diagram of the process has been shown in Fig. 20.

Figure 20
figure 20

Low-cost sensor calibration procedure as proposed in this study. Pre-processing of the sensor readings is followed by generating (local) calibration coefficients using the ANN surrogate (based on the auxiliary vector zs, the actual NO2 reading ys from the low-cost sensor, their differentials, as well as a short-term time series of prior nitrogen dioxide readings from the primary sensor). The affine scaling is then applied to the sensor reading to produce the outcome yc. Subsequently, global response correction is superimposed to produce the final corrected reading yc.G.

Results and discussion

This section concentrates on validating the proposed calibration method for the low-cost sensor, applied to the autonomous monitoring platform detailed in Section "Autonomous NO2 monitoring platform". The content is organized as follows. Section "Reference and low-cost sensor datasets" discusses the reference and low-cost sensor datasets. Section "Results" presents results obtained from various calibration setups explored in comparative experiments. Finally, Section "Discussion" summarizes findings and discusses the performance of the calibration process.

Reference and low-cost sensor datasets

The proposed calibration procedure has been validated using the datasets acquired from the reference stations (as outlined in Section "Reference data. Public monitoring stations") and the monitoring platforms (detailed in Section "Autonomous NO2 monitoring platform"). The data was collected hourly between March and August 2023, cf. Figure 21. For the sake of illustration, Fig. 22 presents selected subsets of the reference and uncorrected low-cost sensor training and testing data. Significant disparities between the readings from the reference and the sensor can be observed, which poses a considerable challenge for the calibration process.

Figure 21
figure 21

Characterization of the training and testing data acquired to carry out calibration of the low-cost sensor of Section "Autonomous NO2 monitoring platform".

Figure 22
figure 22

Selected subsets of NO2 readings from the reference stations and the raw (uncorrected) low-cost sensors: (a) training data, (b) testing data.

Results

In this analysis, we delve into the calibration outcomes of the low-cost NO2 sensor within the monitoring platform highlighted in Section "Autonomous NO2 monitoring platform". We explore various setups of the calibration model inputs to assess the importance of specific algorithmic elements within the correction scheme. Additionally, we selectively enable or disable auxiliary mechanisms, i.e., pre-processing and global data scaling for some configurations. Table 1 presents all the scrutinized setups. Each configuration undergoes ten independent training cycles, and the model with the optimal set of hyper-parameters is chosen as the final model.

Table 1 Input setups of the calibration model considered in verification experiments.

The calibration setups under examination are divided into four groups, denoted as A to D. The first group encompasses configurations that do not utilize the time series of previous NO2 measurements. The second group involves setups that incorporate time series of past readings, varying in length (Ns), excluding global response correction. The third group combines time-series-based calibration with global data scaling. The final group incorporates pre-processing as detailed in Section "Statistical pre-processing of low-cost sensor measurements". Experimenting with different Ns values enables us to identify the most effective time series length.

The results from all calibration setups are consolidated in Table 2, encompassing the correlation coefficient and modeling error (RMSE) for both training and testing data (see Fig. 23 for definitions). To streamline the presentation, data visualization is provided for four specific calibration setups: B.4, and D.3. Figure 24 displays the reference, raw low-cost sensor, and calibrated sensor NO2 measurements (training data) for two chosen eight-week periods. Figure 25 illustrates the same information for testing data across three two-week periods, while Fig. 26 showcases scatter plots for the testing data. Finally, Fig. 27 presents NO2 measurements for setups B.4, and D.3 based on ascending reference readings.

Table 2 Sensor calibration performance: correlation coefficients and RMSE.
Figure 23
figure 23

Definitions of the correlation coefficient r and RMSE.

Figure 24
figure 24

Sensor calibration performance for selected subsets of the training data: (a) setup B.4, (b) setup D.3.

Figure 25
figure 25

Sensor calibration performance for selected subsets of the testing data: (a) setup B.4, (b) setup D.3.

Figure 26
figure 26

Scatter plots for the testing data (uncorrected—gray, corrected—black): (a) setup B.4, (b) setup D.3.

Figure 27
figure 27

Performance of sensor calibration for: (a) setup B.4, (b) setup D.3. Shown are the entire training dataset (top) and testing dataset (bottom), arranged in ascending order according to NO2 reference readings. Note substantial enhancement achieved through calibration, i.e., bringing the calibrated sensor readings much closer to their corresponding reference measurements compared to the raw data.

Discussion

The experiments in Section "Results" aimed to verify the effectiveness of the proposed calibration process. One crucial aspect under examination was whether the correction strategy introduced could adequately align the reference and low-cost sensor readings, ensuring reliable monitoring of nitrogen dioxide. Furthermore, we aimed at verifying the relevance of correction mechanisms, specifically, the pre-processing and global data scaling procedures, and benefits of incorporating environmental parameter differentials, and time series of prior NO2 readings from the low-cost sensor as additional calibration inputs. We were also interested in identifying the optimal length Ns of this series. It is also important to recall that the initial discrepancies between the low-cost sensor and the reference measurements are significant, whereas the NO2 level changes considerably (from almost zero to sixty µg/m3) and often quickly, which make the calibration a challenging endeavour.

The findings in Table 2 showcase the exceptional performance of the proposed calibration technique. Among the calibration setups assessed, the most effective configurations belong to group D, specifically D.3 and D.4. These setups integrate all correction mechanisms outlined in Section "Precise sensor calibration using statistical pre-processing, ANN surrogates, and global data scaling", encompassing pre-processing, global data scaling, and leveraging extended input variables covering environmental parameters, auxiliary NO2 readings, differentials, and medium-length time series (Ns ranging between four and six). For instance, in setup D.3, the correlation coefficient reaches approximately 0.95, with an RMSE of 2.4 µg/m3 for the testing data. Moreover, the average relative RMS error is merely around 11 percent. The precision of the calibrated sensor is evident in its excellent alignment with the reference data, as observed in both the training (Fig. 24d) and testing data (Fig. 25d). The reported numbers are particularly impressive when compared to the metrics of the raw (uncorrected) sensor, which are as follows: correlation coefficients 0.07 and 0.04 (training and testing data, respectively), and RMSE of 8.9 and 10.8 µg/m3 (training and testing data, respectively).

A review of the results across various calibration setups underscores the significance of each incorporated correction mechanism. For instance, augmenting the inputs in the calibration model significantly impacts both the correlation coefficient and RMSE. Comparing configurations A.1, A.2, A.3, A.4, and A.7 (excluding global response correction) highlights this, where the correlation coefficient improves from 0.7 to 0.89, and RMSE drops from 5.6 to 3.4 µg/m3. Consistent integration of global response correction consistently bolsters the correlation coefficient by nearly 0.02 and reduces RMSE by about 0.2 µg/m3 (e.g., comparing setup A.5 versus A.4, or C.1 versus B.1).

Introducing time series data further enhances results, achieving up to a 0.03 improvement in correlation coefficient and a reduction of 0.3 µg/m3 in RMSE (e.g., setups C.3 or C.4). Moreover, data pre-processing significantly contributes to calibration enhancements by adding up to 0.03 to the correlation coefficient and reducing RMSE by nearly 0.3 µg/m3. These improvements are visually evident in Figs. 24, 25, and 26, where transitioning from simpler configurations to more advanced ones (e.g., B.4 and D.3) noticeably improves alignment between the reference and corrected low-cost sensor readings. Additionally, it centres the scatter plots closer to the identity function.

The enhancements in reliability are also visually highlighted in Fig. 27, where both training and testing data are arranged by ascending reference NO2 levels. Moving from the simpler setup A.2 through intermediate stages (A.7 and B.4) to the advanced configuration D.3 significantly reduces deviations between the reference and calibrated sensor readings. An in-depth analysis of setups B and C reveals that the most favourable configuration in terms of the time series length is Ns = 4, showcasing the highest correlation coefficient and minimal RMSE. However, with the inclusion of pre-processing (setups D), the impact of Ns becomes less distinctive, suggesting that the calibration performance becomes more resilient to variations in this parameter.

Additional experiments were conducted to verify the effects of including auxiliary NO2 sensor readings as supplementary calibration inputs. The considered setups are listed in Table 3. The results are encapsulated in Table 4. Note that setups E.1 and E.5 were previously considered as Cases E.1 and E.3 in Table 1. These are repeated to ensure completeness of the data in Tables 3 and 4. Note that incorporating auxiliary NO2 sensor data does improve the calibration process dependability. Also, it can be observed that the second auxiliary sensor S2 has a slightly higher impact, as it can be inferred from the values of correlation coefficient and RMSE. On the other hand, when the auxiliary sensors are not utilized, data alignment degrades noticeably (cf. setup E.2 versus E.3, E.4, or E.5). Furthermore, including the primary sensor measurements is also important.

Table 3 Verification case studies: calibration model setup.
Table 4 Sensor calibration performance for calibration scenarios listed in Table 3.

For supplementary validation, the calibration approach introduced in this paper has been compared to several benchmark methods, specifically, linear regression, neural-network-based calibration, as well as calibration implemented using a convolutional neural network (CNN)55. In the case of ANN/CNN, the neural network predicts the calibrated model output directly instead of rendering the correction coefficients. Linear regression is a model of the form

$$S({\mathbf{z}}_{s} ) = \alpha_{0} + \alpha_{1} T_{o} + \alpha_{2} T_{i} + \alpha_{3} H_{o} + \alpha_{4} H_{i} + \alpha_{5} S_{1} + \alpha_{6} S_{2}$$
(22)

when using vector zs as calibration input, and

$$S_{y} ({\mathbf{z}}_{s} ,y_{s} ) = \alpha_{0} + \alpha_{1} T_{o} + \alpha_{2} T_{i} + \alpha_{3} H_{o} + \alpha_{4} H_{i} + \alpha_{5} S_{1} + \alpha_{6} S_{2} + \alpha_{7} y_{s}$$
(23)

when using extended calibration inputs (i.e., primary sensor data). The coefficients in (22) and (23) are found through least-square regression based on the training data. The ANN uses the same architecture as described in Section "Precise sensor calibration using statistical pre-processing, ANN surrogates, and global data scaling". CNN architecture is uses filters of the size 4 × 1 × 1, three convolution layers of spatial sizes 32, 16, and 8, followed by a fully connected layer of the size 64 neurons (version I), layers of sizes 64, 32, 16 (version II), and 126, 64, and 32 (version III), as well as batch normalization and ReLU layers in between the convolution layers. CNN is trained using the ADAM’s algorithm with a mini batch size of 1000 [70]. Table 5 gathers the numerical results. It should be noted that the calibration methodology proposed in this study provides significantly better results, both in terms of correlation coefficients and RMSE. Utilization of affine correction (cf. Table 2) is superior to direct prediction of the calibrated sensor when using ANN of the same architecture as well as CNN.

Table 5 Comparative studies: linear regression and direct ANN/CNN-based prediction.

In summary, the showcased calibration approach proves remarkably effective. The corrected low-cost sensor measurements closely align with the reference readings, particularly in the advanced configurations, such as D.3, representing the optimal calibration setup. In practical terms, this sensor correction can be integrated offline or implemented within the platform using its on-board computational resources, as outlined in Section "Autonomous NO2 monitoring platform".

Conclusion

This article introduced an innovative methodology for high-efficiency calibration of affordable nitrogen dioxide sensors. The proposed technique integrates various correction mechanisms, encompassing data pre-processing, additive and multiplicative response adjustments executed by an artificial neural network (ANN) surrogate, and global data scaling. The pre-processing step focuses on aligning the distribution of low-cost sensor readings across the entire training dataset with reference measurements. Utilizing the ANN surrogate, the method predicts specific correction coefficients based on environmental parameters and additional NO2 readings from redundant sensors. Additionally, the calibration model explores extended input parameters, including differentials of environmental variables and historical time series data from the primary sensor, proving their significance. Global data scaling acts as the final step, enhancing scatter plot symmetry and consequent reduction in prediction errors for the calibrated sensor.

Our technique was applied and validated on a monitoring platform developed at Gdansk University of Technology, Poland, comprising primary and secondary NO2 detectors, environmental sensors, and custom-designed electronic systems for data transmission and monitoring protocols. The validation involved data from public monitoring stations in Gdansk, Poland. Extensive comparative experiments across diverse calibration model configurations underscored the importance of the integrated algorithmic components. The most comprehensive setup, encompassing all correction mechanisms, demonstrated exceptional reliability, achieving a correlation coefficient of 0.95 between reference and corrected sensor data, with an RMSE below 2.4 µg/m3 (an average relative RMS error of just eleven percent). This high efficacy underscores the practical viability of low-cost NO2 monitoring.

Future endeavors will focus on refining the precision of calibrated low-cost NO2 monitoring. One avenue involves integrating supplementary gas detectors like SO2, CO, and O3 into the measurement platform. This addition aims to leverage their readings as supplemental data sources to further refine the calibration model, particularly regarding cross-sensitivity considerations. Additionally, exploring advanced machine learning methodologies, including convolutional neural networks (CNNs) and recurrent neural networks (RNNs), is on the agenda. RNNs, adept at managing time series of varying lengths, may specifically enhance monitoring reliability by harnessing such data.