Global 1 km × 1 km gridded revised real gross domestic product and electricity consumption during 1992–2019 based on calibrated nighttime light data (2024)

Study areas and Data preprocessing

Given that the estimations were based on the top-down approach, the study areas depended on the countries that provided the available data. The GDP data includes 175 countries (or regions), and the electricity consumption data includes 134 countries (or regions). As such, the research scope covers over 70% of the global land area, and over 90% of the GDP and electricity consumption.

Two sets of nighttime light data were used in this study: Defense Meteorological Satellite Program/Operational Linescan System (DMSP/OLS) and National Polar-Orbiting Partnership’s Visible Infrared Imaging Radiometer Suite (NPP/VIIRS) images. Considering the versions of nighttime light data, we selected annual stable DMSP/OLS images after removing noise and monthly NPP/VIIRS images without cloud cover, because they have better fitting effects with economic output and other socioeconomic factors. The DMSP/OLS resolution is approximately 1000 m and it comprises six different DMSP satellites F10 (1992–1994), F12 (1994–1999), F14 (1997–2003), F15 (2000–2007), F16 (2004–2009), and F18 (2010–2013). The geographic coordinate reference system of the DMSP/OLS image is the WGS-84 coordinate system, the acquisition width is 3000 km, and the spatial resolution is 30 arc seconds (approximately 1 km near the equator and 0.8 km at 40° north latitude). The coverage of the image is from −180° to 180° in longitude and from −65° to 75° in latitude (covering all areas of the world where human activities exist). The spatial resolution of the NPP/VIIRS image data was higher than that of the DMSP/OLS image, which was 413 m. Simultaneously, unlike DMSP/OLS images that only provide relative radiation values in the range of 0–63, NPP/VIIRS images provide absolute radiation values in the unit of Watts/cm2/sr. Considering that there are several problems in satellite images, such as saturation, discontinuities, and white noise, these datasets needed to be pre-processed before they could be used further.

With regard to DMSP/OLS images, we projected the images as a Mollweide projection and resampled them at a spatial resolution of 1 km. Next, based on the invariant region method, we adopted the form of a power function to reduce saturation. In light of the power function parameters provided by Shi et al.17, the images were calibrated. Given that the two sensors both provided images in specific years (e.g., F10-1994 and F12-1994), we averaged them to obtain individual images for each year. As for the discontinuities, annual continuous processing was adopted based on the assumption that the stable DN value of a pixel on the light image in the following year should not be less than the stable DN value of the pixel in the previous year29.

For the NPP/VIIRS images, we adopted 0.3 Watts/cm2/sr as the threshold to remove the noise, which is consistent with previous studies26,30. To avoid the influence of stray light pollution in summer, monthly images from June to August were removed. Next, based on the average monthly data, we estimated the annual NPP/VIIRS images from 2014 to 2019. As for the discontinuities, we also adopted the same annual continuous processing with DMSP/OLS images. Finally, to better match the DMSP/OLS image, we resampled the NPP/VIIRS image from a resolution of 0.5 km × 0.5 km to that of 1 km × 1 km.

Matching of the two sets of nighttime light data

The gap between DMSP/OLS and NPP/VIIRS images is mainly driven by different sensors, spread functions, and spatial and temporal inconsistence21. Considering that the relationship between the two data sets is like a “Black Box,” Chen et al.31 used an artificial neural network (ANN) to explore the potential functions on the two data sets, and the matching results proved successful. Based on their study, we also employed a particle swarm optimization-back propagation (PSO-BP) algorithm to unify the scale of DMSP/OLS and NPP/VIIRS images. The initial parameters of the PSO-BP algorithm (i.e., C1 and C2 values were both set to 2.0, and the structure of the model included one hidden layer with five nodes; the maximum iteration number and population size were set to 50 and 20, respectively) were set following Chen et al.31.

Moreover, because our target is pixel-level matching, errors of hundreds of millions of pixels make the matching effect very poor, even after using machine learning. The difficulty is mainly driven by the ‘high–low’ or ‘low–high’ problems in the pixel DN values of the two images26 (i.e., the pixels in the NPP/VIIRS image have a high (low) DN value in 2013, whereas the pixels in the same place have a low (high) DN value in the DMSP/OLS image). Therefore, we proposed the principle of ‘high to high’ and ‘low to low’ for the matching job.

Thus, we divided the DMSP/OLS and NPP/VIIRS images into nine categories based on the natural interval method. By matching similar attributes in the two images, we extracted and obtained sampling points that met the principles of ‘high to high’ and ‘low to low’ in the analysis. Subsequently, in line with Chen et al.31 and Li et al.28, we used the logarithmic form of the pixel DN values in the NPP/VIIRS image in 2013, and their latitude and longitude as input factors. The DN values of the pixels in the DMSP/OLS image in 2013 were selected as the output factors. In addition, according to general practice in machine learning28,31, the input and output factors were normalized to avoid the influence of indicators’ units. Considering the continental heterogeneity, we estimated six continental parameters of the PSO-BP neural network (e.g., North America, South America, Oceania, Africa, Asia, and Europe). Antarctica was not considered in this study because the scope of sensors that provided DMSP/OLS and NPP/VIIRS images did not include Antarctica. The matching results based on the training sets (60% of the total samples) are presented in Fig.1. In particular, the NED in the x-axis represents normalized DN values converted from NPP/VIIRS image scales in 2013 to DMSP/OLS image scales in 2013; the NOD in the y-axis represents normalized DN values of original DMSP/OLS image DN values in 2013.

Different training results for the pixel normalized values on the six continents. (af). Training results for the pixel normalized values on (a) North America, (b) Oceania, (c) Africa, (d) South America, (e) Europe, and (f) Asia.

Full size image

As shown in Fig.1, all of the R2 of the six continents’ training results were > 0.96, indicating that the PSO-BP neural network performed well in identifying the potential relationship between DMSP/OLS and NPP/VIIRS images in 2013. The test results are shown in Fig.2. The test performances of the six continents can be used to evaluate the prediction effects of the algorithm (i.e., whether the parameters of the PSO-BP algorithm can be employed to convert the pixel normalized DN values of NPP/VIIRS images during 2014–2019 to the scale of normalized DN values in DMSP/OLS images). Except for the fitting result of Oceania (i.e., the R2 was only 0.91), the fitting results of the other five continents all exceeded 0.98. The poor fitting effect of Oceania may be owing to the lack of light at night in most parts of Oceania. Considering Oceania has fewer stable light sources, its poor prediction results have limited impact on the matching of two sets of night lights at the global scale. Subsequently, the global converted normalized DN values of NPP/VIIRS images during 2013–2019 were denormalized to the original range, which was consistent with the scale of DN values in DMSP/OLS images. Moreover, the final global converted DN values of NPP/VIIRS images in 2013 and the DN values of DMSP/OLS images were compared again to verify the effect of matching: the global coefficient of determination was > 0.98, which was higher than that obtained in previous studies (for example, 0.91 in Zhao et al.32, 0.84 Lv et al.33, and 0.87 in Chen et al.34).

Different test results for the pixel normalized values on the six continents. (af). Test results for the pixel normalized values on (a) North America, (b) Oceania, (c) Africa, (d) South America, (e) Europe, and (f) Asia.

Full size image

Based on the trained parameters of the neutral network, we transformed the scale of the NPP/VIIRS data from 2014 to 2019 to the scale of the DMSP/OLS data. As the generated network was based on “high to high” and “low to low” principle, the same pixels with high DN values in the NPP/VIIRS images can be converted into high DN values at the scale of DMSP/OLS images. However, the matching job was not complete yet. First, there were also certain pixels in NPP/VIIRS images with low DN values transformed into low DN values at DMSP/OLS scales, not matching the high DN values in the same regions of DMSP/OLS in 2013. Second, although the correlation coefficient was close to 1, there were evident and unavoidable discontinuity in some grids during 2013–2014, which also exist in previous studies32.

Therefore, inter-annual continuous series correction was adopted for the transformed NPP/VIIRS images from 2014 to 2019. In line with the correction approach, pixels with high DN values in the DMSP/OLS image were maintained in the converted NPP/VIIRS images for the period of 2014–2019. And the potential problem of discontinuity was solved. The equation is as follows:

$$\begin{array}{l}D{N}_{i,t}=\left\{\begin{array}{cc}D{N}_{i,t-1}, & if\,D{N}_{i,t-1}\ge D{N}_{i,t}\\ D{N}_{i,t}, & otherwise\end{array}\right.\left(t=2014,\; ..,2019\right),\\ \end{array}$$

(1)

In summary, based on the PSO-BP algorithm, we could confidently convert the scale of NPP/VIIRS data from 2013 to 2019 to the scales of DMSP/OLS data and obtain stable and continuous global 1 km × 1 km gridded nighttime light data for the time period of 1992–2019, which laid the foundation for further calculations of global 1 km × 1 km gridded GDP and electricity consumption during the period. Figure3 presents the spatial distributions of global nighttime light data in 2019.

The 1 km × 1 km gridded nighttime light data in 2019.

Full size image

Calculation of real GDP and electricity consumption based on growth rate

Owing to errors in official GDP growth attributed to poor statistical methods or intentional manipulation10,11,21, nighttime light data has been employed extensively in revision of official national GDP growth data. Based on the approaches proposed by Henderson et al.10 and Guerrero et al.11, the revised growth estimate is a composite with different weights of conventionally measured growth and growth predicted from nighttime light data. Considering the approaches of such studies, we employed nighttime light data to revise the real GDP growth rate. In particular, the real GDP growth rate was estimated using Eq. (2).

where \({y}_{i,t}^{* }\) is the ith country’s real GDP growth in period t; \({y}_{i,t}\) is the official GDP growth of the ith country in period t; \({y}_{i,t}^{{\prime} }\) presents the ith country’s predicted GDP growth based on the night-time light data in period t; and \(\left(1-\rho \right)\) is the optimal weight of predicted growth based on the night-time light data. In the light with the idea proposed by Henderson et al.10, the optimal value of ρ was specified to minimize the variance of measurement error in this estimate relative to the true value of GDP growth. As long as the optimal weight on \(\left(1-\rho \right)\) is positive, use of night-time light data improves our ability to measure true GDP growth. The variance of this composite GDP growth was estimated by the following equation:

$$var\left({y}_{i}^{\wedge * }-{y}_{i}^{* }\right)={\rho }^{2}var\left({y}_{i}-{y}_{i}^{* }\right)+{\left(1-\rho \right)}^{2}\left({y}_{i}^{{\prime} }-{y}_{i}^{* }\right)$$

(3)

Following Henderson et al.10, the relationships between the night-time light data and real GDP growth/official GDP growth were described as the following equations:

$${y}_{i}={y}_{i}^{* }+{\varepsilon }_{y,i}$$

(4)

$$sdn{a}_{i}=\beta {y}_{i}^{* }+{\varepsilon }_{sdna,i}$$

(5)

$${y}_{i}=\gamma sdn{a}_{i}+{e}_{i}$$

(6)

$${\sigma }_{y}^{2}={\varepsilon }_{y,i}^{2}$$

(7)

$${\sigma }_{sdna}^{2}={\varepsilon }_{sdna,i}^{2}$$

(8)

where \(sdn{a}_{i}\) is the growth of the sum of DN values per area; \({\varepsilon }_{y,i}\), \({\varepsilon }_{{sdna},i}\) and \({e}_{i}\) are the errors; β was is the elasticity of lights growth with respect to real GDP growth; γ was is the elasticity of official GDP growth with respect to lights growth; \({\sigma }_{y}^{2}\) and \({\sigma }_{sdna}^{2}\) are the variance of errors. Based on the assumption that the degree of measurement error in GDP growth has no effect on the estimated value of the parameter, there is \(cov({\varepsilon }_{y},{\varepsilon }_{sdna})=0\). Thus, there were further derived equations as follows:

$$var(sdna)={\beta }^{2}{\sigma }_{y* }^{2}+{\sigma }_{sdna}^{2}$$

(9)

$$cov\left(sdna,y\right)=cov\left({y}^{* },sdna\right)=\beta {\sigma }_{y* }^{2}$$

(10)

$$var(y)={\sigma }_{y* }^{2}+{\sigma }_{y}^{2}$$

(11)

Then, the relationship between \({\gamma }^{\wedge }\) and the structural parameter β is as follows:

$$Plim\left({\gamma }^{\wedge }\right)=\frac{cov(sdna,y)}{var({\rm{sdna}})}=\frac{1}{\beta }\left(\frac{{\beta }^{2}{\sigma }_{y* }^{2}}{{\beta }^{2}{\sigma }_{y* }^{2}+{\sigma }_{sdna}^{2}}\right)$$

(12)

Thus, the Eq. (3) can be rewritten as follows:

$$var\left({y}_{i}^{\wedge * }-{y}_{i}^{* }\right)={\rho }^{2}{\sigma }_{y}^{2}+{\left(1-\rho \right)}^{2}\frac{{\sigma }_{sdna}^{2}{\sigma }_{y* }^{2}}{{\beta }^{2}{\sigma }_{y* }^{2}+{\sigma }_{sdna}^{2}}$$

(13)

From Eq. (13), we solve for the weight ρ which minimizes this variance:

$${\rho }^{* }=\frac{{\sigma }_{sdna}^{2}{\sigma }_{y* }^{2}}{{\sigma }_{y}^{2}\left({\beta }^{2}{\sigma }_{y* }^{2}+{\sigma }_{sdna}^{2}\right)+{\sigma }_{sdna}^{2}{\sigma }_{y* }^{2}}$$

(14)

Furthermore, following Henderson et al.10, ρ is further classified based on countries with good- and bad-quality data: \({\rho }_{i,good}\) and \({\rho }_{i,bad}\). Therefore, the Eq. (11) becomes two Eqs. (15, 16).

$$var({y}_{good})={\sigma }_{y* }^{2}+{\sigma }_{y,good}^{2}$$

(15)

$$var({y}_{bad})={\sigma }_{y* }^{2}+{\sigma }_{y,bad}^{2}$$

(16)

And the ratio of signal to total variance in official GDP growth for countries with good quality of statistics was estimated. A higher ratio of signal to total variance indicates more reliable GDP growth. The calculation equation was presented as follows:

$${\rm{\phi }}=\frac{{\sigma }_{{\rm{y}}* }^{2}}{{\sigma }_{{\rm{y}}* }^{2}+{\sigma }_{{\rm{y}},{\rm{good}}}^{2}},$$

(17)

where ϕ was set to 0.9 based on Henderson et al.10 and Guerrero et al.11. Therefore, \({\rho }_{i,good}\) and \({\rho }_{i,bad}\) can be determined with the following equations:

$${\rho }_{i,good}=\frac{{\sigma }_{sdna}^{2}{\sigma }_{y* }^{2}}{{\sigma }_{y,good}^{2}(\beta {\sigma }_{{y}^{* }}^{2}+{\sigma }_{SDNA}^{2})+{\sigma }_{SDNA}^{2}{\sigma }_{{y}^{* }}^{2}}$$

(18)

$${\rho }_{i,bad}=\frac{{\sigma }_{SDNA}^{2}{\sigma }_{{y}^{* }}^{2}}{{\sigma }_{y,bad}^{2}(\beta {\sigma }_{{y}^{* }}^{2}+{\sigma }_{SDNA}^{2})+{\sigma }_{SDNA}^{2}{\sigma }_{{y}^{* }}^{2}}$$

(19)

Considering the conclusions of previous studies10,21,35, statistics from developed countries always have better quality, while those from developing countries are less reliable. Therefore, we characterized the quality of a country’s data based on whether it is a developed country. In additions, the weights applied during growth prediction from nighttime light data (i.e., \(\left(1-\rho \right)\)) were different between developed and developing countries, which is consistent with Henderson et al.10. The classification into developed and developing countries was based on that of the United Nations (Statistics Division) provided by the World Bank36. Based on the above equations, we obtained the optimal weights of the official GDP growth rate in developed and developing countries (i.e., \({\rho }_{good}=0.94\;and\;{\rho }_{bad}=0.66\)).

Furthermore, each grid’s real GDP growth rate during 1993–2019 can be estimated using the following equation:

$${{\rm{gy}}}_{ij,t}^{* }=\left\{\begin{array}{c}{\rho }_{gb}\times {y}_{i,t}+(1-{\rho }_{gb})\times \left(\frac{D{N}_{ij,t}-D{N}_{ij,t-1}}{D{N}_{ij,t-1}}\right)\times \alpha ,if\;D{N}_{ij,t-1}\ne 0\\ {y}_{i,t},if\;D{N}_{ij,t-1}=0\end{array}\right.,$$

(20)

where \(g{y}_{ij,t}^{* }\) denotes the jth grid in the ith country’s real GDP growth; \(gb=good,bad\); α represents the elasticity of the nighttime light data to GDP (i.e., 0.45 based on the regression results), which was obtained by Eq. (6).

Next, based on the gridded real GDP growth rate during 1993–2019, the gridded GDP data in 1992 or 2019 were estimated as basic values to obtain the gridded real GDP data in other years. Since the DN values in newly built-up areas were zero in 1992, these areas’ basic GDP values in 1992 were also zero, thereby leading to values of zero in subsequent years. Thus, the gridded GDP data in 2019 was selected as the basic value, which was calculated based on the top-down method.

Finally, the gridded real GDP based on the real growth rate can be calculated using Eq. (21).

$${{\rm{RGY}}}_{ij,t}^{* }=\left\{\begin{array}{c}\frac{{{\rm{RGY}}}_{ij,t+1}^{* }}{1+g{y}_{ij,t}^{* }},if\;D{N}_{ij,t}\ne 0\\ 0,if\;D{N}_{ij,t}=0\end{array},\right.$$

(21)

where \(RG{Y}_{ij,t}^{* }\) denotes the jth grid in the ith country’s real GDP in the period of t based on the revised real growth rate. The calculations were based on the hypothesis that there is no GDP when the DN value is zero, which is consistent with Shi et al.17 and Wang et al.16.

As for electricity consumption, the gridded growth rate of nighttime light data was used to estimate the growth rate of gridded electricity consumption. However, because the growth rate of electricity consumption was mainly driven by the industrial sectors rather than the residential sector37,38, the growth rate of the nighttime light data may not comprehensively capture the growth rate of electricity consumption. Thus, we combined the growth of official GDP and nighttime light data to better reveal the gridded growth rate of electricity consumption, which is presented in Eq. (22).

$${\rm{ln}}E{C}_{it}=\gamma ln(SD{N}_{it})+\pi ln({Y}_{it})+{c}_{it}+{\tau }_{it},$$

(22)

where \(E{C}_{it}\) denotes the ith country’s electricity consumption in the period t, SDNit denotes the ith country’s sum of DN values in the period t, \({{\rm{c}}}_{it}\) denotes the constant, \({{\rm{\tau }}}_{it}\) denotes the errors, γ and π denote the coefficients (i.e., 0.22 and 0.71). Then, the gridded growth rate of the electricity consumption \(GE{C}_{j,t}^{* }\) was calculated using Eq. (23).

$${{\rm{gecg}}}_{ij,t}^{* }=\left\{\begin{array}{c}\gamma \times \left(\frac{D{N}_{ij,t}}{D{N}_{ij,t-1}}-1\right)+\pi \times \left(\frac{{Y}_{i,t}}{{Y}_{i,t-1}}-1\right),if\;D{N}_{ij,t-1}\ne 0\\ 0,if\;D{N}_{ij,t-1}=0\end{array}\right.,$$

(23)

Given that only the worldwide electricity consumption during 1992–2015 was open-access and available freely, we selected the gridded electricity consumption data in 2015 as the basic values. Then, the gridded electricity consumption \(GE{C}_{j,t}^{* }\) was calculated using Eq. (24).

$${{\rm{GEC}}}_{ij,t}^{* }=\left\{\begin{array}{c}\frac{GE{C}_{ij,t+1}}{1+{{\rm{gecg}}}_{ij,t}^{* }},if\;D{N}_{ij,t-1}\ne 0\\ 0,if\;D{N}_{ij,t-1}=0\end{array}\right.$$

(24)

With regard to the basic values of gridded GDP in 2019 and electricity consumption in 2015, we first established the relationships between national nighttime light data (i.e., the sum of the DN values) and targeted variables (i.e., GDP and electricity consumption) based on the top-down approach, respectively. Thus, the ratios of GDP and electricity to the nighttime light data (i.e., the coefficients of the targeted variables per unit of DN value) can be estimated among different countries (or regions) during 1992–2019, and each 1 km × 1 km grid can be assigned GDP and electricity consumption with the DN value as the weight. Thus, the ratios of GDP or electricity consumption to DN values were estimated using the following equations:

$${Y}_{it}^{* }={\beta }_{it}SD{N}_{it}+{\mu }_{it},$$

(25)

$$E{C}_{it}={\theta }_{it}SD{N}_{it}+{\epsilon }_{it},$$

(26)

where \({Y}_{it}^{* }\) represents the ith country’s (or region’s) real GDP in the period t; \({\beta }_{it}\) and \({\theta }_{it}\) represent the coefficients of the ith country’s (or region’s) in the period t; \({\mu }_{it}\) and \({\epsilon }_{it}\) denote the errors.

Furthermore, in line with Chen et al.31, we employed the PSO-BP algorithm to fit and train the relationship among real GDP, electricity consumption, and nighttime light data. The real GDP and electricity consumption were selected as the output factors. The sum of DN values, dummy variables of identity and year were used as input parameters. In addition, the other initialized parameters were consistent with those discussed in the earlier section on the inter-calibration. According to the general practice in machine learning28,31, the input and output factors were normalized to avoid the influence of indicators’ units. The results are shown in Fig.4. In particular, the NEGDP/NEEC in the x-axis represents our estimated national normalized GDP/electricity consumption predicted based on input factors; the NAGDP and NAEC in the y-axis represent national normalized actual GDP and electricity consumption, respectively.

Training and all samples’ results for the relationship between national normalized actual GDP/ electricity consumption and our estimated GDP/electricity consumption predicted based on the input factors. (ad). (a) Training results for the relationship between national normalized actual GDP and our estimated GDP predicted based on the input factors; (b) Training results for the relationship between national normalized actual electricity consumption and our estimated electricity consumption predicted based on the input factors; (c) All samples’ results for the relationship between national normalized actual GDP and our estimated GDP consumption predicted based on the input factors; (d) All samples’ results for the relationship between national normalized actual electricity consumption and our estimated electricity consumption predicted based on the input factors.

Full size image

Notably, the coefficients of determination R2 of normalized GDP and electricity consumption were over 0.99. Thus, the training and all samples’ results showed great fitting effects, which indicated the high effectiveness of the algorithm. Then, based on the top-down method and a DN value-based weighted-average strategy39,40,41, we obtained the 1 km × 1 km gridded GDP and electricity consumption in 2019 and 2015. Finally, the gridded real GDP and electricity based on the growth rate during 1992–2019 were calculated using Eqs. (21, 24).

Global 1 km × 1 km gridded revised real gross domestic product and electricity consumption during 1992–2019 based on calibrated nighttime light data (2024)

References

Top Articles
Latest Posts
Article information

Author: Kimberely Baumbach CPA

Last Updated:

Views: 5620

Rating: 4 / 5 (61 voted)

Reviews: 92% of readers found this page helpful

Author information

Name: Kimberely Baumbach CPA

Birthday: 1996-01-14

Address: 8381 Boyce Course, Imeldachester, ND 74681

Phone: +3571286597580

Job: Product Banking Analyst

Hobby: Cosplaying, Inline skating, Amateur radio, Baton twirling, Mountaineering, Flying, Archery

Introduction: My name is Kimberely Baumbach CPA, I am a gorgeous, bright, charming, encouraging, zealous, lively, good person who loves writing and wants to share my knowledge and understanding with you.